I have this much:
A table which used to store the "Folders". Each folder may contain sub folders and files. So if I click a folder, I have to list the content of the folder.
The table to represent the folder listing is something like the following
FolderID Name Type Desc ParentID
In the case of sub folders, ParentID is refer to the FolderID of the parent folder.
Now, my questions are
1.
a. There are 3 type of folders, I use 3 data lists to categorize them. Can I load the entire table in a single fetch and then use LINQ to categorize the types.
OR
b. Load each category by passing 'Type' to stored procedure. Which will do 3 database calls.
2.
a. If I click the parent folder, use LINQ to filter the contents of the folder(because we have the entire table in memory)
OR
b. If I click the parent folder, pass the FolderID of the parent folder and then fetch the content.
In the two cases above, which points makes more sense, which points are best in the case of performance?
There are a number of considerations you need to make.
What is the size of the folder tree, if not currently large could it potentially become very large?
What is the likelihood that the folder table will be modified whilst a user is using/viewing it? If there is a high chance then it may be worthwhile to make smaller, more frequent calls to the DB so that the user is aware of any changes which have been made by other users.
Will users by working with one folder type at a time? Or will they be switching between these three different trees?
As an instinctive answer I would be drawn towards calling 1 or 2 levels at a time. For example - start with loading root folder and immediate children. As the user navigates down into the tree, retrieve more children...
When you are questioning about the performance, the only available answer is:
Measure it!. Implement both scenarios and look at them - how can they load your system.
Try to think, how will you cache your data to prevent high database load.
All works fast for small n, so we can't say something for sure.
If your data is small and changed not frequently, then use Caching and LINQ-based queries for your cache data.
If your data can't be stored in cache because it is huge, or it changes constantly, then cache the results of your queries, create cache dependensies for them, and again, measure it!
Related
I currently have a need to add a local secondary index to a DynamoDB table but I see that they can't be added after the table is created. It's fine for me to re-create the table now while my project is in development, but it would be painful to do that later if I need another index when the project is publicly deployed.
That's made me wonder whether it would be sensible to re-create the table with the maximum number of secondary indexes allowed even though I don't need them now. The indexes would have generically-named attributes that I am not currently using as their sort keys. That way, if I ever need another local secondary index on this table in the future, I could just bring one of the unused ones into service.
I don't think it would be waste of storage or a performance problem because I understand that the indexes will only be written to when an item is written that includes the attribute that they index on.
I Googled to see if this idea was a common practice, but haven't found anyone talking about it. Is there some reason why this wouldn't be a good idea?
Don’t do that. If a table has any LSIs it follows different rules and cannot grow an item collection beyond 10 GB or isolate hot items within an item collection. Why incur these downsides if you don’t need to? Plus later you can always create a GSI instead of an LSI.
We are planning to implement a virtual filesystem using Google Firestore.
The idea of subcollections is nice because it allows us to model our data in terms of a folder hierarchy, like so: /folders/folderA/entities/folderB/entities/fileX
Much like an actual filesystem, we'd like to handle cross-folder moves, such as moving nested subfolder folderB from parent folderA to parent folderC. Indeed, it will often be the case that the folder we want to move may themselves contain their own subcollections of files and folders an arbitrary K levels deep.
This comment suggests that moving a document will not automagically move its associated subcollections. Similarly, deleting a document will forego deleting its underlying subcollections, leaving them as orphans. It seems like the only way to move a folder (and its entities) from one parent to another would be through a recursive clone + delete strategy, which may be difficult to accomplish reliably and transactionally if its sub-entities are massive.
The alternative is to abandon using subcollections and store all folders at the root instead, using a document field like parent_id to point to other docs within the flat collection. This shouldn't impact querying speeds due to Firestore's aggressive indexing, but we've been unable to reproduce this claim locally; i.e., querying via subcollections is vastly more performant as the total # of documents increase in the DB, versus storing everything at the top level. A reproducible repo is available here. Note that the repo uses a local emulator instance, as opposed to an actual Firestore server.
Any advice would be very helpful!
My firebase project is growing fast and I have a lot of children in some paths (the structure is kept flat as possible).
my tree looks like this
/client/section
-key: value
-key: value
...
In some sections I've got 80k+ children and growing fast (I could easily hit 1mil+ in a few months). I was thinking of splitting the sections into section1, section2,... but the problem is that I have to do the numChildren (which of course loads all the children) check before I insert another child (to keep the section inside the desired limit).
Another idea was to change the section to date Y-m-d or just Y-m but that would again generate a lot of paths.
With the new schema I'd also like to add some properties to the current children (so I have to do some work on the schema anyway).
Another idea was to feed this data into a relational DB.
I'd like you input on how to structure the schema for the future.
Thank you!
I am a beginner in AX and I am trying to set access rights for some users and on a specific operation they get the error that they don't have access to the table SalesCreateReleaseOrderLineTmp. I have manually searched for this table in every category, but without success. I found on a website the full description of this table -> Order Lines - SalesCreateReleaseOrderLineTmp - ID: 995. I've search for the ID as well, but again no result. With admin rights everything is ok, but obviously not a solution.
Is there a fix location of this table and can anyone tell me where it is? :) Or is there any way to search for this table (by ID or name)?
I guess with
I have manually searched for this table in every category, but without
success
you mean you tried to find the table in the form for maintaining the user group permissions?
If so then this is due to the fact that temporary tables are hidden from that tree view as the class method SysDictTable.allowSecuritySetup is called from SysSecurity.expandSecurityKey while building the tree view and in this method there is - among other things - a check whether the table is temporary.
So essentially you have 3 options:
Give your permission group the desired access on the security key so that the group 'inherits' access to the table through it - downside of course could be to be too permissive but upside is better maintainability :)
Remove the security key on the temporary table as this in general is IMHO a wrong decision anyway. The application shouldn't restrict access to temporary tables (which are intrinsically scoped to the user session anyway) but rather force access checks in the code filling that table or even higher level processes.
Customize the code which builds the security tree view so that it includes temp. tables.
Try to apply the first option above that works for you as the first one does not need any application modification and the second one is only a simple property change which in my opinion is currently bad configured anyway. The last option should be the last resort.
Consider a set of data called Library, which contains a set of Books and each book contains a set of Pages.
Let's say you are using Riak to store this data, and you need to be access the data in two possible ways:
- Query for a particular page (with a unique id)
- Query for all pages in a particular book (with a unique name)
Additionally, you need to be able to easily update and delete pages of a particular Book.
What would be the best way to accomplish this in Riak?
Obviously Riak Search will do the trick, but maybe is inefficient for what I am trying to do. I am wondering if it makes sense to set up buckets where each bucket can be a Book (which would make for potentially millions of "Book" buckets). Maybe that is a bad idea...
Can this be accomplished with secondary indexes?
I am trying to keep this simple...
I am new to Riak and I am trying to find the best way to accomplish something that is probably relatively simple. I would appreciate any help from the Stack Overflow community. Thanks!
A common way to model master-detail relationships in Riak is to have the master record contain a list of detail record IDs, possibly together with some information about the detail record that may be useful when deciding which detail records to retrieve.
In your example, you could have two buckets called 'books' and 'pages'. The master record in the 'books' bucket will contain metadata and information about the book as a whole together with a list of pages that are included in the book. Each page would contain the ID of the 'pages' record holding the page data as well as the corresponding page number. If you e.g. wanted to be able to query by chapter, you could also add information about which chapters a certain page belongs to.
The 'pages' bucket would contain the text of the page and possibly links to images and other media data that are included on that page. This data could be stored in yet another bucket.
In order to get a specific page or a range of pages, one would first retrieve the master record from the 'books' bucket and then based on the contents of the record the appropriate pages. Even though this requires several GET operations, they are all direct lookups based on keys, which is the most efficient and scalable way to retrieve data from Riak, so it is will perform and scale well.
This approach also makes it simple to change the order of pages and/or chapters as only the master record needs to be updated. Adding, deleting or modifying pages would however require both the master record as well as one or more detail records to be updated, added or deleted.
You can most certainly also solve this problem by adding secondary indexes to the objects and query based on this. Secondary index queries in Riak does however have to include processing on a covering set (generally ring size / n_val) of partitions in order to fulfil the request, and therefore puts a bit more load on the system and generally results in higher latencies than retrieving a single object containing keys through a direct key lookup (which only needs to involve the partitions where the object is actually stored).
Although maintaining a separate object containing indexes adds a bit of extra work when inserting or deleting pages/entries, this approach will generally result in more efficient reads, as only direct key lookups are required. If your application is heavy on reads, it probably makes sense to use this approach, while secondary indexes could be more efficient for a write heavy application as inserts and modifications are made cheaper at the expense of more expensive reads. You can however always add secondary indexes just in case in order to keep your options open.
In cases like this I would usually recommend performing some benchmarks to test the solutions and chech which solution that best matches you particular performance and scaling requirements.
The most efficient way will be to store hole book as an one object, and duplicate it's pages as another separate objects.
Pros:
you will be able to select any object by its key(the most cheapest op
in riak is kv query)
any query will be predicted by latency
this is natural way of storing for riak
Cons:
If you need to update any page you must update whole book, and then page. As riak doesn't have atomic ops, you must to think how to recover any failure situation (like this: book was updated, but page was not).
Riak is about availability predictable latency, so if you will use something like 2i to collect results, it will make unpredictable time query, which will grow with page numbers