How do I delete all instances of a node in all documents? - xquery

We're changing our schema to move nodes that are protected paths into a single node to reduce the number of protected nodes we need to maintain. e.g.
/specialized/protectThis1
/specialized/protectThis2
becomes
/specialized/protected/protectThis1
/specialized/protected/protectThis2
I've tried getting the records via cts:uris, but the number of records fills the expanded tree cache. Any ideas on how I can clean out these nodes without filling the tree cache?

I would break up the work and do it in a batch job. CoRB https://github.com/marklogic-community/corb2 is perfect for this.

Related

Prevent parent node deletion if there is not a single value in it

I am using Firebase to store data. When I delete all data within a single parent node, the parent node also gets deleted. A parent node exists only if there is at least a single value of data in it.
How do I modify the rules in such a way that the parent node always exists even if there is not a single value in it. (I set the values of .read and .write in the rules to true.)
In the above image, the parent 'Queue Members' has a value 'hello'. If I delete 'hello', the parent 'Queue Members' also gets deleted.
There is no way to have a node without a value under it.
The Firebase Database automatically creates nodes as values are added under them. It also automatically deletes nodes when no value exists under them anymore.
This should typically not lead to any problems in your code, since requesting the value of a non-existing node leads to an empty snapshot. If you're having a problem with this in your code, it would be useful to see how you're trying to deal with it.

Meteor pagination: cursor fetch limit with getmore

I have an infinite scroll page where I'm not using Meteor templates to draw the items. The reason for that belongs in a whole other thread. I'm trying to figure out how to paginate the data without fetching all the items at once. I have an idea about using a limit on the cursor, but can't find any real samples online of the proper way to do this.
Should the server call return the cursor itself or just the find with limited data set? If the server doesn't return the cursor itself, won't I lose position when I try to fetch the next set of results?
Also, I want to make sure to retrieve data from the same cursor. Like if there are currently 100 items and I fetch 20, I expect the next 4 fetches to get 20-40, 40-60, 60-80, and 80-100. If in the interim some items got inserted or deleted, I don't want it to mess up the fetches. I am handling reactivity separately and letting users decide when to update the items (which should reset the cursor).
Help/advice appreciated!
What you would usually do is this:
var cursor = collection.find({},{limit:100+20*page});
The first {} is obviously the selector!
Docs:
http://docs.meteor.com/#/basic/Mongo-Collection-find
You don't have to worry about returning only the values 100-120 and then 120-140 etc. since meteors ddp does that for you!
If you were using meteor's blas or you just want to have the reactivity, you should probably store the page variable in the Session or create a dependancy:
https://manual.meteor.com/#deps-asimpleexample

Efficient way of paging with MongoDB and ASP.NET MVC

We are creating an application MongoDB as database and we are using official C# driver for MongoDB. We have one collection which contains thousands of records and we want to create list with paging. I have gone through documentation but there is not efficient way of paging with MongoDB C# official driver.
My requirement is to exactly fetch only 50 records from database. I have seen many examples but that get all collection and perform skip and take via LINQ which is not going to work in our case as we don't want to fetch thousand of records in memory.
Please provide any example code or link for that. Any help will be appreciated.
Thanks in advance for help.
You can use SetLimit on the cursor that represents the query. That will limit the results from MongoDB, not only in memory:
var cursor = collection.FindAll(); // Or any other query.
cursor.SetLimit(50); // Will only return 50.
foreach (var item in cursor)
{
// Process item.
}
You can also use SetSkip to set a skip (surprisingly):
cursor.SetSkip(10);
Note: You must set those properties on the cursor before enumerating it. Setting those after will have no effect.
By the way, even if you do only use Linq's Skip and Take you won't be retrieving thousands of documents. MongoDB automatically batches the result by size (first batch is about 1mb, the rest are 4mb each) so you would only get the first batch and take the first 50 docs out of it. More on
Edit: I think there's some confusion about LINQ here:
that get all collection and perform skip and take via LINQ which is not going to work in our case as we don't want to fetch thousand of records in memory.
Skip and Take are extension methods on both IEnumerable and IQueryable. IEnumerable is meant for in memory collections, but IQueryable operations are translated by the specific provider (the C# driver in this case). So the above code is equivalent with:
foreach (var item in collection.AsQueryable().SetLimit(50))
{
// Process item.
}

dynamodb batch write updates existing items

In this dynamodb documentation it is stated that existing items can not be updated with batch writing. However, when I try it replaces new items. How can I prevent it to update already exists one?
As stated in the documentation if you re-put an item it replaces the old one.
Update item adds/changed attributes but doesn't remove other ones.
So basically what you are doing is replacing items and not updating them.
With batch write you can't put conditions on individual items thus you can't prevent it from updating.

Efficeintly maintaining a cache of distinct items in a huge DB table

I have a very large (millions of rows) SQL table which represents name-value pairs (one columns for a name of a property, the other for it's value). On my ASP.NET web application I have to populate a control with the distinct values available in the name column. This set of values is usually not bigger than 100. Most likely around 20. Running the query
SELECT DISTINCT name FROM nameValueTable
can take a significant time on this large table (even with the proper indexing etc.). I especially don't want to pay this penalty every time I load this web control.
So caching this set of names should be the right answer. My question is, how to promptly update the set when there is a new name in the table. I looked into SQL 2005 Query Notification feature. But the table gets updated frequently, very seldom with an actual new distinct name field. The notifications will flow in all the time, and the web server will probably waste more time than it saved by setting this.
I would like to find a way to balance the time used to query the data, with the delay until the name set is updated.
Any ides on how to efficiently manage this cache?
A little normalization might help. Break out the property names into a new table, and FK back to the original table, using a int ID. you can display the new table to get the complete list, which will be really fast.
Figuring out your pattern of usage will help you come up with the right balance.
How often are new values added? are new values added always unique? is the table mostly updates? do deletes occur?
One approach may be to have a SQL Server insert trigger that will check the table cache to see if its key is there & if it's not add itself
Add a unique increasing sequence MySeq to your table. You may want to try and cluster on MySeq instead of your current primary key so that the DB can build a small set then sort it.
SELECT DISTINCT name FROM nameValueTable Where MySeq >= ?;
Set ? to the last time your cache has seen an update.
You will always have a lag between your cache and the DB so, if this is a problem you need to rethink the flow of the application. You could try making all requests flow through your cache/application if you manage the data:
requests --> cache --> db
If you're not allowed to change the actual structure of this huge table (for example, due to huge numbers of reports relying on it), you could create a holding table of these 20 values and query against that. Then, on the huge table, have a trigger that fires on an INSERT or UPDATE, checks to see if the new NAME value is in the holding table, and if not, adds it.
I don't know the specifics of .NET, but I would pass all the update requests through the cache. Are all the update requests done by your ASP.NET web application? Then you could make a Proxy object for your database and have all the requests directed to it. Taking into consideration that your database only has key-value pairs, it is easy to use a Map as a cache in the Proxy.
Specifically, in pseudocode, all the requests would be as following:
// the client invokes cache.get(key)
if(cacheMap.has(key)) {
return cacheMap.get(key);
} else {
cacheMap.put(key, dababase.retrieve(key));
}
// the client invokes cache.put(key, value)
cacheMap.put(key, value);
if(writeThrough) {
database.put(key, value);
}
Also, in the background you could have an Evictor thread which ensures that the cache does not grow to big in size. In your scenario, where you have a set of values frequently accessed, I would set an eviction strategy based on Time To Idle - if an item is idle for more than a set amount of time, it is evicted. This ensures that frequently accessed values remain in the cache. Also, if your cache is not write through, you need to have the evictor write to the database on eviction.
Hope it helps :)
-- Flaviu Cipcigan

Resources