I am new to the Redis cache implementation.
I want to search value in all the keys.
The values may or may not be nested collections of list.
What command should I use this to search data?
https://github.com/antirez/redis/issues/6802
I am implementing the same in .net core.
https://github.com/StackExchange/StackExchange.Redis
If you just want to search inside a hash key as in the screenshot, you can use HSCAN to traverse all the fields of the hash, this returns the value as well. Then test for the value client-side. Or, you can move this logic to a Lua script to do it Redis-server-side.
If you want to search in all the keys, consider the following:
You will need to traverse the whole keyspace, key by key, using SCAN.
Depending on the type, perform the search inside the key.
Sets and sorted sets can be searched with SSCAN and ZSCAN for values, using MATCH option.
For all other types, you need to do the search by your own.
Again, you can implement the above in a Lua script for a more efficient implementation. This answer can get you started.
Related
I thought Datastore's key was ordered by insertion date, but apparently I was wrong. I need to periodically look for new entities in the Datastore, fetch them and process them.
Until now, I would simply store the last fetched key and wrongly query for anything greater than it.
Is there a way of doing so?
Thanks in advance.
Datastore automatically generated keys are generated with uniform distribution, in order to make search more performant. You will not be able to understand which entity where added last using keys.
Instead, you can try couple of different approaches.
Use Pub/Sub and architecture your app so another background task will consume this last added entities. On entities add in DB, you will just publish new Event into Pub/Sub with key id. You event listener (separate routine) will receive it.
Use names and generate you custom names. But, as you want to create sequentially growing names, this will case performance hit on even not big ranges of data. You can find more about this in Best Practices of Google Datastore.
https://cloud.google.com/datastore/docs/best-practices#keys
You can add additional creation time column, and still use automatic keys generation.
I'm trying Amazon's tutorial on dynamoDB: http://docs.aws.amazon.com/amazondynamodb/latest/gettingstartedguide/GettingStarted.DDBLocal.html
As I'm working through it, I can't figure out how to do simple things like:
print the names of the tables I've created or figure out what the primary keys are in a particular table, t.
I'm assuming that there is probably some really easy way to do this, I just haven't seen it.
DynamoDBLocal is essentially a DynamoDB instance running on your own computer with its own endpoint. The way to interact with it is the same way you would with the actual DynamoDB service.
The easiest way to do that is choose an API and make requests with the local endpoint. See here for some basic examples of how to set the endpoint.
In your case, it sounds like you want to use a few different API operations, which the syntax will differ depending on which language/SDK you use:
ListTables - self-explanatory
Scan - "The Scan operation returns one or more items and item attributes by accessing every item in a table or a secondary index".
DescribeTable - "Returns information about the table, including the current status of the table, when it was created, the primary key schema, and any indexes on the table."
I have a fairly full example of a few operations using the Java SDK in this answer if you want some reference.
I'm looking to extend the base Meteor JS find function to create a pseudo-relational database method on the client side. I'm essentially going to curse through each line of the returned record looking for anything ending in _id and then run a find on the appropriate table to get the related data. I'm aware of performance issues but it's well within tolerance for my particular use cases.
I'd rather not reassign the .find() function for every collection I create on the client side. How do I modify the collection constructor for Meteor so that I can use my own for every collection?
Update
I have successfully extended the Meteor.Collection.prototype.find function and its subsequent fetch method. Fetch now takes a depth variable, and looks through all keys for ones ending in _id (except for the actual _id of the document). These keys are then used to find and fetch documents from the appropriate collections. I've also used pluralize to create a PHP Cake-esque collection naming convention that recognizes many-to-many relational collections. It's potentially expensive with MongoDB, but leveraged carefully it's turned out pretty slick.
I'll post code as an answer here soon and maybe even as a GIT if I have time.
I am using a riak bucket to store a list of messages, using a UUID as the key and a json message as value. This is working fine.
What I need is an efficient way to get a single message from the bucket without knowing its key, at least in one of these two scenarios:
Get the last inserted object (this is my prefered approach).
Get a random object from the bucket (if the first alternative is not possible).
Is there any efficient way to achieve that?
I think one alternative could be to retrieve the keys in the bucket and then get the first one. But this means making two calls to riak, one to obtain all the keys (just to discard all but one) and a second one to obtain the object. It does not seem very efficient.
As Riak is a key-value store, the by far most efficient way to retrieve data is through the keys. Listing or retrieving all keys in a bucket, even if you only end up using the one returned first, is one of the least efficient operations you can perform as it causes Riak to scan ALL keys in the system (not just the bucket), and it is usually recommended NEVER to use this on a production system.
The most efficient way to get the last inserted object would probably be to store the id in a separate, known record in a different bucket. This would however require you to perform two writes on every insert and two reads for every read, but would do so in the most efficient way. You could possibly implement a post-commit hook (would have to be in Erlang as it is not currently not possible to write records using JavaScript functions) on the bucket containing messages to get the system to perform the update for you, which would remove the need for the last write.
If you write a lot of data to the bucket containing messages, you may want to adjust the separate bucket so that it does not allow multiple values and that the last value wins. This way you would reduce the risk of having lots of siblings created due to frequent updates to this single record across the system. This would always give you one of the last written records, but not necessarily the last one (especially if you frequently write messages to the database), as Riak does not support any type of atomicity and is an eventually consistent database.
You could also create one or more secondary indexes if you are using the leveldb backend, and use this to limit your scan to only recent records, which would be more efficient than a scann of all keys. You could then either select the most recent key or a random one through mapreduce, but this would be much less efficient than the previously described approach.
I can not think of any efficient way to retrieve a random record in a bucket from Riak unless you know the range of keys you have inserted and can decide randomly on the client which one to get. One way to do this would be to generate all keys in sequence rather than using a UUID, but that is naturally not a good idea in a highly concurrent distributed system.
1st task is pretty easy to implement:
Add post-commit hook that will write the last inserted key to some predefined key/bucket place
Get the key from that predefined key/bucket and issue a get query using them
It's still two operations but both are just gets that are fast. Plus additional overhead on hook but nothing too heavy either.
2nd scenario is also easy, but it is way too inefficient to be used practically:
Get all keys (extremely expensive operation)
Pick random
Issue get
I have come up with the same scenario. In My scenario I have to save the users. For that I required an auto increment Id. So what I did is, I placed the last inserted key in a separate bucket as like mentioned by "Christian Dahlqvist", every time I want to insert new record I fetch the last inserted key from that key bucket. Here we have only one value in that bucket with the key as "LastKey" which is always known to us. And I incremented the key based on the fetched key and again updated the key bucket. So always the key bucket contains the latest key in it.
I'd like some advice on designing a REST API which will allow clients to add/remove large numbers of objects to a collection efficiently.
Via the API, clients need to be able to add items to the collection and remove items from it, as well as manipulating existing items. In many cases the client will want to make bulk updates to the collection, e.g. adding 1000 items and deleting 500 different items. It feels like the client should be able to do this in a single transaction with the server, rather than requiring 1000 separate POST requests and 500 DELETEs.
Does anyone have any info on the best practices or conventions for achieving this?
My current thinking is that one should be able to PUT an object representing the change to the collection URI, but this seems at odds with the HTTP 1.1 RFC, which seems to suggest that the data sent in a PUT request should be interpreted independently from the data already present at the URI. This implies that the client would have to send a complete description of the new state of the collection in one go, which may well be very much larger than the change, or even be more than the client would know when they make the request.
Obviously, I'd be happy to deviate from the RFC if necessary but would prefer to do this in a conventional way if such a convention exists.
You might want to think of the change task as a resource in itself. So you're really PUT-ing a single object, which is a Bulk Data Update object. Maybe it's got a name, owner, and big blob of CSV, XML, etc. that needs to be parsed and executed. In the case of CSV you might want to also identify what type of objects are represented in the CSV data.
List jobs, add a job, view the status of a job, update a job (probably in order to start/stop it), delete a job (stopping it if it's running) etc. Those operations map easily onto a REST API design.
Once you have this in place, you can easily add different data types that your bulk data updater can handle, maybe even mixed together in the same task. There's no need to have this same API duplicated all over your app for each type of thing you want to import, in other words.
This also lends itself very easily to a background-task implementation. In that case you probably want to add fields to the individual task objects that allow the API client to specify how they want to be notified (a URL they want you to GET when it's done, or send them an e-mail, etc.).
Yes, PUT creates/overwrites, but does not partially update.
If you need partial update semantics, use PATCH. See http://greenbytes.de/tech/webdav/draft-dusseault-http-patch-14.html.
You should use AtomPub. It is specifically designed for managing collections via HTTP. There might even be an implementation for your language of choice.
For the POSTs, at least, it seems like you should be able to POST to a list URL and have the body of the request contain a list of new resources instead of a single new resource.
As far as I understand it, REST means REpresentational State Transfer, so you should transfer the state from client to server.
If that means too much data going back and forth, perhaps you need to change your representation. A collectionChange structure would work, with a series of deletions (by id) and additions (with embedded full xml Representations), POSTed to a handling interface URL. The interface implementation can choose its own method for deletions and additions server-side.
The purest version would probably be to define the items by URL, and the collection contain a series of URLs. The new collection can be PUT after changes by the client, followed by a series of PUTs of the items being added, and perhaps a series of deletions if you want to actually remove the items from the server rather than just remove them from that list.
You could introduce meta-representation of existing collection elements that don't need their entire state transfered, so in some abstract code your update could look like this:
{existing elements 1-100}
{new element foo with values "bar", "baz"}
{existing element 105}
{new element foobar with values "bar", "foo"}
{existing elements 110-200}
Adding (and modifying) elements is done by defining their values, deleting elements is done by not mentioning it the new collection and reordering elements is done by specifying the new order (if order is stored at all).
This way you can easily represent the entire new collection without having to re-transmit the entire content. Using a If-Unmodified-Since header makes sure that your idea of the content indeed matches the servers idea (so that you don't accidentally remove elements that you simply didn't know about when the request was submitted).
Best way is :
Pass Only Id Array of Deletable Objects from Front End Application To Web API
2. Then You have Two Options:
2.1 Web API Way : Find All Collections/Entities using Id arrays and Delete in API , but you need to take care of Dependant entities like Foreign Key Relational Table Data too
2.2. Database Way : Pass Ids to your database side, find all records in Foreign Key Tables and Primary Key Tables and Delete in same order i.e. F-Key Table records then P-Key Table records