Why does DynamoDB require expressionAttributeValue? - amazon-dynamodb

I'm learning about how to filter results from a scan or query using Amazon's DynamoDB. I would expect an example filter to look like filter => name = Bob or some such. However, Amazon requires the use of a expression attribute such as filter => name = :person and then ExpressionAttributeValues => { ":person": {"S": "Bob"}}
This is confusing and hurts my head, why can't I use the simple name = Bob?
Official docs: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#FilteringResults
Apparently working example near end: https://github.com/aws/aws-cli/issues/1073

This type of syntax follows an approach that is similar to prepared statements that are used in SQL systems. This was a design decision that the DynamoDB team at AWS made. One of the reasons is to allow fields that conflict with the lengthy list of reserved words (including 'name' that you were using in your example) that are defined by DynamoDB.
Avoiding reserved words is actually performed by using the ExpressionAttributeNames attribute and specifying the attribute names. You were referencing ExpressionAttributeValues which is where the list of values is specified. More information is available on the Using Placeholders for Attribute Names and Values documentation page.
Another motivation of this design is to separate the statement from the parameter names and values, similar to prepared statements in SQL as I've already mentioned. While this may seem odd at first it has the added benefit of effectively sanitizing your inputs in a NoSQL sense avoiding possible malicious or unintentional problems with your user input affecting the behavior of your request on the interaction with DynamoDB.

Related

Wildcard searches

Our MarkLogic based web-application mostly uses cts.jsonPropertyValueQuery to access needed information.
We want to provide the possibility of wildcard searches against specific JSON properties.
What is the best way to do it?
Turning on one of the wildcard indexes for the whole database is not an option.
I figured out that adding a "wildcarded" parameter to the query itself may solve the problem:
cts.search(cts.jsonPropertyValueQuery("inventor", "R?th", ["wildcarded", "whitespace-sensitive"]));
But it may work slow due to the absence of indexes. Is there any way to create wildcard indexes only for that specific JSON property?
You could create a Path Field with an XPath to the inventor JSON field (and even for //inventor) and configure the field to have wildcard indexes, and then use a field query: cts.fieldValueQuery or cts.fieldWordQuery.

How can I limit and sort on document ID in firestore?

I have a collection where the documents are uniquely identified by a date, and I want to get the n most recent documents. My first thought was to use the date as a document ID, and then my query would sort by ID in descending order. Something like .orderBy(FieldPath.documentId, descending: true).limit(n). This does not work, because it requires an index, which can't be created because __name__ only indexes are not supported.
My next attempt was to use .limitToLast(n) with the default sort, which is documented here.
By default, Cloud Firestore retrieves all documents that satisfy the query in ascending order by document ID
According to that snippet from the docs, .limitToLast(n) should work. However, because I didn't specify a sort, it says I can't limit the results. To fix this, I tried .orderBy(FieldPath.documentId).limitToLast(n), which should be equivalent. This, for some reason, gives me an error saying I need an index. I can't create it for the same reason I couldn't create the previous one, but I don't think I should need to because they must already have an index like that in order to implement the default ordering.
Should I just give up and copy the document ID into the document as a field, so I can sort that way? I know it should be easy from an algorithms perspective to do what I'm trying to do, but I haven't been able to figure out how to do it using the API. Am I missing something?
Edit: I didn't realize this was important, but I'm using the flutterfire firestore library.
A few points. It is ALWAYS a good practice to use random, well distributed documentId's in firestore for scale and efficiency. Related to that, there is effectively NO WAY to query by documentId - and in the few circumstances you can use it (especially for a range, which is possible but VERY tricky, as it requires inequalities, and you can only do inequalities on one field). IF there's a reason to search on an ID, yes it is PERFECTLY appropriate to store in the document as well - in fact, my wrapper library always does this.
the correct notation, btw, would be FieldPath.documentId() (method, not constant) - alternatively, __name__ - but I believe this only works in Queries. The reason it requested a new index is without the () it assumed you had a field named FieldPath with a subfield named documentid.
Further: FieldPath.documentId() does NOT generate the documentId at the server - it generates the FULL PATH to the document - see Firestore collection group query on documentId for a more complete explanation.
So net:
=> documentId's should be as random as possible within a collection; it's generally best to let Firestore generate them for you.
=> a valid exception is when you have ONE AND ONLY ONE sub-document under another - for example, every "user" document might have one and only one "forms of Id" document as a subcollection. It is valid to use the SAME ID as the parent document in this exceptional case.
=> anything you want to query should be a FIELD in a document,and generally simple fields.
=> WORD TO THE WISE: Firestore "arrays" are ABSOLUTELY NOT ARRAYS. They are ORDERED LISTS, generally in the order they were added to the array. The SDK presents them to the CLIENT as arrays, but Firestore it self does not STORE them as ACTUAL ARRAYS - THE NUMBER YOU SEE IN THE CONSOLE is the order, not an index. matching elements in an array (arrayContains, e.g.) requires matching the WHOLE element - if you store an ordered list of objects, you CANNOT query the "array" on sub-elements.
From what I've found:
FieldPath.documentId does not match on the documentId, but on the refPath (which it gets automatically if passed a document reference).
As such, since the documents are to be sorted by timestamp, it would be more ideal to create a timestamp fieldvalue for createdAt rather than a human-readable string which is prone to string length sorting over the value of the string.
From there, you can simply sort by date and limit to last. You can keep the document ID's as you intend.

How to model Not In query in Couch DB [duplicate]

Folks, I was wondering what is the best way to model document and/or map functions that allows me "Not Equals" queries.
For example, my documents are:
1. { name : 'George' }
2. { name : 'Carlin' }
I want to trigger a query that returns every documents where name not equals 'John'.
Note: I don't have all possible names before hand. So the parameters in query can be any random text like 'John' in my example.
In short: there is no easy solution.
You have four options:
sending a multi range query
filter the view response with a server-side list function
using a CouchDB plugin
use the mango query language
sending a multi range query
You can request the view with two ranges defined by startkey and endkey. You have to choose the range so, that the key John is not requested.
Unfortunately you have to find the commit request that somewhere exists and compile your CouchDB with it. Its not included in the official source.
filter the view response with a server-side list function
Its not recommended but you can use a list function and ignore the row with the key John in your response. Its like you will do it with a JavaScript array.
using a CouchDB plugin
Create an additional index with e.g. couchdb-lucene. The lucene server has such query capabilities.
use the "mango" query language
Its included in the CouchDB 2.0 developer preview. Not ready for production but will be definitely included in the stable release.

boto DynamoDb query/scan ProjectionExpression syntax?

From the documentation, it says "By default, a Scan returns all of the data attributes for every item; however, you can use the ProjectionExpression parameter so that the Scan only returns some of the attributes, rather than all of them."
I am wondering if anyone knows what's the syntax for using the ProjectionExpression parameter with boto?
For example I have
leagueTable = Table('leagues', schema=[HashKey('leagueId', data_type=NUMBER)]
I want to use the ProjectionExpression parameter to scan the table and only get back the selected field.
According to the documentation at http://docs.pythonboto.org/en/latest/ref/dynamodb2.html#boto.dynamodb2.table.Table.scan , the attributes parameter will allow you to specify a tuple of attributes and only return those attributes in the result set.
However, this uses the AttributesToGet API, instead of the newer ProjectionExpression API you are referring to. ProjectionExpression will allow you to retrieve individual list or map elements. To use ProjectionExpression, you would have to use the low-level API for boto, which matches the low-level DynamoDB API closely. The scan documentation for this can be found at: http://docs.pythonboto.org/en/latest/ref/dynamodb2.html#boto.dynamodb2.layer1.DynamoDBConnection.scan
Hope that helps, good luck!

Linq to Entities - Drill down filter (Asp.net)

I've been searching for a good way of doing multiple "where" filters on an entity collection from linq. There are lots of sites that use a filter for their searches on the side, like ebay.
The technique used is called a "drill down" filter. Now I'm trying to find the right way of implementing this technique in my 3-tier model working with Linq-to-Entities.
The technique uses the earlier used received entity collection and narrows it down with some kind of filter, but there are multiple filters which can both be applied and removed even within the same "category" of filtering.
Hope somebody finds me the right link to a tutorial or a method of how to use this in a proper way.
In my experience, each "filter" on the side maps to a field in the database. This makes it simple to do a filter:
var result = db.Table
.Where(t => t.Name.Contains(ddlName.Text))
.Where(t => t.Attribute1.Contains(Attribute1.Text));
.Where(t => t.Attribute2.Contains(Attribute2.Text));
Obviously you can substitue .Equals() where it makes sense, I've used this on several webapps with great success. This becomes a bit more trickey when the filters you want do not map directly to fields in your database, but a similar approach can be taken.

Resources