I am referring to a thread creating an index with JSON
I have a column called data in my DynamoDB table. This is in JSON and the structure of this file looks like this:
{
"config": "aasdfds",
"state":"PROCESSED",
"value" "asfdasasdf"
}
The AWS documentation says that I can create an index with the top level JSON attribute. However I don't know how to do this exactly. When I create the index, should I specify the partition key as data.state, then, in my code, use a query with the column data.state with the value set to PROCESSED, or should I create the partition key as data, then, in my code, look for the column data with the value set to state = "PROCESSED" ?
Top level attribute means DynamoDB supports creating index on Scalar attributes only (String, Number, or Binary).
The JSON attribute is stored as Document data type. So, index can't be created on Document data type.
The key schema for the index. Every attribute in the index key schema
must be a top-level attribute of type String, Number, or Binary. Other
data types, including documents and sets, are not allowed.
Scalar Types – A scalar type can represent exactly one value. The
scalar types are number, string, binary, Boolean, and null.
Document Types – A document type can represent a complex structure
with nested attributes—such as you would find in a JSON document. The
document types are list and map.
Set Types – A set type can represent multiple scalar values. The set
types are string set, number set, and binary set.
Related
I have a table that contains SuperGroup, Group and User data where a SuperGroup contains multiple Groups and a Group contains multiple Users. Each of these has a type and uuid attribute, where the type corresponds to what they are.
I have a GSI with the hash key as the type attribute and the range key as the uuid and I need a way to query the table such that I can fetch the relevant data for a list of type and uuid pairs. There will always be exactly one of each type.
Pseudo-example of the query inputs:
query_inputs = [
("SuperGroup", "super-group-uuid"),
("Group", "group-uuid"),
("User", "user-uuid"),
]
Can I do this in a single query? I'd like to avoid a scan, but I'm open to modeling my data differently or creating the index differently, if that can help.
I have a simple table "tags" containing a key and a value column. The key is always a string, the value can be either string, int64 or a double value.
I do not have any real data at this point to test with. But I'm curious about the index usage of the value column. I've defined the column as TEXT type - is SQLite still able to use the index on the value column when an int64 or double type is bound to the statement?
Here is the test table:
CREATE TABLE "tags" ("key" TEXT,"value" TEXT DEFAULT (null) );
INSERT INTO "tags" VALUES('test','test');
INSERT INTO "tags" VALUES('testint','1');
INSERT INTO "tags" VALUES('testdouble','2.0');
I see additional "Integer" and "Affinity" entries when analyzing the query via:
explain SELECT value FROM tags where key = "testint" and value >= 1
But I do not see any difference in index usage otherwise (e.g. idxgt is always used). But I'd rather like to have a definite answer rather than relying on wrong assumption with the small test data.
The documentation says:
A column with TEXT affinity stores all data using storage classes NULL, TEXT or BLOB. If numerical data is inserted into a column with TEXT affinity it is converted into text form before being stored.
The sort order is well-defined for all types.
Forcing the affinity to be TEXT makes comparisons on this column with numbers behave as if the values were text, but that is probably what you want.
In any case, indexes do not change the behaviour; they work correctly with all types, and apply affinities in exactly the same way as on non-indexed columns.
i test Two dimensional array like:
RETURN [[0,1],[2,3],[4,5],[6,7],[8,9]] AS collection
it works.
but , when i try to add an Two dimensional array property to a relationship like:
MATCH (station_44:STATION {id:44}), (station_38:STATION {id:38}) CREATE UNIQUE (station_44)-[:test2 { path:[[1,2],[2,3],[3,4]] } ]->(station_38)
I get error: Collections containing mixed types can not be stored in properties.
How can i do it? is it a bug?
You can not have an array containing an array as node or relationship property values.
You can only have an array of one primitive type, for eg int or string.
Documentation reference on property values
Anyway, if you need to query the subdimension of the arrays, then your model is definitely wrong, so I would suggest to redefine your model for the queries you need to do.
If you just want to store it as a property for later retrieval, you can store it as json String and serialize/deserialize at the application level.
I'm trying to create an index on a nested field, using the Dashboard in AWS Developer Console. E.g. if I have the following schema:
{ 'id': 1,
'nested': {
'mode': 'mode1',
'text': 'nice text'
}
}
I was able to create the index on nested.mode, but whenever I then go to query by index, nothing ever comes back. It makes me think that DynamoDB created the index on a field name nested.mode instead of the mode field of nested. Any hints re. what I might be doing wrong?
You cannot (currently) create a secondary index off of a nested attribute. From the Improving Data Access with Secondary Indexes in DynamoDB documentation (emphasis mine):
For each secondary index, you must specify the following:
...
The key schema for the index. Every attribute in the index key schema must be a top-level attribute of type String, Number, or Binary. Nested attributes and multi-valued sets are not allowed. Other requirements for the key schema depend on the type of index:
You can, however, create an index on any top level JSON element.
I'm trying to store a List as a DynamoDB attribute but I need to be able to retrieve the list order. At the moment the only solution I have come up with is to create a custom hash map by appending a key to the value and converting the complete value to a String and then store that as a list.
eg. key = position1, value = value1, String to be stored in the DB = "position1#value1"
To use the list I then need to filter out, organise, substring and reconvert to the original type. It seems like a long way round but at the moment its the only solution I can come up with.
Does anybody have any better solutions or ideas?
The List type in the newly added Document Types should help.
Document Data Types
DynamoDB supports List and Map data types, which can be nested to represent complex data structures.
A List type contains an ordered collection of values.
A Map type contains an unordered collection of name-value pairs.
Lists and maps are ideal for storing JSON documents. The List data type is similar to a JSON array, and the Map data type is similar to a JSON object. There are no restrictions on the data types that can be stored in List or Map elements, and the elements do not have to be of the same type.
I don't believe it is possible to store an ordered list as an attribute, as DynamoDB only supports single-valued and (unordered) set attributes. However, the performance overhead of storing a string of comma-separated values (or some other separator scheme) is probably pretty minimal given the fact that all the attributes for row must together be under 64KB.
(source: http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/DataModel.html)
Add a range attribute to your primary keys.
Composite Primary Key for Range Queries
A composite primary key enables you to specify two attributes in a table that collectively form a unique primary index. All items in the table must have both attributes. One serves as a “hash partition attribute” and the other as a “range attribute.” For example, you might have a “Status Updates” table with a composite primary key composed of “UserID” (hash attribute, used to partition the workload across multiple servers) and a “Time” (range attribute). You could then run a query to fetch either: 1) a particular item uniquely identified by the combination of UserID and Time values; 2) all of the items for a particular hash “bucket” – in this case UserID; or 3) all of the items for a particular UserID within a particular time range. Range queries against “Time” are only supported when the UserID hash bucket is specified.