No Idea how to create a specific MapReduce in CouchDB - dictionary

I've got 3 types of documents in my db:
{
param: "a",
timestamp: "t"
} (Type 1)
{
param: "b",
partof: "a"
} (Type 2)
{
param: "b",
timestamp: "x"
} (Type 3)
(I can't alter the layout...;-( )
Type 1 defines a start timestamp, it's like the start event. A Type 1 is connected to several Type 3 docs by Type 2 documents.
I want to get the latest Type 3 (highest timestamp) and the corresponding type 1 document.
How may I organize my Map/Reduce?

Easy. For highly relational data, use a relational database.

As user jhs stated before me, your data is relational, and if you can't change it, then you might want to reconsider using CouchDB.
By relational we mean that each "type 1" or "type 3" document in your data "knows" only about itself, and "type 2" documents hold the knowledge about the relation between documents of the other types. With CouchDB, you can only index by fields in the documents themselves, and going one level deeper when querying using includedocs=true. Thus, what you asked for cannot be achieved with a single CouchDB query, because some of the desired data is two levels away from the requested document.
Here is a two-query solution:
{
"views": {
"param-by-timestamp": {
"map": "function(doc) { if (doc.timestamp) emit(doc.timestamp, [doc.timestamp, doc.param]); }",
"reduce": "function(keys, values) { return values.reduce(function(p, c) { return c[0] > p[0] ? c : p }) }"
},
"partof-by-param": {
"map": "function(doc) { if (doc.partof) emit(doc.param, doc.partof); }"
}
}
}
You query it first with param-by-timestamp?reduce=true to get the latest timestamp in value[0] and its corresponding param in value[1], and then query again with partof-by-param?key="<what you got in previous query>". If you need to fetch the full documents together with the timestamp and param, then you will have to play with includedocs=true and provide with the correct _doc values.

Related

Extract values from web service JSON response with JSONPath

I have a JSON response from web service that looks something like this :
[
{
"id":4,
"sourceID":null,
"subject":"SomeSubjectOne",
"category":"SomeCategoryTwo",
"impact":null,
"status":"completed"
},
{
"id":12,
"sourceID":null,
"subject":"SomeSubjectTwo",
"category":"SomeCategoryTwo",
"impact":null,
"status":"assigned"
}
]
What I need to do is extract the subjects from all of the entities by using JSONPATH query.
How can I get these results :
Subject from the first item - SomeSubjectOne
Filter on specific subject value from all entities (SomeSubjectTwo for example)
Get Subjects from all entities
Goessner's orinial JSONPath article is a good reference point and all implementations more or less stick to the suggested query syntax. However, implementations like Jayway JsonPath/Java, JSONPath-Plus/JavaScript, flow-jsonpath/PHP may behave a little differently in some areas. That's why it can be important to know what implementation you are actually using.
Subject from the first item
Just use an index to select the desired array element.
$.[0].subject
Returns:
SomeSubjectOne
Specific subject value
First, go for any elements .., check those with a subject [?(#.subject] and use == '..' for comparison.
$..[?(#.subject == 'SomeSubjectTwo')]
Returns
[ {
"id" : 12,
"sourceID" : null,
"subject" : "SomeSubjectTwo",
"category" : "SomeCategoryTwo",
"impact" : null,
"status" : "assigned" } ]*
Get all subjects
$.[*].subject
or simply
$..subject
Returns
[ "SomeSubjectOne", "SomeSubjectTwo" ]

Can't scan on DynamoDB map nested attributes

I'm new to DynamoDB and I'm trying to query a table from javascript using the Dynamoose library. I have a table with a primary partition key of type String called "id" which is basically a long string with a user id. I have a second column in the table called "attributes" which is a DynamoDB map and is used to store arbitrary user attributes (I can't change the schema as this is how a predefined persistence adapter works and I'm stuck working with it for convenience).
This is an example of a record in the table:
Item{2}
attributes Map{2}
10 Number: 2
11 Number: 4
12 Number: 6
13 Number: 8
id String: YVVVNIL5CB5WXITFTV3JFUBO2IP2C33BY
The numeric fields, such as the "12" field, in the Map can be interpreted as "week10", "week11","week12" and "week13" and the numeric values 2,4,6 and 8 are the number of times the application was launched that week.
What I need to do is get all user ids of the records that have more than 4 launches in a specific week (eg week 12) and I also need to get the list of user ids with a sum of 20 launches in a range of four weeks (eg. from week 10 to 13).
With Dynamoose I have to use the following model:
dynamoose.model(
DYNAMO_DB_TABLE_NAME,
{id: String, attributes: Map},
{useDocumentTypes: true, saveUnknown: true}
);
(to match the table structure generated by the persistence adapter I'm using).
I assume I will need to do DynamoDB "scan" to achieve this rather than a "query" and I tried this to get started and get a records where week 12 equals 6 to no avail (I get an empty set as result):
const filter = {
FilterExpression: 'contains(#attributes, :val)',
ExpressionAttributeNames: {
'#attributes': 'attributes',
},
ExpressionAttributeValues: {
':val': {'12': 6},
},
};
model.scan(filter).all().exec(function (err, result, lastKey) {
console.log('query result: '+ JSON.stringify(result));
});
If you don't know Dynamoose but can help with solving this via the AWS SDK tu run a DynamoDB scan directly that might also be helpful for me.
Thanks!!
Try the following.
const filter = {
FilterExpression: '#attributes.#12 = :val',
ExpressionAttributeNames: {
'#attributes': 'attributes',
'#12': '12'
},
ExpressionAttributeValues: {
':val': 6,
},
};
Sounds like what you are really trying to do is filter the items where attributes.12 = 6. Which is what the query above will do.
Contains can't be used for objects or arrays.

Is it possible to get customDimensions["--"] dynamic Properties in application insight? Can we write loop in kusto Queries

I need to loop through the customDimensions.
example: I have scenario like 1000 tags(json objects) inside the customDimensions like below
cust_0: array
cust_1: array
like 1000 tags inside the customDimensions I want to iterate through and get the JSON object is it possible loop in kusto queries?
sample data: As given sample data I stored below in customDimensions like that i have multiple rows, and in that, I want to combine(merge) the array0, array1, array2 how to write the query to merge the records array0 array1 array2 are dynamically generated columns
{
"sample1":"data",
"sample2":"data",
"sample3":"data",
"sample4":"daa",
"sample5":"data",
"sample6":"data",
"array0":[
{
"1":"0",
"2":"1",
"3":"1",
"4":"1 1",
"5":"1 1",
"6":"",
"7":"",
"8":"1(1)",
"9":"1",
"10":"1"
},
"array1": [
{
"1":"0",
"2":"1",
"3":"1",
"4":"1 1",
"5":"1 1",
"6":"",
"7":"",
"8":"1(1)",
"9":"1",
"10":"1"
}
]
},
}
{
"sample1":"data",
"sample2":"data",
"sample3":"data",
"sample4":"daa",
"sample5":"data",
"sample6":"data",
"array0":[
{
"1":"0",
"2":"1",
"3":"1",
"4":"1 1",
"5":"1 1",
"6":"",
"7":"",
"8":"1(1)",
"9":"1",
"10":"1"
},
"array1": [
{
"1":"0",
"2":"1",
"3":"1",
"4":"1 1",
"5":"1 1",
"6":"",
"7":"",
"8":"1(1)",
"9":"1",
"10":"1"
}
]
},
}
you may be able to achieve that using mv-expand or mv-apply.
if you're not sure how, please provide a sample data set (preferably, using the datatable operator), and the expected output for it, with a verbal description of the logic you want to implement

Invalid type for parameter error when using put_item dynamodb

I want to write data in dataframe to dynamodb table
item = {}
for row in datasource_archived_df_join_repartition.rdd.collect():
item['x'] = row.x
item['y'] = row.y
client.put_item( TableName='tryfail',
Item=item)
but im gettin this error
Invalid type for parameter Item.x, value: 478.2, type: '<'type 'float''>', valid types: '<'type 'dict''>'
Invalid type for parameter Item.y, value: 696- 18C 12, type: '<'type 'unicode''>', valid types: '<'type 'dict''>'
Old question, but it still comes up high in a search and hasn't been answered properly, so here we go.
When putting an item in a DynamoDB table it must be a dictionary in a particular nested form that indicates to the database engine the data type of the value for each attribute. The form looks like below. The way to think of this is that an AttributeValue is not a bare variable value but a combination of that value and its type. For example, an AttributeValue for the AlbumTitle attribute below is the dict {'S': 'Somewhat Famous'} where the 'S' indicates a string type.
response = client.put_item(
TableName='Music',
Item={
'AlbumTitle': { # <-------------- Attribute
'S': 'Somewhat Famous', # <-- Attribute Value with type string ('S')
},
'Artist': {
'S': 'No One You Know',
},
'SongTitle': {
'S': 'Call Me Today',
},
'Year': {
'N': '2021' # <----------- Note that numeric values are supplied as strings
}
}
)
In your case (assuming x and y are numbers) you might want something like this:
for row in datasource_archived_df_join_repartition.rdd.collect():
item = {
'x': {'N': str(row.x)},
'y': {'N': str(row.y)}
}
client.put_item( TableName='tryfail', Item=item)
Two things to note here: first, each item corresponds to a row, so if you are putting items in a loop you must instantiate a new one with each iteration. Second, regarding the conversion of the numeric x and y into strings, the DynamoDB docs explain that the reason the AttributeValue dict requires this is "to maximize compatibility across languages and libraries. However, DynamoDB treats them as number type attributes for mathematical operations." For fuller documentation on the type system for DynamoDB take a look at this or read the Boto3 doc here since you are using Python.
The error message is indicating you are using the wrong type, it looks like you need to be using a dictionary when assigning values to item['x'] and item[y]. e.g.
item['x'] = {'value': row.x}
item['y'] = {'value': row.y}

Neo4j - Cypher: mutual object with traversing relationships

I have a small Graph:
CREATE
(Dic1:Dictioniary { name:'Dic1' }),
(Dic2:Dictioniary { name: 'Dic2' }),
(Dic3:Dictioniary { name: 'Dic3' }),
(File1:File { name: 'File1' }),
(File2:File { name: 'File2' }),
(File3:File { name: 'File3' }),
(Dic2)-[:contains]->(Dic1),
(Dic1)-[:contains]->(File1),
(Dic3)-[:contains]->(File2),
(File1)-[:references]->(File3),
(File2)-[:references]->(File3)
I need a cypher query to find out, if for example Dic2 and Dic3 have paths/relations, where they reference the same File.
In this case it would be true; the mutual File is File3.
Thanks for your help
When you are looking for just two dictionaries you can achieve this in a single statement:
MATCH (d2:Dictioniary { name:'Dic2' }),(d3:Dictioniary { name:'Dic3' })
MATCH (d2)-[:contains|references*]->(f:File)<-[:contains|references*]-(d3)
RETURN f
It is quite expensive due to the two unbounded path matches, but it is quite cheap as it is bound from the outset by the two dictionary matches.
If you had an arbitrary number of Dictionaries to test you could do something like:
MATCH (d1:Dictioniary { name:'Dic1' }),(d2:Dictioniary { name:'Dic2' }),(d3:Dictioniary { name:'Dic3' })
WITH [d1,d2,d3] AS ds
MATCH (d)-[:contains|references*]->(f:File)
WHERE d IN ds
WITH f, ds, COLLECT(d) AS fds
WHERE length(ds)= length(fds)
RETURN f
This matches the dictionaries that you are interested in first and for each of them in turn it finds the files that they reference. Importantly the File object is preserved and the Dictionary that referenced it is collected into an array (fds). If we know that we had 3 dictionaries to begin with (length(ds)) and that a given file has the same number of related dictionaries (length(fds)) then all dictionaries must reference it.
Assuming that there may multiple paths to a given File from a given Dictionary then you can insert the DISTINCTmodifier into the second WITH statement:
WITH f, ds, COLLECT(DISTINCT(d)) AS fds

Resources