JSON Path not working properly with athena - jsonpath

I have a lambda function that converts my logs to this format:
{
"events": [
{
"field1": "value",
"field2": "value",
"field3": "value"
}, (...)
]
}
When I query it on S3, I get in this format:
[
{
"events": [
{ (...) }
]
}
]
And I'm trying to run a custom classifier for it because the data I want is inside the objects kept by 'events' and not events itself.
So I started with the simplest path I could think that worked in my tests (https://jsonpath.curiousconcept.com/)
$.events[*]
And, sure, worked in the tests but when I run a crawler against the file, the table created includes only an events field with a struct inside it.
So I tried a bunch of other paths:
$[*].events
$[*].['events']
$[*].['events'].[*]
$.[*].events[*]
$.events[*].[*]
Some of these does not even make sense and absolutely every one of those got me an schema with an events field marked as array.
Can anyone point me to a better direction to handle this issue?

Related

Using reference funtion in an ARM template parameter file

Is there anyway to use the reference funtion in an ARM parameter file? I understand the following can be used to retrieve the intrumentation key of an app insights instance but this doesnt seem to work in a parameter file.
"[reference('microsoft.insights/components/web-app-name-01', '2015-05-01').InstrumentationKey]"
I currently set a long list of environment variables using an array from a parameter file and need to include the dynamic app insights instrumentation key to that list of variables.
Unfortunately, no.
Reference function only works at runtime. It can't be used in the parameters or variables sections because both are resolved during the initial parsing phase of the template.
Here is an excerpt from the docs and also how to use reference correctly:
You can't use the reference function in the variables section of the template. The reference function derives its value from the resource's runtime state. However, variables are resolved during the initial parsing of the template. Construct values that need the reference function directly in the resources or outputs section of the template.
Not in a param file... it's possible to simulate what you want by nested a deployment if that's an option. So your param file can contain the resourceId of the insights resource and then a nested deployment can make the reference call - but TBH, probably easier to fetch the key as a pipeline step (or similar).
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"insightsResourceId": {
"type": "string",
"defaultValue": "'microsoft.insights/components/web-app-name-01'"
}
},
"resources": [
{
"apiVersion": "2018-02-01",
"type": "Microsoft.Resources/deployments",
"name": "nestedDeployment",
"properties": {
"mode": "Incremental",
"parameters": {
"instrumentationKey": {
"value": "[reference(parameters('insightsResourceId'), '2015-05-01').InstrumentationKey]"
}
},
"template": {
// original template goes here
}
}
}
]
}
Way 1
Use the reference function in your parameter file for resources that are already deployed in another template. For that you have to pass the ApiVersion parameter. Refer MsDoc. which follows:
"value": "[reference(resourceId(variables('<AppInsightsResourceGroup>'),'Microsoft.Insights/components', variables('<ApplicationInsightsName>')), '2015-05-01', 'Full').properties.InstrumentationKey]"
You need to change the property that you are referencing from '.InstrumentationKey' to '.properties.InstrumentationKey'.
Refer to Kwill answer for more information.
Way 2
Get the Content of parameter file in PowerShell variable/Object using
$ParameterObject = Get-Content ./ParameterFileName.json
Update the Parameter file values using
#Assign the parameter values using
$ParameterObject.parameters.<KeyName>.value = "your dynamic value"
Pass $parameterObject to -TemplateParameterObject parameter
Refer here
Way 3
You have to Add/Modify the parameter file values using (PowerShell/ Dev lang (like Python, c#,...) ). After changing the parameter file try to deploy it.

AWS Step Functions: Filter an array using JsonPath

I need to filter an array in my AWS Step Functions state. This seems like something I should easily be able to achieve with JsonPath but I am struggling for some reason.
The state I want to process looks like this:
{
"items": [
{
"id": "A"
},
{
"id": "B"
},
{
"id": "C"
}
]
}
I want to filter this array by removing entries for which id is not in a specified whitelist.
To do this, I define a Pass state in the following way:
"ApplyFilter": {
"Type": "Pass",
"ResultPath": "$.items",
"InputPath": "$.items.[?(#.id in ['A'])]",
"Next": "MapDeployments"
}
This makes use of the JsonPath in operator.
Unfortunately when I execute the state machine I receive an error:
{
"error": "States.Runtime",
"cause": "An error occurred while executing the state 'ApplyFilter' (entered at the event id #8). Invalid path '$.items.[?(#.id in ['A'])]' : com.jayway.jsonpath.InvalidPathException: com.jayway.jsonpath.InvalidPathException: Space not allowed in path"
}
However, I don't understand what is incorrect with the syntax. When I test here everything works correctly.
What is wrong with what I have done? Is there another way of achieving this sort of filter using JsonPath?
According to the official AWS docs for Step Functions,
The following in paths are not supported # .. , : ? *
https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-paths.html

With Meteor, how can I update a document based on MongoDB's ObjectID()?

I created a child array of objects in a document. Each of these array objects have:
children: [
{
_id: ObjectID("lkajsdflkajdsf"),
title: "Something"
}, ...
]
I'm getting ObjectId undefined error when trying to update a document:
Category.update(
{ "_id": "C2Rcjivw96htJSHRq", "children._id": ObjectId("1c46382a25d3888165dd338a") },
{ "$set": { "children.$.title": "Hello World" }}
);
As you can see I'm attempting to update a specific array object by it's associated _id. This does not work. I was reading this thread: Meteor collection update with traditional id
But it's a little outdated, and I'm also getting errors when trying to use it.
Is there a solid method for handling things in this fashion? I can do this in Mongo shell with no problem, but not through Meteor methods.
Thanks!

"Reverse formatting" Riak search results

Let's say I have an object in the test bucket in my Riak installation with the following structure:
{
"animals": {
"dog": "woof",
"cat: "miaow",
"cow": "moo"
}
}
When performing a search request for this object, the structure of the search results is as follows:
{
"responseHeader": {
"status": 0,
"QTime": 3,
"params": {
"q": "animals_cow:moo",
"q.op": "or",
"filter":"",
"wt": "json"
}
},
"response": {
"numFound": 1,
"start": 0,
"maxScore": "0.353553",
"docs": [
{
"id": "test",
"index": "test",
"fields": {
"animals_cat": "miaow",
"animals_cow": "moo",
"animals_dog": "woof"
},
"props": {}
}
]
}
}
As you can see, the way the object is stored, the cat, cow and dog keys are nested within animals. However, when the search results come back, none of the keys are nested, and are simply separated by _.
My question is this: Is there any way provided by Riak to "reverse format" the search, and return the fields of the object in the correct (nested) format? This becomes a problem when storing and returning user data that might possibly contain _.
I do see that the latest version of Riak (beta release) provides a search schema, but I can't seem to see whether my question would be answered by this.
What you receive back in the search result is what the object looked like after passing through the json analyzer. If you need the data formatted differently, you can use a custom analyzer. However, this will only affect newly put data.
For existing data, you can use the id field and issue a get request for the original object, or use the solr query as input to a MapReduce job.

Google Cloud Datastore runQuery returning 412 "no matching index found"

** UPDATE **
Thanks to Alfred Fuller for pointing out that I need to create a manual index for this query.
Unfortunately, using the JSON API, from a .NET application, there does not appear to be an officially supported way of doing so. In fact, there does not officially appear to be a way to do this at all from an app outside of App Engine, which is strange since the Cloud Datastore API was designed to allow access to the Datastore outside of App Engine.
The closest hack I could find was to POST the index definition using RPC to http://appengine.google.com/api/datastore/index/add. Can someone give me the raw spec for how to do this exactly (i.e. URL parameters, what exactly should the body look like, etc), perhaps using Fiddler to inspect the call made by appcfg.cmd?
** ORIGINAL QUESTION **
According to the docs, "a query can combine equality (EQUAL) filters for different properties, along with one or more inequality filters on a single property".
However, this query fails:
{
"query": {
"kinds": [
{
"name": "CodeProse.Pogo.Tests.TestPerson"
}
],
"filter": {
"compositeFilter": {
"operator": "and",
"filters": [
{
"propertyFilter": {
"operator": "equal",
"property": {
"name": "DepartmentCode"
},
"value": {
"integerValue": "123"
}
}
},
{
"propertyFilter": {
"operator": "greaterThan",
"property": {
"name": "HourlyRate"
},
"value": {
"doubleValue": 50
}
}
},
{
"propertyFilter": {
"operator": "lessThan",
"property": {
"name": "HourlyRate"
},
"value": {
"doubleValue": 100
}
}
}
]
}
}
}
}
with the following response:
{
"error": {
"errors": [
{
"domain": "global",
"reason": "FAILED_PRECONDITION",
"message": "no matching index found.",
"locationType": "header",
"location": "If-Match"
}
],
"code": 412,
"message": "no matching index found."
}
}
The JSON API does not yet support local index generation, but we've documented a process that you can follow to generate the xml definition of the index at https://developers.google.com/datastore/docs/tools/indexconfig#Datastore_Manual_index_configuration
Please give this a shot and let us know if it doesn't work.
This is a temporary solution that we hope to replace with automatic local index generation as soon as we can.
The error "no matching index found." indicates that an index needs to be added for the query to work. See the auto index generation documentation.
In this case you need an index with the properties DepartmentCode and HourlyRate (in that order).
For gcloud-node I fixed it with those 3 links:
https://github.com/GoogleCloudPlatform/gcloud-node/issues/369
https://github.com/GoogleCloudPlatform/gcloud-node/blob/master/system-test/data/index.yaml
and most important link:
https://cloud.google.com/appengine/docs/python/config/indexconfig#Python_About_index_yaml to write your index.yaml file
As explained in the last link, an index is what allows complex queries to run faster by storing the result set of the queries in an index. When you get no matching index found it means that you tried to run a complex query involving order or filter. So to make your query work, you need to create your index on the google datastore indexes by creating a config file manually to define your indexes that represent the query you are trying to run. Here is how you fix:
create an index.yaml file in a folder named for example indexes in your app directory by following the directives for the python conf file: https://cloud.google.com/appengine/docs/python/config/indexconfig#Python_About_index_yaml or get inspiration from the gcloud-node tests in https://github.com/GoogleCloudPlatform/gcloud-node/blob/master/system-test/data/index.yaml
create the indexes from the config file with this command:
gcloud preview datastore create-indexes indexes/index.yaml
see https://cloud.google.com/sdk/gcloud/reference/preview/datastore/create-indexes
wait for the indexes to serve on your developer console in Cloud Datastore/Indexes, the interface should display "serving" once the index is built
once it is serving your query should work
For example for this query:
var q = ds.createQuery('project')
.filter('tags =', category)
.order('-date');
index.yaml looks like:
indexes:
- kind: project
ancestor: no
properties:
- name: tags
- name: date
direction: desc
Try not to order the result. After removing orderby(), it worked for me.

Resources