Convert String to Int on AWS DocumentDB - metabase

I am currently trying to write a metabase question off AWS Document DB and I am running into an issue where I need to convert a string to an integer. Unfortunately, it seems like aws documentdb does not support $toInt and I am not sure how to get around it. Here is the query:
[
{"$match": {
"metaData.fileSize" : {"$exists": true}
}},
{"$project": {
"file_size" : "$metaData.fileSize",
"timestamp": 1,
"past7Days":
{ "$subtract":
[ ISODate(), 604800000]
}
}},
{"$project": {
"file_size" : 1,
"timestamp": 1,
"dayofweek": {"$dayOfWeek":["$timestamp"]},
"past7DaysComp":
{ "$subtract":
[ "$timestamp", "$past7Days"]
}
}},
{"$group" :
{
"_id" : {"dayofweek" : "$dayofweek"},
"size": {"$avg" : "$file_size"}
}
}
]
The group returns nothing for size since it is not of a numeric type. Any ideas how to convert file_size to integer or double or float?

Related

Elasticsearch query using Kibana does not work using Java Rest Client API

Can someone help determine why a kibana query does not return hits when using the Elasticsearch Java Rest Client API.
I am currently using
Elasticsearch/Kibana: 7.16.2
Elasticsearch Java Client: 6.6.2
I am reluctant to upgrade java client due to numerous Geometry related updates needed.
fields:
mydatetime: timestamp of doc
category: keyword field
We have 1000 or more records for each category a day.
I want an aggregation that shows categories by day and includes the first and last "date" for the category.
This query works in Kibana
GET /mycategories/_search
{
"size":0,
"aggregations":{
"bucket_by_date":{
"date_histogram":{
"field":"mydatefield",
"format":"yyyy-MM-dd",
"interval":"1d",
"offset":0,
"order":{
"_key":"asc"
},
"keyed":false,
"min_doc_count":1
},
"aggregations":{
"unique_categories":{
"terms":{
"field":"category",
"size":10,
"min_doc_count":1,
"shard_min_doc_count":0,
"show_term_doc_count_error":false,
"order":[
{
"_count":"desc"
},
{
"_key":"asc"
}
]
},
"aggregations":{
"min_mydatefield":{
"min":{
"field":"mydatefield"
}
},
"max_mydatefield":{
"max":{
"field":"mydatefield"
}
}
}
}
}
}
}
}
The first record of the result...category1 and category2 for 2022 05 07 with min and max "mydatetime" for each category
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 4,
"successful" : 4,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2593,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"bucket_by_date" : {
"buckets" : [
{
"key_as_string" : "2022-05-07",
"key" : 1651881600000,
"doc_count" : 2,
"unique_missions" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "category1",
"doc_count" : 1,
"min_mydatefield" : {
"value" : 1.651967952E12,
"value_as_string" : "2022-05-07T13:22:17.000Z"
},
"max_mydatefield" : {
"value" : 1.651967952E12,
"value_as_string" : "2022-05-07T23:59:12.000Z"
}
},
{
"key" : "category2",
"doc_count" : 1,
"min_mydatefield" : {
"value" : 1.651967947E12,
"value_as_string" : "2022-05-07T03:47:23.000Z"
},
"max_mydatefield" : {
"value" : 1.651967947E12,
"value_as_string" : "2022-05-07T23:59:07.000Z"
}
}
]
}
},
I have successfully coded other, less complex aggregations without problems. However, i have not been able to get either an AggregationBuilder or WrapperQuery. Zero results are returned.
{"took":0,"timed_out":false,"_shards":{"total":0,"successful":0,"skipped":0,"failed":0},"hits":{"total":0,"max_score":0.0,"hits":[]}}
Before executing the query, i copy the SearchRequest.source() into Kibana, and it runs and returns the desired information.
Below is the AggregationBuilder code that seems to replicate my kibana query, but returns no results.
AggregationBuilder aggregation =
AggregationBuilders
.dateHistogram("bucket_by_date").format("yyyy-MM-dd")
.minDocCount(1)
.dateHistogramInterval(DateHistogramInterval.DAY)
.field("mydatefield")
.subAggregation(
AggregationBuilders
.terms("unique_categories")
.field("category")
.subAggregation(
AggregationBuilders
.min("min_mydatefield")
.field("mydatefield")
)
.subAggregation(
AggregationBuilders
.max("max_mydatefield")
.field("mydatefield")
)
);

CosmosDB $elemMatch syntax error

I am getting a strange syntax error for some commands in the MongoDB API for CosmosDB. Say I have a collection called "Collection" with two documents:
{
"_id" : 1,
"arr" : [
{
"_id" : 11
},
{
"_id" : 12
}
]
}
{
"_id" : 2,
"arr" : [
{
"_id" : 21
},
{
"_id" : 22
}
]
}
If I try to run the query
db.getCollection('Collection').find( { _id : 2 }, { arr : { $elemMatch : { _id : 21 } } })
I get the result
{
"_t" : "OKMongoResponse",
"ok" : 0,
"code" : 9,
"errmsg" : "Syntax error, incorrect syntax near '10'.",
"$err" : "Syntax error, incorrect syntax near '10'."
}
But the command works perfectly fine on my locally hosted instance of MongoDB, returning the expected result:
{
"_id" : 2,
"arr" : [
{
"_id" : 21
}
]
}
Anyway, this is certainly not a syntax error, but there is no helpful error message. If this is not yet supported by CosmosDB, is there any way to only get certain embedded documents stored in an array?
If I try to use an aggregation pipeline to just extract the document in the array (I realize this should give a different result than the command above, but it would also work for my purposes), like so:
db.getCollection('Collection').aggregate([{ "$unwind" : "$arr" }, { "$match" : { "arr._id" : 21 } }] )
I get the result
{
"_t" : "OKMongoResponse",
"ok" : 0,
"code" : 118,
"errmsg" : "$match is currently only supported when it is the first and only stage of the aggregation pipeline. Please restructure your query to combine multiple $match stages into a single $match stage.",
"$err" : "$match is currently only supported when it is the first and only stage of the aggregation pipeline. Please restructure your query to combine multiple $match stages into a single $match stage."
}
So that doesn't work for me either.
Try this
db.collection.aggregate([
{
$match: {
"_id": 2
}
},
{
$project: {
arr: {
$filter: {
input: "$arr",
as: "ar",
cond: {
$eq: [
"$$ar._id",
21
]
}
}
}
}
}
])
Check it here

Insert date as epoch_seconds, output as formatted date

I have a set of timestamps formatted as seconds since the epoch. I'd like to insert to ElasticSearch as epoch_seconds but when querying would like to see the output as a pretty date, e.g. strict_date_optional_time.
My below mapping preserves the format that the input came in - is there any way to normalize the output to just one format via the mapping api?
Current Mapping:
PUT example
{
"mappings": {
"time": {
"properties": {
"time_stamp": {
"type": "date",
"format": "strict_date_optional_time||epoch_second"
}
}
}
}
}
Example docs
POST example/time
{
"time_stamp": "2018-03-18T00:00:00.000Z"
}
POST example/time
{
"time_stamp": "1521389162" // Would like this to output as: 2018-03-18T16:05:50.000Z
}
GET example/_search output:
{
"total": 2,
"max_score": 1,
"hits": [
{
"_source": {
"time_stamp": "1521389162", // Stayed as epoch_second
}
},
{
"_source": {
"time_stamp": "2018-03-18T00:00:00.000Z"
}
}
]
}
Elasticsearch differentiates between the _source and the so called stored fields. The first one is supposed to represent your input.
If you actually use stored fields (by specifying store=true in your mapping) then specify multiple date formats this is easy: (emphasis mine)
Multiple formats can be specified by separating them with || as a separator. Each format will be tried in turn until a matching format is found. The first format will be used to convert the milliseconds-since-the-epoch value back into a string.
I have tested this with elasticsearch 5.6.4 and it works fine:
PUT /test -d '{ "mappings": {"doc": { "properties": {"post_date": {
"type":"date",
"format":"basic_date_time||epoch_millis",
"store":true
} } } } }'
PUT /test/doc/2 -d '{
"user" : "test1",
"post_date" : "20150101T121030.000+01:00"
}'
PUT /test/doc/1 -d '{
"user" : "test2",
"post_date" : 1525167490500
}'
Note how two different input-formats will result in the same format when using GET /test/_search?stored_fields=post_date&pretty=1
{
"hits" : [
{
"_index" : "test",
"_type" : "doc",
"_id" : "2",
"_score" : 1.0,
"fields" : {
"post_date" : [
"20150101T111030.000Z"
]
}
},
{
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_score" : 1.0,
"fields" : {
"post_date" : [
"20180501T093810.500Z"
]
}
}
]
}
If you want to change the input (in _source) you're not so lucky, the mapping-transform feature has been removed:
This was deprecated in 2.0.0 because it made debugging very difficult. As of now there really isn’t a feature to use in its place other than transforming the document in the client application.
If, instead of changing the stored data you are interested in formatting the output, have a look at this answer to Format date in elasticsearch query (during retrieval)

API Gateway and DynamoDB PutItem for String Set

I can't seem to find how to correctly call PutItem for a StringSet in DynamoDB through API Gateway. If I call it like I would for a List of Maps, then I get objects returned. Example data is below.
{
"eventId": "Lorem",
"eventName": "Lorem",
"companies": [
{
"companyId": "Lorem",
"companyName": "Lorem"
}
],
"eventTags": [
"Lorem",
"Lorem"
]
}
And my example template call for companies:
"companies" : {
"L": [
#foreach($elem in $inputRoot.companies) {
"M": {
"companyId": {
"S": "$elem.companyId"
},
"companyName": {
"S": "$elem.companyName"
}
}
} #if($foreach.hasNext),#end
#end
]
}
I've tried to call it with String Set listed, but it errors out still and tells me that "Start of structure or map found where not expected" or that serialization failed.
"eventTags" : {
"SS": [
#foreach($elem in $inputRoot.eventTags) {
"S":"$elem"
} #if($foreach.hasNext),#end
#end
]
}
What is the proper way to call PutItem for converting an array of strings to a String Set?
If you are using JavaScript AWS SDK, you can use document client API (docClient.createSet) to store the SET data type.
docClient.createSet - converts the array into SET data type
var docClient = new AWS.DynamoDB.DocumentClient();
var params = {
TableName:table,
Item:{
"yearkey": year,
"title": title
"product" : docClient.createSet(['milk','veg'])
}
};

nativescript firebase plugin, query by field

I am using nativescript-plugin-firebase to query firebase database in my angular2-nativescript application. I went through the documentation on how to query the database by field. For example I would like to fetch address of a user, based on uid for the below example database. But I could not find a way. Any help will be appreciated.
{
"address" : {
"-KfBtEuTA43UzSFfK7kU" : {
"house_number" : "hno1",
"street" : "street1",
"city" : "city1",
"uid" : "0P3Km5i9cEd1Akg7gJfJnALUSZw2"
},
"-KfC4Myo69bTZQCzw1yz" : {
"house_number" : "hno2",
"street" : "street2",
"city" : "city2",
"uid" : "4sj3ADekxsVNf5RaAFjbLbF6x0K2"
}
}
}
The following code gave me the query result by uid.
firebase.query(result => {
console.log("query result:", JSON.stringify(result));
}, "/address", {
orderBy: {
type: firebase.QueryOrderByType.CHILD,
value: 'uid'
},
ranges: [
{
type: firebase.QueryRangeType.START_AT,
value: uidValue
},
{
type: firebase.QueryRangeType.END_AT,
value: uidValue
}
]
})

Resources