Making timestamp field available in kibana via elasticsearch with kafka - kibana

How do I allow kibana to use my timestamp field as #timestamp when adding data via Kafka elasticsearch connector?
I am defining my avro schema like so
public static String userSchema = "{\"type\":\"record\"," +
"\"name\":\"myrecord\"," +
"\"fields\":[" +
"{\"name\":\"wSrcTime\",\"type\":[\"string\", \"null\"],\"default\":\"null\"}," +
"{\"name\":\"wTradePrice\",\"type\":[\"null\",\"float\"],\"default\":null}," +
"{\"name\":\"timestamp\",\"type\":{\"type\":\"long\",\"logicalType\":\"timestamp-millis\"}}" +
"]}";
and use this to populate the field
avroRecord.put("timestamp", System.currentTimeMillis());
I see the data kafka-avro-console-consumer as follows:
{"wSrcTime":{"string":"2019-08-01 15:20:40.127"},"wTradePrice":null,"timestamp":1564672840137}
{"wSrcTime":{"string":"2019-08-01 15:20:41.062"},"wTradePrice":null,"timestamp":1564672841072}
{"wSrcTime":{"string":"2019-08-01 15:20:41.062"},"wTradePrice":null,"timestamp":1564672841073}
{"wSrcTime":{"string":"2019-08-01 15:20:41.064"},"wTradePrice":null,"timestamp":1564672841075}
{"wSrcTime":{"string":"2019-08-01 15:20:41.065"},"wTradePrice":null,"timestamp":1564672841076}
{"wSrcTime":{"string":"2019-08-01 15:20:41.410"},"wTradePrice":null,"timestamp":1564672841420}
And see the data added to kibana index as
timestamp: number
wTradePrice: number
wSrcTime: string
Is there a recommended way of making timestamp #timestamp so I can use it on an axis?
Thank you

You can create an ingest-pipeline on elasticsearch so you can rename the timestamp field to #timestamp before indexing the documents.
Create the pipeline on the elastic:
PUT _ingest/pipeline/rename_timestamp
{
"rename": {
"field": "timestamp",
"target_field": "#timestamp"
}
}
And you should provide the pipeline name when indexing the new documents into database like
PUT /es-index/_doc?pipeline=rename_timestamp
{
"timestamp": "value"
...
}

Related

Firestore - Can you query fields in nested documents?

I currently have a data structure like this in Firebase Cloud Firestore
Database
+ ProductInventories (collection)
+ productId1 (document)
+ variantName (collection)
+ [auto-ID] (document)
+ location: "Paris"
+ count: 1334
How would I make a structuredQuery in POST to get the count for location `Paris'?
Intuitively it might have been a POST to https://firestore.googleapis.com/v1/projects/projectName/databases/(default)/documents/ProductInventories/productId1:runQuery with the following JSON
{
"structuredQuery": {
"from": [
{
"collectionId": "variantName",
"allDescendants": true
}
],
"where": {
"fieldFilter": {
"field": {
"fieldPath": "location"
},
"op": "EQUAL",
"value": {
"stringValue": "Paris"
}
}
}
}
}
With this I get error collection group queries are only allowed at the root parent, which means I need to make the POST to https://firestore.googleapis.com/v1/projects/projectName/databases/(default)/documents:runQuery instead. This however means I'll need to create a collection group index exemption for each variant (variantName) I have for each productId.
Seems like I would be better off to have below variantName collection level, the location as the name of the document, and I can then access the count directly without making a query. But seems to me the point of NoSQL was that I could be less careful about how I structure the data, so I'm wondering if there's a way for me to make the query as is with the current data structure.
Using collection names that are not known ahead of time is usually an anti-pattern in Firestore. And what you get is one of the reasons for that: you need to be able to create a collection group query across documents in multiple collections, you need to be able to define an index on the collection name - and that requires that you know those some time during development.
As usual, when using NoSQL databases, you can modify/augment your data structure to allow the use-case. For example, if you create a single subcollection for all variants, you can give that collection a fixed name, and search for paris and $variantName in there. This collection can either be a replacement of your current $variantName collections, or an addition to it.
have you tried something like this?
fb.firestore().collection('ProductInventories')
.doc('productId1')
.collection('variantName')
.where('location', '==', 'Paris')
.get()
.then(res=>{
res.data().docs.forEach((product, i)=>{
console.log('item ' + i + ': ' + product.count);
})
});

How do I suppress CosmosDB "default" info in resultsets?

I want to suppress the CosmosDB information in the following resultset, how can that be done?
{
"id": null,
"_rid": null,
"_self": null,
"_ts": 0,
"_etag": null,
"topLevelCategory": "Shorts,Skirt"
},
This is an extract of course but I dont want to show the ID etc as they serve no purpose in this result but I cannot figure out how to suppress that info.
I expect the following
{
"topLevelCategory": "Shorts,Skirt"
},
Query looks as follows
$"SELECT DISTINCT locales.categories[0] AS topLevelCategory " +
$"FROM c JOIN locales in c.locales " +
$"WHERE locales.country = '{apiInputObject.Locale}' " +
$"AND locales.language = '{apiInputObject.Language}'";
Interesting thing is if I cast the result as a JOBJECT I dont get the system data, I only get it if I createDOcumentQuery as DOcument, so a workaround would be as follows
IQueryable<JObject> queryResultSet = client.CreateDocumentQuery<JObject>(UriFactory.CreateDocumentCollectionUri(databaseName, databaseCollection), parsedQueryObject.SqlStatement, queryOptions);
but that has other async issues but the above does not show the system generate IDs but the below one does
var query = client.CreateDocumentQuery<Document>(UriFactory.CreateDocumentCollectionUri(databaseName, databaseCollection), parsedQueryObject.SqlStatement, queryOptions).AsDocumentQuery();
var result = await query.ExecuteNextAsync<Document>();
These are system-generated properties of items in Cosmos DB.
Surely,you could filter them in the sql: select c.topLevelCategory from c, don't mention them or use select * from c. Filtering in sql is the best method, better than secondary processing of result set.
Update Answer:
Your situation is executing the exact same query the JOBJECT does not show the system data but the Document does.
My explanation as below:
Document Class is a self-contained base class of Document DB .NET package.It has these generate properties:
SDK will try to map the result data one by one to the entity class which you defined in the CreateDocumentQuery<T>.
So actually,you already find the solution.You could define your custom pojo to receive the result data. Just contain the properties you want in that pojo inside like:
class Pojo : Document
{
public string id { get; set; }
public string name { get; set; }
}
That would have both business implications and no more redundant fields.Hope i'm clear on this.

How to map a Firestore date object to a date in elasticsearch

I am using a cloud function to send a Firebase firestore document to elasticsearch for indexing. I am trying to find a way to map a firebase timestamp field to an elasticsearch date field in the index.
The elasticsearch date type mapping supports formats for epoch_millis and epoch_seconds but the firestore date type is an object as follows:
"timestamp": {
"_seconds": 1551833330,
"_nanoseconds": 300000000
},
I could use use the seconds field but will lose the fractional part of the second.
Is there a way map the timestamp object to a date field in the index that calculates the epoch_millis from the _seconds and _nanoseconds fields? I recognize that precision will be lost (nanos to millis).
If you don't mind losing the fractional part of the second, you could set a mapping on your index like this, which is what I ended up doing:
"mappings": {
"date_detection": false,
"dynamic_templates": [
{
"dates": {
"match": ".*_seconds",
"match_pattern": "regex",
"mapping": {
"type": "date",
"format": "epoch_second"
}
}
}
]
}
It will convert any timestamps (even nested in the document) to dates with second precision.

make book.randomID key in amazon dynamodb table

for some reason I want to use book.randomID as key in amazon DynamoDB table using java code. when i tried id added a new field in the item named "book.randomID"
List<KeySchemaElement> keySchema = new ArrayList<KeySchemaElement>();
keySchema.add(new KeySchemaElement().withAttributeName("conceptDetailInfo.conceptId").withKeyType(KeyType.HASH)); // Partition
and here is the json structure
{
"_id":"123",
"book":{
"chapters":{
"chapterList":[
{
"_id":"11310674",
"preferred":true,
"name":"1993"
}
],
"count":1
},
"randomID":"1234"
}
}
so is it possible to use such element as key. if yes how can we use it as key
When creating DynamoDB tables AWS limits it to the types String, Binary and Number. Your attribute book.random seems to be a String.
As long as it's not one of the other data types like List, Map or Set you should be fine.
Just going to AWS console and trying it out worked for me:

NEST is adding TimeZone while indexing docs in Elasticsearch

I have a DateTime field in my c# class as below
public DateTime PassedCreatedDate { get; set; }
While indexing it from NEST to elasticssearch, it is saving it along with local timezone. How to avoid this?
"PassedCreatedDate": "2015-08-14T15:50:04.0479046+05:30" //Actual value saved in ES
"PassedCreatedDate": "2015-08-14T15:50:04.047" //Expected value
mapping of PassedCreatedDate in elasticsearch is
"PassedCreatedDate": {
"type": "date",
"format": "dateOptionalTime"
},
I am aware to have a field as string and provide the format in ElasticProperty, but is there any setting to avoid this timezone addition while using datetime field only?
There are two things to change to achieve saving DateTimes without the time zone offset.
Firstly, NEST uses JSON.Net for json serialization, so we need to change the serializer settings on the ElasticClient to serialize DateTimes into the format desired, and interpret those DateTimes as Local kind when deserializing
var settings = new ConnectionSettings(new Uri("http://localhost:9200"));
settings.SetJsonSerializerSettingsModifier(jsonSettings =>
{
jsonSettings.DateFormatString = "yyyy-MM-ddTHH:mm:ss",
jsonSettings.DateTimeZoneHandling = DateTimeZoneHandling.Local
});
var connection = new InMemoryConnection(settings);
var client = new ElasticClient(connection: connection);
Secondly,we need to tell Elasticsearch via mapping, the format of our DateTime for the field(s) in question
"PassedCreatedDate": {
"type": "date",
"format": "yyyy-MM-ddTHH:mm:ss"
},

Resources