can't push data from AWS IoT Core to AWS TimeStream - aws-iot-core

i finished 2 days trying to search and solve my problem but no result, i wish i can get some help from you.
i am pushing data from local pc running KEPServerEX to AWS IoT Core using MQTT Agent. i can see the data updating on AWS without issue. Then i created a DB and table on TimeStream called respectively Kep_DB & Table_Kep1.
my issue is that i am trying to create a Rule on AWS IoT Core to send data to the DB table. a rule to work it require to create an SQL like statement and Dimensions.
below what i have tried:
`SELECT (SELECT v FROM values) as value FROM 'iotgateway'`
Then as dimension i put 1 dimension like below:
name: id value: ${id}
my pyload on AWS IoT Core is having this format:
`{
"timestamp": 1668852877344,
"values": [
{
"id": "Simulation Examples.Functions.Random1",
"v": 7,
"q": true,
"t": 1668852868880
},
{
"id": "Simulation Examples.Functions.Ramp2",
"v": 161,
"q": true,
"t": 1668852868880
},
{
"id": "Simulation Examples.Functions.Sine4",
"v": 39.9302559,
"q": true,
"t": 1668852868880
}
]}`
I am still not able to see to data coming to my DB even i tried several dimension name and several SQL statement format.
Any experience on this please ?

Related

Azure Data Factory Cosmos DB sql api 'DateTimeFromParts' is not a recognized built-in function name

I am using Copy Activity in my Datafactory(V2) to query Cosmos DB (NO SQL/SQLAPI). I have a where clause to build datetime from parts using DateTimeFromParts datetime function. THis query works fine when I execute it on the Cosmos DB data explorer query window. But when i use the same query from my copy activity I get the following error:
"message":"'DateTimeFromParts' is not a recognized built-in function name."}]}
ActivityId: ac322e36-73b2-4d54-a840-6a55e456e15e, documentdb-dotnet-sdk/2.5.1 Host/64-bit
I am trying convert a string attribute which is like this '20221231', this translates to Dec 31,2022, to a date to compare it with current date, i use the DateTimeFromParts to build the date, is there another way to convert this '20221231' to a valid date
Select * from c where
DateTimeFromParts(StringToNumber(LEFT(c.userDate, 4)), StringToNumber(SUBSTRING(c.userDate,4, 2)), StringToNumber(RIGHT(c.userDate, 2))) < GetCurrentDateTime()
I suspect the error might be because the documentdb-dotnet-sdk might be an old version. Is there way to specify which sdk to use in the activity?
I tried to repro this and got the same error.
Instead of changing the format of userDate column using DateTimeFromParts function, try changing the GetCurrentDateTime() function to userDate column format.
Workaround query:
SELECT * FROM c
where c.userDate <
replace(left(GetCurrentDateTime(),10),'-','')
Input data
[
{
"id": "1",
"userDate": "20221231"
},
{
"id": "2",
"userDate": "20211231",
}
]
Output data
[
{
"id": "2",
"userDate": "20211231"
}
]
Apologies for the slow reply here. Holidays slowed getting an answer for this.
There is a workaround that allows you to use the SDK v3 which would then allows you to access the DateTimeFromParts() system function which was released in .NET SDK v.3.13.0.
Option 1: Use AAD authentication (i.e Service Principal or System or User Managed Identity) for the Linked Service object in ADF to Cosmos DB. This will automatically pick up the .NET SDK v3.
Option 2: Modify the linked service template. First, click on Manage in ADF designer, next click on Linked Services, then select the connection and click the {} to open the JSON template, you can then modify and set useV3 to true. Here is an example.
{
"name": "<CosmosDbV3>",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"annotations": [],
"type": "CosmosDb",
"typeProperties": {
"useV3": true,
"accountEndpoint": "<https://sample.documents.azure.com:443/>",
"database": "<db>",
"accountKey": {
"type": "SecureString",
"value": "<account key>"
}
}
}
}

Can fluentbit forward fluentbit_metrics as plain text instead of a JSON object to a port?

I am trying to send fluentbit metrics to an external source for processing. My understanding from the documentation is that the fluentbit_metrics input is intended to be used with output plugins that are for specific telemetry solutions like Prometheus, OpenTelemetry, etc. However, for my purposes, I cannot actually use any of those solutions and instead have to use a different bespoke metrics solution. For this to work, I would like to just send lines of text to a port that my metrics solution is listening on.
I am trying to use the fluentbit forward output to send data to this endpoint, but I am getting an error in response from my metrics solution because it is receiving a big JSON object which it can't parse. However, when I output the same fluentbit_metrics input to a file or to stdout, the contents of the file is more like what I would expect, where each metric is just a line of text. If these text lines were what was being sent to my metrics endpoint, I wouldn't have any issue ingesting them.
I know that I could take on the work to change my metrics solution to parse and process this JSON map, but before I do that, I wanted to check if this is the only way forward for me. So, my question is, is there a way to get fluentbit to send fluentbit_metrics to a forward output where it does not convert the metrics into a big JSON object? Is the schema for that JSON object specific to prometheus? Is there a reason why the outputs differ so substantially?
Here is a copy of an example config I am using with fluentbit:
[SERVICE]
# This is a commented line
Daemon off
log_level info
log_file C:\MyFolder\fluentlog.txt
flush 1
parsers_file .\parsers.conf
[INPUT]
name fluentbit_metrics
tag internal_metrics
scrape_interval 2
[OUTPUT]
Name forward
Match internal_metrics
Host 127.0.0.1
Port 28232
tag internal_metrics
Time_as_Integer true
[OUTPUT]
name stdout
match *
And here is the output from the forward output plugin:
{
"meta": {
"cmetrics": {},
"external": {},
"processing": {
"static_labels": []
}
},
"metrics": [
{
"meta": {
"ver": 2,
"type": 0,
"opts": {
"ns": "fluentbit",
"ss": "",
"name": "uptime",
"desc": "Number of seconds that Fluent Bit has been running."
},
"labels": [
"hostname"
],
"aggregation_type": 2
},
"values": [
{
"ts": 1670884364820306500,
"value": 22,
"labels": [
"myHostName"
],
"hash": 16603984480778988994
}
]
}, etc.
and here is the output of the same metrics from stdout:
2022-12-12T22:02:13.444100300Z fluentbit_uptime{hostname="myHostName"} = 2
2022-12-12T22:02:11.721859000Z fluentbit_input_bytes_total{name="tail.0"} = 1138
2022-12-12T22:02:11.721859000Z fluentbit_input_records_total{name="tail.0"} = 12
2022-12-12T22:02:11.444943400Z fluentbit_input_files_opened_total{name="tail.0"} = 1

Is it possible to convert JSON Input while sending data from IOT hub to CosmosDb using Stream Analytics

I'm having JSON INPUT (Iot HUB)
{
"time": 1574266369775,
"latitude": 70.703271,
"longitude": 25.8445082,
"accuracy": 23.320999145507812,
"altitude": 498.8999938964844,
"id": "abs8d5c2ff74b5a5"
}
and want to store that into the cosmosDb, in particulate key location.
{
"updatedtime": 1574345877283,
"time": 1574347747884,
"status": "available",
"deviceId":"abs8d5c2ff74b5a5",
"location": {
"time": 1574266369775,
"latitude": 70.703271,
"longitude": 25.8445082,
"accuracy": 23.320999145507812,
"altitude": 498.8999938964844,
"id": "abs8d5c2ff74b5a5"
}
}
Is it possible? I'm able to store it into the first level but can it be store in the key of any document?
Stream Analytics integration with Azure Cosmos DB allows you to insert or update records in your container based on a given Document ID column. This is also referred to as an Upsert.
Documentation also says that it enables partial updates to the document, that is, addition of new properties or replacing an existing property is performed incrementally.
My 2 cents: try to name your output as 'location' and make sure you have the same document id when writing to the cosmos db output from your ASA job.

Spatial Indexing not working with ST_DISTANCE queries and '<'

Spatial indexing does not seem to be working on a collection which contains a document with GeoJson coordinates. I've tried using the default indexing policy which inherently provides spatial indexing on all fields.
I've tried creating a new Cosmos Db account, database, and collection from scratch without any success of getting the spatial indexing to work with ST_DISTANCE query.
I've setup a simple collection with the following indexing policy:
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{
"path": "/\"location\"/?",
"indexes": [
{
"kind": "Spatial",
"dataType": "Point"
},
{
"kind": "Range",
"dataType": "Number",
"precision": -1
},
{
"kind": "Range",
"dataType": "String",
"precision": -1
}
]
}
],
"excludedPaths": [
{
"path": "/*",
},
{
"path": "/\"_etag\"/?"
}
]
}
The document that I've inserted into the collection:
{
"id": "document1",
"type": "Type1",
"location": {
"type": "Point",
"coordinates": [
-50,
50
]
},
"name": "TestObject"
}
The query that should return the single document in the collection:
SELECT * FROM f WHERE f.type = "Type1" and ST_DISTANCE(f.location, {'type': 'Point', 'coordinates':[-50,50]}) < 200000
Is not returning any results. If I explicitly query without using the spatial index like so:
SELECT * FROM f WHERE f.type = "Type1" and ST_DISTANCE({'type': 'Point', 'coordinates':[f.location.coordinates[0],f.location.coordinates[1]]}, {'type': 'Point', 'coordinates':[-50,50]}) < 200000
It returns the document as it should, but doesn't take advantage of the indexing which I will need because I will be storing a lot of coordinates.
This seems to be the same issue referenced here. If I add a second document far away and change the '<' to '>' in the first query it works!
I should mention this is only occurring on Azure. When I use the Azure Cosmos Db Emulator it works perfectly! What is going on here?! Any tips or suggestions are much appreciated.
UPDATE: I found out the reason that the query works on the Emulator and not Azure - the database on the emulator doesn't have provisioned (shared) throughput among its collections, while I made the database in Azure with provisioned throughput to keep costs down (i.e. 4 collections sharing 400 RU/s). I created a non provisioned throughput database in Azure and the query works with spatial indexing!! I will log this issue with Microsoft to see if there is a reason why this is the case?
Thanks for following up with additional details with regards to a fixed collection being the solution but, I did want to get some additional information.
The Cosmos DB Emulator now supports containers:
By default, you can create up to 25 fixed size containers (only supported using Azure Cosmos DB SDKs), or 5 unlimited containers using the Azure Cosmos Emulator. By modifying the PartitionCount value, you can create up to 250 fixed size containers or 50 unlimited containers, or any combination of the two that does not exceed 250 fixed size containers (where one unlimited container = 5 fixed size containers). However it's not recommended to set up the emulator to run with more than 200 fixed size containers. Because of the overhead that it adds to the disk IO operations, which result in unpredictable timeouts when using the endpoint APIs.
So, I want to see which version of the Emulator you were using. Current version is azure-cosmosdb-emulator-2.2.2.

Script paths into Azure Data Factory DataLakeAnalytics u-sql pipeline

I'm trying to publish a data factory solution with this ADF DataLakeAnalyticsU-SQL pipeline activity following the azure step by step doc (https://learn.microsoft.com/en-us/azure/data-factory/data-factory-usql-activity).
{
"type": "DataLakeAnalyticsU-SQL",
"typeProperties": {
"scriptPath": "\\scripts\\111_risk_index.usql",
"scriptLinkedService": "PremiumAzureDataLakeStoreLinkedService",
"degreeOfParallelism": 3,
"priority": 100,
"parameters": {
"in": "/DF_INPUT/Consodata_Prelios_consegna_230617.txt",
"out": "/DF_OUTPUT/111_Analytics.txt"
}
},
"inputs": [
{
"name": "PremiumDataLakeStoreLocation"
}
],
"outputs": [
{
"name": "PremiumDataLakeStoreLocation"
}
],
"policy": {
"timeout": "06:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"retry": 1
},
"scheduler": {
"frequency": "Minute",
"interval": 15
},
"name": "ConsodataFilesProcessing",
"linkedServiceName": "PremiumAzureDataLakeAnalyticsLinkedService"
}
During publishing got this error:
25/07/2017 18:51:59- Publishing Project 'Premium.DataFactory'....
25/07/2017 18:51:59- Validating 6 json files
25/07/2017 18:52:15- Publishing Project 'Premium.DataFactory' to Data
Factory 'premium-df'
25/07/2017 18:52:15- Value cannot be null.
Parameter name: value
Trying to figure up what could be wrong with the project it came up that the issues reside into the activity options "typeProperties" as shown above, specifically for scriptPath and scriptLinkedService attributes. The doc says:
scriptPath: Path to folder that contains the U-SQL script. Name of the file
is case-sensitive.
scriptLinkedService: Linked service that links the storage that contains the
script to the data factory
Publishing the project without them (using hard-coded script) it will complete successfully. The problem is that I can't either figure out what exactly put into them. I tried with several combinations paths. The only thing I know is that the script file must be referenced locally into the solution as a dependency.
The script linked service needs to be Blob Storage, not Data Lake Storage.
Ignore the publishing error, its misleading.
Have a linked service in your solution to an Azure Storage Account, referred to in the 'scriptLinkedService' attribute. Then in the 'scriptPath' attribute reference the blob container + path.
For example:
"typeProperties": {
"scriptPath": "datafactorysupportingfiles/CreateDimensions - Daily.usql",
"scriptLinkedService": "BlobStore",
"degreeOfParallelism": 2,
"priority": 7
},
Hope this helps.
Ps. Double check for case sensitivity on attribute names. It can also throw unhelpful errors.

Resources