required query for Cosmosdb timestamp calculation - azure-cosmosdb

My cosmosdb is updated daily with some data , which looks like
{
"Price": {
"For": "9070.040",
"From": "700.990"
},
"ArticleNumber": "71151004",
"ArticleNumberPartitionKey": "7115",
"ForStatus": "ACTIVE",
"id": "71151004",
"_rid": "ky1XAMpBiEoDAAAAAAAAAA==",
"_self": "dbs/ky1XAA==/colls/ky1XAMpBiEo=/docs/ky1XAMpBiEoDAAAAAAAAAA==/",
"_etag": "\"1200f241-0000-0d00-0000-60c8dad00000\"",
"_attachments": "attachments/",
"_ts": 1623775952
}
My logic is setup with recurrance of 24 hrs. My issue is I am unable to get the data for last 24 hrs if any new data gets added in the cosmosdb. I would like to use "_ts" to get the data for last 24 horurs only. Any idea how to do that?
tried with this but not getting the required result.
Int32 unixTimestamp = (Int32)(DateTime.UtcNow.Subtract(new DateTime(1970, 1, 1))).TotalSeconds;

In .NET the following will get the unix time 24 hours ago:
long hoursAgo24 = DateTimeOffset.UtcNow.AddHours(-24).ToUnixTimeSeconds();
With that value you should be able to run a query like:
SELECT * FROM c
WHERE c._ts > #hoursAgo24

Related

Azure Time Series: json attribute present in raw data but cannot be graphed

I am trying to graph data in Azure Timeseries Insights. For the provided sensor, I can only graph the event count - none of the values provided by the JSON are available. Despite this, the raw data is clearly all present.
I can see the attribute state in the raw data, but it cannot be selected for graphing.
The raw data view:
The property selection for the entity:
The raw json (before it lands in Times Series Insights) is as follows (from another identical sensor). The entity_id and last_updated are used as the device id and update time for the event source.:
{
"entity_id": "sensor.temperature_9",
"state": "21.0",
"attributes": {
"on": true,
"unit_of_measurement": "°C",
"friendly_name": "XXXX Schlafzimmer Temp",
"icon": "mdi:thermometer",
"device_class": "temperature"
},
"last_changed": "2021-03-02T07:45:23.239584+00:00",
"last_updated": "2021-03-02T07:45:23.239584+00:00",
"context": {
"id": "32d7edfe14b5738ee47509d026c6d5d3",
"parent_id": null,
"user_id": null
}
}
How can I graph the state from raw data?
Figured it out: the state value from json is used by many objects, some report a numeric value and some an enumeration. This makes the field invalid for direct selection in a numeric data type.
Instead, a value of toDouble($event.state.String) in a type, then assigning the type to the instance, allows the correct value to be displayed.

Can't scan on DynamoDB map nested attributes

I'm new to DynamoDB and I'm trying to query a table from javascript using the Dynamoose library. I have a table with a primary partition key of type String called "id" which is basically a long string with a user id. I have a second column in the table called "attributes" which is a DynamoDB map and is used to store arbitrary user attributes (I can't change the schema as this is how a predefined persistence adapter works and I'm stuck working with it for convenience).
This is an example of a record in the table:
Item{2}
attributes Map{2}
10 Number: 2
11 Number: 4
12 Number: 6
13 Number: 8
id String: YVVVNIL5CB5WXITFTV3JFUBO2IP2C33BY
The numeric fields, such as the "12" field, in the Map can be interpreted as "week10", "week11","week12" and "week13" and the numeric values 2,4,6 and 8 are the number of times the application was launched that week.
What I need to do is get all user ids of the records that have more than 4 launches in a specific week (eg week 12) and I also need to get the list of user ids with a sum of 20 launches in a range of four weeks (eg. from week 10 to 13).
With Dynamoose I have to use the following model:
dynamoose.model(
DYNAMO_DB_TABLE_NAME,
{id: String, attributes: Map},
{useDocumentTypes: true, saveUnknown: true}
);
(to match the table structure generated by the persistence adapter I'm using).
I assume I will need to do DynamoDB "scan" to achieve this rather than a "query" and I tried this to get started and get a records where week 12 equals 6 to no avail (I get an empty set as result):
const filter = {
FilterExpression: 'contains(#attributes, :val)',
ExpressionAttributeNames: {
'#attributes': 'attributes',
},
ExpressionAttributeValues: {
':val': {'12': 6},
},
};
model.scan(filter).all().exec(function (err, result, lastKey) {
console.log('query result: '+ JSON.stringify(result));
});
If you don't know Dynamoose but can help with solving this via the AWS SDK tu run a DynamoDB scan directly that might also be helpful for me.
Thanks!!
Try the following.
const filter = {
FilterExpression: '#attributes.#12 = :val',
ExpressionAttributeNames: {
'#attributes': 'attributes',
'#12': '12'
},
ExpressionAttributeValues: {
':val': 6,
},
};
Sounds like what you are really trying to do is filter the items where attributes.12 = 6. Which is what the query above will do.
Contains can't be used for objects or arrays.

Pinot fasthll and distinctcounthll returns different values

we are using pinot hll, and got suggested to switch from fasthll to distinctcounthll, but we got the count very different, with the same condition we have 1000x difference.
Example:
SELECT fasthll(my_hll), distinctcounthll(my_hll)
FROM counts_table WHERE timestamp >= 1500768000
I get results:
"aggregationResults": [
{
"function": "fastHLL_my_hll",
"value": "68685244"
}, {
"function": "distinctCountHLL_my_hll",
"value": "50535"
}]
Could anyone suggest what's the big difference between them?
Please refer to pinot-issue-5153.
FastHll will convert one string into a hyperloglog object, which may represent thousand unique values. DistinctCountHLL treats string as a value, not hyperloglog object, so it will return the approximation of how many unique hyperloglog serialized strings, the value should be close to your total number scanned .
fasthll is deprecated because of the low performance of deserialization. You may generate BYTES type for serialized HyperLogLog using org.apache.pinot.core.common.ObjectSerDeUtils.HYPER_LOG_LOG_SER_DE.serialize(hyperLogLog) and query it with distinctcounthll

How to properly set Firebase Realtime Database to avoid null value at the beginning of array?

I'm new to NoSQL database. Currently I'm trying to use the Firebase and integrate it with iOS. When it comes to predefine the database, with trial and error, I try to make it look like this:
When I tried to retrieve the "stories" path in iOS, I get json structure like this:
[
<null>,
{
comments: [
<null>,
1,
2,
3
],
desc: "Blue versus red in a classic battle of good versus evil and right versus wrong.",
duration: 30,
rating: 4.42,
tags: [
<null>,
"fantasy",
"scifi"
title: "The Order of the Midnight Sun",
writer: 1
]
}
]
My question is, why there's always a null at the beginning of each array? What should I do in the database editor to avoid the null?
It looks like you start pushing data to index 1 and not 0, inserting/retrieving data to/from a list starts with index 0:

LinkedIn historical company statistics giving me the wrong number of followers

I am attempting to write a backfill script that will pull in historical follower numbers for linkedin companies we are missing data for. My current script is able to get data back from linkedin, but these numbers appear incorrect for my test company. I am using this company: https://www.linkedin.com/company/3802814
I make a historical follower statistics call like so:
http://api.linkedin.com/v1/companies/3802814/historical-follow-statistics?start-timestamp=315554400&end-timestamp=1421349947&time-granularity=day&format=json
(these timestamps correspond to 01/01/1980 and 01/15/2015)
The data I'm getting back indicates 14 (not 6, as my company actually has) followers, all on random/incorrect dates, with all 0s:
{
"_total": 14,
"values": [
{
"organicFollowerCount": 0,
"paidFollowerCount": 0,
"time": 259200000,
"totalFollowerCount": 0
},
{
"organicFollowerCount": 0,
"paidFollowerCount": 0,
"time": 345600000,
"totalFollowerCount": 0
},
... (10 more similar records)
{
"organicFollowerCount": 0,
"paidFollowerCount": 0,
"time": 1296000000,
"totalFollowerCount": 0
},
{
"organicFollowerCount": 0,
"paidFollowerCount": 0,
"time": 1382400000,
"totalFollowerCount": 0
}
]
}
I would have guessed my timestamps were wrong until I saw that it's giving me more followers than I should actually have. Does anyone know what I might be doing wrong? Looking at the linkedin docs has thus far not given me any obvious answers. Data I expect would be a series of daily records updated by # of followers added on a given day. These followers were primarily added sometime in December 2014.
This is the proper request you should be making to get the information you are looking for:
GET https://api.linkedin.com/v1/companies/3802814/historical-follow-statistics?start-timestamp=315561600000&time-granularity=day&end-timestamp=1421308800000&format=json
You want to make sure you are using the timestamps in milliseconds.
The total value you are also seeing is the # of results - not the # of followers.

Resources