PromQL: Counting samples of a time series - count

For a 2 minute time window this vector has the following results (I am using Grafana Explore with a picked 2 minute time):
instana_metrics{aggregation="max", endpoint="mutation addProduct"}
t1 - 3051
t2 - 5347
t3 - 5347
t4 - 4224
t5 - 4224
I need something equivalent to
SELECT Count(*)
FROM instana_metrics
with a result of 5.
The best I was able to come up with is this
count( instana_metrics{aggregation="max", endpoint="mutation addProduct"} )
t1 | 1
t2 | 1
t3 | 1
t4 | 1
t5 | 1
My interpretation is that every point in time has a count of 1 sample value. But the result itself is a time series, and I am expecting one scalar.
Btw: I understand that I can use Grafana transformation for this, but unfortunately I need a PromQL only solution.

Just use count_over_time function. For example, the following query returns the number of raw samples over the last 2 minutes per each time series with the name instana_metrics:
count_over_time(instana_metrics[2m])
Note that Prometheus calculates the provided query independently per each point on the graph, e.g. each value on the graph shows the number of raw samples per each matching time series over 2 minutes lookbehind window ending at the given point.
If you need just a single value per each matching series over the selected time range in Grafana, then use the following instant query:
count_over_time(instana_metrics[$__range])
See these docs about $__range variable.
See these docs about instant query type in Grafana.

Related

Datetime column from Table1 is not matching the DateTime column from Table 2

Hello I have an issue of matching two different datetime columns.
I need to compare the two of them (and their data), but at the moment of putting them in the same table (using a datetime relation) I do not get the match I need:
What I need:
| Datetime_1 | Datetime_2 |
| ---------- | ---------- |
| 01/01/2023 08:00:00 AM | |
... ...
| 01/11/2023 12:00:00 AM | 01/11/2023 12:00:00 AM |
| 01/11/2023 01:00:00 AM | 01/11/2023 01:00:00 AM |
... ...
| 31/01/2023 12:00:00 PM | 31/01/2023 12:00:00 PM |
What I get:
Datetime_1 goes from 01/01/2023 12:00:00AM to 01/31/2023 11:00:00PM (with steps of 1h) and Datetime_2 goes from 01/11/2023 8:15:00 PM to 02/06/2023 7:45:00 PM (with steps of 30min).
I did a relation with the two of them and I didn't receive any error:
I already put both lists in Date/Time format in Power Query and Data panel.
However, I noticed my main datetime list doesn't have the hierarchy icon on fields panel, while the secondary datetime lists have it, (but not the hour section):
Also, as I mentioned before, my list have a range between Jan and Feb. I do not understand why this range continues and match some dates on the on my main datetime list:
Troubleshooting
Part of the difficulty troubleshooting this is the two columns are formatted differently. Just for now, make sure both are formatted as Long Date Time. When comparing the relationship, do not drag the hierarchy (for the one that has it) into the table but rather just the date itself. When you do, you will see the full timestamp for both columns and the issue will become more clear.
Power BI & Relationships on DateTime
Power BI will only match related rows if the date and time match exactly, so 4/15/2023 12:00:00 AM will not match 4/15/2023/12:00:01 AM. You mentioned one side of the relationship has 30 minute steps while the other has 1 hour steps. Power BI is not going to match up a 1:30am and 1:00am value for you. If you want that 1:30 value to match up to 1:00, create another column truncating the :30 minutes and build your relationship on the truncated column.
Time Dimension
I'm not sure of your application so don't know if this will work, but when dealing with time, I try to separate Date and Time into separate columns and have both a Date and Time dimension. Below is my time dimension DAX. You can generate any minute-precise interval with it. Notice the last defined column "timekey". I create a column in my fact table to relate to this key.
DimTime =
var every_n_minutes = 15 /* between 0 and 60; remainders in last hourly slice */
/* DO NOT CHANGE BELOW THIS LINE */
var slice_per_hour = trunc(DIVIDE(60,every_n_minutes),0)
var rtn =
ADDCOLUMNS(
SELECTCOLUMNS(
GENERATESERIES(0, 24*slice_per_hour - 1, 1),
"hour24", TRUNC(DIVIDE([Value],slice_per_hour),0),
"mins", MOD([Value],slice_per_hour) * every_n_minutes
),
"hour12", MOD([hour24] + 11,12) + 1,
"asTime", TIME([hour24],[mins],0),
"timekey", [hour24] * 100 + [mins]
)
return rtn
As requested, turning this into an answer. The reason you're getting these results is that your time stamps will never line up. Yes, it let you create the join, but my guess is that is only because both fields have the same formatting. Also, it is best practices to separate your dates and time in separate date and time dimensions, then join them via a fact table. See also here.

Power BI DAX - Throughput Board Time Intelligence Metrics with different time zones

Dealing with a bit of a head scratcher. This is more of a logic based issue rather than actual Power BI code. Hoping someone can help out! Here's the scenario:
Site
Shift Num
Start Time
End Time
Daily Output
A
1
8:00AM
4:00PM
10000
B
1
7:00AM
3:00PM
12000
B
2
4:00PM
2:00AM
7000
C
1
6:00AM
2:00PM
5000
This table contains the sites as well as their respective shift times. The master table above is part of an effort in order to capture throughput data from each of the following sites. This master table is connected to tables with a running log of output for each site and shift like so:
Site
Shift Number
Output
Timestamp
A
1
2500
9:45 AM
A
1
4200
11:15 AM
A
1
5600
12:37 PM
A
1
7500
2:15 PM
So there is a one-to-many relationship between the master table and these child throughput tables. The goal is to create use a gauge chart with the following metrics:
Value: Latest Throughput Value (Latest Output in Child Table)
Maximum Value: Throughput Target for the Day (Shift Target in Master Table)
Target Value: Time-dependent project target
i.e. if we are halfway through Site A's shift 1, we should be at 5000 units: (time passed in shift / total shift time) * shift output
if the shift is currently in non-working hours, then the target value = maximum value
Easy enough, but the problem we are facing is the target value erroring for shifts that cross into the next day (i.e. Site B's shift 2).
The shift times are stored as date-independent time values. Here's the code for the measure to get the target value:
Var CurrentTime = HOUR(UTCNOW()) * 60 + MINUTE(UTCNOW())
VAR ShiftStart = HOUR(MAX('mtb MasterTableUTC'[ShiftStartTimeUTC])) * 60 + MINUTE(MAX('mtb MasterTableUTC'[ShiftStartTimeUTC]))
VAR ShiftEnd = HOUR(MAX('mtb MasterTableUTC'[ShiftEndTimeUTC])) * 60 + MINUTE(MAX('mtb MasterTableUTC'[ShiftEndTimeUTC]))
VAR ShiftDiff = ShiftEnd - ShiftStart
Return
IF(CurrentTime > ShiftEnd || CurrentTime < ShiftStart, MAX('mtb MasterTableUTC'[OutputTarget]),
( (CurrentTime - ShiftStart) / ShiftDiff) * MAX('mtb MasterTableUTC'[OutputTarget]))
Basically, if the current time is outside the range of the shift, it should have the target value equal the total shift target, but if it is in the shift time, it calculates it as a ratio of time passed within the shift. This does not work with shifts that cross midnight as the shift end time value is technically earlier than the shift start time value. Any ideas on how to modify the measure to account for these shifts?

R: Continuous futures working backward

I want to create a continuous futures series, that is to eliminate a gap between two series.
First thing I want is to download all individual contracts from the beginning to the now, the syntax is always the same:
Quandl("CME/INSTRUMENT_MONTHCODE_YEAR")
1.INSTRUMENT is GC (gold) in this case
2.MONTHCODE is G J M Q V Z
3.YEAR is from 1975 to 2017 (the actual contract)
With the data, I start working from the last contract, in this case "CME/GCG1975" and with the next contract "CME/GCJ1975". Then I see the last 6 values (are the more recent because date is descending) of the first contract GCG1975
require(Quandl)
GCG1975 = Quandl("CME/GCG1975",order="asc", type="raw")
tail(GCG1975,6)
order can be asc desc (ascending or descending), type can be : raw (data frame) ts xts zoo
And it outputs:
Image: quandl-1.png = Last values of GCG1975
Then I just want the 6th row starting from the final, and I want to eliminate the columns "Last" "Change" (this could be before starting processing each individual contract):
Image: quandl-2.png = Last 6th value GCG1975
Then I want to find the row with date 1975-02-18 (last 6th value GCG1975) in the next contract (GCJ1975):
Image: quandl-3.png = 1975-02-18 on GCJ1975
Then I compute the difference between the "Settle" of the G contract and the "Settle" of the J contract.
Difference_contract = 183.6 - 185.4
Difference_contract = -1.8
So that means that the next or J contract is 1.5 points up respect the before contract so we have to sum -1.8 to all the following numbers of the J contract (Open, High, Low, Settle), including the row 1975-02-18. This:
Image: quandl-4.png = Differences between contracts
And then we have a continuous series like this:
Image: quandl-5.png = Continuous series
All this differences and sums to make a continuous series is done since the last contract until the actual contract.
I think I can't post this because I don't have 10 points of reputation and I can just post 2 image-links.
Any guidance would help me, any question you have ask me.
Thanks and hope everything is well.
RTA
Edit: I have uploaded the photos and its links on post to my dropbox so you must look into it because Stackoverflow don't allow to post more than 2 links without 10 points of reputation.
Dropbox file

Extract data in Aster Teradata?

i have a time series data like the above one. T0,T1,… represents the Date on which a ticket has been submitted to stage 1, stage 2 and so on.. All i need to do is find the time spent in each stage, provided that following all the stages is not compulsory (i.e. a ticket can move directly from stage 1 to stage 3, and then the date in T2 will be left blank). I need to find the result in Aster Teradata.
If you want the difference between stage_n and the next NOT NULL stage:
LEAST(T1,T2,T3,T4,T5) - T0 AS stage_0,
LEAST(T2,T3,T4,T5) - T1 AS stage_1,
LEAST(T3,T4,T5) - T2 AS stage_2,
LEAST(T4,T5) - T3 AS stage_3,
T5 - T4 AS stage_4,

SSAS incremental cube processing shows wrong distinct count

I have created XMLA to process cube in incremental way. It is using type "ProcessUpdate" for dimensions and "ProcessAdd" for measurement partitions. I am facing one issue on distinct count. Let me take one example:
Order Id CustId Amount
1 C1 100.00
2 C2 200.00
3 C3 300.00
4 C4 400.00
5 C5 500.00
If we browse cube, SSAS shows sum of orders as 1500.00, and distinct customer count for all orders as 5. Now adding new fact record for cancelling one order,
e.g.:
Order Id CustId Amount
3 C3 -300.00
After incremental processing, it shows sum of orders as 1200.00 which is correct one. But the distinct customer count for all orders keeps same and shows 5 which is incorrect.
I can understand that the rows are getting append on incremental process which works for sum operation, but fails computing distinct count. I want to know if there is any way that can remove order #3 from all aggregate operation while processing in incremental way.
Distinct customer count remaining at five is correct, as it doesn't know that a minus 300 means the customer shouldn't show. If you processed the cube fully it would show as 5 distinct customers.
This isn't to do with incremental processing, this is to do with how SSAS handles distinct count - It's just "Count distinct customer Ids in fact table", and C3 is there twice with a sale of 300 and a sale of -300.
You need to reconsider how to handle this, ideally at the stage where you load your data warehouse. You could handle it in MDX by not including anyone who has sales of zero or less, but then the whole distinct count calculation will be done in MDX and will be much, much slower.

Resources