Extract data in Aster Teradata? - teradata

i have a time series data like the above one. T0,T1,… represents the Date on which a ticket has been submitted to stage 1, stage 2 and so on.. All i need to do is find the time spent in each stage, provided that following all the stages is not compulsory (i.e. a ticket can move directly from stage 1 to stage 3, and then the date in T2 will be left blank). I need to find the result in Aster Teradata.

If you want the difference between stage_n and the next NOT NULL stage:
LEAST(T1,T2,T3,T4,T5) - T0 AS stage_0,
LEAST(T2,T3,T4,T5) - T1 AS stage_1,
LEAST(T3,T4,T5) - T2 AS stage_2,
LEAST(T4,T5) - T3 AS stage_3,
T5 - T4 AS stage_4,

Related

Datetime column from Table1 is not matching the DateTime column from Table 2

Hello I have an issue of matching two different datetime columns.
I need to compare the two of them (and their data), but at the moment of putting them in the same table (using a datetime relation) I do not get the match I need:
What I need:
| Datetime_1 | Datetime_2 |
| ---------- | ---------- |
| 01/01/2023 08:00:00 AM | |
... ...
| 01/11/2023 12:00:00 AM | 01/11/2023 12:00:00 AM |
| 01/11/2023 01:00:00 AM | 01/11/2023 01:00:00 AM |
... ...
| 31/01/2023 12:00:00 PM | 31/01/2023 12:00:00 PM |
What I get:
Datetime_1 goes from 01/01/2023 12:00:00AM to 01/31/2023 11:00:00PM (with steps of 1h) and Datetime_2 goes from 01/11/2023 8:15:00 PM to 02/06/2023 7:45:00 PM (with steps of 30min).
I did a relation with the two of them and I didn't receive any error:
I already put both lists in Date/Time format in Power Query and Data panel.
However, I noticed my main datetime list doesn't have the hierarchy icon on fields panel, while the secondary datetime lists have it, (but not the hour section):
Also, as I mentioned before, my list have a range between Jan and Feb. I do not understand why this range continues and match some dates on the on my main datetime list:
Troubleshooting
Part of the difficulty troubleshooting this is the two columns are formatted differently. Just for now, make sure both are formatted as Long Date Time. When comparing the relationship, do not drag the hierarchy (for the one that has it) into the table but rather just the date itself. When you do, you will see the full timestamp for both columns and the issue will become more clear.
Power BI & Relationships on DateTime
Power BI will only match related rows if the date and time match exactly, so 4/15/2023 12:00:00 AM will not match 4/15/2023/12:00:01 AM. You mentioned one side of the relationship has 30 minute steps while the other has 1 hour steps. Power BI is not going to match up a 1:30am and 1:00am value for you. If you want that 1:30 value to match up to 1:00, create another column truncating the :30 minutes and build your relationship on the truncated column.
Time Dimension
I'm not sure of your application so don't know if this will work, but when dealing with time, I try to separate Date and Time into separate columns and have both a Date and Time dimension. Below is my time dimension DAX. You can generate any minute-precise interval with it. Notice the last defined column "timekey". I create a column in my fact table to relate to this key.
DimTime =
var every_n_minutes = 15 /* between 0 and 60; remainders in last hourly slice */
/* DO NOT CHANGE BELOW THIS LINE */
var slice_per_hour = trunc(DIVIDE(60,every_n_minutes),0)
var rtn =
ADDCOLUMNS(
SELECTCOLUMNS(
GENERATESERIES(0, 24*slice_per_hour - 1, 1),
"hour24", TRUNC(DIVIDE([Value],slice_per_hour),0),
"mins", MOD([Value],slice_per_hour) * every_n_minutes
),
"hour12", MOD([hour24] + 11,12) + 1,
"asTime", TIME([hour24],[mins],0),
"timekey", [hour24] * 100 + [mins]
)
return rtn
As requested, turning this into an answer. The reason you're getting these results is that your time stamps will never line up. Yes, it let you create the join, but my guess is that is only because both fields have the same formatting. Also, it is best practices to separate your dates and time in separate date and time dimensions, then join them via a fact table. See also here.

PromQL: Counting samples of a time series

For a 2 minute time window this vector has the following results (I am using Grafana Explore with a picked 2 minute time):
instana_metrics{aggregation="max", endpoint="mutation addProduct"}
t1 - 3051
t2 - 5347
t3 - 5347
t4 - 4224
t5 - 4224
I need something equivalent to
SELECT Count(*)
FROM instana_metrics
with a result of 5.
The best I was able to come up with is this
count( instana_metrics{aggregation="max", endpoint="mutation addProduct"} )
t1 | 1
t2 | 1
t3 | 1
t4 | 1
t5 | 1
My interpretation is that every point in time has a count of 1 sample value. But the result itself is a time series, and I am expecting one scalar.
Btw: I understand that I can use Grafana transformation for this, but unfortunately I need a PromQL only solution.
Just use count_over_time function. For example, the following query returns the number of raw samples over the last 2 minutes per each time series with the name instana_metrics:
count_over_time(instana_metrics[2m])
Note that Prometheus calculates the provided query independently per each point on the graph, e.g. each value on the graph shows the number of raw samples per each matching time series over 2 minutes lookbehind window ending at the given point.
If you need just a single value per each matching series over the selected time range in Grafana, then use the following instant query:
count_over_time(instana_metrics[$__range])
See these docs about $__range variable.
See these docs about instant query type in Grafana.

Power BI DAX - Throughput Board Time Intelligence Metrics with different time zones

Dealing with a bit of a head scratcher. This is more of a logic based issue rather than actual Power BI code. Hoping someone can help out! Here's the scenario:
Site
Shift Num
Start Time
End Time
Daily Output
A
1
8:00AM
4:00PM
10000
B
1
7:00AM
3:00PM
12000
B
2
4:00PM
2:00AM
7000
C
1
6:00AM
2:00PM
5000
This table contains the sites as well as their respective shift times. The master table above is part of an effort in order to capture throughput data from each of the following sites. This master table is connected to tables with a running log of output for each site and shift like so:
Site
Shift Number
Output
Timestamp
A
1
2500
9:45 AM
A
1
4200
11:15 AM
A
1
5600
12:37 PM
A
1
7500
2:15 PM
So there is a one-to-many relationship between the master table and these child throughput tables. The goal is to create use a gauge chart with the following metrics:
Value: Latest Throughput Value (Latest Output in Child Table)
Maximum Value: Throughput Target for the Day (Shift Target in Master Table)
Target Value: Time-dependent project target
i.e. if we are halfway through Site A's shift 1, we should be at 5000 units: (time passed in shift / total shift time) * shift output
if the shift is currently in non-working hours, then the target value = maximum value
Easy enough, but the problem we are facing is the target value erroring for shifts that cross into the next day (i.e. Site B's shift 2).
The shift times are stored as date-independent time values. Here's the code for the measure to get the target value:
Var CurrentTime = HOUR(UTCNOW()) * 60 + MINUTE(UTCNOW())
VAR ShiftStart = HOUR(MAX('mtb MasterTableUTC'[ShiftStartTimeUTC])) * 60 + MINUTE(MAX('mtb MasterTableUTC'[ShiftStartTimeUTC]))
VAR ShiftEnd = HOUR(MAX('mtb MasterTableUTC'[ShiftEndTimeUTC])) * 60 + MINUTE(MAX('mtb MasterTableUTC'[ShiftEndTimeUTC]))
VAR ShiftDiff = ShiftEnd - ShiftStart
Return
IF(CurrentTime > ShiftEnd || CurrentTime < ShiftStart, MAX('mtb MasterTableUTC'[OutputTarget]),
( (CurrentTime - ShiftStart) / ShiftDiff) * MAX('mtb MasterTableUTC'[OutputTarget]))
Basically, if the current time is outside the range of the shift, it should have the target value equal the total shift target, but if it is in the shift time, it calculates it as a ratio of time passed within the shift. This does not work with shifts that cross midnight as the shift end time value is technically earlier than the shift start time value. Any ideas on how to modify the measure to account for these shifts?

Matching records between two tables

I have the below code to do matching between two different tables. The code only updates the first record as "Matched".
I want to compare each record in ID field from T1 if it present in T2, e.g. To check if A present in T2 then go to next record in T1 and check B if it present in T2 through loop until all records in T1 matched
Table 1
ID
A
B
C
Table 2
ID
A
B
Expected Matching Results
ID
A
B
Any help please
If rs2("ID").Value = rs1("ID").Value Then
rs2.MoveNext()
Do While Not rs2.EOF()
rs1("Matching").Value = "Matched"
rs1.Update()
rs2.MoveFirst()
Loop
End If
Assuming I'm understanding your problem correctly, you just want to update the "Matching" column in T1 if there's a corresponding ID record in T2. Doing this in SQL would be ideal, as Peter said.
If you really want to loop, Peter is correct that you have an endless loop: you exit when you hit the end of the record set, but at the end of each iteration you are resetting back to the first record.
I don't know why you are looping through rs2 when you already found a match. Once you know that rs1("ID").Value = rs2("ID").Value then just update the Matching record in rs1 and call rs2.MoveFirst() so you can start back at the top when trying to find a match for the next ID in T1. No loop is required here.

Calculate average of dates in sqlite3

I know there are many topics regarding this question but none actually helped me solve my problem. I am still sort of new when it comes to databases and I came across this problem.
I have a table named tests which contains two columns: id and date.
I want to calculate the average difference of days between a couple values.
Say select date from tests where id=1 which will provide me with a list of dates. I want to calculate the avg difference between those days.
Table "tests"
1|2018-03-13
1|2018-03-01
2|2018-03-13
2|2018-03-01
3|2018-03-13
3|2018-03-01
1|2018-03-17
2|2018-03-17
3|2018-03-17
Select date from tests where id=1
2018-03-13
2018-03-01
2018-03-17
Now I am looking to calculate the average difference in days between those three dates.
Can really use some help, thank you!
Edit:
Sorry for being unclear, I'll clarify my question.
So student one had a test on the 01/03, then on the 13/03 and then on the 17/03. What I want to calculate is the avg difference in days between test to test, so:
Diff between first to second is 12 days. Diff between second to third is 4 days.
12+6 divided by two since we have two gaps is 8 eight.
I am looking to calculate the average difference in days between those three dates.*
And by average difference we mean "take the average of the absolute value of the difference between all dates". That's 12 + 16 + 4 / 3 or 10.6667.
We need all combinations of dates. For this we need a self-join with no repeats. That's accomplished by picking a field and using on with a < or >.
select t1.date, t2.date
from tests as t1
join tests as t2 on t1.id = t2.id and t1.date < t2.date
where t1.id = 1;
2018-03-01|2018-03-13
2018-03-01|2018-03-17
2018-03-13|2018-03-17
Now that we have all combinations, we can take the difference. But not by simply subtracting the dates, SQLite doesn't support that. First, convert them to Julian Days.
sqlite> select julianday(t1.date), julianday(t2.date) from tests as t1 join tests as t2 on t1.id = t2.id and t1.date < t2.date where t1.id = 1;
2458178.5|2458190.5
2458178.5|2458194.5
2458190.5|2458194.5
Now that we have numbers we can take the absolute value of the difference and do an average.
select avg(abs(julianday(t1.date) - julianday(t2.date)))
from tests as t1
join tests as t2 on t1.id = t2.id and t1.date < t2.date
where t1.id = 1;
UPDATE
What I want to calculate is the avg difference in days between test to test, so: Diff between first to second is 12 days. Diff between second to third is 4 days. Then (12+4)/2=8 which should be the result.
For this twist on the problem you want to compare each row with the next one. You want a table like this:
2018-03-01|2018-03-13
2018-03-13|2018-03-17
Other databases have features like window or lag to accomplish this. SQLite doesn't have that. Again, we'll use a self-join, but we have to do it per row. This is a correlated subquery.
select t1.date as date, (
select t2.date
from tests t2
where t1.id = t2.id and t2.date > t1.date
order by t2.date
limit 1
) as next
from tests t1
where id = 1
and next is not null
The subquery-as-column finds the next date for each row.
This is a bit unwieldy, so let's turn it into a view. Then we can use it as a table. Just take out the where id = 1 so it's generally useful.
create view test_and_next as
select t1.id, t1.date as date, (
select t2.date
from tests t2
where t1.id = t2.id and t2.date > t1.date
order by t2.date
limit 1
) as next
from tests t1
where next is not null
Now we can treat test_and_next as a table with the columns id, date, and next. Then it's the same as before: turn them into Julian Days, subtract, and take the average.
select avg(julianday(next) - julianday(date))
from test_and_next
where id = 1;
Note that this will go sideways when you have two rows with the same date: there's no way for SQL to know which is the "next" one. For example, if there were two tests for ID 1 on "2018-03-13" they'll both choose "2018-03-17" as the "next" one.
2018-03-01|2018-03-13
2018-03-13|2018-03-17
2018-03-13|2018-03-17
I'm not sure how to fix this.

Resources