sum hours and minutes in 2 tables and display results in HH:MM - sqlite

i have 2 tables, First table is master table having columns in left in HH:MM format taking info from columns in right which are in hh and mm formats
ACType A B C Ahr Amin Bhr Bmin Chr Cmin
A320 12:34 85:45 07:23 12 34 85 45 7 23
B777 20:00 30:00 10:00 20 0 30 0 10 0
Second table has columns in hh and mm format
ACType Bhr Bmin Chr Cmin
A320 10 20 46 31
How can I get final result:
ACType A B C Ahr Amin Bhr Bmin Chr Cmin
A320 12:34 96:05 53:54 12 34 95 65 53 54
B777 20:00 30:00 10:00 20 0 30 0 10 0

SELECT
first.ACType,
first.A,
(
x.Bhr + x.Bmin / 60
)
|| ':' || printf(" % 02d", x.Bmin % 60) AS B,
(
x.Chr + x.Cmin / 60
)
|| ':' || printf(" % 02d", x.Cmin % 60) AS C,
first.Ahr,
first.Amin,
x.Bhr,
x.Bmin,
x.Chr,
x.Cmin
FROM
first
LEFT JOIN
(
SELECT
ACType,
SUM(Bhr) AS Bhr,
SUM(Bmin)AS Bmin,
SUM(Chr) AS Chr,
SUM(Cmin) AS Cmin
FROM
(
SELECT
ACType,
Bhr,
Bmin,
Chr,
Cmin
FROM
first
UNION ALL
SELECT
ACType,
Bhr,
Bmin,
Chr,
Cmin
FROM
SECOND
)
GROUP BY
ACType
)
AS x
WHERE
first.ACType = x.ACType

select null as A, *
from second_table

Related

How to create a window of arbitrary size in Kusto?

Using prev() function I can access previous rows individually.
mytable
| sort by Time asc
| extend mx = max_of(prev(Value, 1), prev(Value, 2), prev(Value, 3))
How to define a window to aggregate over in more generic way? Say I need maximum of 100 values in previous rows. How to write a query that does not require repeating prev() 100 times?
Can be achieved by combining scan and series_stats_dynamic().
scan is used to create an array of last x values, per record.
series_stats_dynamic() is used to get the max value of each array.
// Data sample generation. Not part of the solution
let mytable = materialize(range i from 1 to 15 step 1 | extend Time = ago(1d*rand()), Value = toint(rand(100)));
// Solution starts here
let window_size = 3; // >1
mytable
| order by Time asc
| scan declare (last_x_vals:dynamic)
with
(
step s1 : true => last_x_vals = array_concat(array_slice(s1.last_x_vals, -window_size + 1, -1), pack_array(Value));
)
| extend toint(series_stats_dynamic(last_x_vals).max)
i
Time
Value
last_x_vals
max
5
2022-06-10T11:25:49.9321294Z
45
[45]
45
14
2022-06-10T11:54:13.3729674Z
82
[45,82]
82
2
2022-06-10T13:25:40.9832745Z
44
[45,82,44]
82
1
2022-06-10T17:38:28.3230397Z
24
[82,44,24]
82
7
2022-06-10T18:29:33.926463Z
17
[44,24,17]
44
15
2022-06-10T19:54:33.8253844Z
9
[24,17,9]
24
3
2022-06-10T20:17:46.1347592Z
43
[17,9,43]
43
12
2022-06-11T00:02:55.5315197Z
94
[9,43,94]
94
9
2022-06-11T00:11:18.5924511Z
61
[43,94,61]
94
11
2022-06-11T00:39:40.6858444Z
38
[94,61,38]
94
4
2022-06-11T03:54:59.418534Z
84
[61,38,84]
84
10
2022-06-11T05:55:38.2904242Z
6
[38,84,6]
84
6
2022-06-11T07:25:43.3977923Z
36
[84,6,36]
84
13
2022-06-11T09:36:08.7904844Z
28
[6,36,28]
36
8
2022-06-11T09:51:45.2225391Z
73
[36,28,73]
73
Fiddle

Calculate euclidean distance between groups in r

Having a tracking dataset with 3 time moments (36,76,96) for a match.
My requirement is to calculate distances between a given player and opponents.
Dataframe contains following 5 columns
- time_id (second or instant)
- player ( identifier for player)
- x (x position)
- y (y position)
- team (home or away)
As an example for home player = 26
I need to calculate distances with
all away players ( "12","17","24","37","69","77" ) in the
3 distinct time_id (36,76,96)
Here we can see df data
https://pasteboard.co/ICiyyFB.png
Here it is the link to download sample rds with df
https://1drv.ms/u/s!Am7buNMZi-gwgeBpEyU0Fl9ucem-bw?e=oSTMhx
library(tidyverse)
dat <- readRDS(file = "dat.rds")
# Given home player with id 26
# I need to calculate on each time_id the euclidean distance
# with all away players on each time_id
p36_home <- dat %>% filter(player ==26)
# all away players
all_away <- dat %>% filter(team =='away')
# I know I can calculate it if i put on columns but not elegant
# and require it group by time_id
# mutate(dist= round( sqrt((x1-x2)^2 +(y1-y2)^2),2) )
# below distances row by row should be calculated
# time_id , homePlayer, awayPlayer , distance
#
# 36 , 26 , 12 , x
# 36 , 26 , 17 , x
# 36 , 26 , 24 , x
# 36 , 26 , 37 , x
# 36 , 26 , 69 , x
# 36 , 26 , 77 , x
#
# 76 , 26 , 12 , x
# 76 , 26 , 17 , x
# 76 , 26 , 24 , x
# 76 , 26 , 37 , x
# 76 , 26 , 69 , x
# 76 , 26 , 77 , x
#
# 96 , 26 , 12 , x
# 96 , 26 , 17 , x
# 96 , 26 , 24 , x
# 96 , 26 , 37 , x
# 96 , 26 , 69 , x
# 96 , 26 , 77 , x
This solution should work for you. I simply joined the two dataframes you provided and used your distance calculation. Then filtered the columns to get the desired result.
test <- left_join(p36_home,all_away,by="time_id")
test$dist <- round( sqrt((test$x.x-test$x.y)^2 +(test$y.x-test$y.y)^2),2)
test <- test[,c(1,2,6,10)]
names(test) <- c("time_id",'homePlayer','awayPlayer','distance')
test
The result looks something like this:
time_id homePlayer awayPlayer distance
36 26 37 26.43
36 26 17 28.55
36 26 24 20.44
36 26 69 24.92
36 26 77 11.22
36 26 12 22.65
.
.
.

Function with a for loop to create a column with values 1:n conditioned by intervals matched by another column

I have a data frame like the following
my_df=data.frame(x=runif(100, min = 0,max = 60),
y=runif(100, min = 0,max = 60)) #x and y in cm
With this I need a new column with values from 1 to 36 that match x and y every 10 cm. For example, if 0<=x<=10 & 0<=y<=10, put 1, then if 10<=x<=20 & 0<=y<=10, put 2 and so on up to 6, then 0<=x<=10 & 10<=y<=20 starting with 7 up to 12, etc. I tried to make a function with an if repeating the interval for x 6 times, and increasing by 10 the interval for y every iteration. Here is the function
#my miscarried function 'zones'
>zones= function(x,y) {
i=vector(length = 6)
n=vector(length = 6)
z=vector(length = 36)
i[1]=0
z[1]=0
n[1]=1
for (t in 1:6) {
if (0<=x & x<10 & i[t]<=y & y<i[t]+10) { z[t] = n[t]} else
if (10<=x & x<20 & i[t]<=y & y<i[t]+10) {z[t]=n[t]+1} else
if (20<=x & x<30 & i[t]<=y & y<i[t]+10) {z[t]=n[t]+2} else
if (30<=x & x<40 & i[t]<=y & y<i[t]+10) {z[t]=n[t]+3} else
if (40<=x & x<50 & i[t]<=y & y<i[t]+10) {z[t]=n[t]+4}else
if (50<=x & x<=60 & i[t]<=y & y<i[t]+10) {z[t]=n[t]+5}
else {i[t+1]=i[t]+10
n[t+1]=n[t]+6}
}
return(z)
}
>xy$z=zones(x=xy$x,y=xy$y)
and I got
There were 31 warnings (use warnings() to see them)
>xy$z
[1] 0 0 0 0 25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Please,help me before I die alone!
I think think this does the trick.
a <- cut(my_df$x, (0:6) * 10)
b <- cut(my_df$y, (0:6) * 10)
z <- interaction(a, b)
levels(z)
[1] "(0,10].(0,10]" "(10,20].(0,10]" "(20,30].(0,10]" "(30,40].(0,10]"
[5] "(40,50].(0,10]" "(50,60].(0,10]" "(0,10].(10,20]" "(10,20].(10,20]"
[9] "(20,30].(10,20]" "(30,40].(10,20]" "(40,50].(10,20]" "(50,60].(10,20]"
[13] "(0,10].(20,30]" "(10,20].(20,30]" "(20,30].(20,30]" "(30,40].(20,30]"
[17] "(40,50].(20,30]" "(50,60].(20,30]" "(0,10].(30,40]" "(10,20].(30,40]"
[21] "(20,30].(30,40]" "(30,40].(30,40]" "(40,50].(30,40]" "(50,60].(30,40]"
[25] "(0,10].(40,50]" "(10,20].(40,50]" "(20,30].(40,50]" "(30,40].(40,50]"
[29] "(40,50].(40,50]" "(50,60].(40,50]" "(0,10].(50,60]" "(10,20].(50,60]"
[33] "(20,30].(50,60]" "(30,40].(50,60]" "(40,50].(50,60]" "(50,60].(50,60]"
If this types of levels aren't for your taste, then change as below:
levels(z) <- 1:36
Is this what you're after? The resulting numbers are in column res:
# Get bin index for x values and y values
my_df$bin1 <- as.numeric(cut(my_df$x, breaks = seq(0, max(my_df$x) + 10, by = 10)));
my_df$bin2 <- as.numeric(cut(my_df$y, breaks = seq(0, max(my_df$x) + 10, by = 10)));
# Multiply bin indices
my_df$res <- my_df$bin1 * my_df$bin2;
> head(my_df)
x y bin1 bin2 res
1 49.887499 47.302849 5 5 25
2 43.169773 50.931357 5 6 30
3 10.626466 43.673533 2 5 10
4 43.401454 3.397009 5 1 5
5 7.080386 22.870539 1 3 3
6 39.094724 24.672907 4 3 12
I've broken down the steps for illustration purposes; you probably don't want to keep the intermediate columns bin1 and bin2.
We probably need a table showing the relationship between x, y, and z. After that, we can define a function to do the join.
The solution is related and inspired by this post (R dplyr join by range or virtual column). You may also find other solutions are useful.
# Set seed for reproducibility
set.seed(1)
# Create example data frame
my_df <- data.frame(x=runif(100, min = 0,max = 60),
y=runif(100, min = 0,max = 60))
# Load the dplyr package
library(dplyr)
# Create a table to show the relationship between x, y, and z
r <- expand.grid(x_from = seq(0, 50, 10), y_from = seq(0, 50, 10)) %>%
mutate(x_to = x_from + 10, y_to = y_from + 10, z = 1:n())
# Define a function for dynamic join
dynamic_join <- function(d, r){
if (!("z" %in% colnames(d))){
d[["z"]] <- NA_integer_
}
d <- d %>%
mutate(z = ifelse(x >= r$x_from & x < r$x_to & y >= r$y_from & y < r$y_to,
r$z, z))
return(d)
}
re_dynamic_join <- function(d, r){
r_list <- split(r, r$z)
for (i in 1:length(r_list)){
d <- dynamic_join(d, r_list[[i]])
}
return(d)
}
# Apply the function
re_dynamic_join(my_df, r)
x y z
1 15.930520 39.2834357 20
2 22.327434 21.1918363 15
3 34.371202 16.2156088 10
4 54.492467 59.5610437 36
5 12.100916 38.0095959 20
6 53.903381 12.7924881 12
7 56.680516 7.7623409 6
8 39.647868 28.6870821 16
9 37.746843 55.4444682 34
10 3.707176 35.9256580 19
11 12.358474 58.5702417 32
12 10.593405 43.9075507 26
13 41.221371 21.4036147 17
14 23.046223 25.8884214 15
15 46.190485 8.8926936 5
16 29.861955 0.7846545 3
17 43.057110 42.9339640 29
18 59.514366 6.1910541 6
19 22.802111 26.7770609 15
20 46.646713 38.4060627 23
21 56.082314 59.5103172 36
22 12.728551 29.7356147 14
23 39.100426 29.0609715 16
24 7.533306 10.4065401 7
25 16.033240 45.2892567 26
26 23.166846 27.2337294 15
27 0.803420 30.6701870 19
28 22.943277 12.4527068 9
29 52.181451 13.7194886 12
30 20.420940 35.7427198 21
31 28.924807 34.4923319 21
32 35.973950 4.6238628 4
33 29.612478 2.1324348 3
34 11.173056 38.5677295 20
35 49.642399 55.7169120 35
36 40.108004 35.8855453 23
37 47.654392 33.6540449 23
38 6.476618 31.5616634 19
39 43.422657 59.1057134 35
40 24.676466 30.4585093 21
41 49.256778 40.9672847 29
42 38.823612 36.0924731 22
43 46.975966 14.3321207 11
44 33.182179 15.4899556 10
45 31.783175 43.7585774 28
46 47.361374 27.1542499 17
47 1.399872 10.5076061 7
48 28.633804 44.8018962 27
49 43.938824 6.2992584 5
50 41.563893 51.8726969 35
51 28.657177 36.8786983 21
52 51.672569 33.4295723 24
53 26.285826 19.7266391 9
54 14.687837 27.1878867 14
55 4.240743 30.0264584 19
56 5.967970 10.8519817 7
57 18.976302 31.7778362 20
58 31.118056 4.5165447 4
59 39.720305 16.6653560 10
60 24.409811 12.7619712 9
61 54.772555 17.0874289 12
62 17.616202 53.7056462 32
63 27.543944 26.7741194 15
64 19.943680 46.7990934 26
65 39.052228 52.8371421 34
66 15.481007 24.7874526 14
67 28.712715 3.8285088 3
68 45.978640 20.1292495 17
69 5.054815 43.4235568 25
70 52.519280 20.2569200 18
71 20.344376 37.8248473 21
72 50.366421 50.4368732 36
73 20.801009 51.3678999 33
74 20.026496 23.4815569 15
75 28.581075 22.8296331 15
76 53.531900 53.7267256 36
77 51.860368 38.6589458 24
78 23.399373 44.4647189 27
79 46.639242 36.3182068 23
80 57.637080 54.1848967 36
81 26.079569 17.6238093 9
82 42.750881 11.4756066 11
83 23.999662 53.1870566 33
84 19.521129 30.2003691 20
85 45.425229 52.6234526 35
86 12.161535 11.3516173 8
87 42.667273 45.4861831 29
88 7.301515 43.4699336 25
89 14.729311 56.6234891 32
90 8.598263 32.8587952 19
91 14.377765 42.7046321 26
92 3.536063 23.3343060 13
93 38.537296 6.0523876 4
94 52.576153 55.6381253 36
95 46.734881 16.9939500 11
96 47.838530 35.4343895 23
97 27.316467 6.6216363 3
98 24.605045 50.4304219 33
99 48.652215 19.0778211 11
100 36.295997 46.9710802 28

R- DataTable count the frequency of categorical variables and display each variable as column

I created a dummy data table called DT. And I am trying to calculate the sum of Capacity(numerical), Count the frequency of Code and State( Categorical) within each ID. For the end result, I want to display the sum of Capacity, frequency of A,B,C... and different State within each unique ID. Therefore, the column name will be ID,total.Cap,A,B,C... AZ,CA..
DT <- data.table(ID = rep(1:500,100),
Capacity = sample(1:1000, size = 50000, replace =T),
Code = sample(LETTERS[1:26], 50000, replace = T),
State = rep(c("AZ","CA","PA","NY","WA","SD"), 50000))
The format of result will like the table below:
ID total.Cap A B C ... AZ CA ...
1 28123 10 25 70 ... 29 ...
2 32182 20 42 50 ... 30 ...
3
I have tried to to use ddply, melt and dcast.. But the result does not comes out as what I thought. Could anyone give me some hints about how to structure a table looks like this? Thank you!
You can do this by constructing the totals, state counts, and code counts with three separate data.table statements then joining them. On states and codes, you can use dcast to turn it into one column per state/code with the counts within each.
library(data.table)
totals <- DT[, list(total.Cap = sum(Capacity)), by = "ID"]
states <- dcast(DT, ID ~ State)
codes <- dcast(DT, ID ~ Code)
You can then join the three tables together:
result <- setkey(totals, "ID")[states, ][codes, ]
This results in a table something like:
ID total.Cap AZ CA NY PA SD WA A B C D E F G H I J K L M N O P Q R S T U
1: 1 287526 200 0 0 200 0 200 12 18 24 42 12 30 30 18 6 36 24 6 18 24 30 24 6 24 36 18 30
2: 2 293838 0 200 200 0 200 0 18 24 42 30 30 12 24 6 24 12 48 42 18 18 42 24 24 24 12 18 24
3: 3 279450 200 0 0 200 0 200 24 18 24 6 12 12 18 12 12 30 24 18 54 30 6 42 18 30 24 24 18
4: 4 298200 0 200 200 0 200 0 30 30 36 30 36 24 24 18 24 18 30 30 30 24 6 30 18 6 18 18 18
5: 5 294084 200 0 0 200 0 200 18 6 24 12 42 12 18 42 18 18 18 18 24 24 30 18 30 24 6 30 24
Note that if you have many columns like State and Code, you can do all of them at once by melting them first:
# replace State and Code with the categorical variables you want
melted <- melt(DT, measure.vars = c("State", "Code"))
state_codes <- dcast(melted, ID ~ value)
setkey(totals, "ID")[state_codes, ]
Note you still need to join with the totals, and that this will not preserve the order of columns like "states then codes" or vice versa.
This creates the total.Cap, Code, and State summary elements in three separate data tables then merges them by ID:
# Storing intermediate pieces
total_cap <- DT[, j = list(total.Cap = sum(Capacity)), by = ID]
code <- dcast(DT[, .N, by = c("ID", "Code")], ID ~ Code, fill = 0)
state <- dcast(DT[, .N, by = c("ID", "State")], ID ~ State, fill = 0)
mytable <- merge(total_cap, code, by = "ID")
mytable <- merge(mytable, state, by = "ID")
mytable
# As a one-liner
mytable <- merge(
merge(DT[, j = list(total.Cap = sum(Capacity)), by = ID],
dcast(DT[, .N, by = c("ID", "Code")], ID ~ Code, fill = 0),
by = "ID"),
dcast(DT[, .N, by = c("ID", "State")], ID ~ State, fill = 0),
by = "ID")
mytable

How to display data rangewise?

I want the data fetched from the database to appear as this
Total_marks | No of Students
10-20 | 2
20-30 | 1
30-40 | 3
so on
Sample data:
barcode(student id) | Total_marks
200056 | 70
200071 | 51
200086 | 40
200301 | 56
200317 | 73
200316 | 35
200217 | 42
200104 | 80
200015 | 63
I tried :
SELECT
count(*) as no_of_students,
CASE
WHEN Total_marks BETWEEN 10 and 20 THEN '1'
WHEN Total_marks BETWEEN 20 and 30 THEN '2'
WHEN Total_marks BETWEEN 30 and 40 THEN '3'
WHEN Total_marks BETWEEN 40 and 50 THEN '4'
WHEN Total_marks BETWEEN 50 and 60 THEN '5'
WHEN Total_marks BETWEEN 60 and 70 THEN '6'
WHEN Total_marks BETWEEN 70 and 80 THEN '7'
WHEN Total_marks BETWEEN 80 and 90 THEN '8'
WHEN Total_marks BETWEEN 90 and 100 THEN '9'
END AS intervals
FROM
[database]
WHERE
0 - 10 = '0' and 10 - 20 = '1' and 20 - 30 = '2' and 30 - 40 = '3' and 40 - 50 = '4' and 50 - 60 = '5' and 60 - 70 = '6' and 70 -80 = '7' and 80 - 90 = '8' and 90 - 100 = '9' GROUP BY Total_marks
But i just get the column titles and no data. How to formulate the query correctly. Also I want this to be displayed as a chart in ASP.NET. where the code would be :
{
connection.Open();
SqlCommand cmd = new SqlCommand("Query", connection);
cmd.ExecuteNonQuery();
SqlDataAdapter da = new SqlDataAdapter(cmd);
DataSet ds = new DataSet();
da.Fill(ds);
DataTable ChartData = ds.Tables[0];
//storing total rows count to loop on each Record
string[] XPointMember = new string[ChartData.Rows.Count];
Chart1.ChartAreas[0].AxisX.IsStartedFromZero = true;
decimal[] YPointMember = new decimal[ChartData.Rows.Count];
int totalrows = ChartData.Rows.Count;
if (totalrows > 0)
{
for (int count = 0; count < ChartData.Rows.Count; count++)
{
//storing Values for X axis
XPointMember[count] = ChartData.Rows[count]["Number_of_students"].ToString();
//storing values for Y Axis
YPointMember[count] = Convert.ToDecimal(ChartData.Rows[count]["Total_accumulated_score_achieved" + "%"]);
connection.Close();
}
Try This
create table StudentData
(
StudentID INT,
Total_Marks INT
)
insert into StudentData values(1,10)
insert into StudentData values(2,30)
insert into StudentData values(3,10)
insert into StudentData values(4,50)
insert into StudentData values(5,50)
insert into StudentData values(6,70)
insert into StudentData values(7,80)
insert into StudentData values(8,70)
insert into StudentData values(9,80)
insert into StudentData values(10,80)
insert into StudentData values(11,90)
insert into StudentData values(12,100)
insert into StudentData values(13,40)
SELECT t.Intervals as [Intervals], count(*) as [NoOfStudents]
FROM (
SELECT CASE
WHEN Total_marks BETWEEN 10 and 20 THEN '10-20'
WHEN Total_marks BETWEEN 20 and 30 THEN '20-30'
WHEN Total_marks BETWEEN 30 and 40 THEN '30-40'
WHEN Total_marks BETWEEN 40 and 50 THEN '40-50'
WHEN Total_marks BETWEEN 50 and 60 THEN '50-60'
WHEN Total_marks BETWEEN 60 and 70 THEN '60-70'
WHEN Total_marks BETWEEN 70 and 80 THEN '70-80'
WHEN Total_marks BETWEEN 80 and 90 THEN '80-90'
WHEN Total_marks BETWEEN 90 and 100 THEN '90-100'
end as Intervals
from StudentData) t
group by t.Intervals
Output:
Intervals NoOfStudents
10-20 2
20-30 1
30-40 1
40-50 2
60-70 2
70-80 3
80-90 1
90-100 1
You need to use CASE to get intervals and then GROUP BY same interval to get count of student which falls in the respective intervals.
Check this query and hope it helps you
Changed: Used table variable to store list of intervals and then used join to get the required output.
Declare #Intervals table (Interval varchar(10))
Insert into #Intervals values('10-20'),('21-30'),('31-40'),('41-50'),('51-60'),('61-70'),('71-80'),('81-90'),('91-100')
SELECT i.Interval as [Intervals], ISNULL(count(t.Intervals),0) as [NoOfStudents]
FROM (
SELECT CASE
WHEN Total_marks BETWEEN 10 and 20 THEN '10-20'
WHEN Total_marks BETWEEN 21 and 30 THEN '21-30'
WHEN Total_marks BETWEEN 31 and 40 THEN '31-40'
WHEN Total_marks BETWEEN 41 and 50 THEN '41-50'
WHEN Total_marks BETWEEN 51 and 60 THEN '51-60'
WHEN Total_marks BETWEEN 61 and 70 THEN '61-70'
WHEN Total_marks BETWEEN 71 and 80 THEN '71-80'
WHEN Total_marks BETWEEN 81 and 90 THEN '81-90'
WHEN Total_marks BETWEEN 91 and 100 THEN '91-100'
end as Intervals
from #yourtable) t
right join #Intervals i on i.Interval = t.Intervals
group by i.Interval

Resources