Count Number of Consecutive Occurrence of values in Table in sql server - count

I have a teable like
ID Name
1 A
2 A
3 A
4 B
5 B
6 C
7 C
8 B
9 A
10 C
I want like this frequency count in increment
ID Name Frequency
1 A 1
2 A 2
3 A 3
4 B 1
5 B 2
6 C 1
7 C 2
8 B 3
9 A 4
10 C 3
I need ms sql query to calculate frequency column

Related

How to add a column with repeating but changing sequence?

I'm trying to add a column with repeating sequence but one that changes for each group. In the example data, the group is the id column.
data <- tibble::expand_grid(id = 1:12, condition = c("a", "b", "c"))
data
id condition
1 a
1 b
1 c
2 a
2 b
2 c
3 a
3 b
3 c
... and so on
I'd like to add a column called order to repeat various combinations like 1 2 3 2 3 1 3 1 2 1 3 2 2 1 3 3 2 1 for each id.
In the end, the desired output will look like this
id condition order
1 a 1
1 b 2
1 c 3
2 a 2
2 b 3
2 c 1
3 a 3
3 b 1
3 c 2
... and so on
I'm looking for a simple mutate solution or base R solution. I tried generating a list of combinations but I'm not sure how to create a variable from that.
You can use perms from package pracma to generate all permutations, e.g.,
data %>%
cbind(order = c(t(pracma::perms(1:3))))
which gives
id condition order
1 1 a 3
2 1 b 2
3 1 c 1
4 2 a 3
5 2 b 1
6 2 c 2
7 3 a 2
8 3 b 3
9 3 c 1
10 4 a 2
11 4 b 1
12 4 c 3
13 5 a 1
14 5 b 2
15 5 c 3
16 6 a 1
17 6 b 3
18 6 c 2
19 7 a 3
20 7 b 2
21 7 c 1
22 8 a 3
23 8 b 1
24 8 c 2
25 9 a 2
26 9 b 3
27 9 c 1
28 10 a 2
29 10 b 1
30 10 c 3
31 11 a 1
32 11 b 2
33 11 c 3
34 12 a 1
35 12 b 3
36 12 c 2

Creating two columns of cumulative sum based on the categories of one column

I like to create two columns with cumulative frequency of "A" and "B" in the assignment columns.
df = data.frame(id = 1:10, assignment= c("B","A","B","B","B","A","B","B","A","B"))
id assignment
1 1 B
2 2 A
3 3 B
4 4 B
5 5 B
6 6 A
7 7 B
8 8 B
9 9 A
10 10 B
The resulting table would have this format
id assignment A B
1 1 B 0 1
2 2 A 1 1
3 3 B 1 2
4 4 B 1 3
5 5 B 1 4
6 6 A 2 4
7 7 B 2 5
8 8 B 2 6
9 9 A 3 6
10 10 B 3 7
How to generalize the codes for more than 2 categories (say for "A","B",C")?
Thanks
Use lapply over unique values in assignment to create new columns.
vals <- sort(unique(df$assignment))
df[vals] <- lapply(vals, function(x) cumsum(df$assignment == x))
df
# id assignment A B
#1 1 B 0 1
#2 2 A 1 1
#3 3 B 1 2
#4 4 B 1 3
#5 5 B 1 4
#6 6 A 2 4
#7 7 B 2 5
#8 8 B 2 6
#9 9 A 3 6
#10 10 B 3 7
We can use model.matrix with colCumsums
library(matrixStats)
cbind(df, colCumsums(model.matrix(~ assignment - 1, df[-1])))
A base R option
transform(
df,
A = cumsum(assignment == "A"),
B = cumsum(assignment == "B")
)
gives
id assignment A B
1 1 B 0 1
2 2 A 1 1
3 3 B 1 2
4 4 B 1 3
5 5 B 1 4
6 6 A 2 4
7 7 B 2 5
8 8 B 2 6
9 9 A 3 6
10 10 B 3 7

Split data.frame function by a field

I have a data frame function whose output is too lengthy which is being used as an output in a r shiny app. I want to spilt this by field fac. How could I do it. So I want tables which has fac= A and so on for the unique fields in fac. Thank you.
prod()
x y fac
1 1 1 C
2 1 2 B
3 1 3 B
4 1 4 B
5 1 5 A
6 1 6 B
7 1 7 B
8 1 8 C
9 1 9 C
10 1 10 C

select duplicate rows based on another column value in R

How do I select duplicated rows which belong to the same ID based on a specific value in another column i.e. Activity C?
This is the original dataframe
ID Activity Cost
1 A 8
1 B 2
1 C 5
2 A 4
3 A 2
3 C 7
3 C 1
4 A 3
4 B 8
This is the targeted dataframe i.e. only ID 1 and ID 3 are selected as they contain Activity C
ID Activity Cost
1 A 8
1 B 2
1 C 5
3 A 2
3 C 7
3 C 1
On the flip side, how do I get a dataframe where the ID do not have Activity C?
Flipside data frame:
ID Activity Cost
2 A 4
4 A 3
4 B 8
Cheers

R cumulative sum based upon other columns

I have a data.frame as below. The data is sorted by column txt and then by column val. summ column is sum of value in val colummn and the summ column value from the earlier row provided that the current row and the earlier row have same value in txt column...How could i do this in R?
txt=c(rep("a",4),rep("b",5),rep("c",3))
val=c(1,2,3,4,1,2,3,4,5,1,2,3)
summ=c(1,3,6,10,1,3,6,10,15,1,3,6)
dd=data.frame(txt,val,summ)
> dd
txt val summ
1 a 1 1
2 a 2 3
3 a 3 6
4 a 4 10
5 b 1 1
6 b 2 3
7 b 3 6
8 b 4 10
9 b 5 15
10 c 1 1
11 c 2 3
12 c 3 6
If by "most earlier" (which in English is more properly written "earliest") you mean the nearest, which is what is implied by your expected output, then what you're talking about is a cumulative sum. You can apply cumsum() separately to each group of txt with ave():
dd <- data.frame(txt=c(rep("a",4),rep("b",5),rep("c",3)), val=c(1,2,3,4,1,2,3,4,5,1,2,3) );
dd$summ <- ave(dd$val,dd$txt,FUN=cumsum);
dd;
## txt val summ
## 1 a 1 1
## 2 a 2 3
## 3 a 3 6
## 4 a 4 10
## 5 b 1 1
## 6 b 2 3
## 7 b 3 6
## 8 b 4 10
## 9 b 5 15
## 10 c 1 1
## 11 c 2 3
## 12 c 3 6

Resources