I have a those sequences:
1,2,3,13,14,15,25,26,27
1,2,3,4,14,15,16,17,27,28,29,30,40,41,42,43
they are constructed as follows:
I choose a number M (for example: 3) and a start number F (for example: 1);
I make the power of 2 of this number M^2 (3*3=9) and it is the length of the sequence;
The number in the sequence are in arithmetic progression with "GI" increment (for example: 1) but the number in M+1, 2M+1 and so on, position is the previous number + "P2I" increment (for example: 10) [(F=)1, (1+GI=)2, (2+GI=)3, (3+P2I=)13, (13+GI=)14, ...]
How I make a function that prints those sequences?
Thanks
Using the sequence function:
fSeq <- function(M, F, GI, P21) {
sequence(rep(M, M), seq(F, by = (M - 1)*GI + P21, length.out = M), GI)
}
fSeq(3, 1, 1, 10)
#> [1] 1 2 3 13 14 15 25 26 27
fSeq(4, 1, 1, 10)
#> [1] 1 2 3 4 14 15 16 17 27 28 29 30 40 41 42 43
Using row and col:
fSeq2 <- function(M, F, GI, P21) {
m <- matrix(F - GI*M - P21, nrow = M, ncol = M)
return(c(m + row(m)*GI + col(m)*(GI*(M - 1) + P21)))
}
fSeq2(3, 1, 1, 10)
#> [1] 1 2 3 13 14 15 25 26 27
fSeq2(4, 1, 1, 10)
#> [1] 1 2 3 4 14 15 16 17 27 28 29 30 40 41 42 43
With cumsum:
fSeq3 <- function(M, F, GI, P21) {
cumsum(c(F - P21 + GI, rep(GI, M^2 - 1)) + c(P21 - GI, rep(0, M - 1)))
}
fSeq3(3, 1, 1, 10)
#> [1] 1 2 3 13 14 15 25 26 27
fSeq3(4, 1, 1, 10)
#> [1] 1 2 3 4 14 15 16 17 27 28 29 30 40 41 42 43
Related
I am working with the R programming language.
I simulated this dataset which contains 1000 coin flips - then I calculated the number of "2 Flip Sequences":
Coin <- c('H', 'T')
Results = sample(Coin,1000, replace = TRUE)
My_Data = data.frame(id = 1:1000, Results)
Pairs = data.frame(first = head(My_Data$Results, -1), second = tail(My_Data$Results, -1))
Final = as.data.frame(table(Pairs))
first second Freq
1 H H 255
2 T H 245
3 H T 246
4 T T 253
I am curious - is it possible to extend the above code for "3 Flip Sequences"?
For example - I tried modifying parts of the code to see how the results change (and hoped to stumble across the correct way to write this code):
# First Attempt
Pairs = data.frame(first = head(My_Data$Results, -1), second = head(My_Data$Results, -1) , third = tail(My_Data$Results, -1))
Final = as.data.frame(table(Pairs))
first second third Freq
1 H H H 255
2 T H H 245
3 H T H 0
4 T T H 0
5 H H T 0
6 T H T 0
7 H T T 246
8 T T T 253
# Second Attempt
Pairs = data.frame(first = head(My_Data$Results, -1), second = tail(My_Data$Results, -1) , third = tail(My_Data$Results, -1))
Final = as.data.frame(table(Pairs))
first second third Freq
1 H H H 255
2 T H H 0
3 H T H 0
4 T T H 245
5 H H T 246
6 T H T 0
7 H T T 0
8 T T T 253
I am not sure which of these options are correct?
In general, I am looking to understand the logic as to how I can adapt the above code for an "arbitrary number of coin flips" (e.g. "4 flip sequences", "5 flip sequences", etc.)
Also, this might not be the most efficient way to calculate these frequencies - I would also be interested in learning about other ways that might be more efficient ( e.g. as the overall size of the data increases).
Thanks!
It might be helpful to work with strings.
coin <- c("H", "T")
results <- sample(coin, 1000, replace = TRUE)
Then to get sequence counts (assuming overlapping sequences also count) for triples, we could do something like:
triples <- table(
sapply(
1:(length(results) - 3),
function(i) sprintf(
"%s%s%s",
results[i],
results[i + 1],
results[i + 2]
)
)
)
which gives me something like:
HHH HHT HTH HTT THH THT TTH TTT
132 129 138 115 129 124 116 114
This idea could be generalized fairly easily, for example:
n_sequences <- function(n, results) {
helper <- function(i, n) if (n < 1) "" else sprintf(
"%s%s",
helper(i, n - 1),
results[i + n - 1]
)
result <- data.frame(
table(
sapply(
1:(length(results) - n + 1),
function(i) helper(i, n)
)
)
)
colnames(result) <- c("Sequence", "Frequency")
result
}
For example:
n_sequences(5, results)
Gives me something like:
Sequence Frequency
1 HHHHH 34
2 HHHHT 31
3 HHHTH 36
4 HHHTT 31
5 HHTHH 35
6 HHTHT 36
7 HHTTH 20
8 HHTTT 37
9 HTHHH 35
10 HTHHT 34
11 HTHTH 41
12 HTHTT 27
13 HTTHH 27
14 HTTHT 24
15 HTTTH 34
16 HTTTT 30
17 THHHH 31
18 THHHT 36
19 THHTH 36
20 THHTT 26
21 THTHH 34
22 THTHT 32
23 THTTH 31
24 THTTT 27
25 TTHHH 32
26 TTHHT 28
27 TTHTH 25
28 TTHTT 31
29 TTTHH 33
30 TTTHT 31
31 TTTTH 30
32 TTTTT 20
You could first cut along 3 + 1 breaks, split it along the levels. The interaction can now be tabled to get the result.
My_Data$cut3 <- cut(seq_len(nrow(My_Data)), seq.int(1, nrow(My_Data), length.out=3 + 1), include.lowest=TRUE)
(res <- interaction(split(My_Data$Results, My_Data$cut3)) |> table() |> as.data.frame())
# Var1 Freq
# 1 H.H.H 51
# 2 T.H.H 58
# 3 H.T.H 43
# 4 T.T.H 49
# 5 H.H.T 38
# 6 T.H.T 51
# 7 H.T.T 64
# 8 T.T.T 46
To get the desired output, we can strsplit Var1.
strsplit(as.character(res$Var1), '\\.') |> do.call(what=rbind) |>
cbind.data.frame(res$Freq) |> setNames(c('first', 'second', 'third', 'Freq'))
# first second third Freq
# 1 H H H 51
# 2 T H H 58
# 3 H T H 43
# 4 T T H 49
# 5 H H T 38
# 6 T H T 51
# 7 H T T 64
# 8 T T T 46
Note, that nrow of your data should be divisible by 3.
Edit
To generalize, we may write a small function.
f <- \(x, n) {
ct <- cut(seq_len(nrow(x)), seq.int(1L, nrow(x), length.out=n + 1L), include.lowest=TRUE)
res <- interaction(split(x$Results, ct)) |> table() |> as.data.frame()
strsplit(as.character(res$Var1), '\\.') |> do.call(what=rbind) |>
cbind.data.frame(res$Freq) |> setNames(c(LETTERS[seq_len(n)], 'Freq'))
}
f(My_Data, 4)
# A B C D Freq
# 1 H H H H 13
# 2 T H H H 25
# 3 H T H H 18
# 4 T T H H 17
# 5 H H T H 18
# 6 T H T H 15
# 7 H T T H 21
# 8 T T T H 24
# 9 H H H T 26
# 10 T H H T 15
# 11 H T H T 16
# 12 T T H T 18
# 13 H H T T 22
# 14 T H T T 18
# 15 H T T T 10
# 16 T T T T 24
Data:
set.seed(42)
My_Data <- data.frame(id=1:1200, Results=sample(c('H', 'T'), 1200, replace=TRUE))
A slightly generalized solution with tidyverse tools. Change the sets variable for longer or shorter sequences.
coin <- c("H", "T")
sets <- 4
rolls <- 10000
results <- sample(coin, sets * rolls, rep = TRUE)
named_results <- purrr::map_chr(
0:(rolls - 1),
~ paste0(results[(sets * .x + 1):(sets * .x + sets)],
collapse = ""
)
)
dplyr::count(tibble::tibble(x = named_results), x)
with output
# A tibble: 16 x 2
x n
<chr> <int>
1 HHHH 629
2 HHHT 627
3 HHTH 638
4 HHTT 599
5 HTHH 602
6 HTHT 633
7 HTTH 596
8 HTTT 661
9 THHH 631
10 THHT 589
11 THTH 633
12 THTT 647
13 TTHH 660
14 TTHT 637
15 TTTH 623
16 TTTT 595
sets = 8 would give something like
# A tibble: 256 x 2
x n
<chr> <int>
1 HHHHHHHH 37
2 HHHHHHHT 36
3 HHHHHHTH 43
4 HHHHHHTT 35
5 HHHHHTHH 38
6 HHHHHTHT 27
7 HHHHHTTH 32
8 HHHHHTTT 28
9 HHHHTHHH 33
10 HHHHTHHT 38
# ... with 246 more rows
I am trying to make a piece-wise function. This is a really basic one. I want y to be a list of values (preferably not just a list of integers but a list of real numbers like (1.34, 20.92) in the future).
How might I make a piece-wise function?
y <- 1:10
if (y < 2){
print("CAN'T COMPUTE")
} else if (y >= 2 & y < 6){
print(y^2)
} else {
print(y * 2)
}
Let me give it a try:
library("dplyr")
y <- 1:10
y %>%
as_tibble() %>%
mutate(res = case_when(y < 2 ~ "CAN'T COMPUTE",
y >= 2 & y < 6 ~ as.character(y^2),
TRUE ~ as.character(y*2)))
Here's the results:
# A tibble: 10 x 2
value res
<int> <chr>
1 1 CAN'T COMPUTE
2 2 4
3 3 9
4 4 16
5 5 25
6 6 12
7 7 14
8 8 16
9 9 18
10 10 20
Here are a some base R approaches. We have used NA instead of a character string in order to produce a numeric vector result. The first uses a nested ifelse. The second uses a single ifelse to select between NA and the other values and computes the other values using a formula. The third computes which leg of the result is wanted (1, 2 or 3) and then uses switch to select that leg. The fourth is a variation of three that uses findInterval to compute the leg number.
ifelse(y < 2, NA, ifelse(y < 6, y^2, 2*y))
## [1] NA 4 9 16 25 12 14 16 18 20
ifelse(y < 2, NA, (y < 6) * y^2 + (y >= 6) * 2*y)
## [1] NA 4 9 16 25 12 14 16 18 20
mapply(switch, 1 + (y >= 2) + (y >= 6), NA, y^2, 2*y)
## [1] NA 4 9 16 25 12 14 16 18 20
mapply(switch, findInterval(y, c(-Inf, 2, 6, Inf), left.open = FALSE), NA, y^2, 2*y)
## [1] NA 4 9 16 25 12 14 16 18 20
df <- data.frame(x = seq(1:10))
I want this:
df$y <- c(1, 2, 3, 4, 5, 15, 20 , 25, 30, 35)
i.e. each y is the sum of previous five x values. This implies the first
five y will be same as x
What I get is this:
df$y1 <- c(df$x[1:4], RcppRoll::roll_sum(df$x, 5))
x y y1
1 1 1
2 2 2
3 3 3
4 4 4
5 5 15
6 15 20
7 20 25
8 25 30
9 30 35
10 35 40
In summary, I need y but I am only able to achieve y1
1) enhanced sum function Define a function Sum which sums its first 5 values if it receives 6 values and returns the last value otherwise. Then use it with partial=TRUE in rollapplyr:
Sum <- function(x) if (length(x) < 6) tail(x, 1) else sum(head(x, -1))
rollapplyr(x, 6, Sum, partial = TRUE)
## [1] 1 2 3 4 5 15 20 25 30 35
2) sum 6 and subtract off original Another possibility is to take the running sum of 6 elements filling in the first 5 elements with NA and subtracting off the original vector. Finally fill in the first 5.
replace(rollsumr(x, 6, fill = NA) - x, 1:5, head(x, 5))
## [1] 1 2 3 4 5 15 20 25 30 35
3) specify offsets A third possibility is to use the offset form of width to specify the prior 5 elements:
c(head(x, 5), rollapplyr(x, list(-(1:5)), sum))
## [1] 1 2 3 4 5 15 20 25 30 35
4) alternative specification of offsets In this alternative we specify an offset of 0 for each of the first 5 elements and offsets of -(1:5) for the rest.
width <- replace(rep(list(-(1:5)), length(x)), 1:5, list(0))
rollapply(x, width, sum)
## [1] 1 2 3 4 5 15 20 25 30 35
Note
The scheme for filling in the first 5 elements seems quite unusual and you might consider using partial sums for the first 5 with NA or 0 for the first one since there are no prior elements fir that one:
rollapplyr(x, list(-(1:5)), sum, partial = TRUE, fill = NA)
## [1] NA 1 3 6 10 15 20 25 30 35
rollapplyr(x, list(-(1:5)), sum, partial = TRUE, fill = 0)
## [1] 0 1 3 6 10 15 20 25 30 35
rollapplyr(x, 6, sum, partial = TRUE) - x
## [1] 0 1 3 6 10 15 20 25 30 35
A simple approach would be:
df <- data.frame(x = seq(1:10))
mysum <- function(x, k = 5) {
res <- rep(NA, length(x))
for (i in seq_along(x)) {
if (i <= k) { # edited ;-)
res[i] <- x[i]
} else {
res[i] <- sum(x[(i-k):(i-1)])
}
}
res
}
mysum(df$x)
# [1] 1 2 3 4 5 15 20 25 30 35
mysum <- function(x, k = 5) {
res <- x[1:k]
append<-sapply(2:(len(x)+1-k),function(i) sum(x[i:(i+k-1)]))
return(c(res,append))
}
mysum(df$x)
I have a two value
3 and 5
and I make vector
num1 <- 3
num2 <- 12
a <- c(num1, num2)
I want add number(12) to vector "a" and
also I want to make new vector with repeat and append
like this:
3,12, 15,24, 27,36, 39,48 ....
repeat number "n" is 6
I don't have any idea.
Here are two methods in base R.
with outer, you could do
c(outer(c(3, 12), (12 * 0:4), "+"))
[1] 3 12 15 24 27 36 39 48 51 60
or with sapply, you can explicitly loop through and calculate the pairs of sums.
c(sapply(0:4, function(i) c(3, 12) + (12 * i)))
[1] 3 12 15 24 27 36 39 48 51 60
outer returns a matrix where every pair of elements of the two vectors have been added together. c is used to return a vector. sapply loops through 0:4 and then calculates the element-wise sum. It also returns a matrix in this instance, so c is used to return a vector.
Here is a somewhat generic function that takes as input your original vector a, the number to add 12, and n,
f1 <- function(vec, x, n){
len1 <- length(vec)
v1 <- sapply(seq(n/len1), function(i) x*i)
v2 <- rep(v1, each = n/length(v1))
v3 <- rep(vec, n/len1)
return(c(vec, v3 + v2))
}
f1(a, 12, 6)
#[1] 3 12 15 24 27 36 39 48
f1(a, 11, 12)
#[1] 3 12 14 23 25 34 36 45 47 56 58 67 69 78
f1(a, 3, 2)
#[1] 3 12 6 15
EDIT
If by n=6 you mean 6 times the whole vector then,
f1 <- function(vec, x, n){
len1 <- length(vec)
v1 <- sapply(seq(n), function(i) x*i)
v2 <- rep(v1, each = len1)
v3 <- rep(vec, n)
return(c(vec, v3 + v2))
}
f1(a, 12, 6)
#[1] 3 12 15 24 27 36 39 48 51 60 63 72 75 84
Using rep for repeating and cumsum for the addition:
n = 6
rep(a, n) + cumsum(rep(c(12, 0), n))
# [1] 15 24 27 36 39 48 51 60 63 72 75 84
The seq function in R would give me a sequence from x to y with a constant step m:
seq(x, y, m)
E.g. seq(1,9,2) = c(1,3,5,7,9).
What would be the most elegant way to get a sequence from x to y with alternating steps m1 and m2, such that something like "seq(x, y, c(m1, m2))" would give me c(x, x + m1, (x + m1) + m2, (x + m1 + m2) + m1, ..., y), each time adding one of the steps (not necessarily reaching up to y, of course, as in seq)?
Example: x = 1; y = 19; m1 = 2; m2 = 4 and I get c(1,3,7,9,13,15,19).
I arrived the solution by:
1. Use cumsum with a vector c(from,rep(by,times),...), with by repeated times = ceiling((to-from)/sum(by)) times.
2. Truncate the sequence by !(seq > to).
seq_alt <- function(from, to, by) {
seq <- cumsum(c(from,rep(by,ceiling((to-from)/sum(by)))))
return(seq[! seq > to])
}
First n terms of this sequence you can generate with
x = 1; m1 = 2; m2 = 4
n <- 0:10 # first 11 terms
x + ceiling(n/2)*m1 + ceiling((n-1)/2)*m2
# [1] 1 3 7 9 13 15 19 21 25 27 31
Here is another idea,
fun1 <- function(x, y, j, z){
if(j >= y) {return(x)}else{
s1 <- seq(x, y, j+z)
s2 <- seq(x+j, y, j+z)
return(sort(c(s1, s2)))
}
}
fun1(1, 19, 2, 4)
#[1] 1 3 7 9 13 15 19
fun1(1, 40, 4, 3)
#[1] 1 5 8 12 15 19 22 26 29 33 36 40
fun1(3, 56, 7, 10)
#[1] 3 10 20 27 37 44 54
fun1(1, 2, 2, 4)
#[1] 1
Here is an alternative that uses diffinv This method over allocates the values, so as a stopping rule, I get the elements that are less than or equal to the stopping value.
seqAlt <- function(start, stop, by1, by2) {
out <- diffinv(rep(c(by1, by2), ceiling(stop / (by1 + by2))), xi=start)
return(out[out <= stop])
}
seqAlt(1, 19, 2, 4)
[1] 1 3 7 9 13 15 19
You could use Reduce with accumulate = TRUE to iteratively add either 2 or 4:
Reduce(`+`, rep(c(2,4), 10), init = 1, accumulate = TRUE)
# [1] 1 3 7 9 13 15 19 21 25 27 31 33 37 39 43 45 49 51 55 57 61
The number of times you repeat c(2,4) will determine sequence length; since it is 10 above, the sequence is length 20.
The purrr package has an accumulate wrapper, if you prefer the syntax:
purrr::accumulate(rep(c(2,4), 10), `+`, .init = 1)
## [1] 1 3 7 9 13 15 19 21 25 27 31 33 37 39 43 45 49 51 55 57 61
perfect example of recycling vectors in R
# 1.
x = 1; y = 19; m1 = 2; m2 = 4
(x:y)[c(TRUE, rep(FALSE, m1-1), TRUE, rep(FALSE,m2-1))]
# [1] 1 3 7 9 13 15 19
# 2.
x = 3; y = 56; m1 = 7; m2 = 10
(x:y)[c(TRUE, rep(FALSE, m1-1), TRUE, rep(FALSE,m2-1))]
# [1] 3 10 20 27 37 44 54