Create sequence of binary values with a minimum run length - r

I want to create a random sequence of 0 or 1 of a certain length, say 100. My only restriction is that the number must be at least in two consecutive periods.
Example of correct sequence, where all runs have at least two values:
1 1 0 0 0 1 1 1 0 0 0
Example of incorrect sequence, where some runs have less than two values:
1 0 0 1 0 1 1 1 0 1 1
^ ^ ^
This is my code, but is not working:
x <- NULL
x[1] <- sample(c(0, 1), replace = TRUE, size = 1)
for(i in 2:100){
x[i] <- sample(c(0, 1), replace = TRUE, size = 1)
x[i] <- if(x[i] + x[i-1] == 1){
if(x[i-1] == 1){
1
} else {
0
}
} else {
sample(c(0, 1), replace = TRUE, size = 1)
}
}
print(x)

Here is a simple version. The first value of x is set to a random binomial number (0 or 1). The second value must be the same as the first. Then, the following code checks for each iteration if the two previous values are the same. If they are, then a random binomial is generated. If not, then x[i-1] is assigned as x[i] also.
set.seed(1234)
n <- 100
x <- numeric(n)
x[1] <- rbinom(1, 1, .5)
x[2] <- x[1]
for(i in 3:n) {
if(x[i-1] == x[i-2]) {
x[i] <- rbinom(1, 1, .5)
} else {
x[i] <- x[i-1]
}
}
x
[1] 1 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 1 1 0 0 1 1 0 0 0 1 1 1 1 1 0 0
[59] 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 1 0 0 0 0 0 1 1 0 0 1 1
Here is a function version to generate any length (n), while allowing you to change the probability of getting a 1 over a 0.
my_func <- function(n, prob = 0.5) {
x <- numeric(n)
x[1] <- rbinom(1, 1, prob)
x[2] <- x[1]
for(i in 3:n) {
if(x[i-1] == x[i-2]) {
x[i] <- rbinom(1, 1, prob)
} else {
x[i] <- x[i-1]
}
}
x
}
set.seed(1234)
my_func(n = 10, prob = 0.9)
[1] 1 1 1 1 1 1 1 1 1 1

Here is one way using a variable called consecutive_count which counts how many consecutive same values has been created in the series.
set.seed(1432)
#Initialise the vector of size 100
x <- numeric(100)
#Get the 1st value
x[1] <- sample(c(0,1), 1)
consecutive_count <- 1
for(i in 2:100){
#If the count is less than 2, repeat the previous value
if(consecutive_count < 2){
x[i] <- x[i-1]
#Increment the counter
consecutive_count <- consecutive_count + 1
} else {
#Randomly assign 1 or 0
x[i] <- sample(c(0,1), 1)
#If the count is same as previous value increment the count
#Or set consecutive_count to 1.
if(x[i] == x[i-1]) consecutive_count <- consecutive_count + 1
else consecutive_count <- 1
}
}
x
x
#[1] 0 0 0 1 1 0 0 1 1 0 0 0 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1
#[39] 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 1
#[77] 1 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 0 0 1 1 1 1 0 0

Related

Computing an R function with while loop

-> I couldn't figure out how to compute the time it took the process to stop and I feel like I made a few errors, the function seems a bit off...
# y = number of books
sim_books <- function (y) {
y = 0
shelves <- c(rep(0, y - 3), 1)
plant_books <- y-3
animal_books <- 1
while (shelves != 0){
# time it took for the function to stop running
time_stop <-
}
return(time_stop)
}
You may try this way. If you have any questions, please let me know.
sim_books <- function (n) {
x <- c(-1, rep(1, n-1))
count <- 0
while(!(sum(x) %in% c(n, -n))) {
i <- sample(1:n, 1)
x[i] <- x[i] * -1
count <- count + 1
}
return(count)
}
sim_books(4)
[1] 7
You can try the code below
f <- function(n) {
shelves <- c(rep(0, n - 1), 1)
iter <- 0
res <- list(shelves)
while (var(shelves) != 0) {
shelves <- (shelves + replace(rep(0, n), sample(n, 1), 1)) %% 2
iter <- iter + 1
res[[iter + 1]] <- shelves
}
list(trajectory = res, niter = iter)
}
and you will see
> f(7)
$trajectory
$trajectory[[1]]
[1] 0 0 0 0 0 0 1
$trajectory[[2]]
[1] 1 0 0 0 0 0 1
$trajectory[[3]]
[1] 1 0 0 1 0 0 1
$trajectory[[4]]
[1] 1 0 0 1 0 0 0
$trajectory[[5]]
[1] 0 0 0 1 0 0 0
$trajectory[[6]]
[1] 0 1 0 1 0 0 0
$trajectory[[7]]
[1] 0 1 0 1 1 0 0
$trajectory[[8]]
[1] 0 1 0 1 1 0 1
$trajectory[[9]]
[1] 0 1 1 1 1 0 1
$trajectory[[10]]
[1] 0 1 1 1 1 1 1
$trajectory[[11]]
[1] 0 0 1 1 1 1 1
$trajectory[[12]]
[1] 0 0 0 1 1 1 1
$trajectory[[13]]
[1] 0 0 1 1 1 1 1
$trajectory[[14]]
[1] 1 0 1 1 1 1 1
$trajectory[[15]]
[1] 1 0 1 1 0 1 1
$trajectory[[16]]
[1] 1 0 1 1 1 1 1
$trajectory[[17]]
[1] 1 1 1 1 1 1 1
$niter
[1] 16

Showing missing levels in model matrix

I would like to know if there a way to insert a column into a matrix such that..
p1 <- c("a","b","c","e","d","a","c")
p2 <- c("a","b","c","e","e","a","c")
p1mat <- model.matrix(~p1 + 0)
p2mat <- model.matrix(~p2 + 0)
colnames(p1mat) <- gsub("p1","",colnames(p1mat))
colnames(p2mat) <- gsub("p2","",colnames(p2mat))
this would give me for p1mat
a b c d e
1 1 0 0 0 0
2 0 1 0 0 0
3 0 0 1 0 0
4 0 0 0 0 1
5 0 0 0 1 0
6 1 0 0 0 0
7 0 0 1 0 0
and for p2mat
a b c e
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1
5 0 0 0 1
6 1 0 0 0
7 0 0 1 0
My question is, is there a way to sneak in a column vector d consisting of only zeros into the matrix p2mat? such that
d
0
0
0
0
0
0
0
and the vector is automatically ordered and placed between columns c and e resulting in to following matrix for p2mat
a b c d e
1 1 0 0 0 0
2 0 1 0 0 0
3 0 0 1 0 0
4 0 0 0 0 1
5 0 0 0 0 1
6 1 0 0 0 0
7 0 0 1 0 0
Basically I want matrix p2mat to look into every column in p1mat to create an identical size matrix and to keep track of the data via dummy matrices.
Thank you.
You can factor both your inputs, making sure they both have the same levels. Then model.matrix should work as you expected.
Example:
p1 <- c("a","b","c","e","d","a","c")
p2 <- c("a","b","c","e","e","a","c")
levs <- sort(unique(c(p1, p2)))
f1 <- factor(p1, levs)
f2 <- factor(p2, levs)
model.matrix(~f1 + 0)
# f1a f1b f1c f1d f1e
# 1 1 0 0 0 0
# 2 0 1 0 0 0
# 3 0 0 1 0 0
# 4 0 0 0 0 1
# 5 0 0 0 1 0
# 6 1 0 0 0 0
# 7 0 0 1 0 0
# attr(,"assign")
# [1] 1 1 1 1 1
# attr(,"contrasts")
# attr(,"contrasts")$f1
# [1] "contr.treatment"
model.matrix(~f2 + 0)
# f2a f2b f2c f2d f2e
# 1 1 0 0 0 0
# 2 0 1 0 0 0
# 3 0 0 1 0 0
# 4 0 0 0 0 1
# 5 0 0 0 0 1
# 6 1 0 0 0 0
# 7 0 0 1 0 0
# attr(,"assign")
# [1] 1 1 1 1 1
# attr(,"contrasts")
# attr(,"contrasts")$f2
# [1] "contr.treatment"
If you're really looking to write a function, you might want to look at something like the following:
myfun <- function(..., overwrite = FALSE) {
l <- setNames(list(...), sapply(substitute(list(...))[-1], deparse))
cols <- sort(unique(unlist(lapply(l, colnames), use.names = FALSE)))
out <- lapply(l, function(x) {
cols_x <- c(colnames(x), setdiff(cols, colnames(x)))
temp <- `colnames<-`(x[, match(cols, colnames(x))], cols_x)[, cols]
replace(temp, is.na(temp), 0)
})
if (isTRUE(overwrite)) list2env(out, envir = .GlobalEnv)
out
}
This will take any number of items as inputs, compare the columns in all of them, and add missing columns where necessary. The output is stored as a list, which is a convenient structure to keep if you want to continue doing similar operations on all of the matrices. If you want to overwrite the original object, then you can change the "overwrite" argument to TRUE.
Here's some more sample data to work with.
set.seed(1)
p1 <- c("a","b","c","e","d","a","c"); p2 <-c("a","b","x","e","e","a","x")
p3 <- sample(c(cols, "z"), 7, TRUE)
p1mat <- model.matrix(~p1 + 0)
p2mat <- model.matrix(~p2 + 0)
p3mat <- model.matrix(~p3 + 0)
colnames(p1mat) <- gsub("p1","",colnames(p1mat))
colnames(p2mat) <- gsub("p2","",colnames(p2mat))
colnames(p3mat) <- gsub("p3","",colnames(p3mat))
Try the function out:
myfun(p1mat, p2mat)
myfun(p2mat, p1mat)
myfun(p3mat, p1mat)
myfun(p3mat, p1mat, p2mat)
This function takes 2 matrices, and compares their dimensions. If their dimensions differ, it inserts a new column of zeros into the matrix with fewer columns, at the exact column position that is lacking. It thus produces a new matrix with the same dimensions as the other.
match_matrices <- function(matrix1, matrix2) {
if(ncol(matrix1) != ncol(matrix2)) {
get_cols <- function(x) { l <- list(); for(i in 1:ncol(x)) { l[i] <- list(as.numeric(x[,i])) }; return(l) }
k <- get_cols(matrix2)
odd_one_out <- setdiff(colnames(matrix1), colnames(matrix2))
insert_at <- which(colnames(matrix1) == odd_one_out)
res <- t(do.call('rbind', append(k, list(rep(0, nrow(matrix2))), insert_at-1)))
colnames(res) <- colnames(matrix1)
}
return(res)
}
Using your matrices:
match_matrices(p1mat, p2mat)

How to construct this binary variable in R?

The aim is check if value at index i is 1 and then make the previous six entries as 1.
x <- c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1)
## Required output
y <- c(1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1)
## Attempt
for(j in seq_along(x)){
if(x[j] == 1){
for(i in (j-6):j)
x[i] = 1
}}
Could you help solve this or better approach ?
Thanks.
A fully vectorized solution using filter:
as.integer( #turn logical value into numbers
as.logical( #coerce to logical --> 0 becomes FALSE, everything else TRUE
rev( #reverse order
filter( #linear filtering
c(rep(0, 6), #pad with zeros in the beginning to avoid NAs
rev(x)), #revers order of input vector
c(rep(1, 7)), sides=1 #y_i = x_i * 1 + x_(i-1) * 1 +...+ x_(i-6) * 1
)[-(1:6)]))) #remove NA values
#[1] 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
You could try the following options (though don't forget to initialize x when trying each option as I'm overriding it)
indx <- mapply(function(x, y) x:y, which(x == 1) - 6 , which(x == 1))
x[indx[indx > 0]] <- 1
x
## [1] 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
Or even simpler
indx <- sapply(which(x == 1) - 6, function(x) x:(x + 6))
x[indx[indx > 0]] <- 1
x
## [1] 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
Or
indx <- apply(cbind(which(x == 1) - 6 , which(x == 1)), 1, function(x) x[1]:x[2])
x[indx[indx > 0]] <- 1
x
## [1] 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
Or
indx <- seq_len(6)
indx <- sapply(which(x == 1), function(x) x - indx)
x[indx[indx > 0]] <- 1
x
## [1] 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
Using 'for' loop:
ddf = data.frame(x,y=0)
for(i in 1:nrow(ddf)){
if(ddf[i,'x']==1){
j = i-5
if(j<1) j=1
ddf[j:i,'y'] = 1
}
}
ddf
x y
1 0 1
2 0 1
3 0 1
4 1 1
5 0 0
6 0 0
7 0 0
8 0 0
9 0 0
10 0 0
11 0 0
12 0 1
13 0 1
14 0 1
15 0 1
16 0 1
17 1 1
18 0 1
19 1 1
y = ddf$y
y
[1] 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
y<-x
y[unlist(sapply(which(x==1),
function(val){
val:(max(val-6,1))
}
)
)
]<-1
> y
[1] 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
Explanation :
I first look for indices of x=1 with which(x==1). Then, for each of the indices I get the indices from the one with x=1 to the 6th before that with sapply(...) then I unlist the result to only have a vector of indices for which y must be 1.
I then assigned 1 to the corresponding y values.
another writing, in 2 steps :
y<-x
ind<-unlist(sapply(which(x==1),function(val){val:(max(val-6,1))}))
y[ind]<-1
> y
[1] 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1

Using loop to make column selections using different vectors

Let's say I have 3 vectors (strings of 10):
X <- c(1,1,0,1,0, 1,1, 0, NA,NA)
H <- c(0,0,1,0,NA,1,NA,1, 1, 1 )
I <- c(0,0,0,0,0, 1,NA,NA,NA,1 )
Data.frame Y contains 10 columns and 6 rows:
1 2 3 4 5 6 7 8 9 10
0 1 0 0 1 1 1 0 1 0
1 1 1 0 1 0 1 0 0 0
0 0 0 0 1 0 0 1 0 1
1 0 1 1 0 1 1 1 0 0
0 0 0 0 0 0 1 0 0 0
1 1 0 1 0 0 0 0 1 1
I'd like to use vector X, H en I to make column selections in data.frame Y, using "1's" and "0's" in the vector as selection criterium .
So the results for vector X using the '1' as selection criterium should be:
X <- c(1,1,0,1,0, 1,1, 0, NA,NA)
1 2 4 6 7
0 1 0 1 1
1 1 0 0 1
0 0 0 0 0
1 0 1 1 1
0 0 0 0 1
1 1 1 0 0
For vector H using the '1' as selection criterium:
H <- c(0,0,1,0,NA,1,NA,1, 1, 1 )
3 6 8 9 10
0 1 0 1 0
1 0 0 0 0
0 0 1 0 1
1 1 1 0 0
0 0 0 0 0
0 0 0 1 1
For vector I using the '1' as selection criterium:
I <- c(0,0,0,0,0, 1,NA,NA,NA,1 )
6 10
1 0
0 0
0 1
1 0
0 0
0 1
For convenience and speed I'd like to use a loop. It might be something like this:
all.ones <- lapply[,function(x) x %in% 1]
In the outcome (all.ones), the result for each vector should stay separate. For example:
X 1,2,4,6,7
H 3,6,8,9,10
I 6,10
The standard way of doing this is using the %in% operator:
Y[, X %in% 1]
To do this for multiple vectors (assuming you want an AND operation):
mylist = list(X, H, I, D, E, K)
Y[, Reduce(`&`, lapply(mylist, function(x) x %in% 1))]
The problem is the NA, use which to get round it. Consider the following:
x <- c(1,0,1,NA)
x[x==1]
[1] 1 1 NA
x[which(x==1)]
[1] 1 1
How about this?
idx <- which(X==1)
Y[,idx]
EDIT: For six vectors, do
idx <- which(X==1 & H==1 & I==1 & D==1 & E==1 & K==1)
Y[,idx]
Replace & with | if you want all columns of Y where at least one of the lists has a 1.

How to count the frequency in binary tranactions per column, and adding the result after last row in R?

I have a txt file (data5.txt):
1 0 1 0 0
1 1 1 0 0
0 0 1 0 0
1 1 1 0 1
0 0 0 0 1
0 0 1 1 1
1 0 0 0 0
1 1 1 1 1
0 1 0 0 1
1 1 0 0 0
I need to count the frequency of one's and zero's in each column
if the frequency of ones >= frequency of zero's then I will print 1 after the last row for that Colum
I'm new in R, but I tried this, and I got error:
Error in if (z >= d) data[n, i] = 1 else data[n, i] = 0 :
missing value where TRUE/FALSE needed
my code:
data<-read.table("data5.txt", sep="")
m =length(data)
d=length(data[,1])/2
n=length(data[,1])+1
for(i in 1:m)
{
z=sum(data[,i])
if (z>=d) data[n,i]=1 else data[n,i]=0
}
You may try this:
rbind(df, ifelse(colSums(df == 1) >= colSums(df == 0), 1, NA))
# V1 V2 V3 V4 V5
# 1 1 0 1 0 0
# 2 1 1 1 0 0
# 3 0 0 1 0 0
# 4 1 1 1 0 1
# 5 0 0 0 0 1
# 6 0 0 1 1 1
# 7 1 0 0 0 0
# 8 1 1 1 1 1
# 9 0 1 0 0 1
# 10 1 1 0 0 0
# 11 1 1 1 NA 1
Update, thanks to a nice suggestion from #Arun:
rbind(df, ifelse(colSums(df == 1) >= ceiling(nrow(df)/2), 1, NA)
or even:
rbind(df, ifelse(colSums(df == 1) >= nrow(df)/2, 1, NA)
Thanks to #SvenHohenstein.
Possibly I misinterpreted your intended results. If you want 0 when frequency of ones is not equal or larger than frequency of zero, then this suffice:
rbind(df, colSums(df) >= nrow(df) / 2)
Again, thanks to #SvenHohenstein for his useful comments!

Resources