Alternative ways to create a repetitive vector in R - r

I am tasked to create the vector
0 1 0 1 0 1 0 1 0 1
using two approaches without using c() or rep() in R.
I have tried a bunch of methods, but none of them seem to work.
Here are some of my attempts (all of which have failed) -
vector(0, 1, 0, 1, 0, 1, 0, 1, 0, 1)
a<-seq(from = 0, to = 1 , by = 1)
a
replicate(5, a)
b<-1*(0:1)
do.call(cbind, replicate(5, b, simplify=FALSE))
Any help on this would be appreciated! Thank you.

We can use bitwAnd
> bitwAnd(0:9, 1)
[1] 0 1 0 1 0 1 0 1 0 1
or kronecker
> kronecker(as.vector(matrix(1, 5)), 0:1)
[1] 0 1 0 1 0 1 0 1 0 1
> kronecker((1:5)^0, 0:1)
[1] 0 1 0 1 0 1 0 1 0 1
or outer
> as.vector(outer(0:1, (1:5)^0))
[1] 0 1 0 1 0 1 0 1 0 1

Solution 1: Generalized Function my_rep()
A generalized solution my_rep() for any vector x you wish repeated n times
my_rep <- function(x, n) {
return(
# Use modulo '%%' to subscript the original vector (whose length I'll call "m"), by
# cycling 'n' times through its indices.
x[0:(length(x) * n - 1) %% length(x) + 1]
# 1 2 ... m 1 2 ... m 1 2 ... m
# | 1st cycle | | 2nd cycle | ... | nth cycle |
)
}
which can solve this case
my_rep(x = 0:1, n = 5)
# [1] 0 1 0 1 0 1 0 1 0 1
and many others
# Getting cute, to make a vector of strings without using 'c()'.
str_vec <- strsplit("a b ", split = " ")[[1]]
str_vec
# [1] "a" "b" ""
my_rep(x = str_vec, n = 3)
# [1] "a" "b" "" "a" "b" "" "a" "b" ""
Solution 2: Binary Vector of Arbitrary Length
Another quick solution, for a 0 1 0 1 ... 0 1 vector of arbitrary length l
# Whatever length you desire.
l <- 10
# Generate a vector of alternating 0s and 1s, of length 'l'.
(1:l - 1) %% 2
which yields the output:
[1] 0 1 0 1 0 1 0 1 0 1
Note
Special thanks to #Adam, who figured out 0:9 %% 2 on their own, shortly after my comment with that same solution; and who gracefully retracted their initial answer in favor of mine. :)

Exploiting boolean coercion.
+(1:10*c(-1, 1) > 0)
# [1] 0 1 0 1 0 1 0 1 0 1
Or without c().
+(1:10*(0:1*2) - 1 > 0)
# [1] 0 1 0 1 0 1 0 1 0 1

Here is a way using the apply functions.
unlist(lapply(1:5, function(x) 0:1))
# [1] 0 1 0 1 0 1 0 1 0 1
Similar but with replicate.
as.vector(replicate(5, 0:1))
# [1] 0 1 0 1 0 1 0 1 0 1
And just in case you love trig.
abs(as.integer(cos((1:10 * pi) / 2)))
# [1] 0 1 0 1 0 1 0 1 0 1
And here is one last one that I consider cheating just because. This one generalizes to any vector you want!
unlist(unname(read.table(textConnection("0 1 0 1 0 1 0 1 0 1"))))

We can use purrr::accumulate, and a simple negate(!) operation.
accumulate will perform the same operation recursively over its data argument and output all intermediate results.
In this case, it can be broken down into:
output[1] <-0
output[2] <-!output[1]
output[3] <-!output[2]
...
the output would then be c(0, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE), which is coerced to numeric.
purrr::accumulate(0:9, ~!.x)
[1] 0 1 0 1 0 1 0 1 0 1

Firstly we will make a list of given no. and then apply unlist() function on list to convert it into a vector as shown in below code:
my_list = list(0, 1, 0, 1, 0, 1, 0, 1, 0, 1)
v = unlist(my_list)
print(v)
[ 1 ] 0 1 0 1 0 1 0 1 0 1

Related

How to randomly replace a value

I have a vector of a certain length of which I want to randomly replace every 2 by 0 or 1, with a probability of 0.4 (for value=1). I have used this code below. I expected to have a different value (0 or 1) for the different 2 replaced, but I have only 1 or 0 that replace the 2.
vec<-c(rep(2,18),1,0)
ifelse (vec==2,rbinom(1,1,0.40)
here is one output
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
and another output
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
When you go into the source code of ifelse via typing View(ifelse), you will see a piece of code showing that
len <- length(ans)
ypos <- which(test)
npos <- which(!test)
if (length(ypos) > 0L)
ans[ypos] <- rep(yes, length.out = len)[ypos]
if (length(npos) > 0L)
ans[npos] <- rep(no, length.out = len)[npos]
ans
That means, once you have one single value for yes or no in ifelse, that single value is repeated len times and placed to the corresponding logical positions.
In you case, rbinom(1,1,0.40) is just a single value for yes, thus being repeated once it has an realization.
One workaround is like below
> ifelse(vec == 2, rbinom(sum(vec == 2), 1, 0.40), vec)
[1] 1 1 0 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1 1 0
This replaces all 2 values with either 0 or 1
vec[vec == 2] <- rbinom(sum(vec == 2), 1, prob = .4)
If you draw a 0 and want the value to remain 2 then you could use sample, which would be equivalent to a binomial draw:
vec[vec == 2] <- sample(c(1, 2), sum(vec == 2), prob = c(0.4, 0.6), replace = T)
Try next code:
#Code
vec<-c(rep(2,18),1,0)
vec2 <- unlist(lapply(seq(2,length(vec),by=2), function(x) {vec[x] <- rbinom(1,1,0.40)}))
vec[seq(2,length(vec),by=2)] <-vec2
Output:
vec
[1] 2 0 2 0 2 1 2 0 2 0 2 0 2 1 2 0 2 0 1 1

selecting all rows which has value > 1 in r

dataseti have a data and want to select all rows which has value > 1 in r.
i tried
sel <- apply(data[,collist],1,function(row) "1" %in% row)
but it is not working and give me whole a data frame,
[data set][1]
how can i subset these data?
thanks
The Note at the end shows the data used in the examples below. I have changed the headings as shown since the ones provided in the question are unwieldy and have removed the column of minus signs.
1) Using that data, the correct answer to the question of selecting all rows with a 1 in any column is that only the first two data rows are selected and that is, in fact, what happens:
subset(data, A == 1 | B == 1 | C == 1)
## Sym A B C
## 1 ACAP3 0 0 1
## 2 ACTRT2 0 0 1
2) This version does not make use of the headings:
has1 <- rowSums(data == 1) > 0
data[has1, ]
## Sym A B C
## 1 ACAP3 0 0 1
## 2 ACTRT2 0 0 1
3) Although the above should work it would be a bit safer to just check the numeric columns which for this data can be done like this:
has1 <- rowSums(data[-1] == 1) > 0
data[has1, ]
## Sym A B C
## 1 ACAP3 0 0 1
## 2 ACTRT2 0 0 1
4) or if we did not know which columns were numeric:
is.num <- sapply(data, is.numeric)
has1 <- rowSums(data[is.num] == 1) > 0
data[has1, ]
## Sym A B C
## 1 ACAP3 0 0 1
## 2 ACTRT2 0 0 1
Note
As the question did not provide input in reproducible form, the input shown in such form is assumed to be:
Lines <- 'Hugo_Symbol "A - 3 A- A9J" "B - F2 - 7273 - 01" "C - FB - AAPP - 01"
ACAP3 0 0 - 1
ACTRT2 0 0 - 1
AGRN 0 0 - 0
ANKRD65 0 0 - 0
ATAD3A 0 0 - 0
'
data <- read.table(text = Lines, skip = 1, col.names = c("Sym", "A", "B", "X", "C"),
colClasses = c(NA, NA, NA, "NULL", NA))
The above produces this:
data
## Sym A B C
## 1 ACAP3 0 0 1
## 2 ACTRT2 0 0 1
## 3 AGRN 0 0 0
## 4 ANKRD65 0 0 0
## 5 ATAD3A 0 0 0

Find a numeric pattern R

I would like the find the pattern of either a 0/1 followed by a 2 which occurs more than three times in a row. I would like to find this pattern and transform the 2's in this pattern into 1s - such as
Input:
Y <- c(0,1,0,3,2,5,2,1,2,0,2,1,2,0,1,2,1,3,1,2,1)
Some Function findPattern that finds the pattern:
findPattern(Y)
And Outputs the following:
[1] 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0
I have tried the following:
as.numeric(Y == 2 & lead(Y) %in% 1:2)
1. Find 0/1 followed by 2s
findPattern<-function(Y){
as.numeric(Y==2 & (c(NA,Y[-length(Y)])==0 |c(NA,Y[-length(Y)])==1 ))
}
I add a NA a the start and remove last item so that you "shift" your vector by 1 position but still keep same vector length. This way you avoid for loops.
If you want to use %in% which avoids a second passage:
findPattern<-function(Y){
as.numeric(Y==2 & (c(NA,Y[-length(Y)]) %in% c(0,1))
}
2. Select only those that have at least three 1s every other position
findPattern<-function(Y){
w <- which(Y==2 & (c(NA,Y[-length(Y)]) %in% c(0,1)))
centers<- w[((w - 2) %in% w) & ((w+2) %in% w)]
result<-rep(0, times = length(Y))
result[c(centers,centers-2,centers+2)]<-1
return(result)
}
Testing:
findPattern(c(0,1,0,3,2,5,2,1,2,0,2,1,2,0,1,2,1,3,1,2,1))
[1] 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0
Here is a possible approach to solve the problem where you can combine with the regular expression to find the pattern.
Starting vector:
> Y
[1] 0 2 0 3 2 5 2 1 2 0 2 1 2 0 1
1) Find out all the 2s preceded by 0 or 1;
> ind <- as.integer(lag(Y %in% c(0, 1)) & (Y == 2) )
> ind
[1] 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0
2) Paste the resulting vector into a string and use regular expression to find out the location and length of the required pattern, i.e., alternating 0 and 1 equal or more than three times;
> id <- gregexpr("(01){3,}", paste0(ind, collapse = ""))
> id
[[1]]
[1] 8
attr(,"match.length")
[1] 6
attr(,"useBytes")
[1] TRUE
3) Extracting the location and length from the regular expression result and convert them into the index pattern;
> start <- as.numeric(id[[1]])
> end <- start + attr(id[[1]], "match.length") - 1
> indArray <- unlist(Map(`:`, start, end))
> indArray
[1] 8 9 10 11 12 13
4) Assign all the values at 01 pattern less than 3 times to 0
> ind[-indArray] <- 0
> ind
[1] 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0
Wrap them into a function:
library(dplyr)
findPattern <- function(Y) {
ind <- as.integer(lag(Y %in% c(0, 1)) & (Y == 2) )
id <- gregexpr("(01){3,}", paste0(ind, collapse = ""))
start <- as.numeric(id[[1]])
end <- start + attr(id[[1]], "match.length") - 1
indArray <- unlist(Map(`:`, start, end))
ind[-indArray] <- 0
ind
}
Using stringi package
Y <- c(0,1,0,3,2,5,2,1,2,0,2,1,2,0,1)
matchVec = stri_count(Y,fixed=2)
remapVec = as.integer(matchVec & (cumsum(matchVec)>=3))
remapVec
#[1] 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0

Split vector into contiguous runs of equal values

I have a data table and one of the columns is a bunch of 0's and 1's, just like vec below.
vec = c(rep(1, times = 6), rep(0, times = 10), rep(1, times = 11), rep(0, times = 4))
> vec
[1] 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
What I want to do is to split the data everytime there's a change in that column from 0 to 1 or vice-versa. Here is what I have done so far:
b = c(vec[1],diff(vec))
rowby = numeric(0)
for (i in 2:(length(b))) {
if (b[i] != 0) {
rowby <- c(rowby, i-1)
}
}
splitted_data <- split(vec, cumsum(c(TRUE,(1:length(vec) %in% rowby)[-length(vec)])))
There must be some thing right under my nose I can't see. What is a correct way to do this? This works for the example above, but not generally.
Try
split(vec,cumsum(c(1, abs(diff(vec)))))
#$`1`
#[1] 1 1 1 1 1 1
#$`2`
#[1] 0 0 0 0 0 0 0 0 0 0
#$`3`
#[1] 1 1 1 1 1 1 1 1 1 1 1
#$`4`
#[1] 0 0 0 0
Or use rle
split(vec,inverse.rle(within.list(rle(vec), values <- seq_along(values))))
With current versions of data.table, rleid is one function which can be used for this job:
library(data.table)#v1.9.5+
split(vec,rleid(vec))

R : Updating a vector given a set of indices

I have a vector (initialized to zeros) and a set of indices of that vector. For each value in indices, I want to increment the corresponding index in the vector. So, say 6 occurs twice in indices (as in the example below), then the value of the 6th element of the vector should be 2.
Eg:
> v = rep(0, 10)
> v
[1] 0 0 0 0 0 0 0 0 0 0
> indices
[1] 7 8 6 6 2
The updated vector should be
> c(0, 1, 0, 0, 0, 2, 1, 1, 0, 0)
[1] 0 1 0 0 0 2 1 1 0 0
What is the most idiomatic way of doing this without using loops?
The function tabulate is made for that
> indices = c(7,8,6,6,2);
> tabulate(bin=indices, nbins=10);
[1] 0 1 0 0 0 2 1 1 0 0
You can use rle for this:
x <- rle(sort(indices))
v[x$values] <- x$lengths
v
# [1] 0 1 0 0 0 2 1 1 0 0

Resources