Related
I have a vector like this of 1s and 0s next to each other.
vec <- c(1,1,1,0,0,1,1,1,1,0,0,0,1,1,1)
I want to replace the 1s with the number the consecutive 1s showed up so the final vector would look like the one below.
vec1 <- c(1,1,1,0,0,2,2,2,2,0,0,0,3,3,3)
I am not sure how to even start with this problem. Any help is appreciated.
One way using rle :
with(rle(vec), rep(values * cumsum(values), lengths))
#[1] 1 1 1 0 0 2 2 2 2 0 0 0 3 3 3
One base R option could be:
cumsum(vec & c(0, head(vec, -1)) == 0) * vec
[1] 1 1 1 0 0 2 2 2 2 0 0 0 3 3 3
We can use rleid from data.table
library(data.table)
vec[as.logical(vec)] <- as.integer(factor(rleid(vec)[as.logical(vec)]))
vec
#[1] 1 1 1 0 0 2 2 2 2 0 0 0 3 3 3
I have a really large boolean vector (i.e. T or F) and I want to simply be able to estimate how many "blocks" of consecutive T there are in my vector contained between the F elements.
A simple example of a vector with 3 of these consecutive "blocks" of T elements:
x <- c(T,T,T,T,F,F,F,F,T,T,T,T,F,T,T)
Output:
1,1,1,1,0,0,0,0,2,2,2,2,0,3,3
You can do:
rle <- rle(x)
rle$values <- with(rle, cumsum(values) * values)
inverse.rle(rle)
[1] 1 1 1 1 0 0 0 0 2 2 2 2 0 3 3
And a simplified and more elegant version of the basic idea (proposed by #Lyngbakr):
with(rle(x), rep(cumsum(values) * values, lengths))
Another solution with rle/inverse.rle:
x <- c(T,T,T,T,F,F,F,F,T,T,T,T,F,T,T)
rle_x <- rle(x)
rle_x$values[rle_x$values] <- 1:length(which(rle_x$values))
inverse.rle(rle_x)
# [1] 1 1 1 1 0 0 0 0 2 2 2 2 0 3 3
I have a vector:
> a <- c(0,1,2,3,4)
I am trying to replace the value of everything with that value incremented by 1, like below:
a <- (1,2,3,4,5)
> replace(a,a==4,5)
[1] 0 1 2 3 5
But when I try to replace 3 with 4, there is some issue
replace(a,a==3,4)
[1] 0 1 2 4 4
Both 3 and 5 are getting converted to 4.
and again when I try to replace 2 with 3, the same happens
> replace(a,a==2,3)
[1] 0 1 3 3 4
Can someone point out what i am doing wrong here?
replace doesn't change its argument.
> a = c(0,1,2,3,4)
> replace(a,a==2,99)
[1] 0 1 99 3 4
But a is still the same:
> a
[1] 0 1 2 3 4
so when you thought you'd converted the 4 to a 5 in a you hadn't. Use the return value if you want to change a:
> a
[1] 0 1 2 3 4
> a = replace(a,a==2,99)
> a
[1] 0 1 99 3 4
[As pointed out in comments, there are better ways to add 1 to all values of a vector, a=a+1 being the best]
I am trying to split one column in a data frame in to multiple columns which hold the values from the original column as new column names. Then if there was an occurrence for that respective column in the original give it a 1 in the new column or 0 if no match. I realize this is not the best way to explain so, for example:
df <- data.frame(subject = c(1:4), Location = c('A', 'A/B', 'B/C/D', 'A/B/C/D'))
# subject Location
# 1 1 A
# 2 2 A/B
# 3 3 B/C/D
# 4 4 A/B/C/D
and would like to expand it to wide format, something such as, with 1's and 0's (or T and F):
# subject A B C D
# 1 1 1 0 0 0
# 2 2 1 1 0 0
# 3 3 0 1 1 1
# 4 4 1 1 1 1
I have looked into tidyr and the separate function and reshape2 and the cast function but seem to getting hung up on giving logical values. Any help on the issue would be greatly appreciated. Thank you.
You may try cSplit_e from package splitstackshape:
library(splitstackshape)
cSplit_e(data = df, split.col = "Location", sep = "/",
type = "character", drop = TRUE, fill = 0)
# subject Location_A Location_B Location_C Location_D
# 1 1 1 0 0 0
# 2 2 1 1 0 0
# 3 3 0 1 1 1
# 4 4 1 1 1 1
You could take the following step-by-step approach.
## get the unique values after splitting
u <- unique(unlist(strsplit(as.character(df$Location), "/")))
## compare 'u' with 'Location'
m <- vapply(u, grepl, logical(length(u)), x = df$Location)
## coerce to integer representation
m[] <- as.integer(m)
## bind 'm' to 'subject'
cbind(df["subject"], m)
# subject A B C D
# 1 1 1 0 0 0
# 2 2 1 1 0 0
# 3 3 0 1 1 1
# 4 4 1 1 1 1
This question already has answers here:
Create counter within consecutive runs of certain values
(6 answers)
Closed 1 year ago.
I have this vector :
x = c(1,1,1,1,1,0,1,0,0,0,1,1)
And I want to do a cumulative sum for the positive numbers only. I should have the following vector in return:
xc = (1,2,3,4,5,0,1,0,0,0,1,2)
How could I do it?
I've tried : cumsum(x) but that do the cumulative sum for all values and gives :
cumsum(x)
[1] 1 2 3 4 5 5 6 6 6 6 7 8
One option is
x1 <- inverse.rle(within.list(rle(x), values[!!values] <-
(cumsum(values))[!!values]))
x[x1!=0] <- ave(x[x1!=0], x1[x1!=0], FUN=seq_along)
x
#[1] 1 2 3 4 5 0 1 0 0 0 1 2
Or a one-line code would be
x[x>0] <- with(rle(x), sequence(lengths[!!values]))
x
#[1] 1 2 3 4 5 0 1 0 0 0 1 2
Here's a possible solution using data.table v >= 1.9.5 and its new rleid funciton
library(data.table)
as.data.table(x)[, cumsum(x), rleid(x)]$V1
## [1] 1 2 3 4 5 0 1 0 0 0 1 2
Base R, one line solution with Map Reduce :
> Reduce('c', Map(function(u,v) if(v==0) rep(0,u) else 1:u, rle(x)$lengths, rle(x)$values))
[1] 1 2 3 4 5 0 1 0 0 0 1 2
Or:
unlist(Map(function(u,v) if(v==0) rep(0,u) else 1:u, rle(x)$lengths, rle(x)$values))
x=c(1,1,1,1,1,0,1,0,0,0,1,1)
cumsum_ <- function(x) {
r <- rle(x)
s <- split(x, rep(seq_along(r$values), rle(x)$lengths))
return(unlist(sapply(s, cumsum), use.names = F))
}
(xc <- cumsum_(x))
# [1] 1 2 3 4 5 0 1 0 0 0 1 2
I dont know much of R but i have written a small code in Python. Logic remains the same in all language. Hope this will help you
x=[1,1,1,1,1,0,1,0,0,0,1,1]
tot=0
for i in range(0,len(x)):
if x[i]!=0:
tot=tot+x[i]
x[i]=tot
else:
tot=0
print x
x<-c(1,1,1,1,1,0,1,0,0,0,1,1)
skumulowana<-function(x) {
dl<-length(x)
xx<-numeric(dl+1)
for (i in 1:dl){
ifelse (x[i]==0,xx[i+1]<-0,xx[i+1]<-xx[i]+x[i])
}
wynik<<-xx[1:dl+1]
return (wynik)
}
skumulowana(x)
## [1] 1 2 3 4 5 0 1 0 0 0 1 2
Try this one-liner...
Reduce(function(x,y) (x+y)*(y!=0), x, accumulate=T)
split and lapply version:
x <- c(1,1,1,1,1,0,1,0,0,0,1,1)
unlist(lapply(split(x, cumsum(x==0)), cumsum))
step by step:
a <- split(x, cumsum(x==0)) # divides x into pieces where each 0 starts a new piece
b <- lapply(a, cumsum) # calculates cumsum in each piece
unlist(b) # rejoins the pieces
Result has useless names but is otherwise what you wanted:
# 01 02 03 04 05 11 12 2 3 41 42 43
# 1 2 3 4 5 0 1 0 0 0 1 2
Here is another base R solution using aggregate. The idea is to make a data frame with x and a new column named x.1 by which we can apply aggregate functions (cumsum in this case):
x <- c(1,1,1,1,1,0,1,0,0,0,1,1)
r <- rle(x)
df <- data.frame(x,
x.1=unlist(sapply(1:length(r$lengths), function(i) rep(i, r$lengths[i]))))
# df
# x x.1
# 1 1 1
# 2 1 1
# 3 1 1
# 4 1 1
# 5 1 1
# 6 0 2
# 7 1 3
# 8 0 4
# 9 0 4
# 10 0 4
# 11 1 5
# 12 1 5
agg <- aggregate(df$x~df$x.1, df, cumsum)
as.vector(unlist(agg$`df$x`))
# [1] 1 2 3 4 5 0 1 0 0 0 1 2