r How to change vector values? - r

How do I change 3 values as a set in a vector, eg:
v=c(1,1,1,1,1,6,2,3,4,4,4,4,4,2)
r=v[which(v %in% c(6,2,3))] = c(6,0,8)
r
[1] 1 1 1 1 1 6 0 8 4 4 4 4 4 6
The result comes with a warning: number of items to replace is not a multiple of replacement length.
The idea result I need is :
[1] 1 1 1 1 1 6 0 8 4 4 4 4 4 2
I only want the 3 values to change when they are as a set/group, not individually, any suggestion would be great appreciated!
Edit:
I'm Sorry guys, actually there are more than one set of 6,2,3 in my data, the example should be look like:
v=c(1,1,2,1,1,6,2,3,4,4,4,4,4,2,6,2,3,4)
And the result be:
[1] 1 1 2 1 1 6 0 8 4 4 4 4 4 2 6 0 8 4

You can use rollapply from the zoo package:
rollapply(v, width=3, function(x) all(x == c(6, 2, 3)))
# [1] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
And then you can use that boolean to replace the values you want:
s <- which(rollapply(v, width=3, function(x) all(x == c(6, 2, 3))))
v[s:(s+2)] <- c(6, 0, 8)
In one happy function:
rolling_replace <- function(v, p, r) {
l <- length(r)
s <- which(rollapply(v, width=l, function(x) all(x == p)))
v[s:(s+l-1)] <- r
return (v)
}

A quite different way using regular expressions:
as.numeric(strsplit(gsub("6 2 3", "6 2 8", paste(v, collapse = " ")), " ")[[1]])
# [1] 1 1 1 1 1 6 2 8 4 4 4 4 4 2
This will replace all instances of c(6, 2, 3).

Base R function using rle
seqrepl <- function(v,find,repl) {
l <- length(find)
rv <- rle(v %in% find)
start <- cumsum(rv$lengths)[which(rv$lengths==l & rv$values==TRUE)-1]+1
v[start:(start+(l-1))] <- repl
v
}
Result:
seqrepl(v=v,find=c(6,2,3),repl=c(6,0,8))
#[1] 1 1 1 1 1 6 0 8 4 4 4 4 4 2
An alternative using embed:
seqrepl2 <- function(v,find,repl) {
l <- length(find)
start <- which(colSums(t(embed(v,l)) == rev(find)) == l)
v[start:(start+(l-1))] <- repl
v
}
seqrepl2(v=v,find=c(6,2,3),repl=c(6,0,8))
#[1] 1 1 1 1 1 6 0 8 4 4 4 4 4 2

Related

Count values of same kind in a row and omit interruptions

my actual problem is, that I want to count the length of similar values in my vector, for example:
v <- c(1,1,1,1,2,1,1,3,3,3,1,1,2,2,2)
But additionally I want to omit all interruptions with the length 1.
How can I achieve that my result here would be:
1,1,1,1,1,1,1,3,3,3,1,1,2,2,2
Note that the single "two" should now turn in a "one" and with
v_new <- c(1,1,1,1,1,1,1,3,3,3,1,1,2,2,2)
rle(v_new)
lengths: int [1:4] 7 3 2 3
values : num [1:4] 1 3 1 2
Thanks,
Mike
> v <- c(1,1,1,1,2,1,1,3,3,3,1,1,2,2,2)
>
> local.peak <- which(diff(sign(diff(v)))==-2) + 1
>
> v[which(diff(sign(diff(v)))==-2) + 1] <- v[local.peak - 1]
> v
[1] 1 1 1 1 1 1 1 3 3 3 1 1 2 2 2
The local peak function is taken from Finding local maxima and minima
Below is a little function that replaces values that occur not more than one time in a row with either the value to the left or right of it.
Your input
v <- c(1,1,1,1,2,1,1,3,3,3,1,1,2,2,2)
fun(v)
# [1] 1 1 1 1 1 1 1 3 3 3 1 1 2 2 2
Modified input
v <- c(1,1,1,1,2,4,4,3,3,3,1,1,2,2,2)
# ^ ^
Usage
fun(v, align = "right")
# [1] 1 1 1 1 4 4 4 3 3 3 1 1 2 2 2
Default is left aligned
fun(v)
# [1] 1 1 1 1 1 4 4 3 3 3 1 1 2 2 2
function
fun <- function(x, align = c("left", "right")) {
align <- match.arg(align)
rle_x <- rle(x)
rle_x$values <- with(rle_x, replace(values, lengths == 1, NA))
switch(align,
left = approx(inverse.rle(rle_x), xout = seq_along(x), method = "constant", f = 0)$y,
right = approx(inverse.rle(rle_x), xout = seq_along(x), method = "constant", f = 1)$y)
}

Compare the present value with the previous value in R while in ifelse()

I was working on a project in which if I want to compare the present value with the previous value and return an output 1 if true and 0 if false.
I tried
brv_trx1$'first' <- ifelse(brv_trx1$`Total TRx` != lag(brv_trx1$`Total TRx`),1,0)
This code did not work as expected.
x= c(1,2,2,2,3,4,5,5,5,5,6,7)
I wanted an output similar to this:
x y
1 1
2 1
2 0
2 0
3 1
4 1
5 1
5 0
5 0
After this step I have a decile function
brv_trx1$decvar <- ifelse(brv_trx1$cum != 0 & brv_trx1$first == 1, (11 - ceiling(round((brv_trx1$cum/total) * 10, 4))),
ifelse(brv_trx1$cum != 0 & brv_trx1$first == 0 , lag(brv_trx1$decvar), 0))
For this function, I was getting a lot of NAs.
The output expected was :
Y Dec
1 10
1 10
1 9
0 9
0 9
1 8
0 8
1 8
1 8
Because lag() will produce NA for the first entry, consider the following:
x= c(1,2,2,2,3,4,5,5,5,5,6,7)
x <- as.data.frame(x=x)
x$y <- ifelse( (x$x==lag(x$x)) %in% c(NA, FALSE), 1, 0)
If the comparison of x == lag(x) is FALSE or NA (because it's the first comparison of the lag), flag 1, else flag 0 per your example above.
You can use indexes. Here I've made vectors that go from 1-9, and 2-10. Then you compare the elements of your original vector by using the "shifted by 1" indexes (1 compares to 2, 2 compares to 3, etc).
x <- c(1,2,2,3,4,4,4,5,6,7)
length(x)
#[1] 10
i.1 <- 1:(length(x)-1)
i.2 <- 2:length(x)
x[i.1] == x[i.2]
#[1] FALSE TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
Using diff
ifelse(c(1,diff(x))==0,0,1)
[1] 1 1 0 0 1 1 1 0 0 0 1 1

Create dummy-column based on another columns

Let's say I have this dataset
> example <- data.frame(a = 1:10, b = 10:1, c = 1:5 )
I want to create a new variable d. I want in d the value 1 when at least in of the variables a b c the value 1 2 or 3 is present.
d should look like this:
d <- c(1, 1, 1, 0, 0, 1, 1, 1, 1, 1)
Thanks in advance.
You can use rowSums to get a logical vector of 1, 2 or 3 appearing in each row and wrap it in as.integer to convert to 0 and 1, i.e.
as.integer(rowSums(df == 1|df == 2| df == 3) > 0)
#[1] 1 1 1 0 0 1 1 1 1 1
Will work for any number of vars:
example <- data.frame(a = 1:10, b = 10:1, c = 1:5 )
x <- c(1, 2, 3)
as.integer(Reduce(function(a, b) (a %in% x) | (b %in% x), example))
With the dplyr package:
library(dplyr)
x <- 1:3
example %>% mutate(d = as.integer(a %in% x | b %in% x | c %in% x))
Two other possibilities which work with any number of columns:
#option 1
example$d <- +(rowSums(sapply(example, `%in%`, 1:3)) > 0)
#option 2
library(matrixStats)
example$d <- rowMaxs(+(sapply(example, `%in%`, 1:3)))
which both give:
> example
a b c d
1 1 10 1 1
2 2 9 2 1
3 3 8 3 1
4 4 7 4 0
5 5 6 5 0
6 6 5 1 1
7 7 4 2 1
8 8 3 3 1
9 9 2 4 1
10 10 1 5 1
You can do this using apply(although little slow)
Logic: any will compare if there is any 1,2 or 3 is present or not, apply is used to iterate this logic on each of the rows. Then finally converting the boolean outcome to numeric by adding +0 (You may choose as.numeric here in case you want to be more expressive)
d <- apply(example,1 ,function(x)any(x==1|x==2|x==3))+0
In case someone wants to restrict the columns or want to run the logic on some columns, then one can do this also:
d <- apply(example[,c("a","b","c")], 1, function(x)any(x==1|x==2|x==3))+0
Here you have control on columns on which one to take or ignore basis your needs.
Output:
> d
[1] 1 1 1 0 0 1 1 1 1 1
general solution:
example %>%
sapply(function(i)i %in% x) %>% apply(1,any) %>% as.integer
#[1] 1 1 1 0 0 1 1 1 1 1
Try this method, verify if in any column there is at list one element present in x.
x<-c(1,2,3)
example$d<-as.numeric(example$a %in% x | example$b %in% x | example$c %in% x)
example
a b c d
1 1 10 1 1
2 2 9 2 1
3 3 8 3 1
4 4 7 4 0
5 5 6 5 0
6 6 5 1 1
7 7 4 2 1
8 8 3 3 1
9 9 2 4 1
10 10 1 5 1

Cumulative sum for positive numbers only [duplicate]

This question already has answers here:
Create counter within consecutive runs of certain values
(6 answers)
Closed 1 year ago.
I have this vector :
x = c(1,1,1,1,1,0,1,0,0,0,1,1)
And I want to do a cumulative sum for the positive numbers only. I should have the following vector in return:
xc = (1,2,3,4,5,0,1,0,0,0,1,2)
How could I do it?
I've tried : cumsum(x) but that do the cumulative sum for all values and gives :
cumsum(x)
[1] 1 2 3 4 5 5 6 6 6 6 7 8
One option is
x1 <- inverse.rle(within.list(rle(x), values[!!values] <-
(cumsum(values))[!!values]))
x[x1!=0] <- ave(x[x1!=0], x1[x1!=0], FUN=seq_along)
x
#[1] 1 2 3 4 5 0 1 0 0 0 1 2
Or a one-line code would be
x[x>0] <- with(rle(x), sequence(lengths[!!values]))
x
#[1] 1 2 3 4 5 0 1 0 0 0 1 2
Here's a possible solution using data.table v >= 1.9.5 and its new rleid funciton
library(data.table)
as.data.table(x)[, cumsum(x), rleid(x)]$V1
## [1] 1 2 3 4 5 0 1 0 0 0 1 2
Base R, one line solution with Map Reduce :
> Reduce('c', Map(function(u,v) if(v==0) rep(0,u) else 1:u, rle(x)$lengths, rle(x)$values))
[1] 1 2 3 4 5 0 1 0 0 0 1 2
Or:
unlist(Map(function(u,v) if(v==0) rep(0,u) else 1:u, rle(x)$lengths, rle(x)$values))
x=c(1,1,1,1,1,0,1,0,0,0,1,1)
cumsum_ <- function(x) {
r <- rle(x)
s <- split(x, rep(seq_along(r$values), rle(x)$lengths))
return(unlist(sapply(s, cumsum), use.names = F))
}
(xc <- cumsum_(x))
# [1] 1 2 3 4 5 0 1 0 0 0 1 2
I dont know much of R but i have written a small code in Python. Logic remains the same in all language. Hope this will help you
x=[1,1,1,1,1,0,1,0,0,0,1,1]
tot=0
for i in range(0,len(x)):
if x[i]!=0:
tot=tot+x[i]
x[i]=tot
else:
tot=0
print x
x<-c(1,1,1,1,1,0,1,0,0,0,1,1)
skumulowana<-function(x) {
dl<-length(x)
xx<-numeric(dl+1)
for (i in 1:dl){
ifelse (x[i]==0,xx[i+1]<-0,xx[i+1]<-xx[i]+x[i])
}
wynik<<-xx[1:dl+1]
return (wynik)
}
skumulowana(x)
## [1] 1 2 3 4 5 0 1 0 0 0 1 2
Try this one-liner...
Reduce(function(x,y) (x+y)*(y!=0), x, accumulate=T)
split and lapply version:
x <- c(1,1,1,1,1,0,1,0,0,0,1,1)
unlist(lapply(split(x, cumsum(x==0)), cumsum))
step by step:
a <- split(x, cumsum(x==0)) # divides x into pieces where each 0 starts a new piece
b <- lapply(a, cumsum) # calculates cumsum in each piece
unlist(b) # rejoins the pieces
Result has useless names but is otherwise what you wanted:
# 01 02 03 04 05 11 12 2 3 41 42 43
# 1 2 3 4 5 0 1 0 0 0 1 2
Here is another base R solution using aggregate. The idea is to make a data frame with x and a new column named x.1 by which we can apply aggregate functions (cumsum in this case):
x <- c(1,1,1,1,1,0,1,0,0,0,1,1)
r <- rle(x)
df <- data.frame(x,
x.1=unlist(sapply(1:length(r$lengths), function(i) rep(i, r$lengths[i]))))
# df
# x x.1
# 1 1 1
# 2 1 1
# 3 1 1
# 4 1 1
# 5 1 1
# 6 0 2
# 7 1 3
# 8 0 4
# 9 0 4
# 10 0 4
# 11 1 5
# 12 1 5
agg <- aggregate(df$x~df$x.1, df, cumsum)
as.vector(unlist(agg$`df$x`))
# [1] 1 2 3 4 5 0 1 0 0 0 1 2

Generate vectors using R

I would like to ask,if some of You dont know any simple way to solve this kind of problem:
I need to generate all combinations of A numbers taken from a set B (0,1,2...B), with their sum = C.
ie if A=2, B=3, C=2:
Solution in this case:
(1,1);(0,2);(2,0)
So the vectors are length 2 (A), sum of all its items is 2 (C), possible values for each of vectors elements come from the set {0,1,2,3} (maximum is B).
A functional version since I already started before SO updated:
A=2
B=3
C=2
myfun <- function(a=A, b=B, c=C) {
out <- do.call(expand.grid, lapply(1:a, function(x) 0:b))
return(out[rowSums(out)==c,])
}
> out[rowSums(out)==c,]
Var1 Var2
3 2 0
6 1 1
9 0 2
z <- expand.grid(0:3,0:3)
z[rowSums(z)==2, ]
Var1 Var2
3 2 0
5 1 1
7 0 2
If you wanted to do the expand grid programmatically this would work:
z <- expand.grid( rep( list(C), A) )
You need to expand as a list so that the items remain separate. rep(0:3, 3) would not return 3 separate sequences. So for A=3:
> z <- expand.grid(rep(list(0:3), 3))
> z[rowSums(z)==2, ]
Var1 Var2 Var3
3 2 0 0
6 1 1 0
9 0 2 0
18 1 0 1
21 0 1 1
33 0 0 2
Using the nifty partitions() package, and more interesting values of A, B, and C:
library(partitions)
A <- 2
B <- 5
C <- 7
comps <- t(compositions(C, A))
ii <- apply(comps, 1, FUN=function(X) all(X %in% 0:B))
comps[ii, ]
# [,1] [,2]
# [1,] 5 2
# [2,] 4 3
# [3,] 3 4
# [4,] 2 5

Resources