Related
I have two timeseries vectors as follows -
a <- c(1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0)
b <- c(1, 0, 1, 0)
I want to look back 7 days and replace only 1's in vectors a and b with 2. It is important to check if there were any values 7 days before replacing.
The expected result is -
a = c(1, 0, 0, 0, 1, 0, 2, 1, 1, 0, 2, 0)
b = c(1, 0, 1, 0) - Since no value existed 7 days ago, nothing changes here.
Thanks!
We can create a condition with lag
library(dplyr)
f1 <- function(vec) replace(vec, lag(vec, 6) == 1, 2)
-output
f1(a)
#[1] 1 0 0 0 1 0 2 1 1 0 2 0
f1(b)
#[1] 1 0 1 0
A base R option by defining an user function f
f <- function(v) replace(v, (ind <- which(v == 1) + 6)[ind <= length(v)], 2)
such that
> f(a)
[1] 1 0 0 0 1 0 2 1 1 0 2 0
> f(b)
[1] 1 0 1 0
trying these two methods to solve a problem... but they doesn't work.
Column 3 should continue to be "1" according to the final condition after column 1 changes to 0 from 1.
Method 1:
a <- as.data.frame(c(0,0,0,1,1,1,1,1,0,0,0,0))
b <- as.data.frame(c(0,0,0,0,0,0,0,0,0,0,0,0))
df <- cbind(a,b)
df[1,3] <- 0
df[-1,3] <- ifelse(df[-1,1] == 1 & df[-1,2] == 0, 1, ifelse(df[-1,1] == 1 &
df[-1,2] == 1, 0, df[sum(!is.na(df[,3])),3]))
Method 2:
a <- as.data.frame(c(0,0,0,1,1,1,1,1,0,0,0,0))
b <- as.data.frame(c(0,0,0,0,0,0,0,0,0,0,0,0))
df <- cbind(a,b)
df[1,3] <- 0
ndates <- as.numeric(length(df[,1]))
x <- 1
while (ndates > x - 1){
df[-1,3] <- ifelse(df[-1,1] == 1 & df[-1,2] == 0, 1, ifelse(df[-1,1] == 1
& df[-1,2] == 1, 0, df[sum(!is.na(df[,3])),3]))
x <- x + 1
}
Any help would be appreciated... seems like I'm missing something that is probably quite basic.
Updated answer:
Okay with a better understanding of what you're trying to accomplish here's a for loop version that should be right. Let me know if I'm still missing what your intention is.
df[-1,3] <- ifelse(df[-1,1] == 1 & df[-1,2] == 0, 1, ifelse(df[-1,1] == 1 &
df[-1,2] == 1, 0, df[sum(!is.na(df[,3])),3]))
a <- as.data.frame(c(0,0,0,1,1,1,1,1,0,0,0,0))
b <- as.data.frame(c(0,0,0,0,0,0,0,0,0,0,0,0))
df <- cbind(a,b)
df[1,3] <- 0
for (i in 2:nrow(df)) {
df[i,3] <- ifelse(df[i,1] == 1 & df[i,2] == 0, 1,
ifelse(df[i,1] == 1 & df[i,2] == 1, 0, df[i-1,3]))
}
This code loops once through each row and updates based on the rules you gave ([1,0] = 1, [1,1] = 0, otherwise previous row). And this is the resulting output:
> df
c(0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0) c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) V3
1 0 0 0
2 0 0 0
3 0 0 0
4 1 0 1
5 1 0 1
6 1 0 1
7 1 0 1
8 1 0 1
9 0 0 1
10 0 0 1
11 0 0 1
12 0 0 1
>
Initial answer:
It might be helpful if you could clarify what you're trying to accomplish. I worked through your first method and it seems to give the expected results. This is my understanding in pseudocode:
if C1,C2 == [1,0]:
set C3 to 1
else:
if C1,C2 == [1,1]:
set C3 to 0
else:
set C3 to the val in col=3,row=number of non NA vals in col3
Since you have 1 non-NA then the final statement evaluates to cell [1,3] which is 0 so it sets all the [1,0] cases to 1 and 0 otherwise.
What exactly are you trying to accomplish with df[sum(!is.na(df[,3])),3]))? This might be a case of a logic error, but it's hard to tell without understanding what you're hoping the outcome will be.
I have a matrix (A) like this (the names of row and column are identification codes (ID):
1 3 10 38 46
1 0 0.4 0 0 0
3 0 0 0 0 0
10 0 0 0.9 0.8 0
38 0 0 0 0 0
46 0 0.1 0 0 0
And another matrix (B) like this:
a b c
1 2.676651e-04 4.404911e-06 9.604227e-06
3 6.073389e-10 3.273222e-05 3.360321e-04
10 4.156392e-08 1.269607e-06 7.509217e-06
38 4.200699e-08 3.227431e-02 8.286920e-11
46 9.352353e-05 3.318948e-20 8.694981e-06
I would like to take the index of the elements of the A matrix >0, therefore I used this command:
temp <- apply(A,1, FUN=function(x) which(x>0))
it returned a list with the correct index of the elements >0.
After that I would like to multiply the element of the matrix B using the index. In particular, I would like to do something like these for each row:
1: 6.073389e-10*3.273222e-05*3.360321e-04
I have used the information of the matrix A (in the second column of the first row I have a value >0) as index to take the element in the matrix B for the first row.
For the second row, I obtained 0 because there aren't element in A[2,]>0
For the third row, I would like to obtain something like the first row, but I should sum the two products
10: 4.156392e-08*1.269607e-06*7.509217e-06 +4.200699e-08*3.227431e-02*8.286920e-11
I have tried to unlist the list but in this way I obtained a vector losing the corresponding between the ID
A <-
matrix(
c(0, 0.4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.9, 0.8, 0, 0, 0, 0, 0, 0, 0, 0.1, 0, 0, 0),
nrow = 5,
ncol = 5,
byrow = T
)
B <-
matrix(
c(
2.676651e-04, 4.404911e-06, 9.604227e-06,
6.073389e-10, 3.273222e-05, 3.360321e-04,
4.156392e-08, 1.269607e-06, 7.509217e-06,
4.200699e-08, 3.227431e-02, 8.286920e-11,
9.352353e-05, 3.318948e-20, 8.694981e-06
),
nrow = 5,
ncol = 3,
byrow = T
)
idx<-which(A>0, arr.ind = T)
result <- 0;
for (i in 1:nrow(idx)) {
cat(A[idx[i,1],idx[i,2]], sep="\n")
cat(B[idx[i,2], ], sep="\n")
result = result + sum(A[idx[i,1],idx[i,2]] * B[idx[i,2],])
}
cat("result=")
cat(result)
In R I have some vector.
x <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0)
I want to remove only "0" in x vector, but it removes all '0' in this vector.
Example
x=x[!x %in% 0 )]
All zero in this vector had been remove in x vector
For Example in Python
x = [0,1,0,1,0,0,0,1]
x.remove(0)
x
[1, 0, 1, 0, 0, 0, 1]
x.remove(0)
x
[1, 1, 0, 0, 0, 1]
We can use match to remove the first occurrence of a particular number
x <- c(1, 0, 1, 0, 0, 0, 1)
x[-match(1, x)]
#[1] 0 1 0 0 0 1
If you have any other number to remove in array, for example 5 in the case below,
x <- c(1, 0, 5, 5, 0, 0, 1)
x[-match(5, x)]
#[1] 1 0 5 0 0 1
You may need which.min(),
which determines the index of the first minimum of a vector:
x <- c(0,1,0,1,0,0,0,1)
x <- x[-which.min(x)]
x
# [1] 1 0 1 0 0 0 1
If your vector contains elements other than 0 or 1: x <- x[-which.min(x != 0)]
I have a sequence of 0s and 1s in this manner:
xx <- c(0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0,
0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1)
And I want to select the 0s and the first 1s.
The results should be:
ans <- c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1)
What's the fastest way? in R
Use rle() to extract the run lengths and values, do some minor surgery, and then put the run-length encoded vector "back together" using inverse.rle().
rr <- rle(xx)
rr$lengths[rr$values==1] <- 1
inverse.rle(rr)
# [1] 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1
Here's one way:
idx <- which(xx == 1)
pos <- which(diff(c(xx[1], idx)) == 1)
xx[-idx[pos]] # following Frank's suggestion
# [1] 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1
Without rle:
xx[head(c(TRUE, (xx != 1)), -1) | (xx != 1)]
#[1] 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1
Since OP mentioned speed, here's a benchmark:
josh = function(xx) {
rr <- rle(xx)
rr$lengths[rr$values==1] <- 1
inverse.rle(rr)
}
arun = function(xx) {
idx <- which(xx == 1)
pos <- which(diff(c(xx[1], idx)) == 1)
xx[setdiff(seq_along(xx), idx[pos])]
}
eddi = function(xx) {
xx[head(c(TRUE, (xx != 1)), -1) | (xx != 1)]
}
simon = function(xx) {
# The body of the function is supplied in #SimonO101's answer
first1(xx)
}
set.seed(1)
N = 1e6
xx = sample(c(0,1), N, T)
library(microbenchmark)
bm <- microbenchmark(josh(xx), arun(xx), eddi(xx), simon(xx) , times = 25)
print( bm , digits = 2 , order = "median" )
#Unit: milliseconds
# expr min lq median uq max neval
# simon(xx) 20 21 23 26 72 25
# eddi(xx) 97 102 104 118 149 25
# arun(xx) 205 245 253 258 332 25
# josh(xx) 228 268 275 287 365 25
Here's a quick Rcpp solution. Should be fastish (but I've no idea how it will stack up against the others here)...
Rcpp::cppFunction( 'std::vector<int> first1( IntegerVector x ){
std::vector<int> out;
for( IntegerVector::iterator it = x.begin(); it != x.end(); ++it ){
if( *it == 1 && *(it-1) != 1 || *it == 0 )
out.push_back(*it);
}
return out;
}')
first1(xx)
# [1] 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1
Even tho' I'm a staunch supporter of rle , since it's Friday here's an alternative method. I did it for fun, so YMMV.
yy<-paste(xx,collapse='')
zz<-gsub('[1]{1,}','1',yy) #I probably screwed up the regex here
aa<- as.numeric(strsplit(zz,'')[[1]])