I received the error
Error in if (x[i] == 0 && x[i - 1] > 0) { :
missing value where TRUE/FALSE needed
when running this function on a numeric vector
number_rn <- function(x) {
a <- 0
for (i in 1:length(x)) {
if (x[i] == 0 && x[i-1] > 0) {
a <- a +1
}
}
print(a)
}
However, the following function works fine:
number_rr <- function(x) {
a <- 0
for (i in 1:length(x)) {
if (x[i] > 0 && x[i-1] > 0) {
a <- a +1
}
}
print(a)
}
I note from previous answers to similar questions that this can occur if the if conditional does not have either a TRUE or FALSE result, but I do not believe this to be the case in my example. What could be causing this error?
There are several issues with the for loop (even if x does not contain any NA values):
In the first iteration (i == 1), x[i-1] refers to x[0] which is undefined as indexing in R starts at 1.
The code is using a for loop where vectorized functions can be used.
Unfortunately, starting the loop at i == 2, i.e., for (i in 2:length(x)), is not error-proof in case of a one element vector where length(x) == 1.
My suggestion is to use the vectorized version
number_rn_vec <- function(x) {
n <- length(x)
sum(x[2:n] == 0 & x[1:(n - 1)] > 0, na.rm = TRUE)
}
This will return a without error for many use cases:
sapply(
list(
c(),
c(1),
c(1, 0),
c(1, 0, 3),
c(0, 1, 0, 3),
c(NA, 1, 0, 3),
c(1, NA, 0, 3),
c(1, 0, NA, 3),
c(1, 0, 3, NA)
),
number_rn_vec
)
[1] 0 0 1 1 1 1 0 1 1
This is most likely occurring because you vector x has NULL or NA values. See what happens when I try to run a if condition with NULL values -
x <- NULL
if (x == 0 && x > 5) print("yes")
Make sure to remove any NAs or NULLs using is.na() or is.null() and you should be fine
Related
I was hoping to create a function with the if statements in the following code:
data <- data.frame(
id = c(1, 5, 6, 11, 15, 21),
intervention = c(2, 2, 2, 1, 1, 1),
death = c(0, 1, 0, 1, 0, 0)
)
test <- c()
for (i in data$id[which(data$intervention == 1)]) {
print(paste0("id = ", i, ": "))
for (j in data$id[which(data$intervention == 2)]) {
if (data$death[data$id == i] < data$death[data$id == j]) {
test <- c(test, -1)
} else if (data$death[data$id == i] > data$death[data$id == j]) {
test <- c(test, 1)
} else if (data$death[data$id == i] == data$death[data$id == j]) {
test <- c(test, 0)
}
}
print(test)
test <- c()
}
I had tried to do it as follows, however the code is not writing the result to the vector. However if I replaced the return with print, it would print it out. Would anyone have any suggestions on what I might be doing wrong? Many thanks!
first <- function () {
if(data$death[data$id == i]<data$death[data$id ==j]){
return (test <- c(test,-1))}
else if(data$death[data$id == i]>data$death[data$id == j]){
return (test <- c(test,1))}
else if(data$death[data$id == i]==data$death[data$id == j]){
return (test <- c(test,0))}
}
for (i in data$id[which(data$intervention == 1)]){
for (j in data$id[which(data$intervention == 2)]){
first()
}
test
}
The following function returns a list of the wanted vectors.
first <- function(data, interv1 = 1, interv2 = 2) {
inx <- which(data[["intervention"]] == interv1)
jnx <- which(data[["intervention"]] == interv2)
out <- lapply(inx, \(i) {
sapply(jnx, \(j) sign(data[["death"]][i] - data[["death"]][j]))
})
setNames(out, data[["id"]][inx])
}
first(data)
#> $`11`
#> [1] 1 0 1
#>
#> $`15`
#> [1] 0 -1 0
#>
#> $`21`
#> [1] 0 -1 0
Created on 2022-11-22 with reprex v2.0.2
You can then access the return values as
test <- first(data)
# any of the following extracts the 1st vector
test[['11']]
#> [1] 1 0 1
# notice the backticks
test$`11`
#> [1] 1 0 1
Created on 2022-11-22 with reprex v2.0.2
Can someone tell me what is wrong with this function in R? The functions can work on a single input, but when I use a vector I get an error:
input_check3 <- function(x){
if (is.finite(x)) {
if (x %% 2 == 0){
print(TRUE)
} else {
print(FALSE)
}
} else {
NA
}
}
data_for_e2 <- c(1, 2, 4, 5, 3)
input_check3(data_for_e2)
#> [1] FALSE
#> Warning messages:
#> 1: In if (is.finite(x)) { : The length of the condition is greater than one, so only its first element can be used
#> 2: In if (x%%2 == 0) { : The length of the condition is greater than one, so only its first element can be used
You could use ifelse, which is a vectorized function:
input_check3 <- function(x){
ifelse(is.finite(x),
x %% 2 == 0, # equiv to ifelse(x %% 2 == 0, TRUE, FALSE), thanks Martin Gal!
NA)
}
Result
[1] FALSE TRUE TRUE FALSE FALSE
I'm fairly new to R and am having trouble implementing something that should be very basic. Can someone point me in the right direction?
I need to apply a logical calculation based on the values of two vectors and return the value of that function in a third vector.
I want to do this in a user defined function so I can easily apply this in several other areas of the algorithm and make modifications to the implementation with ease.
Here's what I have tried, but I cannot get this implementation to work. I believe it is because I cannot send vectors as parameters to this function.
<!-- language: python -->
calcSignal <- function(fVector, sVector) {
if(!is.numeric(fVector) || !is.numeric(sVector)) {
0
}
else if (fVector > sVector) {
1
}
else if (fVector < sVector) {
-1
}
else {
0 # is equal case
}
}
# set up data frame
df <- data.frame(x=c("NA", 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, "NA"))
# call function
df$z <- calcSignal(df$x, df$y)
I want the output to be a vector with the following values, but I am not implementing the function correctly.
[0,-1,1,-1,0,0]
Can someone help explain how to implement this function to properly perform the logic outlined?
I appreciate your assistance!
There are some misunderstandings in your code:
in R, "NA" is considered as character (string is called character in R). the correct
form is NA without quotes.
it is worth noting that data.frame automatically will convert character to factor type which can be disabled by using data.frame(...,stringsAsFactors = F).
each column of a data.frame has a type, not each element. so when you have a column containing numbers and NA, class of that column will be numeric and is.numeric gives you True even for NA elements. is.na will do the job
|| only compares first element of each vector. | does elementwise comparison.
Now let's implement what you wanted:
Implementation 1:
#set up data frame
df <- data.frame(x=c(NA, 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, NA))
calcSignal <- function(f,s){
if(is.na(f) | is.na(s))
return(0)
else if(f>s)
return(1)
else if(f<s)
return(-1)
else
return(0)
}
df$z = mapply(calcSignal, df$x, df$y, SIMPLIFY = T)
to run a function on two or more vectors element-wise, we can use mapply.
Implementaion 2
not much different from previous. here the function is easier to use.
#set up data frame
df <- data.frame(x=c(NA, 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, NA))
calcSignal <- function(fVector, sVector) {
res = mapply(function(f,s){
if(is.na(f) | is.na(s))
return(0)
else if(f>s)
return(1)
else if(f<s)
return(-1)
else
return(0)
},fVector,sVector,SIMPLIFY = T)
return(res)
}
df$z = calcSignal(df$x,df$y)
Implementaion 3 (Vectorized)
This one is much better. because it is vectorized and is much faster:
calcSignal <- function(fVector, sVector) {
res = rep(0,length(fVector))
res[fVector>sVector] = 1
res[fVector<sVector] = -1
#This line isn't necessary.It's just for clarification
res[(is.na(fVector) | is.na(sVector))] = 0
return(res)
}
df$z = calcSignal(df$x,df$y)
Output:
> df
x y z
1 NA 4 0
2 2 1 1
3 9 5 1
4 7 9 -1
5 0 0 0
6 5 NA 0
No need for loopage as ?sign has your back:
# fixing the "NA" issue:
df <- data.frame(x=c(NA, 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, NA))
s <- sign(df$x - df$y)
s[is.na(s)] <- 0
s
#[1] 0 1 1 -1 0 0
ifelse is another handy function. Less elegant here than sign though
df <- data.frame(x=c(NA, 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, NA))
cs <- function(x, y){
a <- x > y
b <- x < y
out <- ifelse(a, 1, ifelse(b, -1, 0))
ifelse(is.na(out), 0, out)
}
cs(df$x, df$y)
For time series analysis I handle data that often contains leading and trailing zero elements. In this example, there are 3 zeros at the beginning an 2 at the end. I want to get rid of these elements, and filter for the contents in the middle (that also may contain zeros)
vec <- c(0, 0, 0, 1, 2, 0, 3, 4, 0, 0)
I did this by looping from the beginning and end, and masking out the unwanted elements.
mask <- rep(TRUE, length(vec))
# from begin
i <- 1
while(vec[i] == 0 && i <= length(vec)) {
mask[i] <- FALSE
i <- i+1
}
# from end
i <- length(vec)
while(i >= 1 && vec[i] == 0) {
mask[i] <- FALSE
i <- i-1
}
cleanvec <- vec[mask]
cleanvec
[1] 1 2 0 3 4
This works, but I wonder if there is a more efficient way to do this, avoiding the loops.
vec[ min(which(vec != 0)) : max(which(vec != 0)) ]
Basically the which(vec != 0) part gives the positions of the numbers that are different from 0, and then you take the min and max of them.
We could use the range and Reduce to get the sequence
vec[Reduce(`:`, range(which(vec != 0)))]
#[1] 1 2 0 3 4
Take the cumsum forward and backward of abs(vec) and keep only elements > 0. if it were known that all elements of vec were non-negative, as in the question, then we could optionally omit abs.
vec[cumsum(abs(vec)) > 0 & rev(cumsum(rev(abs(vec)))) > 0]
## [1] 1 2 0 3 4
I have a vector like this:
x <- c(0, 0, 0, 0, 4, 5, 0, 0, 3, 2, 7, 0, 0, 0)
I want to keep only the elements from position 5 to 11. I want to delete the zeroes in the start and end. For this vector it is quite easy since it is small.
I have very large data and need something in general for all vectors.
Try this:
x[ min( which ( x != 0 )) : max( which( x != 0 )) ]
Find index for all values that are not zero, and take the first -min and last - max to subset x.
You can try something like:
x=c(0,0,0,0,4,5,0,0,3,2,7,0,0,0)
rl <- rle(x)
if(rl$values[1] == 0)
x <- tail(x, -rl$lengths[1])
if(tail(rl$values,1) == 0)
x <- head(x, -tail(rl$lengths,1))
x
## 4 5 0 0 3 2 7
Hope it helps,
alex
This would also work :
x[cumsum(x) & rev(cumsum(rev(x)))]
# [1] 4 5 0 0 3 2 7
I would probably define two functions, and compose them:
trim_leading <- function(x, value=0) {
w <- which.max(cummax(x != value))
x[seq.int(w, length(x))]
}
trim_trailing <- function(x, value=0) {
w <- which.max(cumsum(x != value))
x[seq.int(w)]
}
And then pipe your data through:
x %>% trim_leading %>% trim_trailing