Here is my vector:
x <- c("a", "b", "c")
I am going to extract only the odd elements from the vector so I write something like this:
ifelse(length(x) > 0, x[seq(from = 1, to = length(x), by = 2)], NA)
But the result returned is just only"a". However, if I check the condition and run the TRUE statement separately, I got different results.
length(x) > 0 #TRUE
x[seq(from = 1, to = length(x), by = 2)] # "a" "c"
Does anyone know why? Thank you!
According to ?ifelse
A vector of the same length and attributes (including dimensions and "class") as test and data values from the values of yes or no
where 'test', 'yes', 'no' are arguments
ifelse(test, yes, no)
In the concerned example, there is mismatch in length between 'test', 'yes', 'no'. The first one returns a logical vector of length 1, second of 2 and third of 1 again. This creates an imbalance. In this example, it is a TRUE, so it is returning the first element of 'yes'. Instead, we can use if/else
if(length(x) > 0) x[seq(from = 1, to = length(x), by = 2)] else NA
#[1] "a" "c"
Related
Suppose x is a real number, or a vector. i is valued-False. Then x[i] will return numeric(0). I would like to treat this as a real number 0, or integer 0, which are both fit for addition.
numeric(0) added to any real number will return numeric(0), whereas I wish to obtain the real number being added as the result. What can I do to convert the numeric (0) value? Thanks in advance!
It is only when we do the +, it is a problem. This can be avoided if we use sum
sum(numeric(0), 5)
#[1] 5
sum(numeric(0), 5, 10)
#[1] 15
Or if we need to use +, an easy option is to concatenate with 0, select the first element. If the element is numeric(0), that gets replaced by 0, for other cases, the first element remain intact
c(numeric(0), 0)[1]
#[1] 0
Using a small example
lst1 <- list(1, 3, numeric(0), 4, numeric(0))
out <- 0
for(i in seq_along(lst1)) {
out <- out + c(lst1[[i]], 0)[1]
}
out
#[1] 8
You can use max/min with 0 to get 0 as output when input is numeric(0).
x <- 1:10
max(x[FALSE], 0)
#[1] 0
min(x[FALSE], 0)
#[1] 0
I'm trying to identify mismatched values based on one element value before or after the focal value in a vector. Any thought about how to do it?
Let's say, I have a vector: x<-c(1,1,2,1,3,3). If element[i] matches with the element before or after item i (element[i-1] and element[i+1]). If there is a match element[i] should equal "yes", otherwise it should equal "no".
The expected output for x<-c(1,1,2,1,3,3) should be c("yes","yes","no","no","yes","yes").
Use rle() to identify runs of equal values. lengths == 1 means there is no equal values before or after the current one.
with(rle(x), rep(ifelse(lengths == 1, "no", "yes"), lengths))
# [1] "yes" "yes" "no" "no" "yes" "yes"
Edit: more concise version(thanks for #dww's comment)
with(rle(x), rep(lengths != 1, lengths))
# [1] TRUE TRUE FALSE FALSE TRUE TRUE
A one liner for this is to use diff
c(diff(x) == 0, F) | c(F, diff(x) == 0)
[1] TRUE TRUE FALSE FALSE TRUE TRUE
c(diff(x) == 0, F) will be true for each index with element[i] == element[i+1] (not applicable for last element) and c(F, diff(x) == 0) will be true for each index with element[i] == element[i-1] (not applicable for first element)
Here is one base R approach. We can generate shifted vectors, either one position to the left, or one position to the right, from your original input vector. Then, we can assert whether each position in the original vector matches either of the same position in the shifted vectors. To give a visual:
x: [ 1, 1, 2, 1, 3, 3]
------------------------
x1: [NA, 1, 1, 2, 1, 3]
x2: [ 1, 2, 1, 3, 3, NA]
We can see the result your expect by inspection. Here is a code snippet implementing this:
x <- as.character(c(1,1,2,1,NA,NA))
x1 <- c('NA', x[1:length(x)-1])
x2 <- c(x[2:length(x)], 'NA')
result <- (x==x1 | is.na(x) & is.na(x1) | x==x2 | is.na(x) & is.na(x2))
output <- ifelse(is.na(result) | !result, "no", "yes")
output
[1] "yes" "yes" "no" "no" "yes" "yes"
Note that I deliberately converted your numerical vector into a character vector, so that I may use 'NA', a string literal, as a placeholder for a missing value. If we used the above logic with a numeric vector, NA could collide with actual missing values.
Here's one way to do it (using TRUE and FALSE in place of "yes" and "no"). Explanation in comments.
pre_or_post_matches <- function(vec){
# get length of `vec`, create empty return vector `out` that we fill
len <- length(vec)
out <- rep(NA, len)
# first element: just check if it equals the second
out[1] <- vec[1]==vec[2]
# last element: just check if it equals the second to last
out[len] <- vec[len]==vec[len-1]
# the other elements: check if equal to at least one neighbor
for (idx in 2:(len-1)){
out[idx] <- (vec[idx]==vec[idx-1]) | (vec[idx]==vec[idx+1])
}
return(out)
}
# apply func to example data provided by OP
x <- c(1, 1, 2, 1, 3, 3)
pre_or_post_matches(x)
## [1] TRUE TRUE FALSE FALSE TRUE TRUE
This question already has an answer here:
Behavior of identical() in apply in R
(1 answer)
Closed 4 years ago.
I have a data frame consisting of identical strings, but the identical() function is returning false when I compare them?
Example:
df <- data.frame("x" = rep("a", times = 10),
"y" = rep("a", times = 10))
checkEquality <- function(x) {
y = x[1]
z = x[2]
return(identical(y, z))
}
apply(df[1:2], 1, checkEquality)
This code returns a vector of FALSE when it should return a vector of TRUE. I have no idea what's going on here. Any help appreciated.
It's because they're not totally identical. Your function takes the data frame row by row and then compares the former columns. Since you use the single bracket operator [] you maintain the column and row names:
x = df[1,]
x[1]
x
1 a
x[2]
y
1 a
While the value is the same, the column names are different so the two vectors are not identical.
If you use the double bracket notation [[]], then it will extract just that one element, dropping the row and column names and it should work:
checkEquality <- function(x) {
y = x[[1]]
z = x[[2]]
return(identical(y, z))
}
apply(df, 1, checkEquality)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
I haven't used identical() before, but have you tried ifelse()?
ifelse(col1==col2, 'TRUE', 'FALSE')
I'm using R and need to create code that checks if a matrix dimensions has 3x3. I am stuck on the proper code using if() to check the nrow and ncol. Any help or suggestion greatly appreciated.
localSmoother <- function(myMatrix = matrix(),
smoothingMatrix = matrix(1, nrow = 3, ncol = 3))
{
if(class(smoothingMatrix) != "matrix") stop ("smoothingMatrix must be a matrix.")
if(ncol(smoothingMatrix) != "3" & if(nrow(smoothingMatrix) != "3")
stop ("smoothingMatrix must have dimensions of 3x3")
print(myMatrix) # see what myMatrix is.
print(smoothingMatrix) # see what smoothingMatrix is.
return(matrix(, nrow = 3, ncol = 3))
}
# # TEST the CODE:
localSmoother(myMatrix = diag(x = 2, nrow = 5, ncol = 5), smoothingMatrix = matrix(2, nrow = 5, ncol = 3))
# # Error in localSmoother(myMatrix = diag(x = 2, nrow = 5, ncol = 5), smoothingMatrix = matrix(2, :
# # smoothingMatrix must be 3x3
Most immediately, you have two if in your conditional. You should just have one.
if(ncol(smoothingMatrix) != "3" & if(nrow(smoothingMatrix) != "3")
## should be
if(ncol(smoothingMatrix) != "3" & nrow(smoothingMatrix) != "3")
Next, your logic asks when the number of rows is not 3 and the number of columns is not 3. This condition works as intended (returns TRUE and follows to the stop statement in the if block) if your dimension is c(5,4), but would fail if your dimension is c(5,3).
x <- matrix(1:20,nrow=5)
dim(x)
# [1] 5 4
ncol(x) != "3" & nrow(x) != "3"
# [1] TRUE
x <- matrix(1:12,nrow=3)
dim(x)
# [1] 3 4
ncol(x) != "3" & nrow(x) != "3"
# [1] FALSE
I would instead use either of the following equivalent lines:
if(!(ncol(smoothingMatrix) == 3 && nrow(smoothingMatrix) == 3))
## is the same as
if(ncol(smoothingMatrix) != 3 || nrow(smoothingMatrix) != 3)
Note two things:
I'm using the logical operators && and || instead of the vectorized operators & and |. Try c(TRUE, FALSE) & TRUE vs. c(TRUE, FALSE) && TRUE.
I am using the numeric form 3 instead of the character "3". R will coerce the number to a character, so the equality test works here, but it could trip you up in other cases.
It may be easier to compare on the dimension of the matrix:
if(!isTRUE(all.equal(dim(smoothingMatrix),c(3L,3L))))
(isTRUE is needed because all.equal returns either TRUE or a character vector describing the differences. Observe that all.equal(1,0) does not return FALSE but instead a character vector describing the differences. Any if statements around all.equal then throw an error if equality doesn't hold.)
all.equal(1,0)
# [1] "Mean relative difference: 1"
if(all.equal(1,0)) print(TRUE)
# Error in if (all.equal(1, 0)) print(TRUE) :
# argument is not interpretable as logical
#Blue Magister's answer is great with nice explanations, go there first.
For your specific task, you might find the stopifnot function useful. From ?stopifnot
stopifnot(...) If any of the expressions in ... are not all TRUE, stop is called, producing an error message indicating the first of the elements of ... which were not true.
Your code could then be
stopifnot(is.matrix(smoothingMatrix),
ncol(smoothingMatrix) == 3,
nrow(smoothingMatrix) == 3)
The downside is that the error messages are a bit less helpful, the upside is that you don't have to write them yourself, and the code is nice and clean.
Is there a better way to count how many elements of a result satisfy a condition?
a <- c(1:5, 1:-3, 1, 2, 3, 4, 5)
b <- c(6:-8)
u <- a > b
length(u[u == TRUE])
## [1] 7
sum does this directly, counting the number of TRUE values in a logical vector:
sum(u, na.rm=TRUE)
And of course there is no need to construct u for this:
sum(a > b, na.rm=TRUE)
works just as well. sum will return NA by default if any of the values are NA. na.rm=TRUE ignores NA values in the sum (for logical or numeric).
If z consists of only TRUE or FALSE, then simply
length(which(z))
I've always used table for this:
a <- c(1:5, 1:-3, 1, 2, 3, 4, 5)
b <- c(6:-8)
table(a>b)
FALSE TRUE
8 7