consider two vectors test1 <- c(1,2,3,4,5,3) test2 <- c(2,3,4,5,6,7,2) My goal is to create a vector, that only contains values, that can be found in both vectors. The result should be a vector like 2 3 4 5
For this matter I have two questions.
1) How can I get the wanted result in R? (even with 3 vectors, say test3 <- c(1,3,5,6,7) and I wanted to get all values that can be found in all three vectors 3 5
2) I tried to write a loop for this, but it would not do the job as intended. Curiously if I run each step of my code manually, everything works out as intended. What am I missing? Why doesn't my code work?
The idea is to create a vector test4 <- c(test1, test2) and iteratively check, if the value can be found in test1 and test2.
for(i in levels(as.factor(test4))){ #loop for all occuring levels
log1 <- rep(0,nlevels(as.factor(test4))) #create logical vector
log1 <- as.logical(log1) #to store results
if(is.element(i,test1) == TRUE & is.element(i,test2) == TRUE){
log1[which(levels(as.factor(test4)) == i)] <- TRUE
} else{
log1[which(levels(as.factor(test4)) == i)] <- FALSE
}
#if i is element of test1 and test2 the the corresponding entry
#in log1 becomes TRUE, otherwise FALSE
This leads the result
log1
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Now one can think of errors in the loops. To check for that, I printed the values and they are all correct:
for(i in levels(as.factor(test4))){
if(is.element(i,test1) == TRUE & is.element(i,test2) == TRUE){
print(TRUE)
} else{
print(FALSE)
}
}
[1] FALSE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] FALSE
[1] FALSE
To check the index i I run this code
for(i in levels(as.factor(test3))){
j <- which(levels(as.factor(test3)) == i)
print(j)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
All seems to be correct to this point. Now I run the code manually and get the wanted result:
test1 <- c(1,2,3,4,5)
test2 <- c(2,3,4,5,6,7)
test4 <- c(test1, test2)
log1 <- rep(0,nlevels(as.factor(test4)))
log1 <- as.logical(log1)
log1[1] <- is.element(1,test1) == TRUE & is.element(1,test2) == TRUE
log1[2] <- is.element(2,test1) == TRUE & is.element(2,test2) == TRUE
log1[3] <- is.element(3,test1) == TRUE & is.element(3,test2) == TRUE
log1[4] <- is.element(4,test1) == TRUE & is.element(4,test2) == TRUE
log1[5] <- is.element(5,test1) == TRUE & is.element(5,test2) == TRUE
log1[6] <- is.element(6,test1) == TRUE & is.element(6,test2) == TRUE
log1[7] <- is.element(7,test1) == TRUE & is.element(7,test2) == TRUE
log1
[1] FALSE TRUE TRUE TRUE TRUE FALSE FALSE
I tried to set a index j <- which(levels(as.factor(test4)) == i) and replace entries log[j].
The if loop is not necessary, but it helped to locate the problem. the for loop could be written as
for(i in levels(as.factor(test4))){
log1 <- rep(0,nlevels(as.factor(test4)))
log1 <- as.logical(log1)
log1[which(levels(as.factor(test4)) == i)] <- is.element(i,test1) == TRUE & is.element(i,test2) == TRUE
}
Which doesn't help. I really don't know, what I did wrong here. I searched on the web and on stack overflow, but I could not find a solution. I hope you can!
Gather unique values then keep duplicated :
all <- c(unique(test1), unique(test2))
all[duplicated(all)]
Related
vector_1 = c(4,3,5,1,2)
vector_2 = c(3,1)
output:
[1] FALSE TRUE FALSE TRUE FALSE
how do I get the output just by using basic operators/loops without using the operator %in% or any functions in R?
See match.fun(`%in%`)
match(vector_1,vector_2, nomatch = 0) > 0
Without "functions" is a bit vague, since virtually anything in R is a function. Probably that's an assignment and a for loop is wanted.
res <- logical(length(vector_1))
for (i in seq_along(vector_1)) {
for (j in seq_along(vector_2)) {
if (vector_1[i] == vector_2[j])
res[i] <- TRUE
}
}
res
# [1] FALSE TRUE FALSE TRUE FALSE
However, that's not very R-ish where you rather want to do something like
apply(outer(vector_1, vector_2, `==`), 1, \(x) x[which.max(x)])
# [1] FALSE TRUE FALSE TRUE FALSE
Data:
vector_1 <- c(4, 3, 5, 1, 2)
vector_2 <- c(3, 1)
One way with sapply() -
sapply(vector_1, function(x) any(x == vector_2))
[1] FALSE TRUE FALSE TRUE FALSE
I want to check if all elements in a list are named. I've came up with this solution, but I wanted to know if there is a more elegant way to check this.
x <- list(a = 1, b = 2)
y <- list(1, b = 2)
z <- list (1, 2)
any(stringr::str_length(methods::allNames(x)) == 0L) # FALSE, all elements are
# named.
any(stringr::str_length(methods::allNames(y)) == 0L) # TRUE, at least one
# element is not named.
# Throw an error here.
any(stringr::str_length(methods::allNames(z)) == 0L) # TRUE, at least one
# element is not named.
# Throw an error here.
I am not sure if the following base R code works for your general cases, but it seems work for the ones in your post.
Define a function f to check the names
f <- function(lst) length(lst) == sum(names(lst) != "",na.rm = TRUE)
and you will see
> f(x)
[1] TRUE
> f(y)
[1] FALSE
> f(z)
[1] FALSE
We can create a function to check if the the names attribute is NULL or (|) there is blank ("") name, negate (!)
f1 <- function(lst1) is.list(lst1) && !(is.null(names(lst1))| '' %in% names(lst1))
-checking
f1(x)
#[1] TRUE
f1(y)
#[1] FALSE
f1(z)
#[1] FALSE
Or with allNames
f2 <- function(lst1) is.list(lst1) && !("" %in% allNames(lst1))
-checking
f2(x)
#[1] TRUE
f2(y)
#[1] FALSE
f2(z)
#[1] FALSE
This should be very simple, but my r knowledge is limited.
I'm trying to find out if any value is greater than all previous values.
An example would be
x<-c(1.1, 2.5, 2.4, 3.6, 3.2)
results:
NA True False True False
My real values are measurements with many decimal places so I doubt I will get the same value twice
You can use cummax() to get the biggest value so far. x >= cummax(x) basically gives you the answer, although element 1 is TRUE, so you just need to change that:
> out = x >= cummax(x)
> out[1] = NA
> out
[1] NA TRUE FALSE TRUE FALSE
Although #Marius has got this absolutely correct. Here is an option with a loop
sapply(seq_along(x), function(i) all(x[i] >= x[seq_len(i)]))
#[1] TRUE TRUE FALSE TRUE FALSE
Or same logic with explicit for loop
out <- logical(length(x))
for(i in seq_along(x)) {
out[i] <- all(x[i] >= x[seq_len(i)])
}
out[1] <- NA
out
#[1] NA TRUE FALSE TRUE FALSE
We can use lapply
unlist(lapply(seq_along(x), function(i) all(x[i] >=x[seq(i)])))
#[1] TRUE TRUE FALSE TRUE FALSE
Or with max.col
max.col(t(sapply(x, `>=`, x)), 'last') > seq_along(x)
#[1] FALSE TRUE FALSE TRUE FALSE
or with for loop
mx <- x[1]
i1 <- logical(length(x))
for(i in seq_along(x)) {i1[i][x[i] > mx] <- TRUE; mx <- max(c(mx, x[i]))}
I have the following vector:
p<-c(0,0,1,1,1,3,2,3,2,2,2,2)
I'm trying to write a function that returns TRUE if there are x consecutive duplicates in the vector.
The function call found_duplications(p,3) will return True because there are three consecutive 1's. The function call found_duplications(p,5) will return False because there are no 5 consecutive duplicates of a number. The function call found_duplications(p,4) will return True because there are four consecutive 4's.
I have a couple ideas. There's the duplicated() function:
duplicated(p)
> [1] FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
I can make a for loop that counts the number of TRUE's in the vector but the problem is that the consecutive counter would be off by one. Can you guys think of any other solutions?
You could also do
find.dup <- function(x, n){
n %in% rle(x)$lengths
}
find.dup(p,3)
#[1] TRUE
find.dup(p,2)
#[1] TRUE
find.dup(p,5)
#[1] FALSE
find.dup(p,4)
#[1] TRUE
p<-c(0,0,1,1,1,3,2,3,2,2,2,2)
find.dup <- function(x, n) {
consec <- 1
for(i in 2:length(x)) {
if(x[i] == x[i-1]) {
consec <- consec + 1
} else {
consec <- 1
}
if(consec == n)
return(TRUE) # or you could return x[i]
}
return(FALSE)
}
find.dup(p,3)
# [1] TRUE
find.dup(p,4)
# [1] TRUE
find.dup(p,5)
# [1] FALSE
A friend wrote up this function for determining unique members of a vector. I can't figure out (mentally) what this one line is doing and it's the crux of the function. Any help is greatly appreciated
myUniq <- function(x){
len = length(x) # getting the length of the argument
logical = rep(T, len) # creating a vector of logicals as long as the arg, populating with true
for(i in 1:len){ # for i -> length of the argument
logical = logical & x != x[i] # logical vector = logical vector & arg vector where arg vector != x[i] ??????
logical[i] = T
}
x[logical]
}
This line I can't figure out:
logical = logical & x != x[i]
can anyone explain it to me?
Thanks,
Tom
logical is a vector, I presume a logical one containing len values TRUE. x is a vector of some other data of the same length.
The second part x != x[i] is creating a logical vector with TRUE where elements of x aren't the same as the current value of x for this iteration, and FALSE otherwise.
As a result, both sides of & are now logical vector. & is an element-wise AND comparison the result of this is TRUE if elements of logical and x != x[i] are both TRUE and FALSE otherwise. Hence, after the first iteration, logical gets changed to a logical vector with TRUE for all elements x not the same as the i=1th element of x, and FALSE if they are the same.
Here is a bit of an example:
logical <- rep(TRUE, 10)
set.seed(1)
x <- sample(letters[1:4], 10, replace = TRUE)
> x
[1] "b" "b" "c" "d" "a" "d" "d" "c" "c" "a"
> logical
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> x != x[1]
[1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> logical & x != x[1]
[1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
This seems very complex. Do you get the same results as:
unique(x)
gives you? If I run my x above through myUniq() and unique() I get the same output:
> myUniq(x)
[1] "b" "d" "c" "a"
> unique(x)
[1] "b" "c" "d" "a"
(well, except for the ordering...)