Checking for sequences in an R vector - r

I'm looking for a function or operation such that if I have
A <- c(1, 2, 3, 4, 5)
and
B <- c(1, 2, 3)
and C <- c(2, 1)
I'd get a TRUE when checking whether A contained B, and FALSE when checking whether A contained C
basically, the equivalent of the %in% operator but that actually cares about the order of elements
In a perfect world, I'd be able to do this without some kind of apply statement, but I may end up having to

Well, if one's allowd to use a kind-of apply loop, then this could work:
"%seq_in%" = function(b,a) any(sapply(1:(length(a)-length(b)+1),function(i) all(a[i:(i+length(b)-1)]==b)))
(edited thanks to bug-finding by John Coleman!)
EDIT 2:
I couldn't resist trying to solve the 'non-contiguous' case, too:
# find_subseq() returns positions within vec of ordered elements of x, or stops with NA upon failing
find_subseq = function(x,vec) {
p=match(x[1],vec)
if(is.na(p)||length(x)==1){ p }
else { c(p,p+find_subseq(x[-1],vec[-seq_len(p)])) }
}
"%seq_somewhere_in%" = function(b,a) all(!is.na(find_subseq(b,a)))
Examples:
1:3 %seq_in% 1:10
[1] TRUE
c(3,1,2) %seq_in% 1:10
[1] FALSE
c(1,2,3) %seq_in% c(3,2,1,2,3)
[1] TRUE
2:1 %seq_in% c(1,2,1)
[1] TRUE
1:3 %seq_somewhere_in% c(1,10,10,2,10,10,10,3,10)
[1] TRUE

Maybe you can define a custom function subseq_check like below
subseq_check <- function(x,y) grepl(toString(y),toString(x),fixed = TRUE)
which gives
> subseq_check(A,B)
[1] TRUE
> subseq_check(A,C)
[1] FALSE
A Hard-core approach
subseq_find <- function(x,y) {
inds <- which(x == head(y,1))
if (length(inds)==0) return(FALSE)
any(sapply(inds, function(k) all(x[k:(k+length(y)-1)]==y)))
}
such that
> subseq_find(A,B)
[1] TRUE
> subseq_find(A,C)
[1] FALSE

Related

R: how to check if a vector is found in another vector of different length without using %in%

vector_1 = c(4,3,5,1,2)
vector_2 = c(3,1)
output:
[1] FALSE TRUE FALSE TRUE FALSE
how do I get the output just by using basic operators/loops without using the operator %in% or any functions in R?
See match.fun(`%in%`)
match(vector_1,vector_2, nomatch = 0) > 0
Without "functions" is a bit vague, since virtually anything in R is a function. Probably that's an assignment and a for loop is wanted.
res <- logical(length(vector_1))
for (i in seq_along(vector_1)) {
for (j in seq_along(vector_2)) {
if (vector_1[i] == vector_2[j])
res[i] <- TRUE
}
}
res
# [1] FALSE TRUE FALSE TRUE FALSE
However, that's not very R-ish where you rather want to do something like
apply(outer(vector_1, vector_2, `==`), 1, \(x) x[which.max(x)])
# [1] FALSE TRUE FALSE TRUE FALSE
Data:
vector_1 <- c(4, 3, 5, 1, 2)
vector_2 <- c(3, 1)
One way with sapply() -
sapply(vector_1, function(x) any(x == vector_2))
[1] FALSE TRUE FALSE TRUE FALSE

Is there a better way to check if all elements in a list are named?

I want to check if all elements in a list are named. I've came up with this solution, but I wanted to know if there is a more elegant way to check this.
x <- list(a = 1, b = 2)
y <- list(1, b = 2)
z <- list (1, 2)
any(stringr::str_length(methods::allNames(x)) == 0L) # FALSE, all elements are
# named.
any(stringr::str_length(methods::allNames(y)) == 0L) # TRUE, at least one
# element is not named.
# Throw an error here.
any(stringr::str_length(methods::allNames(z)) == 0L) # TRUE, at least one
# element is not named.
# Throw an error here.
I am not sure if the following base R code works for your general cases, but it seems work for the ones in your post.
Define a function f to check the names
f <- function(lst) length(lst) == sum(names(lst) != "",na.rm = TRUE)
and you will see
> f(x)
[1] TRUE
> f(y)
[1] FALSE
> f(z)
[1] FALSE
We can create a function to check if the the names attribute is NULL or (|) there is blank ("") name, negate (!)
f1 <- function(lst1) is.list(lst1) && !(is.null(names(lst1))| '' %in% names(lst1))
-checking
f1(x)
#[1] TRUE
f1(y)
#[1] FALSE
f1(z)
#[1] FALSE
Or with allNames
f2 <- function(lst1) is.list(lst1) && !("" %in% allNames(lst1))
-checking
f2(x)
#[1] TRUE
f2(y)
#[1] FALSE
f2(z)
#[1] FALSE

Check if a number is between two others

I am looking for a function that verifies if a number is between two other numbers. I also need to control if I want a strict comparison (a
I know the function between() in dplyr. Yet, I have to know the upper and lower numbers.
MyNumber = 8
First = 2
Second = 10
# This will return TRUE
between(MyNumber, lower = First, upper = Second)
# But this will return FALSE
between(MyNumber, lower = Second, upper = First)
# This will return TRUE. I want it to return FALSE
First = 8
between(MyNumber, lower = First, upper = Second)
I need a function that returns TRUE no matter what is the order.
Something like:
between2 <- function(number,bounds) { number > min(bounds) & number < max(bounds)}
between2(8, c(2,10))
[1] TRUE
between2(8, c(10,2))
[1] TRUE
This function also deals with your added condition
between2(8,c(8,10))
[1] FALSE
You could do it with a simple arithmetics:
between <- function(number, first, second) { (first - number) * (second - number) < 0 }
Here are some example outputs:
> between(8, 2, 10)
[1] TRUE
> between(8, 10, 2)
[1] TRUE
> between(8, 10, 12)
[1] FALSE
> between(8, 1, 2)
[1] FALSE
You could use %in% with the : function, once you now first and last:
first <- 2
last <- 10
number <- 8
number %in% first:last
[1] TRUE
first <- 10
last <- 2
number <- 8
number %in% first:last
[1] TRUE
first <- 10
last <- 12
number <- 8
number %in% first:last
[1] FALSE
first <- 12
last <- 10
number <- 8
number %in% first:last
[1] FALSE
In a function, and strict lets you consider or not strict comparison:
my_between <- function(n, f, l, strict = FALSE) {
if (!strict) {
n %in% f:l # if strict == FALSE (default)
} else {
n %in% (f+1):(l-1) # if strict == TRUE
}
}
my_between(8, 2, 10)
What's wrong with
f_between <- function (num, L, R) num>=min(L,R) & num<=max(L,R)
f_between(8, 2, 10)
#[1] TRUE
f_between(6, 6, 10)
#[1] TRUE
f_between(2, -10, -2)
#[1] FALSE
f_between(3, 5, 7)
#[1] FALSE

Search a matrix for rows with given values in any order

I have a matrix and a vector with values:
mat<-matrix(c(1,1,6,
3,5,2,
1,6,5,
2,2,7,
8,6,1),nrow=5,ncol=3,byrow=T)
vec<-c(1,6)
This is a small subset of a N by N matrix and 1 by N vector. Is there a way so that I can subset the rows with values in vec?
The most straight forward way of doing this that I know of would be to use the subset function:
subset(mat,vec[,1] == 1 & vec[,2] == 6) #etc etc
The problem with subset is you have to specify in advance the column to look for and the specific combination to do for. The problem I am facing is structured in a way such that I want to find all rows containing the numbers in "vec" in any possible way. So in the above example, I want to get a return matrix of:
1,1,6
1,6,5
8,6,1
Any ideas?
You can do
apply(mat, 1, function(x) all(vec %in% x))
# [1] TRUE FALSE TRUE FALSE TRUE
but this may give you unexpected results if vec contains repeated values:
vec <- c(1, 1)
apply(mat, 1, function(x) all(vec %in% x))
# [1] TRUE FALSE TRUE FALSE TRUE
so you would have to use something more complicated using table to account for repetitions:
vec <- c(1, 1)
is.sub.table <- function(table1, table2) {
all(names(table1) %in% names(table2)) &&
all(table1 <= table2[names(table1)])
}
apply(mat, 1, function(x)is.sub.table(table(vec), table(x)))
# [1] TRUE FALSE FALSE FALSE FALSE
However, if the vector length is equal to the number of columns in your matrix as you seem to indicate but is not the case in your example, you should just do:
vec <- c(1, 6, 1)
apply(mat, 1, function(x) all(sort(vec) == sort(x)))
# [1] TRUE FALSE FALSE FALSE FALSE

Check that a vector is contained in a matrix in R

I can't believe this is taking me this long to figure out, and I still can't figure it out.
I need to keep a collection of vectors, and later check that a certain vector is in that collection. I tried lists combined with %in% but that doesn't appear to work properly.
My next idea was to create a matrix and rbind vectors to it, but now I don't know how to check if a vector is contained in a matrix. %in appears to compare sets and not exact rows. Same appears to apply to intersect.
Help much appreciated!
Do you mean like this:
wantVec <- c(3,1,2)
myList <- list(A = c(1:3), B = c(3,1,2), C = c(2,3,1))
sapply(myList, function(x, want) isTRUE(all.equal(x, want)), wantVec)
## or, is the vector in the set?
any(sapply(myList, function(x, want) isTRUE(all.equal(x, want)), wantVec))
We can do a similar thing with a matrix:
myMat <- matrix(unlist(myList), ncol = 3, byrow = TRUE)
## As the vectors are now in the rows, we use apply over the rows
apply(myMat, 1, function(x, want) isTRUE(all.equal(x, want)), wantVec)
## or
any(apply(myMat, 1, function(x, want) isTRUE(all.equal(x, want)), wantVec))
Or by columns:
myMat2 <- matrix(unlist(myList), ncol = 3)
## As the vectors are now in the cols, we use apply over the cols
apply(myMat, 2, function(x, want) isTRUE(all.equal(x, want)), wantVec)
## or
any(apply(myMat, 2, function(x, want) isTRUE(all.equal(x, want)), wantVec))
If you need to do this a lot, write your own function
vecMatch <- function(x, want) {
isTRUE(all.equal(x, want))
}
And then use it, e.g. on the list myList:
> sapply(myList, vecMatch, wantVec)
A B C
FALSE TRUE FALSE
> any(sapply(myList, vecMatch, wantVec))
[1] TRUE
Or even wrap the whole thing:
vecMatch <- function(x, want) {
out <- sapply(x, function(x, want) isTRUE(all.equal(x, want)), want)
any(out)
}
> vecMatch(myList, wantVec)
[1] TRUE
> vecMatch(myList, 5:3)
[1] FALSE
EDIT: Quick comment on why I used isTRUE() wrapped around the all.equal() calls. This is due to the fact that where the two arguments are not equal, all.equal() doesn't return a logical value (FALSE):
> all.equal(1:3, c(3,2,1))
[1] "Mean relative difference: 1"
isTRUE() is useful here because it returns TRUE iff it's argument is TRUE, whilst it returns FALSE if it is anything else.
> M
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
v <- c(2, 5, 8)
check each column:
c1 <- which(M[, 1] == v[1])
c2 <- which(M[, 2] == v[2])
c3 <- which(M[, 3] == v[3])
Here is a way to still use intersect() on more than 2 elements
> intersect(intersect(c1, c2), c3)
[1] 2

Resources