does vector exist in matrix? - r

how can I check to see if vector exists inside a matrix. The vector will be of size 2. I have an approach but I would like something vectorized/faster.
dim(m)
[1] 30 2
x = c(1, -2)
for(j in 1:nrow(m)){
if ( isTRUE(as.vector(x[1]) == as.vector(m[j,1])) && as.vector(x[2] == as.vector(m[j,2]) )) {
print(TRUE)
}
}
note, x=c(1, -2) is not the same as -2, 1 in the matrix.

If we are comparing the rows of the matrix ('m') with 'x' having the same length as the number of columns of 'm', we can replicate 'x' (x[col(m)]) to make the lengths same, compare (!=), get the rowSums. If the sum is 0 for a particular row, it means that all the values in the vector matches that row of 'm'. Negate (!) to convert 0 to TRUE and all other values as FALSE.
indx1 <- !rowSums(m!=x[col(m)])
Or if we need a solution using apply, we can use identical
indx2 <- apply(m, 1, identical, y=x)
identical(indx1, indx2)
#[1] TRUE
If this to find only a single TRUE/FALSE, we can wrap any to 'indx1' or 'indx2'.
data
x <- c(1, -2)
set.seed(24)
m <- matrix(sample(c(1,-2,3,4), 30*2, replace=TRUE), ncol=2)

Try
m<-matrix(rnorm(60),30)
x<-m[8,]
m[9,]<-c(x[2],x[1]) # to prove 1,-2 not same -2,1
apply(m,1,function(n,x) all(n==x),x=x)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[24] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
if you need just one T/F use any() you
any(apply(m,1,function(n,x) all(n==x),x=x))
[1] TRUE
if run this code with akrun's data
x <- c(1, -2)
set.seed(24)
m <- matrix(sample(c(1,-2,3,4), 30*2, replace=TRUE), ncol=2)
any(apply(m,1,function(n,x) all(n==x),x=x))
[1] TRUE

Related

R: how to check if a vector is found in another vector of different length without using %in%

vector_1 = c(4,3,5,1,2)
vector_2 = c(3,1)
output:
[1] FALSE TRUE FALSE TRUE FALSE
how do I get the output just by using basic operators/loops without using the operator %in% or any functions in R?
See match.fun(`%in%`)
match(vector_1,vector_2, nomatch = 0) > 0
Without "functions" is a bit vague, since virtually anything in R is a function. Probably that's an assignment and a for loop is wanted.
res <- logical(length(vector_1))
for (i in seq_along(vector_1)) {
for (j in seq_along(vector_2)) {
if (vector_1[i] == vector_2[j])
res[i] <- TRUE
}
}
res
# [1] FALSE TRUE FALSE TRUE FALSE
However, that's not very R-ish where you rather want to do something like
apply(outer(vector_1, vector_2, `==`), 1, \(x) x[which.max(x)])
# [1] FALSE TRUE FALSE TRUE FALSE
Data:
vector_1 <- c(4, 3, 5, 1, 2)
vector_2 <- c(3, 1)
One way with sapply() -
sapply(vector_1, function(x) any(x == vector_2))
[1] FALSE TRUE FALSE TRUE FALSE

Function to find if a value is greater than all prior values in a vector

This should be very simple, but my r knowledge is limited.
I'm trying to find out if any value is greater than all previous values.
An example would be
x<-c(1.1, 2.5, 2.4, 3.6, 3.2)
results:
NA True False True False
My real values are measurements with many decimal places so I doubt I will get the same value twice
You can use cummax() to get the biggest value so far. x >= cummax(x) basically gives you the answer, although element 1 is TRUE, so you just need to change that:
> out = x >= cummax(x)
> out[1] = NA
> out
[1] NA TRUE FALSE TRUE FALSE
Although #Marius has got this absolutely correct. Here is an option with a loop
sapply(seq_along(x), function(i) all(x[i] >= x[seq_len(i)]))
#[1] TRUE TRUE FALSE TRUE FALSE
Or same logic with explicit for loop
out <- logical(length(x))
for(i in seq_along(x)) {
out[i] <- all(x[i] >= x[seq_len(i)])
}
out[1] <- NA
out
#[1] NA TRUE FALSE TRUE FALSE
We can use lapply
unlist(lapply(seq_along(x), function(i) all(x[i] >=x[seq(i)])))
#[1] TRUE TRUE FALSE TRUE FALSE
Or with max.col
max.col(t(sapply(x, `>=`, x)), 'last') > seq_along(x)
#[1] FALSE TRUE FALSE TRUE FALSE
or with for loop
mx <- x[1]
i1 <- logical(length(x))
for(i in seq_along(x)) {i1[i][x[i] > mx] <- TRUE; mx <- max(c(mx, x[i]))}

which rows match a given vector in R

I have a matrix A,
A = as.matrix(data.frame(col1 = c(1,1,2,3,1,2), col2 = c(-1,-1,-2,-3,-1,-2), col3 = c(2,6,1,3,2,4)))
And I have a vector v,
v = c(-1, 2)
How can I get a vector of TRUE/FALSE that compares the last two columns of the matrix and returns TRUE if the last two columns match the vector, or false if they don't?
I.e., If I try,
A[,c(2:3)] == v
I obtain,
col2 col3
[1,] TRUE FALSE
[2,] FALSE FALSE
[3,] FALSE FALSE
[4,] FALSE FALSE
[5,] TRUE FALSE
[6,] FALSE FALSE
Which is not what I want, I want both columns to be the same as vector v, more like,
result = c(TRUE, FALSE, FALSE, FALSE, TRUE, FALSE)
Since the first, and 5th rows match the vector v entirely.
Here's a simple alternative
> apply(A[, 2:3], 1, function(x) all(x==v))
[1] TRUE FALSE FALSE FALSE TRUE FALSE
Ooops by looking into R mailing list I found an answer: https://stat.ethz.ch/pipermail/r-help/2010-September/254096.html,
check.equal <- function(x, y)
{
isTRUE(all.equal(y, x, check.attributes=FALSE))
}
result = apply(A[,c(2:3)], 1, check.equal, y=v)
Not sure I need to define a function and do all that, maybe there are easier ways to do it.
Here's another straightforward option:
which(duplicated(rbind(A[, 2:3], v), fromLast=TRUE))
# [1] 1 5
results <- rep(FALSE, nrow(A))
results[which(duplicated(rbind(A[, 2:3], v), fromLast=TRUE))] <- TRUE
results
# [1] TRUE FALSE FALSE FALSE TRUE FALSE
Alternatively, as one line:
duplicated(rbind(A[, 2:3], v), fromLast=TRUE)[-(nrow(A)+1)]
# [1] TRUE FALSE FALSE FALSE TRUE FALSE
A dirty one:
result <- c()
for(n in 1:nrow(A)){result[n] <-(sum(A[n,-1]==v)==2)}
> result
[1] TRUE FALSE FALSE FALSE TRUE FALSE

Search a matrix for rows with given values in any order

I have a matrix and a vector with values:
mat<-matrix(c(1,1,6,
3,5,2,
1,6,5,
2,2,7,
8,6,1),nrow=5,ncol=3,byrow=T)
vec<-c(1,6)
This is a small subset of a N by N matrix and 1 by N vector. Is there a way so that I can subset the rows with values in vec?
The most straight forward way of doing this that I know of would be to use the subset function:
subset(mat,vec[,1] == 1 & vec[,2] == 6) #etc etc
The problem with subset is you have to specify in advance the column to look for and the specific combination to do for. The problem I am facing is structured in a way such that I want to find all rows containing the numbers in "vec" in any possible way. So in the above example, I want to get a return matrix of:
1,1,6
1,6,5
8,6,1
Any ideas?
You can do
apply(mat, 1, function(x) all(vec %in% x))
# [1] TRUE FALSE TRUE FALSE TRUE
but this may give you unexpected results if vec contains repeated values:
vec <- c(1, 1)
apply(mat, 1, function(x) all(vec %in% x))
# [1] TRUE FALSE TRUE FALSE TRUE
so you would have to use something more complicated using table to account for repetitions:
vec <- c(1, 1)
is.sub.table <- function(table1, table2) {
all(names(table1) %in% names(table2)) &&
all(table1 <= table2[names(table1)])
}
apply(mat, 1, function(x)is.sub.table(table(vec), table(x)))
# [1] TRUE FALSE FALSE FALSE FALSE
However, if the vector length is equal to the number of columns in your matrix as you seem to indicate but is not the case in your example, you should just do:
vec <- c(1, 6, 1)
apply(mat, 1, function(x) all(sort(vec) == sort(x)))
# [1] TRUE FALSE FALSE FALSE FALSE

Insert elements into a vector at given indexes

I have a logical vector, for which I wish to insert new elements at particular indexes. I've come up with a clumsy solution below, but is there a neater way?
probes <- rep(TRUE, 15)
ind <- c(5, 10)
probes.2 <- logical(length(probes)+length(ind))
probes.ind <- ind + 1:length(ind)
probes.original <- (1:length(probes.2))[-probes.ind]
probes.2[probes.ind] <- FALSE
probes.2[probes.original] <- probes
print(probes)
gives
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
and
print(probes.2)
gives
[1] TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE
[13] TRUE TRUE TRUE TRUE TRUE
So it works but is ugly looking - any suggestions?
These are all very creative approaches. I think working with indexes is definitely the way to go (Marek's solution is very nice).
I would just mention that there is a function to do roughly that: append().
probes <- rep(TRUE, 15)
probes <- append(probes, FALSE, after=5)
probes <- append(probes, FALSE, after=11)
Or you could do this recursively with your indexes (you need to grow the "after" value on each iteration):
probes <- rep(TRUE, 15)
ind <- c(5, 10)
for(i in 0:(length(ind)-1))
probes <- append(probes, FALSE, after=(ind[i+1]+i))
Incidentally, this question was also previously asked on R-Help. As Barry says:
"Actually I'd say there were no ways of doing this, since I dont think you can actually insert into a vector - you have to create a new vector that produces the illusion of insertion!"
You can do some magic with indexes:
First create vector with output values:
probs <- rep(TRUE, 15)
ind <- c(5, 10)
val <- c( probs, rep(FALSE,length(ind)) )
# > val
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
# [13] TRUE TRUE TRUE FALSE FALSE
Now trick. Each old element gets rank, each new element gets half-rank
id <- c( seq_along(probs), ind+0.5 )
# > id
# [1] 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0
# [16] 5.5 10.5
Then use order to sort in proper order:
val[order(id)]
# [1] TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE
# [13] TRUE TRUE TRUE TRUE TRUE
probes <- rep(TRUE, 1000000)
ind <- c(50:100)
val <- rep(FALSE,length(ind))
new.probes <- vector(mode="logical",length(probes)+length(val))
new.probes[-ind] <- probes
new.probes[ind] <- val
Some timings:
My method
user system elapsed
0.03 0.00 0.03
Marek method
user system elapsed
0.18 0.00 0.18
R append with for loop
user system elapsed
1.61 0.48 2.10
How about this:
> probes <- rep(TRUE, 15)
> ind <- c(5, 10)
> probes.ind <- rep(NA, length(probes))
> probes.ind[ind] <- FALSE
> new.probes <- as.vector(rbind(probes, probes.ind))
> new.probes <- new.probes[!is.na(new.probes)]
> new.probes
[1] TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE
[13] TRUE TRUE TRUE TRUE TRUE
That is sorta tricky. Here's one way. It iterates over the list, inserting each time, so it's not too efficient.
probes <- rep(TRUE, 15)
probes.ind <- ind + 0:(length(ind)-1)
for (i in probes.ind) {
probes <- c(probes[1:i], FALSE, probes[(i+1):length(probes)])
}
> probes
[1] TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE
[13] TRUE TRUE TRUE TRUE TRUE
This should even work if ind has repeated elements, although ind does need to be sorted for the probes.ind construction to work.
Or you can do it using the insertRow function from the miscTools package.
probes <- rep(TRUE, 15)
ind <- c(5,10)
for (i in ind){
probes <- as.vector(insertRow(as.matrix(probes), i, FALSE))
}
I came up with a good answer that's easy to understand and fairly fast to run, building off Wojciech's answer above. I'll adapt the method for the example here, but it can be easily generalized to pretty much any data type for an arbitrary pattern of missing points (shown below).
probes <- rep(TRUE, 15)
ind <- c(5,10)
probes.final <- rep(FALSE, length(probes)+length(ind))
probes.final[-ind] <- probes
The data I needed this for is sampled at a regular interval, but many samples are thrown out, and the resulting data file only includes the timestamps and measurements for those retained. I needed to produce a vector containing all the timestamps and a data vector with NAs inserted for timestamps that were tossed. I used the "not in" function stolen from here to make it a bit simpler.
`%notin%` <- Negate(`%in%`)
dat <- rnorm(50000) # Data given
times <- seq(from=554.3, by=0.1, length.out=70000] # "Original" time stamps
times <- times[-sample(2:69999, 20000)] # "Given" times with arbitrary points missing from interior
times.final <- seq(from=times[1], to=times[length(times)], by=0.1)
na.ind <- which(times.final %notin% times)
dat.final <- rep(NA, length(times.final))
dat.final[-na.ind] <- dat
Um, hi, I had the same doubt, but I couldn't understand what people had answered, because I'm still learning the language. So I tried make my own and I suppose it works! I created a vector and I wanted to insert the value 100 after the 3rd, 5th and 6th indexes. This is what I wrote.
vector <- c(0:9)
indexes <- c(6, 3, 5)
indexes <- indexes[order(indexes)]
i <- 1
j <- 0
while(i <= length(indexes)){
vector <- append(vector, 100, after = indexes[i] + j)
i <-i + 1
j <- j + 1
}
vector
The vector "indexes" must be in ascending order for this to work. This is why I put them in order at the third line.
The variable "j" is necessary because at each iteration, the length of the new vector increases and the original values are moved.
In the case you wish to insert the new value next to each other, simply repeat the number of the index. For instance, by assigning indexes <- c(3, 5, 5, 5, 6), you should get vector == 0 1 2 100 3 4 100 100 100 5 100 6 7 8 9

Resources