Get matrix column name by matching vector - r

I started with a matrix
Xray Stay Leave
[1,] "H" "H" "H"
[2,] "A" "L" "O"
And I have the following vector:
[1] "H" "L"
I want to get the output
"Stay".
I tried this:
which(vec %in% matrix )
but that gives me the following output:
[1] 1 2
Seems to be just telling me the rows that it finds the H and L in. I need the column name of the one that is an exact match.

Another approach:
vec <- c("H", "L")
colnames(mat)[colMeans(mat == vec) == 1]
# [1] "Stay"
where mat is the name of your matrix.

Assuming m is your matrix, you could do
> vec <- c("H", "A")
> colnames(m)[apply(m, 2, identical, vec)]
NOTE: identical used here because the original post says "I need the column name of the one that is an exact match"

This should return a logical vector:
logv <- apply(mat, 2, function(x) identical(vec,x))
Then this will select the correct column name:
dimnames(mat)[[2]][logv]
[1] "Stay"
Test case:
mat <- matrix( c( "H","H", "H", "A","L","O") ,2, byrow=TRUE,
dimnames=list(NULL, c('Xray', 'Stay', 'Leave') ) )

You could try:
If mat and vec are the matrix and vector
colnames(mat)[table(mat %in% vec, (seq_along(mat)-1)%/%nrow(mat) +1)[2,] >1]
#[1] "Stay"

Related

Replace a value in a vector if doesn't equal to two conditions (in R)

I have a vector v. I would like to replace everything that doesn't equal S or D. It can be replaced by x.
v <- c("S","D","hxbasxhas","S","cbsdc")
result
r <- c("S","D","x","S","x")
A stringr approach is possible with a negative look-around.
Using str_replace:
library(stringr)
str_replace(v, "^((?!S|D).)*$", "x")
Result:
[1] "S" "D" "x" "S" "x"
You can negate %in% :
v <- c("S","D","hxbasxhas","S","cbsdc")
v[!v %in% c('S', 'D')] <- 'x'
v
#[1] "S" "D" "x" "S" "x"
Or use forcats::fct_other :
forcats::fct_other(v, c('S', 'D'), other_level = 'x')
I believe that the which statement is what you are looking for:
v[which(v!="S" & v!="D")]<-"x"

name character vectors with same name of list

I have a list that looks like this.
my_list <- list(Y = c("p", "q"), K = c("s", "t", "u"))
I want to name each list element (the character vectors) with the name of the list they are in. All element of the same vector must have the same name
I was able to write this function that works on a single list element
name_vector <- function(x){
names(x[[1]]) <- rep(names(x[1]), length(x[[1]]))
return(x)
}
> name_vector(my_list[1])
$Y
Y Y
"p" "q"
But can't find a way to vectorize it. If I run it with an apply function it just returns the list unchanged
> lapply(my_list, name_vector)
$K
[1] "p" "q"
$J
[1] "x" "y"
My desired output for my_list is a named vector
Y Y K K K
"p" "q" "s" "t" "u"
We unlist the list while setting the names by replicating
setNames(unlist(my_list), rep(names(my_list), lengths(my_list)))
Or stack into a two column data.frame, extract the 'values' column and name it with 'ind'
with(stack(my_list), setNames(values, ind))
if your names don't end with numbers :
vec <- unlist(my_list)
names(vec) <- sub("\\d+$","",names(vec))
vec
# Y Y K K K
# "p" "q" "s" "t" "u"

Replace one element in vector with multiple elements

I have a vector where I want to replace one element with multiple element, I am able to replace with one but not multuiple, can anyone help?
For example I have
data <- c('a', 'x', 'd')
> data
[1] "a" "x" "d"
I want to replace "x" with "b", "c" to get
[1] "a" "b" "c" "d"
However
gsub('x', c('b', 'c'), data)
gives me
[1] "a" "b" "d"
Warning message:
In gsub("x", c("b", "c"), data) :
argument 'replacement' has length > 1 and only the first element will
be used
Here's how I would tackle it:
data <- c('a', 'x', 'd')
lst <- as.list(data)
unlist(lapply(lst, function(x) if(x == "x") c("b", "c") else x))
# [1] "a" "b" "c" "d"
We're making use of the fact that list-structures are more flexible than atomic vectors. We can replace a length-1 list-element with a length>1 element and then unlist the result to go back to an atomic vector.
Since you want to replace exact matches of "x" I prefer not to use sub/gsub in this case.
You may try this , although I believe the accepted answer is great:
unlist(strsplit(gsub("x", "b c", data), split = " "))
Logic: Replacing "x" with "b c" with space and then doing the strsplit, once its splitted we can convert is again back to vector using unlist.
This is a bit tricky of a problem because in your replacement you also want to grow your vector. That being said, I believe this should work:
replacement <- c("b","c")
new_data <- rep(data, times = ifelse(data=="x", length(replacement), 1))
new_data[new_data=="x"] <- replacement
new_data
#[1] "a" "b" "c" "d"
This will also work if you have multiple "x"s in your vector like:
data <- c("a","x","d","x")
Another approach:
data <- c('a', 'x', 'd')
pattern <- "x"
replacement <- c("b", "c")
sub_one_for_many <- function(pattern, replacement, x) {
removal_index <- which(x == pattern)
if (removal_index > 1) {
result <- c(x[1:(removal_index-1)], replacement, x[(removal_index+1):length(x)])
} else if (removal_index == 1) {
result <- c(replacement, x[2:length(x)])
}
return(result)
}
answer <- sub_one_for_many(pattern, replacement, data)
Output:
> answer
[1] "a" "b" "c" "d"

Extract distinct characters that differ between two strings

I have two strings, a <- "AERRRTX"; b <- "TRRA" .
I want to extract the characters in a not used in b, i.e. "ERX"
I tried the answer in Extract characters that differ between two strings , which uses setdiff. It returns "EX", because b does have "R" and setdiff will eliminate all three "R"s in a. My aim is to treat each character as distinct, so only two of the three R's in a should be eliminated.
Any suggestions on what I can use instead of setdiff, or some other approach to achieve my output?
A different approach using pmatch,
a1 <- unlist(strsplit(a, ""))
b1 <- unlist(strsplit(b, ""))
a1[!1:length(a1) %in% pmatch(b1, a1)]
#[1] "E" "R" "X"
Another example,
a <- "Ronak";b<-"Shah"
a1 <- unlist(strsplit(a, ""))
b1 <- unlist(strsplit(b, ""))
a1[!1:length(a1) %in% pmatch(b1, a1)]
# [1] "R" "o" "n" "k"
You can use the function vsetdiff from vecsets package
install.packages("vecsets")
library(vecsets)
a <- "AERRRTX"
b <- "TRRA"
Reduce(vsetdiff, strsplit(c(a, b), split = ""))
## [1] "E" "R" "X"
We can use Reduce() to successively eliminate from a each character found in b:
a <- 'AERRRTX'; b <- 'TRRA';
paste(collapse='',Reduce(function(as,bc) as[-match(bc,as,nomatch=length(as)+1L)],strsplit(b,'')[[1L]],strsplit(a,'')[[1L]]));
## [1] "ERX"
This will preserve the order of the surviving characters in a.
Another approach is to mark each character with its occurrence index in a, do the same for b, and then we can use setdiff():
a <- 'AERRRTX'; b <- 'TRRA';
pasteOccurrence <- function(x) ave(x,x,FUN=function(x) paste0(x,seq_along(x)));
paste(collapse='',substr(setdiff(pasteOccurrence(strsplit(a,'')[[1L]]),pasteOccurrence(strsplit(b,'')[[1L]])),1L,1L));
## [1] "ERX"
An alternative using data.table package`:
library(data.table)
x = data.table(table(strsplit(a, '')[[1]]))
y = data.table(table(strsplit(b, '')[[1]]))
dt = y[x, on='V1'][,N:=ifelse(is.na(N),0,N)][N!=i.N,res:=i.N-N][res>0]
rep(dt$V1, dt$res)
#[1] "E" "R" "X"

Delete elements of list appearing before one element and itself with R

I have a list of elements (letter here in the example)
(l <- list(letters[1:2], letters[2:3]))
# [[1]]
# [1] "a" "b"
# [[2]]
# [1] "b" "c"
And another elements
(r <- letters[2])
# [1] "b"
I create a function that delete that elements appearing before one element and itself (here "b").
out = lapply(l, function(x) x[-c(1,which(x == "b"))])
Filter(length, out)
#[[1]]
#[1] "c"
Now my question is in case I have a list of elements r not only one "b", how can I loop all the list:
for example :
r
[1] a
[2] b
I would like to have a result like this
[1] c
Thank you
Cheers
We can use %in%
Filter(length, lapply(l, function(x)
x[-seq(tail(which(x %in% r),1))]))
#[[1]]
#[1] "c"
data
r <- c('a', 'b')

Resources