Implementing operator - r

I need to implement operator %in%, which gives TRUE if point on the left is in the list on the right and gives FALSE in opposite case. Problem is that i cannot use a loop and any extra packages.
Creating a list
ell <- list( 2, c( 2, 5), list( c( 2, 8)), "test")
Elements in list
2
c(2, 5)
list(c(2, 8))
"test"
Testing elements
2 %in% ell
# TRUE
5 %in% ell
# FALSE
list(c(2, 8)) %in% ell
# TRUE
list(list(2, 8)) %in% ell
# FALSE
"test" %in% ell
# TRUE

%in% is an operator that identifies whether the first argument is the subset of the second one. So the ability you expected cannot be achieved by %in%. Here is another solution to get what you want by sapply(). For example:
any(sapply(ell, identical, list(c( 2, 8))))
# [1] TRUE
any(sapply(ell, identical, list(list( 2, 8))))
# [1] FALSE
You can define a custom operator by yourself to implement the method above. I name it %IN%:
`%IN%` <- function(a, b){
any(sapply(b, identical, a))
}
2 %IN% ell
# [1] TRUE
5 %IN% ell
# [1] FALSE
list(c(2, 8)) %IN% ell
# [1] TRUE
list(list(2, 8)) %IN% ell
# [1] FALSE
"test" %IN% ell
# [1] TRUE

Related

How to check if an R object has a certain attribute?

How can I check if an R object has a certain attribute?
For example, I would like to check if a vector has a "labels" attribute. How can I do this? Exists already a function that does that?
my_vector <- c(1, 2, 3)
my_vector_labelled <- `attr<-`(my_vector, "labels", c(a = 1, b = 2, c = 3))
let's assume there is a function named has_attribute(x, attr). The the expected result would be:
> has_attribute(my_vector, "labels")
FALSE
> has_attribute(my_vector_labelled, "labels")
TRUE
Two ways:
%in% names(attributes(..):
"labels" %in% names(attributes(my_vector))
# [1] FALSE
"labels" %in% names(attributes(my_vector_labelled))
# [1] TRUE
is.null(attr(..,"")):
is.null(attr(my_vector, "labels"))
# [1] TRUE # NOT present
is.null(attr(my_vector_labelled, "labels"))
# [1] FALSE # present
(Perhaps !is.null(attr(..)) is preferred?)
There is a function available in package
> library(BBmisc)
> hasAttributes(my_vector_labelled, "labels")
[1] TRUE
> hasAttributes(my_vector, "labels")
[1] FALSE
Use attributes.
my_vector <- c(1, 2, 3)
my_vector_labelled <- `attr<-`(my_vector, "labels", c(a = 1, b = 2, c = 3))
attributes(my_vector)
#> NULL
names(attributes(my_vector_labelled))
#> [1] "labels"
has_attribute <- function(x, which){
which %in% names(attributes(x))
}
has_attribute(my_vector_labelled, "labels")
#> [1] TRUE
Created on 2022-03-31 by the reprex package (v2.0.1)

Is the argument included in the page list?

I want to define the operator %in%, whose operation is to return TRUE if the argument on the left is in the list on the right and FALSE otherwise. The task should be implemented without using a loop.
### Creating a simple list
ell <- list( 2, c( 2, 5), list( c( 2, 8)), "xyz")
### Testing of selected elements
2 %in% ell
5 %in% ell
list( c( 2, 8)) %in% ell
list( list( 2, 8)) %in% ell
"xyz" %in% ell
[1] TRUE
[1] FALSE
[1] TRUE
[1] FALSE
[1] TRUE
Like MrFlick said in a comment, do not override built-in operators, it will definitely break something.
Try this one and see if it does what you want. I have named the new operator %IN%, since R is case sensitive.
`%IN%` <- function(x, y){
x %in% unlist(y, recursive = FALSE)
}
2 %IN% ell
#[1] TRUE
5 %IN% ell
#[1] TRUE
list( c( 2, 8)) %IN% ell
#[1] TRUE
list( list( 2, 8)) %IN% ell
#[1] FALSE
"xyz" %IN% ell
#[1] TRUE

R: find vector in list of vectors

i'm working with R and my goal is to check wether a given vector is in a list of unique vectors.
The list looks like
final_states <- list(c("x" = 5, "y" = 1),
c("x" = 5, "y" = 2),
c("x" = 5, "y" = 3),
c("x" = 5, "y" = 4),
c("x" = 5, "y" = 5),
c("x" = 3, "y" = 5))
Now I want to check wether a given state is in the list. For example:
state <- c("x" = 5, "y" = 3)
As you can see, the vector state is an element of the list final_states. My idea was to check it with %in% operator:
state %in% final_states
But I get this result:
[1] FALSE FALSE
Can anyone tell me, what is wrong?
Greets,
lupi
If you just want to determine if the vector is in the list, try
Position(function(x) identical(x, state), final_states, nomatch = 0) > 0
# [1] TRUE
Position() basically works like match(), but on a list. If you set nomatch = 0 and check for Position > 0, you'll get a logical result telling you whether state is in final_states
"final_states" is a "list", so you could convert the "state" to list and then do
final_states %in% list(state)
#[1] FALSE FALSE TRUE FALSE FALSE FALSE
or use mapply to check whether all the elements in "state" are present in each of the list elements of "final_states" (assuming that the lengths are the same for the vector and the list elements)
f1 <- function(x,y) all(x==y)
mapply(f1, final_states, list(state))
#[1] FALSE FALSE TRUE FALSE FALSE FALSE
Or rbind the list elements to a matrix and then check whether "state" and the "rows" of "m1" are the same.
m1 <- do.call(rbind, final_states)
!rowSums(m1!=state[col(m1)])
#[1] FALSE FALSE TRUE FALSE FALSE FALSE
Or
m1[,1]==state[1] & m1[,2]==state[2]
#[1] FALSE FALSE TRUE FALSE FALSE FALSE
Update
If you need to get a single TRUE/FALSE
any(mapply(f1, final_states, list(state)))
#[1] TRUE
Or
any(final_states %in% list(state))
#[1] TRUE
Or
list(state) %in% final_states
#[1] TRUE
Or use the "faster" fmatch from fastmatch
library(fastmatch)
fmatch(list(state), final_states) >0
#[1] TRUE
Benchmarks
#Richard Sciven's base R function is very fast compared to other solutions except the one with fmatch
set.seed(295)
final_states <- replicate(1e6, sample(1:20, 20, replace=TRUE),
simplify=FALSE)
state <- final_states[[151]]
richard <- function() {Position(function(x) identical(x, state),
final_states, nomatch = 0) > 0}
Bonded <- function(){any( sapply(final_states, identical, state) )}
akrun2 <- function() {fmatch(list(state), final_states) >0}
akrun1 <- function() {f1 <- function(x,y) all(x==y)
any(mapply(f1, final_states, list(state)))}
library(microbenchmark)
microbenchmark(richard(), Bonded(), akrun1(), akrun2(),
unit='relative', times=20L)
#Unit: relative
# expr min lq mean median uq
# richard() 35.22635 29.47587 17.49164 15.66833 14.58235
# Bonded() 109440.56885 101382.92450 55252.86141 47734.96467 44289.80309
# akrun1() 167001.23864 138812.85016 75664.91378 61417.59871 62667.94867
# akrun2() 1.00000 1.00000 1.00000 1.00000 1.00000
# max neval cld
# 14.62328 20 a
# 46299.43325 20 b
# 63890.68133 20 c
# 1.00000 20 a
Whenever i see a list object I first think of lapply. Seems to deliver the expected result with identical as the test and 'state' as the second argument:
> lapply(final_states, identical, state)
[[1]]
[1] FALSE
[[2]]
[1] FALSE
[[3]]
[1] TRUE
[[4]]
[1] FALSE
[[5]]
[1] FALSE
[[6]]
[1] FALSE
You get a possibly useful intermediate result with:
lapply(final_states, match, state)
... but it comes back as a series of position vectors where c(1,2) is the correct result.
If you want the result to come back as a vector , say for instance you want to use any, then use sapply instead of lapply.
> any( sapply(final_states[-3], identical, state) )
[1] FALSE
> any( sapply(final_states, identical, state) )
[1] TRUE

Search a matrix for rows with given values in any order

I have a matrix and a vector with values:
mat<-matrix(c(1,1,6,
3,5,2,
1,6,5,
2,2,7,
8,6,1),nrow=5,ncol=3,byrow=T)
vec<-c(1,6)
This is a small subset of a N by N matrix and 1 by N vector. Is there a way so that I can subset the rows with values in vec?
The most straight forward way of doing this that I know of would be to use the subset function:
subset(mat,vec[,1] == 1 & vec[,2] == 6) #etc etc
The problem with subset is you have to specify in advance the column to look for and the specific combination to do for. The problem I am facing is structured in a way such that I want to find all rows containing the numbers in "vec" in any possible way. So in the above example, I want to get a return matrix of:
1,1,6
1,6,5
8,6,1
Any ideas?
You can do
apply(mat, 1, function(x) all(vec %in% x))
# [1] TRUE FALSE TRUE FALSE TRUE
but this may give you unexpected results if vec contains repeated values:
vec <- c(1, 1)
apply(mat, 1, function(x) all(vec %in% x))
# [1] TRUE FALSE TRUE FALSE TRUE
so you would have to use something more complicated using table to account for repetitions:
vec <- c(1, 1)
is.sub.table <- function(table1, table2) {
all(names(table1) %in% names(table2)) &&
all(table1 <= table2[names(table1)])
}
apply(mat, 1, function(x)is.sub.table(table(vec), table(x)))
# [1] TRUE FALSE FALSE FALSE FALSE
However, if the vector length is equal to the number of columns in your matrix as you seem to indicate but is not the case in your example, you should just do:
vec <- c(1, 6, 1)
apply(mat, 1, function(x) all(sort(vec) == sort(x)))
# [1] TRUE FALSE FALSE FALSE FALSE

Test if a vector contains a given element

How to check if a vector contains a given value?
Both the match() (returns the first appearance) and %in% (returns a Boolean) functions are designed for this.
v <- c('a','b','c','e')
'b' %in% v
## returns TRUE
match('b',v)
## returns the first location of 'b', in this case: 2
is.element() makes for more readable code, and is identical to %in%
v <- c('a','b','c','e')
is.element('b', v)
'b' %in% v
## both return TRUE
is.element('f', v)
'f' %in% v
## both return FALSE
subv <- c('a', 'f')
subv %in% v
## returns a vector TRUE FALSE
is.element(subv, v)
## returns a vector TRUE FALSE
I will group the options based on output. Assume the following vector for all the examples.
v <- c('z', 'a','b','a','e')
For checking presence:
%in%
> 'a' %in% v
[1] TRUE
any()
> any('a'==v)
[1] TRUE
is.element()
> is.element('a', v)
[1] TRUE
For finding first occurance:
match()
> match('a', v)
[1] 2
For finding all occurances as vector of indices:
which()
> which('a' == v)
[1] 2 4
For finding all occurances as logical vector:
==
> 'a' == v
[1] FALSE TRUE FALSE TRUE FALSE
Edit:
Removing grep() and grepl() from the list for reason mentioned in comments
The any() function makes for readable code
> w <- c(1,2,3)
> any(w==1)
[1] TRUE
> v <- c('a','b','c')
> any(v=='b')
[1] TRUE
> any(v=='f')
[1] FALSE
You can use the %in% operator:
vec <- c(1, 2, 3, 4, 5)
1 %in% vec # true
10 %in% vec # false
Also to find the position of the element "which" can be used as
pop <- c(3, 4, 5, 7, 13)
which(pop==13)
and to find the elements which are not contained in the target vector, one may do this:
pop <- c(1, 2, 4, 6, 10)
Tset <- c(2, 10, 7) # Target set
pop[which(!(pop%in%Tset))]
I really like grep() and grepl() for this purpose.
grep() returns a vector of integers, which indicate where matches are.
yo <- c("a", "a", "b", "b", "c", "c")
grep("b", yo)
[1] 3 4
grepl() returns a logical vector, with "TRUE" at the location of matches.
yo <- c("a", "a", "b", "b", "c", "c")
grepl("b", yo)
[1] FALSE FALSE TRUE TRUE FALSE FALSE
These functions are case-sensitive.
Another option to check if a element exists in a vector is by using the %in{}% syntax from the inops package like this:
library(inops)
#>
#> Attaching package: 'inops'
#> The following object is masked from 'package:base':
#>
#> <<-
v <- c('a','b','c','e')
v %in{}% c("b")
#> [1] FALSE TRUE FALSE FALSE
Created on 2022-07-16 by the reprex package (v2.0.1)

Resources