Extract operator `$( )` and non-syntactic names - r

Say I have the following list (note the usage of non-syntactic names)
list <- list(A = c(1,2,3),
`2` = c(7,8,9))
So the following two way of parsing the list works:
`$`(list,A)
## [1] 1 2 3
`$`(list,`2`)
## [1] 7 8 9
However, this way to proceed fails.
id <- 2
`$`(list,id)
## NULL
Could someone explain why the last way does not work and how I could fix it? Thank you.

Your id is a "computed index", which is not supported by the $ operator. From ?Extract:
Both [[ and $ select a single element of the list. The main difference is that $ does not allow computed indices, whereas [[ does. x$name is equivalent to x[["name", exact = FALSE]].
If you have a computed index, then use [[ to extract.
l <- list(a = 1:3)
id <- "a"
l[[id]]
## [1] 1 2 3
`[[`(l, id) # the same
## [1] 1 2 3
If you insist on using the $ operator, then you need to substitute the value of id in the $ call, like so:
eval(bquote(`$`(l, .(id))))
## [1] 1 2 3
It doesn't really matter whether id is non-syntactic:
l <- list(`!##$%^` = 1:3)
id <- "!##$%^"
l[[id]]
## [1] 1 2 3
`[[`(l, id)
## [1] 1 2 3
eval(bquote(`$`(l, .(id))))
## [1] 1 2 3

I am also trying to get a better grasp of non-syntactic names. Unfortunately, more complex patterns of their use are hard to find. First, read ?Quotes and what backticks do.
For the purpose of learning here is some code:
list <- list(A = c(1,2,3),
`2` = c(7,8,9))
id <- 2
id_backtics <- paste0("`", id,"`")
text <- paste0("`$`(list, ", id_backtics, ")")
text
#> [1] "`$`(list, `2`)"
eval(parse(text = text))
#> [1] 7 8 9
Created on 2022-01-24 by the reprex package (v2.0.1)

Related

Using a list of characters in a function? [duplicate]

This question already has answers here:
Convert comma separated string to integer in R
(3 answers)
Closed 1 year ago.
I am using a function where Timepoints need to be defined as
Timepoints = c(x,y,z)
Now i have a chr list
List
$ chr: "1,2,3,4,5,6,7"
with the timepoints i need to use, already seperated by commas.
I want to use this list in the function and lose the quotation marks, so the function can read my timepoints as
Timepoints= c(1,2,3,4,5,6,7)
I tried using noquote(List), but this is not accepted.
Am is missing something ? printing the list with noquote() results in the desired line of characters 1,2,3,4,5,6,7
1) Base R - scan Assuming that you have a list containing a single character string as shown in L below use scan as shown.
L <- list("1,2,3,4,5,6")
scan(text = L[[1]], sep = ",", quiet = TRUE)
## [1] 1 2 3 4 5 6
2) gsubfn::strapply Another possibility is to use strapply to match each string of digits, convert them to numeric and return it as a vector. (We assume that the numbers have no signs or decimal points but that could readily be added if needed.)
library(gsubfn)
strapply(L[[1]], "\\d+", as.numeric, simplify = unlist)
[1] 1 2 3 4 5 6
Added
In a comment the poster indicated an interest in having a list of character strings as input. The output was not specified but if we assume we want a list of numeric vectors then
L2 <- list(A = "1,2,3,4,5,6", B = "1,2")
Scan <- function(x) scan(text = x, sep = ",", quiet = TRUE)
lapply(L2, Scan)
## $A
## [1] 1 2 3 4 5 6
##
## $B
## [1] 1 2
library(gsubfn)
strapply(L2, "\\d", as.numeric)
## $A
## [1] 1 2 3 4 5 6
##
## $B
## [1] 1 2
Here is an option with strsplit.
as.integer(unlist(strsplit(L[[1]], ",")))
#[1] 1 2 3 4 5 6

Extract the factor's values positions in level

I'm returning to R after some time, and the following has me stumped:
I'd like to build a list of the positions factor values have in the facor levels list.
Example:
> data = c("a", "b", "a","a","c")
> fdata = factor(data)
> fdata
[1] a b a a c
Levels: a b c
> fdata$lvl_idx <- ????
Such that:
> fdata$lvl_idx
[1] 1 2 1 1 3
Appreciate any hints or tips.
If you convert a factor to integer, you get the position in the levels:
as.integer(fdata)
## [1] 1 2 1 1 3
In certain situations, this is counter-intuitive:
f <- factor(2:4)
f
## [1] 2 3 4
## Levels: 2 3 4
as.integer(f)
## [1] 1 2 3
Also if you silently coerce to integer, for example by using a factor as a vector index:
LETTERS[2:4]
## [1] "B" "C" "D"
LETTERS[f]
## [1] "A" "B" "C"
Converting to character before converting to integer gives the expected values. See ?factor for details.
The solution provided years ago by Matthew Lundberg is not robust. It could be that the as.integer() function was defined for a specific S3 type of factors. Imagine someone would create a new factor class to keep operators like >=.
as.myfactor <- function(x, ...) {
structure(as.factor(x), class = c("myfactor", "factor"))
}
# and that someone would create an S3 method for integers - it should
# only remove the operators, which makes sense...
as.integer.myfactor <- function(x, ...) {
as.integer(gsub("(<|=|>)+", "", as.character(x)))
}
Now this is not working anymore, - it just removes operators:
f <- as.myfactor(">=2")
as.integer(f)
#> [1] 2
But this is robust with any factor you want to know the index of the level of, using which():
f <- factor(2:4)
which(levels(f) == 2)
#> [1] 1

Extracting unique numbers from string in R

I have a list of strings which contain random characters such as:
list=list()
list[1] = "djud7+dg[a]hs667"
list[2] = "7fd*hac11(5)"
list[3] = "2tu,g7gka5"
I'd like to know which numbers are present at least once (unique()) in this list. The solution of my example is:
solution: c(7,667,11,5,2)
If someone has a method that does not consider 11 as "eleven" but as "one and one", it would also be useful. The solution in this condition would be:
solution: c(7,6,1,5,2)
(I found this post on a related subject: Extracting numbers from vectors of strings)
For the second answer, you can use gsub to remove everything from the string that's not a number, then split the string as follows:
unique(as.numeric(unlist(strsplit(gsub("[^0-9]", "", unlist(ll)), ""))))
# [1] 7 6 1 5 2
For the first answer, similarly using strsplit,
unique(na.omit(as.numeric(unlist(strsplit(unlist(ll), "[^0-9]+")))))
# [1] 7 667 11 5 2
PS: don't name your variable list (as there's an inbuilt function list). I've named your data as ll.
Here is yet another answer, this one using gregexpr to find the numbers, and regmatches to extract them:
l <- c("djud7+dg[a]hs667", "7fd*hac11(5)", "2tu,g7gka5")
temp1 <- gregexpr("[0-9]", l) # Individual digits
temp2 <- gregexpr("[0-9]+", l) # Numbers with any number of digits
as.numeric(unique(unlist(regmatches(l, temp1))))
# [1] 7 6 1 5 2
as.numeric(unique(unlist(regmatches(l, temp2))))
# [1] 7 667 11 5 2
A solution using stringi
# extract the numbers:
nums <- stri_extract_all_regex(list, "[0-9]+")
# Make vector and get unique numbers:
nums <- unlist(nums)
nums <- unique(nums)
And that's your first solution
For the second solution I would use substr:
nums_first <- sapply(nums, function(x) unique(substr(x,1,1)))
You could use ?strsplit (like suggested in #Arun's answer in Extracting numbers from vectors (of strings)):
l <- c("djud7+dg[a]hs667", "7fd*hac11(5)", "2tu,g7gka5")
## split string at non-digits
s <- strsplit(l, "[^[:digit:]]")
## convert strings to numeric ("" become NA)
solution <- as.numeric(unlist(s))
## remove NA and duplicates
solution <- unique(solution[!is.na(solution)])
# [1] 7 667 11 5 2
A stringr solution with str_match_all and piped operators. For the first solution:
library(stringr)
str_match_all(ll, "[0-9]+") %>% unlist %>% unique %>% as.numeric
Second solution:
str_match_all(ll, "[0-9]") %>% unlist %>% unique %>% as.numeric
(Note: I've also called the list ll)
Use strsplit using pattern as the inverse of numeric digits: 0-9
For the example you have provided, do this:
tmp <- sapply(list, function (k) strsplit(k, "[^0-9]"))
Then simply take a union of all `sets' in the list, like so:
tmp <- Reduce(union, tmp)
Then you only have to remove the empty string.
Check out the str_extract_numbers() function from the strex package.
pacman::p_load(strex)
list=list()
list[1] = "djud7+dg[a]hs667"
list[2] = "7fd*hac11(5)"
list[3] = "2tu,g7gka5"
charvec <- unlist(list)
print(charvec)
#> [1] "djud7+dg[a]hs667" "7fd*hac11(5)" "2tu,g7gka5"
str_extract_numbers(charvec)
#> [[1]]
#> [1] 7 667
#>
#> [[2]]
#> [1] 7 11 5
#>
#> [[3]]
#> [1] 2 7 5
unique(unlist(str_extract_numbers(charvec)))
#> [1] 7 667 11 5 2
Created on 2018-09-03 by the reprex package (v0.2.0).

In R, can't set names of vector elements using assignment in combine function

Quick question. Why does the following work in R (correctly assigning the variable value "Hello" to the first element of the vector):
> a <- "Hello"
> b <- c(a, "There")
> b
[1] "Hello" "There"
And this works:
> c <- c("Hello"=1, "There"=2)
> c
Hello There
1 2
But this does not (making the vector element name equal to "a" rather than "Hello"):
> c <- c(a=1, "There"=2)
> c
a There
1 2
Is it possible to make R recognize that I want to use the value of a in the statement c <- c(a=1, "There"=2)?
I am not sure how c() internally creates the names attribute from the named objects. Perhaps it is along the lines of list() and unlist()? Anyway, you can assign the values of the vector first, and the names attribute later, as in the following.
a <- "Hello"
b <- c(1, 2)
names(b) = c(a, "There")
b
# Hello There
# 1 2
Then to access the named elements later:
b[a] <- 3
b
# Hello There
# 3 2
b["Hello"] <- 4
b
# Hello There
# 4 2
b[1] <- 5
b
# Hello There
# 5 2
Edit
If you really wanted to do it all in one line, the following works:
eval(parse(text = paste0("c(",a," = 1, 'there' = 2)")))
# Hello there
# 1 2
However, I think you'll prefer assigning values and names separately to the eval(parse()) approach.
Assign the values in a named list. Then unlist it. e.g.
lR<-list("a" = 1, "There" = 2 )
v = unlist(lR)
this gives a named vector v
v
a There
1 2

How can I remove an element from a list?

I have a list and I want to remove a single element from it. How can I do this?
I've tried looking up what I think the obvious names for this function would be in the reference manual and I haven't found anything appropriate.
If you don't want to modify the list in-place (e.g. for passing the list with an element removed to a function), you can use indexing: negative indices mean "don't include this element".
x <- list("a", "b", "c", "d", "e"); # example list
x[-2]; # without 2nd element
x[-c(2, 3)]; # without 2nd and 3rd
Also, logical index vectors are useful:
x[x != "b"]; # without elements that are "b"
This works with dataframes, too:
df <- data.frame(number = 1:5, name = letters[1:5])
df[df$name != "b", ]; # rows without "b"
df[df$number %% 2 == 1, ] # rows with odd numbers only
I don't know R at all, but a bit of creative googling led me here: http://tolstoy.newcastle.edu.au/R/help/05/04/1919.html
The key quote from there:
I do not find explicit documentation for R on how to remove elements from lists, but trial and error tells me
myList[[5]] <- NULL
will remove the 5th element and then "close up" the hole caused by deletion of that element. That suffles the index values, So I have to be careful in dropping elements. I must work from the back of the list to the front.
A response to that post later in the thread states:
For deleting an element of a list, see R FAQ 7.1
And the relevant section of the R FAQ says:
... Do not set x[i] or x[[i]] to NULL, because this will remove the corresponding component from the list.
Which seems to tell you (in a somewhat backwards way) how to remove an element.
I would like to add that if it's a named list you can simply use within.
l <- list(a = 1, b = 2)
> within(l, rm(a))
$b
[1] 2
So you can overwrite the original list
l <- within(l, rm(a))
to remove element named a from list l.
Here is how the remove the last element of a list in R:
x <- list("a", "b", "c", "d", "e")
x[length(x)] <- NULL
If x might be a vector then you would need to create a new object:
x <- c("a", "b", "c", "d", "e")
x <- x[-length(x)]
Work for lists and vectors
Removing Null elements from a list in single line :
x=x[-(which(sapply(x,is.null),arr.ind=TRUE))]
Cheers
If you have a named list and want to remove a specific element you can try:
lst <- list(a = 1:4, b = 4:8, c = 8:10)
if("b" %in% names(lst)) lst <- lst[ - which(names(lst) == "b")]
This will make a list lst with elements a, b, c. The second line removes element b after it checks that it exists (to avoid the problem #hjv mentioned).
or better:
lst$b <- NULL
This way it is not a problem to try to delete a non-existent element (e.g. lst$g <- NULL)
Use - (Negative sign) along with position of element, example if 3rd element is to be removed use it as your_list[-3]
Input
my_list <- list(a = 3, b = 3, c = 4, d = "Hello", e = NA)
my_list
# $`a`
# [1] 3
# $b
# [1] 3
# $c
# [1] 4
# $d
# [1] "Hello"
# $e
# [1] NA
Remove single element from list
my_list[-3]
# $`a`
# [1] 3
# $b
# [1] 3
# $d
# [1] "Hello"
# $e
[1] NA
Remove multiple elements from list
my_list[c(-1,-3,-2)]
# $`d`
# [1] "Hello"
# $e
# [1] NA
my_list[c(-3:-5)]
# $`a`
# [1] 3
# $b
# [1] 3
my_list[-seq(1:2)]
# $`c`
# [1] 4
# $d
# [1] "Hello"
# $e
# [1] NA
There's the rlist package (http://cran.r-project.org/web/packages/rlist/index.html) to deal with various kinds of list operations.
Example (http://cran.r-project.org/web/packages/rlist/vignettes/Filtering.html):
library(rlist)
devs <-
list(
p1=list(name="Ken",age=24,
interest=c("reading","music","movies"),
lang=list(r=2,csharp=4,python=3)),
p2=list(name="James",age=25,
interest=c("sports","music"),
lang=list(r=3,java=2,cpp=5)),
p3=list(name="Penny",age=24,
interest=c("movies","reading"),
lang=list(r=1,cpp=4,python=2)))
list.remove(devs, c("p1","p2"))
Results in:
# $p3
# $p3$name
# [1] "Penny"
#
# $p3$age
# [1] 24
#
# $p3$interest
# [1] "movies" "reading"
#
# $p3$lang
# $p3$lang$r
# [1] 1
#
# $p3$lang$cpp
# [1] 4
#
# $p3$lang$python
# [1] 2
Don't know if you still need an answer to this but I found from my limited (3 weeks worth of self-teaching R) experience with R that, using the NULL assignment is actually wrong or sub-optimal especially if you're dynamically updating a list in something like a for-loop.
To be more precise, using
myList[[5]] <- NULL
will throw the error
myList[[5]] <- NULL : replacement has length zero
or
more elements supplied than there are to replace
What I found to work more consistently is
myList <- myList[[-5]]
Just wanted to quickly add (because I didn't see it in any of the answers) that, for a named list, you can also do l["name"] <- NULL. For example:
l <- list(a = 1, b = 2, cc = 3)
l['b'] <- NULL
In the case of named lists I find those helper functions useful
member <- function(list,names){
## return the elements of the list with the input names
member..names <- names(list)
index <- which(member..names %in% names)
list[index]
}
exclude <- function(list,names){
## return the elements of the list not belonging to names
member..names <- names(list)
index <- which(!(member..names %in% names))
list[index]
}
aa <- structure(list(a = 1:10, b = 4:5, fruits = c("apple", "orange"
)), .Names = c("a", "b", "fruits"))
> aa
## $a
## [1] 1 2 3 4 5 6 7 8 9 10
## $b
## [1] 4 5
## $fruits
## [1] "apple" "orange"
> member(aa,"fruits")
## $fruits
## [1] "apple" "orange"
> exclude(aa,"fruits")
## $a
## [1] 1 2 3 4 5 6 7 8 9 10
## $b
## [1] 4 5
Using lapply and grep:
lst <- list(a = 1:4, b = 4:8, c = 8:10)
# say you want to remove a and c
toremove<-c("a","c")
lstnew<-lst[-unlist(lapply(toremove, function(x) grep(x, names(lst)) ) ) ]
#or
pattern<-"a|c"
lstnew<-lst[-grep(pattern, names(lst))]
You can also negatively index from a list using the extract function of the magrittr package to remove a list item.
a <- seq(1,5)
b <- seq(2,6)
c <- seq(3,7)
l <- list(a,b,c)
library(magrittr)
extract(l,-1) #simple one-function method
[[1]]
[1] 2 3 4 5 6
[[2]]
[1] 3 4 5 6 7
There are a few options in the purrr package that haven't been mentioned:
pluck and assign_in work well with nested values and you can access it using a combination of names and/or indices:
library(purrr)
l <- list("a" = 1:2, "b" = 3:4, "d" = list("e" = 5:6, "f" = 7:8))
# select values (by name and/or index)
all.equal(pluck(l, "d", "e"), pluck(l, 3, "e"), pluck(l, 3, 1))
[1] TRUE
# or if element location stored in a vector use !!!
pluck(l, !!! as.list(c("d", "e")))
[1] 5 6
# remove values (modifies in place)
pluck(l, "d", "e") <- NULL
# assign_in to remove values with name and/or index (does not modify in place)
assign_in(l, list("d", 1), NULL)
$a
[1] 1 2
$b
[1] 3 4
$d
$d$f
[1] 7 8
Or you can remove values using modify_list by assigning zap() or NULL:
all.equal(list_modify(l, a = zap()), list_modify(l, a = NULL))
[1] TRUE
You can remove or keep elements using a predicate function with discard and keep:
# remove numeric elements
discard(l, is.numeric)
$d
$d$e
[1] 5 6
$d$f
[1] 7 8
# keep numeric elements
keep(l, is.numeric)
$a
[1] 1 2
$b
[1] 3 4
Here is a simple solution that can be done using base R. It removes the number 5 from the original list of numbers. You can use the same method to remove whatever element you want from a list.
#the original list
original_list = c(1:10)
#the list element to remove
remove = 5
#the new list (which will not contain whatever the `remove` variable equals)
new_list = c()
#go through all the elements in the list and add them to the new list if they don't equal the `remove` variable
counter = 1
for (n in original_list){
if (n != ){
new_list[[counter]] = n
counter = counter + 1
}
}
The new_list variable no longer contains 5.
new_list
# [1] 1 2 3 4 6 7 8 9 10
How about this? Again, using indices
> m <- c(1:5)
> m
[1] 1 2 3 4 5
> m[1:length(m)-1]
[1] 1 2 3 4
or
> m[-(length(m))]
[1] 1 2 3 4
You can use which.
x<-c(1:5)
x
#[1] 1 2 3 4 5
x<-x[-which(x==4)]
x
#[1] 1 2 3 5
if you'd like to avoid numeric indices, you can use
a <- setdiff(names(a),c("name1", ..., "namen"))
to delete names namea...namen from a. this works for lists
> l <- list(a=1,b=2)
> l[setdiff(names(l),"a")]
$b
[1] 2
as well as for vectors
> v <- c(a=1,b=2)
> v[setdiff(names(v),"a")]
b
2

Resources