How do I remove an element from a list in R?
Imagine this workflow:
# create list
my_list <- lapply(1:10, function(x) x)
# find which ones to exclude
my_list_boolean <- sapply(my_list, function(x) ifelse(x%%2>0,F,T))
# does not work like this!
my_list[[my_list_boolean]]
Is there a solution not having to use a for loop and create a big logic around my statement?
Just use [] and not [[]]
my_list <- lapply(1:10, function(x) x)
# find which ones to exclude
my_list_boolean <- sapply(my_list, function(x) ifelse(x%%2>0,F,T))
# does not work like this!
my_list[my_list_boolean]
#> [[1]]
#> [1] 2
#>
#> [[2]]
#> [1] 4
#>
#> [[3]]
#> [1] 6
#>
#> [[4]]
#> [1] 8
#>
#> [[5]]
#> [1] 10
Created on 2018-11-03 by the reprex package (v0.2.1)
You can thus select element of the list with logical vector and not the content (which is [[]]
Do you mean this?
my_list[my_list_boolean]
#[[1]]
#[1] 2
#
#[[2]]
#[1] 4
#
#[[3]]
#[1] 6
#
#[[4]]
#[1] 8
#
#[[5]]
#[1] 10
Related
I have hundreds of observations of census data - each feature is stored within a list with the name census. I am trying to perform an action
a) on all elements of all lists: I want to make all non character elements numeric.
b) a named element present within each list: I want to remove a prefix from a named column in every list
A toy example below.
Census is a nested list within a list
library(tidyverse)
library(purrr)
POA_CODE = c("POA101","POA102")
dogs = c(4,4)
cats = c(3,2)
children = c(0, 1)
salary = c(100, 120)
employed.prop = c(1,0.5)
pets <- list(POA_CODE, as.integer(dogs), as.integer(cats))
children <-list(POA_CODE, as.integer(children))
employment <-list(POA_CODE, salary, employed.prop)
census <- list(pets, children, employment)
Attempt to change all non-numeric elements in every list to numeric
#change all non-numeric elements to numeric
census_num <- census %>%
map(function(x){
ifelse(is.character == TRUE, x,
as.numeric(x))}
)
I get the following error message:
Error in is.character == TRUE :
comparison (1) is possible only for atomic and list types
Attempt to remove prefix from every postcode in census[[]]$'POA_CODE'
#Remove "POA" prefix from every postcode
census_code <- pmap(census, ~.x[["POA_CODE"]],function(x){
str_replace(POA_CODE,"POA","")
})
I get the error
Error: Element 2 of `.l` must have length 1 or 3, not 2
You have a nested list, so you need nested maps :
library(purrr)
map(census, function(x) map_if(x, is.character, ~as.numeric(sub('POA', '', .x))))
#[[1]]
#[[1]][[1]]
#[1] 101 102
#[[1]][[2]]
#[1] 4 4
#[[1]][[3]]
#[1] 3 2
#[[2]]
#[[2]][[1]]
#[1] 101 102
#[[2]][[2]]
#[1] 0 1
#[[3]]
#[[3]][[1]]
#[1] 101 102
#[[3]][[2]]
#[1] 100 120
#[[3]][[3]]
#[1] 1.0 0.5
In base R, we can solve it with nested lapply :
lapply(census, function(x) lapply(x, function(y)
if(is.character(y)) as.numeric(sub('POA', '', y)) else y))
You could use rapply() in base R:
rapply(
census,
function(x) if(is.character(x)) as.numeric(sub("^\\D+","", x)) else x,
how = "replace")
#> [[1]]
#> [[1]][[1]]
#> [1] 101 102
#>
#> [[1]][[2]]
#> [1] 4 4
#>
#> [[1]][[3]]
#> [1] 3 2
#>
#>
#> [[2]]
#> [[2]][[1]]
#> [1] 101 102
#>
#> [[2]][[2]]
#> [1] 0 1
#>
#>
#> [[3]]
#> [[3]][[1]]
#> [1] 101 102
#>
#> [[3]][[2]]
#> [1] 100 120
#>
#> [[3]][[3]]
#> [1] 1.0 0.5
or purrr::map_depth()
library(purrr)
map_depth(census, 2, ~if(is.character(.)) as.numeric(sub("^\\D+","", .)) else .)
#> [[1]]
#> [[1]][[1]]
#> [1] 101 102
#>
#> [[1]][[2]]
#> [1] 4 4
#>
#> [[1]][[3]]
#> [1] 3 2
#>
#>
#> [[2]]
#> [[2]][[1]]
#> [1] 101 102
#>
#> [[2]][[2]]
#> [1] 0 1
#>
#>
#> [[3]]
#> [[3]][[1]]
#> [1] 101 102
#>
#> [[3]][[2]]
#> [1] 100 120
#>
#> [[3]][[3]]
#> [1] 1.0 0.5
We can use rrapply with parse_number
library(rrapply)
library(readr)
rrapply(census, f = function(x) if(is.character(x)) readr::parse_number(x) else x)
#[[1]]
#[[1]][[1]]
#[1] 101 102
#[[1]][[2]]
#[1] 4 4
#[[1]][[3]]
#[1] 3 2
#[[2]]
#[[2]][[1]]
#[1] 101 102
#[[2]][[2]]
#[1] 0 1
#[[3]]
#[[3]][[1]]
#[1] 101 102
#[[3]][[2]]
#[1] 100 120
#[[3]][[3]]
#[1] 1.0 0.5
I have a numeric vector and I need to get the intervals as a list of vectors.
I thought it was easy but I'm really struggling to find a good, simple way.
A bad, complex way would be to paste the vector and its lag, and then split the result.
Here is the working but ugly reprex:
library(tidyverse)
xx = c(1, 5, 10 ,15 ,20)
paste0(lag(xx), "-", xx-1) %>% str_split("-") #nevermind the first one, it cannot really make sense anyway
#> [[1]]
#> [1] "NA" "0"
#>
#> [[2]]
#> [1] "1" "4"
#>
#> [[3]]
#> [1] "5" "9"
#>
#> [[4]]
#> [1] "10" "14"
#>
#> [[5]]
#> [1] "15" "19"
Created on 2020-09-06 by the reprex package (v0.3.0)
Is there a cleaner way to do the same thing?
You can use Map :
Map(c, xx[-length(xx)], xx[-1] - 1)
#[[1]]
#[1] 1 4
#[[2]]
#[1] 5 9
#[[3]]
#[1] 10 14
#[[4]]
#[1] 15 19
We can also use lapply iterating over the length of the variable.
lapply(seq_along(xx[-1]), function(i) c(xx[i], xx[i+1] - 1))
We can use map2 from purrr
library(purrr)
map2(xx[-length(xx)], xx[-1] -1, c)
purrr does not seem to support recycling of elements of a vector in case there is a shortage of elements in one of the two (while using purrr::map2 or purrr::walk2). Unlike baseR where we just get a warning if the larger vector is not a multiple of the shorter one.
Consider this toy example:
This works:
map2(1:3,4:6,sum)
#
#[[1]]
#[1] 5
#[[2]]
#[1] 7
#[[3]]
#[1] 9
And this doesn't work:
map2(1:3,4:9,sum)
Error: .x (3) and .y (6) are different lengths
I understand very well why this is not allowed - as it can make catching bugs very difficult. But is there any way in purrr I can force this to happen? Perhaps using some base R trick with purrr?
You can put both lists in a data frame and let that command repeat your vectors:
input <- data.frame(a = 1:3, b = 4:9)
purrr::map2(input$a, input$b, sum)
It's by design with purrr but you can use Map :
Map(sum,1:3,4:9)
# [[1]]
# [1] 5
#
# [[2]]
# [1] 7
#
# [[3]]
# [1] 9
#
# [[4]]
# [1] 8
#
# [[5]]
# [1] 10
#
# [[6]]
# [1] 12
And here's how I would recycle if I had to :
x <- 1:3
y <- 4:9
l <- max(length(y), length(x))
map2(rep(x,len = l), rep(y,len = l),sum)
# [[1]]
# [1] 5
#
# [[2]]
# [1] 7
#
# [[3]]
# [1] 9
#
# [[4]]
# [1] 8
#
# [[5]]
# [1] 10
#
# [[6]]
# [1] 12
How can I easily change elements of a list or vectors to NAs depending on a predicate ?
I need it to be done in a single call for smooth integration in dplyr::mutate calls etc...
expected output:
make_na(1:10,`>`,5)
# [1] 1 2 3 4 5 NA NA NA NA NA
my_list <- list(1,"a",NULL,character(0))
make_na(my_list, is.null)
# [[1]]
# [1] 1
#
# [[2]]
# [1] "a"
#
# [[3]]
# [1] NA
#
# [[4]]
# character(0)
Note:
I answered my question as I have one solution figured out but Id be happy to get alternate solutions. Also maybe this functionality is already there in base R or packaged in a prominent library
Was inspired by my frustration in my answer to this post
We can build the following function:
make_na <- function(.x,.predicate,...) {
is.na(.x) <- sapply(.x,.predicate,...)
.x
}
Or a bit better to leverage purrr's magic :
make_na <- function(.x,.predicate,...) {
if (requireNamespace("purrr", quietly = TRUE)) {
is.na(.x) <- purrr::map_lgl(.x,.predicate,...)
} else {
if("formula" %in% class(.predicate))
stop("Formulas aren't supported unless package 'purrr' is installed")
is.na(.x) <- sapply(.x,.predicate,...)
}
.x
}
This way we'll be using purrr::map_lgl if library purrr is available, sapply otherwise.
Some examples :
make_na <- function(.x,.predicate,...) {
is.na(.x) <- purrr::map_lgl(.x,.predicate,...)
.x
}
Some use cases:
make_na(1:10,`>`,5)
# [1] 1 2 3 4 5 NA NA NA NA NA
my_list <- list(1,"a",NULL,character(0))
make_na(my_list, is.null)
# [[1]]
# [1] 1
#
# [[2]]
# [1] "a"
#
# [[3]]
# [1] NA
#
# [[4]]
# character(0)
make_na(my_list, function(x) length(x)==0)
# [[1]]
# [1] 1
#
# [[2]]
# [1] "a"
#
# [[3]]
# [1] NA
#
# [[4]]
# [1] NA
If purrr is installed we can use this short form:
make_na(my_list, ~length(.x)==0)
I have a problem with R's grep() function apparently finding an "l" everywhere:
> l <- list(list(), list("a"), list("a","l"))
> grep("a",l)
[1] 2 3
> grep("l",l)
[1] 1 2 3
> grep("l",l,fixed=TRUE)
[1] 1 2 3
This problem seems to occur only with the letter "l". Does anyone have a hint on that?
Many thanks,
Cord
If you look at the documentation for the argument x in grep you'll see that it should be
a character vector where matches are sought, or an object which can be coerced by as.character to a character vector.
If you try that operation you'll see what goes wrong:
> as.character(l)
[1] "list()" "list(\"a\")" "list(\"a\", \"l\")"
so the same "problem" happens if you grep for i, s etc.
You could try the following instead
sapply(l, function(i) grep("l", i))
which produces
[[1]]
integer(0)
[[2]]
integer(0)
[[3]]
[1] 2
Interesting post, I never knew grep convert the x vector like this:
l <- list(list(), list("a"), list("a","l"))
l
#> [[1]]
#> list()
#>
#> [[2]]
#> [[2]][[1]]
#> [1] "a"
#>
#>
#> [[3]]
#> [[3]][[1]]
#> [1] "a"
#>
#> [[3]][[2]]
#> [1] "l"
Internally grep is converting l to a character vector
grep
#> function (pattern, x, ignore.case = FALSE, perl = FALSE, value = FALSE,
#> fixed = FALSE, useBytes = FALSE, invert = FALSE)
#> {
#> if (!is.character(x))
#> x <- structure(as.character(x), names = names(x))
#> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#> .Internal(grep(as.character(pattern), x, ignore.case, value,
#> perl, fixed, useBytes, invert))
#> }
#> <bytecode: 0x0000000012e18610>
#> <environment: namespace:base>
So now l is actually:
structure(as.character(l), names = names(l))
#> [1] "list()" "list(\"a\")" "list(\"a\", \"l\")"
Which has "l" in each.
You could unlist l first to get expected results:
ul <- unlist(l)
ul
#> [1] "a" "a" "l"
grep("a",l)
#> [1] 2 3
grep("a",ul)
#> [1] 1 2
grep("l",l)
#> [1] 1 2 3
grep("l",ul)
#> [1] 3
grep("l",l,fixed=TRUE)
#> [1] 1 2 3
grep("l",ul,fixed=TRUE)
#> [1] 3