I am scraping json from an apartment rental platform's rest api.
Let's say I have the following data structure, which conveys the square footage of each rental unit in a single building:
sqft1 <- list(c("1500"),
NULL,
c("1300"))
It's important that I retain the dimensionality of this data. If I try to unlist and aggregate into a data frame alongside other rental unit properties, I will lose the 2nd element and get thrown an error.
But by finding the indices in the list that have a NULL element, I can replace them with a character vector containing an empty string as follows:
isNull1 <- lapply(1:length(sqft1), function(x) is.null(sqft1[[x]]))
sqft1[unlist(isNull1)] <- c("")
My problem arises when I try to apply the same replacement function over a result set for multiple buildings. After running the following block, no replacements get made.
sqft3 <- list(list(c("1500"),
NULL,
c("1300")),
list(c("1400"),
c("1700")),
list(NULL,
c("1200")))
isNull3 <- lapply(1:length(sqft3), function(x) lapply(1:length(sqft3[[x]]), function(y) is.null(sqft3[[x]][[y]])))
lapply(1:length(sqft3), function(x) sqft3[[x]][unlist(isNull3[[x]])] <- c(""))
What concept about applying functions am I misunderstanding here? Any ideas how to make it work?
Thanks!
You could use a nested lapply as follows
lapply(sqft3, function (x) lapply(x, function (y) ifelse(is.null(y), "", y)))
This is similiar to #MKR's solution.
Using the purrr package, you could also use modify_depth
library(purrr)
modify_depth(sqft3, .depth = 2, .f = ~ifelse(is.null(.x), "", .x))
The ifelse from above could also be replaced by this function %||% ("null-default", also from purrr) as follows
modify_depth(sqft3, 2, `%||%`, "")
(%||% is short-hand for if (is.null(x)) y else x)
Result
#[[1]]
#[[1]][[1]]
#[1] "1500"
#
#[[1]][[2]]
#[1] ""
#
#[[1]][[3]]
#[1] "1300"
#
#
#[[2]]
#[[2]][[1]]
#[1] "1400"
#
#[[2]][[2]]
#[1] "1700"
#
#
#[[3]]
#[[3]][[1]]
#[1] ""
#
#[[3]][[2]]
#[1] "1200"
One option is to use map function twice as:
map(sqft3,function(x){
map(x, function(y){
y[is.null(y)] <- ""
y
})})
# [[1]]
# [[1]][[1]]
# [1] "1500"
#
# [[1]][[2]]
# [1] ""
#
# [[1]][[3]]
# [1] "1300"
#
#
# [[2]]
# [[2]][[1]]
# [1] "1400"
#
# [[2]][[2]]
# [1] "1700"
#
#
# [[3]]
# [[3]][[1]]
# [1] ""
#
# [[3]][[2]]
# [1] "1200"
Data:
sqft3 <- list(list(c("1500"),
NULL,
c("1300")),
list(c("1400"),
c("1700")),
list(NULL,
c("1200")))
sqft3
# [[1]]
# [[1]][[1]]
# [1] "1500"
#
# [[1]][[2]]
# NULL
#
# [[1]][[3]]
# [1] "1300"
#
#
# [[2]]
# [[2]][[1]]
# [1] "1400"
#
# [[2]][[2]]
# [1] "1700"
#
#
# [[3]]
# [[3]][[1]]
# NULL
#
# [[3]][[2]]
# [1] "1200"
Related
I have data in a nested list structure in R and I'd like to use a lookup table to change names no matter where they are in the structure.
Example
# build up an example
x <- as.list(c("a" = NA))
x[[1]] <- vector("list", 4)
names(x[[1]]) <- c("b","c","d","e")
x$a$b <- vector("list", 2)
names(x$a$b) <- c("d","f")
x$a$c <- 3
x$a$d <- 27
x$a$e <- "d"
x$a$b$d <- "data"
x$a$b$f <- "more data"
# make a lookup table for names I want to change from; to
lkp <- data.frame(matrix(data = c("a","z","b","bee","d","dee"),
ncol = 2,
byrow = TRUE), stringsAsFactors = FALSE)
names(lkp) <- c("from","to")
Output from the above
> x
$a
$a$b
$a$b$d
[1] "data"
$a$b$f
[1] "more data"
$a$c
[1] 3
$a$d
[1] 27
$a$e
[1] "d"
> lkp
from to
1 a z
2 b bee
3 d dee
Here is what I came up with to do this for only the first level:
> for(i in 1:nrow(lkp)){
+ names(x)[names(x) == lkp$from[[i]]] <- lkp$to[[i]]
+ }
> x
$z
$z$b
$z$b$d
[1] "data"
$z$b$f
[1] "more data"
$z$c
[1] 3
$z$d
[1] 27
$z$e
[1] "d"
So that works fine but uses a loop and only gets at the first level. I've tried various versions of the *apply world but have not yet been able to get something useful.
Thanks in advance for any thoughts
EDIT:
Interestingly rapply fails miserably (or, I fail miserably in my attempt!) when trying to access and modify names. Here's an example of just trying to change all names the same
> namef <- function(x) names(x) <- "z"
> rapply(x, namef, how = "list")
$a
$a$b
$a$b$d
[1] "z"
$a$b$f
[1] "z"
$a$c
[1] "z"
$a$d
[1] "z"
$a$e
[1] "z"
I used a character vector for look-up instead of you data.frame, but it will be easy to change it if you really want a data.frame.
lkp2 <- lkp$to
names(lkp2) <- lkp$from
rename <- function(nested_list) {
found <- names(nested_list) %in% names(lkp2)
names(nested_list)[found] <- lkp2[names(nested_list)[found]]
nested_list %>% map(~{
if (is.list(.x)) {
rename(.x)
} else {
.x
}
})
}
rename(x)
# $z
# $z$bee
# $z$bee$dee
# [1] "data"
#
# $z$bee$f
# [1] "more data"
#
#
# $z$c
# [1] 3
#
# $z$dee
# [1] 27
#
# $z$e
# [1] "d"
I am not sure this is the best way to do it, but it seems to do the job, and if you're only working with small lists (like XML documents) then there is no need to worry much about performance.
You might want to name the function with a better name.
Using an external package you can also do this with rrapply in the rrapply-package (extension of base rapply):
library(rrapply) ## v1.2.1
rrapply(list(x),
classes = "list",
f = function(x) {
newnames <- lkp$to[match(names(x), lkp$from)]
names(x)[!is.na(newnames)] <- newnames[!is.na(newnames)]
return(x)
},
how = "recurse"
)[[1]]
#> $z
#> $z$bee
#> $z$bee$dee
#> [1] "data"
#>
#> $z$bee$f
#> [1] "more data"
#>
#>
#> $z$c
#> [1] 3
#>
#> $z$dee
#> [1] 27
#>
#> $z$e
#> [1] "d"
Here, the f function achieves essentially the same as OP's for-loop. how = "recurse" tells the function to continue recursion after the application of f.
Note that the input is wrapped as list(x) so that the f function also modifies the name(s) of the list itself.
Update
rrapply v1.2.5 contains a dedicated option how = "names" to replace names in a nested list, which is a bit less convoluted:
rrapply(
x,
f = function(x, .xname) {
newname <- lkp$to[match(.xname, lkp$from)]
return(ifelse(is.na(newname), .xname, newname))
},
how = "names"
)
#> $z
#> $z$bee
#> $z$bee$dee
#> [1] "data"
#>
#> $z$bee$f
#> [1] "more data"
#>
#>
#> $z$c
#> [1] 3
#>
#> $z$dee
#> [1] 27
#>
#> $z$e
#> [1] "d"
I would like to add a sequential element onto a list. Suppose I have the following list
lst <- list("A"=list(e1="a",e2="!"), "B"=list(e1="b", e2="#"))
$A
$A$e1
[1] "a"
$A$e2
[1] "!"
$B
$B$e1
[1] "b"
$B$e2
[1] "#"
I would like to append a e3 which is the position index of that element in the list so essentially I would like my list to be:
$A
$A$e1
[1] "a"
$A$e2
[1] "!"
$A$e3
[1] 1
$B
$B$e1
[1] "b"
$B$e2
[1] "#"
$B$e3
[1] 2
setNames(lapply(seq_along(lst), function(i){
temp = lst[[i]]
temp$e3 = i
temp
}), names(lst))
#$`A`
#$`A`$`e1`
#[1] "a"
#$`A`$e2
#[1] "!"
#$`A`$e3
#[1] 1
#$B
#$B$`e1`
#[1] "b"
#$B$e2
#[1] "#"
#$B$e3
#[1] 2
Here is a solution that doesn't assume that the sub-lists have the same known number of elements.
library("tidyverse")
library("glue")
lst <- list("A"=list(e1="a",e2="!"), "B"=list(e1="b", e2="#"))
# The part
# `setNames(list(.y), glue("e{length(.x) + 1}"))`
# creates a one-element list named accordingly to append to the previous list
map2(lst, seq(lst),
~ append(.x, setNames(list(.y), glue("e{length(.x) + 1}") )))
#> $A
#> $A$e1
#> [1] "a"
#>
#> $A$e2
#> [1] "!"
#>
#> $A$e3
#> [1] 1
#>
#>
#> $B
#> $B$e1
#> [1] "b"
#>
#> $B$e2
#> [1] "#"
#>
#> $B$e3
#> [1] 2
# If naming the additional element is not important, then this can simplified to
map2(lst, seq(lst), append)
# or
map2(lst, seq(lst), c)
Created on 2019-03-06 by the reprex package (v0.2.1)
Another option using Map
Map(function(x, y) c(x, "e3" = y), x = lst, y = seq_along(lst))
#$A
#$A$e1
#[1] "a"
#$A$e2
#[1] "!"
#$A$e3
#[1] 1
#$B
#$B$e1
#[1] "b"
#$B$e2
#[1] "#"
#$B$e3
#[1] 2
This could be written even more concise as
Map(c, lst, e3 = seq_along(lst))
Thanks to #thelatemail
We can use a for loop as well
for(i in seq_along(lst)) lst[[i]]$e3 <- i
Assuming I understood correctly, that you want to add a 3rd element to each nested list which contains the index of that list in it's parent list. This works:
library(rlist)
lst <- list("A"=list(e1="a",e2="!"), "B"=list(e1="b", e2="#"))
for(i in seq(1:length(lst))){
lst[[i]] <- list.append(lst[[i]],e3=i)
}
lst
We can loop along the length of lst with lapply, adding this sequential index to each element.
lst2 <- lapply(seq_along(lst), function(i) {
df <- lst[[i]]
df$e3 <- i
return(df)
})
names(lst2) <- names(lst) # Preserve names from lst
Or, if you're not scared about modifying in place:
lapply(seq_along(lst), function(i) {
lst[[i]]$e3 <<- i
})
Both give the same output:
$A
$A$e1
[1] "a"
$A$e2
[1] "!"
$A$e3
[1] 1
$B
$B$e1
[1] "b"
$B$e2
[1] "#"
$B$e3
[1] 2
I have tried to use the code below but failed. I want to know why it failed and what's the correct (and elegant) way to do that?
a <- 1
b <- 2
res <- lapply(ls(), function(x, l) { l$x <- get(x)}, l=list())
I hope I get the result like
res
# $a
# [1] 1
# $b
# [1] 2
but what I get is
res
# [[1]]
# [1] 1
# [[2]]
# [1] 2
We can use mget to obtain the value of more than one object and it returns a named list
mget(ls())
#$a
#[1] 1
#$b
#[1] 2
If we need to use get, then set the names with ls()
setNames(lapply(ls(), get), ls())
Using sapply:
sapply(ls(), get, simplify = FALSE)
# $a
# [1] 1
#
# $b
# [1] 2
sapply has simplify and USE.NAMES arguments, both have default values of TRUE. So by setting simplify to FALSE we are keeping the result as named list.
I'm trying to append a list of dates to a list of lists such as myList below. This is working as expected except the date format for the date element in each list element is lost. Any ideas?
myList<-list(list("event"="A"),
list("event"="B"),
list("event"="C"))
dates<-as.Date(c("2011-06-05","2012-01-12","2016-05-09"))
outList<-mapply(FUN="c",myList,eventDate=as.list(dates),SIMPLIFY = FALSE)
I'm looking to achieve the below
[[1]]
[[1]]$event
[1] "A"
[[1]]$eventDate
[1] "2011-06-05"
[[2]]
[[2]]$event
[1] "B"
[[2]]$eventDate
[1] "2012-01-12"
[[3]]
[[3]]$event
[1] "C"
[[3]]$eventDate
[1] "2016-06-09"
Using Map, you can also create a small (lambda) function like so:
myList <- list(
list(event = "A"),
list(event = "B"),
list(event = "C")
)
dates <- as.Date(c("2011-06-05", "2012-01-12", "2016-05-09"))
outList <- Map(f = function(origList, date) {
origList$eventDate <- date
return(origList)
}, myList, dates)
outList
#> [[1]]
#> [[1]]$event
#> [1] "A"
#>
#> [[1]]$eventDate
#> [1] "2011-06-05"
#>
#>
#> [[2]]
#> [[2]]$event
#> [1] "B"
#>
#> [[2]]$eventDate
#> [1] "2012-01-12"
#>
#>
#> [[3]]
#> [[3]]$event
#> [1] "C"
#>
#> [[3]]$eventDate
#> [1] "2016-05-09"
The reason why you get the dates converted to numbers, is that the c function converts all elements to the lowest common type (usually characters, in this case numeric).
For example:
c(123, as.Date("2016-01-01"))
#> [1] 123 16801
It may be better to index as c could coerce it to integer storage value
for(i in seq_along(myList)) myList[[i]][['eventDate']] <- dates[i]
An additional list wrapper to insulate each Date element will also work here. I constructed that by running an lapply with the list function on the dates vector:
Map("c", myList, eventDate=lapply(dates, list))
[[1]]
[[1]]$event
[1] "A"
[[1]]$eventDate
[1] "2011-06-05"
[[2]]
[[2]]$event
[1] "B"
[[2]]$eventDate
[1] "2012-01-12"
[[3]]
[[3]]$event
[1] "C"
[[3]]$eventDate
[1] "2016-05-09"
Assuming I have two lists:
xx <- as.list(1:3)
yy <- as.list(LETTERS[1:3])
How do I combine the two such that each element of the new list is a list of the corresponding elements of each component list. So if I combined the two above, I should get:
> combined_list
[[1]]
[[1]][[1]]
[1] 1
[[1]][[2]]
[1] "a"
[[2]]
[[2]][[1]]
[1] 2
[[2]][[2]]
[1] "b"
[[3]]
[[3]][[1]]
[1] 3
[[3]][[2]]
[1] "c"
If you can suggest a solution, I'd like to scale this to 3 or more.
This should do the trick. Nicely, mapply() will take an arbitrary number of lists as arguments.
xx <- as.list(1:3)
yy <- as.list(LETTERS[1:3])
zz <- rnorm(3)
mapply(list, xx, yy, zz, SIMPLIFY=FALSE)