I am trying to replace NULL values with NAs in a list pulled from an API, but the lengths are different and therefore can't be replaced.
I have tried using the nullToNA function in the toxboot package (found here), but it won't locate the function in R when I try to call it (I don't know if there have been changes to the package which I can't locate or whether it is because the list is not pulled from a MongoDB). I have also tried all the function call checks here . My code is below. Any help?
library(httr)
library(toxboot)
library(RJSONIO)
library(lubridate)
library(xlsx)
library(reshape2)
resUrl <- "http://api.eia.gov/series/?api_key=2B5239FA427673D22505DBF45664B12E&series_id=NG.N3010CO3.M"
comUrl <- "http://api.eia.gov/series/?api_key=2B5239FA427673D22505DBF45664B12E&series_id=NG.N3020CO3.M"
indUrl <- "http://api.eia.gov/series/?api_key=2B5239FA427673D22505DBF45664B12E&series_id=NG.N3035CO3.M"
apiList <- list(resUrl, comUrl, indUrl)
results <- vector("list", length(apiList))
for(i in length(apiList)){
raw <- GET(url = as.character(apiList[i]))
char <- rawToChar(raw$content)
list <- fromJSON(char)
for (j in length(list$series[[1]]$data)){
if (is.null(list$series[[1]]$data[[j]][[2]])== TRUE)
##nullToNA(list$series[[1]]$data[[j]][[2]])
##list$series[1]$data[[j]][[2]] <- NA
else
next
}
##seriesData <- list$series[[1]]$data
unlistResult <- lapply(list, unlist)
##unlistResult <- lapply(seriesData, unlist)
##unlist2 <- lapply(unlistResult,unlist)
##results[[i]] <- unlistResult
results[[i]] <- unlistResult
}
My hashtags have some of the things that I have tried. But there are a few other methods I haven't tried.
I have seen lapply(list, function(x) ifelse (x == "NULL", NA, x)) but haven't had any luck with that eiter.
Try this:
library(httr)
resUrl <- "http://api.eia.gov/series/?api_key=2B5239FA427673D22505DBF45664B12E&series_id=NG.N3010CO3.M"
x <- GET(resUrl)
y <- content(x)
str(head(y$series[[1]]$data))
# List of 6
# $ :List of 2
# ..$ : chr "201701"
# ..$ : NULL
# $ :List of 2
# ..$ : chr "201612"
# ..$ : num 6.48
# $ :List of 2
# ..$ : chr "201611"
# ..$ : num 7.42
# $ :List of 2
# ..$ : chr "201610"
# ..$ : num 9.75
# $ :List of 2
# ..$ : chr "201609"
# ..$ : num 12.1
# $ :List of 2
# ..$ : chr "201608"
# ..$ : num 14.3
In this first URL, only the first within $series[[1]]$data contained a NULL. BTW: be clear to distinguish between NULL (the literal) and "NULL" (a character string with 4 letters).
Here are some ways (with various data types) to check for NULLs:
is.null(NULL)
# [1] TRUE
length(NULL)
# [1] 0
Simple enough so far, let's try to list with NULLs:
l <- list(NULL, 1)
is.null(l)
# [1] FALSE
sapply(l, is.null)
# [1] TRUE FALSE
length(l)
# [1] 2
lengths(l)
# [1] 0 1
sapply(l, length)
# [1] 0 1
(The "0" lengths indicate NULLs.) I'll use lengths here:
y$series[[1]]$data <- lapply(y$series[[1]]$data, function(z) { z[ lengths(z) == 0 ] <- NA; z; })
str(head(y$series[[1]]$data))
# List of 6
# $ :List of 2
# ..$ : chr "201701"
# ..$ : logi NA
# $ :List of 2
# ..$ : chr "201612"
# ..$ : num 6.48
# $ :List of 2
# ..$ : chr "201611"
# ..$ : num 7.42
# $ :List of 2
# ..$ : chr "201610"
# ..$ : num 9.75
# $ :List of 2
# ..$ : chr "201609"
# ..$ : num 12.1
# $ :List of 2
# ..$ : chr "201608"
# ..$ : num 14.3
Related
I am making some plots in R in a for-loop and would like to store them using a name to describe the function being plotted, but also which data it came from.
So when I have a list of 2 data sets "x" and "y" and the loop has a structure like this:
x = matrix(
c(1,2,4,5,6,7,8,9),
nrow=3,
ncol=2)
y = matrix(
c(20,40,60,80,100,120,140,160,180),
nrow=3,
ncol=2)
data <- list(x,y)
for (i in data){
??? <- boxplot(i)
}
I would like the ??? to be "name" + (i) + "_" separator. In this case the 2 plots would be called "plot_x" and "plot_y".
I tried some stuff with paste("plot", names(i), sep = "_") but I'm not sure if this is what to use, and where and how to use it in this scenario.
We can create an empty list with the length same as that of the 'data' and then store the corresponding output from the for loop by looping over the sequence of 'data'
out <- vector('list', length(data))
for(i in seq_along(data)) {
out[[i]] <- boxplot(data[[i]])
}
str(out)
#List of 2
# $ :List of 6
# ..$ stats: num [1:5, 1:2] 1 1.5 2 3 4 5 5.5 6 6.5 7
# ..$ n : num [1:2] 3 3
# ..$ conf : num [1:2, 1:2] 0.632 3.368 5.088 6.912
# ..$ out : num(0)
# ..$ group: num(0)
# ..$ names: chr [1:2] "1" "2"
# $ :List of 6
# ..$ stats: num [1:5, 1:2] 20 30 40 50 60 80 90 100 110 120
# ..$ n : num [1:2] 3 3
# ..$ conf : num [1:2, 1:2] 21.8 58.2 81.8 118.2
# ..$ group: num(0)
# ..$ names: chr [1:2] "1" "2"
If required, set the names of the list elements with the object names
names(out) <- paste0("plot_", c("x", "y"))
It is better not to create multiple objects in the global environment. Instead as showed above, place the objects in a list
akrun is right, you should try to avoid setting names in the global environment. But if you really have to, you can try this,
> y = matrix(c(20,40,60,80,100,120,140,160,180),ncol=1)
> .GlobalEnv[[paste0("plot_","y")]] <- boxplot(y)
> str(plot_y)
List of 6
$ stats: num [1:5, 1] 20 60 100 140 180
$ n : num 9
$ conf : num [1:2, 1] 57.9 142.1
$ out : num(0)
$ group: num(0)
$ names: chr "1"
You can read up on .GlobalEnv by typing in ?.GlobalEnv, into the R command prompt.
I have lists of unknown structure (nesting) that always terminate with a named vector. I want to substitute all the periods in the list or atomic vector names for an underscore. There's rapply to apply functios to list elements but how do I apply over the list/atomic vector's names? I am after a base R solution but please share all solutions for others.
MWE
x <- list(
urban = list(
cars = c('volvo', 'ford'),
food.dining = list(
local.business = c('carls'),
chain.business = c('dennys', 'panera')
)
),
rural = list(
land.use = list(
farming =list(
dairy = c('cows'),
vegie.plan = c('carrots')
)
),
social.rec = list(
community.center = c('town.square')
),
people.type = c('good', 'bad', 'in.between')
),
other.locales = c('suburban'),
missing = list(
unknown = c(),
known = c()
),
end = c('wow')
)
Desired Outcome
## $urban
## $urban$cars
## [1] "volvo" "ford"
##
## $urban$food_dining
## $urban$food_dining$local_business
## [1] "carls"
##
## $urban$food_dining$chain_business
## [1] "dennys" "panera"
##
##
##
## $rural
## $rural$land_use
## $rural$land_use$farming
## $rural$land_use$farming$dairy
## [1] "cows"
##
## $rural$land_use$farming$vegie_plan
## [1] "carrots"
##
##
##
## $rural$social_rec
## $rural$social_rec$community_center
## [1] "town.square"
##
##
## $rural$people_type
## [1] "good" "bad" "in.between"
##
##
## $other_locales
## [1] "suburban"
##
## $missing
## $missing$unknown
## NULL
##
## $missing$known
## NULL
##
##
## $end
## [1] "wow"
Here is an idea for a recursive function. It first substitutes the periods in the names with underscores. It then checks if the class of an element is list, and if yes, it applies the function on that element. Otherwise, if the class is character, it substitutes the periods in its elements with underscores. Note that this will not work if there are for example data.frames in the list, that would have to be an extension defined in the function as well. Hope this helps!
Function:
my_func <- function(x)
{
names(x) <- gsub('\\.','_',names(x) )
for(i in 1:length(x))
{
if(any(class(x[[i]])=='list'))
{
x[[i]] <- my_func(x[[i]])
}
}
return(x)
}
y <- my_func(x)
Data:
x <- list(
urban = list(
cars = c('volvo', 'ford'),
food.dining = list(
local.business = c('carls'),
chain.business = c('dennys', 'panera')
)
),
rural = list(
land.use = list(
farming =list(
dairy = c('cows'),
vegie.plan = c('carrots')
)
),
social.rec = list(
community.center = c('town.square')
),
people.type = c('good', 'bad', 'in.between')
),
other.locales = c('suburban'),
missing = list(
unknown = c(),
known = c()
),
end = c('wow')
)
Output:
str(y)
List of 5
$ urban :List of 2
..$ cars : chr [1:2] "volvo" "ford"
..$ food_dining:List of 2
.. ..$ local_business: chr "carls"
.. ..$ chain_business: chr [1:2] "dennys" "panera"
$ rural :List of 3
..$ land_use :List of 1
.. ..$ farming:List of 2
.. .. ..$ dairy : chr "cows"
.. .. ..$ vegie_plan: chr "carrots"
..$ social_rec :List of 1
.. ..$ community_center: chr "town.square"
..$ people_type: chr [1:3] "good" "bad" "in.between"
$ other_locales: chr "suburban"
$ missing :List of 2
..$ unknown: NULL
..$ known : NULL
$ end : chr "wow"
For list objects, it will rename the list and recursively call the same function for each of its elements. For character objects, it will just return the character.
library('purrr')
fix_names.list <- function(v) {
names(v) <- gsub('\\.', '_', names(v))
map(v, fix_names)
}
fix_names.default <- function(v) v
fix_names <- function(v) UseMethod('fix_names')
fix_names(x) %>% str
# List of 5
# $ urban :List of 2
# ..$ cars : chr [1:2] "volvo" "ford"
# ..$ food_dining:List of 2
# .. ..$ local_business: chr "carls"
# .. ..$ chain_business: chr [1:2] "dennys" "panera"
# $ rural :List of 3
# ..$ land_use :List of 1
# .. ..$ farming:List of 2
# .. .. ..$ dairy : chr "cows"
# .. .. ..$ vegie_plan: chr "carrots"
# ..$ social_rec :List of 1
# .. ..$ community_center: chr "town.square"
# ..$ people_type: chr [1:3] "good" "bad" "in.between"
# $ other_locales: chr "suburban"
# $ missing :List of 2
# ..$ unknown: NULL
# ..$ known : NULL
# $ end : chr "wow"
Not a base-R approach, but might still be relevant as this can be done out-of-the-box with rrapply in the rrapply-package (an extension of base-rapply):
x1 <- rrapply::rrapply(
x, ## nested list
f = function(x, .xname) gsub("\\.", "_", .xname), ## new names
how = "names" ## replace names instead of content
)
str(x1)
#> List of 5
#> $ urban :List of 2
#> ..$ cars : chr [1:2] "volvo" "ford"
#> ..$ food_dining:List of 2
#> .. ..$ local_business: chr "carls"
#> .. ..$ chain_business: chr [1:2] "dennys" "panera"
#> $ rural :List of 3
#> ..$ land_use :List of 1
#> .. ..$ farming:List of 2
#> .. .. ..$ dairy : chr "cows"
#> .. .. ..$ vegie_plan: chr "carrots"
#> ..$ social_rec :List of 1
#> .. ..$ community_center: chr "town.square"
#> ..$ people_type: chr [1:3] "good" "bad" "in.between"
#> $ other_locales: chr "suburban"
#> $ missing :List of 2
#> ..$ unknown: NULL
#> ..$ known : NULL
#> $ end : chr "wow"
I was trying to convert below nested list into data.frame but without luck. There are a few complications, mainly the column "results" of position 1 is inconsistent with position 2, as there is no result in position 2.
item length inconsistent across different positions
[[1]]
[[1]]$html_attributions
list()
[[1]]$results
geometry.location.lat geometry.location.lng
1 25.66544 -100.4354
id place_id
1 6ce0a030663144c8e992cbce51eb00479ef7db89 ChIJVy7b7FW9YoYRdaH2I_gOJIk
reference
1 CmRSAAAATdtVfB4Tz1aQ8GhGaw4-nRJ5lZlVNgiOR3ciF4QjmYC56bn6b7omWh1SJEWWqQQEFNXxGZndgEwSgl8sRCOtdF8aXpngUY878Q__yH4in8EMZMCIqSHLARqNgGlV4mKgEhDlvkHLXLiBW4F_KQVT83jIGhS5DJipk6PAnpPDXP2p-4X5NPuG9w
[[1]]$status
[1] "OK"
[[2]]
[[2]]$html_attributions
list()
[[2]]$results
list()
[[2]]$status
[1] "ZERO_RESULTS"
I tried the following codes but they aint' working.
#1
m1 <- do.call(rbind, lapply(myDataFrames, function(y) do.call(rbind, y)))
relist(m1, skeleton = myDataFrames)
#2
relist(matrix(unlist(myDataFrames), ncol = 4, byrow = T), skeleton = myDataFrames)
#3
library(data.table)
df<-rbindlist(myDataFrames, idcol = "index")
df<-rbindlist(myDataFrames, fill=TRUE)
#4
myDataFrame <- do.call(rbind.data.frame, c(myDataFrames, list(stringsAsFactors = FALSE)))
I think I have enough of the original JSON to be able to create a reproducible example:
okjson <- '{"html_attributions":[],"results":[{"geometry":{"location":{"lat":25.66544,"lon":-100.4354},"id":"foo","place_id":"quux"}}],"status":"OK"}'
emptyjson <- '{"html_attributions":[],"results":[],"status":"ZERO_RESULTS"}'
jsons <- list(okjson, emptyjson, okjson)
From here, I'll step (slowly) through the process. I've included much of the intermediate structure for reproducibility, I apologize for the verbosity. This can easily be grouped together and/or put within a magrittr pipeline.
lists <- lapply(jsons, jsonlite::fromJSON)
str(lists)
# List of 3
# $ :List of 3
# ..$ html_attributions: list()
# ..$ results :'data.frame': 1 obs. of 1 variable:
# .. ..$ geometry:'data.frame': 1 obs. of 3 variables:
# .. .. ..$ location:'data.frame': 1 obs. of 2 variables:
# .. .. .. ..$ lat: num 25.7
# .. .. .. ..$ lon: num -100
# .. .. ..$ id : chr "foo"
# .. .. ..$ place_id: chr "quux"
# ..$ status : chr "OK"
# $ :List of 3
# ..$ html_attributions: list()
# ..$ results : list()
# ..$ status : chr "ZERO_RESULTS"
# $ :List of 3
# ..$ html_attributions: list()
# ..$ results :'data.frame': 1 obs. of 1 variable:
# .. ..$ geometry:'data.frame': 1 obs. of 3 variables:
# .. .. ..$ location:'data.frame': 1 obs. of 2 variables:
# .. .. .. ..$ lat: num 25.7
# .. .. .. ..$ lon: num -100
# .. .. ..$ id : chr "foo"
# .. .. ..$ place_id: chr "quux"
# ..$ status : chr "OK"
goodlists <- Filter(function(a) "results" %in% names(a) && length(a$results) > 0, lists)
goodresults <- lapply(goodlists, `[[`, "results")
str(goodresults)
# List of 2
# $ :'data.frame': 1 obs. of 1 variable:
# ..$ geometry:'data.frame': 1 obs. of 3 variables:
# .. ..$ location:'data.frame': 1 obs. of 2 variables:
# .. .. ..$ lat: num 25.7
# .. .. ..$ lon: num -100
# .. ..$ id : chr "foo"
# .. ..$ place_id: chr "quux"
# $ :'data.frame': 1 obs. of 1 variable:
# ..$ geometry:'data.frame': 1 obs. of 3 variables:
# .. ..$ location:'data.frame': 1 obs. of 2 variables:
# .. .. ..$ lat: num 25.7
# .. .. ..$ lon: num -100
# .. ..$ id : chr "foo"
# .. ..$ place_id: chr "quux"
goodresultsdf <- lapply(goodresults, function(a) jsonlite::flatten(as.data.frame(a)))
str(goodresultsdf)
# List of 2
# $ :'data.frame': 1 obs. of 4 variables:
# ..$ geometry.id : chr "foo"
# ..$ geometry.place_id : chr "quux"
# ..$ geometry.location.lat: num 25.7
# ..$ geometry.location.lon: num -100
# $ :'data.frame': 1 obs. of 4 variables:
# ..$ geometry.id : chr "foo"
# ..$ geometry.place_id : chr "quux"
# ..$ geometry.location.lat: num 25.7
# ..$ geometry.location.lon: num -100
We now have a list-of-data.frames, a good place to be.
do.call(rbind.data.frame, c(goodresultsdf, stringsAsFactors = FALSE))
# geometry.id geometry.place_id geometry.location.lat geometry.location.lon
# 1 foo quux 25.66544 -100.4354
# 2 foo quux 25.66544 -100.4354
Edit: I rewrite this question, as I have two related questions that maybe could be answered better together...
I've got some large nested lists with nearly the same structure and without names. All items of the list have attributes and I want to assign these as names in all levels of the list. Furthermore I want to drop a needless list-level.
So this:
before <- list(list("value_1"), list(list("value_2a"), list("value_2b")), list(list("value_3a"), list("value_3b"), list("value_3c")), list("value_4"))
for(i in 1:4) attr(before[[i]], "tag") <- paste0("tag_", i)
attr(before[[2]][[1]], "code") <- "code_2a"
attr(before[[2]][[2]], "code") <- "code_2b"
attr(before[[3]][[1]], "code") <- "code_3a"
attr(before[[3]][[2]], "code") <- "code_3b"
attr(before[[3]][[3]], "code") <- "code_3c"
str(before)
## List of 4
## $ :List of 1
## ..$ : chr "value_1"
## ..- attr(*, "tag")= chr "tag_1"
## $ :List of 2
## ..$ :List of 1
## .. ..$ : chr "value_2a"
## .. ..- attr(*, "code")= chr "code_2a"
## ..$ :List of 1
## .. ..$ : chr "value_2b"
## .. ..- attr(*, "code")= chr "code_2b"
## ..- attr(*, "tag")= chr "tag_2"
## $ :List of 3
## ..$ :List of 1
## .. ..$ : chr "value_3a"
## .. ..- attr(*, "code")= chr "code_3a"
## ..$ :List of 1
## .. ..$ : chr "value_3b"
## .. ..- attr(*, "code")= chr "code_3b"
## ..$ :List of 1
## .. ..$ : chr "value_3c"
## .. ..- attr(*, "code")= chr "code_3c"
## ..- attr(*, "tag")= chr "tag_3"
## $ :List of 1
## ..$ : chr "value_4"
## ..- attr(*, "tag")= chr "tag_4"
(Note: 1st level list items have a "tag"-attribute, 2nd level items have a "code"-attribute.)
Should be this:
after <- list(tag_1="value_1", tag_2=list(code_2a="value_2a", code_2b="value_2b"), tag_3=list(code_3a="value_3a", code_3b="value_3b", code_3c="value_3c"), tag_4="value_4")
str(after)
## List of 4
## $ tag_1: chr "value_1"
## $ tag_2:List of 2
## ..$ code_2a: chr "value_2a"
## ..$ code_2b: chr "value_2b"
## $ tag_3:List of 3
## ..$ code_3a: chr "value_3a"
## ..$ code_3b: chr "value_3b"
## ..$ code_3c: chr "value_3c"
## $ tag_4: chr "value_4"
Since the lists are large, I want to avoid for loops, to get a better performance.
Got it! Three steps, but works perfectly.
# the ugly list
ugly_list <- list(list("value_1"), list(list("value_2a"), list("value_2b")), list(list("value_3a"), list("value_3b"), list("value_3c")), list("value_4"))
for(i in 1:4) attr(ugly_list[[i]], "tag") <- paste0("tag_", i)
attr(ugly_list[[2]][[1]], "code") <- "code_2a"
attr(ugly_list[[2]][[2]], "code") <- "code_2b"
attr(ugly_list[[3]][[1]], "code") <- "code_3a"
attr(ugly_list[[3]][[2]], "code") <- "code_3b"
attr(ugly_list[[3]][[3]], "code") <- "code_3c"
# set names for 1st level
level_1_named <- setNames(ugly_list, sapply(ugly_list, function(x) attributes(x)$tag))
# set names for 2nd level
level_2_named <- lapply(level_1_named, function(x) lapply(x, function(y) setNames(y, attributes(y)$code)))
# clean list
clean_list <- lapply(level_2_named, function(x) unlist(x, recursive=FALSE))
Thanks for trying. :-)
You can easily do this by recursing through the list. Try this:
setListNames <- function(mylist){
# Base case: if we have a nonlist object, set name to its attribute
if( !is.list(mylist) ){
names( mylist ) = attr(mylist, 'code')
return( mylist )
}
# lapply through all sublists and recursively call
mylist = lapply(mylist, setListNames)
# Return named list
return( mylist )
}
# Test run
before_named = setListNames(before)
# Check it worked
print( names( before_named[[2]][[1]][[1]] ) )
I tried to find the subset but it's showing error as :
I am performing Data Envelopment Analysis using Benchmarking Package in R.
Although I saw similar Question were asked before but it didn't help me .
Update :Structure and Summary of Database
I am performing DEA for V6 and V7.
I guess you need
Large.Cap$V1[e_crs$eff > 0.85]
Using a reproducible example from ?dea
library(Benchmarking)
x <- matrix(c(100,200,300,500,100,200,600),ncol=1)
y <- matrix(c(75,100,300,400,25,50,400),ncol=1)
Large.Cap <- data.frame(v1= LETTERS[1:7], v2= 1:7)
e_crs <- dea(x, y, RTS='crs', ORIENTATION='in')
e_crs
#[1] 0.7500 0.5000 1.0000 0.8000 0.2500 0.2500 0.6667
The e_crs object is a list
str(e_crs)
#List of 12
# $ eff : num [1:7] 0.75 0.5 1 0.8 0.25 ...
# $ lambda : num [1:7, 1:7] 0 0 0 0 0 0 0 0 0 0 ...
# ..- attr(*, "dimnames")=List of 2
# .. ..$ : NULL
# .. ..$ : chr [1:7] "L1" "L2" "L3" "L4" ...
# $ objval : num [1:7] 0.75 0.5 1 0.8 0.25 ...
# $ RTS : chr "crs"
# $ primal : NULL
# $ dual : NULL
# $ ux : NULL
# $ vy : NULL
# $ gamma :function (x)
# $ ORIENTATION: chr "in"
# $ TRANSPOSE : logi FALSE
# $ param : NULL
# - attr(*, "class")= chr "Farrell"
We extract the 'eff' list element from 'e_crs' to subset the 'v1' column in 'Large.Cap' dataset.
droplevels(Large.Cap$v1[e_crs$eff > 0.85])
#[1] C
#Levels: C