Populating a list with lists (replicate, lapply) - r

How should I generate a list of lists of known size?
Currently I do it like that.
create_scout_bees <- function(search_space, num_scouts){
gen_bee <- function(unused, sear_spac){
create_random_bee(sear_spac)
}
bees <- lapply(1:num_scouts, gen_bee, search_space)
#bees <- replicate(num_scouts, create_random_bee(search_space))
cat('\nclass of bees is:',class(bees),'\n')
bees
}
where create_random_bee(sear_spac) returns return(list(vector=random_vector(search_space))). This seems to be too complicated. I found the replicate function (see comment in code). But it does not return the same thing. To be honest I'm not entirely sure what it returns.
The lapply option seems returns a list of lists
[[1]]
[[1]]$vector
[1] -3.772477 -4.178604
[[2]]
[[2]]$vector
[1] -1.237291 -2.430769
[[3]]
[[3]]$vector
[1] -2.211511 -1.352074
[[4]]
[[4]]$vector
[1] 4.102391 -1.437620
[[5]]
[[5]]$vector
[1] -0.1355444 -2.0270074
The replicate version returns a list
$vector
[1] 3.780779 3.588892
$vector
[1] -4.290371 4.098709
$vector
[1] 1.051525 -3.374406
$vector
[1] -0.4593861 -4.8412850
$vector
[1] 2.164383 -4.903347
I can index both returned values. But the second option seems to be just a list with 5 elements of type vector with the same key. When accessing by key it returns the first element.
How do you generate a list of lists of known size?

You can try
replicate(5, list(vector=rnorm(2)), simplify=FALSE)
# [[1]]
#[[1]]$vector
#[1] -1.5239454 -0.1326934
#[[2]]
#[[2]]$vector
#[1] -1.4369404 0.3701259
#[[3]]
#[[3]]$vector
#[1] 0.3251298 -1.4289498
#[[4]]
#[[4]]$vector
#[1] 0.8346002 -0.2974959
#[[5]]
#[[5]]$vector
#[1] 0.4581858 -0.8066517

Related

How to extract values from a list with multiple levels in r

I have a list looks like this
[[1]]
[[1]][[1]]
[[1]][[1]]$p1est.z
[1] 2.890829
[[1]][[1]]$p1se.z
[1] 0.1418367
[[1]][[2]]
[[1]][[2]]$p2est.w
[1] 4.947014
[[1]][[2]]$p2se.w
[1] 0.5986682
[[2]]
[[2]][[1]]
[[2]][[1]]$p1est.z
[1] 3.158164
[[2]][[1]]$p1se.z
[1] 0.138770
[[2]][[2]]
[[2]][[2]]$p2est.w
[1] 5.052874
[[2]][[2]]$p2se.w
[1] 0.585608
How can I extract values of "p1est.z" from both levels? since I need to compute the average of them.
Thanks!
Actually the unlist() function out of the box should probably work here:
output <- unlist(your_list)
output[names(output) == "p1est.z"]
p1est.z p1est.z
2.890829 3.158164
Data:
your_list <- list(
list(list(p1est.z=2.890829, p1se.z=0.1418367),
list(p1est.w=4.947014, p2se.w=0.5986682)),
list(list(p1est.z=3.158164, p1se.z=0.138770),
list(p1est.w=5.052874, p2se.w=0.585608)))
One way to do this, using Tim Biegeleisen's representation of your data is to make a function to extract p1est.z and apply that. Your top level list has two elements, in both, the first element has a p1est.z so you could do
fn <- function(x) { x[[1]]$p1est.z }
and then apply it
sapply(your_list, fn)
# [1] 2.890829 3.158164

Split c() inside of a string vector

I am working with a vector of strings in r. However, when I see the first item in the list I see this:
> uni_list[1]
[1] c("ENSMUSG00000000204", "ENSMUSG00000115878", "ENSMUSG00000116453", "ENSMUSG00000116134")
15940 Levels: c("ENSMUSG00000000204", "ENSMUSG00000115878", "ENSMUSG00000116453", "ENSMUSG00000116134")
How can I split this one in separate values?
Thanks in advance,
Juan
You can use split, i.e.
split(l3[[1]], seq(length(l3[[1]])))
$`1`
[1] "ENSMUSG00000000204"
$`2`
[1] "ENSMUSG00000115878"
$`3`
[1] "ENSMUSG00000116453"
$`4`
[1] "ENSMUSG00000116134"
where
l3
[[1]]
[1] "ENSMUSG00000000204" "ENSMUSG00000115878" "ENSMUSG00000116453" "ENSMUSG00000116134"

Reducing a data.tree created from List

I'm working on a shiny app which plots data trees. I'm looking to incorporate the shinyTree app to permit quick comparison of plotted nodes. The issue is that the shinyTree app returns a redundant list of lists of the sub node plot.
The actual list of list is included below. I would like to keep the longest branches only. I would also like to remove the id node (integer node), I'm struggling as to why it even shows up based on the list. I have tried many different methods to work with this list but it's been a real struggle. The list concept is difficult to understand.
I create the data.tree and plot via:
dataTree.a <- FromListSimple(checkList)
plot(dataTree.a)
> checkList
[[1]]
[[1]]$Asia
[[1]]$Asia$China
[[1]]$Asia$China$Beijing
[[1]]$Asia$China$Beijing$Round
[[1]]$Asia$China$Beijing$Round$`20383994`
[1] 0
[[2]]
[[2]]$Asia
[[2]]$Asia$China
[[2]]$Asia$China$Beijing
[[2]]$Asia$China$Beijing$Round
[1] 0
[[3]]
[[3]]$Asia
[[3]]$Asia$China
[[3]]$Asia$China$Beijing
[1] 0
[[4]]
[[4]]$Asia
[[4]]$Asia$China
[[4]]$Asia$China$Shanghai
[[4]]$Asia$China$Shanghai$Round
[[4]]$Asia$China$Shanghai$Round$`23740778`
[1] 0
[[5]]
[[5]]$Asia
[[5]]$Asia$China
[[5]]$Asia$China$Shanghai
[[5]]$Asia$China$Shanghai$Round
[1] 0
[[6]]
[[6]]$Asia
[[6]]$Asia$China
[[6]]$Asia$China$Shanghai
[1] 0
[[7]]
[[7]]$Asia
[[7]]$Asia$China
[1] 0
[[8]]
[[8]]$Asia
[[8]]$Asia$India
[[8]]$Asia$India$Delhi
[[8]]$Asia$India$Delhi$Round
[[8]]$Asia$India$Delhi$Round$`25703168`
[1] 0
[[9]]
[[9]]$Asia
[[9]]$Asia$India
[[9]]$Asia$India$Delhi
[[9]]$Asia$India$Delhi$Round
[1] 0
[[10]]
[[10]]$Asia
[[10]]$Asia$India
[[10]]$Asia$India$Delhi
[1] 0
[[11]]
[[11]]$Asia
[[11]]$Asia$India
[1] 0
[[12]]
[[12]]$Asia
[[12]]$Asia$Japan
[[12]]$Asia$Japan$Tokyo
[[12]]$Asia$Japan$Tokyo$Round
[[12]]$Asia$Japan$Tokyo$Round$`38001000`
[1] 0
[[13]]
[[13]]$Asia
[[13]]$Asia$Japan
[[13]]$Asia$Japan$Tokyo
[[13]]$Asia$Japan$Tokyo$Round
[1] 0
[[14]]
[[14]]$Asia
[[14]]$Asia$Japan
[[14]]$Asia$Japan$Tokyo
[1] 0
[[15]]
[[15]]$Asia
[[15]]$Asia$Japan
[1] 0
[[16]]
[[16]]$Asia
[1] 0
Well, I did cobble together a poor hack to make this work here is what I did to the 'checkList' list
checkList <- get_selected(tree, format = "slices")
# Convert and collapse shinyTree slices to data.tree
# This is a bit of a cluge to work the graphic with
# shinyTree an alternate one liner is in works
# This transform works by finding the longest branches
# and only plotting them since the other branches are
# subsets due to the slices.
# Extract the checkList name (as characters) from the checkList
tmp <- names(unlist(checkList))
# Determine the length of the individual checkList Names
lens <- lapply(tmp, function(x) length(strsplit(x, ".", fixed=TRUE)[[1]]))
# Find the elements with the highest length returns a list of high vals
lens.max <- which(lens == max(sapply(lens, max)))
# Replace all '.' with '\' prepping for DataFrameTable Converions
tmp <- relist(str_replace_all(tmp, "\\.", "/"), skeleton=tmp)
# Add a root node to work with multiple branches
tmp <- unlist(lapply(tmp, function(x) paste0("Root/", x)))
# Create a list of only the longest branches
longBranches <- as.list(tmp[lens.max])
# Convert the list into a data.frame for convert
longBranches.df <- data.frame(pathString = do.call(rbind, longBranches))
# Publish the data.frame for use
vals$selDF <- longBranches.df
#save(checkList, file = "chkLists.RData") # Save for troubleshooting
print(vals$selDF)ode here
The new checkList looks like this:
[1] "Root/Europe/France/Paris/Round/10843285" "Root/Europe/France/Paris/Round"
[3] "Root/Europe/France/Paris" "Root/Europe/France"
[5] "Root/Europe/Germany/Berlin/Diamond/3563194" "Root/Europe/Germany/Berlin/Diamond"
[7] "Root/Europe/Germany/Berlin/Round/3563194" "Root/Europe/Germany/Berlin/Round"
[9] "Root/Europe/Germany/Berlin" "Root/Europe/Germany"
[11] "Root/Europe/Italy/Rome/Round/3717956" "Root/Europe/Italy/Rome/Round"
[13] "Root/Europe/Italy/Rome" "Root/Europe/Italy"
[15] "Root/Europe/United Kingdom/London/Round/10313307" "Root/Europe/United Kingdom/London/Round"
[17] "Root/Europe/United Kingdom/London" "Root/Europe/United Kingdom"
[19] "Root/Europe"
It works :)... but I think this could be done with a two liner.... I'll work on it again in a week or so. Any other Ideas would be appreciated.

How to use unlist with nested lapply in R [duplicate]

This question already has answers here:
How to remove a level of lists from a list of lists
(2 answers)
Closed 4 years ago.
I am working on a difficult function. Giving an example of my function is very hard, hence, I tried to give a very close example to my problem. I would like to get the output as a list instead of a list of list.
Input
x <- list(rnorm(10,2,3), rnorm(10,3,4))
y <- list(rnorm(10,4,5), rnorm(10,5,6))
z <- list(x, y)
xy <- lapply(seq_along(z), function(i) {
lapply(seq_along( z[[i]]), function(j) {
x[[i]][[j]]*z[[i]][[j]]
})
})
unlist(xy)
The Output
xy
[[1]]
[[1]][[1]]
[1] 2.2280230 -4.9779716 4.1359718 10.3939970 -5.2133243 -1.2696787 0.5000506 4.7157700 7.8720780 7.0678141
[[1]][[2]]
[1] -14.950644 -7.263222 -6.586231 9.762505 -4.686088 4.259647 -3.579593 -7.341470 -13.626069 4.979983
[[2]]
[[2]][[1]]
[1] 3.2567110 18.8390907 32.7599898 16.5438238 10.7631826 35.8007750 7.0666637 -9.0148408 -2.5030033 -0.6119803
[[2]][[2]]
[1] 26.766508 9.292216 8.767470 20.690148 20.456934 22.686122 1.981408 1.763479 9.060410 35.391961
expected Output
xy
[[1]]
[1] 2.2280230 -4.9779716 4.1359718 10.3939970 -5.2133243 -1.2696787 0.5000506 4.7157700 7.8720780 7.0678141
[[2]]
[1] -14.950644 -7.263222 -6.586231 9.762505 -4.686088 4.259647 -3.579593 -7.341470 -13.626069 4.979983
[[3]]
[1] 3.2567110 18.8390907 32.7599898 16.5438238 10.7631826 35.8007750 7.0666637 -9.0148408 -2.5030033 -0.6119803
[[4]]
[1] 26.766508 9.292216 8.767470 20.690148 20.456934 22.686122 1.981408 1.763479 9.060410 35.391961
I tried unlist but it gave me a vector.
Use unlist(xy, recursive = FALSE).
It will prevent unlisting to be applied to components of the list.
The output is:
[[1]]
[1] 0.27862974 1.47723685 -1.82963782 3.47664717 0.62645954 1.67429065 -0.06359767 -1.21542539 1.65609366 2.65336458
[[2]]
[1] 1.167232 3.318266 5.949589 -18.459982 -5.321955 7.810067 -12.792953 2.723463 9.934529 16.385867
[[3]]
[1] 5.4596367 1.3340797 4.8059125 -0.2578762 1.2808736 2.6462153 -3.6259595 1.4900160 -0.1496829 -0.8140339
[[4]]
[1] 13.130614 2.957532 2.270956 1.015446 -3.254110 -4.939529 1.465290 -3.141455 5.803487 15.114528
You can do the following:
library(purrr)
flatten(xy)
I think this is what you wanted, but let me know if otherwise.

Why does sapply of an ordered list outputs my content twice

I stored a list of files in a list using this code:
filesList <- list.files(path="/Users/myPath/data/", pattern="*.csv")
I then wanted to output it without the indexes (that usually appear of form [1] at start of each line, so I tried this:
sapply(filesList[order(filesList)], print)
The result is given below copied exactly from RStudio. Why does my list of files output twice? I can work with this, I am just curious.
[1] "IMDB_Bottom250movies.csv"
[1] "IMDB_Bottom250movies2_OMDB_Detailed.csv"
[1] "IMDB_Bottom250movies2.csv"
[1] "IMDB_ErrorLogIDs1_OMDB_Detailed.csv"
[1] "IMDB_ErrorLogIDs1.csv"
[1] "IMDB_ErrorLogIDs2_OMDB_Detailed.csv"
[1] "IMDB_ErrorLogIDs2.csv"
[1] "IMDB_OMDB_Kaggle_TestSet_OMDB_Detailed.csv"
[1] "IMDB_OMDB_Kaggle_TestSet.csv"
[1] "IMDB_Top250Engmovies.csv"
[1] "IMDB_Top250Engmovies2_OMDB_Detailed.csv"
[1] "IMDB_Top250Engmovies2.csv"
[1] "IMDB_Top250Indianmovies.csv"
[1] "IMDB_Top250Indianmovies2_OMDB_Detailed.csv"
[1] "IMDB_Top250Indianmovies2.csv"
[1] "IMDB_Top250movies.csv"
[1] "IMDB_Top250movies2_OMDB_Detailed.csv"
[1] "IMDB_Top250movies2.csv"
[1] "TestDoc2_KaggleData_OMDB_Detailed.csv"
[1] "TestDoc2_KaggleData.csv"
[1] "TestDoc2_KaggleData68_OMDB_Detailed.csv"
[1] "TestDoc2_KaggleData68.csv"
[1] "TestDoc2_KaggleDataHUGE_OMDB_Detailed.csv"
[1] "TestDoc2_KaggleDataHUGE.csv"
IMDB_Bottom250movies.csv IMDB_Bottom250movies2_OMDB_Detailed.csv
"IMDB_Bottom250movies.csv" "IMDB_Bottom250movies2_OMDB_Detailed.csv"
IMDB_Bottom250movies2.csv IMDB_ErrorLogIDs1_OMDB_Detailed.csv
"IMDB_Bottom250movies2.csv" "IMDB_ErrorLogIDs1_OMDB_Detailed.csv"
IMDB_ErrorLogIDs1.csv IMDB_ErrorLogIDs2_OMDB_Detailed.csv
"IMDB_ErrorLogIDs1.csv" "IMDB_ErrorLogIDs2_OMDB_Detailed.csv"
IMDB_ErrorLogIDs2.csv IMDB_OMDB_Kaggle_TestSet_OMDB_Detailed.csv
"IMDB_ErrorLogIDs2.csv" "IMDB_OMDB_Kaggle_TestSet_OMDB_Detailed.csv"
IMDB_OMDB_Kaggle_TestSet.csv IMDB_Top250Engmovies.csv
"IMDB_OMDB_Kaggle_TestSet.csv" "IMDB_Top250Engmovies.csv"
IMDB_Top250Engmovies2_OMDB_Detailed.csv IMDB_Top250Engmovies2.csv
"IMDB_Top250Engmovies2_OMDB_Detailed.csv" "IMDB_Top250Engmovies2.csv"
IMDB_Top250Indianmovies.csv IMDB_Top250Indianmovies2_OMDB_Detailed.csv
"IMDB_Top250Indianmovies.csv" "IMDB_Top250Indianmovies2_OMDB_Detailed.csv"
IMDB_Top250Indianmovies2.csv IMDB_Top250movies.csv
"IMDB_Top250Indianmovies2.csv" "IMDB_Top250movies.csv"
IMDB_Top250movies2_OMDB_Detailed.csv IMDB_Top250movies2.csv
"IMDB_Top250movies2_OMDB_Detailed.csv" "IMDB_Top250movies2.csv"
TestDoc2_KaggleData_OMDB_Detailed.csv TestDoc2_KaggleData.csv
"TestDoc2_KaggleData_OMDB_Detailed.csv" "TestDoc2_KaggleData.csv"
TestDoc2_KaggleData68_OMDB_Detailed.csv TestDoc2_KaggleData68.csv
"TestDoc2_KaggleData68_OMDB_Detailed.csv" "TestDoc2_KaggleData68.csv"
TestDoc2_KaggleDataHUGE_OMDB_Detailed.csv TestDoc2_KaggleDataHUGE.csv
"TestDoc2_KaggleDataHUGE_OMDB_Detailed.csv" "TestDoc2_KaggleDataHUGE.csv"
The second copy (without the indexes) is close enough to copy-paste-use, jsut wondering why this happened ?
What is happening here is that sapply is calling print on each element of fileList[order(fileList)] printing the contents to screen. Then Rstudio prints the result of the sapply function itself, which is a list of the contents printed by print. You can use cat to print values without the [1] or wrap sapply in invisible to suppress its output. https://stackoverflow.com/a/12985020/6490232

Resources