Reducing a data.tree created from List - r

I'm working on a shiny app which plots data trees. I'm looking to incorporate the shinyTree app to permit quick comparison of plotted nodes. The issue is that the shinyTree app returns a redundant list of lists of the sub node plot.
The actual list of list is included below. I would like to keep the longest branches only. I would also like to remove the id node (integer node), I'm struggling as to why it even shows up based on the list. I have tried many different methods to work with this list but it's been a real struggle. The list concept is difficult to understand.
I create the data.tree and plot via:
dataTree.a <- FromListSimple(checkList)
plot(dataTree.a)
> checkList
[[1]]
[[1]]$Asia
[[1]]$Asia$China
[[1]]$Asia$China$Beijing
[[1]]$Asia$China$Beijing$Round
[[1]]$Asia$China$Beijing$Round$`20383994`
[1] 0
[[2]]
[[2]]$Asia
[[2]]$Asia$China
[[2]]$Asia$China$Beijing
[[2]]$Asia$China$Beijing$Round
[1] 0
[[3]]
[[3]]$Asia
[[3]]$Asia$China
[[3]]$Asia$China$Beijing
[1] 0
[[4]]
[[4]]$Asia
[[4]]$Asia$China
[[4]]$Asia$China$Shanghai
[[4]]$Asia$China$Shanghai$Round
[[4]]$Asia$China$Shanghai$Round$`23740778`
[1] 0
[[5]]
[[5]]$Asia
[[5]]$Asia$China
[[5]]$Asia$China$Shanghai
[[5]]$Asia$China$Shanghai$Round
[1] 0
[[6]]
[[6]]$Asia
[[6]]$Asia$China
[[6]]$Asia$China$Shanghai
[1] 0
[[7]]
[[7]]$Asia
[[7]]$Asia$China
[1] 0
[[8]]
[[8]]$Asia
[[8]]$Asia$India
[[8]]$Asia$India$Delhi
[[8]]$Asia$India$Delhi$Round
[[8]]$Asia$India$Delhi$Round$`25703168`
[1] 0
[[9]]
[[9]]$Asia
[[9]]$Asia$India
[[9]]$Asia$India$Delhi
[[9]]$Asia$India$Delhi$Round
[1] 0
[[10]]
[[10]]$Asia
[[10]]$Asia$India
[[10]]$Asia$India$Delhi
[1] 0
[[11]]
[[11]]$Asia
[[11]]$Asia$India
[1] 0
[[12]]
[[12]]$Asia
[[12]]$Asia$Japan
[[12]]$Asia$Japan$Tokyo
[[12]]$Asia$Japan$Tokyo$Round
[[12]]$Asia$Japan$Tokyo$Round$`38001000`
[1] 0
[[13]]
[[13]]$Asia
[[13]]$Asia$Japan
[[13]]$Asia$Japan$Tokyo
[[13]]$Asia$Japan$Tokyo$Round
[1] 0
[[14]]
[[14]]$Asia
[[14]]$Asia$Japan
[[14]]$Asia$Japan$Tokyo
[1] 0
[[15]]
[[15]]$Asia
[[15]]$Asia$Japan
[1] 0
[[16]]
[[16]]$Asia
[1] 0

Well, I did cobble together a poor hack to make this work here is what I did to the 'checkList' list
checkList <- get_selected(tree, format = "slices")
# Convert and collapse shinyTree slices to data.tree
# This is a bit of a cluge to work the graphic with
# shinyTree an alternate one liner is in works
# This transform works by finding the longest branches
# and only plotting them since the other branches are
# subsets due to the slices.
# Extract the checkList name (as characters) from the checkList
tmp <- names(unlist(checkList))
# Determine the length of the individual checkList Names
lens <- lapply(tmp, function(x) length(strsplit(x, ".", fixed=TRUE)[[1]]))
# Find the elements with the highest length returns a list of high vals
lens.max <- which(lens == max(sapply(lens, max)))
# Replace all '.' with '\' prepping for DataFrameTable Converions
tmp <- relist(str_replace_all(tmp, "\\.", "/"), skeleton=tmp)
# Add a root node to work with multiple branches
tmp <- unlist(lapply(tmp, function(x) paste0("Root/", x)))
# Create a list of only the longest branches
longBranches <- as.list(tmp[lens.max])
# Convert the list into a data.frame for convert
longBranches.df <- data.frame(pathString = do.call(rbind, longBranches))
# Publish the data.frame for use
vals$selDF <- longBranches.df
#save(checkList, file = "chkLists.RData") # Save for troubleshooting
print(vals$selDF)ode here
The new checkList looks like this:
[1] "Root/Europe/France/Paris/Round/10843285" "Root/Europe/France/Paris/Round"
[3] "Root/Europe/France/Paris" "Root/Europe/France"
[5] "Root/Europe/Germany/Berlin/Diamond/3563194" "Root/Europe/Germany/Berlin/Diamond"
[7] "Root/Europe/Germany/Berlin/Round/3563194" "Root/Europe/Germany/Berlin/Round"
[9] "Root/Europe/Germany/Berlin" "Root/Europe/Germany"
[11] "Root/Europe/Italy/Rome/Round/3717956" "Root/Europe/Italy/Rome/Round"
[13] "Root/Europe/Italy/Rome" "Root/Europe/Italy"
[15] "Root/Europe/United Kingdom/London/Round/10313307" "Root/Europe/United Kingdom/London/Round"
[17] "Root/Europe/United Kingdom/London" "Root/Europe/United Kingdom"
[19] "Root/Europe"
It works :)... but I think this could be done with a two liner.... I'll work on it again in a week or so. Any other Ideas would be appreciated.

Related

How to extract values from a list with multiple levels in r

I have a list looks like this
[[1]]
[[1]][[1]]
[[1]][[1]]$p1est.z
[1] 2.890829
[[1]][[1]]$p1se.z
[1] 0.1418367
[[1]][[2]]
[[1]][[2]]$p2est.w
[1] 4.947014
[[1]][[2]]$p2se.w
[1] 0.5986682
[[2]]
[[2]][[1]]
[[2]][[1]]$p1est.z
[1] 3.158164
[[2]][[1]]$p1se.z
[1] 0.138770
[[2]][[2]]
[[2]][[2]]$p2est.w
[1] 5.052874
[[2]][[2]]$p2se.w
[1] 0.585608
How can I extract values of "p1est.z" from both levels? since I need to compute the average of them.
Thanks!
Actually the unlist() function out of the box should probably work here:
output <- unlist(your_list)
output[names(output) == "p1est.z"]
p1est.z p1est.z
2.890829 3.158164
Data:
your_list <- list(
list(list(p1est.z=2.890829, p1se.z=0.1418367),
list(p1est.w=4.947014, p2se.w=0.5986682)),
list(list(p1est.z=3.158164, p1se.z=0.138770),
list(p1est.w=5.052874, p2se.w=0.585608)))
One way to do this, using Tim Biegeleisen's representation of your data is to make a function to extract p1est.z and apply that. Your top level list has two elements, in both, the first element has a p1est.z so you could do
fn <- function(x) { x[[1]]$p1est.z }
and then apply it
sapply(your_list, fn)
# [1] 2.890829 3.158164

Split c() inside of a string vector

I am working with a vector of strings in r. However, when I see the first item in the list I see this:
> uni_list[1]
[1] c("ENSMUSG00000000204", "ENSMUSG00000115878", "ENSMUSG00000116453", "ENSMUSG00000116134")
15940 Levels: c("ENSMUSG00000000204", "ENSMUSG00000115878", "ENSMUSG00000116453", "ENSMUSG00000116134")
How can I split this one in separate values?
Thanks in advance,
Juan
You can use split, i.e.
split(l3[[1]], seq(length(l3[[1]])))
$`1`
[1] "ENSMUSG00000000204"
$`2`
[1] "ENSMUSG00000115878"
$`3`
[1] "ENSMUSG00000116453"
$`4`
[1] "ENSMUSG00000116134"
where
l3
[[1]]
[1] "ENSMUSG00000000204" "ENSMUSG00000115878" "ENSMUSG00000116453" "ENSMUSG00000116134"

How to use unlist with nested lapply in R [duplicate]

This question already has answers here:
How to remove a level of lists from a list of lists
(2 answers)
Closed 4 years ago.
I am working on a difficult function. Giving an example of my function is very hard, hence, I tried to give a very close example to my problem. I would like to get the output as a list instead of a list of list.
Input
x <- list(rnorm(10,2,3), rnorm(10,3,4))
y <- list(rnorm(10,4,5), rnorm(10,5,6))
z <- list(x, y)
xy <- lapply(seq_along(z), function(i) {
lapply(seq_along( z[[i]]), function(j) {
x[[i]][[j]]*z[[i]][[j]]
})
})
unlist(xy)
The Output
xy
[[1]]
[[1]][[1]]
[1] 2.2280230 -4.9779716 4.1359718 10.3939970 -5.2133243 -1.2696787 0.5000506 4.7157700 7.8720780 7.0678141
[[1]][[2]]
[1] -14.950644 -7.263222 -6.586231 9.762505 -4.686088 4.259647 -3.579593 -7.341470 -13.626069 4.979983
[[2]]
[[2]][[1]]
[1] 3.2567110 18.8390907 32.7599898 16.5438238 10.7631826 35.8007750 7.0666637 -9.0148408 -2.5030033 -0.6119803
[[2]][[2]]
[1] 26.766508 9.292216 8.767470 20.690148 20.456934 22.686122 1.981408 1.763479 9.060410 35.391961
expected Output
xy
[[1]]
[1] 2.2280230 -4.9779716 4.1359718 10.3939970 -5.2133243 -1.2696787 0.5000506 4.7157700 7.8720780 7.0678141
[[2]]
[1] -14.950644 -7.263222 -6.586231 9.762505 -4.686088 4.259647 -3.579593 -7.341470 -13.626069 4.979983
[[3]]
[1] 3.2567110 18.8390907 32.7599898 16.5438238 10.7631826 35.8007750 7.0666637 -9.0148408 -2.5030033 -0.6119803
[[4]]
[1] 26.766508 9.292216 8.767470 20.690148 20.456934 22.686122 1.981408 1.763479 9.060410 35.391961
I tried unlist but it gave me a vector.
Use unlist(xy, recursive = FALSE).
It will prevent unlisting to be applied to components of the list.
The output is:
[[1]]
[1] 0.27862974 1.47723685 -1.82963782 3.47664717 0.62645954 1.67429065 -0.06359767 -1.21542539 1.65609366 2.65336458
[[2]]
[1] 1.167232 3.318266 5.949589 -18.459982 -5.321955 7.810067 -12.792953 2.723463 9.934529 16.385867
[[3]]
[1] 5.4596367 1.3340797 4.8059125 -0.2578762 1.2808736 2.6462153 -3.6259595 1.4900160 -0.1496829 -0.8140339
[[4]]
[1] 13.130614 2.957532 2.270956 1.015446 -3.254110 -4.939529 1.465290 -3.141455 5.803487 15.114528
You can do the following:
library(purrr)
flatten(xy)
I think this is what you wanted, but let me know if otherwise.

Saving dates in a matrix ("origin must be supplied") with r

I am writing my bachelor thesis and I have not much experience with r so far.
My problem is that my dates which I made with this commands :
t<-strptime(x, "%d.%m.%Y %H.%M")
don't work anymore when I save them in a matrix with the other information on those specific dates.
I am a bit confused because it works just fine when I don't put them in a matrix like this t[1:10]
But that happens as soon as I try to save them in a matrix
matrix1<-matrix(c(t,v2,v3,v4),nrow=length(v2))
Fehler in as.POSIXct.numeric(X[[i]], ...) : 'origin' muss angegeben werden
It's German but it means origin must be supplied.
Any ideas what I have to do to fix it? I am a bit frustrated :)
Roland is right. You can't have Posixlt objects in a matrix. What you can do is save those dates as numeric timestamps in the matrix and convert them back to dates while accessing
Converting to numeric timestamp:
>date<- as.numeric(as.POSIXct("2014-02-16 2:13:46 UTC",origin="01-01-1970"))
>date
[1] 1392545626
Then save those timestamps in a matrix as you do and to convert it back to date, use the above command again without converting it into a numeric.
t (terrible name by the way, easily confused with the t function) is a POSIXlt object, which internally is a list. First you should check, what c(t,v2,v3,v4) returns (I don't know how v2 etc are defined).
Then we can look into the documentation in help("matrix"):
data
an optional data vector (including a list or expression vector). Non-atomic classed R objects are coerced by as.vector and all attributes discarded.
The important bit is "all attributes discarded". This is what you get if you discard the attributes (which include the class attribute) of a POSIXlt object:
x <- strptime(c("2016-05-09 12:00:00", "2016-05-09 13:00:00"), format = "%Y-%m-%d %H:%M:%S")
attributes(x) <- NULL
print(x)
# [[1]]
# [1] 0 0
#
# [[2]]
# [1] 0 0
#
# [[3]]
# [1] 12 13
#
# [[4]]
# [1] 9 9
#
# [[5]]
# [1] 4 4
#
# [[6]]
# [1] 116 116
#
# [[7]]
# [1] 1 1
#
# [[8]]
# [1] 129 129
#
# [[9]]
# [1] 1 1
#
# [[10]]
# [1] "CEST" "CEST"
#
# [[11]]
# [1] NA NA
A matrix can't contain POSIXlt objects (or any objects, i.e., anything with an explicit class).

Populating a list with lists (replicate, lapply)

How should I generate a list of lists of known size?
Currently I do it like that.
create_scout_bees <- function(search_space, num_scouts){
gen_bee <- function(unused, sear_spac){
create_random_bee(sear_spac)
}
bees <- lapply(1:num_scouts, gen_bee, search_space)
#bees <- replicate(num_scouts, create_random_bee(search_space))
cat('\nclass of bees is:',class(bees),'\n')
bees
}
where create_random_bee(sear_spac) returns return(list(vector=random_vector(search_space))). This seems to be too complicated. I found the replicate function (see comment in code). But it does not return the same thing. To be honest I'm not entirely sure what it returns.
The lapply option seems returns a list of lists
[[1]]
[[1]]$vector
[1] -3.772477 -4.178604
[[2]]
[[2]]$vector
[1] -1.237291 -2.430769
[[3]]
[[3]]$vector
[1] -2.211511 -1.352074
[[4]]
[[4]]$vector
[1] 4.102391 -1.437620
[[5]]
[[5]]$vector
[1] -0.1355444 -2.0270074
The replicate version returns a list
$vector
[1] 3.780779 3.588892
$vector
[1] -4.290371 4.098709
$vector
[1] 1.051525 -3.374406
$vector
[1] -0.4593861 -4.8412850
$vector
[1] 2.164383 -4.903347
I can index both returned values. But the second option seems to be just a list with 5 elements of type vector with the same key. When accessing by key it returns the first element.
How do you generate a list of lists of known size?
You can try
replicate(5, list(vector=rnorm(2)), simplify=FALSE)
# [[1]]
#[[1]]$vector
#[1] -1.5239454 -0.1326934
#[[2]]
#[[2]]$vector
#[1] -1.4369404 0.3701259
#[[3]]
#[[3]]$vector
#[1] 0.3251298 -1.4289498
#[[4]]
#[[4]]$vector
#[1] 0.8346002 -0.2974959
#[[5]]
#[[5]]$vector
#[1] 0.4581858 -0.8066517

Resources