How to access attributes of a dendrogram in R

How to access attributes of a dendrogram in R - r

From a dendrogram which i created with
hc<-hclust(kk)
hcd<-as.dendrogram(hc)
i picked a subbranch
k=hcd[[2]][[2]][[2]][[2]][[2]][[2]][[2]][1]
When i simply have k displayed, this gives:
> k
[[1]]
[[1]][[1]]
[1] 243
attr(,"label")
[1] "NAfrica_002"
attr(,"members")
[1] 1
attr(,"height")
[1] 0
attr(,"leaf")
[1] TRUE
[[1]][[2]]
[1] 257
attr(,"label")
[1] "NAfrica_016"
attr(,"members")
[1] 1
attr(,"height")
[1] 0
attr(,"leaf")
[1] TRUE
attr(,"members")
[1] 2
attr(,"midpoint")
[1] 0.5
attr(,"height")
[1] 37
How can i access, for example, the "midpoint" attribute, or the second of the "label" attributes?
(I hope i use the correct terminology here)
I have tried things like
k$midpoint
attr(k,"midpoint")
but both returned 'NULL'.
Sorry for question number 2: how could i add a "label" attribute after the attribute "midpoint"?

Your k is still buried one layer too deep. The attributes have been set on the first element of the list k.
attributes(k[[1]]) # Display attributes
attributes(k[[1]])$label # Access attributes
attributes(k[[1]])$label <- 'new' # Change attribute
Alternatively, you can use attr:
attr(k[[1]],'label') # Display attribute

You can change parameters manually as in the previous answer. The problem with this is that it is not efficient to do manually when you want to do it many times. Also, while it is easy to change parameters - that change may not be reflected in any other function, since they won't implement any action based on that change (it must be programmed).
For your specific question - it generally depends on which attribute we want to view. For "midpoint", use the get_nodes_attr function, with the "midpoint" parameter - from the dendextend package.
# install.packages("dendextend")
library(dendextend)
dend <- as.dendrogram(hclust(dist(USArrests[1:5,])))
# Like:
# dend <- USArrests[1:5,] %>% dist %>% hclust %>% as.dendrogram
# midpoint for all nodes
get_nodes_attr(dend, "midpoint")
And you get this:
[1] 1.25 NA 1.50 0.50 NA NA 0.50 NA NA
To also change an attribute, you can use the various assign functions from the package: assign_values_to_leaves_nodePar, assign_values_to_leaves_edgePar, assign_values_to_nodes_nodePar, assign_values_to_branches_edgePar, remove_branches_edgePar, remove_nodes_nodePar
If all you want is to change the labels, the following ability from the package would solve your question:
> labels(dend)
[1] "Arkansas" "Arizona" "California" "Alabama" "Alaska"
> labels(dend) <- 1:5
> labels(dend)
[1] 1 2 3 4 5
For more details on the package, you can have a look at its vignette.

Related

How to extract values from a list with multiple levels in r

I have a list looks like this
[[1]]
[[1]][[1]]
[[1]][[1]]$p1est.z
[1] 2.890829
[[1]][[1]]$p1se.z
[1] 0.1418367
[[1]][[2]]
[[1]][[2]]$p2est.w
[1] 4.947014
[[1]][[2]]$p2se.w
[1] 0.5986682
[[2]]
[[2]][[1]]
[[2]][[1]]$p1est.z
[1] 3.158164
[[2]][[1]]$p1se.z
[1] 0.138770
[[2]][[2]]
[[2]][[2]]$p2est.w
[1] 5.052874
[[2]][[2]]$p2se.w
[1] 0.585608
How can I extract values of "p1est.z" from both levels? since I need to compute the average of them.
Thanks!

Actually the unlist() function out of the box should probably work here:
output <- unlist(your_list)
output[names(output) == "p1est.z"]
p1est.z p1est.z
2.890829 3.158164
Data:
your_list <- list(
list(list(p1est.z=2.890829, p1se.z=0.1418367),
list(p1est.w=4.947014, p2se.w=0.5986682)),
list(list(p1est.z=3.158164, p1se.z=0.138770),
list(p1est.w=5.052874, p2se.w=0.585608)))

One way to do this, using Tim Biegeleisen's representation of your data is to make a function to extract p1est.z and apply that. Your top level list has two elements, in both, the first element has a p1est.z so you could do
fn <- function(x) { x[[1]]$p1est.z }
and then apply it
sapply(your_list, fn)
# [1] 2.890829 3.158164

Reducing a data.tree created from List

I'm working on a shiny app which plots data trees. I'm looking to incorporate the shinyTree app to permit quick comparison of plotted nodes. The issue is that the shinyTree app returns a redundant list of lists of the sub node plot.
The actual list of list is included below. I would like to keep the longest branches only. I would also like to remove the id node (integer node), I'm struggling as to why it even shows up based on the list. I have tried many different methods to work with this list but it's been a real struggle. The list concept is difficult to understand.
I create the data.tree and plot via:
dataTree.a <- FromListSimple(checkList)
plot(dataTree.a)
> checkList
[[1]]
[[1]]$Asia
[[1]]$Asia$China
[[1]]$Asia$China$Beijing
[[1]]$Asia$China$Beijing$Round
[[1]]$Asia$China$Beijing$Round$`20383994`
[1] 0
[[2]]
[[2]]$Asia
[[2]]$Asia$China
[[2]]$Asia$China$Beijing
[[2]]$Asia$China$Beijing$Round
[1] 0
[[3]]
[[3]]$Asia
[[3]]$Asia$China
[[3]]$Asia$China$Beijing
[1] 0
[[4]]
[[4]]$Asia
[[4]]$Asia$China
[[4]]$Asia$China$Shanghai
[[4]]$Asia$China$Shanghai$Round
[[4]]$Asia$China$Shanghai$Round$`23740778`
[1] 0
[[5]]
[[5]]$Asia
[[5]]$Asia$China
[[5]]$Asia$China$Shanghai
[[5]]$Asia$China$Shanghai$Round
[1] 0
[[6]]
[[6]]$Asia
[[6]]$Asia$China
[[6]]$Asia$China$Shanghai
[1] 0
[[7]]
[[7]]$Asia
[[7]]$Asia$China
[1] 0
[[8]]
[[8]]$Asia
[[8]]$Asia$India
[[8]]$Asia$India$Delhi
[[8]]$Asia$India$Delhi$Round
[[8]]$Asia$India$Delhi$Round$`25703168`
[1] 0
[[9]]
[[9]]$Asia
[[9]]$Asia$India
[[9]]$Asia$India$Delhi
[[9]]$Asia$India$Delhi$Round
[1] 0
[[10]]
[[10]]$Asia
[[10]]$Asia$India
[[10]]$Asia$India$Delhi
[1] 0
[[11]]
[[11]]$Asia
[[11]]$Asia$India
[1] 0
[[12]]
[[12]]$Asia
[[12]]$Asia$Japan
[[12]]$Asia$Japan$Tokyo
[[12]]$Asia$Japan$Tokyo$Round
[[12]]$Asia$Japan$Tokyo$Round$`38001000`
[1] 0
[[13]]
[[13]]$Asia
[[13]]$Asia$Japan
[[13]]$Asia$Japan$Tokyo
[[13]]$Asia$Japan$Tokyo$Round
[1] 0
[[14]]
[[14]]$Asia
[[14]]$Asia$Japan
[[14]]$Asia$Japan$Tokyo
[1] 0
[[15]]
[[15]]$Asia
[[15]]$Asia$Japan
[1] 0
[[16]]
[[16]]$Asia
[1] 0

Well, I did cobble together a poor hack to make this work here is what I did to the 'checkList' list
checkList <- get_selected(tree, format = "slices")
# Convert and collapse shinyTree slices to data.tree
# This is a bit of a cluge to work the graphic with
# shinyTree an alternate one liner is in works
# This transform works by finding the longest branches
# and only plotting them since the other branches are
# subsets due to the slices.
# Extract the checkList name (as characters) from the checkList
tmp <- names(unlist(checkList))
# Determine the length of the individual checkList Names
lens <- lapply(tmp, function(x) length(strsplit(x, ".", fixed=TRUE)[[1]]))
# Find the elements with the highest length returns a list of high vals
lens.max <- which(lens == max(sapply(lens, max)))
# Replace all '.' with '\' prepping for DataFrameTable Converions
tmp <- relist(str_replace_all(tmp, "\\.", "/"), skeleton=tmp)
# Add a root node to work with multiple branches
tmp <- unlist(lapply(tmp, function(x) paste0("Root/", x)))
# Create a list of only the longest branches
longBranches <- as.list(tmp[lens.max])
# Convert the list into a data.frame for convert
longBranches.df <- data.frame(pathString = do.call(rbind, longBranches))
# Publish the data.frame for use
vals$selDF <- longBranches.df
#save(checkList, file = "chkLists.RData") # Save for troubleshooting
print(vals$selDF)ode here
The new checkList looks like this:
[1] "Root/Europe/France/Paris/Round/10843285" "Root/Europe/France/Paris/Round"
[3] "Root/Europe/France/Paris" "Root/Europe/France"
[5] "Root/Europe/Germany/Berlin/Diamond/3563194" "Root/Europe/Germany/Berlin/Diamond"
[7] "Root/Europe/Germany/Berlin/Round/3563194" "Root/Europe/Germany/Berlin/Round"
[9] "Root/Europe/Germany/Berlin" "Root/Europe/Germany"
[11] "Root/Europe/Italy/Rome/Round/3717956" "Root/Europe/Italy/Rome/Round"
[13] "Root/Europe/Italy/Rome" "Root/Europe/Italy"
[15] "Root/Europe/United Kingdom/London/Round/10313307" "Root/Europe/United Kingdom/London/Round"
[17] "Root/Europe/United Kingdom/London" "Root/Europe/United Kingdom"
[19] "Root/Europe"
It works :)... but I think this could be done with a two liner.... I'll work on it again in a week or so. Any other Ideas would be appreciated.

Convert character column to a list within the data frame

When I read the csv file into df, SoftwareOwner is a character column
> df
Software SoftwareOwner
<chr> <chr>
1 I-DEAS Siemens
2 TeamViewer Autodesk, TeamViewer, Siemens
3 Inventor PTC, Google, SpaceClaim, Bricys
4 AutoCAD Autodesk
I want to make SoftwareOwner a list within this data frame so I tried the simple solution
> df$SoftwareOwner <- as.list(df$SoftwareOwner)
But all this did was make each entry in the column a list with one entry
> df$SoftwareOwner[2]
[[1]]
[1] "Autodesk, TeamViewer, Siemens"
I've tried adding parameters like sep = "," and all.names = TRUE to as.list but neither worked. Is there any way to access just Autodesk or TeamViewer or Siemens when calling something like what I have just above?

Might I recommend making Siemens, Autodesk, Teamviewer, etc. their own columns and coding a 1 or 0 to indicate ownership? In my experience this is a far more flexible approach.

A possible solution :
# recreate your data.frame
df <- read.csv(text=
"Software;SoftwareOwner
I-DEAS;Siemens
TeamViewer;Autodesk, TeamViewer, Siemens
Inventor;PTC, Google, SpaceClaim, Bricys
AutoCAD;Autodesk",sep=";")
df$SoftwareOwner <- lapply(strsplit(as.character(df$SoftwareOwner),split=','),trimws)
# > df$SoftwareOwner
# [[1]]
# [1] "Siemens"
#
# [[2]]
# [1] "Autodesk" "TeamViewer" "Siemens"
#
# [[3]]
# [1] "PTC" "Google" "SpaceClaim" "Bricys"
#
# [[4]]
# [1] "Autodesk"
# > df$SoftwareOwner[[2]][3]
# [1] "Siemens"
# > df$SoftwareOwner[[3]][2]
# [1] "Google"

Saving dates in a matrix ("origin must be supplied") with r

I am writing my bachelor thesis and I have not much experience with r so far.
My problem is that my dates which I made with this commands :
t<-strptime(x, "%d.%m.%Y %H.%M")
don't work anymore when I save them in a matrix with the other information on those specific dates.
I am a bit confused because it works just fine when I don't put them in a matrix like this t[1:10]
But that happens as soon as I try to save them in a matrix
matrix1<-matrix(c(t,v2,v3,v4),nrow=length(v2))
Fehler in as.POSIXct.numeric(X[[i]], ...) : 'origin' muss angegeben werden
It's German but it means origin must be supplied.
Any ideas what I have to do to fix it? I am a bit frustrated :)

Roland is right. You can't have Posixlt objects in a matrix. What you can do is save those dates as numeric timestamps in the matrix and convert them back to dates while accessing
Converting to numeric timestamp:
>date<- as.numeric(as.POSIXct("2014-02-16 2:13:46 UTC",origin="01-01-1970"))
>date
[1] 1392545626
Then save those timestamps in a matrix as you do and to convert it back to date, use the above command again without converting it into a numeric.

t (terrible name by the way, easily confused with the t function) is a POSIXlt object, which internally is a list. First you should check, what c(t,v2,v3,v4) returns (I don't know how v2 etc are defined).
Then we can look into the documentation in help("matrix"):
data
an optional data vector (including a list or expression vector). Non-atomic classed R objects are coerced by as.vector and all attributes discarded.
The important bit is "all attributes discarded". This is what you get if you discard the attributes (which include the class attribute) of a POSIXlt object:
x <- strptime(c("2016-05-09 12:00:00", "2016-05-09 13:00:00"), format = "%Y-%m-%d %H:%M:%S")
attributes(x) <- NULL
print(x)
# [[1]]
# [1] 0 0
#
# [[2]]
# [1] 0 0
#
# [[3]]
# [1] 12 13
#
# [[4]]
# [1] 9 9
#
# [[5]]
# [1] 4 4
#
# [[6]]
# [1] 116 116
#
# [[7]]
# [1] 1 1
#
# [[8]]
# [1] 129 129
#
# [[9]]
# [1] 1 1
#
# [[10]]
# [1] "CEST" "CEST"
#
# [[11]]
# [1] NA NA
A matrix can't contain POSIXlt objects (or any objects, i.e., anything with an explicit class).

Sorting a key,value list in R by value

Given a list animals, call it m, which contains
$bob
[1] 3
$ryan
[1] 4
$dan
[1] 1
How can I sort this guy by the numerical value?
Basically I'd like to see my code look like this
m=sort(m,sortbynumber)
$ryan
[1] 4
$bob
[1] 3
$dan
[1] 1
I can't figure this out unfortunately. Seems like a simple solution.

You can try order
m[order(-unlist(m))]
#$ryan
#[1] 4
#$bob
#[1] 3
#$dan
#[1] 1
Or a slightly more efficient option would be to use decreasing=TRUE argument of order (from #nicola's comments)
m[order(unlist(m), decreasing=TRUE)]

here is the optimized solution
library(hashmap)
a1<-hashmap("hello",1)
a1$insert("hello1",4)
a1$insert("hello2",2)
a1$insert("hello3",3)
sort(a1$data(),decreasing = TRUE)
#OUTPUT
hello1 hello3 hello2 hello
4 3 2 1

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to access attributes of a dendrogram in R - r

Related

How to extract values from a list with multiple levels in r

Reducing a data.tree created from List

Convert character column to a list within the data frame

Saving dates in a matrix ("origin must be supplied") with r

Sorting a key,value list in R by value

Categories

Resources