Is there a way to view a list - r

When I have data.frame objects, I can simply do View(df), and then I get to see the data.frame in a nice table (even if I can't see all of the rows, I still have an idea of what variables my data contains).
But when I have a list object, the same command does not work. And when the list is large, I have no idea what the list looks like.
I've tried head(mylist) but my console simply cannot display all of the information at once. What's an efficient way to look at a large list in R?

Here's a few ways to look at a list:
Look at one element of a list:
myList[[1]]
Look at the head of one element of a list:
head(myList[[1]])
See the elements that are in a list neatly:
summary(myList)
See the structure of a list (more in depth):
str(myList)
Alternatively, as suggested above you could make a custom print method as such:
printList <- function(list) {
for (item in 1:length(list)) {
print(head(list[[item]]))
}
}
The above will print out the head of each item in the list.

I use str to see the structure of any object, especially complex list's
Rstudio shows you the structure by clicking at the blue arrow in the data-window:

You can also use a package called listviewer
library(listviewer)
jsonedit( myList )

If you have a really large list, you can look at part of it using
str(myList, max.level=1)
(If you don't feel like typing out the second argument, it can be written as max=1 since there are no other arguments that start with max.)
I do this often enough that I have an alias in my .Rprofile for it:
str1 <- function(x, ...) str(x, max.level=1, ...)
And a couple others that limit the printed output (see example(str) for an example of using list.len):
strl <- function(x, len=10L, ...) str(x, list.len=len, ...) # lowercase L in the func name
str1l <- function(x, len=10L, ...) str(x, max.level=1, list.len=len, ...)

you can check the "head" of your dataframes using lapply family:
lapply(yourList, head)
which will return the "heads" of you list.
For example:
df1 <- data.frame(x = runif(3), y = runif(3))
df2 <- data.frame(x = runif(3), y = runif(3))
dfs <- list(df1, df2)
lapply(dfs, head)
Returns:
> lapply(dfs, head)
[[1]]
x y
1 0.3149013 0.8418625
2 0.8807581 0.5048528
3 0.2490966 0.2373453
[[2]]
x y
1 0.4132597 0.5762428
2 0.0303704 0.3399696
3 0.9425158 0.5465939
Instead of "head" you can use any function related to the data.frames, i.e. names, nrow...

Seeing as you explicitly specify that you want to use View() with a list, this is probably what you are looking for:
View(myList[[x]])
Where x is the number of the list element that you wish to view.
For example:
View(myList[[1]])
will show you the first element of the list in the standard View() format that you will be used to in RStudio.
If you know the name of the list item you wish to view, you can do this:
View(myList[["itemOne"]])
There are several other ways, but these will probably serve you best.

This is a simple edit of giraffehere's excellent answer.
For some lists it is convenient to only print the head of a subset of the nested objects, to print the name of the given slot above the output of head().
Arguments:
#'#param list a list object name
#'#param n an integer - the the objects within the list that you wish to print
#'#param hn an integer - the number of rows you wish head to print
USAGE: printList(mylist, n = 5, hn = 3)
printList <- function(list, n = length(list), hn = 6) {
for (item in 1:n) {
cat("\n", names(list[item]), ":\n")
print(head(list[[item]], hn))
}
}
For numeric lists, output may be more readable if the number of digits is limited to 3, eg:
printList <- function(list, n = length(list), hn = 6) {
for (item in 1:n) {
cat("\n", names(list[item]), ":\n")
print(head(list[[item]], hn), digits = 3)
}
}

I had a similar problem and managed to solve it using as_tibble() on my list (dplyr or tibble packages), then just use View() as usual.

In recent versions of RStudio, you can just use View() (or alternatively click on the little blue arrow beside the object in the Global Environment pane).
For example, if we create a list with:
test_list <- list(
iris,
mtcars
)
Then either of the above methods will show you:

I like using as.matrix() on the list and then can use the standard View() command.

Related

trying to get a proper names(list) output

I'm trying to split a 2 level deep list of characters into a 1 level list using a suffix.
More precisely, I have a list of genes, each containing 6 lists of probes corresponding to 6 bins. The architecture looks like :
feat_indexed_probes_bin$HSPB6$bin1
[1] "cg14513218" "cg22891287" "cg20713852" "cg04719839" "cg27580050" "cg18139462" "cg02956481" "cg26608795" "cg15660498" "cg25654926" "cg04878216"
I'm trying to get a list "bins_indexed_probes" with the following architecture :
bins_indexed_probes$HSPB6_bin6 containing the same probes so I can pass it to my map-reducing function.
I tried many solutions such as melt(), for loop, etc but I can't figure how to perform a double nested loop ( on genes and on bins) and get a list output with only 1 level depth.
For the moment, my func to do so is the following :
create_map <- function(indexes = feat_indexed_probes_bin, binlist = c("bin1", "bin2", "bin3", "bin4", "bin5", "bin6"), genes = features) {
map <- list()
ret <- lapply(binlist, function(bin) {
lapply(rownames(features), function(gene) {
map[[paste(gene, "_", bin, sep = "")]] <- feat_indexed_probes_bin[[gene]][[bin]]
tmp_names <<- paste(gene, "_", bin, sep = "")
return(map)
})
names(map) <- tmp_names
rm(tmp_names)
})
return(ret)
}
it returns:
[[6]][[374]]
GDF10_bin6
"cg13565300"
[[6]][[375]]
NULL
[[6]][[376]]
[[6]][[376]]$HNF1B_bin6
[1] "cg03433642" "cg09679923" "cg17652435" "cg03348978" "cg02435495" "cg02701059" "cg05110178" "cg11862993" "cg09463047"
[[6]][[377]]
[[6]][[377]]$GPIHBP1_bin6
[1] "cg01953797" "cg00152340"
instead, I would expect something like
$GPIHBP1_bin1
"cg...." "cg...."
...
$GPIHBP1_bin6
"someotherprobe"
$someothergene_bin1
"probe" "probe"
...
I hope I'm being clear, and since this is my first time asking question, I already apologise if I didn't follow the stackoverflow protocol.
Thank you already for reading me
Consider a nested lapply with extract, [[, and setNames calls, all wrapped in do.call using c to bind return elements together.
bins_indexed_probes <- do.call(c,
lapply(1:6, function(i)
setNames(lapply(feat_indexed_probes_bin, `[[`, i),
paste0(names(feat_indexed_probes_bin), "_bin", i))
)
)
# RE-ORDER ELEMENTS BY NAME
bins_indexed_probes <- bins_indexed_probes[sort(names(bins_indexed_probes))]
Rextester Demo

How to add an attribute to any level of objects (list, list\$frame, list\$frame\$column)?

My problem is as follows: I'm trying to write a function that sets a collection of attributes on an object in a given environment. I'm trying to mimic a metadata layer, like SAS does, so you can set various attributes on a variable, like label, decimal places, date format, and many others.
Example:
SetAttributes(object = "list$dataframe$column", label="A label", width=20, decDigits=2,
dateTimeFormat="....", env=environment())
But I have to set attributes on different levels of objects, say:
comment(list$dataframe$column) <- "comment on a column of a dataframe in a list"
comment(dataframe$column) <- "comment on a column of a dataframe"
comment(list) <- "comment on a list/dataframe/vector"
Alternatively it can be done like this:
comment("env[[list]][[dataframe]][[column]]) <- "text"
# (my function recognizes both formats, as a variable and as a string with chain of
# [[]] components).
So I have implemented it this way:
SetAttributes <- function(varDescription, label="", .........., env=.GlobalEnv) {
parts <- strsplit( varDescription, "$", fixed=TRUE)[[1]]
if(length(parts) == 3) {
lst <- parts[1]
df <- parts[2]
col <- parts[3]
if(!is.na(label)) comment(env[[lst]][[df]][[col]]) <- label
if(!is.na(textWidth)) attr(env[[lst]][[df]][[col]], "width") <- textWidth
....
} else if(length(parts) == 2) {
df <- varTxtComponents[1]
col <- varTxtComponents[2]
if(!is.na(label)) comment(env[[df]][[col]]) <- label
if(!is.na(textWidth)) attr(env[[df]][[col]], "width") <- textWidth
....
} else if(length(parts) == 1) {
....
You see the problem now: I have three blocks of similar code for length(parts) == 3, 2 and 1
When I tried to automatize it this way:
path <- c()
sapply(parts, FUN=function(comp){ path <<- paste0(path, "[[", comp, "]]") )}
comment(eval(parse(text=paste0(".GlobalEnv", path)))) <- "a comment"
I've got an error:
Error in comment(eval(parse(text = paste0(".GlobalEnv", path)))) <- "a comment" :
target of assignment expands to non-language object
Is there any way to get an object on any level and set attributes for it not having a lot of repeated code?
PS: yes, I heard thousand times that changing external variables from inside a function is an evil, so please don't mention it. I know what I want to achieve.
Just to make sure you hear it 1001 times, it's a very bad idea for a function to have side effects like this. This is a very un R-like way to program something like this. If you're going to write R code, it's better to do things the R way. This means returning modified objects that can optionally be reassigned. This would make life much easier.
Here's a simplified version which only focuses on the comment.
SetComment <- function(varDescription, label=NULL, env=.GlobalEnv) {
obj <- parse(text= varDescription)[[1]]
eval(substitute(comment(X)<-Y, list(X=obj, Y=label)), env)
}
a<-list(b=4)
comment(a$b)
# NULL
SetComment("a$b", "check")
comment(a$b)
# [1] "check"
Here, rather than parsing and splitting the string, we build an expression that we evaluate in the proper context. We use substitute() to pop in the values you want to the actual call.

Assign output from a loop to a list

this might be quiet a strange question but...
I have 3 vectors:
myseq=seq(8,22,1)
myseqema3=seq(3,4,1)
myseqema15=seq(10,20,1)
And I want to assign the results to my list:
SLResultsloop=vector(mode="list")
With this loop:
for (i in myseq){
for(j in myseqema3){
for( k in myseqema15){
SLResultsloop[[i-7]]= StopLoss(data=mydata,n=i,EMA3=j,EMA15=k)
names(SLResultsloop[[i-7]])=rep(paste("RSI=",i,"EMA3=",j,"EMA15=",k,sep="|"),
length=length(SLResultsloop[[i-7]]))
}
}
}
The problem is as follows: the loop above overrides the list elements. So does any one have a clever solution about how to assign the loopresults to unique list elements (without overriding previous results)?
One solution could be to assign the output to different lists but it is a bit of an ugly solution...
Best Regards
You can skip the loops entirely by using expand.grid and apply (or something similar):
g <-
expand.grid(myseq = myseq,
myseqema3 = myseqema3,
myseqema15 = myseqema15)
apply(g, 1, function(a) {
StopLoss(data=mydata, n=a[1], EMA3=a[2], EMA15=a[3])
})
You can then build your names for each element of the return value from apply using something like:
paste("RSI=",g[,1], "EMA3=", g[,2],"EMA15=", g[,3], sep="|")

combination of expand.grid and mapply?

I am trying to come up with a variant of mapply (call it xapply for now) that combines the functionality (sort of) of expand.grid and mapply. That is, for a function FUN and a list of arguments L1, L2, L3, ... of unknown length, it should produce a list of length n1*n2*n3 (where ni is the length of list i) which is the result of applying FUN to all combinations of the elements of the list.
If expand.grid worked to generate lists of lists rather than data frames, one might be able to use it, but I have in mind that the lists may be lists of things that won't necessarily fit into a data frame nicely.
This function works OK if there are exactly three lists to expand, but I am curious about a more generic solution. (FLATTEN is unused, but I can imagine that FLATTEN=FALSE would generate nested lists rather than a single list ...)
xapply3 <- function(FUN,L1,L2,L3,FLATTEN=TRUE,MoreArgs=NULL) {
retlist <- list()
count <- 1
for (i in seq_along(L1)) {
for (j in seq_along(L2)) {
for (k in seq_along(L3)) {
retlist[[count]] <- do.call(FUN,c(list(L1[[i]],L2[[j]],L3[[k]]),MoreArgs))
count <- count+1
}
}
}
retlist
}
edit: forgot to return the result. One might be able to solve this by making a list of the indices with combn and going from there ...
I think I have a solution to my own question, but perhaps someone can do better (and I haven't implemented FLATTEN=FALSE ...)
xapply <- function(FUN,...,FLATTEN=TRUE,MoreArgs=NULL) {
L <- list(...)
inds <- do.call(expand.grid,lapply(L,seq_along)) ## Marek's suggestion
retlist <- list()
for (i in 1:nrow(inds)) {
arglist <- mapply(function(x,j) x[[j]],L,as.list(inds[i,]),SIMPLIFY=FALSE)
if (FLATTEN) {
retlist[[i]] <- do.call(FUN,c(arglist,MoreArgs))
}
}
retlist
}
edit: I tried #baptiste's suggestion, but it's not easy (or wasn't for me). The closest I got was
xapply2 <- function(FUN,...,FLATTEN=TRUE,MoreArgs=NULL) {
L <- list(...)
xx <- do.call(expand.grid,L)
f <- function(...) {
do.call(FUN,lapply(list(...),"[[",1))
}
mlply(xx,f)
}
which still doesn't work. expand.grid is indeed more flexible than I thought (although it creates a weird data frame that can't be printed), but enough magic is happening inside mlply that I can't quite make it work.
Here is a test case:
L1 <- list(data.frame(x=1:10,y=1:10),
data.frame(x=runif(10),y=runif(10)),
data.frame(x=rnorm(10),y=rnorm(10)))
L2 <- list(y~1,y~x,y~poly(x,2))
z <- xapply(lm,L2,L1)
xapply(lm,L2,L1)
#ben-bolker, I had a similar desire and think I have a preliminary solution worked out, that I've also tested to work in parallel. The function, which I somewhat confusingly called gmcmapply (g for grid) takes an arbitrarily large named list mvars (that gets expand.grid-ed within the function) and a FUN that utilizes the list names as if they were arguments to the function itself (gmcmapply will update the formals of FUN so that by the time FUN is passed to mcmapply it's arguments reflect the variables that the user would like to iterate over (which would be layers in a nested for loop)). mcmapply then dynamically updates the values of these formals as it cycles over the expanded set of variables in mvars.
I've posted the preliminary code as a gist (reprinted with an example below) and would be curious to get your feedback on it. I'm a grad student, that is self-described as an intermediately-skilled R enthusiast, so this is pushing my R skills for sure. You or other folks in the community may have suggestions that would improve on what I have. I do think even as it stands, I'll be coming to this function quite a bit in the future.
gmcmapply <- function(mvars, FUN, SIMPLIFY = FALSE, mc.cores = 1, ...){
require(parallel)
FUN <- match.fun(FUN)
funArgs <- formals(FUN)[which(names(formals(FUN)) != "...")] # allow for default args to carry over from FUN.
expand.dots <- list(...) # allows for expanded dot args to be passed as formal args to the user specified function
# Implement non-default arg substitutions passed through dots.
if(any(names(funArgs) %in% names(expand.dots))){
dot_overwrite <- names(funArgs[which(names(funArgs) %in% names(expand.dots))])
funArgs[dot_overwrite] <- expand.dots[dot_overwrite]
#for arg naming and matching below.
expand.dots[dot_overwrite] <- NULL
}
## build grid of mvars to loop over, this ensures that each combination of various inputs is evaluated (equivalent to creating a structure of nested for loops)
grid <- expand.grid(mvars,KEEP.OUT.ATTRS = FALSE, stringsAsFactors = FALSE)
# specify formals of the function to be evaluated by merging the grid to mapply over with expanded dot args
argdefs <- rep(list(bquote()), ncol(grid) + length(expand.dots) + length(funArgs) + 1)
names(argdefs) <- c(colnames(grid), names(funArgs), names(expand.dots), "...")
argdefs[which(names(argdefs) %in% names(funArgs))] <- funArgs # replace with proper dot arg inputs.
argdefs[which(names(argdefs) %in% names(expand.dots))] <- expand.dots # replace with proper dot arg inputs.
formals(FUN) <- argdefs
if(SIMPLIFY) {
#standard mapply
do.call(mcmapply, c(FUN, c(unname(grid), mc.cores = mc.cores))) # mc.cores = 1 == mapply
} else{
#standard Map
do.call(mcmapply, c(FUN, c(unname(grid), SIMPLIFY = FALSE, mc.cores = mc.cores)))
}
}
example code below:
# Example 1:
# just make sure variables used in your function appear as the names of mvars
myfunc <- function(...){
return_me <- paste(l3, l1^2 + l2, sep = "_")
return(return_me)
}
mvars <- list(l1 = 1:10,
l2 = 1:5,
l3 = letters[1:3])
### list output (mapply)
lreturns <- gmcmapply(mvars, myfunc)
### concatenated output (Map)
lreturns <- gmcmapply(mvars, myfunc, SIMPLIFY = TRUE)
## N.B. This is equivalent to running:
lreturns <- c()
for(l1 in 1:10){
for(l2 in 1:5){
for(l3 in letters[1:3]){
lreturns <- c(lreturns,myfunc(l1,l2,l3))
}
}
}
### concatenated outout run on 2 cores.
lreturns <- gmcmapply(mvars, myfunc, SIMPLIFY = TRUE, mc.cores = 2)
Example 2. Pass non-default args to FUN.
## Since the apply functions dont accept full calls as inputs (calls are internal), user can pass arguments to FUN through dots, which can overwrite a default option for FUN.
# e.g. apply(x,1,FUN) works and apply(x,1,FUN(arg_to_change= not_default)) does not, the correct way to specify non-default/additional args to FUN is:
# gmcmapply(mvars, FUN, arg_to_change = not_default)
## update myfunc to have a default argument
myfunc <- function(rep_letters = 3, ...){
return_me <- paste(rep(l3, rep_letters), l1^2 + l2, sep = "_")
return(return_me)
}
lreturns <- gmcmapply(mvars, myfunc, rep_letters = 1)
A bit of additional functionality I would like to add but am still trying to work out is
cleaning up the output to be a pretty nested list with the names of mvars (normally, I'd create multiple lists within a nested for loop and tag lower-level lists onto higher level lists all the way up until all layers of the gigantic nested loop were done). I think using some abstracted variant of the solution provided here will work, but I haven't figured out how to make the solution flexible to the number of columns in the expand.grid-ed data.frame.
I would like an option to log the outputs of the child processesthat get called in mcmapply in a user-specified directory. So you could look at .txt outputs from every combination of variables generated by expand.grid (i.e. if the user prints model summaries or status messages as a part of FUN as I often do). I think a feasible solution is to use the substitute() and body() functions, described here to edit FUN to open a sink() at the beginning of FUN and close it at the end if the user specifies a directory to write to. Right now, I just program it right into FUN itself, but later it would be nice to just pass gmcmapply an argument called something like log_children = "path_to_log_dir. and then editing the body of the function to (pseudocode) sink(file = file.path(log_children, paste0(paste(names(mvars), sep = "_"), ".txt")
Let me know what you think!
-Nate

Accessing same named list elements of the list of lists in R

Frequently I encounter situations where I need to create a lot of similar models for different variables. Usually I dump them into the list. Here is the example of dummy code:
modlist <- lapply(1:10,function(l) {
data <- data.frame(Y=rnorm(10),X=rnorm(10))
lm(Y~.,data=data)
})
Now getting the fit for example is very easy:
lapply(modlist,predict)
What I want to do sometimes is to extract one element from the list. The obvious way is
sapply(modlist,function(l)l$rank)
This does what I want, but I wonder if there is a shorter way to get the same result?
probably these are a little bit simple:
> z <- list(list(a=1, b=2), list(a=3, b=4))
> sapply(z, `[[`, "b")
[1] 2 4
> sapply(z, get, x="b")
[1] 2 4
and you can define a function like:
> `%c%` <- function(x, n)sapply(x, `[[`, n)
> z %c% "b"
[1] 2 4
and also this looks like an extension of $:
> `%$%` <- function(x, n) sapply(x, `[[`, as.character(as.list(match.call())$n))
> z%$%b
[1] 2 4
I usually use kohske way, but here is another trick:
sapply(modlist, with, rank)
It is more useful when you need more elements, e.g.:
sapply(modlist, with, c(rank, df.residual))
As I remember I stole it from hadley (from plyr documentation I think).
Main difference between [[ and with solutions is in case missing elements. [[ returns NULL when element is missing. with throw an error unless there exist an object in global workspace having same name as searched element. So e.g.:
dah <- 1
lapply(modlist, with, dah)
returns list of ones when modlist don't have any dah element.
With Hadley's new lowliner package you can supply map() with a numeric index or an element name to elegantly pluck components out of a list. map() is the equivalent of lapply() with some extra tricks.
library("lowliner")
l <- list(
list(a = 1, b = 2),
list(a = 3, b = 4)
)
map(l, "b")
map(l, 2)
There is also a version that simplifies the result to a vector
map_v(l, "a")
map_v(l, 1)

Resources