I have a list of data frames. I want to perform a bunch of operations within a for loop but before that, I need to extract the string name of each dataset to use as variable/data frame name suffixes.
for(i in dflist) {
suffix<- deparse(substitute(i))
print(suffix)
}
However, my output shows as the following:
[1] "i"
[1] "i"
[1] "i"
I know that this is because of R's lazy evaluation framework. But how do I get around this limitation and get R to assign the names of data frames in dflist to the suffix variable ?
You are almost there. Try to understand what elements of a list are and how indexing works when you works with lists.
example data
dflist <- list(
name_df1 = data.frame(a = 1:3)
,name_df2 = data.frame(mee = c("A","B","C","D"))
)
understand list elements
What do we access with index i in a for loop over a list?
for(i in dflist){
print(class(i))
print(names(i))
}
[1] "data.frame"
[1] "a"
[1] "data.frame"
[1] "mee"
index extracts an object of class (here) data frame.
your case
for(i in dflist){
suffix <- names(i)
print(suffix)
}
[1] "a"
[1] "mee"
Related
I'm trying to find the difference between two columns in a CSV file, which I named Test.
I'd like to add a new column called 'Results' that contains the difference between Events_1 & Events_2. If there is no difference the Results can be blank.
This is a basic example, for what I'm trying to accomplish, the real list contains hundreds of events in both columns.
Not tested with your data, but
vec2 <- c("hello,goodbye","hello,goodbye")
vec1 <- c("hello","hello,goodbye")
Map(setdiff, strsplit(vec2, "[,\\s]+"), strsplit(vec1, "[,\\s]+"))
# [[1]]
# [1] "goodbye"
# [[2]]
# character(0)
If you need them to be comma-delimited strings, then
mapply(function(a,b) paste(setdiff(a,b), collapse=","), strsplit(vec2, "[,\\s]+"), strsplit(vec1, "[,\\s]+"))
# [1] "goodbye" ""
I'm working on some time-series stuff in R (version 3.4.1), and would like to extract coefficients from regressions I ran, in order to do further analysis.
All results are so far saved as uGARCHfit objects, which are basically complicated list objects, from which I want to extract the coefficients in the following manner.
What I want is in essence this:
for(i in list){
i_GARCH_mxreg <- i_GARCH#fit$robust.matcoef[5,1]
}
"list" is a list object, where every element is the name of one observation. For now, I want my loop to create a new numeric object named as I specified in the loop.
Now this obviously doesn't work because the index, 'i', isn't replaced as I would want it to be.
How do I rewrite my loop appropriately?
Minimal working example:
list <- as.list(c("one", "two", "three"))
one_a <- 1
two_a <- 2
three_a <- 3
for (i in list){
i_b <- i_a
}
what this should give me would be:
> one_b
[1] 1
> two_b
[1] 2
> three_b
[1] 3
Clarification:
I want to extract the coefficients form multiple list objects. These are named in the manner 'string'_obj. The problem is that I don't have a function that would extract these coefficients, the list "is not subsettable", so I have to call the individual objects via obj#fit$robust.matcoef[5,1] (or is there another way?). I wanted to use the loop to take my list of strings, and in every iteration, take one string, add 'string'_obj#fit$robust.matcoef[5,1], and save this value into an object, named again with " 'string'_name "
It might well be easier to have this into a list rather than individual objects, as someone suggest lapply, but this is not my primary concern right now.
There is likely an easy way to do this, but I am unable to find it. Sorry for any confusion and thanks for any help.
The following should match your desired output:
# your list
l <- as.list(c("one", "two", "three"))
one_a <- 1
two_a <- 2
three_a <- 3
# my workspace: note that there is no one_b, two_b, three_b
ls()
[1] "l" "one_a" "three_a" "two_a"
for (i in l){
# first, let's define the names as characters, using paste:
dest <- paste0(i, "_b")
orig <- paste0(i, "_a")
# then let's assign the values. Since we are working with
# characters, the functions assign and get come in handy:
assign(dest, get(orig) )
}
# now let's check my workspace again. Note one_b, two_b, three_b
ls()
[1] "dest" "i" "l" "one_a" "one_b" "orig" "three_a"
[8] "three_b" "two_a" "two_b"
# let's check that the values are correct:
one_b
[1] 1
two_b
[1] 2
three_b
[1] 3
To comment on the functions used: assign takes a character as first argument, which is supposed to be the name of the newly created object. The second argument is the value of that object. get takes a character and looks up the value of the object in the workspace with the same name as that character. For instance, get("one_a") will yield 1.
Also, just to follow up on my comment earlier: If we already had all the coefficients in a list, we could do the following:
# hypothetical coefficients stored in list:
lcoefs <- list(1,2,3)
# let's name the coefficients:
lcoefs <- setNames(lcoefs, paste0(c("one", "two", "three"), "_c"))
# push them into the global environment:
list2env(lcoefs, env = .GlobalEnv)
# look at environment:
ls()
[1] "dest" "i" "l" "lcoefs" "one_a" "one_b" "one_c"
[8] "orig" "three_a" "three_b" "three_c" "two_a" "two_b" "two_c"
one_c
[1] 1
two_c
[1] 2
three_c
[1] 3
And to address the comments, here a slightly more realistic example, taking the list-structure into account:
l <- as.list(c("one", "two", "three"))
# let's "hide" the values in a list:
one_a <- list(val = 1)
two_a <- list(val = 2)
three_a <- list(val = 3)
for (i in l){
dest <- paste0(i, "_b")
orig <- paste0(i, "_a")
# let's get the list-object:
tmp <- get(orig)
# extract value:
val <- tmp$val
assign(dest, val )
}
one_b
[1] 1
two_b
[1] 2
three_b
[1] 3
I'm using parLapply to read lots of small CSV files. Then running table() to tabulate the results and put them in a list of lists. into the parLapply function I pass in the id/csv file name.
ll <- parLapply(ids, function(id){
df<-read.csv(paste0(id,".csv"))
return(table(df$result))})
However the names of the list is lost ( names(ll) returns NULL ). How can I get the names for each id associated with the appropriate with the list.
It's because your list is not named. You can name it using names(ids) <- ids:
ids <- list(3,2,1)
names(ids) <- ids
parLapply(cl,ids,function(x) x)
$`3`
[1] 3
$`2`
[1] 2
$`1`
[1] 1
I try to get an object's name from the list containing this object. I searched through similar questions and find some suggestions about using the deparse(substitute(object)) formula:
> my.list <- list(model.product, model.i, model.add)
> lapply(my.list, function(model) deparse(substitute(model)))
and the result is:
[[1]]
[1] "X[[1L]]"
[[2]]
[1] "X[[2L]]"
[[3]]
[1] "X[[3L]]"
whereas I want to obtain:
[1] "model.product", "model.i", "model.add"
Thank you in advance for being of some help!
You can write your own list() function so it behaves like data.frame(), i.e., uses the un-evaluated arg names as entry names:
List <- function(...) {
names <- as.list(substitute(list(...)))[-1L]
setNames(list(...), names)
}
my.list <- List(model.product, model.i, model.add)
Then you can just access the names via:
names(my.list)
names(my.list) #..............
Oh wait, you didn't actually create names did you? There is actually no "memory" for the list function. It returns a list with the values of its arguments but not from whence they came, unless you add names to the pairlist given as the argument.
You won't be able to extract the information that way once you've created my.list.
The underlying way R works is that expressions are not evaluated until they're needed; using deparse(substitute()) will only work before the expression has been evaluated. So:
deparse(substitute(list(model.product, model.i, model.add)))
should work, while yours doesn't.
To save stuffing around, you could employ mget to collect your free-floating variables into a list with the names included:
one <- two <- three <- 1
result <- mget(c("one","two","three"))
result
#$one
#[1] 1
#
#$two
#[1] 1
#
#$three
#[1] 1
Then you can follow #DWin's suggestion:
names(result)
#[1] "one" "two" "three"
I have a list in R with the following elements:
[[812]]
[1] "" "668" "12345_s_at" "667" "4.899777748"
[6] "49.53333333" "10.10930207" "1.598228663" "5.087437057"
[[813]]
[1] "" "376" "6789_at" "375" "4.899655078"
[6] "136.3333333" "27.82508792" "2.20223398" "5.087437057"
[[814]]
[1] "" "19265" "12351_s_at" "19264" "4.897730912"
[6] "889.3666667" "181.5874908" "1.846451572" "5.087437057"
I know I can access them with something like list_elem[[814]][3] in case that I want to extract the third element of the position 814.
I need to extract the third element of all the list, for example 12345_s_at, and I want to put them in a vector or list so I can compare their elements to another list later on. Below is my code:
elem<-(c(listdata))
lp<-length(elem)
for (i in 1:lp)
{
newlist<-c(listdata[[i]][3]) ###maybe to put in a vector
print(newlist)
}
When I print the results I get the third element, but like this:
[1] "1417365_a_at"
[1] "1416336_s_at"
[1] "1416044_at"
[1] "1451201_s_at"
so I cannot traverse them with an index like newlist[3], because it returns NA. Where is my mistake?
If you want to extract the third element of each list element you can do:
List <- list(c(1:3), c(4:6), c(7:9))
lapply(List, '[[', 3) # This returns a list with only the third element
unlist(lapply(List, '[[', 3)) # This returns a vector with the third element
Using your example and taking into account #GSee comment you can do:
yourList <- list(c("","668","12345_s_at","667", "4.899777748","49.53333333",
"10.10930207", "1.598228663","5.087437057"),
c("","376", "6789_at", "375", "4.899655078","136.3333333",
"27.82508792", "2.20223398", "5.087437057"),
c("", "19265", "12351_s_at", "19264", "4.897730912",
"889.3666667", "181.5874908","1.846451572","5.087437057" ))
sapply(yourList, '[[', 3)
[1] "12345_s_at" "6789_at" "12351_s_at"
Next time you can provide some data using dput on a portion of your dataset so we can reproduce your problem easily.
With purrr you can extract elements and ensure data type consistency:
library(purrr)
listdata <- list(c("","668","12345_s_at","667", "4.899777748","49.53333333",
"10.10930207", "1.598228663","5.087437057"),
c("","376", "6789_at", "375", "4.899655078","136.3333333",
"27.82508792", "2.20223398", "5.087437057"),
c("", "19265", "12351_s_at", "19264", "4.897730912",
"889.3666667", "181.5874908","1.846451572","5.087437057" ))
map_chr(listdata, 3)
## [1] "12345_s_at" "6789_at" "12351_s_at"
There are other map_ functions that enforce the type consistency as well and a map_df() which can finally help end the do.call(rbind, …) madness.
In case you wanted to use the code you typed in your question, below is the fix:
listdata <- list(c("","668","12345_s_at","667", "4.899777748","49.53333333",
"10.10930207", "1.598228663","5.087437057"),
c("","376", "6789_at", "375", "4.899655078","136.3333333",
"27.82508792", "2.20223398", "5.087437057"),
c("", "19265", "12351_s_at", "19264", "4.897730912",
"889.3666667", "181.5874908","1.846451572","5.087437057" ))
v <- character() #creates empty character vector
list_len <- length(listdata)
for(i in 1:list_len)
v <- c(v, listdata[[i]][3]) #fills the vector with list elements (not efficient, but works fine)
print(v)
[1] "12345_s_at" "6789_at" "12351_s_at"