Access Variable in a Data Frame List - r

Quick question:
How can I access a variable directly out of a dataframes list?
I tried
df_list[i]$variable which is not working.
or df_list[(i)]$variable...
or df_list[i[1]]
is there a way?

You may use [[ to access the dataframe inside the list. I am assuming you are running in a for loop. To access the variable you may use df_list[[i]]$variable.
You may also read The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe

Related

Referencing recently used objects in R

My question refers to redundant code and a problem that I've been having with a lot of my R-Code.
Consider the following:
list_names<-c("putnam","einstein","newton","kant","hume","locke","leibniz")
combined_df_putnam$fu_time<-combined_df_putnam$age*365.25
combined_df_einstein$fu_time<-combined_einstein$age*365.25
combined_df_newton$fu_time<-combined_newton$age*365.25
...
combined_leibniz$fu_time<-combined_leibniz$age*365.25
I am trying to slim-down my code to do something like this:
list_names<-c("putnam","einstein","newton","kant","hume","locke","leibniz")
paste0("combined_df_",list_names[0:7]) <- data.frame("age"=1)
paste0("combined_df_",list_names[0:7]) <- paste0("combined_df_",list_names[0:7])$age*365.25
When I try to do that, I get "target of assignment expands to non-language object".
Basically, I want to create a list that contains descriptors, use that list to create a list of dataframes/lists and use these shortcuts again to do calculations. Right now, I am copy-pasting these assignments and this has led to various mistakes because I failed to replace the "name" from the previous line in some cases.
Any ideas for a solution to my problem would be greatly appreciated!
The central problem is that you are trying to assign a value (or data.frame) to the result of a function.
In paste0("combined_df_",list_names[0:7]) <- data.frame("age"=1), the left-hand-side returns a character vector:
> paste0("combined_df_",list_names[0:7])
[1] "combined_df_putnam" "combined_df_einstein" "combined_df_newton"
[4] "combined_df_kant" "combined_df_hume" "combined_df_locke"
[7] "combined_df_leibniz"
R will not just interpret these strings as variables that should be created and be referenced to. For that, you should look at the function assign.
Similarily, in the code paste0("combined_df_",list_names[0:7])$age*365.25, the paste0 function does not refer to variables, but simply returns a character vector -- for which the $ operator is not accepted.
There are many ways to solve your problem, but I will recommend that you create a function that performs the necessary operations of each data frame. The function should then return the data frame. You can then re-use the function for all 7 philosophers/scientists.

How to get a list data that are in global environment into a list

I am working on a list of .csv data which I have read in and keep the variables that i need for study. During this process, I have build multiple data set with name xxx_101(PY_101, vB_101_FG_101, etc.) which store in global environment. Now I want to put every new data set with ending _101 into a list. Is it a clever way to build that list other than type them in one by one? Once I read them in to a list, I would like to rename the each list with their original data name. Is there a easy way to do that?
I could do it one by one, but just feel there should be a better way to do. Thanks.
We can use mget with ls and specify the pattern with "_101" as the end ($) of the object name. It would get the values of all those objects into a list
lst1 <- mget(ls(pattern = "_101$"))

Dynamically assign elements to object

In R, I am trying to dynamically select columns of a data.frame called DF. If
cutOffYear=2014
and
forecast_years=3
Then this piece of code
paste0("DF$X",cutOffYear+1:forecast_years)
yields:
[1] "DF$X2015" "DF$X2016" "DF$X2017"
Assuming all three columns exist in DF how do I assign the column variables to the characters?
I have tried a lot of combinations of get, assign and paste0 but I am failing.
We can try with [ to select the columns. It is often error prone when using $. If we need to get the output as a data.frame with columns specified in the pasted combination of 'cutOffYear', 'forecast_years', then the below code should work fine
DF[paste0("X", cutOffYear+1:forecast_years)]

Using strsplit ON a list in R [duplicate]

I'm just learning R and having a hard time wrapping my head around how to extract elements from objects in a list. I've parsed a json file into R giving me list object. But I can't figure out how, from there, to extract various json elements from the list. here's a truncated look at how my data appears after parsing the json:
> #Parse data into R objects#
> list.Json= fromJSON(,final.name, method = "C")
> head(listJson,6)
[[1]]
[[1]]$contributors
NULL
[[1]]$favorited
[1] FALSE
...[truncating]...
[[5]]
[[5]]$contributors
NULL
[[5]]$favorited
[1] FALSE
I can figure out how to extract the favorites data for one of the objects in the list
> first.object=listJson[1]
> ff=first.object[[1]]$favorited
> ff
[1] FALSE
But I'm very confused about how to extract favorited for all objects in the list. I've looked into sappily, is that the correct approach? Do I need to put the above code into a for...next loop?
sapply is going to apply some function to every element in your list. In your case, you want to access each element in a (nested) list. sapply is certainly capable of this. For instance, if you want to access the first child of every element in your list:
sapply(listJson, "[[", 1)
Or if you wanted to access the item named "favorited", you could use:
sapply(listJson, "[[", "favorited")
Note that the [ operator will take a subset of the list you're working with. So when you access myList[1], you still have a list, it's just of length 1. However, if you reference myList[[1]], you'll get the contents of the first space in your list (which may or may not be another list). Thus, you'll use the [[ operator in sapply, because you want to get down to the contents of the list.

Files details from folder

I'd like to loop through a list of files and record detailed info about them (size, no. of rows, means of columns).
I just started with storing the info in a data frame:
df<-data.frame()
all <-list.files(pattern=".csv")
for (i in all){
file<-read.csv(i)
filas<-nrow(file)
cols<-ncol(file)
info<-c(i,filas,cols)
df<-rbind(df,i,filas,cols)
}
but it triggers an error caused by the 'i' variable, which is just a file name. What am I doing wrong?
Thanks in advance, p.
Don't use for loops. Rather, use lapply in combination with do.call to obtain your desired result. Try:
do.call(rbind,lapply(all,function(x) {y<-read.csv(x); c(file=x, filas=nrow(y), cols=ncol(y))}))
Your approach was failing because in order of rbind to work, you need two data.frames with the same number of columns. You initially have created an empty data.frame (with 0 column) and this couldn't be rbinded to a vector of length 3 (assuming that you want a row for each file showing file name, number of rows and number of columns). If you really want to use a for loop, you should do something like:
for (i in 1:length(all)) {
file<-read.csv(all[i])
info<- data.frame(file=all[i], filas=nrow(file), cols=ncol(file))
if (i==1) df<-info else df<-rbind(df,info)
}

Resources