How to keep/change the class after using paste() - r

I am working with netCDF-files and have the following problem:
I created 70 objects of "ncdf4" class. They are named ncin1950 - ncin2019.
In the next step I need to extract certain information (so called "consecutive_dry_days_index_per_time_period") from the single objects which is regulary done by the following command, using ncin1950 as an example:
cdd.array1950 <- ncvar_get(ncin1950,"consecutive_dry_days_index_per_time_period")
This works perfectly fine.
Since I do not want to apply this command for every single object and thereby 70 times, I tried to use a loop:
for (i in 1950:2019) {
assign(paste0("cdd.array",i), ncvar_get(paste("ncin",i,sep = ""),"consecutive_dry_days_index_per_time_period"))
}
This did not work due to this:
error in ncvar_get(paste("ncin", i, sep = ""), "consecutive_dry_days_index_per_time_period") :
first argument (nc) is not of class ncdf4!
I think the reason for this is the fact that paste() automatically creates results of the class "character":
> class(paste("ncin",1950,sep = ""))
[1] "character"
while
> class(ncin1950)
[1] "ncdf4"
So the question is, how can I keep the "ncdf4" class in the for-loop while using the paste()-function or anything similar which concatenates?

You need get() to use a string to reference an object, ncvar_get(get(paste("ncin",i,sep = "")). Or use mget() to get the whole list of objects at once, then lapply().
Or, best of all, rearrange you upstream workflow so that you're creating a list of objects, rather than cluttering your workspace with 70 different objects. Then you can operate on the list with a for loop, or lapply, or fancier methods (e.g. purrr::map()). This is more idiomatic and will simplify your life in the long run ...

Related

Referencing recently used objects in R

My question refers to redundant code and a problem that I've been having with a lot of my R-Code.
Consider the following:
list_names<-c("putnam","einstein","newton","kant","hume","locke","leibniz")
combined_df_putnam$fu_time<-combined_df_putnam$age*365.25
combined_df_einstein$fu_time<-combined_einstein$age*365.25
combined_df_newton$fu_time<-combined_newton$age*365.25
...
combined_leibniz$fu_time<-combined_leibniz$age*365.25
I am trying to slim-down my code to do something like this:
list_names<-c("putnam","einstein","newton","kant","hume","locke","leibniz")
paste0("combined_df_",list_names[0:7]) <- data.frame("age"=1)
paste0("combined_df_",list_names[0:7]) <- paste0("combined_df_",list_names[0:7])$age*365.25
When I try to do that, I get "target of assignment expands to non-language object".
Basically, I want to create a list that contains descriptors, use that list to create a list of dataframes/lists and use these shortcuts again to do calculations. Right now, I am copy-pasting these assignments and this has led to various mistakes because I failed to replace the "name" from the previous line in some cases.
Any ideas for a solution to my problem would be greatly appreciated!
The central problem is that you are trying to assign a value (or data.frame) to the result of a function.
In paste0("combined_df_",list_names[0:7]) <- data.frame("age"=1), the left-hand-side returns a character vector:
> paste0("combined_df_",list_names[0:7])
[1] "combined_df_putnam" "combined_df_einstein" "combined_df_newton"
[4] "combined_df_kant" "combined_df_hume" "combined_df_locke"
[7] "combined_df_leibniz"
R will not just interpret these strings as variables that should be created and be referenced to. For that, you should look at the function assign.
Similarily, in the code paste0("combined_df_",list_names[0:7])$age*365.25, the paste0 function does not refer to variables, but simply returns a character vector -- for which the $ operator is not accepted.
There are many ways to solve your problem, but I will recommend that you create a function that performs the necessary operations of each data frame. The function should then return the data frame. You can then re-use the function for all 7 philosophers/scientists.

$ operator for variable object names

I am trying to use the $ operator for selecting and reformating specific columns in a for loop on variably created data.frame objects. I tried 4 different solutions in my commented code, but none of them works. I looked all over SO but i don't seem to find another solution to try.
How can i make use of the $ operator to select specific columns with variable data.frame names?
Thanks
weather_data_files<-c("CMC","ECMWF","ECMWF_VAR_EPS_MONTHLY_FORECAST",
"GFS","ICON_EU","UKMET_EURO4")
for(filename in weather_data_files){
#create data frame environment objects
assign(paste(filename),read.csv(file = paste(filename,".csv",sep = ""),sep = ";"))
#first solution does not work, because filename is here an atomic vector
#rather than a data.frame
#ErrorMessage: $ operator is invalid for atomic vectors
filename$Forecast.Time<- as.POSIXct(filename$Forecast.Time,
format="%d.%m.%Y %H:%M+%S",tz="UTC")
#ok get it, let's try second soltution,but
#it also does not work allthough i try to get the data.frame object
#ErrorMesssage: could not find function "get<-
get(filename)$Forecast.Time<-
as.POSIXct(get(filename)$Forecast.Time,format="%d.%m.%Y %H:%M+%S",tz="UTC")
#Third solution as.name also does not work
#ErrorMessage: object of type 'symbol' is not subsettable
as.name(filename)$Forecast.Time<-
as.POSIXct(as.name(filename)$Forecast.Time,format="%d.%m.%Y %H:%M+%S",tz="UTC")
#Fourth solution comparable to second solution, still not working
#ErrorMessage: could not find function "eval<-"
eval(assign(filename,get(filename)))$Forecast.Time<-
as.POSIXct(eval(assign(filename,get(filename)))$Forecast.Time,
format="%d.%m.%Y %H:%M+%S",tz="UTC")
}
So, the problem is you're passing in character strings, not objects. The get function retrieves the object, just doesn't have a place to store it.
You could always load the character string into a temporary variable as you're looping. Operate on the temporary variable and then assign when you're done.
for(filename in c("a","b")){
tmp <- get(filename)
}
You could also skip most of the for loop and use the apply family.
files = lapply(paste(c("CMC","ECMWF","ECMWF_VAR_EPS_MONTHLY_FORECAST",
"GFS","ICON_EU","UKMET_EURO4"),".csv",sep=""),
read.csv,sep=";")
files = lapply(files,function(x){x$Forecast.Time = as.POSIXct(x$Forecast.Time,
format="%d.%m.%Y %H:%M+%S",tz="UTC");return(x)}
Now you have a list of your files you can work on. You could assign them to global variables if you want.

Create a list with a number of objects from the local environment

ive created a lot of character objects in R that i would like to put into a list (storing all their information).
the object looks like this and the pattern is "TMC"
str(TMCS09g10086933)
chr [1:10] "TMCS09g1008699" "TMCS09g1008610 "TMCS09g10086101" "TMCS09g10086104" "TMCS09g100864343" "TMCS09g10086434343" "TMCS09g10086994111" ...
i have hundreds of these objects. Could someone tell me how to do this?
You can use the function objects with the argument pattern to list them.
Then, you can call the function get to fetch them. If you do this with an lapply, you will get a list returned right away.
TMClist <- lapply(objects(pattern = "^TMC"), get)
First you need to find the objects, which you can do with a regex search through the list of the objects in your environment grep("^TMC", ls(), value = TRUE), then you need to get the objects using the character vector of their names. For that you use mget.
your_list <- mget(grep("^TMC", ls(), value = TRUE))

In R, I am trying to make a for loop that will cycle through variable names and perform functions on them

I have variables that are named team.1, team.2, team.3, and so forth.
First of all, I would like to know how to go through each of these and assign a data frame to each one. So team.1 would have data from one team, then team.2 would have data from a second team. I am trying to do this for about 30 teams, so instead of typing the code out 30 times, is there a way to cycle through each with a counter or something similar?
I have tried things like
vars = list(sprintf("team.x%s", 1:33)))
to create my variables, but then I have no luck assigning anything to them.
Along those same lines, I would like to be able to run a function I made for cleaning and sorting the individual data sets on all of them at once.
For this, I have tried a for loop
for (j in 1:33) {
assign(paste("team.",j, sep = ""), cleaning1(paste("team.",j, sep =""), j))
}
where cleaning1 is my function, with two calls.
cleaning1(team.1, 1)
This produces the error message
Error in who[, -1] : incorrect number of dimensions
So obviously I am hoping the loop would count through my data sets, and also input my function calls and reassign my datasets with the newly cleaned data.
Is something like this possible? I am a complete newbie, so the more basic, the better.
Edit:
cleaning1:
cleaning1 = function (who, year) {
who[,-1]
who$SeasonEnd = rep(year, nrow(who))
who = (who[-nrow(who),])
who = tbl_df(who)
for (i in 1:nrow(who)) {
if ((str_sub(who$Team[i], -1)) == "*") {
who$Playoffs[i] = 1
} else {
who$Playoffs[i] = 0
}
}
who$Team = gsub("[[:punct:]]",'',who$Team)
who = who[c(27:28,2:26)]
return(who)
}
This works just fine when I run it on the data sets I have compiled myself.
To run it though, I have to go through and reassign each data set, like this:
team.1 = cleaning1(team.1, 1)
team.2 = cleaning1(team.2, 2)
So, I'm trying to find a way to automate that part of it.
I think your problem would be better solved by using a list of data frames instead of many variables containing one data frame each.
You do not say where you get your data from, so I am not sure how you would create the list. But assuming you have your data frames already stored in the variables team.1 etc., you could generate the list with
team.list <- list(team.1, team.2, ...,team.33)
where the dots stand for the variables that I did not write explicitly (you will have to do that). This is tedious, of course, and could be simplified as follows
team.list <- do.call(list,mget(paste0("team.",1:33)))
The paste0 command creates the variable names as strings, mget converts them to the actual objects, and do.call applies the list command to these objects.
Now that you have all your data in a list, it is much easier to apply a function on all of them. I am not quite sure how the year argument should be used, but from your example, I assume that it just runs from 1 to 33 (let me know, if this is not true and I'll change the code). So the following should work:
team.list.cleaned <- mapply(cleaning1,team.list,1:33)
It will go through all elements of team.list and 1:33 and apply the function cleaning1 with the elements as its arguments. The result will again be a list containing the output of each call, i.e.,
list( cleaning1(team.list[[1]],1), cleaning1(team.list[[2]],2), ...)
Since you are now to R I strongly recommend that you read the help on the apply commands (apply, lapply, tapply, mapply). There are very useful and once you got used to them, you will use them all the time...
There is probably also a simple way to directly generate the list of data frames using lapply. As an example: if the data frames are read in from files and you have the file names stored in a character vector file.names, then something along the lines of
team.list <- lapply(file.names,read.table)
might work.

Assigning and removing objects in a loop: eval(parse(paste(

I am looking to assign objects in a loop. I've read that some form of eval(parse( is what I need to perform this, but I'm running into errors listing invalid text or no such file or directory. Below is sample code of generally what I'm attempting to do:
x <- array(seq(1,18,by=1),dim=c(3,2,3))
for (i in 1:length(x[1,1,])) {
eval(parse(paste(letters[i],"<-mean(x[,,",i,"])",sep="")
}
And when I'm finished using these objects, I would like to remove them (the actual objects are very large and cause memory problems later on...)
for (i in 1:length(x[1,1,])) eval(parse(paste("rm(",letters[i],")",sep="")))
Both eval(parse(paste( portions of this script return errors for invalid text or no such file or directory. Am I missing something in using eval(parse(? Is there a easier/better way to assign objects in a loop?
That's a pretty disgusting and frustrating way to go about it. Use assign to assign and rm's list argument to remove objects.
> for (i in 1:length(x[1,1,])) {
+ assign(letters[i],mean(x[,,i]))
+ }
> ls()
[1] "a" "b" "c" "i" "x"
> a
[1] 3.5
> b
[1] 9.5
> c
[1] 15.5
> for (i in 1:length(x[1,1,])) {
+ rm(list=letters[i])
+ }
> ls()
[1] "i" "x"
>
Whenever you feel the need to use parse, remember fortune(106):
If the answer is parse() you should
usually rethink the question.
-- Thomas Lumley, R-help (February 2005)
Although it seems there are better ways to handle this, if you really did want to use the "eval(parse(paste(" approach, what you're missing is the text flag.
parse assumes that its first argument is a path to a file which it will then parse. In your case, you don't want it to go reading a file to parse, you want to directly pass it some text to parse. So, your code, rewritten (in what has been called disgusting form above) would be
letters=c('a','b','c')
x <- array(seq(1,18,by=1),dim=c(3,2,3))
for (i in 1:length(x[1,1,])) {
eval(parse(text=paste(letters[i],"<-mean(x[,,",i,"])",sep="")))
}
In addition to not specifying "text=" you're missing a few parentheses on the right side to close your parse and eval statements.
It sounds like your problem has been solved, but for people who reach this page who really do want to use eval(parse(paste, I wanted to clarify.
Very bad idea; you should never use eval or parse in R, unless you perfectly know what you are doing.
Variables can be created using:
name<-"x"
assign(name,3) #Eqiv to x<-3
And removed by:
name<-"x"
rm(list=name)
But in your case, it can be done with simple named vector:
apply(x,3,mean)->v;names(v)<-letters[1:length(v)]
v
v["b"]
#Some operations on v
rm(v)
It is best to avoid using either eval(paste( or assign in this case. Doing either creates many global variables that just cause additional headaches later on.
The best approach is to use existing data structures to store your objects, lists are the most general for these types of cases.
Then you can use the [ls]apply functions to do things with the different elements, usually much quicker than looping through global variables. If you want to save all the objects created, you have just one list to save/load. When it comes time to delete them, you just delete 1 single object and everything is gone (no looping). You can name the elements of the list to refer to them by name later on, or by index.

Resources