How to rename an output dataset within a function with R? - r

I made a function which its outputs are three different datasets in .csv format, but, I'll like that the name of the original dataset appears in the name of the output dataset.
For example:
If the name of the original dataset is "microbial_mat1", I'll like that output was "microbial_mat1_output1.csv", because I only get "_output1.csv".
Is there a way to do this?
My function looks like the following code:
myFunction <- function(original_dataset,
parameter1,
parameter2 = TRUE){
a long bunch of code
if(parameter2){
write.csv(dataset_temporal, "_output.csv")
} else{
print("No parameter2")
}
Thanks in advance for your help.

We may need to extract the object name. One option is to use deparse/substitute at the top of the function on the original_dataset and use that (nm1) with paste to create the file name
myFunction <- function(original_dataset,
parameter1,
parameter2 = TRUE){
nm1 <- deparse(substitute(original_dataset))
...
...
if(parameter2){
write.csv(dataset_temporal, paste0(nm1, "_output.csv"))
} else{
print("No parameter2")
}

Related

Nested lappy in r

So I have function that takes 2 argument to read txt files from a site. The arguments are 1 the names of city and the 2nd one is the type of data. I have 2 list to pass it down as arguments which are the list of cities and the list of type of data. How could i use nested lapply to read the files from the site?My attempt of the code look something like this:
cities <- c("sydney","brisbane"...)
typedatas <- c("Max", "Avg","Min")
url<- "https:/sitename/datasets/"
read.text <- function(city, typedata){
c(url,typedata,"/year/",city, ".txt) >%>
paste0()
}
finaldata <- lapply(cities, function(x) lapply(typedatas,function(x){read.ts})) %>% set_names(cities)
it creates a big list but the did not completely read the files. The output would be like below:
final data list [10]
sydney list[3]
function
function
function
brisbane list[3]
function...
....
how can i make it read and also appropriately name the dataframes using the type of data for each cities.
Kind of hard to do this without a reproducible example but you can try this:
cities <- c("sydney","brisbane")
typedatas <- c("Max", "Avg","Min")
url<- "https:/sitename/datasets/"
read.text <- function(city, typedata){
paste0(url,typedata,"/year/",city, ".txt")
}
finaldata <- lapply(cities, function(cty){
lapply(typedatas,function(ds_type){
read.text(cty, ds_type)
})
}) %>% set_names(cities)
Notice that your read.text function doesn't actually read the file but only creates the hyperlink to it so you will need to add some kind of function to actually read the file.

R function used to rename columns of a data frames

I have a data frame, say acs10. I need to relabel the columns. To do so, I created another data frame, named as labelName with two columns: The first column contains the old column names, and the second column contains names I want to use, like the table below:
column_1
column_2
oldLabel1
newLabel1
oldLabel2
newLabel2
Then, I wrote a for loop to change the column names:
for (i in seq_len(nrow(labelName))){
names(acs10)[names(acs10) == labelName[i,1]] <- labelName[i,2]}
, and it works.
However, when I tried to put the for loop into a function, because I need to rename column names for other data frames as well, the function failed. The function I wrote looks like below:
renameDF <- function(dataF,varName){
for (i in seq_len(nrow(varName))){
names(dataF)[names(dataF) == varName[i,1]] <- varName[i,2]
print(varName[i,1])
print(varName[i,2])
print(names(dataF))
}
}
renameDF(acs10, labelName)
where dataF is the data frame whose names I need to change, and varName is another data frame where old variable names and new variable names are paired. I used print(names(dataF)) to debug, and the print out suggests that the function works. However, the calling the function does not actually change the column names. I suspect it has something to do with the scope, but I want to know how to make it works.
In your function you need to return the changed dataframe.
renameDF <- function(dataF,varName){
for (i in seq_len(nrow(varName))){
names(dataF)[names(dataF) == varName[i,1]] <- varName[i,2]
}
return(dataF)
}
You can also simplify this and avoid for loop by using match :
renameDF <- function(dataF,varName){
names(dataF) <- varName[[2]][match(names(dataF), varName[[1]])]
return(dataF)
}
This should do the whole thing in one line.
colnames(acs10)[colnames(acs10) %in% labelName$column_1] <- labelName$column_2[match(colnames(acs10)[colnames(acs10) %in% labelName$column_1], labelName$column_1)]
This will work if the column name isn't in the data dictionary, but it's a bit more convoluted:
library(tibble)
df <- tribble(~column_1,~column_2,
"oldLabel1", "newLabel1",
"oldLabel2", "newLabel2")
d <- tibble(oldLabel1 = NA, oldLabel2 = NA, oldLabel3 = NA)
fun <- function(dat, dict) {
names(dat) <- sapply(names(dat), function(x) ifelse(x %in% dict$column_1, dict[dict$column_1 == x,]$column_2, x))
dat
}
fun(d, df)
You can create a function containing just on line of code.
renameDF <- function(df, varName){
setNames(df,varName[[2]][pmatch(names(df),varName[[1]])])
}

Saving the output of a function as a new dataf.frame and using the name of the original input in the saving name

I've written a function that takes the top 50 results from a list based on ranks.
myfunction <- function(x){
...selects the top 50 results
return(the top 50 results)
If you need the exact function I can add more detail.
Ideally, what I would like the function to do is save the top 50 list as a new data.frame by automatically including the name of the input like this:
'x'_top_50
So I can repeatedly use the function and it automatically save the output for use later.
Any help would be great, thank you!
EDIT
This is what I have from the first answer:
t50<-function(x){
m101mb<-x[,1:2]
m101ts<-x[,3:4]
incommon<-intersect(m101mb$mb_gs,m101ts$ts_gs)
df1<-m101mb[m101mb$mb_gs %in% incommon,]
df2<-m101ts[m101ts$ts_gs %in% incommon,]
df3<-df1[order(df1$mb_gs),]
df4<-df2[order(df2$ts_gs),]
df5<-cbind(df3,df4)
df6<-data.frame(df5$mb_rank,df5$ts_rank)
df7<-rowMeans(df6)
df8<-data.frame(df5,df7)
df9<-data.frame(df8$ts_gs,df8$df7)
df10<-df9[order(df9$df8.df7),]
colnames(df10)<-c("Gene_symbol","Gene_rank")
assign(paste(deparse(substitute(x)),"top", "50", sep = "_"), df10[1:50,])
return(df10[1:50,])
}
But it still wont save as a new variable.
Thanks.
I believe what you want is the function assign() to assign the dataframe with a specific name; and deparse() and subsitute() to extract the previously given variable name.
myfunction <- function(x){
#whatever you're doing it's here that results in the_top_50_results as a dataframe
assign(paste(deparse(substitute(x)), "top", "50", sep = "_"), the_top_50_results, envir = .GlobalEnv)
}

Changing column names within a function

When writing a function, how do I get the new name for baseline to change depending on what the name of my dataset is? With this function the column names become dataset_baseline and dataset_adverse instead of for example Inflation_baseline and Inflation_adverse.
renaming <- function(dataset) {
dataset <- dataset %>%
rename(dataset_baseline = baseline, dataset_adverse = adverse)
return(dataset)
}
Try this :
renaming <- function(dataset,columns) {
call = as.list(match.call())
dataset.name <- toString(call$dataset)
dataset %>% rename_at(columns,funs(paste0(dataset.name,.)))
}
dataset <- renaming(dataset,c("baseline","adverse"))
NOTE : You should not try to assign dataset from within your function : it won't work because the 'dataset' there would refer to a local variable of your function.

Inserting the value in data frame into the codes in R

I have the names of the 1000 people in "name" data frame
df=c("John","Smith", .... "Machine")
I have the 1000 data frames for each person. (e.g., a1~a1000)
And, I have the following codes.
a1$name="XXXX"
a2$name="XXXX" ...
a1000$name="XXXX"
I would like to replace "XXXX" in the above codes with the values in name data frame. Output codes would look like this.
a1$name="John"
a2$name="Smith" ...
a1000$name="Machine"
First you need to combine them as List.( I do not know whether it is work with 1000 dataframe or not. )
df=c("John","Smith", .... "Machine")
list_object_names = sprintf("a%s", 1:1000)
list_df = lapply(list_object_names, get)
for (i in 1:length(list_df) ){
list_df[[i]][,'Names']=df[i]
}
Also you can try apply function rather than for loop something like:
lapply(list_df, function(df) {
#what you want to do
})
Here is my shot at this, without knowing if there is any more to the a1,a2...a1000 lists.
# generate your data
df = c("John", "Smith", "Machine")
# build your example
for(i in 1:3){
assign(paste0("a",i), list(name = "XXXX"))
}
# solve your problem, even if there is more to a1 than you are showing us.
for(i in 1:3){
anew <- get(paste0("a",i)) # pulls the object form the environment
anew[['name']] <- df[i] # rewrites only that list
assign(paste0("a",i), anew) # rewrites the object with new name
}

Resources