R function to write results in to a csv file - r

Hi I have created this R function two do some operation and store the results into separate files.
x1=rnorm(100,0,1)
x2=rnorm(100,0,1)
dataaa=data.frame(x1,x2)
func1=function(dataaa,name1,name2)
{
xqr=dataaa[,1]^2
xcube=dataaa[,1]^3
write.csv(xqr,"name1.csv")
write.csv(xcube,"name2.csv")
}
func1(dataaa,xr,xc)
The function works well. But the file name didnt change. i.e the names of the two csv files should be xr.csv and xc.csv. But it was saved as name1.csv and name2.csv.
How to modify this function so that I can get the correct file names?
Thank you.

this should work:
func1=function(dataaa,name1,name2)
{
xqr=dataaa[,1]^2
xcube=dataaa[,1]^3
write.csv(xqr, paste0(name1,".csv"))
write.csv(xcube, paste0(name2,".csv"))
}

Related

How to save the R output in different directory?

I have three different folders name as "5k", "10k" and "15k", I can save the R out from the following code using this for loop.
iter_no=c(5000,10000,15000)
iter_name=c("5k","10k","15k")
for ( i in 1:length(iter_no)){
y=rnorm(iter_no[i])
setwd(paste0("C:/Users/Owner/Desktop/prac_fol/",iter_name[i]))
save(y, file =paste0("ydat",iter_name[i],".RData"))
}
Is there any shortcut or better way to do this.
Any help is appreciated.
Try the following code. It looks like you are omitting / in the second call to paste0.
iter_no=c(5000,10000,15000)
iter_name=c("5k","10k","15k")
for ( i in 1:length(iter_no)){
y=rnorm(iter_no[i])
file = paste0("C:/Users/Owner/Desktop/prac_fol/",iter_name[i], '/' , "ydat",iter_name[i],".RData")
save(y, file = file)
}

Converting the argument name of a function into string

I have developed a function which will take a list of files and will do some statistical tests and will generate a excel file. In the last line of function (return object) I want the function will return a excel file with same names as input file names. In my example it will give list_file.xlsx. IF I enter another file let's say tslist_file it should automatically return tslist_file.xlsx. The function is properly working. Suggest me how I code last line of the function so that I can generalise it.
newey<-function(list_files){
tsmom<-do.call(cbind,lapply(list_files,function(x) read_excel(x)[,2]))
tsmom<-xts(tsmom[,1:5],order.by = seq(as.Date("2005-02-01"),length=183,by="months")-1)
names(tsmom)<-c("tsmom121","tsmom123","tsmom126","tsmom129","tsmom1212")
## newey west
newey_west<-function(x){
model<-lm(x~1)
newey_west<-coeftest(model,vcov=NeweyWest(model,verbose=T))
newey_west[c(1,3,4)]
}
## running newey west
cs_nw_full<-do.call(cbind,lapply(tsmom,newey_west))
library(gtools)
p_values<-cs_nw_full[3,]
cs_nw_full[2,]<-paste0(cs_nw_full[2,],stars.pval(p_values))
write.xlsx(cs_nw_full,"list_file.xlsx")
}
Try:
write.xlsx(cs_nw_full, paste0(eval(substitute(list_files)), ".xlsx"))
Edit:
#jeetkamal is absolutely right - you need to use
write.xlsx(cs_nw_full, paste0(deparse(substitute(list_files)), ".xlsx"))
here.
I apologize for the mistake. eval wold only work if list_files was e.g. the name of a file, not a list object.

How to use a list name as character

I would like to train a model and give it a name. I would like to use this name as character as well to create a text file with model summary. So I created a function as below
C50Training<-function(ModeName,DF_Trai,Form,
Str_PathSum){
library(C50);
ModeName<-C5.0(formula=Form,data=DF_Trai);
capture.output(summary(ModeName),file=paste(Str_PathSum,"/Summ",ModeName,".txt",sep=""));
}
In the funtion I want to use ModeName as characters. I tried to run it but it does not work. ModelName is a list in this case. How can I use ModelName as character?
To change a variable name to string, you can use deparse and substitute, as follows:
deparse(substitute(ModeName))
It return "ModeName" that can be part of your file path.
I tried this. It works.
ModeName=c(1,2,3)
f<-function(ModeName){
print(paste("/Summ",deparse(substitute(ModeName)),".txt",sep=""))
}
f(ModeName)
and this works too:
ModeName=c(1,2,3)
f<-function(list){
print(paste("/Summ",deparse(substitute(list)),".txt",sep=""))
}
f(ModeName)

Loop works outside function but in functions it doesn't.

Been going around for hours with this. My 1st question online on R. Trying to creat a function that contains a loop. The function takes a vector that the user submits like in pollutantmean(4:6) and then it loads a bunch of csv files (in the directory mentioned) and binds them. What is strange (to me) is that if I assign the variable id and then run the loop without using a function, it works! When I put it inside a function so that the user can supply the id vector then it does nothing. Can someone help ? thank you!!!
pollutantmean<-function(id=1:332)
{
#read files
allfiles<-data.frame()
id<-str_pad(id,3,pad = "0")
direct<-"/Users/ped/Documents/LearningR/"
for (i in id) {
path<-paste(direct,"/",i,".csv",sep="")
file<-read.csv(path)
allfiles<-rbind(allfiles,file)
}
}
Your function is missing a return value. (#Roland)
pollutantmean<-function(id=1:332) {
#read files
allfiles<-data.frame()
id<-str_pad(id,3,pad = "0")
direct<-"/Users/ped/Documents/LearningR/"
for (i in id) {
path<-paste(direct,"/",i,".csv",sep="")
file<-read.csv(path)
allfiles<-rbind(allfiles,file)
}
return(allfiles)
}
Edit:
Your mistake was that you did not specify in your function what you want to get out from the function. In R, you create objects inside of function (you could imagine it as different environment) and then specify which object you want it to return.
With my comment about accepting my answer, I meant this: (...To mark an answer as accepted, click on the check mark beside the answer to toggle it from greyed out to filled in...).
Consider even an lapply and do.call which would not need return being last line of function:
pollutantmean <- function(id=1:332) {
id <- str_pad(id,3,pad = "0")
direct_files <- paste0("/Users/ped/Documents/LearningR/", id, ".csv")
# READ FILES INTO LIST AND ROW BIND
allfiles <- do.call(rbind, lapply(direct_files, read.csv))
}
ok, I got it. I was expecting the files that are built to be actually created and show up in the environment of R. But for some reason they don't. But R still does all the calculations. Thanks lot for the replies!!!!
pollutantmean<-function(directory,pollutant,id)
{
#read files
allfiles<-data.frame()
id2<-str_pad(id,3,pad = "0")
direct<-paste("/Users/pedroalbuquerque/Documents/Learning R/",directory,sep="")
for (i in id2) {
path<-paste(direct,"/",i,".csv",sep="")
file<-read.csv(path)
allfiles<-rbind(allfiles,file)
}
#averaging polutants
mean(allfiles[,pollutant],na.rm = TRUE)
}
pollutantmean("specdata","nitrate",23:35)

Merging a large number of csv datasets

Here are 2 sample datasets.
PRISM-APPT_1895.csv
https://copy.com/SOO2KbCHBX4MRQbn
PRISM-APPT_1896.csv
https://copy.com/JDytBqLgDvk6JzUe
I have 100 of these types of data sets that I'm trying to merge into one data frame, export that to csv, and then merge that into another very large dataset.
I need to merge everything by "gridNumber" and "Year", creating a time series dataset.
Originally, I imported all of the annual datasets and then tried to merge them with this :
df <- join_all(list(Year_1895, Year_1896, Year_1897, Year_1898, Year_1899, Year_1900, Year_1901, Year_1902,
Year_1903, Year_1904, Year_1905, Year_1906, Year_1907, Year_1908, Year_1909, Year_1910,
Year_1911, Year_1912, Year_1913, Year_1914, Year_1915, Year_1916, Year_1917, Year_1918,
Year_1919, Year_1920, Year_1921, Year_1922, Year_1923, Year_1924, Year_1925, Year_1926,
Year_1927, Year_1928, Year_1929, Year_1930, Year_1931, Year_1932, Year_1933, Year_1934,
Year_1935, Year_1936, Year_1937, Year_1938, Year_1939, Year_1940, Year_1941, Year_1942,
Year_1943, Year_1944, Year_1945, Year_1946, Year_1947, Year_1948, Year_1949, Year_1950,
Year_1951, Year_1952, Year_1953, Year_1954, Year_1955, Year_1956, Year_1957, Year_1958,
Year_1959, Year_1960, Year_1961, Year_1962, Year_1963, Year_1964, Year_1965, Year_1966,
Year_1967, Year_1968, Year_1969, Year_1970, Year_1971, Year_1972, Year_1973, Year_1974,
Year_1975, Year_1976, Year_1977, Year_1978, Year_1979, Year_1980, Year_1981, Year_1982,
Year_1983, Year_1984, Year_1985, Year_1986, Year_1987, Year_1988, Year_1989, Year_1990,
Year_1991, Year_1992, Year_1993, Year_1994, Year_1995, Year_1996, Year_1997, Year_1998,
Year_1999, Year_2000),
by = c("gridNumber","Year"),type="full")
But R keeps crashing because I think the merge is a bit to large for it to handle, so I'm looking for something that would work better. Maybe data.table? Or another option.
Thanks for any help you can provide.
Almost nine months later and your question has no answer. I could not find your datasets, however, I will show one way to do the job. It is trivial in awk.
Here is a minimal awk script:
BEGIN {
for(i=0;i<10;i++) {
filename = "out" i ".csv";
while(getline < filename) print $0;
close(filename);
}
}
The script is run as
awk -f s.awk
where s.awk is the above script in a text file.
This script creates ten filenames: out0.csv, out1.csv ... out9.csv. These are the already-existing files with the data. The first file is opened and all records sent to the standard output. The file is then closed and the next filename created and opened. The above script has little to offer over a command line read/redirect. You would typically use awk to process a long list of filenames read from another file; with statements to selectively ignore lines or columns depending on various criteria.

Resources