Write excels to path with variable in R [duplicate] - r

I want to manipulate different .csv files through a loop and a list. Works fine, but I for the output, I have to create many .xlsx files and the files have to be named according to the value of a certain variable.
I've already tried piping the write_xlsx function with ifelse condition like:
for (i in 1:length(files)) {
files[[i]] %>%
write_xlsx(files[[i]], paste(ifelse(x="test1", "/Reportings/test1.xlsx",
ifelse(x="test2", "/Reportings/test2.xlsx", "test3")
}
I expect that multiple .xlsx files will be created in the folder Reportings.

Not easy to answer precisely with the information you gave, but here is a minimal example that seems to do what you want :
According that your list is composed of matrix, that x is a variable and that it always has the same value.
df=data.frame(x=rep("test1",3),y=rep("test1",3))
df2=data.frame(x=rep("test2",3),y=rep("test2",3))
files=list(df,df2)
files[[1]]$x[1]
for(i in 1:length(files)){
write.xlsx(files[[i]],paste0("Reportings/",files[[i]]$x[1],".xlsx"))
}

Related

How to extract a data point from multiple csv files using R

I have a very big data frame and multiple copies of it that I programmed R to read automatically. I am now trying figure a way to read the data points automatically as well.
I previously had the code written as the following when it was reading a single csv file that I named "mydata":
subject_variable = mydata$X0[1]
size_variable = mydata$X29[1]
vertical_movement =
if (mydata$X37[1] == 0)
{
"None"
}else{
"Moving"
}
horizontal_start= mydata$X36[1]
Instead of a single data file, I currently have a list of data files. How would the code I wrote above would change so it would read the list of data files that I have?
Would I have to use a for loop for each variable? or would it work with just one for loop?
Thanks!
From your code, it seems like mydata is a dataframe, so as long as you have a list of dataframes, the following approach will work well:
#create a function with the process you've already done for one of the dataframes
fn <- function(mydata){
subject_variable = mydata$X0[1]
size_variable = mydata$X29[1]
vertical_movement =
if (mydata$X37[1] == 0)
{
"None"
}else{
"Moving"
}
horizontal_start= mydata$X36[1]
}
#do that for all the dataframes
lapply(fn, list_of_dataframes)
If you have a list of file locations, you can also use lapply along with read_csv (or similar) to create a list of dataframes and use the above approach.

Why does "write.dat" (R) save data files within folders?

In order to conduct some analysis using a particular software, I am required to have separate ".dat" files for each participant, with each file named as the participant number, all saved in one directory.
I have tried to do this using the "write.dat" function in R (from the 'multiplex' package).
I have written a loop that outputs a ".dat" file for each participant in a dataset. I would like each file that is outputted to be named the participant number, and for them all to be stored in the same folder.
## Using write.dat
participants_ID <- unique(newdata$SJNB)
for (i in 1:length(participants_ID)) {
data_list[[i]] <- newdata %>%
filter(SJNB == participants_ID[i])
write.dat(data_list[[i]], paste0("/Filepath/Directory/", participants_ID[i], ".dat"))
}
## Using write_csv this works perfectly:
participants_ID <- unique(newdata$SJNB)
for (i in 1:length(participants_ID)) {
newdata %>%
filter(SJNB == participants_ID[i]) %>%
write_csv(paste0("/Filepath/Directory/", participants_ID[i], ".csv"), append = FALSE)
}
If I use the function "write_csv", this works perfectly (saving .csv files for each participant). However, if I use the function "write.dat" each participant file is saved inside a separate folder - the folder name is the participant number, and the file inside the folder is called "data_list[[i]]". In order to get all of the data_list files into the same directory, I then have to rename them which is time consuming.
I could theoretically output the files to .csv and then convert them to .dat, but I'm just intrigued to know if there's anything I could do differently to get the write.dat function to work the way I'm trying it :)
The documentation on write.dat is subminimal, but it would appear that you have confused a directory path with a file name . You have deliberately created a directory named "/Filepath/Directory/[participants_ID[i]].dat" and that's where each output file is placed. That you cannot assing a name to the x.dat file itself appears to be a defect in the package as supplied.
However, not all is lost. Inside your loop, replace your write.dat line with the following lines, or something similar (not tested):
edit
It occurs to me that there's a smoother solution, albeit using the dreaded eval:
Again inside the loop, (assuming participants_ID[i] is a char string)
eval(paste0(participants_ID[i],'<- dataList[[i]]'))
write.dat(participants_ID[i], "/Filepath/Directory/")
previous answer
write.dat(data_list[[i]], "/Filepath/Directory/")
thecommand = paste0('mv /Filepath/Directory/dataList[[i]] /Filepath/Directory/',[participants_ID[i]],'.dat',collapse="")
system(thecommand)

Reading multiple offline html files to a list in R

I have rawdata as 20 offline html files stored in following format
../rawdata/1999_table.html
../rawdata/2000_table.html
../rawdata/2001_table.html
../rawdata/2002_table.html
.
.
../rawdata/2017_table.html
These files contain tables that I am extracting and reshaping to a particular format.
I want to read these files at once to a list and process them one by one through a function that I have written.
What I tried:
I put the names of these files into an Excel file called filestoread.xlsx and used a for loop to load these files using the names mentioned in the sheet. But it doesn't seem to work
filestoread <- fread("../rawdata/filestoread.csv")
x <- list()
for (i in nrow(filestoread)) {
x[[i]] <- read_html(paste0("../rawdata/", filestoread[i]))
}
How can this be done?
Also, after reading the HTML files I want to extract the tables from them and reshape them using a function I wrote after converting it to a data table.
My final objective is to rbind all the tables and have a single data table with year wise entries of the tables in the html file.
First save path of your data on one of the following ways.
Either, hardcoded
filestoread <- paste0("../rawdata/", 1999:2017, "_table.html")
or reading all html files in the directory
filestoread <- list.files(path = "../rawdata/", pattern="\\.html$")
Then use lapply()
library(rvest)
lapply(filestoread, function(x) try(read_html(x)))
Note: try() runs the code even when there is a file missing (throwing error).
The second part of your question is a little broad, depends on the content of your files, and there are already some answers, you could consider e.g. this answer. In principle you use a combination of ?html_nodes and ?html_table.

Dynamically define tables to write to csv in r

I am trying to write multiple dataframes to multiple .csv files dynmanically. I have found online how to do the latter part, but not the former (dynamically define the dataframe).
# create separate dataframes from each 12 month interval of closed age
for (i in 1:max_age) {assign(paste("closed",i*12,sep=""),
mc_masterc[mc_masterc[,7]==i*12,])
write.csv(paste("closed",i*12,sep=""),paste("closed",i*12,".csv",sep=""),
row.names=FALSE)
}
In the code above, the problem is with the first part of the write.csv statement. It will create the .csv file dynamically, but not with the actual content from the table I am trying to specify. What should the first argument of the write.csv statement be? Thank you.
The first argument of write.csv needs to be an R object, not a string. If you don't need the objects in memory you can do it like so:
for (i in 1:max_age) {
df <- mc_masterc[mc_masterc[,7]==i*12,])
write.csv(df,paste("closed",i*12,".csv",sep=""),
row.names=FALSE)
}
and if you need them in memory, you can either do that separately, or use get to return an object based on a string. Seperate:
for (i in 1:max_age) {
df <- mc_masterc[mc_masterc[,7]==i*12,])
assign(paste("closed",i*12,sep=""),df)
write.csv(df,paste("closed",i*12,".csv",sep=""),
row.names=FALSE)
}
With get:
for (i in 1:max_age) {
assign(paste("closed",i*12,sep=""), mc_masterc[mc_masterc[,7]==i*12,])
write.csv(get(paste("closed",i*12,sep="")),paste("closed",i*12,".csv",sep=""),
row.names=FALSE)
}

R: Loading data from folder with multiple files

I have a folder with multiple files to load:
Every file is a list. And I want to combine all the lists loaded in a single list. I am using the following code (the variable loaded every time from a file is called TotalData) :
Filenames <- paste0('DATA3_',as.character(1:18))
Data <- list()
for (ii in Filenames){
load(ii)
Data <- append(Data,TotalData)
}
Is there a more elegant way to write it? For example using apply functions?
You can use lapply. I assume that your files have been stored using save, because you use load to get them. I create two files to use in my example as follows:
TotalData<-list(1:10)
save(TotalData,file="DATA3_1")
TotalData<-list(11:20)
save(TotalData,file="DATA3_2")
And then I read them in by
Filenames <- paste0('DATA3_',as.character(1:2))
Data <- lapply(Filenames,function(fn) {
load(fn)
return (TotalData)
})
After this, Data will be a list that contains the lists from the files as its elements. Since you are using append in your example, I assume this is not what you want. I remove one level of nesting with
Data <- unlist(Data,recursive=FALSE)
For my two example files, this gave the same result as your code. Whether it is more elegant can be debated, but I would claim that it is more R-ish than the for-loop.

Resources