I'm having some trouble writing a function to read in multiple .xlsx files as separate data frames in R with a for loop. When I run the function nothing happens. No Errors, but no data frames are loaded into R either. When I take the assign piece out of the function and manually change the input from the for loop to an example the assign function works. Here is an example of the code:
library(readxl)
Load<-function(File_Path,Samp){
setwd(File_Path)
for (i in 1:length(Samp)){
assign(paste("Sample_",Samp[i],sep = ""),read_excel(paste("Sample_",Samp[i],".xlsx",sep = "")))
}
}
Load(File_Path = "~/Desktop/Test",Samp = "A") # Doesn't Work
#When this piece is taken out of the loop and the Sample ("A") replaced it works.
assign(paste("Sample_","A",sep = ""),read_excel(paste("Sample_","A",".xlsx",sep = ""))) # Does Work
In reality there is a long list of samples to load and want to do so by assigning a list to "Samp" such as c("A","C","D"). Thanks up front for any assistance.
You can fix your problem by adding inherits=TRUE to assign
Related
I have a very big data frame and multiple copies of it that I programmed R to read automatically. I am now trying figure a way to read the data points automatically as well.
I previously had the code written as the following when it was reading a single csv file that I named "mydata":
subject_variable = mydata$X0[1]
size_variable = mydata$X29[1]
vertical_movement =
if (mydata$X37[1] == 0)
{
"None"
}else{
"Moving"
}
horizontal_start= mydata$X36[1]
Instead of a single data file, I currently have a list of data files. How would the code I wrote above would change so it would read the list of data files that I have?
Would I have to use a for loop for each variable? or would it work with just one for loop?
Thanks!
From your code, it seems like mydata is a dataframe, so as long as you have a list of dataframes, the following approach will work well:
#create a function with the process you've already done for one of the dataframes
fn <- function(mydata){
subject_variable = mydata$X0[1]
size_variable = mydata$X29[1]
vertical_movement =
if (mydata$X37[1] == 0)
{
"None"
}else{
"Moving"
}
horizontal_start= mydata$X36[1]
}
#do that for all the dataframes
lapply(fn, list_of_dataframes)
If you have a list of file locations, you can also use lapply along with read_csv (or similar) to create a list of dataframes and use the above approach.
I'm attempting to write an R script in a way that remains as automated as possible. To this end, I am trying to create a for loop to execute a function on multiple files. The outputs need to be saved as objects for the purposes of the program I am using and therefore each output from the for loop needs to have a distinct name. This is the code I have so far:
filenames <- as.list(Sys.glob("*.ab1"))
SeqOb <- list()
for (i in filenames)
{
SeqOb <- readsangerseq(i)
}
"readsangerseq" is the function I'm attempting to execute to create multiple SeqOb objects. What I've read from other discussions led me to create an empty list in which to store my output objects, but I can't seem to figure out how to make the for loop write them as distinct outputs.
If you would like to continue using the for loop and want distinct outputs instead of a list you may consider using assign(paste()) in order to give each file a unique object name. Although, as a relative newcomer to R myself, I'm starting to learn there are more elegant ways than for loops as well, such as MrFlick's answer.
for (i in 1:length(filenames)) {
#You may be able to substitute your function in the line below
assign(paste("SomeNamingRule", i, sep = ""), (readsangerseq(i)))
}
I want to read in a pdf via pdf tools, extract some data from it and write it to a csv. I have been able to do this successfully for one pdf, but I have many (440) to do. I'm trying to write a loop that goes through a list I have created that has all my file paths in it. The problem is it overwrites every time. So I think my program is doing what I've asked of it, but I am not asking the correct thing! My code is below:
temp <-as.list(list.files(pattern = "*.pdf"))
file_path <- file.path(getwd(),temp)%>%as.list()
data_anz<-as.character()
for (i in 1:length(file_path)){
data_anz<-pdf_text(file_path[[i]])[2]%>%str_split("\n")%>%.[[1]]%>%str_split_fixed("\\s{2,}", n=4)%>%as.data.frame(i:length(file_path))%>%rename(Commodity =V1, Level = V2, Change = V3, Description = V4)
}
What I would like achieve is a data frame that adds to with every iteration, not over writes. So first run, the df = 1 row, 4 cols, the next run 2 rows ect.
I'm lost! But I can get it to work for an individual member of the list and it seems to work through the whole list, but overwrites.
Any help would be super appreciated!
Each iteration of the loop is assigning your table to the same variable. You might want to try something like
data_anz<-list()
for (i in 1:length(file_path)){
data_anz[[i]] <- ...
}
data_anz_all <- do.call(data_anz, rbind)
which puts each table into its own position in a list, and then row-binds them all together at the end (assuming the columns of the individual frames are compatible).
I have a series of tables and graphs that are produced from a list in R. I would like to create a pdf for each iteration of the list. I have tried simply using pdf() within the function but I get the error that too many graphic devices are open. How can I do this (and name the files according to the list element name?
So far I have tried:
ReportPDF<-function(x){
pdf(paste(name(x),"~\\Myfile.pdf")
tb<-table(x$acolumn)
print(fb)
}
lapply(mylist,ReportPDF)
I cant quite work out how to attach the name of the list element to the filename and I'm not even sure this is the best way to create a pdf from lapply
Can you clean some of this up?
Please give a more specific example of the object you're passing to ReportPDF(), I would expect a plot object, rather than what appears to be a data frame that you are selecting a column from.
The function example has some errors too, did you mean this?
ReportPDF<-function(x){
pdf(paste(names(x),"Myfile.pdf"))
tb<-table(x$acolumn)
print(tb)
dev.off()
}
lapply(mylist,ReportPDF)
I believe I've done something similar before and can update this answer when I get the other information.
Here's an update making some assumptions about your objects. It uses a for loop as lmo suggests, but I think a more elegant method must exist. I'm using the for loop because lapply passes the object within each element of the list, with no reference to name of the element in the list -- which is what you need to name the files individually. Note the difference between calling mylist[i] and mylist[[i]], which is part of what's breaking the code in your example. In your code, names(x) will get the names of the columns within x, rather than the name of x as it is inside of mylist, which is what you want.
x <- data.frame(acolumn = rnorm(10))
q<- data.frame(acolumn = rnorm(10))
mylist <- list(a = x,b = q)
for(i in seq_along(mylist) ){
filename <- paste(names(mylist[i]),'-myFile.pdf', sep = "")
pdf(filename)
plot(myList[[i]]$acolumn)
dev.off()
}
I have a bunch of CSV files and I would like to perform the same analysis (in R) on the data within each file. Firstly, I assume each file must be read into R (as opposed to running a function on the CSV and providing output, like a sed script).
What is the best way to input numerous CSV files to R, in order to perform the analysis and then output separate results for each input?
Thanks (btw I'm a complete R newbie)
You could go for Sean's option, but it's going to lead to several problems:
You'll end up with a lot of unrelated objects in the environment, with the same name as the file they belong to. This is a problem because...
For loops can be pretty slow, and because you've got this big pile of unrelated objects, you're going to have to rely on for loops over the filenames for each subsequent piece of analysis - otherwise, how the heck are you going to remember what the objects are named so that you can call them?
Calling objects by pasting their names in as strings - which you'll have to do, because, again, your only record of what the object is called is in this list of strings - is a real pain. Have you ever tried to call an object when you can't write its name in the code? I have, and it's horrifying.
A better way of doing it might be with lapply().
# List files
filelist <- list.files(pattern = "*.csv")
# Now we use lapply to perform a set of operations
# on each entry in the list of filenames.
to_dispose_of <- lapply(filelist, function(x) {
# Read in the file specified by 'x' - an entry in filelist
data.df <- read.csv(x, skip = 1, header = TRUE)
# Store the filename, minus .csv. This will be important later.
filename <- substr(x = x, start = 1, stop = (nchar(x)-4))
# Your analysis work goes here. You only have to write it out once
# to perform it on each individual file.
...
# Eventually you'll end up with a data frame or a vector of analysis
# to write out. Great! Since you've kept the value of x around,
# you can do that trivially
write.table(x = data_to_output,
file = paste0(filename, "_analysis.csv"),
sep = ",")
})
And done.
You can try the following codes by putting all csv files in the same directory.
names = list.files(pattern="*.csv") %csv file names
for(i in 1:length(names)){ assign(names[i],read.csv(names[i],skip=1, header=TRUE))}
Hope this helps !