I have a very stupid question. It has been already asked, but none of the solutions provided seem to work with me.
I am looping over a list containing different data frames, to perform an analysis and save an output file named differently for each input data frame. The name would be something like originalname_output.txt.
I wrote this piece of code which seems to work fine (does all the analysis in the correct ways), but gives an error when coming to the write.table part.
library(qqman)
library(QuASAR)
list_QuASAR <- list (Fw, Rv, tot) #all of the are dfs
for (i in list_QuASAR){
output <- fitQuasarMpra(i[,2], i[,3], i[,4])
print(sum(output$padj_quasar<0.1))
qq(output$pval3, col = "black", cex = 1)
write.table(output, paste0("quasar_output/", i, "_output.txt"), col.names = T, sep = "\t")
}
fitQuasarMpra is a function of a package called QuASAR. Of course the subdirectory called quasar_output already exists.
The error I am getting is:
Error in file(file, ifelse(append, "a", "w")) :
invalid 'description' argument
In addition: Warning message:
In if (file == "") file <- stdout() else if (is.character(file)) { :
the condition has length > 1 and only the first element will be used
I know it's a trivial problem but I am currently stuck. I may consider to switch and use lapply, but then I may encounter the same problem and I wanted to solve this first.
Many thanks for you help.
You're trying to use a data frame object (i) as part of a file name; i.e. the data frame itself, not its name. You could try iterating over a named list instead:
list_QuASAR <- list (Fw = Fw,Rv = Rv,tot = tot)
for (i in names(list_QuASAR)){
output <- fitQuasarMpra(list_QuASAR[[i]][,2], list_QuASAR[[i]][,3], list_QuASAR[[i]][,4])
print(sum(output$padj_quasar<0.1))
qq(output$pval3, col = "black", cex = 1)
write.table(output, paste0("quasar_output/", i, "_output.txt"), col.names = T, sep = "\t")
}
Related
I am trying to extract data using the rgbif package for multiple species (once the code works I'll be running a list of about 200 species, so it is important for me to implement a list).
I have tried to adapt code written in following link:
https://github.com/ropensci/rgbif/issues/377
This is what my input file looks like:
csv file
And my code looks as follows:
library("rgbif")
#input <- read.csv("C:/Users/omi30wk/Desktop/TESTsampledata_udi.csv", header = TRUE, fill = TRUE, sep = ",")
#since you guys don't have my csv file here are three samples species I'm using:
# Acanthorrhynchium papillatum, Acrolejeunea sandvicensis, Acromastigum cavifolium
#'taxon' as header, see image posted above of my csv file for clarity
allpts <- vector('list', length(input))
names(allpts) = input
for (taxon in input){
cat(taxon, "\n")
allpts[[taxon]] <- occ_data(scientificName = taxon, limit = 2) #error here
df <- allpts[[taxon]]$data
df$networkKeys = NULL
if (!is.null(df)) {
df <- df[, !apply(df, 2, function(z)
is.null(unlist(z)))]
write.csv(df, paste("/Users/user/Desktop/DATA Bats/allpts_30sept/", gsub(" ", "_", taxon), ".csv", sep = "")) } }
However I get following error message at the moment:
Error in `[[<-`(`*tmp*`, taxon, value = list(`Acanthorrhynchium papillatum` = list( :
no such index at level 1
I'm even happy to try different codes to extract multiple species data. I've already tried many codes (i.e. loops, etc) that also kept giving me error messages and I haven't been able to solve.
Any help is greatly appreciated!
I have been googling for hours but still could not figure this out. I am looping through some objects in R global environment and save it into different file names. But an error always pops up in the final stage:
Error in as.vector(x, mode = "character") :
no method for coercing this S4 class to a vector
The idea is to take all the lists in the environment that starts with proj2, and eventually save them to a txt file. I guess it is because the file name to save also contains x that is directing my object names? But how do I save my files to different names without doing this?
Here is my codes:
savelist <- function(x){
x_up <- subset(x, padj <= 0.05 & log2FoldChange >= 0 )
x_up$fac_lab <- rep("treated_up", nrow(x_up))
x_f_up <- as.data.frame(x_up)[,c(7,2,6)]
write.table(x_f_up, file= paste(x, "up", sep ="_"), sep="\t",quote=FALSE, row.names=TRUE, col.names = FALSE, append = FALSE)
}
lapply(mget(ls(pattern = "proj2")), savelist)
Any suggestions will be helpful. Thank you.
I am having trouble using lapply with a custom function I created, but it is a fairly specific function and I haven't been able to find an answer that suits me.
The function is quite long and does a lot of things, but I think I've managed to trim it down to a reproducible example that gives the error I am getting.
The thing is, I have a folder with two different types of files, that I want to load into R as elements of a list. They have more or less the same information, but they have different layout, different file extension, different almost everything, so the process to import them is different for each type.
The function goes like this:
f <- function(a_file, b_file, type){
if (type == "a"){act <- read.table(a_file, skip = 19, header = TRUE, sep = "", dec = ".")}
if (type == "b"){act <- read.table(b_file, header=FALSE, sep="\t", dec=".")}
return(act)}
Then I create vectors with the names of the two types of files I want to call, like this:
a_files <- dir(pattern=".deg")
b_files <- dir(pattern=".act")
And finally try to apply the function like this:
act_list <- lapply(a_files, f, b_files, type = "b")
which works if type = "a", but fails for type = "b", giving the error:
Error in file(file, "rt") : invalid 'description' argument
which I am pretty sure has to do with the fact that I am applying the function to only the "a_files" vector and not to "b_files", but I as much as I try I can't figure out a way to fix it...
There is a much simpler way to solve your problem.
First, define the function to have only one argument for the filename. Then decide inside it what type of file it is and do the right type of read.
readFiles = function(file){
if(grepl(file,pattern = "\\.deg")){
act <- read.table(a_file, skip = 19, header = TRUE, sep = "", dec = ".")
}
if(grepl(file,pattern = "\\.act")){
act <- read.table(b_file, header=FALSE, sep="\t", dec=".")
}
return(act)
}
Finally, you only have to call lapply on your files vector:
filesVector = dir(pattern = "(\\.act|\\.deg)")
result = lapply(filesVector, readFiles)
filesVector will return all files that contain ".act" or ".deg".
NOTE: Your pattern is not correct, since it will return files that contain any character followed by "act" or by "deg".
I created a function clean_to_CSV(df) that takes in a data frame, cleans it, spits it back out, and also writes it to a CSV, with the CSV's filename using the inputted name of the dataset:
clean_to_CSV <- function(df) {
# df <- # code that cleans the df (runs with no errors)
write.csv(df, file = paste0(deparse(substitute(df)), "_clean.csv"), row.names = FALSE)
df
}
However, this returns:
Error in file(file, ifelse(append, "a", "w")) :
invalid 'description' argument
In addition: Warning messages:
3: In if (file == "") file <- stdout() else if (is.character(file)) { :
the condition has length > 1 and only the first element will be used
This is very puzzling because 1) taking the exact same write.csv... line and running it outside the function works perfectly. Also, I know that writing to CSV and returning the df don't interfere with each other. Finally, I did look at related SO posts, but they were either more complex questions or didn't have a solid answer. None were such a simple case where one line of code works outside a function but not inside it.
After you apply the cleaning code (the part you commented out), df has a different representation that no longer has the name it had when it was an argument value.
To fix it, just capture the name of the object as soon as you enter the function, and then reference that value later. Here's an example.
clean_to_CSV <- function(df) {
obj_name = deparse(substitute(df))
# df <- # code that cleans the df (runs with no errors)
write.csv(df, file = paste0(obj_name, "_clean.csv"), row.names = FALSE)
df
}
This question is almost the same as a previous question, but differs enough that the answers for that question don't work here. Like #chase in the last question, I want to write out multiple files for each split of a dataframe in the following format(custom fasta).
#same df as last question
df <- data.frame(
var1 = sample(1:10, 6, replace = TRUE)
, var2 = sample(LETTERS[1:2], 6, replace = TRUE)
, theday = c(1,1,2,2,3,3)
)
#how I want the data to look
write(paste(">", df$var1,"_", df$var2, "\n", df$theday, sep=""), file="test.txt")
#whole df output looks like this:
#test.txt
>1_A
1
>8_A
1
>4_A
2
>9_A
2
>2_A
3
>1_A
3
However, instead of getting the output from the entire dataframe I want to generate individual files for each subset of data. Using d_ply as follows:
d_ply(df, .(theday), function(x) write(paste(">", df$var1,"_", df$var2, "\n", df$theday, sep=""), file=paste(x$theday,".fasta",sep="")))
I get the following output error:
Error in file(file, ifelse(append, "a", "w")) :
invalid 'description' argument
In addition: Warning messages:
1: In if (file == "") file <- stdout() else if (substring(file, 1L, :
the condition has length > 1 and only the first element will be used
2: In if (substring(file, 1L, 1L) == "|") { :
the condition has length > 1 and only the first element will be used
Any suggestions on how to get around this?
Thanks,
zachcp
There were two problems with your code.
First, in constructing the file name, you passed the vector x$theday to paste(). Since x$theday is taken from a column of a data.frame, it often has more than one element. The error you saw was write() complaining when you passed several file names to its file= argument. Using instead unique(x$theday) ensures that you will only ever paste together a single file name rather than possibly more than one.
Second, you didn't get far enough to see it, but you probably want to write the contents of x (the current subset of the data.frame), rather than the entire contents of df to each file.
Here is the corrected code, which appears to work just fine.
d_ply(df, .(theday),
function(x) {write(paste(">", x$var1,"_", x$var2, "\n", x$theday, sep=""),
file=paste(unique(x$theday),".fasta",sep=""))
})