I have been able to create a code in R to batch formate all my .txt file to .csv files using
setwd("~/Desktop/test/")
filelist = list.files(pattern = ".txt")
for (i in 1:length(filelist)){
input<-filelist[i]
output<-paste0(input, ".csv")
print(paste("Processing the file:", input))
data = read.delim(input, header = TRUE)
setwd("~/Desktop/test/done/")
write.table(data, file=output, sep=",", col.names=TRUE, row.names=FALSE)
setwd("~/Desktop/test/")
}
Works great but file still retains the .txt extension in the name
Example origninal file = "sample1.txt" and then the new file says "sample1.txt.csv" The file works as a .csv but the extra ".txt" in the file name annoys me, anyone know how to remove it from name?
thanks
You could delete the unwanted .txt:
output <- paste0(gsub("\\.txt$", "", input), ".csv")
The backslash marks the dot as literal dot (it has other meaning in regular expressions if not marked). The backslash has to be doubled because R tries to escape single backslashes. The dollar sign represents the end of the string, so only ".txt" at the end of the filename gets removed.
write.table(filelist,file=paste0("~/Desktop/test/done/",sub(".txt","",filelist[i]),".csv"),row.names=F,colnames=T,quote=F,sep=",")
Alternative help:
setwd("~/Users/Rio/Documents/Data/")
FILES <- list.files( pattern = ".txt")
for (i in 1:length(FILES)) {
FILE=read.table(file=FILES[i],header=T,sep="\t")
write.table(FILE,file=paste0("~Users/Rio/Documents/Data/",sub(".txt","",FILES[i]),".csv"),row.names=F,quote=F,sep=",")
}
Related
I have multiple similarly named but different folders, each containing similarly named but different csv files.
For example, I have three folders named "output", each containing "image.csv" and "cells.csv".
How do I loop through each "output" folder, then read each csv files in the folder and apply function onto these files?
Here's what I tried :
Firstly, I list the folders named "output":
dirs<-list.dirs()
dirs<-dirs[grepl("output",dirs)]
Then I want to set up a function to join both csv files, something like below (codes are incomplete though, please help to correct this):
object_extraction<-function(x){ image<-read.csv(image.csv, header=T, sep=",")
cells<-read.csv(cells.csv, header=T, sep=",")
object<-dplyr::inner_join(cells,image,by="ImageNumber")
return(object)}
Finally I want to loop the function above through the "output" folders
object<-list()
for(i in 1:length(dirs)){
object[[i]]<-object_extraction(dirs[i])
Thank you
Make the path to read csv dynamic in your function
object_extraction<-function(x){
image<-read.csv(paste0(x, '/image.csv'), header=T, sep=",")
#header = T and sep = ',' is default in read.csv so this should
#work without specifying them as well.
cells<-read.csv(paste0(x, '/cells.csv'))
object<-dplyr::inner_join(cells,image,by="ImageNumber")
return(object)
}
and then apply the function to each folder.
dirs <- list.dirs(recursive=FALSE)
dirs <- grep('output', dirs, value = TRUE)
result <- lapply(dirs, object_extraction)
Two errors I can spot in your code:
You need to use the directory name form the dirs variable, eg:
object_extraction<-function(x){
image<-read.csv(file.path(x, "image.csv"), header=T, sep=",")
cells<-read.csv(file.path(x, "cells.csv"), header=T, sep=",")
object<-dplyr::inner_join(cells,image,by="ImageNumber")
return(object)
}
And the file names should be strings, "image.csv" and "cells.csv"
HTH
I have some data the I would like to write to a temporary CSV file in R.
Users have the option to specify a filename of their choice, which is stored in an environment (called 'envr') separate from .GlobalEnv
if (!is.null(envr$filename)) {
write.csv(df, file = paste(envr$filename, ".csv", sep = ""))
}
In order to do this successfully, I need to create a temporary file that is assigned to the filename chosen by the user.
if (!is.null(envr$filename)) {
file.name <- get("filename", envir = envr)
tempfile(fileext = ".csv")
write.csv(df, file = file.name)
}
The above if statement however does not do the job, as a CSV file is not saved in $TMPDIR.
How can I easily integrate tempfile() into the first if statement above without having to assign it to a variable name (file.name)?
You may concatenate the file name (obtained from the filename environment variable) with the temporary folder of the session (using tempdir()), along with the .csv extension, as follows:
if (!is.null(envr$filename)) {
write.csv(df, file = paste0(tempdir(), "/", get("filename", envir = envr), ".csv"))
}
Let me know if it answers your question or if you need any further help.
I am trying to use for loop to remove the initial part of each text and then export the revised text using write.table, but in my folder I noticed the write.table will generate a set of new files instead of replacing the original ones. Can anyone show me how to overwrite existing files?
for(i in 1:length(file.names)){
text.v <- scan(file.names[i], what="character", encoding = "UTF-8")
novel.v <- paste(text.v, collapse=" " )
word.v <- gsub(".*</Header> ","", novel.v)
write.table(paste(word.v,collapse = " "), paste(file.names[i],".txt",sep=""), row.names=FALSE, col.names=FALSE, quote=FALSE)
}
It seems to me that you are trying to write your files with 'two' extensions, because you read them like filenameWithExtension and than you write them filenameWithExtension.txt. If that's the case, the solution it's just change this paste(file.names[i],".txt",sep="") to this file.names[i].
In case I'm wrong, you should show us an example of file.name's content.
Into the code you can remove the current file i with
file.remove(paste0(file.names[i],".txt"))
And after that your code
write.table(paste(word.v,collapse = " "), paste(file.names[i],".txt",sep=""), row.names=FALSE, col.names=FALSE, quote=FALSE)
Check if the objetc paste(word.v,collapse = " ") will replace correctly your origonal file. I use to do this kind of loops and I have to check several times the structire of my new written file (quotes, NA's, delims, and so on)
I have some .vcf files. I have selected those files from my directory and want to convert them to two other formats.
I am a bit confused using if and else if here. I want to do it like this: if there isn't .bgz file for [i]th .vcf file, I want to convert it to .bgz file keeping the original file.
If there is already .bgz file, but not .bgz.tbi file for [i] th .bgz file, then I want to convert .bgz file to .bgz.tbi file keeping the original .bgz that I get from .vcf file.
Can someone please help me finish this loop? It works for if condition, but don't know how to proceed from there.
path.file<-"/mypath/for/files/"
all.files <- list.files("/mypath/for/files")
all.files <- all.files[grepl(".vcf$",all.files)]
for (i in 1:length(all.files)){
if(!exists(paste0(all.files[i],".bgz"))){
bgzip(paste0(path.file,all.files[i]), overwrite=FALSE)
}else{(!exists(paste0(all.files[i],".bgz",".tbi"))){
#if(!exists(paste0(all.files[i],".bgz",".tbi"))){
indexTabix(paste0(paste0(path.file,all.files[i]),".bgz"), format="vcf")
}
}
Try this (not tested):
#get VCF files with path
all.files <- list.files("/mypath/for/files", pattern = "*.vcf$",
full.names = TRUE)
for (i in all.files) {
#make output names, so we don't mess about with paste
file_bgz <- paste0(i, ".bgz")
file_bgz_tbi <- paste0(i, ".bgz.tbi")
#if bgz exists don't zip else zip
if(!exists(file_bgz))
bgzip(i, paste0(i, ".bgz"))
#if tbi exists don't index else tabix
if(!exists(file_bgz_tbi))
indexTabix(file_bgz, format = "vcf")
}
I am writing output to a file but the data is not appending. It is creating the last row each time. The code is as follows
op <- function(crime) {
filename <- paste(crime,".txt")
fileconn <- file(filename)
cat(nthecrime, file=fileconn, sep=" ",append=TRUE)
#write(nthecrime,file=fileconn, ncolumns=9, append=TRUE,sep="\t")
close(fileconn)
}
Both cat & write create a new file each time I call the above lines instead of appending. What am I missing?
Regards
Ganesh
From the ?cat help:
append logical. Only used if the argument file is the name of file
(and not a connection or "|cmd"). If TRUE output will be appended to
file; otherwise, it will overwrite the contents of file.
You should use filename, not fileconn. Try
cat(nthecrime, file=filename, sep=" ",append=TRUE)