I am trying to use for loop to remove the initial part of each text and then export the revised text using write.table, but in my folder I noticed the write.table will generate a set of new files instead of replacing the original ones. Can anyone show me how to overwrite existing files?
for(i in 1:length(file.names)){
text.v <- scan(file.names[i], what="character", encoding = "UTF-8")
novel.v <- paste(text.v, collapse=" " )
word.v <- gsub(".*</Header> ","", novel.v)
write.table(paste(word.v,collapse = " "), paste(file.names[i],".txt",sep=""), row.names=FALSE, col.names=FALSE, quote=FALSE)
}
It seems to me that you are trying to write your files with 'two' extensions, because you read them like filenameWithExtension and than you write them filenameWithExtension.txt. If that's the case, the solution it's just change this paste(file.names[i],".txt",sep="") to this file.names[i].
In case I'm wrong, you should show us an example of file.name's content.
Into the code you can remove the current file i with
file.remove(paste0(file.names[i],".txt"))
And after that your code
write.table(paste(word.v,collapse = " "), paste(file.names[i],".txt",sep=""), row.names=FALSE, col.names=FALSE, quote=FALSE)
Check if the objetc paste(word.v,collapse = " ") will replace correctly your origonal file. I use to do this kind of loops and I have to check several times the structire of my new written file (quotes, NA's, delims, and so on)
Related
I'm want to save my program logging in a text file using R. I was able to save my entire logging in the text file. However, the challenge is text file name should have date and time. for example:
file1<- function(x){
flog.info("hi",name = 'trail')
summary = summary(x)
mean = mean(x,na.rm=T)
outpurt = list(summary,mean)
return(outpurt)
}
calling this function
files = file1(airquality)
since, i need to add date and time
Curr_date = (Sys.time()
appender function is used, in order to save the logging(file.info mentioned above).
flog.appender(appender.file(sprintf(paste0(Curr_date,'.log))),
name='trail.io')
you can see that, i was trying to use paste0 function in order to get the text file name with date and time. nothing works
filename = paste(gsub(":", "-", Sys.time()),"_file.txt",sep="")
# [1] "2016-12-29 00-49-08_file.txt"
# to write the content to a .txt file with the above filename
write.table("your content", file = paste0("D:/", filename))
Did I understand the problem correctly?
x = as.character(as.POSIXct(Sys.time()))
filename = paste(x,"_file.csv",sep="")
filename = gsub(":","-",filename)
filename = gsub(" ","_",filename)
I was able to save the file with date and time. The appender function. flog.appender() should be used first and then we should use flog.info() inside every function.
result<- function(x1,y){
require(futile.logger)#package name
x= Sys.time()
flog.appender(paste(x1,y,format(x,%y-%m-%d %I %p"),".log",sep ="")) #save file with date and time. for futile.logger see r bloggers.
}
I have been able to create a code in R to batch formate all my .txt file to .csv files using
setwd("~/Desktop/test/")
filelist = list.files(pattern = ".txt")
for (i in 1:length(filelist)){
input<-filelist[i]
output<-paste0(input, ".csv")
print(paste("Processing the file:", input))
data = read.delim(input, header = TRUE)
setwd("~/Desktop/test/done/")
write.table(data, file=output, sep=",", col.names=TRUE, row.names=FALSE)
setwd("~/Desktop/test/")
}
Works great but file still retains the .txt extension in the name
Example origninal file = "sample1.txt" and then the new file says "sample1.txt.csv" The file works as a .csv but the extra ".txt" in the file name annoys me, anyone know how to remove it from name?
thanks
You could delete the unwanted .txt:
output <- paste0(gsub("\\.txt$", "", input), ".csv")
The backslash marks the dot as literal dot (it has other meaning in regular expressions if not marked). The backslash has to be doubled because R tries to escape single backslashes. The dollar sign represents the end of the string, so only ".txt" at the end of the filename gets removed.
write.table(filelist,file=paste0("~/Desktop/test/done/",sub(".txt","",filelist[i]),".csv"),row.names=F,colnames=T,quote=F,sep=",")
Alternative help:
setwd("~/Users/Rio/Documents/Data/")
FILES <- list.files( pattern = ".txt")
for (i in 1:length(FILES)) {
FILE=read.table(file=FILES[i],header=T,sep="\t")
write.table(FILE,file=paste0("~Users/Rio/Documents/Data/",sub(".txt","",FILES[i]),".csv"),row.names=F,quote=F,sep=",")
}
Please I have a DataFrame which contains arabic data , I want to save it as csv file ( or .text), but when I try I have a problem with encoding arabic data .
I read my data like this : cname=readLines('C:/Users/Ahmed/Desktop/Bureau/arabic data R/cnn-arabic-utf8/cnn-arabic-utf8/spt/sportcnnAr08sport (2).html.txt',encoding='UTF-8')
I try to save it with different ways :
con<-file('C:/Users/ahmed/Desktop/test.csv',encoding="utf8")
write.csv(clust.df ,file=con)
save(clust.df , file = "C:/Users/ahmed/Desktop/clust.txt")
write.csv(clust.df, file = "C:/Users/ahmed/Desktop/clust.txt",fileEncoding='UTF-8')
the output is always :
"<U+0623><U+062D><U+0627><U+0644><U+062A>",1
thank you in advance
Try this:
testfile <- "C:/Users/ahmed/Desktop/test.csv"
log <- function(msg="") {
con <- file(testfile, "a")
tryCatch({
cat(iconv(msg, to="UTF-8"), file=con, sep="\n")
},
finally = {
close(con)
})
}
I am not 100% sure
But i am 99% sure :) that CSV file or txt file don't retain character encoding.
So i will suggest that try with Excel File (just to test if excel is showing correct data or not)
I am writing output to a file but the data is not appending. It is creating the last row each time. The code is as follows
op <- function(crime) {
filename <- paste(crime,".txt")
fileconn <- file(filename)
cat(nthecrime, file=fileconn, sep=" ",append=TRUE)
#write(nthecrime,file=fileconn, ncolumns=9, append=TRUE,sep="\t")
close(fileconn)
}
Both cat & write create a new file each time I call the above lines instead of appending. What am I missing?
Regards
Ganesh
From the ?cat help:
append logical. Only used if the argument file is the name of file
(and not a connection or "|cmd"). If TRUE output will be appended to
file; otherwise, it will overwrite the contents of file.
You should use filename, not fileconn. Try
cat(nthecrime, file=filename, sep=" ",append=TRUE)
I was trying to read all files in a folder using R, but I always got an error such like that:
>folder<-"/Volumes/cphg/projects/PROVIDE/freeze" #working directory
>filelist<-list.files(folder) #all files in the directory
>data<-vector("list", length(filelist)) #empty list
>names(data)<-filelist
>for (name in filelist) {
+ data[[name]]<-read.table(paste(folder, name, sep="/"), header=T)
+}
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input
Does any body know what' wrong here and how to fix it?
You can use tryCatch and return NULL if reading the file fails. Then you can Filter the results to exclude the NULLs
L <- setNames(lapply(filelist, function(x) {
tryCatch(read.table(file.path(folder, name)), error=function(e) NULL)
}), filelist)
data <- Filter(NROW, L)
Just to make it clear... and to close the question properly
The problem is that there is at least one file empty. Check the file name when it through the error.