saving arabic data as csv file - r

Please I have a DataFrame which contains arabic data , I want to save it as csv file ( or .text), but when I try I have a problem with encoding arabic data .
I read my data like this : cname=readLines('C:/Users/Ahmed/Desktop/Bureau/arabic data R/cnn-arabic-utf8/cnn-arabic-utf8/spt/sportcnnAr08sport (2).html.txt',encoding='UTF-8')
I try to save it with different ways :
con<-file('C:/Users/ahmed/Desktop/test.csv',encoding="utf8")
write.csv(clust.df ,file=con)
save(clust.df , file = "C:/Users/ahmed/Desktop/clust.txt")
write.csv(clust.df, file = "C:/Users/ahmed/Desktop/clust.txt",fileEncoding='UTF-8')
the output is always :
"<U+0623><U+062D><U+0627><U+0644><U+062A>",1
thank you in advance

Try this:
testfile <- "C:/Users/ahmed/Desktop/test.csv"
log <- function(msg="") {
con <- file(testfile, "a")
tryCatch({
cat(iconv(msg, to="UTF-8"), file=con, sep="\n")
},
finally = {
close(con)
})
}

I am not 100% sure
But i am 99% sure :) that CSV file or txt file don't retain character encoding.
So i will suggest that try with Excel File (just to test if excel is showing correct data or not)

Related

How can I use error handling when reading xlsx files in R?

I am trying to read 23 excel files, store each in a list, and then rbind them to one csv. Some of these file are csv and some of them are xlsx. However, I got the following message:
Error: Can't establish that the input is either xls or xlsx.
So I want to identify which ones are giving error and then append it manually.
My function is the following:
make_df<-function(filename){
library(readxl)
library(foreign)
if (str_sub(filename,-3,-1) == "csv"){
df<-read.csv(filename,fileEncoding="latin1")
}
else{
df<-read_excel(filename)
}
return(df)
}
filenames_vector<-list.files(# directory)
datalist = list()
for (i in 1:23){
datalist[[i]] <- make_df(filenames_vector[i])
}
mega_data = do.call(rbind,datalist)
How can I add something in make_df to print out the names of files that are causing the error message? Also, is there another work around, when the the error message is on not being able to distinguish xlsx from xls?
This can be done with a tryCatch block. Without example data it's a little hard to recreate. I'm not sure what you mean in your second question.
Try the code below to catch errors and print out the filename if there's an error, otherwise return the df object.
make_df<-function(filename){
library(readxl)
library(foreign)
df = tryCatch(
{ # try block
if (str_sub(filename,-3,-1) == "csv"){
df<-read.csv(filename,fileEncoding="latin1")
}
else{
df<-read_excel(filename)
}
},
error=function(cond){return(filename)} # grab the filename if there was an error
)
if (class(df) == 'character') {
print(df)
} else{return(df)}
}

save the text file using date and time in r

I'm want to save my program logging in a text file using R. I was able to save my entire logging in the text file. However, the challenge is text file name should have date and time. for example:
file1<- function(x){
flog.info("hi",name = 'trail')
summary = summary(x)
mean = mean(x,na.rm=T)
outpurt = list(summary,mean)
return(outpurt)
}
calling this function
files = file1(airquality)
since, i need to add date and time
Curr_date = (Sys.time()
appender function is used, in order to save the logging(file.info mentioned above).
flog.appender(appender.file(sprintf(paste0(Curr_date,'.log))),
name='trail.io')
you can see that, i was trying to use paste0 function in order to get the text file name with date and time. nothing works
filename = paste(gsub(":", "-", Sys.time()),"_file.txt",sep="")
# [1] "2016-12-29 00-49-08_file.txt"
# to write the content to a .txt file with the above filename
write.table("your content", file = paste0("D:/", filename))
Did I understand the problem correctly?
x = as.character(as.POSIXct(Sys.time()))
filename = paste(x,"_file.csv",sep="")
filename = gsub(":","-",filename)
filename = gsub(" ","_",filename)
I was able to save the file with date and time. The appender function. flog.appender() should be used first and then we should use flog.info() inside every function.
result<- function(x1,y){
require(futile.logger)#package name
x= Sys.time()
flog.appender(paste(x1,y,format(x,%y-%m-%d %I %p"),".log",sep ="")) #save file with date and time. for futile.logger see r bloggers.
}

Conditionally process (bgzip, tabix) files using loop and if else statement

I have some .vcf files. I have selected those files from my directory and want to convert them to two other formats.
I am a bit confused using if and else if here. I want to do it like this: if there isn't .bgz file for [i]th .vcf file, I want to convert it to .bgz file keeping the original file.
If there is already .bgz file, but not .bgz.tbi file for [i] th .bgz file, then I want to convert .bgz file to .bgz.tbi file keeping the original .bgz that I get from .vcf file.
Can someone please help me finish this loop? It works for if condition, but don't know how to proceed from there.
path.file<-"/mypath/for/files/"
all.files <- list.files("/mypath/for/files")
all.files <- all.files[grepl(".vcf$",all.files)]
for (i in 1:length(all.files)){
if(!exists(paste0(all.files[i],".bgz"))){
bgzip(paste0(path.file,all.files[i]), overwrite=FALSE)
}else{(!exists(paste0(all.files[i],".bgz",".tbi"))){
#if(!exists(paste0(all.files[i],".bgz",".tbi"))){
indexTabix(paste0(paste0(path.file,all.files[i]),".bgz"), format="vcf")
}
}
Try this (not tested):
#get VCF files with path
all.files <- list.files("/mypath/for/files", pattern = "*.vcf$",
full.names = TRUE)
for (i in all.files) {
#make output names, so we don't mess about with paste
file_bgz <- paste0(i, ".bgz")
file_bgz_tbi <- paste0(i, ".bgz.tbi")
#if bgz exists don't zip else zip
if(!exists(file_bgz))
bgzip(i, paste0(i, ".bgz"))
#if tbi exists don't index else tabix
if(!exists(file_bgz_tbi))
indexTabix(file_bgz, format = "vcf")
}

Data not appending with R when writing

I am writing output to a file but the data is not appending. It is creating the last row each time. The code is as follows
op <- function(crime) {
filename <- paste(crime,".txt")
fileconn <- file(filename)
cat(nthecrime, file=fileconn, sep=" ",append=TRUE)
#write(nthecrime,file=fileconn, ncolumns=9, append=TRUE,sep="\t")
close(fileconn)
}
Both cat & write create a new file each time I call the above lines instead of appending. What am I missing?
Regards
Ganesh
From the ?cat help:
append logical. Only used if the argument file is the name of file
(and not a connection or "|cmd"). If TRUE output will be appended to
file; otherwise, it will overwrite the contents of file.
You should use filename, not fileconn. Try
cat(nthecrime, file=filename, sep=" ",append=TRUE)

How to save a data frame as CSV to a user selected location using tcltk

I have a data frame called, Fail.
I would like to save Fail as a CSV in a location that the user selects. Below is some example code that I found, but I don't know how to incorporate Fail into it.
require(tcltk)
fileName <- tclvalue(tkgetSaveFile())
if (!nchar(fileName)) {
tkmessageBox(message = "No file was selected!")
} else {
tkmessageBox(message = paste("The file selected was", fileName))
}
Take a look at the write.csv or the write.table functions. You just have to supply the file name the user selects to the file parameter, and the dataframe to the x parameter:
write.csv(x=df, file="myFileName")
You need not to use even the package "tcltk". You can simply do as shown below:
write.csv(x, file = "c:\\myname\\yourfile.csv", row.names = FALSE)
Give your path inspite of "c:\myname\yourfile.csv".
write.csv([enter name of dataframe here],file = file.choose(new = T))
After running above script this window will open :
Type the new file name with extension in the File name field and click Open, it'll ask you to create a new file to which you should select Yes and the file will be created and saved in the desired location.

Resources