I have a .csv file tab delimited. While running the code
data <- read.table("xxx.csv",sep = "\t", dec=".", header = TRUE,
encoding="UTF-8", stringsAsFactors = FALSE)
R reads it as a single column without dividing (should make 42 columns). Any ideas? Link to file.
The problem arises because each line is between quotation marks (the whole line).
There are two possible ways to read the file.
Keep all quotation marks.
Use the parameter quote = "" to disable quoting.
read.table("xxx.csv", sep = "\t", dec = ".", header = TRUE,
encoding = "UTF-8", stringsAsFactors = FALSE, quote = "")
Remove the quotation marks before reading the file.
tmp <- gsub('^\"|\"$', '', readLines("xxx.csv"))
read.table(text = tmp, sep = "\t", dec = ".", header = TRUE,
encoding = "UTF-8", stringsAsFactors = FALSE)
Related
I have this code:
write.table(df, file = f, append = F, quote = TRUE, sep = ";",
eol = "\n", na = "", dec = ".", row.names = FALSE,
col.names = TRUE, qmethod = c("escape", "double"))
where df is my data frame and f is a .csv file name.
The problem is that the resulting csv file has an empty line at the end.
When I try to read the file:
dd<-read.table(f,fileEncoding = "UTF-8",sep = ";",header = T,quote = "\"")
I get the following error:
incomplete final line found by readTableHeader
Is there something I am missing?
Thank you in advance.
UPDATE: I solved the problem deleting the UTF-8 file encoding into the read.table:
dd<-read.table(f,sep = ";",header = T,quote = "\"")
but I can't explain the reason of this, since the default for write.table seems to be UTF-8 anyway (I checked this using an advanced text tool).
Any idea of why this is happening?
Thank you,
I'm trying to read a file in R but the fourth record appears as a new line (see attached). After the third line, there's no tab, just two spaces. I'm using this code:
df = read.delim("text.txt", header = FALSE, stringsAsFactors = FALSE, quote = "")
UPDATE: the third line has "¬" at the end.
use the sep argument of read.delim to specify the separator. in this case you would need
df = read.delim("text.txt", header = FALSE, stringsAsFactors = FALSE, quote = "", sep = "\t")
I want to read a file with read.csv2 function. This file contains blank spaces in column names. Whith the parameter header = FALSE I can read the file but when I replace FALSE by TRUE, I have this error :
Error in make.names(col.names, unique = TRUE) : chaîne de charactères multioctets incorrecte 7
How can I manage this error?
My code :
client <- read.csv2("./data/Clients.csv", header = T, na.strings = "",
stringsAsFactors = FALSE, sep = ";", encoding = "UTF-8")
Thanks for your help.
The reason for the error is pointing towards column name having a multi-bytes character which is not compatible with UTF-8.
An option is to use encoding = "UCS-2LE":
client <- read.csv2("./data/Clients.csv", header = TRUE, na.strings = "",
stringsAsFactors = FALSE, sep = ";", encoding = "UCS-2LE")
I have some strings in one of the columns of my data frame that look like:
bem\\2015\black.rec
When I export the data frame into a text file using the next line:
write.table(data, file = "sample.txt", quote = FALSE, row.names = FALSE, sep = '\t')
Then in the text file the text looks like:
bem\2015BELblack.rec
Do you know an easy way to ignore all backslashes when writing the table into a text file so the backslashes are kept.
They way I have resolved this is converting backslashes into forwardslashes:
dataset <- read_delim(dataset.path, delim = '\t', col_names = TRUE, escape_backslash = FALSE)
dataset$columnN <- str_replace_all(dataset$Levfile, "\\\\", "//")
dataset$columnN <- str_replace_all(dataset$Levfile, "//", "/")
write.table(dataset, file = "sample.txt", quote = FALSE, row.names = FALSE, sep = '\t')
This exports the text imported as bem\\2015\black.rec with the required slashes: bem//2015/black.rec
I want to create a function that changes specific cells of a given CSV file(s) and saves them back into the exact same format.
I have tried to read the CSV file and write it back with the edited values, but the only way to do it without misinterpreting commas is to quote everything.
Is there a way to edit the files without putting quotes for every single value?
My guess is that there is some other way to read and write the CSV files than read.table
p.s. The csv files I want to edit are not in values seperated by columns, but have rather vague format(there may be unquoted, quoted (mostly because of a comma in a string) strings and integers in the same column/row)
Here is the code I use:
editcell = function(id,
newvalue,
row,
col,
dbpath = "C:/data/",
add = FALSE
)
{
#READ
temptable = read.table(file = file(paste0(dbpath, id, ".csv")),
header = F,
sep = ",",
dec = ".",
fill = TRUE,
quote = "",
comment.char = "",
colClasses = "character",
# nrows = 400,
stringsAsFactors = F
);
cat("There are",nrow(temptable),"rows and",
ncol(temptable),"columns.");
#EDIT
if(add == TRUE) { temptable[x,y] = paste0(temptable[x,y],newvalue)}
else {temptable[x,y] = newvalue};
#WRITE
write.table(temptable,
file=paste0(dbpath, id,".csv"),
append = F,
quote = T,
sep=",",
eol = "\n",
na = "NA",
dec = ".",
row.names=F,
col.names=F,
qmethod = "double",
fileEncoding = ""
)
}