fileEncoding="UTF8" in write.csv does not work - r

df <- dir(getwd(), full.names = T) %>% map_df(~ read_excel(.x, col_names = TRUE))
write.csv(df, file = "mynewfile.csv", col.names = T, row.names = F, fileEncoding = "UTF8", quote = FALSE)
Unfortunately it is not encoded in UTF8, ö, 360° and such still have invalid characters
It works when I save it as write.xlsx but it doesn't work unfortunately when there are bigger amount of rows (like when I had 50k there were memory problem)
Thats how it look - it is a sample (after I've saved it as csv, opened and did text to columns to let it be in diff columns)
Any suggestions?
Lets say thats how my sample file looks like
df<-data.frame(A = c("ö","ö","ö"), B=c("360°", "360°", "360°"), C= c(123,123,123))

Try with encoding="UTF-8":
write.csv(df, file = "mynewfile.csv", col.names = T, row.names = F, encoding="UTF-8", quote = FALSE)
edit: your code is working with my RStudio: Have you set the correct options in your global options: see below. .. Default text encoding...

Related

R write.table function inserts unwanted empty line at the bottom of my csv

I have this code:
write.table(df, file = f, append = F, quote = TRUE, sep = ";",
eol = "\n", na = "", dec = ".", row.names = FALSE,
col.names = TRUE, qmethod = c("escape", "double"))
where df is my data frame and f is a .csv file name.
The problem is that the resulting csv file has an empty line at the end.
When I try to read the file:
dd<-read.table(f,fileEncoding = "UTF-8",sep = ";",header = T,quote = "\"")
I get the following error:
incomplete final line found by readTableHeader
Is there something I am missing?
Thank you in advance.
UPDATE: I solved the problem deleting the UTF-8 file encoding into the read.table:
dd<-read.table(f,sep = ";",header = T,quote = "\"")
but I can't explain the reason of this, since the default for write.table seems to be UTF-8 anyway (I checked this using an advanced text tool).
Any idea of why this is happening?
Thank you,

Separating unique column to multiple columns in multiple csv files in R

im trying to separate a unique column in multiple csv files. I've already done it for one single file with this code:
tempmax <- read.csv(file="path", header=TRUE, sep=";", fill = TRUE)
colnames(tempmax) = c("Fecha", "Hora", "Temperatura max")
rbind(tempmax)
write.csv(tempmax, "path", sep = ";", append = FALSE, row.names = FALSE, col.names = FALSE)
However, I haven't found the way to do it in multiple csv saved in a folder. I would like to do the same: read, modify and write the new one.
I used this to read the multiple files:
getwd <- ("path")
filenames <- list.files("path",
pattern = "*.csv", full.names = TRUE)
But i just cant find the way to edit what i want. (i'm pretty new using R)
I appreciate the help. Thanks!
If we have several files, we can use lapply. It is not clear about the transformation. So, the file is written back by selecting the first column
lapply(filenames, function(file){
tempmax <- read.csv(file= file, header=TRUE, sep=";", fill = TRUE)
colnames(tempmax) = c("Fecha", "Hora", "Temperatura max")
write.csv(tempmax[1], file, sep = ";", append = FALSE,
row.names = FALSE, col.names = FALSE)})

How to save special characters in a csv while preserving them for excel?

Let's consider this simple creation of a csv file with a dataframe that contains special characters:
d <- data.frame(x = "Édifice", y="Arrêt")
write.table(x = d, file = "test.csv", sep = ",", row.names = F, col.names = F, quote = F, fileEncoding = "UTF-8")
The csv file looks like expected
Édifice,Arrêt
But when I open this csv in excel I get:
I have tried using readr, collapsing columns and then writing them with writeLines, writing using write.xlsx, checked for encoding options. None worked.
My constraint is that the input is a dataframe, and the output must be a csv readable in excel.
Same problem with german umlaute. I use write_excel_csv from readr:
library(readr)
write_excel_csv(x = d, path = "test.csv", col_names = F)

Records less in data frame

I have found few people posted similar issues but still can't solve my problem. The expected objects in the dataframe were 957463 but only 392400 extracted out.
I used read.delim2("test.csv", header = TRUE, sep = ",", quote = "\"", fill = TRUE) to create dataframe with the expected records but the output were less than expected.
#set working directory --------------------------------
L <- setwd("C:/Users/abmo8004/Documents/R Project/csv/")
#List files in the path ------------------------
l <- list.files(L)
#form dataframe from csv file ---------------------------
df <- read.delim2("test.csv", header = TRUE, sep = ",", quote = "\"", fill = TRUE)
I expect the output to be 957463 , but the actual output is 392400. Can anyone please look at the codes?

My R code isn't throwing any errors, but it's not doing what it's supposed to

Some background for my question: This is an R script that a previous research assistant wrote, but he did not provide any guidance to me on using it for myself. After working through an R textbook, I attempted to use the code on my data files.
What this code is supposed to do is load multiple .csv files, delete certain items/columns from them, and then write the new cleaned .csv files to a specified directory.
When I run my code, I don't get any errors, but the code isn't going anything. I originally thought that this was a problem with file permissions, but I'm still having the problem after changing them. Not sure what to try next.
Here's the code:
library(data.table)
library(magrittr)
library(stringr)
# create a function to delete unnecessary variables from a CAFAS or PECFAS
data set and save the reduced copy
del.items <- function(file)
{
data <- read.csv(input = paste0("../data/pecfas|cafas/raw",
str_match(pattern = "cafas|pecfas", string = file) %>% tolower, "/raw/",
file), sep = ",", header = TRUE, na.strings = "", stringsAsFactors = FALSE,
skip = 0, colClasses = "character", data.table = FALSE)
data <- data[-grep(pattern = "^(CA|PEC)FAS_E[0-9]+(T(Initial|[0-
9]+|Exit)|SP[a-z])_(G|S|Item)[0-9]+$", x = names(data))]
write.csv(data, file = paste0("../data/pecfas|cafas/items-del",
str_match(pattern = "cafas|pecfas", string = file) %>% tolower, "/items-
del/", sub(pattern = "ExportData_", x = file, replacement = "")) %>%
tolower, sep = ",", row.names = FALSE, col.names = TRUE)
}
# delete items from all cafas data sets
cafas.files <- list.files("../data/cafas/raw/", pattern = ".csv")
for (file in cafas.files){
del.items(file)
}
# delete items from all pecfas data sets
pecfas.files <- list.files("../data/pecfas/raw/", pattern = ".csv")
for (file in pecfas.files){
del.items(file)
}

Resources