When I run read.csv on a dataset
read.csv(file = msleep_ggplot2, header = TRUE, sep = ",")
I get an error message:
Error in read.table(file = file, header = header, sep = sep, quote = quote, : 'file' must be a character string or connection
The csv file loaded in r studio and looks good. Any idea what the problem might be?
Related
I am trying to read a CSV file containing texts in many different characters using the function read.csv.
This is a sample of the file content:
device,country_code,keyword,indexed_clicks,indexed_cost
Mobile,JP,お金 借りる,5.913037843442198,103.05985173478956
Desktop,US,email,82.450427682737157,81.871030974598241
Desktop,US,news,414.14755054432345,66.502397615344861
Mobile,JP,ヤフートラベル,450.9622861586314,55.733902871922957
If I use the next function to read the data:
texts <- read.csv("text.csv", sep = ",", header = TRUE)
The dataframe is imported to R, but the characters are not well saved...
device country_code keyword indexed_clicks indexed_cost
1 Mobile JP ã\u0081Šé‡‘ 借りる 5.913038 103.05985
2 Desktop US email 82.450428 81.87103
3 Desktop US news 414.147551 66.50240
4 Mobile JP ヤフートラベル 450.962286 55.73390
If I use the next function (same as before with fileEncoding="UTF-8"):
texts <- read.csv("text.csv", sep = ",", header = TRUE, fileEncoding = "utf-8")
I get the next warning message:
Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
entrada inválida encontrada en la conexión de entrada 'text.csv'
2: In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'text.csv'
Anyone knows how to read properly this file?
I replicated your problem with both:
texts <- read.csv("text.csv", sep = ",", header = TRUE)
and
texts_ <- read.csv("text.csv", sep = ",", header = TRUE, encoding = "utf-8")
and both works perfectly fine (R Studio V1.4.1717, Ubuntu 20.04.3 LTS).
Some possibilities I can think of:
The csv file wasn't saved properly as UTF-8 or corrupted. Have you checked the file again?
If you are using Windows, try using encoding instead of fileEncoding. These problems happen with non-standard characters (Windows Encoding Hell).
I have a .csv file that contains 285000 observations. Once I tried to import dataset, here is the warning and it shows 166000 observations.
Joint <- read.csv("joint.csv", header = TRUE, sep = ",")
Warning message:
In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
EOF within quoted string
When I coded with quote, as follows:
Joint2 <- read.csv("joint.csv", header = TRUE, sep = ",", quote="", fill= TRUE)
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names
When I coded like that, it shows 483000 observations:
Joint <- read.table("joint.csv", header = TRUE, sep = ",", quote="", fill= TRUE)
What should I do to read the file properly?
I think the problem has to do with file encoding. There are a lot of special characters in the header.
If you know how your file is encoded you can specify using the fileEncoding argument to read.csv.
Otherwise you could try to use fread from data.table. It is able to read the file despite the encoding issues. It will also be significantly faster for reading such a large data file.
I was having difficulties importing an excel sheet into R (csv). However, after reading this post, I was able to successfully import it. However, I noticed that some of the numbers in a particular column have transformed into unwanted characters-"Ï52,386.43" "Ï6,887.61" "Ï32,923.45". Any ideas how I can change these to numbers?
Here's my code below:
df <- read.csv("data.csv", header = TRUE, strip.white = TRUE,
fileEncoding="latin1", stringsAsFactors=FALSE)
I've also tried fileEncoding = "UTF-8" but this doesn't work-I'm getting the following warning:
Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
invalid input found on input connection 'data.csv'
2: In read.table(file = file, header = header, sep = sep, quote = quote
I am using a mac with "R version 3.2.4 (2016-03-10)" (if that makes any difference). Here are the first ten entries from the affected column:
[1] "Ï52,386.43" "Ï6,887.61" "Ï32,923.45" "" "Ï82,108.44"
[6] "Ï6,378.10" "" "Ï22,467.43" "Ï3,850.14" "Ï5,547.83"
It turns out the issue was a pound sign that got changed into Ï in the process of saving an xls file into csv format (in windows-opened in a mac). Thanks for your replies.
I have a csv file which has only one column and empty cells in the first couple cells. When I tried to read it into R with intention reading those blank lines as missing values, I run into the following error. Any help is appreciated!
Test = read.csv("test.csv", header = FALSE, blank.lines.skip = FALSE, nrows = 10)
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
empty beginning of file
I'm loading a csv file into R from the Quandl database.
the file is comma delimited and the data looks as follows:
quandl code,name
WIKI/ACT,"Actavis, Inc."
WIKI/ADM,"Archer-Daniels-Midland Company"
WIKI/AEE,"Ameren Corporation"
...
...
i use the following code to load the data:
US.Stocks <-read.table(file=ABC,header=FALSE,sep=",")
however, i get the following error:
Error in read.table(data.frame(file = ABC, header = FALSE, :
'file' must be a character string or connection
Can someone pls help me with what im doing wrong? suspect ive not classified some parameter in the read.csv command?
thanks
Tom
you should use
read.csv(file = yourFilePath, header = TRUE)
but I think the problem is in your file path, maybe you are missing file extension and remember to wrap your file path in double qoutes ( "yourfilepath" )
UPDATE:
read.csv is just wrappers around read.table with some default parameters
function (file, header = TRUE, sep = ",", quote = "\"", dec = ".",
fill = TRUE, comment.char = "", ...) {
read.table(file = file, header = header, sep = sep, quote = quote,
dec = dec, fill = fill, comment.char = comment.char, ...)
}