I have a csv file with fixed headers. But some of the column values can be missing with empty commas. This is creating problems for read.csv.sql. Am I missing any parameter for this function? I expect to read null/NA - is there any workaround?
sample file content -
day,car1,car2
1,bmw,audi
2,merc,bmw
3,,audi
4,,
I get this error -
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 1 did not have 45 elements
The huge file was first split into smaller chunks. The csv was then opened with Excel with , as de-limiter. There was a mis-match in number of commas in header(45 elements) and the next line(43 elements).
Related
I try to read .csv file from website into R as following:
poll = read.csv("http://www.aec.gov.au/About_AEC/cea- notices/files/2013/prdelms.gaz.statics.130901.09.00.02.csv")
But then I got the warning message:
Warning message:
In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
EOF within quoted string
Then I searched previous StackOverflow, and changed my code to:
poll = read.csv("http://www.aec.gov.au/About_AEC/cea-notices/files/2013/prdelms.gaz.statics.130901.09.00.02.csv", quote="")
This seemed to solve problem, I got no warning, and got 8855 * 26 data. My question is:
What did the original problem mean, and why did the second code fix it?
Thank you!
Your file contains a symbol ", but this symbol is normally interpreted as a quote. This broke the line that contains the symbol. You have to disable the use of this symbol as a quote.
I'm new to R and I'm trying to read a tsv file where sometimes there is a "#" in the table. R just stopped reading when coming across the "#" and gave me the error:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 6227 did not have 6 elements
I looked at that line in the file and I found the "#". The data looks like this:
CM School Supply #1 Upland CA 3 8 Shopping
When I delete it R can continue reading the table,but I have more "#"s in the file...
How to set the variables in the read.table()? I tried to search for a solution everywhere but failed... Hope someone here can help me out. Thanks!
You can completely turn off read.table()'s interpretation of comment characters (by default set to "#") by setting comment.char="" in your call to read.table().
I have very large text message which contain "",*\n* but while reading a file whose one of the column contain text is not getting read properly just because message contain "" and "\n". I have used the following
dat = read.csv("abc", header=F, sep=",", quote ="\"'", stringsAsFactors=FALSE, allowEscapes=T, flush=T, comment.char="")
It reads file incorrectly with read.csv and reading as a table,getting an error
dat = read.table("abc", header =F,sep="," , quote = "\"'",,stringsAsFactors = FALSE,allowEscapes=T,flush =T,comment.char="")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 38 did not have 20 elements
So my row gets break in message column,I saved my file as eol ='\r\r\n' and quote=T but while reading I didn't find any parameter to read it back in the same format.
saved file as
write.table(z,file="abc",append=F,quote=T,sep=",",eol="\r\r\r\r\r\n",row.names=F,col.names=F)
in this example
"In case you know,give some hint
lot of text.....
.................
---------------------------------------------------------------------------
\"thank you very much for your time
and your effort\"
---------------------------------------------------------------------------"
it breaks after
"In case you know,give some hint
lot of text.....
.................
---------------------------------------------------------------------------
\"thank you very much for your time
while reading how can i use eol in order to retrieve complete text message in the same column .I am not able to read a written file back,though the file successfully uploded in Mysql with loading script.Any help in this direction.
thanks.
A question closer to mine was asked ans answered here.
My problem if fairly simple: I need to import in R a .tsv file, but I cannot because some elements contain a \t so that I received an error like:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 34 did not have 6 elements
One way to proceed would be to use gsub in order to replace the \ts. But the file is quite big in size, around 11GB, and doing this pre-processing would probably be too much for my machine. Any idea about a possible short-cut here?
Some context: at the end I need to import the whole dataset into a SQL database; I could do it without doing this conversion but at that point I would have the same problem.
I have a text document that is separated by tab. I did notice a bunch of tabs after the data in the text doc and am unsure if that is the issue here.
I have set the working directory:
setwd("D:/Classes/CSC/gmcar_price")
Then I attempt to read the table using
data=read.table("gmcar_price.txt", header=T)
But this error is coming up:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 11 did not have 13 elements
Any idea what is going on here? I have looked at line 11 and all the data is there.
Edit:
this is the format of the data
Price,Mileage,Make,Model,Trim,Type,Cylinder,Liter,Doors,Cruise,Sound,Leather
data=read.table("gmcar.price.txt", header=T, sep = "\t")
Thanks to shujaa, this solved the issue that I was having.