I am trying to read a huge file (2GB in size) with this:
data1<-read.table("file1.txt", sep=",",header=F)
I get this error:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 513836 did not have 8 elements
Is there a way to skip lines where missing data or replace it with NA values?
This error is most commonly fixed by adding fill = TRUE to your read.table() call. In your case, it would be the following
data1 <- read.table("file1.txt", sep = ",", fill = TRUE)
Additionally, header = FALSE is the default setting for the header argument in read.table() and therefore unnecessary in your code.
Related
I get a repeated error message (below) when trying to import a CSV-file with data, and there has been no problems previous years when using exactly the same R-script and the read.csv-command. I get the impression this is a common problem, and the usual advice is to use read.csv rather than scan, but as I have done this I am stuck and would be grateful for information.
Here is the script:
#Read in all individual data for the year to be updated
Idata <- read.csv("Exp3.csv", sep = ";", header = T,
colClasses=c("numeric", rep("character",4), rep("factor",8), "numeric",
"factor", rep("numeric",11), "factor"))
Here is the error message:
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
scan() expected 'a real', got '63991,21.1074,Ischnura,elegans,06/20
/21,HojeA14,1074,0,mature,1,1073,blue,,0,androchrome,,,,,,,,,,,2021,KP'
Would be grateful for any help!
Check the separator argument. In read.csv it's coded as ; but the data is comma-separated.
When reading my CVS data into R, after reading so many values as normal the data stops being separated by "," leaving lots of data missing
Here is how I load my data into R.
CODATA <- read.table( file.choose("CO2 Emissions per country.cvs"), header = TRUE, sep = "," )
I'm given this warning.
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 1 did not have 3 elements
Warning messages:
1: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
EOF within quoted string
2: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
number of items read is not a multiple of the number of columns
Then the value is...
"Cote dIvoire,0.42,0.44,0.44,0.43,0.45,0.49,0.46,0.51,0.45,0.4,0.33,0.32,0.38,0.33,0.29,0.28,0.26,0.25,0.23,0.21,0.21,0.2,0.21,0.21,0.22,0.25,0.3,0.3,0.4,0.37,0.36,0.36,0.29,0.31,0.32,0.32,0.3,0.34,0.31,0.29\nNigeria,0.1,0.12,0.15,0.16,0.18,0.22,0.27,0.3,0.32,0.35,0.39,0.42,0.41,0.37,0.38,0.35,0.35,0.36,0.36,0.3,0.34,0.4,0.36,0.29,0.28,0.3,0.34,0.29,0.31,0.34,0.38,0.39,0.36,0.36,0.4,0.35,0.32,0.33,0.27,0.29\nKenya,0.28,0.28.....(and so on)
where the values haven't been separated. The data is meant to start a new line with each country. It reads the previous 100 or so countries as normal up to Cote dIvoire.
Is there any way to fix without editing the csv file and changing the code to load it in?
Thank you for any help given.
You're best checking over the CSV file again for any problems. You could also try CODATA <- read.csv("CO2 Emissions per country.csv") rather than read.table?
I imported a dataset in R, but once I did that the value I got were only NAs, and all the numeric values from my csv file got "cancelled". I thought that it was a problem with factorisation and I tried to use stringasfactor= F, as well as read.csv(file="habits.csv",header=TRUE,colClasses=c("numeric","numeric","numeric","numeric"))
But for this last code I get this error:
> datagpa <- read.csv(file="habits.csv",header=TRUE,colClasses=c("integer","integer","integer","integer","integer","integer","integer","integer","integer","integer","integer"))
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
scan() expected 'an integer', got '2018/10/192:32:52PMEET;3.5;8;3;50;53;0;0;8;8;10;10'
> getwd()
[1] "/Users/rachelepesce/Desktop/QRM/R"
> datagpa <- read.csv(file="habits.csv",header=TRUE,colClasses=c("integer","integer","integer","integer","integer","integer","integer","integer","integer","integer","integer"))
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
scan() expected 'an integer', got '2018/10/192:32:52PMEET;3.5;8;3;50;53;0;0;8;8;10;10'
> datagpa <- read.csv(file="habits.csv",header=TRUE,colClasses=c("numeric","numeric","numeric","numeric","numeric","numeric","numeric","numeric","numeric","numeric","numeric"))
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
scan() expected 'a real', got '2018/10/192:32:52PMEET;3.5;8;3;50;53;0;0;8;8;10;10'
Does any of you know how to solve it ?
i am appending 100 files in a folder with a delimiter "|". Below is the code used. I am getting an error whichi am not able to debug,
file_list <- list.files()
dataset <- ldply(file_list, read.table, header=TRUE, sep="|")
ERROR - Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 284 did not have 12 elements
Please help me on this who have experience around this.
I'm trying to read a text file into R using the below code:
d = read.table("test_data.txt")
It returned the following error message:
"Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 2 did not have 119 elements"
I tried this:
read.table("man_cohort9_check.txt", header=T, sep="\t")
but it gave this error:
"Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 43 did not have 116 elements"
I don't understand what's going wrong??
It's because your file has rows with different number of column. To start investigate you can run:
d = read.table("test_data.txt", fill=TRUE, header=TRUE, sep="\t")
The usual cause of this are unmatched quotes and or lurking octothorpes ("#"). I would investigate these by seeing which of these produces the most regular table:
table( countfields("test_data.txt", quote="", comment.char="") )
table( countfields("test_data.txt", quote="") )
table( countfields("test_data.txt", comment.char="") )