I had troubles importing data I need from .csv files to R.
So to check, I created a simple .csv from excel with 2 columns and 3 rows - it reads like this in notepad
what,now
1,4
2,5
3,6
When I try import this data into R
d <- read.csv("D:/Book1.csv")
it gives a warning message,
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'D:/Book1.csv'
and then when I view the data, it's some gibberish.
What do I do?
I was using a work PC, and the files were encrypted - which was the reason why importing data into R was not working. I bypassed it by copying data into a text document.
Thanks everyone!
Related
I'm trying to upload a GSS data set into R Markdown for creating a lecture presentation.
Each time I do, I get errors that I do not understand. Any help would be appreciated.
read.csv("directory/GSS2018.xls", headers = TRUE)
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
unused argument (headers = TRUE)
My .xls has headers, so I'm not sure why it is saying "untrue". Even still, I tried taking out the headers option and received this:
read.csv("~directory/GSS2018.xls")
line 1 appears to contain embedded nullsline 2 appears to contain embedded nullsline 3 appears to contain embedded nullsline 4 appears to contain embedded nullsline 5 appears to contain embedded nullsError in make.names(col.names, unique = TRUE) :
invalid multibyte string at '<1a>'
I can't quite get what this error is telling me, nor how to fix it. I can import my data just fine using the "Import Dataset" button on the environment sector of R Studio - but when I put that code into R markdown, it shows up all these errors.
Any help is appreciated!
I am trying to read a csv file in R, but I am getting some errors.
This is what I have and also I have set the correct path
mydata <- read.csv("food_poisioning.csv")
But I am getting this error
Error in make.names(col.names, unique = TRUE) :
invalid multibyte string at '<ff><fe>Y'
In addition: Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 1 appears to contain embedded nulls
2: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 2 appears to contain embedded nulls
I believe I am getting this error because my csv file is actually not separated by comma, but it has spaces. This is what is looks like:
I tried using sep=" ", but it didn't work.
If you're having difficulty using read.csv() or read.table() (or writing other import commands), try using the "Import Dataset" button on the Environment panel in RStudio. It is useful especially when you are not sure how to specify the table format or when the table format is complex.
For your .csv file, use "From Text (readr)..."
A window will pop up and allow you to choose a file/URL to upload. You will see a preview of the data table after you select a file/URL. You can click on the column names to change the column class or even "skip" the column(s) you don't need. Use the Import Options to further manage your data.
Here is an example using CreditCard.csv from Vincent Arel-Bundock's Github projects:
You can also modify and/or copy and paste the code in Code Preview, or click Import to run the code when you are ready.
I am trying to read a data file into R with several delimited columns. Some columns have entries which are special characters (such as arrow). Read.table comes back with an error:
incomplete final line found by readTableHeader
and does not read the file. Tried UTF-8, UTF-16 coding options which didn't help either. Here is a small example file.
I am not able to reproduce the arrow in this question box, hence I am attaching the image of the notepad screen of a small file (test1.txt).
Here is what I get when I try to open it.
test <- read.table("test1.TXT", header=T, sep=",", fileEncoding="UTF-8", stringsAsFactor=F)
Warning message: In read.table("test1.TXT", header = T, sep = ",",
fileEncoding = "UTF-8", : incomplete final line found by
readTableHeader on 'test1.TXT'
However, if I remove the second line (with the special character) and try to import the file, R imports it without problem.
test2.txt =
id, ti, comment
1001, 105AB, "All OK"
test <- read.table("test2.TXT", header=T, sep=",", fileEncoding="UTF-8", stringsAsFactor=F)
id ti comment
1 1001 105AB All OK
Although this is a small example, the file I am working with is very large. Is there a way I can import the file to R with those special characters in place?
Thank you.
test1.txt
I am using Rstudio with R 3.3.1 on Windows 7 and I have installed CITAN package. I am trying to import bibliography entries from a CSV file that I exported from Scopus (as it is, untouched), choosing to export all available information.
This is the error that I get:
example <- Scopus_ReadCSV("scopus.csv")
Error in Scopus_ReadCSV("scopus.csv") : Column not found: `Source'.
In addition: Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
invalid input found on input connection 'scopus.csv'
2: In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'scopus.csv'
Column `Source' is there when I open the file, so I do not know why it says 'not found'.
Eventually I came into the following conclusions:
The encoding of the CSV file as exported from Scopus was UTF-8-BOM, which does not seem to be recognized from R when using Scopus_readCSV("file.csv") or read.table("file.csv", header = TRUE, sep = ",", fileEncoding = "UTF-8").
Although it is used an encoding type for the file from Scopus, there can be found some "strange" non-english characters which are not readable from the read function in R. (Mainly found this problem in names with special characters)
Solutions for those issues:
Open the CSV file with a notepad application like the Notepad++ and save the file with UTF-8 encoding to become readable for R as UTF-8.
When running the read function in R you will notice that it stops reading (e.g. in the 40th out of 200 registries). See where exactly it stopped and this way you can find the special character, by opening the CSV with the notepad, and then you can erase/change it as you wish in order to not have the same issue in R again.
Another solution that worked for me:
Open the file in Google Sheets, then download it from there again as a *.csv-file. R opens it correctly afterwards.
This question already has answers here:
'Incomplete final line' warning when trying to read a .csv file into R
(17 answers)
Closed 9 years ago.
I am trying to import CSV files to graph for a project. I'm using R 2.15.2 on a Mac OS X.
The first way tried
The script I'm trying to run to import the CSV file is this:
group4 <- read.csv("XXXX.csv", header=T)
But I keep getting this error message:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
object 'XXXXXX.csv' not found
The second way tried
I tried moving my working directory but got another error saying I can't move my working directory. So I went into Preferences tab and changed the working directory to the file that has my CSV files. But I still get the same error(as the first way).
The third way tried
Then I tried this script:
group4 <- read.table(file.choose(), sep="\t", header=T)
And I get this error:
Warning message:
In read.table(file.choose(), sep = "\t", header = T) :
incomplete final line found by readTableHeader on '/Users/xxxxxx/Documents/Programming/R/xxxxxx/xxxxxx.csv'
I've searched on the R site and all over the Internet, and nothing has got me to the point where I can import this simple CSV file into the R console.
The file is not in your working directory, change it, or use an absolute path.
Than you are pointing to a non-existing directory, or you do not have write privileges there.
The last line of your file is malformed.
As to the missing EOF (i.e. last line in file is corrupted)...
Usually, a data file should end with an empty line. Perhaps check your file if that is the case.
As an alternative, I would suggest to try out readLines(). This function reads each line of your data file into a vector. If you know the format of your input, i.e. the number of columns in the table, you could do this...
number.of.columns <- 5 # the number of columns in your data file
delimiter <- "\t" # this is what separates the values in your data file
lines <- readLines("path/to/your/file.csv", -1L)
values <- unlist(lapply(lines, strsplit, delimiter, fixed=TRUE))
data <- matrix(values, byrow=TRUE, ncol=number.of.columns)