issues of reading csv files using read.table [duplicate]

issues of reading csv files using read.table [duplicate] - r

This question already has answers here:
'Incomplete final line' warning when trying to read a .csv file into R
(17 answers)
Closed 9 years ago.
I am trying to import CSV files to graph for a project. I'm using R 2.15.2 on a Mac OS X.
The first way tried
The script I'm trying to run to import the CSV file is this:
group4 <- read.csv("XXXX.csv", header=T)
But I keep getting this error message:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
object 'XXXXXX.csv' not found
The second way tried
I tried moving my working directory but got another error saying I can't move my working directory. So I went into Preferences tab and changed the working directory to the file that has my CSV files. But I still get the same error(as the first way).
The third way tried
Then I tried this script:
group4 <- read.table(file.choose(), sep="\t", header=T)
And I get this error:
Warning message:
In read.table(file.choose(), sep = "\t", header = T) :
incomplete final line found by readTableHeader on '/Users/xxxxxx/Documents/Programming/R/xxxxxx/xxxxxx.csv'
I've searched on the R site and all over the Internet, and nothing has got me to the point where I can import this simple CSV file into the R console.

The file is not in your working directory, change it, or use an absolute path.
Than you are pointing to a non-existing directory, or you do not have write privileges there.
The last line of your file is malformed.

As to the missing EOF (i.e. last line in file is corrupted)...
Usually, a data file should end with an empty line. Perhaps check your file if that is the case.
As an alternative, I would suggest to try out readLines(). This function reads each line of your data file into a vector. If you know the format of your input, i.e. the number of columns in the table, you could do this...
number.of.columns <- 5 # the number of columns in your data file
delimiter <- "\t" # this is what separates the values in your data file
lines <- readLines("path/to/your/file.csv", -1L)
values <- unlist(lapply(lines, strsplit, delimiter, fixed=TRUE))
data <- matrix(values, byrow=TRUE, ncol=number.of.columns)

Related

Importing data from csv File to R

I had troubles importing data I need from .csv files to R.
So to check, I created a simple .csv from excel with 2 columns and 3 rows - it reads like this in notepad
what,now
1,4
2,5
3,6
When I try import this data into R
d <- read.csv("D:/Book1.csv")
it gives a warning message,
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'D:/Book1.csv'
and then when I view the data, it's some gibberish.
What do I do?

I was using a work PC, and the files were encrypted - which was the reason why importing data into R was not working. I bypassed it by copying data into a text document.
Thanks everyone!

How to import YRBS ASCII .dat file into R

I'm trying to import the YRBS ASCII .dat file found here to analyze in R, but I'm having trouble importing the file. I followed the recommendations here and here but none seem to work. More specifically, it's still showing up as being one column/variable in R with 14,765 observations.
I've tried using the readLines(), read.table, and read.csv functions but none seem to be separating the columns.
Here are the specific codes I tried:
readLines("D:/Projects/XXH2017_YRBS_Data.dat", n=5)
read.csv("D:/Projects/XXH2017_YRBS_Data.dat", header = FALSE)
read.table("D:/Projects/XXH2017_YRBS_Data.dat", header = FALSE)
readLines and read.csv only provided one column and I got an error message from using read.table that stated that line 1 did not have 23 elements (which I'm assuming is just referring to the missing values?).
The data also starts from line 1 so I cannot use skip = 1 like some have suggested online.
How do I import this file into R so that I can separate the columns?

Bulky file, so I did not download them.
First, use an Access file version then use try following codes.
Compare it to Access data.
data<- readr::read_table2("XXH2017_YRBS_Data.dat", col_names = FALSE, na = ".")

Read csv in r with special characters

I am trying to read a data file into R with several delimited columns. Some columns have entries which are special characters (such as arrow). Read.table comes back with an error:
incomplete final line found by readTableHeader
and does not read the file. Tried UTF-8, UTF-16 coding options which didn't help either. Here is a small example file.
I am not able to reproduce the arrow in this question box, hence I am attaching the image of the notepad screen of a small file (test1.txt).
Here is what I get when I try to open it.
test <- read.table("test1.TXT", header=T, sep=",", fileEncoding="UTF-8", stringsAsFactor=F)
Warning message: In read.table("test1.TXT", header = T, sep = ",",
fileEncoding = "UTF-8", : incomplete final line found by
readTableHeader on 'test1.TXT'
However, if I remove the second line (with the special character) and try to import the file, R imports it without problem.
test2.txt =
id, ti, comment
1001, 105AB, "All OK"
test <- read.table("test2.TXT", header=T, sep=",", fileEncoding="UTF-8", stringsAsFactor=F)
id ti comment
1 1001 105AB All OK
Although this is a small example, the file I am working with is very large. Is there a way I can import the file to R with those special characters in place?
Thank you.
test1.txt

Loading csv into R with `sep=,` as the first line

The program I am exporting my data from (PowerBI) saves the data as a .csv file, but the first line of the file is sep=, and then the second line of the file has the header (column names).
Sample fake .csv file:
sep=,
Initiative,Actual to Estimate (revised),Hours Logged,Revised Estimate,InitiativeType,Client
FakeInitiative1 ,35 %,320.08,911,Platform,FakeClient1
FakeInitiative2,40 %,161.50,400,Platform,FakeClient2
I'm using this command to read the file:
initData <- read.csv("initData.csv",
row.names=NULL,
header=T,
stringsAsFactors = F)
but I keep getting an error that there are the wrong number of columns (because it thinks the first line tells it the number of columns).
If I do header=F instead then it loads, but then when I do names(initData) <- initData[2,] then the names have spaces and illegal characters and it breaks the rest of my program. Obnoxious.
Does anyone know how to tell R to ignore that first line? I can go into the .csv file in a text editor and just delete the first line manually before I load it each time (if I do that, everything works fine) but I have to export a bunch of files and this is a bit stupid and tedious.
Any help would be much appreciated.

There are many ways to do that. Here's one:
all_content = readLines("initData.csv")
skip_first_line = all_content[-1]
initData <- read.csv(textConnection(skip_first_line),
row.names=NULL,
header=T,
stringsAsFactors = F)

Your file could be in a UTF-16 encoding. See hrbrmstr's answer in how to read a UTF-16 file:

Using read.csv when a data entry is a space (not blank!)

I am having a problem with using read.csv in R. I am trying to import a file that has been saved as a .csv file in Excel. Missing values are blank, but I have a single entry in one column which looks blank, but is in fact a space. Using the standard command that I have been using for similar files produces this error:
raw.data <- read.csv("DATA_FILE.csv", header=TRUE, na.strings="", encoding="latin1")
Error in type.convert(data[[i]], as.is = as.is[i], dec = dec, na.strings = character(0L)) :
invalid multibyte string at ' floo'
I have tried a few variations, adding arguments to the read.csv() command such as na.strings=c(""," ") and strip.white=TRUE, but these result in the exact same error.
It is a similar error to what you get when you use the wrong encoding option, but I am pretty sure this shouldn't be a problem here. I have of course tried manually removing the space (in Excel), and this works, but as I'm trying to write generic code for a Shiny tool, this is not really optimal.