I have read loads of csv files into r. For some reason this file I am working on, with several variables is read as if it had only 1 variable.
R loads it and adds it to the global environment, the number of rows is right, but there is only 1 column.
It never happened before. Have been looking around for a solution but can't find one. thanks!
I have tried the following code:
read.csv("file.csv",sep=",",header=TRUE)
read.csv("file.csv")
read.table("file.csv",sep=",")
image of excel file
OK, I think I found out what the problem was. I had checked if the commas were actually commas and I copied and pasted them and they looked like commas and behaved like commas but to make sure I decided to replace (within excel) all commas by semicolons. Then I read the file again and it worked. thanks for the replies!
Related
When you attempt to read CSV files that aren't the default groceries.csv, every transaction has an additional entry in it — a blank space — which will mess up all of the calculations for analysis (and even crash R if your CSV file is big enough). I've tried to insert NA's into all of the blank cells in my CSV file, but I cannot find a way to remove all of them within the read.transactions() command (remove duplicates leaves a single NA). I haven't found a trustworthy way to fix this in any of the other questions on stackoverflow, nor anywhere else on the internet.
Example entry:
> inspect(trans[1:5])
items
1 {,
FACEBOOK.COM,
Google,
Google Web Search}
It is hard to say. I assume you read the data with read.transactions(). Does your CSV file have leading white spaces in some/all lines? You could try to use the cols parameter in read.transactions() to fix the problem.
An example with data and the code to replicate the problem would help.
I am trying to read an ftp file from the internet ("ftp://ftp.cmegroup.com/pub/settle/stlint") into R, using the following command:
aaa<-read.table("ftp://ftp.cmegroup.vom/pub/settle/stlint", sep="\t", skip=2, header=FALSE)
the result shows the 8th, 51st, 65th, 71st, 72nd, 73rd, 74th, etc etc rows of the resulting dataset as including add-on rows appended at the end. Basically instead of returning
{row8}
{row9}
etc
{row162}
{row163}
It returns (adding in the quotes around the \n)
{row8'\n'row9'\n'etc...etc...'\n'row162}
{row163}
If it seems like i'm picking arbitrary numbers then run the code above, take a look at the actual ftp file on the internet (as of mid-day feb18) and you'll see i'm not, it really adding 155x rows onto the end of the 8th row. So what i'm looking for is simply I'm looking for a way to read it in without the random appending of rows. Thanks, and apologize in advance i'm new to R and was not able to find this fix after a while of searching.
Hi sorry first post here my apologies if I made a mistake.
So I'm fairly new to R and I was given an assignment where I am loading a CSV file into R. When i read.csv the whole file I get a ton of blank spots where values should be. The only info printed out is the N/A in the cells which is actually what I am trying to replace.
So I took a small sample of the file only the first couple rows and the info came up correctly in my read.csv comand. My question is is the layout of the .csv too large to display the original data in my main.csv file?
Also, How would I go about replacing all the N/A and NA's in the file to change them to blank cells or ""
Sorry if i painted my scenario poorly
first make sure that all of you data in the csv file is in GENERAL format!
there should be a title for each of the columns too
if you have an empty cell in your csv file then input a 0 into it
and make sure that around the data you CLEAR ALL the cells around them just incase there is anything funny in them
hope that helps if not then you could send me your file to sgreenaway#vmware.com and i will check it out for you :)
I am trying to load a table to R with some of the column heading start with a number, i.e. 55353, 555xx, abvab, 77agg
I found after loading the files, all the headings that start with a number has a X before it, i.e. changed to X55353, X555xx, abvab, X77agg
What can I do to solve this problem. Please kindly notice that not all column heading are start with a number. What should I do to solve this problem?
Many thanks
Probably your issue will be solved by adding check.names=FALSE to your read.table() call.
For more information see ?read.table
I have just started using R, so this may be a very dumb question. I am trying to import the data using:
emdata=read.csv(file="http://lottery.merseyworld.com/cgi-bin/lottery?days=19&Machine=Z&Ballset=0&order=1&show=1&year=0&display=CSV",header=TRUE)
My problem is that it reads the csv file into a single column ( by the way, the lottery data is simply because it is publicly available to download - using as an exercise to understand what I can and can't do in R), instead of formatting it into however many columns of data there are. Would someone mind helping out, please, even though this is trivial
Hm, that's kind of obnoxious for a page purporting to be in csv format. You can skip the first 5 lines, which will cause R to read (most of) the rest of the file correctly.
emdata=read.csv(file=...., header=TRUE, skip=5)
I got the number of lines to skip by looking at the source. You'll still have to remove the cruft in the middle and end, and then clean up the columns (they'll all be factors because of the embedded text).
It would be much easier to save the page to your hard disk, edit it to remove all the useless bits, then import it.
... to answer your REAL question, yes, you can import data directly from the web. In general, wherever you would read a file, you can substitute a fully qualified URL -- R is smart enough to do the Right Thing[tm]. This specific URL just happens to be particularly messy.
You could read text from the given url, filter out the obnoxious lines and then read the result as CSV like so:
lines <- readLines(url("http://lottery.merseyworld.com/cgi-bin/lottery?days=19&Machine=Z&Ballset=0&order=1&show=1&year=0&display=CSV"))
read.csv(text=lines[grep("([^,]*,){5,}", lines)])
The above regular expression matches any lines containing at least five commas.