R not reading ampersand from .txt file with read.table() [duplicate] - r

I am trying to load a table to R with some of the column heading start with a number, i.e. 55353, 555xx, abvab, 77agg
I found after loading the files, all the headings that start with a number has a X before it, i.e. changed to X55353, X555xx, abvab, X77agg
What can I do to solve this problem. Please kindly notice that not all column heading are start with a number. What should I do to solve this problem?
Many thanks

Probably your issue will be solved by adding check.names=FALSE to your read.table() call.
For more information see ?read.table

Related

How to solve the duplicated names of cores in a RWL file in R

When I read some tree ring width data of rwl files (by the function "read.rwl" in the pacakge of 'dplr') there are the errors that files can not be read due to the duplicated cores' names. So I intended to read the data by function 'read.delim' and convert its format to one column format and rename its name. But now I don't know how to convert it. Please give me some suggestions to resolve this problem. Thanks in advance.

reading csv file. different variables not detected

I have read loads of csv files into r. For some reason this file I am working on, with several variables is read as if it had only 1 variable.
R loads it and adds it to the global environment, the number of rows is right, but there is only 1 column.
It never happened before. Have been looking around for a solution but can't find one. thanks!
I have tried the following code:
read.csv("file.csv",sep=",",header=TRUE)
read.csv("file.csv")
read.table("file.csv",sep=",")
image of excel file
OK, I think I found out what the problem was. I had checked if the commas were actually commas and I copied and pasted them and they looked like commas and behaved like commas but to make sure I decided to replace (within excel) all commas by semicolons. Then I read the file again and it worked. thanks for the replies!

Deal with escaped commas in CSV file?

I'm reading in a file in R using fread as such
test.set = fread("file.csv", header=FALSE, fill=TRUE, blank.lines.skip=TRUE)
Where my csv consists of 6 columns. An example of a row in this file is
"2014-07-03 11:25:56","61073a09d113d3d3a2af6474c92e7d1e2f7e2855","Securenet Systems Radio Playlist Update","Your Love","Fred Hammond & Radical for Christ","50fcfb08424fe1e2c653a87a64ee92d7"
However, certain rows are formatted in a particular way when there is a comma inside one of the cells. For instance,
"2014-07-03 11:25:59","37780f2e40f3af8752e0d66d50c9363279c55be6","Spotify","\"Hello\", He Lied","Red Box","b226ff30a0b83006e5e06582fbb0afd3"
produces an error of the sort
Expecting 6 cols, but line 5395818 contains text after processing all
cols. Try again with fill=TRUE. Another reason could be that fread's
logic in distinguishing one or more fields having embedded sep=','
and/or (unescaped) '\n' characters within unbalanced unescaped quotes
has failed. If quote='' doesn't help, please file an issue to figure
out if the logic could be improved.
As you can see, the value that is causing the error is "\"Hello\", He Lied", which I want to be read by fread as "Hello, He Lied". I'm not sure how to account for this, though - I've tried using fill=TRUE and quote="" as suggested, but the error still keeps coming up. It's probably just a matter of finding the right parameter(s) for fread; anyone know what those might be?
In read.table() from base R this issue is solvable.
Using Import data into R with an unknown number of columns?
In fread from data.table this is not possible.
Issue logged for this : https://github.com/Rdatatable/data.table/issues/2669

"arules" library's "read.transaction()" reads in CSV files with an additional, blank column for every transaction

When you attempt to read CSV files that aren't the default groceries.csv, every transaction has an additional entry in it — a blank space — which will mess up all of the calculations for analysis (and even crash R if your CSV file is big enough). I've tried to insert NA's into all of the blank cells in my CSV file, but I cannot find a way to remove all of them within the read.transactions() command (remove duplicates leaves a single NA). I haven't found a trustworthy way to fix this in any of the other questions on stackoverflow, nor anywhere else on the internet.
Example entry:
> inspect(trans[1:5])
items
1 {,
FACEBOOK.COM,
Google,
Google Web Search}
It is hard to say. I assume you read the data with read.transactions(). Does your CSV file have leading white spaces in some/all lines? You could try to use the cols parameter in read.transactions() to fix the problem.
An example with data and the code to replicate the problem would help.

characters converted to dates when using write.csv

my data frame has a column A with strings in character form
> df$A
[1] "2-60", "2-61", "2-62", "2-63" etc
I saved the table using write.csv, but when I open it with Excel column A appears formatted as date:
Feb-60
Feb-61
Feb-62
Feb-63
etc
Anyone knows what can I do to avoid this?
I tweaked the arguments of write.csv but nothing worked, and I can't seem to find an example in Stack Overflow that helps solve this problem.
As said in the comments, this is an excel behaviour, not R's. And that can't be deactivated:
Microsoft Excel is preprogrammed to make it easier to enter dates. For
example, 12/2 changes to 2-Dec. This is very frustrating when you
enter something that you don't want changed to a date. Unfortunately
there is no way to turn this off. But there are ways to get around it.
Microsoft Office Article
The first suggested way around it according to the article is not helpful, because it relies on changing the cell formatting, but that's too late when you open the .csv file in excel (it's already converted to an integer representing the date).
There is, however, a useful tip:
If you only have a few numbers to enter, you can stop Excel from
changing them into dates by entering:
An apostrophe (‘) before you enter a number, such as ’11-53 or ‘1/47. The apostrophe isn’t displayed in the cell after you press
Enter.
So you can make the data display as original by using
vec <- c("2-60", "2-61", "2-62", "2-63")
vec <- paste0("'", vec)
Just remember the values will still have the apostrophe if you read them again in R, so you might have to use
vec <- sub("'", "", vec)
This might not be ideal but at least it works.
One alternative is enclosing the text in =" ", as an excel formula, but that has the same end result and uses more characters.
Another solution - a bit tedious, Use Import Text File in Excel, click thru the dialog boxes and in Step 3 of 3 of the Text Import Wizard, you will have an option of setting the column data format, use "Text" for the column that has "2-60", "2-61", "2-62", "2-63". If you use General (the default), Excel tries to be smart and converts the answer for you.
I solved the problem by saving the file using the .xlsx format by using the function
write.xlsx()
from the package xlsx (https://www.rdocumentation.org/packages/xlsx/versions/0.6.5)

Resources