Strange argument parsing in write.csv - r

Consider the following two commands:
> write.csv(irfilt,'foo.bar',row.names=FALSE)
#works fine but:
> write.csv(irfilt,'foo.bar',row.n=FALSE)
Error in write.table(irfilt, "foo.bar", row.n = FALSE, col.names = NA, :
'col.names = NA' makes no sense when 'row.names = FALSE'
I would have expected row.n to auto-expand to row.names but apparently that's not happening. There isn't any other allowed argument to write.table which could be confused with row.names. Does anyone know what is causing this misinterpretation? I thought it might be related to the fact that write.csv has no named arguments, but it seems odd that I wouldn't just get an error message about an unknown argument, rather than a misinterpreted arg.

You don't get any partial argument matching inside of write.csv because write.csv's only argument is .... So write.csv's attempt to manipulate your call fails here:
rn <- eval.parent(Call$row.names)
Call$col.names <- if (is.logical(rn) && !rn) TRUE else NA
And row.n is matched to row.names in the call to write.table, but the write.table call generated by write.csv is:
write.table(irfilt, "foo.bar", row.n = FALSE, col.names = NA,
sep = ",", dec = ".", qmethod = "double")
Which is why you're getting the error about col.names = NA while row.names = FALSE.

Related

Remove extra row in printing to file

I'm attempting to print to file the output of a str_split operation as follows:
s <- t(unlist(str_split("foo_bar_0.5", "_"), use.names = FALSE))
write.csv(s, "test.csv", quote = FALSE, row.names = FALSE, col.names = FALSE)
With the row.names = FALSE argument, I was able to remove the row names. However, this code still writes an extra line with the column names to the file as follows:
V1,V2,V3
foo,bar,0.5
With the following warning:
Warning message:
In write.csv(s, "test.csv", quote = FALSE, :
attempt to set 'col.names' ignored
I want only the second line. Any ideas what I am doing wrong?
Use write.table instead of write.csv :
write.table(s, "test.csv",sep=',', quote = FALSE, row.names = FALSE, col.names = FALSE)
write.table has two parameters like sep for putting the delimeter correctly in this case its comma, the other parameter is col.names which is a valid parameter, setting this to False should work for you.
Also as per documentation, if look for ?write.csv, for the ellipsis(...) , it says the following
... arguments to write.table: append, col.names, sep, dec and qmethod
cannot be altered.
A more detailed explanation is also present in documentation which mentions the warning you are getting:
write.csv and write.csv2 provide convenience wrappers for writing CSV
files. They set sep and dec (see below), qmethod = "double", and
col.names to NA if row.names = TRUE (the default) and to TRUE
otherwise.
write.csv uses "." for the decimal point and a comma for the
separator.
write.csv2 uses a comma for the decimal point and a semicolon for the
separator, the Excel convention for CSV files in some Western European
locales.
These wrappers are deliberately inflexible: they are designed to
ensure that the correct conventions are used to write a valid file.
Attempts to change append, col.names, sep, dec or qmethod are ignored,
with a warning.

Warning message in R when using colClasses when reading csv files

I am using lapply to read a list of files. The files have multiple rows and columns, and I interested in the first row in the first column. The code I am using is:
lapply(file_list, read.csv,sep=',', header = F, col.names=F, nrow=1, colClasses = c('character', 'NULL', 'NULL'))
The first row has three columns but I am only reading the first one. From other posts on stackoverflow I found that the way to do this would be to use colClasses = c('character', 'NULL', 'NULL'). While this approach is working, I would like to know the underlying issue that is causing the following error message to be generated and hopefully prevent it from popping up:
"In read.table(file = file, header = header, sep = sep, quote = quote, :
cols = 1 != length(data) = 3"
It's to let you know that you're just keeping one column of the data out of three because it doesn't know how to handle colClasses of "NULL". Note your NULL is in quotation marks.
An example:
write.csv(data.frame(fi=letters[1:3],
fy=rnorm(3,500,1),
fo=rnorm(3,50,2))
,file="a.csv",row.names = F)
write.csv(data.frame(fib=letters[2:4],
fyb=rnorm(3,5,1),
fob=rnorm(3,50,2))
,file="b.csv",row.names = F)
file_list=list("a.csv","b.csv")
lapply(file_list, read.csv,sep=',', header = F, col.names=F, nrow=1, colClasses = c('character', 'NULL', 'NULL'))
Which results in:
[[1]]
FALSE.
1 fi
[[2]]
FALSE.
1 fib
Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
cols = 1 != length(data) = 3
Which is the same as if you used:
lapply(file_list, read.csv,sep=',', header = F, col.names=F,
nrow=1, colClasses = c('character', 'asdasd', 'asdasd'))
But the warning goes away (and you get the rest of the row as a result) if you do:
lapply(file_list, read.csv,sep=',', header = F, col.names=F,
nrow=1, colClasses = c( 'character',NULL, NULL))
You can see where errors and warnings come from in source code for a function by entering, for example, read.table directly without anything following it, then searching for your particular warning within it.

How to specify .csv delimiter while using Map()

I have a list containing 2 or more dataframes:
d <- data.frame(x=1:3, y=letters[1:3])
f <- data.frame(x=11:13, y=letters[11:13])
df <- list(d, f)
to save them as .csv, I use the following syntax:
filenames = paste0('C:/Output_', names(df), '.csv')
Map(write.csv, df, filenames)
But I would like to add some strings to obtain a specific format, like:
quote = FALSE, row.names = FALSE, sep = "\t", na = "", col.names = FALSE
And the thing is that I am not that sure where to add that syntax. Wherever I try, I get a warning saying my syntax has been ignored.
> Warning messages:
1: In (function (...) : attempt to set 'col.names' ignored
2: In (function (...) : attempt to set 'sep' ignored
3: In (function (...) : attempt to set 'col.names' ignored
4: In (function (...) : attempt to set 'sep' ignored
Any suggestions? In BaseR preferably!
Why you're still getting col.names warnings: farther down in the documentation (?write.csv) you'll see
These wrappers [write.csv and write.csv2] are deliberately inflexible: they are designed to
ensure that the correct conventions are used to write a valid
file. Attempts to change ‘append’, ‘col.names’, ‘sep’, ‘dec’ or
‘qmethod’ are ignored, with a warning.
Should go away if you use write.table() instead.
You need to use anonymous function in order to be able to pass further arguments, i.e.
Map(function(...) write.csv(..., quote = FALSE, row.names = FALSE, sep = "\t", na = ""), df, filenames)

R read.table skip not working. Why?

I have a file similar to
ColA ColB ColC
A 1 0.1
B 2 0.2
But with many more columns.
I want to read the table and set the correct type of data for each column.
I am doing the following:
data <- read.table("file.dat", header = FALSE, na.string = "",
dec = ".",skip = 1,
colClasses = c("character", "integer","numeric"))
But I get the following error:
Error in scan(...): scan() expected 'an integer', got 'ColB'
What am I doing wrong? Why is it trying to parse also the first line according to colClasses, despite skip=1?
Thanks for your help.
Some notes: This file has been generated in a Linux environment and is being worked on in a Windows environment. I am thinking of a problem with newline characters, but I have no idea what to do.
Also, if I read the table without colClasses the table is read correctly (skipping the first line) but all columns are factor type. I can probably change the class later, but still I would like to understand what is happening.
Instead of skipping first line, you can change header = TRUE and it should work fine.
data <- read.table("file.dat", header = TRUE, na.string = "",
dec = ".",colClasses = c("character", "integer","numeric"), sep = ",")

Printing several pieces of output to the same CSV in R?

I am using the TraMineR package. I am printing output to a CSV file, like this:
write.csv(seqient(sequences.seq), file = "diversity_measures.csv", quote = FALSE, na = "", row.names = TRUE)
write.csv(seqici(sequences.seq), file = "diversity_measures.csv", quote = FALSE, na = "", row.names = TRUE, append= TRUE)
write.csv(seqST(sequences.seq), file = "diversity_measures.csv", quote = FALSE, na = "", row.names = TRUE, append= TRUE)
The dput(sequences.seq) object can be found here.
However, this does not append the output properly but creates this error message:
In write.csv(seqST(sequences.seq), file = "diversity_measures.csv", :attempt to set 'append' ignored
Additionally, it only gives me the output for the last command, so it seems like it overwrites the file each time.
Is it possible to get all the columns in a single CSV file, with a column name for each (i.e. entropy, complexity, turbulence)
You can use append=TRUE in write.table calls and use the same file name, but you'll need to specify all the other arguments as needed. append=TRUE is not available for the wrapper function write.csv, as noted in the documentation:
These wrappers are deliberately inflexible: they are designed to
ensure that the correct conventions are used to write a valid file.
Attempts to change append, col.names, sep, dec or qmethod are ignored,
with a warning.
Or you could write out
write.csv(data.frame(entropy=seqient(sequences.seq),
complexity=seqici(sequences.seq),
turbulence=seqST(sequences.seq)),
'output.csv')

Resources