write a data.frame to an R file - r

I wish to write a data.frame from one R file to another R file using base R. So far I have tried cat, capture.output, write.table and sink. I found suggested solutions for capture.output and write.table here:
writing a data.frame using cat
However, I have not been able to obtain an ideal solution. write.table comes the closest but returns an unwanted warning message.
Here is the data.frame in the source R file:
my.df <- data.frame(scenario = 3333,
AA = 200,
BB = 999,
CC = 444,
DD = 7)
Here is the desired appearance in the recipient R file except I want the name to be my.df, not desired.format:
desired.format <- read.table(text = '
scenario AA BB CC DD
3333 200 999 444 7
', header = TRUE, stringsAsFactors = FALSE)
Here is the full code for the source R file except for the setwd() statement:
R.file <- 'my_R_file.R'
cat(' ' , file = R.file, sep=c("\n") )
cat('This is my stuff' , file = R.file, sep=c("\n"), append = TRUE)
cat('#' , file = R.file, sep=c("\n"), append = TRUE)
cat(' ' , file = R.file, sep=c("\n"), append = TRUE)
my.df <- data.frame(scenario = 3333,
AA = 200,
BB = 999,
CC = 444,
DD = 7)
str(my.df)
# Desired format in my_R_file.R
desired.format <- read.table(text = '
scenario AA BB CC DD
3333 200 999 444 7
', header = TRUE, stringsAsFactors = FALSE)
str(desired.format)
# capture.output includes an unwanted row number
cat(' ' , file = R.file, sep=c("\n"), append = TRUE)
cat('capture.output' , file = R.file, sep=c("\n"), append = TRUE)
capture.output(my.df, file = R.file, append = TRUE)
cat(' ' , file = R.file, sep=c("\n"), append = TRUE)
# write.table returns an unwanted warning message
cat('write.table' , file = R.file, sep=c("\n"), append = TRUE)
cat('my.df <- read.table(text = \'' , file = R.file, sep=c("\n"), append = TRUE)
write.table(my.df, file = R.file, col.names = TRUE, row.names = FALSE, quote = FALSE, append=TRUE)
cat('\', header = TRUE, stringsAsFactors = FALSE)' , file = R.file, sep=c("\n"), append = TRUE)
cat(' ' , file = R.file, sep=c("\n"), append = TRUE)
# sink does not return any useful output
#cat('sink' , file = R.file, sep=c("\n"), append = TRUE)
#sink(R.file)
#sink()
#my.df
#sink()
#cat(' ' , file = R.file, sep=c("\n"), append = TRUE)
cat('This is the end' , file = R.file, sep=c("\n"), append = TRUE)
cat(' ' , file = R.file, sep=c("\n"), append = TRUE)
Here are the full contents of the recipient R file my_R_file.R:
This is my stuff
#
capture.output
scenario AA BB CC DD
1 3333 200 999 444 7
write.table
my.df <- read.table(text = '
scenario AA BB CC DD
3333 200 999 444 7
', header = TRUE, stringsAsFactors = FALSE)
This is the end
Here is the warning message returned by write.table:
Warning message:
In write.table(my.df, file = R.file, col.names = TRUE, row.names = FALSE, :
appending column names to file
Thank you for any suggestions on eliminating this warning message or arriving at a better solution. I would rather not suppress all warning messages.

Notice that there is base:::print.data.frame method involved when evaluating my.df which is of class "data.frame". It has arguments such as row.names=. Accordingly you may specify:
capture.output(print(my.df, row.names=FALSE), file=R.file, append=TRUE)

Related

Why R reads CSV file differently

I am using
myCounts<-read.csv("myCounts.csv", header = TRUE, row.names = 1, sep = ",")
and
Book4 <- read_delim("Book4.csv", delim = ";",
escape_double = FALSE, trim_ws = TRUE)
to read two csv files. But read.csv and read.delim is pressing them differently.
Could you please explane how to read in book4 data in the same structure of myCounts data?
I tried following, it works.
df<-read.delim("~/Documents/sample.csv" ,sep = ";",row.names = 1)

data.table::fread error when converting MAF files to data table

I want to merge the 50 MAF files with the sample information so that I can read it as a data.table and subset it.
library(maftools)
# Load MAF files
maf = system.file("extdata", list.files(path="mafs/"), package="maftools")
# Load sample information
si <- system.file("extdata", "sample-information.tsv", package="maftools")
d = read.maf(maf=maf, clinicalData=si)
Traceback:
Error in data.table::fread(file = maf, sep = "\t", stringsAsFactors = FALSE, :
File '' does not exist or is non-readable. getwd()=='C:/Users/User/Documents/VanAllen'
> traceback()
3: stop("File '", file, "' does not exist or is non-readable. getwd()=='",
getwd(), "'")
2: data.table::fread(file = maf, sep = "\t", stringsAsFactors = FALSE,
verbose = FALSE, data.table = TRUE, showProgress = TRUE,
header = TRUE, fill = TRUE, skip = "Hugo_Symbol", quote = "")
1: read.maf(maf = maf, clinicalData = si)
1: data.table::fread(input = maf)
Maftools documentation:
https://www.bioconductor.org/packages/release/bioc/manuals/maftools/man/maftools.pdf
When I run your code, maf indeed points to no character ( "" ), which of course cannot be read by fread. However when I try
fread("R/x86_64-pc-linux-gnu-library/3.6/maftools/extdata/brca.maf.gz")
it works as expected.

Omitting the final end-of-line character for txt file (write.table in R)

I have a simple data.frame that I would like to write to an output .txt file using R.
Sample code:
my_df <- data.frame(name = c("Wendy", "Quinn"), age = c(23, 43))
write.table(my_df, file = "my_output_file.txt", sep = " ", col.names = F, row.names = F, quote = F, eol = "\n")
The trouble is that I am getting the following output file when viewed in Notepad++ (see screenshot). I understand the eol = "\n" argument places a carriage return at the end of each line -- I want that for the line separation between these two rows, but not at the end of the document. Is there a method to omit the final carriage return that results in my .txt file being 3 lines long instead of only 2?
I don't know of an automatic way to do it, but try this:
my_df <- data.frame(name = c("Wendy", "Quinn"), age = c(23, 43))
write.table(my_df, file = "my_output_file.txt", sep = " ", col.names = F, row.names = F, quote = F, eol = "\n")
produces the same output:
but this
my_output <- capture.output(write.table(my_df, sep = " ", col.names = F, row.names = F, quote = F, eol = "\n"))
writeBin(paste(my_output, collapse = "\n"), "my_output_file2.txt")
produces this:
You can write the object minus the last line, then append it without a line ending.
write.table(my_df[1:(nrow(my_df)-1),], file = "my_output_file.txt",
sep = " ", col.names = F, row.names = F, quote = F, eol = "\n")
write.table(my_df[nrow(my_df),], file = "my_output_file.txt",
sep = " ", col.names = F, row.names = F, quote = F, eol = "", append=T)

R asks for a list which seems to be a list according to is.list (=TRUE)

I am using the RAM package.
The function I use is very simple for diversity index, adding up a column in my metadata ;
outname <-OTU.diversity(data=OTUtables, meta=metatables)
(Arguments: data a list of OTU tables.
meta the metadata to append the outputs)
I am looping it but I get this error:
please provide otu tables as list; see ?RAM.input.formatting
So I go to that help menu and read this:
one data set:
data=list(data=otu)
multiple data sets:
data=list(data1=otu1, data2=otu2, data3=otu3)
here is my code:
i <- 1
for(i in 1:nrow(metadataMasterTax)){
temp <- read.table(paste(metadataMasterTax$DataAnFilePath[i], metadataMasterTax$meta[i], sep = ""),
sep = "\t", header = TRUE, dec = ".", comment.char = "", quote = "", stringsAsFactors = TRUE,
as.is = TRUE)
temp2 <- temp
temp2$row.names <- NULL #to unactivate numbers generated in the margin
trans <- read.table(paste(metadataMasterTax$taxPath[i], metadataMasterTax$taxName[i], sep = ""),
sep = "\t", header = TRUE, dec = ".", comment.char = "", quote = "", stringsAsFactors = TRUE,
as.is = TRUE, check.names = FALSE)
trans2 <- trans
trans2$row.names <- NULL #to unactivate numbers generated in the margin
data=list(data=trans2[i])
temp2[i] <- OTU.diversity(data=trans2[i], meta=temp2[i])
# Error in OTU.diversity(trans2, temp2) :
# please provide otu tables as list; see ?RAM.input.formatting
# is.list(trans2)
# [1] TRUE
# is.list(data)
# [1] TRUE
temp$taxonomy <- temp2$taxonomy
write.table(temp, file=paste(pathDataAn, "diversityDir/", metadataMasterTax$ShortName[i], ".meta.div.tsv", sep = ""),
append = FALSE,
sep = "\t",
row.names = FALSE)
}
Can anyone help me please....
thanks a lot
Because the main problem appears to be getting the OTU.diversity function to work, I focus on this issue. The code snippet below runs OTU.diversity without any problems, using the Google sheets data provided by OP.
library(gsheet)
library(RAM)
for (i in 1:2) {
# Meta data
temp <- as.data.frame(gsheet2tbl("https://drive.google.com/open?id=1hF47MbYZ1MG6RzGW-fF6tbMT3z4AxbGN5sAOxL4E8xM"))
temp$row.names <- NULL
# OTU
trans <- as.data.frame(gsheet2tbl("https://drive.google.com/open?id=1gOaEjDcs58T8v1GA-OKhnUsyRDU8Jxt2lQZuPWo6XWU"))
trans$row.names <- NULL
rownames(temp) <- colnames(trans)[-ncol(trans)]
temp2 <- OTU.diversity(data = list(data = trans), meta = temp)
write.table(temp2,
file = paste0("file", i, ".meta.div.tsv"), # replace
append = FALSE,
sep = "\t",
row.names = FALSE)
}
Replace for (i in 1:2) with for(i in 1:nrow(metadataMasterTax)), as.data.frame(gsheet2tbl(...)) with read.table(...), and the file argument in write.table with the appropriate string.

Column names shift to left on read.table or read.csv

Originally I have this TSV file (sample):
name type qty
cxfm 1C 0
d2 H50 2
g3g 1G 2
hb E37 1
nlx E45 4
so I am using read.csv to read data from a .tsv file but I always get this output:
name type qty
1 cxfm 1C 0
2 d2 H50 2
3 g3g 1G 2
4 hb E37 1
5 nlx E45 4
instead of getting this one:
name type qty
1 cxfm 1C 0
2 d2 H50 2
3 g3g 1G 2
4 hb E37 1
5 nlx E45 4
Any ideas this? this is what I am using to read the files:
file_list<-list.files()
for (file in file_list){
if (!exists("dataset")){
dataset <- read.table(file, header = TRUE, sep = "\t", row.names = NULL, blank.lines.skip = TRUE, fill = TRUE)
names(dataset) <- c("rowID", names(dataset)[1:ncol(dataset)-1])
}
if (exists("dataset")){
temp_dataset <- read.table(file, header = TRUE, sep = "\t", row.names = NULL, blank.lines.skip = TRUE, fill = TRUE)
names(temp_dataset) <- c("rowID", names(temp_dataset)[1:ncol(temp_dataset)-1])
dataset <- rbind(dataset, temp_dataset)
rm(temp_dataset)
}
}
dataset <- unique(dataset)
write.table(dataset, file = "dataset.tsv", sep = "\t")
There appears to be a missing column header in your source CSV file. One option here would be to leave your read.csv() call as it is and simply adjust the names of the resulting data frame:
df <- read.csv(file,
header = TRUE,
sep = "\t",
row.names = NULL,
blank.lines.skip = TRUE,
fill = TRUE,
comment.char = "",
quote = "", stringsAsFactors = FALSE)
names(df) <- c("rowID", names(df)[1:ncol(df)-1])
This is what I had to do to Fix it: set row.names to FALSE
write.table(dataset, file = "data.tsv", sep = "\t", row.names = FALSE)

Resources