Converting *.rds into *.csv file - r

I am trying to convert an *.rds file into a *.csv file. First, I am importing the file via data <- readRDS(file.rds) and next I am trying to write the CSV file via write.csv(data,file="file.csv").
However, this yields the following error:
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class ‘structure("dgCMatrix", package = "Matrix")’ to a data.frame
How can I turn the *.rds file into a *.csv file?

Sparse matrice often cannot be converted directly into a dataframe.
This answer might be very resource intensive, but it might work by converting the sparse matrix to a normal matrix first and then saving it to a csv.
Try this:
write.csv(as.matrix(data),file="file.csv")
This solution is not efficient and might crash R, so save your work prior.
As a general comment, this csv-file will be huge, so it might be more helpful to use more efficient data storage like a database engine.

Related

Converting RData to CSV file returns incorrect CSV file

I do not have any expertise on R and I have to convert RData files to CSV to analyze the data. I followed the following links to do this: Converting Rdata files to CSV and "filename.rdata" file Exploring and Converting to CSV. The second option seemed to be a simpler as I failed to understand the first one. This is what I have tried till now and the results along with it:
>ddata <- load("input_data.RData")
>print(ddata)
[1] "input_data"
> print(ddata[[1]])
[1] "input_data"
> write.csv(ddata,"test.csv")
From the first link I learnt that we can see the RData type and when I did str(ddata) I found out that it is a List of size 1. Hence, I checked to see if print(ddata[[1]]) would print anything apart from just "input_data". With the write.csv I was able to write it to a csv without any errors but it has just the following 2 lines inside the CSV file:
"","x"
"1","input_data"
Can you please help me understand what am I doing wrong and show a way to get all the details in a csv?
The object ddata contains the name of the object(s) that load() created. Try typing the command ls(). That should give you the names of the objects in your environment. One of them should be input_data. That is the object. If it is a data frame (str(input_data)), you can create the csv file with
write.csv(input_data, "test.csv")

Does R have an equivalent to python's io for saving file like objects to memory?

In python we can import io and then make make a file like object with some_variable=io.BytesIO() and then download any type of file to that and interact with it like it were a locally saved file except that it's in memory. Does R have something like that? To be clear I'm not asking about what any particular OS does when you save some R object to a temp file.
This is kind of a duplicate of Can I write to and access a file in memory in R? but that is about 9 years old so maybe the functionality exists now either in base or with a package.
Yes, readBin.
readBin("/path", raw(), file.info("/path")$size)
This is a working example:
tfile <- tempfile()
writeBin(serialize(iris, NULL), tfile)
x <- readBin(tfile, raw(), file.info(tfile)$size)
unserialize(x)
…and you get back your iris data.
This is just an example, but for R objects, it is way more convenient to use readRDS/saveRDS().
However, if the object is an image you want to analyse, readBin gives a raw memory representation.
For text files, you should then use:
rawToChar(x)
but again there are readLines(), read.table(), etc., for these tasks.

How to change the object when importing data in R

I have an Excel file containing a column of 10000 numbers that I wish to import into R.
However, no matter the method I use, the resulting object is either a list of 1, or 10000 obs. of 1 variable (I have used read.csv on the .csv version of the file, read_xlsx on the .xlsx version). If this is expected, how can I work these objects into ordinary arrays?
I have tried importing the same files into matlab and everything is working normally there (it's immediately an ordinary array).
If it's an excel file you might want to try the readxl package.
library("readxl")
dt <- read_excel("your_file_path")
link
Found an easy method:
convert data to a dataframe, and then convert it to an array;
my_data<-data.frame(my_data)
my_data<-data.matrix(my_data)

Error: Invalid: File is too small to be a well-formed file - error when using feather in R

I'm trying to use feather (v. 0.0.1) in R to read a fairly large (3.5 GB) csv file with 21178665 rows and 16 columns.
I use the following lines to load the file:
library(feather)
path <- "pp-complete.csv"
df <- read_feather(path)
But I get the following error:
Error: Invalid: File is too small to be a well-formed file
There's no explanation in the documentation of read_feather so I'm not sure what's the problem. I guess this function expects a different file form but I'm not sure what that would be.
Btw, I can read the file with read_csv in readr library but it takes a while.
The feather file format is distinct from a CSV file format. They are not interchangeable. The read_feather function cannot read simple CSV files.
If you want to read CSV files quickly, your best bets are probably readr::read_csv or data.table::fread. For large files, it will still usually take a while just to read it from disc.
After you've loaded the data into R, you can create a file in the feather format with write_feather so you can read it with read_feather the next time.

writing to a tab-delimited file or a csv file

I have a RMA normalized data ( from the CEL files ) and would like to write it into a file that I could open in excel but have some problems.
library(affy)
cel <- ReadAffy()
pre<-rma(cel)
write.table(pre, file="norm.txt", sep="\t")
write.table(pre, file="norma.txt")
The outut is arranged row-wise in the text file that is written using the above command and hence when exported to excel it is in a wrong form and many of the information is cut off as the maximum rows are used up .The output looks the following way :
GSM 133971.CEL 5.85302 3.54678 6.57648 9.45634
GSM 133972.CEL 4.65784 3.64578 3.54213 7.89566
GSM 133973.CEL 6.78543 3.54623 2.54345 7.89767
How to write it in a proper format from CEL files in R to a notepad or excel ?
You need to extract the values from the normalised probes using the exprs function. Something like:
write.csv(exprs(pre), file="output.csv", row.names=FALSE)
should do the trick.
I'm not totally clear about what the problem is, what do you mean by "proper format"? Excel will struggle with huge tables and doing your analysis in R with Bioconductor is likely a better way to go, you could then export a smaller results or summary table to excel.
Nevertheless, if you want the file written columnwise, you could try:
write.csv(t(pre),file="norm.txt")
But excel (at least used to) allow many more rows than columns.

Resources