Save big data file in R to be loaded afterwards in Matlab - r

I have created a 300000 x 7 numeric matrix in R and I want to work with it in both R and Matlab. However, I'm not able to create a file well readeable for Matlab.
When using the command save(), with file=xx.csv, it recognizes 5 columns instead; with extension .txt all data is opened in a single column instead.
I have also tried with packages ff and ffdf to manage this big data (I guess the problem of R identifying rows and column when saving is related somehow to this), but I don't know how to save it in a readable format for Matlab afterwards.
An example of this dataset would be:
output <- matrix(runif(2100000, 1, 1000), ncol=7, nrow=300000)

If you want to work both with R and Matlab, and you have a matrix as big as yours, I'd suggest using the R.matlab package. The package provides methods readMat and writeMat. Both methods read/write the binary format that is understood by Matlab (and through R.matlab also by R).
Install the package by typing
install.packages("R.matlab")
Subsequently, don't forget to load the package, e.g. by
library(R.matlab)
The documentation of readMat and writeMat, accessible through ?readMat and ?writeMat, contains easy usage examples.

Related

R does not import values as numbers

Basic data was generated using a SQL query and the intention is to process data in R. However, while importing from a .csv or from .xlsx, R imports numbers as characters in spite of changing the data-type in the built-in import tool. Further, while performing basic arithmetic operations, following errors were encountered:
In Ops.factor((data$A), (data$B)) :‘/’ not meaningful for factors
Is there a simple way to solve this?
Data-set was analysed using the str() function, which revealed that R imported the particular columns as factors.
Used package varhandle and function unfactor to unfactorize the data
Used as.numeric for some columns which were read as characters instead of factors
Tried changing data-types in Excel before importing
data$A <- unfactor(data$A)
data$B <- unfactor(data$B)
data$PERCENTAGE <- (data$B)/(data$A)*100
By what means can R import the data as per specified data-types?
Thank you for the help in advance!
For csv files I would recommend read_csv from Hadley Wickham's excellent Tidyverse package. It has intelligent defaults that cope with most things I throw at it.
For .xlsx, there is read_excel, also from the Tidyverse package (there are other packages available).
Or, alternatively just export a .csv from within Excel and use read_csv.
[Note the Tidyverse's will import these files as a "tibble" which is essentially a data frame on steroids without some of the headaches but is easily converted to a data.frame if you prefer.]

using the 'ptw' package in R

I am working on applying the ptw package to my GC-MS wine data. So far I have been able to correctly use this package on the apples example data described in the vignette (MTBLS99). Since I am new to R, I am unable to get my .CDF files into the format they used to start the vignette. They started with three data frames (All.pks, All.tics, All.xset). I assume that this was generated using the xcms package. But I cannot recreate the specific steps used for the data to be formatted in this manner. Has anyone successfully applied 'ptw' to their LC/GC-MS data? can someone share the code used for generating the All.pks, All.tics, All.xset data frames?

reading files: XLConnect modifies my number

I am reading an excel spreadsheet with the following command
models <- XLConnect::loadWorkbook(bla.xlsx)
models <- readWorksheet(models,sheet1,check.names=F)
however I noticed that R modifies my number. For example in Excel:
1.011379571
and in R
1.011379570999999977
did anyone had this issue. I know it's very small issues but I really need 100% precision

R equivalent to matrix row insertion in Matlab

In Matlab, without any coding, I can create a matrix, open up its spreadsheet, and copy multiple columns of values from Excel and paste them into the spreadsheet. I can then right click this matrix and plot it instantly.
I've tried googling for how to do the equivalent in R, and everything seems to involve creating a function iterating over each value with a for loop. This seems a bit cumbersome, is there an equivalent simple way to do this in RStudio?
Thanks.
You certainly can have a similar functionality by using R's integration with a clipboard. In particular, standard R functions that provide support for clipboard operations include connection functions (base package), such as file(), url(), pipe() and others, clipboard text transfer functions (utils package), such as readClipboard(), writeClipboard(), as well as data import functions (base package), which use connection argument, such as scan() or read.table().
This functionality differs from platform to platform. In particular, for Windows platform, you need to use connection name clipboard, for Mac platform (OS X) - you can use pipe("pbpaste") (see this StackOverflow discussion for more details and alternative methods). It appears that Kmisc package offers a platform-independent approach to that functionality, however, I haven't used it so far, so, can't really confirm that it works as expected. See this discussion for details.
The following code is a simplest example of how you would use the above-mentioned functionality:
read.table("clipboard", sep="\t", header=header, ...)
An explanation and further examples are available in this blog post. As far as plotting the imported data goes, RStudio not only allows you to use standard R approaches, but also adds an element of interactivity via its bundled manipulate package. See this post for more details and examples.

How can I save a data set created using the memisc package in r?

I'm using memisc to read in a massive SPSS file in order to reshape it. This seems to work very well.
However, I'd also like to be able to output the results to an SPSS-readable file that retains labels and descriptions--after all, this is the advantage of using the data-set construct memisc has set up.
I've searched the memisc documentation for any mention of an export, save, or output function. I've also tried passing a memisc data set to write.foreign (from the foreign package).
Why am I doing this in the first place? Reshaping massive data files in SPSS is a pain. R makes it easy. But the folks I'm doing this for want to maintain the labels and descriptions.
Thanks!

Resources