Essentially I want to know if there is a practical way to read a particular kind of binary file in to R. I have some Matlab code which does what I want but ideally I want to be able to do this in R.
The Matlab code is:
fid = fopen('filename');
A(:) = fread(fid, size*2, '2*uint8=>uint8',510,'ieee-le');
and so far in R I've been using:
to.read = file("filename", "rb")
bin = readBin(to.read, integer(), n = 76288, endian = "little")
The confusion I'm having is with the 3rd and 5th argument in the matlab function fread()- I don't understand exactly what '2*uint8=>uint8' or 'ieee-le' mean in terms of interpreting the binary data. This is what is holding me back from implementing it in R.
Also, the file extension is .cwa, apparently this is a very efficient format to have high frequency (100Hz) activity data recorded in.
Related
I've been working with a garmin chartplotter lately, but all of the data requires an ios app to even look at. I'm interested in taking the sonar/sounder data and pulling it into a csv to simply extract depth by time (so I can merge it with the data from the gpx file for a depth track).
Anyone have any experience or suggestions for doing so?
to.read <- file("TestSonar.RSD", "rb")
a <- readBin(to.read,
raw(),
n = file.size("TestSonar.RSD"),
endian = "little", signed = FALSE)
close(to.read)
produces a nice full bit of hex.......not sure where to go from here.
I'm trying to use DESeq2's PCAPlot function in a meta-analysis of data.
Most of the files I have received are raw counts pre-normalization. I'm then running DESeq2 to normalize them, then running PCAPlot.
One of the files I received does not have raw counts or even the FASTQ files, just the data that has already been normalized by DESeq2.
How could I go about importing this data (non-integers) as a DESeqDataSet object after it has already been normalized?
Consensus in vignettes and other comments seems to be that objects can only be constructed from matrices of integers.
I was mostly concerned with getting the format the same between plots. Ultimately, I just used a workaround to get the plots looking the same via ggfortify.
If anyone is curious, I just ended up doing this. Note, the "names" file is just organized like the meta file for colData for building a DESeq object from DESeqDataSetFrom Matrix, but I changed the name of the design column from "conditions" to "group" so it would match the output of PCAplot. Should look identical.
library(ggfortify)
data<-read.csv('COUNTS.csv',sep = ",", header = TRUE, row.names = 1)
names<-read.csv("NAMES.csv")
PCA<-prcomp(t(data))
autoplot(PCA, data = names, colour = "group", size=3)
Error in FUN(X[[i]], ...) :
Input dataset is incorrect, it should contain "Data", "xyCoords", and "Dates",
check help for details or use loadNCDF to read NetCDF file.
If time series input is needed, and your input is a time series, please put "TS = yourinput".
trying to apply a bias correction to rainfall forecasts but keep getting this error, My data is a time series with four columns (Date, Observed, Hindcast, Forecast). i donot know how to let r know that my data is a time series. New to R.
install package hyfo, then follow these steps;
library(hyfo)
bb <-biasCorrect(Forecast,hindcast,observation,method="gqm",preci = TRUE)
NB:method of bias correction varies depending with the researcher
preference i.e eqm,scaling,delta etc.
Date format is mostly the problem,u put is as; 2019-2-1
I have an Agilent dataset. When I use R for reading it, I create a file which is called target file, which is a txt file containing all the sample txt. How can we read all the sample in MATLAB as R?
the code in matlab is :
AGFEData = agferead(File)
%//example:
agfeStruct = agferead('fe_sample.txt')
We use quantile normalization in R. And how we normalize it in MATLAB, the normalization code in MATLAB for a micro array is XNorm = manorm(X).
I am trying to read the stata files of size 320 MB in stata and R with more than 5000 variables. I used first stata to read the file but the maximum variable that it can read is 5000. So, I can't use stata to read the stata file. My questions are:
Is there a way to read the stata file using stata by first asking to keep only variables (I know the variables) so that number of variables is less than 5000?
Is there way that I can read this stata file in R?. I am using 32 bit (Vista) and R is giving me the error. "Error: cannot allocate vector of size 21k.Kb".
I used the following R code to read the file :
#The stata file is in the webpage: http://www.federalreserve.gov/econresdata/scf/scf_2010survey.htm#STATADAT
#1. set mem 400m
set maxvar 4000
use p10i6.dta, clear
keep x8166 x8167 x8168 x8163 x8164 x2422 x2506 x2606 x2623 x604 x614 x623 x716 x507 x513 x526 x1706 x1705 x1806 x1805 x1906 x1905 x2002 x2012 x1409 x1509 x1609 x1415 x1515 x1615 x1417 x1517 x1617 x1619 x1621 x3124 x3224 x3324 x3129 x3229 x3329 x3335 x3408 x3412 x3416 x3420 x3424 x3428 x4020 x4024 x4028 x4018 x4022 x4026 x4030 x4022 x4026 x4030 x4018 x3507 x3511 x3515 x3519 x3523 x3527 x3506 x3510 x3514 x3518 x3522 x3526 x3529 x3804 x3807 x3810 x3813 x3816 x3818 x3930 x3721 x3821 x3823 x3825 x3827 x3829 x3822 x3824 x3826 x3828 x3830
save p10i6.dta, clear
#2.
library (foreign)
year<-2010
yr <- substr( year , 3 , 4 )
p10i6.dta<-read.dta(paste0( "p" , yr , "i6.dta" ))
saveRDS(p10i6.dta,file=paste0( "p" , yr , "i6.rda" ))
p10i6.rda<-readRDS(paste0( "p" , yr , "i6.rda" ))
To read the data into R, there might be a way to do this with the memisc package Stata.file function. Instead of reading in all the variables, select the variables that you need using subset. For example:
require(memisc)
?Stata.file
d1 <- subset(
Stata.file(paste0( "p" , yr , "i6.dta" )),
select=c(x8166, x2606, x2623, x604)
)
I presume you have a StataIC or an older version of Stata. The current Stata/SE and Stata/MP can read up to more than 32.000 variables. So the first logical step would be to upgrade your Stata to a version that can handle larger datasets. If the problem doesn't originate from a lack of available memory, that is... For that, the error message of Stata would be helpful.
As Richard Herron told already in the comments, you should be able to read in a subset of the data using:
use X8166 X8167 ... using p10i6, clear
remember that Stata is case sensitive. The variables are called X... instead of x..., according to the website you linked to.
If you want to load the data into R using the foreign package, make sure you set the memory available to R to the maximum possible on R. On Vista that's going to be about 3.5Gb:
memory.limit(3500)
If that doesn't help, your dataset is going to be too big, and you can use any of the ASCII methods from either Stata or R to load the ASCII data provided at the website.