I have some European Data Format (EDF) files that I would like to import into R.
There are some Python libraries for parsing EDF files and the EDF spec is available, so I know it's possible, but I would avoid writing code if I could.
Does there already exist a facility for importing these kinds of files?
Was looking for the same thing. Found this function written by Fabien Feschet - works well for my data. http://feschet.fr/?p=11
Found another resource recently. This works very well. Need to download both read_edf.R and utilities.R
https://github.com/bwrc/edf/tree/master/R
I tried look for the same thing a while ago, but I couldn't find anything for R. I ended up using biosig Python module to convert edfs to ascii. There is also this edf2ascii-converter.
I guess there wasn't any package available at the time when the question was asked but now you could use edfReader:
https://cran.r-project.org/web/packages/edfReader/
https://github.com/Pisca46/edfReader
Related
I have a 5.1 GB json file that I would like to read in R using rjson. I want afterwards to construct a dataframe from it, however it won't load because the size is too large.
Is there any way to work around it?
Thank you for your help =)
Nina, I would recommend you using jsonlite package instead of rjson.
library(jsonlite)
your_json <- "your_path.json"
unpacked_json <- jsonlite::stream_in(textConnection(readLines(your_json, n=100000)),verbose=F)
Here you limit the page size to let IDE correctly read your JSON file. For more information I would also recommend you to make some research on this topic:
https://community.rstudio.com/t/how-to-read-large-json-file-in-r/13486
Reading a huge json file in R , issues
I know for sure that it is sometimes really hard to cope with documentation (and as all other human beings we are lazy); and I don't like to read doc-n myself, but I highly recommend you to make yourself familiar with jsonlite documentation and vignettes. Here's the CRAN link: https://cran.r-project.org/web/packages/jsonlite/index.html
I am trying to follow closely #hadley's book to learn best practices in writing R packages. And I was thrilled to read these lines about the philosophy of the book:
anything that can be automated, should be automated. Do as little as
possible by hand. Do as much as possible with functions.
So when I was reading about dependencies and the (sort of) confusing differences between import directives in the NAMESPACE file and the "Imports:" field in the DESCRIPTION file, I was hoping that roxygen2 would automatically handle both of them. After all
Every package mentioned in NAMESPACE must also be present in the
Imports or Depends fields.
I was hoping that roxygen2 would take every #import in my functions and make sure it is included in the DESCRIPTION file. But it does not do that automatically.
So I either have to add it manually to the DESCRIPTION file or almost manually using devtools::use_package.
Looking around for an answer, I found this question in SO, where #hadley confirms in the comments that
Currently, the namespace roclet will modify NAMESPACE but not
DESCRIPTION
and other posts (e.g. here or here) where collate_roclet is discussed, but "This only matters if your code has side-effects; most commonly because you’re using S4".
I wonder:
the reason that DESCRIPTION is not automatically updated (sort of contradicting the aforementioned philosophy, which is presumably shared by roxygen2) and
If someone has already crafted a way to do it
I have written a little R package for that task:
https://github.com/markusdumke/pkghelper
It scans the R Code and NAMESPACE for packages in use and adds them to the Imports section.
The namespace_roclet edits the NAMESPACE file based on the tags added in the script before the function. As there are three types of dependencies (Depends, Imports, and Suggests), a similar method as used by the namespace_roclet would require three different tags (notice Imports should be a different one, to differentiate it from the packages to attach in NAMESPACE).
If you are willing to take a semi-automated process, you could identify the packages you have used and add the missing ones to DESCRIPTION, in the adequate sections.
library(reinstallr)
package.dir <- getwd()
base_path <- normalizePath(package.dir)
files <- list.files(file.path(base_path, "R"), full.names = TRUE)
packages <- unique(reinstallr:::scan_for_packages(files)$package)
packages
Regarding the two bullets you wonder about at the bottom:
Updates to the DESCRIPTION file could be further automated with additional roclets, however already >4 years ago such a pull request was deferred:
https://github.com/klutometis/roxygen/pull/76
I have to assume that the guys would indeed rather have you use the devtools package for updating the DESCRIPTION file, instead of adding this to roxygen2. So in that sense, devtools would be the first available choice
I'm writing my first R package and have made a successful build with documentation using roxygen2 and added data sets.
However, I would also like ship an example script with how I use the functions in the r package. But I don't know where to put it.
Let's say I have created MyPackage. I have put my function scripts in the /R folder. Let's say I have:
foo1.R
foo2.R
foo3.R
Somewhere I'd also like to put a script with my workflow. Let's say I have a file, MyWorkflow.R:
library(MyPackage)
load(file='inData.R') # Loads indata variables A, B and C
X=foo1(A)
Y=foo2(X,B)
Z=foo3(Y,C)
Can I do this? If so, where do I put it? Is it an OK procedure - or generally frowned upon?
Any help or thoughts are appreciated.
Thanks.
Carl
Edit:
I looked at the link on demo/ and exec/, but didn't understand the exec/ folder thing. Grateful if you could clarify/exemplify/point to good uses of...
If I understand correctly, I'm not looking for an example or demo/, since the script won't necessarily be executable without tweaking by the user (e.g. to provide input data or paths). I "just" want to add an example script showing how I work with these functions.
I realise I should probably dive into the world of vignettes, but have difficulty in finding the time/oomph/energy to do so.
I also saw that there's the inst/ folder. Could you shed some light on the different uses of these options or hint at good examples of where they've been used (I often find examples more informative than reading an explanatory text that's above my level - I often get the feeling of being like a dog looking at a ceiling fan ;)
Will add info to the GitHub README. Thx for good suggestion!
Created inst/Workflow_Example/workflow.R. Upon build & reload, a Workflow_Example folder was created in the library with workflow.R script in it.
In combination with an explanatory remark in the README, this looks like what I was after. Problem solved or am I not seeing something obvious? Am I e.g. violating conventions/conduct/good practice?
You could either put it in demo/ or exec/ depending on the format of the script. See here for more details. I would mention the workflow and where it lives in the README regardless, and if you host your code on Github, you could create a wiki to describe the workflow and place the script there. This would be similar to what nrussell has mentioned in a comment above.
I am trying to load a matlab file with the R.matlab package. The problem is that it keeps loading indefinitely (e.g. table <- readMat("~/desktop/hg18_with_miR_20080407.mat"). I have a genome file from the Broad Institute (hg18_with_miR_20080407.mat).
You can find it at:
http://genepattern.broadinstitute.org/ftp/distribution/genepattern/dev_archive/GISTIC/broad.mit.edu:cancer.software.genepattern.module.analysis/00125/1.1/
I was wondering: has anyone tried the package and have similar issues?
(hopefully helpful, but not really an answer, though there was too much formatting for a comment)
I think you may need to get a friend with matlab access to save the file into a more reasonable format or use python for data processing. It "hangs" for me as well (OS X 10.9, R 3.1.1). The following works in python:
import scipy.io
mat = scipy.io.loadmat("hg18_with_miR_20080407.mat")
(you can see and work with the rg and cyto' crufty numpy arrays, but they can't be converted to JSON withjson.dumpsand evenjsonpickle.encodecoughs up a lung-full of errors (i.e. you won't be able to userPython` to get access to the object which was the original workaround I was looking for), so no serialization to a file either (and, I have to believe the resultant JSON would have been ugly to deal with).
Your options are to:
get a friend to convert it (as suggested previous)
make CSV files out of the numpy arrays in python
use matlab
I'd like to create a zip archive from within R, and need maximal cross-platform compatibility, so I would prefer not to use a system("zip") command.
Within utils there's zip.file.extract (aka unzip), which uses [a lot of] c code, derived from zlib 1.1.3 within a file called dounzip.c I couldn't find any similar capabilities for creating zip files.
It's also tricky to construct a specific google query for "cran create zip" or equivalent!
Also, a tar will not suffice, I need to creating zip's to use as input for another set of non-R tools.
I'd appreciate any pointers?
cheers,
mark
As usual the amazing Omega Project for Statistical Computing is a valuable resource! Take a look at the Rcompression package and try, for example, something like:
?gzip
txt <- paste(rep("This is a string", 40), collapse = "\n")
v <- gzip(txt))
writeBin(v, "test.txt.zip")
HTH
I think the command gzfile() may also do what you're looking for. Also note that in the upcoming version 2.10.0 there are some enhancements to compression functions that may be relevant. (see https://svn.r-project.org/R/trunk/NEWS -- the svn server may ask you to accept a certificate)