How to read .arff file with R? - r

Is there any way to do that?
Yes, i'm new to R.

read.arff in package foreign reads data from Weka Attribute-Relation File Format (ARFF) files.
Update: there is a new package on CRAN:
farff: A Faster 'ARFF' File Reader and Writer

In general the answer to questions like this can be found via the sos package, which accesses a full-text search of all the packages on CRAN.
install.packages("sos")
library("sos")
findFn("arff")
finds functions in the foreign (as noted above) and RWeka packages. Since foreign is a recommended package, it will be installed on your system by default. Hence you would have found the answer with
help.search("arff")
in the first place, without installing the sos package. sos is still worth having for times when the string you are searching for isn't in the metadata (title, keywords, alias, etc.), which is all that help.search searches, or not in a package you already have installed on your system (ditto). (Looking through the R Data Import/Export Manual, which also comes with your system, is generally useful but would not have found the answer to this question ...)
It might be useful to know about the RWeka version on the off chance that the version in foreign (which you should try first) fails for some reason.

Even though this question is already answered I realize there is another noteworthy solution. Check the RWeka package that enables you to read and write arff files. Plus it gives you a wrapper for Weka functions. So you could use Weka functionality without installing Weka itself (though it installs .jars). See also this doku -> read.arff.

If you only care about the data and not the relations, you can just use:
read.csv("data.arff", header=FALSE, comment.char = "#")

The easiest way to do it is using the "RWeka" library which has read.arff() function that reads .arff files.
library(RWeka)
test=read.arff("../Test/test.arff")
Hope this helps.

Related

Restore a previous version of a file using the aws.s3 package in R

I am trying to write a function in R with the package aws.s3 that allows previous versions of a file (object) to be restored (from AWS S3's versioning feature). I haven't found a solution yet - but if anyone has one, or has done this before any help would be much appreciated!
I think a good start would be the ability to be able to retrieve the versions of individual bucket objects, using get_versions - I've tried get_verions(bucket="bucket/path_to_object") and get_verions(bucket="bucket", path = "path_to_object) but I don't think either of these work?
It doesn't technically answer the question, but you can do it easily with the paws package. You can just do
s3$list_object_versions("bucket",Prefix = prefix)
s3$get_object("bucket",Key=key,VersionId = vid)

Including hpp files in an R package

I'm writing an R package making use of Rcpp to call functions written in C++ into the R code. Some of these functions and templates are written in files with a .hpp extension following the convention used by boost (and also discussed here).
This does not result in an error when building (R CMD build .) and checking (R CMD check --as-cran package.tar.gz) the package, but it returns the next warning:
Subdirectory ‘src’ contains:
file.hpp example.hpp
These are unlikely file names for src files
Ok, this is not a big issue, but my concern is, why the warning? is naming *hpp files considered a bad practice in the R community? Are there objective or community reasons why I should use *cpp/*h files instead of *hpp for the templates?
I originally left this information as a comment, but realized it actually answers your question I think, so here goes:
As Dirk Eddelbuettel points out in the comments, when you have a question about an R Core Team policy on R extension packages, your best bet is to look through their excellent Writing R Extensions manual. This manual tells you almost anything you could ever need to know.
In your case specifically, you needed to look at Section 1.1.5, which explains that "[the R Core Team] recommend[s] using .h for headers" because (as they explain in footnote 18) "Using .hpp is not guaranteed to be portable."

CRAN submission - How should I document hidden functions in R?

Long story short:
My aim is to submit a R package developed with roxygen2 to CRAN, and I need to find some guidelines on writing and documenting hidden functions.
More details:
I am writing my first R package using roxygen2 in Rstudio. I have documented all the functions I wrote so far, so that my collaborators can easily understand their purpose before going into the details of the script. All the functions that are of no use to the user, but necessary to the package, are not exported in the namespace and eliminated from the package manual/index (#keywords internal). At the same time, my collaborators can still read their documentation using the help of Rstudio.
Eventually, I would like to remove the documentation of these "hidden functions" from the help because there is no reason to keep it. I was considering to follow a suggestion I found in other posts, that is changing #' with ## in the documentation created with roxygen2. However, I am not sure this is the correct procedure to implement, and if removing the documentation of a function in the help and the manual is compatible with CRAN requirements.
Can anyone point me to some guidelines related to this issue, or has any experience to share?

non standard file "data-raw" note on building/checking a package in R

I get this warning
Non-standard file/directory found at top level:
‘data-raw’
when building my package, even there is the recommendation of creating this folder to create package data http://r-pkgs.had.co.nz/data.html#data-sysdata
Any comments on that or do I need a specific setting to get rid of this message.
When used, data-raw should be added to .Rbuildignore. As explained in the Data section of Hadley's R-Packages book (also linked in the question)
Often, the data you include in data/ is a cleaned up version of raw data you’ve gathered from elsewhere. I highly recommend taking the time to include the code used to do this in the source version of your package. This will make it easy for you to update or reproduce your version of the data. I suggest that you put this code in data-raw/. You don’t need it in the bundled version of your package, so also add it to .Rbuildignore. Do all this in one step with:
usethis::use_data_raw()

R - no documentation for .fill_short_gaps

I was looking at the R source codes for the zoo package (who many functions are extremely useful). I noticed a function .fill_short_gaps used quite a lot, but I can't find any documentation for this either in the zoo source codes or in the base source codes.
Is this an internal function? What is this function supposed to do?
It's an internal function. A checkin comment on version 661 of the source file says "Use base R coding style convention for internal non-exported functions: .fill_short_gaps() instead of fillShortGaps()."
I found the source on r-forge:
http://r-forge.r-project.org/scm/viewvc.php/pkg/zoo/R/na.approx.R?view=markup&root=zoo
.fill_short_gaps() is at the bottom of that file.
Since the function was renamed recently, you should make sure that all of the libraries that you're using that reference it are using a compatible version of zoo.

Resources