I like the GBM package in R.
I can't get R's memory management to work with the combination of my machine/data set/task needed for reasons that have been covered elsewhere and should be considered off topic for the purposes of this question.
I would like to "rip" out the GBM algorithm away from R and rebuild it as standalone code.
Unfortunately there is no Makefile in the package tarball (or indeed any R package tarball I've seen). Is there a place I can look for straightforward Makefiles of R packages? Or do I really have to go way back to ground zero and write my own Makefile for the long painful journey ahead?
As Henry Spencer quipped: "Those who do not understand Unix are doomed to reinvent it, poorly."
R packages do not have a Makefile because R creates one on the fly when building the package, using both the defaults of the current R installation and the settings in the package, typically via a file Makevars.
Run the usual command R CMD INSTALL foo_1.2.3.tar.gz and you will see the effect of the generated Makefile as the build proceeds. Worst case you can always start by copying and pasting.
You could also take a look at CMake which can quite easily create makefiles for you. It took me minimal time to get it working for a project of mine.
Related
I want to use Kernel LDA in julia 1.6.1.
I found the repo.
https://github.com/remusao/LDA.jl
I read READEME.md, and I typed
] add LDA
. But it does not work.
The following package names could not be resolved:
LDA (not found in project, manifest or registry
Also, I tried all of the following commands, still does not work.
add https://github.com/remusao/LDA.jl
add https://github.com/remusao/LDA.jl.git
Pkg.clone("https://github.com/remusao/LDA.jl.git")
What is the problem? How can I install LDA.jl in my julia?
The package you have linked, https://github.com/remusao/LDA.jl, has had no commits in over eight years. Among other things, it lacks a Project.toml file, which is necessary for installation in modern Julia.
Since Julia was only about one year old and at version 0.2 back in 2013 when this package last saw maintenance, the language has also changed drastically in this time such that the code in this package would likely no longer function even if you could get it to install.
If you can't find any alternative to this package for your work, forking it and upgrading it to work with modern Julia would be a nice intermediate-beginner project.
I am in the process of developing an R package with external source code that takes a long time to compile. While compilation time isn't a big problem for a one-off installation, I have to routinely reinstall the package to test new additions. Is it possible to prevent re-compiling the source code if there haven't been any changes to it?
I don't necessarily need this to be automated, but I can't figure out a manual solution either. As my source code is in Rust, the following serves as the most representative example I have (note that it requires Rust cargo to be installed):
git clone https://github.com/r-rust/hellorust
Rscript -e "devtools::install('hellorust', quick = TRUE)"
When I run the above, I see that the hellorust.so file has been created in the src directory, but how do I make devtools::install() use this file rather than recompile everything? It doesn't seem like devtools::install(quick = TRUE) or devtools::install(build = FALSE) are meant for this...
Alternatively, is it possible to achieve the desired behavior on the Rust side of things? I don't understand why cargo would recompile everything if there haven't been any changes and the target directory is still there. That said, I'm quite new to Rust and compiled languages in general so my understanding of the broader concepts involved here is unfortunately quite limited...
I would also be interested to learn if there is a better way to test R packages during development than manually reinstalling them.
Based on the comments by r2evans, the final answer seems to be that this isn't what devtools::install is for.
As per the devtools documentation, there are three main tools for "frequent development tasks":
load_all
document
test
Of these load_all "simulates installing and reloading your package, loading R code in R/, compiled shared objects in src/ and data files in data/". By default, load_all() will not recompile source code in src/ (unless the recompile flag is set to true).
So the answer is to use load_all as opposed to install during package development and manually control when to compile the source code using something like devtools::compile_dll.
I can only find information on how to install a ready-made R extension package, but it is nowhere mentioned which commands a developer of an extension package has to use during daily development. I am using Rcpp and I am on Windows.
If this were a typical C++ project, it would go like this:
edit
make # oops, typo
edit # fix typo
make # oops, forgot an #include
edit
make # good; updates header dependencies for subsequent 'make' automatically
./fooreader # test it
make install # only now I'm ready
Which commands do I need for daily development of an Rcpp package project?
I've allocated a skeleton project using these commands from the R command line:
library(Rcpp)
Rcpp.package.skeleton("FooReader", example_code=FALSE,
author="My Name", email="my.email#example.com")
This allocated 3 files:
DESCRIPTION
NAMESPACE
man/FooReader-package.Rd
Now I dropped source code into
src/readfoo.cpp
with these contents:
#include <Rcpp.h>
#error here
I know I can run this from the R command line:
Rcpp::sourceCpp("D:/Projects/FooReader/src/readfoo.cpp")
(this does run the compiler and indicates the #error).
But I want to develop a package ultimately.
There is no universal answer for everybody, I guess.
For some people, RStudio is everything, and with some reason. One can use the package creation facility to create an Rcpp package, then edit and just hit the buttons (or keyboard shortcuts) to compile and re-load and test.
I also work a lot on a shell, so I do a fair amount of editing in Emacs/ESS along with R CMD INSTALL (where thanks to ccache recompilation of unchanged code is immediate) with command-line use via r of the littler package -- this allows me to write compact expressions loading the new package and evaluating: r -lnewpackage -esomeFunc(somearg) to test newpackage::someFunc() with somearg.
You can also launch the build and test from Emacs. As I said, it all depends.
Both those answers are for package, where I do real work. When I just test something in a single file, I do that in one Emacs buffer and sourceCpp() in an R session in another buffer of the same Emacs. Or sometimes I edit in Emacs and run sourceCpp() in RStudio.
There is no one answer. Find what works for you.
Also, the first part of your question describes the initial setup of a package. That is not part of the edit/compile/link/test cycle as it is a one off. And for that too do we have different approaches many of which have been discussed here.
Edit: The other main misunderstanding of your question is that once you have package you generally do not use sourceCpp() anymore.
In order to test an R package, it has to be installed into a (temporary) library such that it can be attached to a running R process. So you will typically need:
R CMD build . to build package_version.tar.gz
R CMD check <package_version.tar.gz> to test your package, including tests placed into the testsfolder
R CMD INSTALL <package_version.tar.gz> to install it into a library
After that you can attach the package and test it. Quite often I try to use a more TTD approach, which means I do not have to INSTALL the package. Running the unit tests (e.g. via R CMD check) is enough.
All that is independent of Rcpp. For a package using Rcpp you need to call Rcpp::compileAttributes() before these steps, e.g. with Rscript -e 'Rcpp::compileAttributes()'.
If you use RStudio for package development, it offers a lot of automation via the devtools package. I still find it useful to know what has to go on under the hood and it is by no means required.
I would like to share my R package but keep the source code until after an article is published. If I compile a package using R CMD INSTALL --build, is there any way for an end user to read the C source code?
According to p 44 of R News 2006-4,
In order to access the sources of compiled code
(i.e., C, C++, or Fortran), it is not sufficient to have
the binary version of R or a contributed package
installed.
I would be satisfied with this knowledge (indeed, I would prefer to release the source), but I need to assuage the fears of my collaborators.
My primary question is to confirm: if I distribute a binary created by R CMD INSTALL --build, will the C source be inaccessible?
Update: it is not very clear to me why this question has received so many down votes (4 at this point). A downvote indicates "This question has not shown any research effort; it is unclear or not useful". I am only asking about native R functionality, not trying to promote any nefarious intent.
If the .c source files aren't in the distributed archive file (a .tar.gz for Linux, maybe a .zip for Windows) then no, you can't get the source. I just did a quick test with a skeletal package and a single foo.c file and its not there for me, just a compiled foo.so file.
Unless you've used Rcpp and put the C code into inline R functions, of course.
If you have only binary file output it is inaccessible for source code. Only way to get in to some idea is to disassembly. Of course all of your header files should be also compiled.
I have a software system having 30+ Open Source packages, most of them using GNU Autotools suite.
Are there tools to automatically generate package-to-package dependency graph? I.e. I'd like to see something like gst-plugins-good -> gst-plugins-base -> gstreamer -> glib.
I don't think so, but you could probably whip something together with this knowledge:
Scan the file named either configure.ac or configure.in in the package's root directory.
Look for a string of the form PKG_CHECK_MODULES([...],[...]...)
The second argument of that macro consists of package requirements of the form package or package >= version separated by whitespace.
The requirement string might not be the same as the package tarball name; a tarball that contains package.pc or package.pc.in provides the package package.
This only works for dependencies that use pkg-config. Some don't and you'll need to keep track of those dependencies by hand.
Probably not, because this is a hard problem. If there were only one way to build a package, it might not be too bad, but in general this isn't the case. You have the --enable-foo and --with-foo options that you can pass into configure. Those are sometimes package dependent also, requiring more packages. Most Linux distros (I think but am not completely sure) maintain these sort of dependency lists for yum or zypper or apt or whatever the package manager is by hand, and only one layer deep, leaving it up to the package manager to traverse the graph. The packages for the distro are only built one way. It's not unusual for these lists to be broken, also.