Rcpp: Save compiled function as Robj - r

If I define a function in R, I can save the function object using the save function. Then I can load that function object using the load function and use it directly. However, if I have a rcpp function, and if I try to save the compiled version and load it back to the memory, I can no longer use that function object directly. Is this even possible? The reason I ask is because it takes a while to compile the function, and if there is a way to avoid that cost every time I launch an R environment, that will be great. Thanks!

No, in general you cannot serialize (and hence save) a function compiled with cxxfunction() or sourceCpp(). You need to freshly compile it, unless you place it in a package. Which is why packages are the way to go to really install your compiled code beyond quick experimentation.

Related

Self-written R package does not find its own function

I created a package with some functions which are helpful at my company. Recently, I restructered the package such that there are helper functions which need not to be accessible for everyone, but are called internally from other (exported) functions of the package. These helper functions are not exported to the namespace (no #' #export in the respective .R files).
Now, when I call one of the "major" (exported) functions, I get the error message (no real function names):
Error in major_function() : could not find function "helper_function"
Im fairly new in building packages, but from what I understood so far (from https://cran.r-project.org/web/packages/roxygen2/vignettes/namespace.html), it should neither be necessary to export the helper functions, nor to add #' importFrom my_package helper_function to the .R file of the major function.
When I tried this, it actually produced errors when checking the package. I also tried to call the helper functions with my_package:::helper_function, but this lead to the note that it should almost never be necessary to call functions from the same package like this.
Maybe useful information:
The error occurs only when I call a major_function_1 which internally calls major_function_2 which calls a helper_function.
I think there is more to your problem than what you state. As long as all your functions are defined in the same namespace (this also means that all your functions need to live in .R files in the same folder), the calling function should find the helper-functions accordingly.
I suspect you have your helper functions nested in some way, and that is causing the problem.
I recommend to recheck your namespace structure, or post a simplistic outline of your package here.
Another reason that could come to mind, is that you do not export your 'mayor_function2' in your NAMESPACE-file in your package root (maybe you have not recompiled the Roxygen documentation generating this file), and additionally have a local shadow of the the calling function 'mayor_function1'. Try to check this and rerun from a clean compile.

how to make a function available at start up in R

I have a user defined function in R
blah=function(a,b){
something with a and b
}
is it possile to put this somewhere so that I do not need to remember to load in the workspace every time I start up R? Similar to a built in function like
summary(); t.test(); max(); sd()
You can put the function into your .rprofile file.
However, be very careful with what you put there, since it essentially makes your code non-reproducible — it now depends on your .rprofile:
Let’s say you have an R code file performing some analysis, and the code uses the function blah. Executing the code on any other system will fail due to the non-existence of the blah function.
As a consequence, this file should only contain system-specific setup. Don’t define helper functions in there — or if you do, make them defined only in interactive sessions, so that you have a clear environment when R is running a non-interactive script:
if (interactive()) {
# Helper functions go here.
}
And if you find yourself using the same helper functions over and over again, bundle them into packages (or modules) and reuse those.

How to call R script from another R script, both in same package?

I'm building a package that uses two main functions. One of the functions model.R requires a special type of simulation sim.R and a way to set up the results in a table table.R
In a sharable package, how do I call both the sim.R and table.R files from within model.R? I've tried source("sim.R") and source("R/sim.R") but that call doesn't work from within the package. Any ideas?
Should I just copy and paste the codes from sim.R and table.R into the model.R script instead?
Edit:
I have all the scripts in the R directory, the DESCRIPTION and NAMESPACE files are all set. I just have multiple scripts in the R directory. ~R/ has premodel.R model.R sim.R and table.R. I need the model.R script to use both sim.R and table.R functions... located in the same directory in the package (e.g. ~R/).
To elaborate on joran's point, when you build a package you don't need to source functions.
For example, imagine I want to make a package named TEST. I will begin by generating a directory (i.e. folder) named TEST. Within TEST I will create another folder name R, in that folder I will include all R script(s) containing the different functions in the package.
At a minimum you need to also include a DESCRIPTION and NAMESPACE file. A man (for help files) and tests (for unit tests) are also nice to include.
Making a package is pretty easy. Here is a blog with a straightforward introduction: http://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/
As others have pointed out you don't have to source R files in a package. The package loading mechanism will take care of losing the namespace and making all exported functions available. So usually you don't have to worry about any of this.
There are exceptions however. If you have multiple files with R code situations can arise where the order in which these files are processed matters. Often it doesn't matter or the default order used by R happens to be fine. If you find that there are some dependencies within your package that aren't resolved properly you may be faced with a situation where a custom processing order for the R files is required. The DESCRIPTION file offers the optional Collate field for this purpose. Simply list all your R files in the order they should be processed to satisfy the dependencies.
If all your files are in R directory, any function will be in memory after you do a package build or Load_All.
You may have issues if you have code in files that is not in a function tho.
R loads files in alphabetical order.
Usually, this is not a problem, because functions are evaluated when they are called for execution, not at loading time (id. a function can refer another function not yet defined, even in the same file).
But, if you have code outside a function in model.R, this code will be executed immediately at time of file loading, and your package build will fail usually with a
ERROR: lazy loading failed for package 'yourPackageName'
If this is the case, wrap the sparse code of model.R into a function so you can call it later, when the package has fully loaded, external library too.
If this piece of code is there for initialize some value, consider to use_data() to have R take care of load data into the environment for you.
If this piece of code is just interactive code written to test and implement the package itself, you should consider to put it elsewhere or wrap it to a function anyway.
if you really need that code to be executed at loading time or really have dependency to solve, then you must add the collate line into DESCRIPTION file, as already stated by Peter Humburg, to force R to load files order.
Roxygen2 can help you, put before your code
#' #include sim.R table.R
call roxygenize(), and collate line will be generate for you into the DESCRIPTION file.
But even doing that, external library you may depend are not yet loaded by the package, leading to failure again at build time.
In conclusion, you'd better don't leave code outside functions in a .R file if it's located inside a package.
Since you're building a package, the reason why you're having trouble accessing the other functions in your /R directory is because you need to first:
library(devtools)
document()
from within the working directory of your package. Now each function in your package should be accessible to any other function. Then, to finish up, do:
build()
install()
although it should be noted that a simple document() call will already be sufficient to solve your problem.
Make your functions global by defining them with <<- instead of <- and they will become available to any other script running in that environment.

How to permanently "source" an R script

I am using an R script that someone sent me. This is not a big package but just one function. To use it, I source the file.
However,whenever I restart R, I must type in source("directory") again to use the function.
Is there any way I can avoid this and set that function permanently?
I think that you just want to add source("directory") to your .Rprofile so that the function gets loaded at startup.
See this SO question for some handy things you can do with .Rprofile.

Importing Functions into Current Namespace

Let's say I have an R source file comprised of some functions, doesn't matter what they are, e.g.,
fnx = function(x){(x - mean(x))/sd(x)}
I would like to be able to access them in my current R session (without typing them in obviously). It would be nice if library("/path/to/file/my_fn_lib1.r") worked, as "import" works in Python, but it doesn't. One obvious solution is to create an R Package, but i want to avoid that overhead just to import a few functions.
Use the source() command. In your case:
source("/path/to/file/my_fn_lib1.r")
Incidentally, creating a package is fairly easy with the package.skeleton() function (if you're planning to reuse this frequently).

Resources