Recently I am trying to add a Fortran function into an existing R package that contains C++ code and is build under Rcpp package. I have successfully added the Fortran function and build the R package. But when I try to run an example for the R package for different times, each time the function returns a different value, which a bit confusing since the Fortran function I added has no undeterministic parameter or contains self iteration. Also sometimes when I try to run the function, R crashes.
This is a part of the Fortran function that are in the file "src/k2s.f95"
REAL*8 FUNCTION k2(n1,n2,n3,vec1,length1,a1,vec2,length2,a2)
integer :: n1,n2,n3,length1,length2
integer, dimension(length1) :: vec1
double precision :: a1,a2
double precision, dimension(length2):: vec2
double precision :: P(n1+2)
...
k2 = -1.0
IF ((n1<1) .OR. (n2<1) .OR. (length2 .NE. n1+n2-1) .OR. (n3<1) .OR. (n3 > 3)) RETURN
k2 = -2.0
IF(MINVAL(vec1).LE.0) RETURN
IF(SUM(vec1).NE.(n1+n2)) RETURN
IF(MINVAL(vec2).LE.0) RETURN
...
END FUNCTION K2
And I have used this answer Integrate Fortran, C++ with R to construct the package. Therefore I have written the following C++ file under src dictionary. This is the content of C++ file "k2s2.cpp"
#include "Rcpp.h"
extern "C" {
double k2_(int *n1, int *n2, int *n3, int vec1[], int *length1, double *a1, double vec2[], int *length2, double *a2);
}
// [[Rcpp::export]]
double K2_fortran(int n1, int n2, int n3, Rcpp::IntegerVector vec1, double a1, Rcpp::NumericVector vec2, double a2)
{
int length1 = vec1.size();
int length2 = vec2.size();
double q = 0;
pval = k2_(&n1,&n2,&n3,vec1.begin(),&length1,&a1,vec2.begin(),&length2,&a2);
return q;
}
I think it should be fine. Then I build the package (with Rcpp automatically generates the file "src/RcppExports.cpp" and "R/RcppExports.R" to link the cpp function K2_fortran to R) and load all the functions, then I get the R function K2_fortran. And I run the following example.
K2_fortran(120, 150, 1, c(80,70,40,80), 0.1, rep(1,269), 1e-6)
And the result is -2. Then I rerun the code for the second time, I got the expected result 0.175. For the third time, I got 1(some other returned value when there is an error). Then I run it again, a fatal error occurs in R and closes the R.
If I reopen it and run this example 4 times again. The result will be again -2, 0.175, 1 and a fatal error. I am not sure what has happened.
Related
I am struggling with a probably minor problem while calling compiled ODEs to be solved
via the R package 'deSolve' and I seeking advice from more expert users.
Background
I have a couple of ODE systems to be solved with 'deSolve'. I have defined the ODEs in separate C++ functions (one for each model) I am calling through R in conjunction with 'Rcpp'. The initial values of the system change if the function takes input from another model (so basically to have a cascade).
This works quite nicely, however, for one model I have to set the initial parameters for t < 2. I've tried to do this in the C++ function, but it does not seem to work.
Running code example
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export("set_ODE")]]
SEXP set_ODE(double t, NumericVector state, NumericVector parameters) {
List dn(3);
double tau2 = parameters["tau2"];
double Ae2_4 = parameters["Ae2_4"];
double d2 = parameters["d2"];
double N2 = parameters["N2"];
double n2 = state["n2"];
double m4 = state["m4"];
double ne = state["ne"];
// change starting conditions for t < 2
if(t < 2) {
n2 = (n2 * m4) / N2;
m4 = n2;
ne = 0;
}
dn[0] = n2*d2 - ne*Ae2_4 - ne/tau2;
dn[1] = ne/tau2 - n2*d2;
dn[2] = -ne*Ae2_4;
return(Rcpp::List::create(dn));
}
/*** R
state <- c(ne = 10, n2 = 0, m4 = 0)
parameters <- c(N2 = 5e17, tau2 = 1e-8, Ae2_4 = 5e3, d2 = 0)
results <- deSolve::lsoda(
y = state,
times = 1:10,
func = set_ODE,
parms = parameters
)
print(results)
*/
The output reads (here only the first two rows):
time ne n2 m4
1 1 1.000000e+01 0.000000e+00 0.000000e+00
2 2 1.000000e+01 2.169236e-07 -1.084618e-11
Just in case: How to run this code example?
My example was tested using RStudio:
Copy the code into a file with the ending *.cpp
Click on the 'Source' button (or <shift> + <cmd> + <s>)
It should work also without RStudio present, but the packages 'Rcpp' and 'deSolve' must be installed and to compile the code it needs Rtools on Windows, GNU compilers on Linux and Xcode on macOS.
Problem
From my understanding, ne should be 0 for time = 1 (or t < 2). Unfortunately, the solver does not seem to consider what I have provided in the C++ function, except for the ODEs. If I change state in R to another value, however, it works. Somehow the if-condition I have defined in C++ is ignored, but I don't understand why and how I can calculate the initial values in C++ instead of R.
I was able to reproduce your code. It seems to me that this is indeed elegant, even if it does not leverage the full power of the solver. The reason is, that Rcpp creates an interface to the compiled model via an ordinary R function. So back-calls from the slovers (e.g. lsoda) to R are necessary in each time step. Such back-calls are not for the "plain" C/Fortran interface. Here communication between solver and model takes place at the machine code level.
With this informational, I can see that we don't need to expect initialization issues at the C/C++ level, but it looks like a typical case. As the model function is simply the derivative of the model (and only this). The integration is done by the solver "from outside". It calls the model always with the actual integration state, derived from the time step before (roughly speaking). Therefore, it is not possible to force the state variables to fixed values within the model function.
However, there are several options how to resolve this:
chaining of lsoda calls
use of events
The following shows a chained approach, but I am not yet sure about the initialization of the parameters in the first time segment, so may only be part of the solution.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export("set_ODE")]]
SEXP set_ODE(double t, NumericVector state, NumericVector parameters) {
List dn(3);
double tau2 = parameters["tau2"];
double Ae2_4 = parameters["Ae2_4"];
double d2 = parameters["d2"];
double N2 = parameters["N2"];
double n2 = state["n2"];
double m4 = state["m4"];
double ne = state["ne"];
dn[0] = n2*d2 - ne*Ae2_4 - ne/tau2;
dn[1] = ne/tau2 - n2*d2;
dn[2] = -ne*Ae2_4;
return(Rcpp::List::create(dn));
}
/*** R
state <- c(ne = 10, n2 = 0, m4 = 0)
parameters <- c(N2 = 5e17, tau2 = 1e-8, Ae2_4 = 5e3, d2 = 0)
## the following is not yet clear to me !!!
## especially as it is essentially zero
y1 <- c(ne = 0,
n2 = unname(state["n2"] * state["m4"]/parameters["N2"]),
m4 = unname(state["n2"]))
results1 <- deSolve::lsoda(
y = y,
times = 1:2,
func = set_ODE,
parms = parameters
)
## last time step, except "time" column
y2 <- results1[nrow(results1), -1]
results2 <- deSolve::lsoda(
y = y2,
times = 2:10,
func = set_ODE,
parms = parameters
)
## omit 1st time step in results2
results <- rbind(results1, results2[-1, ])
print(results)
*/
The code has also another potential problem as the parameters span several magnitudes from 1e-8 to 1e17. This can lead to numerical issues, as the relative precision of most software, including R covers only 16 orders of magnitude. Can this be the reason, why the results are all zero? Here it may help to re-scale the model.
I am trying to generate a set of uniformly distributed numbers in R. I know that we can use the function "runif" in R to do the same. But I really want to understand the idea behind how this function would have been developed. In the sense how does the code work for the function "runif". So, in a nutshell, I want to create my own function which can do the same task as the "runif"
Ultimately, runif calls a pseudorandom number generator. One of the simpler ones can be found here defined in C within the R code base and should be straightforward to emulate
static unsigned int I1=1234, I2=5678;
void set_seed(unsigned int i1, unsigned int i2)
{
I1 = i1; I2 = i2;
}
void get_seed(unsigned int *i1, unsigned int *i2)
{
*i1 = I1; *i2 = I2;
}
double unif_rand(void)
{
I1= 36969*(I1 & 0177777) + (I1>>16);
I2= 18000*(I2 & 0177777) + (I2>>16);
return ((I1 << 16)^(I2 & 0177777)) * 2.328306437080797e-10; /* in [0,1) */
}
So effectively this takes the initial integer seed values, shuffles them bitwise, then recasts them as double precision floating point numbers via multiplying by a small constant that normalises the doubles into the [0, 1) range.
I've written a function in Rcpp and compiled it with inline. Now, I want to run it in parallel on different cores, but I'm getting a strange error. Here's a minimal example, where the function funCPP1 can be compiled and runs well by itself, but cannot be called by snow's clusterCall function. The function runs well as a single process, but gives the following error when ran in parallel:
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
2 nodes produced errors; first error: NULL value passed as symbol address
And here is some code:
## Load and compile
library(inline)
library(Rcpp)
library(snow)
src1 <- '
Rcpp::NumericMatrix xbem(xbe);
int nrows = xbem.nrow();
Rcpp::NumericVector gv(g);
for (int i = 1; i < nrows; i++) {
xbem(i,_) = xbem(i-1,_) * gv[0] + xbem(i,_);
}
return xbem;
'
funCPP1 <- cxxfunction(signature(xbe = "numeric", g="numeric"),body = src1, plugin="Rcpp")
## Single process
A <- matrix(rnorm(400), 20,20)
funCPP1(A, 0.5)
## Parallel
cl <- makeCluster(2, type = "SOCK")
clusterExport(cl, 'funCPP1')
clusterCall(cl, funCPP1, A, 0.5)
Think it through -- what does inline do? It creates a C/C++ function for you, then compiles and links it into a dynamically-loadable shared library. Where does that one sit? In R's temp directory.
So you tried the right thing by shipping the R frontend calling that shared library to the other process (which has another temp directory !!), but that does not get the dll / so file there.
Hence the advice is to create a local package, install it and have both snow processes load and call it.
(And as always: better quality answers may be had on the rcpp-devel list which is read by more Rcpp constributors than SO is.)
Old question, but I stumbled across it while looking through the top Rcpp tags so maybe this answer will be of use still.
I think Dirk's answer is proper when the code you've written is fully de-bugged and does what you want, but it can be a hassle to write a new package for such as small piece of code like in the example. What you can do instead is export the code block, export a "helper" function that compiles source code and run the helper. That'll make the CXX function available, then use another helper function to call it. For instance:
# Snow must still be installed, but this functionality is now in "parallel" which ships with base r.
library(parallel)
# Keep your source as an object
src1 <- '
Rcpp::NumericMatrix xbem(xbe);
int nrows = xbem.nrow();
Rcpp::NumericVector gv(g);
for (int i = 1; i < nrows; i++) {
xbem(i,_) = xbem(i-1,_) * gv[0] + xbem(i,_);
}
return xbem;
'
# Save the signature
sig <- signature(xbe = "numeric", g="numeric")
# make a function that compiles the source, then assigns the compiled function
# to the global environment
c.inline <- function(name, sig, src){
library(Rcpp)
funCXX <- inline::cxxfunction(sig = sig, body = src, plugin="Rcpp")
assign(name, funCXX, envir=.GlobalEnv)
}
# and the function which retrieves and calls this newly-compiled function
c.namecall <- function(name,...){
funCXX <- get(name)
funCXX(...)
}
# Keep your example matrix
A <- matrix(rnorm(400), 20,20)
# What are we calling the compiled funciton?
fxname <- "TestCXX"
## Parallel
cl <- makeCluster(2, type = "PSOCK")
# Export all the pieces
clusterExport(cl, c("src1","c.inline","A","fxname"))
# Call the compiler function
clusterCall(cl, c.inline, name=fxname, sig=sig, src=src1)
# Notice how the function now named "TestCXX" is available in the environment
# of every node?
clusterCall(cl, ls, envir=.GlobalEnv)
# Call the function through our wrapper
clusterCall(cl, c.namecall, name=fxname, A, 0.5)
# Works with my testing
I've written a package ctools (shameless self-promotion) which wraps up a lot of the functionality that is in the parallel and Rhpc packages for cluster computing, both with PSOCK and MPI. I already have a function called "c.sourceCpp" which calls "Rcpp::sourceCpp" on every node in much the same way as above. I'm going to add in a "c.inlineCpp" which does the above now that I see the usefulness of it.
Edit:
In light of Coatless' comments, the Rcpp::cppFunction() in fact negates the need for the c.inline helper here, though the c.namecall is still needed.
src2 <- '
NumericMatrix TestCpp(NumericMatrix xbe, int g){
NumericMatrix xbem(xbe);
int nrows = xbem.nrow();
NumericVector gv(g);
for (int i = 1; i < nrows; i++) {
xbem(i,_) = xbem(i-1,_) * gv[0] + xbem(i,_);
}
return xbem;
}
'
clusterCall(cl, Rcpp::cppFunction, code=src2, env=.GlobalEnv)
# Call the function through our wrapper
clusterCall(cl, c.namecall, name="TestCpp", A, 0.5)
I resolved it by sourcing on each cluster cluster node an R file with the wanted C inline function:
clusterEvalQ(cl,
{
library(inline)
invisible(source("your_C_func.R"))
})
And your file your_C_func.R should contain the C function definition:
c_func <- cfunction(...)
I have an object of class big.matrix in R with dimension 778844 x 2. The values are all integers (kilometres). My objective is to calculate the Euclidean distance matrix using the big.matrix and have as a result an object of class big.matrix. I would like to know if there is an optimal way of doing that.
The reason for my choice of using the class big.matrix is memory limitation. I could transform my big.matrix to an object of class matrix and calculate the Euclidean distance matrix using dist(). However, dist() would return an object of size that would not be allocated in the memory.
Edit
The following answer was given by John W. Emerson, author and maintainer of the bigmemory package:
You could use big algebra I expect, but this would also be a very nice use case for Rcpp via sourceCpp(), and very short and easy. But in short, we don't even attempt to provide high-level features (other than the basics which we implemented as proof-of-concept). No single algorithm could cover all use cases once you start talking out-of-memory big.
Here is a way using RcppArmadillo. Much of this is very similar to the RcppGallery example. This will return a big.matrix with the associated pairwise (by row) euclidean distances. I like to wrap my big.matrix functions in a wrapper function to create a cleaner syntax (i.e. avoid the #address and other initializations.
Note - as we are using bigmemory (and therefore concerned with RAM usage) I have this example returned the N-1 x N-1 matrix of only lower triangular elements. You could modify this but this is what I threw together.
euc_dist.cpp
// To enable the functionality provided by Armadillo's various macros,
// simply include them before you include the RcppArmadillo headers.
#define ARMA_NO_DEBUG
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo, BH, bigmemory)]]
using namespace Rcpp;
using namespace arma;
// The following header file provides the definitions for the BigMatrix
// object
#include <bigmemory/BigMatrix.h>
// C++11 plugin
// [[Rcpp::plugins(cpp11)]]
template <typename T>
void BigArmaEuclidean(const Mat<T>& inBigMat, Mat<T> outBigMat) {
int W = inBigMat.n_rows;
for(int i = 0; i < W - 1; i++){
for(int j=i+1; j < W; j++){
outBigMat(j-1,i) = sqrt(sum(pow((inBigMat.row(i) - inBigMat.row(j)),2)));
}
}
}
// [[Rcpp::export]]
void BigArmaEuc(SEXP pInBigMat, SEXP pOutBigMat) {
// First we tell Rcpp that the object we've been given is an external
// pointer.
XPtr<BigMatrix> xpMat(pInBigMat);
XPtr<BigMatrix> xpOutMat(pOutBigMat);
int type = xpMat->matrix_type();
switch(type) {
case 1:
BigArmaEuclidean(
arma::Mat<char>((char *)xpMat->matrix(), xpMat->nrow(), xpMat->ncol(), false),
arma::Mat<char>((char *)xpOutMat->matrix(), xpOutMat->nrow(), xpOutMat->ncol(), false)
);
return;
case 2:
BigArmaEuclidean(
arma::Mat<short>((short *)xpMat->matrix(), xpMat->nrow(), xpMat->ncol(), false),
arma::Mat<short>((short *)xpOutMat->matrix(), xpOutMat->nrow(), xpOutMat->ncol(), false)
);
return;
case 4:
BigArmaEuclidean(
arma::Mat<int>((int *)xpMat->matrix(), xpMat->nrow(), xpMat->ncol(), false),
arma::Mat<int>((int *)xpOutMat->matrix(), xpOutMat->nrow(), xpOutMat->ncol(), false)
);
return;
case 8:
BigArmaEuclidean(
arma::Mat<double>((double *)xpMat->matrix(), xpMat->nrow(), xpMat->ncol(), false),
arma::Mat<double>((double *)xpOutMat->matrix(), xpOutMat->nrow(), xpOutMat->ncol(), false)
);
return;
default:
// We should never get here, but it resolves compiler warnings.
throw Rcpp::exception("Undefined type for provided big.matrix");
}
}
My little wrapper
bigMatrixEuc <- function(bigMat){
zeros <- big.matrix(nrow = nrow(bigMat)-1,
ncol = nrow(bigMat)-1,
init = 0,
type = typeof(bigMat))
BigArmaEuc(bigMat#address, zeros#address)
return(zeros)
}
The test
library(Rcpp)
sourceCpp("euc_dist.cpp")
library(bigmemory)
set.seed(123)
mat <- matrix(rnorm(16), 4)
bm <- as.big.matrix(mat)
# Call new euclidean function
bm_out <- bigMatrixEuc(bm)[]
# pull out the matrix elements for out purposes
distMat <- as.matrix(dist(mat))
distMat[upper.tri(distMat, diag=TRUE)] <- 0
distMat <- distMat[2:4, 1:3]
# check if identical
all.equal(bm_out, distMat, check.attributes = FALSE)
[1] TRUE
require(inline)
func <- cxxfunction(, 'return Rcpp::wrap( qnorm(0.95,0.0,1.0) );' ,plugin="Rcpp")
error: no matching function for call to ‘qnorm5(double, int, int)’
require(inline)
func <- cxxfunction(, 'return Rcpp::wrap( qnorm(0.95, 0.0, 1.0, 1, 0) );'
,plugin="Rcpp")
error: no matching function for call to ‘qnorm5(double, double, double, int, int)’
require(inline)
code <-'
double a = qnorm(0.95, 0.0, 1.0);
return Rcpp::wrap( a );
'
func <-
cxxfunction(, code ,plugin="Rcpp")
func()
error: no matching function for call to ‘qnorm5(double, double, double)’
How can I use qnorm on Rcpp?
By making the mean and sd arguments double as the error message shows -- so try this is a full example
library(inline)
f <- cxxfunction(signature(xs="numeric", plugin="Rcpp", body='
Rcpp::NumericVector x(xs);
return Rcpp::wrap(Rcpp::qnorm(x, 1.0, 0.0));
')
and have a look at the examples and unit tests -- I just looked this up in the unit test file runit.stats.R which has a lot of test cases for these statistical 'Rcpp sugar' functions.
Edit on 2012-11-14: With Rcpp 0.10.0 released today, you can call do the signature R::qnorm(double, double, double, int, int) if you want to use C-style code written against Rmath.h. Rcpp sugar still gives you vectorised versions.