I'm trying to convert a r function into Rcpp to try and speed thing up since it involves a for loop. Along the way I need to calculate the mean of the entries of a vector, which in R would be as simple as mean(x), but it appears to not work in Rcpp, giving me 0 0 as result everytime.
My code looks like this:
cppFunction(
"NumericVector fun(int n, double lambda, ...) {
...
NumericVector y = rpois(n, lambda);
NumericVector w = dpois(y, lambda);
NumericVector x = w*y;
double z = mean(x);
return z;
}")
Edit: So I thought my error was due to what was mentioned above, and the return of a single double of z is just me trying to isolate the issue. The following code however still does not work:
cppFunction(
"NumericVector zstat(int n, double lambda, double lambda0, int m) {
NumericVector z(m);
for (int i=1; i<m; ++i){
NumericVector y = rpois(n, lambda0);
NumericVector w = dpois(y, lambda)/dpois(y,lambda0);
double x = mean(w*y);
z[i] = (x-2)/(sqrt(2/n));
}
return z;
}")
The return type of your function is NumericVector, but Rcpp::mean returns a scalar value convertible to double. Fixing this will correct the issue:
library(Rcpp)
cppFunction(
"double fun(int n, double lambda) {
NumericVector y = rpois(n, lambda);
NumericVector w = dpois(y, lambda);
NumericVector x = w*y;
double z = mean(x);
return z;
}")
set.seed(123)
fun(50, 1.5)
# [1] 0.2992908
What is happening in your code is since NumericVector was specified as the return type, this constructor is called,
template <typename T>
Vector(T size,
typename Rcpp::traits::enable_if<traits::is_arithmetic<T>::value, void>::type* = 0) {
Storage::set__( Rf_allocVector( RTYPE, size) ) ;
init() ;
}
which casts the double to an integral type and creates a NumericVector with length equal to the truncated value of the double. To demonstrate,
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector from_double(double x) {
return x;
}
/*** R
sapply(0.5:4.5, from_double)
# [[1]]
# numeric(0)
#
# [[2]]
# [1] 0
#
# [[3]]
# [1] 0 0
#
# [[4]]
# [1] 0 0 0
#
# [[5]]
# [1] 0 0 0 0
*/
Edit: Regarding your second question, you are dividing by sqrt(2 / n), where 2 and n are both integers, which ends up causing a division by zero in most cases -- hence all of the Inf values in the result vector. You can fix this by using 2.0 instead of 2:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector zstat(int n, double lambda, double lambda0, int m) {
NumericVector z(m);
for (int i=1; i<m; ++i){
NumericVector y = rpois(n, lambda0);
NumericVector w = dpois(y, lambda)/dpois(y,lambda0);
double x = mean(w * y);
// z[i] = (x - 2) / sqrt(2 / n);
// ^^^^^
z[i] = (x - 2) / sqrt(2.0 / n);
// ^^^^^^^
}
return z;
}
/*** R
set.seed(123)
zstat(25, 2, 3, 10)
# [1] 0.0000000 -0.4427721 0.3199805 0.1016661 0.4078687 0.4054078
# [7] -0.1591861 0.9717596 0.6325110 0.1269779
*/
C++ is not R -- you need to be more careful about the types of your variables.
Related
Is there any way to efficiently translate the outer() function for multiplication of two vectors from R base to RcppArmadillo? I attempted to do so but it is not efficient at all.
Take the following example:
library(Rcpp)
library(RcppArmadillo)
library(microbenchmark)
#Outer attempt
cppFunction(depends = "RcppArmadillo",
'
arma::mat outer_rcpp(arma::vec x, arma::vec y) {
int x_length = x.n_elem;
int y_length = y.n_elem;
arma::mat final(x_length, y_length);
// And use loops instead of outer
for(int i = 0; i < x_length; i++) {
final.col(i) = x[i] * y;
}
return(final);
}
'
)
#Test for equal results
a <- rnorm(5)
base <- base::outer(a, a)
rcpp <- outer_rcpp(a, a)
all.equal(base, rcpp)
#Test for speed
b <- rnorm(5000)
microbenchmark(base = base::outer(b, b),
rcpp = outer_rcpp(b, b), times = 10)
The results are 2 times slower using R base. I am sure that this can be done though matrix multiplication, any idea how?
As #thelatemail pointed out in the comments, the outer routine is already using a heavily optimized C routine.
src/library/base/R/outer.R: tcrossprod usage.
src/main/array.c: underlying C routine powering the tcrossprod computation.
Armadillo itself has its own optimization for addressing matrix multiplication using the dgemm and dgemv routines from LAPACK:
armadillo_bits/mul_gemm.hpp: C := alpha*op( A )op( B ) + betaC,
armadillo_bits/mul_gemv.hpp: y := alphaAx + betay, or y := alphaA**Tx + betay,
Playing around with the outerproduct calculations leads to a few optimizations. Mainly, we're opting to move the outer product into armadillo actions instead of loops:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat outer_rcpp(const arma::vec& x, const arma::vec& y) {
int x_length = x.n_elem;
int y_length = y.n_elem;
arma::mat final(x_length, y_length);
// And use loops instead of outer
for(int i = 0; i < x_length; i++) {
final.col(i) = x[i] * y;
}
return final;
}
// [[Rcpp::export]]
arma::mat outer_with_armadillo(const arma::vec& x, const arma::vec& y) {
arma::mat final = x*y.t();
return final;
}
// [[Rcpp::export]]
arma::mat outer_with_armadillo_transposed(const arma::vec& x, const arma::rowvec& y) {
arma::mat final = x*y;
return final;
}
Revisiting the benchmarking code, we have:
b = rnorm(5000)
b_tranposed = t(b)
bench_results = microbenchmark::microbenchmark(base = base::outer(b, b),
outer_armadillo_loop = outer_rcpp(b, b),
outer_armadillo_optimized = outer_with_armadillo(b, b),
outer_armadillo_optimized_transposed = outer_with_armadillo_transposed(b, b_tranposed), times = 10)
bench_results
expr
min
lq
mean
median
uq
max
neval
base
132.8601
141.3532
156.9979
146.7993
154.8954
234.2619
10
outer_armadillo_loop
278.4115
279.9204
317.7907
288.4212
329.0769
451.6872
10
outer_armadillo_optimized
272.4348
283.3380
347.7913
304.1181
339.3282
728.2264
10
outer_armadillo_optimized_transposed
269.7855
270.7108
297.9580
279.8099
312.3488
386.4270
10
From the results, the lowest I could achieve is having a pre-transposed b vector from column vector form into row-vector form: (n x 1) * (1 x m)
I have an Rcpp function that should take an IntegerVector as input (as toInt). I want to use it on vector of integers, but also on vector of doubles that are just integers (e.g. 1:4 is of type integer but 1:4 + 1 is of type double).
Yet, when this is used on real floating point numbers (e.g. 1.5), I would like it to return a warning or an error instead of silently rounding all values (to make them integers).
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector toInt(RObject x) {
return as<IntegerVector>(x);
}
> toInt(c(1.5, 2.4)) # I would like a warning
[1] 1 2
> toInt(1:2 + 1) # No need of warning
[1] 2 3
Rcpp sugar has all you need. Here is one possible implementation:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector fprive(const RObject & x) {
NumericVector nv(x);
IntegerVector iv(x);
if (is_true(any(nv != NumericVector(iv)))) warning("Uh-oh");
return(iv);
}
/*** R
fprive(c(1.5, 2))
fprive(c(1L, 2L))
*/
Its output is as follows:
R> Rcpp::sourceCpp('/tmp/fprive.cpp')
R> fprive(c(1.5, 2))
[1] 1 2
R> fprive(c(1L, 2L))
[1] 1 2
Warning message:
In fprive(c(1.5, 2)) : Uh-oh
R>
Because it is a warning object, you can control via options("warn") whether you want to abort, print immediately, print at end, ignore, ...
The first solution I thought of
// [[Rcpp::export]]
IntegerVector toInt2(const NumericVector& x) {
for (int i = 0; i < x.size(); i++) {
if (x[i] != (int)x[i]) {
warning("Uh-oh");
break;
}
}
return as<IntegerVector>(x);
}
but I wondered if there wasn't an unnecessary copy when x was an IntegerVector, so I made this other solution:
// [[Rcpp::export]]
IntegerVector toInt3(const RObject& x) {
NumericVector nv(x);
for (int i = 0; i < nv.size(); i++) {
if (nv[i] != (int)nv[i]) {
warning("Uh-oh");
break;
}
}
return as<IntegerVector>(x);
}
But, maybe the best solution would be to test if the RObject is already of type int and to fill the resulting vector at the same time of checking the type:
// [[Rcpp::export]]
SEXP toInt4(const RObject& x) {
if (TYPEOF(x) == INTSXP) return x;
NumericVector nv(x);
int i, n = nv.size();
IntegerVector res(n);
for (i = 0; i < n; i++) {
res[i] = nv[i];
if (nv[i] != res[i]) {
warning("Uh-oh");
break;
}
}
for (; i < n; i++) res[i] = nv[i];
return res;
}
Some benchmarking:
x <- seq_len(1e7)
x2 <- x; x2[1] <- 1.5
x3 <- x; x3[length(x3)] <- 1.5
microbenchmark::microbenchmark(
fprive(x), toInt2(x), toInt3(x), toInt4(x),
fprive(x2), toInt2(x2), toInt3(x2), toInt4(x2),
fprive(x3), toInt2(x3), toInt3(x3), toInt4(x3),
times = 20
)
Unit: microseconds
expr min lq mean median uq max neval
fprive(x) 229865.629 233539.952 236049.68870 235623.390 238500.4335 244608.276 20
toInt2(x) 98249.764 99520.233 102026.44305 100468.627 103480.8695 114144.022 20
toInt3(x) 50631.512 50838.560 52307.34400 51417.296 52524.0260 58311.909 20
toInt4(x) 1.165 6.955 46.63055 10.068 11.0755 766.022 20
fprive(x2) 63134.534 64026.846 66004.90820 65079.292 66674.4835 74907.065 20
toInt2(x2) 43073.288 43435.478 44068.28935 43990.455 44528.1800 45745.834 20
toInt3(x2) 42968.743 43461.838 44268.58785 43682.224 44235.6860 51906.093 20
toInt4(x2) 19379.401 19640.198 20091.04150 19918.388 20232.4565 21756.032 20
fprive(x3) 254034.049 256154.851 258329.10340 258676.363 259549.3530 264550.346 20
toInt2(x3) 77983.539 79162.807 79901.65230 79424.011 80030.3425 87906.977 20
toInt3(x3) 73521.565 74329.410 76050.63095 75128.253 75867.9620 88240.937 20
toInt4(x3) 22109.970 22529.713 23759.99890 23072.738 23688.5365 30905.478 20
So, toInt4 seems the best solution.
Please note that this error was taken from a bigger context, which I cannot obviously report here entirely.
I have the following functions in the file fun.cpp
#include <RcppArmadilloExtensions/sample.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]
arma::vec colMeans(arma::mat data){
int n_0 = data.n_rows;
arma::vec xbar(data.n_cols);
for(int i = 0; i < data.n_rows; i++){
for(int j = 0; j < data.n_cols; j++){
xbar[j] += data(i,j) /n_0;
}
}
return xbar;
}
// [[Rcpp::export]]
List PosteriorNIW(arma::mat data, arma::vec mu0, double lambda0,
double df0, arma::mat V){
// Compute posterior
int n = data.n_rows;
arma::vec xbar = colMeans(data);
double lambdan = lambda0 + n;
arma::vec mun = (lambda0 * mu0 + n * xbar) / lambdan;
arma::mat S;
S.zeros(data.n_cols, data.n_cols);
for(int i = 0; i < n; i++){
S += (arma::conv_to<arma::vec>::from(data.row(i)) - xbar) * arma::trans(arma::conv_to<arma::vec>::from(data.row(i)) - xbar);
}
arma::mat Vn = V + S + ((lambda0*n)/(lambda0 + n)) * (xbar - mu0) * arma::trans(xbar - mu0);
return List::create(_["mun"] = mun,
_["Vn"] = Vn,
_["lambdan"] = lambdan);
}
Calling now:
library(Rcpp); library(RcppArmadillo)
mu0 <- c(3,3)
V0 <- matrix(c(2.5,0.0,0.0,2.5), nrow = 2)
sourceCpp("fun.cpp")
data <- cbind(rep(5,15),rep(0,15))
PosteriorNIW(data, mu0, 1, 1, V0)
gives the expected result.
$mun
[,1]
[1,] 4.8750
[2,] 0.1875
$Vn
[,1] [,2]
[1,] 6.250 -5.6250
[2,] -5.625 10.9375
$lambdan
[1] 16
Now if I add to the file fun.cpp the following functions (again, these are taken from a bigger context so don't bother trying to understand but just paste them) strange things happens:
// [[Rcpp::export]]
NumericMatrix myFun(arma::mat t_dish, arma::cube data){
int l = 0;
for(int j = 0; j < data.n_rows; j++){
l++;
}
NumericMatrix Dk(l, 2);
return Dk;
}
// [[Rcpp::export]]
int myFun2(arma::cube n_cust){
arma::mat temp = n_cust.subcube(arma::span(0), arma::span(), arma::span());
int i;
for(i = 0; i < n_cust.n_cols; i++){
arma::rowvec temp2 = temp.row(i);
}
return i + 1;
}
// [[Rcpp::export]]
arma::vec myFun3(arma::mat k_tables){
arma::vec temp(k_tables.n_cols * k_tables.n_rows);
int l = 0;
if(!R_IsNA(k_tables(0,0))){
l++;
}
arma::vec temp2(l);
arma::vec tmp3 = sort(temp2);
return tmp3;
}
double myFun4(arma::vec x, double nu, arma::vec mu, arma::mat Sigma){
arma::vec product = (arma::trans(x - mu) * arma::inv(Sigma) * (x - mu));
double num = pow(1 + (1 / nu) * product[0], - ( nu + 2 ) / 2);
double den = pow(sqrt(M_PI * nu),2) * sqrt(arma::det(Sigma));
return num / den;
}
bool myFun5(NumericVector X, double z) {
return std::find(X.begin(), X.end(), z)!=X.end();
}
calling PosteriorNIW(data, mu0, 1, 1, V0) repeatedly starts giving different results every time. Note that there is no randomness in the functions and that obviously those functions have got no impact as they are not called in the original function.
I have tried on a different machine to make sure it was not a problem of my compiler but the error keeps happening.
I know that removing those function (even just one of them) fixes the problem but clearly this is not a feasible solution when I am working with more functions.
I would like to know if other users are able to replicate this behavior and if yes if there is a fix for it.
Thank you in advance
EDIT:
The version of R is 3.3.2 and Rtools is 3.4. Both Rcpp and RcppArmadillo are up-to-date
You're not zeroing xbar in your colMeans function. If I do do that:
arma::vec colMeans(arma::mat data){
int n_0 = data.n_rows;
arma::vec xbar;
xbar.zeros(data.n_cols);
for(int i = 0; i < data.n_rows; i++){
for(int j = 0; j < data.n_cols; j++){
xbar[j] += data(i,j) /n_0;
}
}
return xbar;
}
I get this everytime:
> PosteriorNIW(data, mu0, 1, 1.1, V0)
$mun
[,1]
[1,] 4.8750
[2,] 0.1875
$Vn
[,1] [,2]
[1,] 6.250 -5.6250
[2,] -5.625 10.9375
$lambdan
[1] 16
Even when I do add your extra block of code.
I don't know if these vectors are documented to be initialised to zero by their constructor (in which case this might be a bug there) or not, in which case its your bug!
I have the following function declared in Rcpp:
#include <Rcpp.h>
// [[Rcpp::depends(RcppArmadillo)]]
#include <Rmath.h>
using namespace Rcpp;
// [[Rcpp::export]]
double loglikZeta(double zold, double zstar, NumericVector y, int K, double p){
NumericVector num = Rcpp::dbinom(y,K,p*zstar);
NumericVector den = Rcpp::dbinom(y,K,p*zold);
return (num[0]/den[0]);
}
// [[Rcpp::export]]
double singleZetaSampler(NumericVector z, NumericVector y,
double p, int K, int i, double zstar){
return loglikZeta(z[i-1],zstar,y[i-1],K,p);
}
Now declare (after having loaded package and file):
z <- y <- c(rep(1,20),rep(0,20))
n <- length(y)
K <- 3
p <- 0.5
i <- 30
zstar <- 1
The unexpected behaviour is that if I try to call I have everytime different results (there is nothing random in the function):
singleZetaSampler(z,y,p,K,i,zstar)
[1] 1.000051
singleZetaSampler(z,y,p,K,i,zstar)
[1] 0.1887447
singleZetaSampler(z,y,p,K,i,zstar)
[1] 0.9999998
Is there any big error am I doing here or these results are actually unexpected?
EDIT:
Sorry if the function doesn't make sense used as it is. This was the original function:
// [[Rcpp::export]]
NumericVector zetaSampler(int n, NumericVector z, NumericVector y,
double p, int K){
NumericVector xx(n);
for(int i = 0; i < n; i++){
xx(i) = loglikZeta(z[i],1,y[i],K,p);
}
return xx;
}
and calling:
zetaSampler(length(z),z,y,p,K)
as before gives different results every time.
Two things. One actual error, one sort-of-stylistic.
The stylistic issue is that you include Rmath.h and depend on RcppArmadillo when you should not. The real error is that you sample 20 times but then set i=30 and access the 30th element. So you get random inputs.
Here is what I just ran, and it gets three times the same result.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double loglikZeta(double zold, double zstar, NumericVector y, int K, double p){
NumericVector num = Rcpp::dbinom(y,K,p*zstar);
NumericVector den = Rcpp::dbinom(y,K,p*zold);
return (num[0]/den[0]);
}
// [[Rcpp::export]]
double singleZetaSampler(NumericVector z, NumericVector y,
double p, int K, int i, double zstar){
return loglikZeta(z[i-1],zstar,y[i-1],K,p);
}
/*** R
z <- y <- c(rep(1,20),rep(0,20))
n <- length(y)
K <- 3
p <- 0.5
i <- 20 # not 30
zstar <- 1
singleZetaSampler(z,y,p,K,i,zstar)
singleZetaSampler(z,y,p,K,i,zstar)
singleZetaSampler(z,y,p,K,i,zstar)
*/
Output:
R> sourceCpp("/tmp/foo.cpp")
R> z <- y <- c(rep(1,20),rep(0,20))
R> n <- length(y)
R> K <- 3
R> p <- 0.5
R> i <- 20 # not 30
R> zstar <- 1
R> singleZetaSampler(z,y,p,K,i,zstar)
[1] 1
R> singleZetaSampler(z,y,p,K,i,zstar)
[1] 1
R> singleZetaSampler(z,y,p,K,i,zstar)
[1] 1
R>
Edit: Appears to work better in a repaired version forcing scalar arguments to loglikZeta():
// [[Rcpp::export]]
double loglikZeta(double zold, double zstar, double y, int K, double p){
double num = R::dbinom(y, K, p*zstar, false);
double den = R::dbinom(y, K, p*zold, false);
return (num/den);
}
Note that Rcpp::dbinom() has a signature of Rcpp::dbinom(Rcpp::NumericVector, int, double, bool=false).
Let Y and K be an n-dimensional (column) vector and n by n matrix, respectively. Think of Y and K as a sample vector and its covariance matrix.
Corresponding to each entry of Y (say Yi) there is a row vector (of size 2) Si encoding the location of the sample in a two dimensional space. Construct the n by 2 matrix S by concatenating all the Si vectors. The ij-th entry of K is of the form
Kij= f( |si-sj|, b )
in which |.| denotes the usual Euclidean norm, f is the covariance function and b represents the covariance parameters. For instance for powered exponential covariance we have f(x) = exp( (-|x|/r)q ) and b = (r,q).
The goal is to compute the following quantity in Rcpp, using a parallel fashion. (YT stands for Y transpose and ||.||2 denotes the sum of square entries of K).
YTKY ⁄ ||K||2
Here is the piece of code I've written to do the job. While running, Rstudio runs out of memory after a few seconds and the following massage displays: "R encountered a fatal error. The session was terminated". I've very recently started using open MP in Rcpp and I have no idea why this happens! Can anybody tell me what have I done wrong here?
#include <Rcpp.h>
#include<math.h>
#include<omp.h>
// [[Rcpp::plugins(openmp)]]
using namespace Rcpp;
// [[Rcpp::export]]
double InnerProd(NumericVector x, NumericVector y) {
int n = x.size();
double total = 0;
for(int i = 0; i < n; ++i) {
total += x[i]*y[i];
}
return total;
}
// [[Rcpp::export]]
double CorFunc(double r, double range_param, double beta) {
double q,x;
x = r/range_param;
q = exp( -pow(x,beta) );
return(q);
}
// [[Rcpp::export]]
double VarianceComp( double range, NumericVector Y, NumericMatrix s, double
beta, int t ){
int n,i,j;
double Numer = 0, Denom = 0, dist, CorVal, ObjVal;
NumericVector DistVec;
n = Y.size();
omp_set_num_threads(t);
# pragma omp parallel for private(DistVec,CorVal,dist,j) \
reduction(+:Numer,Denom)
for( i = 0; i < n; ++i) {
for( j = 0; j < n; ++j){
DistVec = ( s(i,_)-s(j,_) );
dist = sqrt( InnerProd(DistVec,DistVec) );
CorVal = CorFunc(dist,range,beta);
Numer += Y[i]*Y[j]*CorVal/n;
Denom += pow( CorVal, 2 )/n;
}
}
ObjVal = Numer/Denom;
return( ObjVal );
}