Rcpp gamma integral - r

I am trying to rewrite into (R)cpp an original R function that makes use of the gamma function (from double input). Below the original source. When comping with sourceCpp the following error is raised "no matching function for call to 'gamma(Rcpp::traits::storage_type(<14>:.type)'"
The gamma function should has been put within sugar (as the mean below use) so I expect there should be easily called.
#include <Rcpp.h>
#include <math.h>
using namespace Rcpp;
// original R function
// function (y_pred, y_true)
// {
// eps <- 1e-15
// y_pred <- pmax(y_pred, eps)
// Poisson_LogLoss <- mean(log(gamma(y_true + 1)) + y_pred -
// log(y_pred) * y_true)
// return(Poisson_LogLoss)
// }
// [[Rcpp::export]]
double poissonLogLoss(NumericVector predicted, NumericVector actual) {
NumericVector temp, y_pred_new;
double out;
const double eps=1e-15;
y_pred_new=pmax(predicted,eps);
long n = predicted.size();
for (long i = 0; i < n; ++i) {
temp[i] = log( gamma(actual[i]+1)+y_pred_new[i]-log(y_pred_new[i])*actual[i]);
}
out=mean(temp); // using sugar implementation
return out;
}

You are making this too complicated as the point of Rcpp Sugar is work vectorized. So the following compiles as well:
#include <Rcpp.h>
#include <math.h>
using namespace Rcpp;
// [[Rcpp::export]]
double poissonLogLoss(NumericVector predicted, NumericVector actual) {
NumericVector temp, y_pred_new;
double out;
const double eps=1e-15;
y_pred_new=pmax(predicted,eps);
temp = log(gamma(actual + 1)) + y_pred_new - log(y_pred_new)*actual;
out=mean(temp); // using sugar implementation
return out;
}
Now, you didn't supply any test data so I do not know if this computes correctly or not. Also, because your R expression is already vectorized, this will not be much faster.
Lastly, your compile error is likely due to the Sugar function gamma() expecting an Rcpp object whereas you provided a double.

Related

Why do I get the error for using "pnorm" in Rcpp

I need to involve variable from arma::in my Rcpp code. But I ran into a problem when trying to use the sugar function pnorm. Here is a demo:
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
double pget(NumericVector x, NumericVector beta) {
arma::colvec xx = Rcpp::as<arma::colvec>(x) ;
arma::colvec bb = Rcpp::as<arma::colvec>(beta) ;
double tt = as_scalar( arma::trans(xx) * bb);
double temp = Rcpp::pnorm(tt);
return temp;
}
Then I got an error: no matching function for call to 'pnorm5'
Does that mean I cannot use Rcpp::pnorm???
The Rcpp sugar functions are meant for vector type arguments like Rcpp::NumericVector. For scalar arguments you can use the functions in the R namespace:
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
double pget(NumericVector x, NumericVector beta) {
arma::colvec xx = Rcpp::as<arma::colvec>(x) ;
arma::colvec bb = Rcpp::as<arma::colvec>(beta) ;
double tt = as_scalar( arma::trans(xx) * bb);
double temp = R::pnorm(tt, 0.0, 1.0, 1, 0);
return temp;
}
/*** R
x <- rnorm(5)
beta <- rnorm(5)
pget(x, beta)
*/
BTW, here two variants. First variant uses arma instead of Rcpp vectors as arguments. Since these are const references, no data is copied. In addition, arma::dot is used:
// [[Rcpp::export]]
double pget2(const arma::colvec& xx, const arma::colvec& bb) {
double tt = arma::dot(xx, bb);
return R::pnorm(tt, 0.0, 1.0, 1, 0);
}
The second variant calculates the scalar product without resorting to Armadillo:
// [[Rcpp::export]]
double pget3(NumericVector x, NumericVector beta) {
double tt = Rcpp::sum(x * beta);
return R::pnorm(tt, 0.0, 1.0, 1, 0);
}
I'm much less of an expert than #RalfStubner at Rcpp, so I had to hack around (with help from StackOverflow and the Rcpp cheat sheat) to get the following code. Instead of using the R-namespace versions on scalars, I converted back to a NumericVector ... this can almost certainly be done more efficiently/skipping a few steps by someone who actually knows what they're doing ... e.g. it's possible that the arma-to-NumericVector conversion could be done directly without going through as_scalar ... ?
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
#include <Rcpp.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
using namespace arma;
// [[Rcpp::export]]
NumericVector pget(NumericVector x, NumericVector beta) {
colvec xx = as<colvec>(x) ;
colvec bb = as<colvec>(beta) ;
double tt = as_scalar(trans(xx) * bb);
NumericVector tt2 = NumericVector::create( tt );
NumericVector temp = Rcpp::pnorm(tt2);
return temp;
}

efficiently sample from modified arma::vec object

I am using Rcpp to speed up some R code. However, I'm really struggling with types - since these are foreign in R. Here's a simplified version of what I'm trying to do:
#include <RcppArmadillo.h>
#include <algorithm>
//[[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
using namespace arma;
// [[Rcpp::export]]
NumericVector fun(SEXP Pk, int k, int i, const vec& a, const mat& D) {
// this is dummy version of my actual function - with actual arguments.;
// I'm guessing SEXP is going to need to be replaced with something else when it's called from C++ not R.;
return D.col(i);
}
// [[Rcpp::export]]
NumericVector f(const arma::vec& assignment, char k, int B, const mat& D) {
uvec k_ind = find(assignment == k);
NumericVector output(assignment.size()); // for dummy output.
uvec::iterator k_itr = k_ind.begin();
for(; k_itr != k_ind.end(); ++k_itr) {
// this is R code, as I don't know the best way to do this in C++;
k_rep = sample(c(assignment[assignment != k], -1), size = B, replace = TRUE);
output = fun(k_rep, k, *k_itr, assignment, D);
// do something with output;
}
// compile result, ultimately return a List (after I figure out how to do that. For right now, I'll cheat and return the last output);
return output;
}
The part I'm struggling with is the random sampling of assignment. I know that sample has been implemented in Rarmadillo. However, I can see two approaches to this, and I'm not sure which is more efficient/doable.
Approach 1:
Make a table of theassignment values. Replace assignment == k with -1 and set its "count" equal to 1.
sample from the table "names" with probability proportional to the count.
Approach 2:
Copy the relevant subset of the assignment vector into a new vector with an extra spot for -1.
Sample from the copied vector with equal probabilities.
I want to say that approach 1 would be more efficient, except that assignment is currently of type arma::vec, and I'm not sure how to make the table from that - or how much of a cost there is to converting it to a more-compatible format. I think I could implement Approach 2, but I'm hoping to avoid the expensive copy.
Thanks for any insights you can provide.
many variable declaration is not coherent with the assignment made by you, like assignment = k is impossible to compare as assignment has real value and k is a char. as the queston is bad written I feel free to change the variables type. this compile..
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::export]]
arma::vec fun(const Rcpp::NumericVector& Pk, int k, unsigned int i, const arma::ivec& a, const arma::mat& D)
{
return D.col(i);
}
// [[Rcpp::export]]
Rcpp::NumericMatrix f(const arma::ivec& assignment, int k, unsigned int B, const arma::mat& D)
{
arma::uvec k_ind = find(assignment == k);
arma::ivec KK = assignment(find(assignment != k));
//these 2 row are for KK = c(assignment[assignment != k], -1)
//I dont know what is this -1 is for, why -1 ? maybe you dont need it.
KK.insert_rows(KK.n_rows, 1);
KK(KK.n_rows - 1) = -1;
arma::uvec k_ind_not = find(assignment != k);
Rcpp::NumericVector k_rep(B);
arma::mat output(D.n_rows,k_ind.n_rows); // for dummy output.
for(unsigned int i =0; i < k_ind.n_rows ; i++)
{
k_rep = Rcpp::RcppArmadillo::sample(KK, B, true);
output(arma::span::all, i) = fun(k_rep, k, i, assignment, D);
// do something with output;
}
// compile result, ultimately return a List (after I figure out how to do that. For right now, I'll cheat and return the last output);
return Rcpp::wrap(output);
}
this is not optimized (as the question is bogus), this is badly written, beccause as I think R would be sufficiently faster in searching index of a vector (so do this in R and implemement only fun in Rcpp)...is not useful to waste time here, there are other problems that need a solver implemented in Rcpp , not this searching stuff...
but this is not a useful question as you are asking more for an algorithm than for example signature of function

Calling igraph from within Rcpp

As a part of utilizing network data drawn at random before further processing, I am trying to call a couple of functions from the igraph package at the beginning of each iteration. The code I use is as follows:
#define ARMA_64BIT_WORD
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::plugins(cpp11)]]
using namespace Rcpp;
using arma::sp_mat;
// [[Rcpp::export]]
sp_mat adj_mat(int n, double p) {
Environment igraph("package:igraph");
Function game_er = igraph["erdos.renyi.game"];
Function get_adjacency = igraph["get.adjacency"];
List g = game_er(Named("n", n), Named("p", p));
NumericMatrix A_m = get_adjacency(Named("g", g));
sp_mat A = as<sp_mat>(A_m);
return A;
}
/*** R
set.seed(20130810)
library(igraph)
adj_mat(100, 0.5)
*/
So, while the C++ compiles without warnings, the following error is thrown:
> sourceCpp("Hooking-R-in-cpp.cpp")
> set.seed(20130810)
> library(igraph)
> adj_mat(100, 0.5)
Error in adj_mat(100, 0.5) :
Not compatible with requested type: [type=S4; target=double].
From the error it looks like I am passing a S4 class to a double? Where is the error?
You were imposing types in the middle of your C++ functions that did not correspond to the representation, so you got run-time errors trying to instantiate them.
The version below works. I don't know igraph well enough to suggest what else you use to store the first return; for the S4 you can use the dgCMatrix matrix but S4 is an ok superset.
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::plugins(cpp11)]]
using namespace Rcpp;
using arma::sp_mat;
// [[Rcpp::export]]
sp_mat adj_mat(int n, double p) {
Environment igraph("package:igraph");
Function game_er = igraph["erdos.renyi.game"];
Function get_adjacency = igraph["get.adjacency"];
SEXP g = game_er(Named("n", n), Named("p", p));
S4 A_m = get_adjacency(Named("g", g));
sp_mat A = as<sp_mat>(A_m);
return A;
}
/*** R
set.seed(20130810)
library(igraph)
adj_mat(100, 0.5)
*/

Inconsistent results between Rcpp and R code

UPDATE
Previous example is complicated, hence please allow me to use a simpler example as shown below:
Here is the Rcpp code:
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
#include <Rmath.h>
#include <Rcpp.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp ;
using namespace arma;
using namespace std;
// [[Rcpp::export]]
double chooseC(double n, double k) {
return Rf_choose(n, k);
}
// [[Rcpp::export]]
double function3(double n, double m, double beta) {
double prob;
NumericVector k(m);
NumericVector k_vec(m);
if(n<m){prob=0;}
else{
if(chooseC(n,m)==R_PosInf){
k=seq_len(m)-1;
k_vec= (n-k)/(m-k)*std::pow((1-beta),(n-m)/m)*beta;
prob=std::accumulate(k_vec.begin(),k_vec.end(), 1, std::multiplies<double>())*beta;
}
else{
prob = beta * chooseC(n,m) * std::pow(beta,m) * std::pow((1-beta),(n-m));
}
}
return(prob);
}
Here is the R code:
function4 <- function ( n , m , beta )
{
if ( n < m )
{
prob <- 0.0
}
else
{
if (is.infinite(choose(n,m))){
k<-0:(m-1)
prob <- beta *prod((n-k)/(m-k)*(1-beta)^((n-m)/m)*beta)
}
else{
prob <- beta * choose(n,m) * beta^m * (1-beta)^(n-m)
}
}
prob
}
Comparison:
input<-619
beta<-0.09187495
x<-seq(0, (input+1)/beta*3)
yy<-sapply(x,function(n)function3(n,input, beta=beta))
yy2<-sapply(x,function(n)function4(n,input, beta=beta))
sum(yy)=0
sum(yy2)=1
However, with other input:
input<-1
beta<-0.08214248
Both results are the same, sum(yy)=sum(yy2)=0.9865887.
I used double in Rcpp code, I don't know what else could cause the inconsistent precision between Rcpp and R code.
Thanks a lot!
I think I fix the Rcpp code, so right now both Rcpp and R code produce the same results when the results are very small values. The solution is shown as below:
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
#include <Rmath.h>
#include <Rcpp.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp ;
using namespace arma;
using namespace std;
// [[Rcpp::export]]
double chooseC(double n, double k) {
return Rf_choose(n, k);
}
// [[Rcpp::export]]
double function3(double n, double m, double beta) {
double prob;
arma::vec k = arma::linspace<vec>(0, m-1, m);
arma::vec k_vec;
if(n<m){prob=0;}
else{
if(chooseC(n,m)==R_PosInf){
k_vec= (n-k)/(m-k)*pow((1-beta),(n-m)/m)*beta;
prob=arma::prod(k_vec)*beta;
}
else{
prob = beta * chooseC(n,m) * pow(beta,m) * pow((1-beta),(n-m));
}
}
return(prob);
}
However, I still do not understand why by writing code in this way will fix the precision inconsistent. Rcpp and RcppArmadillo still look like black boxes to me.

Rcpp cannot convert ‘SEXP {aka SEXPREC*}’ to ‘double’ in initialization

I am trying to duplicate the R vectorised sum in Rcpp
I first try the following trouble-free code:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double call(NumericVector x){
return sum(x);
}
Type call(Time)
> call(Time)
[1] 1919853
Then an environment version, also works well,
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double call(){
Environment env = Environment::global_env();
NumericVector Time = env["Time"];
return sum(Time);
}
Type call()
> call()
[1] 1919853
Now I am trying something weird as following,
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double call(){
Environment base("package:base");
Function sumc = base["sum"];
Environment env = Environment::global_env();
NumericVector Time = env["Time"];
double res = sumc(Time);
return res;
}
This time I got a error message:
trycpp.cpp:10:25: error: cannot convert ‘SEXP {aka SEXPREC*}’ to ‘double’ in initialization
double res = sumc(Time);
Any idea what's going wrong ?
You cannot call an R function (ie sumc() on one of Rcpp's vectors. Do this:
// [[Rcpp::export]]
double mycall(){
Environment base("package:base");
Function sumc = base["sum"];
Environment env = Environment::global_env();
NumericVector Time = env["Time"];
double res = sum(Time);
return res;
}
Here sum() is the Rcpp sugar function.

Resources