Rcpp cannot convert ‘SEXP {aka SEXPREC*}’ to ‘double’ in initialization - r

I am trying to duplicate the R vectorised sum in Rcpp
I first try the following trouble-free code:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double call(NumericVector x){
return sum(x);
}
Type call(Time)
> call(Time)
[1] 1919853
Then an environment version, also works well,
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double call(){
Environment env = Environment::global_env();
NumericVector Time = env["Time"];
return sum(Time);
}
Type call()
> call()
[1] 1919853
Now I am trying something weird as following,
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double call(){
Environment base("package:base");
Function sumc = base["sum"];
Environment env = Environment::global_env();
NumericVector Time = env["Time"];
double res = sumc(Time);
return res;
}
This time I got a error message:
trycpp.cpp:10:25: error: cannot convert ‘SEXP {aka SEXPREC*}’ to ‘double’ in initialization
double res = sumc(Time);
Any idea what's going wrong ?

You cannot call an R function (ie sumc() on one of Rcpp's vectors. Do this:
// [[Rcpp::export]]
double mycall(){
Environment base("package:base");
Function sumc = base["sum"];
Environment env = Environment::global_env();
NumericVector Time = env["Time"];
double res = sum(Time);
return res;
}
Here sum() is the Rcpp sugar function.

Related

Using Rcpp for faster extraction and summary of lm output

I'm trying to speed up extraction from R's summary lm object when contained in an extremely large loop. The following is my attempt as a Rcpp to speed it up.
Toy-Data
synthetic_LMsummary<-lapply(1:100000L,function(UU){summary(lm(rnorm(500)~replicate(2,{rnorm(500)})))})
R Version
tidy.train <- function(s,SelectedRow=3) {
out<-data.frame(
estimate=s$coefficients[, "Estimate"][SelectedRow],
std.error=s$coefficients[, "Std. Error"][SelectedRow],
statistic=s$coefficients[, "t value"][SelectedRow],
p.value=s$coefficients[, "Pr(>|t|)"][SelectedRow],
rsquared=s$r.squared
)
row.names(out) <- NULL
out
}
synthetic_LMsummary %>% purrr::map_dfr(~tidy.train(.,SelectedRow=3))
Attempted Rcpp Version
Rcpp::sourceCpp(code='
#include <Rcpp.h>
// [[Rcpp::export]]
using namespace std;
using namespace Rcpp;
Rcpp::NumericMatrix RcppTidy(Rcpp::List Summary_List,int mat_cols, int select) {
int mat_rows = Summary_List.length();
Rcpp::NumericMatrix Output_Mat(mat_rows,mat_cols);
for (int i = 0; i < Summary_List.length(); ++i) {
Rcpp::List SubSet=Summary_List[i];
Rcpp::NumericMatrix CoefDF=SubSet["coefficients"];
Rcpp::NumericVector Coef=CoefDF.row(select);
Rcpp::NumericVector rsquared=SubSet["adj.r.squared"];
Output_Mat(i,_) = cbind(Coef,rsquared);
}
return Output_Mat;
}
')
Tidy_synLM<-RcppTidy(synthetic_LMsummary,5L,3L)

Why do I get the error for using "pnorm" in Rcpp

I need to involve variable from arma::in my Rcpp code. But I ran into a problem when trying to use the sugar function pnorm. Here is a demo:
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
double pget(NumericVector x, NumericVector beta) {
arma::colvec xx = Rcpp::as<arma::colvec>(x) ;
arma::colvec bb = Rcpp::as<arma::colvec>(beta) ;
double tt = as_scalar( arma::trans(xx) * bb);
double temp = Rcpp::pnorm(tt);
return temp;
}
Then I got an error: no matching function for call to 'pnorm5'
Does that mean I cannot use Rcpp::pnorm???
The Rcpp sugar functions are meant for vector type arguments like Rcpp::NumericVector. For scalar arguments you can use the functions in the R namespace:
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
double pget(NumericVector x, NumericVector beta) {
arma::colvec xx = Rcpp::as<arma::colvec>(x) ;
arma::colvec bb = Rcpp::as<arma::colvec>(beta) ;
double tt = as_scalar( arma::trans(xx) * bb);
double temp = R::pnorm(tt, 0.0, 1.0, 1, 0);
return temp;
}
/*** R
x <- rnorm(5)
beta <- rnorm(5)
pget(x, beta)
*/
BTW, here two variants. First variant uses arma instead of Rcpp vectors as arguments. Since these are const references, no data is copied. In addition, arma::dot is used:
// [[Rcpp::export]]
double pget2(const arma::colvec& xx, const arma::colvec& bb) {
double tt = arma::dot(xx, bb);
return R::pnorm(tt, 0.0, 1.0, 1, 0);
}
The second variant calculates the scalar product without resorting to Armadillo:
// [[Rcpp::export]]
double pget3(NumericVector x, NumericVector beta) {
double tt = Rcpp::sum(x * beta);
return R::pnorm(tt, 0.0, 1.0, 1, 0);
}
I'm much less of an expert than #RalfStubner at Rcpp, so I had to hack around (with help from StackOverflow and the Rcpp cheat sheat) to get the following code. Instead of using the R-namespace versions on scalars, I converted back to a NumericVector ... this can almost certainly be done more efficiently/skipping a few steps by someone who actually knows what they're doing ... e.g. it's possible that the arma-to-NumericVector conversion could be done directly without going through as_scalar ... ?
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
#include <Rcpp.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
using namespace arma;
// [[Rcpp::export]]
NumericVector pget(NumericVector x, NumericVector beta) {
colvec xx = as<colvec>(x) ;
colvec bb = as<colvec>(beta) ;
double tt = as_scalar(trans(xx) * bb);
NumericVector tt2 = NumericVector::create( tt );
NumericVector temp = Rcpp::pnorm(tt2);
return temp;
}

Calling igraph from within Rcpp

As a part of utilizing network data drawn at random before further processing, I am trying to call a couple of functions from the igraph package at the beginning of each iteration. The code I use is as follows:
#define ARMA_64BIT_WORD
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::plugins(cpp11)]]
using namespace Rcpp;
using arma::sp_mat;
// [[Rcpp::export]]
sp_mat adj_mat(int n, double p) {
Environment igraph("package:igraph");
Function game_er = igraph["erdos.renyi.game"];
Function get_adjacency = igraph["get.adjacency"];
List g = game_er(Named("n", n), Named("p", p));
NumericMatrix A_m = get_adjacency(Named("g", g));
sp_mat A = as<sp_mat>(A_m);
return A;
}
/*** R
set.seed(20130810)
library(igraph)
adj_mat(100, 0.5)
*/
So, while the C++ compiles without warnings, the following error is thrown:
> sourceCpp("Hooking-R-in-cpp.cpp")
> set.seed(20130810)
> library(igraph)
> adj_mat(100, 0.5)
Error in adj_mat(100, 0.5) :
Not compatible with requested type: [type=S4; target=double].
From the error it looks like I am passing a S4 class to a double? Where is the error?
You were imposing types in the middle of your C++ functions that did not correspond to the representation, so you got run-time errors trying to instantiate them.
The version below works. I don't know igraph well enough to suggest what else you use to store the first return; for the S4 you can use the dgCMatrix matrix but S4 is an ok superset.
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::plugins(cpp11)]]
using namespace Rcpp;
using arma::sp_mat;
// [[Rcpp::export]]
sp_mat adj_mat(int n, double p) {
Environment igraph("package:igraph");
Function game_er = igraph["erdos.renyi.game"];
Function get_adjacency = igraph["get.adjacency"];
SEXP g = game_er(Named("n", n), Named("p", p));
S4 A_m = get_adjacency(Named("g", g));
sp_mat A = as<sp_mat>(A_m);
return A;
}
/*** R
set.seed(20130810)
library(igraph)
adj_mat(100, 0.5)
*/

Inconsistent results between Rcpp and R code

UPDATE
Previous example is complicated, hence please allow me to use a simpler example as shown below:
Here is the Rcpp code:
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
#include <Rmath.h>
#include <Rcpp.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp ;
using namespace arma;
using namespace std;
// [[Rcpp::export]]
double chooseC(double n, double k) {
return Rf_choose(n, k);
}
// [[Rcpp::export]]
double function3(double n, double m, double beta) {
double prob;
NumericVector k(m);
NumericVector k_vec(m);
if(n<m){prob=0;}
else{
if(chooseC(n,m)==R_PosInf){
k=seq_len(m)-1;
k_vec= (n-k)/(m-k)*std::pow((1-beta),(n-m)/m)*beta;
prob=std::accumulate(k_vec.begin(),k_vec.end(), 1, std::multiplies<double>())*beta;
}
else{
prob = beta * chooseC(n,m) * std::pow(beta,m) * std::pow((1-beta),(n-m));
}
}
return(prob);
}
Here is the R code:
function4 <- function ( n , m , beta )
{
if ( n < m )
{
prob <- 0.0
}
else
{
if (is.infinite(choose(n,m))){
k<-0:(m-1)
prob <- beta *prod((n-k)/(m-k)*(1-beta)^((n-m)/m)*beta)
}
else{
prob <- beta * choose(n,m) * beta^m * (1-beta)^(n-m)
}
}
prob
}
Comparison:
input<-619
beta<-0.09187495
x<-seq(0, (input+1)/beta*3)
yy<-sapply(x,function(n)function3(n,input, beta=beta))
yy2<-sapply(x,function(n)function4(n,input, beta=beta))
sum(yy)=0
sum(yy2)=1
However, with other input:
input<-1
beta<-0.08214248
Both results are the same, sum(yy)=sum(yy2)=0.9865887.
I used double in Rcpp code, I don't know what else could cause the inconsistent precision between Rcpp and R code.
Thanks a lot!
I think I fix the Rcpp code, so right now both Rcpp and R code produce the same results when the results are very small values. The solution is shown as below:
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
#include <Rmath.h>
#include <Rcpp.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp ;
using namespace arma;
using namespace std;
// [[Rcpp::export]]
double chooseC(double n, double k) {
return Rf_choose(n, k);
}
// [[Rcpp::export]]
double function3(double n, double m, double beta) {
double prob;
arma::vec k = arma::linspace<vec>(0, m-1, m);
arma::vec k_vec;
if(n<m){prob=0;}
else{
if(chooseC(n,m)==R_PosInf){
k_vec= (n-k)/(m-k)*pow((1-beta),(n-m)/m)*beta;
prob=arma::prod(k_vec)*beta;
}
else{
prob = beta * chooseC(n,m) * pow(beta,m) * pow((1-beta),(n-m));
}
}
return(prob);
}
However, I still do not understand why by writing code in this way will fix the precision inconsistent. Rcpp and RcppArmadillo still look like black boxes to me.

Rcpp gamma integral

I am trying to rewrite into (R)cpp an original R function that makes use of the gamma function (from double input). Below the original source. When comping with sourceCpp the following error is raised "no matching function for call to 'gamma(Rcpp::traits::storage_type(<14>:.type)'"
The gamma function should has been put within sugar (as the mean below use) so I expect there should be easily called.
#include <Rcpp.h>
#include <math.h>
using namespace Rcpp;
// original R function
// function (y_pred, y_true)
// {
// eps <- 1e-15
// y_pred <- pmax(y_pred, eps)
// Poisson_LogLoss <- mean(log(gamma(y_true + 1)) + y_pred -
// log(y_pred) * y_true)
// return(Poisson_LogLoss)
// }
// [[Rcpp::export]]
double poissonLogLoss(NumericVector predicted, NumericVector actual) {
NumericVector temp, y_pred_new;
double out;
const double eps=1e-15;
y_pred_new=pmax(predicted,eps);
long n = predicted.size();
for (long i = 0; i < n; ++i) {
temp[i] = log( gamma(actual[i]+1)+y_pred_new[i]-log(y_pred_new[i])*actual[i]);
}
out=mean(temp); // using sugar implementation
return out;
}
You are making this too complicated as the point of Rcpp Sugar is work vectorized. So the following compiles as well:
#include <Rcpp.h>
#include <math.h>
using namespace Rcpp;
// [[Rcpp::export]]
double poissonLogLoss(NumericVector predicted, NumericVector actual) {
NumericVector temp, y_pred_new;
double out;
const double eps=1e-15;
y_pred_new=pmax(predicted,eps);
temp = log(gamma(actual + 1)) + y_pred_new - log(y_pred_new)*actual;
out=mean(temp); // using sugar implementation
return out;
}
Now, you didn't supply any test data so I do not know if this computes correctly or not. Also, because your R expression is already vectorized, this will not be much faster.
Lastly, your compile error is likely due to the Sugar function gamma() expecting an Rcpp object whereas you provided a double.

Resources