I am trying to call R function optim() in Rcpp. I saw an example in Calling R's optim function from within C++ using Rcpp, but I am unable to modify it correctly for my use case. Basically, the objective function depends on the x and y but I want to optimize it with respect to b.
Here is the R code that does what I want:
example_r = function(b, x, y) {
phi = rnorm(length(x))
tar_val = (x ^ 2 + y ^ 2) * b * phi
objftn_r = function(beta, x, y) {
obj_val = (x ^ 2 + y ^ 2) * beta
return(obj_val)
}
b1 = optim(b, function(beta) {
sum((objftn_r(beta, x, y) - tar_val) ^ 2)
}, method = "BFGS")$par
result = (x ^ 2 + y ^ 2) * b1
return(b1)
}
Here's is my attempt to translate it to _RcppArmadillo:
#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]
arma::vec example_rcpp(arma::vec b, arma::vec x, arma::vec y){
arma::vec tar_val = pow(x,2)%b-pow(y,2);
return tar_val;
}
// [[Rcpp::export]]
arma::vec optim_rcpp(const arma::vec& init_val, arma::vec& x, arma::vec& y){
Rcpp::Environment stats("package:stats");
Rcpp::Function optim = stats["optim"];
Rcpp::List opt_results = optim(Rcpp::_["par"] = init_val,
Rcpp::_["fn"] = Rcpp::InternalFunction(&example_rcpp),
Rcpp::_["method"] = "BFGS");
arma::vec out = Rcpp::as<arma::vec>(opt_results[0]);
return out;
}
However, this code is returning:
> optim_rcpp(1:3,2:4,3:5)
Error in optim_rcpp(1:3, 2:4, 3:5) : not compatible with requested type
I'm not sure what the error is here.
Before we begin, I have a few remarks:
Please show all of your attempt.
In particular, make sure your example is a minimal reproducible example
Do not delete or shorten code unless asked.
Keep the scope of your question narrow.
Using optim from R in C++ is very different than using in C++ the underlying C++ code for opt() from nlopt.
Avoid spamming questions.
If you find yourself asking more than 3 questions in rapid succession, please read the documentation or talk in person with someone familiar with the content.
I've cleaned up your question as a result... But, in the future, this likely will not happen.
Data Generation Process
The data generation process seems to be done in 2 steps: First, outside of the example_r function, and, then inside the function.
This should be simplified so that it is done outside of the optimization function. For example:
generate_data = function(n, x_mu = 0, y_mu = 1, beta = 1.5) {
x = rnorm(n, x_mu)
y = rnorm(n, y_mu)
phi = rnorm(length(x))
tar_val = (x ^ 2 + y ^ 2) * beta * phi
simulated_data = list(x = x, y = y, beta = beta, tar_val = tar_val)
return(simulated_data)
}
Objective Functions and R's optim
Objective functions must return a single value, e.g. a scalar, in R. Under the posted R code, there was effectively two functions designed to act as an objective function in sequence, e.g.
objftn_r = function(beta, x, y) {
obj_val = (x ^ 2 + y ^ 2) * beta
return(obj_val)
}
b1 = optim(b, function(beta) {
sum((objftn_r(beta, x, y) - tar_val) ^ 2)
}, method = "BFGS")$par
This objective function should therefore be re-written as:
objftn_r = function(beta_hat, x, y, tar_val) {
# The predictions generate will be a vector
est_val = (x ^ 2 + y ^ 2) * beta_hat
# Here we apply sum of squares which changes it
# from a vector into a single "objective" value
# that optim can work with.
obj_val = sum( ( est_val - tar_val) ^ 2)
return(obj_val)
}
From there, the calls should align as:
sim_data = generate_data(10, 1, 2, .3)
b1 = optim(sim_data$beta, fn = objftn_r, method = "BFGS",
x = sim_data$x, y = sim_data$y, tar_val = sim_data$tar_val)$par
RcppArmadillo Objective Functions
Having fixed the scope and behavior of the R code, let's focus on translating it into RcppArmadillo.
In particular, notice that the objection function defined after the translation returns a vector and not a scalar into optim, which is not a single value. Also of concern is the lack of a tar_val parameter in the objective function. With this in mind, the objective function will translate to:
// changed function return type and
// the return type of first parameter
double obj_fun_rcpp(double& beta_hat,
arma::vec& x, arma::vec& y, arma::vec& tar_val){
// Changed from % to * as it is only appropriate if
// `beta_hat` is the same length as x and y.
// This is because it performs element-wise multiplication
// instead of a scalar multiplication on a vector
arma::vec est_val = (pow(x, 2) - pow(y, 2)) * beta_hat;
// Compute objective value
double obj_val = sum( pow( est_val - tar_val, 2) );
// Return a single value
return obj_val;
}
Now, with the objective function set, let's address the Rcpp call into R for optim() from C++. In this function, the parameters of the
function must be explicitly supplied. So, x, y, and tar_val must be present in the optim call. Thus, we will end up with:
// [[Rcpp::export]]
arma::vec optim_rcpp(double& init_val,
arma::vec& x, arma::vec& y, arma::vec& tar_val){
// Extract R's optim function
Rcpp::Environment stats("package:stats");
Rcpp::Function optim = stats["optim"];
// Call the optim function from R in C++
Rcpp::List opt_results = optim(Rcpp::_["par"] = init_val,
// Make sure this function is not exported!
Rcpp::_["fn"] = Rcpp::InternalFunction(&obj_fun_rcpp),
Rcpp::_["method"] = "BFGS",
// Pass in the other parameters as everything
// is scoped environmentally
Rcpp::_["x"] = x,
Rcpp::_["y"] = y,
Rcpp::_["tar_val"] = tar_val);
// Extract out the estimated parameter values
arma::vec out = Rcpp::as<arma::vec>(opt_results[0]);
// Return estimated values
return out;
}
All together
The full functioning code can be written in test_optim.cpp and compiled via sourceCpp() as:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// changed function return type and
// the return type of first parameter
// DO NOT EXPORT THIS FUNCTION VIA RCPP ATTRIBUTES
double obj_fun_rcpp(double& beta_hat,
arma::vec& x, arma::vec& y, arma::vec& tar_val){
// Changed from % to * as it is only appropriate if
// `beta_hat` is the same length as x and y.
// This is because it performs element-wise multiplication
// instead of a scalar multiplication on a vector
arma::vec est_val = (pow(x, 2) - pow(y, 2)) * beta_hat;
// Compute objective value
double obj_val = sum( pow( est_val - tar_val, 2) );
// Return a single value
return obj_val;
}
// [[Rcpp::export]]
arma::vec optim_rcpp(double& init_val,
arma::vec& x, arma::vec& y, arma::vec& tar_val){
// Extract R's optim function
Rcpp::Environment stats("package:stats");
Rcpp::Function optim = stats["optim"];
// Call the optim function from R in C++
Rcpp::List opt_results = optim(Rcpp::_["par"] = init_val,
// Make sure this function is not exported!
Rcpp::_["fn"] = Rcpp::InternalFunction(&obj_fun_rcpp),
Rcpp::_["method"] = "BFGS",
// Pass in the other parameters as everything
// is scoped environmentally
Rcpp::_["x"] = x,
Rcpp::_["y"] = y,
Rcpp::_["tar_val"] = tar_val);
// Extract out the estimated parameter values
arma::vec out = Rcpp::as<arma::vec>(opt_results[0]);
// Return estimated values
return out;
}
Test Case
# Setup some values
beta = 2
x = 2:4
y = 3:5
# Set a seed for reproducibility
set.seed(111)
phi = rnorm(length(x))
tar_val = (x ^ 2 + y ^ 2) * beta * phi
optim_rcpp(beta, x, y, tar_val)
# [,1]
# [1,] 2.033273
Note: If you would like to avoid a matrix of size 1 x1 from being returned please use double as the return parameter of optim_rcpp and switch Rcpp::as<arma::vec> to Rcpp::as<double>
Related
I am trying to convert my R function to C++ using Rcpp, but I came around errors that I don't understand quite well.
The following code gives my R function, my (poor) attempt to translate it and some examples of uses at the end (testing that the two function return the same thing...)
My R Code function:
intersect_rectangles <- function(x_min, x_max, y_min, y_max) {
rez <- list()
rez$min <- pmax(x_min, y_min)
rez$max <- pmin(x_max, y_max)
if (any(rez$min > rez$max)) {
return(list(NULL))
}
return(rez)
}
My attempt to create the same function with Rcpp.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
List Cpp_intersect_rectangles(NumericVector x_min,NumericVector
x_max,NumericVector y_min,NumericVector y_max) {
// Create a list :
NumericVector min = pmax(x_min,y_min);
NumericVector max = pmin(x_max,y_max);
List L = List::create(R_NilValue);
if (! any(min > max)) {
L = List::create(Named("min") = min , _["max"] = max);
}
return(L);
}
I receive the following error messages:
/Library/Frameworks/R.framework/Versions/3.5/Resources/library/Rcpp/include/Rcpp/sugar/logical/SingleLogicalResult.h:36:2: error: implicit instantiation of undefined template 'Rcpp::sugar::forbidden_conversion<false>'
forbidden_conversion<x>{
^
/Library/Frameworks/R.framework/Versions/3.5/Resources/library/Rcpp/include/Rcpp/sugar/logical/SingleLogicalResult.h:74:40: note: in instantiation of template class 'Rcpp::sugar::conversion_to_bool_is_forbidden<false>' requested here
conversion_to_bool_is_forbidden<!NA> x ;
^
file637e53281965.cpp:13:9: note: in instantiation of member function 'Rcpp::sugar::SingleLogicalResult<true, Rcpp::sugar::Negate_SingleLogicalResult<true, Rcpp::sugar::Any<true, Rcpp::sugar::Comparator<14, Rcpp::sugar::greater<14>, true, Rcpp::Vector<14, PreserveStorage>, true, Rcpp::Vector<14, PreserveStorage> > > > >::operator bool' requested here
if (! any(min > max))
If the Rcpp function is implemented correctly, then the following should work:
u = rep(0,4)
v = rep(1,4)
w = rep(0.3,4)
x = c(0.8,0.8,3,3)
all.equal(intersect_rectangles(u,v,w,x), Cpp_intersect_rectangles(u,v,w,x))
all.equal(intersect_rectangles(u,v,w,w), Cpp_intersect_rectangles(u,v,w,w))
What's wrong with my cpp code?
The reason the code isn't translating correctly is due to how the any() Rcpp sugar implementation was created. In particular, we have that:
The actual return type of any(X) is an instance of the
SingleLogicalResult template class, but the functions is_true
and is_false may be used to convert the return value to bool.
Per https://thecoatlessprofessor.com/programming/unofficial-rcpp-api-documentation/#any
Therefore, the solution is to add .is_true() to the any() function call, e.g. !any(condition).is_true().
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
List Cpp_intersect_rectangles(NumericVector x_min, NumericVector x_max,
NumericVector y_min, NumericVector y_max) {
// Create a list :
NumericVector min = pmax(x_min, y_min);
NumericVector max = pmin(x_max, y_max);
List L = List::create(R_NilValue);
if (! any(min > max).is_true()) {
// ^^^^^^^^^ Added
L = List::create(Named("min") = min , _["max"] = max);
}
return(L);
}
Then, through testing we get:
u = rep(0,4)
v = rep(1,4)
w = rep(0.3,4)
x = c(0.8,0.8,3,3)
all.equal(intersect_rectangles(u,v,w,x), Cpp_intersect_rectangles(u,v,w,x))
# [1] TRUE
all.equal(intersect_rectangles(u,v,w,w), Cpp_intersect_rectangles(u,v,w,w))
# [1] TRUE
I have a question concerning a fast implementation. Imagine that you have a matrix Ys in which each row refers to a vector of observed values stemming from a multivariate normal distribution, e.g.,
Ys = matrix(c(1.0,1.0,1.0,0.0,0.5,0.6,0.1,0.1,0.3), nrow = 3, ncol = 3)
Furthermore, there is a matrix Sigs in which each row refers to the diagonal elements of the variance covariance matrix for each of the outcome vectors in Ys, e.g.,
Sigs = matrix(c(1.0,0.5,0.1,0.2,0.3,0.4,0.3,0.7,0.8), nrow = 3, ncol = 3)
What I want to do is to compute the density value of each row in Ys given the diagonal elemnts in the respective row in Sigs.
One could use a for-loop in R, e.g.
colSigs = ncol(Sigs)
res = rep(0,3)
means = rep(0,colSigs)
for (i in 1:nrow(Ys) ) {
sigma = diag(Sigs[i,],colSigs)
res[i] = mvtnorm::dmvnorm(Ys[i,],means,sigma)
}
however, in my case Ys and Sigs contain about 100,000 rows. So I wrote an Rcpp-function that is considerably faster. Nevertheless, I was wondering whether there is a fancy trick (a more efficient way) so that I do not have to do looping? Any ideas are welcome.
----
EDIT: I was asked to add the Rcpp functions. Here, you go:
This function computes the quadratic form appearing in the multivariate normal density:
double dmvnorm_distance( arma::rowvec y, arma::mat Sigma )
{
int n = Sigma.n_rows;
double res=0;
double fac=1;
for (int ii=0; ii<n; ii++){
for (int jj=ii; jj<n; jj++){
if (ii==jj){ fac = 1; } else { fac = 2;}
res += fac *y(0,ii) * Sigma(ii,jj) * y(0,jj);
}
}
return res;
}
This function computes the density value:
double dmvnorm_rcpp( arma::rowvec y, arma::mat Sigma )
{
int p = Sigma.n_rows;
// inverse Sigma
arma::mat Sigma1 = arma::inv(Sigma);
// determinant Sigma
double det_Sigma = arma::det(Sigma);
// distance
double dist = dmvnorm_distance( y, Sigma1);
double pi1 = 3.14159265358979;
double l1 = - p * std::log(2*pi1) - dist - std::log( det_Sigma );
double ll = 0.5 * l1;
return ll;
}
and this function contains the for-loop and is called from R:
Rcpp::NumericVector mvnorm_loop( arma::mat Ys, arma::mat SIGs )
{
int n = Ys.n_rows;
Rcpp::NumericVector out(n);
for (int ii=0; ii<n; ii++){
// get yi and diagonal entries
arma::rowvec yi = Ys.row(ii);
arma::rowvec si = SIGs.row(ii);;
// make Sigma
arma::mat Sigma = arma::diagmat(si);
// compute likelihood value
out[ii] = dmvnorm_rcpp( yi, Sigma );
}
return out;
}
So basically the question is whether there is an alternative way to implement the insertion in Rcpp to make the whole thing even more faster.
----
Best,
Stefan
PS: I also used apply in R and it is slower than the Rcpp loop-function.
I'm very new to Rcpp. I 'm trying to write a coordinate descent algorithm for lasso in Rcpp as a self study. The code return an error:
Mat::init(): requested size is not compatible with column vector layout
I also have some problems while writing the code.
no matching function for call to 'sign'
no matching function for call to 'ifelse'
pow(X.col(j),2) : no viable conversion
I write
(S1>0)-(S1<0) for (1) to compute the sign of S1,
a if(){}else{} statement for (2) and
X.col(j)%X.col(j) for (3).
Any suggestion, please?
Here is the code.
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
using namespace Rcpp;
using namespace arma;
// [[Rcpp::export]]
mat betamat(NumericMatrix Xr, NumericVector yr, NumericVector lambdar, double tol=0.0000001) {
int N = Xr.nrow(), p = Xr.ncol(), l = lambdar.size();
mat X(Xr.begin(), N, p, false);
colvec y(yr.begin(), yr.size(), false);
vec lambda(lambdar.begin(), lambdar.size(),false);
colvec ols = solve(X,y);
mat betas = zeros<mat>(p,l);
//
bool converged = false;
for (int i = 0; i < l; ++i) {
colvec b = zeros<vec>(p);
colvec r = y-X*b;
while(converged == false){
colvec beta_old = betas;
for(int j = 0; j < p; ++j){
r = r + X.col(j)*b(j);
double xr = dot(X.col(j),r);
double S1 = xr/N;
double xx = sum(X.col(j)%X.col(j))/N;
b(j) =((S1>0)-(S1<0))*(abs(S1)-lambda(i))/xx;
if(b(j)>0){
b(j)=b(j);
}else{
b(j)=0;
}
r = r - X.col(j)*b(j);
}
converged = (sum(abs(betas - beta_old)) < tol);
}
betas.col(i) = b;
}
return betas;
}
In R, I'm calling this with
library(Rcpp)
sourceCpp("filename.cpp")
set.seed(1)
X <- matrix(rnorm(100*3),100)
y <- rnorm(100)
coefficients <- betamat(X,y,seq(0,1,0.0005))
I have a very basic question about C++ integration in R via Rcpp. Suppose I want to implement a simple function like this one in C++:
inte = function(x, y, a, b){
model = approxfun(x, y)
return(integrate(model, a, b)$value)
}
So a very basic approach would be to call R's function 'integrate' as much as needed:
// [[Rcpp::export]]
double intecxx(Function inte, NumericVector x, NumericVector y,
double a, double b) {
NumericVector res;
res = inte(x, y, a, b);
return res[0];
}
However, I need to use this 'intecxx' in many other parts of my C++ code, so calling it from somewhere else results in 'inte' not being available in the scope. Any help is appreciated.
If you are willing to modify intecxx by hardcoding the call to inte inside the body, rather than trying to pass it as a parameter, you could use this approach:
#include <Rcpp.h>
/*** R
inte = function(x, y, a, b){
model = approxfun(x, y)
return(integrate(model, a, b)$value)
}
.x <- 1:10
set.seed(123)
.y <- rnorm(10)
*/
// [[Rcpp::export]]
double intecxx(Rcpp::NumericVector x, Rcpp::NumericVector y, double a, double b) {
Rcpp::NumericVector res;
Rcpp::Environment G = Rcpp::Environment::global_env();
Rcpp::Function inte = G["inte"];
res = inte(x, y, a, b);
return res[0];
}
I defined inte in the same source file as intecxx to ensure that it is available in the global environment, and therefore callable from within intecxx through G.
R> inte(.x, .y, 1, 10)
[1] 1.249325
R> intecxx(.x, .y, 1, 10)
[1] 1.249325
R> all.equal(inte(.x, .y, 1, 10),intecxx(.x, .y, 1, 10))
[1] TRUE
Consider the following R code,
## ----------- R version -----------
caller <- function(x=1:3, fun = "identity", ...){
## do some other stuff
## ...
## then call the function
eval(call(fun, x))
}
fun1 <- function(x, ...){
x + x
}
fun2 <- function(x, a = 10) a * x
caller(fun = "fun1")
caller(fun = "fun2")
The user can pass a function name "fun", that is used by caller. I wish to perform the same task with RcppArmadillo objects (as part of a more complex task, obviously). The function would be defined in C++, and the user selects it at the R level by referring to its name:
caller_cpp(1:3, "fun1_cpp")
or
caller_cpp(1:3, "fun2_cpp")
etc.
Here's my naive attempt for the caller function, that even fails to compile:
## ----------- C++ version -----------
library(Rcpp)
require( RcppArmadillo )
sourceCpp( code = '
// [[Rcpp::depends("RcppArmadillo")]]
#include <RcppArmadillo.h>
using namespace arma ;
using namespace Rcpp ;
colvec fun1_cpp(const colvec x)
{
colvec y ;
y = x + x;
return (y);
}
colvec fun2_cpp(const colvec x)
{
colvec y ;
y = 10*x;
return (y);
}
// mysterious pointer business in an attempt
// to select a compiled function by its name
typedef double (*funcPtr)(SEXP);
SEXP putFunPtrInXPtr(SEXP funname) {
std::string fstr = Rcpp::as<std::string>(funname);
if (fstr == "fun1")
return(Rcpp::XPtr<funcPtr>(new funcPtr(&fun1_cpp)));
else if (fstr == "fun2")
return(Rcpp::XPtr<funcPtr>(new funcPtr(&fun2_cpp)));
}
// [[Rcpp::export]]
colvec caller_cpp(const colvec x, character funname)
{
Rcpp::XPtr fun = putFunPtrInXPtr(funname);
colvec y ;
y = fun(x);
return (y);
}
')
Edit: adapted the example after following Dirk's suggestion to look at RcppDE.
(Sometime you need to use svn log ... on files to see how dated they are...)
I think a better use case is in my "port" of the C-based DEoptim to Rcpp / RcppArmadillo: RcppDE. In it, I allow the optimization routine to use either an R function (as DEoptim does) or a user-supplied compiled function -- which is what you want here as I understand it.
There is a tiny bit of C++ scaffolding, but you should have no problem following that.
Edit on 2013-01-21 Below is a complete solution which I have also justed posted as this new post at the Rcpp Gallery -- including some comments and sample usage.
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
using namespace arma;
using namespace Rcpp;
vec fun1_cpp(const vec& x) { // a first function
vec y = x + x;
return (y);
}
vec fun2_cpp(const vec& x) { // and a second function
vec y = 10*x;
return (y);
}
typedef vec (*funcPtr)(const vec& x);
// [[Rcpp::export]]
XPtr<funcPtr> putFunPtrInXPtr(std::string fstr) {
if (fstr == "fun1")
return(XPtr<funcPtr>(new funcPtr(&fun1_cpp)));
else if (fstr == "fun2")
return(XPtr<funcPtr>(new funcPtr(&fun2_cpp)));
else
return XPtr<funcPtr>(R_NilValue); // runtime error as NULL no XPtr
}
// [[Rcpp::export]]
vec callViaString(const vec x, std::string funname) {
XPtr<funcPtr> xpfun = putFunPtrInXPtr(funname);
funcPtr fun = *xpfun;
vec y = fun(x);
return (y);
}
// [[Rcpp::export]]
vec callViaXPtr(const vec x, SEXP xpsexp) {
XPtr<funcPtr> xpfun(xpsexp);
funcPtr fun = *xpfun;
vec y = fun(x);
return (y);
}