I am trying to update a variable within a loop but I am receiving the error
static assertion failed: cannot convert type to SEXP
I am trying to reproduce the following R code in Rcpp:
> v = rep(1, 5)
> for(k in 0:3){
+ v = cumsum(v)
+ }
> print(v)
[1] 1 5 15 35 70
I have gone through the following attempts (uncommenting / commenting the relevant chunks of code) but all give the same error. How can I do this and what am I doing wrong please?
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector fun() {
IntegerVector v = rep(1, 5);
// Attempt 1.
for(int k = 0; k < 4; ++k){
v = cumsum(v);
}
// Attempt 2.
// IntegerVector tempv;
// for(int k = 0; k < 4; ++k){
// tempv = cumsum(v);
// v = tempv;
// }
// can reproduce error more simply with the following:
// so issue is assigning back to variable or change of class?
// v = cumsum(v);
// Attempt 3.
// IntegerVector tempv;
// for(int k = 0; k < 4; ++k){
// tempv = cumsum(v);
// v = as<IntegerVector>(tempv);
// }
return v;
}
EDIT:
Okay, so I have something working (thanks to this)
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector fun() {
IntegerVector v = rep(1, 5);
for(int k = 0; k < 4; ++k){
std::partial_sum(v.begin(), v.end(), v.begin());
}
return v;
}
So I suppose my question is now what I was doing wrong previously? Thanks
As I hinted in my earlier comment, that is supposed to work. As it is not, you found a bug.
Whether it is worth fixing it is another manner. Whenvever I compute on or with vectors, I typically reach for RcppArmadillo. So here is a minimal (working) version of your first attempt, in RcppArmadillo.
Code
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::ivec fun() {
arma::ivec v(5, arma::fill::ones);
for (int k=0; k<3; k++) {
v = arma::cumsum(v);
}
return(v);
}
/*** R
fun()
*/
Output
R> sourceCpp("~/git/stackoverflow/59936632/answer.cpp")
R> fun()
[,1]
[1,] 1
[2,] 4
[3,] 10
[4,] 20
[5,] 35
R>
Edit
Made one smal fix and replace the earlier C++11 curly-init with a call to ones to replicate rep(1,5).
Related
Hi I am new to rcpp and computing the inner product of two variables but getting an error "inner_product was not declared in this scope" for the following code:
#include <math.h>
#include <RcppCommon.h>
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector polynomial_kernel(NumericVector x, NumericMatrix Y, double scale = 1, double offset =
1, int d=1){
int n = Y.nrow();
NumericVector kernel(n);
for (int j = 0; j < n; j++){
NumericVector v = Y( j,_ );
double crossProd =innerProduct(x,v);
kernel[j]= pow((scale*crossProd+offset),2);
}
return kernel;
}
Please help me to resolve this problem.
Below is simpler, repaired version of your code that actually compiles. It uses Armadillo types for consistency, and instead of calling a non-existing "inner_product" routines computes the inner product of two vectors the standard way via multiplication.
#include <RcppArmadillo.h> // also pulls in Rcpp.h amd cmath
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::vec polynomial_kernel(arma::vec x, arma::mat Y,
double scale = 1, double offset = 1, int d=1) {
int n = Y.n_rows;
arma::vec kernel(n);
for (int j = 0; j < n; j++){
arma::rowvec v = Y.row(j);
double crossProd = arma::as_scalar(v * x);
kernel[j] = std::pow((scale*crossProd+offset),2);
}
return kernel;
}
Your example was not a minimallyc complete verifiable example so I cannot show it any data you could have supplied with. On some made up data it seems to work:
R> set.seed(123)
R> polynomial_kernel(runif(4), matrix(rnorm(16),4))
[,1]
[1,] 3.317483
[2,] 3.055690
[3,] 1.208345
[4,] 0.301834
R>
I am using RcppParallel to speed up some calculations. However, I am running out of memory in the process, so I would like to save results within the Parallel loop that are pass some relevance threshold. Below is a toy example to illustrate my point:
#include <Rcpp.h>
#include <RcppParallel.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppParallel)]]
// [[Rcpp::plugins(cpp11)]]
struct Example : public RcppParallel::Worker {
RcppParallel::RVector<double> xvals, xvals_output, yvals;
Example(const NumericVector & xvals, NumericVector & yvals, NumericVector & xvals_output) :
xvals(xvals), xvals_output(xvals_output), yvals(yvals) {}
void operator()(std::size_t begin, size_t end) {
for(std::size_t i=begin; i < end; i++) {
double y = xvals[i] * (xvals[i] - 1);
// if(y < 0) {
// xvals_output.push_back(xvals[i]);
// yvals.push_back(y);
// }
xvals_output[i] = xvals[i];
yvals[i] = y;
}
}
};
// [[Rcpp::export]]
List find_values(NumericVector xvals) {
NumericVector xvals_output(xvals.size());
NumericVector yvals(xvals.size());
Example ex(xvals, yvals, xvals_output);
parallelFor(0, xvals.size(), ex);
List L = List::create(xvals_output, yvals);
return(L);
}
The R code would be:
find_values(seq(-10,10, by=0.5))
The commented out code is what I would like to do.
That is, I would like to initialize an empty vector, and append only the y-values that pass a certain threshold and also the associated x-values.
In my real usage, I am calculating a MxN matrix, so memory is an issue.
What is the correct way to approach this issue?
If anyone ever comes across a similar problem, here's a solution using "concurrent_vector" from TBB (which RcppParallel uses under the hood and is available as a header).
#include <Rcpp.h>
#include <RcppParallel.h>
#include <tbb/concurrent_vector.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppParallel)]]
// [[Rcpp::plugins(cpp11)]]
struct Example : public RcppParallel::Worker {
RcppParallel::RVector<double> xvals;
tbb::concurrent_vector< std::pair<double, double> > &output;
Example(const NumericVector & xvals, tbb::concurrent_vector< std::pair<double, double> > &output) :
xvals(xvals), output(output) {}
void operator()(std::size_t begin, size_t end) {
for(std::size_t i=begin; i < end; i++) {
double y = xvals[i] * (xvals[i] - 1);
if(y < 0) {
output.push_back( std::pair<double, double>(xvals[i], y) );
}
}
}
};
// [[Rcpp::export]]
List find_values(NumericVector xvals) {
tbb::concurrent_vector< std::pair<double, double> > output;
Example ex(xvals,output);
parallelFor(0, xvals.size(), ex);
NumericVector xout(output.size());
NumericVector yout(output.size());
for(int i=0; i<output.size(); i++) {
xout[i] = output[i].first;
yout[i] = output[i].second;
}
List L = List::create(xout, yout);
return(L);
}
Output:
> find_values(seq(-10,10, by=0.5))
[[1]]
[1] 0.5
[[2]]
[1] -0.25
I have to convert individual elements of Rcpp::IntegerVector into their string form so I can add another string to them. My code looks like this:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
Rcpp::String int_to_char_single_fun(int x){
// Obtain environment containing function
Rcpp::Environment base("package:base");
// Make function callable from C++
Rcpp::Function int_to_string = base["as.character"];
// Call the function and receive its list output
Rcpp::String res = int_to_string(Rcpp::_["x"] = x); // example of original param
// Return test object in list structure
return (res);
}
//[[Rcpp::export]]
Rcpp::CharacterVector add_chars_to_int(Rcpp::IntegerVector x){
int n = x.size();
Rcpp::CharacterVector BASEL_SEG(n);
for(int i = 0; i < n; i++){
BASEL_SEG[i] = "B0" + int_to_char_single_fun(x[i]);
}
return BASEL_SEG;
}
/*** R
int_vec <- as.integer(c(1,2,3,4,5))
BASEL_SEG_char <- add_chars_to_int(int_vec)
*/
I get the following error:
no match for 'operator+'(operand types are 'const char[3]' and 'Rcpp::String')
I cannot import any C++ libraries like Boost to do this and can only use Rcpp functionality to do this. How do I add string to integer here in Rcpp?
We basically covered this over at the Rcpp Gallery when we covered Boost in an example for lexical_cast (though that one went the other way). So rewriting it quickly yields this:
Code
// We can now use the BH package
// [[Rcpp::depends(BH)]]
#include <Rcpp.h>
#include <boost/lexical_cast.hpp>
using namespace Rcpp;
using boost::lexical_cast;
using boost::bad_lexical_cast;
// [[Rcpp::export]]
std::vector<std::string> lexicalCast(std::vector<int> v) {
std::vector<std::string> res(v.size());
for (unsigned int i=0; i<v.size(); i++) {
try {
res[i] = lexical_cast<std::string>(v[i]);
} catch(bad_lexical_cast &) {
res[i] = "(failed)";
}
}
return res;
}
/*** R
lexicalCast(c(42L, 101L))
*/
Output
R> Rcpp::sourceCpp("/tmp/lexcast.cpp")
R> lexicalCast(c(42L, 101L))
[1] "42" "101"
R>
Alternatives
Because converting numbers to strings is as old as computing itself you could also use:
itoa()
snprintf()
streams
and probably a few more I keep forgetting.
As others have pointed out, there are several ways to do this. Here are two very straightforward approaches.
1. std::to_string
Rcpp::CharacterVector add_chars_to_int1(Rcpp::IntegerVector x){
int n = x.size();
Rcpp::CharacterVector BASEL_SEG(n);
for(int i = 0; i < n; i++){
BASEL_SEG[i] = "B0" + std::to_string(x[i]);
}
return BASEL_SEG;
}
2. Creating a new Rcpp::CharacterVector
Rcpp::CharacterVector add_chars_to_int2(Rcpp::IntegerVector x){
int n = x.size();
Rcpp::CharacterVector BASEL_SEG(n);
Rcpp::CharacterVector myIntToStr(x.begin(), x.end());
for(int i = 0; i < n; i++){
BASEL_SEG[i] = "B0" + myIntToStr[i];
}
return BASEL_SEG;
}
Calling them:
add_chars_to_int1(int_vec) ## using std::to_string
[1] "B01" "B02" "B03" "B04" "B05"
add_chars_to_int2(int_vec) ## converting to CharacterVector
[1] "B01" "B02" "B03" "B04" "B05"
I was trying to use rcpp/armadillo with openmp to speed up a loop in R. The loop takes a matrix with each row containing indices of a location vector(or matrix if it's 2D locations) as input(and other matrix/vec to be used). Inside the loop, I extracted each row of input indices matrix and find the corresponding locations, calculate distance matrix, and covariance matrix, do cholesky and backsolve, save the backsolve results to a new matrix. Here is the rcpp code:
`#include <iostream>
#include <RcppArmadillo.h>
#include <omp.h>
#include <Rcpp.h>
// [[Rcpp::plugins(openmp)]]
using namespace Rcpp;
using namespace arma;
using namespace std;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
mat NZentries_new2 (int m, int nnp, const mat& locs, const umat& revNNarray, const mat& revCondOnLatent, const vec& nuggets, const vec covparms){
// initialized the output matrix
mat Lentries=zeros(nnp,m+1);
// initialized objects in parallel part
int n0; //number of !is_na elements
uvec inds;//
vec revCon_row;//
uvec inds00;//
vec nug;//
mat covmat;//
vec onevec;//
vec M;//
mat dist;//
int k;//
omp_set_num_threads(2);// selects the number of cores to use.
#pragma omp parallel for shared(locs,revNNarray,revCondOnLatent,nuggets,nnp,m,Lentries) private(k,M,dist,onevec,covmat,nug,n0,inds,revCon_row,inds00) default(none) schedule(static)
for (k = 0; k < nnp; k++) {
// extract a row to work with
inds=revNNarray.row(k).t();
revCon_row=revCondOnLatent.row(k).t();
if (k < m){
n0=k+1;
} else {
n0=m+1;
}
// extract locations
inds00=inds(span(m+1-n0,m))-ones<uvec>(n0);
nug=nuggets.elem(inds00) % (ones(n0)-revCon_row(span(m+1-n0,m))); // vec is vec, cannot convert to mat
dist=calcPWD2(locs.rows(inds00));
#pragma omp critical
{
//calculate covariance matrix
covmat= MaternFun(dist,covparms) + diagmat(nug) ; // summation from arma
}
// get last row of inverse Cholesky
onevec = zeros(n0);
onevec[n0-1] = 1;
M=solve(chol(covmat,"upper"),onevec);
// save the entries to matrix
Lentries(k,span(0,n0-1)) = M.t();
}
return Lentries;
}`
The current version works fine but speed is slow(almost the same as no parallel version), if I take the line in omp critical bracket out, it cause segment fault and R will be crashed. This MaterFun is a function I defined as below with several other small functions. So my question is that why MaternFun has to stay in the critical part.
// [[Rcpp::export]]
mat MaternFun( mat distmat, vec covparms ){
int d1 = distmat.n_rows;
int d2 = distmat.n_cols;
int j1;
int j2;
mat covmat(d1,d2);
double scaledist;
double normcon = covparms(0)/(pow(2.0,covparms(2)-1)*Rf_gammafn(covparms(2)));
for (j1 = 0; j1 < d1; j1++){
for (j2 = 0; j2 < d2; j2++){
if ( distmat(j1,j2) == 0 ){
covmat(j1,j2) = covparms(0);
} else {
scaledist = distmat(j1,j2)/covparms(1);
covmat(j1,j2) = normcon*pow( scaledist, covparms(2) )*
Rf_bessel_k(scaledist,covparms(2),1.0);
}
}
}
return covmat;
}
// [[Rcpp::export]]
double dist2(double lat1,double long1,double lat2,double long2) {
double dist = sqrt(pow(lat1 - lat2, 2) + pow(long1 - long2, 2)) ;
return (dist) ;
}
// [[Rcpp::export]]
mat calcPWD2( mat x) {//Rcpp::NumericMatrix
int outrows = x.n_rows ;
int outcols = x.n_rows ;
mat out(outrows, outcols) ;
for (int arow = 0 ; arow < outrows ; arow++) {
for (int acol = 0 ; acol < outcols ; acol++) {
out(arow, acol) = dist2(x(arow, 0),x(arow, 1),
x(acol, 0),x(acol, 1)) ; //extract element from mat
}
}
return (out) ;
}
Here is some sample inputs for testing the MaterFun in R:
library(fields)
distmat=rdist(1:5) # distance matrix
covparms=c(1,0.2,1.5)
The issue is there are two calls to R math functions (Rf_bessel_k and Rf_gammafn) that require the access to be single threaded instead of parallel.
To get around this, let's add a dependency on boost via BH to obtain the cyl_bessel_k and tgamma functions. Alternatively, there is always the option of reimplementing R's besselK and gamma in C++ so it doesn't use the single-threaded R variant.
This gives:
#include <Rcpp.h>
#include <boost/math/special_functions/bessel.hpp>
#include <boost/math/special_functions/gamma.hpp>
// [[Rcpp::depends(BH)]]
// [[Rcpp::export]]
double besselK_boost(double x, double v) {
return boost::math::cyl_bessel_k(v, x);
}
// [[Rcpp::export]]
double gamma_fn_boost(double x) {
return boost::math::tgamma(x);
}
Test Code
x0 = 9.536743e-07
nu = -10
all.equal(besselK(x0, nu), besselK_boost(x0, nu))
# [1] TRUE
x = 2
all.equal(gamma(x), gamma_fn_boost(x))
# [1] TRUE
Note: The order of parameters for boost's variant differs from R's:
cyl_bessel_k(v, x)
Rf_bessel_k(x, v, expon.scaled = FALSE)
From here, we can modify the MaternFun. Unfortunately, because calcPWD2 is missing, the furthest we can go is switching to use boost and incorporating in OpenMP protections.
#include <RcppArmadillo.h>
#include <boost/math/special_functions/bessel.hpp>
#include <boost/math/special_functions/gamma.hpp>
#ifdef _OPENMP
#include <omp.h>
#endif
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(BH)]]
// [[Rcpp::plugins(openmp)]]
// [[Rcpp::export]]
arma::mat MaternFun(arma::mat distmat, arma::vec covparms) {
int d1 = distmat.n_rows;
int d2 = distmat.n_cols;
int j1;
int j2;
arma::mat covmat(d1,d2);
double scaledist;
double normcon = covparms(0) /
(pow(2.0, covparms(2) - 1) * boost::math::tgamma(covparms(2)));
for (j1 = 0; j1 < d1; ++j1){
for (j2 = 0; j2 < d2; ++j2){
if ( distmat(j1, j2) == 0 ){
covmat(j1, j2) = covparms(0);
} else {
scaledist = distmat(j1, j2)/covparms(1);
covmat(j1, j2) = normcon * pow( scaledist, covparms(2) ) *
boost::math::cyl_bessel_k(covparms(2), scaledist);
}
}
}
return covmat;
}
I have a dataframe 'tmp' on which I need to do perform calculation using the last row of another dataframe 'SpreadData'. I am using following code:
for(i in 1:ncol(tmp)){for(j in 1:nrow(tmp)){PNLData[j,i] = 10*tmp[j,i]*SpreadData[nrow(SpreadData),i]}}
Is there any faster method using mapply or something else so that I need not to use for loop.
Thanks
You can use sweep():
PNLData <- sweep(10 * tmp, 2, SpreadData[nrow(SpreadData), ], "*")
PS1: you can replace SpreadData[nrow(SpreadData), ] by tail(SpreadData, 1).
PS2: I think this makes two copies of your data. If you have a large matrix, you'd better use Rcpp.
Edit: Rcpp solution: put that an .cpp file and source it.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericMatrix rcppFun(const NumericMatrix& x,
const NumericVector& lastCol) {
int n = x.nrow();
int m = x.ncol();
NumericMatrix res(n, m);
int i, j;
for (j = 0; j < m; j++) {
for (i = 0; i < n; i++) {
res(i, j) = 10 * x(i, j) * lastCol[j];
}
}
return res;
}
And do in R PNLData <- rcppFun(tmp, SpreadData[nrow(SpreadData), ]).