Convert individual Rcpp::IntegerVector element to a character - r

I have to convert individual elements of Rcpp::IntegerVector into their string form so I can add another string to them. My code looks like this:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
Rcpp::String int_to_char_single_fun(int x){
// Obtain environment containing function
Rcpp::Environment base("package:base");
// Make function callable from C++
Rcpp::Function int_to_string = base["as.character"];
// Call the function and receive its list output
Rcpp::String res = int_to_string(Rcpp::_["x"] = x); // example of original param
// Return test object in list structure
return (res);
}
//[[Rcpp::export]]
Rcpp::CharacterVector add_chars_to_int(Rcpp::IntegerVector x){
int n = x.size();
Rcpp::CharacterVector BASEL_SEG(n);
for(int i = 0; i < n; i++){
BASEL_SEG[i] = "B0" + int_to_char_single_fun(x[i]);
}
return BASEL_SEG;
}
/*** R
int_vec <- as.integer(c(1,2,3,4,5))
BASEL_SEG_char <- add_chars_to_int(int_vec)
*/
I get the following error:
no match for 'operator+'(operand types are 'const char[3]' and 'Rcpp::String')
I cannot import any C++ libraries like Boost to do this and can only use Rcpp functionality to do this. How do I add string to integer here in Rcpp?

We basically covered this over at the Rcpp Gallery when we covered Boost in an example for lexical_cast (though that one went the other way). So rewriting it quickly yields this:
Code
// We can now use the BH package
// [[Rcpp::depends(BH)]]
#include <Rcpp.h>
#include <boost/lexical_cast.hpp>
using namespace Rcpp;
using boost::lexical_cast;
using boost::bad_lexical_cast;
// [[Rcpp::export]]
std::vector<std::string> lexicalCast(std::vector<int> v) {
std::vector<std::string> res(v.size());
for (unsigned int i=0; i<v.size(); i++) {
try {
res[i] = lexical_cast<std::string>(v[i]);
} catch(bad_lexical_cast &) {
res[i] = "(failed)";
}
}
return res;
}
/*** R
lexicalCast(c(42L, 101L))
*/
Output
R> Rcpp::sourceCpp("/tmp/lexcast.cpp")
R> lexicalCast(c(42L, 101L))
[1] "42" "101"
R>
Alternatives
Because converting numbers to strings is as old as computing itself you could also use:
itoa()
snprintf()
streams
and probably a few more I keep forgetting.

As others have pointed out, there are several ways to do this. Here are two very straightforward approaches.
1. std::to_string
Rcpp::CharacterVector add_chars_to_int1(Rcpp::IntegerVector x){
int n = x.size();
Rcpp::CharacterVector BASEL_SEG(n);
for(int i = 0; i < n; i++){
BASEL_SEG[i] = "B0" + std::to_string(x[i]);
}
return BASEL_SEG;
}
2. Creating a new Rcpp::CharacterVector
Rcpp::CharacterVector add_chars_to_int2(Rcpp::IntegerVector x){
int n = x.size();
Rcpp::CharacterVector BASEL_SEG(n);
Rcpp::CharacterVector myIntToStr(x.begin(), x.end());
for(int i = 0; i < n; i++){
BASEL_SEG[i] = "B0" + myIntToStr[i];
}
return BASEL_SEG;
}
Calling them:
add_chars_to_int1(int_vec) ## using std::to_string
[1] "B01" "B02" "B03" "B04" "B05"
add_chars_to_int2(int_vec) ## converting to CharacterVector
[1] "B01" "B02" "B03" "B04" "B05"

Related

RcppParallel RVector push_back or something similar?

I am using RcppParallel to speed up some calculations. However, I am running out of memory in the process, so I would like to save results within the Parallel loop that are pass some relevance threshold. Below is a toy example to illustrate my point:
#include <Rcpp.h>
#include <RcppParallel.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppParallel)]]
// [[Rcpp::plugins(cpp11)]]
struct Example : public RcppParallel::Worker {
RcppParallel::RVector<double> xvals, xvals_output, yvals;
Example(const NumericVector & xvals, NumericVector & yvals, NumericVector & xvals_output) :
xvals(xvals), xvals_output(xvals_output), yvals(yvals) {}
void operator()(std::size_t begin, size_t end) {
for(std::size_t i=begin; i < end; i++) {
double y = xvals[i] * (xvals[i] - 1);
// if(y < 0) {
// xvals_output.push_back(xvals[i]);
// yvals.push_back(y);
// }
xvals_output[i] = xvals[i];
yvals[i] = y;
}
}
};
// [[Rcpp::export]]
List find_values(NumericVector xvals) {
NumericVector xvals_output(xvals.size());
NumericVector yvals(xvals.size());
Example ex(xvals, yvals, xvals_output);
parallelFor(0, xvals.size(), ex);
List L = List::create(xvals_output, yvals);
return(L);
}
The R code would be:
find_values(seq(-10,10, by=0.5))
The commented out code is what I would like to do.
That is, I would like to initialize an empty vector, and append only the y-values that pass a certain threshold and also the associated x-values.
In my real usage, I am calculating a MxN matrix, so memory is an issue.
What is the correct way to approach this issue?
If anyone ever comes across a similar problem, here's a solution using "concurrent_vector" from TBB (which RcppParallel uses under the hood and is available as a header).
#include <Rcpp.h>
#include <RcppParallel.h>
#include <tbb/concurrent_vector.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppParallel)]]
// [[Rcpp::plugins(cpp11)]]
struct Example : public RcppParallel::Worker {
RcppParallel::RVector<double> xvals;
tbb::concurrent_vector< std::pair<double, double> > &output;
Example(const NumericVector & xvals, tbb::concurrent_vector< std::pair<double, double> > &output) :
xvals(xvals), output(output) {}
void operator()(std::size_t begin, size_t end) {
for(std::size_t i=begin; i < end; i++) {
double y = xvals[i] * (xvals[i] - 1);
if(y < 0) {
output.push_back( std::pair<double, double>(xvals[i], y) );
}
}
}
};
// [[Rcpp::export]]
List find_values(NumericVector xvals) {
tbb::concurrent_vector< std::pair<double, double> > output;
Example ex(xvals,output);
parallelFor(0, xvals.size(), ex);
NumericVector xout(output.size());
NumericVector yout(output.size());
for(int i=0; i<output.size(); i++) {
xout[i] = output[i].first;
yout[i] = output[i].second;
}
List L = List::create(xout, yout);
return(L);
}
Output:
> find_values(seq(-10,10, by=0.5))
[[1]]
[1] 0.5
[[2]]
[1] -0.25

Import list to Rcpp

I have a list (with list inside) in R and need import to C using Rcpp.
# R
MainList <- list()
MainList$myint <- 2
MainList$mylist <- list(matrix(1,2,2), matrix(2,2,2))
MainList
My goal is import the list in R (MainList$mylist in example) and copy to one 3D array in C.
I tried this:
// Rcpp
// [[Rcpp::export]]
List MyFunction (List MainList){
int N = as<int>(MainList["myint"]);
List mylistRcpp = as<List>(MainList["mylist"]); // It this work? Apparently no
double*** mylistC; // already with allocate memory
for (int h=0; h<N; h++){
NumericMatrix temp = mylistRcpp[h];
for (int i=0; i<N; i++){
for (int n=0; n<N; n++){
mylistC[h][i][n] = temp(i, n);
}
}
}
return List::create(Named("1") = N,
Named("2") = N);
}
I can import the list this way? There are some easy way to copy without copy one by one? I need the 3D array for another function. I not sure about how import the list from R to Rcpp.
Yes you can insert an Rcpp::List inside an Rcpp::List and recurse as deep as you want. No need for double** and other gymnastics.
Code
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::List ListExample() {
std::string abc = "def";
double tol = 0.001;
Rcpp::List l = Rcpp::List::create(Rcpp::Named("method", abc),
Rcpp::Named("tolerance", tol));
Rcpp::List ll = Rcpp::List::create(Rcpp::Named("method", abc),
Rcpp::Named("tolerance", tol),
Rcpp::Named("list", l));
return ll;
}
/*** R
ListExample()
*/
Demo
R> sourceCpp("/tmp/soQ.cpp")
R> ListExample()
$method
[1] "def"
$tolerance
[1] 0.001
$list
$list$method
[1] "def"
$list$tolerance
[1] 0.001
R>
As you can see we have a list inside a list.

Segment fault when using Rcpp/Armadillo and openMP prarallel with user-defined function

I was trying to use rcpp/armadillo with openmp to speed up a loop in R. The loop takes a matrix with each row containing indices of a location vector(or matrix if it's 2D locations) as input(and other matrix/vec to be used). Inside the loop, I extracted each row of input indices matrix and find the corresponding locations, calculate distance matrix, and covariance matrix, do cholesky and backsolve, save the backsolve results to a new matrix. Here is the rcpp code:
`#include <iostream>
#include <RcppArmadillo.h>
#include <omp.h>
#include <Rcpp.h>
// [[Rcpp::plugins(openmp)]]
using namespace Rcpp;
using namespace arma;
using namespace std;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
mat NZentries_new2 (int m, int nnp, const mat& locs, const umat& revNNarray, const mat& revCondOnLatent, const vec& nuggets, const vec covparms){
// initialized the output matrix
mat Lentries=zeros(nnp,m+1);
// initialized objects in parallel part
int n0; //number of !is_na elements
uvec inds;//
vec revCon_row;//
uvec inds00;//
vec nug;//
mat covmat;//
vec onevec;//
vec M;//
mat dist;//
int k;//
omp_set_num_threads(2);// selects the number of cores to use.
#pragma omp parallel for shared(locs,revNNarray,revCondOnLatent,nuggets,nnp,m,Lentries) private(k,M,dist,onevec,covmat,nug,n0,inds,revCon_row,inds00) default(none) schedule(static)
for (k = 0; k < nnp; k++) {
// extract a row to work with
inds=revNNarray.row(k).t();
revCon_row=revCondOnLatent.row(k).t();
if (k < m){
n0=k+1;
} else {
n0=m+1;
}
// extract locations
inds00=inds(span(m+1-n0,m))-ones<uvec>(n0);
nug=nuggets.elem(inds00) % (ones(n0)-revCon_row(span(m+1-n0,m))); // vec is vec, cannot convert to mat
dist=calcPWD2(locs.rows(inds00));
#pragma omp critical
{
//calculate covariance matrix
covmat= MaternFun(dist,covparms) + diagmat(nug) ; // summation from arma
}
// get last row of inverse Cholesky
onevec = zeros(n0);
onevec[n0-1] = 1;
M=solve(chol(covmat,"upper"),onevec);
// save the entries to matrix
Lentries(k,span(0,n0-1)) = M.t();
}
return Lentries;
}`
The current version works fine but speed is slow(almost the same as no parallel version), if I take the line in omp critical bracket out, it cause segment fault and R will be crashed. This MaterFun is a function I defined as below with several other small functions. So my question is that why MaternFun has to stay in the critical part.
// [[Rcpp::export]]
mat MaternFun( mat distmat, vec covparms ){
int d1 = distmat.n_rows;
int d2 = distmat.n_cols;
int j1;
int j2;
mat covmat(d1,d2);
double scaledist;
double normcon = covparms(0)/(pow(2.0,covparms(2)-1)*Rf_gammafn(covparms(2)));
for (j1 = 0; j1 < d1; j1++){
for (j2 = 0; j2 < d2; j2++){
if ( distmat(j1,j2) == 0 ){
covmat(j1,j2) = covparms(0);
} else {
scaledist = distmat(j1,j2)/covparms(1);
covmat(j1,j2) = normcon*pow( scaledist, covparms(2) )*
Rf_bessel_k(scaledist,covparms(2),1.0);
}
}
}
return covmat;
}
// [[Rcpp::export]]
double dist2(double lat1,double long1,double lat2,double long2) {
double dist = sqrt(pow(lat1 - lat2, 2) + pow(long1 - long2, 2)) ;
return (dist) ;
}
// [[Rcpp::export]]
mat calcPWD2( mat x) {//Rcpp::NumericMatrix
int outrows = x.n_rows ;
int outcols = x.n_rows ;
mat out(outrows, outcols) ;
for (int arow = 0 ; arow < outrows ; arow++) {
for (int acol = 0 ; acol < outcols ; acol++) {
out(arow, acol) = dist2(x(arow, 0),x(arow, 1),
x(acol, 0),x(acol, 1)) ; //extract element from mat
}
}
return (out) ;
}
Here is some sample inputs for testing the MaterFun in R:
library(fields)
distmat=rdist(1:5) # distance matrix
covparms=c(1,0.2,1.5)
The issue is there are two calls to R math functions (Rf_bessel_k and Rf_gammafn) that require the access to be single threaded instead of parallel.
To get around this, let's add a dependency on boost via BH to obtain the cyl_bessel_k and tgamma functions. Alternatively, there is always the option of reimplementing R's besselK and gamma in C++ so it doesn't use the single-threaded R variant.
This gives:
#include <Rcpp.h>
#include <boost/math/special_functions/bessel.hpp>
#include <boost/math/special_functions/gamma.hpp>
// [[Rcpp::depends(BH)]]
// [[Rcpp::export]]
double besselK_boost(double x, double v) {
return boost::math::cyl_bessel_k(v, x);
}
// [[Rcpp::export]]
double gamma_fn_boost(double x) {
return boost::math::tgamma(x);
}
Test Code
x0 = 9.536743e-07
nu = -10
all.equal(besselK(x0, nu), besselK_boost(x0, nu))
# [1] TRUE
x = 2
all.equal(gamma(x), gamma_fn_boost(x))
# [1] TRUE
Note: The order of parameters for boost's variant differs from R's:
cyl_bessel_k(v, x)
Rf_bessel_k(x, v, expon.scaled = FALSE)
From here, we can modify the MaternFun. Unfortunately, because calcPWD2 is missing, the furthest we can go is switching to use boost and incorporating in OpenMP protections.
#include <RcppArmadillo.h>
#include <boost/math/special_functions/bessel.hpp>
#include <boost/math/special_functions/gamma.hpp>
#ifdef _OPENMP
#include <omp.h>
#endif
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(BH)]]
// [[Rcpp::plugins(openmp)]]
// [[Rcpp::export]]
arma::mat MaternFun(arma::mat distmat, arma::vec covparms) {
int d1 = distmat.n_rows;
int d2 = distmat.n_cols;
int j1;
int j2;
arma::mat covmat(d1,d2);
double scaledist;
double normcon = covparms(0) /
(pow(2.0, covparms(2) - 1) * boost::math::tgamma(covparms(2)));
for (j1 = 0; j1 < d1; ++j1){
for (j2 = 0; j2 < d2; ++j2){
if ( distmat(j1, j2) == 0 ){
covmat(j1, j2) = covparms(0);
} else {
scaledist = distmat(j1, j2)/covparms(1);
covmat(j1, j2) = normcon * pow( scaledist, covparms(2) ) *
boost::math::cyl_bessel_k(covparms(2), scaledist);
}
}
}
return covmat;
}

Vectorized Method instead of using for loop

I have a dataframe 'tmp' on which I need to do perform calculation using the last row of another dataframe 'SpreadData'. I am using following code:
for(i in 1:ncol(tmp)){for(j in 1:nrow(tmp)){PNLData[j,i] = 10*tmp[j,i]*SpreadData[nrow(SpreadData),i]}}
Is there any faster method using mapply or something else so that I need not to use for loop.
Thanks
You can use sweep():
PNLData <- sweep(10 * tmp, 2, SpreadData[nrow(SpreadData), ], "*")
PS1: you can replace SpreadData[nrow(SpreadData), ] by tail(SpreadData, 1).
PS2: I think this makes two copies of your data. If you have a large matrix, you'd better use Rcpp.
Edit: Rcpp solution: put that an .cpp file and source it.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericMatrix rcppFun(const NumericMatrix& x,
const NumericVector& lastCol) {
int n = x.nrow();
int m = x.ncol();
NumericMatrix res(n, m);
int i, j;
for (j = 0; j < m; j++) {
for (i = 0; i < n; i++) {
res(i, j) = 10 * x(i, j) * lastCol[j];
}
}
return res;
}
And do in R PNLData <- rcppFun(tmp, SpreadData[nrow(SpreadData), ]).

How to add multiple classes to an object in Rcpp?

I have tried this:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector difflag(NumericVector x, int lag) {
int n = x.size();
NumericVector out(n-lag);
for(int i=0; i<(n-lag); i++) {
out[i] = x[i+lag] - x[i];
}
out.attr("class") += "myclass";
return out;
}
It gives me an error:
all.cpp: In function ‘Rcpp::NumericVector difflag(Rcpp::NumericVector, int)’:
all.cpp:64:26: error: no match for ‘operator+=’ in ‘Rcpp::RObject::attr(const string&) const((* & std::basic_string<char>(((const char*)"class"), (*(const std::allocator<char>*)(& std::allocator<char>()))))) += "myclass"’
make: *** [all.o] Error 1
Perhaps something like :
CharacterVector classes = out.attr( "class" ) ;
classes.push_back( "myclass" ) ;
out.attr( "class" ) = classes ;
There could be room for a generic append function.

Resources