I am a newbie to Rcpp and C++. I am trying to convert the following R code into RCpp.
library (compiler)
robzscore<-cmpfun(function(x) {
byec <- sum(!is.na(x))
byer <- rank(x, na.last="keep", ties.method="average") - 0.5
as.data.frame(suppressWarnings(qnorm(byer/byec)), row.names=NULL)
})
I am struggling with writing the syntax for the part where i need to get the ranks. For example, this is what I wrote (in a separate cpp file that I am compiling using sourceCpp) based on other codes I found on SO as an equivalent of the rank(x,...) function in R assuming there are no NAs (and not handling ties):
#include <Rcpp.h>
#include <algorithm>
#include <iostream>
using namespace Rcpp;
template <typename T>
std::vector<size_t> sort_indexes(const std::vector<T> &v) {
// initialize original index locations
std::vector<size_t> idx(v.size());
for (size_t i=0; i!=idx.size();++i) idx[i]=i;
// sort indexes based on comparing values in v
std::sort(idx.begin(),idx.end(),[&v](size_t i1, size_t i2) {return v[i1]<v[i2];});
// return the values
return idx;
}
// [[Rcpp::export]]
NumericVector do_rank(NumericVector x) {
std::vector<float> y=as<std::vector<float> >(x);
return wrap(sort_indexes(y));
}
The error is get are:
lambda expressions only available with -std=c++0x or -std=gnu++0x [enabled by default] - followed by - no matching function for call to 'sort(std::vector<long long unsigned int>::iterator, std::vector<long long unsigned int>::interator, sort_indexes(const std::vector<T> &) [ with T=float]::<lambda(long long unsigned int, long long u
nsigned int)>)' at the place where my code says std::sort(idx.begin(),...).
~/R/win-library/3.0/Rcpp/include/Rcpp/internal/wrap.h : invalid conversion from 'long long unsigned int' to 'SEXP' [-fpermissive].
I suspect the main issue with some error I made in the syntax I used to handle Rcpp (converting Rcpp to C++ data structure or vice versa).
Can someone help me interpret the errors and/or what could be the right way?
Thanks,
Related
If I create a large Tensor in Eigen, and I like to return the Tensor back to R as multi-dimension array. I know how to do it with data copy like below. Question: is it possible to do it without the data-copy step?
#include <Rcpp.h>
#include <RcppEigen.h>
#include <unsupported/Eigen/CXX11/Tensor>
// [[Rcpp::depends(RcppEigen)]]
using namespace Rcpp;
template <typename T>
NumericVector copyFromTensor(const T& x)
{
int n = x.size();
NumericVector ans(n);
IntegerVector dim(x.NumDimensions);
for (int i = 0; i < x.NumDimensions; ++i) {
dim[i] = x.dimension(i);
}
memcpy((double*)ans.begin(), (double*)x.data(), n * sizeof(double));
ans.attr("dim") = dim;
return ans;
}
// [[Rcpp::export]]
NumericVector getTensor() {
Eigen::Tensor<double, 3> x(4, 3, 1);
x.setRandom();
return copyFromTensor(x);
}
/*** R
getTensor()
*/
As a general rule you can zero-copy one the way into your C++ code with data coming from R and already managed by R.
On the way out of your C++ code with data returning to R anything that is not created used the R allocator has to be copied.
Here your object x is a stack-allocated so you need a copy. See Writing R Extensions about the R allocator; Eigen may let you use it when you create a new Tensor object. Not a trivial step. I think I would just live with the copy.
My code is the following
#include <RcppArmadillo.h>
#include <Rcpp.h>
using namespace std;
using namespace Rcpp;
using namespace arma;
//RNGScope scope;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat hh(arma::mat Z, int n, int m){
if(Z.size()==0){
Z = arma::randu<mat>(n,m); # if matrix Z is null, then generate random numbers to fill in it
return Z;
}else{
return Z;
}
}
Error reported:
conflicting declaration of C function 'SEXPREC* sourceCpp_1_hh(SEXP, SEXP, SEXP)'
Do you have any idea about this question?
Thank you in advance!
Let's slow down and clean up, following other examples:
Never ever include both Rcpp.h and RcppArmadillo.h. It errors. And RcppArmadillo.h pulls in Rcpp.h for you, and at the right time. (This matters for the generated code.)
No need to mess with RNGScope unless you really know what your are doing.
I recommend against flattening namespaces.
For reasons discussed elsewhere at length, you probably want R's RNGs.
The code doesn't compile as posted: C++ uses // for comments, not #.
The code doesn't compile as posted: Armadillo uses different matrix creation.
The code doesn't run as intended as size() is not what you want there. We also do not let a 'zero element' matrix in---maybe a constraint on our end.
That said, once repaired, we now get correct behavior for a slightly changed spec:
Output
R> Rcpp::sourceCpp("~/git/stackoverflow/63984142/answer.cpp")
R> hh(2, 2)
[,1] [,2]
[1,] 0.359028 0.775823
[2,] 0.645632 0.563647
R>
Code
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat hh(int n, int m) {
arma::mat Z = arma::mat(n,m,arma::fill::randu);
return Z;
}
/*** R
hh(2, 2)
*/
I commonly work with a short Rcpp function that takes as input a matrix where each row contains K probabilities that sum to 1. The function then randomly samples for each row an integer between 1 and K corresponding to the provided probabilities. This is the function:
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadilloExtensions/sample.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector sample_matrix(NumericMatrix x, IntegerVector choice_set) {
int n = x.nrow();
IntegerVector result(n);
for ( int i = 0; i < n; ++i ) {
result[i] = RcppArmadillo::sample(choice_set, 1, false, x(i, _))[0];
}
return result;
}
I recently updated R and all packages. Now I cannot compile this function anymore. The reason is not clear to me. Running
library(Rcpp)
library(RcppArmadillo)
Rcpp::sourceCpp("sample_matrix.cpp")
throws the following error:
error: call of overloaded 'sample(Rcpp::IntegerVector&, int, bool, Rcpp::Matrix<14>::Row)' is ambiguous
This basically tells me that my call to RcppArmadillo::sample() is ambiguous. Can anyone enlighten me as to why this is the case?
There are two things happening here, and two parts to your problem and hence the answer.
The first is "meta": why now? Well we had a bug let in the sample() code / setup which Christian kindly fixed for the most recent RcppArmadillo release (and it is all documented there). In short, the interface for the very probability argument giving you trouble here was changed as it was not safe for re-use / repeated use. It is now.
Second, the error message. You didn't say what compiler or version you use but mine (currently g++-9.3) is actually pretty helpful with the error. It is still C++ so some interpretative dance is needed but in essence it clearly stating you called with Rcpp::Matrix<14>::Row and no interface is provided for that type. Which is correct. sample() offers a few interface, but none for a Row object. So the fix is, once again, simple. Add a line to aid the compiler by making the row a NumericVector and all is good.
Fixed code
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector sample_matrix(NumericMatrix x, IntegerVector choice_set) {
int n = x.nrow();
IntegerVector result(n);
for ( int i = 0; i < n; ++i ) {
Rcpp::NumericVector z(x(i, _));
result[i] = RcppArmadillo::sample(choice_set, 1, false, z)[0];
}
return result;
}
Example
R> Rcpp::sourceCpp("answer.cpp") # no need for library(Rcpp)
R>
Is there a way to call an r script from C++ ?
I have an rscript , for example:
myScript.R:
runif(100)
I want to execute this script from C++ and pass the result.
I tried:
#include <Rcpp.h>
#include <iostream>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector loadFile(CharacterVector inFile){
NumericVector one = system(inFile);
return one;
}
inFile : "C:/Program Files/R/R-3.4.2/bin/x64/Rscript C:/Rscripts/myScript.R"
but it gives me :
cannot convert Rcpp::CharacterVector (aka Rcpp::Vector<16>} to const char* for argument 1 to int system(const char*)
Usual convention is to use Rcpp so as to write expensive C++ code in R.
If you would like to invoke an R script from within c++, and then work with the result, one approach would be be to make use of the popen function.
To see how to convert a rcpp character to std::string see Converting element of 'const Rcpp::CharacterVector&' to 'std::string'
Example:
std::string r_execute = "C:/Program Files/R/R-3.4.2/bin/x64/Rscript C:/Rscripts/myScript.R"
FILE *fp = popen(r_execute ,"r");
You should be able to read the result of the operation from the file stream.
I'm writing a package with some functions calling RcppArmadillo::sample from RcppArmadillo.
However I met the following error when compiling.
In file included from Citrus.cpp:2:
./R/x86_64-unknown-linux-gnu-library/3.0/RcppArmadillo/include/RcppArmadilloExtensions/sample.h: In function ‘T Rcpp::RcppArmadillo::sample(const T&, int, bool, Rcpp::NumericVector) [with T = arma::subview_col]’:
Citrus.cpp:241: instantiated from here
./R/x86_64-unknown-linux-gnu-library/3.0/RcppArmadillo/include/RcppArmadilloExtensions/sample.h:45: error: ‘const struct arma::subview_col’ has no member named ‘size’
./R/x86_64-unknown-linux-gnu-library/3.0/RcppArmadillo/include/RcppArmadilloExtensions/sample.h:48: error: no matching function for call to ‘arma::subview_col::subview_col(const int&)’
./R/x86_64-unknown-linux-gnu-library/3.0/RcppArmadillo/include/armadillo_bits/subview_bones.hpp:236: note: candidates are: arma::subview_col::subview_col() [with eT = double]
./R/x86_64-unknown-linux-gnu-library/3.0/RcppArmadillo/include/armadillo_bits/subview_meat.hpp:2608: note: arma::subview_col::subview_col(const arma::Mat&, arma::uword, arma::uword, arma::uword) [with eT = double]
./R/x86_64-unknown-linux-gnu-library/3.0/RcppArmadillo/include/armadillo_bits/subview_meat.hpp:2597: note: arma::subview_col::subview_col(const arma::Mat&, arma::uword) [with eT = double]
./R/x86_64-unknown-linux-gnu library/3.0/RcppArmadillo/include/armadillo_bits/forward_bones.hpp:29: note: arma::subview_col::subview_col(const arma::subview_col&)
make: *** [Citrus.o] Error 1
The RcppArmadillo I'm using is 0.7.700.0.0.
The same error appeared on both linux and OSX. When compiling using Rstudio, the error message as follows:
no member named 'size' in 'arma::subview_col<double>'.
no matching constructor for initialization of 'arma::subview_col<double>'
I used RcppArmadillo::sample in my previous work a lot. It suddenly doesn't work. I appreciate any help.
This feature works on pre-subset data in either arma::vec or NumericVector Always has and always will. Do not use this with an intermediary vector obtained from a subset operation (e.g. .col(), .cols(), or .submat()).
The issue you are running into is you've decided to subset the data within the call to sample. (You've omitted code to diagnose this part, so I'm speculating here.) Since sample() needs to work with both Rcpp and Armadillo data types there never was a call to Armadillo specific size member functions. Instead, the library opted to call the .size() member function of an STL container, which armadillo supported, since that was shared between both objects. However, armadillo limits where the member function was implemented to the "active" data structures and not temporaries. As a result, the .size() member function was not implemented for subview_col. So, we end up with the error of:
error: ‘const struct arma::subview_col’ has no member named ‘size’
To get around this limitation and save memory, use an advanced vec ctor that will reuse memory and, thus, avoiding the need for an intermediary arma::subview_col to be created.
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::export]]
void adv_rnd(int nrow, int ncol, bool replace = true){
// Create a matrix of given dimensions
arma::mat X(nrow, ncol);
X.randn();
// Show state before randomization
Rcpp::Rcout << "Before Randomization:" << std::endl << X << std::endl;
// Randomize each column
for(int i = 0; i < ncol; ++i){
arma::vec Y(X.colptr(i), nrow, false, true);
X.col(i) = Rcpp::RcppArmadillo::sample(Y, nrow, replace);
}
// Show state after randomization
Rcpp::Rcout << "After Randomization:" << std::endl << X << std::endl;
}
Sample output:
> adv_rnd(3,3)
Before Randomization:
-0.7197 1.2590 -0.5898
0.0253 0.1493 -0.0685
-0.6074 1.3843 0.0400
After Randomization:
-0.7197 1.2590 0.0400
-0.6074 1.2590 -0.5898
-0.6074 0.1493 -0.0685