Is there a Rcpp Modules "internal" (or say correct) way how to export overloaded methods?
The Rcpp-modules vignette still has a "TODO" on providing a good example (in section 2.2.5 it says "TODO: mention overloading, need good example.").
I can export my overloaded methods following this solution from Romain François, however, issues can occur with the provided solutions. E.g.:
library(Rcpp)
# define example class in C++
sourceCpp(code = paste0('
#include<Rcpp.h>
class Test {
private:
int a;
public:
Test(): a{0} {};
Test(int x): a{x} {};
int foo();
int foo(int x);
};
int Test::foo() {
return a;
}
int Test::foo(int x) {
return a + x;
}
RCPP_MODULE(rawdata_module) {',
# solution 1 to handle member function overloading provided in
# https://lists.r-forge.r-project.org/pipermail/rcpp-devel/2010-November/001326.html
' int (Test::*foo1)(int x) = &Test::foo;
int (Test::*foo0)() = &Test::foo;
Rcpp::class_<Test>( "Test" )
.constructor()
.constructor<int>()',
# works: foo with 1 argument before 0 arguments!
' .method("foo", foo1)
.method("foo", foo0)',
# solution 2 (also working)
#works: foo with 1 argument before 0 arguments!
' //.method("foo", ( int (Test::*)(int) )(&Test::foo) )
//.method("foo", ( int (Test::*)() )(&Test::foo) )
;
}
'))
# create new object in R
obj0 <- Test$new()
obj1 <- Test$new(5L)
# test overloading
obj0$foo()
obj0$foo(3L)
obj1$foo()
obj1$foo(3L)
For both solutions, if we export the method with 0 arguments before foo with one argument (foo0 before foo1 in solution 1), the code compiles fine and we can call both functions from R, however only method foo with 0 arguments is called.
Thx & please be patient with me. I'd call myself rather inexperienced in C++...
Compiler info:
gcc (Debian 10.2.1-6) 10.2.1 20210110
R session info:
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Progress Linux 6.99 (fuchur-backports)
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rcpp_1.0.8.3 colorout_1.2-2
loaded via a namespace (and not attached):
[1] compiler_4.2.0 tools_4.2.0 codetools_0.2-18 RhpcBLASctl_0.21-247.1
Related
This question is related to this old question and this old question.
R has the nice wrapper-ish function anyNA for quicker evaluation of any(is.na(x)). When working in Rcpp a similar minimal implementation could be given by:
// CharacterVector example
#include <Rcpp.h>
using namespace Rcpp;
template<typename T, typename S>
bool any_na(S x){
T xx = as<T>(x);
for(auto i : xx){
if(T::is_na(i))
return true;
}
return false;
}
// [[Rcpp::export(rng = false)]]
LogicalVector any_na(SEXP x){
return any_na<CharacterVector>(x);
}
// [[Rcpp::export(rng = false)]]
SEXP overhead(SEXP x){
CharacterVector xx = as<CharacterVector>(x);
return wrap(xx);
}
/***R
library(microbenchmark)
vec <- sample(letters, 1e6, TRUE)
vec[1e6] <- NA_character_
any_na(vec)
# [1] TRUE
*/
But comparing the performance of this to anyNA I was surprised by the benchmark below
library(microbenchmark)
microbenchmark(
Rcpp = any_na(vec),
R = anyNA(vec),
overhead = overhead(vec),
unit = "ms"
)
Unit: milliseconds
expr min lq mean median uq max neval cld
Rcpp 2.647901 2.8059500 3.243573 3.0435010 3.675051 5.899100 100 c
R 0.800300 0.8151005 0.952301 0.8577015 0.961201 3.467402 100 b
overhead 0.001300 0.0029010 0.011388 0.0122510 0.015751 0.048401 100 a
where the last line is the "overhead" incurred from converting back and forth from SEXP to CharacterVector (turns out to be negligible). As immediately evident the Rcpp version is roughly ~3.5 times slower than the R version. I was curious so I checked up on the source for Rcpp's is_na and finding no obvious reasons for the slow performance I continued to check the source for anyNA for R's own character vectors's and reimplementing the function using R's C API thinking to speed up this
// Added after SEXP overhead(SEXP x){ --- }
inline bool anyNA2(SEXP x){
R_xlen_t n = Rf_length(x);
for(R_xlen_t i = 0; i < n; i++){
if(STRING_ELT(x, i) == NA_STRING)
return true;
}
return false;
}
// [[Rcpp::export(rng = false)]]
SEXP any_na2(SEXP x){
bool xx = anyNA2(x);
return wrap(xx);
}
// [[Rcpp::export(rng = false)]]
SEXP any_na3(SEXP x){
Function anyNA("anyNA");
return anyNA(x);
}
/***R
microbenchmark(
Rcpp = any_na(vec),
R = anyNA(vec),
R_C_api = any_na2(vec),
Rcpp_Function = any_na3(vec),
overhead = overhead(vec),
unit = "ms"
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# Rcpp 2.654901 2.8650515 3.54936501 3.2392510 3.997901 8.074201 100 d
# R 0.803701 0.8303015 1.01017200 0.9400015 1.061751 2.019902 100 b
# R_C_api 2.336402 2.4536510 3.01576302 2.7220010 3.314951 6.905101 100 c
# Rcpp_Function 0.844001 0.8862510 1.09259990 0.9597505 1.120701 3.011801 100 b
# overhead 0.001500 0.0071005 0.01459391 0.0146510 0.017651 0.101401 100 a
*/
Note that I've included a simple wrapper calling anyNA through Rcpp::Function as well. Once again this implementation of anyNA is not just a little but alot slower than the base implementation.
So the question becomes 2 fold:
Why is the Rcpp so much slower?
Derived from 1: How could this be "changed" to speed up the code?
The questions themselves are not very interesting in itself, but it is interesting if this is affecting multiple parts of Rcpp implementations that may in aggregate gain significant performance boosts.
SessonInfo()
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_Denmark.1252 LC_CTYPE=English_Denmark.1252 LC_MONETARY=English_Denmark.1252 LC_NUMERIC=C LC_TIME=English_Denmark.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] microbenchmark_1.4-7 cmdline.arguments_0.0.1 glue_1.4.2 R6_2.5.0 Rcpp_1.0.6
loaded via a namespace (and not attached):
[1] codetools_0.2-18 lattice_0.20-41 mvtnorm_1.1-1 zoo_1.8-8 MASS_7.3-53 grid_4.0.3 multcomp_1.4-15 Matrix_1.2-18 sandwich_3.0-0 splines_4.0.3
[11] TH.data_1.0-10 tools_4.0.3 survival_3.2-7 compiler_4.0.3
Edit (Not only a windows problem):
I wanted to make sure this is not a "Windows problem" so I went through and executed the problem within a Docker container running linux. The result is shown below and is very similar
# Unit: milliseconds
# expr min lq mean median uq max neval
# Rcpp 2.3399 2.62155 4.093380 3.12495 3.92155 26.2088 100
# R 0.7635 0.84415 1.459659 1.10350 1.42145 12.1148 100
# R_C_api 2.3358 2.56500 3.833955 3.11075 3.65925 14.2267 100
# Rcpp_Function 0.8163 0.96595 1.574403 1.27335 1.56730 11.9240 100
# overhead 0.0009 0.00530 0.013330 0.01195 0.01660 0.0824 100
Session info:
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS
Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-openmp/libopenblasp-r0.3.8.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] microbenchmark_1.4-7 Rcpp_1.0.5
loaded via a namespace (and not attached):
[1] compiler_4.0.2 tools_4.0.2
This is an interesting question, but the answer is pretty simple: there are two versions of STRING_ELT one used internally by R or if you set the USE_RINTERNALS macro in Rinlinedfuns.h and one for plebs in memory.c.
Comparing the two versions, you can see that the pleb version has more checks, which fully accounts for the difference in speed.
If you really want speed and don't care about safety, you can usually beat R by at least a little bit.
// [[Rcpp::export(rng = false)]]
bool any_na_unsafe(SEXP x) {
SEXP* ptr = STRING_PTR(x);
R_xlen_t n = Rf_xlength(x);
for(R_xlen_t i=0; i<n; ++i) {
if(ptr[i] == NA_STRING) return true;
}
return false;
}
Bench:
> microbenchmark(
+ R = anyNA(vec),
+ R_C_api = any_na2(vec),
+ unsafe = any_na_unsafe(vec),
+ unit = "ms"
+ )
Unit: milliseconds
expr min lq mean median uq max neval
R 0.5058 0.52830 0.553696 0.54000 0.55465 0.7758 100
R_C_api 1.9990 2.05170 2.214136 2.06695 2.10220 12.2183 100
unsafe 0.3170 0.33135 0.369585 0.35270 0.37730 1.2856 100
Although as written this is unsafe, if you add a few checks before the loop in the beginning it'd be fine.
This questions turns out to be a good example of why some people rail and rant against microbenchmarks.
Baseline is a built-in primitive
The function that is supposed to be beat here is actually a primitive so that makes it a little tricky already
> anyNA
function (x, recursive = FALSE) .Primitive("anyNA")
>
ALTREP puts a performance floor down
Next, a little experiment shows that the baseline function anyNA() never loops. We define a very short vector srt and a long vector lng, both contain a NA value. Turns out ... R is optimised via ALTREP keeping a matching bit in the data structure headers and the cost of checking is independent of length:
> srt <- c("A",NA_character_); lng <- c(rep("A", 1e6), NA_character_)
> microbenchmark(short=function(srt) { anyNA(srt) },
+ long=function(lng) { anyNA(lng) }, times=1000)
Unit: nanoseconds
expr min lq mean median uq max neval cld
short 48 50 69.324 51 53 5293 1000 a
long 48 50 92.166 51 52 15494 1000 a
>
Note the units here (nanoseconds) and time spent. We are measuring looking at single bit.
(Edit: Scrab that. Thinko of mine in a rush, see comments.)
Rcpp functions have some small overhead
This is not new and documented. If you look at the code generated by Rcpp Attributes, conveniently giving us an R function of the same name of the C++ function we designate you see that at least one other function call is involved. Plus a baked-in try/catch layer, RNG setting (here turned off) and so on. That cannot be zero, and if amortized against anything reasonable it does neither matter not show up in measurements.
Here, however, the exercise was set up to match a primitive function looking at one bit. It's a race one cannot win. So here is my final table
> microbenchmark(anyNA = anyNA(vec), Rcpp_plain = rcpp_c_api(vec),
+ Rcpp_tmpl = rcpp_any_na(vec), Rcpp_altrep = rcpp_altrep(vec),
+ times = .... [TRUNCATED]
Unit: microseconds
expr min lq mean median uq max neval cld
anyNA 643.993 658.43 827.773 700.729 819.78 6280.85 5000 a
Rcpp_plain 1916.188 1952.55 2168.708 2022.017 2191.64 8506.71 5000 d
Rcpp_tmpl 1709.380 1743.04 1933.043 1798.788 1947.83 8176.10 5000 c
Rcpp_altrep 1501.148 1533.88 1741.465 1590.572 1744.74 10584.93 5000 b
It contains the primitive R function, the original (templated) C++ function which looks pretty good still, something using Rcpp (and its small overhead) with just C API use (plus the automatic wrappers in/out) a little slower -- and then for comparison a function from Michel's checkmate package which does look at the ALTREP bit. And it is barely faster.
So really what we are looking at here is overhead from function calls getting in the way of measurning a micro-operations. So no, Rcpp cannot be made faster than a highly optimised primitive. The question looked interesting, but was, at the end of the day, somewhat ill-posed. Sometimes it is worth working through that.
My code version follows below.
// CharacterVector example
#include <Rcpp.h>
using namespace Rcpp;
template<typename T, typename S>
bool any_na(S x){
T xx = as<T>(x);
for (auto i : xx){
if (T::is_na(i))
return true;
}
return false;
}
// [[Rcpp::export(rng = false)]]
LogicalVector rcpp_any_na(SEXP x){
return any_na<CharacterVector>(x);
}
// [[Rcpp::export(rng = false)]]
SEXP overhead(SEXP x){
CharacterVector xx = as<CharacterVector>(x);
return wrap(xx);
}
// [[Rcpp::export(rng = false)]]
bool rcpp_c_api(SEXP x) {
R_xlen_t n = Rf_length(x);
for (R_xlen_t i = 0; i < n; i++) {
if(STRING_ELT(x, i) == NA_STRING)
return true;
}
return false;
}
// [[Rcpp::export(rng = false)]]
SEXP any_na3(SEXP x){
Function anyNA("anyNA");
return anyNA(x);
}
// courtesy of the checkmate package
// [[Rcpp::export(rng=false)]]
R_xlen_t rcpp_altrep(SEXP x) {
#if defined(R_VERSION) && R_VERSION >= R_Version(3, 5, 0)
if (STRING_NO_NA(x))
return 0;
#endif
const R_xlen_t nx = Rf_xlength(x);
for (R_xlen_t i = 0; i < nx; i++) {
if (STRING_ELT(x, i) == NA_STRING)
return i + 1;
}
return 0;
}
/***R
library(microbenchmark)
srt <- c("A",NA_character_)
lng <- c(rep("A", 1e6), NA_character_)
microbenchmark(short = function(srt) { anyNA(srt) },
long = function(lng) { anyNA(lng) },
times=1000)
N <- 1e6
vec <- sample(letters, N, TRUE)
vec[N] <- NA_character_
anyNA(vec) # to check
microbenchmark(
anyNA = anyNA(vec),
Rcpp_plain = rcpp_c_api(vec),
Rcpp_tmpl = rcpp_any_na(vec),
Rcpp_altrep = rcpp_altrep(vec),
#Rcpp_Function = any_na3(vec),
#overhead = overhead(vec),
times = 5000
# unit="relative"
)
*/
I made a simple Rcpp fucntion to calculate all pearson correlation coefficients that can be computed from all row combinations of an input matrix E. The results are stored with 4 decimals of precision (in intger format) in a vector v. The function works fine if the dimensions of E aren't too large but just crashes when I test with a data size similar to that of the real data that I want to process with the function.
Here is the Rccp code:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
void pearson(NumericMatrix E, IntegerVector v){
int rows = E.nrow();
int cols = E.ncol();
int j, irow, jrow;
double rowsum;
NumericVector means(rows);
int k = 0;
double cov, varx, vary;
double pearson;
for(irow = 0; irow < rows; irow++){
rowsum = 0;
for(j = 0; j < cols; j++){
rowsum += E(irow, j);
}
means[irow] = rowsum / cols;
}
for(irow = 0; irow < rows - 1; irow++){
for(jrow = irow + 1; jrow < rows; jrow++){
cov = 0;
varx = 0;
vary = 0;
for(j = 0; j < cols; j++) {
cov += (E(irow, j) - means[irow]) * (E(jrow, j) - means[jrow]);
varx += std::pow(E(irow, j) - means[irow], 2);
vary += std::pow(E(jrow, j) - means[jrow], 2);
}
pearson = cov / std::sqrt(varx * vary);
v[k] = (int) (pearson * 10000);
k++;
}
}
}
And then for testing it in R I started with the following:
library(Rcpp)
sourceCpp("pearson.cpp")
testin <- matrix(rnorm(1000 * 1100), nrow = 1000, ncol = 1100)
testout <- integer( (nrow(testin) * (nrow(testin) - 1)) / 2 )
pearson(testin, testout) # success!
However when increasing input size the R session crashes after executing the last line in the following script:
library(Rcpp)
sourceCpp("pearson.cpp")
testin <- matrix(rnorm(16000 * 17000), nrow = 16000, ncol = 17000)
testout <- integer( (nrow(testin) * (nrow(testin) - 1)) / 2 )
pearson(testin, testout) # sad
I feel like this is strange since I'm able to allocate the input and the output just fine before executing the function. Inside the function the output vector is modified by reference. Can't figure out what is wrong. Currently I'm working on machine with 16GB RAM.
EDIT: output of sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Mexico.1252
[2] LC_CTYPE=Spanish_Mexico.1252
[3] LC_MONETARY=Spanish_Mexico.1252
[4] LC_NUMERIC=C
[5] LC_TIME=Spanish_Mexico.1252
attached base packages:
[1] stats graphics grDevices
[4] utils datasets methods
[7] base
other attached packages:
[1] Rcpp_1.0.5
loaded via a namespace (and not attached):
[1] compiler_4.0.4
Just for the sake of giving closure to this question, I tried to run the function just allocating the inputs and not running the actual algorithm as suggested in the comments and it returns just fine. I think in Windows for some reason when the input reaches a certain size the window will dim and say "not responding" next to the R console's window name. However the function is still running as it will eventually finish if left enough time and the R console's window will return to normal. The fact that the process took so long and that the window looked like when Rcpp crashes led me to think the process was not running and that it was some sort of crash.
What I ended up doing is programming a parallel version of the algorithm with the aid of this very helpful tutorial by some of the creators of RcppParallel. Since I cannot afford using the base R cor() function due to memory constraints, making the parallel version suited my needs perfectly.
I use Rcpp::sourceCpp("test.cpp") and it output the following error information. Note that check1() works and check2 fails. The difference is "arma::vec" and "arma::fvec". The error happens when I tried it on a Windows. When I tried it on linux, it works.
(EDIT: I have added my R environment on Linux. PS: Results on Linux shows that float is faster than double, which is why I prefer using float)
C:/RBuildTools/3.5/mingw_64/bin/g++ -std=gnu++11 -I"C:/PROGRA~1/R/R-36~1.1/include" -DNDEBUG -I../inst/include -fopenmp -I"C:/Users/wenji/OneDrive/Documents/R/win-library/3.6/Rcpp/include" -I"C:/Users/wenji/OneDrive/Documents/R/win-library/3.6/RcppArmadillo/include" -I"Y:/" -O2 -Wall -mtune=generic -c check.cpp -o check.o
C:/RBuildTools/3.5/mingw_64/bin/g++ -shared -s -static-libgcc -o sourceCpp_3.dll tmp.def check.o -fopenmp -LC:/PROGRA~1/R/R-36~1.1/bin/x64 -lRlapack -LC:/PROGRA~1/R/R-36~1.1/bin/x64 -lRblas -lgfortran -lm -lquadmath -LC:/PROGRA~1/R/R-36~1.1/bin/x64 -lR
check.o:check.cpp:(.text+0xa18): undefined reference to `sdot_'
collect2.exe: error: ld returned 1 exit status
The below is the R environment on Windows
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)
Matrix products: default
Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.6.1 tools_3.6.1 RcppArmadillo_0.9.850.1.0
[4] Rcpp_1.0.3
The below is the R environment on Linux
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS
Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.6.3 tools_3.6.3
[3] RcppArmadillo_0.9.850.1.0 Rcpp_1.0.4
The below is The codes of "test.cpp"
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector timesTwo(NumericVector x) {
return x * 2;
}
// [[Rcpp::export]]
arma::vec check1(arma::vec x1, arma::vec x2, int rep){
int n = x1.size();
arma::vec y(n);
y.fill(0);
for(int i = 0; i < rep; i ++){
y += x1 * arma::dot(x1, x2);
}
return y;
}
// [[Rcpp::export]]
arma::fvec check2(arma::fvec x1, arma::fvec x2, int rep){
int n = x1.size();
arma::fvec y(n);
y.fill(0);
for(int i = 0; i < rep; i ++){
y += x1 * arma::dot(x1, x2);
}
return y;
}
// You can include R code blocks in C++ files processed with sourceCpp
// (useful for testing and development). The R code will be automatically
// run after the compilation.
//
/*** R
timesTwo(42)
n = 100000
x1 = rnorm(n)
x2 = rnorm(n)
rep = 1000
system.time(y1 <- check1(x1, x2, rep))
system.time(y2 <- check2(x1, x2, rep))
head(y1)
head(y2)
*/
The below is the output on Linux
> system.time(y1 <- check1(x1, x2, rep))
user system elapsed
0.156 0.000 0.160
> system.time(y2 <- check2(x1, x2, rep))
user system elapsed
0.088 0.000 0.100
There are two questions here:
Why did it work on Linux but not Windows?
R only has int and double, but not float (or 64-bit integer). On Windows you may be linking with R's own internal LAPACK which likely only has double. On Linux float may be present in the system LAPACK. That is my best guess.
Can you / should you use float with Armadillo?
Not really. R only has double and not float so to get values back and forth will always involve copies and is less efficient. I would stick with double.
I have a large raw vector, e.g.:
x <- rep(as.raw(1:10), 4e8) # this vector is about 4 GB
I just want to remove the first element, but no matter what I do it uses a huge amount of memory.
> x <- tail(x, length(x)-1)
Error: cannot allocate vector of size 29.8 Gb
> x <- x[-1L]
Error: cannot allocate vector of size 29.8 Gb
> x <- x[seq(2, length(x)-1)]
Error: cannot allocate vector of size 29.8 Gb
What's going on? Do I really have to rely on C to do such a simple operation? (I know it's simple to do with Rcpp but that's not the point).
SessionInfo:
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_0.8.3
loaded via a namespace (and not attached):
[1] tidyselect_0.2.5 compiler_3.6.1 magrittr_1.5 assertthat_0.2.1
[5] R6_2.4.0 pillar_1.4.2 glue_1.3.1 tibble_2.1.3
[9] crayon_1.3.4 Rcpp_1.0.2 pkgconfig_2.0.2 rlang_0.4.0
[13] purrr_0.3.2
Rcpp solution as #jangoreki asked for:
#include <Rcpp.h>
using namespace Rcpp;
// solution for the original question
// [[Rcpp::export]]
IntegerVector popBeginningOfVector(IntegerVector x, int npop) {
return IntegerVector(x.begin() + npop, x.end());
}
// generic negative indexing
// [[Rcpp::export]]
IntegerVector efficientNegativeIndexing(IntegerVector x, IntegerVector neg_idx) {
std::sort(neg_idx.begin(), neg_idx.end());
size_t ni_size = neg_idx.size();
size_t xsize = x.size();
int * xptr = INTEGER(x);
int * niptr = INTEGER(neg_idx);
size_t xtposition = 0;
IntegerVector xt(xsize - ni_size); // allocate new vector of the correct size
int * xtptr = INTEGER(xt);
int range_begin, range_end;
for(size_t i=0; i < ni_size; ++i) {
if(i == 0) {
range_begin = 0;
} else {
range_begin = neg_idx[i-1];
}
range_end = neg_idx[i] - 1;
// std::cout << range_begin << " " << range_end << std::endl;
std::copy(xptr+range_begin, xptr+range_end, xtptr+xtposition);
xtposition += range_end - range_begin;
}
std::copy(xptr+range_end+1, xptr + xsize, xtptr+xtposition);
return xt;
}
The problem is that the code to do subsetting allocates a vector of the indices corresponding to the elements you want. For your example, that's the vector 2:4e9.
Recent versions of R can store such vectors very compactly (just first and last element), but the code doing the subsetting doesn't do that, so it needs to store all 4e9-1 values.
Integers would use 4 bytes each, but 4e9 is too big to be an integer, so R stores all those values as 8 byte doubles. That adds up to 32000000040 bytes according to pryr::object_size(2:4e9). That's 29.8 Gb.
To get around this, you would need to make very low level changes to the subsetting code in https://svn.r-project.org/R/trunk/src/main/subset.c and
the subscripting code in https://svn.r-project.org/R/trunk/src/main/subscript.c.
Since this is such a specialized case and the alternative (doing it all in C or C++) is so much easier, I don't think R Core is going to put a lot of effort into this.
My problem is probably trival (hope), but I haven't found specific help on errors from this package and posts on compilation errors regard issues where people wrote a code themselfs (so they could change it).
I'm trying to replicate first example from example from BMA package help:
library(MASS)
library(BMA)
data(birthwt)
y <- birthwt$lo
x <- data.frame(birthwt[,-1])
x$race <- as.factor(x$race)
x$ht <- (x$ht>=1)+0
x <- x[,-9]
x$smoke <- as.factor(x$smoke)
x$ptl <- as.factor(x$ptl)
x$ht <- as.factor(x$ht)
x$ui <- as.factor(x$ui)
### add 41 columns of noise
noise<- matrix(rnorm(41*nrow(x)), ncol=41)
colnames(noise)<- paste('noise', 1:41, sep='')
x<- cbind(x, noise)
iBMA.glm.out<- iBMA.glm( x, y, glm.family="binomial",
factor.type=FALSE, verbose = TRUE,
thresProbne0 = 5 )
summary(iBMA.glm.out)
Everything goes fine until iBMA.glm function which returns lengthy compilation error which I completely don't understand (I've never compiled anything inside R with my hands):
Warning message:
running command 'make -f "C:/PROGRA~1/R/R-32~1.2/etc/x64/Makeconf" -f "C:/PROGRA~1/R/R-32~1.2/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="file1b2c792f3888.dll" WIN=64 TCLBIN=64 OBJECTS="file1b2c792f3888.o"' had status 127
ERROR(s) during compilation: source code errors or compiler configuration errors!
Program source:
1: #include <R.h>
2: #include <Rdefines.h>
3: #include <R_ext/Error.h>
4:
5:
6: /* This is taken from envir.c in the R 2.15.1 source
7: https://github.com/SurajGupta/r-source/blob/master/src/main/envir.c
8: */
9: #define FRAME_LOCK_MASK (1<<14)
10: #define FRAME_IS_LOCKED(e) (ENVFLAGS(e) & FRAME_LOCK_MASK)
11: #define UNLOCK_FRAME(e) SET_ENVFLAGS(e, ENVFLAGS(e) & (~ FRAME_LOCK_MASK))
12:
13:
14: extern "C" {
15: SEXP file1b2c792f3888 ( SEXP env );
16: }
17:
18: SEXP file1b2c792f3888 ( SEXP env ) {
19:
20: if (TYPEOF(env) == NILSXP)
21: error("use of NULL environment is defunct");
22: if (TYPEOF(env) != ENVSXP)
23: error("not an environment");
24:
25: UNLOCK_FRAME(env);
26:
27: // Return TRUE if unlocked; FALSE otherwise
28: SEXP result = PROTECT( Rf_allocVector(LGLSXP, 1) );
29: LOGICAL(result)[0] = FRAME_IS_LOCKED(env) == 0;
30: UNPROTECT(1);
31:
32: return result;
33:
34: warning("your C program does not return anything!");
35: return R_NilValue;
36: }
Error in compileCode(f, code, language, verbose) :
Compilation ERROR, function(s)/method(s) not created! Warning message:
running command 'make -f "C:/PROGRA~1/R/R-32~1.2/etc/x64/Makeconf" -f "C:/PROGRA~1/R/R-32~1.2/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="file1b2c792f3888.dll" WIN=64 TCLBIN=64 OBJECTS="file1b2c792f3888.o"' had status 127
In addition: Warning message:
running command 'C:/PROGRA~1/R/R-32~1.2/bin/x64/R CMD SHLIB file1b2c792f3888.cpp 2> file1b2c792f3888.cpp.err.txt' had status 1
My session info is:
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Polish_Poland.1250 LC_CTYPE=Polish_Poland.1250 LC_MONETARY=Polish_Poland.1250
[4] LC_NUMERIC=C LC_TIME=Polish_Poland.1250
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] BMA_3.18.4 rrcov_1.3-8 inline_0.3.14 robustbase_0.92-5 leaps_2.9
[6] survival_2.38-3 rJava_0.9-6 relaimpo_2.2-2 mitools_2.3 survey_3.30-3
[11] boot_1.3-17 MASS_7.3-43
loaded via a namespace (and not attached):
[1] mvtnorm_1.0-3 lattice_0.20-33 corpcor_1.6.8 pcaPP_1.9-60 stats4_3.2.2 splines_3.2.2
[7] tools_3.2.2 DEoptimR_1.0-3 cluster_2.0.3