change vector element by name in rcpp - r

I have a function where I need to make a table (tab, then change one value - the value where tab.names() == k, where k is given in the function call.
Looking at http://dirk.eddelbuettel.com/code/rcpp/Rcpp-quickref.pdf, I've hoped that the following code would work (replacing "foo" with a variable name), but I guess that requires the element name to be static, and mine won't be. I've tried using which but that won't compile (invalid conversion from 'char' to 'Rcpp::traits::storage_type<16>::type {aka SEXPREC*}' - so I'm doing something wrong there.
#include <RcppArmadillo.h>
#include <algorithm>
//[[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector fun(const arma::vec& assignment, int k) {
// count number of peptides per protein
IntegerVector tab = table(as<IntegerVector>(wrap(assignment)));
CharacterVector all_proteins = tab.names();
char kc = '0' + k;
// what I need a working version of:
tab(kc) = 1; // gets ignored, as does a [] version of the same thing.
// or
tab('0' + k) = 1; // also ignored
int ki = which(all_proteins == kc); // gives me compile errors
// extra credit
// tab.names(k-1) = "-1";
return tab;
}
/*** R
set.seed(23)
x <- rpois(20, 5)
k <- 5
fun(x, k)
# same thing in R:
expected_output <- table(x)
expected_output # before modification
# x
# 3 4 5 6 7 9 10 12
# 2 4 3 3 4 2 1 1
expected_output[as.character(k)] <- 1 # this is what I need help with
expected_output
# x
# 3 4 5 6 7 9 10 12
# 2 4 1 3 4 2 1 1
# extra credit:
names(expected_output)[as.character(k)] <- -1
*/
I'm still learning rcpp, and more importantly, still learning how to read the manual pages and plug in the right search terms into google/stackoverflow. I'm sure this is basic stuff (and I'm open to better methods - I currently think like an R programmer in terms of initial approaches to problems, not a C++ programmer.)
(BTW - The use of arma::vec is used in other parts of the code which I'm not showing for simplicity - I realize it's not useful here. I debated on switching it, but decided against it on the principle that I've tested that part, it works, and the last thing I want to do is introduce an extra bug...)
Thanks!

You can use the .findName() method to get the relevant index:
#include <RcppArmadillo.h>
#include <algorithm>
//[[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector fun(const arma::vec& assignment, int k) {
// count number of peptides per protein
IntegerVector tab = table(as<IntegerVector>(wrap(assignment)));
CharacterVector all_proteins = tab.names();
int index = tab.findName(std::string(1, '0' + k));
tab(index) = 1;
all_proteins(index) = "-1";
tab.names() = all_proteins;
return tab;
}
/*** R
set.seed(23)
x <- rpois(20, 5)
k <- 5
fun(x, k)
*/
Output:
> Rcpp::sourceCpp('table-name.cpp')
> set.seed(23)
> x <- rpois(20, 5)
> k <- 5
> fun(x, k)
3 4 -1 6 7 9 10 12
2 4 1 3 4 2 1 1

You could write your own function (use String instead of char):
int first_which_equal(const CharacterVector& x, String y) {
int n = x.size();
for (int i = 0; i < n; i++) {
if (x[i] == y) return(i);
}
return -1;
}
Also, it seems that tab(kc) is converting kc to an integer representation.

Related

CUDA thrust vector: copy and sum the values from device_vectorA to device_vectorB

I'm new to CUDA.
I want to copy and sum values in device_vector in the following ways. Are there more efficient ways (or functions provided by thrust) to implement these?
thrust::device_vector<int> device_vectorA(5);
thrust::device_vector<int> device_vectorB(20);
copydevice_vectorA 4 times into device_vectorB in the following way:
for (size_t i = 0; i < 4; i++)
{
offset_sta = i * 5;
thrust::copy(device_vectorA.begin(), device_vectorA.end(), device_vectorB.begin() + offset_sta);
}
Sum every 5 values in device_vectorB and store the results in new device_vector (size 4):
// Example
device_vectorB = 1 2 3 4 5 | 1 2 3 4 5 | 1 2 3 4 5 | 1 2 3 4 5
device_vectorC = 15 15 15 15
thrust::device_vector<int> device_vectorC(4);
for (size_t i = 0; i < 4; i++)
{
offset_sta = i * 5;
offset_end = (i + 1) * 5 - 1;
device_vectorC[i] = thrust::reduce(device_vectorB.begin() + offset_sta, device_vectorB.begin() + offset_end, 0);
}
Are there more efficient ways (or functions provided by thrust) to implement these?
P.S. 1 and 2 are separate instances. For simplicity, these two instances just use the same vectors to illustrate.
Step 1 can be done with a single thrust::copy operation using a permutation iterator that uses a transform iterator working on a counting iterator to generate the copy indices "on the fly".
Step 2 is a partitioned reduction, using thrust::reduce_by_key. We can again use a transform iterator working on a counting iterator to create the flags array "on the fly".
Here is an example:
$ cat t2124.cu
#include <thrust/device_vector.h>
#include <thrust/host_vector.h>
#include <thrust/copy.h>
#include <thrust/reduce.h>
#include <thrust/sequence.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/discard_iterator.h>
#include <iostream>
using namespace thrust::placeholders;
const int As = 5;
const int Cs = 4;
const int Bs = As*Cs;
int main(){
thrust::device_vector<int> A(As);
thrust::device_vector<int> B(Bs);
thrust::device_vector<int> C(Cs);
thrust::sequence(A.begin(), A.end(), 1); // fill A with 1,2,3,4,5
thrust::copy_n(thrust::make_permutation_iterator(A.begin(), thrust::make_transform_iterator(thrust::counting_iterator<int>(0), _1%A.size())), B.size(), B.begin()); // step 1
auto my_flags_iterator = thrust::make_transform_iterator(thrust::counting_iterator<int>(0), _1/A.size());
thrust::reduce_by_key(my_flags_iterator, my_flags_iterator+B.size(), B.begin(), thrust::make_discard_iterator(), C.begin()); // step 2
thrust::host_vector<int> Ch = C;
thrust::copy_n(Ch.begin(), Ch.size(), std::ostream_iterator<int>(std::cout, ","));
std::cout << std::endl;
}
$ nvcc -o t2124 t2124.cu
$ compute-sanitizer ./t2124
========= COMPUTE-SANITIZER
15,15,15,15,
========= ERROR SUMMARY: 0 errors
$
If we wanted to, even the device vector A could be dispensed with; that could be created "on the fly" using a counting iterator. But presumably your inputs are not actually 1,2,3,4,5

Poisson draw in Rcpp and R different results

I face the following contradiction when I use the same code in R and in Rcpp
In R I run the following code
t = 0
for(i in 1:50){
t = t + rpois(1, 0.5)
}
t
[1] 28
and I take back a value t which is nonnegative. Now I type the exact same commands in Rcpp
#include <Rcpp.h>
#include<Rmath.h>
using namespace Rcpp;
// [[Rcpp::export]]
int Pois(int l){
int t=0;
for(int i=0; i<50;++i){
t+=R::rpois(l);
}
return t;
}
and when I call the function in R
Pois(0.5)
[1] 0
which is wrong since in R it was different of zero
What is going wrong?
You should use double l rather than int l, e.g.,
int Pois(double l){
int t=0;
for(int i=0; i<50;++i){
t+=R::rpois(l);
}
return t;
}
otherwise (int) 0.5 gives you 0.
#ThomasIsCoding already showed you the main issue. But recall that beside R::rpois() we also have the vectorised Rcpp::rpois(). And, as usual, given the same seed it gives the same draws as R:
> set.seed(123)
> rpois(10, 0.5)
[1] 0 1 0 1 2 0 0 1 0 0
> Rcpp::cppFunction("NumericVector myrp(int n, double l) { return Rcpp::rpois(n, l); }")
> set.seed(123)
> myrp(10, 0.5)
[1] 0 1 0 1 2 0 0 1 0 0
>

Rcpp and R: pass by reference

Working with Rcpp and R I observed the following behaviour, which I do not understand at the moment. Consider the following simple function written in Rcpp
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericMatrix hadamard_product(NumericMatrix & X, NumericMatrix & Y){
unsigned int ncol = X.ncol();
unsigned int nrow = X.nrow();
int counter = 0;
for (unsigned int j=0; j<ncol; j++) {
for (unsigned int i=0; i<nrow; i++) {
X[counter++] *= Y(i, j);
}
}
return X;
}
This simply returns the component-wise product of two matrices. Now I know that the arguments to this function are passed by reference, i.e., calling
M <- matrix(rnorm(4), ncol = 2)
N <- matrix(rnorm(4), ncol = 2)
M_copy <- M
hadamard_product(M, N)
will overwrite the original M. However, it also overwrites M_copy, which I do not understand. I thought that M_copy <- M makes a copy of the object M and saves it somewhere in the memory and not that this assignment points M_copy to M, which would be the behaviour when executing
x <- 1
y <- x
x <- 2
for example. This does not change y but only x.
So why does the behaviour above occur?
No, R does not make a copy immediately, only if it is necessary, i.e., copy-on-modify:
x <- 1
tracemem(x)
#[1] "<0000000009A57D78>"
y <- x
tracemem(x)
#[1] "<0000000009A57D78>"
x <- 2
tracemem(x)
#[1] "<00000000099E9900>"
Since you modify M by reference outside R, R can't know that a copy is necessary. If you want to ensure a copy is made, you can use data.table::copy. Or avoid the side effect in your C++ code, e.g., make a deep copy there (by using clone).

R: Summing up neighboring matrix elements. How to speed up?

I'm working with large matrices of about 2500x2500x50 (lonxlatxtime). The matrix contains only 1 and 0. I need to know for each timestep the sum of the 24 surrounding elements. So far I did it about this way:
xdim <- 2500
ydim <- 2500
tdim <- 50
a <- array(0:1,dim=c(xdim,ydim,tdim))
res <- array(0:1,dim=c(xdim,ydim,tdim))
for (t in 1:tdim){
for (x in 3:(xdim-2)){
for (y in 3:(ydim-2)){
res[x,y,t] <- sum(a[(x-2):(x+2),(y-2):(y+2),t])
}
}
}
This works, but it is much too slow for my needs. Has anybody please an advice how to speed up?
Intro
I have to say, there are so many hidden things behind just the setup of the arrays. The remainder of the problem is trivial though. As a result, there are two ways to go about it really:
Bruteforce given by #Alex (written in C++)
Observing replication patterns
Bruteforce with OpenMP
If we want to 'brute force' it, then we can use the suggestion given by #Alex to employ OpenMP with Armadillo
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// Add a flag to enable OpenMP at compile time
// [[Rcpp::plugins(openmp)]]
// Protect against compilers without OpenMP
#ifdef _OPENMP
#include <omp.h>
#endif
// [[Rcpp::export]]
arma::cube cube_parallel(arma::cube a, arma::cube res, int cores = 1) {
// Extract the different dimensions
unsigned int tdim = res.n_slices;
unsigned int xdim = res.n_rows;
unsigned int ydim = res.n_cols;
// Same calculation loop
#pragma omp parallel for num_threads(cores)
for (unsigned int t = 0; t < tdim; t++){
// pop the T
arma::mat temp_mat = a.slice(t);
// Subset the rows
for (unsigned int x = 2; x < xdim-2; x++){
arma::mat temp_row_sub = temp_mat.rows(x-2, x+2);
// Iterate over the columns with unit accumulative sum
for (unsigned int y = 2; y < ydim-2; y++){
res(x,y,t) = accu(temp_row_sub.cols(y-2,y+2));
}
}
}
return res;
}
Replication Patterns
However, the smarter approach is understanding how the array(0:1, dims) is being constructed.
Most notably:
Case 1: If xdim is even, then only the rows of a matrix alternate.
Case 2: If xdim is odd and ydim is odd, then rows alternate as well as the matrices alternate.
Case 3: If xdim is odd and ydim is even, then only the rows alternate
Examples
Let's see the cases in action to observe the patterns.
Case 1:
xdim <- 2
ydim <- 3
tdim <- 2
a <- array(0:1,dim=c(xdim,ydim,tdim))
Output:
, , 1
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 1 1 1
, , 2
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 1 1 1
Case 2:
xdim <- 3
ydim <- 3
tdim <- 3
a <- array(0:1,dim=c(xdim,ydim,tdim))
Output:
, , 1
[,1] [,2] [,3]
[1,] 0 1 0
[2,] 1 0 1
[3,] 0 1 0
, , 2
[,1] [,2] [,3]
[1,] 1 0 1
[2,] 0 1 0
[3,] 1 0 1
, , 3
[,1] [,2] [,3]
[1,] 0 1 0
[2,] 1 0 1
[3,] 0 1 0
Case 3:
xdim <- 3
ydim <- 4
tdim <- 2
a <- array(0:1,dim=c(xdim,ydim,tdim))
Output:
, , 1
[,1] [,2] [,3] [,4]
[1,] 0 1 0 1
[2,] 1 0 1 0
[3,] 0 1 0 1
, , 2
[,1] [,2] [,3] [,4]
[1,] 0 1 0 1
[2,] 1 0 1 0
[3,] 0 1 0 1
Pattern Hacking
Alrighty, based on the above discussion, we opt to make a bit of code the exploits this unique pattern.
Creating Alternating Vectors
An alternating vector in this case switches between two different values.
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// ------- Make Alternating Vectors
arma::vec odd_vec(unsigned int xdim){
// make a temporary vector to create alternating 0-1 effect by row.
arma::vec temp_vec(xdim);
// Alternating vector (anyone have a better solution? )
for (unsigned int i = 0; i < xdim; i++) {
temp_vec(i) = (i % 2 ? 0 : 1);
}
return temp_vec;
}
arma::vec even_vec(unsigned int xdim){
// make a temporary vector to create alternating 0-1 effect by row.
arma::vec temp_vec(xdim);
// Alternating vector (anyone have a better solution? )
for (unsigned int i = 0; i < xdim; i++) {
temp_vec(i) = (i % 2 ? 1 : 0); // changed
}
return temp_vec;
}
Creating the three cases of matrix
As mentioned above, there are three cases of matrix. The even, first odd, and second odd cases.
// --- Handle the different cases
// [[Rcpp::export]]
arma::mat make_even_matrix(unsigned int xdim, unsigned int ydim){
arma::mat temp_mat(xdim,ydim);
temp_mat.each_col() = even_vec(xdim);
return temp_mat;
}
// xdim is odd and ydim is even
// [[Rcpp::export]]
arma::mat make_odd_matrix_case1(unsigned int xdim, unsigned int ydim){
arma::mat temp_mat(xdim,ydim);
arma::vec e_vec = even_vec(xdim);
arma::vec o_vec = odd_vec(xdim);
// Alternating column
for (unsigned int i = 0; i < ydim; i++) {
temp_mat.col(i) = (i % 2 ? o_vec : e_vec);
}
return temp_mat;
}
// xdim is odd and ydim is odd
// [[Rcpp::export]]
arma::mat make_odd_matrix_case2(unsigned int xdim, unsigned int ydim){
arma::mat temp_mat(xdim,ydim);
arma::vec e_vec = even_vec(xdim);
arma::vec o_vec = odd_vec(xdim);
// Alternating column
for (unsigned int i = 0; i < ydim; i++) {
temp_mat.col(i) = (i % 2 ? e_vec : o_vec); // slight change
}
return temp_mat;
}
Calculation Engine
Same as the previous solution, just without the t as we no longer need to repeat calculations.
// --- Calculation engine
// [[Rcpp::export]]
arma::mat calc_matrix(arma::mat temp_mat){
unsigned int xdim = temp_mat.n_rows;
unsigned int ydim = temp_mat.n_cols;
arma::mat res = temp_mat;
// Subset the rows
for (unsigned int x = 2; x < xdim-2; x++){
arma::mat temp_row_sub = temp_mat.rows(x-2, x+2);
// Iterate over the columns with unit accumulative sum
for (unsigned int y = 2; y < ydim-2; y++){
res(x,y) = accu(temp_row_sub.cols(y-2,y+2));
}
}
return res;
}
Call Main Function
Here is the core function that pieces everything together. This gives us the desired distance arrays.
// --- Main Engine
// Create the desired cube information
// [[Rcpp::export]]
arma::cube dim_to_cube(unsigned int xdim = 4, unsigned int ydim = 4, unsigned int tdim = 3) {
// Initialize values in A
arma::cube res(xdim,ydim,tdim);
if(xdim % 2 == 0){
res.each_slice() = calc_matrix(make_even_matrix(xdim, ydim));
}else{
if(ydim % 2 == 0){
res.each_slice() = calc_matrix(make_odd_matrix_case1(xdim, ydim));
}else{
arma::mat first_odd_mat = calc_matrix(make_odd_matrix_case1(xdim, ydim));
arma::mat sec_odd_mat = calc_matrix(make_odd_matrix_case2(xdim, ydim));
for(unsigned int t = 0; t < tdim; t++){
res.slice(t) = (t % 2 ? sec_odd_mat : first_odd_mat);
}
}
}
return res;
}
Timing
Now, the real truth is how well does this perform:
Unit: microseconds
expr min lq mean median uq max neval
r_1core 3538.022 3825.8105 4301.84107 3957.3765 4043.0085 16856.865 100
alex_1core 2790.515 2984.7180 3461.11021 3076.9265 3189.7890 15371.406 100
cpp_1core 174.508 180.7190 197.29728 194.1480 204.8875 338.510 100
cpp_2core 111.960 116.0040 126.34508 122.7375 136.2285 162.279 100
cpp_3core 81.619 88.4485 104.54602 94.8735 108.5515 204.979 100
cpp_cache 40.637 44.3440 55.08915 52.1030 60.2290 302.306 100
Script used for timing:
cpp_parallel = cube_parallel(a,res, 1)
alex_1core = alex(a,res,xdim,ydim,tdim)
cpp_cache = dim_to_cube(xdim,ydim,tdim)
op_answer = cube_r(a,res,xdim,ydim,tdim)
all.equal(cpp_parallel, op_answer)
all.equal(cpp_cache, op_answer)
all.equal(alex_1core, op_answer)
xdim <- 20
ydim <- 20
tdim <- 5
a <- array(0:1,dim=c(xdim,ydim,tdim))
res <- array(0:1,dim=c(xdim,ydim,tdim))
ga = microbenchmark::microbenchmark(r_1core = cube_r(a,res,xdim,ydim,tdim),
alex_1core = alex(a,res,xdim,ydim,tdim),
cpp_1core = cube_parallel(a,res, 1),
cpp_2core = cube_parallel(a,res, 2),
cpp_3core = cube_parallel(a,res, 3),
cpp_cache = dim_to_cube(xdim,ydim,tdim))
Here's one solution that's fast for a large array:
res <- apply(a, 3, function(a) t(filter(t(filter(a, rep(1, 5), circular=TRUE)), rep(1, 5), circular=TRUE)))
dim(res) <- c(xdim, ydim, tdim)
I filtered the array using rep(1,5) as the weights (i.e. sum values within a neighborhood of 2) along each dimension. I then modified the dim attribute since it initially comes out as a matrix.
Note that this wraps the sum around at the edges of the array (which might make sense since you're looking at latitude and longitude; if not, I can modify my answer).
For a concrete example:
xdim <- 500
ydim <- 500
tdim <- 15
a <- array(0:1,dim=c(xdim,ydim,tdim))
and here's what you're currently using (with NAs at the edges) and how long this example takes on my laptop:
f1 <- function(a, xdim, ydim, tdim){
res <- array(NA_integer_,dim=c(xdim,ydim,tdim))
for (t in 1:tdim){
for (x in 3:(xdim-2)){
for (y in 3:(ydim-2)){
res[x,y,t] <- sum(a[(x-2):(x+2),(y-2):(y+2),t])
}
}
}
return(res)
}
system.time(res1 <- f1(a, xdim, ydim, tdim))
# user system elapsed
# 14.813 0.005 14.819
And here's a comparison with the version I described:
f2 <- function(a, xdim, ydim, tdim){
res <- apply(a, 3, function(a) t(filter(t(filter(a, rep(1, 5), circular=TRUE)), rep(1, 5), circular=TRUE)))
dim(res) <- c(xdim, ydim, tdim)
return(res)
}
system.time(res2 <- f2(a, xdim, ydim, tdim))
# user system elapsed
# 1.188 0.047 1.236
You can see there's a significant speed boost (for large arrays). And to check that it's giving the correct solution (note that I'm adding NAs so both results match, since the one I gave filters in a circular manner):
## Match NAs
res2NA <- ifelse(is.na(res1), NA, res2)
all.equal(res2NA, res1)
# [1] TRUE
I'll add that your full array (2500x2500x50) took just under a minute (about 55 seconds), although it did use a lot of memory in the process, FYI.
Your current code has a lot of overhead from redundant subsetting and calculation. Clean this up if you want better speed.
At xdim <- ydim <- 20; tdim <- 5, I see a 23% speedup on my machine.
At xdim <- ydim <- 200; tdim <- 10, I see a 25% speedup.
This comes at small cost of additional memory, which is obvious by examining the code below.
xdim <- ydim <- 20; tdim <- 5
a <- array(0:1,dim=c(xdim,ydim,tdim))
res <- array(0:1,dim=c(xdim,ydim,tdim))
microbenchmark(op= {
for (t in 1:tdim){
for (x in 3:(xdim-2)){
for (y in 3:(ydim-2)){
res[x,y,t] <- sum(a[(x-2):(x+2),(y-2):(y+2),t])
}
}
}
},
alex= {
for (t in 1:tdim){
temp <- a[,,t]
for (x in 3:(xdim-2)){
temp2 <- temp[(x-2):(x+2),]
for (y in 3:(ydim-2)){
res[x,y,t] <- sum(temp2[,(y-2):(y+2)])
}
}
}
}, times = 50)
Unit: milliseconds
expr min lq mean median uq max neval cld
op 4.855827 5.134845 5.474327 5.321681 5.626738 7.463923 50 b
alex 3.720368 3.915756 4.213355 4.012120 4.348729 6.320481 50 a
Further improvements:
If you write this in C++, my guess is that recognizing res[x,y,t] = res[x,y-1,t] - sum(a[...,y-2,...]) + sum(a[...,y+2,...]) will save you additional time. In R, it did not in my timing tests.
This problem is also embarrassingly parallel. There's no reason you couldn't split the t dimension to make more use of a multi-core architecture.
Both of these are left to the reader / OP.

equivalent of 'which' function in Rcpp

I'm a newbie to C++ and Rcpp. Suppose, I have a vector
t1<-c(1,2,NA,NA,3,4,1,NA,5)
and I want to get a index of elements of t1 that are NA. I can write:
NumericVector retIdxNA(NumericVector x) {
// Step 1: get the positions of NA in the vector
LogicalVector y=is_na(x);
// Step 2: count the number of NA
int Cnt=0;
for (int i=0;i<x.size();i++) {
if (y[i]) {
Cnt++;
}
}
// Step 3: create an output matrix whose size is same as that of NA
// and return the answer
NumericVector retIdx(Cnt);
int Cnt1=0;
for (int i=0;i<x.size();i++) {
if (y[i]) {
retIdx[Cnt1]=i+1;
Cnt1++;
}
}
return retIdx;
}
then I get
retIdxNA(t1)
[1] 3 4 8
I was wondering:
(i) is there any equivalent of which in Rcpp?
(ii) is there any way to make the above function shorter/crisper? In particular, is there any easy way to combine the Step 1, 2, 3 above?
Recent version of RcppArmadillo have functions to identify the indices of finite and non-finite values.
So this code
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::uvec whichNA(arma::vec x) {
return arma::find_nonfinite(x);
}
/*** R
t1 <- c(1,2,NA,NA,3,4,1,NA,5)
whichNA(t1)
*/
yields your desired answer (module the off-by-one in C/C++ as they are zero-based):
R> sourceCpp("/tmp/uday.cpp")
R> t1 <- c(1,2,NA,NA,3,4,1,NA,5)
R> whichNA(t1)
[,1]
[1,] 2
[2,] 3
[3,] 7
R>
Rcpp can do it too if you first create the sequence to subset into:
// [[Rcpp::export]]
Rcpp::IntegerVector which2(Rcpp::NumericVector x) {
Rcpp::IntegerVector v = Rcpp::seq(0, x.size()-1);
return v[Rcpp::is_na(x)];
}
Added to code above it yields:
R> which2(t1)
[1] 2 3 7
R>
The logical subsetting is also somewhat new in Rcpp.
Try this:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector which4( NumericVector x) {
int nx = x.size();
std::vector<int> y;
y.reserve(nx);
for(int i = 0; i < nx; i++) {
if (R_IsNA(x[i])) y.push_back(i+1);
}
return wrap(y);
}
which we can run like this in R:
> which4(t1)
[1] 3 4 8
Performance
Note that we have changed the above solution to reserve space for the output vector. This replaces which3 which is:
// [[Rcpp::export]]
IntegerVector which3( NumericVector x) {
int nx = x.size();
IntegerVector y;
for(int i = 0; i < nx; i++) {
// if (internal::Rcpp_IsNA(x[i])) y.push_back(i+1);
if (R_IsNA(x[i])) y.push_back(i+1);
}
return y;
}
Then the performance on a vector 9 elements long is the following with which4 the fastest:
> library(rbenchmark)
> benchmark(retIdxNA(t1), whichNA(t1), which2(t1), which3(t1), which4(t1),
+ replications = 10000, order = "relative")[1:4]
test replications elapsed relative
5 which4(t1) 10000 0.14 1.000
4 which3(t1) 10000 0.16 1.143
1 retIdxNA(t1) 10000 0.17 1.214
2 whichNA(t1) 10000 0.17 1.214
3 which2(t1) 10000 0.25 1.786
Repeating this for a vector 9000 elements long the Armadillo solution comes in quite a bit faster than the others. Here which3 (which is the same as which4 except it does not reserve space for the output vector) comes in worst while which4 comes second.
> tt <- rep(t1, 1000)
> benchmark(retIdxNA(tt), whichNA(tt), which2(tt), which3(tt), which4(tt),
+ replications = 1000, order = "relative")[1:4]
test replications elapsed relative
2 whichNA(tt) 1000 0.09 1.000
5 which4(tt) 1000 0.79 8.778
3 which2(tt) 1000 1.03 11.444
1 retIdxNA(tt) 1000 1.19 13.222
4 which3(tt) 1000 23.58 262.000
All of the solutions above are serial. Although not trivial, it is quite possible to take advantage of threading for implementing which. See this write up for more details. Although for such small sizes, it would not more harm than good. Like taking a plane for a small distance, you lose too much time at airport security..
R implements which by allocating memory for a logical vector as large as the input, does a single pass to store the indices in this memory, then copy it eventually into a proper logical vector.
Intuitively this seems less efficient than a double pass loop, but not necessarily, as copying a data range is cheap. See more details here.
Just write a function for yourself like:
which_1<-function(a,b){
return(which(a>b))
}
Then pass this function into rcpp.

Resources