Rcpp - generate multiple random observations from custom distribution - r

This question is related to a previous one on calling functions within functions in Rcpp.
I need to generate a large number of random draws from a custom distribution, in a way similar to rnorm() or rbinom(), with the additional complication that my function produces a vector output.
As a solution, I thought about defining a function that generates observations from the custom distribution, and then a main function that draws n times from the generating function via a for loop. Here below is a much simplified working version of the code:
#include <Rcpp.h>
using namespace Rcpp;
// generating function
NumericVector gen(NumericVector A, NumericVector B){
NumericVector out = no_init_vector(2);
out[0] = R::runif(A[0],A[1]) + R::runif(B[0],B[1]);
out[1] = R::runif(A[0],A[1]) - R::runif(B[0],B[1]);
return out;
}
// [[Rcpp::export]]
// draw n observations
NumericVector rdraw(int n, NumericVector A, NumericVector B){
NumericMatrix out = no_init_matrix(n, 2);
for (int i = 0; i < n; ++i) {
out(i,_) = gen(A, B);
}
return out;
}
I am looking for ways to speed up the draws. My questions are: is there any more efficient alternative to the for loop? Would parallelization help in this case?
Thank you for any help!

There are different ways to speed this up:
Use inline on gen(), reducing the number of function calls.
Use Rcpp::runif instead of a loop with R::runif to remove even more function calls.
Use a faster RNG that allows for parallel execution.
Here points 1. and 2.:
#include <Rcpp.h>
using namespace Rcpp;
// generating function
inline NumericVector gen(NumericVector A, NumericVector B){
NumericVector out = no_init_vector(2);
out[0] = R::runif(A[0],A[1]) + R::runif(B[0],B[1]);
out[1] = R::runif(A[0],A[1]) - R::runif(B[0],B[1]);
return out;
}
// [[Rcpp::export]]
// draw n observations
NumericVector rdraw(int n, NumericVector A, NumericVector B){
NumericMatrix out = no_init_matrix(n, 2);
for (int i = 0; i < n; ++i) {
out(i,_) = gen(A, B);
}
return out;
}
// [[Rcpp::export]]
// draw n observations
NumericVector rdraw2(int n, NumericVector A, NumericVector B){
NumericMatrix out = no_init_matrix(n, 2);
out(_, 0) = Rcpp::runif(n, A[0],A[1]) + Rcpp::runif(n, B[0],B[1]);
out(_, 1) = Rcpp::runif(n, A[0],A[1]) - Rcpp::runif(n, B[0],B[1]);
return out;
}
/*** R
set.seed(42)
system.time(rdraw(1e7, c(0,2), c(1,3)))
system.time(rdraw2(1e7, c(0,2), c(1,3)))
*/
Result:
> set.seed(42)
> system.time(rdraw(1e7, c(0,2), c(1,3)))
user system elapsed
1.576 0.034 1.610
> system.time(rdraw2(1e7, c(0,2), c(1,3)))
user system elapsed
0.458 0.139 0.598
For comparison, your original code took about 1.8s for 10^7 draws. For point 3. I am adapting code from the parallel vignette of my dqrng package:
#include <Rcpp.h>
// [[Rcpp::depends(dqrng)]]
#include <xoshiro.h>
#include <dqrng_distribution.h>
// [[Rcpp::plugins(openmp)]]
#include <omp.h>
// [[Rcpp::depends(RcppParallel)]]
#include <RcppParallel.h>
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::export]]
Rcpp::NumericMatrix rdraw3(int n, Rcpp::NumericVector A, Rcpp::NumericVector B, int seed, int ncores) {
dqrng::uniform_distribution distA(A(0), A(1));
dqrng::uniform_distribution distB(B(0), B(1));
dqrng::xoshiro256plus rng(seed);
Rcpp::NumericMatrix res = Rcpp::no_init_matrix(n, 2);
RcppParallel::RMatrix<double> output(res);
#pragma omp parallel num_threads(ncores)
{
dqrng::xoshiro256plus lrng(rng); // make thread local copy of rng
lrng.jump(omp_get_thread_num() + 1); // advance rng by 1 ... ncores jumps
auto genA = std::bind(distA, std::ref(lrng));
auto genB = std::bind(distB, std::ref(lrng));
#pragma omp for
for (int i = 0; i < n; ++i) {
output(i, 0) = genA() + genB();
output(i, 1) = genA() - genB();
}
}
return res;
}
/*** R
system.time(rdraw3(1e7, c(0,2), c(1,3), 42, 2))
*/
Result:
> system.time(rdraw3(1e7, c(0,2), c(1,3), 42, 2))
user system elapsed
0.276 0.025 0.151
So with a faster RNG and moderate parallelism, we can gain an order of magnitude in execution time. The results will be different, of course, but summary statistics should be the same.

Related

"inner_product" was not declared in this scope

Hi I am new to rcpp and computing the inner product of two variables but getting an error "inner_product was not declared in this scope" for the following code:
#include <math.h>
#include <RcppCommon.h>
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector polynomial_kernel(NumericVector x, NumericMatrix Y, double scale = 1, double offset =
1, int d=1){
int n = Y.nrow();
NumericVector kernel(n);
for (int j = 0; j < n; j++){
NumericVector v = Y( j,_ );
double crossProd =innerProduct(x,v);
kernel[j]= pow((scale*crossProd+offset),2);
}
return kernel;
}
Please help me to resolve this problem.
Below is simpler, repaired version of your code that actually compiles. It uses Armadillo types for consistency, and instead of calling a non-existing "inner_product" routines computes the inner product of two vectors the standard way via multiplication.
#include <RcppArmadillo.h> // also pulls in Rcpp.h amd cmath
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::vec polynomial_kernel(arma::vec x, arma::mat Y,
double scale = 1, double offset = 1, int d=1) {
int n = Y.n_rows;
arma::vec kernel(n);
for (int j = 0; j < n; j++){
arma::rowvec v = Y.row(j);
double crossProd = arma::as_scalar(v * x);
kernel[j] = std::pow((scale*crossProd+offset),2);
}
return kernel;
}
Your example was not a minimallyc complete verifiable example so I cannot show it any data you could have supplied with. On some made up data it seems to work:
R> set.seed(123)
R> polynomial_kernel(runif(4), matrix(rnorm(16),4))
[,1]
[1,] 3.317483
[2,] 3.055690
[3,] 1.208345
[4,] 0.301834
R>

Rcpp implementation of mvtnorm::pmvnorm slower than original R function

I am trying to get a Rcpp version of pmvnorm to work at least as fast as mvtnorm::pmvnorm in R.
I have found https://github.com/zhanxw/libMvtnorm and created a Rcpp skeleton package with the relevant source files. I have added the following functions which make use of Armadillo (since I'm using it across other code I've been writing).
//[[Rcpp::export]]
arma::vec triangl(const arma::mat& X){
arma::mat LL = arma::trimatl(X, -1); // omit the main diagonal
return LL.elem(arma::find(LL != 0));
}
//[[Rcpp::export]]
double pmvnorm_cpp(arma::vec& bound, arma::vec& lowtrivec){
double error;
int n = bound.n_elem;
double* boundptr = bound.memptr();
double* lowtrivecptr = lowtrivec.memptr();
double result = pmvnorm_P(n, boundptr, lowtrivecptr, &error);
return result;
}
From R after building the package, this is a reproducible example:
set.seed(1)
covar <- rWishart(1, 10, diag(5))[,,1]
sds <- diag(covar) ^-.5
corrmat <- diag(sds) %*% covar %*% diag(sds)
triang <- triangl(corrmat)
bounds <- c(0.5, 0.9, 1, 4, -1)
rbenchmark::benchmark(pmvnorm_cpp(bounds, triang),
mvtnorm::pmvnorm(upper=bounds, corr = corrmat),
replications=1000)
Which shows that pmvnorm_cpp is much slower than mvtnorm::pmvnorm. and the result is different.
> pmvnorm_cpp(bounds, triang)
[1] 0.04300643
> mvtnorm::pmvnorm(upper=bounds, corr = corrmat)
[1] 0.04895361
which puzzles me because I thought the base fortran code was the same. Is there something in my code that makes everything go slow? Or should I try to port the mvtnorm::pmvnorm code directly? I have literally no experience with fortran.
Suggestions appreciated, excuse my incompetence othewise.
EDIT: to make a quick comparison with an alternative, this:
//[[Rcpp::export]]
NumericVector pmvnorm_cpp(NumericVector bound, NumericMatrix cormat){
Environment stats("package:mvtnorm");
Function f = stats["pmvnorm"];
NumericVector lower(bound.length(), R_NegInf);
NumericVector mean(bound.length());
NumericVector res = f(lower, bound, mean, cormat);
return res;
}
has essentially the same performance as an R call (the following on a 40-dimensional mvnormal):
> rbenchmark::benchmark(pmvnorm_cpp(bounds, corrmat),
+ mvtnorm::pmvnorm(upper=bounds, corr = corrmat),
+ replications=100)
test replications elapsed relative user.self sys.self
2 mvtnorm::pmvnorm(upper = bounds, corr = corrmat) 100 16.86 1.032 16.60 0.00
1 pmvnorm_cpp(bounds, corrmat) 100 16.34 1.000 16.26 0.01
so it seems to me there must be something going on in the previous code. either with how I'm handling things with Armadillo, or how the other things are connected. I would assume that there should be a performance gain compared to this last implementation.
Instead of trying to use an additional library for this, I would try to use the C API exported by mvtnorm, c.f. https://github.com/cran/mvtnorm/blob/master/inst/NEWS#L44-L48. While doing so, I found three reasons why the results differ. One of them is also responsible for the preformance difference:
mvtnorm uses R's RNG, while this has been removed from the library you are using, c.f. https://github.com/zhanxw/libMvtnorm/blob/master/libMvtnorm/randomF77.c.
Your triangl function is incorrect. It returns the lower triangular matrix in column-major order. However, the underlying fortran code expects it in row-major order, c.f. https://github.com/cran/mvtnorm/blob/master/src/mvt.f#L36-L39 and https://github.com/zhanxw/libMvtnorm/blob/master/libMvtnorm/mvtnorm.cpp#L60
libMvtnorm uses 1e-6 instead of 1e-3 as relative precision, c.f. https://github.com/zhanxw/libMvtnorm/blob/master/libMvtnorm/mvtnorm.cpp#L65. This is also responsible for the performance difference.
We can test this using the following code:
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::depends(mvtnorm)]]
#include <mvtnormAPI.h>
//[[Rcpp::export]]
arma::vec triangl(const arma::mat& X){
int n = X.n_cols;
arma::vec res(n * (n-1) / 2);
for (int i = 0; i < n; ++i) {
for (int j = 0; j < i; ++j) {
res(j + i * (i-1) / 2) = X(i, j);
}
}
return res;
}
// [[Rcpp::export]]
double pmvnorm_cpp(arma::vec& bound,
arma::vec& lowertrivec,
double abseps = 1e-3){
int n = bound.n_elem;
int nu = 0;
int maxpts = 25000; // default in mvtnorm: 25000
double releps = 0; // default in mvtnorm: 0
int rnd = 1; // Get/PutRNGstate
double* bound_ = bound.memptr();
double* correlationMatrix = lowertrivec.memptr();
double* lower = new double[n];
int* infin = new int[n];
double* delta = new double[n];
for (int i = 0; i < n; ++i) {
infin[i] = 0; // (-inf, bound]
lower[i] = 0.0;
delta[i] = 0.0;
}
// return values
double error;
double value;
int inform;
mvtnorm_C_mvtdst(&n, &nu, lower, bound_,
infin, correlationMatrix, delta,
&maxpts, &abseps, &releps,
&error, &value, &inform, &rnd);
delete[] (lower);
delete[] (infin);
delete[] (delta);
return value;
}
/*** R
set.seed(1)
covar <- rWishart(1, 10, diag(5))[,,1]
sds <- diag(covar) ^-.5
corrmat <- diag(sds) %*% covar %*% diag(sds)
triang <- triangl(corrmat)
bounds <- c(0.5, 0.9, 1, 4, -1)
set.seed(1)
system.time(cat(mvtnorm::pmvnorm(upper=bounds, corr = corrmat), "\n"))
set.seed(1)
system.time(cat(pmvnorm_cpp(bounds, triang, 1e-6), "\n"))
set.seed(1)
system.time(cat(pmvnorm_cpp(bounds, triang, 0.001), "\n"))
*/
Results:
> system.time(cat(mvtnorm::pmvnorm(upper=bounds, corr = corrmat), "\n"))
0.04896221
user system elapsed
0.000 0.003 0.003
> system.time(cat(pmvnorm_cpp(bounds, triang, 1e-6), "\n"))
0.04895756
user system elapsed
0.035 0.000 0.035
> system.time(cat(pmvnorm_cpp(bounds, triang, 0.001), "\n"))
0.04896221
user system elapsed
0.004 0.000 0.004
With the same RNG (and RNG state), the correct lower triangular correlation matrix and the same relative precision, results are identical and performance is comparable. With higher precision, performance suffers.
All this is for a stand-alone file using Rcpp::sourceCpp. In order to use this in a package, you need to add LinkingTo: mvtnorm to your DESCRIPTION file.

Segment fault when using Rcpp/Armadillo and openMP prarallel with user-defined function

I was trying to use rcpp/armadillo with openmp to speed up a loop in R. The loop takes a matrix with each row containing indices of a location vector(or matrix if it's 2D locations) as input(and other matrix/vec to be used). Inside the loop, I extracted each row of input indices matrix and find the corresponding locations, calculate distance matrix, and covariance matrix, do cholesky and backsolve, save the backsolve results to a new matrix. Here is the rcpp code:
`#include <iostream>
#include <RcppArmadillo.h>
#include <omp.h>
#include <Rcpp.h>
// [[Rcpp::plugins(openmp)]]
using namespace Rcpp;
using namespace arma;
using namespace std;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
mat NZentries_new2 (int m, int nnp, const mat& locs, const umat& revNNarray, const mat& revCondOnLatent, const vec& nuggets, const vec covparms){
// initialized the output matrix
mat Lentries=zeros(nnp,m+1);
// initialized objects in parallel part
int n0; //number of !is_na elements
uvec inds;//
vec revCon_row;//
uvec inds00;//
vec nug;//
mat covmat;//
vec onevec;//
vec M;//
mat dist;//
int k;//
omp_set_num_threads(2);// selects the number of cores to use.
#pragma omp parallel for shared(locs,revNNarray,revCondOnLatent,nuggets,nnp,m,Lentries) private(k,M,dist,onevec,covmat,nug,n0,inds,revCon_row,inds00) default(none) schedule(static)
for (k = 0; k < nnp; k++) {
// extract a row to work with
inds=revNNarray.row(k).t();
revCon_row=revCondOnLatent.row(k).t();
if (k < m){
n0=k+1;
} else {
n0=m+1;
}
// extract locations
inds00=inds(span(m+1-n0,m))-ones<uvec>(n0);
nug=nuggets.elem(inds00) % (ones(n0)-revCon_row(span(m+1-n0,m))); // vec is vec, cannot convert to mat
dist=calcPWD2(locs.rows(inds00));
#pragma omp critical
{
//calculate covariance matrix
covmat= MaternFun(dist,covparms) + diagmat(nug) ; // summation from arma
}
// get last row of inverse Cholesky
onevec = zeros(n0);
onevec[n0-1] = 1;
M=solve(chol(covmat,"upper"),onevec);
// save the entries to matrix
Lentries(k,span(0,n0-1)) = M.t();
}
return Lentries;
}`
The current version works fine but speed is slow(almost the same as no parallel version), if I take the line in omp critical bracket out, it cause segment fault and R will be crashed. This MaterFun is a function I defined as below with several other small functions. So my question is that why MaternFun has to stay in the critical part.
// [[Rcpp::export]]
mat MaternFun( mat distmat, vec covparms ){
int d1 = distmat.n_rows;
int d2 = distmat.n_cols;
int j1;
int j2;
mat covmat(d1,d2);
double scaledist;
double normcon = covparms(0)/(pow(2.0,covparms(2)-1)*Rf_gammafn(covparms(2)));
for (j1 = 0; j1 < d1; j1++){
for (j2 = 0; j2 < d2; j2++){
if ( distmat(j1,j2) == 0 ){
covmat(j1,j2) = covparms(0);
} else {
scaledist = distmat(j1,j2)/covparms(1);
covmat(j1,j2) = normcon*pow( scaledist, covparms(2) )*
Rf_bessel_k(scaledist,covparms(2),1.0);
}
}
}
return covmat;
}
// [[Rcpp::export]]
double dist2(double lat1,double long1,double lat2,double long2) {
double dist = sqrt(pow(lat1 - lat2, 2) + pow(long1 - long2, 2)) ;
return (dist) ;
}
// [[Rcpp::export]]
mat calcPWD2( mat x) {//Rcpp::NumericMatrix
int outrows = x.n_rows ;
int outcols = x.n_rows ;
mat out(outrows, outcols) ;
for (int arow = 0 ; arow < outrows ; arow++) {
for (int acol = 0 ; acol < outcols ; acol++) {
out(arow, acol) = dist2(x(arow, 0),x(arow, 1),
x(acol, 0),x(acol, 1)) ; //extract element from mat
}
}
return (out) ;
}
Here is some sample inputs for testing the MaterFun in R:
library(fields)
distmat=rdist(1:5) # distance matrix
covparms=c(1,0.2,1.5)
The issue is there are two calls to R math functions (Rf_bessel_k and Rf_gammafn) that require the access to be single threaded instead of parallel.
To get around this, let's add a dependency on boost via BH to obtain the cyl_bessel_k and tgamma functions. Alternatively, there is always the option of reimplementing R's besselK and gamma in C++ so it doesn't use the single-threaded R variant.
This gives:
#include <Rcpp.h>
#include <boost/math/special_functions/bessel.hpp>
#include <boost/math/special_functions/gamma.hpp>
// [[Rcpp::depends(BH)]]
// [[Rcpp::export]]
double besselK_boost(double x, double v) {
return boost::math::cyl_bessel_k(v, x);
}
// [[Rcpp::export]]
double gamma_fn_boost(double x) {
return boost::math::tgamma(x);
}
Test Code
x0 = 9.536743e-07
nu = -10
all.equal(besselK(x0, nu), besselK_boost(x0, nu))
# [1] TRUE
x = 2
all.equal(gamma(x), gamma_fn_boost(x))
# [1] TRUE
Note: The order of parameters for boost's variant differs from R's:
cyl_bessel_k(v, x)
Rf_bessel_k(x, v, expon.scaled = FALSE)
From here, we can modify the MaternFun. Unfortunately, because calcPWD2 is missing, the furthest we can go is switching to use boost and incorporating in OpenMP protections.
#include <RcppArmadillo.h>
#include <boost/math/special_functions/bessel.hpp>
#include <boost/math/special_functions/gamma.hpp>
#ifdef _OPENMP
#include <omp.h>
#endif
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(BH)]]
// [[Rcpp::plugins(openmp)]]
// [[Rcpp::export]]
arma::mat MaternFun(arma::mat distmat, arma::vec covparms) {
int d1 = distmat.n_rows;
int d2 = distmat.n_cols;
int j1;
int j2;
arma::mat covmat(d1,d2);
double scaledist;
double normcon = covparms(0) /
(pow(2.0, covparms(2) - 1) * boost::math::tgamma(covparms(2)));
for (j1 = 0; j1 < d1; ++j1){
for (j2 = 0; j2 < d2; ++j2){
if ( distmat(j1, j2) == 0 ){
covmat(j1, j2) = covparms(0);
} else {
scaledist = distmat(j1, j2)/covparms(1);
covmat(j1, j2) = normcon * pow( scaledist, covparms(2) ) *
boost::math::cyl_bessel_k(covparms(2), scaledist);
}
}
}
return covmat;
}

Why does mclapply function in R is more efficient than Rcpp + OpenMP?

I have a function with a loop (EstimateUniques) that is parallelized with OpenMP. I suggested that multithreading should be more efficient than multiprocessing, but when I compare this function with the simple run of "mclapply", it showed lower performance. What is the proper way to achieve the same level of parallelization in c++ as in R? Am I doing something wrong?
Performance comparison (time in seconds):
#Cores CPP R
1 1.721s 1.538s
2 1.945s 1.080s
3 2.858s 0.801s
R code:
Rcpp::sourceCpp('ReproducibleExample.cpp')
arr <- 1:10000
n_rep <- 150
n_iters <- 200
EstimateUniquesR <- function(arr, n_iters, n_rep, cores) {
parallel::mclapply(1:n_iters, function(i)
GetNumberOfUniqSamples(arr, i * 10, n_rep), mc.cores=cores)
}
cpp_times <- sapply(1:3, function(threads)
system.time(EstimateUniques(arr, n_iters, n_rep, threads))['elapsed'])
r_times <- sapply(1:3, function(cores)
system.time(EstimateUniquesR(arr, n_iters, n_rep, cores))['elapsed'])
data.frame(CPP=cpp_times, R=r_times)
Example.cpp file:
// [[Rcpp::plugins(openmp)]]
// [[Rcpp::plugins(cpp11)]]
#include <algorithm>
#include <vector>
#include <omp.h>
// [[Rcpp::export]]
int GetNumberOfUniqSamples(const std::vector<int> &bs_array, int size, unsigned n_rep) {
unsigned long sum = 0;
for (unsigned i = 0; i < n_rep; ++i) {
std::vector<int> uniq_vals(size);
for (int try_num = 0; try_num < size; ++try_num) {
uniq_vals[try_num] = bs_array[rand() % bs_array.size()];
}
std::sort(uniq_vals.begin(), uniq_vals.end());
sum += std::distance(uniq_vals.begin(), std::unique(uniq_vals.begin(), uniq_vals.end()));
}
return std::round(double(sum) / n_rep);
}
// [[Rcpp::export]]
std::vector<int> EstimateUniques(const std::vector<int> &bs_array, const int n_iters,
const int n_rep = 1000, const int threads=1) {
std::vector<int> uniq_counts(n_iters);
#pragma omp parallel for num_threads(threads) schedule(dynamic)
for (int i = 0; i < n_iters; ++i) {
uniq_counts[i] = GetNumberOfUniqSamples(bs_array, (i + 1) * 10, n_rep);
}
return uniq_counts;
}
I tried to use other types of scheduling in OpenMP, but they gave even worse results.

rcpp updating data in base environment

I am working on a simple matching algorithm in Rcpp which is taking a number of individuals (I), a number of schools (J), a number of submitted choices (nc), a priority ranking of individuals (pos), the number of vacancy (emp), and the true choices..
#include <Rcpp.h>
using namespace Rcpp;
//[[Rcpp::export]]
NumericVector gs2(int I, int J, int nc, NumericVector pos, NumericVector emp, NumericMatrix choices) {
NumericVector admits(J);
NumericVector out(I);
std::fill(out.begin(),out.end(),J+1);
for (int i=0;i<I;i++){
NumericVector apply = choices(pos(i),_)-1;
for (int j=0;j<nc;j++){
if (emp(apply(j))>0)
{
out(pos(i)) = apply(j)+1;
admits(apply(j)) = admits(apply(j)) + 1;
emp(apply(j)) = emp(apply(j)) - 1;
break;
}
}
}
return out;
}
The code works fine.. Except that it looks like it is messing with my data.. after running the code my size variable has been changed...Am I missing something? Thanks
set.seed(123)
rank = (1:20)-1
stuchoice = matrix(sample(1:3,6*20,replace=T),byrow=T,ncol=6,nrow=20)
size = c(7,11,4)
gs2(20,3,6,rank,size,stuchoice)
size
Your size variable is changing because you are changing it in your C++ code. In particular this line:
emp(apply(j)) = emp(apply(j)) - 1;
Rcpp passes variables by reference so anything you do to them inside will be reflected in your top R variables. If you want to avoid this, then you want to clone your variable. Changing your code to the following corrects the problem.
#include <Rcpp.h>
using namespace Rcpp;
// Note the change in the name of 'emp' to 'emp_'!!!
//[[Rcpp::export]]
NumericVector gs2(int I, int J, int nc, NumericVector pos, NumericVector emp_, NumericMatrix choices) {
NumericVector admits(J);
NumericVector out(I);
// clone your emp
NumericVector emp = clone(emp_);
std::fill(out.begin(),out.end(),J+1);
for (int i=0;i<I;i++){
NumericVector apply = choices(pos(i),_)-1;
for (int j=0;j<nc;j++){
if (emp(apply(j))>0)
{
out(pos(i)) = apply(j)+1;
admits(apply(j)) = admits(apply(j)) + 1;
emp(apply(j)) = emp(apply(j)) - 1;
break;
}
}
}
return out;
}
Test
library(Rcpp)
sourceCpp("test.cpp")
set.seed(123)
rank = (1:20)-1
stuchoice = matrix(sample(1:3,6*20,replace=T),byrow=T,ncol=6,nrow=20)
size = c(7,11,4)
gs2(20,3,6,rank,size,stuchoice)
size
[1] 7 11 4

Resources