Extract diagonal matrix from a given sparse matrix - r

Using RcppEigen I want to extract only the diagonal of a sparse matrix as a sparse matrix. Seemed easy enough - below you find my attempts and none deliver my desired result. Mind you attempt 5 doesn't compile and doesn't work. Here are some resources I used; Rcpp Gallery, KDE Forum and in the same post KDE Forum (2), Eigen Sparse Tutorial and SO. Feel like I am close... maybe not... I will let the experts decide.
// [[Rcpp::depends(RcppEigen)]]
#include <RcppEigen.h>
#include <Eigen/SparseCore>
// [[Rcpp::export]]
Eigen::SparseMatrix<double> diag_mat1(Eigen::Map<Eigen::SparseMatrix<double> > &X){
// cannot access diagonal of mapped sparse matrix
const int n(X.rows());
Eigen::VectorXd dii(n);
for (int i = 0; i < n; ++i) {
dii[i] = X.coeff(i,i);
}
Eigen::SparseMatrix<double> ans(dii.asDiagonal());
return ans;
}
// [[Rcpp::export]]
Eigen::SparseMatrix<double> diag_mat2(Eigen::SparseMatrix<double> &X){
Eigen::SparseVector<double> dii(X.diagonal().sparseView());
Eigen::SparseMatrix<double> ans(dii);
return ans;
}
// [[Rcpp::export]]
Eigen::SparseMatrix<double> diag_mat3(Eigen::SparseMatrix<double> &X){
Eigen::VectorXd dii(X.diagonal());
Eigen::SparseMatrix<double> ans(dii.asDiagonal());
ans.pruned(); //hoping this helps
return ans;
}
// [[Rcpp::export]]
Eigen::SparseMatrix<double> diag_mat4(Eigen::SparseMatrix<double> &X){
Eigen::SparseMatrix<double> ans(X.diagonal().asDiagonal());
return ans;
}
// [[Rcpp::export]]
Eigen::SparseMatrix<double> diag_mat5(Eigen::SparseMatrix<double> &X){
struct keep_diag {
inline bool operator() (const int& row, const int& col, const double&) const
{ return row==col; }
};
Eigen::SparseMatrix<double> ans(X.prune(keep_diag()));
return ans;
}
/***R
library(Matrix)
set.seed(42)
nc <- nr <- 5
m <- rsparsematrix(nr, nc, nnz = 10)
diag_mat1(m)
diag_mat2(m)
diag_mat3(m)
diag_mat4(m)
*/
EDIT: Added the results that each attempt gives;
> diag_mat1(m)
5 x 5 sparse Matrix of class "dgCMatrix"
[1,] 0 . . . .
[2,] . -0.095 . . .
[3,] . . 0 . .
[4,] . . . 2 .
[5,] . . . . 1.5
> diag_mat2(m)
5 x 1 sparse Matrix of class "dgCMatrix"
[1,] .
[2,] -0.095
[3,] .
[4,] 2.000
[5,] 1.500
> diag_mat3(m)
5 x 5 sparse Matrix of class "dgCMatrix"
[1,] 0 . . . .
[2,] . -0.095 . . .
[3,] . . 0 . .
[4,] . . . 2 .
[5,] . . . . 1.5
> diag_mat4(m)
5 x 5 sparse Matrix of class "dgCMatrix"
[1,] 0 . . . .
[2,] . -0.095 . . .
[3,] . . 0 . .
[4,] . . . 2 .
[5,] . . . . 1.5
EDIT2: Added desired output;
5 x 5 sparse Matrix of class "dgCMatrix"
[1,] . . . . .
[2,] . -0.095 . . .
[3,] . . . . .
[4,] . . . 2 .
[5,] . . . . 1.5
Answer with inspiration thanks to Aleh;
Eigen::SparseMatrix<double> diag_mat6(Eigen::Map<Eigen::SparseMatrix<double> > &X){
const int n(X.rows());
Eigen::SparseMatrix<double> dii(n, n);
for (int i = 0; i < n; ++i) {
if (X.coeff(i,i) != 0.0 ) dii.insert(i, i) = X.coeff(i,i);
}
dii.makeCompressed();
return dii;
}

I prefer RcppArmadillo because it generally behaves more like R than RcppEigen does.
For your problem, with RcppArmadillo, you can do:
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
arma::sp_mat extractDiag(const arma::sp_mat& x) {
int n = x.n_rows;
arma::sp_mat res(n, n);
for (int i = 0; i < n; i++)
res(i, i) = x(i, i);
return res;
}
As suggested by #mtall, you can simply use:
// [[Rcpp::export]]
arma::sp_mat extractDiag3(const arma::sp_mat& x) {
return arma::diagmat(x);
}
If you really want to do this in Eigen, from the documentation, I came up with:
// [[Rcpp::export]]
Eigen::SparseMatrix<double> extractDiag2(Eigen::Map<Eigen::SparseMatrix<double> > &X){
int n = X.rows();
Eigen::SparseMatrix<double> res(n, n);
double d;
typedef Eigen::Triplet<double> T;
std::vector<T> tripletList;
tripletList.reserve(n);
for (int i = 0; i < n; i++) {
d = X.coeff(i, i);
if (d != 0) tripletList.push_back(T(i, i, d));
}
res.setFromTriplets(tripletList.begin(), tripletList.end());
return res;
}

I think you just need to skip zero elements across the diagonal:
for (int i = 0; i < n; ++i) {
if (X.coeff(i,i) != 0.0)
dii[i] = X.coeff(i,i);
}
}

Related

Is there a simple way to generate a multidimensional array in Rcpp and export that array to R

I'm looking for a simple way to build a 3D array using Rcpp and make that array accessible in R. I'm still very new to Rcpp and c++ coding in general, so any help would be greatly appreciated.
Here is my source code:
#include <Rcpp.h>
using namespace Rcpp;
int d[5][5][5] = {0};
// [[Rcpp::export]]
int fit(){
for (int X = 0; X < 5; X++){
for (int Y = 0; Y < 5; Y++){
for (int Z = 0; Z < 5; Z++){
d[X][Y][Z] = X + Y + Z;
}
}
}
return d;
}
The idea being I could assign fit() to an element in R (say D), and be able to call from locations in that array.
I realize I could just make "fit" a function of x, y, and z; but for my purposes, having the array in R works much faster.
Again, any help would be super appreciated.
Edit
Thank you for the help Dirk,
One thing I'm still struggling with is running the loops. When I run this loop:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector rcpp_matrix(){
IntegerVector v = IntegerVector(Dimension(2,2));
for (int i = 0; i < 2; i++){
for (int j = 0; j < 2; j++){
v(i,j) = (i + 1) * (j + 1);
}
}
// Return the vector to R
return v;
}
It works as intended. However, when I try to span out to three dimensions, like so:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector rcpp_matrix(){
IntegerVector v = IntegerVector(Dimension(2,2,2));
for (int i = 0; i < 2; i++){
for (int j = 0; j < 2; j++){
for (int k = 0; k < 2; k++){
v(i,j,k) = (i + 1) * (j + 1) * (k + 1);
}
}
}
// Return the vector to R
return v;
}
I get errors:
file10c2d06d0b.cpp: In function ‘Rcpp::IntegerVector rcpp_matrix()’:
file10c2d06d0b.cpp:14:16: error: no match for call to ‘(Rcpp::IntegerVector {aka Rcpp::Vector<13>}) (int&, int&, int&)’
14 | v(i,j,k) = (i + 1) * (j + 1) * (k + 1);
| ^
Is there some other aspect I'm missing?
Thank you again.
Vectors are key in R, and an array is just a vector with 2-d dimension attribute:
> v <- 1:12
> dim(v) <- c(3,4)
> v
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
>
So we can also do 3-d:
> dim(v) <- c(2,3,2)
> v
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
>
and so on. And that is the exact same thing in Rcpp as seen e.g. here from a unit test snippet:
// [[Rcpp::export]]
IntegerVector integer_dimension_ctor_3(){
return IntegerVector( Dimension( 2, 3, 4) ) ;
}
which we can test quickly:
> Rcpp::cppFunction("IntegerVector ivec3() { return IntegerVector(Dimension(2, 3, 4));}")
> ivec3()
, , 1
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
, , 2
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
, , 3
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
, , 4
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
>
All that said, I recommend you look into Armadillo 'Cubes' and what you can do from RcppArmadillo -- likely more fully featured.
Edit: Here is much simpler Armadillo variant of your approach, using a cube<int> aka icube:
Code
#include <RcppArmadillo/Lightest> // new 'lighter' header
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::icube make3dvec(){
arma::icube v(2,2,2);
for (int x = 0; x < 2; x++){
for (int y = 0; y < 2; y++){
for (int z = 0; z < 2; z++){
v(x,y,z) = (x + 1) * (y + 1) * (z + 1);
}
}
}
return v;
}
/*** R
make3dvec()
*/
Output
> Rcpp::sourceCpp("~/git/stackoverflow/75036466/answer.cpp")
> make3dvec()
, , 1
[,1] [,2]
[1,] 1 2
[2,] 2 4
, , 2
[,1] [,2]
[1,] 2 4
[2,] 4 8
>
Okay. I found a solution, albeit a little convoluted, to the specific issue I was having.
sourceCpp(code = "
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector fit(){
int d[2][2][2] = {0};
IntegerVector V = IntegerVector(Dimension(8));
int Count = 0;
for (int X = 0; X < 2; X++){
for (int Y = 0; Y < 2; Y++){
for (int Z = 0; Z < 2; Z++){
d[X][Y][Z] = (X + 1) * (Y + 1) * (Z + 1);
V(Count) = d[X][Y][Z];
Count = Count + 1;
}
}
}
return V;
}
")
Which creates a vector for fit(), which I can use in R to get the array I want, vis-a-vis:
array(fit(),dim=c(2,2,2)
Thank you again for the help. This is speeding up my original model quite a bit.

Rcpp returns the same matrix

My Rcpp function returns the same result. Into this function i change some studyClones numbers, but when i take result, i have identical matrix studyClones. What I do wrong?
Rcpp code:
NumericMatrix myFunction(NumericMatrix study, NumericMatrix numMatrix, double coef){
int ind = 0;
int sizeImage = study.rows();
NumericVector randomNumbers;
for(int i=0; i<numMatrix.rows(); i++){
for(int j=ind; j<(numMatrix(i,0)+ind); j++){
randomNumbers = sample(sizeImage, ceil(numMatrix(i,0)*coef), false);
for(int k=0; k<randomNumbers.length(); k++){
if(study(randomNumbers[k],j)==1){
study[randomNumbers[k],j] = 0;
}else{
study[randomNumbers[k],j] = 1;
}
}
}
ind += numMatrix(i,0);
}
return study;
}
R code:
result <- myFunction(studyMatrix, numericMatrix, coefficienM)
all(result==studyMatrix)
[1] TRUE
What you did wrong it that you missed that study is (roughly) a pointer to the original R data. when you modify study at C++ level you modify the original matrix not a copy. Thus the R object studyMatrix is modified in place and you also return it. So basically result and studyMatrix are both the same original object modified in place in memory. Thus they are equal.
Try this code to understand:
void f(NumericMatrix M)
{
M(0,0) = 0;
return;
}
Then in R
m = matrix(1, 2,2)
m
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 1 1
f(m)
m
#> [,1] [,2]
#> [1,] 0 1
#> [2,] 1 1
To fix your issue you can use clone
NumericMatrix f(NumericMatrix M)
{
NumericMatrix MM = clone(M);
MM(0,0) = 0;
return MM;
}

Results for calculating nearest positive definite matrix are different in R function and Rcpp function

In R, we can use Matrix::nearPD() to calculate nearest positive definite matrix.
I have written a Rcpp-version, nearPD_c, myself as follows (c++ file),
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
using namespace arma;
using namespace Rcpp;
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::export]]
vec rep_each(const vec& x, const int each) {
std::size_t n=x.n_elem;
std::size_t n_out=n*each;
vec res(n_out);
auto begin = res.begin();
for (std::size_t i = 0, ind = 0; i < n; ind += each, ++i) {
auto start = begin + ind;
auto end = start + each;
std::fill(start, end, x[i]);
}
return res;
}
mat mat_vec_same_len(mat mt1, vec v1){
//do not check the input...
int t=0;
for(int i=0;i<mt1.n_cols;i++){
for(int j=0;j<mt1.n_rows;j++){
mt1(j,i)=mt1(j,i)*v1(t);
t++;
}
}
return(mt1);
}
// [[Rcpp::export]]
vec pmax_c(double a, vec b){
vec c(b.n_elem);
for(int i=0;i<b.n_elem;i++){
c(i)=std::max(a,b(i));
}
return c;
}
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
mat nearPD_c(mat x,
bool corr = false, bool keepDiag = false
,bool do2eigen = true // if TRUE do a sfsmisc::posdefify() eigen step
,bool doSym = false // symmetrize after tcrossprod()
, bool doDykstra = true // do use Dykstra's correction
,bool only_values = false // if TRUE simply return lambda[j].
, double eig_tol = 1e-6 // defines relative positiveness of eigenvalues compared to largest
, double conv_tol = 1e-7 // convergence tolerance for algorithm
,double posd_tol = 1e-8 // tolerance for enforcing positive definiteness
, int maxit = 100 // maximum number of iterations allowed
, bool trace = false // set to TRUE (or 1 ..) to trace iterations
){
int n = x.n_cols;
vec diagX0;
if(keepDiag) {
diagX0 = x.diag();
}
mat D_S;
if(doDykstra) {
//D_S should be like x, but filled with '0' -- following also works for 'Matrix':
D_S = x;
D_S.zeros(); //set all element
}
mat X = x;
int iter = 0 ;
bool converged = false;
double conv = R_PosInf;
mat Y;
mat R;
mat B;
while (iter < maxit && !converged) {
Y = X;
if(doDykstra){
R = Y - D_S;
}
vec d;
mat Q;
if(doDykstra){
B=R;
}else{
B=Y;
}
eig_sym(d, Q, B);
// create mask from relative positive eigenvalues
uvec p= (d>eig_tol*d[1]);
if(sum(p)==0){
//stop("Matrix seems negative semi-definite")
break;
}
// use p mask to only compute 'positive' part
uvec p_indexes(sum(p));
int p_i_i=0;
for(int i=0;i<p.n_elem;i++){
if(p(i)){
p_indexes(p_i_i)=i;
p_i_i++;
}
}
Q=Q.cols(p_indexes);
X=mat_vec_same_len(Q,rep_each(d.elem(p_indexes),Q.n_rows))*Q.t();
// update Dykstra's correction D_S = \Delta S_k
if(doDykstra){
D_S = X - R;
}
// project onto symmetric and possibly 'given diag' matrices:
if(doSym){
X = (X + X.t())/2;
}
if(corr){
X.diag().ones(); //set diagnols as ones
}
else if(keepDiag){
X.diag() = diagX0;
}
conv = norm(Y-X,"inf")/norm(Y,"inf");
iter = iter + 1;
if (trace){
// cat(sprintf("iter %3d : #{p}=%d, ||Y-X|| / ||Y||= %11g\n",
// iter, sum(p), conv))
Rcpp::Rcout << "iter " << iter <<" : #{p}= "<< sum(p) << std::endl;
}
converged = (conv <= conv_tol);
// force symmetry is *NEVER* needed, we have symmetric X here!
//X <- (X + t(X))/2
if(do2eigen || only_values) {
// begin from posdefify(sfsmisc)
eig_sym(d, Q, X);
double Eps = posd_tol * std::abs(d[1]);
// if (d[n] < Eps) { //should be n-1?
if (d(n-1) < Eps) {
uvec d_comp = d < Eps;
for(int i=0;i<sum(d_comp);i++){
if(d_comp(i)){
d(i)=Eps;
}
}
// d[d < Eps] = Eps; //how to assign values likes this?
if(!only_values) {
vec o_diag = X.diag();
X = Q * (d *Q.t());
vec D = sqrt(pmax_c(Eps, o_diag)/X.diag());
x=D * X * rep_each(D, n);
}
}
if(only_values) return(d);
// unneeded(?!): X <- (X + t(X))/2
if(corr) {
X.diag().ones(); //set diag as ones
}
else if(keepDiag){
X.diag()= diagX0;
}
} //end from posdefify(sfsmisc)
}
if(!converged){ //not converged
Rcpp::Rcout << "did not converge! " <<std::endl;
}
return X;
// return List::create(_["mat"] = X,_["eigenvalues"]=d,
//
// _["corr"] = corr, _["normF"] = norm(x-X, "fro"), _["iterations"] = iter,
// _["rel.tol"] = conv, _["converged"] = converged);
}
However, although nearPD and nearPD_c give similar results, they are not identical. For example (in R):
> mt0=matrix(c(0.5416, -0.0668 , -0.1538, -0.2435,
+ -0.0668 , 0.9836 , -0.0135 , -0.0195,
+ -0.1538 , -0.0135 , 0.0226 , 0.0334,
+ -0.2435, -0.0195 , 0.0334 , 0.0487),4,byrow = T)
> nearPD(mt0)$mat
4 x 4 Matrix of class "dpoMatrix"
[,1] [,2] [,3] [,4]
[1,] 0.55417390 -0.06540967 -0.14059121 -0.22075966
[2,] -0.06540967 0.98375373 -0.01203943 -0.01698557
[3,] -0.14059121 -0.01203943 0.03650733 0.05726836
[4,] -0.22075966 -0.01698557 0.05726836 0.08983952
> nearPD_c(mt0)
[,1] [,2] [,3] [,4]
[1,] 0.55417390 -0.06540967 -0.14059123 -0.22075967
[2,] -0.06540967 0.98375373 -0.01203944 -0.01698557
[3,] -0.14059123 -0.01203944 0.03650733 0.05726837
[4,] -0.22075967 -0.01698557 0.05726837 0.08983952
There are some differences in 7th or 8th decimal, which make nearPD(mt0) positive define while nearPD_c(mt0) not.
> chol(nearPD(mt0)$mat)
4 x 4 Matrix of class "Cholesky"
[,1] [,2] [,3] [,4]
[1,] 7.444286e-01 -8.786561e-02 -1.888579e-01 -2.965491e-01
[2,] . 9.879440e-01 -2.898297e-02 -4.356729e-02
[3,] . . 1.029821e-04 1.014128e-05
[4,] . . . 1.071201e-04
> chol(nearPD_c(mt0))
Error in chol.default(nearPD_c(mt0)) :
the leading minor of order 3 is not positive definite
I sense that there might be some rounding issue in Rcpp. But I couldn't identify it. Any insights of what goes wrong?
There is at least one logic error in your post-processing. In R we have:
e <- eigen(X, symmetric = TRUE)
d <- e$values
Eps <- posd.tol * abs(d[1])
if (d[n] < Eps) {
d[d < Eps] <- Eps
[...]
While you have:
eig_sym(d, Q, X);
double Eps = posd_tol * std::abs(d[1]);
// if (d[n] < Eps) { //should be n-1?
if (d(n-1) < Eps) {
uvec d_comp = d < Eps;
for(int i=0;i<sum(d_comp);i++){
if(d_comp(i)){
d(i)=Eps;
}
}
According to the Armadillo docs, eigen values are in ascending order, while they are in decreasing order in R. So R builds Eps based on the largest eigen value, while you use the second(!) smallest. Then R compares with the smallest eigen value, while you compare with the largest. Something like this should give the same results as R (untested):
eig_sym(d, Q, X);
double Eps = posd_tol * std::abs(d[n-1]);
if (d(0) < Eps) {
uvec d_comp = d < Eps;
for(int i=0;i<sum(d_comp);i++){
if(d_comp(i)){
d(i)=Eps;
}
}
BTW, you only need // [[Rcpp::export]] for functions that you want to call from R.

MPI C - Gather 2d Array Segments into One Global Array

I am trying to print a dynamically allocated 2d array from my master process after receiving all its components from all other processes. By components I mean subarrays, or blocks.
I have made the code generic to the number of processes. The following diagram will help you see how the blocks are arranged in the complete array. Each block is handled by one process. Just for here though, let's assume that i run the program using 12 processes (natively i have 8 cores), using the command:
mpiexec -n 12 ./gather2dArray
This is the diagram, which targets specifically the 12 process scenario:
The answer by Jonathan in this question helped me a great deal, but unfortunately i have not been able to fully implement what i want.
I first create the blocks into each process, which i name them grid. Every array is a dynamically allocated 2d array. I also create the global array (universe) to be visible only by the master process (#0).
Finally i have to use MPI_Gatherv(...) to assemble all the subarrays into the global array. Then i proceed to display the local arrays and the global array.
When i run the program with the command above i get Segmentation fault when i reach the MPI_Gatherv(...) function. I can't figure out what i do incorrectly. I have provided complete code (heavily commented) below:
EDIT
I have fixed some wrongs in the code. Now MPI_Gatherv() is somewhat successful. I am able to print the entire first row of the global array correctly (i check the individual elements of the processes and they always match). But when i reach the second row some hieroglyphics appear and finally a segmentation fault. I haven't been able to figure out what is wrong there. Still looking into it..
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <time.h>
void print2dCharArray(char** array, int rows, int columns);
int main(int argc, char** argv)
{
int master = 0, np, rank;
char version[10];
char processorName[20];
int strLen[10];
// Initialize MPI environment
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &np);
if (np != 12) { MPI_Abort(MPI_COMM_WORLD,1); }
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
// We need a different seed for each process
srand(time(0) ^ (rank * 33 / 4));
int nDims = 2; // array dimensions
int rows = 4, columns = 6; // rows and columns of each block
int prows = 3, pcolumns = 4; // rows and columns of blocks. Each block is handled by 1 process
char** grid = malloc(rows * sizeof(char*));
for (int i = 0; i < rows; i++)
grid[i] = malloc(columns * sizeof(char));
char** universe = NULL; // Global array
char* recvPtr; // Pointer to start of Global array
int Rows = rows * prows; // Global array rows
int Columns = columns * pcolumns; // Global array columns
int sizes[2]; // No of elements in each dimension of the whole array
int subSizes[2]; // No of elements in each dimension of the subarray
int startCoords[2]; // Starting coordinates of each subarray
MPI_Datatype recvBlock, recvMagicBlock;
if (rank == master){ // For the master's eyes only
universe = malloc(Rows * sizeof(char*));
for (int i = 0; i < Rows; i++)
universe[i] = malloc(Columns * sizeof(char));
// Create a subarray (a rectangular block) datatype from a regular, 2d array
sizes[0] = Rows;
sizes[1] = Columns;
subSizes[0] = rows;
subSizes[1] = columns;
startCoords[0] = 0;
startCoords[1] = 0;
MPI_Type_create_subarray(nDims, sizes, subSizes, startCoords, MPI_ORDER_C, MPI_CHAR, &recvBlock);
// Now modify the newly created datatype to fit our needs, by specifying
// (lower bound remains the same = 0)
// - new extent
// The new region / block will now "change" sooner, as soon as we reach a region of elements
// occupied by a new block, ie. every: (columns) * sizeof(elementType) =
MPI_Type_create_resized(recvBlock, 0, columns * sizeof(char), &recvMagicBlock);
MPI_Type_commit(&recvMagicBlock);
recvPtr = &universe[0][0];
}
// populate arrays
for (int y = 0; y < rows; y++){
for (int x = 0; x < columns; x++){
if (( (double) rand() / RAND_MAX) <= density)
grid[y][x] = '#';
else
grid[y][x] = '.';
}
}
// display local array
for (int i = 0; i < np; i++){
if (i == rank) {
printf("\n[Rank] of [total]: No%d of %d\n", rank, np);
print2dCharArray(grid, rows, columns);
}
MPI_Barrier(MPI_COMM_WORLD);
}
/* MPI_Gathering.. */
int recvCounts[np], displacements[np];
// recvCounts: how many chunks of data each process has -- in units of blocks here --
for (int i = 0; i < np; i++)
recvCounts[i] = 1;
// prows * pcolumns = np
// displacements: displacement relative to global buffer (universe) at which to place the
// incoming data block from process i -- in block extents! --
int index = 0;
for (int p_row = 0; p_row < prows; p_row++)
for (int p_column = 0; p_column < pcolumns; p_column++)
displacements[index++] = p_column + p_row * (rows * pcolumns);
// MPI_Gatherv(...) is a collective routine
// Gather the local arrays to the global array in the master process
// send type: MPI_CHAR (a char)
// recv type: recvMagicBlock (a block)
MPI_Gatherv(&grid[0][0], rows * columns, MPI_CHAR, //: parameters relevant to sender
recvPtr, recvCounts, displacements, recvMagicBlock, master, //: parameters relevant to receiver
MPI_COMM_WORLD);
// display global array
MPI_Barrier(MPI_COMM_WORLD);
if (rank == master){
printf("\n---Global Array---\n");
print2dCharArray(universe, Rows, Columns);
}
MPI_Finalize();
return 0;
}
void print2dCharArray(char** array, int rows, int columns)
{
int i, j;
for (i = 0; i < rows; i++){
for (j = 0; j < columns; j++){
printf("%c ", array[i][j]);
}
printf("\n");
}
fflush(stdout);
}
The following is the output I'm getting. No matter what i try, I cannot get past this. As you can see the first line of the global array is printed properly using the first 4 blocks of the 4 processes. When jumping to next line we get hieroglyphics..
hostname#User:~/mpi$ mpiexec -n 12 ./gather2darray
MPICH Version: 3User
Processor name: User
[Rank] of [total]: No0 of 12
. . # . . #
# . # # # .
. . . # # .
. . # . . .
[Rank] of [total]: No1 of 12
. . # # . .
. . . . # #
. # . . # .
. . # . . .
[Rank] of [total]: No2 of 12
. # # # . #
. # . . . .
# # # . . .
. . . # # .
[Rank] of [total]: No3 of 12
. . # # # #
. . # # . .
# . # . # .
. . . # . .
[Rank] of [total]: No4 of 12
. # . . . #
# . # . # .
# . . . . .
# . . . . .
[Rank] of [total]: No5 of 12
# # . # # .
# . . # # .
. . . . # .
. # # . . .
[Rank] of [total]: No6 of 12
. . # # . #
. . # . # .
# . . . . .
. . . # # #
[Rank] of [total]: No7 of 12
# # . # # .
. # # . . .
. . . . . #
. . . # # .
[Rank] of [total]: No8 of 12
. # . . . .
# . # . # .
. . . # . #
# . # # # .
[Rank] of [total]: No9 of 12
. . . . . #
. . # . . .
. . # . . #
. . # # . .
[Rank] of [total]: No10 of 12
. . . . # .
# . . . . .
. . # # . .
. . . # . #
[Rank] of [total]: No11 of 12
. # . . # .
. # . # # .
. . . # . .
. # . # . #
---Global Array---
. . # . . # . . # # . . . # # # . # . . # # # #
� � < * � � e { � � � � � �
J
*** Error in `./gather2darray': double free or corruption (out): 0x0000000001e4c050 ***
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
*** stack smashing detected ***: ./gather2darray terminated
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 10979 RUNNING AT User
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
Help will be very appreciated. Thanks in advance.
Your code is almost correct, you just forgotten an MPI important principle. When you are using an array on MPI functions, MPI assumes that your array memory is allocate continuously. So you have to change your 2 dims arrays allocations.
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <time.h>
void print2dCharArray(char** array, int rows, int columns);
int main(int argc, char** argv)
{
int master = 0, np, rank;
char version[10];
char processorName[20];
int strLen[10];
// Initialize MPI environment
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &np);
if (np != 12) { MPI_Abort(MPI_COMM_WORLD,1); }
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
// We need a different seed for each process
srand(time(0) ^ (rank * 33 / 4));
int nDims = 2; // array dimensions
int rows = 4, columns = 6; // rows and columns of each block
int prows = 3, pcolumns = 4; // rows and columns of blocks. Each block is handled by 1 process
char* pre_grid = (char*) malloc(rows * columns * sizeof(char));
char** grid = (char**) malloc(rows * sizeof(char*));
for (int i = 0; i < rows; i++)
grid[i] = &(pre_grid[i * columns]);
char** universe = NULL; // Global array
char* pre_universe = NULL;
char* recvPtr; // Pointer to start of Global array
int Rows = rows * prows; // Global array rows
int Columns = columns * pcolumns; // Global array columns
int sizes[2]; // No of elements in each dimension of the whole array
int subSizes[2]; // No of elements in each dimension of the subarray
int startCoords[2]; // Starting coordinates of each subarray
MPI_Datatype recvBlock, recvMagicBlock;
if (rank == master){ // For the master's eyes only
/* universe = malloc(Rows * sizeof(char*));*/
/* for (int i = 0; i < Rows; i++)*/
/* universe[i] = malloc(Columns * sizeof(char));*/
pre_universe = (char*) malloc(Rows * Columns * sizeof(char));
universe = (char**) malloc(Rows * sizeof(char*));
for (int i = 0; i < Rows; i++) {
universe[i] = &(pre_universe[i * Columns]);
}
// Create a subarray (a rectangular block) datatype from a regular, 2d array
sizes[0] = Rows;
sizes[1] = Columns;
subSizes[0] = rows;
subSizes[1] = columns;
startCoords[0] = 0;
startCoords[1] = 0;
MPI_Type_create_subarray(nDims, sizes, subSizes, startCoords, MPI_ORDER_C, MPI_CHAR, &recvBlock);
// Now modify the newly created datatype to fit our needs, by specifying
// (lower bound remains the same = 0)
// - new extent
// The new region / block will now "change" sooner, as soon as we reach a region of elements
// occupied by a new block, ie. every: (columns) * sizeof(elementType) =
MPI_Type_create_resized(recvBlock, 0, columns * sizeof(char), &recvMagicBlock);
MPI_Type_commit(&recvMagicBlock);
recvPtr = &universe[0][0];
}
// populate arrays
for (int y = 0; y < rows; y++){
for (int x = 0; x < columns; x++){
grid[y][x] = rank + 65;
}
}
// display local array
for (int i = 0; i < np; i++){
if (i == rank) {
printf("\n[Rank] of [total]: No%d of %d\n", rank, np);
print2dCharArray(grid, rows, columns);
}
MPI_Barrier(MPI_COMM_WORLD);
}
/* MPI_Gathering.. */
int recvCounts[np], displacements[np];
// recvCounts: how many chunks of data each process has -- in units of blocks here --
for (int i = 0; i < np; i++)
recvCounts[i] = 1;
// prows * pcolumns = np
// displacements: displacement relative to global buffer (universe) at which to place the
// incoming data block from process i -- in block extents! --
int index = 0;
for (int p_row = 0; p_row < prows; p_row++)
for (int p_column = 0; p_column < pcolumns; p_column++)
displacements[index++] = p_column + p_row * (rows * pcolumns);
// MPI_Gatherv(...) is a collective routine
// Gather the local arrays to the global array in the master process
// send type: MPI_CHAR (a char)
// recv type: recvMagicBlock (a block)
MPI_Gatherv(&grid[0][0], rows * columns, MPI_CHAR, //: parameters relevant to sender
recvPtr, recvCounts, displacements, recvMagicBlock, master, //: parameters relevant to receiver
MPI_COMM_WORLD);
// display global array
MPI_Barrier(MPI_COMM_WORLD);
if (rank == master){
printf("\n---Global Array---\n");
print2dCharArray(universe, Rows, Columns);
}
free(grid[0]);
free(grid);
if (rank == master) {
free(universe[0]);
free(universe);
MPI_Type_free(&recvMagicBlock);
MPI_Type_free(&recvBlock);
}
MPI_Finalize();
return 0;
}
void print2dCharArray(char** array, int rows, int columns)
{
int i, j;
for (i = 0; i < rows; i++){
for (j = 0; j < columns; j++){
printf("%c ", array[i][j]);
}
printf("\n");
}
fflush(stdout);
}
Output:
---Global Array---
A A A A A A B B B B B B C C C C C C D D D D D D
A A A A A A B B B B B B C C C C C C D D D D D D
A A A A A A B B B B B B C C C C C C D D D D D D
A A A A A A B B B B B B C C C C C C D D D D D D
E E E E E E F F F F F F G G G G G G H H H H H H
E E E E E E F F F F F F G G G G G G H H H H H H
E E E E E E F F F F F F G G G G G G H H H H H H
E E E E E E F F F F F F G G G G G G H H H H H H
I I I I I I J J J J J J K K K K K K L L L L L L
I I I I I I J J J J J J K K K K K K L L L L L L
I I I I I I J J J J J J K K K K K K L L L L L L
I I I I I I J J J J J J K K K K K K L L L L L L

How to write Rcpp function for simple matrix multiplication in R

I have wrote a Rcpp code to compute element wise matrix multiplication in R. But when try to run this code R stops working and its exiting. How to correct this function?
Thanks in advance.
library(Rcpp)
func <- 'NumericMatrix mmult( NumericMatrix m , NumericMatrix v, bool byrow=true )
{
if( ! m.nrow() == v.nrow() ) stop("Non-conformable arrays") ;
if( ! m.ncol() == v.ncol() ) stop("Non-conformable arrays") ;
NumericMatrix out(m) ;
for (int i = 1; i <= m.nrow(); i++)
{
for (int j = 1; j <= m.ncol(); j++)
{
out(i,j)=m(i,j) * v(i,j) ;
}
}
return out ;
}'
cppFunction( func )
m1<-matrix(1:4,2,2)
m2<-m1
r1<-mmult(m1,m2)
r2<-m1*m2
The (at least to me) obvious choice is to use RcppArmadillo:
R> cppFunction("arma::mat matmult(arma::mat A, arma::mat B) { return A % B; }",
+ depends="RcppArmadillo")
R> m1 <- m2 <- matrix(1:4,2,2)
R> matmult(m1,m2)
[,1] [,2]
[1,] 1 9
[2,] 4 16
R>
as Armadillo is strongly typed, and has an element-by-element multiplication operator (%) which we use in the one-liner it takes.
You have to keep in mind that c++ uses 0 indexed arrays. (See Why does the indexing start with zero in 'C'? and Why are zero-based arrays the norm? .)
So you need to define your loop to run from 0 to m.nrow() - 1
Try this:
func <- '
NumericMatrix mmult( NumericMatrix m , NumericMatrix v, bool byrow=true )
{
if( ! m.nrow() == v.nrow() ) stop("Non-conformable arrays") ;
if( ! m.ncol() == v.ncol() ) stop("Non-conformable arrays") ;
NumericMatrix out(m) ;
for (int i = 0; i < m.nrow(); i++)
{
for (int j = 0; j < m.ncol(); j++)
{
out(i,j)=m(i,j) * v(i,j) ;
}
}
return out ;
}
'
Then I get:
> mmult(m1,m2)
[,1] [,2]
[1,] 1 9
[2,] 4 16
> m1*m2
[,1] [,2]
[1,] 1 9
[2,] 4 16

Resources