I complete a leetcode problem 329:Given an integer matrix, find the length of the longest increasing path. using recursion, and I am not sure about its time complexity.
For the time complexity, first there are for loops outside. Thus, it is
T(m, n) = O(m*n)
for the two loops. Inside the loop, there is a recursive calling findPath. It is like
T(m,n) = T(m-1, n)+T(m+1, n)+T(m, n-1)+T(m, n+1)
and I am completely lost for this one. Thanks if you can help explain this one for me.
Following is my code:
int longestIncreasingPath(vector<vector<int>>& matrix) {
if (matrix.size() == 0 || matrix[0].size() == 0) return 0;
vector<vector<int>> cached(matrix.size(), vector<int>(matrix[0].size(), 0));
int maxVal =0;
for(int i=0; i<matrix.size(); i++){
for(int j=0; j<matrix[0].size();j++){
int length = findPath(matrix, i, j , cached, INT_MAX);
maxVal=max(length, maxVal);
return maxVal;
int findPath(vector<vector<int>>& matrix, int i, int j,
vector<vector<int>>& cached, int lastValue){
if(i<0 || j<0 || i>=matrix.size() || j>=matrix[0].size() || matrix[i][j]>=lastValue){
return 0;
if(cached[i][j]==0) {
int current = matrix[i][j];
int temp = 0;
temp= max(temp, findPath(matrix, i-1, j, cached, current));
temp= max(temp, findPath(matrix, i+1, j, cached, current));
temp= max(temp, findPath(matrix, i, j-1, cached, current));
temp= max(temp, findPath(matrix, i, j+1, cached, current));
cached[i][j] = temp+1;
return cached[i][j];
After each call to findPath, the content of cached[i][j] will always be greater than 1. Therefore subsequent calls to the position (i, j) will not lead to recursive calls to positions around it. We can then deduce that each position (i, j) is called a maximum of 4 times, as it can only be accessed by calls to positions either horizontally or vertically adjacent, only the first of which would lead onto further recursive calls. We also assume the worst case when matrix[i][j] >= lastValue is never satisfied. Therefore the upper bound is O(mn), where m, n are the dimensions of matrix.
Below is a piece of C code run from R used to compare each row of a matrix to a vector. The number of identical values is stored in the first column of a two-column matrix.
I know it can easily be done in R (as done to check the results), but this is a first step for a more complex use case.
When openmp is not used, it works ok. When openmp is used, it give correlated (0.99) but inconsistent results.
Question1: What am I doing wrong?
Question2: I use a double for loop to fill the output matrix (ret) with zeros. What would be a better solution?
Also, inconsistencies were observed when the code was used in a package. I tried to make the code reproducible using inline, but it does not recognize the openmp statements (I tried to include 'omp.h', in the parameters of cfunction, ...).
Question3: How can we make this code work with inline?
I'm (too?) far outside my comfort zone on this topic.
compare <- cfunction(c(x = "integer", vec = "integer"), "
const int I = nrows(x), J = ncols(x);
SEXP ret;
PROTECT(ret = allocMatrix(INTSXP, I, 2));
int *ptx = INTEGER(x), *ptvec = INTEGER(vec), *ptret = INTEGER(ret);
for (int i=0; i<I; i++)
for (int j=0; j<2; j++)
ptret[j * I + i] = 0;
int i, j;
#pragma omp parallel for default(none) shared(ptx, ptvec, ptret) private(i,j)
for (j=0; j<J; j++)
for (i=0; i<I; i++)
if (ptx[i + I * j] == ptvec[j]) {++ptret[i];}
return ret;
N = 3e3
M = 1e4
m = matrix(sample(c(-1:1), N*M, replace = TRUE), nc = M)
v = sample(-1:1, M, replace = TRUE)
cc = compare(m, v)
cr = rowSums(t(t(m) == v))
all.equal(cc[,1], cr)
Thanks to the comments above, I reconsidered the data race issue.
IIUC, my loop was parallelized on j (the columns). Then, each thread had its own value of i (the rows), but possible identical values across threads, that were then trying to increment ptret[i] at the same time.
To avoid this, I now loop on i first, so that only a single thread will increment each row.
Then, I realized that I could move the zero-initialization of ptret within the first loop.
It seems to work. I get identical results, increased CPU usage, and 3-4x speedup on my laptop.
I guess that solves questions 1 and 2. I will have a closer look at the inline/openmp problem.
Code below, fwiw.
#include <omp.h>
#include <R.h>
#include <Rinternals.h>
#include <stdio.h>
SEXP c_compare(SEXP x, SEXP vec)
const int I = nrows(x), J = ncols(x);
SEXP ret;
PROTECT(ret = allocMatrix(INTSXP, I, 2));
int *ptx = INTEGER(x), *ptvec = INTEGER(vec), *ptret = INTEGER(ret);
int i, j;
#pragma omp parallel for default(none) shared(ptx, ptvec, ptret) private(i, j)
for (i = 0; i < I; i++) {
// init ptret to zero
ptret[i] = 0;
ptret[I + i] = 0;
for (j = 0; j < J; j++)
if (ptx[i + I * j] == ptvec[j]) {
return ret;
Given a bag with a maximum of 100 chips,each chip has its value written over it.
Determine the most fair division between two persons. This means that the difference between the amount each person obtains should be minimized. The value of a chips varies from 1 to 1000.
Input: The number of coins m, and the value of each coin.
Output: Minimal positive difference between the amount the two persons obtain when they divide the chips from the corresponding bag.
I am finding it difficult to form a DP solution for it. Please help me.
Initially I had to tried it as a Non DP solution.Actually I havent thought of solving it using DP. I simply sorted the value array. And assigned the largest value to one of the person, and incrementally assigned the other values to one of the two depending upon which creates minimum difference. But that solution actually didnt work.
I am posting my solution here :
bool myfunction(int i, int j)
return(i >= j) ;
int main()
int T, m, sum1, sum2, temp_sum1, temp_sum2,i ;
cin >> T ;
cin >> m ;
sum1 = 0 ; sum2 = 0 ; temp_sum1 = 0 ; temp_sum2 = 0 ;
vector<int> arr(m) ;
for(i=0 ; i < m ; i++)
cin>>arr[i] ;
if(m==1 )
cout<<0<<endl ;
cout<<1<<endl ;
else {
sort(arr.begin(), arr.end(), myfunction) ;
// vector<int> s1 ;
// vector<int> s2 ;
for(i=0 ; i < m ; i++)
temp_sum1 = sum1 + arr[i] ;
temp_sum2 = sum2 + arr[i] ;
if(abs(temp_sum1 - sum2) <= abs(temp_sum2 -sum1))
sum1 = sum1 + arr[i] ;
sum2 = sum2 + arr[i] ;
temp_sum1 = 0 ;
temp_sum2 = 0 ;
cout<<abs(sum1 -sum2)<<endl ;
return 0 ;
what i understand from your question is you want to divide chips in two persons so as to minimize the difference between sum of numbers written on those.
If understanding is correct, then potentially you can follow below approach to arrive at solution.
Sort the values array i.e. int values[100]
Start adding elements from both ends of array in for loop i.e. for(i=0; j=values.length;i<j;i++,j--)
Odd numbered iteration sum belongs to one person & even numbered sum to other person
run the loop till i < j
now, the difference between two sums obtained in odd & even iterations should be minimum as array was sorted earlier.
If my understanding of the question is correct, then this solution should resolve your problem.
Reflect as appropriate.
can we parallelize a recursive function using MPI?
I am trying to parallelize the quick sort function, but don't know if it works in MPI because it is recursive. I also want to know where should I do the parallel region.
// quickSort.c
#include <stdio.h>
void quickSort( int[], int, int);
int partition( int[], int, int);
void main()
int a[] = { 7, 12, 1, -2, 0, 15, 4, 11, 9};
int i;
printf("\n\nUnsorted array is: ");
for(i = 0; i < 9; ++i)
printf(" %d ", a[i]);
quickSort( a, 0, 8);
printf("\n\nSorted array is: ");
for(i = 0; i < 9; ++i)
printf(" %d ", a[i]);
void quickSort( int a[], int l, int r)
int j;
if( l < r )
// divide and conquer
j = partition( a, l, r);
quickSort( a, l, j-1);
quickSort( a, j+1, r);
int partition( int a[], int l, int r) {
int pivot, i, j, t;
pivot = a[l];
i = l; j = r+1;
while( 1)
do ++i; while( a[i] <= pivot && i <= r );
do --j; while( a[j] > pivot );
if( i >= j ) break;
t = a[i]; a[i] = a[j]; a[j] = t;
t = a[l]; a[l] = a[j]; a[j] = t;
return j;
I would also really appreciate it if there is another simpler code for the quick sort.
Well, technically you can, but I'm afraid this would be efficient only in SMP. And does the array fit to single node? If no, then you cannot perform even the first pass of a quick-sort.
If you really need to sort an array on a parallel system using MPI, you might want to consider using merge sort instead (of course you still can use quick sort for single blocks at each node, before you begin merging the blocks).
If you still want to use quick sort, but you are confused with the recursive version, here is a sketch of non-recursive algorithm which hopefully can be parallelized a bit easier, although it's essentially the same:
std::stack<std::pair<int, int> > unsorted;
unsorted.push(std::make_pair(0, size-1));
while (!unsorted.empty()) {
std::pair<int, int> u = unsorted.top();
m = partition(A, u.first, u.second);
// here you can send one of intervals to another node instead of
// pushing it into the stack, so it would be processed in parallel.
if (m+1 < u.second) unsorted.push(std::make_pair(m+1, u.second));
if (u.first < m-1) unsorted.push(std::make_pair(u.first, m-1));
Theoretically "anything" can be parallelized using MPI, but remember that MPI isn't doing any parallelization itself. It's just providing the communication layer between processes. As long as all of your sends and receives (or collective calls) match up, it's a correct program for the most part. That being said, it may not be the most efficient thing to use MPI, depending on your algorithm. If you are going to be sorting lots and lots of data (more than can fit in the memory of one node) then it could be efficient to use MPI (you probably want to take a look at the RMA chapter in that case) or some other higher level library that might make things even simpler for this type of application (UPC, Co-array Fortran, SHMEM, etc.).
I am trying to implement a "coupling to the past" algorithm in Rcpp. For this I need to store a matrix of random numbers, and if the algorithm did not converge create a new matrix of random numbers and store that as well. This might have to be done 10+ times or something until convergence.
I was hoping I could use a List and dynamically update it, similar as I would in R. I was actually very surprised it worked a bit but I got errors whenever the list size becomes large. This seems to make sense as I did not allocate the needed memory for the additional list elements, although I am not that familiar with C++ and not sure if that is the problem.
Here is an example of what I tried. however be aware that this will probably crash your R session:
includes = '
NumericMatrix RandMat(int nrow, int ncol)
int N = nrow * ncol;
NumericMatrix Res(nrow,ncol);
NumericVector Rands = runif(N);
for (int i = 0; i < N; i++)
Res[i] = Rands[i];
code = '
void foo()
// This is the relevant part, I create a list then update it and print the results:
List x;
for (int i=0; i<10; i++)
x[i] = RandMat(100,10);
Does anyone know a way to do this without crashing R? I guess I could initiate the list at a fixed amount of elements here, but in my application the amount of elements is random.
You have to "allocate" enough space for your list. Maybe you can use something like a resizefunction:
List resize( const List& x, int n ){
int oldsize = x.size() ;
List y(n) ;
for( int i=0; i<oldsize; i++) y[i] = x[i] ;
return y ;
and whenever you want your list to be bigger than it is now, you can do:
x = resize( x, n ) ;
Your initial list is of size 0, so it expected that you get unpredictable behavior at the first iteration of your loop.
Have you tried the latest Codility test?
I felt like there was an error in the definition of what a K-Sparse number is that left me confused and I wasn't sure what the right way to proceed was. So it starts out by defining a K-Sparse Number:
In the binary number "100100010000" there are at least two 0s between
any two consecutive 1s. In the binary number "100010000100010" there
are at least three 0s between any two consecutive 1s. A positive
integer N is called K-sparse if there are at least K 0s between any
two consecutive 1s in its binary representation. (My emphasis)
So the first number you see, 100100010000 is 2-sparse and the second one, 100010000100010, is 3-sparse. Pretty simple, but then it gets down into the algorithm:
Write a function:
class Solution { public int sparse_binary_count(String S,String T,int K); }
that, given:
string S containing a binary representation of some positive integer A,
string T containing a binary representation of some positive integer B,
a positive integer K.
returns the number of K-sparse integers within the range [A..B] (both
ends included)
and then states this test case:
For example, given S = "101" (A = 5), T = "1111" (B=15) and K=2, the
function should return 2, because there are just two 2-sparse integers
in the range [5..15], namely "1000" (i.e. 8) and "1001" (i.e. 9).
Basically it is saying that 8, or 1000 in base 2, is a 2-sparse number, even though it does not have two consecutive ones in its binary representation. What gives? Am I missing something here?
Tried solving that one. The assumption that the problem makes about binary representations of "power of two" numbers being K sparse by default is somewhat confusing and contrary.
What I understood was 8-->1000 is 2 power 3 so 8 is 3 sparse. 16-->10000 2 power 4 , and hence 4 sparse.
Even we assume it as true , and if you are interested in below is my solution code(C) for this problem. Doesn't handle some cases correctly, where there are powers of two numbers involved in between the two input numbers, trying to see if i can fix that:
int sparse_binary_count (const string &S,const string &T,int K)
char buf[50];
char *str1,*tptr,*Sstr,*Tstr;
int i,len1,len2,cnt=0;
long int num1,num2;
char *pend,*ch;
Sstr = (char *)S.c_str();
Tstr = (char *)T.c_str();
str1 = (char *)malloc(300001);
tptr = str1;
num1 = strtol(Sstr,&pend,2);
num2 = strtol(Tstr,&pend,2);
buf[i] = '0';
buf[i] = '\0';
str1 = tptr;
if( (i & (i-1))==0)
if(i >= (pow((float)2,(float)K)))
str1 = myitoa(i,str1,2);
ch = strstr(str1,buf);
if(ch == NULL)
if((i % 2) != 0)
return cnt;
char* myitoa(int val, char *buf, int base){
int i = 299999;
int cnt=0;
for(; val && i ; --i, val /= base)
buf[i] = "0123456789abcdef"[val % base];
buf[i+cnt+1] = '\0';
return &buf[i+1];
There was an information within the test details, showing this specific case. According to this information, any power of 2 is considered K-sparse for any K.
You can solve this simply by binary operations on integers. You are even able to tell, that you will find no K-sparse integers bigger than some specific integer and lower than (or equal to) integer represented by T.
As far as I can see, you must pay also a lot of attention to the performance, as there are sometimes hundreds of milions of integers to be checked.
My own solution, written in Python, working very efficiently even on large ranges of integers and being successfully tested for many inputs, has failed. The results were not very descriptive, saying it does not work as required within question (although it meets all the requirements in my opinion).
solutions with bitwise operators:
no of bits per int = 32 on 32 bit system,check for pattern (for K=2,
like 1001, 1000) in each shift and increment the count, repeat this
for all numbers in range.
int KsparseNumbers(int a, int b, int s) {
int nbits = sizeof(int)*8;
int slen = 0;
int lslen = pow(2, s);
int scount = 0;
int i = 0;
for (; i < s; ++i) {
slen += pow(2, i);
printf("\n slen = %d\n", slen);
for(; a <= b; ++a) {
int num = a;
for(i = 0 ; i < nbits-2; ++i) {
if ( (num & slen) == 0 && (num & lslen) ) {
printf("\n Scount = %d\n", scount);
num >>=1;
return scount;
int main() {
printf("\n No of 2-sparse numbers between 5 and 15 = %d\n", KsparseNumbers(5, 15, 2));