Base case condition in quick sort algorithm - recursion

For the quick sort algorithm(recursive), every time when it calls itself, it have the condition if(p < r). Please correct me if I am wrong: as far as I know, for every recursive algorithm, it has a condition as the time when it entered the routine, and this condition is used to get the base case. But I still cannot understand how to correctly set and test this condition ?
void quickSort(int* arr, int p, int r)
{
if(p < r)
{
int q = partition(arr,p,r);
quickSort(arr,p,q-1);
quickSort(arr,q+1,r);
}
}
For my entire code, please refer to the following:
/*
filename : main.c
description: quickSort algorithm
*/
#include<iostream>
using namespace std;
void exchange(int* val1, int* val2)
{
int temp = *val1;
*val1 = *val2;
*val2 = temp;
}
int partition(int* arr, int p, int r)
{
int x = arr[r];
int j = p;
int i = j-1;
while(j<=r-1)
{
if(arr[j] <= x)
{
i++;
// exchange arr[r] with arr[j]
exchange(&arr[i],&arr[j]);
}
j++;
}
exchange(&arr[i+1],&arr[r]);
return i+1;
}
void quickSort(int* arr, int p, int r)
{
if(p < r)
{
int q = partition(arr,p,r);
quickSort(arr,p,q-1);
quickSort(arr,q+1,r);
}
}
// driver program to test the quick sort algorithm
int main(int argc, const char* argv[])
{
int arr1[] = {13,19,9,5,12,8,7,4,21,2,6,11};
cout <<"The original array is: ";
for(int i=0; i<12; i++)
{
cout << arr1[i] << " ";
}
cout << "\n";
quickSort(arr1,0,11);
//print out the sorted array
cout <<"The sorted array is: ";
for(int i=0; i<12; i++)
{
cout << arr1[i] << " ";
}
cout << "\n";
cin.get();
return 0;
}

Your question is not quite clear, but I will try to answer.
Quicksort works by sorting smaller and smaller arrays. The base case is an array with less than 2 elements because no sorting would be required.
At each step it finds a partition value and makes it true that all the values to the left of the partition value are smaller and all values to the right of the partition value are larger. In other words, it puts the partition value in the correct place. Then it recursively sorts the array to the left of the partition and the array to right of the partition.
The base case of quicksort is an array with one element because a one element array requires no sorting. In your code, p is the index of the first element and r is the index of the last element. The predicate p < r is only true for an array of at least size 2. In other words, if p >= r then you have an array of size 1 (or zero, or nonsense) and there is no work to do.

Related

Pointer back and next for a node

I'm new to C++. I'm now trying to create a class with back and forth pointer. My code is listed below:
#include<iostream>
using namespace std;
class Node
{
public:
Node(int d, Node*k = NULL, Node*q = NULL) :data(d), back(k), next(q){};
int data;
Node*next; // point to next value on the list
Node*back; // point to back value on the list
};
int main()
{
int n;
Node*p = NULL;
Node*k = NULL; //k is back
while (cin >> n)
{
p = new Node(n,k);
p->back->next = p;
k = p;
}
for (; p; p = p->back)
cout << p->data << "->";
cout << "*\n";
system("pause");
}
However, I always have this error: "Access violation writing location"
I wonder if anybody have a solution ? Thanks
In the first iteration of the loop p->back is NULL. You get the access violation because you dereference it. Write this instead:
while (cin >> n)
{
p = new Node(n,k);
if (p->back != NULL) // p->back == NULL in the first iteration
p->back->next = p;
k = p;
}

parallelize for loop using boost MPI

I am learning to use Boost.MPI to parallelize the large amount of computation, here below is just my simple test see if I can get MPI logic correctly. However, I did not get it to work. I used world.size()=10, there are total 50 elements in data array, each process will do 5 iteration. I would hope to update data array by having each process sending the updated data array to root process, and then the root process receives the updated data array then print out. But I only get a few elements updated.
Thanks for helping me.
#include <boost/mpi.hpp>
#include <iostream>
#include <cstdlib>
namespace mpi = boost::mpi;
using namespace std;
#define max_rows 100
int data[max_rows];
int modifyArr(const int index, const int arr[]) {
return arr[index]*2+1;
}
int main(int argc, char* argv[])
{
mpi::environment env(argc, argv);
mpi::communicator world;
int num_rows = 50;
int my_number;
if (world.rank() == 0) {
for ( int i = 0; i < num_rows; i++)
data[i] = i + 1;
}
broadcast(world, data, 0);
for (int i = world.rank(); i < num_rows; i += world.size()) {
my_number = modifyArr(i, data);
data[i] = my_number;
world.send(0, 1, data);
//cout << "i=" << i << " my_number=" << my_number << endl;
if (world.rank() == 0)
for (int j = 1; j < world.size(); j++)
mpi::status s = world.recv(boost::mpi::any_source, 1, data);
}
if (world.rank() == 0) {
for ( int i = 0; i < num_rows; i++)
cout << "i=" << i << " results = " << data[i] << endl;
}
return 0;
}
Your problem is probably here:
mpi::status s = world.recv(boost::mpi::any_source, 1, data);
This is the only way data can get back to the master node.
However, you do not tell the master node where in data to store the answers it is getting. Since data is the address of the array, everything should get stored in the zeroth element.
Interleaving which elements of the array you are processing on each node is a pretty bad idea. You should assign blocks of the array to each node so that you can send entire chunks of the array at once. That will reduce communication overhead significantly.
Also, if your issue is simply speeding up for loops, you should consider OpenMP, which can do things like this:
#pragma omp parallel for
for(int i=0;i<100;i++)
data[i]*=4;
Bam! I just split that for loop up between all of my processes with no further work needed.

Performance issues : segment tree, update function

I update segment tree with such function. Profiling says here's the bottleneck:
void update (int tree[], int root, int left, int right, int pos, double val)
{
if (left == right)
{
data[tree[root]] = val;
}
else
{
int middle = (left + right) / 2;
if (pos <= middle)
update(tree, root*2, left, middle, pos, val);
else
update(tree, root*2+1, middle+1, right, pos, val);
tree[root] = indexOfMax(tree, tree[root*2], tree[root*2+1]); // simple comparations
}
}
// indexOfMax is just a simple comparation
int indexOfMax(int tree[], int a, int b)
{
//cout << data[tree[a]] << " > " << data[tree[b]] << " ? " << tree[a] << " : " << tree[b] << endl;
return data[a] > data[b] ? a : b;
}
And while memory operations are fast, I'm wondering if it is caused by recursion overhead, while the depth of it is usually not more 20.
What I get from my primitive profiler is:
4.39434ms - average time for a singe binary search over data
2642.94ms - time from a single update
19.9097ms - time for a single RMQ-query.
So.. The time spent on a single update is dramatic :).
Answer : one hidden std::find over map was found.

OpenCL autocorrelation kernel

I have written a simple program that does autocorrelation as follows...I've used pgi accelerator directives to move the computation to GPUs.
//autocorrelation
void autocorr(float *restrict A, float *restrict C, int N)
{
int i, j;
float sum;
#pragma acc region
{
for (i = 0; i < N; i++) {
sum = 0.0;
for (j = 0; j < N; j++) {
if ((i+j) < N)
sum += A[j] * A[i+j];
else
continue;
}
C[i] = sum;
}
}
}
I wrote a similar program in OpenCL, but I am not getting correct results. The program is as follows...I am new to GPU programming, so apart from hints that could fix my error, any other advices are welcome.
__kernel void autocorrel1D(__global double *Vol_IN, __global double *Vol_AUTOCORR, int size)
{
int j, gid = get_global_id(0);
double sum = 0.0;
for (j = 0; j < size; j++) {
if ((gid+j) < size)
{
sum += Vol_IN[j] * Vol_IN[gid+j];
}
else
continue;
}
barrier(CLK_GLOBAL_MEM_FENCE);
Vol_AUTOCORR[gid] = sum;
}
Since I have passed the dimension to be 1, so I am considering my get_global_size(0) call would give me the id of the current block, which is used to access the input 1d array.
Thanks,
Sayan
The code is correct. As far as I know, that should run fine and give corret results.
barrier(CLK_GLOBAL_MEM_FENCE); is not needed. You'll get more speed without that sentence.
Your problem should be outside the kernel, check that you a re passing correctly the input, and you are taking out of GPU the correct data.
BTW, I supose you are using a double precision suported GPU as you are doing double calcs.
Check that you are passing also double values. Remember you CAN't point a float pointer to a double value, and viceversa. That will give you wrong results.

Runtime allocation of multidimensional array

So far I thought that the following syntax was invalid,
int B[ydim][xdim];
But today I tried and it worked! I ran it many times to make sure it did not work by chance, even valgrind didn't report any segfault or memory leak!! I am very surprised. Is it a new feature introduced in g++? I always have used 1D arrays to store matrices by indexing them with correct strides as done with A in the program below. But this new method, as with B, is so simple and elegant that I have always wanted. Is it really safe to use? See the sample program.
PS. I am compiling it with g++-4.4.3, if that matters.
#include <cstdlib>
#include <iostream>
int test(int ydim, int xdim) {
// Allocate 1D array
int *A = new int[xdim*ydim](); // with C++ new operator
// int *A = (int *) malloc(xdim*ydim * sizeof(int)); // or with C style malloc
if (A == NULL)
return EXIT_FAILURE;
// Declare a 2D array of variable size
int B[ydim][xdim];
// populate matrices A and B
for(int y = 0; y < ydim; y++) {
for(int x = 0; x < xdim; x++) {
A[y*xdim + x] = y*xdim + x;
B[y][x] = y*xdim + x;
}
}
// read out matrix A
for(int y = 0; y < ydim; y++) {
for(int x = 0; x < xdim; x++)
std::cout << A[y*xdim + x] << " ";
std::cout << std::endl;
}
std::cout << std::endl;
// read out matrix B
for(int y = 0; y < ydim; y++) {
for(int x = 0; x < xdim; x++)
std::cout << B[y][x] << " ";
std::cout << std::endl;
}
delete []A;
// free(A); // or in C style
return EXIT_SUCCESS;
}
int main() {
return test(5, 8);
}
int b[ydim][xdim] is declaring a 2-d array on the stack. new, on the other hand, allocates the array on the heap.
For any non-trivial array size, it's almost certainly better to have it on the heap, lest you run yourself out of stack space, or if you want to pass the array back to something outside the current scope.
This is a C99 'variable length array' or VLA. If they are supported by g++ too, then I believe it is an extension of the C++ standard.
Nice, aren't they?

Resources