MPI combine vector from all processors - vector

I am coding an MPI fortran program and have, let say, three vectors of different length in three ranks. I would like to combine them together in a "concatenate" way such as:
Rank 0: a0 = [1 2 3 4 5]
Rank 1: a1 = [3 5 7 9]
Rank 2: a2 = [2 4 6 8 10 12]
Combine them to:
Rank 0: a = [1 2 3 4 5 3 5 7 9 2 4 6 8 10 12]
Could you please tell me how I can do that ?

Since vectors have different sizes based on the ranks, you can use MPI_Gatherv() in order to achieve the expected result

Related

I want to create 2D array with 5 rows by 1 column

If I want to create 2D array with 1 row by 5 columns.
I could do this
julia> a = [1 2 3 4 5]
1×5 Array{Int64,2}:
1 2 3 4 5
But to create 2D array with 5 rows by 1 column. I have tried
julia> b = [1; 2; 3; 4; 5]
5-element Array{Int64,1}:
1
2
3
4
5
But I got back a 1D array which is NOT what I wanted
The only way to get it to work is
julia> b=reshape([1 2 3 4 5],5,1)
5×1 Array{Int64,2}:
1
2
3
4
5
Perhaps I am missing some crucial information here.
You could also do a = [1 2 3 4 5]'.
On a side note, for Julia versions > 0.6 the type of a wouldn't be Array{Int64, 2} but a LinearAlgebra.Adjoint{Int64,Array{Int64,2}} as conjugate transpose is lazy in this case. One can get <= 0.6 behavior by a = copy([1 2 3 4 5]').
AFAIK there is no syntactic sugar for it.
I usually write:
hcat([1, 2, 3, 4, 5])
which is short and I find it easy to remember.
If you use reshape you can replace one dimension with : which means you do not have to count (it is useful e.g. when you get an input vector as a variable):
reshape([1 2 3 4 5], :, 1)
Finally you could use:
permutedims([1 2 3 4 5])

Operation between two dataframe with different size in R

I'd like to sum two dataframe with different size in R.
> x = data.frame(a=c(1,2,3),b=c(5,6,7))
> y = data.frame(x=c(1,1,1))
> x
a b
1 1 5
2 2 6
3 3 7
> y
x
1 1
2 1
3 1
The result I want is,
>
a b
1 2 6
2 3 7
3 4 8
How can I do this?
Maybe easiest to convert y to a vector with unlist and then perform the operation. Here, the vector in unlist(y) will be recycled over the columns of the data.frame x.
x + unlist(y)
a b
1 2 6
2 3 7
3 4 8
As a side note, data.frames are a special type of list object and sometimes performing operations on lists can be a bit more involved. On the otherhand, they tend to work fairly well with vectors as long as the dimensions line up (here, as long as the vector has the same length as the number of rows in the data.frame).
We can make the dimensions same and then get the sum
x + rep(y, ncol(x))
# a b
#1 2 6
#2 3 7
#3 4 8
Or another option is sweep
sweep(x, y$x, 1, `+`)
# a b
#1 2 6
#2 3 7
#3 4 8

Kolmogorov-Smirnov Test gives result different from max(abs(difference(x, y)))

I'm using ks.test function in r to perform Kolmogorov-Smirnov test. I found that Kolmogorov-Smirnov test gives result different from
max(abs(difference(x, y)))
According to the definition of Kolmogorov-Smirnov Test in Wikipedia, the results should be equivalent.
Does any one know why?
The KS statistic is not supposed to be equal to max(|x-y|).
It is applied to the cumulative distribution function(s) (CDF). Thus, it represents rather the proportion of observations different between a sample and a reference distribution.
See the two examples below executed in MATLAB (although I expect the results to be identical in R):
x = [1 2 3 4 5 6 7 8 9 10];
y = [1 2 3 4 5 6 7 8 9 11];
[~, ~, ks2s] = kstest2(x,y)
ks2s =
0.1000 (1)
x = [1 2 3 4 5 6 7 8 9 10];
y = [1 2 3 4 5 6 7 8 9 12];
[~, ~, ks2s] = kstest2(x,y)
ks2s =
0.1000 (2)
Thus, although the maximum absolute magnitude difference between x and y is larger in (2), the KS statistic is the same because the proportion of samples that are different is the same.
If y has an extra sample, for example, the result changes:
x = [1 2 3 4 5 6 7 8 9 10];
y = [1 2 3 4 5 6 7 8 9 10 11];
[h, p, ks2s] = kstest2(x,y)
ks2s =
0.0909

Arguments for Subset within a function in R colon v. greater or equal to

Suppose I have the following data.
x<- c(1,2, 3,4,5,1,3,8,2)
y<- c(4,2, 5,6,7,6,7,8,9)
data<-cbind(x,y)
x y
1 1 4
2 2 2
3 3 5
4 4 6
5 5 7
6 1 6
7 3 7
8 8 8
9 2 9
Now, if I subset this data to select only the observations with "x" between 1 and 3 I can do:
s1<- subset(data, x>=1 & x<=3)
and obtain my desired output:
x y
1 1 4
2 2 2
3 3 5
4 1 6
5 3 7
6 2 9
However, if I subset using the colon operator I obtained a different result:
s2<- subset(data, x==1:3)
x y
1 1 4
2 2 2
3 3 5
This time it only includes the first observation in which "x" was 1,2, or 3. Why?
I would like to use the ":" operator because I am writing a function so the user would input a range of values from which she wants to see an average calculated over the "y" variable. I would prefer if they can use ":" operator to pass this argument to the subset function inside my function but I don't know why subsetting with ":" gives me different results.
I'd appreciate any suggestions on this regard.
You can use %in% instead of ==
subset(data, x %in% 1:3)
In general, if we are comparing two vectors of unequal sizes, %in% would be used. There are cases where we can take advantage of the recycling (it can fail too) if the length of one of the vector is double that of the second. Some examples with some description is here.

Gathering results of MPI_SCAN

I have this array [1 2 3 4 5 6 7 8 9] and i am performing scan operation on that.
I have 3 mpi tasks and each task gets 3 elements then each task calculates its scan and returns result to master task
task 0 - [1 2 3] => [1 3 6]
task 1 - [4 5 6 ] => [4 9 15]
task 2 - [7 8 9] => [7 15 24]
Now task 0 gets all the results [1 3 6] [4 9 15] [7 15 24]
How can I combine these results to produce final scan output?
final scan output of array would be [1 3 6 10 15 21 28 36 45]
can anyone help me please?
Are you trying to implement your own scan operation? Since this is not what MPI_SCAN does. It applies the scan operation elementwise over each i-th element of the input array stored on each node and the result will be more like:
rank 0 - [1 2 3] => [ 1 2 3]
rank 1 - [4 5 6] => [ 5 7 9]
rank 2 - [7 8 9] => [12 15 18]
Nevertheless, in order to obtain the result that you want, you should add 6 (the last element from the first scan in task 0) to all elements in the next scans:
[ 1 3 6][ 4 9 15][ 7 15 24]
+6 -------------->
=
[ 1 3 6][10 15 21][13 21 30]
Then you should add 15 (the last element from the scan in task 1 before 6 was added) to all elements in the next scans and so forth.
[ 1 3 6][10 15 21][13 21 30]
+15 ---->
=
[ 1 3 6][10 15 21][28 36 45]
Alternatively you could add 6 only to the results from the second scan, then add 21 to the results from the third scan and so forth.
Maybe you can find some clever way to do that using MPI operations.

Resources