Octave - comparing vectors (element by element) - vector

how do I compare 2 vectors of equal length - I want to get number of elements (which have the same position in both vectors) that differ.
Example:
x=[1 0 0 1 1]
y=[1 0 1 1 0]
result should be 2 since 3rd and 5th element of both vectors differ

One possible solution:
x==y will return a vector of length length(x) (or length(y) since x and y are the same length) with 1 where x(i)==y(i) and 0 where x(i)~=y(i):
>> x==y
ans =
1 1 0 1 0
So all you need to do is sum the elements of x==y and subtract that to length(x)
>> length(x)-sum(x==y)
ans = 2
Arnaud

Compare two matrices (/vectors) with -
z = eq(x, y) % returns 1 for match and 0 for mismatch
which returns a matrix z of 0s and 1s. Finally count the number of zeros in it:
sum(z == 0); % find total non matching elements

sum(ne(x, y)) % find all elements that are different
gives 2

Related

Sum from each 0 position in a 0 1 vector in R

I am trying to calculate the sum of all 1s after each 0 in a vector of 0s and 1s.
E.g.,
0 0 1 0 1
would be:
2 (all 1s after 1st 0) + 2 (all 1s after 2nd zero) + 1 (one 1 after 3rd zero on the 4th position) = 5
So not 6 what would be the case if you just sum the whole vector for each 0 in the vector (3*2).
This is what I have tried but does not work:
a <- rbinom(10, 1, 0.5)
counter <- 0
for (i in a){
if(i == 0)
counter <- counter + sum(a[i:10])
}
print(counter)
I first create a vector of 10 random 0s and 1s. I make a counter which starts at 0, then I try to calculate the sum from each i position until the final position (10th), but only when i equals 0.
What it actually does is just calculate the sum of all 1s for each 0 in the vector.
Thanks for any help on this!
Given
x <- c(0, 0, 1, 0, 1)
Here is a vectorized way
sum(rev(cumsum(rev(x))) * !x)
#[1] 5
Or using this input
set.seed(1)
a <- rbinom(10, 1, 0.5)
a
# [1] 0 0 1 1 0 1 1 1 1 0
The result is
sum(rev(cumsum(rev(a))) * !a)
# [1] 16
step by step
When we calculate the reversed cumulative sum of rev(x) we get
rev(cumsum(rev(x)))
# [1] 2 2 2 1 1
The result shows us for each element in x how many 1s there are until the end of the vector.
The idea to multiply this vector by !x is that we later only want to sum those elements for which x is zero, i.e. not 1 (or not TRUE).
Result
rev(cumsum(rev(x))) * !x
# [1] 2 2 0 1 0
This needs to be summed up to get desired output.
Try this:
a <- rbinom(10, 1, 0.5)
counter <- 0
for (i in seq_along(a)){
if(a[i] == 0)
counter <- counter + sum(a[i:10])
}
print(counter)
In yours example i is not an iterator. It's a value of a vector. So a[i:10] gives either a[0:10] or a[1:10].

split a numeric value into unequal size but summation of them is equal to total value [duplicate]

I would like to generate N random positive integers that sum to M. I would like the random positive integers to be selected around a fairly normal distribution whose mean is M/N, with a small standard deviation (is it possible to set this as a constraint?).
Finally, how would you generalize the answer to generate N random positive numbers (not just integers)?
I found other relevant questions, but couldn't determine how to apply their answers to this context:
https://stats.stackexchange.com/questions/59096/generate-three-random-numbers-that-sum-to-1-in-r
Generate 3 random number that sum to 1 in R
R - random approximate normal distribution of integers with predefined total
Normalize.
rand_vect <- function(N, M, sd = 1, pos.only = TRUE) {
vec <- rnorm(N, M/N, sd)
if (abs(sum(vec)) < 0.01) vec <- vec + 1
vec <- round(vec / sum(vec) * M)
deviation <- M - sum(vec)
for (. in seq_len(abs(deviation))) {
vec[i] <- vec[i <- sample(N, 1)] + sign(deviation)
}
if (pos.only) while (any(vec < 0)) {
negs <- vec < 0
pos <- vec > 0
vec[negs][i] <- vec[negs][i <- sample(sum(negs), 1)] + 1
vec[pos][i] <- vec[pos ][i <- sample(sum(pos ), 1)] - 1
}
vec
}
For a continuous version, simply use:
rand_vect_cont <- function(N, M, sd = 1) {
vec <- rnorm(N, M/N, sd)
vec / sum(vec) * M
}
Examples
rand_vect(3, 50)
# [1] 17 16 17
rand_vect(10, 10, pos.only = FALSE)
# [1] 0 2 3 2 0 0 -1 2 1 1
rand_vect(10, 5, pos.only = TRUE)
# [1] 0 0 0 0 2 0 0 1 2 0
rand_vect_cont(3, 10)
# [1] 2.832636 3.722558 3.444806
rand_vect(10, -1, pos.only = FALSE)
# [1] -1 -1 1 -2 2 1 1 0 -1 -1
Just came up with an algorithm to generate N random numbers greater or equal to k whose sum is S, in an uniformly distributed manner. I hope it will be of use here!
First, generate N-1 random numbers between k and S - k(N-1), inclusive. Sort them in descending order. Then, for all xi, with i <= N-2, apply x'i = xi - xi+1 + k, and x'N-1 = xN-1 (use two buffers). The Nth number is just S minus the sum of all the obtained quantities. This has the advantage of giving the same probability for all the possible combinations. If you want positive integers, k = 0 (or maybe 1?). If you want reals, use the same method with a continuous RNG. If your numbers are to be integer, you may care about whether they can or can't be equal to k. Best wishes!
Explanation: by taking out one of the numbers, all the combinations of values which allow a valid Nth number form a simplex when represented in (N-1)-space, which lies at one vertex of a (N-1)-cube (the (N-1)-cube described by the random values range). After generating them, we have to map all points in the N-cube to points in the simplex. For that purpose, I have used one method of triangulation which involves all possible permutations of coordinates in descending order. By sorting the values, we are mapping all (N-1)! simplices to only one of them. We also have to translate and scale the numbers vector so that all coordinates lie in [0, 1], by subtracting k and dividing the result by S - kN. Let us name the new coordinates yi.
Then we apply the transformation by multiplying the inverse matrix of the original basis, something like this:
/ 1 1 1 \ / 1 -1 0 \
B = | 0 1 1 |, B^-1 = | 0 1 -1 |, Y' = B^-1 Y
\ 0 0 1 / \ 0 0 1 /
Which gives y'i = yi - yi+1. When we rescale the coordinates, we get:
x'i = y'i(S - kN) + k = yi(S - kN) - yi+1(S - kN) + k = (xi - k) - (xi+1 - k) + k = xi - xi+1 + k, hence the above formula. This is applied to all elements except the last one.
Finally, we should take into account the distortion that this transformation introduces into the probability distribution. Actually, and please correct me if I'm wrong, the transformation applied to the first simplex to obtain the second should not alter the probability distribution. Here is the proof.
The probability increase at any point is the increase in the volume of a local region around that point as the size of the region tends to zero, divided by the total volume increase of the simplex. In this case, the two volumes are the same (just take the determinants of the basis vectors). The probability distribution will be the same if the linear increase of the region volume is always equal to 1. We can calculate it as the determinant of the transpose matrix of the derivative of a transformed vector V' = B-1 V with respect to V, which, of course, is B-1.
Calculation of this determinant is quite straightforward, and it gives 1, which means that the points are not distorted in any way that would make some of them more likely to appear than others.
I figured out what I believe to be a much simpler solution. You first generate random integers from your minimum to maximum range, count them up and then make a vector of the counts (including zeros).
Note that this solution may include zeros even if the minimum value is greater than zero.
Hope this helps future r people with this problem :)
rand.vect.with.total <- function(min, max, total) {
# generate random numbers
x <- sample(min:max, total, replace=TRUE)
# count numbers
sum.x <- table(x)
# convert count to index position
out = vector()
for (i in 1:length(min:max)) {
out[i] <- sum.x[as.character(i)]
}
out[is.na(out)] <- 0
return(out)
}
rand.vect.with.total(0, 3, 5)
# [1] 3 1 1 0
rand.vect.with.total(1, 5, 10)
#[1] 4 1 3 0 2

finding n-1 matrix elements in R

probably it's a very simple question with a very simple answer but I just can't figure it out by myself. I have a matrix called 'hz' with 1 column and 115 rows (hz[1:115, 1]) and I'm trying to find the values preceding those that are smaller than 1 and replace them. I did the following:
hz[c(hz < 1)], got 11 values,
then I tried to find the preceding ones: hz[c(hz < 1) - 1], expected 11 values but got 114.
If I try to find specific elements like hz[c(6, 26, 36)], and the preceding ones: hz[c(6, 26, 36) - 1] I got 3 values in both cases as expected. So what's the difference? Is it a problem that I have a condition (<1) in the index?
Thank you for your help!
Viktor
Basically you want hz[which(hz < 1) - 1].
Note, hz < 1 is returning a logical vector, i.e., TRUE / FALSE. If you take subtraction: (hz < 1) - 1, TRUE will be seen as 1 and FALSE will be seen as 0. Applying which to a logical vector gives you positions (as integers) of TRUE, which is what you want.
Consider the following demonstration:
x <- 1:5
x < 3
#[1] TRUE TRUE FALSE FALSE FALSE
(x < 3) - 1
#[1] 0 0 -1 -1 -1
which(x < 3)
#[1] 1 2
which(x < 3) - 1
#[1] 0 1

Fastest way to find switching from positive to negative in a vector in R

I have a vector that contains both positive and negative values. For example something like
x = c(1,2,1,-2,-3,3,-4,5,1,1,-3)
And now I want to flag the indices of the vector where the value changes from positive to negative or negative to positive. So in the example above I would want something a vector of indices that looks something like this
y=c(0,0,0,1,0,1,1,1,0,0,1)
I am doing this in R so if possible I would like to avoid using for-loops.
I think this should work:
+(c(0, diff(sign(x))) != 0)
#[1] 0 0 0 1 0 1 1 1 0 0 1
all.equal(+(c(0, diff(sign(x))) != 0), y)
#[1] TRUE
Here's one way:
yy = rep(0, length(x))
yy[with(rle(sign(x)),{ p = cumsum(c(1,lengths)); p[ -c(1,length(p)) ] })] = 1
all.equal(yy,y) # TRUE
...which turned out more convoluted than I expected at first.

Generate N random integers that sum to M in R

I would like to generate N random positive integers that sum to M. I would like the random positive integers to be selected around a fairly normal distribution whose mean is M/N, with a small standard deviation (is it possible to set this as a constraint?).
Finally, how would you generalize the answer to generate N random positive numbers (not just integers)?
I found other relevant questions, but couldn't determine how to apply their answers to this context:
https://stats.stackexchange.com/questions/59096/generate-three-random-numbers-that-sum-to-1-in-r
Generate 3 random number that sum to 1 in R
R - random approximate normal distribution of integers with predefined total
Normalize.
rand_vect <- function(N, M, sd = 1, pos.only = TRUE) {
vec <- rnorm(N, M/N, sd)
if (abs(sum(vec)) < 0.01) vec <- vec + 1
vec <- round(vec / sum(vec) * M)
deviation <- M - sum(vec)
for (. in seq_len(abs(deviation))) {
vec[i] <- vec[i <- sample(N, 1)] + sign(deviation)
}
if (pos.only) while (any(vec < 0)) {
negs <- vec < 0
pos <- vec > 0
vec[negs][i] <- vec[negs][i <- sample(sum(negs), 1)] + 1
vec[pos][i] <- vec[pos ][i <- sample(sum(pos ), 1)] - 1
}
vec
}
For a continuous version, simply use:
rand_vect_cont <- function(N, M, sd = 1) {
vec <- rnorm(N, M/N, sd)
vec / sum(vec) * M
}
Examples
rand_vect(3, 50)
# [1] 17 16 17
rand_vect(10, 10, pos.only = FALSE)
# [1] 0 2 3 2 0 0 -1 2 1 1
rand_vect(10, 5, pos.only = TRUE)
# [1] 0 0 0 0 2 0 0 1 2 0
rand_vect_cont(3, 10)
# [1] 2.832636 3.722558 3.444806
rand_vect(10, -1, pos.only = FALSE)
# [1] -1 -1 1 -2 2 1 1 0 -1 -1
Just came up with an algorithm to generate N random numbers greater or equal to k whose sum is S, in an uniformly distributed manner. I hope it will be of use here!
First, generate N-1 random numbers between k and S - k(N-1), inclusive. Sort them in descending order. Then, for all xi, with i <= N-2, apply x'i = xi - xi+1 + k, and x'N-1 = xN-1 (use two buffers). The Nth number is just S minus the sum of all the obtained quantities. This has the advantage of giving the same probability for all the possible combinations. If you want positive integers, k = 0 (or maybe 1?). If you want reals, use the same method with a continuous RNG. If your numbers are to be integer, you may care about whether they can or can't be equal to k. Best wishes!
Explanation: by taking out one of the numbers, all the combinations of values which allow a valid Nth number form a simplex when represented in (N-1)-space, which lies at one vertex of a (N-1)-cube (the (N-1)-cube described by the random values range). After generating them, we have to map all points in the N-cube to points in the simplex. For that purpose, I have used one method of triangulation which involves all possible permutations of coordinates in descending order. By sorting the values, we are mapping all (N-1)! simplices to only one of them. We also have to translate and scale the numbers vector so that all coordinates lie in [0, 1], by subtracting k and dividing the result by S - kN. Let us name the new coordinates yi.
Then we apply the transformation by multiplying the inverse matrix of the original basis, something like this:
/ 1 1 1 \ / 1 -1 0 \
B = | 0 1 1 |, B^-1 = | 0 1 -1 |, Y' = B^-1 Y
\ 0 0 1 / \ 0 0 1 /
Which gives y'i = yi - yi+1. When we rescale the coordinates, we get:
x'i = y'i(S - kN) + k = yi(S - kN) - yi+1(S - kN) + k = (xi - k) - (xi+1 - k) + k = xi - xi+1 + k, hence the above formula. This is applied to all elements except the last one.
Finally, we should take into account the distortion that this transformation introduces into the probability distribution. Actually, and please correct me if I'm wrong, the transformation applied to the first simplex to obtain the second should not alter the probability distribution. Here is the proof.
The probability increase at any point is the increase in the volume of a local region around that point as the size of the region tends to zero, divided by the total volume increase of the simplex. In this case, the two volumes are the same (just take the determinants of the basis vectors). The probability distribution will be the same if the linear increase of the region volume is always equal to 1. We can calculate it as the determinant of the transpose matrix of the derivative of a transformed vector V' = B-1 V with respect to V, which, of course, is B-1.
Calculation of this determinant is quite straightforward, and it gives 1, which means that the points are not distorted in any way that would make some of them more likely to appear than others.
I figured out what I believe to be a much simpler solution. You first generate random integers from your minimum to maximum range, count them up and then make a vector of the counts (including zeros).
Note that this solution may include zeros even if the minimum value is greater than zero.
Hope this helps future r people with this problem :)
rand.vect.with.total <- function(min, max, total) {
# generate random numbers
x <- sample(min:max, total, replace=TRUE)
# count numbers
sum.x <- table(x)
# convert count to index position
out = vector()
for (i in 1:length(min:max)) {
out[i] <- sum.x[as.character(i)]
}
out[is.na(out)] <- 0
return(out)
}
rand.vect.with.total(0, 3, 5)
# [1] 3 1 1 0
rand.vect.with.total(1, 5, 10)
#[1] 4 1 3 0 2

Resources