Random number generation - constrained sequence - r

I'm trying to produce a set of 480 random integers between 1-9. However there are some constraints:
the sequence cannot contain 2 duplicate digits in a row.
the sequence must include exactly 4 sequences of odd numbers and 4 sequences of even numbers (in any order) within every 80 digit sequence (e.g. 6 4 5 2 4 8 3 4 6 9 1 5 4 6 1).
I have been able to produce a set of random numbers hat allows repeated digits, using:
NumRep <- sample(1:9, 480, replace=T)
but not been able to work out how to allow digits to be repeated over the entire set, but to not allow sequential repeats (e.g. 2 5 3 would be okay, 2 5 5 would not). I have got nowhere with the odd/even constraint.
For context, this is not homework! I am a researcher, and this is part of a psychological experiment that I am creating.
Any help would be greatly appreciated!

First, the problem loses the "random" way of simulating with that conditions. Anyway this code corresponds with the first constraint:
# Build a vector
C<-vector()
# Length of the vector
n<-480
# The first element
C<-sample(1:9,1)
# Complete the rest
for (i in 2:n){
# Take a random number not equal to the previous one
C[i] <- sample(c(1:9)[1:9!=C[i-1]],1)
}
# It is an odd number?
C %% 2 == 0
# How many consecutive odd numbers are in the sequence?
# Build a table with this information
TAB <- rle( C %% 2 == 0)
# Maximum of consecutive odd numbers
max(TAB$lengths[TAB$values==T])
# Maximum of consecutive even numbers
max(TAB$lengths[TAB$values==F])
I don't understand the second constraint, but I hope the final part of the code helps. You should use that information in order to interchange some values.

Related

R: least number of sets with decimal numbers on some condition?

Consider
> data.frame(n=runif(6),m=1:6)
n m
1 0.44000000 1
2 0.12102262 2
3 0.95483015 3
4 0.35628753 4
5 0.55000000 5
6 0.50189420 6
where you want to form the least number of sets having decimal numbers where the sum of numbers is less than 1.
Example trial to find the partions, not necessarily optimal way to find the partitions (particularly with bigger sets)
For example, a partition is a set of number 3 because it is less than one i.e. 0.95483015<1. Then other partition is a set of 5 and 1 because 0.55+0.44<1. And rest numbers go to third partitions such that
partition: 3
partition: 5,1
partition: 2,4,6
now I have a big list of numbers like that which I need to make into least number of partitions or least number of sets having decimal numbers.
Does there exist some R package to find partitions with some optimal criteria like the least number of partitions with some condition?

Picking random elements from a vector but excluding certain numbers each time

I have this vector
K=c(1,2,3,4,5,6,8,10,12,14)
I want to pick 2 random elements from K such that my output never includes 6 or 14 or both each time. How can i do this for it to have output like if i used
S=c(1,2,3,4,5,8,10,12)
sample(S,2)
You may take 6 and 14 out of the vector of candidates to sample from, as in
sample(setdiff(K, c(6, 14)), 2)

Largest Permutation in k steps (R)

I have a problem where I would like to replace 2 numbers in a set upto "k"-times such that each time they are switched, I get the largest possible permutation and print this after k-swaps. For example with k=2, for the set (1,4,2,5,3,3) in 1 step I would swap (1,5) to create (5,4,2,1,3,3). In step 2 I would swap (2,3) to create (5,4,3,1,3,2). If after the n < k(th) point we already have the largest permutation e.g. (5,4,3,3,2,1) then we stop.
So far this is what I have:
x<-y<-c(1,4,2,5,3,3)
sx<-sort(x,decreasing=TRUE)
if(k>=n){cat(sx)} else{ ### If we have more operations than numbers?
i<-0; k<-2
m<-max(y)
while(i<k){
if(all(sort(x,decreasing=TRUE)==y)){break}
i<-i+1
a<-max(which(y==m))
while(length(which(y[c(1:a)]<m))==0){
m<-m-1
a<-which(y==m) ### Location of the largest number
if(length(a)==0){
a<-1
next
}
a<-max(a)
}
y[c(min(which(y[c(1:a)]<m)),a)]<-y[c(a,min(which(y[c(1:a)]<m)))]
}
cat(y)
}
5 4 2 1 3 3
Essentially the code finds the max of the current set. Then finds the right most occurrence of this max number. Then finds the left most number lower than the max number. Then switches them. This continues until we have performed k steps OR we have the largest permutation before k. Then prints it.
This code works but takes too long if there are more than 10^4 digits for large k. Is there a way to reduce complexity to O(n) in R?
Here's an O(n log n)-time algorithm (O(n) is impossible with comparisons only, though maybe you know something about the input numbers).
For each subarray of the permutation having (zero-indexed) bounds [i 2^j, (i+1) 2^j) for some integers i, j, store the minimum and the maximum value in the subarray (i.e., set up a segment tree structure).
To find the maximum in an interval in time O(log n), decompose the interval into O(log n) elementary intervals and return the maximum of the maximums.
To find the rightmost maximum in an elementary interval in time O(log n), repeatedly descend to the right child if its maximum is equal to the target, else descend to the left child.
To find the leftmost number lower than a target number in an interval in time O(log n), decompose the interval into O(log n) elementary intervals and find the leftmost elementary interval whose minimum is lower than the target. From this interval, repeatedly descend to the left child if its minimum is lower than the target, else the descend to the right child.
Following your example:
1,5
4,5 1,2 3,3
5 4 2 1 3 3
We just moved 5, and 4 is in the right place, so we want the maximum on the rest of the array. The elementary intervals are 2 1 and 3 3. The maximum is 3. We examine 3 3 and discover that the maximum is in the right child, all the way at the end.
Now we seek the leftmost number less than 3. The elementary intervals are 2 1 and 3 3. We see that 2 1 has a minimum less than 3, so we examine it. So does the left child of 2 1, so 2 is what we swap 3 with.
*1,5*
4,5 *1,2* *3,3*
5 4 3 1 3 2
The aggregates in *asterisks* may need updating (ancestors of the swapped values). There are O(log n) of these. We work bottom up.
*1,5*
4,5 1,3 *3,3*
5 4 3 1 3 2
1,5
4,5 1,3 *3,3*
5 4 3 1 3 2
1,5
4,5 1,3 2,3
5 4 3 1 3 2

Is there a closed form available for the following table?

Below is a table which has a recursive relation as current cell value is the sum of the upper and left cell.
I want to find the odd positions for any given row denoted by v(x) as represented in the first column.
Currently, I am maintaining two one arrays which I update with new sum values and literally checking if each positions value is odd or even.
Is there a closed form that exists which would allow me to directly say what are the odd positions available (say, for the 4th row, in which case it should tell me that p1 and p4 are the odd places).
Since it is following a particular pattern I feel very certain that a closed form should exist which would mathematically tell me the positions rather than calculating each value and checking it.
The numbers that you're looking at are the numbers in Pascal's triangle, just rotated ninety degrees. You more typically see it written out like this:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
...
You're cutting Pascal's triangle along diagonal stripes going down the left (or right, depending on your perspective) strips, and the question you're asking is how to find the positions of the odd numbers in each stripe.
There's a mathematical result called Lucas's theorem which is useful for determining whether a given entry in Pascal's triangle is even or odd. The entry in row m, column n of Pascal's triangle is given by (m choose n), and Lucas's theorem says that (m choose n) mod 2 (1 if the number is odd, 0 otherwise) can be found by comparing the bits of m and n. If n has a bit that's set in a position where m doesn't have that bit set, then (m choose n) is even. Otherwise, (m choose n) is odd.
As an example, let's try (5 choose 3). The bits in 5 are 101. The bits in 3 are 011. Since the 2's bit of 3 is set and the 2's bit of 5 is not set, the quantity (5 choose 3) should be even. And since (5 choose 3) = 10, we see that this is indeed the case!
In pseudocode using relational operators, you essentially want the following:
if ((~m & n) != 0) {
// Row m, entry n is even
} else {
// Row m, entry n is odd.
}

Split matrix into 4 sub-matrices with lowest difference between their sum

I have to find the difference between the sum of 4 sub-matrices, which I get after splitting the matrix A in any way, in order to get the lowest difference between the sum of sub-matrix.
For example, for a matrix A,
3 0 2 -8 -8
5 3 2 2 3
2 5 2 1 4
3 4 -1 4 2
-3 6 2 4 3
I could split it like this:
3 | 0 2 -8 -8
5 | 3 2 2 3
2 | 5 2 1 4
-------------------
3 4 -1 | 4 2
-3 6 2 | 4 3
The sum of all elements within each sub-matrix gives the following result:
10 | 8
-------
11 | 13
Afterwards, I compute all the possible absolute differences between the sums, i.e.
abs(10 - 8) = 2
abs(10 - 11) = 1
abs(10 - 13) = 3
abs(8 - 11) = 3
abs(8 - 13) = 5
abs(11 -13) = 2
Finally, I chose the maximum distance, which is 5.
However, if I split the matrix A in any other way, it will give a different maximum distance, which I don't want. I have to find just 5, but if I'll do this brute force, I just spend too much time on finding all possibilities. Does this problem has a name, or may be you can give me a hint?
ADDED
The allowable splits are a horizontal split followed by a vertical split above and a possibly different vertical split below the horizontal split. In the example, there are 4 x 4 x 4 = 64 allowable partitions of the matrix.
The max difference between the submatrices of a particular partition is formed by considering all pairs of the 4 submatrices of that partition (there will be 6 such pairs) and taking the largest difference between the sums of the elements of one of the submatrices of the pair and the sum of the elements of the other submatrix of the pair. We wish to find the minimum over all max differences.
The actual matrix may be up to 4000 x 4000.
There are some speed-ups over brute force. First of all, by accumulating sums along rows and then down columns you can build a table giving, for each point, the total sum of all points, including that one, no further up than it and no further right than it. Then you can compute the sum in any rectangle by subtracting at most four of these sub-totals: roughly speaking the sum from the top right corner plus the sum from the bottom left corner minus the sums from the other two corners.
For the split pattern the OP has diagrammed, with a horizontal line splitting the entire matrix followed by different vertical lines splitting in each half, the vertical splits must be the most even vertical split of their half. If the most extreme difference between sums is within a vertical split, evening the vertical splits can only improve it. If the most extreme difference between sums is between (for example) a high sum from the top left and a low sum on the bottom right, then evening out either vertical split will either bring the high sum down or the low sum up, evening out the most extreme difference. This means you need only consider the best split in the top half and the best split in the bottom half - you don't need to consider all pairs of splits.
For the case where you have two vertical splits on the same side of a horizontal split, you do not have to try all pairs of positions for the vertical splits: you can start with the leftmost split at the far left, and adjust the rightmost split to cut the remainder as evenly as possible in two. Then move the leftmost split slowly to the right and, as you do so, the rightmost split can be repeatedly adjust to move to the right so as to keep splitting the remainder as evenly as possible.
Using these ideas, it seems to me that, for each possible split pattern, you can find the minimum cost split of that pattern in time, given the position of the longest line in that pattern, which is O(N) for a square matrix of side N, so with N positions for a longest line that is O(N^2), which is about the same time as it takes to build up a table of sums of points below and to the left of each point, which takes time linear in the total number of cells in the matrix, or O(N^2) for a square matrix of side N. - but it is annoying that there seem to six different split patterns.

Resources