The following question is about math. The matter is, how to calculate the index of an element in a non-repetitive permutation. Example,
A={a,b,c} The permutation is then 3!=6 therefore: (a,b,c);(a,c,b);(b,a,c);(b,c,a);(c,a,b);(c,b,a)
I researched for algorithm to get the index of an element in this permutation. In internet there are only repetitive permutation algorithms.
The index of (b,c,a) is in this zero-based list, obviously 3. Is there an easy way to calculate the position directly by formula ?
I do not need the itertools from python. Because i use very large permutations.(Example 120!) I messed once with python's itertools' permutations function to get the index of an element over the list iterator. But the results were weary. I need a mathematical solution to get the index directly.
Thanks for reading.
Some clues:
You have n! permutations. Note that (n-1)! permutations start from the first element (a), next (n-1)! permutations start from the second element (b) and so on.
So you can calculate the first term of permutation rank as (n-1)! * Ord(P[0]) where Ord gives ordering number of the first element of permutation in initial sequence (0 for a, 1 for b etc).
Then continue with the second element using (n-2)! multiplier and so on.
Don't forget to exclude used elements from order - for your example b is used, so at the second stage c has index 1 rather 0, ad rank is 2!*1 + 1!*1 + 0! * 0 = 3
Related
I am looking for help with pseudo code (unless you are a user of Game Maker 8.0 by Mark Overmars and know the GML equivalent of what I need) for how to generate a list / array of unique combinations of a set of X number of integers which size is variable. It can be 1-5 or 1-1000.
For example:
IntegerList{1,2,3,4}
1,2
1,3
1,4
2,3
2,4
3,4
I feel like the math behind this is simple I just cant seem to wrap my head around it after checking multiple sources on how to do it in languages such as C++ and Java. Thanks everyone.
As there are not many details in the question, I assume:
Your input is a natural number n and the resulting array contains all natural numbers from 1 to n.
The expected output given by the combinations above, resembles a symmetric relation, i. e. in your case [1, 2] is considered the same as [2, 1].
Combinations [x, x] are excluded.
There are only combinations with 2 elements.
There is no List<> datatype or dynamic array, so the array length has to be known before creating the array.
The number of elements in your result is therefore the binomial coefficient m = n over 2 = n! / (2! * (n - 2)!) (which is 4! / (2! * (4 - 2)!) = 24 / 4 = 6 in your example) with ! being the factorial.
First, initializing the array with the first n natural numbers should be quite easy using the array element index. However, the index is a property of the array elements, so you don't need to initialize them in the first place.
You need 2 nested loops processing the array. The outer loop ranges i from 1 to n - 1, the inner loop ranges j from 2 to n. If your indexes start from 0 instead of 1, you have to take this into consideration for the loop limits. Now, you only need to fill your target array with the combinations [i, j]. To find the correct index in your target array, you should use a third counter variable, initialized with the first index and incremented at the end of the inner loop.
I agree, the math behind is not that hard and I think this explanation should suffice to develop the corresponding code yourself.
Note: I edited the original question to explain more precisely.
While I was doing a simulation for my new method, I needed to generate a special type of dataset consists of multiple subset. The problem is that there is some "shared" variables across the subsets, and the number of shared variable is called "overlap" here. Since the distribution of overlap proportion is given, I need to generate an appropriate list of variables and their overlap follows the given distribution. But I have failed to implement such algorithm...
I am not sure whether there is a specific algorithm for this kind of question,
but I have failed to find such thing after a long search.
I prefer R solution, but anything others also will be very appreciated. Please help me to solve this problem! Thank you so much in advance!
The below is a standardized explanation for my problem. I tried to explain as general as possible I can, but please give me any suggestion if it is not sufficient.
Purpose: Generate n sets from given overlap matrix of elements. Each set contains k elements.
Input: There is a n*n matrix whose (i,j)th cell value represents a number of overlapped elements from (i)th set to (j)th set.
Output: A list of k element identifiers (whatever can be used such as number) for n sets.
Assumption: The number of elements for each set is k, and it is same across all n sets. Hence, the input matrix is symmetric.
Example (assumes k=3 and n=3)
Input
3 1 0
1 3 1
0 1 3
Output
Set 1: A B C
Set 2: A D E
Set 3: D F G
In the above example input, (1,2)th and (2,1)th cells are 1 because set 1 and 2 share "A" element and vice versa, and diagonal cells are 3(=k) because each set shares all elements with itself.
I would repeat the following process until I had accounted for all the matrix entries:
1) Treat the matrix as the adjacency matrix of a graph, and find the largest clique in it. That is, find the largest possible set S of indexes such that for all i, j in set S M(i,j) > 0
2) Create an item that is in all of the sets which correspond to the indexes in S - in fact, if the minimum value of M(i,j) = v, create v such items.
3) subtract v from M(i,j) for all i, j in set S, accounting for the counts generated by the items you have just created.
I was looking at [Stirling numbers of the second kind], which are the total number of ways to split a set of length n into k non-empty subsets, where order does not matter.(http://mathworld.wolfram.com/StirlingNumberoftheSecondKind.html), and was wondering how to write a non-naive algorithm to compute
S(n, k {occurences of each element})
Where
S(6, 3, {1, 2, 3} )
would give the total number of ways a set with 6 elements in which 3 are the same element and a different 2 are another element (and 1 is its unique element) could be split into 3 non-empty sets, ignoring permutations.
There is a recursive formula for regular Stirling numbers of the second kind S(n, k), but unlikely to be a comparable function for multisets.
So what's an algorithm that could calculate this number?
Relevant question on Math.SE here, without a real method to calculate this number.
I'm new to R and I'm looking through a book called "Discovering Statistics using R".
Although the book implies you don't need any statistical background, some of the content isn't covered/explained...
I'm trying to sum the elements of a vector starting from position 1 until a positive element is present.
I found this question which is very similar to what I'm trying to achieve. However when I implement it, it doesn't always seem to work (and it sometimes appears to include the first positive element)...
My program is:
vecA <- runif(10, -10, 10);
sumA <-sum(vecA [1:min(which(vecA < 0))]);
Is there a more robust way to calculate this without using loops that works every time and doesn't add the positive element? I'm not at the looping stage of my books yet.
I also found this site which asks a similar question but their answer errors:
sum(vecA [seq_len(which.max(vecA > 0)]);
You can use the following code:
sum(vecA * !cumsum(vecA > 0))
This also works if the first element is positive or all elements are negative.
You want to use > not < to sum all elements until the first positive one is reached.
You're currently summing from 1 until the first negative value is reached (including the first negative value).
sum(vecA[1:min(which(vecA>0))-1])
the which() function will return all of the positions of the positive elements, then taking the sum from 1 to the position of the first positive - 1 will guarantee you are summing all of the negative elements
match function is usually the fastest to find the first occurrence of some element in a vector, so another version of this could look like follows:
first.positive <- match(TRUE, vecA > 0)
sumA <- sum( vecA[ 1 : first.positive ] ) - vecA[first.positive]
This will give you zero if positive element is the first.
When I compute the difference between the largest and the smallest number in an empty vector(v←⍳0) using ⌈⌿(⌈/c)- ⌊⌿(⌊/c) , it gives me a domain error. This statement works fine with normal vectors and matrices.
How do I handle the exception such that it does not give me an error when the vector is empty? It should not return anything or just return a zero.
A guard is the best way to do this:
{0=⍴⍵:0 ⋄ (⌈/⍵)-⌊/⍵}
Note that the use of two reductions, one with axis specfication, is not really needed or correct actually. That is, if you want it to work on all of the elements of a simple array of any dimension, simply ravel the argument first:
{0=⍴⍵:0 ⋄ (⌈/⍵)-⌊/⍵},10 10 ⍴⍳100
99
Or for an array of any structure or depth, you can use "super ravel":
{0=⍴⍵:0 ⋄ (⌈/⍵)-⌊/⍵}∊(1 2 3)(7 8 9 10)
9
Note that quadML (Migration Level) must be set to 3 to ensure that epsilon is "super ravel."
Note also the equivalence of the following when operating on a matrix:
⌈⌿⌈/10 10 ⍴⍳100
99
⌈/⌈/10 10 ⍴⍳100
99
⌈/⌈⌿10 10 ⍴⍳100
99
⌈⌿⌈⌿10 10 ⍴⍳100
99
Using reduction with axis is not needed in this case, and obscures the intent and is also potentially more expensive. Better to just ravel the whole thing.
As I mentioned in the comments, Dyalog APL has guards, which can be used for conditional execution, and thus you can simply check for the empty vector and give a different answer.
This can be implemented in a more traditional/pure APL method however.
This version only works in 1-dimension
In the APL font:
Z←DIFFERENCE V
⍝ Calculate difference between vectors, with empty set protection
⍝ Difference is calculated by a reduced ceiling subtracted from the reduced floor
⍝ eg. (⌈⌿(⌈V)) - (⌊⌿(⌊V))
⍝ Protection is implemented by comparison against the empty set ⍬≡V
⍝ Which yields 0 or 1, and using that result to select an answer from a tuple
⍝ If empty, then it drops the first element, yielding just a zero, otherwise both are retained
⍝ eg. <condition>↓(a b) => 0 = (a b), 1 = (b)
⍝ The final operation is first ↑, to remove the first element from the tuple.
Z←↑(⍬≡V)↓(((⌈⌿(⌈V)) - (⌊⌿(⌊V))) 0)
Or in brace notation, for people without the font.
Z{leftarrow}DIFFERENCE V
{lamp} Calculate difference between vectors, with empty set protection
{lamp} Difference is calculated by a reduced ceiling subtracted from the reduced floor
{lamp} eg. ({upstile}{slashbar}({upstile}V)) - ({downstile}{slashbar}({downstile}V))
{lamp} Protection is implemented by comparison against the empty set {zilde}{equalunderbar}V
{lamp} Which yields 0 or 1, and using that result to select an answer from a tuple
{lamp} If empty, then it drops the first element, yielding just a zero, otherwise both are retained
{lamp} eg. <condition>{downarrow}(a b) => 0 = (a b), 1 = (b)
{lamp} The final operation is first {uparrow}, to remove the first element from the tuple.
Z{leftarrow}{uparrow}({zilde}{equalunderbar}V){downarrow}((({upstile}{slashbar}({upstile}V)) - ({downstile}{slashbar}({downstile}V))) 0)
and an image for the sake of preservation...
Updated. multi-dimensional
Z←DIFFERENCE V
⍝ Calculate difference between vectors, with empty set protection
⍝ Initially enlist the vector to get reduce to single dimension
⍝ eg. ∊V
⍝ Difference is calculated by a reduced ceiling subtracted from the reduced floor
⍝ eg. (⌈/V) - (⌊/V)
⍝ Protection is implemented by comparison against the empty set ⍬≡V
⍝ Which yields 0 or 1, and using that result to select an answer from a tuple
⍝ If empty, then it drops the first element, yielding just a zero, otherwise both are retained
⍝ eg. <condition>↓(a b) => 0 = (a b), 1 = (b)
⍝ The final operation is first ↑, to remove the first element from the tuple.
V←∊V
Z←↑(⍬≡V)↓(((⌈/V) - (⌊/V)) 0)