If there are M different boxes and N identical balls - math

and we need to put these balls into boxes.
How many states of the states could there be?
This is part of a computer simulation puzzle. I've almost forget all my math knowledges.

I believe you are looking for the Multinomial Coefficient.
I will check myself and expand my answer.
Edit:
If you take a look at the wikipedia article I gave a link to, you can see that the M and N you defined in your question correspond to the m and n defined in the Theorem section.
This means that your question corresponds to: "What is the number of possible coefficient orderings when expanding a polynomial raised to an arbitrary power?", where N is the power, and M is the number of variables in the polynomial.
In other words:
What you are looking for is to sum over the multinomial coefficients of a polynomial of M variables expanded when raised to the power on N.
The exact equations are a bit long, but they are explained very clearly in wikipedia.
Why is this true:
The multinomial coefficient gives you the number of ways to order identical balls between baskets when grouped into a specific grouping (for example, 4 balls grouped into 3, 1, and 1 - in this case M=4 and N=3). When summing over all grouping options you get all possible combinations.
I hope this helped you out.

These notes explain how to solve the "balls in boxes" problem in general: whether the balls are labeled or not, whether the boxes are labeled or not, whether you have to have at least one ball in each box, etc.

this is a basic combinatorial question (distribution of identical objects into non identical slots)
the number of states is [(N+M-1) choose (M-1)]

Related

What is the difference between permutations and derangements?

I have been given a program to write difference combinations of set of number entered by user and when I researched for the same I get examples with terms permutations and derangements.
I am unable to find the clarity between the them. Also adding to that one more term is combinations. Any one please provide a simple one liner for clarity on the question.
Thanks in advance.
http://en.wikipedia.org/wiki/Permutation
The notion of permutation relates to the act of rearranging, or permuting, all the members of a set into some sequence or order (unlike combinations, which are selections of some members of the set where order is disregarded). For example, written as tuples, there are six permutations of the set {1,2,3}, namely: (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), and (3,2,1). As another example, an anagram of a word, all of whose letters are different, is a permutation of its letters.
http://en.wikipedia.org/wiki/Derangement
In combinatorial mathematics, a derangement is a permutation of the elements of a set such that none of the elements appear in their original position.
The number of derangements of a set of size n, usually written Dn, dn, or !n, is called the "derangement number" or "de Montmort number". (These numbers are generalized to rencontres numbers.) The subfactorial function (not to be confused with the factorial n!) maps n to !n.1 No standard notation for subfactorials is agreed upon; n¡ is sometimes used instead of !n.2

Calculating simple probability in R

I have just started with R and am having trouble with how to enter a seemingly very simple probability question in R. The question is:
There is a big court case, and 20% of the adult population believe the accused is innocent prior to jury selection. Assume that the 12 jurors were selected randomly and independently from the population.
Find the probability that the jury has at least one member who believe in the accused's innocence prior to jury selection (hint: define the binomial (12,.2) random variable X to be the number of jurors believing in the accused's innocence).
Find the probability that the jury had at least two members who believed in the accused's innocence (hint: P(X ≥ 2) = 1-P(X ≤ 1), and P(X ≤ 1) = P(X=0) + P(X=1)).
I know that the dbinom(1,12,.2) will give me the probability of exactly one jury member believing in the accused's innocence, but I am unsure what code to enter for "at least," since this is how the question is phrased. I assume once I am able to calculate "at least" for one jury member, the same code will apply for the second question. The hint for the second question is throwing me off, however.
This is certainly a stupid and simple question, but any and all help would be greatly appreciated!
If you want at least 1, you could just calculate the probability of having none and take the difference to 1:
1-dbinom(0,12,.2)
I guess you could think about it, since it is hinted in the second part of the problem. However, in R the function pbinom gives the cumulative probability; so the expression:
pbinom(2,12,.2)
amounts to the probability of having 0,1 or 2 people believing the innocence. You could get the same result through:
sum(dbinom(0:2,12,.2))
Here I'm taking the probability of 0,1 or 2 people and summing them all

Solving a system of linear equations in a non-square matrix [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 3 years ago.
Improve this question
I have a system of linear equations that make up an NxM matrix (i.e. Non-square) which I need to solve - or at least attempt to solve in order to show that there is no solution to the system. (more likely than not, there will be no solution)
As I understand it, if my matrix is not square (over or under-determined), then no exact solution can be found - am I correct in thinking this? Is there a way to transform my matrix into a square matrix in order to calculate the determinate, apply Gaussian Elimination, Cramer's rule, etc?
It may be worth mentioning that the coefficients of my unknowns may be zero, so in certain, rare cases it would be possible to have a zero-column or zero-row.
Whether or not your matrix is square is not what determines the solution space. It is the rank of the matrix compared to the number of columns that determines that (see the rank-nullity theorem). In general you can have zero, one or an infinite number of solutions to a linear system of equations, depending on its rank and nullity relationship.
To answer your question, however, you can use Gaussian elimination to find the rank of the matrix and, if this indicates that solutions exist, find a particular solution x0 and the nullspace Null(A) of the matrix. Then, you can describe all your solutions as x = x0 + xn, where xn represents any element of Null(A). For example, if a matrix is full rank its nullspace will be empty and the linear system will have at most one solution. If its rank is also equal to the number of rows, then you have one unique solution. If the nullspace is of dimension one, then your solution will be a line that passes through x0, any point on that line satisfying the linear equations.
Ok, first off: a non-square system of equations can have an exact solution
[ 1 0 0 ][x] = [1]
[ 0 0 1 ][y] [1]
[z]
clearly has a solution (actually, it has an 1-dimensional family of solutions: x=z=1). Even if the system is overdetermined instead of underdetermined it may still have a solution:
[ 1 0 ][x] = [1]
[ 0 1 ][y] [1]
[ 1 1 ] [2]
(x=y=1). You may want to start by looking at least squares solution methods, which find the exact solution if one exists, and "the best" approximate solution (in some sense) if one does not.
Taking Ax = b, with A having m columns and n rows. We are not guaranteed to have one and only one solution, which in many cases is because we have more equations than unknowns (m bigger n). This could be because of repeated measurements, that we actually want because we are cautious about influence of noise.
If we observe that we can not find a solution that actually means, that there is no way to find b travelling the column space spanned by A. (As x is only taking a combination of the columns).
We can however ask for the point in the space spanned by A that is nearest to b. How can we find such a point? Walking on a plane the closest one can get to a point outside it, is to walk until you are right below. Geometrically speaking this is when our axis of sight is perpendicular to the plane.
Now that is something we can have a mathematical formulation of. A perpendicular vector reminds us of orthogonal projections. And that is what we are going to do. The simplest case tells us to do a.T b. But we can take the whole matrix A.T b.
For our equation let us apply the transformation to both sides: A.T Ax = A.T b.
Last step is to solve for x by taking the inverse of A.T A:
x = (A.T A)^-1 * A.T b
The least squares recommendation is a very good one.
I'll add that you can try a singular value decomposition (SVD) that will give you the best answer possible and provide information about the null space for free.

Sample uniformly at random from an n-dimensional unit simplex

Sampling uniformly at random from an n-dimensional unit simplex is the fancy way to say that you want n random numbers such that
they are all non-negative,
they sum to one, and
every possible vector of n non-negative numbers that sum to one are equally likely.
In the n=2 case you want to sample uniformly from the segment of the line x+y=1 (ie, y=1-x) that is in the positive quadrant.
In the n=3 case you're sampling from the triangle-shaped part of the plane x+y+z=1 that is in the positive octant of R3:
(Image from http://en.wikipedia.org/wiki/Simplex.)
Note that picking n uniform random numbers and then normalizing them so they sum to one does not work. You end up with a bias towards less extreme numbers.
Similarly, picking n-1 uniform random numbers and then taking the nth to be one minus the sum of them also introduces bias.
Wikipedia gives two algorithms to do this correctly: http://en.wikipedia.org/wiki/Simplex#Random_sampling
(Though the second one currently claims to only be correct in practice, not in theory. I'm hoping to clean that up or clarify it when I understand this better. I initially stuck in a "WARNING: such-and-such paper claims the following is wrong" on that Wikipedia page and someone else turned it into the "works only in practice" caveat.)
Finally, the question:
What do you consider the best implementation of simplex sampling in Mathematica (preferably with empirical confirmation that it's correct)?
Related questions
Generating a probability distribution
java random percentages
This code can work:
samples[n_] := Differences[Join[{0}, Sort[RandomReal[Range[0, 1], n - 1]], {1}]]
Basically you just choose n - 1 places on the interval [0,1] to split it up then take the size of each of the pieces using Differences.
A quick run of Timing on this shows that it's a little faster than Janus's first answer.
After a little digging around, I found this page which gives a nice implementation of the Dirichlet Distribution. From there it seems like it would be pretty simple to follow Wikipedia's method 1. This seems like the best way to do it.
As a test:
In[14]:= RandomReal[DirichletDistribution[{1,1}],WorkingPrecision->25]
Out[14]= {0.8428995243540368880268079,0.1571004756459631119731921}
In[15]:= Total[%]
Out[15]= 1.000000000000000000000000
A plot of 100 samples:
alt text http://www.public.iastate.edu/~zdavkeos/simplex-sample.png
I'm with zdav: the Dirichlet distribution seems to be the easiest way ahead, and the algorithm for sampling the Dirichlet distribution which zdav refers to is also presented on the Wikipedia page on the Dirichlet distribution.
Implementationwise, it is a bit of an overhead to do the full Dirichlet distribution first, as all you really need is n random Gamma[1,1] samples. Compare below
Simple implementation
SimplexSample[n_, opts:OptionsPattern[RandomReal]] :=
(#/Total[#])& # RandomReal[GammaDistribution[1,1],n,opts]
Full Dirichlet implementation
DirichletDistribution/:Random`DistributionVector[
DirichletDistribution[alpha_?(VectorQ[#,Positive]&)],n_Integer,prec_?Positive]:=
Block[{gammas}, gammas =
Map[RandomReal[GammaDistribution[#,1],n,WorkingPrecision->prec]&,alpha];
Transpose[gammas]/Total[gammas]]
SimplexSample2[n_, opts:OptionsPattern[RandomReal]] :=
(#/Total[#])& # RandomReal[DirichletDistribution[ConstantArray[1,{n}]],opts]
Timing
Timing[Table[SimplexSample[10,WorkingPrecision-> 20],{10000}];]
Timing[Table[SimplexSample2[10,WorkingPrecision-> 20],{10000}];]
Out[159]= {1.30249,Null}
Out[160]= {3.52216,Null}
So the full Dirichlet is a factor of 3 slower. If you need m>1 samplepoints at a time, you could probably win further by doing (#/Total[#]&)/#RandomReal[GammaDistribution[1,1],{m,n}].
Here's a nice concise implementation of the second algorithm from Wikipedia:
SimplexSample[n_] := Rest## - Most## &[Sort#Join[{0,1}, RandomReal[{0,1}, n-1]]]
That's adapted from here: http://www.mofeel.net/1164-comp-soft-sys-math-mathematica/14968.aspx
(Originally it had Union instead of Sort#Join -- the latter is slightly faster.)
(See comments for some evidence that this is correct!)
I have created an algorithm for uniform random generation over a simplex. You can find the details in the paper in the following link:
http://www.tandfonline.com/doi/abs/10.1080/03610918.2010.551012#.U5q7inJdVNY
Briefly speaking, you can use following recursion formulas to find the random points over the n-dimensional simplex:
x1=1-R11/n-1
xk=(1-Σi=1kxi)(1-Rk1/n-k), k=2, ..., n-1
xn=1-Σi=1n-1xi
Where R_i's are random number between 0 and 1.
Now I am trying to make an algorithm to generate random uniform samples from constrained simplex.that is intersection between a simplex and a convex body.
Old question, and I'm late to the party, but this method is much faster than the accepted answer if implemented efficiently.
In Mathematica code:
#/Total[#,{2}]&#Log#RandomReal[{0,1},{n,d}]
In plain English, you generate n rows * d columns of randoms uniformly distributed between 0 and 1. Then take the Log of everything. Then normalize each row, dividing each element in the row by the row total. Now you have n samples uniformly distributed over the (d-1) dimensional simplex.
If found this method here: https://mathematica.stackexchange.com/questions/33652/uniformly-distributed-n-dimensional-probability-vectors-over-a-simplex
I'll admit, I'm not sure why it works, but it passes every statistical test I can think of. If anyone has a proof of why this method works, I'd love to see it!

How to check if m n-sized vectors are linearly independent?

Disclaimer
This is not strictly a programming question, but most programmers soon or later have to deal with math (especially algebra), so I think that the answer could turn out to be useful to someone else in the future.
Now the problem
I'm trying to check if m vectors of dimension n are linearly independent. If m == n you can just build a matrix using the vectors and check if the determinant is != 0. But what if m < n?
Any hints?
See also this video lecture.
Construct a matrix of the vectors (one row per vector), and perform a Gaussian elimination on this matrix. If any of the matrix rows cancels out, they are not linearly independent.
The trivial case is when m > n, in this case, they cannot be linearly independent.
Construct a matrix M whose rows are the vectors and determine the rank of M. If the rank of M is less than m (the number of vectors) then there is a linear dependence. In the algorithm to determine the rank of M you can stop the procedure as soon as you obtain one row of zeros, but running the algorithm to completion has the added bonanza of providing the dimension of the spanning set of the vectors. Oh, and the algorithm to determine the rank of M is merely Gaussian elimination.
Take care for numerical instability. See the warning at the beginning of chapter two in Numerical Recipes.
If m<n, you will have to do some operation on them (there are multiple possibilities: Gaussian elimination, orthogonalization, etc., almost any transformation which can be used for solving equations will do) and check the result (eg. Gaussian elimination => zero row or column, orthogonalization => zero vector, SVD => zero singular number)
However, note that this question is a bad question for a programmer to ask, and this problem is a bad problem for a program to solve. That's because every linearly dependent set of n<m vectors has a different set of linearly independent vectors nearby (eg. the problem is numerically unstable)
I have been working on this problem these days.
Previously, I have found some algorithms regarding Gaussian or Gaussian-Jordan elimination, but most of those algorithms only apply to square matrix, not general matrix.
To apply for general matrix, one of the best answers might be this:
http://rosettacode.org/wiki/Reduced_row_echelon_form#MATLAB
You can find both pseudo-code and source code in various languages.
As for me, I transformed the Python source code to C++, causes the C++ code provided in the above link is somehow complex and inappropriate to implement in my simulation.
Hope this will help you, and good luck ^^
If computing power is not a problem, probably the best way is to find singular values of the matrix. Basically you need to find eigenvalues of M'*M and look at the ratio of the largest to the smallest. If the ratio is not very big, the vectors are independent.
Another way to check that m row vectors are linearly independent, when put in a matrix M of size mxn, is to compute
det(M * M^T)
i.e. the determinant of a mxm square matrix. It will be zero if and only if M has some dependent rows. However Gaussian elimination should be in general faster.
Sorry man, my mistake...
The source code provided in the above link turns out to be incorrect, at least the python code I have tested and the C++ code I have transformed does not generates the right answer all the time. (while for the exmample in the above link, the result is correct :) -- )
To test the python code, simply replace the mtx with
[30,10,20,0],[60,20,40,0]
and the returned result would be like:
[1,0,0,0],[0,1,2,0]
Nevertheless, I have got a way out of this. It's just this time I transformed the matalb source code of rref function to C++. You can run matlab and use the type rref command to get the source code of rref.
Just notice that if you are working with some really large value or really small value, make sure use the long double datatype in c++. Otherwise, the result will be truncated and inconsistent with the matlab result.
I have been conducting large simulations in ns2, and all the observed results are sound.
hope this will help you and any other who have encontered the problem...
A very simple way, that is not the most computationally efficient, is to simply remove random rows until m=n and then apply the determinant trick.
m < n: remove rows (make the vectors shorter) until the matrix is square, and then
m = n: check if the determinant is 0 (as you said)
m < n (the number of vectors is greater than their length): they are linearly dependent (always).
The reason, in short, is that any solution to the system of m x n equations is also a solution to the n x n system of equations (you're trying to solve Av=0). For a better explanation, see Wikipedia, which explains it better than I can.

Resources