OR-multiplication on big integers - math

Multiplication of two n-bit numbers A and B can be understood as a sum of shifts:
(A << i1) + (A << i2) + ...
where i1, i2, ... are numbers of bits that are set to 1 in B.
Now lets replace PLUS with OR to get new operation I actually need:
(A << i1) | (A << i2) | ...
This operation is quite similar to regular multiplication for which there exists many faster algorithms (Schönhage-Strassen for example).
Is a similar algorithm for operation I presented here?
The size of the numbers is 6000 bits.
edit:
For some reason I have no link/button to post comments (any idea why?) so I will edit my question insead.
I indeed search for faster than O(n^2) algorithm for the operation defined above.
And yes, I am aware that it is not ordinary multiplication.

Is there a similar algorithm? I think probably not.
Is there some way to speed things up beyond O(n^2)? Possibly. If you consider a number A to be the analogue of A(x) = Σanxn where an are the binary digits of A, then your operation with bitwise ORs (let's call it A ⊕ B ) can be expressed as follows, where "⇔" means "analogue"
A ⇔ A(x) = Σanxn
B ⇔ B(x) = Σbnxn
C = A ⊕ B ⇔ C(x) = f(A(x)B(x)) = f(V(x)) where f(V(x)) = f(Σvnxn) = Σu(vn)xn where u(vn) = 0 if vn = 0, u(vn) = 1 otherwise.
Basically you are doing the equivalent of taking two polynomials and multiplying them together, then identifying all the nonzero terms. From a bit-string standpoint, this means treating the bitstring as an array of samples of zeros or ones, convolving the two arrays, and collapsing the resulting samples that are nonzero. There are fast convolution algorithms that are O(n log n), using FFTs for instance, and the "collapsing" step here is O(n)... but somehow I wonder if the O(n log n) evaluation of fast convolution treats something (like multiplication of large integers) as O(1) so you wouldn't actually get a faster algorithm. Either that, or the constants for orders of growth are so large that you'd have to have thousands of bits before you got any speed advantage. ORing is so simple.
edit: there appears to be something called "binary convolution" (see this book for example) that sounds awfully relevant here, but I can't find any good links to the theory behind it and whether there are fast algorithms.
edit 2: maybe the term is "logical convolution" or "bitwise convolution"... here's a page from CPAN (bleah!) talking a little about it along with Walsh and Hadamard transforms which are kind of the bitwise equivalent to Fourier transforms... hmm, no, that seems to be the analog for XOR rather than OR.

You can do this O(#1-bits in A * #1-bits in B).
a-bitnums = set(x : ((1<<x) & A) != 0)
b-bitnums = set(x : ((1<<x) & B) != 0)
c-set = 0
for a-bit in a-bitnums:
for b-bit in b-bitnums:
c-set |= 1 << (a-bit + b-bit)
This might be worthwhile if A and B are sparse in the number
of 1 bits present.

I presume, you are asking the name for the additive technique you have given
when you write "Is a similar algorithm for operation I presented here?"...
Have you looked at the Peasant multiplication technique?
Please read up the Wikipedia description if you do not get the 3rd column in this example.
B X A
27 X 15 : 1
13 30 : 1
6 60 : 0
3 120 : 1
1 240 : 1
B is 27 == binary form 11011b
27x15 = 15 + 30 + 120 + 240
= 15<<0 + 15<<1 + 15<<3 + 15<<4
= 405
Sounds familiar?
Here is your algorithm.
Choose the smaller number as your A
Initialize C as your result area
while B is not zero,
if lsb of B is 1, add A to C
left shift A once
right shift B once
C has your multiplication result (unless you rolled over sizeof C)
Update If you are trying to get a fast algorithm for the shift and OR operation across 6000 bits,
there might actually be one. I'll think a little more on that.
It would appear like 'blurring' one number over the other. Interesting.
A rather crude example here,
110000011 X 1010101 would look like
110000011
110000011
110000011
110000011
---------------
111111111111111
The number of 1s in the two numbers will decide the amount of blurring towards a number with all its bits set.
Wonder what you want to do with it...
Update2 This is the nature of the shift+OR operation with two 6000 bit numbers.
The result will be 12000 bits of course
the operation can be done with two bit streams; but, need not be done to its entirety
the 'middle' part of the 12000 bit stream will almost certainly be all 1s (provided both numbers are non-zero)
the problem will be in identifying the depth to which we need to process this operation to get both ends of the 12000 bit stream
the pattern at the two ends of the stream will depend on the largest consecutive 1s present in both the numbers
I have not yet got to a clean algorithm for this yet. Have updated for anyone else wanting to recheck or go further from here. Also, describing the need for such an operation might motivate further interest :-)

The best I could up with is to use a fast out on the looping logic. Combined with the possibility of using the Non-Zero approach as described by themis, you can answer you question by inspecting less than 2% of the N^2 problem.
Below is some code that gives the timing for numbers that are between 80% and 99% zero.
When the numbers get around 88% zero, using themis' approach switches to being better (was not coded in the sample below, though).
This is not a highly theoretical solution, but it is practical.
OK, here is some "theory" of the problem space:
Basically, each bit for X (the output) is the OR summation of the bits on the diagonal of a grid constructed by having the bits of A along the top (MSB to LSB left to right) and the bits of B along the side (MSB to LSB from top to bottom). Since the bit of X is 1 if any on the diagonal is 1, you can perform an early out on the cell traversal.
The code below does this and shows that even for numbers that are ~87% zero, you only have to check ~2% of the cells. For more dense (more 1's) numbers, that percentage drops even more.
In other words, I would not worry about tricky algorithms and just do some efficient logic checking. I think the trick is to look at the bits of your output as the diagonals of the grid as opposed to the bits of A shift-OR with the bits of B. The trickiest thing is this case is keeping track of the bits you can look at in A and B and how to index the bits properly.
Hopefully this makes sense. Let me know if I need to explain this a bit further (or if you find any problems with this approach).
NOTE: If we knew your problem space a bit better, we could optimize the algorithm accordingly. If your numbers are mostly non-zero, then this approach is better than themis since his would result is more computations and storage space needed (sizeof(int) * NNZ).
NOTE 2: This assumes the data is basically bits, and I am using .NET's BitArray to store and access the data. I don't think this would cause any major headaches when translated to other languages. The basic idea still applies.
using System;
using System.Collections;
namespace BigIntegerOr
{
class Program
{
private static Random r = new Random();
private static BitArray WeightedToZeroes(int size, double pctZero, out int nnz)
{
nnz = 0;
BitArray ba = new BitArray(size);
for (int i = 0; i < size; i++)
{
ba[i] = (r.NextDouble() < pctZero) ? false : true;
if (ba[i]) nnz++;
}
return ba;
}
static void Main(string[] args)
{
// make sure there are enough bytes to hold the 6000 bits
int size = (6000 + 7) / 8;
int bits = size * 8;
Console.WriteLine("PCT ZERO\tSECONDS\t\tPCT CELLS\tTOTAL CELLS\tNNZ APPROACH");
for (double pctZero = 0.8; pctZero < 1.0; pctZero += 0.01)
{
// fill the "BigInts"
int nnzA, nnzB;
BitArray a = WeightedToZeroes(bits, pctZero, out nnzA);
BitArray b = WeightedToZeroes(bits, pctZero, out nnzB);
// this is the answer "BigInt" that is at most twice the size minus 1
int xSize = bits * 2 - 1;
BitArray x = new BitArray(xSize);
int LSB, MSB;
LSB = MSB = bits - 1;
// stats
long cells = 0;
DateTime start = DateTime.Now;
for (int i = 0; i < xSize; i++)
{
// compare using the diagonals
for (int bit = LSB; bit < MSB; bit++)
{
cells++;
x[i] |= (b[MSB - bit] && a[bit]);
if (x[i]) break;
}
// update the window over the bits
if (LSB == 0)
{
MSB--;
}
else
{
LSB--;
}
//Console.Write(".");
}
// stats
TimeSpan elapsed = DateTime.Now.Subtract(start);
double pctCells = (cells * 100.0) / (bits * bits);
Console.WriteLine(pctZero.ToString("p") + "\t\t" +elapsed.TotalSeconds.ToString("00.000") + "\t\t" +
pctCells.ToString("00.00") + "\t\t" + cells.ToString("00000000") + "\t" + (nnzA * nnzB).ToString("00000000"));
}
Console.ReadLine();
}
}
}

Just use any FFT Polynomial Multiplication Algorithm and transform all resulting coefficients that are greater than or equal 1 into 1.
Example:
10011 * 10001
[1 x^4 + 0 x^3 + 0 x^2 + 1 x^1 + 1 x^0] * [1 x^4 + 0 x^3 + 0 x^2 + 0 x^1 + 1 x^0]
== [1 x^8 + 0 x^7 + 0 x^6 + 1 x^5 + 2 x^4 + 0 x^3 + 0 x^2 + 1 x^1 + 1 x^0]
-> [1 x^8 + 0 x^7 + 0 x^6 + 1 x^5 + 1 x^4 + 0 x^3 + 0 x^2 + 1 x^1 + 1 x^0]
-> 100110011
For an example of the algorithm, check:
http://www.cs.pitt.edu/~kirk/cs1501/animations/FFT.html
BTW, it is of linearithmic complexity, i.e., O(n log(n))
Also see:
http://everything2.com/title/Multiplication%2520using%2520the%2520Fast%2520Fourier%2520Transform

Related

I am trying to solve recursion by hand

I am self taught and thought that I understood recursion, but I can not solve this problem:
What is returned by the call recur(12)?
What is returned by the call recur (25)?
public static int recur (int y)
{
if(y <=3)
return y%4;
return recur(y-2) + recur(y-1) + 1;
}
Would someone please help me with understanding how to solve these problems?
First of all, I assume you mean:
public static int recur(int y)
but the results of this method are discovered by placing a print statement at the beginning of the method:
public static int recur(int y)
{
System.out.println(y);
if(y <=3)
return y % 4;
return recur(y-2) + recur(y-1) + 1;
}
I am not sure what you mean by what is returned because there are several returns, though. Anyway, these are the steps to figure this out:
is 12 <= 3? No
recur(10) Don't proceed to the next recursion statement yet
is 10 <= 3? No
recur(8) Don't proceed to the next recursion statement yet
Continue this pattern until y <= 3 is true. Then you return y % 4 (whatever that number may be).
Now you are ready to go to the second recursive statement in the most recent recur() call. So, recur(y - 1).
Is y <= 3? If so, return y % 4. If not, do a process similar to step 1
once you return you add the result of recur(y - 2) + recur(y - 1) + 1. This will be a number of course.
continue this process for many iterations.
Recursion is difficult to follow and understand sometimes even for advanced programmers.
Here is a very common (and similar) problem for you to look into:
Java recursive Fibonacci sequence
I hope this helps!
Good luck!
I have removed the modulus there since any nonegative n less than 4 will just become n so I ended up with:
public static int recur (int y)
{
return y <= 3 ?
y :
recur(y-2) + recur(y-1) + 1;
}
I like to start off by testing the base case so what happens for y when they are 0,1,2,3? well the argument so 0,1,2,3 of course.
What about 4? Well then it's not less or equal to 3 and you get to replace it with recur(4-2) + recur(4-1) + 1 which is recur(2) + recur(3) + 1. Now you can solve each of the recur ones since we earlier established became its argument so you end up with 2 + 3 + 1.
Now doing this for 12 or 25 is exactly the same just with more steps. Here is 5 which has just one more step:
recur(5); //=>
recur(3) + recur(4) + 1; //==>
recur(3) + ( recur(2) + recur(3) + 1 ) + 1; //==>
3 + 2 + 3 + 1 + 1; // ==>
10
So in reality the recursion halts the process in the current iteration until you have an answer that the current iteration adds together so I could have done this in the opposite way, but then I would have paused every time I now have used a previous calculated value.
You should have enough info to do any y.
This is nothing more than an augmented Fibonacci sequence.
The first four terms are defined as 0, 1, 2, 3. Thereafter, each term is the sum of the previous two terms, plus one. This +1 augmentation is where it differs from the classic Fibonacci sequence. Just add up the series by hand:
0
1
2
3
3+2+1 = 6
6+3+1 = 10
10+6+1 = 17
17+10+1 = 28
...

How is uint overflow defined in OpenCL?

What happens when the result of a multiplication or sum in OpenCL overflows? Does it wrap?
In particular I'd like to know if I can catch an overflow in
uint4 x = ( get_global_id( 0 ) * 4 + (uint4)(0, 1, 2, 3) ) * q + r;
with
int4 invalid = x < get_global_id( 0 ) * 4;
or how else that would be possible. (Assuming r >= 0 && q > r && q < (1 << 20) and the id will be at most just big enough to cause an overflow.)
Context: I want to check every 32 bit uint x for which x % q == r , where q and r are known. With vectors I can check 4 at a time, but the number of tests may not be divisible by 4.
I'm targeting the GPU, but that shouldn't be relevant, right?
OpenCL 1.2 standard (section 6.2.3.3) refers to C99 standard (section 6.3.1.3):
...if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
Generally, get_global_id returns size_t, so narrowing conversion is bad idea IMO. Though, I never faced NDRange big enough to exceed uint range.

Determining the big Oh for (n-1)+(n-1)

I have been trying to get my head around this perticular complexity computation but everything i read about this type of complexity says to me that it is of type big O(2^n) but if i add a counter to the code and check how many times it iterates per given n it seems to follow the curve of 4^n instead. Maybe i just misunderstood as i placed an count++; inside the scope.
Is this not of type big O(2^n)?
public int test(int n)
{
if (n == 0)
return 0;
else
return test(n-1) + test(n-1);
}
I would appreciate any hints or explanation on this! I completely new to this complexity calculation and this one has thrown me off the track.
//Regards
int test(int n)
{
printf("%d\n", n);
if (n == 0) {
return 0;
}
else {
return test(n - 1) + test(n - 1);
}
}
With a printout at the top of the function, running test(8) and counting the number of times each n is printed yields this output, which clearly shows 2n growth.
$ ./test | sort | uniq -c
256 0
128 1
64 2
32 3
16 4
8 5
4 6
2 7
1 8
(uniq -c counts the number of times each line occurs. 0 is printed 256 times, 1 128 times, etc.)
Perhaps you mean you got a result of O(2n+1), rather than O(4n)? If you add up all of these numbers you'll get 511, which for n=8 is 2n+1-1.
If that's what you meant, then that's fine. O(2n+1) = O(2⋅2n) = O(2n)
First off: the 'else' statement is obsolete since the if already returns if it evaluates to true.
On topic: every iteration forks 2 different iterations, which fork 2 iterations themselves, etc. etc. As such, for n=1 the function is called 2 times, plus the originating call. For n=2 it is called 4+1 times, then 8+1, then 16+1 etc. The complexity is therefore clearly 2^n, since the constant is cancelled out by the exponential.
I suspect your counter wasn't properly reset between calls.
Let x(n) be a number of total calls of test.
x(0) = 1
x(n) = 2 * x(n - 1) = 2 * 2 * x(n-2) = 2 * 2 * ... * 2
There is total of n twos - hence 2^n calls.
The complexity T(n) of this function can be easily shown to equal c + 2*T(n-1). The recurrence given by
T(0) = 0
T(n) = c + 2*T(n-1)
Has as its solution c*(2^n - 1), or something like that. It's O(2^n).
Now, if you take the input size of your function to be m = lg n, as might be acceptable in this scenario (the number of bits to represent n, the true input size) then this is, in fact, an O(m^4) algorithm... since O(n^2) = O(m^4).

bitxor, solve equation, boolean logic

I have the equation:
C = A^b + (2*A)^b + (4*A)^b.
Where C and A are known, but b is unknown. How to find b?
All numbers are 8 bit bytes. Is there any possible method much faster than brute-force?
Does the + sign indicate addition on bytes and * multipication, with overflowing bits discarded? If so, I think the answer is
b = C ^ (A + 2 * A + 4* A)
How to reach that conclusion :
C = A^b + (2*A)^b + (4*A)^b
hence
C^b = A^b^b + (2*A)^b^b + (4*A)^b^b = A + 2*A + 4*A
then
C^C^b = b = C^(A + 2*A + 4*A)
EDIT Just to make sure : This answer is not correct. Shame on me. I'll have to think more about it.
I'm taking the same assumptions: + and * are addition and multiplication with overflow ignored.
Look-up Table
This would probably be the fastest solution: Precompute the results, and store them in a look-up table. It would require 216 bytes of memory, or 64 kB.
Guess, Check, Refine Method
Presented in C-family-like pseudocode:
byte Solve(byte a, byte c){
byte guess = lastGuess = result = lastResult = 0;
do {
guess = lastGuess ^ lastResult ^ c; //see explanation below
result = a^guess + (2*a)^guess + (4*a)^guess;
lastGuess = guess;
lastResult = result;
} while (result != c);
return guess;
}
The idea of this algorithm is that it makes a guess at what b is, then plugs it into the formula for a tentative result, and checks it against c. Whatever bits in the guess caused the result to differ from c are changed. This corresponds to the XOR of the last guess, last result, and c (if this statement is a little bit of a jump, I encourage you draw a truth table, and not just take my word for it!).
Explanation
It works because changing a bit can only affect the results of that bit, and the more significant bits, but not lower bits (since when you do addition with pen and paper, the carries can propagate to the left). So in the worst case the algorithm takes 2 guesses to get the least significant bit correct, another guess for the 2nd lsb, another for the 3rd, etc. for a maximum of 9 guesses given any combination of a and c.
Here's an example trace from my test program:
a: 00001100
c: 01100111
Guess: 01100111
Result: 01000001
Guess: 01000001
Result: 00010111
Guess: 00110001
Result: 01100111
b: 00110001

Solving a linear equation

I need to programmatically solve a system of linear equations in C, Objective C, or (if needed) C++.
Here's an example of the equations:
-44.3940 = a * 50.0 + b * 37.0 + tx
-45.3049 = a * 43.0 + b * 39.0 + tx
-44.9594 = a * 52.0 + b * 41.0 + tx
From this, I'd like to get the best approximation for a, b, and tx.
Cramer's Rule
and
Gaussian Elimination
are two good, general-purpose algorithms (also see Simultaneous Linear Equations). If you're looking for code, check out GiNaC, Maxima, and SymbolicC++ (depending on your licensing requirements, of course).
EDIT: I know you're working in C land, but I also have to put in a good word for SymPy (a computer algebra system in Python). You can learn a lot from its algorithms (if you can read a bit of python). Also, it's under the new BSD license, while most of the free math packages are GPL.
You can solve this with a program exactly the same way you solve it by hand (with multiplication and subtraction, then feeding results back into the equations). This is pretty standard secondary-school-level mathematics.
-44.3940 = 50a + 37b + c (A)
-45.3049 = 43a + 39b + c (B)
-44.9594 = 52a + 41b + c (C)
(A-B): 0.9109 = 7a - 2b (D)
(B-C): 0.3455 = -9a - 2b (E)
(D-E): 1.2564 = 16a (F)
(F/16): a = 0.078525 (G)
Feed G into D:
0.9109 = 7a - 2b
=> 0.9109 = 0.549675 - 2b (substitute a)
=> 0.361225 = -2b (subtract 0.549675 from both sides)
=> -0.1806125 = b (divide both sides by -2) (H)
Feed H/G into A:
-44.3940 = 50a + 37b + c
=> -44.3940 = 3.92625 - 6.6826625 + c (substitute a/b)
=> -41.6375875 = c (subtract 3.92625 - 6.6826625 from both sides)
So you end up with:
a = 0.0785250
b = -0.1806125
c = -41.6375875
If you plug these values back into A, B and C, you'll find they're correct.
The trick is to use a simple 4x3 matrix which reduces in turn to a 3x2 matrix, then a 2x1 which is "a = n", n being an actual number. Once you have that, you feed it into the next matrix up to get another value, then those two values into the next matrix up until you've solved all variables.
Provided you have N distinct equations, you can always solve for N variables. I say distinct because these two are not:
7a + 2b = 50
14a + 4b = 100
They are the same equation multiplied by two so you cannot get a solution from them - multiplying the first by two then subtracting leaves you with the true but useless statement:
0 = 0 + 0
By way of example, here's some C code that works out the simultaneous equations that you're placed in your question. First some necessary types, variables, a support function for printing out an equation, and the start of main:
#include <stdio.h>
typedef struct { double r, a, b, c; } tEquation;
tEquation equ1[] = {
{ -44.3940, 50, 37, 1 }, // -44.3940 = 50a + 37b + c (A)
{ -45.3049, 43, 39, 1 }, // -45.3049 = 43a + 39b + c (B)
{ -44.9594, 52, 41, 1 }, // -44.9594 = 52a + 41b + c (C)
};
tEquation equ2[2], equ3[1];
static void dumpEqu (char *desc, tEquation *e, char *post) {
printf ("%10s: %12.8lf = %12.8lfa + %12.8lfb + %12.8lfc (%s)\n",
desc, e->r, e->a, e->b, e->c, post);
}
int main (void) {
double a, b, c;
Next, the reduction of the three equations with three unknowns to two equations with two unknowns:
// First step, populate equ2 based on removing c from equ.
dumpEqu (">", &(equ1[0]), "A");
dumpEqu (">", &(equ1[1]), "B");
dumpEqu (">", &(equ1[2]), "C");
puts ("");
// A - B
equ2[0].r = equ1[0].r * equ1[1].c - equ1[1].r * equ1[0].c;
equ2[0].a = equ1[0].a * equ1[1].c - equ1[1].a * equ1[0].c;
equ2[0].b = equ1[0].b * equ1[1].c - equ1[1].b * equ1[0].c;
equ2[0].c = 0;
// B - C
equ2[1].r = equ1[1].r * equ1[2].c - equ1[2].r * equ1[1].c;
equ2[1].a = equ1[1].a * equ1[2].c - equ1[2].a * equ1[1].c;
equ2[1].b = equ1[1].b * equ1[2].c - equ1[2].b * equ1[1].c;
equ2[1].c = 0;
dumpEqu ("A-B", &(equ2[0]), "D");
dumpEqu ("B-C", &(equ2[1]), "E");
puts ("");
Next, the reduction of the two equations with two unknowns to one equation with one unknown:
// Next step, populate equ3 based on removing b from equ2.
// D - E
equ3[0].r = equ2[0].r * equ2[1].b - equ2[1].r * equ2[0].b;
equ3[0].a = equ2[0].a * equ2[1].b - equ2[1].a * equ2[0].b;
equ3[0].b = 0;
equ3[0].c = 0;
dumpEqu ("D-E", &(equ3[0]), "F");
puts ("");
Now that we have a formula of the type number1 = unknown * number2, we can simply work out the unknown value with unknown <- number1 / number2. Then, once you've figured that value out, substitute it into one of the equations with two unknowns and work out the second value. Then substitute both those (now-known) unknowns into one of the original equations and you now have the values for all three unknowns:
// Finally, substitute values back into equations.
a = equ3[0].r / equ3[0].a;
printf ("From (F ), a = %12.8lf (G)\n", a);
b = (equ2[0].r - equ2[0].a * a) / equ2[0].b;
printf ("From (D,G ), b = %12.8lf (H)\n", b);
c = (equ1[0].r - equ1[0].a * a - equ1[0].b * b) / equ1[0].c;
printf ("From (A,G,H), c = %12.8lf (I)\n", c);
return 0;
}
The output of that code matches the earlier calculations in this answer:
>: -44.39400000 = 50.00000000a + 37.00000000b + 1.00000000c (A)
>: -45.30490000 = 43.00000000a + 39.00000000b + 1.00000000c (B)
>: -44.95940000 = 52.00000000a + 41.00000000b + 1.00000000c (C)
A-B: 0.91090000 = 7.00000000a + -2.00000000b + 0.00000000c (D)
B-C: -0.34550000 = -9.00000000a + -2.00000000b + 0.00000000c (E)
D-E: -2.51280000 = -32.00000000a + 0.00000000b + 0.00000000c (F)
From (F ), a = 0.07852500 (G)
From (D,G ), b = -0.18061250 (H)
From (A,G,H), c = -41.63758750 (I)
Take a look at the Microsoft Solver Foundation.
With it you could write code like this:
SolverContext context = SolverContext.GetContext();
Model model = context.CreateModel();
Decision a = new Decision(Domain.Real, "a");
Decision b = new Decision(Domain.Real, "b");
Decision c = new Decision(Domain.Real, "c");
model.AddDecisions(a,b,c);
model.AddConstraint("eqA", -44.3940 == 50*a + 37*b + c);
model.AddConstraint("eqB", -45.3049 == 43*a + 39*b + c);
model.AddConstraint("eqC", -44.9594 == 52*a + 41*b + c);
Solution solution = context.Solve();
string results = solution.GetReport().ToString();
Console.WriteLine(results);
Here is the output:
===Solver Foundation Service Report===
Datetime: 04/20/2009 23:29:55
Model Name: Default
Capabilities requested: LP
Solve Time (ms): 1027
Total Time (ms): 1414
Solve Completion Status: Optimal
Solver Selected: Microsoft.SolverFoundation.Solvers.SimplexSolver
Directives:
Microsoft.SolverFoundation.Services.Directive
Algorithm: Primal
Arithmetic: Hybrid
Pricing (exact): Default
Pricing (double): SteepestEdge
Basis: Slack
Pivot Count: 3
===Solution Details===
Goals:
Decisions:
a: 0.0785250000000004
b: -0.180612500000001
c: -41.6375875
For a 3x3 system of linear equations I guess it would be okay to roll out your own algorithms.
However, you might have to worry about accuracy, division by zero or really small numbers and what to do about infinitely many solutions. My suggestion is to go with a standard numerical linear algebra package such as LAPACK.
Are you looking for a software package that'll do the work or actually doing the matrix operations and such and do each step?
The the first, a coworker of mine just used Ocaml GLPK. It is just a wrapper for the GLPK, but it removes a lot of the steps of setting things up. It looks like you're going to have to stick with the GLPK, in C, though. For the latter, thanks to delicious for saving an old article I used to learn LP awhile back, PDF. If you need specific help setting up further, let us know and I'm sure, me or someone will wander back in and help, but, I think it's fairly straight forward from here. Good Luck!
Template Numerical Toolkit from NIST has tools for doing that.
One of the more reliable ways is to use a QR Decomposition.
Here's an example of a wrapper so that I can call "GetInverse(A, InvA)" in my code and it will put the inverse into InvA.
void GetInverse(const Array2D<double>& A, Array2D<double>& invA)
{
QR<double> qr(A);
invA = qr.solve(I);
}
Array2D is defined in the library.
In terms of run-time efficiency, others have answered better than I. If you always will have the same number of equations as variables, I like Cramer's rule as it's easy to implement. Just write a function to calculate determinant of a matrix (or use one that's already written, I'm sure you can find one out there), and divide the determinants of two matrices.
Personally, I'm partial to the algorithms of Numerical Recipes. (I'm fond of the C++ edition.)
This book will teach you why the algorithms work, plus show you some pretty-well debugged implementations of those algorithms.
Of course, you could just blindly use CLAPACK (I've used it with great success), but I would first hand-type a Gaussian Elimination algorithm to at least have a faint idea of the kind of work that has gone into making these algorithms stable.
Later, if you're doing more interesting linear algebra, looking around the source code of Octave will answer a lot of questions.
From the wording of your question, it seems like you have more equations than unknowns and you want to minimize the inconsistencies. This is typically done with linear regression, which minimizes the sum of the squares of the inconsistencies. Depending on the size of the data, you can do this in a spreadsheet or in a statistical package. R is a high-quality, free package that does linear regression, among a lot of other things. There is a lot to linear regression (and a lot of gotcha's), but as it's straightforward to do for simple cases. Here's an R example using your data. Note that the "tx" is the intercept to your model.
> y <- c(-44.394, -45.3049, -44.9594)
> a <- c(50.0, 43.0, 52.0)
> b <- c(37.0, 39.0, 41.0)
> regression = lm(y ~ a + b)
> regression
Call:
lm(formula = y ~ a + b)
Coefficients:
(Intercept) a b
-41.63759 0.07852 -0.18061
function x = LinSolve(A,y)
%
% Recursive Solution of Linear System Ax=y
% matlab equivalent: x = A\y
% x = n x 1
% A = n x n
% y = n x 1
% Uses stack space extensively. Not efficient.
% C allows recursion, so convert it into C.
% ----------------------------------------------
n=length(y);
x=zeros(n,1);
if(n>1)
x(1:n-1,1) = LinSolve( A(1:n-1,1:n-1) - (A(1:n-1,n)*A(n,1:n-1))./A(n,n) , ...
y(1:n-1,1) - A(1:n-1,n).*(y(n,1)/A(n,n)));
x(n,1) = (y(n,1) - A(n,1:n-1)*x(1:n-1,1))./A(n,n);
else
x = y(1,1) / A(1,1);
end
For general cases, you could use python along with numpy for Gaussian elimination. And then plug in values and get the remaining values.

Resources