I have a math expression and I want to simplify it, if it's posible, to make it have the least operations and thus be the fastest to calculate. I'm not interested in precision, just speed. I found many online sites that simplifies math expression but only for human readable purpose, not computer. Is there any algorithm/method to do so?
Btw one of my expressions is:
a(a*x+b*y+c*z)+d(d*x-b*z+c*y)+z(d*y-c*x+a*z)-b(d*z-a*y+b*x)
rest is similar.
Lets start with
a*(a*x+b*y+c*z)+d*(d*x-b*z+c*y)+z*(d*y-c*x+a*z)-b*(d*z-a*y+b*x)
that has 16 multiplications and 11 additions/subtractions.
The third bracket looks a bit odd having z as a multiplier, I might have expected a constant c here.
If we expand
a*a*x+a*b*y+a*c*z+d*d*x-d*b*z+d*c*y+z*d*y-c*x*z+a*z*z-b*d*z-a*b*y+b*b*x
it goes up to 24 multiplications and 11 additions/subtractions.
Grouping by powers of x,y,z
(a*a+d*d+b*b)*x-c*x*z+(a*b+d*c-a*b)*y+d*y*z+(a*c-d*b-b*d)*z+a*z*z
Gives 18 multiplications and 11 additions. We could have gone to
(a*a+d*d+b*b-c*z)*x+(a*b+d*c-a*b+d*z)*y+(a*c-d*b-b*d+a*z)*z
With 15 multiplications and 11 additions. There is some simplification
as the aby cancel and there are two dbz's.
(a*a+d*d+b*b-c*z)*x+(d*c+d*z)*y+(a*c-2*d*b+a*z)*z
13 multiplications 8 additions.
Some further grouping
(a*a+d*d+b*b-c*z)*x+(c+z)*d*y+(a*(c+z)-2*d*b)*z
drops to 11 multiplications 8 additions. There is a common c+z terms so we could use a tempory variable
c_z = c+z
(a*a+d*d+b*b-c*z)*x+c_z*d*y+(a*c_z-2*d*b)*z
11 multiplications and 7 additions. Which, I think, is the best you are going to get.
The first thing to note is that this is not substantially better than the initial version. 18 operations compared to 27, maybe saving a third of the evaluation time. You may well find that this is not the bottleneck in your program.
There is an algorithm called Horner's Rule which can simplify the evaluation of polynomials. This tends to work better if you have higher powers of a single variable.
The above suggests an algorithm
expand all terms to form a set of monomials, cancel terms and collect like terms
Find the most frequent symbol, say x
Group using that symbol.
repeat 2
So in your case expanding gives
a*a*x+a*b*y+a*c*z+d*d*x-d*b*z+d*c*y+z*d*y-c*x*z+a*z*z-b*d*z-a*b*y+b*b*x
cancelling and collecting like terms
a*a*x+a*c*z+d*d*x+d*c*y+z*d*y-c*x*z+a*z*z-2*b*d*z+b*b*x
find most common symbol, say x, group using that
(a*a+d*d-c*z+b*b)*x+a*c*z+d*c*y+z*d*y+a*z*z-2*b*d*z
repeat. Common symbol is z,
(a*a+d*d-c*z+b*b)*x+(a*c+d*y+a*z-2*b*d)*z+d*c*y
repeat. This time the most common symbol is the a inside the second bracket.
(a*a+d*d-c*z+b*b)*x+((c+z)*a+d*y-2*b*d)*z+d*c*y
repeat again, this time the most common symbol is d
(a*a+d*d-c*z+b*b)*x+((c+z)*a+(y-2*b)*d)*z+d*c*y
Giving a version with 11 multiplications and 8 additions/subtractions.
An alternative for the first common symbols was d
a*a*x+a*c*z+d*d*x+d*c*y+z*d*y-c*x*z+a*z*z-2*b*d*z+b*b*x
d
a*a*x+a*c*z+ (d*x+c*y+z*y-2*b*z)*d-c*x*z+a*z*z+b*b*x
x
(a*a-c*z+b*b)*x+a*c*z+ (d*x+c*y+z*y-2*b*z)*d+a*z*z
a
(a*a-c*z+b*b)*x+(c*z+z*z)*a+ (d*x+c*y+z*y-2*b*z)*d
z
(a*a-c*z+b*b)*x+((c+z)*z)*a+ (d*x+c*y+(y-2*b)*z)*d
Again a solution with 11 multiplications and 8 additions/subtractions.
Related
I have a challenge. This may be little tricky or even not possible but wanted to check if anyone has any thoughts on this?
PS : This question is in general and not related to only to R. May be I can say its general mathematics
I have a data
df
ColA ColB ColC
6 9 27
1 4 32
4 8 40
If you observe closely, there is some relationship between these columns.
Example, (ColC/ColB)+ColA will give you number 9.
df
ColA ColB ColC ColD
6 9 27 9
1 4 32 9
4 8 40 9
However this data is manipulated and I made sure there is some relation.
But in general, lets us take any numbers, is there a way to find if there is any relationship between these numbers. Need not be (ColC/ColB)+ColA . It could be anything.
Say we have 5 columns of numeric data. I need to find mathematical operation between these so that common number exists.
This is more into mathematics(algebra).
Can anyone let me know is this even possible ?
For some types of relationships this is doable. But when such a method fails to find a relationship, it typically just means there could be a relationship of a kind not covered by your approach.
One common tool for finding relationships is linear algebra, and linear dependencies in particular. Write your data in a matrix like you did. Consider that a linear equation
a*ColA + b*ColB + c*ColC = 0
Use standard techniques such as Gaussian elimination to find coefficients a, b, c which satisfy this equation but are not all zero themselves. You probably can find a library to compute the kernel of a matrix which you can use for that. Now you know whether one of the columns can be expressed as a linear combination of the other two.
This is a very limited class of relationships, and doesn't cover your example yet. But you can improve it by including more columns. Include a column with ones everywhere to allow for a constant term in your formula. Include all pair wise products.
x + a*ColA + b*ColB + c*ColC + ab*ColA*ColB + ac*ColA*ColC + bc*ColB*ColC + aa*ColA^2 + bb*ColB^2 + cc*ColC^2 = 0
Now for your data this could tell you that there is a solution of the form
b=-9 c=1 ab=1 x=a=ac=bc=aa=bb=cc=0
-9*ColB + ColC + ColA*ColB = 0
which is equivalent to the relationship you described in your question.
But also observed that you are now using 3 data points to determine 10 variables. So this one relationship is by far not the only one.
In general you want at least as many data points as you have variables in your equation. You want at least as many rows as you have columns in your extended matrix. Only then can you say that a relationship between them us indeed a property of the underlying data and not merely an artifact of having too much flexibility and too little data.
In R you might want to look into using linear models for determining coefficients in the presence of imprecise data. You can also use powers of formulas to include all interactions between columns, i.e. those higher degree terms which I included above as well.
I have a pretty easy question (I think). As much as I've tried, I can not find an answer to this question.
I am creating a function, for which I want the user to enter two numbers. The first is the the number of terms of a certain infinite series to add together. The second is the number of digits the user would like the truncated sum to be accurate to.
Say the terms of the sequence are a_i. How much precision n, would be required in mpfr to ensure the result of adding these a_i from i=0 up to the user's entered value would be needed to guarantee the number of digits the user needs?
By the way, I'm adding the a_i in a naive way.
Any help will be much appreciated.
Thanks,
Rick
You can convert between decimal digits of precision, d, and binary digits of precision, b, with logarithms
b = d × log(10) / log(2)
A little rearranging shows why
b × log(2) = d × log(10)
log(2b) = log(10d)
2b = 10d
Each term of the series (and each addition) will introduce a rounding error at the least significant digit so, assuming each of the t terms involves n (two argument) arithmetic operations, you will want to add an extra
log(t * (n+2))/log(2)
bits.
You'll need to round the number of bits of precision up to be sure that you have enough room for your decimal digits of precision
b = ceil((d*log(10.0) + log(t*(n+2)))/log(2.0));
Finally, you should be aware that the terms may introduce cancellation errors, in which case this simple calculation will dramatically underestimate the required number of bits, even assuming I've got it right in the first place ;-)
EDIT
So it seems I "underestimated" what varying length numbers meant. I didn't even think about situations where the operands are 100 digits long. In that case, my proposed algorithm is definitely not efficient. I'd probably need an implementation who's complexity depends on the # of digits in each operands as opposed to its numerical value, right?
As suggested below, I will look into the Karatsuba algorithm...
Write the pseudocode of an algorithm that takes in two arbitrary length numbers (provided as strings), and computes the product of these numbers. Use an efficient procedure for multiplication of large numbers of arbitrary length. Analyze the efficiency of your algorithm.
I decided to take the (semi) easy way out and use the Russian Peasant Algorithm. It works like this:
a * b = a/2 * 2b if a is even
a * b = (a-1)/2 * 2b + a if a is odd
My pseudocode is:
rpa(x, y){
if x is 1
return y
if x is even
return rpa(x/2, 2y)
if x is odd
return rpa((x-1)/2, 2y) + y
}
I have 3 questions:
Is this efficient for arbitrary length numbers? I implemented it in C and tried varying length numbers. The run-time in was near-instant in all cases so it's hard to tell empirically...
Can I apply the Master's Theorem to understand the complexity...?
a = # subproblems in recursion = 1 (max 1 recursive call across all states)
n / b = size of each subproblem = n / 1 -> b = 1 (problem doesn't change size...?)
f(n^d) = work done outside recursive calls = 1 -> d = 0 (the addition when a is odd)
a = 1, b^d = 1, a = b^d -> complexity is in n^d*log(n) = log(n)
this makes sense logically since we are halving the problem at each step, right?
What might my professor mean by providing arbitrary length numbers "as strings". Why do that?
Many thanks in advance
What might my professor mean by providing arbitrary length numbers "as strings". Why do that?
This actually change everything about the problem (and make your algorithm incorrect).
It means than 1234 is provided as 1,2,3,4 and you cannot operate directly on the whole number. You need to analyze your algorithm in terms of #additions, #multiplications, #divisions.
You should expect a division to be a bit more expensive than a multiplication, and a multiplication to be lot more expensive than an addition. So a good algorithm try to reduce the number of divisions and multiplications.
Check out the Karatsuba algorithm, (ps don't copy it that's not what your teacher want) is one of the fastest for this specification.
Add 3): Native integers are limited in how large (or small) numbers they can represent (32- or 64-bit integers for example). To represent arbitrary length numbers you can choose strings, because then you are not really limited by this. The problem is then, of course, that your arithmetic units are not really made to add strings ;-)
Given the following letters in a license plate, how many combinations of them can you create
AAAA1234
Please note that this is not a homework question (I am too old for college :)
I am only trying to understand permutations and combinations. I always get lost when I see questions like this. Do I use n! or nPr or nCr.
Any book on this subject in addition to the logic used to arrive at the answer will also be greatly appreciated.
I have faith in exactly one method to remember such formulas: Rethink through the reasoning to justify it as needed. Then, each time you need the formula, remembering it becomes a mental exercise that makes it easier to remember it the next time. It also allows you to know the math on your own authority, instead of someone else's authority.
If the letters are all different, then there are n choices for the first letter, n-1 choices for the second letter, and so on. That makes n! However, in your problem the letters are not all different. One trick is to tag them to make them different so that you are overcounting, then divide by the amount that you are overcounting. If a of the symbols are A, then you can tag them in a! ways. They are then all different, so that the answer to the modified question is n!. So the answer to the original question is n!/a! (This is assuming that the symbols other than the A are fixed, distinct numbers.)
Another argument is to count the positions for the numbers. There are n positions for the 1, n-1 positions for the 2, etc., so you get n(n-1)...(n-r+1) = n!/a!, where r = n-a.
In fact the answer is the same as the permutation formula nPr. And your arrangements are much the same as partial permutations, which is what the formula is for. But you'll learn it better if you reason through it before looking at the formula.
As for books, I might suggest Brualdi, Introductory Combinatorics.
One strategy that you can use (there will be many) is to get all the permutations possible, then divide out the repeats.
Permutations of 8 elements = 8!
But for each unique arrangement of these, there are a bunch more with the same positions of the A's. So, how many ways can you arrange four A's in one particular set of positions?
Permutations of 4 A's = 4!
So the total unique arrangements should be 8! / 4!
If I'm totally wrong just someone say so and I'll delete this answer...
If you mean 3 letters A-Z and 4 digits 0...9 in that order, then you have
26 letters x
26 letters x
26 letters x
26 letters x
10 digits x
10 digits x
10 digits x
10 digits
= 26^4 * 10^4
= 4569760000
If no leading "0" is allowed, you get a few less.
Edit1: Miscounted the "A"
Edit2: I reread the question - originally I thought it was just four letters at the beginning followed by 4 numbers. If it's just a permutation thing, then the answer is obviously different: 8! permutations at all, but 4! permutations for the A are the same, so 8! / 4! = 1680.
Answer is 8!/4!
Let's try to explain with a simpler question: Combinations of 112 ?
There are 112, 121 and 211. If all digits would be unique, we could just find the answer by 3! But there is a repeating digit. So we should extract repeating digits by 3!/2! = 3
Another example is 1122. We have two repeating digit here. So we should extract twice. 4!/2!.2! = 6
I think this is a good explanation of permutations and combinations:
Easy Permutations and Combinations Better explained.
It goes step by step until you discover how to make the calculations.
No need for permutations, because all letters can be repeated, even the number
since the given example is [AAAA1234],then we have 4-Letters and 4-Digits.
for each letter we have 26 {A-Z} possible combinations
Thats why for 4 letters we will have 26^4
For each Number we have 10 {0-9} possible combinations, except the last digit we 9 possible combinations {case 1}, if it not allowed to be 0 otherwise it is 10 {case 2}
Thats why for 4 letters we will have 9*10^3 {case 1} or 10^4 {case 2}
The total number of combinations is {case 1} 9*(26^4)***(10^3) or {case 2} (26^4)*(10^4)
But if your question about permutations for the set{A,A,A,A,1,2,3,4}, then consider the the equivalent set {1,2,3,4,5,6,7,8} and try avoid the repeated sequence by divide over the permutations of {5,6,7,8} and the answer is 8!/4!=5*6*7*8=1680. the{5,6,7,8} represent {A,A,A,A} See #Tesserex & #erkangur
How many distinct sets of positions can the A's occupy? Given this value, multiply by the number of distinct arrangements of 1234 and you have your answer. You'll need to choose the positions for the A's and then ! will help with the arrangements of 1234.
Consider a simpler example. Let's say you had asked the question:
How many arrangements are there of the symbols: ABCD1234?
Now, since every symbol is distinct, there are 8! ways to arrange them.
Now let's build up to your problem. If we change the letter B to an A, we have AACD1234.
This destroys the uniqueness of exactly half the possible combinations, since any combination where we could have previously switched the A and the B is now non-unique. Therefore, we now have 8!/2 combinations.
Similarly, replacing the C with another A would result in half of the remaining combinations losing their uniqueness, and so on.
So, if only one symbol is duplicated, the generalized formula is (number of symbols total)!/2^(number of duplications)
In your case, the number of possible arrangements is 8!/2^4
I'm trying to learn C and have come across the inability to work with REALLY big numbers (i.e., 100 digits, 1000 digits, etc.). I am aware that there exist libraries to do this, but I want to attempt to implement it myself.
I just want to know if anyone has or can provide a very detailed, dumbed down explanation of arbitrary-precision arithmetic.
It's all a matter of adequate storage and algorithms to treat numbers as smaller parts. Let's assume you have a compiler in which an int can only be 0 through 99 and you want to handle numbers up to 999999 (we'll only worry about positive numbers here to keep it simple).
You do that by giving each number three ints and using the same rules you (should have) learned back in primary school for addition, subtraction and the other basic operations.
In an arbitrary precision library, there's no fixed limit on the number of base types used to represent our numbers, just whatever memory can hold.
Addition for example: 123456 + 78:
12 34 56
78
-- -- --
12 35 34
Working from the least significant end:
initial carry = 0.
56 + 78 + 0 carry = 134 = 34 with 1 carry
34 + 00 + 1 carry = 35 = 35 with 0 carry
12 + 00 + 0 carry = 12 = 12 with 0 carry
This is, in fact, how addition generally works at the bit level inside your CPU.
Subtraction is similar (using subtraction of the base type and borrow instead of carry), multiplication can be done with repeated additions (very slow) or cross-products (faster) and division is trickier but can be done by shifting and subtraction of the numbers involved (the long division you would have learned as a kid).
I've actually written libraries to do this sort of stuff using the maximum powers of ten that can be fit into an integer when squared (to prevent overflow when multiplying two ints together, such as a 16-bit int being limited to 0 through 99 to generate 9,801 (<32,768) when squared, or 32-bit int using 0 through 9,999 to generate 99,980,001 (<2,147,483,648)) which greatly eased the algorithms.
Some tricks to watch out for.
1/ When adding or multiplying numbers, pre-allocate the maximum space needed then reduce later if you find it's too much. For example, adding two 100-"digit" (where digit is an int) numbers will never give you more than 101 digits. Multiply a 12-digit number by a 3 digit number will never generate more than 15 digits (add the digit counts).
2/ For added speed, normalise (reduce the storage required for) the numbers only if absolutely necessary - my library had this as a separate call so the user can decide between speed and storage concerns.
3/ Addition of a positive and negative number is subtraction, and subtracting a negative number is the same as adding the equivalent positive. You can save quite a bit of code by having the add and subtract methods call each other after adjusting signs.
4/ Avoid subtracting big numbers from small ones since you invariably end up with numbers like:
10
11-
-- -- -- --
99 99 99 99 (and you still have a borrow).
Instead, subtract 10 from 11, then negate it:
11
10-
--
1 (then negate to get -1).
Here are the comments (turned into text) from one of the libraries I had to do this for. The code itself is, unfortunately, copyrighted, but you may be able to pick out enough information to handle the four basic operations. Assume in the following that -a and -b represent negative numbers and a and b are zero or positive numbers.
For addition, if signs are different, use subtraction of the negation:
-a + b becomes b - a
a + -b becomes a - b
For subtraction, if signs are different, use addition of the negation:
a - -b becomes a + b
-a - b becomes -(a + b)
Also special handling to ensure we're subtracting small numbers from large:
small - big becomes -(big - small)
Multiplication uses entry-level math as follows:
475(a) x 32(b) = 475 x (30 + 2)
= 475 x 30 + 475 x 2
= 4750 x 3 + 475 x 2
= 4750 + 4750 + 4750 + 475 + 475
The way in which this is achieved involves extracting each of the digits of 32 one at a time (backwards) then using add to calculate a value to be added to the result (initially zero).
ShiftLeft and ShiftRight operations are used to quickly multiply or divide a LongInt by the wrap value (10 for "real" math). In the example above, we add 475 to zero 2 times (the last digit of 32) to get 950 (result = 0 + 950 = 950).
Then we left shift 475 to get 4750 and right shift 32 to get 3. Add 4750 to zero 3 times to get 14250 then add to result of 950 to get 15200.
Left shift 4750 to get 47500, right shift 3 to get 0. Since the right shifted 32 is now zero, we're finished and, in fact 475 x 32 does equal 15200.
Division is also tricky but based on early arithmetic (the "gazinta" method for "goes into"). Consider the following long division for 12345 / 27:
457
+-------
27 | 12345 27 is larger than 1 or 12 so we first use 123.
108 27 goes into 123 4 times, 4 x 27 = 108, 123 - 108 = 15.
---
154 Bring down 4.
135 27 goes into 154 5 times, 5 x 27 = 135, 154 - 135 = 19.
---
195 Bring down 5.
189 27 goes into 195 7 times, 7 x 27 = 189, 195 - 189 = 6.
---
6 Nothing more to bring down, so stop.
Therefore 12345 / 27 is 457 with remainder 6. Verify:
457 x 27 + 6
= 12339 + 6
= 12345
This is implemented by using a draw-down variable (initially zero) to bring down the segments of 12345 one at a time until it's greater or equal to 27.
Then we simply subtract 27 from that until we get below 27 - the number of subtractions is the segment added to the top line.
When there are no more segments to bring down, we have our result.
Keep in mind these are pretty basic algorithms. There are far better ways to do complex arithmetic if your numbers are going to be particularly large. You can look into something like GNU Multiple Precision Arithmetic Library - it's substantially better and faster than my own libraries.
It does have the rather unfortunate misfeature in that it will simply exit if it runs out of memory (a rather fatal flaw for a general purpose library in my opinion) but, if you can look past that, it's pretty good at what it does.
If you cannot use it for licensing reasons (or because you don't want your application just exiting for no apparent reason), you could at least get the algorithms from there for integrating into your own code.
I've also found that the bods over at MPIR (a fork of GMP) are more amenable to discussions on potential changes - they seem a more developer-friendly bunch.
While re-inventing the wheel is extremely good for your personal edification and learning, its also an extremely large task. I don't want to dissuade you as its an important exercise and one that I've done myself, but you should be aware that there are subtle and complex issues at work that larger packages address.
For example, multiplication. Naively, you might think of the 'schoolboy' method, i.e. write one number above the other, then do long multiplication as you learned in school. example:
123
x 34
-----
492
+ 3690
---------
4182
but this method is extremely slow (O(n^2), n being the number of digits). Instead, modern bignum packages use either a discrete Fourier transform or a Numeric transform to turn this into an essentially O(n ln(n)) operation.
And this is just for integers. When you get into more complicated functions on some type of real representation of number (log, sqrt, exp, etc.) things get even more complicated.
If you'd like some theoretical background, I highly recommend reading the first chapter of Yap's book, "Fundamental Problems of Algorithmic Algebra". As already mentioned, the gmp bignum library is an excellent library. For real numbers, I've used MPFR and liked it.
Don't reinvent the wheel: it might turn out to be square!
Use a third party library, such as GNU MP, that is tried and tested.
You do it in basically the same way you do with pencil and paper...
The number is to be represented in a buffer (array) able to take on an arbitrary size (which means using malloc and realloc) as needed
you implement basic arithmetic as much as possible using language supported structures, and deal with carries and moving the radix-point manually
you scour numeric analysis texts to find efficient arguments for dealing by more complex function
you only implement as much as you need.
Typically you will use as you basic unit of computation
bytes containing with 0-99 or 0-255
16 bit words contaning wither 0-9999 or 0--65536
32 bit words containing...
...
as dictated by your architecture.
The choice of binary or decimal base depends on you desires for maximum space efficiency, human readability, and the presence of absence of Binary Coded Decimal (BCD) math support on your chip.
You can do it with high school level of mathematics. Though more advanced algorithms are used in reality. So for example to add two 1024-byte numbers :
unsigned char first[1024], second[1024], result[1025];
unsigned char carry = 0;
unsigned int sum = 0;
for(size_t i = 0; i < 1024; i++)
{
sum = first[i] + second[i] + carry;
carry = sum - 255;
}
result will have to be bigger by one place in case of addition to take care of maximum values. Look at this :
9
+
9
----
18
TTMath is a great library if you want to learn. It is built using C++. The above example was silly one, but this is how addition and subtraction is done in general!
A good reference about the subject is Computational complexity of mathematical operations. It tells you how much space is required for each operation you want to implement. For example, If you have two N-digit numbers, then you need 2N digits to store the result of multiplication.
As Mitch said, it is by far not an easy task to implement! I recommend you take a look at TTMath if you know C++.
One of the ultimate references (IMHO) is Knuth's TAOCP Volume II. It explains lots of algorithms for representing numbers and arithmetic operations on these representations.
#Book{Knuth:taocp:2,
author = {Knuth, Donald E.},
title = {The Art of Computer Programming},
volume = {2: Seminumerical Algorithms, second edition},
year = {1981},
publisher = {\Range{Addison}{Wesley}},
isbn = {0-201-03822-6},
}
Assuming that you wish to write a big integer code yourself, this can be surprisingly simple to do, spoken as someone who did it recently (though in MATLAB.) Here are a few of the tricks I used:
I stored each individual decimal digit as a double number. This makes many operations simple, especially output. While it does take up more storage than you might wish, memory is cheap here, and it makes multiplication very efficient if you can convolve a pair of vectors efficiently. Alternatively, you can store several decimal digits in a double, but beware then that convolution to do the multiplication can cause numerical problems on very large numbers.
Store a sign bit separately.
Addition of two numbers is mainly a matter of adding the digits, then check for a carry at each step.
Multiplication of a pair of numbers is best done as convolution followed by a carry step, at least if you have a fast convolution code on tap.
Even when you store the numbers as a string of individual decimal digits, division (also mod/rem ops) can be done to gain roughly 13 decimal digits at a time in the result. This is much more efficient than a divide that works on only 1 decimal digit at a time.
To compute an integer power of an integer, compute the binary representation of the exponent. Then use repeated squaring operations to compute the powers as needed.
Many operations (factoring, primality tests, etc.) will benefit from a powermod operation. That is, when you compute mod(a^p,N), reduce the result mod N at each step of the exponentiation where p has been expressed in a binary form. Do not compute a^p first, and then try to reduce it mod N.
Here's a simple ( naive ) example I did in PHP.
I implemented "Add" and "Multiply" and used that for an exponent example.
http://adevsoft.com/simple-php-arbitrary-precision-integer-big-num-example/
Code snip
// Add two big integers
function ba($a, $b)
{
if( $a === "0" ) return $b;
else if( $b === "0") return $a;
$aa = str_split(strrev(strlen($a)>1?ltrim($a,"0"):$a), 9);
$bb = str_split(strrev(strlen($b)>1?ltrim($b,"0"):$b), 9);
$rr = Array();
$maxC = max(Array(count($aa), count($bb)));
$aa = array_pad(array_map("strrev", $aa),$maxC+1,"0");
$bb = array_pad(array_map("strrev", $bb),$maxC+1,"0");
for( $i=0; $i<=$maxC; $i++ )
{
$t = str_pad((string) ($aa[$i] + $bb[$i]), 9, "0", STR_PAD_LEFT);
if( strlen($t) > 9 )
{
$aa[$i+1] = ba($aa[$i+1], substr($t,0,1));
$t = substr($t, 1);
}
array_unshift($rr, $t);
}
return implode($rr);
}