Project Euler #211 - efficiency issue - math

I've been slowly working my way through the list of Project Euler problems, and I've come to one that I know how to solve, but it seems like I can't (given the way my solution was written).
I am using Common Lisp to do this with and my script has been running for over 24 hours (well over their one minute goal).
For the sake of conciseness, here's my solution (it's a spoiler, but only if you have one hell of a fast processor):
(defun square? (num)
(if (integerp (sqrt num)) T))
(defun factors (num)
(let ((l '()))
(do ((current 1 (1+ current)))
((> current (/ num current)))
(if (= 0 (mod num current))
(if (= current (/ num current))
(setf l (append l (list current)))
(setf l (append l (list current (/ num current)))))))
(sort l #'< )))
(defun o_2 (n)
(reduce #'+ (mapcar (lambda (x) (* x x)) (factors n))))
(defun sum-divisor-squares (limit)
(loop for i from 1 to limit when (square? (o_2 i)) summing i))
(defun euler-211 ()
(sum-divisor-squares 64000000))
The time required to solve the problem using smaller, more friendly, test arguments seems to grow larger than exponentialy... which is a real problem.
It took:
0.007 seconds to solve for 100
0.107 seconds to solve for 1000
2.020 seconds to solve for 10000
56.61 seconds to solve for 100000
1835.385 seconds to solve for 1000000
24+ hours to solve for 64000000
I'm really trying to figure out which part(s) of the script is causing it to take so long. I've put some thought into memoizing the factors function, but I'm at a loss as to how to actually implement that.
For those that want to take a look at the problem itself, here it be.
Any ideas on how to make this thing go faster would be greatly appreciated.
**sorry if this is a spoiler to anyone, it's not meant to be.... but if you have the computing power to run this in a decent amount of time, more power to you.

Here's a solution, keeping in mind the spirit of [Project] Euler. [Warning: spoiler. I've tried to keep the hints slow, so that you can read only part of the answer and think on your own if you want. :)]
When you are confronted with a problem having to do with numbers, one good strategy (as you probably already know from solving 210 Project Euler problems) is to look at small examples, find a pattern, and prove it. [The last part may be optional depending on your attitude to mathematics ;-)]
In this problem, though, looking at small examples -- for n=1,2,3,4,... will probably not give you any hint. But there is another sense of "small examples" when dealing with number-theoretic problems, which you also probably know by now -- primes are the building blocks of the natural numbers, so start with the primes.
For a prime number p, its only divisors are 1 and p, so the sum of the squares of its divisors is 1+p2.
For a prime power pk, its only divisors are 1, p, p2, … pk, so the sum of the squares of its divisors is 1+p+p2+…+pk=(pk+1-1)/(p-1).
That was the simplest case: you've solved the problem for all numbers with only one prime factor.
So far nothing special. Now suppose you have a number n that has two prime factors, say n=pq. Then its factors are 1, p, q, and pq, so the sum of the squares of its divisors is 1+p2+q2+p2q2=(1+p2)(1+q2).
What about n=paqb? What is the sum of the squares of its factors?
[............................Dangerous to read below this line...................]
It is ∑0≤c≤a, 0≤d≤b(pcqd)2 = ((pa+1-1)/(p-1))((qb+1-1)/(q-1)).
That should give you the hint, both on what the answer is and how to prove it: the sum of the divisors of n is simply the product of the (answer) for each of the prime powers in its factorization, so all you need to do is to factorize 64000000 (which is very easy to do even in one's head :-)) and multiply the answer for each (=both, because the only primes are 2 and 5) of its prime powers.
That solves the Project Euler problem; now the moral to take away from it.
The more general fact here is about multiplicative functions -- functions on the natural numbers such that f(mn) = f(m)f(n) whenever gcd(m,n)=1, i.e. m and n have no prime factors in common. If you have such a function, the value of the function at a particular number is completely determined by its values at prime powers (can you prove this?)
The slightly harder fact, which you can try to prove[it's not that hard], is this: if you have a multiplicative function f [here, f(n)=n2] and you define the function F as F(n) = ∑d divides nf(d), (as the problem did here) then F(n) is also a multiplicative function.
[In fact something very beautiful is true, but don't look at it just yet, and you'll probably never need it. :-)]

I think that your algorithm is not the most efficient possible. Hint: you may be starting from the wrong side.
edit: I'd like to add that choosing 64000000 as the upper limit is likely the problem poster's way of telling you to think of something better.
edit: A few efficiency hints:
instead of
(setf l (append l (...)))
you can use
(push (...) l)
which destructively modifies your list by consing a new cell with your value as car and the former l as cdr, then points l to this cell. This is much faster than appending which has to traverse the list once each. If you need the list in the other order, you can nreverse it after it is complete (but that is not needed here).
why do you sort l?
you can make (> current (/ num current)) more efficient by comparing with the square root of num instead (which only needs to be computed once per num).
is it perhaps possible to find the factors of a number more efficiently?
And a style hint: You can put the scope of l into the do declaration:
(do ((l ())
(current 1 (+ current 1)))
((> current (/ num current))
l)
...)

I would attack this by doing the prime factorization of the number (for example: 300 = 2^2 * 3^1 * 5^2), which is relatively fast, especially if you generate this by sieve. From this, it's relatively simple to generate the factors by iterating i=0..2; j=0..1; k=0..2, and doing 2^i * 3^j * 5^k.
5 3 2
-----
0 0 0 = 1
0 0 1 = 2
0 0 2 = 4
0 1 0 = 3
0 1 1 = 6
0 1 2 = 12
1 0 0 = 5
1 0 1 = 10
1 0 2 = 20
1 1 0 = 15
1 1 1 = 30
1 1 2 = 60
2 0 0 = 25
2 0 1 = 50
2 0 2 = 100
2 1 0 = 75
2 1 1 = 150
2 1 2 = 300
This might not be fast enough

The clever trick you are missing is that you don't need to factor the numbers at all
How many numbers from 1..N are multiples of 1? N
How many numbers from 1..N are multiples of 2? N/2
The trick is to sum each number's factors in a list.
For 1, add 1^2 to every number in the list. For 2, add 2^2 to every other number.
For 3, add 3^2 to every 3rd number.
Don't check for divisibility at all.
At the end, you do have to check whether the sum is a perfect square, and that's it.
In C++, this worked in 58 seconds for me.

Sorry, I don't understand LISP well enough to read your answer. But my first impression is that the time cost of the brute force solution should be:
open bracket
sqrt(k) to find the divisors of k (by trial division), square each one (constant time per factor), and sum them (constant time per factor). This is σ2(k), which I will call x.
plus
not sure what the complexity of a good integer square root algorithm is, but certainly no worse than sqrt(x) (dumb trial multiplication). x might well be big-O larger than k, so I reserve judgement here, but x is obviously bounded above by k^3, because k has at most k divisors, each itself no bigger than k and hence its square no bigger than k^2. It's been so long since my maths degree that I have no idea how fast Newton-Raphson converges, but I suspect it's faster than sqrt(x), and if all else fails a binary chop is log(x).
close bracket
multiplied by n (as k ranges 1 .. n).
So if your algorithm is worse than O(n * sqrt(n^3)) = O(n ^ (5/2)), in the dumb-sqrt case, or O(n * (sqrt(n) + log(n^3)) = O(n ^ 3/2) in the clever-sqrt case, I think something has gone wrong which should be identifiable in the algorithm. At this point I'm stuck because I can't debug your LISP.
Oh, I've assumed that arithmetic is constant-time for the numbers in use. It darn well should be for numbers as small as 64 million, and the cube of that fits in a 64bit unsigned integer, barely. But even if your LISP implementation is making arithmetic worse than O(1), it shouldn't be worse than O(log n), so it won't have much affect on the complexity. Certainly won't make it super-polynomial.
This is where someone comes along and tells me just how wrong I am.
Oops, I just looked at your actual timing figures. They aren't worse than exponential. Ignoring the first and last values (because small times aren't accurately measurable and you haven't finished, respectively), multiplying n by 10 multiplies time by no more than 30-ish. 30 is about 10^1.5, which is about right for brute force as described above.

I think you can attack this problem with something like a prime sieve. That's only my first impression though.

I've reworked the program with some notes taken from the comments here. The 'factors' function is now ever so slightly more efficient and I also had to modify the σ_(2)(n) function to accept the new output.
'factors' went from having an output like:
$ (factors 10) => (1 2 5 10)
to having one like
$ (factors 10) => ((2 5) (1 10))
Revised function looks like this:
(defun o_2 (n)
"sum of squares of divisors"
(reduce #'+ (mapcar (lambda (x) (* x x)) (reduce #'append (factors n)))))
After the modest re-writes I did, I only saved about 7 seconds in the calculation for 100,000.
Looks like I'm going to have to get off of my ass and write a more direct approach.

Related

Can't understand this Tree recursion problem

So i'm going through the SICP book. I'm in the tree recursion chapter. I googled tree recursion to gain more knowledge about it and i stumbled upon this exercice and i'm having hard times to understand it perfectly.
Exercice :
I want to go up a flight of stairs that has n steps. I can either take 1 or 2 steps each time. How many different ways can I go up this flight of stairs?
The answer was :
For example, in the case where nis 5, there are 8 possible ways:
1 1 1 1 1
2 1 1 1
1 2 1 1
1 1 2 1
1 1 1 2
1 2 2
2 1 2
2 2 1
And this the code block i had trouble understanding it fully :
(define (count-stairs n)
(cond [(= n 1) 1]
[(= n 2) 2]
[else (+ (count-stairs (- n 1))
(count-stairs (- n 2)) ]) ))
The image illustrating the process
My problem is , why is there the + sign? isn't the count-stairs(4) + count-stairs(3) result in 7 steps? or i'm missing something here
ALSO: here's the full link to the exercice https://berkeley-cs61as.github.io/textbook/tree-recursion.html
please need your help !
The tree diagram is just giving the space of function calls and their arguments that occur starting with (count-stairs 5). When we call the function with argument 5, it will call (count-stairs 4) due to the expression (count-stairs (- n 1)) and it will call (count-stairs 3) due to the expression (count-stairs (- n 2)). Of course, these values get added with + which becomes the return value of the call. The tree just doesn't show that return value information, just the call arguments.
(count-stairs 5) doesn't mean "count five stairs", but "call the count-stairs function with argument 5 to calculate how many different ways there are to go up a flight of 5 stairs".
For (count-stairs 3) the result will be 3, because (count-stairs 1) and (count-stairs 2) just return 1 and 2, respectively.
However, (count-stairs 4) adds (count-stairs 3) and (count-stairs 2), therefore (count-stairs 4) -> 5.
We can use this arrow notation to annotate the expressions in the tree with their result values, starting from the bottom and working upward. At the top of the tree we will end up with (count-stairs 5) -> 8.
count-stairs is just a slight variation of the recursive Fibonacci function in disguise.
Why does this calculate the number of ways of ascending the stairs using 1 or 2 sized steps? Firstly, the base cases are clear. If a staircase has one step, there is only one way to traverse it: we take that one step. So (count-stairs 1) -> 1. If there are two steps, then then there are two ways: take each step, or take both of them in one stride. Thus (count-stairs 2) -> 2. Then comes the tricky inductive part. If we are faced with three or more stairs, what is the solution?
If we are faced with a staircase with n steps, n > 2, then we have two possibilities about how to begin climbing. Possibility (1): we can take one step, and then climb the remaining staircase of n - 1 steps; or, possibility (2) we can take two steps as a single stride, and then climb the remaining staircase of n - 2 steps. Thus the number of ways of climbing n steps is the sum of the ways from these two possibilities: the number of ways of climbing n - 1 steps, plus the number of ways of climbing n - 2 steps.

White-box and Black-box testing of recursive functions

I learned white-box and black-box testing in terms of iterative functions. Now i need to do white-box and black-box testing of several recursive functions (in F#). take the following recursive algorithm for gcd:
gcd (m, n)
if (m % n) = 0 then
n
else
gcd n ( m % n)
For the white-box test: how exactly do i go about covering the different branches of the algorithm? Naively one could say there are two branches but when the function is called more than once the possible branches will obviously increase. Should i do testing with arguments which results in different amounts of recursive calls or how exactly do i determine which values to test with?
black-box: i get the general idea of black box testing. we should look at possible values we might want to call the function with without having knowledge of its inner workings. In this case i am just not sure which are values we might want to call it with. one way could be just to start with two values m and n for which gcd = 1 and then do the same for values m and for which gcd = 2 up to some gcd= n for some arbitrary number n. Is this how one is supposed to go about this?
First of all, I don't think there is one single established definition of how to do white-box and black-box testing of recursive functions, but here is how I interpret it.
White-box testing. We want to test the function based on its inner working. In case of recursive functions, I think this means that we want to test that the recursive calls it makes are the ones we would expect. One way to do this is to log all recursive calls. A simple implementation of gcd that does this adds a parameter to keep a log and returns it with the result:
let rec gcd log m n =
let log = (m, n)::log
if (m % n) = 0 then List.rev log, n
else gcd log n (m % n)
Now, for some two parameters, say 54 and 22, you can do the calculation by hand, decide what the parameters of the recursive calls should be and write a test for that:
let log, res = gcd [] 54 22
log |> shouldEqual [ (54, 22); (22, 10); (10, 2) ]
Black-box testing. Here, we assume we do not know how exactly the function works, so we cannot test its internals. All we can do is to test it using a number of inputs. It is probably a good idea to think of corner-case or tricky inputs because those are the ones that could cause problems. Given a simple implementation:
let rec gcd m n =
if (m % n) = 0 then n
else gcd n (m % n)
I would probably write tests for the following:
// A random case where one of the numbers is the result
gcd 100 50 |> shouldEqual 50
gcd 50 100 |> shouldEqual 50
// A random case where the only divisor is 1
gcd 13 123 |> shouldEqual 1
gcd 123 13 |> shouldEqual 1
// The following are problematic and I'm not sure what the right behaviour is
gcd 0 0 // This probably should not be allowed
gcd 10 -5 // This returns -5, but I'm not sure that's what we want
Random testing.
You could also use random testing (which is a form of black box testing) to generate multiple test cases automatically. There are at least two random tests I can think of:
Generate two random numbers, a and b and check that gcd a b = gcd b a. This is testing only a very basic property, but it can cover quite a lot of cases.
Pick a random number a and a couple of primes p1, p2, .... Then split the primes into two groups and produce a*p1*p3*p5 and a*p2*p4*p6. Write a test that checks that the GCD of the two numbers is a.

Why 2 ^ 3 ^ 4 = 0 in Julia?

I just read a post from Quora:
http://www.quora.com/Is-Julia-ready-for-production-use
At the bottom, there's an answer said:
2 ^ 3 ^ 4 = 0
I tried it myself:
julia> 2 ^ 3 ^ 4
0
Personally I don't consider this a bug in the language. We can add parenthesis for clarity, both for Julia and for our human beings:
julia> (2 ^ 3) ^ 4
4096
So far so good; however, this doesn't work:
julia> 2 ^ (3 ^ 4)
0
Since I'm learning, I'd like to know, how Julia evaluate this expression to 0? What's the evaluation precedent?
julia> typeof(2 ^ 3 ^ 4)
Int64
I'm surprised I couldn't find a duplicate question about this on SO yet. I figure I'll answer this slightly differently than the FAQ in the manual since it's a common first question. Oops, I somehow missed: Factorial function works in Python, returns 0 for Julia
Imagine you've been taught addition and multiplication, but never learned any numbers higher than 99. As far as you're concerned, numbers bigger than that simply don't exist. So you learned to carry ones into the tens column, but you don't even know what you'd call the column you'd carry tens into. So you just drop them. As long as your numbers never get bigger than 99, everything will be just fine. Once you go over 99, you wrap back down to 0. So 99+3 ≡ 2 (mod 100). And 52*9 ≡ 68 (mod 100). Any time you do a multiplication with more than two factors of 10, your answer will be zero: 25*32 ≡ 0 (mod 100). Now, after you do each computation, someone could ask you "did you go over 99?" But that takes time to answer… time that could be spent computing your next math problem!
This is effectively how computers natively do arithmetic, except they do it in binary with 64 bits. You can see the individual bits with the bits function:
julia> bits(45)
"0000000000000000000000000000000000000000000000000000000000101101"
As we multiply it by 2, 101101 will shift to the left (just like multiplying by 10 in decimal):
julia> bits(45 * 2)
"0000000000000000000000000000000000000000000000000000000001011010"
julia> bits(45 * 2 * 2)
"0000000000000000000000000000000000000000000000000000000010110100"
julia> bits(45 * 2^58)
"1011010000000000000000000000000000000000000000000000000000000000"
julia> bits(45 * 2^60)
"1101000000000000000000000000000000000000000000000000000000000000"
… until it starts falling off the end. If you multiply more than 64 twos together, the answer will always zero (just like multiplying more than two tens together in the example above). We can ask the computer if it overflowed, but doing so by default for every single computation has some serious performance implications. So in Julia you have to be explicit. You can either ask Julia to check after a specific multiplication:
julia> Base.checked_mul(45, 2^60) # or checked_add for addition
ERROR: OverflowError()
in checked_mul at int.jl:514
Or you can promote one of the arguments to a BigInt:
julia> bin(big(45) * 2^60)
"101101000000000000000000000000000000000000000000000000000000000000"
In your example, you can see that the answer is 1 followed by 81 zeros when you use big integer arithmetic:
julia> bin(big(2) ^ 3 ^ 4)
"1000000000000000000000000000000000000000000000000000000000000000000000000000000000"
For more details, see the FAQ: why does julia use native machine integer arithmetic?

range based number guessing game

I want to find the difference between two numbers in a range, but I need to be able to wrap around to the beginning of the range, like a circular list.
The range is 9.
So if the number is 6 and the guess is 5 the answer should be 1, but if the number is 8 and the guess is 2, then the answer should be 3.
My first thought was to bump the number by 10 like this:
n is the correct number, g is the guess, r is the result.
( let
[ r (min (- (+ n 10) g) (- g n)) ]
(if (> 0 r) ( * -1 r ) r) )
)
... and that worked for wrapping around, but then the problem is that the existence of the number 10 increases the result by 1 if it wraps. Just subtracting 1 from the result or the number doesn't work either in all cases.
Depending on the numbers in question, the result is negative, so the if statement is to swap it around to positive.
This isn't a clojure problem exactly, it's really a math issue and I'd have this problem in any language, but it so happens that's what I'm writing it in. I've only just started using clojure (or any functional language), so it's entirely possibly I'm doing things wrong or wildly unidiomatic.
thanks for any help
If you include zero in your loop the question becomes smaller of A - B mod R or B - A mod R
R = 9
8 - 3 mod 9 = 5
3 - 8 mod 9 = 4
min(5 3) = 3
You can get the effect of not allowing guesses of zero by decrementing the numbers you read and incramenting the numbers you print though your game will be mathematically easier if you allow them to guess the number zero.
(defn mod- [x y r]
(let [res (rem (- x y) r)]
(if (neg? res)
(+ res r)
res)))
(min (mod- A B r) (mod- B A r))
Are you sure that for the case if the number is 8 and the guess is 2 the answer should be 3 not 6? Provided it's the case what about the following function to calculate the difference:
(defn difference [rrange number-to-guess guess]
(-> rrange
(- guess)
(+ number-to-guess)
(mod rrange)))
Here are the tests.
user=> (= 1 (difference 9 6 5))
true
user=> (= 6 (difference 9 8 2))
true

Multiply without + or *

I'm working my way through How to Design Programs on my own. I haven't quite grasped complex linear recursion, so I need a little help.
The problem:
Define multiply, which consumes two natural numbers, n and x, and produces n * x without using Scheme's *. Eliminate + from this definition, too.
Straightforward with the + sign:
(define (multiply n m)
(cond
[(zero? m) 0]
[else (+ n (multiply n (sub1 m)))]))
(= (multiply 3 3) 9)
I know to use add1, but I can't it the recursion right.
Thanks.
Split the problem in two functions. First, you need a function (add m n) which adds m to n. What is the base case? when n is zero, return m. What is the recursive step? add one to the result of calling add again, but decrementing n. You guessed it, add1 and sub1 will be useful.
The other function, (mul m n) is similar. What is the base case? if either m or n are zero, return 0. What is the recursive step? add (using the previously defined function) m to the result of calling mul again, but decrementing n. And that's it!
Since this is almost certainly a homework-type question, hints only.
How do you add 7 and 2? While most people just come up with 9, is there a more basic way?
How about you increment the first number and decrement the second number until one of them reaches zero?
Then the other one is the answer. Let's try the sample:
7 2
8 1
9 0 <- bingo
This will work fine for natural numbers though you need to be careful if you ever want to apply it to negatives. You can get into the situation (such as with 10 and -2) where both numbers are moving away from zero. Of course, you could check for that before hand and swap the operations.
So now you know can write + in terms of an increment and decrement instruction. It's not fantastic for recursion but, since your multiply-by-recursive-add already suffers the same problem, it's probably acceptable.
Now you just have to find out how to increment and decrement in LISP without using +. I wonder whether there might be some specific instructions for this :-)

Resources