GLPK Timeout (tm_lim) returns best solution found? - glpk

My experiments (using the C library directly) suggest that using the tm_lim parameter to limit the time taken by GLPK on a mixed integer programming problem results in a problem pointer that contains the best solution found so far. However, I can't find any confirmation of this in the documentation. Does a timed-out computation always leave the best discovered solution in the problem buffer?
Thanks!

The tm_lim parameter does indeed return the best solution from my anecdotal experience. I could not find verification of this in the documentation either, so I looked at the source.
glpk iterates over a loop, updating the solution in-place until one of four termination criteria (optimal solution, unbounded solution, time limit, iteration limit) is satisfied. Once this happens, glpk stops updating the solution and returns a value indicating the satisfied criterion.
You can verify this in the function ssx_phase_II in src/glpssx02.c in https://ftp.gnu.org/gnu/glpk/glpk-4.35.tar.gz. Look at references to tm_lim.
A final piece of justification is the documentation for the --tmlim command line option:
--tmlim nnn limit solution time to nnn seconds (--tmlim 0 allows
obtaining solution at initial point)
Passing --tmlim 0 would return the initial solution.

Related

Struggling with building an intuition for recursion

Though I have studied and able am able to understand some programs in recursion, I am still not able to intuitively obtain a solution using recursion as I do easily using Iteration. Is there any course or track available in order to build an intuition for recursion? How can one master the concept of recursion?
if you want to gain a thorough understanding of how recursion works, I highly recommend that you start with understanding mathematical induction, as the two are very closely related, if not arguably identical.
Recursion is a way of breaking down seemingly complicated problems into smaller bits. Consider the trivial example of the factorial function.
def factorial(n):
if n < 2:
return 1
return n * factorial(n - 1)
To calculate factorial(100), for example, all you need is to calculate factorial(99) and multiply 100. This follows from the familiar definition of the factorial.
Here are some tips for coming up with a recursive solution:
Assume you know the result returned by the immediately preceding recursive call (e.g. in calculating factorial(100), assume you already know the value of factorial(99). How do you go from there?)
Consider the base case (i.e. when should the recursion come to a halt?)
The first bullet point might seem rather abstract, but all it means is this: a large portion of the work has already been done. How do you go from there to complete the task? In the case of the factorial, factorial(99) constituted this large portion of work. In many cases, you will find that identifying this portion of work simply amounts to examining the argument to the function (e.g. n in factorial), and assuming that you already have the answer to func(n - 1).
Here's another example for concreteness. Let's say we want to reverse a string without using in-built functions. In using recursion, we might assume that string[:-1], or the substring until the very last character, has already been reversed. Then, all that is needed is to put the last remaining character in the front. Using this inspiration, we might come up with the following recursive solution:
def my_reverse(string):
if not string: # base case: empty string
return string # return empty string, nothing to reverse
return string[-1] + my_reverse(string[:-1])
With all of this said, recursion is built on mathematical induction, and these two are inseparable ideas. In fact, one can easily prove that recursive algorithms work using induction. I highly recommend that you checkout this lecture.

Using large hash tables in R

I'm trying to use package hash, which I understand is the most commonly adopted implementation (other than directly using environments).
If I try to create and store hashes larger than ~20MB, I start getting protect(): protection stack overflow errors.
pryr::object_size(hash::hash(1:120000, 1:120000)) # * (see end of post)
#> 21.5 MB
h <- hash::hash(1:120000, 1:120000)
#> Error: protect(): protection stack overflow
If I run the h <- ... command once, the error only appears once. If I run it twice, I get an infinite loop of errors appearing in the console, freezing Rstudio and forcing me to restart it from the Task Manager.
From multiple other SO questions, I understand this means I'm creating more pointers than R can protect. This makes sense to me, since hashes are actually just environments (which themselves are just hash tables), so I assume R needs to keep track of each value in the hash table as a separate pointer.
The common solution I've seen for the protect() error is to use rstudio.exe --max-ppsize=500000 (which I assume propagates that option to R itself), but it doesn't help in this case, the error remains. This is somewhat surprising, since the hash in the example above is only 120,000 keys/pointers long, much smaller than the given ppsize of 500,000.
So, how can I use large hashes in R? I'm assuming changing to pure environments won't help, since hash is really just a wrapper around environments.
* For the record, the given hash::hash() call above will create hashes with non-syntactic names, but that's irrelevant: my real case has simple character keys and integer values and shows the same behavior)
This is a bug in RStudio, not a limitation in R. The bug happens when it tries to examine the h object for display in the environment pane. The bug is on their issue list as https://github.com/rstudio/rstudio/issues/5546 .

CPLEX outputting different results on consecutive runs - Asynchronity issue?

I'm running CPLEX from IBM ILOG CPLEX Optimization Studio 12.6.
Here I'm facing a weird issue. Solving the same optimization problem (pure LP) multiple times in a row, yields different results.
The aim is to solve once, then iteratively modify the coefficient matrix, and re-solve the problem. However, we experienced that the changes between iterations did not correspond to the modifications.
This lead us to try re-solving the problem without doing modifications in between, which returned different results.
The catch is that we still do one major modification before we start iterating, and our hypothesis is that this change (cplex.setCoef(...) on about 10,000 rows) is done asynchronously, so that it is only partially done during the first re-solution iterations.
However, we cannot seem to find any documentation stating that this method is asynchronous, nor any way to ensure synchronous execution, so that all the changes are done before CPLEX restarts.
Does anyone know if this is the case? Is there any way to delay restart until cplex.setCoef(...) is done? The problem is quite huge, but the representative lines are:
functionUsingSetCoefOn10000rows();
for(var j = 0; j < 100; j++){
cplex.solve();
writeln("Iteration " + j + ": " + cplex.getObjValue());
for(var k = 0; k < 100000; k++){
doBusyWork(); //Just to kill time
}
}
which outputs
Iteration 0: 1529486959.814946
Iteration 1: 1544325969.750444
Iteration 2: 1549669732.757587
Iteration 3: 1551818419.584333
...
Iteration 33: 1564007987.849925
...
Iteration 98: 1564007987.849925
Iteration 99: 1564007987.849925
Last minute update
Reducing the number of calls to cplex.setCoef to about 2500 removes the issue, and all iterations return the same objective value. Sadly, we do need to change all the 10,000 coefficients.
Edit: The OPL scripting and engine log: http://goo.gl/ywJhkm and here: http://goo.gl/v2Qhm9
Sorry that this is not really an answer, but it is too big to go as a comment...
I don't think that the setCoef() calls would be asynchronous and not complete - that would be very surprising. Such behaviour would be too unpredictable and too many other people would have problems with this behaviour. However, CPLEX itself will use multiple threads to solve a problem and that means that it can generate different solutions each time it runs. The example objective values that you show do seem to change significantly, so a few questions/observations:
1: The numbers seem to be monotonically increasing - are they all increasing like this until they reach the maximum value? It looks like some kind of convergence behaviour. On re-running, CPLEX will start from a previous solution if it can. Check that there isn't some other CPLEX parameter stopping the search early such as an iteration or time limit or wider solution optimality tolerance.
2: Have you looked at the CPLEX logs from each run to see what CPLEX is doing in each run?
3: If you have doubts about the model being solved, try dumping out the model as an LP file and check the values in each iteration. They should all be the same in your case. You can also try solving the LP file in the CPLEX standalone optimiser to see what value that gives.
4: Have you tried setting the parameters to make CPLEX use a different LP algorithm (e.g. primal simplex, barrier etc)?

Product of range in Prolog

I need to write a program, which calculates product of product in range:
I written the following code:
mult(N,N,R,R).
mult(N,Nt,R,Rt):-N1=Nt+1,R1=Rt*(1/(log(Nt))),mult(N,N1,R,R1).
This should implement basic product from Nt to N of 1/ln(j). As far as I understand it's got to be stopped when Nt and N are equal. However, I can't get it working due to:
?- mult(10,2,R,1), write(R).
ERROR: Out of global stack
The following error. Is there any other way to implement loop not using default libraries of SWI-Prolog?
Your program never terminates! To see this consider the following failure-slice of your program:
mult(N,N,R,R) :- false.
mult(N,Nt,R,Rt):-
N1=Nt+1,
R1=Rt*(1/(log(Nt))),
mult(N,N1,R,R1), false.
This new program does never terminate, and thus the original program doesn't terminate. To see that this never terminates, consider the two (=)/2 goals. In the first, the new variable N1 is unified with something. This will always succeed. Similarly, the second goal with always succeed. There will never be a possibility for failure prior to the recursive goal. And thus, this program never terminates.
You need to add some goal, or to replace existing goals. in the visible part. Maybe add
N > Nt.
Further, it might be a good idea to replace the two (=)/2 goals by (is)/2. But this is not required for termination, strictly speaking.
Out of global stack means you entered a too-long chain of recursion, possibly an infinite one.
The problem stems from using = instead of is in your assignments.
mult(N,N,R,R).
mult(N,Nt,R,Rt):-N1 is Nt+1, R1 is Rt*(1/(log(Nt))), mult(N,N1,R,R1).
You might want to insert a cut in your first clause to avoid going on after getting an answer.
If you have a graphical debugger (like the one in SWI) try setting 'trace' and 'debug' on and running. You'll soon realize that performing N1 = Nt+1 giving Ntas 2 yields the term 2+1. Since 2+1+1+1+(...) will never unify with with 10, that's the problem right there.

R: reference iteration number in call to sfLapply(1:N, function(x))

Is it possible to reference the iteration number in a sfLapply call as follows -
wrapper <- function(a) {
y.mat <- data.frame(get(foo[i,1]), get(foo[i,2]))
...
...
do other things....
}
results <- sfLapply(1:200000, wrapper)
Where i is the iteration number as sfLapply cycles through 1:200000.
The problem I am faced with is that I have over 200,000 cases to test, with each case requiring the construction of a data.frame to which various operations will be performed.
I have a 2 Ghz Intel Core 2 Duo processor (macbook laptop) and so I began to investigate the snowfall package to take advantage of parallel processing. This led me to sfLapply and so I started to investigate whether I could re-write my code to work with lapply(). However, I have yet to come across examples that reference the iteration number in lappy() calls.
Maybe I am heading in the wrong direction. If anyone has any suggestions I would be greatly appreciative.
You're not using parameter a in the code to wrapper. All the numbers from 1:200000 will be passed to wrapper, so it is this a that represents your iteration (instead of i).
Don't forget, though, that these will not appear in order (courtesy of sfLapply).
As far as I know, there is no way of knowing the how manyth iteration your going into, as the different processes don't know what the others are doing.

Resources