I'm trying to use scipy.optimize.minimize with method trust-constr to optimize a function of 8 variables. Unfortunately, the function is too complicated to post in full here; it involves around 1000 terms, each of which involves an integral. But here is an excerpt that I think already shows it doesn't work as expected.
def objective(variables):
starts_neg=variables[0]
starts_neu=variables[1]
etc
return _____
(these are the two variables that cause the problem)
Having defined the objective function, I then need to define the constraints. There are 12 constraints involving the 8 variables; but I'll just show the ones involving the offending variables.
constraint_matrix=[[0 for j in range(8)] for i in range(12)]
constraint_matrix[0][0]=1
constraint_matrix[1][1]=1
constraint_matrix[2][0]=1
constraint_matrix[2][1]=1
etc
lower_bounds=[10**(-12) for i in range(12)]
upper_bounds=[1 for i in range(12)]
prob_constraints=LinearConstraint(constraint_matrix,lower_bounds,upper_bounds,keep_feasible=True)
My intent here is to say 0 < starts_neg < 1, 0 < starts_neu < 1, 0 < start_neg + starts_neu < 1. The lower bounds are changed from 0 to 10^-12 to avoid nan errors, since the objective function involves taking the logs of the variables.
I then give scipy an initial estimate x0=[estimate,estimate,etc.]
Lastly, call optimize as follows:
result=minimize(objective,x0,method='trust-constr',constraints=[prob_constraints],options={'xtol':10**(-9)}).x
Unfortunately, this yielded a nan error. So I tried inserting the following in the objective function and running again:
if starts_neg<=0 or starts_neg>=1 or starts_neu<=0 or starts_neu>=1 or starts_neu+starts_neg>=1:
print(starts_neg,starts_neu)
This outputs -0.02436406136453448 0.7588112085953852 before the nan error & traceback, which seems too large a constraint violation to be explained by rounding error. And no, this was not the initial estimate x0; I checked for that too.
So clearly scipy disobeyed one of my constraints, despite my setting keep_feasible=True. Did I set up something wrong? Sorry the function is too long to include in full.
Related
My goal is to understand how Eva shrinks the intervals for a variable. for example:
unsigned int nondet_uint(void);
int main()
{
unsigned int x=nondet_uint();
unsigned int y=nondet_uint();
//# assert x >= 20 && x <= 30;
//# assert y <= 60;
//# assert(x>=y);
return 0;
}
So, we have x=[20,30] and y=[0,60]. However, the results from Eva shrinks y to [0,30] which is where the domain may be valid.
[eva] ====== VALUES COMPUTED ======
[eva:final-states] Values at end of function main:
x ∈ [20..30]
y ∈ [0..30]
__retres ∈ {0}
I tried some options for the Eva plugin, but none showed the steps for it. May I ask you to provide the method or publication on how to compute these values?
Showing values during abstract interpretation
I tried some options for the Eva plugin, but none showed the steps for it.
The most efficient way to follow the evaluation is not via command-line options, but by adding Frama_C_show_each(exp) statements in the code. These are special function calls which, during the analysis, emit the values of the expression contained in them. They are especially useful in loops, for instance to see when a widening is triggered, what happens to the loop counter values.
Note that displaying all of the intermediary evaluation and reduction steps would be very verbose, even for very small programs. By default, this information is not exposed, since it is too dense and rarely useful.
For starters, try adding Frama_C_show_each statements, and use the Frama-C GUI to see the result. It allows focusing on any expression in the code and, in the Values tab, shows the values for the given expression, at the selected statement, for each callstack. You can also press Ctrl+E and type an arbitrary expression to have its value evaluated at that statement.
If you want more details about the values, their reductions, and the overall mechanism, see the section below.
Detailed information about values in Eva
Your question is related to the values used by the abstract interpretation engine in Eva.
Chapter 3 of the Eva User Manual describes the abstractions used by the engine, which are, succinctly:
sets of integers, which are maximally precise but limited to a number of elements (modified by option -eva-ilevel, which on Frama-C 22 is set to 8 by default);
integer intervals with periodicity information (also called modulo, or congruence), e.g. [2..42],2%10 being the set containing {2, 12, 22, 32, 42}. In the simple case, e.g. [2..42], all integers between 2 and 42 are included;
sets of addresses (for pointers), with offsets represented using the above values (sets of integers or intervals);
intervals of floating-point variables (unlike integers, there are no small sets of floating-point values).
Why is all of this necessary? Because without knowing some of these details, you'll have a hard time understanding why the analysis is sometimes precise, sometimes imprecise.
Note that the term reduction is used in the documentation, instead of shrinkage. So look for words related to reduce in the Eva manual when searching for clues.
For instance, in the following code:
int a = Frama_C_interval(-5, 5);
if (a != 0) {
//# assert a != 0;
int b = 5 / a;
}
By default, the analysis will not be able to remove the 0 from the interval inside the if, because [-5..-1];[1..5] is not an interval, but a disjoint union of intervals. However, if the number of elements drops below -eva-ilevel, then the analysis will convert it into a small set, and get a precise result. Therefore, changing some analysis options will result in different ranges, and different results.
In some cases, you can force Eva to compute using disjunctions, for instance by adding the split ACSL annotation, e.g. //# split a < b || a >= b;. But you still need the give the analysis some "fuel" for it to evaluate both branches separately. The easiest way to do so is to use -eva-precision N, with N being an integer between 0 and 11. The higher N is, the more splitting is allowed to happen, but the longer the analysis may take.
Note that, to ensure termination of the analysis, some mechanisms such as widening are used. Without it, a simple loop might require billions of evaluation steps to terminate. This mechanism may introduce extra values which lead to a less precise analysis.
Finally, there are also some abstract domains (option -eva-domains) which allow other kinds of values besides the default ones mentioned above. For instance, the sign domain allows splitting values between negative, zero and positive, and would avoid the imprecision in the above example. The Eva user manual contains examples of usage of each of the domains, indicating when they are useful.
I am using the R function expect_equal to test if two large vectors are equal (almost) up to a certain tolerance. I was wondering if there was a way to only print the cases where expect_equal breaks the tolerance.
For example
a <- c(2.001, 3.5)
b <- c(2,3)
expect_equal(object=a,expected=b,tolerance=0.015, scale=1).
This prints the error:
Error: c(2, 3) not equal to c(2.001, 3.5)
2/2 mismatches (average diff: 0.25).
First 2:
pos x y diff
1 2 2.0 -0.001
2 3 3.5 -0.500
Even though case 1 "passes" my test. Is it possible to only print the cases that break the tolerance level? And even better would be if I could then store and refer to cases which fail so that I can route out the errors quicker.
The quick answer is "no". You can't only show the values that break the tolerance. The reason is that equality is tested using the "all.equal" function which doesn't have that option (to see this, you can look at the function "compare.numeric" in testthat via
testthat:::compare.numeric
at the R command prompt.
The longer answer depends how hard you want to work to get your answer and how often you will reuse the method. The simplest is to do as #VermillionAzure mentioned, manually generate the vector out of tolerance and check for its length to be 0 (or a similar test). For that test, you could use the expect_true function. A more complex method would be to create your own data class (other than numeric) and then make your own compare method for that class. If you really need the result to be summarized your way, you may have to go down the path of creating your own compare function.
For the second part of your question (storing to refer to tests that failed later), you can store the results of the test() function call from testthat, and from that, you can find what function had the errors.
results <- test()
I have the following problem set up in glpk. Two variables, p and v, and three constraints. The objective is to maximize v.
p >= 0
p == 1
-v + 3p >= 0
The answer should be v==3, but for some reason, the solver tells me it is infeasible when using the simplex method, and complains about numerical instability when using an interior point method.
This problem is generated as a subproblem of a bigger problem, and obviously not all subproblems are as trivial or I would just hardcode the solution.
Because, for some reason, by default, columns variables are fixed at 0 (GLP_FX) and not free. I don't see how that default makes sense.
I am new to WinBUGS/OpenBUGS and having difficulty debugging my code.
Does anyone know of a list of potential error messages for BUGS models and their meanings in plain English?
The WinBUGS manual has a list of some common error. I have added some additional notes from my own experience:
expected variable name indicates an inappropriate variable name. I occasionally get this error in providing the data, might have used 1.02e04 instead of 1.02E04.
undefined variable - variables in a data file must be defined in a model (just put them in as constants or with vague priors). If a logical node is reported undefined, the problem may be with a node on the 'right hand side'. I occasionally get this error when I have removed a variable from the model but not from the data or missed a comma in the data.
invalid or unexpected token scanned - check that the value field of a logical node in a Doodle has been completed.
index out of range - usually indicates that a loop-index goes beyond the size of a vector (or matrix dimension); sometimes, however, appears if the # has been omitted from the beginning of a comment line
linear predictor in probit regression too large indicates numerical overflow. See possible solutions below for Trap 'undefined real result'.
logical expression too complex - a logical node is defined in terms of too many parameters/constants or too many operators: try introducing further logical nodes to represent parts of the overall calculation; for example, a1 + a2 + a3 + b1 + b2 + b3 could be written as A + B where A and B are the simpler logical expressions a1 + a2 + a3 and b1 + b2 + b3, respectively. Note that linear predictors with many terms should be formulated by 'vectorizing' parameters and covariates and by then using the inprod(.,.) function
unable to choose update method indicates that a restriction in the program has been violated
You might also hit a trap at the start or during the MCMC. The BUGS manual list the following common traps (I always get the first two, never met the last two):
undefined real result indicates numerical overflow. Possible reasons include:
initial values generated from a 'vague' prior distribution may be numerically extreme - specify appropriate initial values;
numerically impossible values such as log of a non-positive number - check, for example, that no zero expectations have been given when Poisson modelling;
numerical difficulties in sampling. Possible solutions include:
better initial values;
more informative priors - uniform priors might still be used but with their range restricted to plausible values;
better parameterisation to improve orthogonality;
standardisation of covariates to have mean 0 and standard deviation 1.
can happen if all initial values are equal.Probit models are particularly susceptible to this problem, i.e. generating undefined real results. If a probit is a stochastic node, it may help to put reasonable bounds on its distribution, e.g.
probit(p[i]) <- delta[i]
delta[i] ~ dnorm(mu[i], tau)I(-5, 5)
This trap can sometimes be escaped from by simply clicking on the update button. The equivalent construction
p[i] <- phi(delta[i])
may be more forgiving.
index array out of range
possible reasons include:
attempting to assign values beyond the declared length of an array;
if a logical expression is too long to evaluate break it down into smaller components.
stack overflow can occur if there is a recursive definition of a logical node.
NIL dereference (read) can occur at compilation in some circumstances when an inappropriate transformation is made, for example an array into a scalar.
Trap messages referring to DFreeARS indicate numerical problems with the derivative-free adaptive rejection algorithm used for log-concave distributions. One possibility is to change to "Slice" sampling
This WinBUGS User Manual might be of some use.
I was trying to learn Scipy, using it for mixed integrations and differentiations, but at the very initial step I encountered the following problems.
For numerical differentiation, it seems that the only Scipy function that works for callable functions is scipy.derivative() if I'm right!? However, I couldn't work with it:
1st) when I am not going to specify the point at which the differentiation is to be taken, e.g. when the differentiation is under an integral so that it is the integral that should assign the numerical values to its integrand's variable, not me. As a simple example I tried this code in Sage's notebook:
import scipy as sp
from scipy import integrate, derivative
var('y')
f=lambda x: 10^10*sin(x)
g=lambda x,y: f(x+y^2)
I=integrate.quad( sp.derivative(f(y),y, dx=0.00001, n=1, order=7) , 0, pi)[0]; show(I)
show( integral(diff(f(y),y),y,0,1).n() )
also it gives the warning that "Warning: The occurrence of roundoff error is detected, which prevents the requested tolerance from being achieved. The error may be underestimated." and I don't know what does this warning stand for as it persists even with increasing "dx" and decreasing the "order".
2nd) when I want to find the derivative of a multivariable function like g(x,y) in the above example and something like sp.derivative(g(x,y),(x,0.5), dx=0.01, n=1, order=3) gives error, as is easily expected.
Looking forward to hearing from you about how to resolve the above cited problems with numerical differentiation.
Best Regards
There are some strange problems with your code that suggest you need to brush up on some python! I don't know how you even made these definitions in python since they are not legal syntax.
First, I think you are using an older version of scipy. In recent versions (at least from 0.12+) you need from scipy.misc import derivative. derivative is not in the scipy global namespace.
Second, var is not defined, although it is not necessary anyway (I think you meant to import sympy first and use sympy.var('y')). sin has also not been imported from math (or numpy, if you prefer). show is not a valid function in sympy or scipy.
^ is not the power operator in python. You meant **
You seem to be mixing up the idea of symbolic and numeric calculus operations here. scipy won't numerically differentiate an expression involving a symbolic object -- the second argument to derivative is supposed to be the point at which you wish to take the derivative (i.e. a number). As you say you are trying to do numeric differentiation, I'll resolve the issue for that purpose.
from scipy import integrate
from scipy.misc import derivative
from math import *
f = lambda x: 10**10*sin(x)
df = lambda x: derivative(f, x, dx=0.00001, n=1, order=7)
I = integrate.quad( df, 0, pi)[0]
Now, this last expression generates the warning you mentioned, and the value returned is not very close to zero at -0.0731642869874073 in absolute terms, although that's not bad relative to the scale of f. You have to appreciate the issues of roundoff error in finite differencing. Your function f varies on your interval between 0 and 10^10! It probably seems paradoxical, but making the dx value for differentiation too small can actually magnify roundoff error and cause numerical instability. See the second graph here ("Example showing the difficulty of choosing h due to both rounding error and formula error") for an explanation: http://en.wikipedia.org/wiki/Numerical_differentiation
In fact, in this case, you need to increase it, say to 0.001: df = lambda x: derivative(f, x, dx=0.001, n=1, order=7)
Then, you can integrate safely, with no terrible roundoff.
I=integrate.quad( df, 0, pi)[0]
I don't recommend throwing away the second return value from quad. It's an important verification of what happened, as it is "an estimate of the absolute error in the result". In this case, I == 0.0012846582250212652 and the abs error is ~ 0.00022, which is not bad (the interval that implies still does not include zero). Maybe some more fiddling with the dx and absolute tolerances for quad will get you an even better solution, but hopefully you get the idea.
For your second problem, you simply need to create a proper scalar function (call it gx) that represents g(x,y) along y=0.5 (this is called Currying in computer science).
g = lambda x, y: f(x+y**2)
gx = lambda x: g(x, 0.5)
derivative(gx, 0.2, dx=0.01, n=1, order=3)
gives you a value of the derivative at x=0.2. Naturally, the value is huge given the scale of f. You can integrate using quad like I showed you above.
If you want to be able to differentiate g itself, you need a different numerical differentiation functio. I don't think scipy or numpy support this, although you could hack together a central difference calculation by making a 2D fine mesh (size dx) and using numpy.gradient. There are probably other library solutions that I'm not aware of, but I know my PyDSTool software contains a function diff that will do that (if you rewrite g to take one array argument instead). It uses Ridder's method and is inspired from the Numerical Recipes pseudocode.