GEKKO - optimization in matrix form - constraints

I am trying to solve an optimization problem where I need to specify the problem and the constraints using a 2D matrix. I have been using SCIPY, where the 1D arrays are the requirements. I want to check if GEKKO allows one to specify the objective function, bounds and constraints using a 2D matrix.
I have provided details and a reproducible version of the problem in the post here:
SCIPY - building constraints without listing each variable separately
Thanks
C

You can use the m.Array function in gekko. I don't recommend that you use the np.triu() with the Gekko array because the eliminated variables will still solve but potentially be hidden from the results. Here is a solution:
import numpy as np
import scipy.optimize as opt
from gekko import GEKKO
p= np.array([4, 5, 6.65, 12]) #p = prices
pmx = np.triu(p - p[:, np.newaxis]) #pmx = price matrix, upper triangular
m = GEKKO(remote=False)
q = m.Array(m.Var,(4,4),lb=0,ub=10)
# only upper triangular can change
for i in range(4):
for j in range(4):
if j<=i:
q[i,j].upper=0 # set upper bound = 0
def profit(q):
profit = np.sum(q.flatten() * pmx.flatten())
return profit
for i in range(4):
m.Equation(np.sum(q[i,:])<=10)
m.Equation(np.sum(q[:,i])<=8)
m.Maximize(profit(q))
m.solve()
print(q)
This gives the solution:
[[[0.0] [2.5432017412] [3.7228765674] [3.7339217013]]
[[0.0] [0.0] [4.2771234426] [4.2660783187]]
[[0.0] [0.0] [0.0] [0.0]]
[[0.0] [0.0] [0.0] [0.0]]]

Related

can Gekko solve vector based dynamic optimization problem for optimal control

i have tried many solver but getting errors somewhere. now im going to try gekko for my problem.
Please Lt me know that gekko can this kind of problem, where M in python function take the variable q. and all variables and parameters are in form of vector or matrix.
thanks you
q should be function of time, and M, c sai and other matrix depends on q and u.
Here is a similar problem with an inverted pendulum
There is also a related Stack Overflow question: how to use arrays in gekko optimizer for python Gekko can solve path planning problems that are subject to actuator constraints and equations of motion. One of the challenging mathematical features of this problem is how to smoothly pose the path constraints so that the robot does not pause at each intermediate path point. A potential method is to create a path cubic spline that helps to define the desired location for each time point. Here is one problem with matrices:
from gekko import GEKKO
import numpy as np
m = GEKKO(remote=False)
ni = 3; nj = 2; nk = 4
# solve AX=B
A = m.Array(m.Var,(ni,nj),lb=0)
X = m.Array(m.Var,(nj,nk),lb=0)
AX = np.dot(A,X)
B = m.Array(m.Var,(ni,nk),lb=0)
# equality constraints
m.Equations([AX[i,j]==B[i,j] for i in range(ni) \
for j in range(nk)])
m.Equation(5==m.sum([m.sum([A[i][j] for i in range(ni)]) \
for j in range(nj)]))
m.Equation(2==m.sum([m.sum([X[i][j] for i in range(nj)]) \
for j in range(nk)]))
# objective function
m.Minimize(m.sum([m.sum([B[i][j] for i in range(ni)]) \
for j in range(nk)]))
m.solve()
print(A)
print(X)
print(B)
Here is another test script that demonstrates dot products and the trace function with Numpy:
import numpy as np
from gekko import GEKKO
m = GEKKO(remote=False)
# Random 3x3
A = np.random.rand(3,3)
# Random 3x1
b = np.random.rand(3,1)
# Gekko array 3x3
p = m.Array(m.Param,(3,3))
# Gekko array 3x1
y = m.Array(m.Var,(3,1))
# Dot product of A p
x = np.dot(A,p) # or A#p
# Dot product of x y
w = x#y
# Dot product of p y
z = p#y # or np.dot(p,y)
# Trace (sum of diag) of p
t = np.trace(p)
# solve Ax = b
s = m.axb(A,b)
m.solve()

Is there a way to optimize the calculation of Bernoulli Log-Likelihoods for many multivariate samples?

I currently have two Torch Tensors, p and x, which both have the shape of (batch_size, input_size).
I would like to calculate the Bernoulli log likelihoods for the given data, and return a tensor of size (batch_size)
Here's an example of what I'd like to do:
I have the formula for log likelihoods of Bernoulli Random variables:
\sum_i^d x_{i} ln(p_i) + (1-x_i) ln (1-p_i)
Say I have p Tensor:
[[0.6 0.4 0], [0.33 0.34 0.33]]
And say I have the x tensor for the binary inputs based on those probabilities:
[[1 1 0], [0 1 1]]
And I want to calculate the log likelihood for every sample, which would result in:
[[ln(0.6)+ln(0.4)], [ln(0.67)+ln(0.34)+ln(0.33)]]
Would it be possible to do this computation without the use of for loops?
I know I could use torch.sum(axis=1) to do the final summation between the logs, but is it possible to do the Bernoulli log-likelihood computation without the use of for loops? or use at most 1 for loop? I am trying to vectorize this operation as much as possible. I could've sworn we could use LaTeX for equations before, did something change or is it another website?
Though not a good practice, you can directly use the formula on the tensors as follows (works because these are element wise operations):
import torch
p = torch.tensor([
[0.6, 0.4, 0],
[0.33, 0.34, 0.33]
])
x = torch.tensor([
[1., 1, 0],
[0, 1, 1]
])
eps = 1e-8
bll1 = (x * torch.log(p+eps) + (1-x) * torch.log(1-p+eps)).sum(axis=1)
print(bll1)
#tensor([-1.4271162748, -2.5879497528])
Note that to avoid log(0) error, I have introduced a very small constant eps inside it.
A better way to do this is to use BCELoss inside nn module in pytorch.
import torch.nn as nn
bce = nn.BCELoss(reduction='none')
bll2 = -bce(p, x).sum(axis=1)
print(bll2)
#tensor([-1.4271162748, -2.5879497528])
Since pytorch computes the BCE as a loss, it prepends your formula with a negative sign. The attribute reduction='none' says that I do not want the computed losses to be reduced (averaged/summed) across the batch in any way. This is advisable to use since we do not need to manually take care of numerical stability and error handling (such as adding eps above.)
You can indeed verify that the two solutions actually return the same tensor (upto a tolerance):
torch.allclose(bll1, bll2)
# True
or the tensors (without summing each row):
torch.allclose((x * torch.log(p+eps) + (1-x) * torch.log(1-p+eps)), -bce(p, x))
# True
Feel free to ask for further clarifications.

Generate random natural numbers that sum to a given number and comply to a set of general constraints

I had an application that required something similar to the problem described here.
I too need to generate a set of positive integer random variables {Xi} that add up to a given sum S, where each variable might have constraints such as mi<=Xi<=Mi.
This I know how to do, the problem is that in my case I also might have constraints between the random variables themselves, say Xi<=Fi(Xj) for some given Fi (also lets say Fi's inverse is known), Now, how should one generate the random variables "correctly"? I put correctly in quotes here because I'm not really sure what it would mean here except that I want the generated numbers to cover all possible cases with as uniform a probability as possible for each possible case.
Say we even look at a very simple case:
4 random variables X1,X2,X3,X4 that need to add up to 100 and comply with the constraint X1 <= 2*X2, what would be the "correct" way to generate them?
P.S. I know that this seems like it would be a better fit for math overflow but I found no solutions there either.
For 4 random variables X1,X2,X3,X4 that need to add up to 100 and comply with the constraint X1 <= 2*X2, one could use multinomial distribution
As soon as probability of the first number is low enough, your
condition would be almost always satisfied, if not - reject and repeat.
And multinomial distribution by design has the sum equal to 100.
Code, Windows 10 x64, Python 3.8
import numpy as np
def x1x2x3x4(rng):
while True:
v = rng.multinomial(100, [0.1, 1/2-0.1, 1/4, 1/4])
if v[0] <= 2*v[1]:
return v
return None
rng = np.random.default_rng()
print(x1x2x3x4(rng))
print(x1x2x3x4(rng))
print(x1x2x3x4(rng))
UPDATE
Lots of freedom in selecting probabilities. E.g., you could make other (##2, 3, 4) symmetric. Code
def x1x2x3x4(rng, pfirst = 0.1):
pother = (1.0 - pfirst)/3.0
while True:
v = rng.multinomial(100, [pfirst, pother, pother, pother])
if v[0] <= 2*v[1]:
return v
return None
UPDATE II
If you start rejecting combinations, then you artificially bump probabilities of one subset of events and lower probabilities of another set of events - and total sum is always 1. There is NO WAY to have uniform probabilities with conditions you want to meet. Code below runs with multinomial with equal probabilities and computes histograms and mean values. Mean supposed to be exactly 25 (=100/4), but as soon as you reject some samples, you lower mean of first value and increase mean of the second value. Difference is small, but UNAVOIDABLE. If it is ok with you, so be it. Code
import numpy as np
import matplotlib.pyplot as plt
def x1x2x3x4(rng, summa, pfirst = 0.1):
pother = (1.0 - pfirst)/3.0
while True:
v = rng.multinomial(summa, [pfirst, pother, pother, pother])
if v[0] <= 2*v[1]:
return v
return None
rng = np.random.default_rng()
s = 100
N = 5000000
# histograms
first = np.zeros(s+1)
secnd = np.zeros(s+1)
third = np.zeros(s+1)
forth = np.zeros(s+1)
mfirst = np.float64(0.0)
msecnd = np.float64(0.0)
mthird = np.float64(0.0)
mforth = np.float64(0.0)
for _ in range(0, N): # sampling with equal probabilities
v = x1x2x3x4(rng, s, 0.25)
q = v[0]
mfirst += np.float64(q)
first[q] += 1.0
q = v[1]
msecnd += np.float64(q)
secnd[q] += 1.0
q = v[2]
mthird += np.float64(q)
third[q] += 1.0
q = v[3]
mforth += np.float64(q)
forth[q] += 1.0
x = np.arange(0, s+1, dtype=np.int32)
fig, axs = plt.subplots(4)
axs[0].stem(x, first, markerfmt=' ')
axs[1].stem(x, secnd, markerfmt=' ')
axs[2].stem(x, third, markerfmt=' ')
axs[3].stem(x, forth, markerfmt=' ')
plt.show()
print((mfirst/N, msecnd/N, mthird/N, mforth/N))
prints
(24.9267492, 25.0858356, 24.9928602, 24.994555)
NB! As I said, first mean is lower and second is higher. Histograms are a little bit different as well
UPDATE III
Ok, Dirichlet, so be it. Lets compute mean values of your generator before and after the filter. Code
import numpy as np
def generate(n=10000):
uv = np.hstack([np.zeros([n, 1]),
np.sort(np.random.rand(n, 2), axis=1),
np.ones([n,1])])
return np.diff(uv, axis=1)
a = generate(1000000)
print("Original Dirichlet sample means")
print(a.shape)
print(np.mean((a[:, 0] * 100).astype(int)))
print(np.mean((a[:, 1] * 100).astype(int)))
print(np.mean((a[:, 2] * 100).astype(int)))
print("\nFiltered Dirichlet sample means")
q = (a[(a[:,0]<=2*a[:,1]) & (a[:,2]>0.35),:] * 100).astype(int)
print(q.shape)
print(np.mean(q[:, 0]))
print(np.mean(q[:, 1]))
print(np.mean(q[:, 2]))
I've got
Original Dirichlet sample means
(1000000, 3)
32.833758
32.791228
32.88054
Filtered Dirichlet sample means
(281428, 3)
13.912784086871243
28.36360987535
56.23109285501087
Do you see the difference? As soon as you apply any kind of filter, you alter the distribution. Nothing is uniform anymore
Ok, so I have this solution for my actual question where I generate 9000 triplets of 3 random variables by joining zeros to sorted random tuple arrays and finally ones and then taking their differences as suggested in the answer on SO I mentioned in my original question.
Then I simply filter out the ones that don't match my constraints and plot them.
S = 100
def generate(n=9000):
uv = np.hstack([np.zeros([n, 1]),
np.sort(np.random.rand(n, 2), axis=1),
np.ones([n,1])])
return np.diff(uv, axis=1)
a = generate()
def plotter(a):
fig = plt.figure(figsize=(10, 10), dpi=100)
ax = fig.add_subplot(projection='3d')
surf = ax.scatter(*zip(*a), marker='o', color=a / 100)
ax.view_init(elev=25., azim=75)
ax.set_xlabel('$A_1$', fontsize='large', fontweight='bold')
ax.set_ylabel('$A_2$', fontsize='large', fontweight='bold')
ax.set_zlabel('$A_3$', fontsize='large', fontweight='bold')
lim = (0, S);
ax.set_xlim3d(*lim);
ax.set_ylim3d(*lim);
ax.set_zlim3d(*lim)
plt.show()
b = a[(a[:, 0] <= 3.5 * a[:, 1] + 2 * a[:, 2]) &\
(a[:, 1] >= (a[:, 2])),:] * S
plotter(b.astype(int))
As you can see, the distribution is uniformly distributed over these arbitrary limits on the simplex but I'm still not sure if I could forego throwing away samples that don't adhere to the constraints (work the constraints somehow into the generation process? I'm almost certain now that it can't be done for general {Fi}). This could be useful in the general case if your constraints limit your sampled area to a very small subarea of the entire simplex (since resampling like this means that to sample from the constrained area a you need to sample from the simplex an order of 1/a times).
If someone has an answer to this last question I will be much obliged (will change the selected answer to his).
I have an answer to my question, under a general set of constraints what I do is:
Sample the constraints in order to evaluate s, the constrained area.
If s is big enough then generate random samples and throw out those that do not comply to the constraints as described in my previous answer.
Otherwise:
Enumerate the entire simplex.
Apply the constraints to filter out all tuples outside the constrained area.
List the resulting filtered tuples.
When asked to generate, I generate by choosing uniformly from this result list.
(note: this is worth my effort only because I'm asked to generate very often)
A combination of these two strategies should cover most cases.
Note: I also had to handle cases where S was a randomly generated parameter (m < S < M) in which case I simply treat it as another random variable constrained between m and M and I generate it together with the rest of the variables and handle it as I described earlier.

check_totals wrt a large vector in OpenMDAO

I'd like to check the total derivatives of an output with respect to a large array of inputs, but I don't want to check the derivative with respect to every member of the array, since the array is too large, and the complex steps (or finite differences) across each member of the array would take too long. Is there a way to check_totals wrt just a single member of an array?
Alternatively, is there a way to perform a directional derivative across the entire array for check_totals? This feature seems to exist for check_partials only?
As of Version 3.1.1 of OpenMDAO we don't have directional checking for totals, but it is a good idea and we are probably going to implement it when we figure out the best way.
As a workaround for now, I think the easiest way to take a directional derivative of your model is to temporarily modify your model by creating a component that takes a "step" in some random direction, and then inserting it in front of your component with wide inputs. I've put together a simple example here:
import numpy as np
import openmdao.api as om
n = 50
class DirectionalComp(om.ExplicitComponent):
def setup(self):
self.add_input('x', 1.0)
self.add_output('y', np.ones(n))
self.A = -1.0 + 2.0 * np.random.random(n)
self.declare_partials('y', 'x', rows=np.arange(n), cols=np.repeat(0, n), val=self.A)
def compute(self, inputs, outputs, discrete_inputs=None, discrete_outputs=None):
x = inputs['x']
outputs['y'] = x * self.A
prob = om.Problem()
model = prob.model
# Add something like this
model.add_subsystem('p', om.IndepVarComp('x', 1.0))
model.add_subsystem('direction', DirectionalComp())
model.connect('p.x', 'direction.x')
model.connect('direction.y', 'comp.x')
model.add_design_var('p.x')
# Old Model
model.add_subsystem('comp', om.ExecComp('y = 2.0*x', x=np.ones((n, )), y=np.ones((n, ))))
model.add_constraint('comp.y', lower=0.0)
prob.setup()
prob.run_model()
totals = prob.check_totals()

Math behind Conv2D function in Keras

I am using Conv2D model of Keras 2.0. However, I cannot fully understand what the function is doing mathematically. I try to understand the math using randomly generated data and a very simple network:
import numpy as np
import keras
from keras.layers import Input, Conv2D
from keras.models import Model
from keras import backend as K
# create the model
inputs = Input(shape=(10,10,1)) # 1 channel, 10x10 image
outputs = Conv2D(32, (3, 3), activation='relu', name='block1_conv1')(inputs)
model = Model(outputs=outputs, inputs=inputs)
# input
x = np.random.random(100).reshape((10,10))
# predicted output for x
y_pred = model.predict(x.reshape((1,10,10,1))) # y_pred.shape = (1,8,8,32)
I tried to calculate, for example, the value of the first row, the first column in the first feature map, following the demo in here.
w = model.layers[1].get_weights()[0] # w.shape = (3,3,1,32)
w0 = w[:,:,0,0]
b = model.layers[1].get_weights()[1] # b.shape = (32,)
b0 = b[0] # b0 = 0
y_pred_000 = np.sum(x[0:3,0:3] * w0) + b0
But relu(y_pred_000) is not equal to y_pred[0][0][0][0].
Could anyone point out what's wrong with my understanding? Thank you.
It's easy and it comes from Theano dim ordering. The result of applying filter in stored in a so called channel dimension. In case of TensorFlow this is the last dimension and that's why results are good. In case of Theano it's second dimension (convolution result has shape (cases, channels, width, height) so in order to solve your problem you need to change prediction line to:
y_pred = model.predict(x.reshape((1,1,10,10)))
Also you need to change the way you get the weights as weights in Theano has shape (output_channels, input_channels, width, height) you need to change the weight getter to:
w = model.layers[1].get_weights()[0] # w.shape = (32,1,3,3)
w0 = w[0,0,:,:]

Resources