Impact of the parameter dimension in gather function - torch

I am trying to use the gather function in pytorch but can't understand the role of dim parameter.
Code:
t = torch.Tensor([[1,2],[3,4]])
print(torch.gather(t, 0, torch.LongTensor([[0,0],[1,0]])))
Output:
1 2
3 2
[torch.FloatTensor of size 2x2]
Dimension set to 1:
print(torch.gather(t, 1, torch.LongTensor([[0,0],[1,0]])))
Output becomes:
1 1
4 3
[torch.FloatTensor of size 2x2]
How, gather function actually works?

I realized how the gather function works.
t = torch.Tensor([[1,2],[3,4]])
index = torch.LongTensor([[0,0],[1,0]])
torch.gather(t, 0, index)
Since the dimension is zero, so the output will be:
| t[index[0, 0], 0] t[index[0, 1], 1] |
| t[index[1, 0], 0] t[index[1, 1], 1] |
If the dimension is set to one, the output will become:
| t[0, index[0, 0]] t[0, index[0, 1]] |
| t[1, index[1, 0]] t[1, index[1, 1]] |
So the formula is:
For a 3-D tensor the output is specified by:
out[i][j][k] = input[index[i][j][k]][j][k] # if dim == 0
out[i][j][k] = input[i][index[i][j][k]][k] # if dim == 1
out[i][j][k] = input[i][j][index[i][j][k]] # if dim == 2
Reference: http://pytorch.org/docs/master/torch.html?highlight=gather#torch.gather

Just add to the existing answer, one application of gather is to collect scores along a designated dimension.
For instance we have such settings:
3 classes and 5 examples
Each class is assigned of a score, do it for every example
Objective is to collect scores indicated by the labels y
The code is as the follows
torch.manual_seed(0)
num_examples = 5
num_classes = 3
scores = torch.randn(5, 3)
#print of scores
scores: tensor([[ 1.5410, -0.2934, -2.1788],
[ 0.5684, -1.0845, -1.3986],
[ 0.4033, 0.8380, -0.7193],
[-0.4033, -0.5966, 0.1820],
[-0.8567, 1.1006, -1.0712]])
y = torch.LongTensor([1, 2, 1, 0, 2])
res = scores.gather(1, y.view(-1, 1)).squeeze()
Outputs:
#print of gather results
tensor([-0.2934, -1.3986, 0.8380, -0.4033, -1.0712])

Related

How to use CP-SAT solver or other tools to find 3D arrays with constraints on rows, columns and tubes (representing school classes, terms and days)?

I will be grateful to anyone who can help me write some Python code to enumerate the 21×2×3 arrays, indexed with i, j and k, which are two-thirds filled with 0's and one-third filled with the values 'Ava', 'Bob', 'Joe', 'Mia', 'Sam', 'Tom', 'Zoe' in such a way that:
fixed the index i you have exactly two empty 2-tuples and one 2-tuple with different non-zero values;
fixed the index k you have exactly fourteen empty 2-tuples and seven 2-tuple with different non-zero values;
fixed the indexes j and k you have a 21-tuple with fourteen zero values and exactly one occurrence of each of the non-zero values, respecting the following constraints:
a) "Ava" can appear only in a row with index 0, 1, 4, 6, 10, 11, 13, 14, 15, 19 or 20;
b) "Bob" can appear only in a row with index 2, 3, 5, 7, 8, 9, 12, 16, 17 or 18;
c) "Joe" can appear only in a row with index 2, 4, 5, 7, 8, 10, 14, 15, 18 or 20;
d) "Mia" can appear only in a row with index 0, 1, 3, 6, 9, 12, 13, 16, 17 or 19;
e) "Sam" can appear only in a row with index 1, 2, 7, 9, 15, 17 or 20;
f) "Tom" can appear only in a row with index 0, 3, 8, 10, 12, 16 or 19;
g) "Zoe" can appear only in a row with index 4, 5, 6, 11, 13, 14 or 18.
As a result I would like to obtain something like this:
[ 0 0 [Tom Mia [ 0 0
0 0 Ava Sam 0 0
0 0 Sam Bob 0 0
0 0 Bob Tom 0 0
0 0 0 0 Joe Zoe
0 0 Joe Zoe 0 0
0 0 0 0 Zoe Ava
Joe Sam 0 0 0 0
0 0 0 0 Tom Bob
0 0 0 0 Mia Sam
Tom Ava 0 0 0 0
Ava Zoe 0 0 0 0
Bob Mia 0 0 0 0
0 0 Mia Ava 0 0
0 0 Zoe Joe 0 0
Sam Joe 0 0 0 0
0 0 0 0 Bob Tom
0 0 0 0 Sam Mia
Zoe Bob 0 0 0 0
Mia Tom 0 0 0 0
0 0 ] 0 0 ] Ava Joe]
Rows represent school classes, columns represent school terms (there are 2 of them), tubes represent class days (there are 3 of them: Monday, Wednesday and Friday). So the first horizontal slice of the above solution means that class 1A has lesson only on Wednesday, in the the first term with teacher Tom and in the second term with teacher Mia. (Teachers can only work in some classes and not in others.)
Thanks in advance!
Update n. 1
As a starting point, I tried to attack the following toy problem:
Enumerate all arrays with a given number of rows and 3 columns which are two-thirds filled with "0" and one-third filled with "1" in such a way that summing the values in each row you always get 1 and summing the values in each column you always get rows / 3.
Finally, after struggling a bit, I think I managed to get a solution with the following code, that I kindly ask you to correct or improve. (I have set rows = 6 because the number of permutations of the obvious solution is 6!/(2!*2!*2!) = 90, whereas setting rows = 21 I would have got 21!/(7!*7!*7!) = 399,072,960 solutions.)
from ortools.sat.python import cp_model
# Create model.
model = cp_model.CpModel()
# Create variables.
rows = 6
columns = 3
x = []
for i in range(rows):
x.append([model.NewBoolVar(f'x[{i}][{j}]') for j in range(columns)])
# Add constraints.
for i in range(rows):
model.Add(sum(x[i]) == 1)
# Uncomment the following four lines of code if you want to solve the slightly more general problem that asks to enumerate
# all boolean arrays, with a given number of rows and columns, filled in such a way that summing the values in each
# row you always get 1 and summing the values in each column you always get no more than the ceiling of (rows / columns).
# if rows % columns != 0:
# for j in range(columns):
# model.Add(sum(x[i][j] for i in range(rows)) <= rows // columns + 1)
# else:
for j in range(columns):
model.Add(sum(x[i][j] for i in range(rows)) == rows // columns)
class MyPrintedSolution():
def __init__(self, sol, sol_number):
self.sol = sol
self.sol_number = sol_number
def PrintReadableTable(self):
print(f'Solution {self.sol_number}, printed in readable form:')
counter = 0
for v in self.sol:
if counter % columns != columns-1:
print(v, end = ' ')
else:
print(v)
counter += 1
print()
def PrintRawSolution(self):
print(f'Solution {self.sol_number}, printed in raw form:')
counter = 0
for v in self.sol:
print(f'{v}', end = '')
counter += 1
print('\n')
class VarArraySolutionPrinter(cp_model.CpSolverSolutionCallback):
def __init__(self, variables, limit):
cp_model.CpSolverSolutionCallback.__init__(self)
self.__variables = variables
self.__solution_count = 0
self.__solution_limit = limit
def solution_count(self):
return self.__solution_count
def on_solution_callback(self):
self.__solution_count += 1
solution = [self.Value(v) for v in self.__variables]
myprint = MyPrintedSolution(solution, self.__solution_count)
myprint.PrintReadableTable()
# myprint.PrintRawSolution()
if self.__solution_count >= self.__solution_limit:
print(f'Stop search after {self.__solution_limit} solutions')
self.StopSearch()
# Create solver and solve model.
solver = cp_model.CpSolver()
# solver.parameters.num_workers = 16 # Solver works better with more workers. (At least 8, 16 if enough cores.)
# solver.parameters.log_search_progress = True
solver.parameters.enumerate_all_solutions = True
# solver.parameters.max_time_in_seconds = 10.0
solution_limit = 100000
solution_printer = VarArraySolutionPrinter([x[i][j] for i in range(rows) for j in range(columns)], solution_limit)
solver.Solve(model, solution_printer)
Update n. 2
Following #Christopher Hamkins' initial roadmap and subsequent precious suggestions, I think I finally got what I wanted, using the following code (although I am of course always open to corrections or further suggestions).
from ortools.sat.python import cp_model
# Create model.
model = cp_model.CpModel()
# Create variables.
classes = 21 # indexed with "i", but one could as well have chosen "c"
terms = 2 # indexed with "j", but one could as well have chosen "t"
days = 3 # indexed with "k", but one could as well have chosen "d"
persons = 8 # indexed with "p"
persons_names = [' 0 ', 'Ava', 'Bob', 'Joe', 'Mia', 'Sam', 'Tom', 'Zoe']
classes_names = ['1A', '1B', '1C', '1D', '1E', '1F', '1G', '2A', '2B', '2C', '2D', '2E', '2F', '2G', '3A', '3B', '3C', '3D', '3E', '3F', '3G']
classes_p = [[] for _ in range(persons)]
classes_p[0] = list(range(classes))
classes_p[1] = [0, 1, 4, 6, 10, 11, 13, 14, 15, 19, 20] # list of classes in which person 1 can work
classes_p[2] = [2, 3, 5, 7, 8, 9, 12, 16, 17, 18] # list of classes in which person 2 can work
classes_p[3] = [2, 4, 5, 7, 8, 10, 14, 15, 18, 20] # list of classes in which person 3 can work
classes_p[4] = [0, 1, 3, 6, 9, 12, 13, 16, 17, 19] # list of classes in which person 4 can work
classes_p[5] = [1, 2, 7, 9, 15, 17, 20] # list of classes in which person 5 can work
classes_p[6] = [0, 3, 8, 10, 12, 16, 19] # list of classes in which person 6 can work
classes_p[7] = [4, 5, 6, 11, 13, 14, 18] # list of classes in which person 7 can work
x = {}
for i in range(classes):
for j in range(terms):
for k in range(days):
for p in range(persons):
x[i, j, k, p] = model.NewBoolVar(f'x[{i}, {j}, {k}, {p}]')
# Add constraints.
"""
For all i, j, k constrain the sum of x[i, j, k, p] over p in the range of people to be equal to 1,
so exactly nobody or one person is selected at a given slot.
"""
for i in range(classes):
for j in range(terms):
for k in range(days):
model.Add(sum(x[i, j, k, p] for p in range(persons)) == 1)
"""
For all i constrain the sum of x[i, j, k, p] over all j, k, p in their respective ranges (except p = 0)
to be exactly equal to 2, so exactly two people are in a given row.
"""
for i in range(classes):
model.Add(sum(x[i, j, k, p] for j in range(terms) for k in range(days) for p in range(1, persons)) == 2)
"""
For all i, k, and for p = 0, add the implications
x[i, 0, k, 0] == x[i, 1, k, 0]
"""
for i in range(classes):
for k in range(days):
model.Add(x[i, 0, k, 0] == x[i, 1, k, 0])
"""
For all i, p (except p = 0), constrain the sum of x[i, j, k, p] over all j and k
to be at most 1.
"""
for i in range(classes):
for p in range(1, persons):
model.Add(sum(x[i, j, k, p] for j in range(terms) for k in range(days)) <= 1)
# for k in range(days): # Equivalent alternative to the previous line of code
# model.AddBoolOr([x[i, 0, k, p].Not(), x[i, 1, k, p].Not])
"""
For all j, k constrain the sum of x[i, j, k, p] over all i, p in their respective ranges (except p = 0)
to be exactly equal to 7, so exactly seven people are in a given column.
"""
for j in range(terms):
for k in range(days):
model.Add(sum(x[i, j, k, p] for i in range(classes) for p in range(1, persons)) == 7)
"""
For all j, k, p (except p = 0) constrain the sum of x[i, j, k, p] over all i
to be exactly equal to 1, so each person appears exactly once in the column.
"""
for j in range(terms):
for k in range(days):
for p in range(1, persons):
model.Add(sum(x[i, j, k, p] for i in range(classes)) == 1)
"""
For all j and k, constrain x[i, j, k, p] == 0 for the row i in which each person p can't appear.
"""
for p in range(persons):
for i in enumerate(set(range(classes)) - set(classes_p[p])):
for j in range(terms):
for k in range(days):
model.Add(x[i[1], j, k, p] == 0)
class MyPrintedSolution():
def __init__(self, sol, sol_number):
self.sol = sol
self.sol_number = sol_number
def PrintReadableTable1(self):
print(f'Solution {self.sol_number}, printed in first readable form:')
print(' | Mon | Wed | Fri ')
print(' Cl | Term1 Term2 | Term1 Term2 | Term1 Term2')
print('----------------------------------------------------', end='')
q = [_ for _ in range(8)] + [_ for _ in range(24, 32)] + [_ for _ in range(8, 16)] + [_ for _ in range(32, 40)] + [_ for _ in range(16, 24)] + [_ for _ in range(40, 48)]
r = []
for i in range(21):
r += [n+48*i for n in q]
shuffled_sol = [self.sol[m] for m in tuple(r)]
counter = 0
for w in shuffled_sol:
if (counter % (persons * days * terms)) == 0:
print('\n ', classes_names[counter // (terms * days * persons)], sep='', end=' |')
if w:
print(' ', persons_names[counter % persons], sep='', end=' ')
counter += 1
print('\n')
def PrintReadableTable2(self):
print(f'Solution {self.sol_number}, printed in second readable form:')
print(' Cl | Term1 Term2 ')
print(' Cl | Mon Wed Fri Mon Wed Fri ')
print('----------------------------------------', end = '')
counter = 0
for v in self.sol:
if (counter % (persons * days * terms)) == 0:
print('\n ', classes_names[counter // (terms * days * persons)], sep = '', end = ' |')
if v:
print(' ', persons_names[counter % persons], sep = '', end = ' ')
counter += 1
print('\n')
def PrintRawSolution(self):
print(f'Solution {self.sol_number}, printed in raw form:')
counter = 0
for v in self.sol:
print(f'{v}', end = '')
counter += 1
print('\n')
class VarArraySolutionPrinter(cp_model.CpSolverSolutionCallback):
def __init__(self, variables, limit):
cp_model.CpSolverSolutionCallback.__init__(self)
self.__variables = variables
self.__solution_count = 0
self.__solution_limit = limit
def solution_count(self):
return self.__solution_count
def on_solution_callback(self):
self.__solution_count += 1
solution = [self.Value(v) for v in self.__variables]
myprint = MyPrintedSolution(solution, self.__solution_count)
myprint.PrintReadableTable1()
# myprint.PrintReadableTable2()
# myprint.PrintRawSolution()
if self.__solution_count >= self.__solution_limit:
print(f'Stop search after {self.__solution_limit} solutions')
self.StopSearch()
# Create solver and solve model.
solver = cp_model.CpSolver()
# solver.parameters.num_workers = 16 # Solver works better with more workers. (At least 8, 16 if enough cores.)
# solver.parameters.log_search_progress = True
solver.parameters.enumerate_all_solutions = True
# solver.parameters.max_time_in_seconds = 10.0
solution_limit = 20
solution_printer = VarArraySolutionPrinter([x[i, j, k, p] for i in range(classes) for j in range(terms) for k in range(days) for p in range(persons)], solution_limit)
status = solver.Solve(model, solution_printer)
Update n. 3
#AirSquid proposed a solution using PuLP which is to me almost as valuable as the one using CP-SAT. It provides only one solution at a time, but (it has other advantages and) one can always get around this by adding some ad hoc further constraints, for example to see a different solution with a certain person in a specific position.
Your "toy" problem is definitely going in the right direction.
For your actual problem, try making a 21×2×3x8 array x indexed with i, j, k and p (for person) of BoolVar's. The last index represents the person, it will need 0 to represent "nobody" and for the rest Ava = 1, Bob = 2, etc., so its max value will be one more than the number of people. If the variable X[i,j,k,p] is true (1) it means that the given person p is present at the index i, j, k. If X[i,j,k,0] is true, it means a 0 = nobody is present at i, j, k.
For all i, j, k, constrain the sum of x[i, j, k, p] for p in the range of people to be equal to 1, so exactly nobody or one person is selected at a given slot.
For point 1: fixed the index i you have exactly two empty 2-tuples and one 2-tuple with different non-zero values:
For all i constrain the sum of x[i, j, k, p] for all j, k, p in their respective ranges (except p = 0) to be exactly equal to 2, so exactly two people are in a given row.
For all i, k, and for p = 0, add the implications
x[i, 0, k, 0] == x[i, 1, k, 0]
This will ensure that if one of the pair is 0, so is the other.
For all i, k and p except p = 0, add the implications
x[i, 0, k, p] implies x[i, 1, k, p].Not and
x[i, 1, k, p] implies x[i, 0, k, p].Not
(Actually one of these alone should be sufficient)
You can directly add an implication with the AddImplication(self, a, b) method, or you can realize that "a implies b" means the same thing as "b or not a" and add the implication with the AddBoolOr method. For the first implication, with x[i, 0, k, p] as a, and x[i, 1, k, p].Not as b, therefore adding:
AddBoolOr([x[i, 0, k, p].Not(), x[i, 1, k, p].Not])
Note that both variables are negated with Not in the expression.
Since the other implication assigns x[i, 1, k, p] as a, and x[i, 0, k, p].Not as b, the resulting expression is exactly the same
AddBoolOr([x[i, 0, k, p].Not(), x[i, 1, k, p].Not])
so you only need to add it once.
This will ensure a tuple will consist of two different people.
Alternative formulation of the last part:
For all i and p except p = 0, constraint the sum of x[i, j, k, p] for all j and k to be exactly equal to 1.
For point 2: fixed the index k you have exactly fourteen empty 2-tuples and seven 2-tuple with different non-zero values;
For all j and k constrain the sum of x[i, j, k, p] for all i and p (except p=0) in their respective ranges to be exactly equal to 7, so exactly seven people are in a given column.
For all j, k, and p (except p = 0) constrain the sum of x[i, j, k, p] over all i to be exactly equal to 1, so each person appears exactly once in the column (that is, once for each value of the indices j and k, for some value of i).
For point 3:
For all j and k, Constrain x[i, j, k, p] == 0 for the row i in which each person p can't appear.
Let us know how it works.
You're taking a pretty big swing if you are new to the trifecta of python, linear programming, and pulp, but the problem you describe is very doable...perhaps the below will get you started. It is a smaller example that should work just fine for the data you have, I just didn't type it all in.
A couple notes:
The below is a linear program. It is "naturally integer" as coded, preventing the need to restrict the domain of the variables to integers, so it is much easier to solve. (A topic for you to research, perhaps).
You could certainly code this up as a constraint problem as well, I'm just not as familiar. You could also code this up like a matrix as you are doing with i, j, k, but most frameworks allow more readable names for the sets.
The teaching day M/W/F is arbitrary and not linked to anything else in the problem, so you can (externally to the problem), just pick 1/3 of the assignments per day from the solution for each course & term.
The transition from the verbiage to the constraint formulation is most of the magic in linear programming and you'd be well suited with an introductory text if you continue along!
Code:
# teacher assignment
import pulp
from itertools import chain
# some data...
teach_days = {'M', 'W', 'F'}
terms = {'Spring', 'Fall'}
courses = {'Math 101', 'English 203', 'Physics 201'}
legal_asmts = { 'Bob': {'Math 101', 'Physics 201'},
'Ann': {'Math 101', 'English 203'},
'Tim': {'English 203'},
'Joe': {'Physics 201'}}
# quick sanity check
assert courses == set.union(*chain(legal_asmts.values())), 'course mismatch'
# set up the problem
prob = pulp.LpProblem('teacher_assignment', pulp.LpMaximize)
# make a 3-tuple index of the term, class, teacher
idx = [(term, course, teacher) for term in terms for course in courses for teacher in legal_asmts.keys()]
assn = pulp.LpVariable.dicts('assign', idx, cat=pulp.LpContinuous, lowBound=0)
# OBJECTIVE: teach as many courses as possible within constraints...
prob += pulp.lpSum(assn)
# CONSTRAINTS
# teach each class no more than once per term
for term in terms:
for course in courses:
prob += pulp.lpSum(assn[term, course, teacher] for teacher in legal_asmts.keys()) <= 1
# each teacher no more than 1 course per term
for term in terms:
for teacher in legal_asmts.keys():
prob += pulp.lpSum(assn[term, course, teacher] for course in courses) <= 1
# each teacher can only teach within legal assmts, and if legal, only teach it once
for teacher in legal_asmts.keys():
for course in courses:
if course in legal_asmts.get(teacher):
prob += pulp.lpSum(assn[term, course, teacher] for term in terms) <= 1
else: # it is not legal assignment
prob += pulp.lpSum(assn[term, course, teacher] for term in terms) <= 0
prob.solve()
#print(prob)
# Inspect results...
for i in idx:
if assn[i].varValue: # will be true if value is non-zero
print(i, assn[i].varValue)
Output:
Coin0008I MODEL read with 0 errors
Option for timeMode changed from cpu to elapsed
Presolve 16 (-10) rows, 12 (-12) columns and 32 (-40) elements
Perturbing problem by 0.001% of 1 - largest nonzero change 0.00010234913 ( 0.010234913%) - largest zero change 0
0 Obj -0 Dual inf 11.99913 (12)
10 Obj 5.9995988
Optimal - objective value 6
After Postsolve, objective 6, infeasibilities - dual 0 (0), primal 0 (0)
Optimal objective 6 - 10 iterations time 0.002, Presolve 0.00
Option for printingOptions changed from normal to all
Total time (CPU seconds): 0.00 (Wallclock seconds): 0.00
('Spring', 'Math 101', 'Bob') 1.0
('Spring', 'Physics 201', 'Joe') 1.0
('Spring', 'English 203', 'Ann') 1.0
('Fall', 'Math 101', 'Ann') 1.0
('Fall', 'Physics 201', 'Bob') 1.0
('Fall', 'English 203', 'Tim') 1.0
EDIT: corrected form of problem
Misunderstood part of the problem statement. The below is fixed. Needed to introduce a binary indicator variable for the day assignment per form and had some fun with tabulate.
Using an LP has the advantage that (with the included obj statement) it will do the best possible within the constraints to teach as much as possible, even if there is a teacher shortage, where a CP will not. A CP on the other hand can enumerate all the combos that satisfy the constraints, the LP cannot.
Code
# teacher assignment
import pulp
from tabulate import tabulate
# some data...
teach_days = {'M', 'W', 'F'}
terms = {'Spring', 'Fall'}
forms = list(range(20))
teach_capable = { "Ava" : [ 0, 1, 4, 6, 10, 11, 13, 14, 15, 19, 20],
"Bob" : [ 2, 3, 5, 7, 8, 9, 12, 16, 17, 18],
"Joe" : [ 2, 4, 5, 7, 8, 10, 14, 15, 18, 20],
"Mia" : [ 0, 1, 3, 6, 9, 12, 13, 16, 17, 19],
"Sam" : [ 1, 2, 7, 9, 15, 17, 20],
"Tom" : [ 0, 3, 8, 10, 12, 16, 19],
"Zoe" : [ 4, 5, 6, 11, 13, 14, 18],}
# set up the problem
prob = pulp.LpProblem('teacher_assignment', pulp.LpMaximize)
# make a 4-tuple index of the day, term, class, teacher
idx = [(day, term, form, teacher)
for day in teach_days
for term in terms
for form in forms
for teacher in teach_capable.keys()]
# variables
assn = pulp.LpVariable.dicts('assign', idx, cat=pulp.LpContinuous, lowBound=0)
form_day = pulp.LpVariable.dicts('form-day',
[(form, day) for form in forms for day in teach_days],
cat=pulp.LpBinary) # inidicator for which day the form uses
# OBJECTIVE: teach as many courses as possible within constraints...
prob += pulp.lpSum(assn)
# CONSTRAINTS
# 1. Teach each form on no more than 1 day
for form in forms:
prob += pulp.lpSum(form_day[form, day] for day in teach_days) <= 1 # limit to 1 day per form
for form in forms:
for day in teach_days:
for term in terms:
# no more than 1 assignment, if this day is the designated "form-day"
prob += pulp.lpSum(assn[day, term, form, teacher] for teacher in teach_capable.keys()) \
<= form_day[form, day]
# 2. Each teacher can only teach within legal assmts, and limit them to teaching that form once
for teacher in teach_capable.keys():
for form in forms:
if form in teach_capable.get(teacher):
prob += pulp.lpSum(assn[day, term, form, teacher] for day in teach_days for term in terms) <= 1
else: # it is not legal assignment
prob += pulp.lpSum(assn[day, term, form, teacher] for day in teach_days for term in terms) <= 0
# 3. Each teacher can only teach on once per day per term
for teacher in teach_capable.keys():
for term in terms:
for day in teach_days:
prob += pulp.lpSum(assn[day, term, form, teacher] for form in forms) <= 1
prob.solve()
print("Status = %s" % pulp.LpStatus[prob.status])
#print(prob)
# gather results...
selections = []
for i in idx:
if assn[i].varValue: # will be true if value is non-zero
selections.append(i)
#print(i, assn[i].varValue)
# Let's try to make some rows for tabulate... hacky but fun
def row_index(label):
"""return the form, column index, and name"""
col = 1
if 'W' in label: col += 2
elif 'F' in label: col += 4
if 'Fall' in label: col += 1
return label[2], col, label[-1]
headers = ['Form', 'Mon-1', 'Mon-2', 'Wed-1', 'Wed-2', 'Fri-1', 'Fri-2']
row_data = [[f,'','','','','',''] for f in forms]
for selection in selections:
form, col, name = row_index(selection)
row_data[form][col] = name
print(tabulate(row_data, headers=headers, tablefmt='grid'))
Output:
Status = Optimal
+--------+---------+---------+---------+---------+---------+---------+
| Form | Mon-1 | Mon-2 | Wed-1 | Wed-2 | Fri-1 | Fri-2 |
+========+=========+=========+=========+=========+=========+=========+
| 0 | | | Ava | Tom | | |
+--------+---------+---------+---------+---------+---------+---------+
| 1 | Mia | Sam | | | | |
+--------+---------+---------+---------+---------+---------+---------+
| 2 | | | | | Sam | Joe |
+--------+---------+---------+---------+---------+---------+---------+
| 3 | | | | | Bob | Mia |
+--------+---------+---------+---------+---------+---------+---------+
| 4 | Ava | Zoe | | | | |
+--------+---------+---------+---------+---------+---------+---------+
| 5 | Zoe | Joe | | | | |
+--------+---------+---------+---------+---------+---------+---------+
| 6 | | | Mia | Zoe | | |
+--------+---------+---------+---------+---------+---------+---------+
| 7 | | | Joe | Sam | | |
+--------+---------+---------+---------+---------+---------+---------+
| 8 | Bob | Tom | | | | |
+--------+---------+---------+---------+---------+---------+---------+
| 9 | | | Sam | Mia | | |
+--------+---------+---------+---------+---------+---------+---------+
| 10 | | | | | Ava | Tom |
+--------+---------+---------+---------+---------+---------+---------+
| 11 | | | Zoe | Ava | | |
+--------+---------+---------+---------+---------+---------+---------+
| 12 | | | Tom | Bob | | |
+--------+---------+---------+---------+---------+---------+---------+
| 13 | | | | | Zoe | Ava |
+--------+---------+---------+---------+---------+---------+---------+
| 14 | | | | | Joe | Zoe |
+--------+---------+---------+---------+---------+---------+---------+
| 15 | Joe | Ava | | | | |
+--------+---------+---------+---------+---------+---------+---------+
| 16 | | | | | Tom | Bob |
+--------+---------+---------+---------+---------+---------+---------+
| 17 | Sam | Bob | | | | |
+--------+---------+---------+---------+---------+---------+---------+
| 18 | | | Bob | Joe | | |
+--------+---------+---------+---------+---------+---------+---------+
| 19 | Tom | Mia | | | | |
+--------+---------+---------+---------+---------+---------+---------+
[Finished in 165ms]

Update cartesianIndex

I am stuck in a problem. I want to update my cartesian index.
I have a matrix (x) 200x6 that is binary. 1 if assigned, 0 otherwise. I want to find the cartesian index of when x is 1 in the first 3 columns and in the last 3 elements.
I have the following code:
index_right = findall( x -> x == 1, sol.x_assignment[:,1:3])
index_left = findall( x -> x == 1, sol.x_assignment[:,4:6])
index_left
However index_right is correct, index_left is wrong as it returns index between 1,2,3 instead of 4,5,6
CartesianIndex(2, 1)
CartesianIndex(3, 1)
CartesianIndex(10, 2)
CartesianIndex(11, 1)
Expected output:
CartesianIndex(2, 4)
CartesianIndex(3, 4)
CartesianIndex(10, 5)
CartesianIndex(11, 4)
How can I update index_left to add +3 in the second index for all?
One solution could be
index_left = findall( x -> x == 1, sol.x_assignment[:,4:6])
index_left = map(x -> x + CartesianIndex(0, 3), index_left)
I think you can also use ==(1) in place of x -> x + 1, looks a bit nicer :)
index_left = findall(==(1), sol.x_assignment[:,4:6])
and the inplace version of map should work too
map!(x -> x + CartesianIndex(0, 3), index_left, index_left).
An alternative could be first finding all the indices with 1 and then filtering afterwards, so smth like
full_index = findall(==(1), sol.x_assignment)
and then
left_index = filter(x -> x[2] <= 3, full_index)
right_index = filter(x -> x[2] > 3, full_index)
Assuming your x is:
using Random;Random.seed!(0);
x = rand(Bool, (5,6))
The first set can be found as:
findall(isone, #view x[:, 1:3])
For the second set you need to shift the results hence you want:
findall(isone, #view x[:, 4:6]) .+ Ref( CartesianIndex(0,3))
If you are searching for different value eg. 2 use ==(2) rather than a lambda as this is faster.
Similarly #view allows to avoid unnecessary allocations.

Get a sequence number from 0 and alternate positive/negative incrementing every other time

I would like to be able to obtain a (non-convergent) sequence of numbers by a simple calculation that would look like this: 0, 1, -1, 2, -2, 3, -3, 4, -4 ...
By simple calculation I mean being able to do it with a single variable that would start from 1 (or 0) without having to rearrange this sequence.
I made several (unsuccessful) attempts in Lua, here is what it should look like in principle (this example only alternates 0s and 1s):
do
local n = 0
for i = 1, 10 do print(n)
n = n==0 and 1 or -n + (n/n)
end
end
Is this possible and how?
Update:
I just succeeded like this:
local n, j = 0, 2
for i = 1, 10 do print(n)
n = n==0 and 1 or j%2==0 and -(n+(n/math.abs(n))) or -n
j = j + 1
end
But I have to help myself with a second variable, I would have liked to know if with only n it would be possible to do it?
The whole numbers are enumerable. Thus there exists a mapping from the natural numbers to whole numbers. You'll now have to use a loop to loop over natural numbers, then compute a function that gives you a whole number:
-- 0, 1...10, -1...-10 -> 21 numbers total
for n = 1, 21 do
local last_bit = n % 2
local sign = 1 - (2 * last_bit)
local abs = (n - last_bit) / 2
print(sign * abs)
end
prints
-0
1
-1
2
-2
...
10
-10
on Lua 5.1; on newer Lua versions, you can use n // 2 instead of (n - last_bit) / 2 to (1) use ints and (2) make extracting the abs cheaper.
Simply "emit" both n and -n in each iteration:
for n = 0, 10 do
print(n)
print(-n)
end
My problem was solved by #EgorSkriptunoff in comment of my question, his approach is:
n = (n > 0 and 0 or 1) - n
The output of:
local n = 0
for i=1,10 do io.write(n..", ")
n = (n > 0 and 0 or 1) - n
end
Actually gives:
0, 1, -1, 2, -2, 3, -3, 4, -4, 5,

Force y axis to start at 0 and still use automated labeling

I have a plot whose y min starts well above 0. But I want to include 0 as the min of the y-axis and still have Stata automatically create evenly-spaced y-axis labels.
Here is the baseline:
sysuse auto2, clear
scatter turn displacement
This produces:
This is almost what I want, except that the y range does not start at 0.
Based on this answer by Nick Cox (https://www.statalist.org/forums/forum/general-stata-discussion/general/1598753-force-chart-y-axis-to-start-at-0), I modify the code to be:
scatter turn displacement, yscale(range(0 .)) ylabel(0)
This succeeds in starting the y-axis at 0, but the labeling besides 0 goes away:
I proceed to remove `ylabel(0):
scatter turn displacement, yscale(range(0 .))
This produces the opposite problem - the y-axis labels are the same as in the first plot.
How can I have Stata automatically produce the y-axis labels from 0 to the max? For instance, 0, 10, 20, 30, 40, 50 - importantly, though, I have many plots and need a solution that determines the exact values automatically, without needing me to input the y max, etc. So it would not be me who chooses 10, 20, ..., 50, but Stata.
By coincidence, I have been working on a command in this territory. Here is a reproducible example.
sysuse auto, clear
summarize turn, meanonly
local max = r(max)
nicelabels 0 `max', local(yla)
* shows 0 20 40 60
scatter turn displacement, yla(`yla', ang(h))
nicelabels 0 `max', local(yla) nvals(10)
* shows 0 10 20 30 40 50 60
scatter turn displacement, yla(`yla', ang(h))
where nicelabels is at present this code.
*! 1.0.0 NJC 25 April 2022
program nicelabels
/// fudge() undocumented
version 9
gettoken first 0 : 0, parse(" ,")
capture confirm numeric variable `first'
if _rc == 0 {
// syntax varname(numeric), Local(str) [ nvals(int 5) tight Fudge(real 0) ]
syntax [if] [in] , Local(str) [ nvals(int 5) tight Fudge(real 0) ]
local varlist `first'
marksample touse
quietly count if `touse'
if r(N) == 0 exit 2000
}
else {
// syntax #1 #2 , Local(str) [ nvals(int 5) tight Fudge(real 0) ]
confirm number `first'
gettoken second 0 : 0, parse(" ,")
syntax , Local(str) [ nvals(int 5) tight Fudge(real 0) ]
if _N < 2 {
preserve
quietly set obs 2
}
tempvar varlist touse
gen double `varlist' = cond(_n == 1, `first', `second')
gen byte `touse' = _n <= 2
}
su `varlist' if `touse', meanonly
local min = r(min) - (r(max) - r(min)) * `fudge'/100
local max = r(max) + (r(max) - r(min)) * `fudge'/100
local tight = "`tight'" == "tight"
mata: nicelabels(`min', `max', `nvals', `tight')
di "`results'"
c_local `local' "`results'"
end
mata :
void nicelabels(real min, real max, real nvals, real tight) {
if (min == max) {
st_local("results", min)
exit(0)
}
real range, d, newmin, newmax
colvector nicevals
range = nicenum(max - min, 0)
d = nicenum(range / (nvals - 1), 1)
newmin = tight == 0 ? d * floor(min / d) : d * ceil(min / d)
newmax = tight == 0 ? d * ceil(max / d) : d * floor(max / d)
nvals = 1 + (newmax - newmin) / d
nicevals = newmin :+ (0 :: nvals - 1) :* d
st_local("results", invtokens(strofreal(nicevals')))
}
real nicenum(real x, real round) {
real expt, f, nf
expt = floor(log10(x))
f = x / (10^expt)
if (round) {
if (f < 1.5) nf = 1
else if (f < 3) nf = 2
else if (f < 7) nf = 5
else nf = 10
}
else {
if (f <= 1) nf = 1
else if (f <= 2) nf = 2
else if (f <= 5) nf = 5
else nf = 10
}
return(nf * 10^expt)
}
end
EDIT
If you go
sysuse auto, clear
summarize turn, meanonly
local max = r(max)
scatter turn displacement, yla(0(10)`max', ang(h))
scatter turn displacement, yla(0(20)`max', ang(h))
you get good solutions. Clearly in this case we need to know that 10 or 20 is a step size to use. There would be scope to calculate a good width programmatically using your own recipe.
EDIT 10 May 2022
A revised and documented nicelabels is now downloadable from SSC.

Sum on spotifre

I try to compare two expressions in an array on spotfire
but they do not give me the same result and I don't understand why from a mathematical point of view.
Sum([OUTS_P] - [OUTS_P2])
Sum([OUTS_P]) - Sum([OUTS_P2])
Do you have an idea in which case these two operations could be different ?
take this example table:
A B
1 3
2 2
3 1
we have these two results:
Sum([A]) - Sum([B]) = Sum(1, 2, 3) - Sum(3, 2, 1) = 6 - 6 = 0
Sum([A] - [B]) = Sum( (1 - 3), (2 - 2), (3 - 1) ) = Sum(-2, 0, 2) = 0
this is what you're expecting, and this will work 100% of the time.
unless, of course, your table resembles this one:
A B
1 3
2
3 1
B:2 is NULL or (Empty). this table results in the expressions being evaluated as:
Sum([A]) - Sum([B]) = Sum(1, 2, 3) - Sum(3, 1) = 6 - 4 = 2
Sum([A] - [B]) = Sum( (1 - 3), (3 - 1) ) = Sum(-2, 2) = 0
the reason is because NULL is non-numeric; it's not possible to evaluate 2 - NULL, and this data is ignored by Sum().
if you want both expressions to always result in the same answer, you can create a calculated column like this for each column you'll be using in Sum():
If([Column] is NULL, 0, [Column])
and then aggregate on this column instead of the original.

Resources