Setting up linear program for allocation/assignment problem - r

I have some troubles regarding a linear program I alreday solved and use excel but now i want to do it in r/python beacuse I already reach excels and the solvers limits. Therefore I am asking for help on this specific topic.
I tried it with the lPsovle package by also altering the lp.assign function but I cannot come up with an solution.
The problem is as follows:
Let's say I am a deliverer of an commodity good.
I have differnet depots which serve different areas. These areas MUST be served with their demands.
My depots on the other hand, have a constraint regarding their capacity what they can handle and deliver.
One depot can serve several areas, but one area can only be served by one depot.
I have the distance/cost matrix for the connections between depots and areas as well as the demand for that areas.
The objective for this solution should be that the areas should be served with the minimal possible effort.
Lets say the cost/distance matrix looks something like this:
assign.costs <- matrix (c(2, 7, 7, 2, 7, 7, 3, 2, 7, 2, 8, 10, 1, 9, 8, 2,7,8,9,10), 4, 10)
So this creates my matrix, with the costumers/areas in the first row/header and the depots in the first column/row names.
Now the demand of the areas/customers is:
assign.demand <- matrix (c(1,2,3,4,5,6,7,8,9,10), 1, 10)
The capacity restrictions, what amount the depos are able to serve is:
assign.capacity <- matrix (c(15,15,15,15), 4, 1)
So now i woukd like this problem to be solved by a lp to generate the allocation, which area should be served by which depot according to these restrictions.
The result should look something like this:
assign.solution <- matrix (c(1,0,0,0 ,0,1,0,0, 1,0,0,0, 1,0,0,0 ,0,0,0,1), 4, 10)
As for the restrictions this means that every column must some up to one.
I tried it with the lpsolve and lp.assign functions from lpSolve but I dont know exactly how to implement that exact kind of restrictions I have and i already tried to alter the lp.assign functions with no success.
If it helps, i can also formulate the equations for the lp.
Thank you all for your help, I am really stuck right now :D
BR

Step 1. Develop a mathematical model
The mathematical model can look like:
The blue entries represent data and the red ones indicate a decision variable. i are the depots and j are the customers. Ship indicates if we ship from i to j (it is a binary variable). The first constraint says that total amount shipped from depot i should not exceed its capacity. The second constraint says that there must be exactly one supplier i for each customer j.
Step 2. Implementation
This is now just a question of being precise. I follow the model from the previous section as closely as I can.
library(dplyr)
library(tidyr)
library(ROI)
library(ROI.plugin.symphony)
library(ompr)
library(ompr.roi)
num_depots <- 4
num_cust <- 10
cost <- matrix(c(2, 7, 7, 2, 7, 7, 3, 2, 7, 2, 8, 10, 1, 9, 8, 2,7,8,9,10), num_depots, num_cust)
demand <- c(1,2,3,4,5,6,7,8,9,10)
capacity <- c(15,15,15,15)
m <- MIPModel() %>%
add_variable(ship[i,j], i=1:num_depots, j=1:num_cust, type="binary") %>%
add_constraint(sum_expr(demand[j]*ship[i,j], j=1:num_cust) <= capacity[i], i=1:num_depots) %>%
add_constraint(sum_expr(ship[i,j], i=1:num_depots) == 1, j=1:num_cust) %>%
set_objective(sum_expr(cost[i,j]*ship[i,j], i=1:num_depots, j=1:num_cust),"min") %>%
solve_model(with_ROI(solver = "symphony", verbosity=1))
cat("Status:",solver_status(m),"\n")
cat("Objective:",objective_value(m),"\n")
get_solution(m,ship[i, j]) %>%
filter(value > 0)
We see how important it is to first write down a mathematical model. It is much more compact and easier to reason about than a bunch of code. Going directly to code often leads to all kind of problems. Like building a house without a blueprint. Even for this small example, writing down the mathematical model is a useful exercise.
For the implementation I used OMPR instead of the LpSolve package because OMPR allows me to stay closer to the mathematical model. LpSolve has a matrix interface, which is very difficult to use except for very structured models.
Step 3: Solve it
Status: optimal
Objective: 32
variable i j value
1 ship 1 1 1
2 ship 4 2 1
3 ship 2 3 1
4 ship 1 4 1
5 ship 3 5 1
6 ship 4 6 1
7 ship 4 7 1
8 ship 2 8 1
9 ship 1 9 1
10 ship 3 10 1
I believe this is the correct solution.

Related

Assignment algorithm variant

I have a square matrix in R containing distances between cities:
set.seed(3)
x <- matrix(sample(1:15, 16, replace = TRUE), nrow = 4)
x
# [,1] [,2] [,3] [,4]
#[1,] 3 10 9 9
#[2,] 13 10 10 9
#[3,] 6 2 8 14
#[4,] 5 5 8 13
Every row represents a city from which a courier can be sent, and every column represents a city where a package has to be delivered. All couriers have the same package, so they can be assigned to every other city.
Normally I would use the Hungarian algorithm in clue::solve_LSAP() to find an optimal assignment so that the total cost (in this case total distance) would be minimized:
y <- clue::solve_LSAP(x)
y
#Optimal assignment:
#1 => 1, 2 => 4, 3 => 2, 4 => 3
However, in this specific case I would like minimize the spread of the distances.
I have been searching for quite some time now, and I found the following here in this book at page 270 (and something similar here in this book at page 195):
So the stated objective is to minimize the difference between the maximum and minimum assigned distance, which is exactly what I am looking for. The assignment of the Hungarian algorithm gives the following difference between maximum and minimum distance:
distances <- x[cbind(1:4, y)]
max(distances) - min(distances)
#[1] 7
However, the best assignment to minimize the new objective is (this solution was found by brute force):
#Optimal assignment to minimize the new objective:
#1 => 2, 2 => 4, 3 => 1, 4 => 3
yNew <- c(2, 4, 1, 3)
distancesNew <- x[cbind(1:4, yNew)]
max(distancesNew) - min(distancesNew)
#[1] 4
So clearly, the Hungarian algorithm doesn't give the desired solution.
Now my question: is there any existing R code that finds an optimal assignment by minimizing the objective mentioned above (the difference between the maximum and minimum assigned cost value)? Or maybe some R code with an algorithm that achieves a similar result?
Those books I mentioned describe the algorithm I want, but 1) they both start with the solution of the bottleneck assignment problem (an algorithm for which I couldn't find R code either) and 2) implementing this algorithm (and possibly also the bottleneck assignment algorithm) myself would be far from ideal.

R seq function between item 1 and 2, then between 2 and 3 of a vector

I have a vector c(5, 10, 15) and would like to use something like the seq function to created a new vector: 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15. This is how I would do it now, but it seems ineloquent at best. In the final (functional) form, I would need to increment by any given number, not necessarily units of 1.
original_vec <- c(5, 10, 15)
new_vec <- unique(c(seq(original_vec[1],original_vec[2],1),seq(original_vec[2],original_vec[3],1)))
> new_vec
[1] 5 6 7 8 9 10 11 12 13 14 15
Is there a way (I'm sure there is!) to use an apply or similar function to apply a sequence across multiple items in a vector, also without repeating the number in the middle (in the case above, 10 would be repeated, if not for the unique function call.
Edit: Some other possible scenarios might include changing c(1,5,7,10,12) to 1,1.5,2,2.5 ... 10, 10.5, 11, 11.5, 12, or c(1,7,4) where the price increases and then decreases by an interval.
The answer may be totally obvious and I just can't quite figure it out. I have looked at manuals and conducted searched for the answer already. Thank you!
While this isn't the answer to my original question, after discussing with my colleague, we don't have cases where seq(min(original_vec), max(original_vec), by=0.5), wouldn't work, so that's the simplest answer.
However, a more generalized answer might be:
interval = 1
seq(original_vec[1], original_vec[length(original_vec)], by = interval)
Edit: Just thought I'd go ahead and include the finished product, which includes the seq value in a larger context and work for increasing values AND for cases where values change direction. The use case is the linear interpolation of utilities, given original prices and utilities.
orig_price <- c(2,4,6)
orig_utils <- c(2,1,-3)
utility.expansion = function(x, y, by=1){
#x = original price, y = original utilities
require(zoo)
new_price <- seq(x[1],x[length(x)],by)
temp_ind <- new_price %in% x
new_utils <- rep(NA,length(new_price))
new_utils[temp_ind] <- y
new_utils <- na.approx(new_utils)
return(list("new price"=new_price,"new utilities"=new_utils))
}

What data structure should be used while designing algo for multiplication and division problems?

Considering basic example of multiplication where 12*24 = 288. Now I am looking for single or multiple data structures where I can keep each an every information of the intermediate steps performed during multiplication. e.g. 2*4 fetches 8, 1*4 fetches 4, etc.
I need to store such intermediate information so as to facilitate me to tell user exactly where he went wrong in his operations.
http://tutrr.com
Focus on first on the capability you need to provide.
For example, the user will enter one digit of his answer, you need to check it and give feedback. For example in 28 x 57 assuming you are teaching traditional "long multiplication" then the user needs to mulitply 28 by 7, recording 6 in the units, carrying 5 and then 9, remembering to adding the carried 5 and then 1. suppose he enters 4 in the tens column, you might want to say "Yes, 7 x 2 is 14, but don't forget to add the 5 you carried"
So to support this you need functions such as
getCorrectWorkingDigit( int leftDigitIndex, int rightDigitIndex)
In this case we'd call
getCorrectWorrkingDigit( 1, 0 ) and get 9 as the answer
and
getWorkingCarryDigit( int leftDigitIndex, int rightDigitIndex)
so
getWorkingCarryDigit( 1, 0 ) and get 5 as the answer
You will need a few other such functions, including functions for the final answer's digits.
Now, what data structure would allow you to do this? Your requirement is to enable your functions to be implemented. Well clearly you could build some kind of array of objects, representing each Working position, and each position in the final answer. But I think that's overkill, you can implement those functions directly against the question. All you actually need are the two integers (28 and 57 in my example) you can compute the function values on the fly, no need for keeping the target.
Having written all that I've just realised that you probably also want to keep the values the user entered, and for that a data structure might be useful, keeping the individual digits will be convenient.
For and "row" of working, and for the final result, how about an array of digits, where the index corresponds to the power of 10, so represent 196 as
[6, 9, 1]
and for the working put that in a Set, keyed by the power of ten of the right digit. In my 28 x 57:
0 -> [6, 9, 1] // this is 7 x 28
10 -> [0, 4, 1] // this is 5 x 28

Dominance-directed (tournament) graph metrics

I am interested in deriving dominance metrics (as in a dominance hierarchy) for nodes in a dominance directed graph, aka a tournament graph. I can use R and the package igraph to easily construct such graphs, e.g.
library(igraph)
create a data frame of edges
the.froms <- c(1,1,1,2,2,3)
the.tos <- c(2,3,4,3,4,4)
the.set <- data.frame(the.froms, the.tos)
set.graph <- graph.data.frame(the.set)
plot(set.graph)
This plotted graph shows that node 1 influences nodes 2, 3, and 4 (is dominant to them), that 2 is dominant to 3 and 4, and that 3 is dominant to 4.
However, I see no easy way to actually calculate a dominance hierarchy as in the page: https://www.math.ucdavis.edu/~daddel/linear_algebra_appl/Applications/GraphTheory/GraphTheory_9_17/node11.html . So, my first and main question is does anyone know how to derive a dominance hierarchy/node-based dominance metric for a graph like this using some hopefully already coded solution in R?
Moreover, in my real case, I actually have a sparse matrix that is missing some interactions, e.g.
incomplete.set <- the.set[-2, ]
incomplete.graph <- graph.data.frame(incomplete.set)
plot(incomplete.graph)
In this plotted graph, there is no connection between species 1 and 3, however making some assumptions about transitivity, the dominance hierarchy is the same as above.
This is a much more complicated problem, but if anyone has any input about how I might go about deriving node-based metrics of dominance for sparse matrices like this, please let me know. I am hoping for an already coded solution in R, but I'm certainly MORE than willing to code it myself.
Thanks in advance!
Not sure if this is perfect or that I fully understand this, but it seems to work as it should from some trial and error:
library(relations)
result <- relation_consensus(endorelation(graph=the.set),method="Borda")
relation_class_ids(result)
#1 2 3 4
#1 2 3 4
There are lots of potential options for method= for dealing with ties etc - see ?relation_consensus for more information. Using method="SD/L" which is a linear order might be the most appropriate for your data, though it can suggest multiple possible solutions due to conflicts in more complex examples. For the current simple data this is not the case though - try:
result <- relation_consensus(endorelation(graph=the.set),method="SD/L",
control=list(n="all"))
result
#An ensemble of 1 relation of size 4 x 4.
lapply(result,relation_class_ids)
#[[1]]
#1 2 3 4
#1 2 3 4
Methods of dealing with this are again provided in the examples in ?relation_consensus.

Picking fair teams - and the math to prove it

Application: similar to picking playground teams.
I must divide a collection of n sequentially ranked elements into two teams of n/2. The teams must be as "even" as possible. Think of "even" in terms of playground teams, as described above. The rankings indicate relative "skill" or value levels. Element #1 is worth 1 "point", element #2 is worth 2, etc. No other constraints.
So if I had a collection [1,2,3,4], I would need two teams of two elements. The possibilities are
[1,2] & [3,4]
[1,3] & [2,4]
[1,4] & [2,3]
(Order is not important.)
Looks like the third option is the best in this case. But how can I best assess larger sets? Average/mean is one approach, but that would result in identical rankings for the following candidate pair which otherwise seem uneven:
[1,2,3,4,13,14,15,16] & [5,6,7,8,9,10,11,12]
I can use brute force to evaluate all candidate solutions for my problem domain.
Is there some mathematical/statistical approach I can use to verify the "evenness" of two teams?
Thanks!
Your second, longer example, does not seem uneven (or unfair) to me. In fact, it accords with what you seem to think is the preferred answer for the first example.
Therein lies the non-programming-related nub of your problem. What you have are ordinal numbers and what you want are cardinal numbers. To turn the former into the latter you have to define your own mapping, there is no universal, off-the-shelf approach.
You might, for example, compare each element of the 2 sets in turn, eg a1 vs b1, a2 vs b2, ... and regard the sets as even enough if the number of cases where a is better than b is about the same as the number of cases where b is better than a.
But for your application, I don't think you will do better than use the playground algorithm, each team leader chooses the best unchosen player and turns to choose alternate. Why do you need anything more complicated ?
The numbers represent rankings? Then no, there is no algorithm to get fair teams, because there's not enough information. It could be that even the match-up
[1] & [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
is stacked against the large-team. This would be the case, for example, for chess-teams, if the difference between [1] and [2] was large.
Even the matchup you mentioned as being "unfair":
[1,2,3,4,13,14,15,16] & [5,6,7,8,9,10,11,12]
Could be completely fair in a game like baseball. After all, players 13-16 still need to bat!
So, probably the most fair thing to do would be to just pick teams randomly. That would also avoid any form of "gaming" the system (like my friends and I did in gym class in high school :) )
I don't think there's enough information to determine an answer.
What does it really mean for someone to be #1 vs #2? Are they 50% better, or 10% better or 1% better? How much better is #1 vs #5? It's really the algorithm to assign a value that needs to be accurate, and the distribution algorithm needs to reflect this properly.
For example, like I said, if you have Kobe Bryant mixed in with a bunch of high school basketball kids, what would the relative values be? Because in basketball, Kobe Bryant could single-handedly beat all the high school kids. So would his rank be #1, and the rest of the kids be #1000+?
As well, you have to assume that the value determination takes into account the size of a team. Does the team only need 2 players? Or does it need 10? In latter case, then in your second example, the 2nd team seems okay because the top 4 players would be playing with 6 much worse players, which could affect the success.
If all you are doing is distributing values, and if the notion of "fairness" is built into the value system, then the mean values seem to be a fair way to distribute the players.
You need an iterative ranking approach, with automated picking to produce evenly ranked teams on each iteration. This works even when the mix of participants changes to some extent over time. I created a tool to do just this for my 5-a-side group and then opened it up to allcomers if you google for "Fair Team Picker"
The pattern from above.
Team A: 1, 4, 5, 8, 9, 12, 13, 16
Team B: 2, 3, 6, 7, 10, 11, 14, 15
Snakes through the list using the pattern
A-B-B-A; A-B-B-A; etc.
This selection is pretty easy to code. Place all the ordered list of players into pairs. Reverse every odd # pair (assumes 1st pair is 0th group).
However, there is a "better" way to make teams using the Thue-Morse algorithm. For a more in-depth description of of this algorithm see: https://www.youtube.com/watch?v=prh72BLNjIk
Aren't the teams equal if each "round" of picking is simply done in reverse order of the preceding round? If there are 10 players whose talent is 1-10 and we are creating 5 teams (2 players each), the first round through, the first pick would obviously pick the best player (talent level 10). Then next pick would be 9, and so on. The 5th pick would get the play with the talent level of 6. In the second round, the pick order is reversed, so the team that just got talent level 6 would pick talent level 5 (the highest left) and so on until the captain who picked first in the 1st round would get the last player (talent level 1). Thus each team has a talent level of 11, with one team having 10 and 1, the next having 9 and 2, and so on. This would work for as many players/teams as there are.
Firs one selects 1. From then on they take turns choosing 2.
Assuming:
An even number of elements to choose
Every chooser gives the same value to each element
The value of the elements is very similar but different
Lower is better
[BEST] Firs one selects 1. From then on they take turns choosing 2:
16 items average
Team 1: 1, 4, 5, 8, 9, 12, 13, 16 8.5
Team 2: 2, 3, 6, 7, 10, 11, 14, 15 8.5
14 items average
Team 1: 1, 4, 5, 8, 9, 12, 13 7.42857
Team 2: 2, 3, 6, 7, 10, 11, 14 7.57143
Choosing first 1, second 2 and then 1 each:
16 items average
Team 1: 1, 4, 6, 7, 10, 11, 14, 15 8.875
Team 2: 2, 3, 5, 8, 9, 12, 13, 16 8.125
14 items average
Team 1: 1, 4, 5, 8, 9, 12, 13 7.42857
Team 2: 2, 3, 6, 7, 10, 11, 14 7.57143
[WORST] Comparing with selecting 1 each:
16 items average
Team 1: 1, 3, 5, 7, 9, 11, 13, 15 8
Team 2: 2, 4, 6, 8, 10, 12, 14, 16 9
16 items average
Team 1: 1, 3, 5, 7, 9, 11, 13 7
Team 2: 2, 4, 6, 8, 10, 12, 14 8

Resources