R OMPR package - Limiting the number of unique variable components chosen - r

I'm using the ompr package for creating and solving an integer programming problem. For simplicity's sake, I will use NFL football fantasy players as my example.
I want to maximize the number of points scored across the 2 games, while only playing 1 player at each position per game. (For simplicity's sake, here assume that any player can play any position.)
The part I'm having trouble with is that of the 25 possible players, I want to limit the total number of players chosen across the two games to 15. The i component of the added ompr variable represents the player indices, but I'm not sure how to add a constraint that limits the total unique i's chosen.
Any help would be greatly appreciated!
n_players = 25
n_positions = 11
n_games = 2
# Points each player will score at each position per game
points_game1 = matrix(runif(25*11), nrow = 25, ncol = 11)
points_game2 = matrix(runif(25*11), nrow = 25, ncol = 11)
points_array <- array(c(points_game1, points_game2), dim = c(n_players, n_positions, 2))
mip <- ompr::MIPModel() %>%
# Initialize player/position set of binary options
ompr::add_variable(x[i, j, k], i = 1:n_players, j = 1:n_positions, k = 1:n_games, type = 'binary') %>%
# Every player/game can only be 0 or 1 across all positions
ompr::add_constraint(sum_expr(x[i, j, k], j = 1:n_positions) <= 1, i = 1:n_players, k = 1:n_games) %>%
# Every position/game has to be exactly 1 across all players
ompr::add_constraint(sum_expr(x[i, j, k], i = 1:n_players) == 1, j = 1:n_positions, k = 1:2) %>%
# ****** Limit to 15 players total ??? ****
# Objective is to maximize points
ompr::set_objective(sum_expr(x[i, j, k] * points_array[i, j, k], i = 1:n_players, j = 1:n_positions, k = 1:n_players), 'max') %>%
# Solve model
ompr::solve_model(with_ROI(solver = 'symphony', verbosity = -2))

You can add a set of binary variables indexed across the players that tracks whether or not a player is used in any postition in any game. Then you can limit the sum of those variables to your limit (15). This lets you only count a player once even if they get used in both games. Then you can add a big M constraint that forces the new binary variables to be 1 if a player is used in any position in any game, but lets the variable be 0 if the player is not used. Since we have two games and a player can be in at most 1 position in each game, we can set the big M to be 2 for all players
ompr::add_variable(is_used[i], i = 1:n_players, type = 'binary') %>%
ompr::add_constraint(sum_expr(is_used[i],i = 1:n_players) <= 15) %>%
# big M constraint ensuring that is_used is 1 if a player is used
ompr::add_constraint(2*is_used[i] >= sum_expr(x[i,j,k],j = 1:n_positions, k = 1:2), i = 1:n_players) %>%

Related

Dynamic Time Warping (DTW) monotonicity constraint

How to specify a monotonicity constraint (that one time series should not come before the other) when using dynamic time warping?
For example, I have cost and revenue data; one should impact the other but not vice versa. I am using the basic dtw package but I know that there are many others that could be better. Below is my current alignment.
(I would like to save the corresponding revenue point into a separate column, would that be possible?)
library(dtw)
asy<-dtw(df$cost,
df$revenue,
keep=TRUE,
window.size = 7, # max 7 days shift
step=asymmetric # gives best results for this problem (other: symmetric1 & symmetric2)
);
plot(asy, type="two", off=1);
Thank you for your help!
I think you can enforce this by defining your own window function.
For example, take these series:
library(dtw)
set.seed(310L)
idx <- seq(0, 6.28, len = 100L)
reference <- sin(idx)
query <- cos(idx) + runif(100L) / 10
foo <- dtw(query, reference, keep = TRUE, step.pattern = symmetric2, window.type = sakoeChibaWindow, window.size = 30L)
plot(foo, type = "two", off = 2)
The red line is the reference,
and you want the query's values to only match values from the past or the same day.
win_fun <- function(i, j, ...) { i >= j }
bar <- dtw(query, reference, keep = TRUE, step.pattern = symmetric2, window.type = win_fun)
plot(bar, type = "two", off = 2)
If you want to match past values strictly excluding values from the same time,
change the condition to i > j.
Check the documentation of dtwWindowingFunctions for more options.
You might want to add a window size constraint.

R OMPR- Adding a new constraint and dimension to matrix

I'm using the OMPR package in r in order to solve some constraint problems with my keeper-league soccer/football team. This league is operated very much like a real Premier League team, where I wanted to maximize the total number of goals contributed by each player per game. Essentially I want to pick the 10 best players each day in order to maximize the output (goals) of my team.
But there is a catch! Not only does my fantasy teams play two games a day (and need to maximize production across both games), but I can only use 5 players from last week's team on this week's team. So in more direct terms I need to optimize points where:
Max 10 Players are selected
Players are expected to get different amount of goals in each of the two games per day
Players can play any position (for simplicity)
The total number of players who played last week cannot exceed 5 on the "optimally" selected roster
At first pass this looks very similar to this question with an additional wrinkle. Unlike that question, I need to add an additional constraint to the process where I set a maximum threshold for players who played last week (5).
I'm having trouble both conceptualizing how to add a binary "was used last week" column to the array to be optimized as well as setting the actual constraint in the optimization function. Any wisdom/guidance would be appreciated.
#total player pool
num_players = 20
#total positions
num_positions = 9
#total number of games to optimize over
num_games = 2
# Goal each player will generate at each position per game
Goal_1 = matrix(runif(20*9)*10, nrow = 20, ncol = 9)
Goal_2 = matrix(runif(20*9)*10, nrow = 20, ncol = 9)
#matrix that generates 1/0 if you were used last week...1=you were used last week
#first number in vector = first row (player) in each Goal_`` matrix
last_week= sample(c(0,1), replace=TRUE, size=20)
# ******How do I add this last_week vector to the below matrix to use in the optimization function???****
Goal_Matrix <- array(c(Goal_1, Goal_2), dim = c(n_players, n_positions, num_games))
#******i need to add an additional constraint where only five players (max) from last week are used******
mip <- ompr::MIPModel() %>%
# Initialize player/position set of binary options
ompr::add_variable(x[i, j, k], i = 1:num_players, j = 1:num_positions, k = 1:num_games, type = 'binary') %>%
# Every player/game can only be 0 or 1 across all positions
ompr::add_constraint(sum_expr(x[i, j, k], j = 1:num_positions) <= 1, i = 1:num_players, k = 1:num_games) %>%
# Every position/game has to be exactly 1 across all players
ompr::add_constraint(sum_expr(x[i, j, k], i = 1:num_players) == 1, j = 1:num_positions, k = 1:2) %>%
# Limit to 10 players total via Big M
ompr::add_variable(u[i], i = 1:num_players, type = 'binary') %>%
ompr::add_constraint(sum_expr(u[i],i = 1:num_players) <= 10) %>%
# big M constraint ensuring that is_used is 1 if a player is used
ompr::add_constraint(2*u[i] >= sum_expr(x[i,j,k],j = 1:num_positions, k = 1:2), i = 1:num_players) %>%
# ****** Limit to max 5 players used last week via the `last_week vector` ??? ****
# Objective is to maximize Goal
ompr::set_objective(sum_expr(x[i, j, k] * Goal_Matrix[i, j, k], i = 1:num_players, j = 1:num_positions, k = 1:num_games), 'max') %>%
# Solve model
ompr::solve_model(with_ROI(solver = 'symphony', verbosity = -2))

While loop with sampling until object takes on one of select values

I am trying to set up a process using a while loop in order to have my code consistently sample among certain xd[i] before one particular xd[i] becomes equal to x.
I know it would be more efficient to put everything under one for loop (except for the while loop) but I am trying to create this step by step. Right now, I am stuck on the while loop part. I cannot run that part of the code without R crashing, or if it does not crash, it seems to continue sampling nonstop until I manually stop it. How can I change my while loop such that it samples over the xd vector until one of the elements of xd matches with x?
Thank you
reset1 = {
a = 0.3 #lower legal threshold
b = 0.9 #upper legal threshold
x = 0
theta = runif(1,min = a, max = b)
theta
A = 5 ## monetary value of harm from
maxw = 2*A
minw = 0
wbar = (maxw+minw)/2 ##average cost
wbar
xd = c(1,2,3)
w = c(1,2,3)
}
for (i in 1:length(xd)){w[i] = runif(1, min = 0, max = 2)} #trying to make it create a w for each person
##Drivers problem: pick the x that will minimize your cost
for(i in 1:length(xd)){xd[i] = min(c(1-(w[i]/(2*A)),((2+b)-sqrt(b^2-2*b+1+3*(w[i]/A)*(b-a)))/3,b))}
xd
for(i in 1:length(xd)){proba = function(xd){(xd-1)^2}}
proba(xd) #ith individual probability of getting in an accident given their xd[i]
proba(xd[c(1:3)])
probn = 1 - proba(xd) #probability of not getting in an accident given driveri's effort level
probn
while (any(x!=xd)) {x = sample(c(xd[c(1,2,3)],0,0,0),size = 1, replace = TRUE, prob = c(proba(xd), probn)) ###the x is selected based on which ever x resulted in an accident
}
show(x)
Perhaps
while(sum(xd!=x)==3){}
This loops runs as long as no element of xd equals x

RStudio not responding with large Linear Optimization

I have a large two-step optimization problem that I've tried to simplify for this question. The first step is to choose 10 elements to maximize utility with certain constraints. I need 200 of these sets, but due to the nature of what I'm trying to do, there needs to be 600 generated so that the correct combinations can manifest.
Wrapping around these mini-optimization problems is a larger constraint where each individual element can only be used within a certain range. The first optimization tunes each element's utility so that each one is relatively close to the bounds, but it's not possible for all of them to be within their bounds. Therefore, the second step is to choose 200 of the 600 sets such that each individual element's min/max usage is satisfied. This is what I need help with.
I made a function using lpSolve that works, but over 80% of the time it freezes RStudio and it's just becoming too much of a hassle - I need to either improve my current approach, or need a completely new approach. I don't know if lpSolve is really the best approach to begin with. While I do have an overall set-score that I can maximize, all I really care about is having each element within the bounds. I've made a simplified example to get at the essence of my problem.
I'm in charge of making 200 meals from a set of 80 different fruits. Each meal uses 10 fruit and cannot have more than 1 of the same fruit. I'm limited in the number of fruits that I have (and my boss is making me use a minimum of each fruit otherwise they'll go bad), so they need to be within certain bounds. I have a list of 600 meals already created (Meals), and each one has it's own unique Health-Score. Ideally I would like to maximize the Health-Score, but obviously the most important piece is that each fruit is used the correct number of times, otherwise the meals can't be made in the first place.
Here's my code to 1) Setup the 600 Meals (random) 2) Set the min/max times each fruit must be used (random) 3) Run a linear optimization to select 200 of the 600 meals such that the individual fruit constraints are fulfilled. The program tries to chose 200 of the 600, but if the constraints don't allow it, then it loosens the constraints (e.g. if the solver doesn't work the first time, then I'll decrease the minimum number of times an Apple can be used, and increase the maximum number of times it can be used). It does this one fruit at a time, rather than all at once. Eventually the constraints should be loosened so much that any 200 of the 600 will work (i.e. when all fruit minPercent is less than 0 and all fruit maxPercent is greater than 100), but it doesn't matter because R freezes up.
library(stringr)
library(dplyr)
library(lpSolve)
# Inputs
MealsNeeded <- 200
Buffer <- 3
# Setup the meals (this is the output of another optimizer in my actual program. Considered "Step 1" as I mentioned above)
Meals <- data.frame()
for(i in 1:(MealsNeeded*Buffer)){
run <- i
meal <- sample(fruit, 10)
healthFactor <- round(runif(1, 10, 30), 0) #(Health factor for the entire meal)
df <- data.frame(Run = run, Fruit = meal, healthFactor = healthFactor, stringsAsFactors = FALSE)
Meals <- rbind(Meals, df)
}
# The minimum/maximum number of times each fruit must be used across all 200 meals (these would be inputs in my program)
set.seed(11)
fruitDF <- data.frame(Name = fruit, minSelectPct = round(runif(length(fruit), .05, .1)*100, 0), stringsAsFactors = FALSE) %>%
mutate(maxSelectPct = round(minSelectPct/2 + runif(length(fruit), .05, .1)*100, 0))
#### Actual Program Start
# Get objective
obj <- Meals %>%
distinct(Run, healthFactor) %>%
ungroup() %>%
select(healthFactor) %>%
pull()
# Dummy LU - for each fruit give 1/0 whether or not they were in the meal
dummyLUInd <- data.frame(FruitName = fruitDF$Name, stringsAsFactors = FALSE)
for(i in unique(Meals$Run)){
selectedFruit <- Meals %>%
filter(Run == i) %>%
select(Fruit) %>%
mutate(Indicator = 1)
dummyLUIndTemp <- fruitDF %>%
left_join(selectedFruit, by = c('Name' = 'Fruit')) %>%
mutate(Indicator = ifelse(is.na(Indicator), 0, Indicator)) %>%
select(Indicator)
dummyLUInd <- cbind(dummyLUInd, dummyLUIndTemp)
}
## Table create
dummyLUInd <- rbind(dummyLUInd, dummyLUInd)[,-1]
dummyLUInd <- as.data.frame(t(dummyLUInd))
dummyLUInd$Total = 1
## Directions
dirLT <- c(rep('<=', (ncol(dummyLUInd)-1)/2))
dirGT <- c(rep('>=', (ncol(dummyLUInd)-1)/2))
## Multiply percentages by total Meals
MinExp = round(fruitDF$minSelectPct/100 * MealsNeeded - 0.499, 0)
MaxExp = round(fruitDF$maxSelectPct/100 * MealsNeeded + 0.499, 0)
# Setup constraints like # of tries
CounterMax <- 10000
LPSum = 0
Counter = 0
# Create DF to make it easier to change constraints for each run
MinExpDF <- data.frame(Place = 1:length(MinExp), MinExp = MinExp)
MaxExpDF <- data.frame(Place = 1:length(MaxExp), MaxExp = MaxExp)
cat('\nStarting\n')
Sys.sleep(2)
# Try to get the 200 of 600 Meals that satisfy the constraints for the individual Fruit.
# If the solution doesn't exist, loosen the constraints for each fruit (one at a time) until it does work
while (LPSum == 0 & Counter <= CounterMax) {
rowUse <- Counter %% length(MaxExp)
# Knock one of minimum, starting with highest exposure, one at a time
MinExpDF <- MinExpDF %>%
mutate(Rank = rank(-MinExp, na.last = FALSE, ties.method = "first"),
MinExp = ifelse(Rank == rowUse, MinExp - 1, MinExp)
)
MinExp <- MinExpDF$MinExp
# Add one of maximum, starting with highest exposure, one at a time
MaxExpDF <- MaxExpDF %>%
mutate(Rank = rank(-MaxExp, na.last = FALSE, ties.method = "first"),
MaxExp = ifelse(Rank == rowUse, MaxExp + 1, MaxExp))
MaxExp <- MaxExpDF$MaxExp
# Solve
dir <- 'max'
f.obj <- obj
f.mat <- t(dummyLUInd)
f.dir <- c(dirGT, dirLT, '==')
f.rhs <- c(MinExp, MaxExp, MealsNeeded)
Sol <- lp(dir, f.obj, f.mat, f.dir, f.rhs, all.bin = T)$solution
LPSum <- sum(Sol)
Counter = Counter + 1
if(Counter %% 10 == 0) cat(Counter, ', ', sep = '')
}
# Get the Run #'s from the lpSolve
if(Counter >= CounterMax){
cat("Unable to find right exposure, returning all Meals\n")
MealsSolved <- Meals
} else {
MealsSolved <- data.frame(Run = unique(Meals$Run))
MealsSolved$selected <- Sol
MealsSolved <- MealsSolved[MealsSolved$selected == 1,]
}
# Final Meals
FinalMeals <- Meals %>%
filter(Run %in% MealsSolved$Run)
If you run this code enough times, eventually RStudio will freeze up on you (at least it does for me, if it doesn't for you I suppose increase the number of Meals). It happens during the actual lp, so there's really not much you can do since it's really C code. This is where I'm lost.
A part of me thinks this isn't really an lpSolve issue since I'm not really trying to maximize anything (Health-Factor isn't all too important). My real "loss function" is the number of times each fruit goes above/below their min/max exposure, but I can't think of how to set something like this up. Can my current approach work, or do I need to do something different completely?

A number turn into a vector in R and the answer turn out to be the sum of components of the vector

UrnA =rep(c(10,5,1),c(5,5,5))
UrnB =rep(c(20,5,1),c(9,3,3))
n=1e3
sum=0
for( i in 1:n ){
dice=sample(1:6,1)
sum=sum+(dice<=4)*sample(UrnA,2,replace = FALSE)+(dice>=5)*sample(UrnB,2,replace = FALSE)
}
E=sum/n
I want to use the sentences above to solve the problem below.
"Urn A contains 5 $10 bills, 5 $5 bills, and 5 $1 bills.
Urn B contains 9 $20 bills, 3 $5 bills, and 3 $1 bills.
A dice is thrown. If it lands on 1,2,3, or 4, two bills are drawn from Urn A (without replacement),
Otherwise two bills are drawn from Urn B. Let X = the total value of the bills drawn.
(a) Use simulations to estimate E[X]."
And the problem is that,when I run the sentence the sum turn out to be a array with two components which really makes me confused.And I calculate it myself and the sum of each components of sum turn out to be the right answer . enter image description here
You can avoid the for loop if you consider that rolling a single die n times is the same as rolling n dice once.
UrnA <- rep(c(10,5,1), c(5,5,5))
UrnB <- rep(c(20,5,1), c(9,3,3))
n <- 1e3
set.seed(2018);
sum(as.integer(sapply(sample(1:6, n, replace = T), function(x)
if (x <= 4) sample(UrnA, 2) else sample(UrnB, 2))))
#[1] 15818
I'm using a fixed seed here for reproducibility; remove if necessary.
We can confirm convergence by repeating the process 1000 times
val <- sapply(1:1000, function(x)
sum(as.integer(sapply(sample(1:6, n, replace = T), function(x)
if (x <= 4) sample(UrnA, 2) else sample(UrnB, 2)))))
ggplot(data.frame(idx = 1:1000, val = val), aes(idx, val)) +
geom_point() +
ylim(0, pretty(max(val))[2])
Both your sample function will return a set of two values. You need to sum their components.
# Instead of
sum=sum+(dice<=4)*sample(UrnA,2,replace = FALSE)+(dice>=5)*sample(UrnB,2,replace = FALSE)
#Use:
sum=sum+sum((dice<=4)*sample(UrnA,2,replace = FALSE)+(dice>=5)*sample(UrnB,2,replace = FALSE))

Resources