Ideas for optimization algorithm for Fantasy Football - r

So, this is a bit different than standard fantasy football. What I have is a list of players, their average "points per game" (PPG) and their salary. I want to maximize points per game under the constraint that my team does not exceed a salary cap. A team consists of 1 QB, 1 TE, 3 WRs, and 2 RBs. So, if we have 15 of each position we have 15X15 X(15 c 3)X(15 c 2) = 10749375 possible teams.
Pretty computationally complex. I can use a bit of branch and bound i.e. once a team has surpassed the salary cap I can trim the tree, but even with that the algorithm is still pretty slow. I tried another option where I used a "genetic algorithm" i.e. made 10 random teams, picked the best one and "mutated" it (randomly changing some of the players) into another 10 teams and then picked of those and then looped through a bunch of times until the points per game of the "best team" stopped getting better.
There must be a better way to do this. I'm not a computer scientist and I've only taken an intro course in algorithmics. Programmers - what are your thoughts? I have a feeling that some sort of application of dynamic programming could help.
Thanks

I think a genetic algorithm, intelligently implemented, will yield an acceptable result for you. You might want to use a metric like points per salary dollar rather than straight PPG to decide the best team. This way you are inherently measuring value added. Also, you should consider running the full algorithm/mutation to satisfactory completion numerous times so that you can identity what players consistently show up in the final outcomes. These players then should be valued above others.
Of course the problem with the genetc approach Is that you need a good mutation algorithm and that is highly personal for how you want to implement it.

Take i as the current number of players out of n players and j to be the current remaining salary that is left. Take m[i, j] to be the dynamic set of solutions.
Then m[i, 0] = 0, m[0, j] = 0
and
m[i, j] = m[i - 1, j] if salary for player i is greater than j
else
m[i, j] = max ( m[i - 1, j], m[i - 1, j - salary of player i] + PPG of player i)
Sorry that I don't know R but I'm good with algorithms so I hope this helps.
A further optimization you can make is that you really only need 2 rows of m[i, j] because the DP solution only uses the current row and the last row (you can save memory this way)

First of all, the variation you have provided should not be right. Best way to build team is limit positions by limited plus there is absolutely no sense of moving 3 similar positions players between themselves.
Christian Ronaldo, Suarez and Messi will give you the equal sum of fantasy points in any line-up, like:
Christian Ronaldo, Suarez and Messi
or
Suarez, Christian Ronaldo and Messi
or
Messi, Suarez, Ronaldo
First step - simplify the variation possibility.
Next step - calculate the average price, and build the team one by one by adding player with lower salary but higher price. When reach salary limit, remove expensive one and add cheaper but with same fantasy points - and so on. Don't build the variation, value the weight of each player by combination of salary and fantasy points.

Does this help? It sets up the constraints and maximises points.
You could adapt to get data out of excel
http://pena.lt/y/2014/07/24/mathematically-optimising-fantasy-football-teams
14/07/24/mathematically-optimising-fantasy-football-teams

Related

Whats the logic behind 'impact' argument in ahp topsis function

d <- matrix(rpois(12, 5), nrow = 4)
w <- c(1, 1, 2)
i <- c("+", "-", "+")
topsis(d, w, i)
this is the function available in R for Ahp topsis, i am confused about how to assign "+" and "-" signs here for "impact" argument. how is it done here in this example
Good question.
'c("+", "-", "+")' indicates which criteria you need to maximise and which criteria you need to minimise.
So TOPSIS was developed in 1981 by Hwang and Yoon [1] and is a common algorithm used for MCDC (multi-criteria decision making) problems. TOPSIS is based on the premise that the 'best' solution out a set of alternatives, is the one with the closest geometric distance to the ideal solution and the farthest geometric distance to the anti-ideal solution.
Each alternative is characterised with different criteria. Criteria can be beneficial or unbeneficial. If it is beneficial you want to maximise, but if it is a cost you want to minimise.
So, let's say you want to select the 'best' car from an array of car alternatives.
Price is a cost criterion... that you want to minimise. But, maybe 'speed limit' is something you want to maximise.
As said, those '+', '-' indicates which are the attributes are costs and which are benefits so that you can compute the ideal and anti-ideal solution.
Resources:
TOPSIS Package documentation. Retrieved from https://cran.r-project.org/web/packages/topsis/topsis.pdf
Manoj Mathew. TOPSIS - Technique for Order Preference by Similarity to Ideal Solution Retrieved from: https://www.youtube.com/watch?v=kfcN7MuYVeI
REFERENCES:
1 Hwang, C. L., & Yoon, K. (1981). Methods for multiple attribute decision making. In Multiple attribute decision making (pp. 58-191). Springer, Berlin, Heidelberg.
First off, I don't have any experience with TOPSIS but the code of that function explains what is going on and matches the description of TOPSIS. You can see the code by typing topsis.
matrix d in this example is a 4x3 matrix. Each row represents one alternative (for instance , a model of car available in the market) while each column represents a criterion on which these alternatives are to be judged (for instance , you might use cost, efficiency, torque and ground clearance to select a car)
The + and - just show how that particular criteria(column) impacts the outcome. For instance, cost of a car might be a -ve while torque will be +ve.
The algorithm uses these impact signs to come to a Positive Ideal solution and a Negative(worst) Ideal solution .
Positive Ideal solution is derived by using the max value of +ve columns and the min value of the -ve columns. Here's the relevant line from the code.
u <- as.integer(impacts == "+") * apply(V, 2, max) + as.integer(impacts ==
"-") * apply(V, 2, min)
Negative ideal is the opposite.
From thereon the code proceeds to find distance of each of our alternatives with these best and worst outcomes and ranks them.

Calculate the number of trips in graph traversal

Hello Stack Overflow Community,
I'm attempting to solve this problem:
https://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&problem=1040
The problem is to find the best path based on capacity between edges. I get that this can be solved using Dynamic Programming, I'm confused by the example they provide:
According to the problem description, if someone is trying to get 99 people from city 1 to 7, the route should be 1-2-4-7 which I get since the weight of each edge represents the maximum amount of passengers that can go at once. What I don't get is that the description says that it takes at least 5 trips. Where does the 5 come from? 1-2-4-7 is 3 hops, If I take this trip I calculate 4 trips, since 25 is the most limited hop in the route, I would say you need 99/25 or at least 4 trips. Is this a typo, or am I missing something?
Given the first line of the problem statement:
Mr. G. works as a tourist guide.
It is likely that Mr. G must always be present on the bus, thus the equation for the number of trips is:
x = (ceil(x) + number_of_passengers) / best_route
rather than simply:
x = number_of_passengers / best_route
or, for your numbers:
x = (ceil(x) + 99) / 25
Which can be solved with:
x == 4.16 (trips)

R Optimisation - Integer Programming

I have tried to use the R package LPSolve and in particular the lp.transport function to solve a optimisation problem. In my fictitious example below I have 5 office sites that I need to resource with a minimum number of employees and I have set up a cost matrix that determines the distance from each employees home to the office. I want to minimize the total distance traveled to work whilst meeting the minimum number of employees per office.
Initially this was working as I was treating all employees as equal (1). however problems have started to occur when I rate each employee by how efficient they are. For example I now want to say that officeX needs the equivalent of 2 engineers which might be made up of 4 engineers who are 50% efficient or 1 that is 200% efficient. When I do this however the solution found will split a employee across a number of offices, what I need is a additional constraint so impose that a employee can only be at 1 Office.
Anyway hopefully that is enough background here is my example:
Employee <- c("Jim","John","Jonah","James","Jeremy","Jorge")
Office1 <- c(2.58321505105556, 5.13811249390279, 2.75943834864996,
6.73543614029559, 6.23080251653027, 9.00620341764497)
Office2 <- c(24.1757667923894, 19.9990724784926, 24.3538456922105,
27.9532073293925, 26.3310994833106, 14.6856664813007)
Office3 <- c(38.6957155251069, 37.9074293509861, 38.8271000719858,
40.3882569566947, 42.6658938732098, 34.2011184027657)
Office4 <- c(28.8754359274453, 30.396841941228, 28.9595182970988,
29.2042274337124, 33.3933900645023, 28.6340025144932)
Office5 <- c(49.8854888720157, 51.9164328512659, 49.948290261029,
49.4793138594302, 54.4908258333456, 50.1487397648236)
#create CostMatrix
costMat<-data.frame(Employee,Office1, Office2, Office3, Office4, Office5)
#efficiency is the worth of employees, eg if 1 they are working at 100%
#so if for example I wanted 5 Employees
#working in a office then I could choose 5 at 100% or 10 working at 50% etc...
efficiency<-c(0.8416298, 0.8207991, 0.7129663, 1.1406839, 1.3868177, 1.1989748)
#Uncomment next line to see the working version based on headcount
#efficiency<-c(1,1,1,1,1,1)
#Minimum is the minimum number of Employees we want in each office
minimum<-c(1, 1, 2, 1, 1)
#solve problem
opSol <-lp.transport(cost.mat = as.matrix(costMat[,-1]),
direction = "min",
col.signs = rep(">=",length(minimum)),
col.rhs = minimum,
row.signs = rep("==", length(efficiency)),
row.rhs = efficiency,
integers=NULL)
#view solution
opSol$solution
# My issue is one employee is being spread across multiple areas,
#what I really want is a extra constraint that says that in a row there
# can only be 1 non 0 value.
I think this is no longer a transportation problem. However you still can solve it as a MIP model:

R function for weighting teams by strength of opponent?

I'm analyzing some sports data, and I have a set of win/loss records for about 40 teams. I would like to come up with a ranking where each win is weighted by the strength of the opponent. This would have to be some iterative/recursive sort of thing where the weights and ranks are updated on each iteration until convergence. Does anyone know if there is an existing function or package for doing this sort of thing? My guess would is that it wouldn't be a sports-specific package, but I imagine this sort of thing is common across a lot of fields.
EDIT:
Here's some example data. There are 4 teams, A,B,C,and D, and each played the other team once, resulting in 10 unique games. The data are doubled so that each team's four games are listed as their own rows, with the column "a.win" referring to if "team.a" won the game (1=Yes).
dat<-data.frame(
team.a=c("A","A","A","A","B","B","B","B","C","C","C","C","D","D","D","D","E","E","E","E"),
team.b=c("B","C","D","E","A","C","D","E","A","B","D","E","A","B","C","E","A","B","C","D"),
a.win=c(1,1,0,1,0,0,1,0,0,1,1,0,1,0,0,1,0,1,1,0))
From these data, team A won 3/4, B won 1/4, and C,D,and E each won 2/4. But team D beat A, whereas C and E all lost to A. So intuitively D should be ranked slightly higher than C and E since one of its wins came to the highest rated opponent. Similarly, team C lost to team B (the only team with only won win) so intuitively it should be ranked lower than D and E.
I'm trying to figure out how best to assign ranks (e.g., from -1 to 1, or based on probability of winning, or number of losses, etc), and then how best to re-weight each team not just based on the number of wins/losses, but on the rank of the opponent they defeated.
Try the PlayerRatings package.
http://cran.r-project.org/web/packages/PlayerRatings/index.html
It implements the Elo and Glicko ratings used in Chess, but it can be extended to other sports as well. The package also contains functions for updating the ratings of players based on the previous rating and game outcomes. This is a basic starting point, which you will have to build on depending on your situation.
http://en.wikipedia.org/wiki/Elo_rating_system#Elo_ratings_beyond_chess
I don't think there will be a tailored solution for what you want to do, since how you go about ratings will depend on the specifics of your scenario.

How to calculate the expected cost?

I am not good at probability and I know it's not a coding problem directly. But I wish you would help me with this. While I was solving a computation problem I found this difficulty:
Problem definition:
The Little Elephant from the Zoo of Lviv is going to the Birthday
Party of the Big Hippo tomorrow. Now he wants to prepare a gift for
the Big Hippo. He has N balloons, numbered from 1 to N. The i-th
balloon has the color Ci and it costs Pi dollars. The gift for the Big
Hippo will be any subset (chosen randomly, possibly empty) of the
balloons such that the number of different colors in that subset is at
least M. Help Little Elephant to find the expected cost of the gift.
Input
The first line of the input contains a single integer T - the number
of test cases. T test cases follow. The first line of each test case
contains a pair of integers N and M. The next N lines contain N pairs
of integers Ci and Pi, one pair per line.
Output
In T lines print T real numbers - the answers for the corresponding test cases. Your answer will considered correct if it has at most 10^-6 absolute or relative error.
Example
Input:
2
2 2
1 4
2 7
2 1
1 4
2 7
Output:
11.000000000
7.333333333
So, Here I don't understand why the expected cost of the gift for the second case is 7.333333333, because the expected cost equals Summation[xP(x)] and according to this formula it should be 33/2?
Yes, it is a codechef question. But, I am not asking for the solution or the algorithm( because if I take the algo from other than it would not increase my coding potentiality). I just don't understand their example. And hence, I am not being able to start thinking about the algo.
Please help. Thanks in advance!
There are three possible choices, 1, 2, 1+2, with costs 4, 7 and 11. Each is equally likely, so the expected cost is (4 + 7 + 11) / 3 = 22 / 3 = 7.33333.

Resources