'At least one choice' Genetic Algorithm Constraint - r

I just started learning genetic algorithms in R using this example and I've run into an interesting problem trying to apply it. I have a dataset of shops, supply centers, and distance (miles) between shops and supply centers in the data frame dataset:
Shop Center Distance DistanceSave
A 1 700 300
A 2 200 800
A 3 300 700
B 1 400 600
B 2 100 900
B 3 150 850
C 1 600 400
C 2 500 500
C 3 200 800
I'm trying to minimize Distance (or maximize DistanceSave which is 1000 minus Distance) subject to the constraint that each shop must be tied to a Center and I'm having trouble coding that last part:
#Generate Evaluation Function
library(genalg)
evalFunc <- function(x) {
current_solution_savings <- x %*% dataset$DistanceSave
current_solution_shop <- length(unique(dataset$shop[x==1]))
#Set Conditions in Function
if (current_solution_shop != length(unique(dataset$Shop)))
return(0) else return(-current_solution_savings)
}
#Run GA Algorithm with 100 iterations
iter = 100
GAmodel <- rbga.bin(size = genes, popSize = 200, iters = iter, mutationChance = 0.01,
elitism = T, evalFunc = evalFunc)
cat(summary.rbga(GAmodel))
I thought the current_solution_shop != length(unique(dataset$Shop)) condition would be enough but unfortunately it's not, sometimes it still assigns the same restaurant twice.
EDIT: It looks like the Facility Location Problem is what I need to research, but can anyone recommend a multi-facility approach for R or Python?

If you're trying to assign each shop to exactly one center and aren't allowed to assign multiple shops to a particular center then this is called the assignment problem, and can be solved exactly in an efficient manner using linear programming.
Here's an approach using the lp.assign function from the lpSolve package:
# Cost matrix (shops as rows, centers as columns)
(dist.mat <- matrix(dat$Distance, nrow=3))
# [,1] [,2] [,3]
# [1,] 700 400 600
# [2,] 200 100 500
# [3,] 300 150 200
# Solve the assignment problem
library(lpSolve)
sol <- lp.assign(dist.mat, "min")
sol$solution
# [,1] [,2] [,3]
# [1,] 0 1 0
# [2,] 1 0 0
# [3,] 0 0 1
sol$objval
# [1] 800
The optimal solution assigns store A to center 2, store B to center 1, and store C to center 3, with cost 800.

Related

how to generate random numbers with conditons impose in R?

I would like to generate 500 different combination of a,b,and c meeting the following conditions
a+ b+ c = 1 and
a < b < c
here is a basic sample of generating random numbers, however, I need to generate it based on aforementioned conditions.
Coeff = data.frame(a=runif(500, min = 0, max = 1),
b=runif(500, min = 0, max = 1),
c=runif(500, min = 0, max = 1))
myrandom <- function(n) {
m <- matrix(runif(3*n), ncol=3)
m <- cbind(m, rowSums(m)) # rowSums is efficient
t(apply(m, 1, function(a) sort(a[1:3] / a[4])))
}
Demonstration:
set.seed(2)
(m <- myrandom(5))
# [,1] [,2] [,3]
# [1,] 0.1099815 0.3287708 0.5612477
# [2,] 0.1206611 0.2231769 0.6561620
# [3,] 0.2645362 0.3509054 0.3845583
# [4,] 0.2057215 0.2213517 0.5729268
# [5,] 0.2134069 0.2896015 0.4969916
all(abs(rowSums(m) - 1) < 1e-8) # CONSTRAINT 1: a+b+c = 1
# [1] TRUE
all(apply(m, 1, diff) > 0) # CONSTRAINT 2: a < b < c
# [1] TRUE
Note:
my test for "sum to 1" is more than just ==1 because of IEEE-754 and R FAQ 7.31, suggesting that any floating-point test should be an inequality vice a test for equality; if you test for ==1, you will eventually find occurrences where it does not appear to be satisfied:
set.seed(2)
m <- myrandom(1e5)
head(which(rowSums(m) != 1))
# [1] 73 109 199 266 367 488
m[73,]
# [1] 0.05290744 0.24824770 0.69884486
sum(m[73,])
# [1] 1
sum(m[73,]) == 1
# [1] FALSE
abs(sum(m[73,]) - 1) < 1e-15
# [1] TRUE
max(abs(rowSums(m) - 1))
# [1] 1.110223e-16
I would like to point out that ANY distribution law (uniform, gaussian, exponential, ...) will produce numbers a, b and c meeting your condition as soon as you normalize and sort them, so there should be some domain knowledge to prefer one over the other.
As an alternative, I would propose to use Dirichlet distribution which produce numbers naturally satisfying your first condition: a+b+c=1. It was applied to rainfall modelling as well, I believe (https://arxiv.org/pdf/1801.02962.pdf)
library(MCMCpack)
abc <- rdirichlet(n, c(1,1,1))
sum(abc) # should output n
You could vary power law values to shape the data, and, of course, sort them to satisfy your second condition. For many cases it is easy to argue about your model behavior if it uses Dirichlet (Dirichlet being prior for multinomial in Bayes approach, f.e.)

How to implement q-learning in R?

I am learning about q-learning and found a Wikipedia post and this website.
According to the tutorials and pseudo code I wrote this much in R
#q-learning example
#http://mnemstudio.org/path-finding-q-learning-tutorial.htm
#https://en.wikipedia.org/wiki/Q-learning
set.seed(2016)
iter=100
dimension=5;
alpha=0.1 #learning rate
gamma=0.8 #exploration/ discount factor
# n x n matrix
Q=matrix( rep( 0, len=dimension*dimension), nrow = dimension)
Q
# R -1 is fire pit,0 safe path and 100 Goal state########
R=matrix( sample( -1:0, dimension*dimension,replace=T,prob=c(1,2)), nrow = dimension)
R[dimension,dimension]=100
R #reward matrix
################
for(i in 1:iter){
row=sample(1:dimension,1)
col=sample(1:dimension,1)
I=Q[row,col] #randomly choosing initial state
Q[row,col]=Q[row,col]+alpha*(R[row,col]+gamma*max(Qdash-Q[row,col])
#equation from wikipedia
}
But I have problem in max(Qdash-Q[row,col] which according to the website is Max[Q(next state, all actions)] How to I programmatically search all actions for next state?
The second problem is this pseudo code
Do While the goal state hasn't been reached.
Select one among all possible actions for the current state.
Using this possible action, consider going to the next state.
Get maximum Q value for this next state based on all possible actions.
Compute: Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]
Set the next state as the current state.
End Do
Is it this
while(Q<100){
Q[row,col]=Q[row,col]+alpha*(R[row,col]+gamma*max(Qdash-Q[row,col])
}
This post is by no means a complete implementation of Q-learning in R. It is an attempt to answer the OP with regards to the description of the algorithm in the website linked in the post and in Wikipedia.
The assumption here is that the reward matrix R is as described in the website. Namely that it encodes reward values for possible actions as non-negative numbers, and -1's in the matrix represent null values (i.e., where there is no possible action to transition to that state).
With this setup, an R implementation of the Q update is:
Q[cs,ns] <- Q[cs,ns] + alpha*(R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]) - Q[cs,ns])
where
cs is the current state at the current point in the path.
ns is the new state based on a (randomly) chosen action at the current state. This action is chosen from the collection of possible actions at the current state (i.e., for which R[cs,] > -1). Since the state transition itself is deterministic here, the action is the transition to the new state.
For this action resulting in ns, we want to add its maximum (future) value over all possible actions that can be taken at ns. This is the so-called Max[Q(next state, all actions)] term in the linked website and the "estimate of optimal future value" in Wikipedia. To compute this, we want to maximize over the ns-th row of Q but consider only columns of Q for which columns of R at the corresponding ns-th row are valid actions (i.e., for which R[ns,] > -1). Therefore, this is:
max(Q[ns, which(R[ns,] > -1)])
An interpretation of this value is a one-step look ahead value or an estimate of the cost-to-go in dynamic programming.
The equation in the linked website is the special case in which alpha, the learning rate, is 1. We can view the equation in Wikipedia as:
Q[cs,ns] <- (1-alpha)*Q[cs,ns] + alpha*(R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]))
where alpha "interpolates" between the old value Q[cs,ns] and the learned value R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]). As noted in Wikipedia,
In fully deterministic environments, a learning rate of alpha=1 is optimal
Putting it all together into a function:
q.learn <- function(R, N, alpha, gamma, tgt.state) {
## initialize Q to be zero matrix, same size as R
Q <- matrix(rep(0,length(R)), nrow=nrow(R))
## loop over episodes
for (i in 1:N) {
## for each episode, choose an initial state at random
cs <- sample(1:nrow(R), 1)
## iterate until we get to the tgt.state
while (1) {
## choose next state from possible actions at current state
## Note: if only one possible action, then choose it;
## otherwise, choose one at random
next.states <- which(R[cs,] > -1)
if (length(next.states)==1)
ns <- next.states
else
ns <- sample(next.states,1)
## this is the update
Q[cs,ns] <- Q[cs,ns] + alpha*(R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]) - Q[cs,ns])
## break out of while loop if target state is reached
## otherwise, set next.state as current.state and repeat
if (ns == tgt.state) break
cs <- ns
}
}
## return resulting Q normalized by max value
return(100*Q/max(Q))
}
where the input parameters are:
R is the rewards matrix as defined in the blog
N is the number of episodes to iterate
alpha is the learning rate
gamma is the discount factor
tgt.state is the target state of the problem.
Using the example in the linked website as a test:
N <- 1000
alpha <- 1
gamma <- 0.8
tgt.state <- 6
R <- matrix(c(-1,-1,-1,-1,0,-1,-1,-1,-1,0,-1,0,-1,-1,-1,0,-1,-1,-1,0,0,-1,0,-1,0,-1,-1,0,-1,0,-1,100,-1,-1,100,100),nrow=6)
print(R)
## [,1] [,2] [,3] [,4] [,5] [,6]
##[1,] -1 -1 -1 -1 0 -1
##[2,] -1 -1 -1 0 -1 100
##[3,] -1 -1 -1 0 -1 -1
##[4,] -1 0 0 -1 0 -1
##[5,] 0 -1 -1 0 -1 100
##[6,] -1 0 -1 -1 0 100
Q <- q.learn(R,iter,alpha,gamma,tgt.state)
print(Q)
## [,1] [,2] [,3] [,4] [,5] [,6]
##[1,] 0 0 0.0 0 80 0.00000
##[2,] 0 0 0.0 64 0 100.00000
##[3,] 0 0 0.0 64 0 0.00000
##[4,] 0 80 51.2 0 80 0.00000
##[5,] 64 0 0.0 64 0 100.00000
##[6,] 0 80 0.0 0 80 99.99994

Is there any alternative for Excel solver in R?

We have the below code for solving an optimization problem where we want to maximize sales by applying constraint on profit and no. of items.
We want to apply this profit threshold as a percentage of Revenue generated by 200 items only.
We have done it by applying a formula on profit using changing variable in Excel Solver using GRGE non-linear algorithm. We want a similar alternative for R.
Is there any way to assign changing variable in R?
Dataset
item sales profit
A 1200 120
B 5600 45
C 450 00
D 990 -90
E 1000 80
F 560 120
G 500 23
H 2000 350
Code
library(lpSolveAPI)
dataset<-read.csv("Dataset.csv",header=T,na.strings='NA',stringsAsFactors =F)
dataset$keep_flag <-1
**all the func in LPsolve API**
ls("package:lpSolveAPI")
summary(dataset)
**Passing the parameters**
ncol <- nrow(dataset)
**you have eight rows that can be picked or dropped from the solution set**
lp_rowpicker <- make.lp(ncol=ncol)
set.type(lp_rowpicker, columns=1:ncol, type = c("binary"))
**checking the model**
lp_rowpicker
**setting objective**
obj_vals <- dataset$Revenue_1hr.Projected
#obj_vals<- dataset[, 2]
obj_vals
set.objfn(lp_rowpicker, obj_vals)
lp.control(lp_rowpicker,sense='max')
**Adding contraints**
Profit constraint
xt<- (dataset$Profit_1hr.Projected)
add.constraint(lp_rowpicker, xt, ">=", 100)
xt
#No.of items to be kept
xt<- (dataset$keep_flag)
add.constraint(lp_rowpicker, xt, "=", 4)
xt
#model check
lp_rowpicker
#solving equation
solve(lp_rowpicker)
#Maximised revenue
get.objective(lp_rowpicker)
#The one with binary as 1 is our item
dataset$keep_flag<- get.variables(lp_rowpicker)
dataset$keep_flag <- as.data.frame(dataset$keep_flag)
sum(dataset$keep_flag)
final_set <- cbind(dataset,final_flag)
final_set <- final_set[which(final_set$final_flag==1),]
final_set$keep_flag <- NULL
final_set$final_flag<- NULL
This code snippet applies the profit threshold on total no. of items rather than applying it on selected items.
Edit
This is the model that got created when I ran #Karsten W. code:
C1 C2 C3 C4 C5 C6 C7 C8
Maximize 1200 5600 450 990 1000 560 500 2000
R1 120 45 0 -90 80 120 23 350 >= 100
R2 1 1 1 1 1 1 1 1 = 4
Kind Std Std Std Std Std Std Std Std
Type Int Int Int Int Int Int Int Int
Upper 1 1 1 1 1 1 1 1
Lower 0 0 0 0 0 0 0 0
And the output obtained is:
item sales profit
1 A 1200 120
1.1 A 1200 120
1.2 A 1200 120
1.3 A 1200 120
The same item is returned four times. I want 4 unique items. Plus I want to apply constraint of profit as a percentage of Sales generated by those 4 items.
By the way, we kept 'keep_flag' for the similar function to what your 'nitems' is doing. It is a changing variable that takes binary value.
Your code seems ok to me, except for that the variable names do not fit to the dataset you provided. In particular it is not clear to me what keep_flag stands for, is that some sort of preselection?
The profit constraint in your code is applied only the four from the solver selected variabes.
Here is your code, a bit cleaned up.
library(lpSolveAPI)
dataset <- data.frame(item=LETTERS[1:8], sales=c(1200, 5600, 450, 990, 1000, 560, 500, 2000), profit=c(120, 45, 0, -90, 80, 120, 23, 350))
nitems <- nrow(dataset)
# make lp
lprec <- make.lp(0, ncol=nitems)
set.type(lprec, columns=seq.int(nitems), type="binary")
# set objective
lp.control(lprec, sense="max", bb.rule="gap", timeout=30)
set.objfn(lprec, obj=dataset[, "sales"])
# constraints
min_rel_profit <- 0.10 # min. 10% profit
add.constraint(lprec, dataset[, "profit"]-min_rel_profit*dataset[,"sales"], ">=", 0) # required profit
add.constraint(lprec, rep(1, nitems), "=", 4) # four products
print(lprec)
solve(lprec)
dataset[get.variables(lprec)==1,]
The profit constraint is derived as follows (p is the vector of profits, s is the vector of sales, x is the decision variable 0/1, all of length nitems, minp is the minimum relative profit):
sum(profit) / sum(sales) >= minprofit translates to p'x/s'x >= minp
this is equivalent to (p - minp s)'x >= 0
Hence the minimum profit has to appear as part of the coefficients on the LHS.
If you are encountering long solving times, you can finetune the parameters. See ?lp.control.options for more details. Use timeout to set a time limit while testing. For this kind of problem (MIP) the bb.rule parameter is helpful. Given your example data, a solution for 9.5% was found in less than one second.
I would look at a few and choose the best
LPSolve https://cran.r-project.org/web/packages/lpSolve/lpSolve.pdf,
This is a simple linear solver. Its pretty much similar to LPSolve Api but I find it much more easier.
Minqa https://cran.r-project.org/web/packages/minqa/minqa.pdf
This is a quadriatic solver that works mostly for non linear problems
Gurobi http://www.gurobi.com/products/modeling-languages/r
This is an open source implementation of IBM's CPLEX solver. Very good and competent.

R: too long computation of likelihood function for conditional logit model

I am trying to maximize loglikelihood function to get coefficients for conditional logit model. I have a big data frame with about 9M rows (300k choice sets) and about 40 parameters to be estimated. It looks like this:
ChoiceSet Choice SKU Price Caramel etc.
1 1 1234 1.0 1 ...
1 0 145 2.0 1 ...
1 0 5233 2.0 0 ...
2 0 1432 1.5 1 ...
2 0 5233 2.0 0 ...
2 1 8320 2.0 0 ...
3 0 1234 1.5 1 ...
3 1 145 1.0 1 ...
3 0 8320 1.0 0 ...
Where ChoiceSet is a set of products available in store in the moment of purchase and Choice=1 when the SKU is chosen.
Since ChoiceSets might vary I use loglikelihood function:
clogit.ll <- function(beta,X) { #### This is a function to be maximized
X <- as.data.table(X)
setkey(X,ChoiceSet,Choice)
sum((as.matrix(X[J(t(as.vector(unique(X[,1,with=F]))),1),3:ncol(X),with=F]))%*%beta)-
sum(foreach(chset=unique(X[,list(ChoiceSet)])$ChoiceSet, .combine='c', .packages='data.table') %dopar% {
Z <- as.matrix(X[J(chset,0:1),3:ncol(X), with=F])
Zb <- Z%*%beta
e <- exp(Zb)
log(sum(e))
})
}
Create new data frame without SKU (it's not needed) and zero vector:
X0 <- Data[,-3]
b0 <- rep(0,ncol(X0)-2)
I maximize this function with a help of maxLike package where I use gradient to make calculation faster:
grad.clogit.ll <- function(beta,X) { ###It is a gradient of likelihood function
X <- as.data.table(X)
setkey(X,ChoiceSet,Choice)
colSums(foreach(chset=unique(X[,list(ChoiceSet)])$ChoiceSet, .combine='rbind',.packages='data.table') %dopar% {
Z <- as.matrix(X[J(chset,0:1),3:ncol(X), with=F])
Zb <- Z%*%beta
e <- exp(Zb)
as.vector(X[J(chset,1),3:ncol(X),with=F]-t(as.vector(X[J(chset,0:1),3:ncol(X),with=F]))%*%(e/sum(e)))
})
}
Maximization problem is following:
fit <- maxLik(logLik = clogit.ll, grad = grad.clogit.ll, start=b0, X=X0, method="NR", tol=10^(-6), iterlim=100)
Generally, it works fine for small samples, but too long for big:
Number of Choice sets Duration of computation
300 4.5min
400 10.5min
1000 25min
But when I do it for 5000+ choice sets R terminate session.
So (if you are still reading it) how can I maximaze this function if I have 300,000+ choice sets and 1.5 weeks to finish my course work? Please help, I have no any idea.

parallel k-means in R

I am trying to understand how to parallelize some of my code using R. So, in the following example I want to use k-means to cluster data using 2,3,4,5,6 centers, while using 20 iterations.
Here is the code:
library(parallel)
library(BLR)
data(wheat)
parallel.function <- function(i) {
kmeans( X[1:100,100], centers=?? , nstart=i )
}
out <- mclapply( c(5, 5, 5, 5), FUN=parallel.function )
How can we parallel simultaneously the iterations and the centers?
How to track the outputs, assuming I want to keep all the outputs from k-means across all, iterations and centers, just to learn how?
This looked very simple to me at first ... and then i tried it. After a lot of monkey typing and face palming during my lunch break however, I arrived at this:
library(parallel)
library(BLR)
data(wheat)
mc = mclapply(2:6, function(x,centers)kmeans(x, centers), x=X)
It looks right though I didn't check how sensible the clustering was.
> summary(mc)
Length Class Mode
[1,] 9 kmeans list
[2,] 9 kmeans list
[3,] 9 kmeans list
[4,] 9 kmeans list
[5,] 9 kmeans list
On reflection the command syntax seems sensible - although a lot of other stuff that failed seemed reasonable too...The examples in the help documentation are maybe not that great.
Hope it helps.
EDIT
As requested here is that on two variables nstart and centers
(pars = expand.grid(i=1:3, cent=2:4))
i cent
1 1 2
2 2 2
3 3 2
4 1 3
5 2 3
6 3 3
7 1 4
8 2 4
9 3 4
L=list()
# zikes horrible
pars2=apply(pars,1,append, L)
mc = mclapply(pars2, function(x,pars)kmeans(x, centers=pars$cent,nstart=pars$i ), x=X)
> summary(mc)
Length Class Mode
[1,] 9 kmeans list
[2,] 9 kmeans list
[3,] 9 kmeans list
[4,] 9 kmeans list
[5,] 9 kmeans list
[6,] 9 kmeans list
[7,] 9 kmeans list
[8,] 9 kmeans list
[9,] 9 means list
How'd you like them apples?
There's a CRAN package called knor that is derived from a research paper that improves the performance using a memory efficient variant of Elkan's pruning algorithm. It's an order of magnitude faster than everything in these answers.
install.packages("knor")
require(knor)
iris.mat <- as.matrix(iris[,1:4])
k <- length(unique(iris[, dim(iris)[2]])) # Number of unique classes
nthread <- 4
kms <- Kmeans(iris.mat, k, nthread=nthread)
You may use parallel to try K-Means from different random starting points on multiple cores.
The code below is an example. (K=K in K-means, N= number of random starting points, C = number of cores you would like to use)
suppressMessages( library("Matrix") )
suppressMessages( library("irlba") )
suppressMessages( library("stats") )
suppressMessages( library("cluster") )
suppressMessages( library("fpc") )
suppressMessages( library("parallel") )
#Calculate KMeans results
calcKMeans <- function(matrix, K, N, C){
#Parallel running from various of random starting points (Using C cores)
results <- mclapply(rep(N %/% C, C), FUN=function(nstart) kmeans(matrix, K, iter.max=15, nstart=nstart), mc.cores=C);
#Find the solution with smallest total within sum of square error
tmp <- sapply(results, function(r){r[['tot.withinss']]})
km <- results[[which.min(tmp)]]
#return cluster, centers, totss, withinss, tot.withinss, betweenss, size
return(km)
}
runKMeans <- function(fin_uf, K, N, C,
#fout_center, fout_label, fout_size,
fin_record=NULL, fout_prediction=NULL){
uf = read.table(fin_uf)
km = calcKMeans(uf, K, N, C)
rm(uf)
#write.table(km$cluster, file=fout_label, row.names=FALSE, col.names=FALSE)
#write.table(km$center, file=fout_center, row.names=FALSE, col.names=FALSE)
#write.table(km$size, file=fout_size, row.names=FALSE, col.names=FALSE)
str(km)
return(km$center)
}
Hope it helps!

Resources