I have a list of rooms, the rooms' maximum square feet, programs, the programs' maximum square feet, and the values of how well a room matches (match #) the intended program use. With help, I have been able to maximize match # and square feet use for one program per room. However, I would like to take this analysis one step further and allow for multiple programs in the same room or multiples of the same program if it has the highest match #, so long as the multiples still fit within the square foot requirements. Moreover, I would like to tell lpSolve overall that I only want "x" number of Offices, "y" number of Studios, etc. throughout the entire building. Here is my data and code thus far:
program.size <- c(120,320,300,800,500,1000,500,1000,1500,400,1500,2000)
room.size <- c(1414,682,1484,2938,1985,1493,427,1958,708,581,1485,652,727,2556,1634,187,2174,205,1070,2165,1680,1449,1441,2289,986,298,590,2925)
(obj.vals <- matrix(c(3,4,2,8,3,7,4,8,6,4,7,7,
3,4,2,8,3,7,4,8,6,4,7,7,
4,5,3,7,4,6,5,7,5,3,6,6,
2,3,1,7,2,6,3,7,7,5,6,6,
4,5,3,7,4,6,5,7,5,3,6,6,
3,6,4,8,5,7,4,8,7,7,7,7,
3,4,2,8,3,7,4,8,6,4,7,7,
4,5,3,7,4,6,5,7,5,3,6,6,
6,7,5,7,6,6,7,7,5,3,6,6,
6,7,5,7,6,6,7,7,5,3,6,6,
5,6,6,6,5,7,8,6,4,2,5,5,
6,7,5,7,6,6,7,7,5,3,6,6,
6,7,5,7,6,6,7,7,5,3,6,6,
3,4,4,8,3,9,6,8,6,4,7,7,
3,4,2,6,3,5,4,6,6,4,5,5,
4,5,3,5,4,4,5,5,5,3,4,4,
5,6,4,8,5,7,6,8,6,4,7,7,
5,6,4,8,5,7,6,8,6,4,7,7,
4,5,5,7,4,8,7,7,5,3,6,6,
5,6,4,8,5,7,6,8,6,4,7,7,
3,4,2,6,3,5,4,6,6,4,5,5,
5,6,4,8,5,7,6,8,6,4,7,7,
5,6,4,8,5,7,6,8,6,4,7,7,
5,4,4,6,5,5,6,6,6,6,7,5,
6,5,5,5,6,4,5,5,5,7,6,4,
4,5,3,7,4,6,5,7,7,5,6,6,
6,5,5,5,6,4,5,5,5,7,6,4,
3,4,4,6,3,7,6,6,6,4,5,5), nrow=12))
rownames(obj.vals) <- c("Enclosed Offices", "Open Office", "Reception / Greeter", "Studio / Classroom",
"Conference / Meeting Room", "Gallery", "Public / Lobby / Waiting",
"Collaborative Space", "Mechanical / Support", "Storage / Archives",
"Fabrication", "Performance")
(obj.adj <- obj.vals * outer(program.size, room.size, "<="))
nr <- nrow(obj.adj)
nc <- ncol(obj.adj)
library(lpSolve)
obj <- as.vector(obj.adj)
con <- t(1*sapply(1:nc, function(x) rep(1:nc == x, each=nr)))
dir <- rep("<=", nc)
rhs <- rep(1, nc)
mod <- lp("max", obj, con, dir, rhs, all.bin=TRUE)
final <- matrix(mod$solution, nrow=nr)
And so now my question is how can I allow the solver to maximize square foot use and match # within each room (column) and allow either multiple of the same programs or a combination of programs to accomplish this? I know I would have to lift the "<= 1" restriction in "mod", but I can't figure out how to let it find the best fit in each room and then ultimately, overall.
The solution that should come for room [,1] is:
$optimum
33
And it's going to try to fit 11 Enclosed Offices within the room which scores a much higher optimum match # than 1 Collaborative Space (8 matches) and 1 Storage / Archives (4 matches) for a total of 12 matches.
And so this leads to my next question about limiting the overall number of certain programs within my solution matrix. I assume it would include some kind of
as.numeric(data$EnclosedOffices "<=" 5)
but I also can't figure out how to limit that. These numbers would be different for all the programs.
Thanks for any and all help and feel free to ask for any clarification.
Update:
Constraints
Maximize the match # (obj.vals) for each room.
Maximize the program.size square feet within each room.size square
feet. Do this by either using the same program multiple times (5 x
Enclosed Offices) or combining programs (1 Collaborative Space and
1 Performance).
Restrict and/or force the number of programs returned in the
solution. The programs can be split up among rooms so long as it doesn't go over the maximum number I provide for that program in all the rooms. (Only 5 Enclosed Offices, 8 Studios/Classrooms, 1 Fabrication, etc. can be selected across all 28 columns.
If you use the R package lpSolveAPI (a wrapper for lpSolve) then it becomes a little easier.
First, take a look at the mathematical formulation (an Integer Program) and then I show you the code to solve your problem.
Formulation
Let X_r_p be the decision variable that takes on positive integer values.
X_r_p = Number of programs of type p assigned to room r
(In all your problem will have 28*12=336 decision variables)
Objective Function
Maximize matching score
Max sum(r) sum(p) C_r_p * X_r_p # Here C_r_p is the score of assigning p to room r
Subject to
Room Area Limitation Constraint
Sum(p) Max_area_p * X_r_p <= Room Size (r) for each room r
(We will have 28 such constraints)
Restrict the Number of programs Constraint
Sum(r) X_r_p <= Max_allowable(p) for each program p
(We will have 12 such constraints)
X_r_p >= 0, Integer
That is the formulation in all. 336 columns and 40 rows.
Implementatin in R
Here's an implementation in R, using lpSolveAPI.
Note: Since the OP didn't provide the max_allowable programs in the building, I generated my own data for max_programs.
program.size <- c(120,320,300,800,500,1000,500,1000,1500,400,1500,2000)
room.size <- c(1414,682,1484,2938,1985,1493,427,1958,708,581,1485,652,727,2556,1634,187,2174,205,1070,2165,1680,1449,1441,2289,986,298,590,2925)
(obj.vals <- matrix(c(3,4,2,8,3,7,4,8,6,4,7,7,3,4,2,8,3,7,4,8,6,4,7,7,
4,5,3,7,4,6,5,7,5,3,6,6,2,3,1,7,2,6,3,7,7,5,6,6, 4,5,3,7,4,6,5,7,5,3,6,6,
3,6,4,8,5,7,4,8,7,7,7,7,3,4,2,8,3,7,4,8,6,4,7,7, 4,5,3,7,4,6,5,7,5,3,6,6,
6,7,5,7,6,6,7,7,5,3,6,6,6,7,5,7,6,6,7,7,5,3,6,6, 5,6,6,6,5,7,8,6,4,2,5,5,
6,7,5,7,6,6,7,7,5,3,6,6, 6,7,5,7,6,6,7,7,5,3,6,6, 3,4,4,8,3,9,6,8,6,4,7,7,
3,4,2,6,3,5,4,6,6,4,5,5, 4,5,3,5,4,4,5,5,5,3,4,4, 5,6,4,8,5,7,6,8,6,4,7,7,
5,6,4,8,5,7,6,8,6,4,7,7, 4,5,5,7,4,8,7,7,5,3,6,6, 5,6,4,8,5,7,6,8,6,4,7,7,
3,4,2,6,3,5,4,6,6,4,5,5, 5,6,4,8,5,7,6,8,6,4,7,7, 5,6,4,8,5,7,6,8,6,4,7,7,
5,4,4,6,5,5,6,6,6,6,7,5, 6,5,5,5,6,4,5,5,5,7,6,4, 4,5,3,7,4,6,5,7,7,5,6,6,
6,5,5,5,6,4,5,5,5,7,6,4, 3,4,4,6,3,7,6,6,6,4,5,5), nrow=12))
rownames(obj.vals) <- c("Enclosed Offices", "Open Office", "Reception / Greeter", "Studio / Classroom",
"Conference / Meeting Room", "Gallery", "Public / Lobby / Waiting",
"Collaborative Space", "Mechanical / Support", "Storage / Archives",
"Fabrication", "Performance")
For each of the 12 programs, let's set a maximum number of repetitions that can be assigned to all rooms combined. Note this is something that I added, since this data was not provided by the OP. (Restrict them from getting assigned to too many rooms.)
max_programs <- c(1,2,3,1,5,2,3,4,1,3,1,2)
library(lpSolveAPI)
nrooms <- 28
nprgs <- 12
ncol = nrooms*nprgs
lp_matching <- make.lp(ncol=ncol)
#we want integer assignments
set.type(lp_matching, columns=1:ncol, type = c("integer"))
# sum r,p Crp * Xrp
set.objfn(lp_matching, obj.vals) #28 rooms * 12 programs
lp.control(lp_matching,sense='max')
#' Set Max Programs constraints
#' No more than max number of programs over all the rooms
#' X1p + x2p + x3p ... + x28p <= max(p) for each p
Add_Max_program_constraint <- function (prog_index) {
prog_cols <- (0:(nrooms-1))*nprgs + prog_index
add.constraint(lp_matching, rep(1,nrooms), indices=prog_cols, rhs=max_programs[prog_index])
}
#Add a max_number constraint for each program
lapply(1:nprgs, Add_Max_program_constraint)
#' Sum of all the programs assigned to each room, over all programs
#' area_1 * Xr1+ area 2* Xr2+ ... + area12* Xr12 <= room.size[r] for each room
Add_room_size_constraint <- function (room_index) {
room_cols <- (room_index-1)*nprgs + (1:nprgs) #relevant columns for a given room
add.constraint(lp_matching, xt=program.size, indices=room_cols, rhs=room.size[room_index])
}
#Add a max_number constraint for each program
lapply(1:nrooms, Add_room_size_constraint)
To solve this:
> solve(lp_matching)
> get.objective(lp_matching)
[1] 195
get.variables(lp_matching) # to see which programs went to which rooms
> print(lp_matching)
Model name:
a linear program with 336 decision variables and 40 constraints
You can also write the IP model to a file to examine it:
#Give identifiable column and row names
rp<- t(outer(1:nrooms, 1:nprgs, paste, sep="_"))
rp_vec <- paste(abc, sep="")
colnames<- paste("x_",rp_vec, sep="")
# RowNames
rownames1 <- paste("MaxProg", 1:nprgs, sep="_")
rownames2 <- paste("Room", 1:nrooms, "AreaLimit", sep="_")
dimnames(lp_matching) <- list(c(rownames1, rownames2), colnames)
write.lp(lp_matching,filename="room_matching.lp")
Hope that helps.
Update based on follow-up questions
Follow up Question 1: Alter the code to ensure that every room has at least one program.
Add the following constraint set:
X_r_p >= 1 for all r
Note: Since this is a maximization problem, the optimal solution should honor this constraint by default. In other words, it will always assign a program to any room if it could, assuming positive scores for assigning.
Follow up question 2: Another question is if I can ask it to have more than 28 programs in total? For instance, if I want 28 Enclosed Offices they almost all could fit in the one room with 2938 square feet. How then can I ask R to still find other programs if the max is set for 28?
In order to achieve this goal, you can do it a bit differently. Do not have the sum of all programs <= 28 constraint at all. (If you note in the solution above, my constraints are slightly different.)
The constraint:
Sum(r) X_r_p <= Max_allowable(p) for each program p
only limits the max per type of program. There is no limit on the total. Also, you don't have to write one such constraint for each type of program. Write this constraints only if you want to limit its occurrence.
To generalize this, you can set the lower and upper bounds for the total of each type of programs. This will give you very fine control over the assignments.
min_allowable(p) <= sum(over r) X_r_p <= max_allowable(p) for any program type p
Related
I am testing 2 ways of calculating Prod(b-a), where a and b are vectors of length n. Prod(b-a)=(b1-a1)(b2-a2)(b3-a3)*... (bn-an), where b_i>a_i>0 for all i=1,2,3, n. For some special cases, another way (Method 2) of calculation this prod(b-a) is more efficient. It uses the following formula, which is to expand the terms and sum them:
Here is my question is: When it happens that a_i very close to b_i, the true outcome could be very, very close 0, something like 10^(-16). Method 1 (substract and Multiply) always returns positive output. Method 2 of using the formula some times return negative output ( about 7~8% of time returning negative for my experiment). Mathematically, these 2 methods should return exactly the same output. But in computer language, it apparently produces different outputs.
Here are my codes to run the test. When I run the testing code for 10000 times, about 7~8% of my runs for method 2 returns negative output. According to the official document, the R double has the precision of "2.225074e-308" as indicated by R parameter: ".Machine$double.xmin". Why it's getting into the negative values when the differences are between 10^(-16) ~ 10^(-18)? Any help that sheds light on this will be apprecaited. I would also love some suggestions concerning how to practically increase the precision to higher level as indicated by R document.
########## Testing code 1.
ftest1case<-function(a,b) {
n<-length(a)
if (length(b)!=n) stop("--------- length a and b are not right.")
if ( any(b<a) ) stop("---------- b has to be greater than a all the time.")
out1<-prod(b-a)
out2<-0
N<-2^n
for ( i in 1:N ) {
tidx<-rev(as.integer(intToBits(x=i-1))[1:n])
tsign<-ifelse( (sum(tidx)%%2)==0,1.0,-1.0)
out2<-out2+tsign*prod(b[tidx==0])*prod(a[tidx==1])
}
c(out1,out2)
}
########## Testing code 2.
ftestManyCases<-function(N,printFreq=1000,smallNum=10^(-20))
{
tt<-matrix(0,nrow=N,ncol=2)
n<-12
for ( i in 1:N) {
a<-runif(n,0,1)
b<-a+runif(n,0,1)*0.1
tt[i,]<-ftest1case(a=a,b=b)
if ( (i%%printFreq)==0 ) cat("----- i = ",i,"\n")
if ( tt[i,2]< smallNum ) cat("------ i = ",i, " ---- Negative summation found.\n")
}
tout<-apply(tt,2,FUN=function(x) { round(sum(x<smallNum)/N,6) } )
names(tout)<-c("PerLess0_Method1","PerLee0_Method2")
list(summary=tout, data=tt)
}
######## Step 1. Test for 1 case.
n<-12
a<-runif(n,0,1)
b<-a+runif(n,0,1)*0.1
ftest1case(a=a,b=b)
######## Step 2 Test Code 2 for multiple cases.
N<-300
tt<-ftestManyCases(N=N,printFreq = 100)
tt[[1]]
It's hard for me to imagine when an algorithm that consists of generating 2^n permutations and adding them up is going to be more efficient than a straightforward product of differences, but I'll take your word for it that there are some special cases where it is.
As suggested in comments, the root of your problem is the accumulation of floating-point errors when adding values of different magnitudes; see here for an R-specific question about floating point and here for the generic explanation.
First, a simplified example:
n <- 12
set.seed(1001)
a <- runif(a,0,1)
b <- a + 0.01
prod(a-b) ## 1e-24
out2 <- 0
N <- 2^n
out2v <- numeric(N)
for ( i in 1:N ) {
tidx <- rev(as.integer(intToBits(x=i-1))[1:n])
tsign <- ifelse( (sum(tidx)%%2)==0,1.0,-1.0)
j <- as.logical(tidx)
out2v[i] <- tsign*prod(b[!j])*prod(a[j])
}
sum(out2v) ## -2.011703e-21
Using extended precision (with 1000 bits of precision) to check that the simple/brute force calculation is more reliable:
library(Rmpfr)
a_m <- mpfr(a, 1000)
b_m <- mpfr(b, 1000)
prod(a_m-b_m)
## 1.00000000000000857647286522936696473705868726043995807429578968484409120647055193862325070279593735821154440625984047036486664599510856317884962563644275433171621778761377125514191564456600405460403870124263023336542598111475858881830547350667868450934867675523340703947491662460873009229537576817962228e-24
This proves the point in this case, but in general doing extended-precision arithmetic will probably kill any performance gains you would get.
Redoing the permutation-based calculation with mpfr values (using out2 <- mpfr(0, 1000), and going back to the out2 <- out2 + ... running summation rather than accumulating the values in a vector and calling sum()) gives an accurate answer (at least to the first 20 or so digits, I didn't check farther), but takes 6.5 seconds on my machine (instead of 0.03 seconds when using regular floating-point).
Why is this calculation problematic? First, note the difference between .Machine$double.xmin (approx 2e-308), which is the smallest floating-point value that the system can store, and .Machine$double.eps (approx 2e-16), which is the smallest value such that 1+x > x, i.e. the smallest relative value that can be added without catastrophic cancellation (values a little bit bigger than this magnitude will experience severe, but not catastrophic, cancellation).
Now look at the distribution of values in out2v, the series of values in out2v:
hist(out2v)
There are clusters of negative and positive numbers of similar magnitude. If our summation happens to add a bunch of values that almost cancel (so that the result is very close to 0), then add that to another value that is not nearly zero, we'll get bad cancellation.
It's entirely possible that there's a way to rearrange this calculation so that bad cancellation doesn't happen, but I couldn't think of one easily.
Here is my problem - I would like to generate a fairly large number of factorial combinations and then apply some constraints on them to narrow down the list of all possible combinations. However, this becomes an issue when the number of all possible combinations becomes extremely large.
Let's take an example - Assume we have 8 variables (A; B; C; etc.) each taking 3 levels/values (A={1,2,3}; B={1,2,3}; etc.).
The list of all possible combinations would be 3**8 (=6561) and can be generated as following:
tic <- function(){start.time <<- Sys.time()}
toc <- function(){round(Sys.time() - start.time, 4)}
nX = 8
tic()
lk = as.list(NULL)
lk = lapply(1:nX, function(x) c(1,2,3))
toc()
tic()
mapx = expand.grid(lk)
mapx$idx = 1:nrow(mapx)
toc()
So far so good, these operations are done pretty quickly (< 1 second) even if we significantly increase the number of variables.
The next step is to generate a corrected set of all pairwise comparisons (An uncorrected set would be obtain by freely combining all 6561 options with each other, leading to 65616561=43046721 combinations) - The size of this "universe" would be: 6561(6561-1)/2 = 21520080. Already pretty big!
I am using the R built-in function combn to get it done. In this example the running time remains acceptable (about 20 seconds on my PC) but things become impossible with higher higher number of variables and/or more levels per variable (running time would increase exponentially, for example it already took 177 seconds with 9 variables!). But my biggest concern is actually that the object size would become so large that R can no longer handle it (Memory issue).
tic()
univ = t(combn(mapx$idx,2))
toc()
The next step would be to identify the list of combinations meeting some pre-defined constraints. For instance I would like to sub-select all combinations sharing exactly 3 common elements (ie 3 variables take the same values). Again the running time will be very long (even if a 8 variables) as my approach is to loop over all combinations previously defined.
tic()
vrf = NULL
vrf = sapply(1:nrow(univ), function(x){
j1 = mapx[mapx$idx==univ[x,1],-ncol(mapx)]
j2 = mapx[mapx$idx==univ[x,2],-ncol(mapx)]
cond = ifelse(sum(j1==j2)==3,1,0)
return(cond)})
toc()
tic()
univ = univ[vrf==1,]
toc()
Would you know how to overcome this issue? Any tips/advices would be more than welcome!
Reading the book of Aziz & Prakash 2021 I am a bit stuck on problem 3.7 and the associated solution for which I am trying to implement.
The problem says :
You have n users with unique hashes h1 through hn and
m servers, numbered 1 to m. User i has Bi bytes to store. You need to
find numbers K1 through Km such that all users with hashes between
Kj and Kj+1 get assigned to server j. Design an algorithm to find the
numbers K 1 through Km that minimizes the load on the most heavily
loaded server.
The solution says:
Let L(a,b) be the maximum load on a server when
users with hash h1 through ha are assigned to servers S1 through Sb in
an optimal way so that the max load is minimised. We observe the
following recurrence:
In other words, we find the right value of x such that if we pack the
first x users in b - 1 servers and the remaining in the last servers the max
load on a given server is minimized.
Using this relationship, we can tabulate the values of L till we get
L(n,m). While computing L(a,b) when the values of L is tabulated
for all lower values of a and b we need to find the right value of x to
minimize the load. As we increase x, L(x,b-1) in the above expression increases the the sum term decreases. We can do binary search for x to find x that minimises their max.
I know that we can probably use some sort of dynamic programming, but how could we possibly implement this idea into a code?
The dynamic programming algorithm is defined fairly well given that formula: Implementing a top-down DP algorithm just needs you to loop from x = 1 to a and record which one minimizes that max(L(x,b-1), sum(B_i)) expression.
There is, however, a simpler (and faster) greedy/binary search algorithm for this problem that you should consider, which goes like this:
Compute prefix sums for B
Find the minimum value of L such that we can partition B into m contiguous subarrays whose maximum sum is equal to L.
We know 1 <= L <= sum(B). So, perform a binary search to find L, with a helper function canSplit(v) that tests whether we can split B into such subarrays of sum <= v.
canSplit(v) works greedily: Remove as many elements from the start of B as possible so that our sum does not exceed v. Repeat this a total of m times; return True if we've used all of B.
You can use the prefix sums to run canSplit in O(m log n) time, with an additional inner binary search.
Given L, use the same strategy as the canSplit function to determine the m-1 partition points; find the m partition boundaries from there.
I have a question for an assignment I'm doing.
Q:
"Set the seed at 1, then using a for-loop take a random sample of 5 mice 1,000 times. Save these averages.
What proportion of these 1,000 averages are more than 1 gram away from the average of x ?"
I understand that basically, I need to write a code that says: What percentage of "Nulls" is +or- 1 gram from the average of "x." I'm not really certain how to write that given that this course hasn't given us the information on how to do that yet is asking us to do so. Any help on how to do so?
url <- "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/femaleControlsPopulation.csv"
filename <- basename(url)
download(url, destfile=filename)
x <- unlist( read.csv(filename) )
set.seed(1)
n <- 1000
nulls<-vector("numeric", n)
for(i in 1:n){
control <- sample(x, 5)
nulls[i] <-mean(control)
##I know my last line for this should be something like this
## mean(nulls "+ or - 1")> or < mean(x)
## not certain if they're asking for abs() to be involved.
## is the question asking only for those that are 1 gram MORE than the avg of x?
}
Thanks for any help.
Z
I do think that the absolute distance is what they're after here.
Vectors in R are nice in that you can just perform arithmetic operations between a vector and a scalar and it will apply it element-wise, so computing the absolute value of nulls - mean(x) is easy. The abs function also takes vectors as arguments.
Logical operators (such as < and >) can also be used in the same way, making it equally simple to compare the result with 1. This will yield a vector of booleans (TRUE/FALSE) where TRUE means the value at that index was indeed greater than 1, but booleans are really just numbers (1 or 0), so you can just sum that vector to find the number of TRUE elements.
I don't know what programming level you are on, but I hope this helps without giving the solution away completely (since you said it's for an assignment).
I have tried to use the R package LPSolve and in particular the lp.transport function to solve a optimisation problem. In my fictitious example below I have 5 office sites that I need to resource with a minimum number of employees and I have set up a cost matrix that determines the distance from each employees home to the office. I want to minimize the total distance traveled to work whilst meeting the minimum number of employees per office.
Initially this was working as I was treating all employees as equal (1). however problems have started to occur when I rate each employee by how efficient they are. For example I now want to say that officeX needs the equivalent of 2 engineers which might be made up of 4 engineers who are 50% efficient or 1 that is 200% efficient. When I do this however the solution found will split a employee across a number of offices, what I need is a additional constraint so impose that a employee can only be at 1 Office.
Anyway hopefully that is enough background here is my example:
Employee <- c("Jim","John","Jonah","James","Jeremy","Jorge")
Office1 <- c(2.58321505105556, 5.13811249390279, 2.75943834864996,
6.73543614029559, 6.23080251653027, 9.00620341764497)
Office2 <- c(24.1757667923894, 19.9990724784926, 24.3538456922105,
27.9532073293925, 26.3310994833106, 14.6856664813007)
Office3 <- c(38.6957155251069, 37.9074293509861, 38.8271000719858,
40.3882569566947, 42.6658938732098, 34.2011184027657)
Office4 <- c(28.8754359274453, 30.396841941228, 28.9595182970988,
29.2042274337124, 33.3933900645023, 28.6340025144932)
Office5 <- c(49.8854888720157, 51.9164328512659, 49.948290261029,
49.4793138594302, 54.4908258333456, 50.1487397648236)
#create CostMatrix
costMat<-data.frame(Employee,Office1, Office2, Office3, Office4, Office5)
#efficiency is the worth of employees, eg if 1 they are working at 100%
#so if for example I wanted 5 Employees
#working in a office then I could choose 5 at 100% or 10 working at 50% etc...
efficiency<-c(0.8416298, 0.8207991, 0.7129663, 1.1406839, 1.3868177, 1.1989748)
#Uncomment next line to see the working version based on headcount
#efficiency<-c(1,1,1,1,1,1)
#Minimum is the minimum number of Employees we want in each office
minimum<-c(1, 1, 2, 1, 1)
#solve problem
opSol <-lp.transport(cost.mat = as.matrix(costMat[,-1]),
direction = "min",
col.signs = rep(">=",length(minimum)),
col.rhs = minimum,
row.signs = rep("==", length(efficiency)),
row.rhs = efficiency,
integers=NULL)
#view solution
opSol$solution
# My issue is one employee is being spread across multiple areas,
#what I really want is a extra constraint that says that in a row there
# can only be 1 non 0 value.
I think this is no longer a transportation problem. However you still can solve it as a MIP model: