GAMS Error in MIP Transportation Problem - Uncontrolled set entered as constant - mixed-integer-programming

I am trying to formulate an MIP model in which a transportation can be performed by available trains or new ship investments. My current code includes three tables: Monthly costs for trains, monthly costs for ships and initial investment costs for ships.
It raises the following error n149 at the "cost.. z =e=" line: Uncontrolled set entered as constant. Also errors with codes 257 and 141 are raised at the 56th and 57th rows, respectively.
Sets
i supply nodes /Plant1, Plant2, Plant3, Plant4/
j demand nodes /City1, City2, City3, City4, City5, Dummy/;
Parameters
a(i) supply capacities
/Plant1 290
Plant2 220
Plant3 180
Plant4 280/
b(j) demands
/City1 180
City2 200
City3 160
City4 140
City5 250
Dummy 40/;
Table c1(i,j) transport costs for trains
City1 City2 City3 City4 City5 Dummy
Plant1 8.5 7 8 6.5 9 0
Plant2 7.5 8 7 10 8.5 0
Plant3 11 6 6.5 8 7 0
Plant4 9 7 12 6 7.5 0 ;
Table c2(i,j) transport costs for ships
City1 City2 City3 City4 City5 Dummy
Plant1 5.5 6 99999 3.5 4 0
Plant2 3 4.5 4 6.5 6 0
Plant3 99999 99999 3 4 4.5 0
Plant4 5 4.5 7 3 99999 0 ;
Table in(i,j) investment costs for ships
City1 City2 City3 City4 City5 Dummy
Plant1 40 90 99999 40 80 0
Plant2 60 40 80 20 40 0
Plant3 99999 99999 80 60 100 0
Plant4 100 60 60 80 99999 0 ;
Positive Variables
x(i,j) flow between supply node i and demand node j;
Variables
y(i,j) whether a ship is bought for the trasfer from i to j
z total cost;
Binary Variables y;
Equations
cost objective function
supply(i) supply constraint
demand(j) demand constraint;
cost.. z =e= sum((i,j), c1(i,j)*x(i,j)*12*10*(1-y(i,j)) + c2(i,j)*x(i,j)*12*10*y(i,j)) + in(i,j)*y(i,j);
supply(i).. sum(j, x(i,j)) =l= a(i);
demand(j).. sum(i, x(i,j)) =g= b(j);
Model homework1c /all/;
homework1c.OPTFILE=1;
Solve homework1c using MIP minimizing z;
Display x.l, x.M, y.l;
I would appreciate any suggestions to fix them, thanks in advance.

I see two issues:
i and j in + in(i,j)*y(i,j) at the end of your cost equation is not controlled. Was this term supposed to be part of the sum over i and j?
You try to solve a MIP (which should be linear) but multiply the variable x with variable y. So, you need to solve a MINLP or reformulate your cost equation.

Related

Nested logit model using panel data in R

I am new to R and I would love it if you can help me with this because I am having serious difficulties.
I have unbalanced panel data that shows monthly companies' performance compared to the rest of the market in terms of $$ (eg. this month company 1 has made $1000 more than the average of the market). Each of these companies had decided on a strategy when they entered the market (1 through 8). These strategies are nested into two different groups (a,b) so that strategies 1,2, and 3 are part of the group a, while strategies 4 through 8 are part of group b. I would need a rank of the best strategies from best to worst.
I have discretized my DV so that now it only shows whether that month company 1 performed higher or lower than the market. However, I am not sure it is the right way because I would then lose how much better or worse each month companies performed compared to the market.
My data looks like this:
ID Main Strategy YearMonth DiffPerformance Control1 Control 2 DiffPerformanceHL
1 a 2 201706 9.037 2 57 H
1 a 2 201707 4.371 2 57 H
1 a 2 201708 1.633 2 57 H
1 a 2 201709 -3.521 2 59 L
1 a 2 201710 13.096 2 59 H
1 a 2 201711 5.070 2 60 H
1 a 2 201712 4.25 2 60 H
2 b 5 201904 6.78 4 171 H
2 b 5 201905 -15.26 4 169 L
2 b 5 201906 7.985 4 169 H
Where ID is the company, Main is the group (a or b) Strategies are 1 through 8 and nested as previously stated, YearMonth represents the specific month, DifferencePerformance is the DV as a continuous variable, Control 1 is static over time and is a categorical variable (1 through 6), Control 2 is a control count variable that changes over time, and DiffPerformance HL is the discretized DV.
Can you please help me figuring out how to create a nested logit model in R? I would be super appreciative
Thanks

How to estimate additional variables in a df by group in R?

I have a data frame, consisting of stand ID, Treatment type, revision, and tree diameter. I want to estimate an additional variable - Quadratic mean diameter (and other variables) for every stand, revision and treatment separately, using a function: sqrt(sum(dia^2)/n).
An example of my dataset:
ID Rev Treat dia
1523 1 A 7.549834
1523 1 A 4.500000
1523 1 B 1.500000
1523 1 B 2.949576
1523 2 A 6.348228
1523 2 A 2.900000
1523 2 B 3.400000
1523 2 B 6.449806
1545 1 A 2.349468
1545 1 A 5.249762
1545 1 B 6.249800
1545 1 B 8.748714
1545 2 A 0.100000
1523 2 A 0.100000
1523 2 B 3.200000
1523 2 B 3.200000
So, basically what I want to do is have an estimate of Dq for 1) Stand 1523, Rev 1, Treat A; 2) 1) Stand 1523, Rev 1, Treat B; 3) Stand 1523, Rev 2, Treat A and so on.
My dataset is much larger, consisting of 4 treatments, 6 revisions and 8 stands. Making a loop would be one option I guess, but there must be an easier way how to do this?
Here is one way using dplyr:
library(dplyr)
data.df %>%
group_by(ID, Rev, Treat) %>%
summarise(quadratic_mean_diameter = sqrt(sum(dia^2)/length(dia)))

Mutation of non-conformable arrays

library(boot)
install.packages("AMORE")
library(AMORE)
l.data=nrow(melanoma)
set.seed(5)
idxTrain<-sample(1:l.data,100)
idxTest<-setdiff(1:l.data,idxTrain)
set.seed(3)
net<-newff(n.neurons=c(6,6,3),
learning.rate.global=0.02,
momentum.global=0.5,
hidden.layer="sigmoid",
output.layer="purelin",
method="ADAPTgdwm",
error.criterium="LMS")
result<-train(net,
melanoma[idxTrain,-2],
melanoma$status,
error.criterium="LMS",
report=TRUE,
show.step=10,
n.shows=800)
The problem I have is I have an error in result: "target - non-conformable arrays".
I know that it is the problem with melanoma$status, but have no idea how to alter the data accordingly. Any ideas? Couple of samples of data (if you don't use boot package from Rstudio).
melanoma:
time status sex age year thickness ulcer
1 10 3 1 76 1972 6.76 1
2 30 3 1 56 1968 0.65 0
3 35 2 1 41 1977 1.34 0
4 99 3 0 71 1968 2.90 0
5 185 1 1 52 1965 12.08 1
Your target variable should first take only the training indices. Moreover, the target should have a number of columns equal to the number of classes - with one-hot encoding. Something like this:
net<-newff(n.neurons=c(6,6,3),
learning.rate.global=0.02,
momentum.global=0.5,
hidden.layer="sigmoid",
output.layer="purelin",
method="ADAPTgdwm",
error.criterium="LMS")
Target = matrix(data=0, nrow=length(idxTrain), ncol=3)
status_mat=matrix(nrow=length(idxTrain), ncol=2)
status_mat[,1] = c(1:length(idxTrain))
status_mat[,2] = melanoma$status[idxTrain]
Target[(status_mat[,2]-1)*length(idxTrain)+status_mat[,1]]=1
result<-train(net,
melanoma[idxTrain,-2],
Target,
error.criterium="LMS",
report=TRUE,
show.step=10,
n.shows=800)

How to prepare my data fo a factorial repeated measures analysis?

Currently, my dataframe is in wide-format and I want to do a factorial repeated measures analysis with two between subject factors (sex & org) and a within subject factor (tasktype). Below I've illustrated how my data looks with a sample (the actual dataset has a lot more variables). The variable starting with '1_' and '2_' belong to measurements during task 1 and task 2 respectively. this means that 1_FD_H_org and 2_FD_H_org are the same measurements but for tasks 1 and 2 respectively.
id sex org task1 task2 1_FD_H_org 1_FD_H_text 2_FD_H_org 2_FD_H_text 1_apv 2_apv
2 F T Correct 2 69.97 68.9 116.12 296.02 10 27
6 M T Correct 2 53.08 107.91 73.73 333.15 16 21
7 M T Correct 2 13.82 30.9 31.8 78.07 4 9
8 M T Correct 2 42.96 50.01 88.81 302.07 4 24
9 F H Correct 3 60.35 102.9 39.81 96.6 15 10
10 F T Incorrect 3 78.61 80.42 55.16 117.57 20 17
I want to analyze whether there is a difference between the two tasks on e.g. FD_H_org for the different groups/conditions (sex & org).
How do I reshape my data so I can analyze it with a model like this?
ezANOVA(data=df, dv=.(FD_H_org), wid=.(id), between=.(sex, org), within=.(task))
I think that the correct format of my data should like this:
id sex org task outcome FD_H_org FD_H_text apv
2 F T 1 Correct 69.97 68.9 10
2 F T 2 2 116.12 296.02 27
6 M T 1 Correct 53.08 107.91 16
6 M T 2 2 73.73 333.15 21
But I'm not sure. I tryed to achieve this wih the reshape2 package but couldn't figure out how to do it. Anybody who can help?
I think probably you need to rebuild it by binding the 2 subsets of columns together with rbind(). The only issue here was that your outcomes implied difference data types, so forced them both to text:
require(plyr)
dt<-read.table(file="dt.txt",header=TRUE,sep=" ") # this was to bring in your data
newtab=rbind(
ddply(dt,.(id,sex,org),summarize, task=1, outcome=as.character(task1), FD_H_org=X1_FD_H_org, FD_H_text=X1_FD_H_text, apv=X1_apv),
ddply(dt,.(id,sex,org),summarize, task=2, outcome=as.character(task2), FD_H_org=X2_FD_H_org, FD_H_text=X2_FD_H_text, apv=X2_apv)
)
newtab[order(newtab$id),]
id sex org task outcome FD_H_org FD_H_text apv
1 2 F T 1 Correct 69.97 68.90 10
7 2 F T 2 2 116.12 296.02 27
2 6 M T 1 Correct 53.08 107.91 16
8 6 M T 2 2 73.73 333.15 21
3 7 M T 1 Correct 13.82 30.90 4
9 7 M T 2 2 31.80 78.07 9
4 8 M T 1 Correct 42.96 50.01 4
10 8 M T 2 2 88.81 302.07 24
5 9 F H 1 Correct 60.35 102.90 15
11 9 F H 2 3 39.81 96.60 10
6 10 F T 1 Incorrect 78.61 80.42 20
12 10 F T 2 3 55.16 117.57 17
EDIT - obviously you don't need plyr for this (and it may slow it down) unless you're doing further transformations. This is the code with no non-standard dependencies:
newcolnames<-c("id","sex","org","task","outcome","FD_H_org","FD_H_text","apv")
t1<-dt[,c(1,2,3,3,4,6,8,10)]
t1$org.1<-1
colnames(t1)<-newcolnames
t2<-dt[,c(1,2,3,3,5,7,9,11)]
t2$org.1<-2
t2$task2<-as.character(t2$task2)
colnames(t2)<-newcolnames
newt<-rbind(t1,t2)
newt[order(newt$id),]

How to choose the best splitting attribute with same gain information

I am actually computing step by step how CART (Classification and regression trees) choose the best attribute split with this training data set:
Car Age Children Location
1 sedan 23 0 yes
2 sports 31 1 no
3 sedan 36 1 no
4 truck 25 2 no
5 sports 30 0 no
6 sedan 36 0 no
7 sedan 25 0 yes
8 truck 36 1 no
9 sedan 30 2 yes
10 sedan 31 1 yes
11 sports 25 0 no
12 truck 45 0 yes
Results given by R:
n= 12
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 12 5 no (0.5833333 0.4166667)
2) Car=sports,truck 6 1 no (0.8333333 0.1666667)
4) Age
5) Age>=40.5 1 0 yes (0.0000000 1.0000000) *
3) Car=sedan 6 2 yes (0.3333333 0.6666667)
6) Age>=33.5 2 0 no (1.0000000 0.0000000) *
7) Age
For the root node Gini(root)=0.486
- with the Car attribute GainGini(Car)=0.1255;
- with the Age attribute I got the same gain with threshold 27.5 and 33.5. So which one to choose if GainGini(Age) will be maximized.
- with the Children attribute. the 2 child nodes are very pure so GainGini(Children)=0.486
My first question is why on this plot I got the Car attribute for the splitting?
For the first right child node:Gini(node2)=0.444
- with the Age attribute: threshold 33.5 got GainGini(Age)=0.444
-with the children attribute: same as the root node (all instances are pure) GainGini(children)=0.444
this is my second question how CART manage to choose the split attribute with those 2 values?

Resources