Constrained multivariate optimization in R [duplicate] - r

Can somebody help me in solving this to multivariate function parameters optimization in R, I have a data set like this. This is just a subset of data, dimension of the full dataset is n type * m regions * 12 months.
Month region type physics maths allsub
Jan r1 1 4 5 9
Feb r1 1 3 8 11
Mar r1 1 5 4 9
Apr r1 1 6 7 13
May r1 1 4 4 8
Jun r1 1 8 9 17
Jul r1 1 4 3 7
Aug r1 1 5 4 9
Sep r1 1 3 8 11
Oct r1 1 9 2 11
Nov r1 1 4 7 11
Dec r1 1 7 3 10
Jan r1 2 5 8 13
Feb r1 2 4 9 13
Mar r1 2 8 3 11
Apr r1 2 5 6 11
May r1 2 6 4 10
Jun r1 2 7 6 13
Jul r1 2 3 7 10
Aug r1 2 4 8 12
Sep r1 2 4 4 8
Oct r1 2 8 1 9
Nov r1 2 2 3 5
Dec r1 2 1 6 7
... ... .. ... ... ....
... ... .. ... ... ....
I have one more dataset which has maximum number of physics and maths students in each region. And my objective function is this, 100*(physics) + 65*(maths) >= 0. I want to minimize this function and my constraints are
1. sum of physics and maths should always be equal to allsub for that region and month.
2. total number of physics students in a region every month should be less than maximum number of physics students available in that region.
3. total number of maths students in a region every month should be less than maximum number of maths students available in that region.
I am trying to use R. The whole idea is to find the right number of physics and maths students in each region/type/month minimizing the objective function and meeting the constraints. Can someone help me with this?
EDIT : As requested in the comments.
Here is the total capacity dataset. dataframe name = totalcap
Month region physicscap mathscap
1 Jan r1 9 13
2 Feb r1 7 17
3 Mar r1 13 7
4 Apr r1 11 13
5 May r1 10 8
6 Jun r1 15 15
7 Jul r1 7 10
8 Aug r1 9 12
9 Sep r1 7 12
10 Oct r1 17 3
11 Nov r1 6 10
12 Dec r1 8 9
Here is the script I have tried,
library(dplyr)
library(MASS)
library(Rsolnp)
Month <- c('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec')
region <- c('r1')
physicscap <- c(5,5,8,6,7,9,5,6,4,10,5,8)
mathscap <- c(5,8,5,8,5,10,5,5,8,5,8,5)
totalcap <- data.frame(Month,region,physicscap,mathscap)
#Constraints for the optimization.
constraints2 <- function(efforts){
# constraints are:
# 1. effort - allsub <= 0 in each region/month
#
efforts$effort_calculated <- efforts$physics + efforts+maths
reqeff <- summarise(group_by(efforts,region,Month),monthlyeffreg=sum(effort_calculated))
reqeffallsub <- summarise(group_by(efforts,region,Month),allsubsum=sum(allsub))
cons1 <- mutate(inner_join(reqeff,reqeffallsub,by=c('region'='region','Month'='Month'))
,diff=monthlyeffreg-allsubsum)
constout <- cons1$diff
# 2. sum(physics) - total physics available <= 0 in each region/month
#
phyreqeff <- summarise(group_by(efforts,region,Month),physicseff=sum(physics))
cons2 <- mutate(inner_join(totalcap,phyreqeff,by=c('region'='region','Month'='Month')),
diff=physicseff-physicscap)
constout <- c(constout,cons2$diff)
# 3. sum(maths) - total maths available <= 0 in each region/month
#
matreqeff <- summarise(group_by(efforts,region,Month),mathseff=sum(maths))
cons3 <- mutate(inner_join(totalcap,matreqeff,by=c('region'='region','Month'='Month')),
diff=mathseff-mathscap)
constout <- c(constout,cons3$diff)
constout
}
#Objective function to minimize the cost function.
objectivefunc <- function(efforts){
nb_physics <- sum(efforts$physics)
nb_maths <- sum(efforts$maths)
objective <- (100*nb_physics + 55*nb_maths - 110)
objective
}
Out2 <- solnp(pars = efforts,fun=objectivefunc,ineqfun=constraints2,ineqLB = rep(-100000,36),
ineqUB = rep(0,36), LB = rep(0,length(u)))
Here is the error I am getting,
Error in p0/vscale[(neq + 2):(nc + np + 1)] :
non-numeric argument to binary operator
Hope this clears the questions in comments. I tried my level best here, hope someone help me in solving this.

Here is an approach with lpSolveAPI:
dat <- data.frame(
mon=rep(c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"),2),
region="r1",
type=c(rep("1", 12), rep("2", 12)),
physicsmin=1,
mathsmin=1,
allsub=c(9, 11, 9, 13, 8, 17, 7, 9, 11, 11, 11, 10, 13,13,11,11,10,13,10,12,8,9,5,7),
stringsAsFactors=FALSE
)
dat
capdat <- data.frame(
mon=c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"),
region="r1",
physicscap=c(9,7,13,11,10,15,7,9,7,17,6,8),
mathscap=c(13,17,7,13,8,15,10,12,12,3,10,9),
stringsAsFactors=FALSE
)
capdat
Now for each month/region combination an optimization problem is to be solved. That is why
we wrap the calculation in a function:
library(lpSolveAPI)
ntypes <- length(unique(dat[,"type"])) # number of types
typemap <- setNames(seq.int(ntypes), unique(dat[,"type"])) # map typename to 1,...,ntypes
solve_one <- function(subdat, capdat) {
# create object
lprec <- make.lp(0, ncol=2*ntypes) # for each type, two decision variables
# By convention, we assume that the first ntypes variables are physics for type 1, ..., ntypes
# and the second ntypes variables are maths
# add objective and type
set.objfn(lprec, obj=c(rep(100, ntypes), rep(65, ntypes)))
set.type(lprec, columns=seq.int(2*ntypes), type="integer") # no reals
# add capacity constraints
idx <- which(capdat[,"mon"]==subdat[1,"mon"] & capdat[,"region"]==subdat[1,"region"]) # lookup the right cap
add.constraint(lprec, rep(1, ntypes), type="<=", rhs=capdat[idx,"physicscap"], indices=seq.int(ntypes))
add.constraint(lprec, rep(1, ntypes), type="<=", rhs=capdat[idx,"mathscap"], indices=seq.int(ntypes+1, 2*ntypes))
# add allsub equality constraints and minimum constraints
for (typ in subdat[,"type"]) {
add.constraint(lprec, c(1,1), type="=", rhs=subdat[typemap[typ], "allsub"], indices=c(typemap[typ], ntypes+typemap[typ]))
add.constraint(lprec, 1, type=">=", rhs=subdat[typemap[typ],"physicsmin"], indices=typemap[typ])
add.constraint(lprec, 1, type=">=", rhs=subdat[typemap[typ],"mathsmin"], indices=ntypes+typemap[typ])
}
# solution data.frame
ans <- subdat[, c("mon", "region", "type")]
# solve
if(solve(lprec)==0) {
sol <- get.variables(lprec)
for (i in seq.int(nrow(subdat))) {
ans[i, "physics"] <- sol[typemap[subdat[i,"type"]]]
ans[i, "maths"] <- sol[typemap[subdat[i,"type"]]+ntypes]
}
} else ans[,c("physics", "maths")] <- NA # no solution found
return(ans)
}
Now we apply the function to each subdataset which includes all types for each month/region combination. We
use a split/apply/combine approach here:
sp <- split(dat, list(dat[,"mon"], dat[,"region"]))
results <- lapply(sp, solve_one, capdat=capdat)
results <- do.call(rbind, results)
rownames(results) <- NULL
results
The code does not assume that for each month/region combination all types are present (some types may be omitted), however the solution will be wrong if there are several entries present for the same month/region/type combination. (the code would need to be adapted for this).

Related

How to calculate the duration of employment spells

I have data on peoples employment status monthly for 1 year. The dataframe includes 4 variables: ID of a Person, Country, Month and Main Activity in that specific month (Employed, Unemployed, Inactive, Other). I put an example of it here:
ID <- c(1:10, 1:10)
country <- c("AT", "BE", "CH", "CZ", "HR", "SO", "SV", "RU", "GR", "GE", "AT", "BE", "CH", "CZ", "HR",
"SO", "SV", "RU", "GR", "GE")
month <- c("Jan", "Feb", "Mar", "Apr", "May", "Aug", "Dec", "Nov", "Sep", "Jan", "Jun", "Jul", "Oct",
"Jan", "Feb", "Mar", "Apr", "May", "Aug", "Dec")
act <- c("Unemployed", "Employed", "Other", "Other", "Inactive", "Unemployed", "Employed",
"Employed", "Employed", "Unemployed", "Other", "Unemployed", "Unemployed", "Unemployed",
"Other", "Other", "Employed", "Other", "Other", "NA")
df <- data.frame(ID, country, month, act)
df[order(ID),]
ID country month act
1 1 AT Jan Unemployed
11 1 AT Jun Other
21 1 AT Nov Unemployed
2 2 BE Feb Employed
12 2 BE Jul Unemployed
22 2 BE Sep Unemployed
3 3 CH Mar Other
13 3 CH Oct Unemployed
23 3 CH Jan NA
4 4 CZ Apr Other
14 4 CZ Jan Unemployed
24 4 CZ Jun Unemployed
5 5 HR May Inactive
15 5 HR Feb Other
25 5 HR Jul Other
6 6 SO Aug Unemployed
16 6 SO Mar Other
26 6 SO Oct Employed
7 7 SV Dec Employed
17 7 SV Apr Employed
27 7 SV Nov Employed
8 8 RU Nov Employed
18 8 RU May Other
28 8 RU Jan NA
9 9 GR Sep Employed
19 9 GR Aug Other
29 9 GR Jun Inactive
10 10 GE Jan Unemployed
20 10 GE Dec NA
30 10 GE Aug Unemployed
My goal is to create a new dataframe, where every row represents one spell of employment but with the condition that prior to AND after the spell of employment had to be a spell of unemployment. So that I am able to include only spells of employment where people switched from unemployment to employment and back to unemployment and calculate the duration of these spells. Ideally in the end there would be 4 variables: PersID, Country, duration of spell, starting month, end month. It should look like this:
ID country spell_duration starting ending
1 1 AT 5 Jan May
11 1 AT 5 Jun Oct
2 2 BE 7 Feb Aug
12 2 BE 6 Jul Dec
3 3 CH 10 Mar Dec
13 3 CH 1 Oct Oct
4 4 CZ 8 Apr Nov
14 4 CZ 5 Jan May
5 5 HR 5 May Sep
15 5 HR 4 Feb May
6 6 SO 2 Aug Sep
16 6 SO 6 Mar Aug
7 7 SV 1 Dec Dec
17 7 SV 9 Apr Dec
8 8 RU 8 Nov Dec
18 8 RU 7 May Nov
9 9 GR 3 Sep Nov
19 9 GR 2 Aug Sep
10 10 GE 8 Jan Aug
20 10 GE 1 Dec Dec
I already found this solution by Maria (How to calculate number and duration of categorical spells by ID in R) but her problem is different. I don't want the overall duration of employment and I also don't really need the number of spells
I used data.table package for manupulation and loops are working i guess.
EDIT: One extra "}" left and I edited it. I've tried it and it works.
EDIT2: I added "setDT(df)" too.
library(data.table)
df <- fread(paste("ID country month act
1 AT Jan Unemployed
1 AT Jun Other
1 AT Nov Unemployed
2 BE Feb Employed
2 BE Jul Unemployed
2 BE Sep Unemployed
3 CH Mar Other
3 CH Oct Unemployed
3 CH Jan NA
4 CZ Apr Other
4 CZ Jan Unemployed
4 CZ Jun Unemployed
5 HR May Inactive
5 HR Feb Other
5 HR Jul Other
6 SO Aug Unemployed
6 SO Mar Other
6 SO Oct Employed
7 SV Dec Employed
7 SV Apr Employed
7 SV Nov Employed
8 RU Nov Employed
8 RU May Other
8 RU Jan NA
9 GR Sep Employed
9 GR Aug Other
9 GR Jun Inactive
10 GE Jan Unemployed
10 GE Dec NA
10 GE Aug Unemployed", collapse = '\n'))
setDT(df)
df[, monthInt := match(month, month.abb)]
df <- df[order(ID,monthInt)]
finalDt <- data.table()
for (i in unique(df[, ID])) {
tempT <- df[ID == i]
for (tim in 1:(nrow(tempT)-1)) {
timT <- data.table(ID = tempT[tim,ID],
country = tempT[tim, country],
spell_duration = tempT[tim+1, monthInt] - tempT[tim, monthInt],
starting = month.abb[tempT[tim, monthInt]],
ending = month.abb[tempT[tim+1, monthInt]-1])
finalDt <- rbind(finalDt,timT)
}
}
Without much thought, the first thing that came to my mind. Very cumbersome, though. I'm sure there are more elegant solutions to this, but this doesn't require any additional packages.
data <- df
Empl_spells <- data.frame(ID = c(), Start = c(), End = c())
for(user in unique(data$ID)){
# subset per user
user_dat <- data[data$ID == user,]
# initiate a list to store where changes occur and a counter for
# entries to this list
if(nrow(user_dat) > 2){
Changes_data <- list()
entry <- 1
# for every row, check if it switches from employed to unemployed
# or the opposite. Mark with "break" if some other entry interrupts
for(i in 2:nrow(user_dat)){
if(user_dat$act[i] == "Employed" &
user_dat$act[i-1] == "Unemployed"){
Changes_data[[entry]] <- c("Start", i)
entry <- entry + 1
}else if(user_dat$act[i] == "Unemployed" &
user_dat$act[i-1] == "Employed"){
Changes_data[[entry]] <- c("End", i)
entry <- entry + 1
}else if(user_dat$act[i] != "Employed" &
user_dat$act[i] != "Unemployed"){
Changes_data[[entry]] <- c("Break", i)
entry <- entry + 1
}
}
# see where to an "End" follows a "Start" immediately in the new list
Changes_df <- do.call(rbind.data.frame, Changes_data)
EmplToUnempl <- which(Changes_df[-nrow(Changes_df), 1] == "Start" & Changes_df[-1, 1] == "End")
if(length(EmplToUnempl) >= 1){
append <- data.frame(ID = user,
Start = user_dat$month[as.numeric(Changes_df[EmplToUnempl, 2])],
End = user_dat$month[as.numeric(Changes_df[EmplToUnempl + 1, 2])-1])
# append the data to the data.frame for all of the people
Empl_spells <- rbind(Empl_spells, append)
}
}
}
Since I don't have your data, I didn't test this. Is this what you want?
Edit (vectorize; probably makes it faster):
data <- df
users <- unique(data$ID)
calculate <- function(user){
# subset per user
user_dat <- data[data$ID == user,]
# initiate a list to store where changes occur and a counter for
# entries to this list
if(nrow(user_dat) > 2){
Changes_data <- list()
entry <- 1
# for every row, check if it switches from employed to unemployed
# or the opposite. Mark with "break" if some other entry interrupts
for(i in 2:nrow(user_dat)){
if(user_dat$act[i] == "Employed" &
user_dat$act[i-1] == "Unemployed"){
Changes_data[[entry]] <- c("Start", i)
entry <- entry + 1
}else if(user_dat$act[i] == "Unemployed" &
user_dat$act[i-1] == "Employed"){
Changes_data[[entry]] <- c("End", i)
entry <- entry + 1
}else if(user_dat$act[i] != "Employed" &
user_dat$act[i] != "Unemployed"){
Changes_data[[entry]] <- c("Break", i)
entry <- entry + 1
}
}
# see where to an "End" follows a "Start" immediately in the new list
Changes_df <- do.call(rbind.data.frame, Changes_data)
EmplToUnempl <- which(Changes_df[-nrow(Changes_df), 1] == "Start" & Changes_df[-1, 1] == "End")
if(length(EmplToUnempl) >= 1){
append <- data.frame(ID = user,
Start = user_dat$month[as.numeric(Changes_df[EmplToUnempl, 2])],
End = user_dat$month[as.numeric(Changes_df[EmplToUnempl + 1, 2])-1])
# append the data to the data.frame for all of the people
return(append)
}
}
}
empl_spells <- lapply(users, FUN = calculate)
Empl_spells <- do.call(rbind.data.frame, empl_spells)
Edit #2 (calculate duration):
MonthToNumeric <- function(x){
which(c("Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec") == x)
}
calcDuration <- function(Start, End){
return(MonthToNumeric(End) - MonthToNumeric(Start) + 1)
}
Empl_spells$Duration <- mapply(FUN = calcDuration, Start = Empl_spells[, 2], End = Empl_spells[, 3])

Create id with a pair of two sequenced number

In my data, I have a year-week column and hope to create a variable with combining two weeks.
for example, here is my column x,
x = c(201336, 201336, 201336, 201337, 201337, 201340, 201341, 201341, 201342, 201343, 201344, ...)
In x, 201336 means 36th week of 2013, 201337 means 37th week of 2013 and so on. I want to indicate that
(201336,201337) -> 1
(201338,201339) -> 2
(201340,201341) -> 3
(201342,201343) -> 4
(201344,201345) -> 5
and so on
So, My desired vector is that
x2 = c(1,1,1,1,1,3,3,3,4,4,5,...)
if there is only one year as #Psidom mentioned above, it seems to be:
(x - min(x))%/%2 + 1
[update]
If your vector contains entries for different years:
set.seed(112358)
x <- sample(1992:2017,1000, replace = T)
x <- as.integer((x + sample(seq(.1,.52,length.out=length(1992:2017)),
1000,replace = T)
)*100)
x <- x[order(x)]
names(x) <- x%/%100
head(x, 11)
#1992 1992 1992 1992 1992 1992 1992 1992 1992 1992 1992
#199210 199211 199213 199213 199215 199216 199220 199220 199220 199221 199221
tail(x, 11)
#2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017
#201736 201738 201740 201740 201741 201741 201743 201745 201746 201750 201752
foo <- function(x){
return((x-min(x))%/%2+1)
}
xm <- unlist(sapply(unique(names(x)), function(i) foo(x[names(x) == i])), use.names = F)
mdf <- data.frame(original = x, modified = xm)
head(mdf)
# original modified
#1 199210 1
#2 199211 1
#3 199213 2
#4 199213 2
#5 199215 3
#6 199216 4
tail(mdf)
# original modified
#995 201741 16
#996 201743 17
#997 201745 18
#998 201746 19
#999 201750 21
#1000 201752 22
This should work for you:
x = c(201336, 201336, 201336, 201337, 201337, 201340, 201341, 201341, 201342, 201343, 201344)
as.integer((x - 201336) / 2) + 1
# [1] 1 1 1 1 1 3 3 3 4 4 5

Minimise objective function using R

Can somebody help me in solving this to multivariate function parameters optimization in R, I have a data set like this. This is just a subset of data, dimension of the full dataset is n type * m regions * 12 months.
Month region type physics maths allsub
Jan r1 1 4 5 9
Feb r1 1 3 8 11
Mar r1 1 5 4 9
Apr r1 1 6 7 13
May r1 1 4 4 8
Jun r1 1 8 9 17
Jul r1 1 4 3 7
Aug r1 1 5 4 9
Sep r1 1 3 8 11
Oct r1 1 9 2 11
Nov r1 1 4 7 11
Dec r1 1 7 3 10
Jan r1 2 5 8 13
Feb r1 2 4 9 13
Mar r1 2 8 3 11
Apr r1 2 5 6 11
May r1 2 6 4 10
Jun r1 2 7 6 13
Jul r1 2 3 7 10
Aug r1 2 4 8 12
Sep r1 2 4 4 8
Oct r1 2 8 1 9
Nov r1 2 2 3 5
Dec r1 2 1 6 7
... ... .. ... ... ....
... ... .. ... ... ....
I have one more dataset which has maximum number of physics and maths students in each region. And my objective function is this, 100*(physics) + 65*(maths) >= 0. I want to minimize this function and my constraints are
1. sum of physics and maths should always be equal to allsub for that region and month.
2. total number of physics students in a region every month should be less than maximum number of physics students available in that region.
3. total number of maths students in a region every month should be less than maximum number of maths students available in that region.
I am trying to use R. The whole idea is to find the right number of physics and maths students in each region/type/month minimizing the objective function and meeting the constraints. Can someone help me with this?
EDIT : As requested in the comments.
Here is the total capacity dataset. dataframe name = totalcap
Month region physicscap mathscap
1 Jan r1 9 13
2 Feb r1 7 17
3 Mar r1 13 7
4 Apr r1 11 13
5 May r1 10 8
6 Jun r1 15 15
7 Jul r1 7 10
8 Aug r1 9 12
9 Sep r1 7 12
10 Oct r1 17 3
11 Nov r1 6 10
12 Dec r1 8 9
Here is the script I have tried,
library(dplyr)
library(MASS)
library(Rsolnp)
Month <- c('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec')
region <- c('r1')
physicscap <- c(5,5,8,6,7,9,5,6,4,10,5,8)
mathscap <- c(5,8,5,8,5,10,5,5,8,5,8,5)
totalcap <- data.frame(Month,region,physicscap,mathscap)
#Constraints for the optimization.
constraints2 <- function(efforts){
# constraints are:
# 1. effort - allsub <= 0 in each region/month
#
efforts$effort_calculated <- efforts$physics + efforts+maths
reqeff <- summarise(group_by(efforts,region,Month),monthlyeffreg=sum(effort_calculated))
reqeffallsub <- summarise(group_by(efforts,region,Month),allsubsum=sum(allsub))
cons1 <- mutate(inner_join(reqeff,reqeffallsub,by=c('region'='region','Month'='Month'))
,diff=monthlyeffreg-allsubsum)
constout <- cons1$diff
# 2. sum(physics) - total physics available <= 0 in each region/month
#
phyreqeff <- summarise(group_by(efforts,region,Month),physicseff=sum(physics))
cons2 <- mutate(inner_join(totalcap,phyreqeff,by=c('region'='region','Month'='Month')),
diff=physicseff-physicscap)
constout <- c(constout,cons2$diff)
# 3. sum(maths) - total maths available <= 0 in each region/month
#
matreqeff <- summarise(group_by(efforts,region,Month),mathseff=sum(maths))
cons3 <- mutate(inner_join(totalcap,matreqeff,by=c('region'='region','Month'='Month')),
diff=mathseff-mathscap)
constout <- c(constout,cons3$diff)
constout
}
#Objective function to minimize the cost function.
objectivefunc <- function(efforts){
nb_physics <- sum(efforts$physics)
nb_maths <- sum(efforts$maths)
objective <- (100*nb_physics + 55*nb_maths - 110)
objective
}
Out2 <- solnp(pars = efforts,fun=objectivefunc,ineqfun=constraints2,ineqLB = rep(-100000,36),
ineqUB = rep(0,36), LB = rep(0,length(u)))
Here is the error I am getting,
Error in p0/vscale[(neq + 2):(nc + np + 1)] :
non-numeric argument to binary operator
Hope this clears the questions in comments. I tried my level best here, hope someone help me in solving this.
Here is an approach with lpSolveAPI:
dat <- data.frame(
mon=rep(c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"),2),
region="r1",
type=c(rep("1", 12), rep("2", 12)),
physicsmin=1,
mathsmin=1,
allsub=c(9, 11, 9, 13, 8, 17, 7, 9, 11, 11, 11, 10, 13,13,11,11,10,13,10,12,8,9,5,7),
stringsAsFactors=FALSE
)
dat
capdat <- data.frame(
mon=c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"),
region="r1",
physicscap=c(9,7,13,11,10,15,7,9,7,17,6,8),
mathscap=c(13,17,7,13,8,15,10,12,12,3,10,9),
stringsAsFactors=FALSE
)
capdat
Now for each month/region combination an optimization problem is to be solved. That is why
we wrap the calculation in a function:
library(lpSolveAPI)
ntypes <- length(unique(dat[,"type"])) # number of types
typemap <- setNames(seq.int(ntypes), unique(dat[,"type"])) # map typename to 1,...,ntypes
solve_one <- function(subdat, capdat) {
# create object
lprec <- make.lp(0, ncol=2*ntypes) # for each type, two decision variables
# By convention, we assume that the first ntypes variables are physics for type 1, ..., ntypes
# and the second ntypes variables are maths
# add objective and type
set.objfn(lprec, obj=c(rep(100, ntypes), rep(65, ntypes)))
set.type(lprec, columns=seq.int(2*ntypes), type="integer") # no reals
# add capacity constraints
idx <- which(capdat[,"mon"]==subdat[1,"mon"] & capdat[,"region"]==subdat[1,"region"]) # lookup the right cap
add.constraint(lprec, rep(1, ntypes), type="<=", rhs=capdat[idx,"physicscap"], indices=seq.int(ntypes))
add.constraint(lprec, rep(1, ntypes), type="<=", rhs=capdat[idx,"mathscap"], indices=seq.int(ntypes+1, 2*ntypes))
# add allsub equality constraints and minimum constraints
for (typ in subdat[,"type"]) {
add.constraint(lprec, c(1,1), type="=", rhs=subdat[typemap[typ], "allsub"], indices=c(typemap[typ], ntypes+typemap[typ]))
add.constraint(lprec, 1, type=">=", rhs=subdat[typemap[typ],"physicsmin"], indices=typemap[typ])
add.constraint(lprec, 1, type=">=", rhs=subdat[typemap[typ],"mathsmin"], indices=ntypes+typemap[typ])
}
# solution data.frame
ans <- subdat[, c("mon", "region", "type")]
# solve
if(solve(lprec)==0) {
sol <- get.variables(lprec)
for (i in seq.int(nrow(subdat))) {
ans[i, "physics"] <- sol[typemap[subdat[i,"type"]]]
ans[i, "maths"] <- sol[typemap[subdat[i,"type"]]+ntypes]
}
} else ans[,c("physics", "maths")] <- NA # no solution found
return(ans)
}
Now we apply the function to each subdataset which includes all types for each month/region combination. We
use a split/apply/combine approach here:
sp <- split(dat, list(dat[,"mon"], dat[,"region"]))
results <- lapply(sp, solve_one, capdat=capdat)
results <- do.call(rbind, results)
rownames(results) <- NULL
results
The code does not assume that for each month/region combination all types are present (some types may be omitted), however the solution will be wrong if there are several entries present for the same month/region/type combination. (the code would need to be adapted for this).

Read Data into Time Series Object in R

My data looks as follows:
Month/Year;Number
01/2010; 1.0
02/2010;19.0
03/2010; 1.0
...
How can I read this into a ts(object) in R?
Try this (assuming your data is called df)
ts(df$Number, start = c(2010, 01), frequency = 12)
## Jan Feb Mar
## 2010 1 19 1
Edit: this will work only if you don't have missing dates and your data is in correct order. For a more general solution see #Anandas answer below
I would recommend using zoo as a starting point. This will ensure that if there are any month/year combinations missing, they would be handled properly.
Example (notice that data for April is missing):
mydf <- data.frame(Month.Year = c("01/2010", "02/2010", "03/2010", "05/2010"),
Number = c(1, 19, 1, 12))
mydf
# Month.Year Number
# 1 01/2010 1
# 2 02/2010 19
# 3 03/2010 1
# 4 05/2010 12
library(zoo)
as.ts(zoo(mydf$Number, as.yearmon(mydf$Month.Year, "%m/%Y")))
# Jan Feb Mar Apr May
# 2010 1 19 1 NA 12

forecast throws error K must be not be greater than period/2

I issue the following commands:
ops <- read.csv("ops.csv")
ops.ts <- ts(ops, frequency=12, start=c(2014,1))
ops.fc <- forecast(ops.ts)
forecast() then throws the following error:
Error in ...fourier(x, K, 1:length(x)) :
K must be not be greater than period/2
The data from the csv looks like this according to summary(ops):
1 10
2 3
3 7
4 4
5 2
6 20
7 13
8 9
9 8
10 7
11 6
12 11
13 7
R is up to date, Forecast is installed via CRAN.
I appreciate any advice especially because I am quiet new to R.
The error message is self-explanatory.
You have 13 elements in your dataset so when you do:
ops.ts <- ts(ops, frequency = 12, start=c(2014, 1))
You get (notice the 2015 value here):
#> ops.ts
# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
#2014 10 3 7 4 2 20 13 9 8 7 6 11
#2015 7
I'm guessing you only want to use the first 12 months and then use forecast() ? If that is the case you can do either:
ops.ts <- ts(ops, frequency = 12, start = 2014, end = c(2015, 0))
ops.fc <- forecast(ops.ts)
or
ops <- ops[1:12, ]
ops.ts <- ts(ops, frequency = 12, start = 2014)
ops.fc <- forecast(ops.ts)

Resources