Knapsack 0-1 in R - r

I have been trying to formulate a simple knapsack problem, but I cannot see why it is not working.
i <- c(1,2,3,4)
v <- c(100,80,10,120)
w <- c(10,5,10,4)
k <- 15
F <- function(i,k){
if (i==0 | k==0){
output <- 0
} else if (k<w[i]){
output <- F(i-1,w)
} else {
output <- max(v[i]+ F(i-1, k-w[i]), F(i-1,k))
}
return(output)
}

Having a look at the knapsack function of the package adagio should help you, where w is the vector of weights, p the vector of profits and cap is your k. (see ?knapsack)
knapsack <- function (w, p, cap) {
n <- length(w)
x <- logical(n)
F <- matrix(0, nrow = cap + 1, ncol = n)
G <- matrix(0, nrow = cap + 1, ncol = 1)
for (k in 1:n) {
F[, k] <- G
H <- c(numeric(w[k]), G[1:(cap + 1 - w[k]), 1] + p[k])
G <- pmax(G, H)
}
fmax <- G[cap + 1, 1]
f <- fmax
j <- cap + 1
for (k in n:1) {
if (F[j, k] < f) {
x[k] <- TRUE
j <- j - w[k]
f <- F[j, k]
}
}
inds <- which(x)
wght <- sum(w[inds])
prof <- sum(p[inds])
return(list(capacity = wght, profit = prof, indices = inds))
}
However, the problems in your function seem to be
You did not declare all the objects used in your function (w and v) : you should also declare them as parameters of your function.
F which is the name of your function is called inside your function. Hence, as (i==0 | k==0) could never be true, the function will never stop processing.

Related

Is there any way to improve performance (e.g. vectorize) this look-up and recoding problem implemented by a for loop?

I need to make recodings to data sets of the following form.
# List elements of varying length
set.seed(12345)
n = 1e3
m = sample(2:5, n, T)
V = list()
for(i in 1:n) {
for(j in 1:m[i])
if(j ==1) V[[i]] = 0 else V[[i]][j] = V[[i]][j-1] + rexp(1, 1/10)
}
As an example consider
[1] 0.00000 23.23549 30.10976
Each list element contains a ascending vector of length m, each starting with 0 and ending somewhere in positive real numbers.
Now, consider a value s, where s is smaller than the maximum v_m of each V[[i]]. Also let v_m_ denote the m-1-th element of V[[i]]. Our goal is to find all elements of V[[i]] that are bounded by v_m_ - s and v_m - s. In the example above, if s=5, the desired vector v would be 23.23549. v can contain more elements if the interval encloses more values. As an example consider:
> V[[1]]
[1] 0.000000 2.214964 8.455576 10.188048 26.170458
If we now let s=16, the resulting vector is now 0 2.214964 8.455576, so that it has length 3. The code below implements this procedure using a for loop. It returns v in a list for all n. Note that I also attach the (upper/lower) bound before/afterv, if the bound lead to a reduction in length of v (in other words, if the bound has a positive value).
This loop is too slow in my application because n is large and the procedure is part of a larger algorithm that has to be run many times with some parameters changing. Is there a way to obtain the result faster than with a for loop, for example using vectorization? I know lapply in general is not faster than for.
# Series maximum and one below maximum
v_m = sapply(V, function(x) x[length(x)])
v_m_ = sapply(V, function(x) x[length(x)-1])
# Set some offsets s
s = runif(n,0,v_m)
# Procedure
d1 = (v_m_ - s)
d2 = (v_m - s)
if(sum(d2 < 0) > 0) stop('s cannot be larger than series maximum.')
# For loop - can this be done faster?
W = list()
for(i in 1:n){
v = V[[i]]
l = length(v)
v = v[v > d1[i]]
if(l > length(v)) v = c(d1[i], v)
l = length(v)
v = v[v < d2[i]]
if(l > length(v)) v = c(v, d2[i])
W[[i]] = v
}
I guess you can try mapply like below
V <- lapply(m, function(i) c(0, cumsum(rexp(i - 1, 1 / 10))))
v <- sapply(V, tail, 2)
s <- runif(n, 0, v[1, ])
if (sum(v[2, ] < 0) > 0) stop("s cannot be larger than series maximum.")
W <- mapply(
function(x, lb, ub) c(pmax(lb,0), x[x >= lb & x <= ub], pmin(ub,max(x))),
V,
v[1,]-s,
v[2,]-s
)
I don't think vectorization will be an option since the operation goes from a list of unequal-length vectors to another list of unequal-length vectors.
For example, the following vectorizes all the comparisons, but the unlist/relist operations are too expensive (not to mention the final lapply(..., unique)). Stick with the for loop.
W <- lapply(
relist(
pmax(
pmin(
unlist(V),
rep.int(d2, lengths(V))
),
rep.int(d1, lengths(V))
),
V
),
unique
)
I see two things that will give modest gains in speed. First, if s is always greater than 0, your final if statement will always evaluate to TRUE, so it can be skipped, simplifying some of the code. Second is to pre-allocate W. These are both implemented in fRecode2 below. A third thing that gives a slight gain is to avoid multiple reassignments to v. This is implemented in fRecode3 below.
For additional speed, move to Rcpp--it will allow the vectors in W to be built via a single pass through each vector element in V instead of two.
set.seed(12345)
n <- 1e3
m <- sample(2:5, n, T)
V <- lapply(m, function(i) c(0, cumsum(rexp(i - 1, 1 / 10))))
v_m <- sapply(V, function(x) x[length(x)])
v_m_ <- sapply(V, function(x) x[length(x)-1])
s <- runif(n,0,v_m)
d1 <- (v_m_ - s)
d2 <- (v_m - s)
if(sum(d2 < 0) > 0) stop('s cannot be larger than series maximum.')
fRecode1 <- function() {
# original function
W = list()
for(i in 1:n){
v = V[[i]]
l = length(v)
v = v[v > d1[i]]
if(l > length(v)) v = c(d1[i], v)
l = length(v)
v = v[v < d2[i]]
if(l > length(v)) v = c(v, d2[i])
W[[i]] = v
}
W
}
fRecode2 <- function() {
W <- vector("list", length(V))
i <- 0L
for(v in V){
l <- length(v)
v <- v[v > d1[i <- i + 1L]]
if (l > length(v)) v <- c(d1[i], v)
W[[i]] <- c(v[v < d2[i]], d2[[i]])
}
W
}
fRecode3 <- function() {
W <- vector("list", length(V))
i <- 0L
for(v in V){
idx1 <- sum(v <= d1[i <- i + 1L]) + 1L
idx2 <- sum(v < d2[i])
if (idx1 > 1L) {
if (idx2 >= idx1) {
W[[i]] <- c(d1[i], v[idx1:idx2], d2[i])
} else {
W[[i]] <- c(d1[i], d2[i])
}
} else {
W[[i]] <- c(v[1:idx2], d2[i])
}
}
W
}
microbenchmark::microbenchmark(fRecode1 = fRecode1(),
fRecode2 = fRecode2(),
fRecode3 = fRecode3(),
times = 1e3,
check = "equal")
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> fRecode1 2.0210 2.20405 2.731124 2.39785 2.80075 12.7946 1000
#> fRecode2 1.2829 1.43315 1.917761 1.54715 1.88495 51.8183 1000
#> fRecode3 1.2710 1.38920 1.741597 1.45640 1.76225 5.4515 1000
Not a huge speed boost: fRecode3 shaves just under a microsecond on average for each vector in V.

length of vector in function of parameter and fill with sequence in R

I would like to create a vector V and W with a length in function of parameter G. The job is related to matrix. Here my code for G=2:
ncol <- 13
G <- 2
m <- matrix(, nrow = ncol, ncol = ncol)
l <- matrix(1:(G*G), nrow = G, ncol = G)
N <- ncol-(G)
for (i in (0:(N))) {
for(j in (0:(N))){
V <- c(i+1, i+2)
W <- c(j+1, j+2)
m[V, W] <- l
}
}
For this example the length of V and W is 2. If I change G=3 I would like to have:
V <- c(i+1, i+2, i+3)
W <- c(j+1, j+2, j+3)
For G=4:
V <- c(i+1, i+2, i+3, i+4)
W <- c(j+1, j+2, j+3, j+4)
How to do create V and W to have the length in function of G?
Use seq or : to generate sequence.
for (i in (0:(N))) {
for(j in (0:(N))){
V <- seq(i+1, i+G)
#V <- (i+1):(i+G)
W <- seq(j+1, j+G)
#W <- (j+1):(j+G)
m[V, W] <- l
}
}

Error: "Unused argument N" Why is it giving me this error?

Here is my code,
BinAmerPut <- function(S0,R,sigmaA,sigmaB,K,T,N) {
Cn <- rep(0, N+1)
tempC <- 0
C0 <- 0
sigma <- (sigmaA+sigmaB)/2
Dt <- T/N
D <- exp(-R*Dt)
a <- exp(-R*Dt)+exp((R*Dt)+ (Dt*sigma^2))/2
d <- a-sqrt(a^2-1)
u <- 1/d
p <- (exp(R*Dt)-d)/(u-d)
q <- (1-p)
for (n in N:1){
for (j in 1:(n+1)){
if (n == N){
Cn[j] <- max((S0*(u^(j-1)*(d^(N-(j-1))))) - K, 0)
}
if (n < N){
tempC <- D*(p*Cn[j+1]+q*Cn[j])
Cn[j] <- max(S0*(u^(j-1))*(d^(N-(j-1)))-K, tempC)
}
}
}
C0 <- max(S0-K,D*(p*Cn[2]+q*Cn[1]))
return(C0)
}
S0 <- 100
R <- 0.03
T <- 1.0
N <- 4
K <- 100
BinAmerPut(S0, R, 0.1, 0.25, K, T, N)
Perhaps I am missing something which I can't seem to find.? Eventually, my goal is to then add the bisection theorem in a while loop.
Thanks!

Efficient implementation of double for-loop

I am new to R and I was wondering if there is any more efficient implementation of the following setting? Time series length (x,y) is around 5000 and h != nrow(q).
set.seed(1)
h = 21
x <- rnorm(5e3, 1)
y <- rnorm(5e3, 2)
q <- c(0.1, 0.3, 0.5, 0.7, 0.9)
qx <- quantile(x, probs = q)
qx <- expand.grid(qx, qx)
qy <- quantile(y, probs = q)
qy <- expand.grid(qy, qy)
q <- expand.grid(q, q)
f <- function(z, l, qz) {
n <- length(z)
1/(n - l) * sum((z[1:(n-l)] <= qz[[1]]) * (z[(1+l):n] <= qz[[2]])) - prod(q[i,])
}
sum = 0
for (i in 1:h) {
for (j in 1:nrow(q)) {
sum = sum + (f(x, l = i, qx[j,]) - f(y, l = i, qy[j,]))^2
}
}
sum
# 0.0008698279
Thank you very much!
One faster alternative to loops might be under some circumstances a sapply function.
This function works as follows: for each element of a vector perform some function.
Aternatively, you could take a look at a foreach package which offers some fast looping.
Here is an example using sapply: depending on what exactly you need, you might want to use either of the functions. Also, sapply is just one of the faster ways of doing this, not necessarily the fastest.
# setup from the question
set.seed(1)
h = 1
x <- rnorm(5e3, 1)
y <- rnorm(5e3, 2)
q <- c(0.1, 0.3, 0.5, 0.7, 0.9)
qx <- quantile(x, probs = q)
qx <- expand.grid(qx, qx)
qy <- quantile(y, probs = q)
qy <- expand.grid(qy, qy)
q <- expand.grid(q, q)
f <- function(z, l, qz) {
n <- length(z)
1/(n - l) * sum((z[1:(n-l)] <= qz[[1]]) * (z[(1+l):n] <= qz[[2]])) - prod(q[i,])
}
# load microbenchmark library for comparison of execution times
library(microbenchmark)
microbenchmark({
# the version from question with for loop
sum = 0
for (i in 1:h) {
for (j in 1:nrow(q)) {
sum = sum + (f(x, l = i, qx[j,]) - f(y, l = i, qy[j,]))^2
}
}
},
{
# using sapply and storing to object. this will give you h*j matrix as well as the sum
sum = 0
sapply(1:h, function(i) sapply(1:nrow(q), function(j) {sum <<- sum + (f(x, l = i, qx[j,]) - f(y, l = i, qy[j,]))^2}))
},
{
# use sapply and sum the output
sum(sapply(1:h, function(i) sapply(1:nrow(q), function(j) {(f(x, l = i, qx[j,]) - f(y, l = i, qy[j,]))^2})))},
# run each code 200 times to get the time comparison
times = 200
)

Repeating a function multiple times using a for loop

A have code that creates a random graph in the form of a matrix. Now I would like it to create many, say m, random graphs so the output is m matrices. I am trying to do this with a for loop. This would be my preferred method however I am open to other suggestions (apply family?). Here is my code, where n is the number of nodes/vertices the graph has and beta is the amount of preferential attachment (keep this between 0 and 1.5)
multiplerandomgraphs <- function(n, beta, m) {
for(k in 1:m) {
randomgraph <- function(n, beta) {
binfunction <- function(y) {
L <- length(y)
x <- c(0, cumsum(y))
U <- runif(1, min = 0 , max = sum(y))
for(i in 1:L) {
if(x[i] <= U && x[i+1] > U){
return(i)
}
}
}
mat <- matrix(0,n,n)
mat[1,2] <- 1
mat[2,1] <- 1
for(i in 3:n) {
degvect <- colSums(mat[ , (1:(i-1))])
degvect <- degvect^(beta)
j <- binfunction(degvect)
mat[i,j] <- 1
mat[j,i] <- 1
}
return(mat)
}
}
}
You can define your randomgraph function as randomgraph <- function(i, n, beta) {} with the body the same as your definition, leaves the parameter i as a dummy parameter. And then use apply function as listOfMatrix <- lapply(1:m, randomgraph, n, beta) which return a list of matrix.

Resources