I am trying to set up a linear programming solution using lpSolveAPI and R to solve a scheduling problem. Below is a small sample of the data; the minutes required for each session id, and their 'preferred' order/weight.
id <- 1:100
min <- sample(0:500, 100)
weight <- (1:100)/sum(1:100)
data <- data.frame(id, min, weight)
What I want to do is arrange/schedule these session IDs so that there are maximum number sessions in a day, preferably by their weight and each day is capped by a total of 400 minutes.
This is how I have set it up currently in R:
require(lpSolveAPI)
#Set up matrix to hold results; each row represents day
r <- 5
c <- 10
row <- 1
results <- matrix(0, nrow = r, ncol = c)
rownames(results) <- format(seq(Sys.Date(), by = "days", length.out = r), "%Y-%m-%d")
for (i in 1:r){
for(j in 1:c){
lp <- make.lp(0, nrow(data))
set.type(lp, 1:nrow(data), "binary")
set.objfn(lp, rep(1, nrow(data)))
lp.control(lp, sense = "max")
add.constraint(lp, data$min, "<=", 400)
set.branch.weights(lp, data$weight)
solve(lp)
a <- get.variables(lp)*data$id
b <- a[a!=0]
tryCatch(results[row, 1:length(b)] <- b, error = function(x) 0)
if(dim(data[!data$id == a,])[1] > 0) {
data <- data[!data$id== a,]
row <- row + 1
}
break
}
}
sum(results > 0)
barplot(results) #View of scheduled IDs
A quick look at the results matrix tells me that while the setup works to maximise number of sessions so that the total minutes in a day are close to 400 as possible, the setup doesn't follow the weights given. I expect my results matrix to be filled with increasing session IDs.
I have tried assigning different weights, weights in reverse order etc. but for some reason my setup doesn't seem to enforce "set.branch.weights".
I have read the documentation for "set.branch.weights" from lpSolveAPI but I think I am doing something wrong here.
Example - Data:
id min weight
1 67 1
2 72 2
3 36 3
4 91 4
5 80 5
6 44 6
7 76 7
8 58 8
9 84 9
10 96 10
11 21 11
12 1 12
13 41 13
14 66 14
15 89 15
16 62 16
17 11 17
18 42 18
19 68 19
20 25 20
21 44 21
22 90 22
23 4 23
24 33 24
25 31 25
Should be
Day 1 67 72 36 91 80 44 76
Day 2 58 84 96 21 1 41 66 89
Day 3 62 11 42 68 25 44 90 4 33 31
Each day has a cumulative sum of <= 480m.
My simple minded approach:
df = read.table(header=T,text="
id min weight
1 67 1
2 72 2
3 36 3
4 91 4
5 80 5
6 44 6
7 76 7
8 58 8
9 84 9
10 96 10
11 21 11
12 1 12
13 41 13
14 66 14
15 89 15
16 62 16
17 11 17
18 42 18
19 68 19
20 25 20
21 44 21
22 90 22
23 4 23
24 33 24
25 31 25")
# assume sorted by weight
daynr = 1
daymax = 480
dayusd = 0
for (i in 1:nrow(df))
{
v = df$min[i]
dayusd = dayusd + v
if (dayusd>daymax)
{
daynr = daynr + 1
dayusd = v
}
df$day[[i]] = daynr
}
This will give:
> df
id min weight day
1 1 67 1 1
2 2 72 2 1
3 3 36 3 1
4 4 91 4 1
5 5 80 5 1
6 6 44 6 1
7 7 76 7 1
8 8 58 8 2
9 9 84 9 2
10 10 96 10 2
11 11 21 11 2
12 12 1 12 2
13 13 41 13 2
14 14 66 14 2
15 15 89 15 2
16 16 62 16 3
17 17 11 17 3
18 18 42 18 3
19 19 68 19 3
20 20 25 20 3
21 21 44 21 3
22 22 90 22 3
23 23 4 23 3
24 24 33 24 3
25 25 31 25 3
>
I will concentrate on the first solve. We basically solve a knapsack problem (objective + one constraint):
When I run this model as is I get:
> solve(lp)
[1] 0
> x <- get.variables(lp)
> weightx <- data$weight * x
> sum(x)
[1] 14
> sum(weightx)
[1] 0.5952381
Now when I change the objective to
I get:
> solve(lp)
[1] 0
> x <- get.variables(lp)
> weightx <- data$weight * x
> sum(x)
[1] 14
> sum(weightx)
[1] 0.7428571
I.e. the count stayed at 14, but the weight improved.
Related
I am trying to create a new column in a data.frame that is created by selecting the 9th row of a column starting at the first row (i.e. row 1, row 9, row 17). Once it reaches the nth row of the column I need it to repeat this process starting at row 2 (selecting row 2, row 10, row 18). I have a fixed number of rows at 96 so I need it to repeat until it would start on the 9th row and then quit.
Here is an example of what I would like to do:
df <- data.frame(Row=1:96)
> df$nineth <- c(1,9,17,25,33,41,49,57,65,73,81,89,2,10,18,26,34,42,50,58,66,74,82,90)
> print(df)
Row nineth
1 1 1
2 2 9
3 3 17
4 4 25
5 5 33
6 6 41
7 7 49
8 8 57
9 9 65
10 10 73
11 11 81
12 12 89
13 13 2
14 14 10
15 15 18
16 16 26
17 17 34
18 18 42
19 19 50
20 20 58
21 21 66
22 22 74
23 23 82
24 24 90
Is there a way to do this using a for loop? I am more familiar with them than the apply family.
You can use R's matrix/vector duality to do this easily...
df <- data.frame(Row=1:96)
df$nineth <- as.vector(matrix(df$Row, byrow = TRUE, ncol = 8))
head(df,15)
Row nineth
1 1 1
2 2 9
3 3 17
4 4 25
5 5 33
6 6 41
7 7 49
8 8 57
9 9 65
10 10 73
11 11 81
12 12 89
13 13 2
14 14 10
15 15 18
Following works:
n <- 9
df$nineth <- unlist(lapply(1:(n-1),
function(x){
df$Row[seq(x, nrow(df),by=n-1)]}))
This question already has answers here:
Subset data frame based on multiple conditions [duplicate]
(3 answers)
Closed 3 years ago.
I have a dataframe (k by 4). I have ordered one of the four columns in a descending order (from 19 to -9 let'say). I would like to throw away those values that are smaller than 1.5.
I just tried unsuccessfully various combinations of the following code
subset(w, select = -c(columnofinterest, <=1.50))
Can anyone help me?
Thanks a lot!
You can use arrange and filter from dplyr package:
library(dplyr)
w <- data.frame(use_this = round(runif(100, min = -9, max = 19)),
second = runif(100),
third = runif(100),
fourth = runif(100)) %>%
arrange(desc(use_this)) %>%
filter(use_this >= 1.5)
Output:
> w
use_this second third fourth
1 19 0.264306555 0.11234097 0.30149863
2 19 0.574675520 0.50406805 0.71502833
3 19 0.376586752 0.21530618 0.35323250
4 18 0.949974135 0.46726122 0.36008741
5 17 0.339737597 0.11358402 0.04035303
6 16 0.180291264 0.81855913 0.16109650
7 16 0.958398058 0.94827266 0.54693974
8 16 0.297317238 0.28726682 0.63560208
9 16 0.653006870 0.15175848 0.69305851
10 16 0.685338886 0.30493976 0.89360112
11 16 0.493931093 0.52830391 0.68391458
12 16 0.945083084 0.19880501 0.66769341
13 16 0.910927578 0.86032225 0.73062990
14 15 0.662130980 0.19207451 0.44240610
15 15 0.730482762 0.92418574 0.46387086
16 15 0.547101759 0.87847767 0.27973739
17 15 0.487773258 0.05870471 0.40147753
18 15 0.695824922 0.91289504 0.94897518
19 14 0.576095914 0.42914670 0.27707368
20 14 0.156691824 0.02187951 0.31940887
21 13 0.079037019 0.16993999 0.53232350
22 13 0.944372064 0.63485350 0.23548337
23 13 0.016378244 0.42772076 0.76618218
24 13 0.606340182 0.33611591 0.36017352
25 13 0.170346203 0.43325314 0.16285515
26 13 0.605379012 0.95574187 0.23941377
27 12 0.157352454 0.90963650 0.01611328
28 12 0.353934785 0.80058806 0.13782414
29 12 0.464950823 0.81835421 0.12771521
30 12 0.624139506 0.69472154 0.02833191
31 11 0.362033514 0.98849181 0.37684822
32 11 0.067974815 0.24154922 0.49300890
33 11 0.522271380 0.03502680 0.50665790
34 10 0.810183210 0.56598130 0.41279787
35 10 0.609560713 0.46745813 0.34939724
36 10 0.087748839 0.56531646 0.02249387
37 10 0.008262635 0.68432285 0.35648525
38 10 0.757824842 0.57826099 0.89973902
39 10 0.428174539 0.12538288 0.69233083
40 10 0.785175550 0.21516237 0.36578714
41 10 0.631388832 0.63700087 0.40933640
42 10 0.171396873 0.37925970 0.27935731
43 10 0.773437320 0.24710107 0.23902388
44 8 0.443778088 0.77238651 0.08517639
45 8 0.954302451 0.87102748 0.52031446
46 8 0.347608835 0.79912385 0.36169856
47 8 0.839238717 0.54200177 0.52221408
48 8 0.235710838 0.85575923 0.78092366
49 7 0.610772265 0.16833538 0.94704562
50 7 0.242917834 0.02852729 0.87131760
51 7 0.875879507 0.04537683 0.81000861
52 7 0.577880660 0.54259171 0.43301336
53 6 0.541772984 0.06164861 0.62867700
54 6 0.071746509 0.51758874 0.70365933
55 5 0.103953563 0.99147043 0.33944620
56 5 0.504618656 0.95827073 0.65527417
57 5 0.726648637 0.37460291 0.47072657
58 5 0.796268586 0.09644167 0.93960812
59 5 0.796498528 0.68346948 0.23290885
60 5 0.490859592 0.76727730 0.39888256
61 5 0.949232913 0.02954981 0.56672834
62 4 0.360401806 0.62879833 0.31107107
63 4 0.926329930 0.87624801 0.91260914
64 4 0.922783983 0.11524112 0.06240194
65 3 0.518727534 0.23927630 0.37114683
66 3 0.951288192 0.58672287 0.45337659
67 3 0.767943126 0.76102957 0.24347122
68 2 0.786254279 0.39824869 0.58548193
69 2 0.321557042 0.75393236 0.43273743
70 2 0.872124621 0.89918160 0.55623725
71 2 0.242389529 0.85453423 0.78540085
72 2 0.013294874 0.61593974 0.70549476
In R, I have a data.frame that looks like this:
X Y
20 7
25 84
15 62
22 12
60 24
40 10
60 60
12 50
11 17
now, i want a new Colum, lets call it "SumX", that adds two following values of X into a new field of that SumX column, and one that does the same to "SumY" column. So the result data.frame would look like this:
X Y SumX SumY
20 7 20 #first row = X 7 #first row = Y
25 84 45 #X0 + X1 91 #Y0 + Y1
15 62 40 #X1 + X2 146 #Y1 + Y2
22 12 37 #X2 + X3 74 #Y2 + Y3
60 24 82 #X3 + X4 36 #Y3 + Y4
40 10 100 #X4 + X5 34 #Y4 + Y5
60 60 100 #and so on 70 #and so on
12 50 72 110
11 17 23 67
I can do simple X + Y into a new column with
myFrame$SumXY <- with(myFrame, X+Y)
but it there a simple way to add two X (n + (n-1)) values into SumX, and two Y (n + (n-1)) into SumY? Even if it is with a while-loop, though i would prefer a simpler way (its a lot of data like this). Any help is much appreciated! (I'm still pretty new to R)
The rollapply function from the zoo package will work here.
The following code block will create the rolling sum of each 2 adjacent values.
require(zoo)
myFrame$SumX <- rollapply(myFrame$X, 2, sum) # this is a rolling sum of every 2 values
You could add by = 2 as an argument to rollapply in order to not have a rolling sum (i.e. it sums values 1+2, then 3+4, then 5+6 etc.).
Look up ?rollapply for more info.
Here's a dplyr approach.
Use mutate() to add a new colum and var + lag(var, default = 0) to compute your variable. Example:
library(dplyr)
d <- data.frame(
x = 1:10,
y = 11:20,
z = 21:30
)
mutate(d, sumx = x + lag(x, default = 0))
#> x y z sumx
#> 1 1 11 21 1
#> 2 2 12 22 3
#> 3 3 13 23 5
#> 4 4 14 24 7
#> 5 5 15 25 9
#> 6 6 16 26 11
#> 7 7 17 27 13
#> 8 8 18 28 15
#> 9 9 19 29 17
#> 10 10 20 30 19
More variables can be handled similarly:
mutate(d, sumx = x + lag(x, default = 0), sumy = y + lag(y, default = 0))
#> x y z sumx sumy
#> 1 1 11 21 1 11
#> 2 2 12 22 3 23
#> 3 3 13 23 5 25
#> 4 4 14 24 7 27
#> 5 5 15 25 9 29
#> 6 6 16 26 11 31
#> 7 7 17 27 13 33
#> 8 8 18 28 15 35
#> 9 9 19 29 17 37
#> 10 10 20 30 19 39
If you know that you want to do this for many, or even EVERY column in your data frame, then here's a standard evaluation approach with mutate_() that uses a custom function I adapted from this blog post (note you need to have the lazyeval package installed). The function gets applied to each column in a for loop (which could probably be optimised).
f <- function(df, col, new_col_name) {
mutate_call <- lazyeval::interp(~ x + lag(x, default = 0), x = as.name(col))
df %>% mutate_(.dots = setNames(list(mutate_call), new_col_name))
}
for (var in names(d)) {
d <- f(d, var, paste0('sum', var))
}
d
#> x y z sumx sumy sumz
#> 1 1 11 21 1 11 21
#> 2 2 12 22 3 23 43
#> 3 3 13 23 5 25 45
#> 4 4 14 24 7 27 47
#> 5 5 15 25 9 29 49
#> 6 6 16 26 11 31 51
#> 7 7 17 27 13 33 53
#> 8 8 18 28 15 35 55
#> 9 9 19 29 17 37 57
#> 10 10 20 30 19 39 59
Just to continue the tidyverse theme, here's a solution using the purrr package (again, works for all columns, but can subset columns if need to):
library(purrr)
# Create new columns in new data frame.
# Subset `d` here if only want select columns
sum_d <- map_df(d, ~ . + lag(., default = 0))
# Set names correctly and
# bind back to original data
names(sum_d) <- paste0("sum", names(sum_d))
d <- cbind(d, sum_d)
d
#> x y z sumx sumy sumz
#> 1 1 11 21 2 22 42
#> 2 2 12 22 4 24 44
#> 3 3 13 23 6 26 46
#> 4 4 14 24 8 28 48
#> 5 5 15 25 10 30 50
#> 6 6 16 26 12 32 52
#> 7 7 17 27 14 34 54
#> 8 8 18 28 16 36 56
#> 9 9 19 29 18 38 58
#> 10 10 20 30 20 40 60
You can use the lag function to achieve something like this:
myFrame$SumX[1] <- X[1]
myFrame$SumX[2:nrow(myFrame)] <- X[2:nrow(myFrame)]+lag(X)[2:nrow(myFrame)]
#SumX
cumsum(df$X) - c(0, 0, cumsum(df$X)[1:(nrow(df)-2)])
#[1] 20 45 40 37 82 100 100 72 23
#SumY
cumsum(df$Y) - c(0, 0, cumsum(df$Y)[1:(nrow(df)-2)])
#[1] 7 91 146 74 36 34 70 110 67
I have a data.frame named final that looks like:
labels gvs order color f3
1 Adygei -2.3321916 1 1 353.0184
2 Basque -0.8519079 2 1 368.1515
3 French -0.9298674 3 1 365.2545
4 Italian -2.8859587 4 1 354.4481
5 Orcadian -1.4996229 5 1 350.9650
6 Russian -1.5597359 6 1 358.9736
7 Sardinian -1.4494841 7 1 355.1171
8 Tuscan -2.4279528 8 1 362.4717
9 Bedouin -3.1717421 9 2 319.3706
10 Druze -0.5058627 10 2 346.2211
11 Mozabite -2.6491331 11 2 299.5014
12 Palestinian -0.7819299 12 2 330.4576
13 Balochi -1.4095947 13 3 327.1238
14 Brahui -1.2534511 14 3 331.0927
15 Burusho 1.7958170 15 3 335.0919
16 Hazara 2.2810477 16 3 325.2444
17 Kalash -0.9258497 17 3 337.7116
18 Makrani -0.9007551 18 3 321.5726
19 Pathan 2.5543214 19 3 326.1923
20 Sindhi 2.6614486 20 3 318.7025
21 Uygur -1.2207974 21 3 322.0286
22 Cambodian 2.3706977 22 4 310.8989
23 Dai -0.9441980 23 4 305.5687
24 Daur -1.0325107 24 4 309.0984
25 Han -0.7381369 25 4 309.1198
26 Hezhen -2.7590587 26 4 296.9128
27 Japanese -0.5644325 27 4 297.9313
28 Lahu -0.8449225 28 4 307.0776
29 Miao -0.7237586 29 4 303.6593
30 Mongola -0.9452944 30 4 302.1380
31 Naxi -0.1625003 31 4 311.8019
32 Oroqen -1.2035258 32 4 308.7219
33 She -2.7758460 33 4 302.1271
34 Tu -0.7703779 34 4 307.3750
35 Tujia -1.0265275 35 4 303.5923
36 Xibo -1.1163019 36 4 295.5764
37 Yakut -3.2102686 37 4 315.0111
38 Yi -0.9614190 38 4 296.8134
39 Colombian -1.9659984 39 5 311.3134
40 Karitiana -0.9195156 40 5 300.8539
41 Maya 2.1239768 41 5 333.8995
42 Pima -3.0895998 42 5 325.3484
43 Surui -0.9377928 43 5 313.8505
44 Melanesian -1.6961014 44 6 294.5214
45 Papuan -0.7037952 45 6 286.7389
46 BantuKenya -1.9311354 46 7 152.9971
47 BantuSouthAfrica -1.8515908 47 7 133.6722
48 BiakaPygmy -1.7657017 48 7 117.5555
49 Mandenka -0.5423822 49 7 152.8525
50 MbutiPygmy -1.6244801 50 7 114.1691
51 San -0.9049735 51 7 0.0000
52 Yoruba 2.0949378 52 7 154.4460
I'm using the following code to make a graph
jpeg("F3.SCZ.Jul_22.jpg", 700,700)
final$color <- as.factor(final$color)
levels(final$color) <- c("blue","yellow3","red","pink","purple","green","orange")
plot(final$gvs, final$f3, cex=2,pch = 21, bg = as.character(final$color), xaxt="n", xlab="Genetic Values", ylab="F3", main="SCZ")
dev.off()
that looks like:
I would like to split the y-axis at 200, to have the y-values range from 0 to 200 to take up only 10% of the graph, while 200 to 400 to take up 90% of the y-axis. Is that possible?
EDIT:
Here is the data that is running into issues:
labels gvs order color f3
1 Adygei -2.3321916 1 1 0.09862109
2 Basque -0.8519079 2 1 0.09942770
3 French -0.9298674 3 1 0.10357547
4 Italian -2.8859587 4 1 0.09960179
5 Orcadian -1.4996229 5 1 0.10244666
6 Russian -1.5597359 6 1 0.10097691
7 Sardinian -1.4494841 7 1 0.10189642
8 Tuscan -2.4279528 8 1 0.09794686
9 Bedouin -3.1717421 9 2 0.09272493
10 Druze -0.5058627 10 2 0.09682272
11 Mozabite -2.6491331 11 2 0.08563901
12 Palestinian -0.7819299 12 2 0.09331649
13 Balochi -1.4095947 13 3 0.09227273
14 Brahui -1.2534511 14 3 0.09328593
15 Burusho 1.7958170 15 3 0.09396032
16 Hazara 2.2810477 16 3 0.09342432
17 Kalash -0.9258497 17 3 0.09666599
18 Makrani -0.9007551 18 3 0.09222257
19 Pathan 2.5543214 19 3 0.09468376
20 Sindhi 2.6614486 20 3 0.09172395
21 Uygur -1.2207974 21 3 0.09140727
22 Cambodian 2.3706977 22 4 0.08655821
23 Dai -0.9441980 23 4 0.08739080
24 Daur -1.0325107 24 4 0.08656669
25 Han -0.7381369 25 4 0.08764395
26 Hezhen -2.7590587 26 4 0.08802065
27 Japanese -0.5644325 27 4 0.08810874
28 Lahu -0.8449225 28 4 0.08609791
29 Miao -0.7237586 29 4 0.08700414
30 Mongola -0.9452944 30 4 0.08921706
31 Naxi -0.1625003 31 4 0.08646436
32 Oroqen -1.2035258 32 4 0.08719536
33 She -2.7758460 33 4 0.08656100
34 Tu -0.7703779 34 4 0.08818588
35 Tujia -1.0265275 35 4 0.08737680
36 Xibo -1.1163019 36 4 0.08806230
37 Yakut -3.2102686 37 4 0.08965344
38 Yi -0.9614190 38 4 0.08593454
39 Colombian -1.9659984 39 5 0.09114697
40 Karitiana -0.9195156 40 5 0.09040477
41 Maya 2.1239768 41 5 0.09068139
42 Pima -3.0895998 42 5 0.09084750
43 Surui -0.9377928 43 5 0.08925535
44 Melanesian -1.6961014 44 6 0.08430903
45 Papuan -0.7037952 45 6 0.08272786
46 BantuKenya -1.9311354 46 7 0.04668356
47 BantuSouthAfrica -1.8515908 47 7 0.03914248
48 BiakaPygmy -1.7657017 48 7 0.03546243
49 Mandenka -0.5423822 49 7 0.04612336
50 MbutiPygmy -1.6244801 50 7 0.03098719
51 San -0.9049735 51 7 0.00000000
52 Yoruba 2.0949378 52 7 0.04561542
You can do:
my_color <- as.factor(final$color)
levels(my_color) <- c("blue","yellow3","red","pink","purple","green","orange")
par(mfrow = c(1,2))
# original plot
pos <- seq(min(final$f3), max(final$f3), by = 25) ## y-axis tick marks position.
plot(final$gvs, final$f3, cex=2, pch=21, bg = as.character(my_color),
xaxt="n", yaxt="n", xlab="Genetic Values", ylab="F3", main="SCZ")
axis(2, at = pos, labels = pos) ## add y-axis
# new plot
threshold <- 260 ## cut off threshold
## some rescaling
## if f3 < threshold, we take new_f3 <- 0.1 * f3
## if f3 > threshold, we take new_f3 <- f3 - 0.9 * threshold
new_f3 <- ifelse(final$f3 < threshold, 0.1 * final$f3, final$f3 - threshold * 0.9)
## we apply the same transform to `pos` to get `new_pos`
new_pos <- ifelse(pos < threshold, 0.1 * pos, pos - threshold * 0.9)
plot(final$gvs, new_f3, cex=2, pch=21, bg = as.character(my_color),
xaxt="n", yaxt="n", xlab="Genetic Values", ylab="F3", main="SCZ")
abline(h = threshold * 0.1, lty = 3) # threshold line
axis(2, at = new_pos, labels = pos)
I would use trans_new() from scales package to transform the y-axis. This should get you close. I prefer the continuously differentiable transform (first), but you can also do a step change in scale (second). H/T to Gregor for pointing out that pmin and pmax handle vectors and are correct here.
setwd("C:/Users/rherron1/Desktop/")
final <- read.table("Scratch2.txt", header=TRUE)
final$id <- NULL
# default y-scale
require(ggplot2)
a <- ggplot(final, aes(gvs, f3, color=factor(color)))
a <- a + geom_point()
a
# transform y-axis
require(scales)
skew <- function(x) x^2
iskew <- function(x) x^(1/2)
skew_trans <- function() trans_new("skew", "skew", "iskew")
b <- a + coord_trans(y="skew")
b
# transform y-axis
require(scales)
sku <- function(x) pmin(x, 200) + 9*pmax(x-200, 0)
isku <- function(x) pmax((x-200)/9, 0) + pmin(x, 200)
sku_trans <- function() trans_new("sku", "sku", "isku")
c <- a + coord_trans(y="sku")
c
I am playing around to develop a sampling function to do randomization to make days easier:
Question:
pln <- 1:80
bcap <- cumsum(c(20, 12, 16, 16, 16))
bcap
[1] 20 32 48 64 80
I want to randomize pln such that 1:20, 21:32, 33:48, 49:64, 65:80, for this example. This might vary for different scenarios.
newpln <- c(sample(1:20), sample(21:32), sample(33:48),
sample(49:64), sample(65:80))
I want create a general function where length of bcap can be of any number, however the pln should run 1: max(bcap).
Is this what you want?
> unlist(sapply(mapply(seq, c(1, bcap[1:(length(bcap)-1)]+1), bcap), sample))
[1] 13 19 4 16 11 2 5 20 9 14 10 3 1 7 6 8 17 12 15 18 27 24 30 32 23 25 28 21 31 26 29 22 39 41 48 36 37 45 42 47 43 38 40 34 35
[46] 44 46 33 60 52 50 58 51 54 62 55 64 61 59 49 63 53 56 57 72 74 76 78 67 69 70 66 73 79 68 80 77 71 75 65
Testing:
> pln <- 1:12
> pln
[1] 1 2 3 4 5 6 7 8 9 10 11 12
> bcap <- cumsum(c(4, 3, 2, 3))
> bcap
[1] 4 7 9 12
> unlist(sapply(mapply(seq, c(1, bcap[1:(length(bcap)-1)]+1), bcap), sample))
[1] 4 2 3 1 6 5 7 8 9 12 11 10
> unlist(sapply(mapply(seq, c(1, bcap[1:(length(bcap)-1)]+1), bcap), sample))
[1] 4 2 3 1 6 5 7 9 8 10 12 11
> unlist(sapply(mapply(seq, c(1, bcap[1:(length(bcap)-1)]+1), bcap), sample))
[1] 2 3 1 4 7 6 5 8 9 11 10 12
You can do this with one call to mapply. You just need an object that contains what's inside the cumsum call of your bcap object.
bvec <- c(20, 12, 16, 16, 16)
mapply(function(x,y) sample(x)+y-x, bvec, cumsum(bvec))
A small example:
bvec <- c(2,1,3,1)
set.seed(21)
unlist(mapply(function(x,y) sample(x)+y-x, bvec, cumsum(bvec)))
# [1] 2 1 3 4 5 6 7
library("plyr")
unlist(
llply(
mlply(
data.frame(from=c(1,bcap[-length(bcap)]), to=bcap),
seq),
sample),
use.names = FALSE)
Make a data.frame with each ranges from/to, use that to make a list with the sequences, sample each list, and then combine them together.
UPDATE:
worked for me:
> library("plyr")
> bcap <- cumsum(c(4, 3, 2, 3))
> unlist(llply(mlply(data.frame(from=c(1,bcap[-length(bcap)]), to=bcap),seq),sample),use.names=FALSE)
[1] 4 2 3 1 7 4 5 6 9 7 8 12 9 11 10
> unlist(llply(mlply(data.frame(from=c(1,bcap[-length(bcap)]), to=bcap),seq),sample),use.names=FALSE)
[1] 3 1 2 4 5 6 4 7 9 7 8 9 12 10 11
> unlist(llply(mlply(data.frame(from=c(1,bcap[-length(bcap)]), to=bcap),seq),sample),use.names=FALSE)
[1] 2 3 4 1 6 5 4 7 8 9 7 11 10 12 9