Given I have the vector and a target number
target.mountain <- 10
Roll_dice <- sample(1:6, 4, replace=TRUE)
With Roll_dice producing
[1] 6, 5, 3, 2 as an example
How can I produce a list of all numbers in Roll_dice with all the ways of adding them together by combining either 2, 3 or 4 of the values in Roll_dice together in a list
For example [1] 2, 3, 5, 5, 6, 7, 11, ....
I would like you to check out the RccpAlgos-package, which has some awesome (and fast!) functions for fast operations on combinations/permutations with constraints.
update
library(RcppAlgos)
library(vecsets)
library(data.table)
target.mountain <- 10
Roll_dice <- c(5, 5, 3, 2)
L <- lapply( 2:4, function(x) {
as.data.table(comboGeneral( Roll_dice,
x,
constraintFun = "sum",
comparisonFun = "==",
limitConstraints = target.mountain ),
keep.rownames = TRUE )
})
# [[1]]
# V1 V2
# 1: 5 5
#
# [[2]]
# V1 V2 V3
# 1: 2 3 5
#so 5-5 of 2-3-5 can be chosen to get to 10
#remaining dice
DT <- data.table::rbindlist( L, fill = TRUE )
remains <- lapply( transpose(DT), function(x) {
v <- as.vector(x)
v <- v[ !is.na(v) ]
sum( vecsets::vsetdiff( Roll_dice, v) )
})
remains
#witrh leftovers:
# $V1
# [1] 5
#
# $V2
# [1] 5
old answer
library(RcppAlgos)
target.mountain <- 10
Roll_dice <- c(6, 4, 5, 5)
sapply( 2:4, function(x) {
comboGeneral( Roll_dice,
x,
constraintFun = "sum",
comparisonFun = "==",
limitConstraints = target.mountain )
})
# [[1]]
# [,1] [,2]
# [1,] 4 6
# [2,] 5 5
#
# [[2]]
# [,1] [,2] [,3]
#
# [[3]]
# [,1] [,2] [,3] [,4]
Something like this?
> sapply(
+ 2:4,
+ function(k) combn(Roll_dice, k, sum)
+ )
[[1]]
[1] 11 9 8 8 7 5
[[2]]
[1] 14 13 11 10
[[3]]
[1] 16
Or do you need this?
> lapply(
+ setNames(2:4, 2:4),
+ function(k) target.mountain %in% combn(Roll_dice, k, sum)
+ )
$`2`
[1] FALSE
$`3`
[1] TRUE
$`4`
[1] FALSE
Related
Say, I have the following list
raw <- list(list(1:2, 2:3, 3:4), list(4:5, 5:6, 6:7), list(7:8, 8:9, 9:10))
I would like to find the mean of the corresponding entries of the out-most list. The expected output would be something like
[[1]]
[1] 4 5
[[2]]
[1] 5 6
[[3]]
[1] 6 7
This is because the mean of 1:2, 4:5, and 7:8 would be 4:5.
I have been experimenting with stuff like lapply(raw, function(x) lapply(x, mean)), but apparently it doesn't return the desired output.
This is pretty ugly, but we can use mapply to iterate over the lists but we need to expand the list into parameters via do.call
do.call("mapply", c(function(...) rowMeans(data.frame(...)), raw, SIMPLIFY=FALSE))
You can make this prettier using the purrr package
purrr::pmap(raw, ~rowMeans(data.frame(...)))
1
n = length(raw[[1]])
lapply(1:n, function(i){
d = do.call(rbind, lapply(seq_along(raw), function(j){
raw[[j]][[i]]
}))
apply(d, 2, mean)
})
#[[1]]
#[1] 4 5
#[[2]]
#[1] 5 6
#[[3]]
#[1] 6 7
2
aggregate(. ~ ind, do.call(rbind, lapply(raw, function(x)
data.frame(cbind(do.call(rbind, x), ind = seq_along(x))))), mean)
# ind V1 V2
#1 1 4 5
#2 2 5 6
#3 3 6 7
You could put the thing into an array and take the cell medians (I suppose you want these instead of means).
A <- array(matrix(unlist(raw), 2, byrow=FALSE), dim=c(2, 3, 3))
v.mds <- t(apply(A, 1:2, median))
lapply(1:3, function(x) v.mds[x, ])
# [[1]]
# [1] 4 5
#
# [[2]]
# [1] 5 6
#
# [[3]]
# [1] 6 7
Generalized like so:
A <- array(matrix(unlist(raw), length(el(el(raw))), byrow=0),
dim=c(length(el(el(raw))), el(lengths(raw)), length(raw)))
v.mds <- t(apply(A, 1:2, median))
lapply(1:nrow(v.mds), function(x) v.means[x, ])
I would like to simulate the frequency and severity over a list of parameters.
Here is the case for the first item in the list:
data <- data.frame(
lamda = c(5, 2, 3),
meanlog = c(9, 10, 11),
sdlog = c(2, 2.1, 2.2))
freq <- rpois(s, data$lamda[1])
freqsev <- lapply(freq, function(k) rlnorm(k, data$meanlog[1], sdlog = data$sdlog[1]))
freq
freqsev
How I set up a loop or an lapply statement to iterate over all the items in data? (not just the first).
Thanks.
We can use map (from the purrr package, part of the tidyverse package) as follows to create list columns. The contents are now stored in the freq and freqsev columns.
library(tidyverse)
set.seed(123)
s <- 2
data2 <- data %>%
mutate(freq = map(lamda, ~rpois(s, .x)),
freqsev = map(freq, ~map(.x, function(k) rlnorm(k, meanlog, sdlog))))
data2$freq
# [[1]]
# [1] 4 7
#
# [[2]]
# [1] 2 4
#
# [[3]]
# [1] 6 0
data2$freqsev
# [[1]]
# [[1]][[1]]
# [1] 9330.247 28897.323 2605520.369 20370.283
#
# [[1]][[2]]
# [1] 645.4047 5206.2183 22461.1778 93729.0634 46892.3129 144595.7492 10110.8606
#
#
# [[2]]
# [[2]][[1]]
# [1] 2665.955 938950.074
#
# [[2]][[2]]
# [1] 21931.9763 354.2858 280122.6952 3147.6681
#
#
# [[3]]
# [[3]][[1]]
# [1] 957.5257 13936.3063 6265.3530 1886.0077 5927.8540 1464.5081
#
# [[3]][[2]]
# numeric(0)
Update
Here is the way to replace values larger than equal to 500.
data3 <- data2 %>%
mutate(capat500 = map(freqsev, ~map(.x, function(y) ifelse(y >= 500, 500, y))))
Short version:
How does one programmatically select sub-arrays from an n-dimensional array when n is arbitrary?
(If the short version of this question is clear enough, feel free to skip the rest of this post.)
Suppose that A is an array such that dim(A) is the vector of positive integers (d1, d2, …, dn), with n > 2.
For example:
> d <- 5:2
> set.seed(0)
> A <- array(runif(prod(d)), dim = d)
Here the array A corresponds to the definition given earlier, with n = 4, and dk = 6 - k, for k ∈ {1, 2, 3, 4}.
Then, if 1 ≤ i ≤ d1 and 1 ≤ j ≤ d2, the expression A[i, j … ] (where … is a placeholder for n - 2 commas) evaluates to an (n - 2)-dimensional array.
To continue the previous example, if we take i = 3 and j = 2, my notation A[i, j … ] would denote the (n - 2 = 2)-dimensional array shown below:
> A[3, 2, ,]
[,1] [,2]
[1,] 0.94467527 0.4785452
[2,] 0.01339033 0.7111212
[3,] 0.02333120 0.1293723
More generally, if
1 ≤ k1 < k2 < … < km ≤ n
and
1 ≤ ir ≤ dkr, ∀r ∈ {1, … m}, then an expression of the general form
A[ … i1 … i2 … … im … ]
...(where the …'s are placeholders for sequences of indices ik and commas), evaluates to an (n - m)-dimensional array.
For example,
> d <- c(4, 2, 5, 4, 2, 7, 3)
> set.seed(1)
> A <- array(runif(prod(d)), dim = d)
> A[3, 1, 4, , 1, 6, ]
[,1] [,2] [,3]
[1,] 0.5320469 0.77282382 0.18034186
[2,] 0.6817434 0.08627063 0.77227529
[3,] 0.8572805 0.32337850 0.63322550
[4,] 0.6555618 0.20578391 0.01257377
Now, one can write out expressions like A[i, j … ] and
A[ … i1 … i2 … … im … ] in full (i.e. filling in all the … placeholders) only if one knows n.
Of course, when one is working interactively, one usually knows (or can easily find out) what n is, and can use this knowledge to decide how many commas to insert in, e.g., A[i, j … ]. This is not the case, however, when one is writing code to work with multi-dimensional arrays of any number of dimensions.
How would one express selections such as A[i, j … ] and A[ … i1 … i2 … … im … ] when one does not know n?
Perhaps this will work for you:
func <- function(ary, ..., drop = TRUE) {
d <- length(dim(ary))
dots <- list(...)
if (length(dots) > d) stop("incorrect number of dimensions")
rest <- rep(TRUE, d - length(dots))
do.call(`[`, c(list(ary), c(dots, rest, drop = drop)))
}
Using your data:
d <- rev(2:5)
set.seed(0)
A <- array(runif(prod(d)), dim = d)
You normally need to know how many commas to include for the correct dimensionality:
A[3,2]
# Error in A[3, 2] : incorrect number of dimensions
This function "fills in" the rest of it for you:
func(A, 3, 2)
# [,1] [,2]
# [1,] 0.94467527 0.4785452
# [2,] 0.01339033 0.7111212
# [3,] 0.02333120 0.1293723
func(A, 3)
# , , 1
# [,1] [,2] [,3]
# [1,] 0.3721239 0.21214252 0.6470602
# [2,] 0.9446753 0.01339033 0.0233312
# [3,] 0.1765568 0.59956583 0.8612095
# [4,] 0.7176185 0.79423986 0.3162717
# , , 2
# [,1] [,2] [,3]
# [1,] 0.2936034 0.71251468 0.3531973
# [2,] 0.4785452 0.71112122 0.1293723
# [3,] 0.8394404 0.05893438 0.7317925
# [4,] 0.8643395 0.45527445 0.7155661
It correctly handles all dimensions:
A[3,2,1,1]
# [1] 0.9446753
func(A, 3, 2, 1, 1)
# [1] 0.9446753
And errors similarly with too many dimensions:
A[3,2,1,1,1]
# Error in A[3, 2, 1, 1, 1] : incorrect number of dimensions
func(A, 3, 2, 1, 1, 1)
# Error in func(A, 3, 2, 1, 1, 1) (from #4) : incorrect number of dimensions
Edit: and the part that I missed. In order to catch blanks, we need to have a little fun.
func <- function(ary, ..., drop = TRUE) {
d <- length(dim(ary))
dots <- as.list(match.call()[-(1:2)])
if (length(dots) > d) stop("incorrect number of dimensions")
pf <- parent.frame()
dots <- lapply(seq_along(dots), function(i) {
x <- dots[[i]]
if (missing(x)) TRUE else eval(dots[[i]], env = pf)
})
rest <- rep(TRUE, d - length(dots))
do.call(`[`, c(list(ary), c(dots, rest, drop = drop)))
}
I had a simpler version of this function (without the lappy), but it tended to fail if any of the positional arguments were variables vice literals.
d <- c(4, 2, 5, 4, 2, 7, 3)
set.seed(1)
A <- array(runif(prod(d)), dim = d)
A[3, 1, 4, , 1, 6, ]
# [,1] [,2] [,3]
# [1,] 0.007668596 0.1818094 0.3278203
# [2,] 0.286473525 0.4119333 0.4825088
# [3,] 0.008869468 0.4767760 0.7649491
# [4,] 0.330141563 0.3438217 0.8710419
func(A, 3, 1, 4, , 1, 6)
# [,1] [,2] [,3]
# [1,] 0.007668596 0.1818094 0.3278203
# [2,] 0.286473525 0.4119333 0.4825088
# [3,] 0.008869468 0.4767760 0.7649491
# [4,] 0.330141563 0.3438217 0.8710419
i <- 3
func(A, i, 1, 2+2, , 1, 6)
# [,1] [,2] [,3]
# [1,] 0.007668596 0.1818094 0.3278203
# [2,] 0.286473525 0.4119333 0.4825088
# [3,] 0.008869468 0.4767760 0.7649491
# [4,] 0.330141563 0.3438217 0.8710419
Consider a list of mixed classes like what returns from boxplot. I want to concatenate each list element, sort of stack each pair of elements horizontally.
(I clicked all of the "similar questions" and searched and am not aware of a base function to do this, modifyList being similar but not exactly what I want. I also looked quickly through the package rlist, but nothing struck me as similar. Also this question/answer is similar but only works for vectors)
f <- function(x) boxplot(mpg ~ vs, data = x, plot = FALSE)
(bp1 <- f(mtcars[mtcars$vs == 0, ]))
# $stats
# [,1]
# [1,] 10.40
# [2,] 14.70
# [3,] 15.65
# [4,] 19.20
# [5,] 21.00
#
# $n
# [1] 18
#
# $conf
# [,1]
# [1,] 13.97416
# [2,] 17.32584
#
# $out
# [1] 26
#
# $group
# [1] 1
#
# $names
# [1] "0"
(bp2 <- f(mtcars[mtcars$vs == 1, ]))
# $stats
# [,1]
# [1,] 17.8
# [2,] 21.4
# [3,] 22.8
# [4,] 30.4
# [5,] 33.9
#
# $n
# [1] 14
#
# $conf
# [,1]
# [1,] 18.99955
# [2,] 26.60045
#
# $out
# numeric(0)
#
# $group
# numeric(0)
#
# $names
# [1] "1"
The idea is to combine the two lists above into what one would get having simply done the following:
(bp <- f(mtcars))
# $stats
# [,1] [,2]
# [1,] 10.40 17.8
# [2,] 14.70 21.4
# [3,] 15.65 22.8
# [4,] 19.20 30.4
# [5,] 21.00 33.9
#
# $n
# [1] 18 14
#
# $conf
# [,1] [,2]
# [1,] 13.97416 18.99955
# [2,] 17.32584 26.60045
#
# $out
# [1] 26
#
# $group
# [1] 1
#
# $names
# [1] "0" "1"
This function seems to get the job done but is simple, so it can probably be broken easily.
cList <- function (x, y) {
islist <- function(x) inherits(x, 'list')
get_fun <- function(x, y)
switch(class(if (is.null(x)) y else x),
matrix = cbind,
data.frame = function(x, y)
do.call('cbind.data.frame', Filter(Negate(is.null), list(x, y))),
factor = function(...) unlist(list(...)), c)
stopifnot(islist(x), islist(y))
nn <- names(rapply(c(x, y), names, how = 'list'))
if (is.null(nn) || any(!nzchar(nn)))
stop('All non-NULL list elements should have unique names', domain = NA)
nn <- unique(c(names(x), names(y)))
z <- setNames(vector('list', length(nn)), nn)
for (ii in nn)
z[[ii]] <- if (islist(x[[ii]]) && islist(y[[ii]]))
Recall(x[[ii]], y[[ii]]) else
(get_fun(x[[ii]], y[[ii]]))(x[[ii]], y[[ii]])
z
}
f <- function(x) boxplot(mpg ~ vs, data = x, plot = FALSE)
bp1 <- f(mtcars[mtcars$vs == 0, ])
bp2 <- f(mtcars[mtcars$vs == 1, ])
bp <- f(mtcars)
identical(cList(bp1, bp2), bp)
# [1] TRUE
Also works on nested lists or lists not having the same elements in the same order, the caveat being the lists must be named, otherwise the function doesn't know which elements to concatenate.
l0 <- list(x = 1:5, y = matrix(1:4, 2), z = head(cars), l = list(1:5))
l1 <- list(x = factor(1:5), y = matrix(1:4, 2), z = head(cars), l = list(zz = 1:5))
l2 <- list(z = head(cbind(cars, cars)), x = factor('a'), l = list(zz = 6:10))
cList(l0, l2) ## should throw error
cList(l1, l2)
# $x
# [1] 1 2 3 4 5 a
# Levels: 1 2 3 4 5 a
#
# $y
# [,1] [,2]
# [1,] 1 3
# [2,] 2 4
#
# $z
# speed dist speed dist speed dist
# 1 4 2 4 2 4 2
# 2 4 10 4 10 4 10
# 3 7 4 7 4 7 4
# 4 7 22 7 22 7 22
# 5 8 16 8 16 8 16
# 6 9 10 9 10 9 10
#
# $l
# $l$zz
# [1] 1 2 3 4 5 6 7 8 9 10
Update -- new version (approximately here) which can rbind or cbind rectangular objects (matrices, data frames)
cList <- function(x, y, how = c('cbind', 'rbind')) {
if (missing(y))
return(x)
how <- match.arg(how)
islist <- function(x) inherits(x, 'list')
get_fun <- function(x, y)
switch(class(if (is.null(x)) y else x),
matrix = match.fun(how),
data.frame = function(x, y)
do.call(sprintf('%s.data.frame', how),
Filter(Negate(is.null), list(x, y))),
factor = function(...) unlist(list(...)), c)
stopifnot(islist(x), islist(y))
nn <- names(rapply(c(x, y), names, how = 'list'))
if (is.null(nn) || any(!nzchar(nn)))
stop('All non-NULL list elements should have unique names', domain = NA)
nn <- unique(c(names(x), names(y)))
z <- setNames(vector('list', length(nn)), nn)
for (ii in nn)
z[[ii]] <- if (islist(x[[ii]]) && islist(y[[ii]]))
Recall(x[[ii]], y[[ii]]) else
(get_fun(x[[ii]], y[[ii]]))(x[[ii]], y[[ii]])
z
}
I have a range of values
c(1,2,3,4,5,8,9,10,13,14,15)
And I want to find the ranges where the numbers become discontinuous. All I want is this as output:
(1,5)
(8,10)
(13,15)
I need to find break points.
I need to do it in R.
Something like this?
x <- c(1:5, 8:10, 13:15) # example data
unname(tapply(x, cumsum(c(1, diff(x)) != 1), range)
# [[1]]
# [1] 1 5
#
# [[2]]
# [1] 8 10
#
# [[3]]
# [1] 13 15
Another example:
x <- c(1, 5, 10, 11:14, 20:21, 23)
unname(tapply(x, cumsum(c(1, diff(x)) != 1), range))
# [[1]]
# [1] 1 1
#
# [[2]]
# [1] 5 5
#
# [[3]]
# [1] 10 14
#
# [[4]]
# [1] 20 21
#
# [[5]]
# [1] 23 23
x <- c(1:5, 8:10, 13:15)
rr <- rle(x - seq_along(x))
rr$values <- seq_along(rr$values)
s <- split(x, inverse.rle(rr))
s
# $`1`
# [1] 1 2 3 4 5
#
# $`2`
# [1] 8 9 10
#
# $`3`
# [1] 13 14 15
## And then to get *literally* what you asked for:
cat(paste0("(", gsub(":", ",", sapply(s, deparse)), ")"), sep="\n")
# (1,5)
# (8,10)
# (13,15)
I published seqle which will do this for you in one line. You can load the package cgwtools or search SO for the code, as it's been posted a couple times.
Assuming that you don't care about the exact output and are looking for the min and max of each range, you can use diff/cumsum/range as follows:
x <- c(1:5, 8:10, 13:15)
x. <- c(0, cumsum( diff(x)-1 ) )
lapply( split(x, x.), range )