Group a vector of numbers by range - r

I have a sorted vector:
c(1, 10, 31, 80, 100, 120, 160)
I would like to split its elements into several groups by range, which I set to 31 in this example.
The result is like this:
[[1]] 1, 10, 31
[[2]] 80, 100
[[3]] 100, 120
[[4]] 160
The ranges in each group is less than 31. I have tried a loop, but I do not like it. Also, I tried the outer function, where I calculated all pairwise differences:
res <- outer(vec, vec, "-")
Then filter each column by the condition > 0 and < 31.
apply(res, 2, function(x) x[x > 0 & x < 31])
The result is not good enough though...

I think this will serve your purpose finally
First list will extract items fulfilling your condition of range, whereas final_list will remove items that are actually contained in some other items.
vec <- c(1, 10, 31, 80, 100, 120, 160)
first_list <- unique(apply(outer(vec, vec, "-"), 1, function(x){vec[(x < 31 & x >= 0)] }))
final_list <- first_list[!sapply(seq_along(first_list), function(i) max(sapply(first_list[-i],function(L) all(first_list[[i]] %in% L))))]
> final_list
[[1]]
[1] 1 10 31
[[2]]
[1] 80 100
[[3]]
[1] 100 120
[[4]]
[1] 160

Here's a neat solution
x <- c(1, 10, 31, 80, 100, 120, 160)
y <- findInterval(x+30, x)
lapply(seq_along(x)[!duplicated(y)], function(z) x[z:y[z]])
#> [[1]]
#> [1] 1 10 31
#>
#> [[2]]
#> [1] 80 100
#>
#> [[3]]
#> [1] 100 120
#>
#> [[4]]
#> [1] 160

Related

Reordering elements of a list

I'm trying to create a function in R that takes a list as an input and reorder its elements based on the mean of each element (for example, the first element will be the one with the largest mean value, etc.) I'm trying to find the easiest way to do it without having to use any loops. I tried to sort the mean like below but couldn't figure out how to have the elements associated with the mean to move along. Any suggestions or advice would be appreciated.
function1 <- function(x){
return(sort(mean(x), decreasing = T))
}
function2 <- function(x) {
return(lapply(function1, x))
}
testlist <- list(c(10, 5, 1), c(3, 2), c(77, 90, 1), c(23, 34), c(2, 35, 22))
function2(testlist)
Here's one way that is really just testlist[order(sapply(testiest, mean))], but put inside a function. The idea is that sapply() returns a vector that gives the mean of each list element, order() gives the element numbers in order of the mean. Then, you are giving that ordered element numbers based on the mean to index the values of the list.
testlist <- list(c(10, 5, 1), c(3, 2), c(77, 90, 1), c(23, 34), c(2, 35, 22))
function1 <- function(x){
x[order(sapply(x, mean))]
}
function1(testlist)
#> [[1]]
#> [1] 3 2
#>
#> [[2]]
#> [1] 10 5 1
#>
#> [[3]]
#> [1] 2 35 22
#>
#> [[4]]
#> [1] 23 34
#>
#> [[5]]
#> [1] 77 90 1
Created on 2023-02-18 by the reprex package (v2.0.1)
You can use sapply function and then order the elements in decreasing ordedr with the first element with the highest mean :
function2 <- function(z) {
order_ind <- order(sapply(z, mean), decreasing = TRUE)
z[order_ind]
}
sapply() computes the mean value of each element and
z[order_ind] outputs the ordered list using the ordered indexes
function2(testlist)
the results:
[[1]]
[1] 77 90 1
[[2]]
[1] 23 34
[[3]]
[1] 2 35 22
[[4]]
[1] 10 5 1
[[5]]
[1] 3 2
Using fmean from collapse
library(collapse)
testlist[order(fmean(testlist))]
-output
[[1]]
[1] 3 2
[[2]]
[1] 10 5 1
[[3]]
[1] 2 35 22
[[4]]
[1] 23 34
[[5]]
[1] 77 90 1
data
testlist <- list(c(10, 5, 1), c(3, 2), c(77, 90, 1), c(23, 34), c(2, 35,
22))

Random Numbers Are Too Similar To Each Other?

I am using this code that generates 3 random numbers that add to 72:
# https://stackoverflow.com/questions/24845909/generate-n-random-integers-that-sum-to-m-in-r
rand_vect <- function(N, M, sd = 1, pos.only = TRUE) {
vec <- rnorm(N, M/N, sd)
if (abs(sum(vec)) < 0.01) vec <- vec + 1
vec <- round(vec / sum(vec) * M)
deviation <- M - sum(vec)
for (. in seq_len(abs(deviation))) {
vec[i] <- vec[i <- sample(N, 1)] + sign(deviation)
}
if (pos.only) while (any(vec < 0)) {
negs <- vec < 0
pos <- vec > 0
vec[negs][i] <- vec[negs][i <- sample(sum(negs), 1)] + 1
vec[pos][i] <- vec[pos ][i <- sample(sum(pos ), 1)] - 1
}
vec
}
If I run this code 100 times:
results <- list()
for (i in 1:100)
{
r_i = rand_vect(3,72)
results[[i]] <- r_i
}
When I look at the numbers that came out:
[[1]]
[1] 23 24 25
[[2]]
[1] 25 24 23
[[3]]
[1] 24 25 23
[[4]]
[1] 23 24 25
[[5]]
[1] 25 24 23
[[6]]
[1] 24 24 24
[[7]]
[1] 24 25 23
[[8]]
[1] 24 25 23
[[9]]
[1] 24 25 23
[[10]]
[1] 24 23 25
In each iteration, all the numbers add to 72 as expected - but the numbers don't really "look that random". They all seem to be "clumped" around "24, 23, 25". I was hoping to see more "randomness" in these numbers. For example:
[[11]]
[1] 5 50 17
[[12]]
[1] 12 40 20
Why are the numbers in the code I am using "clumped" around 24, 23, 25 - and how can I change the above code so that there is more "randomness" in the numbers being generated?
Thank you!
If you just want three random integers that sum to 72 you could do something like
diff(c(0, sort(sample(72, 2)), 72))

Simulating Snakes and Ladders in R

I am a beginner in R and I have to simulate a snakes and ladders game in R for an assignment. The board has 100 squares. The only winning square is 100, for example if you’re on square 98 and roll a 6 you would go forward 2 spaces to 100 and then bounce back 4 spaces to 96. My difficulty is insterting the snakes/ladders transitions in the complete transition matrix, and coding the winning condition. Here is my code so far:
snakesNladders <-
function()
{
transitions <- rbind(
c(40, 3),
c(4, 25),
c(27, 5),
c(13, 46),
c(43, 18),
c(54, 31),
c(33, 49),
c(99, 41),
c(42, 63),
c(66, 45),
c(50, 69),
c(89, 53),
c(76, 58),
c(62, 81),
c(74, 92))
transmat <- 1:100
names(transmat) <- as.character(1:100)
transmat[transitions[,1]] <- transitions[,2]
firstpos <- 0
curpos_player1 <- NULL
curpos_player2 <- NULL
while(curpos_player1 & curpos_player2 < 100) {
curpos_player1 <- firstpos + curpos_player1 + sample(1:6, 1, replace=TRUE)
curpos_player2 <- firstpos + curpos_player2 + sample(1:6, 1, replace = TRUE)
curpos_player1 <- transmat[curpos_player1]
curpos_player2 <- transmat[curpos_player2]
if(curpos_player1 | curpos_player2 == 100){
print(win)
}else if(curpos_player1 > 100){
return()
}else if(curpos_player2 > 100){
return()
}
}
}
}
}
Not sure what I should put in the return brackets to simulate the winning condition. Also if the rest of the code seems ok. I would really appreciate any help.
Here's how I might approach it. Maybe you can get some ideas by walking through the code.
library(data.table)
set.seed(1633654196) # for reproducibility
# for the transitions, position 1 is the start position, position 2 is the
# square at the bottom left of the board, position 101 is the winning position
# get a vector of snake/ladder transitions (where a player moves after
# landing on each square)
trans <- 1:101
trans[c(40, 4, 27, 13, 43, 54, 33, 99, 42, 66, 50, 89, 76, 62, 74) + 1] <-
c( 3, 25, 5, 46, 18, 31, 49, 41, 63, 45, 69, 53, 58, 81, 92) + 1
# get a matrix of all possible starting/ending positions (based on die
# roll 1 through 6)
m <- c(sequence(rep(6L, 100), 2:101), rep(101L, 6))
# handle overshooting the winning square
blnBounce <- m > 101
m[blnBounce] <- 202 - m[blnBounce]
# add in the snake/ladder transitions
# (faster to index moves as a list of vectors than as a matrix)
to <- asplit(m <- matrix(trans[m], 6), 2)
simgame <- function(nplayers = 1, to, blnPrint = TRUE, turns = 50L, start = rep(1L, nplayers)) {
# turn 1 is the starting position, and won't be retained in the output, so
# add 1 to turns
turns <- as.integer(turns) + 1L
# sample the rolls for all players up front
rolls <- matrix(sample(6, nplayers*turns, TRUE), ncol = 2)
# initialize the output matrix
out <- matrix(0, turns, nplayers)
out[1,] <- start
for (j in 1:nplayers) {
for (i in 2:turns) {
out[i, j] <- to[[out[i - 1L]]][rolls[i, j]]
if (out[i, j] == 101L) {
turns <- i
break
}
}
}
# subtract 1 so "1" corresponds to square 1 and "100" corresponds to the
# winning square
out <- out[2:turns,,drop = FALSE] - 1L
turns <- turns - 1L
winner <- match(100L, out[turns,], 0L)
if (winner) {
if (blnPrint) print(paste("Player", winner, "wins"))
out
} else {
# no winner yet, recursively call simgame
rbind(out, Recall(nplayers, to, blnPrint, turns, out[turns,]))
}
}
# simulate a 2-player game
(game <- simgame(2, to))
#> [1] "Player 2 wins"
#> [,1] [,2]
#> [1,] 25 6
#> [2,] 31 30
#> [3,] 32 32
#> [4,] 34 34
#> [5,] 3 36
#> [6,] 25 6
#> [7,] 5 28
#> [8,] 11 10
#> [9,] 46 14
#> [10,] 52 51
#> [11,] 31 31
#> [12,] 35 37
#> [13,] 39 38
#> [14,] 3 41
#> [15,] 8 7
#> [16,] 12 12
#> [17,] 46 17
#> [18,] 69 48
#> [19,] 75 71
#> [20,] 78 77
#> [21,] 84 83
#> [22,] 87 53
#> [23,] 93 92
#> [24,] 96 94
#> [25,] 41 100
# simulate the number of rounds for 15k games
system.time(turns <- replicate(15e3, nrow(simgame(2, to, FALSE))))
#> user system elapsed
#> 1.18 0.01 1.06
hist(turns)
# Probability of single player finishing by turn
dt <- data.table(from = c(col(m)), to = c(m))[, .(prob = .N/6), from:to]
# full transition matrix
m1 <- matrix(0L, 101, 101)
m1[as.matrix(dt[, 1:2])] <- dt$prob
# calculate the probabilities for 300 turns
mm <- m1
prob1 <- numeric(300)
for (i in 1:300) {
mm <- mm %*% m1
prob1[i] <- mm[1, 101]
}
plot(prob1, xlab = "Roll number", main = "Probability of winning by turn")
# for a 3-player game:
prob3 <- 1 - (1 - prob1)^3

From a list of numeric values, create a list of indices

I have a list of numeric vectors:
a <- list(c(2, 3, 4, 5, 6, 7), c(4, 5, 6, 7, 8), c(6, 7, 8, 9, 10))
> a
[[1]]
[1] 2 3 4 5 6 7
[[2]]
[1] 4 5 6 7 8
[[3]]
[1] 6 7 8 9 10
I want to create a list where each element corresponds to values from 1 to the max value in the original list "a". The values in each element of the new list are the indices in the original list containing the focal value.
For example, the first element in the result contains the indices in "a" with the value 1. Because no element contains 1, the result is NULL. The second element contains the indices in "a" with the value 2, i.e. the first element, 1. The value 4 is found in element 1 and 2.
> res
[[1]]
NULL
[[2]]
[1] 1
[[3]]
[1] 1
[[4]]
[2] 1 2
[[5]]
[2] 1 2
[[6]]
[3] 1 2 3
[[7]]
[3] 1 2 3
[[8]]
[2] 2 3
[[9]]
[1] 3
[[10]]
[1] 3
I tried this with nested loops, but it is taking too much time and growing lists within loops is very slow. I have 60,000 sublists in my main list, so is there vectorized solution for this ?
Thanks in Advance.
Here is a base R way.
lapply(seq.int(max(unique(unlist(a)))), \(i){
which(sapply(a, \(x) any(i == x)))
})
Another way:
searchInList <- function(list2search, e){
idx2search <- 1:length(list2search)
list2search2 <- lapply(list2search, `length<-`, max(lengths(list2search)))
output <- matrix(unlist(list2search2), ncol = length(list2search2[[1]]), byrow = TRUE)
idx <- apply(output, 1, function(x){ (e %in% x) } )
return(idx2search[idx])
}
result <- lapply(1:max(unlist(a)), function(x) { searchInList(a, x) } )
Here is one way using match and rapply.
apply(matrix(rapply(a, \(x) !is.na(match(1:max(unlist(a)), x))),,length(a)), 1, which)
# [[1]]
# integer(0)
#
# [[2]]
# [1] 1
#
# [[3]]
# [1] 1
#
# [[4]]
# [1] 1 2
#
# [[5]]
# [1] 1 2
#
# [[6]]
# [1] 1 2 3
#
# [[7]]
# [1] 1 2 3
#
# [[8]]
# [1] 2 3
#
# [[9]]
# [1] 3
#
# [[10]]
# [1] 3
Another solution using base R:
apply(sapply(a, `%in%`, x = seq_len(max(unlist(a)))), 1, which)
A tidyverse approach:
library(purrr)
a <- list(c(2, 3, 4, 5, 6, 7), c(4, 5, 6, 7, 8), c(6, 7, 8, 9, 10))
i = 1:10
map(i, ~map_int(imap(a, ~(..3 %in% .x)*.y, i), ~.x[.y], .x) %>% .[. != 0])
The logic is to get a list of positions of TRUE values, and multiple this by the list element index. Here, the first element of the first vector, first element of the second vector, third element of the third vector form all matches, and thus the first element in the target list.
imap(a, ~(..3 %in% .x)*.y, i)
[[1]]
[1] 0 1 1 1 1 1 1 0 0 0
[[2]]
[1] 0 0 0 2 2 2 2 2 0 0
[[3]]
[1] 0 0 0 0 0 3 3 3 3 3

R : Swapping 2 values in a nested list

[[1]]
[[1]][[1]]
[1] 1 46 107 69 1
[[1]][[2]]
[1] 1 146 145 71 92 1
####################
[[2]]
[[2]][[1]]
[1] 1 46 18 92 1
[[2]][[2]]
[1] 1 127 145 53 168 1
Assume I have 2 nested list, as shown above, I'm looking for function where I can update (say 46) in both the list with some other number in list (say 92) and update 92's with 46 without altering the structure of the list
Expected output will be something like this
[[1]]
[[1]][[1]]
[1] 1 92 107 69 1
[[1]][[2]]
[1] 1 146 145 71 46 1
####
[[2]]
[[2]][[1]]
[1] 1 92 18 46 1
[[2]][[2]]
[1] 1 127 145 53 168 1
Rlist library has functions like list.find/list.findi which works only for named nested list. Mine is not a named list
This is another way to achieve that. First you simply convert your list to vector (unlist(l)). Do your necessary swaps and convert it back to your list (relist(x, skeleton = l)).
x <- unlist(l)
a <- which(x==46)
b <- which(x==92)
x[a] <- 92
x[b] <- 46
relist(x, skeleton = l)
Benchmarking
library(microbenchmark)
l <- list(list(c(1, 46, 107, 69, 1), c(1, 146, 145, 71, 92, 1)), list(
c(1, 46, 18, 92, 1), c(1, 127, 145, 53, 168, 1)))
f_m0h3n <- function(l){x <- unlist(l);a <- which(x==46);b <- which(x==92);x[a] <- 92;x[b] <- 46;relist(x, l);}
f_jakub <- function(li) rapply(li, function(x) ifelse(x == 46, 92,ifelse(x==92, 46, x)), how = "list")
all.equal(f_m0h3n(l), f_jakub(l))
# [1] TRUE
microbenchmark(f_m0h3n(l), f_jakub(l))
# Unit: microseconds
# expr min lq mean median uq max neval
# f_m0h3n(l) 100.942 103.509 109.7108 107.3580 111.6355 204.879 100
# f_jakub(l) 126.178 131.738 142.8850 137.9405 143.7150 357.148 100
Larger scale
library(microbenchmark)
set.seed(123)
l <- list(list(sample(1000), sample(2000)),list(sample(1000), sample(2000)))
all.equal(f_m0h3n(l), f_jakub(l))
# [1] TRUE
microbenchmark(f_m0h3n(l), f_jakub(l))
# Unit: microseconds
# expr min lq mean median uq max neval
# f_m0h3n(l) 588.973 615.0645 896.9371 651.2065 692.268 2827.242 100
# f_jakub(l) 1022.683 1053.9070 1914.0769 1253.0115 2848.842 3287.898 100
It is evident that f_m0h3n works better than f_jakub. The difference is even more significant for larger scales (the time is reduced almost by half).
Could it be a simple rapply? See this example, where 46 is replaced by 92 (and the other way round, as added by #akrun):
li = list(list(c(1, 46, 107, 69, 1),
c(1, 146, 145, 71, 92, 1)))
# [[1]]
# [[1]][[1]]
# [1] 1 46 107 69 1
#
# [[1]][[2]]
# [1] 1 146 145 71 92 1
rapply(li, function(x) ifelse(x == 46, 92,ifelse(x==92, 46, x)), how = "list")
# [[1]]
# [[1]][[1]]
# [1] 1 92 107 69 1
#
# [[1]][[2]]
# [1] 1 146 145 71 46 1
The how = "list" makes sure you get the original structure back.

Resources