I have a question about sampling: I would like to sample successive number in a vector without replacement. Is there a simple way to do so?
For example,
sample(c(1:100), 10, replace = F)
76 99 94 53 12 34 5 82 75 30
gives me 10 number between 1 and 100. Now I would like to have 10 sequence of 3 successive integer without replacement: c(2,3,4), c(10,11,12), c(82,83,84) etc.
The different sequences can't overlap, that is if c(2,3,4) is my first sampling, then none of the following one can have these numbers.
I would even look for the possibility of sampling 10 sequences of different sizes, the sizes given by a vector like
sizevec <- sample(c(1:4),10,replace = T)
Thanks for the help
set.seed(42)
lapply(sample(1:10, 1) + cumsum(sample(4:10, 10, TRUE)), function(x) x + 1:3)
# [[1]]
# [1] 21 22 23
# [[2]]
# [1] 27 28 29
# [[3]]
# [1] 36 37 38
# [[4]]
# [1] 44 45 46
# [[5]]
# [1] 51 52 53
# [[6]]
# [1] 60 61 62
# [[7]]
# [1] 64 65 66
# [[8]]
# [1] 72 73 74
# [[9]]
# [1] 80 81 82
# [[10]]
# [1] 87 88 89
A solution using tow while loop to take samples. After running the code, x is a list of desired output.
# Set seed for reproduciblility
set.seed(123)
# Create a list to store values
x <- list()
# Create a vector to store values in x
y <- integer()
# Set the threshold to stop
threshold <- 4
# Set the condition
condition <- TRUE
while (length(x) < threshold){
while (condition){
# Sample a number between 1 to 98
s <- sample(c(1:98), 1)
# Create a sequence
se <- s:(s + 2)
# Check if the values in se is in y, save it to the condition
condition <- any(se %in% y)
}
# Save se to the list
x[[length(x) + 1]] <- se
# Update y
y <- unlist(x)
# Reset the condition
condition <- TRUE
}
# View the results
x
# [[1]]
# [1] 29 30 31
#
# [[2]]
# [1] 79 80 81
#
# [[3]]
# [1] 41 42 43
#
# [[4]]
# [1] 89 90 91
HI you are unclear if the vectors may overlap or not. assuming the may overlap this should work
lapply(sample(c(1:97), 10, replace = F),function(i){ 0:2 + i})
having a random length would then look like this
lapply(sample(c(1:97), 10, replace = F),function(i){ 0:sample(1:10,1) + i})
Related
I am using this code that generates 3 random numbers that add to 72:
# https://stackoverflow.com/questions/24845909/generate-n-random-integers-that-sum-to-m-in-r
rand_vect <- function(N, M, sd = 1, pos.only = TRUE) {
vec <- rnorm(N, M/N, sd)
if (abs(sum(vec)) < 0.01) vec <- vec + 1
vec <- round(vec / sum(vec) * M)
deviation <- M - sum(vec)
for (. in seq_len(abs(deviation))) {
vec[i] <- vec[i <- sample(N, 1)] + sign(deviation)
}
if (pos.only) while (any(vec < 0)) {
negs <- vec < 0
pos <- vec > 0
vec[negs][i] <- vec[negs][i <- sample(sum(negs), 1)] + 1
vec[pos][i] <- vec[pos ][i <- sample(sum(pos ), 1)] - 1
}
vec
}
If I run this code 100 times:
results <- list()
for (i in 1:100)
{
r_i = rand_vect(3,72)
results[[i]] <- r_i
}
When I look at the numbers that came out:
[[1]]
[1] 23 24 25
[[2]]
[1] 25 24 23
[[3]]
[1] 24 25 23
[[4]]
[1] 23 24 25
[[5]]
[1] 25 24 23
[[6]]
[1] 24 24 24
[[7]]
[1] 24 25 23
[[8]]
[1] 24 25 23
[[9]]
[1] 24 25 23
[[10]]
[1] 24 23 25
In each iteration, all the numbers add to 72 as expected - but the numbers don't really "look that random". They all seem to be "clumped" around "24, 23, 25". I was hoping to see more "randomness" in these numbers. For example:
[[11]]
[1] 5 50 17
[[12]]
[1] 12 40 20
Why are the numbers in the code I am using "clumped" around 24, 23, 25 - and how can I change the above code so that there is more "randomness" in the numbers being generated?
Thank you!
If you just want three random integers that sum to 72 you could do something like
diff(c(0, sort(sample(72, 2)), 72))
i will generate permutations of a partitions.
i generate partitions with this code:
library(partitions)
x <- c(2,4,6)
parts <- listParts(length(x))
out <- rapply(parts, function(ii) x[ii], how="replace")
and generate
out
[[1]]
[1] (2,4,6)
[[2]]
[1] (2,6)(4)
[[3]]
[1] (2,4)(6)
[[4]]
[1] (4,6)(2)
[[5]]
[1] (2)(4)(6)
take a element for example [1](2)(4)(6) i would generate all possible permutations. I tried with:
library(combinat)
permn(x)
but the return element is not in the same form of input,for example with element
[1](21,33,41,40,39,3,6)(13,37) return:
[[1]]
[[1]]$`1`
[1] 21 33 41 40 39 3 6
[[1]]$`2`
[1] 13 37
[[2]]
[[2]]$`2`
[1] 13 37
[[2]]$`1`
[1] 21 33 41 40 39 3 6
i do similar question some week ago, but the solution that was given to when generate partitions generate permutations for all possible partitions, but for problem of
efficiency i can't use it.
the solution this was it:
library(partitions)
permListParts <- function (x)
{
f <- function(pp) {
out <- split(seq_along(pp), pp)
myPerms <- perms(length(out))
apply(myPerms, 2, function(x) {
temp <- out[x]
class(temp) <- c(class(temp), "equivalence")
temp
})
}
apply(setparts(x), 2, f)}
Problem:
I have a list of two lists of three vectors. I would like to remove the zero vector from each sublist.
Example:
x <- list(x1=c(0,0,0), x2=c(3,4,5), x3=c(45,34,23))
y <- list(y1=c(2,33,4), y2=c(0,0,0), y3=c(4,5,44))
z <- list(x, y)
Try:
I tried this:
res <- lapply(1:2, function(i) {lapply(1:3, function(j) z[[i]][[j]][z[[i]][[j]] != 0])})
Which gave me this:
> res
[[1]]
[[1]][[1]]
numeric(0)
[[1]][[2]]
[1] 3 4 5
[[1]][[3]]
[1] 45 34 23
[[2]]
[[2]][[1]]
[1] 2 33 4
[[2]][[2]]
numeric(0)
[[2]][[3]]
[1] 4 5 44
Problem with the output:
I do not want numeric(0).
Expected output:
x= list(x2, x3)
y=list(y1, y3)
Any idea, please?
You can try a tidyverse if the nested list structure is not important
library(tidyverse)
z %>%
flatten() %>%
keep(~all(. != 0))
$x2
[1] 3 4 5
$x3
[1] 45 34 23
$y1
[1] 2 33 4
$y3
[1] 4 5 44
Given your structure of list of lists I would go with the following:
filteredList <- lapply(z, function(i) Filter(function(x) any(x != 0), i))
x <- filteredList[[1]]
y <- filteredList[[2]]
x
##$`x2`
##[1] 3 4 5
##$x3
##[1] 45 34 23
y
##$`y1`
##[1] 2 33 4
##$y3
##[1] 4 5 44
define z as
z <- c(x, y)
# z <- unlist(z, recursive = F) if you cannot define z by yourself.
then use:
z[sapply(z, any)]
#$`x2`
#[1] 3 4 5
#$x3
#[1] 45 34 23
#$y1
#[1] 2 33 4
#$y3
#[1] 4 5 44
Please note:
As in the tradition of lang C. Every integer/ numeric != 0 will be casted to TRUE. So in this task we can use this logic. ?any will eval FALSE if all values are 0.
Or:
x <- list(x1=c(0,0,0), x2=c(3,4,5), x3=c(45,34,23))
y <- list(y1=c(2,33,4), y2=c(0,0,0), y3=c(4,5,44))
z <- list(x, y)
lapply(z, function(a) a[unlist(lapply(a, function(b) !identical(b, rep(0,3))))])
#[[1]]
#[[1]]$`x2`
#[1] 3 4 5
#
#[[1]]$x3
#[1] 45 34 23
#
#
#[[2]]
#[[2]]$`y1`
#[1] 2 33 4
#
#[[2]]$y3
#[1] 4 5 44
with purrr it can be really compact
library(purrr)
map(z, keep ,~all(.!=0))
# [[1]]
# [[1]]$x2
# [1] 3 4 5
#
# [[1]]$x3
# [1] 45 34 23
#
#
# [[2]]
# [[2]]$y1
# [1] 2 33 4
#
# [[2]]$y3
# [1] 4 5 44
If it wasn't for the annoying warnings we could do just map(z, keep , all)
I am working on a matrix in R, 230 x 230 and I want to extract the 10 (or any other number than 1) max inputs on the matrix, both their position and value.
The extra problem is that this is a similarity matrix, so I have 1s in the diagonal which of course I want to leave out of the max search.
Any ideas or commands for that?
A neat way to do this in general is with the underused arrayInd function, which gives you row and column positions for plain jane vector positions. That's how which(..., arr.ind = TRUE) does it. Here's how you might do it:
## creating a random 230x230 matrix
n <- 230;
set.seed(1);
m <- matrix(sample.int(100000, n*n, replace = TRUE), n, n);
diag(m) <- 1;
## function to return n largest values and position for matrix m
nlargest <- function(m, n, sim = TRUE) {
mult <- 1;
if (sim) mult <- 2;
res <- order(m)[seq_len(n) * mult];
pos <- arrayInd(res, dim(m), useNames = TRUE);
list(values = m[res],
position = pos)
}
diag(m) <- NA;
nlargest(m, 10);
# $values
# [1] 1 2 11 12 12 12 13 18 21 22
#
# $position
# row col
# [1,] 59 95
# [2,] 178 202
# [3,] 160 34
# [4,] 83 151
# [5,] 150 194
# [6,] 18 225
# [7,] 13 38
# [8,] 206 182
# [9,] 89 22
#[10,] 142 99
This question already has answers here:
Create grouping variable for consecutive sequences and split vector
(5 answers)
Closed 4 years ago.
If I have a vector as such:
dat <- c(1,2,3,4,5,19,20,21,56,80,81,92)
How can I break it up into a list as:
[[1]]
1 2 3 4 5
[[2]]
19 20 21
[[3]]
56
[[4]]
80 81
[[5]]
92
Just use split in conjunction with diff:
> split(dat, cumsum(c(1, diff(dat) != 1)))
$`1`
[1] 1 2 3 4 5
$`2`
[1] 19 20 21
$`3`
[1] 56
$`4`
[1] 80 81
$`5`
[1] 92
Not exactly what you asked for, but the "R.utils" package has a couple of related fun functions:
library(R.utils)
seqToIntervals(dat)
# from to
# [1,] 1 5
# [2,] 19 21
# [3,] 56 56
# [4,] 80 81
# [5,] 92 92
seqToHumanReadable(dat)
# [1] "1-5, 19-21, 56, 80-81, 92"
I think Robert Krzyzanowski is correct. So here is a tidyverse that involves placing the vector into a tibble (data frame).
library(tidyverse)
# library(dplyr)
# library(tidyr)
df <- c(1,2,3,4,5,19,20,21,56,80,81,92) %>%
tibble(dat = .)
# using lag()
df %>%
group_by(seq_id = cumsum(dat != lag(dat) + 1 | is.na(dat != lag(dat) + 1)) %>%
nest()
# using diff()
df %>%
group_by(seq_id = cumsum(c(1, diff(dat)) != 1)) %>%
nest()
Of course, you need not nest the resulting groups into list-columns, and can instead perform some kind of summary operation.