Is there a straightforward way to generate all possible permutations of a vector of integers (1 to max 999) that specifically excludes duplicated elements?
For example, for a vector with three elements in a range of 1 to 9 the sequence 1 2 3 would be acceptable, as would 1 2 9 but 1 2 2 would be invalid. The sequence must contain exactly n elements (in this case, three). EDIT: to avoid confusion, the order is significant, so 1 2 9 and 9 2 1 are both valid and required.
There are many questions on permutations and combinations using R on SO (such as this and this) but none that seem to fit this particular case. I'm hoping there's an obscure base R or package function out there that will take care of it without me having to write a graceless function myself.
Using gtools package:
require(gtools)
permutations(n = 9, r = 3, v = 1:9)
# n -> size of source vector
# r -> size of target vector
# v -> source vector, defaults to 1:n
# repeats.allowed = FALSE (default)
utils::combn ; combinat::combn or combinat::permn are alternatives.
EDIT: This is not what the OP asked for, but I leave this answer, to avoid confusion.
My math is a little bit rusty, but i think you are describing combinations, not permutations. The base functioncombn() returns combinations.
I illustrate with a manageable set - all combinations of length 3, from the vector 1:4:
combn(4, 3)
[,1] [,2] [,3] [,4]
[1,] 1 1 1 2
[2,] 2 2 3 3
[3,] 3 4 4 4
The difference between combinations and permutations is that in combinations the order doesn't matter. So, (2, 3, 4) and (4, 3, 2) is the same combination, but different permutations.
Related
I was searching online if it is possible to create a vector given certain conditions, such as it must contain 2 and 6 but not 5 and 1, also that it is in a specific range (2 000 000-4 999 999), and also that it must be even.
I have genuinely no idea about how to give these commands to R even if I know the basic functions to create a vector.
Thanks in advance for your time and for the big help
You can try the code below
# create a sequence from 2000000 to 4999999
v <- 2e6:(5e6 - 1)
# filter the sequence with given criteria
v[grepl("(2.*6)|(6.*2)", v) & !grepl("(1.*5)|(5.*1)", v)]
You can create it using "seq" function.
seq(from = 2, to = 7, by = 2)
#> [1] 2 4 6
Then use "setdiff" function to remove specific values you dont need.
remove <- c(2)
#> a
[1] 2 4 6
#> setdiff(a, remove)
[1] 4 6
Let's say I have a vector of integers 1:6
w=1:6
I am attempting to obtain a matrix of 90 rows and 6 columns that contains the multinomial combinations from these 6 integers taken as 3 groups of size 2.
6!/(2!*2!*2!)=90
So, columns 1 and 2 of the matrix would represent group 1, columns 3 and 4 would represent group 2 and columns 5 and 6 would represent group 3. Something like:
1 2 3 4 5 6
1 2 3 5 4 6
1 2 3 6 4 5
1 2 4 5 3 6
1 2 4 6 3 5
...
Ultimately, I would want to expand this to other multinomial combinations of limited size (because the numbers get large rather quickly) but I am having trouble getting things to work. I've found several functions that do binomial combinations (only 2 groups) but I could not locate any functions that do this when the number of groups is greater than 2.
I've tried two approaches to this:
Building up the matrix from nothing using for loops and attempting things with the reshape package (thinking that might be something there for this with melt() )
working backwards from the permutation matrix (720 rows) by attempting to retain unique rows within groups and or removing duplicated rows within groups
Neither worked for me.
The permutation matrix can be obtained with
library(gtools)
dat=permutations(6, 6, set=TRUE, repeats.allowed=FALSE)
I think working backwards from the full permutation matrix is a bit excessive but I'm tring anything at this point.
Is there a package with a prebuilt function for this? Anyone have any ideas how I shoud proceed?
Here is how you can implement your "working backwards" approach:
gps <- list(1:2, 3:4, 5:6)
get.col <- function(x, j) x[, j]
is.ordered <- function(x) !colSums(diff(t(x)) < 0)
is.valid <- Reduce(`&`, Map(is.ordered, Map(get.col, list(dat), gps)))
dat <- dat[is.valid, ]
nrow(dat)
# [1] 90
Consider the following matrix:
MAT <- matrix(nrow=3,ncol=3,1:9)
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
I want to retrieve the row number if I provide a vector which exactly matches a row in MAT. So if I provide c(2,5,8), I should get back 2. I'm unsure how to accomplish this; the closest thing I know is using which to find the location of a single number in a matrix. An alternate could be a very slow quadruple for loop checking if the given vector matches a row in the matrix. Is there a one line solution for this problem?
You can use identical to test, apply loop and which to identify:
which(apply(MAT,1,function(x) identical(x,c(2L,5L,8L))))
[1] 2
Note that the values in the matrix are stored as integers, so you need to specify that in the vector to test.
You can apply a simple matching function to each row, then use which to find the row number:
search_vec = c(2, 5, 8)
vec_matches = apply(MAT, 1, function(row, search_vec) all(row == search_vec), search_vec)
row_num = which(vec_matches)
When subsetting arrays, R behaves differently depending on whether one of the dimensions is of length 1 or not. If a dimension has length 1, that dimension is lost during subsetting:
ax <- array(1:24, c(2,3,4))
ay <- array(1:12, c(1,3,4))
dim(ax)
#[1] 2 3 4
dim(ay)
#[1] 1 3 4
dim(ax[,1:2,])
#[1] 2 2 4
dim(ay[,1:2,])
#[1] 2 4
From my point of view, ax and ay are the same, and performing the same subset operation on them should return an array with the same dimensions. I can see that the way that R is handling the two cases might be useful, but it's undesirable in the code that I'm writing. It means that when I pass a subsetted array to another function, the function will get an array that's missing a dimension, if I happened to reduce a dimension to length 1 at an earlier stage. (So in this case R's flexibility is making my code less flexible!)
How can I prevent R from losing a dimension of length 1 during subsetting? Is there another way of indexing? Some flag to set?
As you've found out by default R drops unnecessary dimensions. Adding drop=FALSE while indexing can prevent this:
> dim(ay[,1:2,])
[1] 2 4
> dim(ax[,1:2,])
[1] 2 2 4
> dim(ay[,1:2,,drop = F])
[1] 1 2 4
How can I reduce the size of a vector to a lower dimension?
Say for example X:=(1,2,3,4,5,6,7,8,9,10) is a 10-D vector. Suppose
I want to reduce it to a 5 dimensional space. Is there any way to do this?
I have a situation where I need to compare an N-d vector with a corresponding vector of a lower dimension.
There are an infinite number of ways to convert a 10d vector into a 5d vector.
This is like saying "I want a function that takes two integer parameters and returns an integer, can I make such a function". There an infinite many such functions.
It really depends on what you want to do with the vector. What are the meanings of your 10d and 5d vectors?
If my assumption is right, the OP would like to convert a vector of 10 values to a matrix with 2 columns.
This could be done easily in R:
# make up the demo data
> v <- c(1,2,3,4,5,6,7,8,9,10)
# modify the dimensions of 'v' to have 2 columns
> dim(v) <- c(5,2)
# and check the result
> v
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10