I would like to generate all combinations of two vectors, given two constraints: there can never be more than 3 characters from the first vector, and there must always be at least one characters from the second vector. I would also like to vary the final number of characters in the combination.
For instance, here are two vectors:
vec1=c("A","B","C","D")
vec2=c("W","X","Y","Z")
Say I wanted 3 characters in the combination. Possible acceptable permutations would be: "A" "B" "X"or "A" "Y" "Z". An unacceptable permutation would be: "A" "B" "C" since there is not at least one character from vec2.
Now say I wanted 5 characters in the combination. Possible acceptable permutations would be: "A" "C" "Z" "Y" or "A" "Y" "Z" "X". An unacceptable permutation would be: "A" "C" "D" "B" "X" since there are >3 characters from vec2.
I suppose I could use expand.grid to generate all combinations and then somehow subset, but there must be an easier way. Thanks in advance!
I'm not sure wheter this is easier, but you can leave away permutations that do not satisfy your conditions whith this strategy:
generate all combinations from vec1 that are acceptable.
generate all combinations from vec2 that are acceptable.
generate all combinations taking one solution from 1. + one solution from 2. Here I'd do the filtering with condition 3 afterwards.
(if you're looking for combinations, you're done, otherwise:) produce all permutations of letters within each result.
Now, let's have
vec1 <- LETTERS [1:4]
vec2 <- LETTERS [23:26]
## lists can eat up lots of memory, so use character vectors instead.
combine <- function (x, y)
combn (y, x, paste, collapse = "")
res1 <- unlist (lapply (0:3, combine, vec1))
res2 <- unlist (lapply (1:length (vec2), combine, vec2))
now we have:
> res1
[1] "" "A" "B" "C" "D" "AB" "AC" "AD" "BC" "BD" "CD" "ABC"
[13] "ABD" "ACD" "BCD"
> res2
[1] "W" "X" "Y" "Z" "WX" "WY" "WZ" "XY" "XZ" "YZ"
[11] "WXY" "WXZ" "WYZ" "XYZ" "WXYZ"
res3 <- outer (res1, res2, paste0)
res3 <- res3 [nchar (res3) == 5]
So here you are:
> res3
[1] "ABCWX" "ABDWX" "ACDWX" "BCDWX" "ABCWY" "ABDWY" "ACDWY" "BCDWY" "ABCWZ"
[10] "ABDWZ" "ACDWZ" "BCDWZ" "ABCXY" "ABDXY" "ACDXY" "BCDXY" "ABCXZ" "ABDXZ"
[19] "ACDXZ" "BCDXZ" "ABCYZ" "ABDYZ" "ACDYZ" "BCDYZ" "ABWXY" "ACWXY" "ADWXY"
[28] "BCWXY" "BDWXY" "CDWXY" "ABWXZ" "ACWXZ" "ADWXZ" "BCWXZ" "BDWXZ" "CDWXZ"
[37] "ABWYZ" "ACWYZ" "ADWYZ" "BCWYZ" "BDWYZ" "CDWYZ" "ABXYZ" "ACXYZ" "ADXYZ"
[46] "BCXYZ" "BDXYZ" "CDXYZ" "AWXYZ" "BWXYZ" "CWXYZ" "DWXYZ"
If you prefer the results split into single letters:
res <- matrix (unlist (strsplit (res3, "")), nrow = length (res3), byrow = TRUE)
> res
[,1] [,2] [,3] [,4] [,5]
[1,] "A" "B" "C" "W" "X"
[2,] "A" "B" "D" "W" "X"
[3,] "A" "C" "D" "W" "X"
[4,] "B" "C" "D" "W" "X"
(snip)
[51,] "C" "W" "X" "Y" "Z"
[52,] "D" "W" "X" "Y" "Z"
Which are your combinations.
Related
I have character data like this
[[1]]
[1] "F" "S"
[[2]]
[1] "Y" "Q" "Q"
[[3]]
[1] "C" "T"
[[4]]
[1] "G" "M"
[[5]]
[1] "A" "M"
And I want to generate all permutations for each individual list (not mixed between lists) and combine them together into one big list.
For example, for the first and second lists, which are "F" "S" and "Y" "Q" "Q", I want to get the permutation lists as c("FS", "SF"), and c("YQQ", "QYQ", "QQY"), and then combine them into one.
Here's an approach with combinat::permn:
library(combinat)
lapply(data,function(x)unique(sapply(combinat::permn(x),paste,collapse = "")))
#[[1]]
#[1] "FS" "SF"
#
#[[2]]
#[1] "YQQ" "QYQ" "QQY"
#
#[[3]]
#[1] "CT" "TC"
#
#[[4]]
#[1] "GM" "MG"
#
#[[5]]
#[1] "AM" "MA"
Or together with unlist:
unlist(lapply(data,function(x)unique(sapply(combinat::permn(x),paste,collapse = ""))))
# [1] "FS" "SF" "YQQ" "QYQ" "QQY" "CT" "TC" "GM" "MG" "AM" "MA"
Data:
data <- list(c("F", "S"), c("Y", "Q", "Q"), c("C", "T"), c("G", "M"),
c("A", "M"))
It looks like your desired output is not exactly the same as this related post (Generating all distinct permutations of a list in R). But we can build on the answer there.
library(combinat)
# example data, based on your description
X <- list(c("F","S"), c("Y", "Q", "Q"))
result <- lapply(X, function(x1) {
unique(sapply(permn(x1), function(x2) paste(x2, collapse = "")))
})
print(result)
Output
[[1]]
[1] "FS" "SF"
[[2]]
[1] "YQQ" "QYQ" "QQY"
The first (outer) lapply iterates over each element of the list, which contains the individual letters (in a vector). With each iteration the permn takes the individual letters (eg "F" and "S"), and returns a list object with all possible permutations (e.g "F" "S" and "S" F"). To format the output as you described, the inner sapply takes each those permutations and collapses them into a single character value, filtered for unique values.
library(combinat)
final <- unlist(lapply(X , function(test_X) lapply(permn(test_X), function(x) paste(x,collapse='')) ))
I have a large list of lists where I want to remove duplicated elements in each list. Example:
x <- list(c("A", "A", "B", "C"), c("O", "C", "A", "Z", "O"))
x
[[1]]
[1] "A" "A" "B" "C"
[[2]]
[1] "O" "C" "A" "Z" "O"
I want the result to be a list that looks like this, where duplicates within a list are removed, but the structure of the list remains.
[[1]]
[1] "A" "B" "C"
[[2]]
[1] "O" "C" "A" "Z"
My main strategy has been to use rapply (also tried lapply) to identify duplicates and remove them. I tried:
x[rapply(x, duplicated) == T]
but received the following error:
"Error: (list) object cannot be coerced to type 'logical'"
Does anyone know a way to solve this issue?
Thanks!
We can use lapply with unique
lapply(x, unique)
#[[1]]
#[1] "A" "B" "C"
#[[2]]
#[1] "O" "C" "A" "Z"
The issue with rapply, is that it recursively applies the duplicated and then returns a single vector instead of a list of logical vectors
rapply(x, duplicated)
#[1] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
Instead it can be
lapply(x, function(u) u[!duplicated(u)])
#[[1]]
#[1] "A" "B" "C"
#[[2]]
#[1] "O" "C" "A" "Z"
I am trying to remove the text before and including a character ("-") for every element in a list.
Ex-
x = list(c("a-b","b-c","c-d"),c("a-b","e-f"))
desired output:
"b" "c" "d"
"b" "f"
I have tried using various combinations of lapply and gsub, such as
lapply(x,gsub,'.*-','',x)
but this just returns a null list-
[[1]]
[1] ""
[[2]]
[1] ""
And only using
gsub(".*-","",x)
returns
"d\")" "f\")"
You are close, but using lapply with gsub, R doesn't know which arguments are which. You just need to label the arguments explicitly.
x <- list(c("a-b","b-c","c-d"),c("a-b","e-f"))
lapply(x, gsub, pattern = "^.*-", replacement = "")
[[1]]
[1] "b" "c" "d"
[[2]]
[1] "b" "f"
This can be done with a for loop.
val<-list()
for(i in 1:length(x)){
val[[i]]<-gsub('.*-',"",x[[i]])}
val
[[1]]
[1] "b" "c" "d"
[[2]]
[1] "b" "f"
I have a list L of unnamed comma separated character lists. Each list of characters is of unequal length. I need to drop the character lists that have less than 4 elements from L. How can this be done? Example L:
> L
[[1]]
[1] "A" "B" "C" "D"
[[2]]
[1] "E" "F" "G"
In the example above I would like to end up with:
> L
[[1]]
[1] "A" "B" "C" "D"
We can use lengths to get the length of the list elements as a vector, create a logical vector based on that and subset the list
L[lengths(L)>3]
#[[1]]
#[1] "A" "B" "C" "D"
A less optimized approach (used earlier) is to loop through the list elements with sapply, get the length and use that to subset
L[sapply(L, length)>3]
data
L <- list(LETTERS[1:4], LETTERS[5:7])
I want to apply a long index vector (50+ non-sequential integers) to a long list of vectors (50+ character vectors containing 100+ names) in order to retrieve specific values (as a list, vector, or data frame).
A simplified example is below:
> my.list <- list(c("a","b","c"),c("d","e","f"))
> my.index <- 2:3
Desired Output
[[1]]
[1] "b"
[[2]]
[1] "f"
##or
[1] "b"
[1] "f"
##or
[1] "b" "f"
I know I can get the same value from each element using:
> lapply(my.list, function(x) x[2])
##or
> lapply(my.list,'[', 2)
I can pull the second and third values from each element by:
> lapply(my.list,'[', my.index)
[[1]]
[1] "b" "c"
[[2]]
[1] "e" "f"
##or
> for(j in my.index) for(i in seq_along(my.list)) print(my.list[[i]][[j]])
[1] "b"
[1] "e"
[1] "c"
[1] "f"
I don't know how to pull just the one value from each element.
I've been looking for a few days and haven't found any examples of this being done, but it seems fairly straight forward. Am I missing something obvious here?
Thank you,
Scott
Whenever you have a problem that is like lapply but involves multiple parallel lists/vectors, consider Map or mapply (Map simply being a wrapper around mapply with SIMPLIFY=FALSE hardcoded).
Try this:
Map("[",my.list,my.index)
#[[1]]
#[1] "b"
#
#[[2]]
#[1] "f"
..or:
mapply("[",my.list,my.index)
#[1] "b" "f"