Generate all possible permutations (or n-tuples) - r

I'd like to create a data.frame of all possible permutations of 10 variables that can be either 1 or 2
2*2*2*2*2*2*2*2*2*2 = 1024 # possible
1,1,1,1,1,1,1,1,1,1
1,2,1,1,1,1,1,1,1,1
1,2,2,1,1,1,1,1,1,1
1,2,2,2,1,1,1,1,1,1
...
Is there a "quick" way to do this in R?

how about this:
tmp = expand.grid(1:2,1:2,1:2,1:2,1:2,1:2,1:2,1:2,1:2,1:2)
or this (thanks Tyler):
x <- list(1:2)
tmp = expand.grid(rep(x, 10))

Some people have asked the question regarding letters, such as here. The expand.grid solution is usually given, but I find this to be much simpler:
sapply(LETTERS[1:3], function(x){paste0(x, LETTERS[1:3])}) %>% c()
# [1] "AA" "AB" "AC" "BA" "BB" "BC" "CA" "CB" "CC"

Related

Create a list of sequential letters like in excel's header in R [duplicate]

This question already has answers here:
Numeric to Alphabetic Lettering Function in R [duplicate]
(4 answers)
is there a way to extend LETTERS past 26 characters e.g., AA, AB, AC...?
(9 answers)
Closed 2 years ago.
How to create a list of sequential letters like excel's column headers in R? I have 559 columns in my excel file, so I would like to create a vector of sequential letters like "A, B,... Z, AA, AB,... AZ,... BA, BB,..." etc. This is so that I can create my own data dictionary, so would like to "map" to excel's column headers.
In case if you are interested in base R solution, you can try this:
all <- expand.grid(LETTERS, LETTERS)
all <- all[order(all$Var1,all$Var2),]
out <- c(LETTERS, do.call('paste0',all))
The out will return 702 values as vectors, I believe you want to subset them until 559, so you can write: out[1:559].
To rename your columns you can use, where data_frame is your data frame name
names(data_frame) <- out[1:559]
One important note though, I am assuming here that you only wanted column with two characters not more than that.
A generic approach using gtools
comb <- lapply(1:3, function(x)gtools::permutations(26,x, LETTERS, repeats.allowed = TRUE))
## Using 3 for excel 3 combinations of alphabets
unlist(lapply(comb, function(x)do.call('paste0', data.frame(x,stringsAsFactors = FALSE))))
Some observations:
> out[1:50]
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K"
[12] "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V"
[23] "W" "X" "Y" "Z" "AA" "AB" "AC" "AD" "AE" "AF" "AG"
[34] "AH" "AI" "AJ" "AK" "AL" "AM" "AN" "AO" "AP" "AQ" "AR"
[45] "AS" "AT" "AU" "AV" "AW" "AX"
num2let takes a number or vector of numbers and converts that to letter notation. Pass the sequence 1:n to get sequential output.
We also provide the inverse function lets2num which takes codes and returns the numbers.
The main advantages of this approach are that (1) it also works on input vectors which are not sequences, e.g. num2let(599) finds the letter code for columnn number 599, (2) if used with a sequence the sequence does not have to be a multiple or a power of 26, (3) it can be used with other codes and bases and not just 26 LETTERS, e.g. num2let(1:40, head(letters)), (4) no upper bound restriction to n, e.g. num2let(1e10) works and (5) both directions are provided and (6) only base R is used.
num2let <- function(n, lets = LETTERS) {
base <- length(lets)
if (length(n) > 1) return(sapply(n, num2let, lets = lets))
stopifnot(n > 0)
out <- ""
repeat {
if (n > base) {
rem <- (n-1) %% base
n <- (n-1) %/% base
out <- paste0(lets[rem+1], out)
} else return( paste0(lets[n], out) )
}
}
let2num <- function(x, lets = LETTERS) {
base <- length(lets)
s <- strsplit(x, "")
sapply(s, function(x) sum((match(x, lets)) * base ^ seq(length(x) - 1, 0)))
}
Test
lets <- num2let(1:40)
lets
## [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O"
## [16] "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z" "AA" "AB" "AC" "AD"
## [31] "AE" "AF" "AG" "AH" "AI" "AJ" "AK" "AL" "AM" "AN"
let2num(lets)
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
num2let(599)
## [1] "WA"
let2num("WA")
## [1] 599
Other
I previously answered this question here Numeric to Alphabetic Lettering Function in R and my answers there take different approaches than the answer here so you can look at those too.
Base R one liner:
c(LETTERS, unlist(sapply(seq_along(LETTERS), function(i){paste0(LETTERS[i], LETTERS)})))

Misunderstanding sub function

I'm working on a Shiny app that loops through an html file replacing an instance of a phrase with a different phrase relative to its position.
That is, the first time "aa" comes, I put "bluh",
the second time "aa" comes, I put "gfgf".
I have a table of all the 2nd phrases in order.
I think I'm misunderstanding the sub function documentation:
The two *sub functions differ only in that sub replaces only the first
occurrence of a pattern whereas gsub replaces all occurrences.
But here a smallest reproducible example:
tt <- c("aa", "aa","bb","aa")
sub("aa","test",tt)
# [1] "test" "test" "bb" "test"
gsub("aa","test",tt)
# [1] "test" "test" "bb" "test"
tt
# [1] "aa" "aa" "bb" "aa"
I expected
sub("aa","test",tt)
# [1] "test" "aa" "bb" "aa"
so that I could loop through and go:
og.list <- c("aa","cat","aa","cat","aa")
repl.list <- c("the","is","happy")
for(i in 1:3){
og.list <- sub("aa",repl.list[i], og.list)
}
instead all "aa" become "the". I thought that's what gsub did, but instead it's both sub and gsub.
Thank you.
I think you might want just this:
og.list[og.list == "aa"] <- repl.list
#[1] "the" "cat" "is" "cat" "happy"
Thank you Wiktor^.
I now understand that I would need to separate each item into its own string and then sub.
og.list <- c("aa","cat","aa","cat","aa"
repl.list <- c("the","is","happy")
og.index <- grep("aa",og.list)
for(i in 1:3){
curr.index <- og.index[i]
og.list[curr.index] <- sub("aa",
repl.list[i],
og.list[curr.index])
}

How to generate AA,AB,AC...,BA,BB,BC...,CA,CB,CC,...,ZZ series in R?

I have to generate an alphabet series (AA,AB,AC...,BA,BB,BC...,CA,CB,CC,...,ZZ) similar to the default column names of an excel file.
I tried it by using the combination function as follows,
combn(LETTERS,2)
However, this did not match with my requirement.
as.vector(sapply(LETTERS,function(x) sapply(LETTERS , function(y) paste0(x,y))))
An alternative would be outer
out <- c(outer(LETTERS, LETTERS, paste0))
head(sort(out))
# [1] "AA" "AB" "AC" "AD" "AE" "AF"

How do I paste string columns in data.frame [duplicate]

This question already has answers here:
Concatenate rows of a data frame
(4 answers)
Closed 6 years ago.
suppose we have:
mydf <- data.frame(a= LETTERS, b = LETTERS, c =LETTERS)
Now we want to add a new column, containing a concatenation of all columns.
So that rows in the new column read "AAA", "BBB", ...
In my mind the following should work?
mydf[,"Concat"] <- apply(mydf, 1, paste0)
In addition to #akrun's answer, here is a short explanation on why your code didn't work.
What you are passing to paste0 in your code are vectors and here is the behavior of paste and paste0 with vectors:
> paste0(c("A","A","A"))
[1] "A" "A" "A"
Indeed, to concatenate a vector, you need to use argument collapse:
> paste0(c("A","A","A"), collapse="")
[1] "AAA"
Consequently, your code should have been:
> apply(mydf, 1, paste0, collapse="")
[1] "AAA" "BBB" "CCC" "DDD" "EEE" "FFF" "GGG" "HHH" "III" "JJJ" "KKK" "LLL" "MMM" "NNN" "OOO" "PPP" "QQQ" "RRR" "SSS" "TTT" "UUU" "VVV"
[23] "WWW" "XXX" "YYY" "ZZZ"
We can use do.call with paste0 for faster execution
mydf[, "Concat"] <- do.call(paste0, mydf)

How to add a "1" to end of each variable name in R?

I have a dataframe "data" with 50 variables. For the analysis purpose, I want to rename all these variables by adding 1 at the end of each variable . Following is the procedure that I followed (for a dataframe "datasample" of 10 variables):
names(datasample)
# original colnames for 10 variables
names(datasample)
[1] "a" "z" "y" "b" "bb" "ca" "a3"
[8] "b2" "as" "ask"
#rename 10 variables
names(datasample)<-c("a1","z1","y1","b1","bb1","ca1","a31","b21","as1","ask1")
I was wondering whether there is an efficient way of renaming these multiple variables. Thanks in advance.
names(datasample) <- paste(names(datasample), "1", sep="")
Or, equivalently,
names(datasample) <- paste0(names(datasample), "1")

Resources