How do I generate all the vectors by swapping two positions? - r

Suppose I have a column vector of [1 1 1 2 2 2 3 3 3] and I want to generate all the different column vectors only by switching two positions. For an example, one such vector would be
[1 1 3 2 2 2 1 3 3].

Try this (it gives you a data frame each row of which is a unique vector with 2 elements swapped from the original vector, there are 28 such unique vectors, including the original one):
v <- c(1,1,1,2,2,2,3,3,3)
unique(t(apply(t(combn(1:length(v), 2)), 1, function(x) {v[x] <- v[rev(x)]; v})))
with output:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 1 1 2 2 2 3 3 3 # original one
[2,] 2 1 1 1 2 2 3 3 3 # swap 1st & 4th elements
[3,] 2 1 1 2 1 2 3 3 3 # swap 1st & 5th
[4,] 2 1 1 2 2 1 3 3 3 # ...
[5,] 3 1 1 2 2 2 1 3 3
[6,] 3 1 1 2 2 2 3 1 3
[7,] 3 1 1 2 2 2 3 3 1
[8,] 1 2 1 1 2 2 3 3 3
[9,] 1 2 1 2 1 2 3 3 3
[10,] 1 2 1 2 2 1 3 3 3
[11,] 1 3 1 2 2 2 1 3 3
[12,] 1 3 1 2 2 2 3 1 3
[13,] 1 3 1 2 2 2 3 3 1
[14,] 1 1 2 1 2 2 3 3 3
[15,] 1 1 2 2 1 2 3 3 3
[16,] 1 1 2 2 2 1 3 3 3
[17,] 1 1 3 2 2 2 1 3 3
[18,] 1 1 3 2 2 2 3 1 3
[19,] 1 1 3 2 2 2 3 3 1
[20,] 1 1 1 3 2 2 2 3 3
[21,] 1 1 1 3 2 2 3 2 3
[22,] 1 1 1 3 2 2 3 3 2
[23,] 1 1 1 2 3 2 2 3 3
[24,] 1 1 1 2 3 2 3 2 3
[25,] 1 1 1 2 3 2 3 3 2
[26,] 1 1 1 2 2 3 2 3 3
[27,] 1 1 1 2 2 3 3 2 3
[28,] 1 1 1 2 2 3 3 3 2 # swap 6th & 9th

First, we can use the utils function combn to generate all of the possible combinations of pairs of positions to swap. Here, I am assuming you don't want to swap the same number (e.g. 1 and 1), so am checking them to make sure they are different values:
allCombo <-
combn(1:length(startVec), 2)
toKeep <- apply(allCombo, 2, function(x) {
startVec[x[1]] != startVec[x[2]]
})
Then, apply along those that you are keeping, and swap the positions.
outVecs <- apply(allCombo[ , toKeep], 2, function(x){
temp <- startVec
temp[x] <- startVec[rev(x)]
return(temp)
})
This returns as a vector, but you can convert it to a list, which may be easier to manage, like so:
outVecsInList <-
as.list(as.data.frame(outVecs))
head(outVecsInList) shows:
$V1
[1] 2 1 1 1 2 2 3 3 3
$V2
[1] 2 1 1 2 1 2 3 3 3
$V3
[1] 2 1 1 2 2 1 3 3 3
$V4
[1] 3 1 1 2 2 2 1 3 3
$V5
[1] 3 1 1 2 2 2 3 1 3
$V6
[1] 3 1 1 2 2 2 3 3 1

Related

Make a probability with specific format in R

I have data like this:
x = c(1,2,3)
prob = c(0.13,0.13,0.74)
# Total sample size
n = 70
result = rep(x, round(n * prob))
Final<-replicate(1, sample(result))
I want to make a matrix[7,10] that have the probability of (0.14,0.14,0.72) for (1,2,3). In this matrix, I need to have in every seven values 1 and 2 repeat 1, and 3 repeats 5 times like this :
3 3 3 1 2 3 3
3 3 3 3 2 1 3
3 2 1 3 3 3 3
3 3 3 1 3 3 2
2 1 3 3 3 3 3
2 1 3 3 3 3 3
3 3 3 2 1 3 3
3 3 3 3 2 3 1
3 2 1 3 3 3 3
So, I will get just one 1, and one 2 in each raw. Could you please help me how to write the code?
One way is to populate a matrix with 3's, then assign 1 and 2 randomly to a column position for each row.
set.seed(1)
m <- matrix(rep(3, 7*10), ncol = 7)
pos <- replicate(10, sample(1:7, 2))
for (i in 1:nrow(m)) m[i, pos[,i]] <- 1:2
m
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 1 3 3 2 3 3 3
#> [2,] 2 3 3 3 3 3 1
#> [3,] 3 1 3 3 2 3 3
#> [4,] 3 3 2 3 3 3 1
#> [5,] 3 2 3 3 3 1 3
#> [6,] 3 3 1 3 3 3 2
#> [7,] 1 3 3 3 2 3 3
#> [8,] 3 2 3 3 1 3 3
#> [9,] 3 3 3 3 3 1 2
#> [10,] 2 1 3 3 3 3 3
Created on 2022-04-15 by the reprex package (v2.0.1)

Exporting Items from List into Single .csv File R

I have a list which contains vectors that I would like to export as a single .csv file containing all vectors as named colums.
For instance, if I have, simply, four vectors containing ten items from hypothetical cluster analyses of four models containing a variable number data points created by
veglist=list.files(pattern="TXT") #create list of files
veg=lapply(veglist,read.csv,header=T,row.names=1) #read list of files
vegbc=lapply(veg,vegdist,method="bray") #create dissimilarity matrix from each file
av=lapply(vegbc,agnes,method="average") #do clustering analysis with each dissimilarity mat
av2=lapply(av,cutree,k=2) #cut the hierarchical analysis at 2 groups level
when I type in fix(av2) I would see:
list(c(1,1,1,1,1,1,2,2,2,2,2,2),c(1,1,1,1,1,2,2,2,2,2),c(1,1,1,2,1,2,2,2,2,2),c(1,1,1,1,2,1,2,2,2,2,2,2,2))
If I type in av2 I see
[[1]]
[1] 1 1 1 1 1 1 2 2 2 2 2 2
[[2]]
[1] 1 1 1 1 1 2 2 2 2 2
[[3]]
[1] 1 1 1 2 1 2 2 2 2 2
[[4]]
[1] 1 1 1 1 2 1 2 2 2 2 2 2 2
I have tried following this example How to read every .csv file in R and export them into single large file. This did not work.
I think the underlying problem is that my vectors are not the same size. What I want to do is output the vectors into a single table that looks something like:
a b c d
1 1 1 1
1 1 1 1
1 1 1 1
1 1 2 1
1 1 1 1
1 2 2 2
2 2 2 2
2 2 2 2
2 2 2 2
2 2
2 2
2
Where a,b,c,d are in place of my actual names. Preferably it would look prettier than this, but I could work with it.
I apologize for the very long question, but I was trying to provide enough of an example to go by. I am also sorry if this has a very easy answer, but I am not yet good with R. Thanks in advance.
Here is one way you can do:
l <- list(c(1,1,1,1,1,1,2,2,2,2,2,2),c(1,1,1,1,1,2,2,2,2,2),c(1,1,1,2,1,2,2,2,2,2),c(1,1,1,1,2,1,2,2,2,2,2,2,2))
maxlength <- max(sapply(l, length))
df <- data.frame(sapply(l, function(x) c(x, rep(NA, (maxlength - length(x))))))
df
X1 X2 X3 X4
1 1 1 1 1
2 1 1 1 1
3 1 1 1 1
4 1 1 2 1
5 1 1 1 2
6 1 2 2 1
7 2 2 2 2
8 2 2 2 2
9 2 2 2 2
10 2 2 2 2
11 2 NA NA 2
12 2 NA NA 2
13 NA NA NA 2
You would first need to extend each vector to the length of the maximum length-ed vector and then you could cbind them together so that write.csv would send them out as "columns":
> maxlength <- max(sapply(l, length))
> mat <- cbind(sapply(l, `length<-`, maxlength))
> mat
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 1 1 1 1
[3,] 1 1 1 1
[4,] 1 1 2 1
[5,] 1 1 1 2
[6,] 1 2 2 1
[7,] 2 2 2 2
[8,] 2 2 2 2
[9,] 2 2 2 2
[10,] 2 2 2 2
[11,] 2 NA NA 2
[12,] 2 NA NA 2
[13,] NA NA NA 2
> write.csv(mat, file="mycsv.csv")
Which looks like this in a text editor (and would get imported into Excel properly.):
"","V1","V2","V3","V4"
"1",1,1,1,1
"2",1,1,1,1
"3",1,1,1,1
"4",1,1,2,1
"5",1,1,1,2
"6",1,2,2,1
"7",2,2,2,2
"8",2,2,2,2
"9",2,2,2,2
"10",2,2,2,2
"11",2,NA,NA,2
"12",2,NA,NA,2
"13",NA,NA,NA,2
This can be done with stri_list2matrix from stringi
library(stringi)
m1 <- stri_list2matrix(l)
mode(m1) <- "integer"
m1
# [,1] [,2] [,3] [,4]
# [1,] 1 1 1 1
# [2,] 1 1 1 1
# [3,] 1 1 1 1
# [4,] 1 1 2 1
# [5,] 1 1 1 2
# [6,] 1 2 2 1
# [7,] 2 2 2 2
# [8,] 2 2 2 2
# [9,] 2 2 2 2
#[10,] 2 2 2 2
#[11,] 2 NA NA 2
#[12,] 2 NA NA 2
#[13,] NA NA NA 2

all combinations of n numbers

I would like to generate all unique combinations of n numbers without replacement. For example if n = 4, there are gamma(4+1) combinations shown here:
combos4 <- read.table(text='
x1 x2 x3 x4
1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 2 3
1 4 3 2
2 1 3 4
2 1 4 3
2 3 1 4
2 3 4 1
2 4 1 3
2 4 3 1
3 1 2 4
3 1 4 2
3 2 1 4
3 2 4 1
3 4 1 2
3 4 2 1
4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1
', header = TRUE)
I can obtain these 24 combinations using the following code:
combos <- expand.grid(x1=1:4, x2=1:4, x3=1:4, x4=1:4)
my.combos <- combos[apply(combos,1,function(x) length(unique(x)) == ncol(combos)),]
my.combos
However, I always assumed there was an easier way in base R. Specifically, I always assumed the function combn() could return these combinations. However, if combn() can I now realize I do not know how to do it.
The expand.grid approach seems a little inefficient, especially when n is large.
In searching the internet and StackOverflow I have found several similar questions that use the algorithm tag and the code posted in answers, when provided, is not simple. Sorry if this is a duplicate or if I am missing something obvious.
The combinat package has a permn function
library(combinat)
permn(1:4)
> permn(1:4)
[[1]]
[1] 1 2 3 4
[[2]]
[1] 1 2 4 3
[[3]]
[1] 1 4 2 3
[[4]]
[1] 4 1 2 3
[[5]]
[1] 4 1 3 2
[[6]]
[1] 1 4 3 2
[[7]]
[1] 1 3 4 2
[[8]]
[1] 1 3 2 4
[[9]]
[1] 3 1 2 4
[[10]]
[1] 3 1 4 2
[[11]]
[1] 3 4 1 2
[[12]]
[1] 4 3 1 2
[[13]]
[1] 4 3 2 1
[[14]]
[1] 3 4 2 1
[[15]]
[1] 3 2 4 1
[[16]]
[1] 3 2 1 4
[[17]]
[1] 2 3 1 4
[[18]]
[1] 2 3 4 1
[[19]]
[1] 2 4 3 1
[[20]]
[1] 4 2 3 1
[[21]]
[1] 4 2 1 3
[[22]]
[1] 2 4 1 3
[[23]]
[1] 2 1 4 3
[[24]]
[1] 2 1 3 4
gtools has a permutations function that might help:
library(gtools)
permutations(4,4,1:4)
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 1 2 4 3
## [3,] 1 3 2 4
## [4,] 1 3 4 2
## [5,] 1 4 2 3
## [6,] 1 4 3 2
## [7,] 2 1 3 4
## [8,] 2 1 4 3
## [9,] 2 3 1 4
## [10,] 2 3 4 1
## [11,] 2 4 1 3
## [12,] 2 4 3 1
## [13,] 3 1 2 4
## [14,] 3 1 4 2
## [15,] 3 2 1 4
## [16,] 3 2 4 1
## [17,] 3 4 1 2
## [18,] 3 4 2 1
## [19,] 4 1 2 3
## [20,] 4 1 3 2
## [21,] 4 2 1 3
## [22,] 4 2 3 1
## [23,] 4 3 1 2
## [24,] 4 3 2 1
and runs pretty fast:
user system elapsed
0.001 0.000 0.000
There is also iterpc package.
> I = iterpc(4, ordered=TRUE)
> getall(I)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 2 4 3
[3,] 1 3 2 4
[4,] 1 3 4 2
[5,] 1 4 2 3
[6,] 1 4 3 2
[7,] 2 1 3 4
[8,] 2 1 4 3
[9,] 2 3 1 4
[10,] 2 3 4 1
...
You can also do it iteratively
> getnext(I)
[1] 1 2 3 4
> getnext(I)
[1] 1 2 4 3
> getnext(I)
[1] 1 3 2 4
> getnext(I)
[1] 1 3 4 2

Creating a matrix from multiple column vectors

How can I create a matrix from multiple column vectors?
I know that I can easily create a data frame with column vectors:
> colA <- 1:5
> colB <- 21:25
> colC <- 31:35
> data.frame(colA, colB, colC)
colA colB colC
1 1 21 31
2 2 22 32
3 3 23 33
4 4 24 34
5 5 25 35
However, when I try matrix(), it gives me unexpected results, as shown below. How can create my desired matrix? I know I can do as.matrix(df), which nicely preserves the column names, but I'm looking for a more direct approach.
> matrix(colA, colB, colC)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[2,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[3,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[4,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[5,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[6,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[7,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[8,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[9,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[10,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[11,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[12,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[13,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[14,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[15,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[16,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[17,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[18,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[19,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[20,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[21,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
[1,] 4 5 1 2 3 4 5 1 2 3 4 5
[2,] 5 1 2 3 4 5 1 2 3 4 5 1
[3,] 1 2 3 4 5 1 2 3 4 5 1 2
[4,] 2 3 4 5 1 2 3 4 5 1 2 3
[5,] 3 4 5 1 2 3 4 5 1 2 3 4
[6,] 4 5 1 2 3 4 5 1 2 3 4 5
[7,] 5 1 2 3 4 5 1 2 3 4 5 1
[8,] 1 2 3 4 5 1 2 3 4 5 1 2
[9,] 2 3 4 5 1 2 3 4 5 1 2 3
[10,] 3 4 5 1 2 3 4 5 1 2 3 4
[11,] 4 5 1 2 3 4 5 1 2 3 4 5
[12,] 5 1 2 3 4 5 1 2 3 4 5 1
[13,] 1 2 3 4 5 1 2 3 4 5 1 2
[14,] 2 3 4 5 1 2 3 4 5 1 2 3
[15,] 3 4 5 1 2 3 4 5 1 2 3 4
[16,] 4 5 1 2 3 4 5 1 2 3 4 5
[17,] 5 1 2 3 4 5 1 2 3 4 5 1
[18,] 1 2 3 4 5 1 2 3 4 5 1 2
[19,] 2 3 4 5 1 2 3 4 5 1 2 3
[20,] 3 4 5 1 2 3 4 5 1 2 3 4
[21,] 4 5 1 2 3 4 5 1 2 3 4 5
[,26] [,27] [,28] [,29] [,30] [,31]
[1,] 1 2 3 4 5 1
[2,] 2 3 4 5 1 2
[3,] 3 4 5 1 2 3
[4,] 4 5 1 2 3 4
[5,] 5 1 2 3 4 5
[6,] 1 2 3 4 5 1
[7,] 2 3 4 5 1 2
[8,] 3 4 5 1 2 3
[9,] 4 5 1 2 3 4
[10,] 5 1 2 3 4 5
[11,] 1 2 3 4 5 1
[12,] 2 3 4 5 1 2
[13,] 3 4 5 1 2 3
[14,] 4 5 1 2 3 4
[15,] 5 1 2 3 4 5
[16,] 1 2 3 4 5 1
[17,] 2 3 4 5 1 2
[18,] 3 4 5 1 2 3
[19,] 4 5 1 2 3 4
[20,] 5 1 2 3 4 5
[21,] 1 2 3 4 5 1
Warning message:
In matrix(colA, colB, colC) :
data length [5] is not a sub-multiple or multiple of the number of rows [21]
You can use cbind to produce the desired matrix:
mat <- cbind(colA, colB, colC)
mat
# colA colB colC
# [1,] 1 21 31
# [2,] 2 22 32
# [3,] 3 23 33
# [4,] 4 24 34
# [5,] 5 25 35
class(mat)
# [1] "matrix"
You don't get the matrix you're expecting with the call of matrix(colA, colB, colC), because your arguments are getting interpreted as the first, second, and third arguments to the matrix function (aka data, nrow, and ncol). If you wanted to use the matrix function, you would need to provide your data as a single argument, with something like mat <- matrix(c(colA, colB, colC), ncol=3). If you used this syntax, you would not get the column names from the variables like we did with cbind.

find unique elements in matrix based on a subset of columns

I have a table which i want to transform
t LabelA LabelB start stop
[1,] 1 4 9 1 2
[2,] 1 3 5 1 2
[3,] 1 1 6 1 2
[4,] 2 4 9 1 2
[5,] 2 3 5 1 2
[6,] 2 1 6 1 2
[7,] 2 7 2 2 2
[8,] 3 3 5 3 4
[9,] 3 1 6 3 4
[10,] 3 7 2 3 5
[11,] 3 4 9 3 5
I want to filter the data in a way that rows which just differ by there number in the first column are removed (not completely but only the duplicate). So for rows 1 and 4 only row 1 should remain in the table. Or for row 3 and 9 only row 9 should remain. It is important that the information in the first column is remained and that the earliest occurance of the row remaisn in the table not the other incidences.
You can use duplicated:
mat[!duplicated(as.data.frame(mat[, -1])), ]
t LabelA LabelB start stop
[1,] 1 4 9 1 2
[2,] 1 3 5 1 2
[3,] 1 1 6 1 2
[7,] 2 7 2 2 2
[8,] 3 3 5 3 4
[9,] 3 1 6 3 4
[10,] 3 7 2 3 5
[11,] 3 4 9 3 5
where mat is the name of your matrix.
Try using duplicated function:
mymx <- matrix(c(1,4,9,1,2 ,1,3,5,1,2 ,1,1,6,1,2 ,2,4,9,1,2 ,2,3,5,1,2 ,2,1,6,1,2 ,2,7,2,2,2 ,3,3,5,3,4 ,3,1,6,3,4 ,3,7,2,3,5 ,3,4,9,3,5), ncol=5, byrow=T)
mymx[!duplicated(mymx[,-1]),]
> mymx[!duplicated(mymx[,-1]),]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 4 9 1 2
[2,] 1 3 5 1 2
[3,] 1 1 6 1 2
[4,] 2 7 2 2 2
[5,] 3 3 5 3 4
[6,] 3 1 6 3 4
[7,] 3 7 2 3 5
[8,] 3 4 9 3 5

Resources