create all possible permutations of two vectors in R [duplicate] - r

This question already has answers here:
Generate all possible permutations (or n-tuples)
(2 answers)
Closed 9 years ago.
I have two vectors like this:
f1=c('a','b','c','d')
e1=c('e','f','g')
There is 4^3 different permutations of them. I need to create all of possible permutations of them in R softeware.for example;
(1):
a e
a f
a g
(2):
a e
a f
b g
...
Moreover, my real data are very huge and I need speed codes.

It sounds like you are looking for expand.grid.
> expand.grid(f1, e1)
Var1 Var2
1 a e
2 b e
3 c e
4 d e
5 a f
6 b f
7 c f
8 d f
9 a g
10 b g
11 c g
12 d g
I don't know what "speed codes" are, so I'm not sure I can help from that aspect.

Related

Randomize vector order with maximum variance

I have a vector that looks something like this.
v <- as.data.frame(list(v=(c("a","b","c",'d','e'))))
v
v
1 a
2 b
3 c
4 d
5 e
My vector has 5 different values. This means I can make 120 permutations of my vector.
Here are some examples of permutations
v v2 v3
1 a a a
2 b b c
3 c c b
4 d e d
5 e d e
I would like to create only create 10 different vectors out of the 120 possible ones, but I would like to select the combination that should maximise their covariance. Any idea how I could do this?
thanks a lot in advance for your help

r - Collapse multiple rows in one following multiple conditions with tidyr [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 5 years ago.
i have a database structure like this
A B C
n 1 M
n 2 U
n 1 U
f 3 M
f 4 M
f 1 U
using the package tidyr, I want to obtain this result:
A B C
n 1 M
n 3 U
f 7 M
f 1 U
So I want to make a sum of the b value characterized by the same A value and, obtained this sub set, collapsing B value in relation to the same C value.
How could I do?
library(dplyr)
df %>%
group_by(A,C) %>%
summarize(B=sum(B)) %>%
data.frame()

Trying to understand membership type cluster-edge_betweeness (igraph) R language

I am trying to understand the return type of the membership function used in ceb<-cluster-edge_betweeness
Saying
ceb<-cluster-edge_betweeness(g)
data<-membership(ceb)
print data
a b c d e f g h i j k l m n o p q r s t
1 2 3 4 5 4 6 6 7 8 9 10 11 3 6 12 5 3 13 6
I want to able to say for node a which clusters are you a member of
print data[2]
gives
b
2
Saying
print data[[2]]
gives
[1] 2
I want to be able to write something that returns the value of 'b' part of strange data type.
class(data)
gives
membership
typeof(data)
gives
double
data[2:10]
gives
b c d e f g h i j
2 3 4 5 4 6 6 7 8
what I was hoping to say was some code that said
vertex f is a member of cluster 4
The data[[6]] will give me 4, how do I get access to the f part ?
`
Just discovered that its possible to say
data [['a']]
and the expected answer, so this data type is some type of array addressed by the node name. I guess my questions needs to be how can I get a list of keys which such a construct

Convert from n x m matrix to long matrix in R [duplicate]

This question already has answers here:
Create dataframe from a matrix
(6 answers)
Closed 1 year ago.
Note: This is not a graph question.
I have an n x m matrix:
> m = matrix(1:6,2,3)
> m
a b c
d 1 2 3
e 4 5 6
I would like to convert this to a long matrix:
> m.l
a d 1
a e 4
b d 2
b e 5
c d 3
c e 6
Obviously nested for loops would work but I know there are a lot of nice tools for reshaping matrixes in R. So far, I have only found literature on converting from long or wide matrixes to an n x m matrix and not the other way around. Am I missing something obvious? How can I do this conversion?
Thank you!
If you need a single column matrix
matrix(m, dimnames=list(t(outer(colnames(m), rownames(m), FUN=paste)), NULL))
# [,1]
#a d 1
#a e 4
#b d 2
#b e 5
#c d 3
#c e 6
For a data.frame output, you can use melt from reshape2
library(reshape2)
melt(m)

Creating a new variable column based on data from another column

I'm pretty new to R, and programming in general, and I'm wondering the best way to loop through a column so I can add a column to the data frame further describing the observations I looped through.
I currently have a list of amino acids and their positions on a protein that looks like this:
Residue Position
H 1
R 2
K 3
D 4
E 5
H 6
R 7
K 8
D 9
E 10
I'd like something that looks like this (where H, R, and K are basic amino acids, and D and E are acidic amino acids):
Residue Position Properties
H 1 Basic
R 2 Basic
K 3 Basic
D 4 Acidic
E 5 Acidic
H 6 Basic
R 7 Basic
K 8 Basic
D 9 Acidic
E 10 Acidic
I'm really not sure where to start, and I'm having difficulty finding a good resource for this kind of situation in R.
I started by trying to subset the data, but then I realized that wouldn't do the trick:
Basic
h.dat <- subset(all, all$Residue == "H")
r.dat <- subset(all, all$Residue == "R")
k.dat <- subset(all, all$Residue == "K")
Acidic
d.dat <- subset(all, all$Residue == "D")
e.dat <- subset(all, all$Residue == "E")
Thanks!
Note:
H = Histidine (Basic amino acid)
R = Arginine (Basic)
K = Lysine (Basic)
E = Glutamic Acid (Acidic)
D = Aspartic Acid (Acidic)
You can use ifelse. If df is the name of your original data,
df$Property <- ifelse(df$Residue %in% c("H", "R", "K"), "Basic", "Acidic")
df
# Residue Position Property
# 1 H 1 Basic
# 2 R 2 Basic
# 3 K 3 Basic
# 4 D 4 Acidic
# 5 E 5 Acidic
# 6 H 6 Basic
# 7 R 7 Basic
# 8 K 8 Basic
# 9 D 9 Acidic
# 10 E 10 Acidic
Try:
> df1
Residue Position
1 H 1
2 R 2
3 K 3
4 D 4
5 E 5
6 H 6
7 R 7
8 K 8
9 D 9
10 E 10
Create a reference table:
> df2
Residue Property
1 H Basic
2 R Basic
3 K Basic
4 D Acidic
5 E Acidic
Then merge:
> merge(df1, df2)
Residue Position Property
1 D 9 Acidic
2 D 4 Acidic
3 E 5 Acidic
4 E 10 Acidic
5 H 1 Basic
6 H 6 Basic
7 K 8 Basic
8 K 3 Basic
9 R 7 Basic
10 R 2 Basic
I think you might want to allow for non-polar amino acids as well:
c(rep("Basic",3),rep("Acidic",2),"Non-Polar")[ # those are the choices
match(dat$Residue, c("H","R","K","E","D"), nomatch=6) ] #select indices
So I added an 11th residue named "Z" and tested:
> dat$Property <- c(rep("Basic",3),rep("Acidic",2),"Non-Polar")[
match(dat$Residue, c("H","R","K","E","D"), nomatch=6) ]
> dat
Residue Position Property
1 H 1 Basic
2 R 2 Basic
3 K 3 Basic
4 D 4 Acidic
5 E 5 Acidic
6 H 6 Basic
7 R 7 Basic
8 K 8 Basic
9 D 9 Acidic
10 E 10 Acidic
11 Z 11 Non-Polar

Resources