I am trying to accomplish the following task to get to matrix d:
d1<-matrix(as.factor(rep(sample(1:10,10,T),5)),ncol=5)
d2<-matrix(as.factor(rep(sample(1:10,10,T),5)),ncol=5)
d3<-matrix(as.factor(rep(sample(1:10,10,T),5)),ncol=5)
d<-cbind(
cbind(d1[,2],d1[,5]),
cbind(d2[,2],d2[,5]),
cbind(d3[,2],d3[,5])
)
But for many matrices d1...dn, say.
More generally I would like to select the same column numbers from a series of matrices and append into a single matrix. The focus of this task is on combining, not creating the matrices. The factor-type column vectors should be preserved.
I thought about something along the lines of
d<-matrix(nrow=10)
dl<-list(d1,d2,d3)
for (i in 1:3){
d<-cbind(d,dl[[i]][,2],dl[[i]][,5])
}
But maybe there is a better way.
You can create a list of your matrices and use do.call and lapply to get what you want:
matList <- list(d1, d2, d3)
do.call(cbind, lapply(matList, function(x) x[, c(2, 5)]))
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] "3" "3" "3" "3" "10" "10"
# [2,] "4" "4" "2" "2" "3" "3"
# [3,] "6" "6" "7" "7" "7" "7"
# [4,] "10" "10" "4" "4" "2" "2"
# [5,] "3" "3" "8" "8" "3" "3"
# [6,] "9" "9" "5" "5" "4" "4"
# [7,] "10" "10" "8" "8" "1" "1"
# [8,] "7" "7" "10" "10" "4" "4"
# [9,] "7" "7" "4" "4" "9" "9"
# [10,] "1" "1" "8" "8" "4" "4"
By the way, the data type in your matrix is character, not factor. See the help page at ?matrix where you will find the following:
The method for data frames will return a character matrix if there is only atomic columns and any non-(numeric/logical/complex) column, applying as.vector to factors and format to other non-character columns.
Related
I wish to find the row number, based on multiple parameters. I have made this test matrix:
data=
[,1] [,2] [,3]
[1,] "1" "a" "0"
[2,] "2" "b" "0"
[3,] "3" "c" "0"
[4,] "4" "a" "0"
[5,] "1" "b" "0"
[6,] "2" "c" "0"
[7,] "3" "a" "0"
[8,] "4" "b" "0"
Then I want to get the row number where
data[,1]==1 and data[,2]=='b'
I have the variable assignments
N <- 10
H <- 10
K <- 2 # number of subarrays
perms <- 10
I set up an empty array as follows:
pop <- array(dim = c(c(perms, N), K))
Then I assign character labels:
haps <- as.character(1:H)
Now, I assign probabilities:
probs <- rep(1/H, H)
I then create a 'for' loop:
for(j in 1:perms){
for(i in 1:K){
pop[i,j] <- sample(haps, size = N, replace = TRUE, prob = probs)
}
}
'pop' should now contain character labels from 1:H across both subarrays. Instead, I end up with an error "incorrect number of subscripts on matrix."
I am not sure why R is producing the error.
Any assistance is appreciated.
You gave 2 subscripts to the 3 dimensional vector pop. Also the dimensions of the sample wouldn't have matched the dimensions of the slice of pop even if you had added a 3rd subscript because the order was wrong.
dim(pop)
[1] 10 10 2
You may have wanted this:
for(j in 1:perms){
for(i in 1:K){
pop[1:10,j,i] <- sample(haps, size = N, replace = TRUE, prob = probs)
}
}
pop
, , 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] "6" "8" "8" "6" "3" "3" "1" "9" "7" "8"
[2,] "7" "10" "5" "3" "7" "10" "7" "1" "5" "8"
[3,] "9" "7" "1" "4" "1" "3" "2" "7" "6" "5"
[4,] "6" "8" "9" "4" "9" "7" "10" "9" "7" "2"
[5,] "1" "3" "2" "6" "10" "3" "3" "9" "10" "6"
[6,] "9" "3" "8" "1" "6" "6" "4" "8" "8" "9"
[7,] "9" "2" "2" "2" "3" "9" "8" "6" "10" "10"
[8,] "9" "5" "3" "8" "3" "4" "1" "6" "8" "4"
[9,] "10" "8" "1" "3" "10" "2" "5" "10" "6" "4"
[10,] "2" "1" "8" "10" "5" "5" "7" "8" "7" "6"
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] "1" "10" "10" "7" "3" "6" "3" "2" "4" "1"
[2,] "8" "5" "10" "8" "3" "6" "6" "6" "8" "2"
[3,] "1" "6" "1" "5" "6" "3" "6" "1" "7" "9"
[4,] "7" "10" "5" "5" "7" "5" "3" "3" "10" "1"
[5,] "7" "3" "6" "8" "3" "9" "6" "2" "7" "3"
[6,] "1" "9" "4" "9" "1" "1" "10" "4" "3" "9"
[7,] "9" "3" "10" "1" "2" "2" "2" "3" "5" "4"
[8,] "8" "10" "8" "6" "9" "6" "9" "9" "2" "4"
[9,] "4" "6" "1" "1" "6" "5" "6" "6" "10" "3"
[10,] "5" "9" "9" "7" "9" "6" "4" "2" "10" "9"
Though it seems strange to me that you want the numbers in haps and therefore pop to be characters, but I assume that you have your reasons.
This is a column-wise assignment of the samples. I'm not sure if it matters, but if you prefer row-wise then you can do the index like this:
pop[j,1:10,i]
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
I'm trying to apply a loop on a matrix (matrixExample, as described below) to retrieve Columns V1 to V6 on 3 rows each time.
matrixExample
ID CHR SEGNUM GENOSEG V1 V2 V3 V4 V5 V6
[1,] "CHAR8" "1" "1" "102505" "1" "0" "2" "5" "0" "5"
[2,] "LIMO9" "1" "1" "012505" "0" "1" "2" "5" "0" "5"
[3,] "SIM10" "1" "1" "122505" "1" "2" "2" "5" "0" "5"
[4,] "CHAR8" "1" "2" "111520" "1" "1" "1" "5" "2" "0"
[5,] "LIMO9" "1" "2" "221520" "2" "2" "1" "5" "2" "0"
[6,] "SIM10" "1" "2" "222520" "2" "2" "2" "5" "2" "0"
[7,] "CHAR8" "1" "3" "501111" "5" "0" "1" "1" "1" "1"
[8,] "LIMO9" "1" "3" "501100" "5" "0" "1" "1" "0" "0"
[9,] "SIM10" "1" "3" "502011" "5" "0" "2" "0" "1" "1"
[10,] "CHAR8" "2" "1" "102505" "1" "0" "2" "5" "0" "5"
[11,] "LIMO9" "2" "1" "012505" "0" "1" "2" "5" "0" "5"
[12,] "SIM10" "2" "1" "122505" "1" "2" "2" "5" "0" "5"
[13,] "CHAR8" "2" "2" "111520" "1" "1" "1" "5" "2" "0"
[14,] "LIMO9" "2" "2" "221520" "2" "2" "1" "5" "2" "0"
[15,] "SIM10" "2" "2" "222520" "2" "2" "2" "5" "2" "0"
[16,] "CHAR8" "2" "3" "501111" "5" "0" "1" "1" "1" "1"
[17,] "LIMO9" "2" "3" "501100" "5" "0" "1" "1" "0" "0"
[18,] "SIM10" "2" "3" "502011" "5" "0" "2" "0" "1" "1"
As example, from the first 3 rows, I would like to do some matrix calculations using the submatrix:
"1" "0" "2" "5" "0" "5"
"0" "1" "2" "5" "0" "5"
"1" "2" "2" "5" "0" "5"
After this calculation, I need to go to rows 4 to 6 ...
I tried this code:
for(i in seq(1, dim(exampleDoubleSort)[1], 3))
{
print(matrixExample[(i:i+2),c(4:10)]) # using print only as an example
# do some matrix calculations using the subset matrix
}
Its only printing the elements from one row and not for the combination of 3 rows.
If I try out of the loop, I can obtain the expected result.
print(matrixExample[(1:3),c(5:10)])
V1 V2 V3 V4 V5 V6
[1,] "1" "0" "2" "5" "0" "5"
[2,] "0" "1" "2" "5" "0" "5"
[3,] "1" "2" "2" "5" "0" "5"
Please, could you give me some idea on how to read 3 rows each time and retrieve a matrix subset for further calculations?
If I have 30 hows, I need to retried 10 matrix as subset and perform 10 calculations ... The calculation will be implemented using function as example.
Thanks in advance!
Cheers!
RV
I create a frame from different vectors
pat1<-c(11, 12, 13, 14, 15)
pat2<-c(1:5)
pat3<-seq(1,10, by=2)
pat4<-seq(-5,3, by=2)
pat5<-c(pat1+pat2)
variables<-c("a","b","c","d","e")
mydata<-data.frame(variables, pat1, pat2,pat3, pat4, pat5)
mydata<-t(mydata)
I translate my columns to rows and I get the correct table, but numbers are not doubles
[,1] [,2] [,3] [,4] [,5]
variables "a" "b" "c" "d" "e"
pat1 "11" "12" "13" "14" "15"
pat2 "1" "2" "3" "4" "5"
pat3 "1" "3" "5" "7" "9"
pat4 "-5" "-3" "-1" " 1" " 3"
pat5 "12" "14" "16" "18" "20"
How shall I get doubles for my pat values?
I have a list of graphs (igraph format) and I would like to obtain a merge graph, which would be the intersection of those nodes and vertices that are shared a certain percentage trough all the graphs.
I know that igraph library has the function graph.intersection() but this function intersect all the vertices and nodes present in all the graphs.
Any help would be much appreciated
Here is a brief example
g1 <- graph.data.frame(df1, directed=F)
df2 <- data.frame(V1=c(1,2,2,3,4), V2=c(3,3,5,5,5))
g2 <- graph.data.frame(df2, directed=F)
df3 <- data.frame(V1=c(1,2,3,4), V2=c(3,3,5,5))
g3 <- graph.data.frame(df3, directed=F)
df4 <- data.frame(V1=c(1,1,2,3), V2=c(2,3,4,5))
g4 <- graph.data.frame(df4, directed=F)
get.edgelist(g1)
[,1] [,2]
[1,] "1" "3"
[2,] "2" "3"
[3,] "2" "4"
[4,] "3" "5"
[5,] "4" "5"
get.edgelist(g2)
[,1] [,2]
[1,] "1" "3"
[2,] "2" "3"
[3,] "2" "5"
[4,] "3" "5"
[5,] "4" "5"
get.edgelist(g3)
[,1] [,2]
[1,] "1" "3"
[2,] "2" "3"
[3,] "3" "5"
[4,] "4" "5"
get.edgelist(g4)
[,1] [,2]
[1,] "1" "2"
[2,] "1" "3"
[3,] "2" "4"
[4,] "3" "5"
If I put all the graphs in a list:
mylist <- list(g1,g2,g3,g4)
And then apply the graph.intersection() function:
g.int <- graph.intersection(mylist, keep.all.vertices=FALSE)
The result is a graph with the following nodes and edges:
V(g.int)
[1] "1" "2" "3" "4" "5"
get.edgelist(g.int)
[,1] [,2]
[1,] "3" "5"
[2,] "1" "3"
What I want is to include those vertices and edges that appears in a certain percentage, in this example I would like to include edges present in 75% of the graphs. Thus, the optimal result would be:
V(g.int)
[1] "1" "2" "3" "4" "5"
get.edgelist(g.int)
[,1] [,2]
[1,] "3" "5"
[2,] "1" "3"
[3,] "4" "5"
Hope now it is more clear
You can create a graph of all of the edges in the graphs, then eliminate edges which do not appear often enough.
library(igraph)
# generate graphs
edgeset <- combn(1:20, 2)
graphs <- list()
for (i in 1:10) {
graphs[[i]] <- graph(i + edgeset[, sample(ncol(edgeset), 150)])
}
# Get a list of all edges in all graphs
myedges <- lapply(graphs, get.edgelist)
# Make a graph of all of the edges including overlap
uniongraph <- graph(do.call(rbind, myedges))
# Eliminate edges not overlapped enough
resultgraph <- graph.adjacency(get.adjacency(uniongraph) >= 0.75 * length(graphs))