I have a square matrix M in R whose all entries all real numbers lying between 0.5 to 1.9. I want to make adjacency matrix by imposing a condition that whenever each element is less than 0.6 then that element should replace by zero other wise element should replace by 1. This i want to do for all 141 threshold in
seq(0.5, 1.9, 0.01) so that i can get 141 adjacency matrices. How can I get this? and how can I save or print all those matrices in R? Any help will be appreciated. Kindly bear with my poor knowledge in R :-)
You could use lapply to loop over the values of "Seq1", create the binary matrix ((M>=x)+0L) and store it in a list ("lst")
lst <- lapply(Seq1, function(x) (M >=x)+0L)
length(lst)
#[1] 141
data
Seq1 <- seq(0.5, 1.9, 0.01)
set.seed(24)
M <- matrix(sample(Seq1, 10*10, replace=TRUE), ncol=10)
Related
I have a document-term matrix:
document_term_matrix <- as.matrix(DocumentTermMatrix(corpus, control = list(stemming = FALSE, stopwords=FALSE, minWordLength=3, removeNumbers=TRUE, removePunctuation=TRUE )))
For this document-term matrix, I've calculated the local term- and global term weighing as follows:
lw_tf <- lw_tf(document_term_matrix)
gw_idf <- gw_idf(document_term_matrix)
lw_tf is a matrix with the same dimensionality as the document-term-matrix (nxm) and gw_idf is a vector of size n. However, when I run:
tf_idf <- lw_tf * gw_idf
The dimensionality of tf_idf is again nxm.
Originally, I would not expect this multiplication to work, as the dimensionalities are not conformable. However, given this output I now expect the dimensionality of gw_idf to be mxm. Is this indeed the case? And if so: what happened to the gw_idf vector of size n?
Matrix multiplication is done in R by using %*%, not * (the latter is just element-wise multiplication). Your reasoning is partially correct, you were just using the wrong symbols.
About the matrix multiplication, a matrix multiplication is only possible if the second dimension of the first matrix is the same as the first dimensions of the second matrix. The resulting dimensions is the dim1 of first matrix by the dim2 of the second matrix.
In your case, you're telling us you have a 1 x n matrix multiplied by a n x m matrix, which should result in a 1 x m matrix. You can check such case in this example:
a <- matrix(runif(100, 0 , 1), nrow = 1, ncol = 100)
b <- matrix(runif(100 * 200, 0, 1), nrow = 100, ncol = 200)
c <- a %*% b
dim(c)
[1] 1 200
Now, about your specific case, I don't really have this package that makes term-documents (would be nice of you to provide an easily reproducible example!), but if you're multiplying a nxm matrix element-wise (you're using *, like I said in the beginning) by a nx1 array, the result does not make sense. Either your variable gw_idf is not an array at all (maybe it's just a scalar) or you're simply making a wrong conclusion.
I am given a matrix M. I now need to determine a matrix of the same dimension, which is defined by
N_{i,j} = M_{A(i,j),B(i,j)}
for two matrices A and B of the same dimension, which define indices.
As an example,
set.seed(1)
M <- matrix(LETTERS[1:(4*6)], ncol=6)
A <- matrix(sample(c(1:4), 4*6, replace=TRUE), ncol=6)
B <- matrix(sample(c(1:6), 4*6, replace=TRUE), ncol=6)
How do I now quickly determine N?
Try this:
replace(M, TRUE, M[cbind(c(A), c(B))])
or
array(M[cbind(c(A), c(B))], dim(M))
I have 2000 covariance matrices of size 27*27, I want to get the mean covariance matrix over all 2000 matrices. The result I want is one matrix of size 27x27 in which position [1,1] is the mean of position [1,1] of the given 27 matrices.
I could see from other posts that I should make an array and use apply function, but it does not work!
my codes:
a<-array(ml.1[c(1:2000)])
apply(a,c(1,2),mean)
I get this error message:
Error in if (d2 == 0L) { : missing value where TRUE/FALSE needed
I would appreciate if anyone can help me to solve this problem.
First, #eipi10 is right your're question is not reproducible. But the key here is in how you set up your array.
#Make some fake data 10 matrices 10x10
m <- lapply(1:10, function(x) matrix(rnorm(100), nrow = 10))
#bind the matrices together
a <- do.call(cbind, m)
#recast the matrix into three dimensions
dim(a) <- c(10,10,10)
#now apply should work
apply(a, c(1,2), mean)
I wish to represent p values and distances as lower triangular and upper triangular entries in a single matrix. While I managed to create a UT or LT matrix for both, I have ben unable to merge them into a single data frame in R.
dist[(upper.tri(dist,diag=FALSE))]=0 #upper tri of distances
pval[(lower.tri(pval,diag=FALSE))]=0 #lower tri of p-values
I tried the following line but does not work
dist[(upper.tri(dist,diag=FALSE))]=pval[(lower.tri(pval,diag=FALSE))]
Any possible way of doing this?
I'm sure this could be done more elegantly, but I think this does what you want:
a <- matrix(0, nrow = 10, ncol = 10)
b <- matrix(1, nrow = 10, ncol = 10)
a[upper.tri(a)]
b[lower.tri(b)]
new <- matrix(NA, nrow = 10, ncol = 10)
new[upper.tri(new)] <- a[upper.tri(a)]
new[lower.tri(new)] <- b[lower.tri(b)]
new
Since you did not supply a reproducible example, I can't be sure, but basically I just take the upper and lower of matrices (one of 0s and the other of 1s) and combine them in new. As proof of concept, new has 0s above the diagonal, 1s below, and NAs on the diagonal itself. Hopefully this gives you some insight into your issue.
Though, this question already answered I would like to add the following code for future use for anybody.
First, create two matrices of 10 by 10, with 1s and 2s only. Then using the package Matrix get only the lower and upper triangular matrices. Since there are no overlaps, we can simply use addition to combine the two matrices. Then convert the "dgeMatrix" first into a matrix and then to a data frame.
a <- matrix(1,10,10)
b <- matrix(2,10,10)
library(Matrix)
a <- tril(a, -1) # strict lower triangular matrix (omit diagonals)
b <- triu(b, 1) # strict upper triangular matrix
c <- a + b
c <- as.data.frame(as.matrix(c))
I have a list of 100,000 simulated numbers of T in R (min: 1.5, max 88.8) and I want to calculate the probability of T being between 10 and 50.
I sumulated 100,000 numbers of T, where T is t(y) %*% M %*% y where M is a 8x8 matrix of constant values and y is a 8x1 matrix. The element in the i-th row if y, is equal to: a_i + b_i where a is a vector of constants and b is a vector whose elements follow a normal (0,sd=2) distribution (each element is a different simulated number of N(0,2) )
Is it in a vector or a list? If it's a vector, the following should work. If it's in a list, you may use unlist() to convert it to a vector.
mylist <- runif(100000,1.5,88.8) #this is just to generate a random number vector
length(which(mylist>=10 & mylist<=50))/length(mylist)
set.seed(42)
myrandoms <- rnorm(100000, mean=5, sd=2)
mydistr <- ecdf(myrandoms)
#probability of beeing between 1 and 3:
diff(mydistr(c(1, 3)))
#[1] 0.13781
#compare with normal distribution
diff(pnorm(c(1, 3), mean=5, sd=2))
#[1] 0.1359051
If you really have a list, use myrandoms <- do.call(c, mylist) to make it a vector.