How to fix code to enter a covariance matrix using lavaan package - r

I am receiving the following error message when entering a covariance matrix using lavaan:
Error in lav_matrix_lower2full(c(0.77, 0.38, 0.65, 0.39, 0.39, 0.62, -0.25, :
p == round(p, 0) is not TRUE
I've used the following code before with no issue.
Full.cor<-lav_matrix_lower2full(c(.77,.38,.65,.39,.39,.62,-.25,-.32,-.27,6.09,.31,.29,.26,-.36,7.67,.24,.25,.19,-.18,.51,1.69,-3.16,-3.56,-2,63,6.09,-3.12,-4.58,204.79,-.92,-.88,-.72,.88,-1.49,-1.41,16.53,7.24))
Any ideas where I went wrong?

From ?lav_matrix_lower2full (bold-face mine):
The ‘lav_matrix_vechr_reverse’ (alias: ‘lav_matrix_vechu_reverse’
and ‘lav_matrix_lower2full’) creates a symmetric matrix, given
only the lower triangular elements, row by row. If diagonal =
FALSE, an diagonal with zero elements is added.
For any symmetric n × n matrix there are n(n+1)/2 lower triangular elements (including the diagonal).
The error arises from you not providing the correct amount of "unpacked" matrix elements.
For example for a 3 × 3 matrix we need to provide 6 elements
lav_matrix_lower2full(c(0, 1, 2, 3, 4, 5))
# [,1] [,2] [,3]
#[1,] 0 1 3
#[2,] 1 2 4
#[3,] 3 4 5
If we do instead
lav_matrix_lower2full(c(0, 1, 2, 3, 4))
we get the error
Error in lav_matrix_lower2full(c(0, 1, 2, 3, 4)) :
p == round(p, 0) is not TRUE
In your case you have 37 elements, which suggests that either
you have an additional erroneous element for a potential 8 × 8 covariance matrix requiring 36 elements, or
you are missing 8 additional elements for a potential 9 × 9 covariance matrix requiring 45 elements.

Related

How to get the pivot and rank from Matrix::qr() like that of base::qr()?

When applying Matrix::qr() on the sparse matrix in R, the output is quite different from that of base::qr. There are V, beta, p, R, q but not rank and pivot. Below is a small example code. I want to detect linear dependent columns of the A sparse matrix, which requires the pivot and rank. How should I get these information?
library(Matrix)
A <- matrix(c(0, -2, 1, 0,
0, -4, 2, 0,
1, -2, 1, 2,
1, -2, 1, 2,
1, -2, 1, 2), nrow = 5, byrow = T)
A.s <- as(A, "dgCMatrix")
qrA.M <- Matrix::qr(A.s)
qrA.R <- base::qr(A)
There is another related but not answered question, Get base::qr pivoting in Matrix::qr method
I would reconstruct your example matrix A a little bit:
A <- A[, c(1,4,3,2)]
# [,1] [,2] [,3] [,4]
#[1,] 0 0 1 -2
#[2,] 0 0 2 -4
#[3,] 1 2 1 -2
#[4,] 1 2 1 -2
#[5,] 1 2 1 -2
You did not mention in your question why rank and pivot returned by a dense QR factorization are useful. But I think this is what you are looking for:
dQR <- base::qr(A)
with(dQR, pivot[1:rank])
#[1] 1 3
So columns 1 and 3 of A gives a basis for A's column space.
I don't really understand the logic of a sparse QR factorization. The 2nd column of A is perfectly linearly dependent on the 1st column, so I expect column pivoting to take place during the factorization. But very much to my surprise, it doesn't!
library(Matrix)
sA <- Matrix(A, sparse = TRUE)
sQR <- Matrix::qr(sA)
sQR#q + 1L
#[1] 1 2 3 4
No column pivoting is done! As a result, there isn't an obvious way to determine the rank of A.
At this moment, I could only think of performing a dense QR factorization on the R factor to get what you are looking for.
R <- as.matrix(Matrix::qrR(sQR))
QRR <- base::qr(R)
with(QRR, pivot[1:rank])
#[1] 1 3
Why does this work? Well, the Q factor has orthogonal hence linearly independent columns, thus columns of R inherit linear dependence or independence of A. For a matrix with much more rows than columns, the computational costs of this 2nd QR factorization is negligible.
I need to figure out the algorithm behind a sparse QR factorization before coming up with a better idea.
I've been looking at a similar problem and I ended up not relying on Matrix::qr() to calculate rank and to detect linear dependency. Instead I programmed the function GaussIndependent in the package SSBtools.
In the package examples I included an example that demonstrates wrong conclusion from rankMatrix(x, method = "qr"). Input x is a 44*20 dummy matrix.
Starting with your example matrix, A.s:
library(SSBtools)
GaussIndependent(A.s) # List of logical vectors specifying independent rows and columns
# $rows
# [1] TRUE FALSE TRUE FALSE FALSE
#
# $columns
# [1] TRUE TRUE FALSE FALSE
GaussRank(A.s) # the rank
# [1] 2

Solve A linear equation system b=0 Rstudio

i would solve a linear equation system like this:
x_1*3+x_2*4+x_3*5+x_4*6+x_6*2=0
x_1*21+x_2*23+x_3*45+x_4*37*+x_6*0=0
x_1*340+x_2*24+x_3*25+x_4*31+x_6*0=0
x_1*32+x_2*45+x_3*5+x_4*6+x_7*2=0
x_1*9+x_2*11+x_3*13+x_4*49+x_7*0=0
x_1*5+x_2*88+x_3*100+x_4*102+X_7*2=0
[x_1][x_2][x_3] [x_4] [,5]
[1,] 3 4 5 6 2
[2,] 21 23 45 37 0
[3,] 340 24 25 31 0
[4,] 32 45 5 6 2
[5,] 9 11 13 49 0
[6,] 5 88 100 102 2
i use solve this linear homogeneous equation system with MASS::null(t(M),
but the problem is that find x_1....x_4, but x_5 find only one solution but i need different three value that is x_5,1,x_5,2 and x_5,3.
value of matrix are random, and they can be changed
Ok, had to reactivate my rusted linear algebra knowledge, you can do this by using the Singular Value Decomposition, if all elements of the diagonal part of the SVD are non zero, only the trivial solutions exists:
solution_space <- function(A){
my_svd <- svd(A)
if(all(my_svd$d != 0)){
return(rep(0, ncol(A)))
} else {
return(my_svd$u[,my_svd$d == 0, drop=F])
}
}
A %*% solution_space(A)
You can try the code with these matrices:
A <- matrix(c(1,1,0,1,1,0,0,0,1), 3, 3)
A <- matrix(c(1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1), 4, 4)
A <- matrix(c(1,1,0,1,1,0,0,0,0), 3, 3)
With the update which shows you've got 5 equations with 7 unknowns, it's obvious that there is a multidimensional surface of solutions.
I fear I don't have code to calculate that surface, but let me toot my own horn and offer the ktsolve package. For any given set of inputs from your { $x_1, x_2, ...x_7$ } [ah rats no latex markdown] , enter a collection of known values and ktsolve will run a back-solver (usually BB ) to find the unknowns.
So if you can feed your problem a selected set of any two of {X_5, X_6, X_7}, you can find all five of the other values.

Algorithm for finding a permutation matrix of a matrix

I see some similar questions:
Generate permutation matrix from permutation vector
https://math.stackexchange.com/questions/345166/what-is-the-name-for-a-non-square-permutation-matrix
Given elements:
elems = [1,2,3,4] # dimensions 1x4
If I have a vector:
M = [4,2,3,1] # dimensions 1x4
I know there is some permutation matrix p that I can multiply elems * p = M, which in this case would be:
p =
[
0 0 0 1
0 1 0 0
0 0 1 0
1 0 0 0
] # dimensions 4x4
# eg:
# elems * P = M
1x4 4x4 = 1x4
Now, for my question, I am interested in what it would look like in the case when M is a non-vector, non-square matrix, like:
M' = [
4 2 3 1
4 3 2 1
1 2 3 4
] # dimensions 3x4
For the same
elems' = [
1 2 3 4
1 2 3 4
1 2 3 4
] # where this is now tripled to be conformant dimensions
# dimensions 3x4
#
# meaning P is still 4x4
You can see M_prime and elems_prime in this case are still just permutations, but now multivariate, rather than just a single vector as originally.
I know I am not able to just do the following kind of thing, because the matrix is not square, and thus not invertible:
elems' * P = M'
P = elems'^-1 * M'
# eg:
# elems' * P = M'
3x4 4x4 = 3x4
When I try, in R at least, I see:
> P <- ginv(elems_prime) %*% M_prime
[,1] [,2] [,3] [,4]
[1,] 0.1 0.07777778 0.08888889 0.06666667
[2,] 0.2 0.15555556 0.17777778 0.13333333
[3,] 0.3 0.23333333 0.26666667 0.20000000
[4,] 0.4 0.31111111 0.35555556 0.26666667
Does this give me back M'?
> elems_prime %*% P
[,1] [,2] [,3] [,4]
[1,] 3 2.333333 2.666667 2
[2,] 3 2.333333 2.666667 2
[3,] 3 2.333333 2.666667 2
!= M' # No, does not.
So this is not right.
My questions are:
What is the right P that would correctly permute the elems' matrix
into the M' matrix?
What is the name of the algorithm to find it?
(implementation in R, Haskell, or pseudocode is great)
Is there a way to restrict the values of P to be integers, preferably 0 or 1?
for R reproducibility
> dput(elems_prime)
structure(c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4), .Dim = 3:4)
> dput(M_prime)
structure(c(4, 4, 1, 2, 3, 2, 3, 2, 3, 1, 1, 4), .Dim = 3:4)
Notice that column space of M' is of higher order than the column space of elem'. This implies that there does not exist a linear mapping from elem' to M' because a linear mapping cannot increase the row or column space of a matrix (useful to think about this as a transformation of basis).
It follows that the any M' generated by elem' * P can have rank of at most 1, leaving only the conventional permutation matrices as candidates for P'
It is an entirely different question if we look at going from M' back to elem, and this asymmetry is also noteworthy.
When M is not a vector, this is not possible.
Here is why. In general if we multiple a nxm matrix times a mxp matrix we get a nxp matrix. Here elems is a vector that is a 1x4 matrix, so elems * P has to be a 1x? matrix of some sort. By making P longer, you can make M longer, but you'd have to change elems to make M taller.
Incidentally in linear algebra it is standard to flip vectors to be columns and put the matrices on their left. The reason for that is that the matrix represents a linear function, and that puts the matrix in the same place where the linear function goes. So it is very nice when going from functional notation to matrix notation. Also if you've got to write a square matrix anyways, it takes less room on the page to write a vertical vector on the right rather than a horizontal one on the left...

R, use binomial distribution with more than two possibilities

I know this is probably elementary, but I seem to have a mental block. Let's say you want to calculate the probability of tossing a 4, 5, or 6 on a roll of one die. In R, it's easy enough:
sum(1/6, 1/6, 1/6)
This gives 1/2 which is the correct answer. However, I have in the back of my mind (where it possibly should remain) that I should be able to use the binomial distribution for this. I've tried various combinations of arguments for pbinom and dbinom, but I can't get the right answer.
With coin tosses, it works fine. Is it entirely inappropriate for situations where there are more than two possible outcomes? (I'm a programmer, not a statistician, so I'm expecting to get killed by the stat guys here.)
Question: How can I use pbinom() or dbinom() to calculate the probability of throwing a 4, 5, or 6 with one roll of a die? I'm familiar with the prob and dice packages, but I really want to use one of the built-in distributions.
Thanks.
As #Alex mentioned above, dice-throwing can be represented in terms of multinomial probabilities. The probability of rolling a 4, for example, is
dmultinom(c(0, 0, 0, 1, 0, 0), size = 1, prob = rep(1/6, 6))
# [1] 0.1666667
and the probability of rolling a 4, 5, or 6 is
X <- cbind(matrix(rep(0, 9), nc = 3), diag(1, 3))
X
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 1 0 0
# [2,] 0 0 0 0 1 0
# [3,] 0 0 0 0 0 1
sum(apply(X, MAR = 1, dmultinom, size = 1, prob = rep(1/6, 6)))
# [1] 0.5
Though it's not quite obvious, this can be done with pmultinom, implemented either in my pmultinom package on CRAN or this other pmultinom package on Github.
You conceptualize it as the event that it is not a 1, 2 or 3. Then, you write this probability as
P(X_1 ≤ 0, X_2 ≤ 0, X_3 ≤ 0, X_4 ≤ ∞, X_5 ≤ ∞, X_6 ≤ ∞)
where X_i is the number of occurrences of side i. All the X's together have a multinomial distribution, with a size parameter of 1, and all probabilities equal to 1/6. This probability above can be calculated (using my package) as
pmultinom(upper=c(0, 0, 0, Inf, Inf, Inf), size=1,
probs=c(1/6, 1/6, 1/6, 1/6, 1/6, 1/6), method="exact")
# [1] 0.5
Though it's a bit of an awkward reformulation, I like it because I prefer to use a "p" function rather than take a sum of "d" functions.

R Generic solution to create 2*2 confusion matrix

My question is related to this one on producing a confusion matrix in R with the table() function. I am looking for a solution without using a package (e.g. caret).
Let's say these are our predictions and labels in a binary classification problem:
predictions <- c(0.61, 0.36, 0.43, 0.14, 0.38, 0.24, 0.97, 0.89, 0.78, 0.86, 0.15, 0.52, 0.74, 0.24)
labels <- c(1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0)
For these values, the solution below works well to create a 2*2 confusion matrix for, let's say, threshold = 0.5:
# Confusion matrix for threshold = 0.5
conf_matrix <- as.matrix(table(predictions>0.5,labels))
conf_matrix
labels
0 1
FALSE 4 3
TRUE 2 5
However, I do not get a 2*2 matrix if I select any value that is smaller than min(predictions) or larger than max(predictions), since the data won't have either a FALSE or TRUE occurrence e.g.:
conf_matrix <- as.matrix(table(predictions>0.05,labels))
conf_matrix
labels
0 1
TRUE 6 8
I need a method that consistently produces a 2*2 confusion matrix for all possible thresholds (decision boundaries) between 0 and 1, as I use this as an input in an optimisation. Is there a way I can tweak the table function so it always returns a 2*2 matrix here?
You can make your thresholded prediction a factor variable to achieve this:
(conf_matrix <- as.matrix(table(factor(predictions>0.05, levels=c(F, T)), labels)))
# labels
# 0 1
# FALSE 0 0
# TRUE 6 8

Resources