Cortest.mat converting to class "psych,sim" - r

I've been using the psych package to compare two correlation matrices using the function cortest.
Now I want to try the cortest.mat and cortest.jennrich function which require an object of the class phychand sim. I have tried converting mi correlation matrices with sim.structure which results in an object of such classes but I get an error when running either function.
Here is what I've tried using Random numbers:
Random<-cor(matrix(rnorm(400, 0, .25), nrow=(20), ncol=(20)))
SimRandom<-sim.structure(Random)
class(SimRandom)
cortest.jennrich(SimRandom,SimRandom,n1=400, n2=400)
Yields the following:
Error in if (dim(R1)[1] != p) { : argument is of length zero
I sure I'm doing it wrong 'cause of the error message and 'cause the values in Random and SimRandom are not exactly the same.
Which is the correct way to translate a correlation matrix to a type -phych, sim- to use as input for running cortest.mat?
Thanks in advance.
EDIT: Short explanation on what I want to do. Using Random numbers serves just as an example. The actual correlation matrices to compare are done as follows. I have a huge list of files each composed of 100 observations for a specific genetic location. These files can be grouped into say 20 files based on known genetic relationships, thus I use those groups of files, load them into a matrix as columns and calculate cor(). That gives a correlation matrix. As a control I load random files and treat them the same way. This matrix contains real data, but the grouping is done randomly. In the end I have two correlation matrices 1-That contains the correlations of pre-selected files and 2- that contains the correlations between randomly loaded files. Both matrices are the same size.
What I would like to do is to compare the two correlation matrices to have an idea whether the grouping has an influence on the correlation values observed.
Sorry for not explaining this earlier, I wanted to avoid the long explanation and keep the question simple.

Related

SPSS: correlating two vectors

I have two vectors in my dataset Vs = s1 to s10 and Vt= t1 to t10.
They describe two pictures and I want to know for each case what the correlation is.
However there is no such a function Cor(Vs, Vt) because Vectors are apparently not usable in the standard functions. There is even no mean(Vs)!
I tried to write syntax but failed also because the problem of missing variables (implementing pairwise deletion seems complex).
Any hint is welcome.
Is it possible to ask a question that is only seen by SPSS experts?
calculating the correlation in the present structure is probably feasible but would be pretty complex. I suggest restructuring the data, then all becomes easy:
The code assumes you have some line ID in the data, called lineNum.
If you don't, you'll need to create one using the first line.
compute lineNum=$casenum. /* this is only necessary if you don't have some other line ID.
varstocases /mame V_s from S1 to S10 /make V_t from V1 to V10 /index=pairNum(V_s).
sort cases by lineNum.
split file by lineNum.
correlations V_s with V_t. /* you can edit the code here to add features to the analysis.
split file off.
That's it. Now the results will appear in the output window - one correlation for each of the original lines. If you need to import the correlations back to the original data you can do that by using OMS control to capture the results into a new dataset and then matching it back to the original file.

Estimation to plot person-item map not feasible because items "have no 0-responses" in data matrix

I am trying to create a person item map that organizes the questions from a dataset in order of difficulty. I am using the eRm package and the output should looks like follows:
[person-item map] (https://hansjoerg.me/post/2018-04-23-rasch-in-r-tutorial_files/figure-html/unnamed-chunk-3-1.png)
So one of the previous steps, before running the function that outputs the map, I have to fit the data set to have a matrix which is the object that the plotting functions uses to create the actual map, but I am having an error when creating that matrix
I have already tried to follow and review some documentation that might be useful if you want to have some extra-information:
[Tutorial] https://hansjoerg.me/2018/04/23/rasch-in-r-tutorial/#plots
[Ploting function] https://rdrr.io/rforge/eRm/man/plotPImap.html
[Documentation] https://eeecon.uibk.ac.at/psychoco/2010/slides/Hatzinger.pdf
Now, this is the code that I am using. First, I install and load the respective libraries and the data:
> library(eRm)
> library(ltm)
Loading required package: MASS
Loading required package: msm
Loading required package: polycor
> library(difR)
Then I fit the PCM and generate the object of class Rm and here is the error:
*the PCM function here is specific for polytomous data, if I use a different one the output says that I am not using a dichotomous dataset
> res <- PCM(my.data)
>Warning:
The following items have no 0-responses:
AUT_10_04 AUN_07_01 AUN_07_02 AUN_09_01 AUN_10_01 AUT_11_01 AUT_17_01
AUT_20_03 CRE_05_02 CRE_07_04 CRE_10_01 CRE_16_02 EFEC_03_07 EFEC_05
EFEC_09_02 EFEC_16_03 EVA_02_01 EVA_07_01 EVA_12_02 EVA_15_06 FLX_04_01
... [rest of items]
>Responses are shifted such that lowest
category is 0.
Warning:
The following items do not have responses on
each category:
EFEC_03_07 LC_07_03 LC_11_05
Estimation may not be feasible. Please check
data matrix
I must clarify that all the dataset has a range from 1 to 5. Is a Likert polytomous dataset
Finally, I try to use the plot function and it does not have any output, the system just keep loading ad-infinitum with no answer
>plotPImap(res, sorted=TRUE)
I would like to add the description of that particular function and the arguments:
>PCM(X, W, se = TRUE, sum0 = TRUE, etaStart)
#X
Input data matrix or data frame with item responses (starting from 0);
rows represent individuals, columns represent items. Missing values are
inserted as NA.
#W
Design matrix for the PCM. If omitted, the function will compute W
automatically.
#se
If TRUE, the standard errors are computed.
#sum0
If TRUE, the parameters are normed to sum-0 by specifying an appropriate
W.
If FALSE, the first parameter is restricted to 0.
#etaStart
A vector of starting values for the eta parameters can be specified. If
missing, the 0-vector is used.
I do not understand why is necessary to have a score beginning from 0, I think that that what the error is trying to say but I don't understand quite well that output.
I highly appreciate any hint that you can provide me
Feel free to ask for any information that could be useful to reach the solution to this issue
The problem is not caused by the fact that there are no items with 0-responses. The model automatically corrects this by centering the response scale categories on zero. (You'll notice that the PI-map that you linked to is centered on zero. Also, I believe the map you linked to is of dichotomous data. Polytomous data should include the scale categories on the PI-map, I believe.)
Without being able to see your data, it is impossible to know the exact cause though.
It may be that the model is not converging. That may be what this error was alluding to: Estimation may not be feasible. Please check data matrix. You could check by entering > res at the prompt. If the model was able to converge you should see something like:
Conditional log-likelihood: -2.23709
Number of iterations: 27
Number of parameters: 8
...
Does your data contain answers with decimal numbers? I found the same error, I solved it by using dplyr::dense_rank() function:
df_ranked <- sapply(df_decimal_data, dense_rank)
Worked.

random data generation for Uniform Distribution for Set of Parameter Vector

I am interested in generating a data from uniform distribution using a vector of parameter (say parameter vector of size 10). I tried in R software but error is there. Please see the below code, it gives only one observation but I am interested to get all the 10 values.
parameter=c(1,2,4,5,3,45,10,14,7,12)
runif(1,0,parameter)
runif(10,0,parameter)
Or if you want it to automatically detect how many values to generate based on the length...
runif(length(parameter), 0, parameter)

Accessing class values in R's poLCA

I am trying my hand at learning Latent Component Analysis, while also learning R. I'm using the poLCA package, and am having a bit of trouble accessing the attributes. I can run the sample code just fine:
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
ds = within(ds, (cesdcut = ifelse(cesd>20, 1, 0)))
library(poLCA)
res2 = poLCA(cbind(homeless=homeless+1,
cesdcut=cesdcut+1, satreat=satreat+1,
linkstatus=linkstatus+1) ~ 1,
maxiter=50000, nclass=3,
nrep=10, data=ds)
but in order to make this more useful, I'd like to access the attributes within the objects created by the poLCA class as such:
attr(res2, 'Nobs')
attr(res2, 'maxiter')
but they both come up as 'Null'. I expect Nobs to be 453 (determined by the function) and maxiter to be 50000 (dictated by my input value).
I'm sure I'm just being naive, but I could use any help available. Thanks a lot!
Welcome to R. You've got the model-fitting syntax right, in that you can get a model out (don't know how latent component analysis works, so can't speak to the statistical validity of your result). However, you've mixed up the different ways in which R can store information pertaining to a model.
poLCA returns an object of class poLCA, which is
a list containing the following elements:
(. . .)
Nobs number of fully observed cases (less than or equal to N).
maxiter maximum number of iterations through which the estimation algorithm was set
to run.
Since it's a list, you can extract individual elements from your model object using the $ operator:
res2$Nobs # number of observations
res2$maxiter # maximum iterations
In some cases, there might be extractor functions to get this information without having to do low-level indexing. For example, many model-fitting functions will have a fitted method, which pulls out the vector of fitted values on the training data; and similarly residuals pulls out the vector of residuals. You should check whether there are such extractor functions provided by the poLCA package and use them if possible; that way, you're not making assumptions about the structure of the model object that might be broken in the future.
This is distinct to getting the attributes of an object, which is what you use attr for. Attributes in R are what you might call metadata: they contain R-specific information about an object itself, rather than information about whatever it is the object relates to. Examples of common attributes include class (the class of an object), dim (the dimensions of an array or matrix), names (names of individual elements of a vector/list/array) and so on.

Creating a correlation matrix using novel distance function

I'm trying to write a function that will create a correlation matrix using a fancy distance estimate (dcorr, Brownian distance). More generally, I want to write code for a generic "correlation" matrix in which you can plug in any distance estimator.
My data is formatted such that columns are variables and rows are observations.
I'm having problems with my basic code. My algorithm is as follows:
Use apply to take a variable
Pass to function that will again take apply on the entire matrix
At this point you should have two pairs of variables
Use na.omit to remove missing observations (necessary for dcorr)
Calculate dcorr
I was hoping this would result in the correlation matrix but I'm having a lot of problems with basic variable managment. I'm having difficulty passing variables to the apply function. In particular, I want to pass a the column that was pulled in the first apply and pass it to the second apply (that is applied on the entire original matrix)
My code:
dcormatrix <- function(Matrix){
dcorhelper <- function (Col1){
as.matrix(apply(Matrix,2,function(Col2){
B <- na.omit(cbind(Col1,Col2))
dcor(B[,1],B[,2],index=1)
},Col1=Col1))
}
apply(Matrix,2,dcorhelper(),Matrix=Matrix)
}
Any ideas? I'm sure there's gotta be an easy way to do this.
You may want to check out designdist from the vegan package. It allows one to define alternate distance / dissimilarity matrices. See here.

Resources