How to generate iterations of list object in R - r

I'd like to generate set of list object.
To start, I have a 2*2 matrix from which I should get a list of output.
The list contains: a projection matrix, an asymptotic dynamic, a transient dynamic and a matrix of elasticity: hence 4 objects. I can have all of them from the function projection.
My difficulty is that:
In task 1, I'd like to vary one of the elements (the third called gamma) of the starting matrix and then get a list of as many output as possible.
What I did shows only the first element of the list for each iteration.
#Creating function projection matrix
projection<- function(sigma1,sigma2,gama,phi){
A <- matrix(c(sigma1*(1-gama),phi,sigma1*gama, sigma2),
byrow = T, ncol = 2)
if(sigma1>1|sigma1<0){stop("sigma1 must be bounded in 0 and 1")}
if(gama>1|gama<0){stop("gama must be bounded in 0 and 1")}
if(phi<0){stop("phi must be greater or equal to 0")}
library(popbio)
e.a <- eigen.analysis(A)
as <- e.a$lambda1
tr <- -log(as)
Dynamic <- list(projection.matrix = A, assymtotic.dynamic=as,
transient.dynamic=tr, Elasticity=e.a$elasticities)
return(Dynamic)
}
#Try with B
B <- projection(0.5,0.9,0.1,1.5)
#Task 1
Task1 <- function(Gama){
n <- length(as.vector(Gama))
g <- list()
for (i in 1:n){g[i]<-projection(sigma1 = 0.5,sigma2 = 0.9,
gama = Gama[i],phi = 1.5)}
return(g)
}
G <- seq(from=0, to=1, by= 0.1)
Task1(G)

There's a fairly easy fix. Instead of using [<- for the assignment of the indexed projection-object use instead the [[<- function and don't forget to assign the result to an object name so you can inspect and use it. Otherwise there will only be material printed at the console but the result will be in the (temporary) environment of the function which will get garbage-collected.
Task1 <- function(Gama){
n <- length(as.vector(Gama))
g <- list()
for (i in 1:n){g[[i]]<-projection(sigma1 = 0.5,sigma2 = 0.9,
gama = Gama[i],phi = 1.5)}
return(g)
}
G <- seq(from=0, to=1, by= 0.1)
resG <- Task1(G)
resG[1]
#--- result is a list of list.
[[1]]
[[1]]$projection.matrix
[,1] [,2]
[1,] 0.5 1.5
[2,] 0.0 0.9
[[1]]$assymtotic.dynamic
[1] 0.9
[[1]]$transient.dynamic
[1] 0.1053605
[[1]]$Elasticity
[,1] [,2]
[1,] 0 0
[2,] 0 1

Related

Evaluate expression in environment passed to function as parameter in R

I am trying to create a function for theoretical hessian matrix that I can then evaluate at different locations. First I tried setting expressions as values in a matrix or array, but although I could initially set an expression into a matrix I couldn't replace with the value calculated.
hessian_matrix <- function(gx, respect_to){
out_mat <- matrix(0, nrow=length(respect_to), ncol=length(respect_to))
for(i in 1:length(respect_to)){
for(j in 1:length(respect_to)){
dthetad2x <- deriv(D(gx, respect_to[i]), respect_to[j], function.arg=TRUE)
# also tried
# dthetad2x <- as.expression(D(D(gx, respect_to[i])))
out_mat[i,j] <- dthetad2x
}
return(out_mat)
}
Because that didn't work, I decided to create an environment to house the indeces of the hessian matrix as object.
hessian_matrix <- function(gx, respect_to){
out_env <- new.env()
for(i in 1:length(respect_to)){
for(j in 1:length(respect_to)){
dthetad2x <- as.call(D(D(gx, respect_to[i]), respect_to[j]))
assign(paste0(i,j), dthetad2x, out_env)
}
}
return(out_env)
}
g <- expression(x^3-2*x*y-y^6)
h_g <- hessian_matrix(g, respect_to = c('x', 'y'))
This worked, and when I pass this in as a parameter to evaluate I can see the expression, but I can't evaluate it. I tried with call(), eval(), do.call(), get(), etc. and it didn't work. I also assigning the answer within the environment passed, making a new environment to return, or simply using variables.
fisher_observed <- function(h, at_val, params, sum=TRUE){
out_env <- new.env()
# add params to passed environment
for(i in 1:length(at_val)){
h[[names(at_val)[i]]] <- unname(at_val[i])
}
for(i in ls(h)){
value <- do.call(i, envir=h, at_val)
assign(i, value, out_env)
}
return(h)
}
fisher_observed(h_g, at_val=list(x=1,y=2))
According the code for do.call() this is how it should be used, but it isn't working when passed as a parameter in this way.
R already has the hessian matrix function. You do not have to write one. You could use deriv or deriv3 as shown below:
g <- expression(x^3 - 2 * x * y - y^6)
eval(deriv3(g, c('x','y')),list(x=1,y=2))
[1] -67
attr(,"gradient")
x y
[1,] -1 -194
attr(,"hessian")
, , x
x y
[1,] 6 -2
, , y
x y
[1,] -2 -480
If you want to use a function, you could do:
hessian <- function(expr,values){
nms <- names(values)
f <- eval(deriv3(g, nms),as.list(values))
matrix(attr(f, 'hessian'), length(values), dimnames = list(nms,nms))
}
hessian(g, c(x=1,y=2))
x y
x 6 -2
y -2 -480
Although the function is not necessary as you would do double computation in case you wanted the gradient and hessian
I think this (almost) does what you're looking for:
fisher_observed <- function(h, at_val) {
values <- numeric(length = length(names(h)))
for (i in seq_len(length(names(h)))) {
values[i] = purrr::pmap(.l = at_val, function(x, y) eval(h[[names(h)[i]]]))
}
names(values) = names(h)
return(values)
}
This currently returns a named list of evaluated points:
$`21`
[1] -2
$`22`
[1] -480
$`11`
[1] 6
$`12`
[1] -2
you'd still need to re-arrange this into a matrix (should be fairly easy given the column names are preserved. I think the key thing is that the names must be characters when looking up values in h_g.
You cannot have a matrix of "calls" but you can have a character matrix then evaluate it:
hessian_matrix <- function(gx, respect_to){
out_mat <- matrix("", nrow=length(respect_to), ncol=length(respect_to))
for(i in 1:length(respect_to)){
for(j in 1:length(respect_to)){
dthetad2x <- D(D(gx, respect_to[i]), respect_to[j])
out_mat[i,j] <- deparse(dthetad2x)
}
}
return(out_mat)
}
g <- expression(x^3-2*x*y-y^6)
h_g <- hessian_matrix(g, respect_to = c('x', 'y'))
h_g
#> [,1] [,2]
#> [1,] "3 * (2 * x)" "-2"
#> [2,] "-2" "-(6 * (5 * y^4))"
apply(h_g, 1:2, \(x) eval(str2lang(x), list(x=1, y=2)))
#> [,1] [,2]
#> [1,] 6 -2
#> [2,] -2 -480

How to modify non-zero elements of a large sparse matrix based on a second sparse matrix in R

I have two large sparse matrices (about 41,000 x 55,000 in size). The density of nonzero elements is around 10%. They both have the same row index and column index for nonzero elements.
I now want to modify the values in the first sparse matrix if values in the second matrix are below a certain threshold.
library(Matrix)
# Generating the example matrices.
set.seed(42)
# Rows with values.
i <- sample(1:41000, 227000000, replace = TRUE)
# Columns with values.
j <- sample(1:55000, 227000000, replace = TRUE)
# Values for the first matrix.
x1 <- runif(227000000)
# Values for the second matrix.
x2 <- sample(1:3, 227000000, replace = TRUE)
# Constructing the matrices.
m1 <- sparseMatrix(i = i, j = j, x = x1)
m2 <- sparseMatrix(i = i, j = j, x = x2)
I now get the rows, columns and values from the first matrix in a new matrix. This way, I can simply subset them and only the ones I am interested in remain.
# Getting the positions and values from the matrices.
position_matrix_from_m1 <- rbind(i = m1#i, j = summary(m1)$j, x = m1#x)
position_matrix_from_m2 <- rbind(i = m2#i, j = summary(m2)$j, x = m2#x)
# Subsetting to get the elements of interest.
position_matrix_from_m1 <- position_matrix_from_m1[,position_matrix_from_m1[3,] > 0 & position_matrix_from_m1[3,] < 0.05]
# We add 1 to the values, since the sparse matrix is 0-based.
position_matrix_from_m1[1,] <- position_matrix_from_m1[1,] + 1
position_matrix_from_m1[2,] <- position_matrix_from_m1[2,] + 1
Now I am getting into trouble. Overwriting the values in the second matrix takes too long. I let it run for several hours and it did not finish.
# This takes hours.
m2[position_matrix_from_m1[1,], position_matrix_from_m1[2,]] <- 1
m1[position_matrix_from_m1[1,], position_matrix_from_m1[2,]] <- 0
I thought about pasting the row and column information together. Then I have a unique identifier for each value. This also takes too long and is probably just very bad practice.
# We would get the unique identifiers after the subsetting.
m1_identifiers <- paste0(position_matrix_from_m1[1,], "_", position_matrix_from_m1[2,])
m2_identifiers <- paste0(position_matrix_from_m2[1,], "_", position_matrix_from_m2[2,])
# Now, I could use which and get the position of the values I want to change.
# This also uses to much memory.
m2_identifiers_of_interest <- which(m2_identifiers %in% m1_identifiers)
# Then I would modify the x values in the position_matrix_from_m2 matrix and overwrite m2#x in the sparse matrix object.
Is there a fundamental error in my approach? What should I do to run this efficiently?
Is there a fundamental error in my approach?
Yes. Here it is.
# This takes hours.
m2[position_matrix_from_m1[1,], position_matrix_from_m1[2,]] <- 1
m1[position_matrix_from_m1[1,], position_matrix_from_m1[2,]] <- 0
Syntax as mat[rn, cn] (whether mat is a dense or sparse matrix) is selecting all rows in rn and all columns in cn. So you get a length(rn) x length(cn) matrix. Here is a small example:
A <- matrix(1:9, 3, 3)
# [,1] [,2] [,3]
#[1,] 1 4 7
#[2,] 2 5 8
#[3,] 3 6 9
rn <- 1:2
cn <- 2:3
A[rn, cn]
# [,1] [,2]
#[1,] 4 7
#[2,] 5 8
What you intend to do is to select (rc[1], cn[1]), (rc[2], cn[2]) ..., only. The correct syntax is then mat[cbind(rn, cn)]. Here is a demo:
A[cbind(rn, cn)]
#[1] 4 8
So you need to fix your code to:
m2[cbind(position_matrix_from_m1[1,], position_matrix_from_m1[2,])] <- 1
m1[cbind(position_matrix_from_m1[1,], position_matrix_from_m1[2,])] <- 0
Oh wait... Based on your construction of position_matrix_from_m1, this is just
ij <- t(position_matrix_from_m1[1:2, ])
m2[ij] <- 1
m1[ij] <- 0
Now, let me explain how you can do better. You have underused summary(). It returns a 3-column data frame, giving (i, j, x) triplet, where both i and j are index starting from 1. You could have worked with this nice output directly, as follows:
# Getting (i, j, x) triplet (stored as a data.frame) for both `m1` and `m2`
position_matrix_from_m1 <- summary(m1)
# you never seem to use `position_matrix_from_m2` so I skip it
# Subsetting to get the elements of interest.
position_matrix_from_m1 <- subset(position_matrix_from_m1, x > 0 & x < 0.05)
Now you can do:
ij <- as.matrix(position_matrix_from_m1[, 1:2])
m2[ij] <- 1
m1[ij] <- 0
Is there a even better solution? Yes! Note that nonzero elements in m1 and m2 are located in the same positions. So basically, you just need to change m2#x according to m1#x.
ind <- m1#x > 0 & m1#x < 0.05
m2#x[ind] <- 1
m1#x[ind] <- 0
A complete R session
I don't have enough RAM to create your large matrix, so I reduced your problem size a little bit for testing. Everything worked smoothly.
library(Matrix)
# Generating the example matrices.
set.seed(42)
## reduce problem size to what my laptop can bear with
squeeze <- 0.1
# Rows with values.
i <- sample(1:(41000 * squeeze), 227000000 * squeeze ^ 2, replace = TRUE)
# Columns with values.
j <- sample(1:(55000 * squeeze), 227000000 * squeeze ^ 2, replace = TRUE)
# Values for the first matrix.
x1 <- runif(227000000 * squeeze ^ 2)
# Values for the second matrix.
x2 <- sample(1:3, 227000000 * squeeze ^ 2, replace = TRUE)
# Constructing the matrices.
m1 <- sparseMatrix(i = i, j = j, x = x1)
m2 <- sparseMatrix(i = i, j = j, x = x2)
## give me more usable RAM
rm(i, j, x1, x2)
##
## fix to your code
##
m1a <- m1
m2a <- m2
# Getting (i, j, x) triplet (stored as a data.frame) for both `m1` and `m2`
position_matrix_from_m1 <- summary(m1)
# Subsetting to get the elements of interest.
position_matrix_from_m1 <- subset(position_matrix_from_m1, x > 0 & x < 0.05)
ij <- as.matrix(position_matrix_from_m1[, 1:2])
m2a[ij] <- 1
m1a[ij] <- 0
##
## the best solution
##
m1b <- m1
m2b <- m2
ind <- m1#x > 0 & m1#x < 0.05
m2b#x[ind] <- 1
m1b#x[ind] <- 0
##
## they are identical
##
all.equal(m1a, m1b)
#[1] TRUE
all.equal(m2a, m2b)
#[1] TRUE
Caveat:
I know that some people may propose
m1c <- m1
m2c <- m2
logi <- m1 > 0 & m1 < 0.05
m2c[logi] <- 1
m1c[logi] <- 0
It looks completely natural in R's syntax. But trust me, it is extremely slow for large matrices.

Write a loop to select all combination of variable values generating positive equation values in R

I have the following four equations (a,b,c,d), with several different variables (x,t,v,w,n,f). My goal would be to try and find all sets of variable values that would generate all positive (and non-zero) numbers for equations(a,b,c,d). A regular loop would just go through each number of the sequence generated and systematically check if it generates a positive value or not. I want it to pick up random numbers from each sequence and test it against the others in R.
For example (x=8, t = 2.1,v=13,w=1,n=10,f=1) is a possible set of combinations.
Please do not suggest analytically solving for these and then finding out values. These are simply representations of equations I'm dealing with. The equations I have are quite complex, and more than 15 variables.
#Equations
a <- x * t - 2*x
b <- v - x^2
c <- x - w*t - t*t
d <- (n - f)/t
x <- seq(from = 0.0001, to = 1000, by = 0.1)
t <- seq(from = 0.0001, to = 1000, by = 0.1)
v <- seq(from = 0.0001, to = 1000, by = 0.1)
w <- seq(from = 0.0001, to = 1000, by = 0.1)
n <- seq(from = 0.0001, to = 1000, by = 0.1)
f <- seq(from = 0.0001, to = 1000, by = 0.1)
For a start, it might be better to organize your equations and your probe values into lists:
set.seed(1222)
values <- list(x = x, t = t, v = v, w = w, n = n, f = f)
eqs <- list(
a = expression(x * t - 2 * x),
b = expression(v - x^2),
c = expression(x - w*t - t*t),
d = expression((n - f)/t)
)
Then we can define a number of samples to take randomly from each probe vector:
samples <- 3
values.sampled <- lapply(values, sample, samples)
$x
[1] 642.3001 563.1001 221.3001
$t
[1] 583.9001 279.0001 749.1001
$v
[1] 446.6001 106.7001 0.7001
$w
[1] 636.0001 208.8001 525.5001
$n
[1] 559.8001 28.4001 239.0001
$f
[1] 640.4001 612.5001 790.1001
We can then iterate over each stored equation, evaluating the equation within the "sampled" environment:
results <- sapply(eqs, eval, envir = values.sampled)
a b c d
[1,] 373754.5 -412102.82 -711657.5 -0.1380373
[2,] 155978.8 -316975.02 -135533.2 -2.0935476
[3,] 165333.3 -48973.03 -954581.8 -0.7356827
From there you can remove any value that is 0 or less:
results[results <= 0] <- NA
If every independent value can take on the same value (e.g. seq(from = 0.0001, to = 1000, by = 0.1)), we can approach this with much greater rigor and avoid the possibility of generating duplicates. First we create a masterFun that is essentially a wrapper for all of the functions you want to define:
masterFun <- function(y) {
## y is a vector with 6 values
## y[1] -->> x
## y[2] -->> t
## y[3] -->> v
## y[4] -->> w
## y[5] -->> n
## y[6] -->> f
fA <- function(x, t) {x * t - 2*x}
fB <- function(v, x) {v - x^2}
fC <- function(x, w, t) {x - w*t - t*t}
fD <- function(n, f, t) {(n - f)/t}
## one can easily filter out negative
## results as #jdobres has done.
c(a = fA(y[1], y[2]), b = fB(y[3], y[1]),
c = fC(y[1], y[4], y[2]), d = fD(y[5], y[6], y[2]))
}
Now, using permuteSample, which is capable of generating random permutations of a vector and subsequently applying any given user defined function to each permutation, from RcppAlgos (I am the author), we have:
## Not technically the domain, but this variable name
## is concise and very descriptive
domain <- seq(from = 0.0001, to = 1000, by = 0.1)
library(RcppAlgos)
## number of variables ... x, t, v, w, n, f
## ||
## \/
permuteSample(domain, m = 6, repetition = TRUE,
n = 3, seed = 123, FUN = masterFun)
[[1]]
a b c d
218830.316100 -608541.146040 -310624.596670 -1.415869
[[2]]
a b c d
371023.322880 -482662.278860 -731052.643620 1.132836
[[3]]
a b c d
18512.60761001 -12521.71284001 -39722.27696002 -0.09118721
In short, the underlying algorithm is capable of generating the nth lexicographical result, which allows us to apply a mapping from 1 to "# of total permutations" to the permutations themselves. For example, given the permutations of the vector 1:3:
permuteGeneral(3, 3)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 3 2
[3,] 2 1 3
[4,] 2 3 1
[5,] 3 1 2
[6,] 3 2 1
We can easily generate the 2nd and the 5th permutation above without generating the first permutation or the first four permutations:
permuteSample(3, 3, sampleVec = c(2, 5))
[,1] [,2] [,3]
[1,] 1 3 2
[2,] 3 1 2
This allows us to have a more controlled and tangible grasp of our random samples as we can now think of them in a more familiar way (i.e. a random sample of numbers).
If you actually want to see which variables were used in the above calculation, we simply drop the FUN argument:
permuteSample(domain, m = 6, repetition = TRUE, n = 3, seed = 123)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 780.7001 282.3001 951.5001 820.8001 289.1001 688.8001
[2,] 694.8001 536.0001 84.9001 829.2001 757.3001 150.1001
[3,] 114.7001 163.4001 634.4001 80.4001 327.2001 342.1001

Error with parallelization in R for S4 objects

I am trying to optimize a function that I am going to do with a several rasters with millions of cells, so I want to parallelize this function.
The initial Raster
So this is the initial raster:
library(raster)
SPA <- raster(nrows=3, ncols=3, xmn = -10, xmx = -4, ymn = 4, ymx = 10)
values(SPA) <- c(0.1, 0.4, 0.6, 0, 0.2, 0.4, 0, 0.1, 0.2)
plot(SPA)
The objective of the function is to get a dataframe with the distance between all of the cells present in the raster with a column from, a column to, and a column distance.
Transition layer
in order to do that I create a transition layer using the gdistance package:
library(gdistance)
h16 <- transition(SPA, transitionFunction=function(x){1},16,symm=FALSE)
h16 <- geoCorrection(h16, scl=FALSE)
and the origin points for every cell:
B <- xyFromCell(SPA, cell = 1:ncell(SPA))
head(B)
x y
[1,] -9 9
[2,] -7 9
[3,] -5 9
[4,] -9 7
[5,] -7 7
[6,] -5 7
Distance function
With some help from some stackoverflow answers I made this function which is faster than the accCost one in gdistance
accCost2 <- function(x, fromCoords) {
fromCells <- cellFromXY(x, fromCoords)
tr <- transitionMatrix(x)
tr <- rBind(tr, rep(0, nrow(tr)))
tr <- cBind(tr, rep(0, nrow(tr)))
startNode <- nrow(tr)
adjP <- cbind(rep(startNode, times = length(fromCells)), fromCells)
tr[adjP] <- Inf
adjacencyGraph <- graph.adjacency(tr, mode = "directed", weighted = TRUE)
E(adjacencyGraph)$weight <- 1/E(adjacencyGraph)$weight
return(shortest.paths(adjacencyGraph, v = startNode, mode = "out")[-startNode])
}
What I want to parallelize
And using apply I get my desired data.frame
connections <- data.frame(from = rep(1:nrow(B), each = nrow(B)),to = rep(1:nrow(B), nrow(B)), dist =as.vector(apply(B,1, accCost2, x = h16)))
head(connections)
from to dist
1 1 1 0.0
2 1 2 219915.7
3 1 3 439831.3
4 1 4 221191.8
5 1 5 312305.7
6 1 6 493316.1
This is what I tried with parApply
library("parallel")
cl = makeCluster(3)
clusterExport(cl, c("B", "h16", "accCost2"))
clusterEvalQ(cl, library(gdistance), library(raster))
connections <- data.frame(from = rep(1:nrow(B), each = nrow(B)),to = rep(1:nrow(B), nrow(B)), dist =as.vector(parRapply(cl, B,1, accCost2, x = h16)))
stopCluster(cl)
But I get the following error:
Error in x[i, , drop = FALSE] : object of type 'S4' is not subsettable
I am fairly new in parallelization, and I am not sure what I am doing wrong
There are several syntax issues in your code.
This code works for me.
library("parallel")
accCost_wrap <- function(x){accCost2(h16,x)}
#Instead of including h16 in the parRapply function,
#just get it in the node environment
cl = makeCluster(3)
clusterExport(cl, c("h16", "accCost2"))
#B will be "sent" to the nodes through the parRapply function.
clusterEvalQ(cl, {library(gdistance)})
#raster is a dependency of gdistance, so no need to include raster here.
pp <- parRapply(cl, x=B, FUN=accCost_wrap)
stopCluster(cl)
connections <- data.frame(from = rep(1:nrow(B), each = nrow(B)),
to = rep(1:nrow(B), nrow(B)),
dist = as.vector(pp))
Your version of accCost is indeed faster than the version in gdistance. Your version omits the checks to see if your points are within the extent of your transition layer. Proceed with caution.
(You could make your function even faster by taking the cell numbers as input. Also, sending so much data back from each node does not seem very efficient.)

Is there a way to simulate a dataset based on a model object in TAM?

so I've estimated a multidimensional IRT model using the TAM package, based on this dataset that I have.
So now that I have the TAM fit object, is there any way to use it to simulate a new dataset that "abides by the rules" of that model I estimated?
Here is something similar, but about lme fit objects:
https://stats.stackexchange.com/questions/11233/how-to-simulate-data-based-on-a-linear-mixed-model-fit-object-in-r
Thanks in advance,
KH
Edit
now, since TAM version 1.10-0, it is possible using the function IRT.simulate (see respective help file). Thanks again for the request.
library(TAM)
data(data.gpcm)
psych::describe(data.gpcm)
resp <- data.gpcm
# define three dimensions and different loadings of item categories
# on these dimensions in B loading matrix
I <- 3 # 3 items
D <- 3 # 3 dimensions
# define loading matrix B
# 4 categories for each item (0, 1, 2, 3)
B <- array(0 , dim = c(I, 4, D))
for (ii in 1:I){
B[ii, 1:4, 1] <- 0:3
B[ii, 1, 2] <- 1
B[ii, 4, 3] <- 1
}
dimnames(B)[[1]] <- colnames(resp)
B[1, , ]
## > B[1,,]
## [,1] [,2] [,3]
## [1,] 0 1 0
## [2,] 1 0 0
## [3,] 2 0 0
## [4,] 3 0 1
#-- test run
mod1 <- tam.mml(resp, B = B, control = list(snodes = 1000, maxiter = 5))
sim.dat <- IRT.simulate(mod1, nobs = 2000)
Old Solution
I wouldn't say it is impossible. However, for the time being, it is not easy since it involves handling of TAM internal functions and attributes of the estimation object. That is, there is no method yet that lets you extract the response probability function at prespecified trait points.
However, thanks to your request, we are working on exactly this very valuable feature and I'll give an update to this answer as soon as the method is on CRAN.
For now, let's extend the example of that request: Implement ConQuest score command in TAM that Alex also included at the manual page of the tam function as EXAMPLE 20.
data(data.gpcm)
psych::describe(data.gpcm)
resp <- data.gpcm
# define three dimensions and different loadings of item categories
# on these dimensions in B loading matrix
I <- 3 # 3 items
D <- 3 # 3 dimensions
# define loading matrix B
# 4 categories for each item (0, 1, 2, 3)
B <- array(0 , dim = c(I, 4, D))
for (ii in 1:I){
B[ii, 1:4, 1] <- 0:3
B[ii, 1, 2] <- 1
B[ii, 4, 3] <- 1
}
dimnames(B)[[1]] <- colnames(resp)
B[1, , ]
## > B[1,,]
## [,1] [,2] [,3]
## [1,] 0 1 0
## [2,] 1 0 0
## [3,] 2 0 0
## [4,] 3 0 1
#-- test run
mod1 <- tam.mml(resp, B = B, control = list(snodes = 1000, maxiter = 5))
Now for the part where we extract the attributes that are necessary for the computation of the response probabilities and generate new testees.
# Extract necessary item attributes
xsi <- mod1$xsi$xsi
A <- mod1$A
B <- mod1$B
maxK <- mod1$maxK
nI <- dim(A)[1]
iIndex <- 1:nI
AXsi <- matrix(0, nrow = nI, ncol = maxK)
# Simulate new testees
nnodes <- 2000
theta <- mvrnorm(n = nnodes, mod1$beta, mod1$variance)
The response probabilities can be obtained from a call to an internal function.
# Calculate response probablities and simulate
p <- TAM:::calc_prob.v5(iIndex, A, AXsi, B, xsi, theta, nnodes, maxK, recalc = TRUE)$rprobs
p[,,1] # response probability of testee 1 to each category 0, 1, 2, 3 for all three items
# [,1] [,2] [,3] [,4]
# [1,] 0.06738066 0.8111365 0.1043441 0.0171387
# [2,] 0.02545206 0.4895568 0.3182046 0.1667866
# [3,] 0.04503185 0.5105446 0.3429603 0.1014633
With this, simulate the success cut and compare that to the response probabilities.
sim.data <- matrix(runif(nnodes * nI), nrow = nnodes, ncol = nI)
for(pp in 1:nnodes){
cat.success.pp <- (sim.data[pp, ] > t(apply(p[, , pp], 1, cumsum)))
sim.data[pp, ] <- c(cat.success.pp %*% rep(1, maxK))
}
Best,
Tom

Resources