Fitting an inhomogeneous Cox LGCP to a replicated point process using mppm - r

My recent foray into spatial point patterns has brought me to examining LGCP Cox processes. In my case I actually have a series of point patterns that I want to fit a single model to. One of my previous inquiries brought me to using mppm to train such models( thanks Adrian Baddeley!). My next question relates to using this type of Cox model in the context of mppm.
Is this possible to fit an inhomogeneous LGCP Cox process (or other type of Cox process) to a replicated point pattern using mppm? I see some info on fitting Gibbs processes, but not really for Cox processes.
It seems like the answer may be "possibly" through some creative use of the "random" argument.
For the sake of example, lets say I'm fitting a using point pattern Y with a single covariate X (which is a single im). The call to kppm would be:
myModel = kppm(Y ~ X,"LGCP")
If I were fitting a simple inhomogeneous Poisson process to a replicated point pattern and associated covariate in hyperframe G, I believe the call would look like the following:
myModel = mppm(Y ~ X, data=G)
After going through Chapter 16 of the SpatStat book I think that fitting a replicated LGCP Cox model might be accomplished by using the simulated intensities from calls to rLGCP, maybe like this...
myLGCP = rLGCP(model="exp",mu=0,saveLambda=TRUE,nsim=2,win=myWindow)
myIntensity = lapply(myLGCP,function(x) attributes(x)$Lambda)
G$Z = myIntensity
myModel = mppm(Y ~ X, data=G, random=~Z|id)
The above approach "runs" without errors... but I have no idea if I'm even remotely close to actually accomplishing what I wanted to do. It's also a little unclear how to use the fitted object to then simulate a realization of the model, since simulate.kppm requires a kppm object.
Thoughts and suggestions appreciated.

mppm does not currently support Cox processes.
You could do the following
Fit the trend part of the model to your replicated point pattern data using mppm, for example m <- mppm(Y ~ X, data=G)
Extract the fitted intensities for each point pattern using predict.mppm
For each point pattern, using the corresponding intensity obtained from the model, compute the inhomogeneous K function using Kinhom (with argument ratio=TRUE)
Combine the K functions using pool
Estimate the cluster parameters of the LGCP by applying lgcp.estK to the pooled K function.
Optionally after step 4 you could convert the pooled K function to a pair correlation function using pcf.fv and then fit the cluster parameters using lgcp.estpcf.
This approach assumes that the same cluster parameters will apply to each point pattern. If your data consist of several distinct groups of patterns, and you want the model to assign different cluster parameter values to the different groups of patterns, then just apply steps 4 and 5 separately to each group.

Related

how to decompose a gamma distribution into two gamma distribution in R

Is there an algorithm available in R that can decompose a gamma distribution into two (or more) gamma distributions? If so, can you give me an example with it? Basically, I have a data set that looks like a gamma distribution if I plot it with respect to time (it's a time series data). Basically, this data contains the movement of the animal. And the animal can be in two different states: hungry, not hungry. My immediate reaction was to use the Hidden Markov Model and see if I can predict the two states. I was trying to use the depmix() function from depmixS4 library in R to see if I can see the two different states. However, I don't really know how to use this function in gamma distribution. The following is the code that I wrote, but it says that I need an argument for gamma, which I don't understand. Can someone tell me what parameter I should use and how to determine the parameter? Thanks!
mod <- depmix(freq ~ 1, data = mod.data, nstates = 2, family = gamma())
fit.mod <- fit(mod)
Thank you!

How to figure out the parameters from mppm in R

I am working using the spatstat library in R.
I have several point pattern objects built from my own dataset. The point patterns contain only the x and y coordinates of the points in them. I wanted to fit the point patterns to a Gibbs process with Strauss interaction to build a model and simulate similar point patterns. I was able to use ppm function for that purpose if I work with one point pattern at a time. I used rmhmodel function on the ppm object returned from the ppm function. The rmhmodel function gave me the parameters beta, gamma and r, which I needed to use in rStrauss function further to simulate new point patterns. FYI, I am not using the simulate function directly as I want the new simulated point pattern to have flexible number of points that simulate does not give me.
Now, if I want to work with all the point patterns I have, I can build a hyperframe of point patterns as described in the replicated point pattern chapter of the Baddeley textbook, but it requires mppm function instead of ppm function to fit the model and mppm is not working with rmhmodel when I am trying to figure out the model parameters beta, gamma and r.
How can I extract the fitted beta, gamma and r from a mppm object?
There are several ways to do this.
If you print a fitted model (obtained from ppm or from mppm) simply by typing the name of the object, the printed output contains a description of the fitted model including the model parameters.
If you apply the function parameters to a fitted model obtained from ppm you will obtain a list of the parameter values with appropriate names.
fit <- ppm(cells ~ 1, Strauss(0.12))
fit
parameters(fit)
For a model obtained from mppm, there could be different parameter values applying to each row of the hyperframe of data, so you would have to do lapply(subfits(model), parameters) and the result is a list with one entry for each row of the hyperframe, containing the parameters relevant to each row.
A <- hyperframe(Bugs=waterstriders)
mfit <- mppm(Bugs ~ 1, data=A, Strauss(5))
lapply(subfits(mfit), parameters)
Alternatively you can extract the canonical parameters by coef and transform them to the natural parameters.
You wrote:
I am not using the simulate function directly as I want the new simulated point pattern to have flexible number of points that simulate does not give me.
This cannot be right. The function simulate.mppm generates simulated realisations with a variable number of points. Try simulate(mfit).

Estimation of a state-space model with lags in the measurement equation in R

I'm trying to estimate an SS model from this paper that has the following form:
Setting the order of the first lag polynomial to zero and the second one to one, we can reformulate it using terms from the MARSS package guide when applicable (x is the state, y is the observed variable, d is exogenous):
MARSS package allows for estimation of a simpler model that dooesn't include lagged variables in the measurement equation. Is there a way to estimate this one using MARSS or any other package without rewriting the estimation routine for this special case? Maybe there is a way to reformulate it so it could be "fed" to MARSS or some other package?
Take a look at how say the BSM Structural time series model or ARMA model is formulated as a MARSS model, aka a multivariate state-space model. That'll give you an idea of how to reform your model in multivariate state-space form.
Basically, your x will look like
See how the x_2 is just a dummy that is forced to be x(t-1)?
Now the y equation
The d and a are your D and A. I wrote in small case to spec that they are scalars. But they can be matrices in general (if y is multivariate say). Your inputs are the d_t and y_{t-1}. You prepare that 2x1xT matrix as an input.
Be careful with your initial condition specification. Probably best/easiest to set it at t=1 and estimate or use diffuse prior.
You can fit this model with MARSS. You can fit with any Kalman filter function that will allow you to pass in inputs in the y equation (some do, some don't). KFAS::KFS() allows that using the SScustom() function.
In MARSS the model list will look like so
mod.list=list(
B=matrix(list("b",1,0,0),2,2),
U=matrix(0,2,1),
Q=matrix(list("q",0,0,0),2,2),
Z=matrix(c("z", "c"),1,2),
A=matrix(0),
R=matrix("r"),
D=matrix(c("d", "a"),1,2),
x0=matrix(c("x1","x2"),2,1),
tinitx=1,
d=rbind(dt[2:TT],y[1:(TT-1)])
)
dat <- y[2:TT] # since you need y_{t-1} in the d (inputs)
fit <- MARSS(dat, model=mod.list)
It'll probably complain that it wants initial conditions for x0. Anything will work. The EM algorithm isn't sensitive to that like a BFGS or Newton algorithm. But method="BFGS" is actually often better for this type of structural ts model and in that case pick a reasonable initial condition for x (reasonable = close to your data in this case I think).

evaluate forecast by the terms of p-value and pearson correlation

I am using R to do some evaluations for two different forecasting models. The basic idea of the evaluation is do the comparison of Pearson correlation and it corresponding p-value using the function of cor.() . The graph below shows the final result of the correlation coefficient and its p-value.
we suggestion that model which has lower correlation coefficient with corresponding lower p-value(less 0,05) is better(or, higher correlation coefficient but with pretty high corresponding p-value).
so , in this case, overall, we would say that the model1 is better than model2.
but the question here is, is there any other specific statistic method to quantify the comparison?
Thanks a lot !!!
Assuming you're working with time series data since you called out a "forecast". I think what you're really looking for is backtesting of your forecast model. From Ruey S. Tsay's "An Introduction to Analysis of Financial Data with R", you might want to take a look at his backtest.R function.
backtest(m1,rt,orig,h,xre=NULL,fixed=NULL,inc.mean=TRUE)
# m1: is a time-series model object
# orig: is the starting forecast origin
# rt: the time series
# xre: the independent variables
# h: forecast horizon
# fixed: parameter constriant
# inc.mean: flag for constant term of the model.
Backtesting allows you to see how well your models perform on past data and Tsay's backtest.R provides RMSE and Mean-Absolute-Error which will give you another perspective outside of correlation. Caution depending on the size of your data and complexity of your model, this can be a very slow running test.
To compare models you'll normally look at RMSE which is essentially the standard deviation of the error of your model. Those two are directly comparable and smaller is better.
An even better alternative is to set up training, testing, and validation sets before you build your models. If you train two models on the same training / test data you can compare them against your validation set (which has never been seen by your models) to get a more accurate measurement of your model's performance measures.
One final alternative, if you have a "cost" associated with an inaccurate forecast, apply those costs to your predictions and add them up. If one model performs poorly on a more expensive segment of data, you may want to avoid using it.
As a side-note, your interpretation of a p value as less is better leaves a little to be [desired] quite right.
P values address only one question: how likely are your data, assuming a true null hypothesis? It does not measure support for the alternative hypothesis.

Multivariate time series model using MARSS package (or maybe dlm)

I have two temporal processes. I would like to see if one temporal process (X_{t,2}) can be used to perform better forecast of the other process (X_{t,1}). I have multiple sources providing temporal data on X_{t,2}, (e.g. 3 time series measuring X_{t,2}). All time series require a seasonal component.
I found MARSS' notation to be pretty natural to fit this type of model and the code looks like this:
Z=factor(c("R","S","S","S")) # observation matrix
B=matrix(list(1,0,"beta",1),2,2) #evolution matrix
A="zero" #demeaned
R=matrix(list(0),4,4); diag(R)=c("r","s","s","s")
Q="diagonal and unequal"
U="zero"
period = 12
per.1st = 1 # Now create factors for seasons
c.in = diag(period)
for(i in 2:(ceiling(TT/period))) {c.in = cbind(c.in,diag(period))}
c.in = c.in[,(1:TT)+(per.1st-1)]
rownames(c.in) = month.abb
C = "unconstrained" #2 x 12 matrix
dlmfit = MARSS(data, model=list(Z=Z,B=B,Q=Q,C=C, c=c.in,R=R,A=A,U=U))
I got a beta estimate implying that the second temporal process is useful in forecasting the first process but to my dismay, MARSS gives me an error when I use MARSSsimulate to forecast because one of the matrices (related to seasonality) is time-varying.
Anyone, knows a way around this issue of the MARSS package? And if not, any tips on fitting an analogous model using, say the dlm package?
I was able to represent my state-space model in a form adequate to use with the dlm package. But I encountered some problems using dlm too. First, the ML estimates are VERY unstable. I bypassed this issue by constructing the dlm model based on marss estimates. However, dlmFilter is not working properly. I think the issue is that dlmFilter is not designed to deal with models with multiple sources for one time series, and additional seasonal components. dlmForecast gives me forecasts that I need!!!
In summary for my multivariate time series model (with multiple sources providing data for one of the temporal processes), the MARSS library gave me reasonable estimates of the parameters and allowed me to obtain filtered and smoothed values of the states. Forecast values were not possible. On the other hand, dlm gave fishy estimates for my model and the dlmFilter didn't work, but I was able to use dlmForecast to forecast values using the model I fitted in MARSS and reexpressed in dlm appropriate form.

Resources