Calculating variance covariance matrix for improved kppm model - r

I get the following error when I try to improve the intensity estimate of a kppm object, if I include the argument vcov = TRUE in the function improve.kppm:
Error in improve.kppm(object, type = type, rmax = rmax, dimyx = dimyx, :
object 'gminus1' not found
If I don't include the argument, the function runs but I cannot use the summary() function on the improved kppm object. I get the same error message as above. The same thing happens when I use vcov().
The call that I used to create my kppm object was (number of covariates has been reduced for clarity):
a05 = kppm(a2005nests ~ nest + nest2, cluster = "Thomas", covariates = fitcov(2))
where fitcov(2) is a function that returns a list of im objects. Could this be the issue? I've noticed that some spatstat functions on kppm objects throw errors if I used this function in the original kppm call. Usually it says something along the lines of Error: Covariates ‘nest’ and ‘nest2’ were not found.

There is a bug in the logical flow of improve.kppm: if vcov=TRUE and type != "quasi", the variable gminus1 is not defined. We will fix this in the development version of spatstat as soon as possible.
Did you perhaps select type="clik1" or type="wclik1" in the original call to kppm?
For the moment, you should be able to avoid the bug by omitting the argument type, or explicitly selecting type="quasi", in calls to kppm and improve.kppm.
The second problem, in which kppm fails to find the covariates, appears to be a scoping problem, but I can't reproduce it here. It would help if you can supply a minimal working example.

Related

To find valid argument for a function in R's help document (meaning of ...)

This question may seem basic but this has bothered me quite a while. The help document for many functions has ... as one of its argument, but somehow I can never get my head around this ... thing.
For example, suppose I have created a model say model_xgboost and want to make a prediction based on a dataset say data_tbl using the predict() function, and I want to know the syntax. So I look at its help document which says:
?predict
**Usage**
predict (object, ...)
**Arguments**
object a model object for which prediction is desired.
... additional arguments affecting the predictions produced.
To me the syntax and its examples didn't really enlighten me as I still have no idea what the valid syntax/arguments are for the function. In an online course it uses something like below, which works:
data_tbl %>%
predict(model_xgboost, new_data = .)
However, looking across the help doc I cannot find the new_data argument. Instead it mentioned newdata argument in its Details section, which actually didn't work if I displace the new_data = . with newdata = .:
Error in `check_pred_type_dots()`:
! Did you mean to use `new_data` instead of `newdata`?
My questions are:
How do I know exactly what argument(s) / syntax can be used for a function like this?
Why new_data but not newdata in this example?
I might be missing something here, but is there any reference/resource about how to use/interpret a help document, in plain English? (a lot of document, including R help file seem just give a brief sentence like "additional arguments affecting the predictions produced" etc)
#CarlWitthoft's answer is good, I want to add a little bit of nuance about this particular function. The reason the help page for ?predict is so vague is an unfortunate consequence of the fact that predict() is a generic method in R: that is, it's a function that can be applied to a variety of different object types, using slightly different (but appropriate) methods in each case. As such, the ?predict help page only lists object (which is required as the first argument in all methods) and ..., because different predict methods could take very different arguments/options.
If you call methods("predict") in a clean R session (before loading any additional packages) you'll see a list of 16 methods that base R knows about. After loading library("tidymodels"), the list expands to 69 methods. I don't know what class your object is (class("model_xgboost")), but assuming that it's of class model_fit, we look at ?predict.model_fit to see
predict(object, new_data, type = NULL, opts = list(), ...)
This tells us that we need to call the new data new_data (and, reading a bit farther down, that it needs to be "A rectangular data object, such as a data frame")
The help page for predict says
Most prediction methods which are similar to those for linear
models have an argument ‘newdata’ specifying the first place to
look for explanatory variables to be used for prediction
(emphasis added). I don't know why the parsnip authors (the predict.model_fit method comes from the parsnip package) decided to use new_data rather than newdata, presumably in line with the tidyverse style guide, which says
Use underscores (_) (so called snake case) to separate words within a name.
In my opinion this might have been a mistake, but you can see that the parsnip/tidymodels authors have realized that people are likely to make this mistake and added an informative warning, as shown in your example and noted e.g. here
Among other things, the existence of ... in a function definition means you can enter any arguments (values, functions, etc) you want to. There are some cases where the main function does not even use the ... but passes them to functions called inside the main function. Simple example:
foo <- function(x,...){
y <- x^2
plot(x,y,...)
}
I know of functions which accept a function as an input argument, at which point the items to include via ... are specific to the selected input function name.

EnvStats::epdfPlot() throws error with continuous distribution

I am trying to compute an empirical probability distribution for a continuous variable using the epdfPlot() function in the EnvStats:: package. I keep getting an error when I accept the default of discrete=FALSE.
Error in UseMethod("density") :
no applicable method for 'density' applied to an object of class "c('double', 'numeric')"
Reading through the documentation, I think this is somehow a result of how the function passes arguments to stats::density() because I don't have this problem when I set discrete = TRUE. As the documentation notes, the argument density.arg.list=NULL is ignored when discrete = TRUE. Here is the reproducible example:
library(EnvStats)
dat<-rnorm(500, 0, 1)
demo1<-epdfPlot(dat, discrete = FALSE, plot.it=FALSE) # throws error
demo2<-epdfPlot(dat, discrete = TRUE, plot.it=FALSE) # works
demo2
Is this possibly a bug?
Turns out this occurs because of a conflict with the labdsv::density() function. EnvStats::epdfPlot() doesn't specify stats::density(), so when density() gets masked by another package, EnvStats::epdfPlot() calls that function. I tried running the code in a different session and it worked without error.

Error in aeqSurv(Y) : aeqSurv exception, an interval has effective length 0

I'm using R's coxph function to fit a survival regression model, and I'm trying to model time dependent covariates (see this vignette). When fitting the model, I get the following error:
Error in aeqSurv(Y) :
aeqSurv exception, an interval has effective length 0
Other than the source code, I couldn't find any references to this error online. Would appreciate any ideas about how to handle this exception.
I found the same error. Probably the cause is the aeqSurv routine that treats time values such that tiny differences are treated as a tie. This is actually useful and the error is potentially pointing an issue with the data.
However, if we need to force a solution you can use the coxph.options.
Just setting timefix = FALSE in the call to coxph should make the trick!
Source:
https://rdrr.io/cran/survival/src/R/aeqSurv.R
I had this error after I used the survSplit function to make time intervals, prior to fitting with coxph. I noticed that survSplit introduced trailing digits (i.e., 20 days turned into 20.0 days). So I removed those digits with the round function and it worked.
Like the above answer, adding the control variable in the coxph function should solve the problem. Please see the reference: https://github.com/therneau/survival/issues/76
model <- coxph(formula = Surv(time1, time2, event) ~ cluster(cluster),
data = dataframe,
control = coxph.control(timefix = FALSE)) # add the control variable

Adapt mle2 to use an unnamed parameter vector & addition algurments

Good evening,
I have a quick question about mle2() syntax. I have an optim() routine that optimizes a log-Likelihood function of the following form (and this runs thousands of times, so i don't want to change much):
ObjFun <- function(p, X, y, ModelFunction, CostFunction)
where p is a vector of 1-8 parameters, X is the input matrix, y is the response/independent variable vector, ModelFunction is a function specifying the shape of a model, and CostFunction specifies the cost/loss function the model likelihood should incorporate during the optimization. The code works fine with optim() or maxLik [maxLik] wit the following code:
maxLik(ObjFun, method="NM", start=c(1,2,3,4,5),
X=conc, y=y, ModelFunction=Model1, CostFunction=GCost)
constrOptim(init.par, ObjFun, ui = Ui, ci = Ci, method = "Nelder-Mead",
control = control1, X=X, y=y, ModelFunction= get(Model1),
CostFunction= get(GCost))
##i'm obviously using constrained optimization in my actual problem.
But I can't get it to work easily with mle() or mle2(). I just want to run it in mle2 to compare the likelihood profile with my own profiling function. Before i go digging through the mle2() code, does anyone know if it's my unnamed parameter vector or the additional arguments that make the function crash? I thought it was the former, but i am confused because the error it's giving me is:
mle2(ObjFun, method="Nelder-Mead"", start=c(1,2,3,4,5),
X=X, y=y, ModelFunction=Model1, CostFunction=GCost)
"minuslogl() : argument "ModelFunction" is missing, with no default"
and that argument is clearly specified. I couldn't really find any examples with additional parameters in the vignettes.
PS:
I would have just commented on this post as it's obviously related:
Creating function arguments from a named list (with an application to stats4::mle)
But I don't have enough points to comment.
UPDATE:
mle2() has options vecpar and parnames options that should allow one to specify "for compatibility with Optim", according to Ben's vignette. I simplified the objective function (the log-likelihood) to include hard-coded loss and model examples. The result looks like this:
mod2 <- mle2(ObjFun2, method="Nelder-Mead", start=list(1,2,3,4,5),
vecpar=T, parnames=c("A", "B", "C", "D", "E"))
However this still doesn't work. I have a hard time troubleshooting it because i don't know how to refer to the parameters inside the objective function after the call from mle2. If i include debugging commands such as print(p[2]) inside the ObjFun2, it returns NULL. So the parameter is no longer in a vector form. However print(A) forces the function to crush. Again, I can't find any working examples of this online, so maybe I'm missing something very simple.
I can't use the parameters argument as Ben suggested in the above link because my models are not linear.
I attempted to look inside the mle2() but got stuck on a call to calc_mle2_function().

How to specify additional parameters in R functions?

For Fitting Markov Switching Models with package MSwM (function msmFit) there is 'control' argument which is list of control parameters.
Syntax of msmFit is:
msmFit(object, k, sw, p, data, family, control)
The 'control' argument is a list that can supply any of the following components:
-trace: A logical value. If it is TRUE, tracing information on the progress of the optimization is produced.
-maxiter: The maximum number of iterations in the EM method. Default is 100.
and so on.
My question is how to specify for example '-maxiter'? I tried: component(maxiter=50), component-maxiter=50, component[["maxiter"]]=50. Everything gives an error. "unexpected '='" or other errors connected to argument.
Format your call something like this:
mod=msmFit(model,k=2,sw=rep(TRUE,8),control=list(maxiter=10))

Resources