Errors in segmented package: breakpoints confusion - r

Using the segmented package to create a piecewise linear regression I am seeing an error when I try to set my own breakpoints; it seems only when I try to set more than two.
(EDIT) Here is the code I am using:
# data
bullard <- structure(list(Rt = c(0, 4.0054, 25.1858, 27.9998, 35.7259, 39.0769,
45.1805, 45.6717, 48.3419, 51.5661, 64.1578, 66.828, 111.1613,
114.2518, 121.8681, 146.0591, 148.8134, 164.6219, 176.522, 177.9578,
180.8773, 187.1846, 210.5131, 211.483, 230.2598, 262.3549, 266.2318,
303.3181, 329.4067, 335.0262, 337.8323, 343.1142, 352.2322, 367.8386,
380.09, 388.5412, 390.4162, 395.6409), Tem = c(15.248, 15.4523,
16.0761, 16.2013, 16.5914, 16.8777, 17.3545, 17.3877, 17.5307,
17.7079, 18.4177, 18.575, 19.8261, 19.9731, 20.4074, 21.2622,
21.4117, 22.1776, 23.4835, 23.6738, 23.9973, 24.4976, 25.7585,
26.0231, 28.5495, 30.8602, 31.3067, 37.3183, 39.2858, 39.4731,
39.6756, 39.9271, 40.6634, 42.3641, 43.9158, 44.1891, 44.3563,
44.5837)), .Names = c("Rt", "Tem"), class = "data.frame", row.names = c(NA,
-38L))
library(segmented)
# create a linear model
out.lm <- lm(Tem ~ Rt, data=bullard)
o<-segmented(out.lm, seg.Z=~Rt, psi=list(Rt=c(200,300)), control=seg.control(display=FALSE))
Using the psi option, I have tried the following:
psi = list(x = c(150, 300)) -- OK
psi = list(x = c(100, 200)) -- OK
psi = list(x = c(200, 300)) -- OK
psi = list(x = c(100, 300)) -- OK
psi = list(x = c(120, 150, 300)) -- error 1 below
psi = list(x = c(120, 300)) -- OK
psi = list(x = c(120, 150)) -- OK
psi = list(x = c(150, 300)) -- OK
psi = list(x = c(100, 200, 300)) -- error 2 below
(1) Error in segmented.lm(out.lm, seg.Z = ~Rt, psi = list(Rt = c(120, 150, :
only 1 datum in an interval: breakpoint(s) at the boundary or too close
(2) Error in diag(Cov[id, id]) : subscript out of bounds
I have already listed my data at this question, but as a guide the limits on the x data are about 0--400.
A second question that pertains to this one is: how do I actually fix the breakpoints using this segmented package?

The issue here seems to be poor error trapping in the segmented package. Having a look at the code for segmented.lm allows a bit of debugging. For example, in the case of psi = list(x = c(100, 200, 300)), an augmented linear model is fitted as shown below:
lm(formula = Tem ~ Rt + U1.Rt + U2.Rt + U3.Rt + psi1.Rt + psi2.Rt +
psi3.Rt, data = mf)
Call:
lm(formula = Tem ~ Rt + U1.Rt + U2.Rt + U3.Rt + psi1.Rt + psi2.Rt +
psi3.Rt, data = mf)
Coefficients:
(Intercept) Rt U1.Rt U2.Rt U3.Rt psi1.Rt
15.34303 0.04149 0.04591 742.74186 -742.74499 1.02252
psi2.Rt psi3.Rt
NA NA
As you can see, the fit has NA values which then result in a degenerate variance-covariance matrix (called Cov in the code). The function doesn't check for this and tries to pull out diagonal entries from Cov and fails with the error message shown. At least the first error, although perhaps not overly helpful, is caught by the function itself and suggests that the break-points are too close.
In the absence of better error trapping in the function, I think that all you can do is adopt a trial and error approach (and avoid break points which are too close). For example, psi = list(x = c(50, 200, 300)) seems to work ok.

If you use while and tryCatch you can make the command repeat itself until it decides there is no error in the model #jaySf. I'm guessing this is down to the randomiser settings in the function, which can be seen in seg.control.
lm.model <- lm(xdat ~ ydat, data = x)
if.false <- F
while(if.false == F){
tryCatch({
s <- segmented(lm.model, seg.Z =~ydata, psi = NA)
if.false <- T
}, error = function(e){
}, finally = {})
}

Related

rootogram() error when checking for overdispersion in GAM

I have run the below GAM and am trying to plot a rootogram() using the countreg package to check for overdispersion, but get the error Error in X[, pstart[i] - 1 + 1:object$nsdf[i]] <- Xp : number of items to replace is not a multiple of replacement length.
I understand what the error message is telling me, that the length of two vectors/objects do not match, but am none the wiser as to how to fix it. Any help/suggestions would be appreciated? Has anyone had this problem previously, if so how did you fix it?
This may be arising due to a peculiarity in my data as I have never previously had a problem producing rootograms when using other datasets.
# I cannot fit a rootogram from the following GAM
> knots2 <- list(nMonth = c(0.5, 12.5))
> sup15 <- gam(Number ~ State + Virus + State*Virus + s(nMonth, bs = "cc", k = 12, by = Virus) + s(Time, k = 60, by = Virus),
data = supply.pad,
family = nb(),
method = "REML",
knots = knots2)
> root_nb <- rootogram(sup15, style = "hanging", plot = FALSE)
Error in X[, pstart[i] - 1 + 1:object$nsdf[i]] <- Xp :
number of items to replace is not a multiple of replacement length
# But can fit a rootogram from the below GAM. Note that these are different datasets but pretty much the same code.
> knots1 <- list(month = c(0.5, 12.5))
> gam10 <- gam(n ~ State + s(month, bs = "cc", k = 12) + s(time),
data = rhdv.gp.pad,
family = nb(),
method = "REML",
knots = knots1)
> root_nb1 <- rootogram(gam10, style = "hanging", plot = FALSE)

Problem with updating terms in the multinom function

I am trying to add1 all interaction terms on top of a multinomial baseline model using multinom() but it shows the error
trying + x1:x2
Error in if (trace) { : argument is not interpretable as logical
Called from: nnet.default(X, Y, w, mask = mask, size = 0, skip = TRUE, softmax = TRUE,
censored = censored, rang = 0, ...)
What is the problem here? I appreciate any input. Here is a reproducible example:
require(nnet)
data <- data.frame(y=sample(1:3, 24, replace = TRUE),
x1 = c(rep(1,12), rep(2,12)),
x2 = rep(c(rep(1,4), rep(2,4), rep(3,4)),2),
x3=rnorm(24),
z1 = sample(1:10, 24, replace = TRUE))
m0 <- multinom(y ~ x1 + x2 + x3 + z1, data = data)
m1 <- add1(m0, scope = .~. + .^2, test="Chisq")
My end goal is to see which terms are appropriate to drop by later adding the line m1[order(add1.m1$'Pr(>Chi)'),].

Generalized estimating equations working by themselves but not within functions (R)

I am trying to write a function to run GEE using the geepack package. It works fine "on its own" but not within a function, please see example below:
library(geepack)
library(pstools)
df <- data.frame(study_id = c(1:20),
leptin = runif(20),
insulin = runif(20),
age = runif(20, min = 20, max = 45),
sex = sample(c(0,1), size = 20, replace = TRUE))
#Works
geepack::geeglm(leptin ~ insulin + age + sex, id = study_id, data = df)
#Doesn't work
model_function_covariates_gee <- function(x,y) {
M1 <- paste0(x, "~", y, "+ age + sex")
M1_fit <- geepack::geeglm(M1, id = study_id, data = df)
s <- summary(M1_fit)
return(s)
}
model_function_covariates_gee("leptin", "insulin")
Error message:
Error in mcall$formula[3] <- switch(match(length(sformula), c(0, 2, 3)), :
incompatible types (from language to character) in subassignment type fix
Does anyone know why this is? I've fiddled around with it but can't get it to change. Thanks in advance.

Error in R: Non-conformable arrays, how to fix?

I am trying to create an effect plot for a cox proportional hazards model:
fitC7 <- coxph(Surv(TimeDeath, event == 1) ~
strata(sex) * mutation + age
+ ns(BM1, 3),
data = data)
I created a new dataset as follows:
ND1a <- with(data, expand.grid(age = seq(30, 75, length.out = 40), mutation = factor(c("Yes", "No")), sex = factor(c("male", "female")), BM1 = 1.583926))
Then, I tried to use the predict function:
predict(fitC7, newdata = ND1a, type = "lp", se.fit = T)
However, I keep getting the error:
Error in newx - xmeans[match(newstrat, row.names(xmeans)), ] : non-conformable arrays
and I do not know how to correct this.
It does work when I put in a model without sex as a stratifier, e.g.,
fitC9 <- coxph(Surv(TimeDeath, event ==1) ~
sex * mutation + age +
ns(BM1, 3), data = data)
I hope someone can help me, I could not figure it out with previous question and answer threads.

Creating Survival Trees with MST package: Undefined Columns Error?

I am trying to create a survival Tree with the MST package from R. I have been looking into this paper.
I replicated their example with randomly generated Data and it works just fine. I adjusted my data to fit the same model. My data has the same columns and the same datatypes.
I keep getting this error:
Error in `[.data.frame`(mf_data[col.split.var], , 3) : undefined columns selected
with the following line of code:
fit <- MST(formula = Surv(time,status) ~ x1 + | id), data = data)
I have looked through all of the documentation and I didnt find anything and I can't understand why this error appears.
The code form the paper looks like this:
set.seed(186117)
data <- rmultime(N = 200, K = 4, beta = c(-1, 0.8, 0.8, 0, 0),cutoff = c(0.5, 0.3, 0, 0), model = "marginal.multivariate.exponential", rho = 0.65)$dat
test <- rmultime(N = 100, K = 4, beta = c(-1, 0.8, 0.8, 0, 0), cutoff = c(0.5, 0.3, 0, 0), model = "marginal.multivariate.exponential",rho = 0.65)$dat
fit <- MST(formula = Surv(time, status) ~ x1 + x2 + x3 + x4 | id,data, test, method = "marginal", minsplit = 100, minevents = 20,selection.method = "test.sample")
I tried running your code and I do get an error although not the one you are getting and I'm fairly sure after looking at it that you need to use the [edit] features of SO to modify your question.
> fit <- MST(formula = Surv(time,status) ~ x1 + | id), data = data)
Error: unexpected '|' in "fit <- MST(formula = Surv(time,status) ~ x1 + |"
The formula give is obviously wrong and there is an unnecesary closing parentheses. I am able to get teh error you report with:
> fit <- MST(formula = Surv(time,status) ~ x1 | id, data = data)
[1] "No test sample supplied, changed selection.method = 'bootstrap'"
Error in `[.data.frame`(mf_data[col.split.var], , 3) :
undefined columns selected
.... but not with the original code:
fit <- MST(formula = Surv(time, status) ~ x1 + x2 + x3 + x4 | id,data, test, method = "marginal", minsplit = 100, minevents = 20,selection.method = "test.sample")
I also see an erroir with x1+x2|id on the RHS of the formula but not with three variables:
> fit <- MST(formula = Surv(time, status) ~ x1 +x2 | id,data, test, method = "marginal", minsplit = 100, minevents = 20,selection.method = "test.sample")
Error in `[.data.frame`(mf_data[col.split.var], , 3) :
undefined columns selected
> fit <- MST(formula = Surv(time, status) ~ x1 +x2+x3| id,data, test, method = "marginal", minsplit = 100, minevents = 20,selection.method = "test.sample")
So I'm thinking is is a bug that the developers had not anticipated. Here's how to obtain the needed email address to report:
> maintainer("MST")
[1] "Peter Calhoun <calhoun.peter#gmail.com>"

Resources