Can I get pairwise contrasts in or from a Gamlss model? - gamlss

I am using the following code to model how a one-inflated response correlates with treatment (categorical). I would like to get a post-hoc pairwise comparison of the treatments. There is a "contrasts" call in the gamlss model function, but I have no clue what is supposed to go there and cannot find any examples.
propfertmodel <- gamlss(propfert~Trx, data = only_eggs, family = BEINF)
In the gamlss model, I did try contrasts = "Trx". I received this error, "Error in model.matrix.default(mu.terms, mu.frame, contrasts) :
object 'Trx' not found"
I also tried contrasts = "pairwise". I got this error, Warning messages:
1: In model.matrix.default(mu.terms, mu.frame, contrasts) :
non-list contrasts argument ignored
2: In model.matrix.default(sigma.terms, sigma.frame, contrasts) :
non-list contrasts argument ignored
3: In model.matrix.default(form.nu, nu.frame, contrasts) :
non-list contrasts argument ignored
4: In model.matrix.default(form.tau, tau.frame, contrasts) :
non-list contrasts argument ignored
Finally, I tried contrasts = c("C", "H", "L", "T") and got this error:
Warning messages:
1: In model.matrix.default(mu.terms, mu.frame, contrasts) :
non-list contrasts argument ignored
2: In model.matrix.default(sigma.terms, sigma.frame, contrasts) :
non-list contrasts argument ignored
3: In model.matrix.default(form.nu, nu.frame, contrasts) :
non-list contrasts argument ignored
4: In model.matrix.default(form.tau, tau.frame, contrasts) :
non-list contrasts argument ignored

Related

'caret' package error in train(): "all the Accuracy metric values are missing"

I am using the 'caret' package to train a model with the 'polr' package (doing an ordinal logistic regression). Here is my setup:
fitControl <- trainControl(
method = "repeatedcv",
number = 5,
repeats = 5)
set.seed(825)
ordinallogitfit <- train(BTC ~
TallyZero3 + LastMove:ConsecUp + LastMove:ConsecDown +
TallyUp1:ConsecDown + LastMove:UpMaxPoint, data = training,
method = "polr",
trControl = fitControl,
verbose = TRUE)
When I run this, I get the following output:
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :5 NA's :5
Error: Stopping
In addition: There were 14 warnings (use warnings() to see them)
warnings()
Warning messages:
1: model fit failed for Fold1.Rep1: method=logistic Error in family$linkfun(mustart) :
Argument mu must be a nonempty numeric vector
2: In glm.fit(X, y1, wt, family = binomial("probit"), offset = offset) :
no observations informative at iteration 1
3: glm.fit: algorithm did not converge
4: model fit failed for Fold1.Rep1: method=probit Error : object of type 'closure' is not subsettable
5: In glm.fit(X, y1, wt, family = binomial("probit"), offset = offset) :
no observations informative at iteration 1
6: glm.fit: algorithm did not converge
7: model fit failed for Fold1.Rep1: method=loglog Error : object of type 'closure' is not subsettable
8: In glm.fit(X, y1, wt, family = binomial("probit"), offset = offset) :
no observations informative at iteration 1
9: glm.fit: algorithm did not converge
10: model fit failed for Fold1.Rep1: method=cloglog Error : object of type 'closure' is not subsettable
11: In glm.fit(X, y1, wt, family = binomial("cauchit"), offset = offset) :
no observations informative at iteration 1
12: glm.fit: algorithm did not converge
13: model fit failed for Fold1.Rep1: method=cauchit Error : object of type 'closure' is not subsettable
14: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, ... :
There were missing values in resampled performance measures.
It looks like that very last message is the most important one, but I don't know what to do. My data has no NAs and no infinite values. My response variable is represented as a factor with 3 levels. All of my input variables are integers (not categorical variables, but just integers/counts).
What is the cause of the error and how can I fix it?
Thank you very much!
EDIT: I just tried using the polr() function directly and there is no error - looks like this must be something with the 'caret' package itself.

How to use the Box-Tidwell function with a logistic regression in R

I get errors using the boxTidwell function from the 'car' package for a logistic regression model.
I want to model
fatalCancer ~ globy1, where fatalCancer is a factor with two levels and globy1 is numeric (all positive). I am testing this to check the assumption of linearity of globy1 with the logit of the outcome.
Looking at the error messages (below) and the boxTidwell function code, it seems that maybe there's a problem with fatalCancer being a factor. There isn't anything in the boxTidwell documentation about specifying that it's a logistic model. In the example in Section 6.4 of Fox's _An R Companion to Applied Regression (p.312) the example logistic regression didn't require any specification.
Is there a way to fix the syntax of the boxTidwell function below?
> library(car)
Loading required package: carData
>
> load("m2dat.RData")
> m2dat <- na.omit(m2dat)
> dim(m2dat)
[1] 116 3
> head(m2dat)
dog globy1 fatalCancer
1 101A 3.1 No
2 102A 2.9 No
3 103A 4.9 No
4 104A 3.1 Yes
5 105A 2.8 Yes
6 106A 3.5 No
> boxTidwell(fatalCancer ~ globy1, data=m2dat)
MLE of lambda Score Statistic (z) Pr(>|z|)
6.5694 NA NA
iterations = 21
There were 48 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
3: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
4: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
5: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
6: In Ops.factor(r, 2) : ‘^’ not meaningful for factors
7: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
8: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
9: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
10: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
...
46: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
47: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
48: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
The score statistic ends up NA, and I would like to run the test successfully.
The linearity assumption for logistic regression is between the log-odds and the predictor variables, not between the outcome and predictor variables (as you have entered them into the function).
lreg <- glm(fatalCancer ~ globy1, data=m2dat, family = binomial(link="logit"))
logodds <- lreg$linear.predictors
boxTidwell(logodds ~ globy1)
Alternatively, this is something you can assess with a scatter plot:
plot(logodds ~ globy1)
I hope this helps!

vif, multicollinearity, Error in as.vector(y) - mean(y) : non-numeric argument to binary operator,

Please help! By using vif for multicollinearity come those errors:
vif (M1)
Error in as.vector(y) - mean(y) : non-numeric argument to binary operator
In addition: Warning message:
In mean.default(y) : argument is not numeric or logical: returning NA
My script:
library(lmtest)
M1 <- lm(y ~ x + x1 + x2,
data = Mydata)
summary(M1)
library(VIF)
vif (M1)
You should do the following steps:
install.packages("car")
library(car)
vif(M1)
It works for me

Plotting glm in R throws error regarding row counts in x and y

I'm trying to plot a glm in R, my current code is as follows, but I keep getting the error as shown:
Code:
RAVE_CYC_PR3_reg <- glm(RAVE_CYC_PR3$`BVAS/WG Score` ~ RAVE_CYC_PR3$Days,
family=poisson())
summary(RAVE_CYC_PR3_reg)
plot(RAVE_CYC_PR3$Days, RAVE_CYC_PR3$`BVAS/WG Score`)
plot(x=seq(0,365,1), predict(RAVE_CYC_PR3_reg,
newdata = data.frame(days=seq(0,365,1)),
type = "response"))
Error:
Error in xy.coords(x, y, xlabel, ylabel, log) :
'x' and 'y' lengths differ
In addition: Warning message:
'newdata' had 366 rows but variables found have 393 rows
My intention is to plot both the raw data that I have as well as the predicted values from the glm to visualize how well the model fits the data.
Thank you all for any input.

Designating numeric variables as factors within lme4

Is there any way to specify contrast matrices within an lmer if the variable in question is not a factor outside the lmer? As an example, toy data of a 2 x 4 mixed model with group, a between-subjects factor, and time a within-subjects factor. I am testing the between-group differences in score at level of time, however this is more of a technical question concerning whether it is possible to specify numerical variables as factors within an lmer (i.e. while keeping them as numeric outside the lmer).
set.seed(345)
A0 <- rnorm(4,2,.5)
B0 <- rnorm(4,2+3,.5)
A1 <- rnorm(4,6,.5)
B1 <- rnorm(4,6+2,.5)
A2 <- rnorm(4,10,.5)
B2 <- rnorm(4,10+1,.5)
A3 <- rnorm(4,14,.5)
B3 <- rnorm(4,14+0,.5)
score <- c(A0,B0,A1,B1,A2,B2,A3,B3)
id <- rep(1:8,times = 4, length = 32)
time <- rep(0:3, each = 8, length = 32)
group <- rep(c("A","B"), times =2, each = 4, length = 32)
df <- data.frame(id = id, group = group, time = time, score = score)
Note that the time factor, which is a numeric variable, has been included in the model housed within a factor() command.
summary(lmer(score ~ group*factor(time) + (1|id), data = df))
This works fine on its own. However when I specify contrasts using the same factor(time) syntax:
summary(lmer(score ~ group*factor(time) + (1|id), data = df, contrasts = list(factor(time) = contr.helmert(4), group = contr.helmert(2))))
I get the error:
Error: unexpected '=' in "summary(lmer(score ~ group*factor(time) + (1|id), data = df, contrasts = list(factor(time) ="
...which I assume is down to my specifying the contrast matrices for the time variable within a factor() command.
And if I do not call time a factor in the contrasts = call...
summary(lmer(score ~ group*factor(time) + (1|id), data = df, contrasts = list(time = contr.helmert(4), group = contr.helmert(2))))
it returns a result but ignores that factor, with the warning messages....
1: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
2: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
3: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
4: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
5: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
6: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
7: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
8: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
9: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
10: In model.matrix.default(fixedform, fr, contrasts) :
variable 'time' is absent, its contrast will be ignored
I am aware that I am violating the syntax of lme4 here but I am trying to demonstrate what I would like to achieve, while acknowledging that I am ignorant of how to do so or if what I would like to do is even possible.

Resources