Custom nonlinear function in `R.gnm` - r

I'm trying to build a nonlinear model as described in the manual for gmn in R. The desired form of the model is
y = b0*x0^g0 + b1*x1^g1 + ...
This would seem to be the simplest possible form of nonlinear model to me, but for some reason (and please correct me if I'm wrong!) I have to write a custom nonlinear function to fit it in R. Very well!
df=read.csv("d:/mydataframe.csv")
require(gnm)
mypower = function(x){
list(predictors = list(beta=1,gamma=1),
variables = list(substitute(x)),
term = function(predlabels,varlabels) {
paste(predlabels[1],"*(",varlabels[1],"**",predlabels[2],")")
}
)
}
class(mypower) <- "nonlin"
Now when I try
fit <- gnm(formula=y ~ mypower(x1), data=df)
I get a fitted value of beta and gamma from the model. But when I try
fit <- gnm(formula=y ~ mypower(x1)+mypower(x2), data=df)
I get the error
Algorithm failed - no model could be estimated.
So, question 1: how can I solve this?
Also, when - trying to match all xs - I try
fit <- gnm(formula=PedalCycles ~ mypower(.), data=df)
I get
Error in eval(expr, envir, enclos) : object '.' not found
Is this the right way to specify a sum of all xs each raised to a power?

To estimate y = b_0*x_0^g_0 you could use gnm's built-in Exp() to estimate
Exp( 1 + I( log(x_0) ) )
This gives you coefficients:
b'_0 for the intercept
g'_0 for log(x_0)
Hence g'_0 is your desired g_0 (since e^log(x_0)*g'_0 = x_0^g_0) and e^b'_0 is b_0. Your model is now a sum of such terms.
Caveat: This will not work if x_0 assumes non-positive values in your data set.

Related

R: How to obtain diagnostic plots for a lavaan mediation model?

I wasn't sure whether this was more appropriate to ask here or CrossValidated as I'm specifically asking about using R / lavaan...
I'm not sure if I've completely misunderstood how violations of assumptions are checked. I understand that we can obtain diagnostic plots for linear models with:
model <- lm(data$outcome ~ data$predictor)
plot(model, which = c(1:6))
But I'm having trouble figuring out how I should do this for a mediation model fitted like so:
model <- 'outcome ~ c*predictor + b*mediator
mediator ~ a*predictor
indirect_effect := a*b
total_effect := c + (a*b)
'
model.fit <- lavaan::sem(
model = model,
data = data,
missing = "FIML",
estimator = "ML")
Then if I try obtaining plots in the same way (plot(model.fit, which = c(1:6))), I get Error in as.double(y) : cannot coerce type 'S4' to vector of type 'double'.
Also, to check for violations of assumptions for Pearson's correlation, would we do so by looking at the structure of each variable individually, or by making a linear model (lm(data$outcome ~ data$predictor)), or using the correlation itself (cor.test(data$var1, data$var2)) in some way?
Try:
lavaanPlot::lavaanPlot(model = model.fit, coefs=T)

How to Fit Conway–Maxwell-Poisson regression in R?

I want to fit Conway–Maxwell-Poisson regression with one response and two randomly generated covariates in R, How I Fit?
library(COMPoissonReg)
n1=200
x1 = rnorm(n1,0,1)
x2 = rnorm(n1,0,1)
b0=0.05; b1=0.0025;b2=0.005;b7=0.0001
y=b0+(b1*x1)+(b2*x2)
y1=exp(y)
nu=exp(y)
y2=rcmp(n1, y1,nu)
model = glm.cmp(y1 ~ x1+x2,formula.nu=x1+x2)
I found the following error
Error in formula.default(object, env = baseenv()) : invalid formula
so guide me I Fit this model?
For your data above, you did not simulate the overdispersion parameter so you can don't specify the formula for nu. This will work, but your dependent variable is not an integer, so it throws an error:
glm.cmp(y1 ~ x1+x2)
You can simulate data like this, below comes with an over dispersion so you can see that in the estimate for nu :
n1=200
x1 = rnorm(n1,0,1)
x2 = rnorm(n1,0,1)
b0=0.05; b1=3;b2=2
mu=exp(b0+(b1*x1)+(b2*x2))
y = rnbinom(length(mu),mu=mu,size=1)
glm.cmp(y ~ x1+x2,data=df)

How to get AIC from lm_robust object

How do I get an AIC from an lm_robust object (package estimatr)? I'm using lm_robust because I want to use a robust estimator for calculating the SE. Unlike the lm function, AIC is not provided when you run the summary function and running the AIC function on a lm_robust object produces an error. Below is a toy example of the kind of model I'm trying to run.
library(estimatr)
fake_data<-data.frame(outcome=rnorm(100,3.65,1),
pred1=rnorm(100,15,7),
pred2=as.factor(sample(1:5, 100, replace = T)))
mod1<-lm_robust(outcome~pred1+pred2,data=fake_data)
AIC(mod1)
here is what the error message looks like:
> AIC(mod1)
Error in UseMethod("logLik") :
no applicable method for 'logLik' applied to an object of class "lm_robust"
If you have to do it with lm_robust, you may choose to calculate it by yourself as below,
The formula of AIC,
AIC = 2*k + n [Ln( 2(pi) RSS/n ) + 1]
# n : Number of observation
# k : All variables including all distinct factors and constant
# RSS : Residual Sum of Square
If we apply it to R for your case,
# Note that, I take k=7 since you have, 5 factors + 1 continuous and 1 constant
AIC_calculated <- 2*7 + 100* (log( 2*pi* (1-mod1$r.squared)*mod1$tss/100 ) + 1)
[1] 332.2865
which is same with both lm and glm outputs.
mod2<-lm(outcome~pred1+pred2,data=fake_data)
> AIC(mod2)
[1] 332.2865
And finally, of course, you can put this calculation into a function to call whenever you want by just giving lm_robust model inside it without having to set the N and k parameters for any given data like,
myAIC <- function(data) {
2*(data$k+1) + data$N * (log(2*pi* (1-data$r.squared)*data$tss/data$N ) + 1)
}
> myAIC(mod1)
[1] 332.2865
Note: Results may be shown different in your computer because of the seeding differences when running the sample() function in dataframe.
Here's a workaround
mod1 = lm_robust(outcome ~ pred1 + pred2, data = fake_data)
#Create any fitted model using 'lm' as a placeholder
mod2 = with(list(x = rnorm(10), y = rnorm(10)), lm(y ~ x))
#Copy values in `mod2` from `mod1`
mod2[names(mod2)] = mod1[names(mod2)]
#Calculate residuals in `mod2`
mod2$residuals = mod2$fitted.values - fake_data$outcome
AIC(mod2)
#[1] 326.6092

Apply Durbin Watson test on Prais Winsten Model

I am having trouble running a Durbin Watson test on the prais winsten model I generated.
value3<-prais.winsten(value1$model)
dwtest(value3)
I receive this error:
Error in terms.default(formula) : no terms component nor attribute
Without any reasonable reproducible example is hard to pinpoint where is the issue, this is a kind of default way to go:
# Calculate Durbin-Watson
ols <- lm(y ~ x1 + x2)
dwtest(y ~ x1 + x2)
prais.winsten.lm(ols)
# Calcuate Autocorrelation and Partial Autocorrelation Coefficients
acf(resid(ols),lag.max = 5,plot = FALSE)
pacf(resid(ols),lag.max = 5,plot = FALSE)
You need to call dtest with the lm function on the list that prais.winsten returns
From your example it would be:
dwtest(lm(value3[[1]]))
When you display value3 you should see somthing like this at the top
[[1]]
Call:
lm(formula = fo)
This is what I am referring to.

change null hypothesis in lmtest in R

I have a linear model generated using lm. I use the coeftest function in the package lmtest go test a hypothesis with my desired vcov from the sandwich package. The default null hypothesis is beta = 0. What if I want to test beta = 1, for example. I know I can simply take the estimated coefficient, subtract 1 and divide by the provided standard error to get the t-stat for my hypothesis. However, there must be functionality for this already in R. What is the right way to do this?
MWE:
require(lmtest)
require(sandwich)
set.seed(123)
x = 1:10
y = x + rnorm(10)
mdl = lm(y ~ x)
z = coeftest(mdl, df=Inf, vcov=NeweyWest)
b = z[2,1]
se = z[2,2]
mytstat = (b-1)/se
print(mytstat)
the formally correct way to do this:
require(multcomp)
zed = glht(model=mdl, linfct=matrix(c(0,1), nrow=1, ncol=2), rhs=1, alternative="two.sided", vcov.=NeweyWest)
summary(zed)
Use an offset of -1*x
mdl<-lm(y~x)
mdl2 <- lm(y ~ x-offset(x) )
> mdl
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
0.5255 0.9180
> mdl2
Call:
lm(formula = y ~ x - offset(x))
Coefficients:
(Intercept) x
0.52547 -0.08197
You can look at summary(mdl2) to see the p-value (and it is the same as in mdl.
As far as I know, there is no default function to test the model coefficients against arbitrary value (1 in your case). There is the offset trick presented in the other answer, but it's not that straightforward (and always be careful with such model modifications). So, your expression (b-1)/se is actually a good way to do it.
I have two notes on your code:
You can use summary(mdl) to get the t-test for 0.
You are using lmtest with covariance structure (which will change the t-test values), but your original lm model doesn't have it. Perhaps this could be a problem? Perhaps you should use glm and specify the correlation structure from the start.

Resources