Regression result showing just an intercept with no variable estimator - r

I was trying to fit a regression modell with several independent variables. When checking the summary() of that model I saw that one variable's estimator didnt show up. So, I tried to fit a model with just that independent variable that didn't show up, which you can see in the sample code below. I changed the variable names for an easier understanding. But basically what happened is that for this variable somehow no estimator is being calculated and it just shows the intercept. In other regressions the variable worked fine and showed an estimator. So I don't know why this happens here. I have a panel dataset, in case this matters and variable Y value changes from datapoint to datapoint. So it's not just a constant.
Does anyone have an idea why this happens?
Sample code:
> TestFit = plm(Y ~ X, data = dataset, model = "between", index = c("Index", "DatesNum"))
> TestFit
Model Formula: Y ~ X
Coefficients:
(Intercept)
0.00014546

You'd better show your dataset (e.g. de-identified if necessary) so people can better answer your question.
If you define Y and X outside plm(), you probably don't need plm.
When you have data = dataset, Y and X should be the column name of your model.
Does changing plm to lm work?

Related

Generalized Linear Model (GLM) in R

I have a response variable (A) which I transformed (logA) and predictor (B) from data (X) which are both continuous. How do I check the linearity between the two variables using Generalized Additive Model (GAM) in R. I use the following code
model <- gamlss(logA ~ pb(B) , data = X, trace = F)
but I am not sure about it, can I add "family=Poisson" in the code when logA is continuous in GLM? Any thoughts on this?
Thanks in advance
If your dependent variable is a count variable, you can use family=PO() without log transformation. With family=PO() a log link is already applied to transform the variable. See help page for gamlss family and also vignette on count regression section 2.1.
So it will go like:
library(gamlss)
fit = gamlss(gear ~ pb(mpg),data=mtcars,family=PO())
You can see that the predictions are log transformed and you need to take the exponential:
with(mtcars,plot(mpg,gear))
points(mtcars$mpg,exp(predict(fit,what="mu")),col="blue",pch=20)

Error in glsEstimate(object, control = control) : computed "gls" fit is singular, rank 19

First time asking in the forums, this time I couldn't find the solutions in other answers.
I'm just starting to learn to use R, so I can't help but think this has a simple solution I'm failing to see.
I'm analyzing the relationship between different insect species (SP) and temperature (T), explanatory variables
and the area of the femur of the resulting adult (Femur.area) response variable.
This is my linear model:
ModeloP <- lm(Femur.area ~ T * SP, data=Datos)
No error, but when I want to model variance with gls,
modelo_varPower <- gls(Femur.area ~ T*SP,
weights = varPower(),
data = Datos
)
I get the following errors...
Error in glsEstimate(object, control = control) :
computed "gls" fit is singular, rank 19
The linear model barely passes the Shapiro test of normality, could this be the issue?
Shapiro-Wilk normality test
data: re
W = 0.98269, p-value = 0.05936
Strangely I've run this model using another explanatory variable and had no errors, all I can read in the forums has to do with multiple samplings along a period of time, and thats not my case.
Since the only difference is the response variable I'm uploading and image of how the table looks like in case it helps.
You have some missing cells in your SP:T interaction. lm() tolerates these (if you look at coef(lm(Femur.area~SP*T,data=Datos)) you'll see some NA values for the missing interactions). gls() does not. One way to deal with this is to create an interaction variable and drop the missing levels, then fit the model as (effectively) a one-way rather than a two-way ANOVA. (I called the data dd rather than datos.)
dd3 <- transform(na.omit(dd), SPT=droplevels(interaction(SP,T)))
library(nlme)
gls(Femur.area~SPT,weights=varPower(form=~fitted(.)),data=dd3)
If you want the main effects and the interaction term and the power-law variance that's possible, but it's harder.

Why is it that when a random slope and intercept are uncorrelated, the choice of a reference category for a predictor variable matters?

I had originally asked this in an edit to a previous question, but I think that it deserves its own question.
I am running a glmer with a single dichotomous predictor (coded 1/0). The model also includes a random subject intercept, as well as a random item intercept and slope.
Changing which level of the predictor serves as the reference category doesn’t change the absolute value of the coefficient, EXCEPT when the random intercept and slope are uncorrelated.
This happens whether I keep the predictor as a numeric variable, or change the predictor into a factor and use the following code:
t1<-glmer(DV~IV+(1|PPT)+(0+dummy(IV, "1")|Item)+(1|Item), data = data, family = "binomial”)
Is this a genuine result? If so, can anyone explain why the uncorrelated random intercept and slope allow it to emerge? If not, how can I run a model that has an uncorrelated random intercept and slope that would prevent the choice of reference category from affecting the result?
Thank you very much!
Edit:
Here is some sample data.
I'm sorry I wasn't sure how to link to it with an r command, but a sample of csv data is here: https://pastebin.com/embed_js/X2h9yT4c
testdata<-read.csv("test.csv")
testdata$PPT<-as.factor(testdata$PPT)
testdata$BalancedIV<-as.factor(testdata$BalancedIV)
testdata$BalancedIVReversed<-as.factor(testdata$BalancedIVReversed)
testdata$BIV<-as.numeric(as.character(testdata$BalancedIV))
testdata$BIVR<-as.numeric(as.character(testdata$BalancedIVReversed))
testdata$UIV<-as.numeric(as.character(testdata$UnBalancedIV))
testdata$UIVR<-as.numeric(as.character(testdata$UnbalancedIVReversed))
These two models have the same predictor but with reverse coding (i.e., 1/0 the first time, and 0/1 the second time). You can see that their coefficients have different absolute values.
t19<-glmer(DV~BalancedIV+(1|PPT)+(0+dummy(BalancedIV, "1")|Item)+(1|Item), data = testdata, family = "binomial")
t20<-glmer(DV~BalancedIVReversed+(1|PPT)+(0+dummy(BalancedIVReversed, "1")|Item)+(1|Item), data = testdata, family = "binomial")
summary(t19)
summary(t20)

Obtaining predicted (i.e. expected) values from the orm function (Ordinal Regression Model) from rms package in R

I've run a simple model using orm (i.e. reg <- orm(formula = y ~ x)) and I'm having trouble understanding how to get predicted values for Y. I've never worked with models that use multiple intercepts. I want to know for each and every value of Y in my dataset what the predicted value from the model would be. I tried predict(reg, type="mean") and this produced values that are close to the predicted values from an OLS regression, but I'm not sure if this is what I want. I really just want something analogous to OLS where you can obtain the E(Y) given a set of predictors. If possible, please provide code I can run to do this with a very short explanation.

Tukey HSD for multiple variables and single variable return different results

I have tried to run Tukey HSD for a multi-variable dataset. However, when I run the same test on a single variable, the results are completely opposite.
While running for multiple variables, I observed the following error in ANOVA output:
8 out of 87 effects not estimable
Estimated effects may be unbalanced
While running for single variable, I observed the following error in ANOVA output:
Estimated effects may be unbalanced
Is this in any way related to the completely opposite Tukey HSD output which I received? Also, how do I go on solving this problem?
I used aov() and have close to 500000 datapoints in my dataset.
to be more specific, the following code gave me a different result:
code1:
lm_test1 <- lm(y ~ x1+ x2, data=data)
glht(lm_test1, linfct = mcp(x1 = "Tukey"))
code2:
lm_test1 <- lm(y ~ x1, data=data)
glht(lm_test1, linfct = mcp(x1 = "Tukey"))
Please tell me how this is possible...
after some more research, I found the answer to this, so thought I should post this. Anova in R is by default type - I anova. So that means the first variable that we input, the effects are considered without controlling for any other factors, on the other hand, for the other variables, the results are shown after controlling for the effects of other variables. Therefore, since I was inputting my variable as the 2nd variable, the results shown were after controlling for the 1st variable which was by chance, in a completely opposite direction to looking at a direct effect.

Resources