I want to manually set a coefficient for a variable that is not input in my linear model so that I can perform a spatial prediction.
I'm going to try to expose my question in the most simple and clear way.
What I have:
a raster stack with 4 binary variables for soil cover: agro, open,
tran and urb
a linear model lm(formula = no2 ~ open + tran + urb, data = df)
The reason why I only used 3 of the variables in my linear ways to prevent multicollinearity in the model, because they are proportions of land coverage that add up to 100%.
So, my goal is to add a coefficient to my model for the agro variable, so that all of the 4 variables are used correctly in raster::predict()
You can use the offset term in the formula and include the desired coefficient and variable therein:
lm(formula = no2 ~ open + tran + urb + offset(agro*400), data = df)
So this is regressing formula on open, tran and urb plus the fixed term agro * 400. For more than one given coefficient, add the appropriate additional offset() terms.
You can avoid the collinearity by leaving the intercept out of your model. Use
lm(formula = no2 ~ open + tran + urb + agro - 1, data = df)
and you'll be able to estimate coefficients for all of the predictors (but no intercept term).
Related
I have data from an Experience Sampling Study, which consists of 8140 observations nested in 106 participants. I want to test if there is a mediation, in which I also want to compare the predictors (X1= socialInteraction_tech, X2= socialInteraction_ftf, M = MPEE_int, Y= wellbeing). X1, X2, and M are person-mean centred in order to obtain the within-person effects. To account for the autocorrelation I have fit a model with an ARMA(2,1) structure. We control for time with the variable "obs".
This is the final model including all variables of interest:
fit_mainH1xmy <- lme(fixed = wellbeing ~ 1 + obs # Controls
+ MPEE_int_centred + socialInteraction_tech_centred + socialInteraction_ftf_centred,
random = ~ 1 + obs | ID, correlation = corARMA(form = ~ obs | ID, p = 2, q = 1),
data = file, method = "ML", na.action=na.exclude)
summary(fit_mainH1xmy)
The mediation is partial, as my predictor X still significantly predicts Y after adding M.
However, I can't find a way to calculate c'(cprime), the indirect effect.
I have found the mlma package, but it looks weird and requires me to do transformations to my data.
I have tried melting the data in a long format and using lmer() to fit the model (following https://quantdev.ssri.psu.edu/sites/qdev/files/ILD_Ch07_2017_Within-PersonMedationWithMLM.html), but lmer() does not let me take into account the moving average (MA-part of the ARMA(2,1) structure).
Does anyone know how I could now obtain the indirect effect?
I am supposed to find the intercept term using Ridge Regression model.
"Use Ridge Regression with alpha = 0 to and lambda = 0, divorce as the response and all the other variables as predictors."
I know I'm supposed to convert my data to matrix mode and then transform it to fit the glmnet function. I've converted my response to matrix mode, but I'm not sure how to convert all my predictors into matrix mode, too.
set.seed(100)
require(faraway)
require(leaps)
require(glmnet)
mydata = divusa
mymodel = lm(divorce ~ year + unemployed + femlab + marriage + birth +
military, data=mydata)
summary(mymodel)
.
.
.
y = model.matrix(divorce~.,mydata)
Can anyone help with the code for my x variable? I'm very new to R and finding it very hard to understand it.
Your y = model.matrix(divorce~.,mydata) actually created your predictor matrix (usually called X). Try
X = model.matrix(divorce~.,mydata)
y = mydata$divorce
glmnet(X,y)
glmnet(X,y,alpha=0,lambda=0)
I think if you set lambda=0 you're actually doing ordinary regression (i.e., you're setting the penalty to zero, so ridge -> OLS).
I have a problem with the estimation of a panel data model in R.
I want to estimate the effect of a change in the Real GDP and the relative price level in the respective country on the contribution of the tourism sector.
If I use the command
Y <- cbind(ln_Differences_Contribution)
X <- cbind(ln_price_differences, Differences_ln_gdp)
and then
fixed <- plm(Y~X, data=pdata, model = "within")
I do not have an effect for the different years.
Is there anyway I can add a time variable?
If you want to control for the effect of the time period ("years"), you can use the effect argument to additionally specify time effects in a two-way within model:
plm(y ~ x1 + x2, data = pdata, model = "within", effect = "twoways")
Another way would be to explicitly include the time variable in the formula as a factor and specifying a one-way individual model as in your question, assuming your time variable is called year:
plm(y ~ x1 + x2 + factor(year), data = pdata, model = "within")
The output would be a bit "polluted" by the explicit estimates for the years (usually, one is not interested in those).
I'd like to use the ols() (ordinary least squares) function from the rms package to do a multivariate linear regression, but I would not like it to calculate the intercept. Using lm() the syntax would be like:
model <- lm(formula = z ~ 0 + x + y, data = myData)
where the 0 stops it from calculating an intercept, and only two coefficients are returned, on for x and the other for y. How do I do this when using ols()?
Trying
model <- ols(formula = z ~ 0 + x + y, data = myData)
did not work, it still returns an intercept and a coefficient each for x and y.
Here is a link to a csv file
It has five columns. For this example, can only use the first three columns:
model <- ols(formula = CorrEn ~ intEn_anti_ncp + intEn_par_ncp, data = ccd)
Thanks!
rms::ols uses rms:::Design instead of model.frame.default. Design is called with the default of intercept = 1, so there is no (obvious) way to specify that there is no intercept. I assume there is a good reason for this, but you can try changing ols using trace.
I have built a survival cox-model, which includes a covariate * time interaction (non-proportionality detected).
I am now wondering how could I most easily get survival predictions from my model.
My model was specified:
coxph(formula = Surv(event_time_mod, event_indicator_mod) ~ Sex +
ageC + HHcat_alt + Main_Branch + Acute_seizure + TreatmentType_binary +
ICH + IVH_dummy + IVH_dummy:log(event_time_mod)
And now I was hoping to get a prediction using survfit and providing new.data for the combination of variables I am doing the predictions:
survfit(cox, new.data=new)
Now as I have event_time_mod in the right-hand side in my model I need to specify it in the new data frame passed on to survfit. This event_time would need to be set at individual times of the predictions. Is there an easy way to specify event_time_mod to be the correct time to survfit?
Or are there any other options for achieving predictions from my model?
Of course I could create as many rows in the new data frame as there are distinct times in the predictions and setting to event_time_mod to correct values but it feels really cumbersome and I thought that there must be a better way.
You have done what is refereed to as
An obvious but incorrect approach ...
as stated in Using Time Dependent Covariates and Time Dependent Coefficients in the Cox Model vignette in version 2.41-3 of the R survival package. Instead, you should use the time-transform functionality, i.e., the tt function as stated in the same vignette. The code would be something similar to the example in the vignette
> library(survival)
> vfit3 <- coxph(Surv(time, status) ~ trt + prior + karno + tt(karno),
+ data=veteran,
+ tt = function(x, t, ...) x * log(t+20))
>
> vfit3
Call:
coxph(formula = Surv(time, status) ~ trt + prior + karno + tt(karno),
data = veteran, tt = function(x, t, ...) x * log(t + 20))
coef exp(coef) se(coef) z p
trt 0.01648 1.01661 0.19071 0.09 0.9311
prior -0.00932 0.99073 0.02030 -0.46 0.6462
karno -0.12466 0.88279 0.02879 -4.33 1.5e-05
tt(karno) 0.02131 1.02154 0.00661 3.23 0.0013
Likelihood ratio test=53.8 on 4 df, p=5.7e-11
n= 137, number of events= 128
The survfit though does not work when you have a tt term
> survfit(vfit3, veteran[1, ])
Error in survfit.coxph(vfit3, veteran[1, ]) :
The survfit function can not yet process coxph models with a tt term
However, you can easily get out the terms, linear predictor or mean response with predict. Further, you can create the term over time for the tt term using the answer here.