roc curve for bayesian logistic regression - r

Is there anyone can help me implement a ROC curve for a bayesian logistic regression? been trying DPpackage but is it me or it just doesn't work.
the two models i want to compare using ROC Curve are showed below:
bayes_mod=MCMClogit(Default ~ ACTIVITY + CIF + MAN + STA + PIA + COL + CurrLiq + DebtCov + GDPgr, data=mydata, burnin=500000,mcmc=10000, tune=0.6,b0=coef(mylogit.reduced),B0=information2, subset=c(-1772,-2064,-655))
bayes_mod1=MCMClogit(Default ~ ACTIVITY + CIF + MAN + STA + PIA + COL + CurrLiq + DebtCov + GDPgr, data=mydata, burnin=500000,mcmc=10000,tune=0.6,subset=c(-1772,-2064,-655))
where Default ~ ACTIVITY + CIF + MAN + STA + PIA + COL + CurrLiq + DebtCov + GDPgr are my arguments; mydata is the database; mylogit.reduced is a logistic regression estimated prior to bayesian, B0 is the covariation matrix, and subset=c are the eliminated observations.

I don't know this package, but it probably provides a predict function (actually it does, I just can't find if it does for MCMClogit models as I can't find the doc for this function). You can then pass it to a ROC function like pROC:
library(pROC)
predictions <- predict(mydata, newdata=mytestdata)
roc(mytestdata$Default, predictions)

Related

How can I calculate marginal effects from a binomial logistic model with clustered standard errors in R studio?

I would like to calculate marginal effects for this logistic model with clustered standard errors which I computed with miceadds::glm.cluster.
fullmodel3 <- miceadds::glm.cluster(data = SDdataset17,
formula = stigmatisation_dummy_num ~ gender + age +
agesquared + education_new + publicsector +
retired + socialisation_sd + selfplacement_num +
years_membership + voteshare,
cluster = "voteshare", family = "binomial")
Given that I am not using glm(), most functions I have seen around do not work.
Any suggestions?

Significant variables for Logistic regression in R

I am still new to R and still struggling. I am trying to do a logistic regression using a categorical and continuous variable and I am supposed to select the right variable for my model. There are 27 variables and a 8,000 observations.
I have gone through a couple of articles online including stepwise regression by AIC and all I do is confuse myself the more. I was also told to select my variables from the correlation matrix but when I do the correlation I don't seem to find the correlation especially with the categorical variable. I also try to fit all the model and I get some variables with p-value less than 0.5. This is the code:
d4 <- d3[,c('SW','MOI','YOI','DOI_CMC','RMOB','RYOB','RDOB_CMC',
'RCA','Region','TPR','DPR','NV','HEL','Has_Radio','Has_TV',
'Religion','WI','MOFB','YOB','DOB_CMC','DOFB_CMC','AOR','MTFBI',
'DSOUOM_CMC','RW','RH','RBMI')]
cor(d4)
d5 <- cor(d4)
round(cor(d4),2)
When I select the significant variables and try to apply logistic regression all the p value will be between 0.9 to 1. See code:
d3 <- lm(TPR ~ SW + MOI + RMOB + RYOB + RCA + Region + TPR + DPR +
NV + HEL + Has_Radio + Has_TV + Religion + WI + MOFB +
YOB + DOB_CMC + DOFB_CMC + AOR + MTFBI + DSOUOM_CMC +
RW + RH + RBMI,
data = d3, family = "binomial")
summary(d3)
I need help with this please!!
Here is the sample of d3

Hedonic price method multivariate regression in R | Interpretation of linear-linear, log-linear and log-log model

I am aware that there are similar questions on this site, however, none of them seem to answer my question sufficiently.
I am performing a multivariate regression in order to predict real estate data using Hedonic price method.
EXCERPT OF DATA USED
Dependent variable is AV_TOTAL, which is actually the price of the unit apartments'.
Distances from the closer park/highway are expressed in meters.
U_NUM_PARKS/U_FPLACE(presence of parkings and fireplace) are taken into account as dummy variables.
1) Linear-Linear Model --> Results Model 1
lm(AV_TOTAL ~ LIVINGA_AREAM2 + NUM_FLOORS +
U_BASE_FLO + U_BDRMS + factor(U_NUM_PARK) + DIST_PARKS +
DIST_HIGHdiff + DIST_BIGDIG, data = data)
Residuals Model 1
2) Log-linear Model --> Results Model 2
lm(log(AV_TOTAL) ~ LIVINGA_AREAM2 + NUM_FLOORS +
U_BASE_FLO + U_BDRMS + factor(U_NUM_PARK) + DIST_PARKS + DIST_HIGHdiff + DIST_BIGDIG, data = data)
Residuals Model 2
3) Log-Log Model --> Results Model 3
lm(formula = log(AV_TOTAL) ~ log(LIVINGA_AREAM2) + NUM_FLOORS +
U_BASE_FLO + log(U_BDRMS) + factor(U_NUM_PARK) + log(DIST_PARKS) +
log(DIST_HIGHdiff) + log(DIST_BIGDIG), data = data)
Residuals Model 3
All the models have quite good R^2 while residuals plot shows better normal distribution for Model 2 and 3.
I can't figure out which is the difference between model 2 and 3 especially in interpreting the variable DIST_PARKS (distance from parks) and also which is the more correct model.

Compare beta coefficients of the same regression

is there a way to compare (standardized) beta coefficients of one sample and regression without generating two models and conducting an anova? Is there a simpler method with e.g. one function?
For example, if I have this model and would want to compare beta coefficients of SE_gesamt and CE_gesamt (only two variables):
library(lm.beta)
fit1 <- lm(Umint_gesamt ~ Alter + Geschlecht_Dummy + SE_gesamt + CE_gesamt + EmoP_gesamt + Emp_gesamt + IN_gesamt + DN_gesamt + SozID_gesamt, data=dataset)
summary(fit1)
lm.beta(fit1)
All the best,
Karen

fitting model in svyglm

I am using svyglm model, I need to get "AIC" to compare differents models.
My problem is that if I use my model:
Modelo12=svyglm(formula = Asiste ~ sexo + E27 + JovenActivo + hijos +
+ jefe + LN_YSVL_sin_joven_prom + aniosed + climaeducativo +
+ icv2 + TV + Computadora + Telefono + Cable + Calefon +
+ DVD + Microhondas + Aire + Auto_o_moto + Secadora + Lavavajillas +
+ Refrigerador + Actividad_del_Jefe + Hacinamiento, family = quasibinomial(link
=
"probit"), data = Personas.con.muestra, design=diseno_personas_14_17)
when I write summary(Modelo12) AIC doesnt appears.
In other hand, I can`t use stepwise, this is the error:
stepwise(Modelo12)
Direction: backward/forward forward/backward backward forward
Criterion: BIC
Error en extractAIC.svyglm(fit, scale, k = k, ...) :
svyglm not fitted by maximum likelihood
THANKS!!!
Natalia
AIC is a likelihood based measure. svyglm does not use maximum likelihood fitting, as the error and ?svyglm tell you. There you also get the advice to use regTermTest, which provide Wald tests and adjusted Rao-Scott likelihood ratio test.

Resources