Regression Kriging of binomial data in geoRglm R package - r

I am using binom.krige() function of the R package geoRglm for determining the spatial predictions of a binary (0, 1) response variable with several continuous as well as discrete covariates.
Using glm() with binomial logit link function I found that the response variable is showing significant dependency on several covariates.
I included the trend into binom.krige() using krige.glm.control() where I specified the two trend models as
> trend.d=trend.spatial(~ rivers + roads + annual_pre + annual_tem + elevation_ + host_densi + lulc + moist_dq + moist_in + moist_wq, data_points)
> trend.l=trend.spatial(~ rivers1 + roads + annual_pre + annual_tem + elevation_ + host_densi + lulc + moist_dq + moist_in + moist_wq, pred_grid)
The question, which is confusing me, is when trend.d and trend.l go into krige.glm.control() and eventually into binom.krige(), does it actually fit a glm with binomial logit or just linear model (because the above equations seem to be a linear model)?

Related

How can I calculate marginal effects from a binomial logistic model with clustered standard errors in R studio?

I would like to calculate marginal effects for this logistic model with clustered standard errors which I computed with miceadds::glm.cluster.
fullmodel3 <- miceadds::glm.cluster(data = SDdataset17,
formula = stigmatisation_dummy_num ~ gender + age +
agesquared + education_new + publicsector +
retired + socialisation_sd + selfplacement_num +
years_membership + voteshare,
cluster = "voteshare", family = "binomial")
Given that I am not using glm(), most functions I have seen around do not work.
Any suggestions?

Fixed effect model - using plm and lm, different R2 values

I tried to do fixed effect model using plm and lm + factor(countries).
And I got the same values for estimators, but I got different values for R2, residual standard error and F-statistic. Why?
And the second question - can I get the coefficients for countries using the plm? (similar to lm + factor(countries))
Here is my data set: https://www.dropbox.com/s/a8r0vl85rb1ak6e/panel_data.csv?dl=0
There are some financial measures and GDP growth for some countries and several years. There are some NAN's (panel in unbalanced)
proba<-read_excel("my_data.xlsx")
pdata<-pdata.frame(proba,index=c("id","year"))
fixed <-plm(GDP_growth~gfdddi01 + gfdddi02 + gfdddi04 + gfdddi05,data=pdata,model="within")
fixed.dum <-lm(GDP_growth~gfdddi01 + gfdddi02 + gfdddi04 + gfdddi05 + factor(country) - 1, data=pdata_srednie_5_letnie)
Thank you!

Hedonic price method multivariate regression in R | Interpretation of linear-linear, log-linear and log-log model

I am aware that there are similar questions on this site, however, none of them seem to answer my question sufficiently.
I am performing a multivariate regression in order to predict real estate data using Hedonic price method.
EXCERPT OF DATA USED
Dependent variable is AV_TOTAL, which is actually the price of the unit apartments'.
Distances from the closer park/highway are expressed in meters.
U_NUM_PARKS/U_FPLACE(presence of parkings and fireplace) are taken into account as dummy variables.
1) Linear-Linear Model --> Results Model 1
lm(AV_TOTAL ~ LIVINGA_AREAM2 + NUM_FLOORS +
U_BASE_FLO + U_BDRMS + factor(U_NUM_PARK) + DIST_PARKS +
DIST_HIGHdiff + DIST_BIGDIG, data = data)
Residuals Model 1
2) Log-linear Model --> Results Model 2
lm(log(AV_TOTAL) ~ LIVINGA_AREAM2 + NUM_FLOORS +
U_BASE_FLO + U_BDRMS + factor(U_NUM_PARK) + DIST_PARKS + DIST_HIGHdiff + DIST_BIGDIG, data = data)
Residuals Model 2
3) Log-Log Model --> Results Model 3
lm(formula = log(AV_TOTAL) ~ log(LIVINGA_AREAM2) + NUM_FLOORS +
U_BASE_FLO + log(U_BDRMS) + factor(U_NUM_PARK) + log(DIST_PARKS) +
log(DIST_HIGHdiff) + log(DIST_BIGDIG), data = data)
Residuals Model 3
All the models have quite good R^2 while residuals plot shows better normal distribution for Model 2 and 3.
I can't figure out which is the difference between model 2 and 3 especially in interpreting the variable DIST_PARKS (distance from parks) and also which is the more correct model.

specifying multiple random effects in R lmer (translating from HLM model)

I'm attempting to "translate" a model run in HLM7 software to R lmer syntax.
This is from the now-ubiquitous "Math achievement" dataset. The outcome is math achievement score, and in the dataset there are various student-level predictors (such as minority status, SES, and whether or not the student is female) and various school level predictors (such as Catholic vs. Public).
The only predictors in the model I want to fit are student-level predictors, which have all been group-mean centered to deal with dummy variables (aside: contrast codes are better). The students are nested in schools, so we should (I think) have random effects specified for all of the components of the model.
Here is the HLM model:
Level-1 Model
(note: all predictors at level one are group mean centered)
MATHACHij = β0j + β1j*(MINORITYij) + β2j*(FEMALEij) + β3j*(SESij) + rij
Level-2 Models
β0j = γ00 + u0j
β1j = γ10 + u1j
β2j = γ20 + u2j
β3j = γ30 + u3j
Mixed Model
MATHACHij = γ00 + γ10*MINORITYij + γ20*FEMALEij + γ30*SESij + u0j + u1j*MINORITYij + u2j*FEMALEij + u3j*SESij + rij
Translating it to lmer syntax, I try:
(note: _gmc means the variable has been group mean centered, the grouping factor is "school_id")
model1<-lmer(mathach~minority_gmc+female_gmc+ses_gmc+(minority_gmc|school_id)+(female_gmc|school_id)+(ses_gmc|school_id), data=data, REML=F)
When I run this model I get results that don't mesh with the HLM results. Am I specifying the random effects incorrectly?
Thanks!
When you specify your random effect structure, you can include each random effect in one parentheses. While this may not solve your result dependencies, I believe the appropriate random effects code syntax for your model is this:
lmer(mathach~minority_gmc + female_gmc + ses_gmc + (1 + minority_gmc + female_gmc + ses_gmc |school_id), data=data, REML=F)

roc curve for bayesian logistic regression

Is there anyone can help me implement a ROC curve for a bayesian logistic regression? been trying DPpackage but is it me or it just doesn't work.
the two models i want to compare using ROC Curve are showed below:
bayes_mod=MCMClogit(Default ~ ACTIVITY + CIF + MAN + STA + PIA + COL + CurrLiq + DebtCov + GDPgr, data=mydata, burnin=500000,mcmc=10000, tune=0.6,b0=coef(mylogit.reduced),B0=information2, subset=c(-1772,-2064,-655))
bayes_mod1=MCMClogit(Default ~ ACTIVITY + CIF + MAN + STA + PIA + COL + CurrLiq + DebtCov + GDPgr, data=mydata, burnin=500000,mcmc=10000,tune=0.6,subset=c(-1772,-2064,-655))
where Default ~ ACTIVITY + CIF + MAN + STA + PIA + COL + CurrLiq + DebtCov + GDPgr are my arguments; mydata is the database; mylogit.reduced is a logistic regression estimated prior to bayesian, B0 is the covariation matrix, and subset=c are the eliminated observations.
I don't know this package, but it probably provides a predict function (actually it does, I just can't find if it does for MCMClogit models as I can't find the doc for this function). You can then pass it to a ROC function like pROC:
library(pROC)
predictions <- predict(mydata, newdata=mytestdata)
roc(mytestdata$Default, predictions)

Resources