Why does ivreg function generate strange error

Why does ivreg function generate strange error - r

I'm trying to use instrument variables for the following three variables: pwtopen, inc, incsqr with the following three instruments: elhsfs, incf, incfsqr. polity is an exegenous variable.
answer<- ivreg(sulfdm ~ polity + pwtopen + inc + incsqr|polity + elhsfs + incf + incfsqr,
mydata)
I am then getting the error message:
Error in ivreg(sulfdm ~ polity + pwtopen + inc + incsqr | polity + elhsfs + :
length(formula)[1] == 1L is not TRUE
Any thoughts? Thanks

You should use the arguments in this case:
ivreg(formula, instruments, data, subset, na.action, weights, offset,
contrasts = NULL, model = TRUE, y = TRUE, x = FALSE, …)
As you didn't specify the arguments, the function tried to use your data as "instrument".
So, this should solve your problem:
answer<- ivreg(sulfdm ~ polity + pwtopen + inc + incsqr|polity + elhsfs + incf + incfsqr,
data = mydata)

I faced the same issue. Rewriting the command as below solved it:
answer <- ivreg(sulfdm ~ polity + pwtopen + inc + incsqr, ~polity + elhsfs + incf + incfsqr, mydata)

Related

How can I solve "invalid 'times' value" error in tidymv package in R?

I am using a generalized additive method (gam) in R and trying to use "tidymv" package for it's visualisation. My gam model is the following:
gam_reg <-
gam(
asinhnondurable ~ as.factor(CountyName) + zero5 + m611 + f611 + m1217 + f1217 +
m1864Illiterate + f1864Illiterate + m1864Primary +
f1864Primary + m1864Secondary + f1864Secondary +
m1864University + f1864University + AgeCtgover65 +
car + HomeOwner + Land + Livestock + Year +
Year * as.factor(CountyName) + Year2 * as.factor(CountyName) +
s(tempshock) + s(rainfallshock),
family = gaussian,
data = data,
weights = data$Weight
)
And then I use the following code for visualizasion:
model_p <- predict_gam(gam_reg)
but, I face the following error:
Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
invalid 'times' value
Does anyone know what the problem is and how I could solve the problem?

String formula for linear models

I'm trying to create a string formula with the independent variables that are significant within my linear model, though I'm finding it difficult trying to include the + at the end of each variable.
I have tried:
as.formula(sprintf("encounter ~ %s",
names(tbest$model)[-1]))
However, this only gives the first variable:
encounter ~ open_shrubland
Warning message:
Using formula(x) is deprecated when x is a character vector of length > 1.
Consider formula(paste(x, collapse = " ")) instead.
How would I include all of them such that: encounter ~ X1 + X2 + X3 ..., also, can this be made functional, such that if I wanted to remove a variable, I would only have to do my.formula[-3] to remove it?
list of variable names:
c("open_shrubland", "Appalachian_Mountains", "Boreal_Hardwood_Transition",
"Central_Hardwoods", "Piedmont", "wetland", "Badlands_And_Prairies",
"Peninsular_Florida", "Central_Mixed_Grass_Prairie", "water",
"New_England_Mid_Atlantic_Coast", "grassland", "mixed_forest",
"cropland", "Oaks_And_Prairies", "Eastern_Tallgrass_Prairie",
"evergreen_needleleaf", "year", "pland_change", "evergreen_broadleaf",
"Southeastern_Coastal_Plain", "Prairie_Potholes", "Shortgrass_Prairie",
"urban", "Prairie_Hardwood_Transition", "Lower_Great_Lakes_St.Lawrence_Plain",
"mosaic", "Mississippi_Alluvial_Valley", "deciduous_broadleaf",
"deciduous_needleleaf", "barren")

Using reformulate will be helpful.
reformulate(names(tbest$model)[-1], 'encounter')
If the list of variable names are in x :
reformulate(x, 'encounter')
encounter ~ open_shrubland + Appalachian_Mountains + Boreal_Hardwood_Transition +
Central_Hardwoods + Piedmont + wetland + Badlands_And_Prairies +
Peninsular_Florida + Central_Mixed_Grass_Prairie + water +
New_England_Mid_Atlantic_Coast + grassland + mixed_forest +
cropland + Oaks_And_Prairies + Eastern_Tallgrass_Prairie +
evergreen_needleleaf + year + pland_change + evergreen_broadleaf +
Southeastern_Coastal_Plain + Prairie_Potholes + Shortgrass_Prairie +
urban + Prairie_Hardwood_Transition + Lower_Great_Lakes_St.Lawrence_Plain +
mosaic + Mississippi_Alluvial_Valley + deciduous_broadleaf +
deciduous_needleleaf + barren

We can create a formula with paste
as.formula(paste('encounter~', paste(names(tbtest$model)[-1], collapse = "+")))

Add column in regression output

I would like to add a column with the X² (Chi-Square) as well as a column with the Exp(B) into my regression output. Is there any idea on how to do this? Many thanks in advance. For now i have calculated this manually for every model and variable, which is quite time consuming.
model_simple <- as.formula("completion_yesno ~ ac + ov + UCRate + FirstWeek + LastWeek + DayofWeekSu + DayofWeekMo + DayofWeekTu + DayofWeekWe + DayofWeekTh + DayofWeekFr + MonthofYearJan + MonthofYearFeb + MonthofYearMar + MonthofYearApr +MonthofYearMay+ MonthofYearJun + MonthofYearJul + MonthofYearAug + MonthofYearSep + MonthofYearOct + MonthofYearNov")
clog_simple1 = glm(model_simple,data=cllw,family = binomial(link = cloglog))
summary(clog_simple1)

Maybe you can elaborate on what you mean by chi-square and exp(B). You can do the below:
da <- MASS::Pima.tr
model <- glm(type ~ .,data=da,family = binomial(link = cloglog))
results <- data.frame(coefficients(summary(model)),check.names=FALSE)
# some random values
results$chisq = rchisq(nrow(results),1)
results$expB = exp(results$Estimate)
Or you can use tidy from broom:
library(broom)
results = tidy(model)
results$expB = exp(results$Estimate)

upper scope has term ‘NA’ not included in model

I am working on a data set and would like to do step wise logistic regression using some variables and to do so I am using the add1() function in R. A sample of the data set can be downloaded from the link here: https://drive.google.com/file/d/0B0N-Nc7kEi4bVjhDd1FDaEE5cEE/view?usp=sharing
I thereby fit a logistic regression using:
train <- read.csv('training.csv')
glm.model_step_1 <- glm(loan_status ~ acc_open_past_24mths + annual_inc + avg_cur_bal + bc_open_to_buy + delinq_2yrs + dti + inq_last_6mths + installment + int_rate + mo_sin_old_il_acct + mo_sin_old_rev_tl_op + mo_sin_rcnt_rev_tl_op + mo_sin_rcnt_tl + mort_acc + mths_since_last_delinq + mths_since_recent_bc + mths_since_recent_inq + num_accts_ever_120_pd + num_actv_bc_tl + num_actv_rev_tl + num_bc_tl + num_il_tl + num_op_rev_tl + num_tl_op_past_12m + pct_tl_nvr_dlq + percent_bc_gt_75 + pub_rec_bankruptcies + revol_bal + revol_util + term + total_acc + total_bc_limit + total_il_high_credit_limit + fico_mean + addr_state + emp_length + verification_status + Count_NA + Info_missing + Engineer + Teacher + Doctor + Professor + Manager + Director + Analyst + senior + lead + consultant + home_ownership_own + home_ownership_rent + purpose_debt_consolidation + purpose_medical + purpose_credit_card + purpose_other,
data = train,
family = binomial(link = 'logit'))
And use the add1() function to do a forward selection.
add1(glm.model_step_1, scope = train)
This code does not work. I get the below error:
Error in factor.scope(attr(terms1, "factors"), list(add = attr(terms2, :
upper scope has term ‘NA’ not included in model
Does anyone know how to solve this error?
A question asked previously on datascience.stackexchange (https://datascience.stackexchange.com/questions/11604/checking-regression-coefficients-stability) mentioned checking for NAs. There aren't any NAs in the data set and that can be confirmed by running sapply(train, function(x) sum(is.na(x))

The train dataset of #Jash Sash has some anomalous values inside which force read.csv to read some numerical variables as factors with many categories.
Anyway, I consider here a model with only few variables in order to show how to avoid the error message reported above.
Remember that the scope argument must be a "formula giving the terms to be considered for adding or dropping"; it cannot be a data.frame like in the code of #Jash Sash.
train <- read.csv('training.csv')
numeric <- apply(train,2,is.factor)
glm.model_step_1 <- glm(loan_status ~ acc_open_past_24mths + avg_cur_bal + bc_open_to_buy,
data = na.omit(train),
family = binomial(link = 'logit'))
add1(glm.model_step_1, scope=~.+delinq_2yrs+inq_last_6mths+int_rate)
The results is:
Model:
loan_status ~ acc_open_past_24mths + avg_cur_bal + bc_open_to_buy
Df Deviance AIC
<none> 1038.6 1046.6
delinq_2yrs 1 1037.9 1047.9
inq_last_6mths 1 1038.0 1048.0
int_rate 1 1038.0 1048.0

Trying to run glmulti with quasipoisson

I have tried running the following equation:
modruns2 <- glmulti(SRI ~ TVIS + SNH3000 + BU1000 + TBE250 + TS3000 + TAB3000 +
PELHL1000 + SR250 + SW250, data = data1, family = quasipoisson,
glmulti-cvalue = 6.219132, level =1, maxsize = 4, crit = qaicc)
And it comes up with the error:
Error: unexpected '=' in "modruns2 <- glmulti(SRI ~ TVIS + SNH3000 + BU1000 + TBE250 +
TS3000 + TAB3000 + PELHL1000 + SR250 + SW250, data = data1, family = quasipoisson, glmulti-
cvalue ="
I have been trying to follow http://cran.r-project.org/web/packages/glmulti/glmulti.pdf & http://vcalcagnoresearch.wordpress.com/package-glmulti/
But it seems I may be specifying the c-hat incorrectly.
Any help?

Install the package named R.utils.
What LyzandeR suggests was correct. But instead of
setOption('glmulti-cvalue'=6.219132)
use
setOption('glmulti-cvalue',6.219132)
I had the same issue, and I found answer in an example on this page.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Why does ivreg function generate strange error - r

I faced the same issue. Rewriting the command as below solved it: answer <- ivreg(sulfdm ~ polity + pwtopen + inc + incsqr, ~polity + elhsfs + incf + incfsqr, mydata)

Related

How can I solve "invalid 'times' value" error in tidymv package in R?

String formula for linear models

Add column in regression output

upper scope has term ‘NA’ not included in model

Trying to run glmulti with quasipoisson

Categories

Resources