String formula for linear models - r

I'm trying to create a string formula with the independent variables that are significant within my linear model, though I'm finding it difficult trying to include the + at the end of each variable.
I have tried:
as.formula(sprintf("encounter ~ %s",
However, this only gives the first variable:
encounter ~ open_shrubland
Warning message:
Using formula(x) is deprecated when x is a character vector of length > 1.
Consider formula(paste(x, collapse = " ")) instead.
How would I include all of them such that: encounter ~ X1 + X2 + X3 ..., also, can this be made functional, such that if I wanted to remove a variable, I would only have to do my.formula[-3] to remove it?
list of variable names:
c("open_shrubland", "Appalachian_Mountains", "Boreal_Hardwood_Transition",
"Central_Hardwoods", "Piedmont", "wetland", "Badlands_And_Prairies",
"Peninsular_Florida", "Central_Mixed_Grass_Prairie", "water",
"New_England_Mid_Atlantic_Coast", "grassland", "mixed_forest",
"cropland", "Oaks_And_Prairies", "Eastern_Tallgrass_Prairie",
"evergreen_needleleaf", "year", "pland_change", "evergreen_broadleaf",
"Southeastern_Coastal_Plain", "Prairie_Potholes", "Shortgrass_Prairie",
"urban", "Prairie_Hardwood_Transition", "Lower_Great_Lakes_St.Lawrence_Plain",
"mosaic", "Mississippi_Alluvial_Valley", "deciduous_broadleaf",
"deciduous_needleleaf", "barren")

Using reformulate will be helpful.
reformulate(names(tbest$model)[-1], 'encounter')
If the list of variable names are in x :
reformulate(x, 'encounter')
encounter ~ open_shrubland + Appalachian_Mountains + Boreal_Hardwood_Transition +
Central_Hardwoods + Piedmont + wetland + Badlands_And_Prairies +
Peninsular_Florida + Central_Mixed_Grass_Prairie + water +
New_England_Mid_Atlantic_Coast + grassland + mixed_forest +
cropland + Oaks_And_Prairies + Eastern_Tallgrass_Prairie +
evergreen_needleleaf + year + pland_change + evergreen_broadleaf +
Southeastern_Coastal_Plain + Prairie_Potholes + Shortgrass_Prairie +
urban + Prairie_Hardwood_Transition + Lower_Great_Lakes_St.Lawrence_Plain +
mosaic + Mississippi_Alluvial_Valley + deciduous_broadleaf +
deciduous_needleleaf + barren

We can create a formula with paste
as.formula(paste('encounter~', paste(names(tbtest$model)[-1], collapse = "+")))


This an event study difference in differences regression equation. how can we transform this stata code to R code?

reg laccidentsvso2 weakban strongban lpop lunemp permale2 lrgastax laccidentmv2 st1-st50 t1-t48 time stt1-stt50 [aweight=pop],cluster(state)
There is no direct equivalent of the Stata regress command in R, but the following code should produce the same results:
model1 <- lm(laccidentsvso2 ~ weakban + strongban + lpop + lunemp + permale2 + lrgastax + laccidentmv2 + st1-st50 + t1-t48 + time + stt1-stt50, data = rstata, weights = aweight)
model2 <- lm(laccidentsvso2 ~ weakban + strongban + lpop + lunemp + permale2 + lrgastax + laccidentmv2 + st1-st50 + t1-t48 + time + stt1-stt50 + cluster(state), data = rstata, weights = aweight)
coeftest(model1, model2)

r glmmLasso : Error in n %*% s (glmm.rmd#64): requires numeric/complex matrix/vector arguments

I am trying to fit a glmmLasso model using this code:
lasso <- glmmLasso(stars ~ ADJRIND + AUC + KAPPA + DICE + ICCORR + JACRD + MUTINF + RNDIND + SURFOVLP + SURFDICE + VOLSMTY + HDRFDST + AVGDIST + MAHLNBS + VARINFO + GCOERR + PROBDST + SNSVTY + SPCFTY + PRCISON + RECALL + FMEASR + ACURCY + FALLOUT + TP + FP + TN + FN + GTVOL + SEGVOL, rnd = ~1|participant, family = acat(), data = mixedModel_df, lambda=10, switch.NR=TRUE, control=list(print.iter=TRUE))
I get the following error msg:
Error in n %*% s (glmm.rmd#64): requires numeric/complex matrix/vector arguments
all the IVs and my DV are doubles, only my random variable is a factor.
Also what should I specify for family?
As I requested here the head of my dataframe:
I somehow got a calculation running now by reducing some predictors and switching the family variable and getting rid of my random variables for the moment:
glm3 <- glmmLasso(stars ~ ADJRIND + AUC + KAPPA + DICE + ICCORR + JACRD + MUTINF + RNDIND + SURFOVLP + SURFDICE + VOLSMTY + HDRFDST + AVGDIST + MAHLNBS + VARINFO + GCOERR + PROBDST + SNSVTY + SPCFTY + PRCISON + RECALL + FMEASR + ACURCY + FALLOUT, rnd = NULL, family = cumulative(), data = mixedModel_df, lambda=10, switch.NR=TRUE, control=list(print.iter=TRUE))
However the calculation is stuck at iteration 89 for 30minutes now.
I have trouble deciding which one is the correct family type for my problem. This question is probably better suited for CV. How do I correctly specify my random variables, my final model will have two?

upper scope has term ‘NA’ not included in model

I am working on a data set and would like to do step wise logistic regression using some variables and to do so I am using the add1() function in R. A sample of the data set can be downloaded from the link here:
I thereby fit a logistic regression using:
train <- read.csv('training.csv')
glm.model_step_1 <- glm(loan_status ~ acc_open_past_24mths + annual_inc + avg_cur_bal + bc_open_to_buy + delinq_2yrs + dti + inq_last_6mths + installment + int_rate + mo_sin_old_il_acct + mo_sin_old_rev_tl_op + mo_sin_rcnt_rev_tl_op + mo_sin_rcnt_tl + mort_acc + mths_since_last_delinq + mths_since_recent_bc + mths_since_recent_inq + num_accts_ever_120_pd + num_actv_bc_tl + num_actv_rev_tl + num_bc_tl + num_il_tl + num_op_rev_tl + num_tl_op_past_12m + pct_tl_nvr_dlq + percent_bc_gt_75 + pub_rec_bankruptcies + revol_bal + revol_util + term + total_acc + total_bc_limit + total_il_high_credit_limit + fico_mean + addr_state + emp_length + verification_status + Count_NA + Info_missing + Engineer + Teacher + Doctor + Professor + Manager + Director + Analyst + senior + lead + consultant + home_ownership_own + home_ownership_rent + purpose_debt_consolidation + purpose_medical + purpose_credit_card + purpose_other,
data = train,
family = binomial(link = 'logit'))
And use the add1() function to do a forward selection.
add1(glm.model_step_1, scope = train)
This code does not work. I get the below error:
Error in factor.scope(attr(terms1, "factors"), list(add = attr(terms2, :
upper scope has term ‘NA’ not included in model
Does anyone know how to solve this error?
A question asked previously on datascience.stackexchange ( mentioned checking for NAs. There aren't any NAs in the data set and that can be confirmed by running sapply(train, function(x) sum(
The train dataset of #Jash Sash has some anomalous values inside which force read.csv to read some numerical variables as factors with many categories.
Anyway, I consider here a model with only few variables in order to show how to avoid the error message reported above.
Remember that the scope argument must be a "formula giving the terms to be considered for adding or dropping"; it cannot be a data.frame like in the code of #Jash Sash.
train <- read.csv('training.csv')
numeric <- apply(train,2,is.factor)
glm.model_step_1 <- glm(loan_status ~ acc_open_past_24mths + avg_cur_bal + bc_open_to_buy,
data = na.omit(train),
family = binomial(link = 'logit'))
add1(glm.model_step_1, scope=~.+delinq_2yrs+inq_last_6mths+int_rate)
The results is:
loan_status ~ acc_open_past_24mths + avg_cur_bal + bc_open_to_buy
Df Deviance AIC
<none> 1038.6 1046.6
delinq_2yrs 1 1037.9 1047.9
inq_last_6mths 1 1038.0 1048.0
int_rate 1 1038.0 1048.0

Error in neurons[[i]] %*% weights[[i]] : requires numeric/complex matrix/vector arguments

Have encountered this error whilst trying to train a neural network with R neuralnet package, I have 716 input variables that are either 1 or 0 and I am trying to predict a column that is also 1 or 0. I have looked around and so far this error appears to be caused by a non numeric value but all my values are numeric, so what is the possible problem?
Here is the code
train <- read.csv("training.csv")
n <- names(train)
f <- as.formula(paste("conclusion ~", paste(n[!n %in% "conclusion"], collapse = " + ")))
nn <- neuralnet(f, data=train, hidden = 1, threshold=0.4)
Here is the output of paste
conclusion ~ not + work + virus + me + have + down + sick + outbreak +
tabs + pill + pills + dont + do + n.t + fall + falling +
fell + study + school + fever + again + symptom + why + bird +
bad + shot + ill + return + cough + throat + soar + itch...........

Error in the class of random forest

I'm feeding a new set of data to the Random forest prediction model and encounter this error:
Error in checkData(oldData, RET) :
Classes of new data do not match original data
Here's the code:
fit1 <- cforest((b == 'three')~ affect+ certain+ negemo+ future+swear+sad
+negate+ppron+sexual+death + filler+leisure + conj+ funct + i
+future + past + bio + body+cause + cogmech + death +
discrep + future +incl + motion + quant + sad + tentat + excl+insight +percept +posemo
+ppron +quant + relativ + space + article + age + s_all + s_sad + gender
, data = trainset1,
controls=cforest_unbiased(ntree=500, mtry= 1))
testset2$pre_swl<-predict(fit1, newdata=testset2 , type='response')
Both of the training set and the test set are data.frame.
