Can GLMNet perform Logistic regression? - r

I am using GLMNet to perform LASSO on Binary Logistic models with cv.GLMNet to test selection consistency and would like to compare its performance with plain GLM Logistic regression. For fairness' sake in comparing the outputs, I would like to use GLMNet to perform this regression, however, I am unable to find any way to do so, barring using GLMnet with alpha = 0 and Lambda = 0 (Ridge without a penalty factor). However, I am unsure about this method, as it seems slightly janky, GLMnet's manual discourages the inputting of single lambda values (for speed reasons) and it provides me no z-values to determine the confidence level of the coefficient. (Essentially, the ideal output would be something similar to just using r's GLM function)
I've read through the entire manual and cant find a method of doing this, is there a way to perform Logistic Regression with GLMNet, without the penalty factor in order to get an output similar to r's GLM?

Related

How to find the best fitted models using the forward ,backward and the stepwise selection in poisson regression using R programming?

I am using regsubsets method for linear regression and came across step() method for selecting columns for logistic regression methods.
I am not sure whether we can use regsubsets or steps for Poisson regression. It will be helpful if there is a method to find the best subsets for Poisson regression in R programming.
From here it looks like
step() (base R: works on glm objects -> includes Poisson regression models)
bestglm package
glmulti package
Possibly others.
Be prepared for the GLM (Poisson etc.) case to be much slower than the analogous problem for Gaussian responses (OLS/lm).

ROCR Package - Classification algo other than logistic regression

I was referring to this link which explains the usage of ROCR package for plotting ROC curves and other related accuracy measurement metrics. The author mentions about logistic regression in the beginning, but do these functions(prediction, performance from ROCR) apply to other classification algorithms like SVM, Decision Trees, etc. ?
I tried using prediction() function with results of my SVM model, but it threw a format error despite the arguments being of same type and dimensions. Also I am not sure that if we try coming up with ROC curves for these algorithms, we would get a shape like the one we generally see with logistic regression(like this).
The prediction and performance functions are model-agnostic in the sense that they only require the user to input actual and predicted values from a binary classifier. (More precisely, this is what prediction requires, and performance takes as input an object output by prediction). Therefore, you should be able to use both functions for any classification algorithm that can output this information - including both logistic regression and SVM.
Having said this, model predictions can come in different formats (e.g., propensity scores vs. classes; results stored as numeric vs. factor), and you'll need to ensure that what you feed into prediction is appropriate. This can be quite specific, for example, while the predictions argument can represent continuous or class information, it can’t be a factor. See the “details” section of the function’s help file for more information.

Lambda's from glmnet (R) used in online SGD

I'm using cv.glmnet from glmnet package (in R). In the outcome I get a vector of lambda's (regularization parameter). I would like to use it in the online SGD algorithm. Is there a way of doing so and how?
Any suggestion would be helpful.
I am wondering how can I compare results (in the terms of a model's coefficients and regularization output parameter) in generalized linear model with l1 regularization and binomial distribution (logistic link function) that was calculated once in offline using cv.glmnet function from R package that I think uses Raphson-Newton estimation algorithm WITH online evaluating model of the same type but where the estimates are re-calculated after every new observation using stochastic-gradient-descent algorithm ( classic, type I ).

Is it possible to customize a likelihood function for logit models using speedglm, biglm, and glm packages

I am trying to fit a customized logistic regression/survival analysis function using the optim/maxBFGS functions in R and literally defining the functions by hand.
I was always under the impression that for the packages speedglm, biglm, and glm, the likelihood functions for logit models or whatever distribution were hardlocked. However, I was wondering if I was mistaken or if it was possible to specify my own likelihood functions. The reason being that optim/maxBFGS is a LOT slower to run than speedglm.
The R glm function is set up only to work with likelihoods from the exponential family. The fitting algorithms won't work with any other kind of likelihood, and with any other you're not in fact fitting a glm but some other kind of model.
The glm functions fit using iterated reweighted least squares; the special form of the likelihood function for the exponential families makes Newton's method for solving the max likelihood equations identical to fitting ordinary least squares regression repeatedly until convergence is achieved.
This is a faster process than generic nonlinear optimization; so if the likelihoods you want to use have been customized so that they are no longer from an exponential family, you are no longer fitting a generalized linear model. This means that the IRWLS algorithm isn't applicable, and the fit will be slower, as you are finding.

is there a way to only include factors that are significant at P<0.05 in a backward elimination in logistic regression

When doing a backward elimination using the step(), is it possible to only include those factors that are significant, for example, at P<0.05?
I am using this line at the moment
step(FulMod3,direction="backward",trace=FALSE)
to get my final model.
Answers to these questions give starting points
Logistic Regression in R (SAS-like output)
Stepwise Regression using P-Values to drop variables with nonsignificant p-values
In particular they point you towards fastbw in the rms package, which can be used in conjunction with rms::lrm (logistic regression). They also explain why stepwise regression via p values is often a really, really, really, BAD idea: see also http://www.stata.com/support/faqs/stat/stepwise.html . There are a few contexts where it is appropriate (otherwise Frank Harrell, the author of the rms package and crusader against foolish uses of stepwise regression, wouldn't have written fastbw), but they are relatively rare, usually dominated by (e.g.) penalized regression approaches or by stepwise approaches via AIC (as implemented in step): see e.g. https://stats.stackexchange.com/questions/13686/what-are-modern-easily-used-alternatives-to-stepwise-regression and https://stats.stackexchange.com/questions/20836/algorithms-for-automatic-model-selection

Resources