stepAIC forward function in R has a long run time - r

I am using the stepAIC function in R to run a stepwise regression on a dataset with 28 predictor variables. The backwards method is working perfectly, however the forward method has been running for the past half an hour with no output whatsoever this far. I believe it is to do with how I'm defining the scope attribute. Without the scope it runs instantaneously however only one step is run and no changes are made to the initial model.
Bstep.NNN.top<-step(lm(Response~X1+X2+X3+X4+.....+X28,data=df),
scope=list(upper=Response~X1*X2*X3*.....*X28,lower=Response~1),direction=c("forward"))
Does anybody know of a method that is quicker to run? Or if there is a way to simplify the scope attribute to a point where the run time will decrease?
Thanks

Related

Quantile regression model from `quantreg` does not finish computation

I am developing an iterative algorithm that uses quantile regression models at each iteration. For that I use the rq function from the quantreg package in R. So far it has worked fine. However, I have found a dataset where, at one of the iterations, the rq function simply gets stuck. No error message, no warning. It simply goes on as if still working, but never finishes computation.
I provide here a very small minimal code example. You can download the problematic data on this link:
https://www.dropbox.com/s/yrlotit1ovk9yzd/r555.RData?dl=0
library(quantreg)
load('~r555.RData')
dependent = r$dependent
independent = r$independent
quantreg::rq(dependent ~ -1 + independent, tau=0.1)
If you execute the above mentioned code, the rq function will get stuck and never finish. Be aware that the data provided is part of the iterative process I am developing, so it has no direct interpretation by itself. I am writing to check for possible reasons on this behaviour and check for possible solutions.
Dont know if it matters, but I have tested this on two different computers running Windows10 and using different versions of the quantreg package.
Changing the default method="br" to method="fn" fixes the problem.
quantreg::rq(dependent ~ -1 + independent, tau=0.1, method="fn")

How can I get My.stepwise.glm to return the model outside the console?

I asked this question on RCommunity but haven't had anyone bite... so I'm here!
My current project involves me predicting whether some trees will survive given future climate change scenarios. Against better judgement (like using Maxent) I've decided to pursue this with a GLM, which requires presence and absence data. Everytime I generate my absence data (as I was only given presence data) using randomPoints from dismo, the resulting GLM model has different significant variables. I found a package called My.stepwise that has a My.stepwise.glm function (here: My.stepwise.glm: Stepwise Variable Selection Procedure for Generalized Linear... in My.stepwise: Stepwise Variable Selection Procedures for Regression Analysis) , and this goes through a forward/backward selection process to find the best variables and returns a model ready for you.
My problem is that I don't want to run My.stepwise.glm just once and use the model it spits out for me. I'd like to run it roughly 100 times with different pseudo-absence data and see which variables it returns, then take the most frequent variables and move forward with building my model using those. The issue is that the My.stepwise.glm function ends by 'print(summary(initial.model))' and I would like to be able to access the output similar to how step() returns a list, where you can then say 'step$coefficients' and have the function coefficients return as numerics. Can anyone help me with this?

R Neural Network Forecasting - Aspect of Randomness?

I have been trying different methods of forecasting and stumbled upon the
nnetar()
function in the forecast package of R. I soon quickly realized that while this does work to forecast, it gives me something different every time I run it. Could anybody help to explain why this happens? I thought I had a decent understanding of neural nets and I don't see what could make drastic differences in forecasts, unless the nnetar() function randomly selects the number of nodes or something. Any help?
20, by default, networks are trained with random starting values and then their predictions are averaged when you use the function.
Because the function uses random starting values for each run, the forecasts will be different for each call too.
EDIT: new question from OP in the comments
In order to control the function and get the same random starting values each time, you can simple use the function set.seed() with the value of your choice.
For example:
set.seed(666)
forecast(nnetar(...),...)
set.seed(666)
forecast(nnetar(...),...)
set.seed(666)
forecast(nnetar(...),...)
will give the same results every time you run it with this "seed" value (666). You have to run set.seed(666) before every run of the rest of you code of course.
EDIT 2: new new question from OP in the comments
In order to have 100 different networks to fit with random starting weights:
nnetar(...,repeats=100,...)

Can I tell JAGS to re-start automatically after failure with initial values?

My model failed with the following error:
Compiling rjags model...
Error: The following error occured when compiling and adapting the model using rjags:
Error in rjags::jags.model(model, data = dataenv, inits = inits, n.chains = length(runjags.object$end.state), :
Error in node Y[34,10]
Observed node inconsistent with unobserved parents at initialization.
Try setting appropriate initial values.
I have done some diagnosis and found that there was a problem with initial values in chain 3. However, this can happen from time to time. Is there any way to tell run.jags or JAGS itself to re-try and re-run the model in such cases? For example, to tell him to make another N attempts to initialize the model properly. That would be very logical thing to do instead of just failing. Or do I have to do it manually with some tryCatch thing?
P.S.: note that I am currently using run.jags to run JAGS from R.
There is no facility for that provided within runjags, but it would be fairly simple to write yourself like so:
success <- FALSE
while(!success){
s <- try(results <- run.jags(...))
success <- class(s)!='try-error'
}
results
[Note that if this model NEVER works, the loop will never stop!]
A better idea might be to specify an initial values function/list that provides initial values that are guaranteed to work (if possible).
In runjags version 2, it will be possible to recover successful simulations when some simulations have crashed, so if you ran (say) 5 chains in parallel then if 1 or 2 crashed you would still have 3 or 4. That should be released within the next couple of weeks, and contains a large number of other improvements.
Usually when this error occurs it is an indication of a serious underlying problem. I don't think a strategy of "try again" is useful in general (and especially because default initial values are deterministic).
The default initial values generated by JAGS are given by a "typical" value from the prior distribution (e.g. mean, median, or mode). If it turns out that this is inconsistent with the data then there are typically two possible causes:
A posteriori constraints that need to be taken into account, such as
when modelling censored survival data with the dinterval
distribution
Prior-data conflict, e.g. the prior mean is so far
away from the value supported by the data that it has zero
likelihood.
These problems remain the same when you are supplying your own initial values.
If you think you can generate good initial values most of the time, with occasional failures, then it might be worth repeated attempts inside a call to try() but I think this is an unusual case.

Show progress R2WinBugs

I am doing several Bayesian analyses using R2WinBugs so I can put them in a for-loop. It works perfectly, R calls WinBugs, then the simulation starts and when it is done the results are saved and the next analysis starts.
When I normally use WinBugs, without R, I can monitor the simulations already done in the update screen so I roughly know how fast it is going and how long it will take to finish. My question is: Is there an option with R2WinBugs, or maybe a different package, to call WinBugs in for loops and still force WinBugs to show the progress made?
I hope my question is clear :)
I don't think it is possible using R2WinBUGS. You can set debug=TRUE to follow the simulations in WinBUGS itself, but it will mess up your for loop as you will then need to quit WinBUGS manually after each model run.
BRugs shows the same progress as a WinBUGS log file,... as in you can run the model check, initialise parameters, compile the model and update the simulations with output printed in the R console.

Resources