I'm trying to run a boosted robust regression on Caret (with the Huber family), however I get an error when training the model:
library(caret)
X <- rnorm(300, 0, 100)
Y <- rnorm(300, 0, 100000)
data <- cbind(X,Y)
model <- train(Y~X, method="glmboost", data=data, family=Huber())
I get the error 'could not find function Huber()', however this is explicitly included in the mboost package (the one on which glmboost is based).
Any help would be really appreciated.
If you Just run library(caret) with method="glmboost" it will load the mboost package, but it will not attach the mboost package to your search path. Packages are discouraged from automatically attaching other packages since they may import functions that could conflict with other functions you have loaded. Thus most packages load dependencies privately. If you fully qualify the function name with the package name, then you can use it in your model
model <- train(Y~X, method="glmboost", data=data, family=mboost::Huber())
Or you could just also run library(mboost) to attach the package to your search path so you don't have to include the package name prefix.
Related
I am trying to install the package Miceadds to estimate clustered standard errors but it doesn't install on my computer. I am using mac. Instead, now I am trying the coeftest function from one of the packages installed.
The coeftest function, however, doesn't give the correct result for the probit regression I have run earlier since the errors don't change no matter which cluster I use.
install.packages("sandwich")
install.packages("plm")
install.packages("lmtest")
library(sandwich)
library(plm)
library(lmtest)
coeftest(probitreg1, vcov.=vcovHC(probitreg, cluster= c("state", "year")))
coeftest(probitreg1, vcov.=vcovHC(probitreg, cluster= "year"))
The clustered standard errors are coming out to be smaller than I expected even though ideally, they should be larger since intra-state variation in voting will lead to underestimated standard errors. Am I using the coeftest correctly?
A similar question was closed and the solution accepted was to check if the package caret was installed correctly. As instructed in the solutions I checked if caret package was installed and loaded correctly. I have reloaded the package and it is available in the current session. The following lines using train(...) is producing the error:
model <- train(
price ~ ., diamonds,
method = "lm",
trControl = trainControl(
method = "cv", number = 10
, verboseIter = TRUE
))
Here I am trying to train and cross validate the famous diamonds dataset on a linear model. However, the following error is observed:
Error: All inputs to rbind.fill must be data.frames
It doesn't provide any further information about the error. My warnings are on. Is there any way I can debug this?
I copy pasted your code in my console and it worked just fine. Try updating your caret package.
I am working with the R package 'zoib' for performing beta regression in R. I am trying to replicate the example included on page 41 in the paper the package authors published in The R Journal:
Lui F and Kong Y. 2015. zoib: An R Package for Bayesian Inference for Beta Regression and Zero/One Inflated Beta Regression. The R Journal 7(2)
I believe I am using the exact same data and code that they use:
library(zoib)
data("GasolineYield", package="zoib")
GasolineYield$batch <- as.factor(GasolineYield$batch)
d <- GasolineYield
eg1.fixed <- zoib(yield ~ temp + as.factor(batch) | 1, data=GasolineYield, joint=FALSE,
random=0, EUID=1:nrow(d), zero.inflation=F, one.inflation=F,
n.iter=1050, n.thin=5, n.burn=50)
sample1 <- eg1$coeff
traceplot(sample1)
autocorr.plot(sample1)
gelman.diag(sample1)
However, I am getting an error when I try to do the diagnostic plots on the samples. This is the error message:
Error in ts(seq(from = start(x), to = end(x), by = thin(x)), start = start(x), :
invalid time series parameters specified
I cannot understand why the code isn't working or what I can do to fix the problem. I can trace the error to the time function which is called by zoib, and it seems like maybe it is a problem that the sample object does not have a tsp attribute, but the zoib package authors make it clear that their model output is meant to be used with coda, so I am very confused. I don't have much experience working with MCMC or time series objects, so maybe I am just missing something obvious. Can anyone explain why the example provided by the package authors is failing, and what the solution is?
I e-mailed the package author (Fang Liu) and she informed me that there was in fact a bug in the version of the package I have, but that the bug is fixed in the most recent version of zoib (Version 1.4.2). Using the most recent version, the code now works.
So I generated a random dataset online and I need to apply the C4.5 algorithm on it.
I installed the RWeka package and all its dependencies but I do not know how to execute it.
Can somebody help me with links to tutorials? Anything apart from the RWeka documentation. Or a sample C4.5 code in R to understand its working?
Thank you
I think it would be worth your time to check out the caret package. It standardizes the syntax for most machine learning packages in R, including RWeka.
It also has a ton of really useful helper functions and a great tutorial on their website
Here's the syntax for predicting Species on the iris dataset using the RWeka package with C4.5-like trees:
library(caret)
train_rows <- createDataPartition(iris$Species, list=FALSE)
train_set <- iris[train_rows, ]
test_set <- iris[-train_rows, ]
fit.rweka <- train(Species ~ ., data=train_set, method='J48')
pred <- predict(fit.rweka, newdata=test_set)
then, if you want to try a gradient boosting machine or some other algorithm, just change to method='gbm'
I'm working on an R package where I need to call predict.lm on a model I've already fit. I've saved the linear model as a file which I can put in the data folder of the package. I'm worried about slowing things down if I load the model every time the function is called. The function that uses this model is the meat of the package and gets called on every iteration of a simulation, so I'd prefer to read the saved model once when the package is loaded. Is there a way to do that?
Why not just save the coefficients and then "predict" with them?
c.vec <- coef(fit) # Intercept + terms
Yhat <- c.vec * c(1, data.vec)