Error with gamsel R Package - r

I'm trying to use the gamsel R package to fit a sparse generalized additive model, and I can't seem to get it to work on real data. When I run on synthetic data as described in the package documentation, everything works well:
library(gamsel)
data=gendata(n=500,p=12,k.lin=3,k.nonlin=3,deg=8,sigma=0.5)
attach(data)
bases=pseudo.bases(X,degree=10,df=6)
gamsel.out=gamsel(X,y,bases=bases)
But when I run on real data, I get the following error:
library(gamsel)
X = as.matrix(read.csv("X.csv"),header=FALSE)
y = as.matrix(read.csv("y.csv"),header=FALSE)
gam_fit = gamsel(X,y)
Error in if (abs((df - current.df)/df) < 1e-04 | iterations == 1)
return(list(lambda = lambda, : missing value where TRUE/FALSE
needed
You can access sample data files that will reproduce this result here. Any thoughts about how to fix this error?

Related

Poisson distribution in R mixdist package

I think my data might be described by a sum of Poisson distributions and I found out about mixdist package for R. I managed to fit gamma and lnorm distributions, but I can't figure out how to use Poisson. I tried with the example data first:
library(mixdist)
data(poisdat)
data(poispar)
plot.mixdata(poisdat) #this works
fitp<-mix(poisdat, coef(poispar), "pois") #this doesn't
but I get an error Error in if (usecondit & ncol(mixdat) - 2 != k) stop("Conditional data are not consistent with mixpar.") : argument is of length zero
How to get a working example of mixdist and Poisson?
Are you looking for this:
mix(poisdat, poispar, "pois", constr = mixconstr(consigma = "POIS"))
Simply mix(poisdat, poispar, "pois") gives an Error:
Error in testconstr(mixdat, mixpar, dist, constr) :
Poisson needs consigma = POIS.
thas why this constr parameter.

R implementation of kohonen SOMs: prediction error due to data type.

I have been trying to run an example code for supervised kohonen SOMs from https://clarkdatalabs.github.io/soms/SOM_NBA . When I tried to predict test set data I got the following error:
pos.prediction <- predict(NBA.SOM3, newdata = NBA.testing)
Error in FUN(X[[i]], ...) :
Data type not allowed: should be a matrix or a factor
I tried newdata = as.matrix(NBA.testing) but it did not help. Neither did as.factor().
Why does it happen? And how can I fix that?
You should put one more argument to the predict function, i.e. "whatmap", then set its value to 1.
The code would be like:
pos.prediction <- predict(NBA.SOM3, newdata = NBA.testing, whatmap = 1)
To verify the prediction result, you can check using:
table(NBA$Pos[-training_indices], pos.prediction$predictions[[2]], useNA = 'always')
The result may be different from that of the tutorial, since it did not declare the use of set.seed() function.
I suggest that the set.seed() with an arbitrary number in it was declared somewhere before the training phase.
For simplicity, put it once on the top most of your script, e.g.
set.seed(12345)
This will guarantee a reproducible result of your model next time you re-run your script.
Hope that will help.

R Leaps Package: Regsubsets - coef "Reordr" Fortran error

I'm using the R leaps package to obtain a fit to some data:
(My dataframe df contains a Y variable and 41 predictor variables)
require(leaps)
N=3
regsubsets(Y ~ ., data = df, nbest=1, nvmax=N+1,force.in="X", method = 'exhaustive')-> regfit
coef(regfit,id = N)
When I run the code more than once (the first time works fine) I get the following error when I run the coef command:
Error in .Fortran("REORDR", np = as.integer(object$np), nrbar = as.integer(object$nrbar), :
"reordr" not resolved from current namespace (leaps)
Any help with why this is happening would be much appreciated.
A.
I had to build the package from source inserting the (PACKAGE = 'leaps') argument into the REORDR function in the leaps.R file. It now works fine every time.
The solution is related to:
R: error message --- package error: "functionName" not resolved from current namespace

Error with Generalized portmanteau tests for garch [gBox] in R

I am trying to use the gBox() function from the TSA package in R. I want to test the goodness of fit for a GARCH model. But when I try to run the function I get this error message:
Error in filter(M, filter = beta, method = "recursive", sides = 1,
init = rep(sigma2,dims [product 2] do not match the length of object
[1]**
The funny thing is that this is from an exact replica of the example that the package instructions provide, so there really shouldn't be any errors one would think. I get the same error message for my own data as well and I just don't know what to do.
Here is the example code:
library(TSA)
library(tseries)
data(CREF)
r.cref=diff(log(CREF))*100
m1=garch(x=r.cref,order=c(1,1))
summary(m1)
gBox(m1,x=r.cref,method='squared')
The length of the time series r.cref is 500 and the length of the garch m1 is 10, so they're obviously not same length, but how do I fix this error?

Evaluating weka classifier J48 with missing values in test set, R RWeka

I have an error when evaluating a simple test set with evaluate_Weka_classifier. Trying to learn how the interface works from R to Weka with RWeka, but I still don't get this.
library("RWeka")
iris_input <- iris[1:140,]
iris_test <- iris[-(1:140),]
iris_fit <- J48(Species ~ ., data = iris_input)
evaluate_Weka_classifier(iris_fit, newdata = iris_test, numFolds=5)
No problems here, as we would assume (It is ofcourse a stupit test, no random holdout data etc). But now I want to simulate missing data (alot). So i set Petal.Width as missing:
iris_test$Petal.Width <- NA
evaluate_Weka_classifier(iris_fit, newdata = iris_test, numFolds=5)
Which gives the error:
Error in .jcall(evaluation, "S", "toSummaryString", complexity) :
java.lang.IllegalArgumentException: Can't have more folds than instances!
Edit: This error should tell me that I have not enough instances, but I have 10
Edit: If I use write.arff, it can be exported and read in by Weka. Change Petal.Width {} into Petal.Width numeric to make the two files exactly the same. Then it works in Weka.
Is this a thinking error? When reading Machine Learning, Practical machine learning tools and techniques it seems to be legit. Maybe I just have to tell RWeka that I want to use fractions when a split uses a missing variable?
Thnx!
The issue is that you need to tell J48() what to do with missing values.
library(RWeka)
?J48()
#pertinent output
J48(formula, data, subset, na.action,
control = Weka_control(), options = NULL)
na.action tells R what to do with missing values. When following up on na.action you will find that "The ‘factory-fresh’ default is na.omit". Under this setting of course there are not enough instances!
Instead of leaving na.action as the default omit, I have changed it as follows,
iris_fit<-J48(Species~., data = iris_input, na.action=NULL)
and it works like a charm!

Resources