r gbm package summary function performing erroring - r

I am trying to use the gbm package within r, and am having problems with the summary function. Hoping that someone can help out. My code is as follows:
library(ISLR)
Caravan$Purchase <- ifelse(Caravan$Purchase == "Yes", 1, 0)
train_index <- 1:1000
train <- Caravan[train_index, ]
test <- Caravan[-train_index, ]
library(gbm)
set.seed(1234)
boost <- gbm(Purchase ~ ., data = train, n.trees=400, shrinkage =0.01,
distribution = "bernoulli")
summary(boost)
I get the following error message with traceback:
Error in plot.window(xlim, ylim, log = log, ...) : need finite 'xlim' values
5.
plot.window(xlim, ylim, log = log, ...)
4.
barplot.default(rel.inf[i[cBars:1]], horiz = TRUE, col = rainbow(cBars, start = 3/6, end = 4/6), names = object$var.names[i[cBars:1]], xlab = "Relative influence", ...)
3.
barplot(rel.inf[i[cBars:1]], horiz = TRUE, col = rainbow(cBars, start = 3/6, end = 4/6), names = object$var.names[i[cBars:1]], xlab = "Relative influence", ...)
2.
summary.gbm(boost)
1.
summary(boost)
I have tried using the workaround here: http://www.samuelbosch.com/2015/09/workaround-ntrees-is-missing-in-r.html to no avail.
Any suggestions?
EDIT 1: Confirmed error occurs in R 3.5.0, but does not occur in 3.4.4.

Related

Error message on when charting efficient frontier in r studios

When I run this line of code, I get the error message:
chart.RiskReward(maxret, risk.col = "StdDev", return.col = "mean",
chart.assets = "False")
chart.EfficientFrontier(maxret, match.col="StdDev", n.portfolios=100, type="l", tangent.line=FALSE)
Error in seq.default(from = minret, to = maxret, length.out = n.portfolios) :
'from' must be a finite number

parallel analysis works with psych:fa.parallel but gives error message with paran package

I am running the parallel analysis with fa.parallel which works but the problem is that it provides or suggests a number of factors lower (3) than what I would expect (5):
fa.parallel(test3[, c(7:28)], fm="ml",sim=TRUE,n.iter = 100)
The answer I get in the R console:
Parallel analysis suggests that the number of factors = 3 and the number of components = 3
And a graph.
But how can I further see eigenvalues?
Secondly, as I tried another way to run the parallel analysis using paran package, it does not compute the parallel analysis instead gives me the error message:
Error in svd(X) : infinite or missing values in 'x'.
I have tried to search for this error message which I did not find in context of parallel analysis but in PCA and it has to do with missing values which I acknowledge I have in my dataset. What should I do? the code used for paran is:
paran(test2[, c(7:28)], iterations = 5000, centile = 0, quietly = FALSE,
status = TRUE, all = TRUE, cfa = TRUE, graph = TRUE, color = TRUE,
col = c("black", "red", "blue"), lty = c(1, 2, 3), lwd = 1, legend = TRUE,
file = "", width = 640, height = 640, grdevice = "png", seed = 0)

How do I contour a smoothScatter plot with missing values in r studio?

This is the code I am using
smoothScatter(longesttimeon$nhmc,longesttimeon$pt2,nrpoints=0)
smoothScatter(longesttimeon$nhmc,longesttimeon$pt2,nrpoints=0,colramp=colorRampPalette(c("white","dodgerblue2","gold","firebrick")))
library(readxl)
kern <- kde2d(longesttimeon$nhmc, longesttimeon$pt2)
contour(kern, drawlabels = FALSE, nlevels = 6,
col = rev(heat.colors(6)), add = TRUE, lwd = 3)
I get this error sign
Error in kde2d(longesttimeon$nhmc, longesttimeon$pt2) :
missing or infinite values in the data are not allowed
I am trying to make it look like this

different behavior on same code in different installation of r

When I run following code:
library(turboEM)
library(SparseM)
library(quantreg)
library(numDeriv)
library(foreach)
library(parallel)
libraary(iterators)
library(doParallel)
library(logbin)
library(glm2)
#-------------------------------------------------------------------------
data(heart)
head(heart)
start.p <- sum(data1$deaths) / sum(data1$patients)
fit.glm <- glm(cbind(deaths, patients-deaths) ~ factor(age) + factor(severity) +
factor(onset) + factor(region), family = binomial(log),
start = c(log(start.p), -rep(1e-4, 8)), data = data1,trace = TRUE, maxit = 100)
fit.logbin <- logbin(formula(fit.glm), trace = 1,data=data1)
summary(fit.logbin)
# Speed up convergence by using single EM algorithm
fit.logbin.em <- update(fit.logbin, method = "em")
summary(fit.logbin.em)
# Speed up convergence by using acceleration methods
fit.logbin.acc <- update(fit.logbin, accelerate = "squarem")
summary(fit.logbin.acc)
fit.logbin.em.acc <- update(fit.logbin.em, accelerate = "squarem")
summary(fit.logbin.em.acc)
sink("C:\\Users\\ak.biglarian\\Desktop\\MrsZooghi\\Out.logbin.txt")
summary(fit.logbin.em)
sink()
#-------------------------------------------------------------------------
logbin(formula = formula(fit.glm), data = heart, method = "em",
trace = 1)
on different installations of R-project, I get different results on same code!
In one installation I get the code running pretty good but in another installation I get following errors:
> fit.logbin <- logbin(formula(fit.glm), trace = 1,data=data1)
Error in logbin(formula(fit.glm), trace = 1, data = data1) :
unused argument (data = data1)
or
> logbin(formula = formula(fit.glm), data = heart, method = "em",
+ trace = 1)
Error in logbin(formula = formula(fit.glm), data = heart, method = "em", :
unused arguments (formula = formula(fit.glm), data = heart, method = "em")
What is the reason? And how shall I fix this?

Unknown error messages in R using nls() with SSlogis() and predict()

I'm fitting a logistic model (self-starting; SSlogis) to data of multiple populations of birds using nls(). My goal is to fit an expected function to the data (using only part of each dataset) and display a measure of the variance about the expectation on a graph. I then want to fit and plot the observed function (using the entire dataset for each population) to determine if the observed dynamics fell within the variance of the expectation. Here's my code as currently written to accomplish this:
CE.mod = nls(CE.observed ~ SSlogis(t.CattleEgret, Asym, xmid, scal))
with(collapse.data, plot(CE.time, CE.obs))
CE.extrap = predict(CE.mod, data.frame(t.CattleEgret = CE.time))
lines(CE.time, CE.extrap)
CE.se.fit = sqrt(apply(attr(CE.extrap, "gradient"), 1, function(x)
sum(vcov(CE.mod)*outer(x,x))))
matplot(CE.time, CE.extrap+outer(CE.se.fit, qnorm(c(0.5, 0.025, 0.975))),
type = "l", lty = c(1,1,1), ylab = "Abundance (# per party hour)",
xlab = "Time (year)", main = "Cattle Egret Collapse Analysis",
pch = 15, font.lab = 2, font.axis = 2, cex = 4, cex.lab = 1.5,
cex.axis = 2, cex.main = 2, frame.plot = FALSE, lwd = 4, 10)
with(collapse.data, matpoints(CE.time, CE.obs, pch = 15, cex = 3))
lines(CE.time, predict(nls(CE.obs ~ SSlogis(log(CE.time),
Asym, xmid, scal))), lty = 3, lwd = 4)
Where (from the "collapse.data" file):
t.CattleEgret = c(1:20)
CE.time = c(1:45)
CE.obs = c(0.3061324, 0.0000100, 0.2361211, 0.5058240, 2.0685032, 2.1944544,
4.2689494, 4.9508297, 3.1334720, 3.6570752, 5.6753381, 10.9133183,
5.4518257, 20.4166979, 15.9741054, 19.0970426, 13.7559959, 14.1358153,
15.9986416, 29.6762828, 10.3760667, 8.4284488, 6.1060359, 3.7099982,
3.3584060, 2.5981386, 2.5697082, 2.8091952, 5.5487979, 1.6505442,
2.2696972, 2.1835692, 3.6747876, 4.8307886, 3.5019731, 2.8397137,
1.8605288, 11.1848738, 2.6268683, 4.1215127, 2.3996210, 2.6569938,
2.1987387, 3.0267252, 2.4420927)
CE.observed = c(0.3061324, 0.0000100, 0.2361211, 0.5058240, 2.0685032, 2.1944544,
4.2689494, 4.9508297, 3.1334720, 3.6570752, 5.6753381, 10.9133183,
5.4518257, 20.4166979, 15.9741054, 19.0970426, 13.7559959, 14.1358153,
15.9986416, 29.6762828)
That code works fine and produces a figure like this:
If, however, I remove the "log()" from the final line of the code so as to write this:
lines(CE.time, predict(nls(CE.obs ~ SSlogis(CE.time,
Asym, xmid, scal))), lty = 3, lwd = 4),
The line will not plot and I receive this error:
Error in nls(y ~ 1/(1 + exp((xmid - x)/scal)), data = xy, start = list(xmid =
aux[1L], : step factor 0.000488281 reduced below 'minFactor' of 0.000976562
which I cannot alter, even if I play around with the nls.controls and change the 'minFactor' value. I also get this error message following the initial line defining the mod (the ##.mod portion) for some populations.
Also, for some populations I receive an error message following the final line of code that reports this:
Error in qr.solve(QR.B, cc) : singular matrix 'a' in solve
I can think of no rationalization for natural log-transforming the data, and I'm left to assume that I have simply altered the data (in this case arbitrarily logged it) in such a way to allow the predict() and SSlogis() functions to function properly, but I don't know why. I haven't been able to find any suitable answers in any forums to such an issue. Any help would be greatly appreciated.
*Update: I've attempted to implement the nlsLM function as recommended by Roland (below). That does indeed clean up the portion of code with the confusing log() use:
lines(CE.time, predict(nlsLM(CE.obs ~ Asym/(1 + exp((xmid - CE.time)/scal)), start
= list(Asym = max(CE.obs), xmid = popsizetime[1], scal = 1), control =
nls.lm.control(maxiter = 1000))
However, for other populations I run into the same error message as above at the initial model specification:
ChMa.mod = nls(ChMa.observed ~ SSlogis(t.ChestnutMannikin, Asym, xmid, scal))
Error in nls(y ~ 1/(1 + exp((xmid - x)/scal)), data = xy, start = list(xmid =
aux[1L], : step factor 0.000488281 reduced below 'minFactor' of 0.000976562
Switched to:
ChMa.mod = nlsLM(ChMa.observed ~ Asym/(1 + exp((xmid - t.ChestnutMannikin)/
scal)), start = list(Asym = max(ChMa.obs), xmid = popsizetime[2],
scal = 1), control = nls.lm.control(maxiter = 1000))
Where
ChMa.observed = c(4.02785074, 0.33847154, 0.99029776, 2.86516540, 0.59588068,
0.01334333, 2.07693362, 0.62485994, 3.48979515, 3.67785202, 20.84180181)
t.ChestnutMannikin = c(1:11)
popsizetime[2] = 11
While this switch does avoid the error message, nlsLM evaluates the function but does not evaluate the gradient. Without the evaluation of the gradient I cannot use the se.fit code and therefore cannot obtain an estimate of the variance for plotting.
I've found the answer to my problems: I need to add a component of my model that generates a gradient for the function I'm regressing with nlsLM.
log.model = function(t.RedventedBulbul, Asym, xmid, scal) {
numericDeriv(quote(Asym/(1 + exp((xmid - t.RedventedBulbul)/scal))),
c("Asym", "xmid", "scal"), parent.frame())
}

Resources