Change x and y labels on a gbm partial plot - r

I am having trouble changing the x and y labels on a partial plot for a gbm model. I need to rename them for the journal article.
I read this in and create the plot as follows:
library(gbm)
final<- readRDS(final_gbm_model)
summary(final, n.trees=final$n.trees)
Here is the summary output:
var rel.inf
ProbMn50ppb ProbMn50ppb 11.042750
ProbDOpt5ppm ProbDOpt5ppm 7.585275
Ngw_1975 Ngw_1975 6.314080
PrecipMinusETin_1971_2000_GWRP PrecipMinusETin_1971_2000_GWRP 4.988598
N_total N_total 4.776950
DTW60YrJurgens DTW60YrJurgens 4.415016
CVHM_TextZone CVHM_TextZone 4.225048
RiverDist_NEAR RiverDist_NEAR 4.165035
LateralPosition LateralPosition 4.036406
CAML1990_natural_water CAML1990_natural_water 3.720303
PctCoarseMFUpActLayer PctCoarseMFUpActLayer 3.668184
BioClim_BIO12 BioClim_BIO12 3.561071
MFDTWSpr2000Faunt MFDTWSpr2000Faunt 3.383900
PBot_krig PBot_krig 3.362289
WaterUse2 WaterUse2 3.291040
AVG_CLAY AVG_CLAY 3.280454
Age_yrs Age_yrs 3.144734
MFVelSept2000 MFVelSept2000 3.064030
AVG_SILT AVG_SILT 2.882709
ScreenLength ScreenLength 2.683542
HydGrp_C HydGrp_C 2.666106
AVG_POR AVG_POR 2.563147
MFVelFeb2000 MFVelFeb2000 2.505106
HiWatTabDepMin HiWatTabDepMin 2.421521
RechargeAnnualmmWolock RechargeAnnualmmWolock 2.252706
I can create a partial dependence plot as follows:
plot(final,"ProbMn50ppb",n.trees=final$n.trees)
But if I try to set the label arguments I get the following error:
plot(final,"ProbMn50ppb",n.trees=final$n.trees,ylab="LNNO3")
Error in plot.default(X$X1, X$y, type = "l", xlab = x$var.names[i.var], :
formal argument "ylab" matched by multiple actual arguments
How can I change the y and x axis labels?

The plot.gbm function passes its own name to the generic plot function so the two are colliding. So you will not be able to customize the plot the way you want in that mode. But the authors did provide an alternative when you set return.grid=TRUE. Instead of building a plot, it will output the data itself. You can then use that for any plot including ggplot2.
plotdata <- plot(gbm1, return.grid=TRUE)
plot(plotdata, type="l", ylab="ylab", xlab="xlab")
Example data from help(gbm)

You can also change the gbm object itself before plotting (or in a function):
your_gbm_obj$var.names[index] = "axis label"

Related

Error in axis(side = side, at = at, labels = labels, ...) : invalid value specified for graphical parameter "pch"

I have applied DBSCAN algorithm on built-in dataset iris in R. But I am getting error when tried to visualise the output using the plot( ).
Following is my code.
library(fpc)
library(dbscan)
data("iris")
head(iris,2)
data1 <- iris[,1:4]
head(data1,2)
set.seed(220)
db <- dbscan(data1,eps = 0.45,minPts = 5)
table(db$cluster,iris$Species)
plot(db,data1,main = 'DBSCAN')
Error: Error in axis(side = side, at = at, labels = labels, ...) :
invalid value specified for graphical parameter "pch"
How to rectify this error?
I have a suggestion below, but first I see two issues:
You're loading two packages, fpc and dbscan, both of which have different functions named dbscan(). This could create tricky bugs later (e.g. if you change the order in which you load the packages, different functions will be run).
It's not clear what you're trying to plot, either what the x- or y-axes should be or the type of plot. The function plot() generally takes a vector of values for the x-axis and another for the y-axis (although not always, consult ?plot), but here you're passing it a data.frame and a dbscan object, and it doesn't know how to handle it.
Here's one way of approaching it, using ggplot() to make a scatterplot, and dplyr for some convenience functions:
# load our packages
# note: only loading dbscacn, not loading fpc since we're not using it
library(dbscan)
library(ggplot2)
library(dplyr)
# run dbscan::dbscan() on the first four columns of iris
db <- dbscan::dbscan(iris[,1:4],eps = 0.45,minPts = 5)
# create a new data frame by binding the derived clusters to the original data
# this keeps our input and output in the same dataframe for ease of reference
data2 <- bind_cols(iris, cluster = factor(db$cluster))
# make a table to confirm it gives the same results as the original code
table(data2$cluster, data2$Species)
# using ggplot, make a point plot with "jitter" so each point is visible
# x-axis is species, y-axis is cluster, also coloured according to cluster
ggplot(data2) +
geom_point(mapping = aes(x=Species, y = cluster, colour = cluster),
position = "jitter") +
labs(title = "DBSCAN")
Here's the image it generates:
If you're looking for something else, please be more specific about what the final plot should look like.

Is there a way to remove points from a Mclust classification plot in R?

I am trying to plot the GMM of my dataset using the Mclust package in R. While the plotting is a success, I do not want points to show in the final plot, just the ellipses. For a reference, here is the plot I have obtained:
GMM Plot
But, I want the resulting plot to have only the ellipses, something like this:
GMM desired plot
I have been looking at the Mclust plot page in: https://rdrr.io/cran/mclust/man/plot.Mclust.html and looking at the arguments of the function, I see there is a scope of adding other graphical parameters. Looking at the documentation of the plot function, there is a parameter called type = 'n' which might help to do what I want but when I write it, it produces the following error:
Error in plot.default(data[, 1], data[, 2], type = "n", xlab = xlab, ylab = ylab, :
formal argument "type" matched by multiple actual arguments
For reference, this is the code I used for the first plot:
library(mclust)
Data1_2 <- Mclust(Data, G=15)
summary(Data1_2, parameters = TRUE, classification = TRUE)
plot(Data1_2, what="classification")
The code I tried using for getting the graph below is:
Data1_4 <- Mclust(Data, G=8)
summary(Data1_4, parameters = TRUE, classification = TRUE)
plot(Data1_4, what="classification", type = "n")
Any help on this matter will be appreciated. Thanks!
If you look under the source code of plot.Mclust, it calls plot.Mclust.classification which in turn calls coordProj for the dot and ellipse plot. Inside this function, the size is controlled by the option CEX= and shape PCH=.
So for your purpose, do:
library(mclust)
clu = Mclust(iris[,1:4], G = 3, what="classification")
plot(clu,what="classification",CEX=0)

adjust plot parameters in R while plotting regsubsets object in R (more room below x axis)

I'm trying to adjust the plotting parameters that I would normally do with par(mar=c(10,4.1,4.1,2.1) to allow for more room below the x-axis to plot these labels. Right now the variable names are running off the screen.
Is it something with the leaps package or the regsubsets object that I'm plotting that doesn't recognize the par(mar=c(10,4.1,4.1,2.1))
Here's a boiled down example of what I'm trying to do.
require('leaps')
par(mar=c(10,4.1,4.1,2.1))
leaps <- regsubsets(mpg~disp+hp+drat+wt+qsec, data=mtcars, nbest=2, nvmax=5)
## artificially making labels longer... my labels are longer than this example dataset
labs <- sapply(leaps$xnames, function(x) paste(rep(x,5), collapse=''))
plot(leaps, scale=c('adjr2'), labels=labs))
This is caused by the function (plot.regsubsets()) setting mar within the function body. This overrides the mar that you set.
You can fix this by adding a mar argument to the plot.regsubsets() function and passing it to the par() call on line 3 of the function body:
plot.regsubsets<-function(x,labels=obj$xnames,main=NULL,
scale=c("bic","Cp","adjr2","r2"),
col=gray(seq(0,0.9,length=10)),mar = c(7,5,6,3)+0.1, ...){
obj<-x
lsum<-summary(obj)
par(mar=mar)
nmodels<-length(lsum$rsq)
np<-obj$np
propscale<-FALSE
sscale<-pmatch(scale[1],c("bic","Cp","adjr2","r2"),nomatch=0)
if (sscale==0)
stop(paste("Unrecognised scale=",scale))
if (propscale)
stop(paste("Proportional scaling only for probabilities"))
yscale<-switch(sscale,lsum$bic,lsum$cp,lsum$adjr2,lsum$rsq)
up<-switch(sscale,-1,-1,1,1)
index<-order(yscale*up)
colorscale<- switch(sscale,
yscale,yscale,
-log(pmax(yscale,0.0001)),-log(pmax(yscale,0.0001)))
image(z=t(ifelse(lsum$which[index,],
colorscale[index],NA+max(colorscale)*1.5)),
xaxt="n",yaxt="n",x=(1:np),y=1:nmodels,xlab="",ylab=scale[1],col=col)
laspar<-par("las")
on.exit(par(las=laspar))
par(las=2)
axis(1,at=1:np,labels=labels)
axis(2,at=1:nmodels,labels=signif(yscale[index],2))
if (!is.null(main))
title(main=main)
box()
invisible(NULL)
}

Plotting vectors in a constrained ordination without labels

I would like to plot vectors from a capscale ordination using VEGAN.
I am familiar with the display ="bp" command, but this adds labels that are obscured by site points. Is there an easy means of removing these? I am happy to add them in later i.e. once exported and within word for publication.
My code thus far is as follows:
plot(mod, scaling = 3, type="n")
with(data, points(mod, display="sites", cex=Pointsize,
pch=ifelse(Cat=="Reference",21,19)) ,bg=Cat,)
with(data,text(mod,display="bp"))
Help will be appreciated
Use the points() method instead of the text() method:
points(mod, display = "bp")
(There also should be no need for the with(data) in that last line of code you show.)
Here is a reproducible example:
require(vegan)
data(varespec)
data(varechem)
ord <- cca(varespec ~ ., data = varechem)
plot(ord, type = "n", display = "sites")
points(ord, display = "sites")
points(ord, display = "bp")

error labelling axis of plot using Ecdf

I am attempting to plot a graph using the code below:
Require(Hmisc)
Ecdf(ceac_primary,xlab="axis label",xlim=c(5000,50000),q=c(0.9,0.1),
ylab="Probability of Success",main="CEAC")
Where ceac_primary is a data frame with 1 variable of 90k observations.
When I include the 'xlab="axis label"' I keep getting the following error:
Error in Ecdf.default(v, group = group, weights = weights, normwt = normwt, :
formal argument "xlab" matched by multiple actual arguments
However if I exclude the x axis label part of the code, it plots the graph fine.
Is this a known problem, and if so, are there alternative ways to plot an x axis label?
Thanks
Digging around in the source code for Ecdf.data.frame (the method that is called when passing a data.frame to Ecdf), it looks like that function creates an object that is later passed to the xlab argument. Therefore, xlab is not expected as a user-supplied argument when running Ecdf with a data.frame. Here's the code that creates the object lab that gets passed to xlab within Ecdf.data.frame:
lab <- if (vnames == "names")
nam[j]
else label(v, units = TRUE, plot = TRUE, default = nam[j])
Then Ecdf is called with xlab = lab, but also any arguments in the elipses of Ecdf.data.frame are also passed to Ecdf. Since xlab is not a formal argument of Ecdf.data.frame, this is why you get your error.
To get around it, try either of the following:
Convert your data.frame to a vector of the appropriate class (numeric, I presume), and then run
Ecdf(ceac_primary_Vec, xlab = "axis label")
Or, you can create a label for the one column in your data.frame using the label function in the Hmisc package. If that column is called myCol, you can run
label(ceac_primary$myCol) <- "axis label"
Ecdf(ceac_primary)
And that should get your axis label printing correctly.

Resources