Is there a way to remove points from a Mclust classification plot in R? - r

I am trying to plot the GMM of my dataset using the Mclust package in R. While the plotting is a success, I do not want points to show in the final plot, just the ellipses. For a reference, here is the plot I have obtained:
GMM Plot
But, I want the resulting plot to have only the ellipses, something like this:
GMM desired plot
I have been looking at the Mclust plot page in: https://rdrr.io/cran/mclust/man/plot.Mclust.html and looking at the arguments of the function, I see there is a scope of adding other graphical parameters. Looking at the documentation of the plot function, there is a parameter called type = 'n' which might help to do what I want but when I write it, it produces the following error:
Error in plot.default(data[, 1], data[, 2], type = "n", xlab = xlab, ylab = ylab, :
formal argument "type" matched by multiple actual arguments
For reference, this is the code I used for the first plot:
library(mclust)
Data1_2 <- Mclust(Data, G=15)
summary(Data1_2, parameters = TRUE, classification = TRUE)
plot(Data1_2, what="classification")
The code I tried using for getting the graph below is:
Data1_4 <- Mclust(Data, G=8)
summary(Data1_4, parameters = TRUE, classification = TRUE)
plot(Data1_4, what="classification", type = "n")
Any help on this matter will be appreciated. Thanks!

If you look under the source code of plot.Mclust, it calls plot.Mclust.classification which in turn calls coordProj for the dot and ellipse plot. Inside this function, the size is controlled by the option CEX= and shape PCH=.
So for your purpose, do:
library(mclust)
clu = Mclust(iris[,1:4], G = 3, what="classification")
plot(clu,what="classification",CEX=0)

Related

Suppress graph output of a function [duplicate]

I am trying to turn off the display of plot in R.
I read Disable GUI, graphics devices in R but the only solution given is to write the plot to a file.
What if I don't want to pollute the workspace and what if I don't have write permission ?
I tried options(device=NULL) but it didn't work.
The context is the package NbClust : I want what NbClust() returns but I do not want to display the plot it does.
Thanks in advance !
edit : Here is a reproducible example using data from the rattle package :)
data(wine, package="rattle")
df <- scale (wine[-1])
library(NbClust)
# This produces a graph output which I don't want
nc <- NbClust(df, min.nc=2, max.nc=15, method="kmeans")
# This is the plot I want ;)
barplot(table(nc$Best.n[1,]),
xlab="Numer of Clusters", ylab="Number of Criteria",
main="Number of Clusters Chosen by 26 Criteria")
You can wrap the call in
pdf(file = NULL)
and
dev.off()
This sends all the output to a null file which effectively hides it.
Luckily it seems that NbClust is one giant messy function with some other functions in it and lots of icky looking code. The plotting is done in one of two places.
Create a copy of NbClust:
> MyNbClust = NbClust
and then edit this function. Change the header to:
MyNbClust <-
function (data, diss = "NULL", distance = "euclidean", min.nc = 2,
max.nc = 15, method = "ward", index = "all", alphaBeale = 0.1, plotetc=FALSE)
{
and then wrap the plotting code in if blocks. Around line 1588:
if(plotetc){
par(mfrow = c(1, 2))
[etc]
cat(paste(...
}
and similarly around line 1610. Save. Now use:
nc = MyNbClust(...etc....)
and you see no plots unless you add plotetc=TRUE.
Then ask the devs to include your patch.

plot function in R producing legend without legend() being called

I'm trying to produce a cumulative incidence plot for a competing hazards survival analysis using plot() in R. For some reason, the plot that is produced has a legend that I have not called. The legend is intersecting with the lines on my graph and I can't figure out how to get rid of it. Please help!
My code is as follows:
CompRisk2 <- cuminc(ftime=ADI$time_DeathTxCensor, fstatus=ADI$status, group=ADI$natADI_quart)
cols <- c("darkorange","coral1","firebrick1","firebrick4","lightskyblue","darkturquoise","dodgerblue","dodgerblue4")
par(bg="white")
plot(CompRisk2,
col=cols,
xlab="Years",
ylab="Probability of Mortality or Transplant",
xlim=c(0,10),
ylim=c(0,0.6))
Which produces the following plot:
I tried adding the following code to move the legend out of the frame, but I got an error:
legend(0,5, legend=c(11,21,31,41,12,22,32,42),
col=c("darkorange","coral1","firebrick1","firebrick4","lightskyblue","darkturquoise","dodgerblue","dodgerblue4"),
lty=1:2, cex=0.8, text.font=4, box.lty=0)
Error: Error in title(...) : invalid graphics parameter
Any help would be much appreciated!
You are using the cuminc function from the cmprsk package. This produces an object of class cuminc, which has an S3 plot method. ?plot.cuminc shows you the documentation and typing plot.cuminc shows you the code.
There is some slightly obscure code that suggests a workaround:
u <- list(...)
if (length(u) > 0) {
i <- pmatch(names(u), names(formals(legend)), 0)
do.call("legend", c(list(x = wh[1], y = wh[2], legend = curvlab,
col = color, lty = lty, lwd = lwd, bty = "n", bg = -999999),
u[i > 0]))
}
This says that any additional arguments passed in ... whose names match the names of arguments to legend will be passed to legend(). legend() has a plot argument:
plot: logical. If ‘FALSE’, nothing is plotted but the sizes are returned.
So it looks like adding plot=FALSE to your plot() command will work.
In principle you could try looking at the other arguments to legend() and see if any of them will adjust the legend position/size as you want. Unfortunately the x argument to legend (which would determine the horizontal position) is masked by the first argument to plot.cuminc.
I don't think that the ellipsis arguments are intended for the legend call inside plot.cuminc. The code offered in Ben's answer suggests that there might be a wh argument that determines the location of the legend. It is not named within the parameters as "x" in the code he offered, but is rather given as a positionally-defined argument. If you look at the plot.cuminc function you do in fact find that wh is documented.
I cannot test this because you have not offered us access to the ADI-object but my suggestion would be to try:
opar <- par(xpd=TRUE) # xpd lets graphics be placed 'outside'
plot(CompRisk2,
col=cols, wh=c(-.5, 7),
xlab="Years",
ylab="Probability of Mortality or Transplant",
xlim=c(0,10),
ylim=c(0,0.6))
par(opar) # restores original graphics parameters
It's always a bit risky to put out a code chunk without testing, but I'm happy to report that I did find a suitable test and it seems to work reasonably as predicted. Using the code below on the object in the SO question prior question about using the gg-packages for cmprsk:
library(cmprsk)
# some simulated data to get started
comp.risk.data <- data.frame("tfs.days" = rweibull(n = 100, shape = 1, scale = 1)*100,
"status.tfs" = c(sample(c(0,1,1,1,1,2), size=50, replace=T)),
"Typing" = sample(c("A","B","C","D"), size=50, replace=T))
# fitting a competing risks model
CR <- cuminc(ftime = comp.risk.data$tfs.days,
fstatus = comp.risk.data$status.tfs,
cencode = 0,
group = comp.risk.data$Typing)
opar <- par(xpd=TRUE) # xpd lets graphics be placed 'outside'
plot(CR,
wh=c(-15, 1.1), # obviously different than the OP's coordinates
xlab="Years",
ylab="Probability of Mortality or Transplant",
xlim=c(0,400),
ylim=c(0,1))
par(opar) # restores graphics parameters
I get the legend to move up and leftward from its original position.

How do I change line thickness in denscomp plots from the fitdistrplus package in R?

I'm over-plotting three densities onto my data histogram, using denscomp in the fitdistrplus package in R. The code below is working perfectly, but I don't know how to make the lines thicker.
denscomp(list(TryWeibull, TryGamma, TryLognormal), legendtext = plot.legend,
fitcol = c("indianred3","gray38", "darkblue"), fitlty = c("dashed", "longdash", "dotdash"),
xlab = "Age", ylab = "Proportion", main="")
fitcol is giving me the correct colours, fitly is giving me the correct line types, but I can't work out the command to make the lines thicker. I have two distribution densities that are close together and I have been unsuccessful in clearly identifying them using colour/line type differences. .
I am trying to de-emphasize the Weibull and emphasise the gamma and lognormal. The proportions are estimates, so I am trying to fit the general shape, not the exact values.
I can't see an option in the denscomp function to specify line widths. I would rather not use the ggplot option, but can shift to that if required. I was hoping there was a function option I'm overlooking.
Edited to add: I raised this as a feature request on GitHub and it has been implemented into the package.
Although the author of this package allows you to specify multiple line types (fitlty) and line colours (fitcol), they didn't allow you to specify multiple line widths. But since R is open-source, you are free to modify the function in any way.
Type the following at the R console:
fix(denscomp)
Then add a new argument to the function after fitcol, called fitlwd.
..., fitcol, fitlwd, addlegend = TRUE, ...
Then after line 30 add the following:
if (missing(fitlwd))
fitlwd <- 1
Then after line 34 add the following:
fitlwd <- rep(fitlwd, length.out = nft)
Then modify line 136 as follows:
col = fitcol[i], lwd=fitlwd[i], ...)
Finally, modify line 142:
col = fitcol, lwd=fitlwd,
Save and call the new function as before but now specifying the fitlwd argument:
denscomp(..., fitlwd=c(1,3,3))
I had the same question and followed Edward's solution, which was great and I learnt a lot, but it turned out you can just use ggplot to do that.
denscomp(..., plotstyle = "ggplot") + geom_line(linetype = "dashed",size = 1))

Change x and y labels on a gbm partial plot

I am having trouble changing the x and y labels on a partial plot for a gbm model. I need to rename them for the journal article.
I read this in and create the plot as follows:
library(gbm)
final<- readRDS(final_gbm_model)
summary(final, n.trees=final$n.trees)
Here is the summary output:
var rel.inf
ProbMn50ppb ProbMn50ppb 11.042750
ProbDOpt5ppm ProbDOpt5ppm 7.585275
Ngw_1975 Ngw_1975 6.314080
PrecipMinusETin_1971_2000_GWRP PrecipMinusETin_1971_2000_GWRP 4.988598
N_total N_total 4.776950
DTW60YrJurgens DTW60YrJurgens 4.415016
CVHM_TextZone CVHM_TextZone 4.225048
RiverDist_NEAR RiverDist_NEAR 4.165035
LateralPosition LateralPosition 4.036406
CAML1990_natural_water CAML1990_natural_water 3.720303
PctCoarseMFUpActLayer PctCoarseMFUpActLayer 3.668184
BioClim_BIO12 BioClim_BIO12 3.561071
MFDTWSpr2000Faunt MFDTWSpr2000Faunt 3.383900
PBot_krig PBot_krig 3.362289
WaterUse2 WaterUse2 3.291040
AVG_CLAY AVG_CLAY 3.280454
Age_yrs Age_yrs 3.144734
MFVelSept2000 MFVelSept2000 3.064030
AVG_SILT AVG_SILT 2.882709
ScreenLength ScreenLength 2.683542
HydGrp_C HydGrp_C 2.666106
AVG_POR AVG_POR 2.563147
MFVelFeb2000 MFVelFeb2000 2.505106
HiWatTabDepMin HiWatTabDepMin 2.421521
RechargeAnnualmmWolock RechargeAnnualmmWolock 2.252706
I can create a partial dependence plot as follows:
plot(final,"ProbMn50ppb",n.trees=final$n.trees)
But if I try to set the label arguments I get the following error:
plot(final,"ProbMn50ppb",n.trees=final$n.trees,ylab="LNNO3")
Error in plot.default(X$X1, X$y, type = "l", xlab = x$var.names[i.var], :
formal argument "ylab" matched by multiple actual arguments
How can I change the y and x axis labels?
The plot.gbm function passes its own name to the generic plot function so the two are colliding. So you will not be able to customize the plot the way you want in that mode. But the authors did provide an alternative when you set return.grid=TRUE. Instead of building a plot, it will output the data itself. You can then use that for any plot including ggplot2.
plotdata <- plot(gbm1, return.grid=TRUE)
plot(plotdata, type="l", ylab="ylab", xlab="xlab")
Example data from help(gbm)
You can also change the gbm object itself before plotting (or in a function):
your_gbm_obj$var.names[index] = "axis label"

error labelling axis of plot using Ecdf

I am attempting to plot a graph using the code below:
Require(Hmisc)
Ecdf(ceac_primary,xlab="axis label",xlim=c(5000,50000),q=c(0.9,0.1),
ylab="Probability of Success",main="CEAC")
Where ceac_primary is a data frame with 1 variable of 90k observations.
When I include the 'xlab="axis label"' I keep getting the following error:
Error in Ecdf.default(v, group = group, weights = weights, normwt = normwt, :
formal argument "xlab" matched by multiple actual arguments
However if I exclude the x axis label part of the code, it plots the graph fine.
Is this a known problem, and if so, are there alternative ways to plot an x axis label?
Thanks
Digging around in the source code for Ecdf.data.frame (the method that is called when passing a data.frame to Ecdf), it looks like that function creates an object that is later passed to the xlab argument. Therefore, xlab is not expected as a user-supplied argument when running Ecdf with a data.frame. Here's the code that creates the object lab that gets passed to xlab within Ecdf.data.frame:
lab <- if (vnames == "names")
nam[j]
else label(v, units = TRUE, plot = TRUE, default = nam[j])
Then Ecdf is called with xlab = lab, but also any arguments in the elipses of Ecdf.data.frame are also passed to Ecdf. Since xlab is not a formal argument of Ecdf.data.frame, this is why you get your error.
To get around it, try either of the following:
Convert your data.frame to a vector of the appropriate class (numeric, I presume), and then run
Ecdf(ceac_primary_Vec, xlab = "axis label")
Or, you can create a label for the one column in your data.frame using the label function in the Hmisc package. If that column is called myCol, you can run
label(ceac_primary$myCol) <- "axis label"
Ecdf(ceac_primary)
And that should get your axis label printing correctly.

Resources