R: ggbiplot legend doesn't show different colours AND shapes - r

I admit this question has already been asked at least two times, but none of the provided answers worked for me.
My data is very similar to the ggbiplot wine example, so I'll just use this to illustrate my problem.
I want my datapoints to be differently coloured AND shaped, so they look nice when printed in colour but can still be identified when printed in b/w. This actually works just fine, but the legend doesn't play along.
require(scales)
library(devtools)
install_github("vqv/ggbiplot")
library(ggbiplot)
data(wine)
wine.pca <- prcomp(wine, scale. = TRUE)
ggbiplot(wine.pca, obs.scale = 1, var.scale = 1,
groups = wine.class, ellipse = TRUE, circle = TRUE, var.axes=FALSE) +
scale_color_discrete(name = '') +
geom_point(aes(colour=wine.class,shape=wine.class),size=2) +
theme(legend.direction = 'horizontal', legend.position = 'top')
produces this:
The legend is obviously not what I want. The shapes and colours should be in the same legend.
(As a bonus, I'd also like to remove the small lines from the points, but if that's not easily doable I can live with them. )
I already tried what was suggested here and here.
The first suggestion
data(wine)
wine.pca <- prcomp(wine, scale. = TRUE)
ggbiplot(wine.pca, obs.scale = 1, var.scale = 1,
groups = wine.class, ellipse = TRUE, circle = TRUE, var.axes=FALSE) +
scale_color_discrete(name = '') +
geom_point(aes(colour = wine.class), size = "2") +
scale_color_manual(values=c(16,17,18)) +
theme(legend.direction = 'horizontal', legend.position = 'top')
produces this error message (and no plot at all)
Scale for 'colour' is already present. Adding another scale for 'colour',
which will replace the existing scale.
Fehler in coords$size * .pt :
nicht-numerisches Argument für binären Operator
(the last two lines translate to "error in coords$size * .pt : non-numeric argument for binary operator)
The second suggestion
data(wine)
wine.pca <- prcomp(wine, scale. = TRUE)
ggbiplot(wine.pca, obs.scale = 1, var.scale = 1,
groups = wine.class, ellipse = TRUE, circle = TRUE, var.axes=FALSE) +
scale_color_discrete(name = '') +
geom_point(size = 2) +
scale_colour_manual(name = "Wines",
labels = c("barolo","grignolino","barbera"),
values = c("blue", "red", "green")) +
scale_shape_manual(name = "Wines",
labels = c("barolo","grignolino","barbera"),
values = c(17,18,19))
gives me this error message
Scale for 'colour' is already present. Adding another scale for 'colour',
which will replace the existing scale.
and a funny plot with coloured ellipses, legend with coloured points, but black datapoints.
I'm afraid my knowledge of R plotting isn't enough to make anything of this. Can someone point me in the right way?

Related

R correlation plot using ggcorrplot2: "x-axis" labels get cropped

I am using ggcorrplot2 (github page) to generate my correlation plots, since I need to overlay significance levels as *** on top.
This package relies on ggplot2, so I thought it would be easy to change different features like axis label font size, asterisk color, gradient colors, etc. But it is proving to be more complicated than I thought.
My current problem at hand is that the "x-axis" labels get cropped out of the plotting area... As you see below, this isn't actually the x-axis, but rather labels placed on top of the diagonal cells. Hence, it is quite difficult to change them.
Check out this MWE. I first did this:
data(mtcars)
#change "wt" to a very long name
names(mtcars)[6] <- "a very long name"
corrtest <- psych::corr.test(mtcars[,1:7], adjust="none")
all_matrix <- corrtest$r
all_pmat <- corrtest$p
###
P <- ggcorrplot2::ggcorrplot(all_matrix, type = "lower", method = "circle", p.mat = all_pmat, show.diag = FALSE,
insig = "label_sig", sig.lvl = c(0.05, 0.01, 0.001), pch = "*", pch.cex = 6) +
ggplot2::theme(axis.text.y=ggplot2::element_text(size=15),
legend.text=ggplot2::element_text(size=15))
grDevices::pdf(file="heat_all2.pdf", height=6, width=6)
print(
P
)
grDevices::dev.off()
Which produces this:
As you can see, I was able to modify the y-axis labels with ggplot2 theme, but not the "x-axis" labels or anything else...
So I figured I could use ggplot_build and tweak the plot before actually printing it, and I did the following:
P <- ggcorrplot2::ggcorrplot(all_matrix, type = "lower", method = "circle", p.mat = all_pmat, show.diag = FALSE,
insig = "label_sig", sig.lvl = c(0.05, 0.01, 0.001), pch = "*", pch.cex = 6) +
ggplot2::theme(axis.text.y=ggplot2::element_text(size=15),
legend.text=ggplot2::element_text(size=15))
P2 <- ggplot2::ggplot_build(P)
P2$data[[4]]$size <- 5
P2$data[[4]]$hjust <- 0
P2$data[[3]]$angle <- 15
P2$data[[3]]$colour <- "grey30"
grDevices::pdf.options(reset = TRUE, onefile = FALSE)
grDevices::pdf(file="heat_all2.pdf", height=6, width=6)
print(
graphics::plot(ggplot2::ggplot_gtable(P2))
)
grDevices::dev.off()
Which produces this:
Very close, but still not quite there yet. The problems I keep encountering are the following:
The "x-axis" labels get cropped
Weird grey area on top and bottom of the plot
I want to change the color gradient so the darker blue and darker red
aren't that dark
I attempted to solve this by adding plot.margin=grid::unit(c(0,3,0,0),"cm") to theme, but the result is this (still cropped label and more grey space on top and bottom of the plot):
Any help? Thanks!
In the original function, the author set expand = c(0, 0) in scale_x_continuous(). You just need to modify that part to get what you want
library(tidyverse)
library(ggcorrplot2)
data(mtcars)
# change "wt" to a very long name
names(mtcars)[6] <- "a very long name"
corrtest <- psych::corr.test(mtcars[, 1:7], adjust = "none")
all_matrix <- corrtest$r
all_pmat <- corrtest$p
###
P <- ggcorrplot2::ggcorrplot(all_matrix,
type = "lower", method = "circle", p.mat = all_pmat, show.diag = FALSE,
insig = "label_sig", sig.lvl = c(0.05, 0.01, 0.001), pch = "*", pch.cex = 6) +
ggplot2::theme(axis.text.y = ggplot2::element_text(size = 15),
legend.text = ggplot2::element_text(size = 15))
P +
scale_x_continuous(expand = expansion(mult = c(0, 0.25)))
#> Scale for 'x' is already present. Adding another scale for 'x', which will
#> replace the existing scale.
Created on 2020-09-01 by the reprex package (v0.3.0)

contourplot color and labels options in Lattice for R

I am quite new to Lattice and I am stuck with some possibly basic coding. I am using shapefiles and geoTIFFS to produce maps of animals distribution and in particular I have:
1 x point shapefile
2 x geoTIFF
1 x polygon shapefile
I am overlapping a levelplot of one of the geoTIFF (UD generated with adehabitatHR) with a contourplot of the same geoTIFF at specific intervals (percentile values), a contourplot of the second geoTIFF (depth raster from ETOPO2) for three specific values (-200, -1000 and -2000), the point shapefile (animal locations) and the polygon shapefile (land). All works fine but I need to change the font size of contour plot labels, their length (i.e. from 0.12315 to 0.123) and positioning for all the contourplots. For the depth contourplot I would like to change the style of each line in something like "continous line", "dashed line" and "point line", and for the contourplot of the UD I would like to change the color of each line using a yellow to red palette.
As far as I understand, I should use panel functions to implement these changes (e.g. Controlling z labels in contourplot) but i am not quite sure how to do it. Part of my code to generate the "plot":
aa <-
quantile(
UD_raster,
probs = c(0.25, 0.75),
type = 8,
names = TRUE
)
my.at <- c(aa[1], aa[2])
depth<-c(-100, -200, -2000)
levelplot(
UD_raster,
xlab = "",
ylab = "",
margin = FALSE,
contour = FALSE,
col.regions = viridis(100),
main = "A",
maxpixels = 2e5
) + layer(sp.polygons(Land, fill = "grey40", col = NA)) + layer(sp.points(locations, pts = 2, col = "red")) + contourplot(
UD_raster,
at = my.at,
labels = TRUE,
margin = FALSE
) + contourplot(
ETOPO2,
at = depth,
labels = TRUE,
margin = FALSE
)
A simplified image, with no UD layer and no point shapefile can be found here and as you can see it is pretty messy. Thanks for your help.
So far for the ETOPO2 countourplot I have solved by eliminating the labels and adding the argument lty to style the line. Because I can't figure out how to use lty with different values for each single line in my contour, I have replicated the contourplot function three times on the same surface, one for each contour I am interested into (this was easy because I only need three contours).
For the position, font and font size of the labels of the remaining contourplot I have used
labels = list(cex = 0.8, "verdana"),
label.style = "flat"
To "shorten" the length of the labels I have used the function round where I specify to which decimal digit to round number.
So now my new code looks like:
aa <-
quantile(
UD_raster,
probs = c(0.25, 0.75),
type = 8,
names = TRUE
)
my.at <- c(aa[1], aa[2])
my.at <- round(my.at, 3)
levelplot(
UD_raster,
xlab = "",
ylab = "",
margin = FALSE,
contour = FALSE,
col.regions = viridis(100),
main = "A",
maxpixels = 2e5
) + layer(sp.polygons(Land, fill = "grey40", col = NA)) + layer(sp.points(positions, pts = 2, col = "red")) + contourplot(
UD_raster,
at = my.at,
labels = list(cex = 0.8, "verdana"),
label.style = "flat",
margin = FALSE
) + contourplot(
ETOPO2,
at = -200,
labels = FALSE,
margin = FALSE,
lty = 1,
pretty = TRUE
) + contourplot(
ETOPO2,
at = -1000,
labels = FALSE,
margin = FALSE,
lty = 2,
pretty = TRUE
) + contourplot(
ETOPO2,
at = -2000,
labels = FALSE,
margin = FALSE,
lty = 3,
pretty = TRUE
)
As one could expect, it takes a bit longer to produce the plot. Still no idea on how to change the colors of the UD contourplot.

Customize border of ggbiplot points

Given the following code using the ggbiplot library available via devtools::install.github() :
library(ggbiplot)
data(iris)
log.ir <- log(iris[, 1:4])
ir.species <- iris[, 5]
ir.pca <- prcomp(log.ir, center = TRUE, scale. = TRUE)
g <- ggbiplot(ir.pca, obs.scale = 1, var.scale = 1, groups = ir.species)
g <- g + theme(legend.direction = 'vertical', legend.position = 'right')
g <- g + scale_color_manual(values=c("blue", "red", "green"))
print(g)
what is the best way to customize the border of the data points based on the grouping? I used scale_color_manual() to customize the color of those data points, but I can't think of a way to do that for the border.
Thanks
Assuming you want to adjust the border of the data points themselves...
The ggbiplot() call itself won't give you this flexibility but setting alpha = 0 will make the points plotted by ggbiplot invisible or really 100% transparent. Then you can make a separate layer with a geom_point() call where you specify shape as one of the 5 shapes (21-25) that have a fill (the middle) and a color (the boarder) aesthetic.
ggbiplot(ir.pca, obs.scale = 1, var.scale = 1, groups = ir.species, alpha = 0) +
theme(legend.direction = 'vertical', legend.position = 'right') +
scale_color_manual(values=c("blue", "red", "green")) +
scale_fill_manual(values = c("red", "green", "blue")) + # just offset by one to show
geom_point(size = 3, shape = 21, aes(fill = groups, color = groups))
PS It's probably always a good idea to include in your question that the package you are using is only available via devtools::install.github() and not the standard install.packages()

How to change the line type of ellipses in ggbiplot?

Is it possible to change the type of lines of the normal probability ellipsoids in ggbiplot, e.g. have them dashed and dotted lines instead of or additional to the different colors?
I couldn't find anything in the documentation of ggbiplot except this to be used as MWE:
library(ggbiplot)
data(wine)
wine.pca <- prcomp(wine, scale. = TRUE)
print(ggbiplot(wine.pca, obs.scale = 1, var.scale = 1, groups = wine.class, ellipse = TRUE, circle = TRUE))
To the best of my knowledge it isn't possible with any or the arguments passed to ggbiplot. Luckily ggbiplot is a pretty simple wrapper for some ggplot2 commands and data massaging. You can copy the source code make a custom function and change line 124 of the original source from:
g <- g + geom_path(data = ell, aes(color = groups, group = groups))
to:
g <- g + geom_path(data = ell, aes(color = groups, group = groups, linetype = groups))
Because of the plot scale it's hard to tell the lines apart without changing the size outside of the aes() statement.

How to manipulate y-axis text labels in R varImpPlot?

The following sample resembles my dataset:
require(randomForest)
alpha = c(1,2,3,4,5,6)
bravo = c(2,3,4,5,6,7)
charlie = c(2,6,5,3,5,6)
mydata = data.frame(alpha,bravo,charlie)
myrf = randomForest(alpha~bravo+charlie, data = mydata, importance = TRUE)
varImpPlot(myrf, type = 2)
I cannot seem to control the placement of the y-axis labels in varImpPlot. I have tried altering the plot parameters (e.g. mar, oma), with no success. I need the y-axis labels shifted to the left in order to produce a PDF with proper label placement.
How can I shift the y-axis labels to the left?
I tried to use adj parameter but it produces a bug. As varImpPlot , use dotchart behind, Here a version using lattice dotplot. Then you can customize you axs using scales parameters.
imp <- importance(myref, class = NULL, scale = TRUE, type = 2)
dotplot(imp, scales=list(y =list(cex=2,
at = c(1,2),
col='red',
rot =20,
axs='i') ,
x =list(cex=2,col='blue')) )
You can extract the data needed to construct the plot out of myref and construct a plot with ggplot. By doing so you have more freedom in tweaking the plot. Here are some examples
library(ggplot2)
str(myrf)
str(myrf$importance)
data <- as.data.frame(cbind(rownames(myrf$importance),round(myrf$importance[,"IncNodePurity"],1)))
colnames(data) <- c("Parameters","IncNodePurity")
data$IncNodePurity <- as.numeric(as.character(data$IncNodePurity))
Standard plot:
(p <- ggplot(data) + geom_point(aes(IncNodePurity,Parameters)))
Rotate y-axis labels:
(p1 <- p+ theme(axis.text.y = element_text(angle = 90, hjust = 1)))
Some more tweaking (also first plot shown here):
(p2 <- p1 + scale_x_continuous(limits=c(3,7),breaks=3:7) + theme(axis.title.y = element_blank()))
Plot that looks like the varImpPlot (second plot shown here) :
(p3 <- p2+ theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.major.y = element_line(colour = 'gray', linetype = 'dashed'),
panel.background = element_rect(fill='white', colour='black')))
Saving to pdf is easy with ggplot:
ggsave("randomforestplot.pdf",p2)
or
ggsave("randomforestplot.png",p2)
p2
p3
Did I understood correctly, that you want to get texts charlie and bravo more left of the boundary of the plot? If so, here's one hack to archive this, based on the modification of the rownames used in plotting:
myrf = randomForest(alpha~bravo+charlie, data = mydata, importance = TRUE)
#add white spaces at the end of the rownames
rownames(myrf$importance)<-paste(rownames(myrf$importance), " ")
varImpPlot(myrf, type = 2)
The adj parameter in dotchart is fixed as 0 (align to right), so that cannot be changed without modifying the code of dotchart:
mtext(labs, side = 2, line = loffset, at = y, **adj = 0**, col = color,
las = 2, cex = cex, ...)
(from dotchart)
EDIT:
You can make another type of hack also. Take the code of dotchart, change the above line to
mtext(labs, side = 2, line = loffset, at = y, adj = adjust_ylab, col = color,
las = 2, cex = cex, ...)
Then add argument adjust_ylab to the argument list, and rename the function as for example dotchartHack. Now copy the code of varImpPlot, find the line which calls dotchart, change the function name to dotchartHack and add the argument adjust_ylab=adjust_ylab to function call, rename the function to varImpPlotHack and add adjust_ylab to this functions argument list. Now you can change the alignment of the charlie and bravo by changing the parameter adjust_ylab:
myrf = randomForest(alpha~bravo+charlie, data = mydata, importance = TRUE)
varImpPlotHack(myrf, type = 2,adjust_ylab=0.5)
From ?par:
The value of adj determines the way in which text strings are
justified in text, mtext and title. A value of 0 produces
left-justified text, 0.5 (the default) centered text and
right-justified text. (Any value in [0, 1] is allowed, and on most
devices values outside that interval will also work.)

Resources