autoplot - How to adjust loading labels? - r

I would like to be able to adjust the positions of the loading labels, so that they do not fall atop the the arrows. However, I do not know where the adjustments need to be made. The geom_text can be used to adjust the position of the site positions, but I cannot find where the vectors are stored in str(g).
library(ggplot2)
library(ggfortify)
df <- data.frame(replicate(10,sample(-10:10,10,rep=TRUE)))
names(df) <- c('up','down','left','right','circle','square','triangle','x','r1','l1')
rownames(df) <- paste('Dummy Site', seq(0,9,1))
g <- autoplot(prcomp(df[,-11], scale=TRUE), data=df,
loadings.label=TRUE, loadings=TRUE,
loadings.label.size=8, loadings.colour='blue',
label.size=5) +
geom_text(vjust=-1, label=rownames(df)) +
theme(plot.background=element_blank(),
panel.background=element_rect(fill='transparent',color='black',size=1),
legend.text=element_text(hjust=1),
legend.key=element_blank())
g
I've looked in ggplot2::theme and I've examined the help docs for autoplot, but can't find any mention of the adjusting label position. Bonus points if it can adjust based on the vector of the arrow, but a static adjustment would be acceptable.
Currently, here is what the plot looks like:

You can get the coordinates by layer_data(g, 2). But autoplot(prcomp.obj) passes other arguments to ggbiplot(), so you can change label and loadings.label position using arguments of ggbiplot(), such as loadings.label.hjust (see ?ggbiplot).
example code:
arrow_ends <- layer_data(g, 2)[,c(2,4)]
autoplot(prcomp(df[,-11], scale=TRUE), data=df,
loadings.label=TRUE, loadings=TRUE,
loadings.label.size=8, loadings.colour='blue',
label.size=5, loadings.label.vjust = 1.2) + # change loadings.label position
geom_point(data = arrow_ends, aes(xend, yend), size = 3) + # the coordinates from layer_data(...)
geom_text(vjust=-1, label=rownames(df)) +
theme(plot.background=element_blank(),
panel.background=element_rect(fill='transparent',color='black',size=1),
legend.text=element_text(hjust=1),
legend.key=element_blank())

Related

Insert rectangle outside of ggplot to visualize plot segments

I hope you can help me. I have the idea of visualizing segments within a plot with a rectangle that can be placed next to the y or x-axis which means that it would be outside of the plot area. It should look similar as in the image below:
I tried to reach the mentioned output by trying two different approaches:
I created two viewports with the grid package and put the plot in one viewport that I placed at the bottom and one viewport on top of that. The big problem here is that I need the coordinates from where the grey background panel of the ggplot starts so I can place the top viewport exactly there, so that the segments conincide with the x-axis length. My code looked like following:
container_viewport <- viewport(x=0,y=0,height=1,width=1,just = c("left","bottom"))
pushViewport(container_viewport)
grid.draw(rectGrob())
popViewport()
section_viewport <- viewport(x=0.055,y=0.99,height=0.085,width=0.935,just=c("left","top"))
pushViewport(section_viewport)
plot_obj <- ggplot_build(testplot)
plot_data <- plot_obj$data[[1]]
grid.draw(rectGrob(gp = gpar(col = "red")))
popViewport()
plot_viewport <- viewport(x=0,y=0,height=0.9,width=1,just=c("left","bottom"))
pushViewport(plot_viewport)
grid.draw(ggplotGrob(testplot))
popViewport()
This looks fine but I had to hardcode the coordinates of the viewport at the top.
I used grid.arrange() to arrange to stack the plots vertically (instead of a grob for the rectangle like in the other approach I create a ggplot instead for that). Here, basically the same problem exists, since I somehow need to put the plot representing the rectangle at the top in the right position on the x-axis. My code looked like following:
p1 <- plot_data %>%
ggplot()+
geom_rect(aes(xmin=-Inf,xmax=Inf,ymin=-Inf,ymax=Inf))
p2 <- testplot
test_plot <- grid.arrange(p1,p2,heights=c(1,10))
This approach does not work that good.
Since I would like to create a solution that can be applied generally, trial and error with the coordinates of the viewport is no option since the length of the y-axis label or tick labels can vary and therefore the length and coordinates of the background panel. When this step is done the segmentation of the rectangle should be no problem anymore.
Maybe this is just not possible but if then I would appreciate any help.
Thank you!
I would probably use patchwork here. Let's start by replicating your plot:
library(ggplot2)
library(patchwork)
p <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
geom_point(color = "red") +
labs(x = "test", y = "test")
p
That looks very similar. Now we define (in our own co-ordinates) where we want the section split to occur on the x axis.
section_split <- 5.25
Using just this number, we add rectangles and text annotations that cover a copy of our original plot, and remove its axis annotations using theme_void:
p2 <- p +
annotate("rect", xmin = c(-Inf, section_split), ymin = c(-Inf, -Inf),
xmax = c(section_split, Inf), ymax = c(Inf, Inf),
fill = c("#00a2e8", "#ff7f27")) +
annotate("text", label = c("Section A", "Section B"), size = 6,
y = rep(mean(layer_scales(p)$y$range$range), 2),
x = c((min(layer_scales(p)$x$range$range) + section_split)/2,
(max(layer_scales(p)$x$range$range) + section_split)/2)) +
theme_void()
Now we just draw this second plot above our first, adjusting the relative heights to about 1:10
p2/p + plot_layout(heights = c(1, 10))
The benefit of doing it this way is that, since we copied the original plot, the positional mapping of the x axis is identical between the two plots, and patchwork will automatically line up the panels.
Created on 2023-02-04 with reprex v2.0.2

Multiple Splines using ggplot2 + Different colours + Line width + Custom X-axis markings

I have a two small sets of points, viz. (1,a1),...,(9,a9) and (1,b1),...,(9,b9). I'm trying to interpolate these two set of points separately by using splines with the help of ggplot2. So, what I want is 2 different splines curves interpolating the two sets of points on the same plot (Refer to the end of this post).
Since I have a very little plotting experience using ggplot2, I copied a code snippet from this answer by Richard Telford. At first, I stored my Y-values for set of points in two numeric variables A and B, and wrote the following code :
library(ggplot2)
library(plyr)
A <- c(a1,...,a9)
B <- c(b1,...,b9)
d <- data.frame(x=1:9,y=A)
d2 <- data.frame(x=1:9,y=B)
dd <- rbind(cbind(d, case = "d"), cbind(d2, case = "d2"))
ddsmooth <- plyr::ddply(dd, .(case), function(k) as.data.frame(spline(k)))
ggplot(dd,aes(x, y, group = case)) + geom_point() + geom_line(aes(x, y, group = case), data = ddsmooth)
This produces the following output :
Now, I'm seeking for an almost identical plot with the following customizations :
The two spline curves should have different colours
The line width should be user's choice (Like we do in plot function)
A legend (Specifying the colour and the corresponding attribute)
Markings on the X-axis should be 1,2,3,...,9
Hoping for a detailed solution to my problem, though any kind of help is appreciated. Thanks in advance for your time and help.
You have already shaped your data correctly for the plot. It's just a case of associating the case variable with colour and size scales.
Note the following:
I have inferred the values of A and B from your plot
Since the lines are opaque, we plot them first so that the points are still visible
I have included size and colour parameters to the aes call in geom_line
I have selected the colours by passing them as a character vector to scale_colour_manual
I have also selected the sizes of the lines by calling scale_size_manual
I have set the x axis breaks by adding a call to scale_x_continuous
The legend has been added automatically according to the scales used.
ggplot(dd, aes(x, y)) +
geom_line(aes(colour = case, size = case, linetype = case), data = ddsmooth) +
geom_point(colour = "black") +
scale_colour_manual(values = c("red4", "forestgreen"), name = "Legend") +
scale_size_manual(values = c(0.8, 1.5), name = "Legend") +
scale_linetype_manual(values = 1:2, name = "Legend") +
scale_x_continuous(breaks = 1:9)
Created on 2020-07-15 by the reprex package (v0.3.0)

3-variables plotting heatmap ggplot2

I'm currently working on a very simple data.frame, containing three columns:
x contains x-coordinates of a set of points,
y contains y-coordinates of the set of points, and
weight contains a value associated to each point;
Now, working in ggplot2 I seem to be able to plot contour levels for these data, but i can't manage to find a way to fill the plot according to the variable weight. Here's the code that I used:
ggplot(df, aes(x,y, fill=weight)) +
geom_density_2d() +
coord_fixed(ratio = 1)
You can see that there's no filling whatsoever, sadly.
I've been trying for three days now, and I'm starting to get depressed.
Specifying fill=weight and/or color = weight in the general ggplot call, resulted in nothing. I've tried to use different geoms (tile, raster, polygon...), still nothing. Tried to specify the aes directly into the geom layer, also didn't work.
Tried to convert the object as a ppp but ggplot can't handle them, and also using base-R plotting didn't work. I have honestly no idea of what's wrong!
I'm attaching the first 10 points' data, which is spaced on an irregular grid:
x = c(-0.13397460,-0.31698730,-0.13397460,0.13397460,-0.28867513,-0.13397460,-0.31698730,-0.13397460,-0.28867513,-0.26794919)
y = c(-0.5000000,-0.6830127,-0.5000000,-0.2320508,-0.6547005,-0.5000000,-0.6830127,-0.5000000,-0.6547005,0.0000000)
weight = c(4.799250e-01,5.500250e-01,4.799250e-01,-2.130287e+12,5.798250e-01,4.799250e-01,5.500250e-01,4.799250e-01,5.798250e-01,6.618956e-01)
any advise? The desired output would be something along these lines:
click
Thank you in advance.
From your description geom_density doesn't sound right.
You could try geom_raster:
ggplot(df, aes(x,y, fill = weight)) +
geom_raster() +
coord_fixed(ratio = 1) +
scale_fill_gradientn(colours = rev(rainbow(7)) # colourmap
Here is a second-best using fill=..level... There is a good explanation on ..level.. here.
# load libraries
library(ggplot2)
library(RColorBrewer)
library(ggthemes)
# build your data.frame
df <- data.frame(x=x, y=y, weight=weight)
# build color Palette
myPalette <- colorRampPalette(rev(brewer.pal(11, "Spectral")), space="Lab")
# Plot
ggplot(df, aes(x,y, fill=..level..) ) +
stat_density_2d( bins=11, geom = "polygon") +
scale_fill_gradientn(colours = myPalette(11)) +
theme_minimal() +
coord_fixed(ratio = 1)

ggrepel: Repelling text in only one direction, and returning values of repelled text

I have a dataset, where each data point has an x-value that is constrained (represents an actual instance of a quantitative variable), y-value that is arbitrary (exists simply to provide a dimension to spread out text), and a label. My datasets can be very large, and there is often text overlap, even when I try to spread the data across the y-axis as much as possible.
Hence, I am trying to use the new ggrepel. However, I am trying to keep the text labels constrained at their x-value position, while only allowing them to repel from each other in the y-direction.
As an example, the below code produces an plot for 32 data points, where the x-values show the number of cylinders in a car, and the y-values are determined randomly (have no meaning but to provide a second dimension for text plotting purposes). Without using ggrepel, there is significant overlap in the text:
library(ggrepel)
library(ggplot2)
set.seed(1)
data = data.frame(x=runif(100, 1, 10),y=runif(100, 1, 10),label=paste0("label",seq(1:100)))
origPlot <- ggplot(data) +
geom_point(aes(x, y), color = 'red') +
geom_text(aes(x, y, label = label)) +
theme_classic(base_size = 16)
I can remedy the text overlap using ggrepel, as shown below. However, this changes not only the y-values, but also the x-values. I am trying to avoid changing the x-values, as they represent an actual physical meaning (the number of cylinders):
repelPlot <- ggplot(data) +
geom_point(aes(x, y), color = 'red') +
geom_text_repel(aes(x, y, label = label)) +
theme_classic(base_size = 16)
As a note, the reason I cannot allow the x-value of the text to change is because I am only plotting the text (not the points). Whereas, it seems that most examples in ggrepel keep the position of the points (so that their values remain true), and only repel the x and y values of the labels. Then, the points and connected to the labels with segments (you can see that in my second plot example).
I kept the points in the two examples above for demonstration purposes. However, I am only retaining the text (and hence will be removing the points and the segments), leaving me with something like this:
repelPlot2 <- ggplot(data) + geom_text_repel(aes(x, y, label = label), segment.size = 0) + theme_classic(base_size = 16)
My question is two fold:
1) Is it possible for me to repel the text labels only in the y-direction?
2) Is it possible for me to obtain a structure containing the new (repelled) y-values of the text?
Thank you for any advice!
ggrepel version 0.6.8 (Install from GitHub using devtools::github_install) now supports a "direction" argument, which enables repelling of labels only in "x" or "y" direction.
repelPlot2 <- ggplot(data) + geom_text_repel(aes(x, y, label = label), segment.size = 0, direction = "y") + theme_classic(base_size = 16)
Getting the y values is harder -- one approach can be to use the "repel_boxes" function from ggrepel first to get repelled values and then input those into ggplot with geom_text. For discussion and sample code of that approach, see https://github.com/slowkow/ggrepel/issues/24. Note that if using the latest version, the repel_boxes function now also has a "direction" argument, which takes in "both","x", or "y".
I don't think it is possible to repel text labels only in one direction with ggrepel.
I would approach this problem differently, by instead generating the arbitrary y-axis positions manually. For example, for the data set in your example, you could do this using the code below.
I have used the dplyr package to group the data set by the values of x, and then created a new column of data y containing the row numbers within each group. The row numbers are then used as the values for the y-axis.
library(ggplot2)
library(dplyr)
data <- data.frame(x = mtcars$cyl, label = paste0("label", seq(1:32)))
data <- data %>%
group_by(x) %>%
mutate(y = row_number())
ggplot(data, aes(x = x, y = y, label = label)) +
geom_text(size = 2) +
xlim(3.5, 8.5) +
theme_classic(base_size = 8)
ggsave("filename.png", width = 4, height = 2)

Label minimum and maximum of scale fill gradient legend with text: ggplot2

I have a plot created in ggplot2 that uses scale_fill_gradientn. I'd like to add text at the minimum and maximum of the scale legend. For example, at the legend minimum display "Minimum" and at the legend maximum display "Maximum". There are posts using discrete fills and adding labels with numbers instead of text (e.g. here), but I am unsure how to use the labels feature with scale_fill_gradientn to only insert text at the min and max. At the present I am apt to getting errors:
Error in scale_labels.continuous(scale, breaks) :
Breaks and labels are different lengths
Is this text label possible within ggplot2 for this type of scale / fill?
# The example code here produces an plot for illustrative purposes only.
# create data frame, from ggplot2 documentation
df <- expand.grid(x = 0:5, y = 0:5)
df$z <- runif(nrow(df))
#plot
ggplot(df, aes(x, y, fill = z)) + geom_raster() +
scale_fill_gradientn(colours=topo.colors(7),na.value = "transparent")
For scale_fill_gradientn() you should provide both arguments: breaks= and labels= with the same length. With argument limits= you extend colorbar to minimum and maximum value you need.
ggplot(df, aes(x, y, fill = z)) + geom_raster() +
scale_fill_gradientn(colours=topo.colors(7),na.value = "transparent",
breaks=c(0,0.5,1),labels=c("Minimum",0.5,"Maximum"),
limits=c(0,1))
User Didzis Elfert's answer slightly lacks "automatism" in my opinion (but it is of course pointing to the core of the problem +1 :).
Here an option to programatically define minimum and maximum of your data.
Advantages:
You will not need to hard code values any more (which is error prone)
You will not need hard code the limits (which also is error prone)
Passing a named vector: You don't need the labels argument (manually map labels to values is also error-prone).
As a side effect you will avoid the "non-matching labels/breaks" problem
library(ggplot2)
foo <- expand.grid(x = 0:5, y = 0:5)
foo$z <- runif(nrow(foo))
myfuns <- list(Minimum = min, Mean = mean, Maximum = max)
ls_val <- unlist(lapply(myfuns, function(f) f(foo$z)))
# you only need to set the breaks argument!
ggplot(foo, aes(x, y, fill = z)) +
geom_raster() +
scale_fill_gradientn(
colours = topo.colors(7),
breaks = ls_val
)
# You can obviously also replace the middle value with sth else
ls_val[2] <- 0.5
names(ls_val)[2] <- 0.5
ggplot(foo, aes(x, y, fill = z)) +
geom_raster() +
scale_fill_gradientn(
colours = topo.colors(7),
breaks = ls_val
)

Resources