Consider the following graph
d1 = data.frame(x=LETTERS[1:2],y=c(1.9,2.3))
d2 = data.frame(x=LETTERS[1:2],y=c(1.9,3))
ggplot(d1, aes(x=x,y=y)) + geom_point(data=d1, color="red") +
geom_point(data=d2, color="blue")
The goal is to dodge the blue toward the right and the red dots toward the left. One way would be to merge the two data.frames
d1$category=1
d2$category=2
d = rbind(d1,d2)
d$category = as.factor(d$category)
ggplot(d, aes(x=x,y=y, color=category)) +
geom_point(data=d, position=position_dodge(0.3)) +
scale_color_manual(values=c("red","blue"))
Is there a another solution (a solution that does not require merging the data.frames)?
You can use position_nudge():
library(ggplot2)
d1 <- data.frame(x = LETTERS[1:2], y = c(1.9, 2.3))
d2 <- data.frame(x = LETTERS[1:2], y = c(1.9, 3))
ggplot(d1, aes(x, y)) +
geom_point(d1, color = "red", position = position_nudge(- 0.05)) +
geom_point(d2, color = "blue", position = position_nudge(0.05))
Related
I first make a plot
df <- data.frame(x = c(1:40, rep(1:20, 3), 15:40))
p <- ggplot(df, aes(x=x, y = x)) +
stat_density2d(aes(fill='red',alpha=..level..),geom='polygon', show.legend = F)
Then I want to change the geom_density values and use these in another plot.
# build plot
q <- ggplot_build(p)
# Change density
dens <- q$data[[1]]
dens$y <- dens$y - dens$x
Build the other plot using the changed densities, something like this:
# Built another plot
ggplot(df, aes(x=x, y =1)) +
geom_point(alpha = 0.3) +
geom_density2d(dens)
This does not work however is there a way of doing this?
EDIT: doing it when there are multiple groups:
df <- data.frame(x = c(1:40, rep(1:20, 3), 15:40), group = c(rep('A',40), rep('B',60), rep('C',26)))
p <- ggplot(df, aes(x=x, y = x)) +
stat_density2d(aes(fill=group,alpha=..level..),geom='polygon', show.legend = F)
q <- ggplot_build(p)
dens <- q$data[[1]]
dens$y <- dens$y - dens$x
ggplot(df, aes(x=x, y =1)) +
geom_point(aes(col = group), alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = fill, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F)
Results when applied to my own dataset
Although this is exactly what I'm looking for the fill colors seem not to correspond to the initial colors (linked to A, B and C):
Like this? It is possible to plot a transformation of the shapes plotted by geom_density. But that's not quite the same as manipulating the underlying density...
ggplot(df, aes(x=x, y =1)) +
geom_point(alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = fill, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F)
Edit - OP now has multiple groups. We can plot those with the code below, which produces an artistic plot of questionably utility. It does what you propose, but I would suggest it would be more fruitful to transform the underlying data and summarize that, if you are looking for representative output.
ggplot(df, aes(x=x, y =1)) +
geom_point(aes(col = group), alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = group, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F) +
theme_minimal()
I want to overlay two plots: one is a simple point plot where a variable is used to control the dot size; and another is a simple curve.
Here is a dummy example for the first plot;
library(ggplot2)
x <- seq(from = 1, to = 10, by = 1)
df = data.frame(x=x, y=x^2, v=2*x)
ggplot(df, aes(x, y, size = v)) + geom_point() + theme_classic() + scale_size("blabla")
Now lets overlay a curve to this plot with data from another dataframe:
df2 = data.frame(x=x, y=x^2-x+2)
ggplot(df, aes(x, y, size = v)) + geom_point() + theme_classic() + scale_size("blabla") + geom_line(data=df2, aes(x, y), color = "blue") + scale_color_discrete(name = "other", labels = c("nanana"))
It produces the error:
Error in FUN(X[[i]], ...) : object 'v' not found
The value in v is not used to draw the intended curse, but anyway, I added a dummy v to df2.
df2 = data.frame(x=x, y=x^2-x+2, v=replicate(length(x),0)) # add a dummy v
ggplot(df, aes(x, y, size = v)) + geom_point() + theme_classic() + scale_size("blabla") + geom_line(data=df2, aes(x, y), color = "blue") + scale_color_discrete(name = "other", labels = c("nanana"))
An the result has a messed legend:
What is the right way to achieve the desired plot?
You can put the size aes in the geom_point() call to make it so that you don't need the dummy v in df2.
Not sure exactly what you want regarding the legend. If you replace the above, then the blue portion goes away. If you want to have a legend for the line color, then you have to place color inside the geom_line aes call.
x <- seq(from = 1, to = 10, by = 1)
df = data.frame(x=x, y=x^2, v=2*x)
df2 = data.frame(x=x, y=x^2-x+2)
ggplot(df, aes(x, y)) +
geom_point(aes(size = v)) +
theme_classic() +
scale_size("blabla") +
geom_line(data=df2, aes(x, y, color = "blue")) +
scale_color_manual(values = "blue", labels = "nanana", name = "other")
I am using the ggplot function to plot this kind of graph
image
I want to add the specific value of the x-axis as shown in the picture
this is my code :
quantiles <- quantile(mat,prob = quant)
x <- as.vector(mat)
d <- as.data.frame(x=x)
p <- ggplot(data = d,aes(x=x)) + theme_bw() +
geom_histogram(aes(y = ..density..), binwidth=0.001,color="black",fill="white") +
geom_density(aes(x=x, y = ..density..),fill="blue", alpha=0.5, color = 'black')
x.dens <- density(x)
df.dens <- data.frame(x = x.dens$x, y = x.dens$y)
p <- p + geom_area(data = subset(df.dens, x <= quantiles), aes(x=x,y=y),
fill = 'green', alpha=0.6)
print(p)
Is there a method to overlay something analogous to a density curve when the vertical axis is frequency or relative frequency? (Not an actual density function, since the area need not integrate to 1.) The following question is similar:
ggplot2: histogram with normal curve, and the user self-answers with the idea to scale ..count.. inside of geom_density(). However this seems unusual.
The following code produces an overinflated "density" line.
df1 <- data.frame(v = rnorm(164, mean = 9, sd = 1.5))
b1 <- seq(4.5, 12, by = 0.1)
hist.1a <- ggplot(df1, aes(v)) +
stat_bin(aes(y = ..count..), color = "black", fill = "blue",
breaks = b1) +
geom_density(aes(y = ..count..))
hist.1a
#joran's response/comment got me thinking about what the appropriate scaling factor would be. For posterity's sake, here's the result.
When Vertical Axis is Frequency (aka Count)
Thus, the scaling factor for a vertical axis measured in bin counts is
In this case, with N = 164 and the bin width as 0.1, the aesthetic for y in the smoothed line should be:
y = ..density..*(164 * 0.1)
Thus the following code produces a "density" line scaled for a histogram measured in frequency (aka count).
df1 <- data.frame(v = rnorm(164, mean = 9, sd = 1.5))
b1 <- seq(4.5, 12, by = 0.1)
hist.1a <- ggplot(df1, aes(x = v)) +
geom_histogram(aes(y = ..count..), breaks = b1,
fill = "blue", color = "black") +
geom_density(aes(y = ..density..*(164*0.1)))
hist.1a
When Vertical Axis is Relative Frequency
Using the above, we could write
hist.1b <- ggplot(df1, aes(x = v)) +
geom_histogram(aes(y = ..count../164), breaks = b1,
fill = "blue", color = "black") +
geom_density(aes(y = ..density..*(0.1)))
hist.1b
When Vertical Axis is Density
hist.1c <- ggplot(df1, aes(x = v)) +
geom_histogram(aes(y = ..density..), breaks = b1,
fill = "blue", color = "black") +
geom_density(aes(y = ..density..))
hist.1c
Try this instead:
ggplot(df1,aes(x = v)) +
geom_histogram(aes(y = ..ncount..)) +
geom_density(aes(y = ..scaled..))
library(ggplot2)
smoothedHistogram <- function(dat, y, bins=30, xlabel = y, ...){
gg <- ggplot(dat, aes_string(y)) +
geom_histogram(bins=bins, center = 0.5, stat="bin",
fill = I("midnightblue"), color = "#E07102", alpha=0.8)
gg_build <- ggplot_build(gg)
area <- sum(with(gg_build[["data"]][[1]], y*(xmax - xmin)))
gg <- gg +
stat_density(aes(y=..density..*area),
color="#BCBD22", size=2, geom="line", ...)
gg$layers <- gg$layers[2:1]
gg + xlab(xlabel) +
theme_bw() + theme(axis.title = element_text(size = 16),
axis.text = element_text(size = 12))
}
dat <- data.frame(x = rnorm(10000))
smoothedHistogram(dat, "x")
I know that when you use par( fig=c( ... ), new=T ), you can create inset graphs. However, I was wondering if it is possible to use ggplot2 library to create 'inset' graphs.
UPDATE 1: I tried using the par() with ggplot2, but it does not work.
UPDATE 2: I found a working solution at ggplot2 GoogleGroups using grid::viewport().
Section 8.4 of the book explains how to do this. The trick is to use the grid package's viewports.
#Any old plot
a_plot <- ggplot(cars, aes(speed, dist)) + geom_line()
#A viewport taking up a fraction of the plot area
vp <- viewport(width = 0.4, height = 0.4, x = 0.8, y = 0.2)
#Just draw the plot twice
png("test.png")
print(a_plot)
print(a_plot, vp = vp)
dev.off()
Much simpler solution utilizing ggplot2 and egg. Most importantly this solution works with ggsave.
library(ggplot2)
library(egg)
plotx <- ggplot(mpg, aes(displ, hwy)) + geom_point()
plotx +
annotation_custom(
ggplotGrob(plotx),
xmin = 5, xmax = 7, ymin = 30, ymax = 44
)
ggsave(filename = "inset-plot.png")
Alternatively, can use the cowplot R package by Claus O. Wilke (cowplot is a powerful extension of ggplot2). The author has an example about plotting an inset inside a larger graph in this intro vignette. Here is some adapted code:
library(cowplot)
main.plot <-
ggplot(data = mpg, aes(x = cty, y = hwy, colour = factor(cyl))) +
geom_point(size = 2.5)
inset.plot <- main.plot + theme(legend.position = "none")
plot.with.inset <-
ggdraw() +
draw_plot(main.plot) +
draw_plot(inset.plot, x = 0.07, y = .7, width = .3, height = .3)
# Can save the plot with ggsave()
ggsave(filename = "plot.with.inset.png",
plot = plot.with.inset,
width = 17,
height = 12,
units = "cm",
dpi = 300)
I prefer solutions that work with ggsave. After a lot of googling around I ended up with this (which is a general formula for positioning and sizing the plot that you insert.
library(tidyverse)
plot1 = qplot(1.00*mpg, 1.00*wt, data=mtcars) # Make sure x and y values are floating values in plot 1
plot2 = qplot(hp, cyl, data=mtcars)
plot(plot1)
# Specify position of plot2 (in percentages of plot1)
# This is in the top left and 25% width and 25% height
xleft = 0.05
xright = 0.30
ybottom = 0.70
ytop = 0.95
# Calculate position in plot1 coordinates
# Extract x and y values from plot1
l1 = ggplot_build(plot1)
x1 = l1$layout$panel_ranges[[1]]$x.range[1]
x2 = l1$layout$panel_ranges[[1]]$x.range[2]
y1 = l1$layout$panel_ranges[[1]]$y.range[1]
y2 = l1$layout$panel_ranges[[1]]$y.range[2]
xdif = x2-x1
ydif = y2-y1
xmin = x1 + (xleft*xdif)
xmax = x1 + (xright*xdif)
ymin = y1 + (ybottom*ydif)
ymax = y1 + (ytop*ydif)
# Get plot2 and make grob
g2 = ggplotGrob(plot2)
plot3 = plot1 + annotation_custom(grob = g2, xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax)
plot(plot3)
ggsave(filename = "test.png", plot = plot3)
# Try and make a weird combination of plots
g1 <- ggplotGrob(plot1)
g2 <- ggplotGrob(plot2)
g3 <- ggplotGrob(plot3)
library(gridExtra)
library(grid)
t1 = arrangeGrob(g1,ncol=1, left = textGrob("A", y = 1, vjust=1, gp=gpar(fontsize=20)))
t2 = arrangeGrob(g2,ncol=1, left = textGrob("B", y = 1, vjust=1, gp=gpar(fontsize=20)))
t3 = arrangeGrob(g3,ncol=1, left = textGrob("C", y = 1, vjust=1, gp=gpar(fontsize=20)))
final = arrangeGrob(t1,t2,t3, layout_matrix = cbind(c(1,2), c(3,3)))
grid.arrange(final)
ggsave(filename = "test2.png", plot = final)
'ggplot2' >= 3.0.0 makes possible new approaches for adding insets, as now tibble objects containing lists as member columns can be passed as data. The objects in the list column can be even whole ggplots... The latest version of my package 'ggpmisc' provides geom_plot(), geom_table() and geom_grob(), and also versions that use npc units instead of native data units for locating the insets. These geoms can add multiple insets per call and obey faceting, which annotation_custom() does not. I copy the example from the help page, which adds an inset with a zoom-in detail of the main plot as an inset.
library(tibble)
library(ggpmisc)
p <-
ggplot(data = mtcars, mapping = aes(wt, mpg)) +
geom_point()
df <- tibble(x = 0.01, y = 0.01,
plot = list(p +
coord_cartesian(xlim = c(3, 4),
ylim = c(13, 16)) +
labs(x = NULL, y = NULL) +
theme_bw(10)))
p +
expand_limits(x = 0, y = 0) +
geom_plot_npc(data = df, aes(npcx = x, npcy = y, label = plot))
Or a barplot as inset, taken from the package vignette.
library(tibble)
library(ggpmisc)
p <- ggplot(mpg, aes(factor(cyl), hwy, fill = factor(cyl))) +
stat_summary(geom = "col", fun.y = mean, width = 2/3) +
labs(x = "Number of cylinders", y = NULL, title = "Means") +
scale_fill_discrete(guide = FALSE)
data.tb <- tibble(x = 7, y = 44,
plot = list(p +
theme_bw(8)))
ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
geom_plot(data = data.tb, aes(x, y, label = plot)) +
geom_point() +
labs(x = "Engine displacement (l)", y = "Fuel use efficiency (MPG)",
colour = "Engine cylinders\n(number)") +
theme_bw()
The next example shows how to add different inset plots to different panels in a faceted plot. The next example uses the same example data after splitting it according to the century. This particular data set once split adds the problem of one missing level in one of the inset plots. As these plots are built on their own we need to use manual scales to make sure the colors and fill are consistent across the plots. With other data sets this may not be needed.
library(tibble)
library(ggpmisc)
my.mpg <- mpg
my.mpg$century <- factor(ifelse(my.mpg$year < 2000, "XX", "XXI"))
my.mpg$cyl.f <- factor(my.mpg$cyl)
my_scale_fill <- scale_fill_manual(guide = FALSE,
values = c("red", "orange", "darkgreen", "blue"),
breaks = levels(my.mpg$cyl.f))
p1 <- ggplot(subset(my.mpg, century == "XX"),
aes(factor(cyl), hwy, fill = cyl.f)) +
stat_summary(geom = "col", fun = mean, width = 2/3) +
labs(x = "Number of cylinders", y = NULL, title = "Means") +
my_scale_fill
p2 <- ggplot(subset(my.mpg, century == "XXI"),
aes(factor(cyl), hwy, fill = cyl.f)) +
stat_summary(geom = "col", fun = mean, width = 2/3) +
labs(x = "Number of cylinders", y = NULL, title = "Means") +
my_scale_fill
data.tb <- tibble(x = c(7, 7),
y = c(44, 44),
century = factor(c("XX", "XXI")),
plot = list(p1, p2))
ggplot() +
geom_plot(data = data.tb, aes(x, y, label = plot)) +
geom_point(data = my.mpg, aes(displ, hwy, colour = cyl.f)) +
labs(x = "Engine displacement (l)", y = "Fuel use efficiency (MPG)",
colour = "Engine cylinders\n(number)") +
scale_colour_manual(guide = FALSE,
values = c("red", "orange", "darkgreen", "blue"),
breaks = levels(my.mpg$cyl.f)) +
facet_wrap(~century, ncol = 1)
In 2019, the patchwork package entered the stage, with which you can create
insets
easily by using the inset_element() function:
require(ggplot2)
require(patchwork)
gg1 = ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
geom_point()
gg2 = ggplot(iris, aes(Sepal.Length)) +
geom_density()
gg1 +
inset_element(gg2, left = 0.65, bottom = 0.75, right = 1, top = 1)