I have the following data frame:
df = data.frame(x = c('a', 'b'),
y = c(2,4))
and the corresponding graph:
ggplot(df, aes(x,y)) +
geom_col()
My scale is going from 1 to 5 so I don't want the 0 to appear on the y axis but want the y axis to start at 1. Yet I still want the blank space below the bars.
ggplot(df, aes(x,y)) +
geom_col() +
coord_cartesian(ylim = c(1,4)) +
scale_y_continuous(
expand = expand_scale(add = c(0.2,0)))
As you see, the y axis indeed starts at 1, but now the space below the 1 is filled with the black bar, and is no longer blank. The other posts I consulted deal with cases when the y axis starts at 0 so expand_scale() did the job. Not in my case.
Any idea on how to resolve this issue?
You could use the limits in scale_y_continuous to indicate what range you want to show. NA in the limits is 'use the default'. Now, normally that would take the away both bars because they are out-of-bounds (oob), but you could fix that by setting the oob argument of the scales to squish, which is a function found in the scales package.
library(scales)
ggplot(df, aes(x,y)) +
geom_col() +
scale_y_continuous(limits = c(1, NA), oob = squish)
In ggplot you can use position_nudge() to cheekily move an entire geom by a fixed distance:
ggplot(df, aes(x,y)) +
geom_col(position = position_nudge(y=1)) + #move everything up one
ylim(c(0,5)) #set the y axis limits
If I well understand your problem, you have several levels of response and you want for each individual to represent it. Could you consider a representation with a y axis which is discret ?
It doesn't avoid your problem of 0 or 1 but in fact only level are importants. If there is nothing it is just because you don't have the answer so it is 0. I don't know if you agree with my proposition :
df = data.frame(x = c('a', 'b', 'c', 'd'),
y = factor(c(1, 2, 3, 4), levels = 1:4))
ggplot(df, aes(x, y)) +
geom_col() +
scale_y_discrete(expand = expansion(add = 1.2))
Change coord_cartesian() to the following:
coord_cartesian(ylim=c(1,5), expand=F)
Here's the full call to ggplot() with the change:
ggplot(df, aes(x,y)) +
geom_col() +
coord_cartesian(ylim=c(1,5), expand=F) +
scale_y_continuous(
expand = expand_scale(add = c(0.2,0)))
Related
Context
I want to show a specific value at x axis in ggplot2. It is 2.84 in the Reproducible code.
I found the answer at How can I add specific value to x-axis in ggplot2?
It very close to my need.
Question
Is there some way that do not need set breaks and labels in scale_x_continuous to show a specific value at x axis.
Because I need to draw a large number of similar images, setting the breaks and labels for each image will be very tedious.
Reproducibale code
# make up some data
d <- data.frame(x = 6*runif(10) + 1,
y = runif(10))
# generate break positions
breaks = c(seq(1, 7, by=0.5), 2.84)
# and labels
labels = as.character(breaks)
# plot
ggplot(d, aes(x, y)) + geom_point() + theme_minimal() +
scale_x_continuous(limits = c(1, 7), breaks = breaks, labels = labels,
name = "Number of treatments")
You can automate this process by creating a wrapper round scale_x_continuous that inserts your break into a vector of pretty breaks:
scale_x_fancy <- function(xval, ...) {
scale_x_continuous(breaks = ~ sort(c(pretty(.x, 10), xval)), ...)
}
So now you just add the x value(s) where you want the extra break to appear:
ggplot(d, aes(x, y)) +
geom_point() +
theme_minimal() +
scale_x_fancy(xval = 2.84, name = "Number of treatments")
I have created a chart with ggplot.
I have set the width of each bar, but I also want to set the spacing between the bars to a certain value (I want to reduce the spacing marked in red to 0.1, for example)? I know there are options like position_dodge, but that does not seem to work in combination with coord_flip().
In this related post it was suggested to use theme(aspect.ratio = .2), but this does not allow to additionally set the specific width of the bars.
Are there any suggestions to achieve this?
Code:
library(ggplot2)
set.seed(0)
numbers <- runif(5, 0, 10)
names <- LETTERS[seq(1, 5)]
df <- cbind.data.frame(names, numbers)
ggplot(data = df, aes(x = names, y = numbers)) +
geom_bar(stat = "identity", fill = "blue", width = 0.30) +
coord_flip()
I think the solution is in the combination of
the width argument of geom_bar() (which fills the space reserved for a bar)
and the aspect ratio argument of theme(), which squeezes the plot vertically, leading to 'small' bars.
With the following code:
library(ggplot2)
## your data
set.seed(0)
numbers <- runif(5, 0, 10)
names <- LETTERS[seq(1, 5)]
df <- cbind.data.frame(names, numbers) ## corrected args
ggplot(data = df, aes(x = names, y = numbers)) +
geom_bar(stat="identity",
fill = "blue",
width=0.9) + ### increased
theme(aspect.ratio = .2) + ### aspect ratio added
coord_flip()
you get the following graph:
I personally prefer to use the ggstance package to avoid messing around with coord_flip. you need to switch your x and y
library(ggplot2)
library(ggstance)
ggplot(df, aes(x = numbers, y = names)) +
geom_colh(fill = "blue", width = 0.9)
I am pretty sure that this is easy to do but I can't seem to find a proper way to query this question into google or stack, so here we are:
I have a plot made in ggplot2 which makes use of geom_jitter(), efficiently creating one row for each element in a factor and plotting its values.
I would like to add a complementary geom_violin() to the plot, but just adding the extra geom_ function to the plot code returns two layers: the jitter and the violin, one on top of the other (as usually expected).
EDIT:
This is how the plot looks like:
How can I have the violin as a separate row, without generating a second plot?
Side quest: how I can I have the jitter and the violin geoms interleaved? (i.e. element A jitter row followed by element A violin row, and then element B jitter row followed by element B violin row)
This is the minimum required code to make it (without all the theme() embellishments):
P1 <- ggplot(data=TEST_STACK_SUB, aes(x=E, y=C, col=A)) +
theme(... , aspect.ratio=0.3) +
geom_point(position = position_jitter(w = 0.30, h = 0), alpha=0.2, size=0.5) +
geom_violin(data=TEST_STACK_SUB, mapping=aes(x=E, y=C), position="dodge") +
scale_x_discrete() +
scale_y_continuous(limits=c(0,1), breaks=seq(0,1,0.1),
labels=c(seq(0,1,0.1))) +
scale_color_gradient2(breaks=seq(0,100,20),
limits=c(0,100),
low="green3",
high="darkorchid4",
midpoint=50,
name="") +
coord_flip()
options(repr.plot.width=8, repr.plot.height=2)
plot(P1)
Here is a subset of the data to generate it (for you to try):
data
How about manipulating your factor as a continuous variable and nudging the entries across the aes() calls like so:
library(dplyr)
library(ggplot2)
set.seed(42)
tibble(x = rep(c(1, 3), each = 10),
y = c(rnorm(10, 2), rnorm(10))) -> plot_data
ggplot(plot_data) +
geom_jitter(aes(x = x - 0.5, y = y), width = 0.25) +
geom_violin(aes(x = x + 0.5, y = y, group = x), width = 0.5) +
coord_flip() +
labs(x = "x") +
scale_x_continuous(breaks = c(1, 3),
labels = paste("Level", 1:2),
trans = scales::reverse_trans())
I have a plot with three different lines. I want one of those lines to have points on as well. I also want the two lines without points to be thicker than the one without points. I have managed to get the plot I want, but I the legend isn't keeping up.
library(ggplot2)
y <- c(1:10, 2:11, 3:12)
x <- c(1:10, 1:10, 1:10)
testnames <- c(rep('mod1', 10), rep('mod2', 10), rep('meas', 10))
df <- data.frame(testnames, y, x)
ggplot(data=df, aes(x=x, y=y, colour=testnames)) +
geom_line(aes(size=testnames)) +
scale_size_manual("", values=c(0.5,1,1)) +
geom_point(aes(alpha=testnames), size=5, shape=4) +
scale_alpha_manual("", values=c(1, 0, 0))
I can remove the second (black) legend:
ggplot(data = df, aes(x=x, y=y, colour=testnames)) +
geom_line(aes(size=testnames)) +
scale_size_manual("", values=c(0.5,1,1), guide='none') +
geom_point(aes(alpha=testnames), size=5, shape=4) +
scale_alpha_manual("", values=c(1, 0.05, 0.05), guide='none')
But what I really want is a merge of the two legends - a legend with colours, cross only on the first variable (meas) and the lines of mod1 and mod2 thicker than the first line. I have tried guide and override, but with little luck.
You don't need transparency to hide the shapes for mod1 and mod2. You can omit these points from the plot and legend by setting their shape to NA in scale_shape_manual:
ggplot(data = df, aes(x = x, y = y, colour = testnames, size = testnames)) +
geom_line() +
geom_point(aes(shape = testnames), size = 5) +
scale_size_manual(values=c(0.5, 2, 2)) +
scale_shape_manual(values=c(8, NA, NA))
This gives the following plot:
NOTE: I used some more distinct values in the size-scale and another shape in order to better illustrate the effect.
This question already has answers here:
Force the origin to start at 0
(4 answers)
Closed 1 year ago.
I have a data frame of positive x and y values that I want to present as a scatterplot in ggplot2. The values are clustered away from the point (0,0), but I want to include the x=0 and y=0 lines in the plot to show overall magnitude. How can I do this?
set.seed(349)
d <- data.frame(x = runif(10, 1, 2), y = runif(10, 1, 2))
ggplot(d, aes(x,y)) + geom_point()
But what I want is something roughly equivalent to this, without having to specify both ends of the limits:
ggplot(d, aes(x=x, y=y)) + geom_point() +
scale_x_continuous(limits = c(0,2)) + scale_y_continuous(limits = c(0,2))
One option is to just anchor the x and y min, but leave the max unspecified
ggplot(d, aes(x,y)) + geom_point() +
scale_x_continuous(limits = c(0,NA)) +
scale_y_continuous(limits = c(0,NA))
This solution is a bit hacky, but it works for standard plot also.
Where d is the original dataframe we add two "fake" data points:
d2 = rbind(d,c(0,NA),c(NA,0))
This first extra data point has x-coordinate=0 and y-coordinate=NA. This means 0 will be included in the xlim, but the point will not be displayed (because it has no y-coordinate).
The other data point does the same for the y limits.
Just plot d2 instead of d and it will work as desired.
If using ggplot, as opposed to plot, you will get a warning about missing values. This can be suppressed by replacing geom_point() with geom_point(na.rm=T)
One downside with this solution (especially for plot) is that an extra value must be added for any other 'per-data-point' parameters, such as col= if you give each point a different colour.
Use the function expand_limits(x=0,y=0), i.e.:
set.seed(349)
d <- data.frame(x = runif(10, 1, 2), y = runif(10, 1, 2))
ggplot(d, aes(x,y)) + geom_point() + expand_limits(x = 0, y = 0)