This question already has answers here:
Force the origin to start at 0
(4 answers)
Closed 1 year ago.
I have a data frame of positive x and y values that I want to present as a scatterplot in ggplot2. The values are clustered away from the point (0,0), but I want to include the x=0 and y=0 lines in the plot to show overall magnitude. How can I do this?
set.seed(349)
d <- data.frame(x = runif(10, 1, 2), y = runif(10, 1, 2))
ggplot(d, aes(x,y)) + geom_point()
But what I want is something roughly equivalent to this, without having to specify both ends of the limits:
ggplot(d, aes(x=x, y=y)) + geom_point() +
scale_x_continuous(limits = c(0,2)) + scale_y_continuous(limits = c(0,2))
One option is to just anchor the x and y min, but leave the max unspecified
ggplot(d, aes(x,y)) + geom_point() +
scale_x_continuous(limits = c(0,NA)) +
scale_y_continuous(limits = c(0,NA))
This solution is a bit hacky, but it works for standard plot also.
Where d is the original dataframe we add two "fake" data points:
d2 = rbind(d,c(0,NA),c(NA,0))
This first extra data point has x-coordinate=0 and y-coordinate=NA. This means 0 will be included in the xlim, but the point will not be displayed (because it has no y-coordinate).
The other data point does the same for the y limits.
Just plot d2 instead of d and it will work as desired.
If using ggplot, as opposed to plot, you will get a warning about missing values. This can be suppressed by replacing geom_point() with geom_point(na.rm=T)
One downside with this solution (especially for plot) is that an extra value must be added for any other 'per-data-point' parameters, such as col= if you give each point a different colour.
Use the function expand_limits(x=0,y=0), i.e.:
set.seed(349)
d <- data.frame(x = runif(10, 1, 2), y = runif(10, 1, 2))
ggplot(d, aes(x,y)) + geom_point() + expand_limits(x = 0, y = 0)
Related
This question already has answers here:
How to align the bars of a histogram with the x axis?
(5 answers)
Closed 1 year ago.
When using geom_histogram() to plot histogram, the plot will always not start at zero as expect. See my example below:
set.seed(20)
randomnum <- rnorm(40)
data <- data.frame(number = randomnum[randomnum > 0])
ggplot(data, aes(x = number)) +
geom_histogram(color="black", fill="grey40")
Even the data do not contain negative number, the histgram will start at a negative value. The code below may see more clear:
ggplot(data, aes(x = number)) +
geom_histogram(color="black", fill="grey40", binwidth = 0.1) +
scale_x_continuous(breaks = c(0, seq(0, 2, 0.1)))
The histgram will start at -0.1, but the original data do not contain the negative data.
My ideal plot is the x axis will start at 0, and every plot bar share the same width. The plot may be like this:
There are some simmilar questions in stackoverflow, link1 and link2. They change the parameter in scale_x_continuous(). But it only changes the axis scales and labels and do not solve my problem.
OK, the solution is:
ggplot(data, aes(x = number)) +
geom_histogram(color="black", fill="grey40", binwidth = 0.1,
boundary = 0, closed = "left") +
scale_x_continuous(breaks = c(0, seq(0, 2, 0.1)))
The boundary is the key parameter!
I am pretty sure that this is easy to do but I can't seem to find a proper way to query this question into google or stack, so here we are:
I have a plot made in ggplot2 which makes use of geom_jitter(), efficiently creating one row for each element in a factor and plotting its values.
I would like to add a complementary geom_violin() to the plot, but just adding the extra geom_ function to the plot code returns two layers: the jitter and the violin, one on top of the other (as usually expected).
EDIT:
This is how the plot looks like:
How can I have the violin as a separate row, without generating a second plot?
Side quest: how I can I have the jitter and the violin geoms interleaved? (i.e. element A jitter row followed by element A violin row, and then element B jitter row followed by element B violin row)
This is the minimum required code to make it (without all the theme() embellishments):
P1 <- ggplot(data=TEST_STACK_SUB, aes(x=E, y=C, col=A)) +
theme(... , aspect.ratio=0.3) +
geom_point(position = position_jitter(w = 0.30, h = 0), alpha=0.2, size=0.5) +
geom_violin(data=TEST_STACK_SUB, mapping=aes(x=E, y=C), position="dodge") +
scale_x_discrete() +
scale_y_continuous(limits=c(0,1), breaks=seq(0,1,0.1),
labels=c(seq(0,1,0.1))) +
scale_color_gradient2(breaks=seq(0,100,20),
limits=c(0,100),
low="green3",
high="darkorchid4",
midpoint=50,
name="") +
coord_flip()
options(repr.plot.width=8, repr.plot.height=2)
plot(P1)
Here is a subset of the data to generate it (for you to try):
data
How about manipulating your factor as a continuous variable and nudging the entries across the aes() calls like so:
library(dplyr)
library(ggplot2)
set.seed(42)
tibble(x = rep(c(1, 3), each = 10),
y = c(rnorm(10, 2), rnorm(10))) -> plot_data
ggplot(plot_data) +
geom_jitter(aes(x = x - 0.5, y = y), width = 0.25) +
geom_violin(aes(x = x + 0.5, y = y, group = x), width = 0.5) +
coord_flip() +
labs(x = "x") +
scale_x_continuous(breaks = c(1, 3),
labels = paste("Level", 1:2),
trans = scales::reverse_trans())
I have the following data frame:
df = data.frame(x = c('a', 'b'),
y = c(2,4))
and the corresponding graph:
ggplot(df, aes(x,y)) +
geom_col()
My scale is going from 1 to 5 so I don't want the 0 to appear on the y axis but want the y axis to start at 1. Yet I still want the blank space below the bars.
ggplot(df, aes(x,y)) +
geom_col() +
coord_cartesian(ylim = c(1,4)) +
scale_y_continuous(
expand = expand_scale(add = c(0.2,0)))
As you see, the y axis indeed starts at 1, but now the space below the 1 is filled with the black bar, and is no longer blank. The other posts I consulted deal with cases when the y axis starts at 0 so expand_scale() did the job. Not in my case.
Any idea on how to resolve this issue?
You could use the limits in scale_y_continuous to indicate what range you want to show. NA in the limits is 'use the default'. Now, normally that would take the away both bars because they are out-of-bounds (oob), but you could fix that by setting the oob argument of the scales to squish, which is a function found in the scales package.
library(scales)
ggplot(df, aes(x,y)) +
geom_col() +
scale_y_continuous(limits = c(1, NA), oob = squish)
In ggplot you can use position_nudge() to cheekily move an entire geom by a fixed distance:
ggplot(df, aes(x,y)) +
geom_col(position = position_nudge(y=1)) + #move everything up one
ylim(c(0,5)) #set the y axis limits
If I well understand your problem, you have several levels of response and you want for each individual to represent it. Could you consider a representation with a y axis which is discret ?
It doesn't avoid your problem of 0 or 1 but in fact only level are importants. If there is nothing it is just because you don't have the answer so it is 0. I don't know if you agree with my proposition :
df = data.frame(x = c('a', 'b', 'c', 'd'),
y = factor(c(1, 2, 3, 4), levels = 1:4))
ggplot(df, aes(x, y)) +
geom_col() +
scale_y_discrete(expand = expansion(add = 1.2))
Change coord_cartesian() to the following:
coord_cartesian(ylim=c(1,5), expand=F)
Here's the full call to ggplot() with the change:
ggplot(df, aes(x,y)) +
geom_col() +
coord_cartesian(ylim=c(1,5), expand=F) +
scale_y_continuous(
expand = expand_scale(add = c(0.2,0)))
This question already has an answer here:
draw straight line between any two point when using coord_polar() in ggplot2 (R)
(1 answer)
Closed 2 years ago.
I am making a polar violin plot. I would like to add lines and labels to the plot to annotate what each spoke means.
I'm running into two problems.
The first is that when I try to create line segments, if x != xend, then the segments are drawn as curves rather than as lines.
For example:
data.frame(
x = rnorm(1000),
spoke = factor(sample(1:6, 1000, replace=T))
) %>%
ggplot(aes(x = spoke, fill=spoke, y = x)) +
geom_violin() +
coord_polar() +
annotate("segment", x=1.1, xend=1.3, y=0, yend=3, color="black", size=0.6) +
theme_minimal()
The second problem that arises occurs when I try to add an annotation between the last spoke and the first. In this case, the annotation causes the coordinate scale to shift, so that spokes are no longer evenly distributed.
See as here:
data.frame(
x = rnorm(1000),
spoke = factor(sample(1:5, 1000, replace=T))
) %>%
ggplot(aes(x = spoke, fill=spoke, y = x)) +
geom_violin() +
coord_polar() +
scale_x_discrete(limits = 1:5) +
annotate("segment", x=5.9, xend=5.7, y=0, yend=3, color="black", size=0.6) +
theme_minimal()
Any assistance is greatly appreciated!
(PS: I do understand that there are perceptual issues with plots like these. I have a good reason...)
You want an 'generic annotation' as shown here
You basically have to overlay your plots and not use the layer facility, if you don't want to exactly calculate the distance in radians of each x for each y.
With cowplot
require(ggplot2) #again, you should specify your required packages in your question as well
require(cowplot)
my_dat <- data.frame(x = rnorm(1000),
spoke = factor(sample(1:6, 1000, replace=T)))
my_annot <- data.frame(para = c('start','end'), x = c(0,0.4), y = c(0,0.2))
#first point x/y = c(0,0) because this makes positioning easier
When I edited your question and removed the piping - that was not only a matter of good style, but also makes it much easier to then work with your different plots. So - I would suggest you should remove the pipe.
p1 <- ggplot(my_dat, aes(x = spoke, fill=spoke, y = x)) +
geom_violin() +
theme_minimal()+
coord_polar()
p2 <- ggplot(my_annot) +
geom_line(aes(x,y)) +
coord_cartesian(xlim = c(0,2), ylim =c(0,2)) +
# the limits change the length of your line too
theme_void()
ggdraw() +
draw_plot(p1) +
draw_plot(p2, x = 0.55, y = 0.6)
Obviously - you can now play around with both length of your line and its position within draw_plot()
This question already has an answer here:
How to expand axis asymmetrically with ggplot2 without setting limits manually?
(1 answer)
Closed 7 years ago.
I've got some data that share a common x-axis but have two different y variables:
set.seed(42)
data = data.frame(
x = rep(2000:2004, 2),
y = c(rnorm(5, 20, 5), rnorm(5, 150, 15)),
var = rep(c("A", "B"), each = 5)
)
I'm using a faceted line plot to display the data:
p = ggplot(data, aes(x, y)) +
geom_line() +
facet_grid(var ~ ., scales = "free_y")
I'd like the y-axis to include 0. This is easy enough:
p + expand_limits(y = 0)
but then my data looks crowded too close to the top of my facets. So I'd like to pad the range of the axis. Normally scale_y_continuous(expand = ...) is used for padding the axis, but the padding is applied symmetrically to the top and bottom, making the y-axis go well below 0.
p + expand_limits(y = 0) +
scale_y_continuous(expand = c(0.3, 0.2))
# the order of expand_limits and scale_y_continuous
# does not change the output
I can't explicitly set limits because of the facets with free y scales. What's the best way to have the y-scale extend down to 0 (not below!), while multiplicatively padding the top of the y scale?
You could create an extra data set with a single point for each facet and plot it invisibly with geom_blank(). The point is chosen to be a fixed factor larger than the maximum value in the given facet. Here, I choose that factor to be 1.5 to make the effect clearly visible:
max_data <- aggregate(y ~ var, data = data, FUN = function(y) max(y) * 1.5)
max_data <- transform(max_data, x = 2000)
p + geom_blank(data = max_data)
And this is what I get: