r ggplot2: fill area under curves with geom_step

r ggplot2: fill area under curves with geom_step - r

I'm trying to fill area under each step function using ggplot2 and geom_step. Here's an example dataset:
time = c(0, 5, 8, 11, 14, 18, 20, 0, 3, 7, 13, 19, 20, 0, 4, 9, 15, 18)
prob = c(1, 0.95, 0.80, 0.62, 0.30, 0.03, 0, 1, 0.92, 0.75, 0.57, 0.21, 0, 1, 0.80, 0.64, 0.43, 0)
group = c(1,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3)
df = data.frame(time, prob, group)
Here's the codes i've tried:
plot1 = ggplot(df, aes(x = time, y = prob, group = group, fill = group)) +
geom_step()+
geom_ribbon(data = df, aes(ymin = 0, ymax = prob))
The problem is that, after fill the area, only group 1 has the step line, and the area filling is not following the step function.

You may use geom_rect instead of geom_ribbon.
df %>%
mutate(group = as.factor(group)) %>%
ggplot(aes(x = time, y = prob, group = group, fill = group)) +
geom_step()+
geom_rect(aes(xmin = time, xmax = lead(time),
ymin = 0, ymax = prob), alpha = 0.4)

Related

How fill geom_ribbon with different colour in R?

I am trying to use different fill for geom_ribbon according to the x-values (For Temp = 0-20 one fill, 20-30.1 another fill and > 30.1 another fill). I am using the following code
library(tidyverse)
bounds2 <- df %>%
mutate(ymax = pmax(Growth.rate, slope),
ymin = pmin(Growth.rate, slope),
x_bins = cut(Temp, breaks = c(0,20,30.1,max(Temp)+5)))
ggplot(df, aes(x = Temp, y = Growth.rate)) +
geom_line(colour = "blue") +
geom_line(aes(y = slope), colour = "red") +
scale_y_continuous(sec.axis = sec_axis(~ .^1, name = "slope")) +
geom_ribbon(data = bounds2, aes(Temp, ymin = ymin, ymax = ymax, fill = x_bins),
alpha = 0.4)
It is returning me following output
As you can see from the output some regions are remaining empty. Now how can I fill those parts in the curve?
Here is the data
df = structure(list(Temp = c(10, 13, 17, 20, 25, 28, 30, 32, 35, 38
), Growth.rate = c(0, 0.02, 0.19, 0.39, 0.79, 0.96, 1, 0.95,
0.65, 0), slope = c(0, 0.02, 0.16, 0.2, 0.39, 0.1, 0.03, -0.04,
-0.29, -0.65)), row.names = c(NA, 10L), class = "data.frame")

Here's a solution that involves interpolating new points at the boundaries between the areas. I used approx to get the values of ymin and ymax at Temp=30.1 and added this to the plotting dataset.
Then, instead of using cut just once as you did I use it twice, once with lower bounds included in each set then once with upper bounds included. Then I reshape the data long, and de-duplicate the rows I don't need.
If you zoom in enough you can see that the boundary is at 30.1 not at 30.
bounds2 <- df %>%
mutate(ymax = pmax(Growth.rate, slope),
ymin = pmin(Growth.rate, slope))
bounds2 <- bounds2 |>
add_case(Temp=30.1,
ymax=approx(bounds2$Temp,bounds2$ymax,xout = 30.1)$y,
ymin=approx(bounds2$Temp,bounds2$ymin,xout = 30.1)$y) |>
mutate(x_bins2 = cut(Temp, breaks = c(0,20,30.1,max(Temp)+5),right=FALSE, labels=c("0-20","20-30.1","30.1-max")),
x_bins = cut(Temp, breaks = c(0,20,30.1,max(Temp)+5), labels=c("0-20","20-30.1","30.1-max"))) |>
tidyr::pivot_longer(cols=c(x_bins2, x_bins), names_to = NULL, values_to = "xb") |>
distinct()
ggplot(df, aes(x = Temp, y = Growth.rate)) +
geom_line(colour = "blue") +
geom_line(aes(y = slope), colour = "red") +
scale_y_continuous(sec.axis = sec_axis(~ .^1, name = "slope")) +
geom_ribbon(data = bounds2, aes(Temp, ymin = ymin, ymax = ymax, fill = xb),
alpha = 0.4)

The idea is here but the code I show can be much improved at the step ### Dupplicate the 2 last x_bins from each category and move them into the next
### Libraries
library(tidyverse)
df <- structure(list(Temp = c(10, 13, 17, 20, 25, 28, 30, 32, 35, 38
), Growth.rate = c(0, 0.02, 0.19, 0.39, 0.79, 0.96, 1, 0.95,
0.65, 0), slope = c(0, 0.02, 0.16, 0.2, 0.39, 0.1, 0.03, -0.04,
-0.29, -0.65)), row.names = c(NA, 10L), class = "data.frame")
### Preprocessing
bounds2 <- df %>%
mutate(ymax = pmax(Growth.rate, slope),
ymin = pmin(Growth.rate, slope),
x_bins = cut(Temp, breaks = c(0, 20, 30.1, max(Temp)+5)))
### Dupplicate the 2 last x_bins from each category and move them into the next category
bounds2 <- rbind(bounds2, bounds2[c(4, 7), ])
bounds2$x_bins[c(11, 12)] <- bounds2[c(5, 8), ]$x_bins
### Plot
ggplot(df, aes(x = Temp, y = Growth.rate)) +
geom_line(colour = "blue") +
geom_line(aes(y = slope), colour = "red") +
scale_y_continuous(sec.axis = sec_axis(~ .^1, name = "slope")) +
geom_ribbon(data = bounds2, aes(Temp, ymin = ymin, ymax = ymax, fill = x_bins),
alpha = 0.4)

scale_fill versus scale_color for geom_point

I want to generate bubble plots where the bubbles have a black outline. However, for some reason, I'm having trouble getting geom_point to accept scale_fill. This gives me a nice plot where the bubble color scales with a continuous variable, color:
age <-c(16, 5, 6, 22, 11, 12, 11, 13, 4, 8)
y <- c(0.53, 0.50, 0.50, 0.46, 0.44, 0.44, 0.44, 0.43, 0.40, 0.40)
s <- c(11.5, 78.0, 753.5, 44.5, 372.0, 62.0, 163.0, 25.0, 56.0, 80.5)
color <- c(29, 15, 7, 30, 15, 26, 8, 14, 17, 12)
df <- data.frame(age, y, s, color)
p <- ggplot(df)+
geom_point(aes(x = age, y = y, size = s, color = color))+
labs(x = "age", y = "rate")+
scale_size(range = c(.1, 10), name="s")+
scale_color_viridis(limits = c(5, 20), oob = squish, option = "magma")+
scale_x_continuous(breaks=c(0, 10, 20, 30, 40, 50), limits = c(0,55))+
scale_y_continuous(breaks=c(-0.4, -0.2, 0, 0.2, 0.4), limits=c(-0.5, 0.5))+
theme_minimal()
But if I just switch color to fill, I get all black bubbles:
p <- ggplot(df)+
geom_point(aes(x = age, y = y, size = s, fill = color))+
labs(x = "age", y = "rate")+
scale_size(range = c(.1, 10), name="s")+
scale_fill_viridis(limits = c(5, 20), oob = squish, option = "magma")+
scale_x_continuous(breaks=c(0, 10, 20, 30, 40, 50), limits = c(0,55))+
scale_y_continuous(breaks=c(-0.4, -0.2, 0, 0.2, 0.4), limits=c(-0.5, 0.5))+
theme_minimal()
If I specify shape:
p <- ggplot(df)+
geom_point(aes(x = age, y = y, size = s, shape = 21, fill = color))+
labs(x = "age", y = "rate")+
scale_size(range = c(.1, 10), name="s")+
scale_fill_viridis(limits = c(5, 20), oob = squish, option = "magma")+
scale_x_continuous(breaks=c(0, 10, 20, 30, 40, 50), limits = c(0,55))+
scale_y_continuous(breaks=c(-0.4, -0.2, 0, 0.2, 0.4), limits=c(-0.5, 0.5))+
theme_minimal()
I get
Error in scale_f():
! A continuous variable can not be mapped to shape
If I understand correctly, fill should specify bubble color and color should let me specify a black outline around the bubbles. What am I missing here? Thanks.

Your error says that you can't assign a continuous variable to shape. You should place shape outside your aes like this:
age <-c(16, 5, 6, 22, 11, 12, 11, 13, 4, 8)
y <- c(0.53, 0.50, 0.50, 0.46, 0.44, 0.44, 0.44, 0.43, 0.40, 0.40)
s <- c(11.5, 78.0, 753.5, 44.5, 372.0, 62.0, 163.0, 25.0, 56.0, 80.5)
color <- c(29, 15, 7, 30, 15, 26, 8, 14, 17, 12)
df <- data.frame(age, y, s, color)
library(ggplot2)
library(viridis)
library(scales)
p <- ggplot(df)+
geom_point(aes(x = age, y = y, size = s, fill = color), shape = 21)+
labs(x = "age", y = "rate")+
scale_size(range = c(.1, 10), name="s")+
scale_fill_viridis(limits = c(5, 20), oob = squish, option = "magma")+
scale_x_continuous(breaks=c(0, 10, 20, 30, 40, 50), limits = c(0,55))+
scale_y_continuous(breaks=c(-0.4, -0.2, 0, 0.2, 0.4), limits=c(-0.5, 0.5))+
theme_minimal()
p
#> Warning: Removed 1 rows containing missing values (geom_point).
Created on 2022-08-29 with reprex v2.0.2

I am not quite sure if this is your expectec output, but please recognize this:
Note that shapes 21-24 have both stroke colour and a fill. The size of the filled part is controlled by size, the size of the stroke is controlled by stroke. Each is measured in mm, and the total size of the point is the sum of the two. Note that the size is constant along the diagonal in the following figure.
source: vignette("ggplot2-specs")
library(tidyverse)
ggplot(df, aes(x = age, y = y, size = s, color = color))+
geom_point(fill = color, shape=21)+
labs(x = "age", y = "rate")+
scale_size(range = c(.1, 10), name="s")+
scale_color_viridis(limits = c(5, 20), oob = squish, option = "magma")+
scale_x_continuous(breaks=c(0, 10, 20, 30, 40, 50), limits = c(0,55))+
scale_y_continuous(breaks=c(-0.4, -0.2, 0, 0.2, 0.4), limits=c(-0.5, 0.5))+
theme_minimal()

How to plot a zoom of the plot inside the same plot area using ggplot2?

This question seems difficult to understand, but to illustrate, I bring a figure as an example:
I am trying to replicate this graph. So far I've done the graphics separately, but I don't know how I can put them together as in the example.
Any help?
time <- seq(from = 0,
to = 10,
by = 0.5)
line_1 <- c(0, 0, 0, 66, 173, 426, 1440, 800, 1200, 400, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
line_2 <- c(0, 0, 0, 0, 0, 0, 0, 0, 1000, 25000, 5000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
df <- data.frame(time, line_1, line_2)
library(ggpubr)
#the plot
ggplot(data = df, aes(x = time)) +
geom_line(aes(y = line_2), color = "red",
position = position_nudge(x = 0.5, y = 1000)) +
geom_line(aes(y = line_1),color = "blue") +
geom_rect(aes(xmin = 1, xmax = 5, ymin = 0, ymax = 1500), color = "black", alpha = 0) +
theme_pubr( base_size = 8,
border = TRUE)
#The plot with a zoom
ggplot(data = df, aes(x = time, y = line_1)) +
geom_line(color = "blue") +
xlim (1, 5) +
ylim (0, 1500) +
theme_pubr( base_size = 8,
border = TRUE)

You can use a custom annotation
p1 = ggplot(data = df, aes(x = time)) +
geom_line(aes(y = line_2), color = "red", position = position_nudge(x = 0.5, y = 1000)) +
geom_line(aes(y = line_1),color = "blue") +
geom_rect(aes(xmin = 1, xmax = 5, ymin = 0, ymax = 1500), color = "black", alpha = 0) +
theme_pubr( base_size = 8, border = TRUE)
#The plot with a zoom
p2 = ggplot(data = df, aes(x = time, y = line_1)) +
geom_line(color = "blue") +
xlim (1, 5) +
ylim (0, 1500) +
theme_pubr( base_size = 8,border = TRUE)
p1 +
annotation_custom(ggplotGrob(p2), xmin = 0, xmax = 4, ymin = 5000, ymax = 20000) +
geom_rect(aes(xmin = 0, xmax = 4, ymin = 5000, ymax = 20000), color='black', linetype='dashed', alpha=0) +
geom_path(aes(x,y,group=grp),
data=data.frame(x = c(1,0,5,4), y=c(1500,5000,1500,5000),grp=c(1,1,2,2)),
linetype='dashed')

Is it possible to draw a legend for the different lengths of lines (birds-on-wire plot) drawn using geom_linerange

I am making a birds-on-wire plot using geom_linerange and geom_linerangeh to show subjects' treatment over time (wire) and the occurrence of some events (birds) during the treatment period. I am using the height of "birds" (i.e. length of vertical lines) to represent a certain characteristic of the "birds" (events). There are only 4 possible different lengths (see variable evt_type2) below. See mock data, code, and plot below.
library(tidyverse)
library(ggstance)
###Mock data###
#Treatment data
data_foo1 <- data.frame(subjectn = c(1, 1, 2, 3, 3, 3, 4, 5),
trt_start = c(1, 25, 1, 1, 50, 101, 1, 1),
trt_end = c(80, 60, 100, 25, 100, 200, 120, 90),
trt_type = as.factor(c(1, 2, 1, 3, 4, 5, 2, 4)),
stringsAsFactors = F)
#Some kind of events data
data_foo2 <- data.frame(subjectn = c(1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 5),
evt_start = c(70, 20, 90, 92, 24, 50, 70, 120, 170, 69, 80, 90),
evt_type1 = as.factor(c(0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0)),
evt_type2 = c(0.2, 0.8, 0.2, 0.4, 0.4, 0.2, 0.2, 0.6, 0.4, 0.4, 0.6, 0.2))
###Plot###
data_foo2 %>%
ggplot() +
geom_linerange(aes(x = evt_start, y = subjectn, ymin = subjectn, ymax = subjectn + evt_type2, linetype = evt_type1), size=0.8, alpha = 0.5) +
geom_linerangeh(data = data_foo1, aes(x = trt_start, y = subjectn, xmin = trt_start, xmax = trt_end, color = trt_type), size = 1.2, alpha = 0.7) +
scale_y_continuous(breaks = c(1, 2, 3, 4, 5), labels = c("A001", "A002", "A003", "A004", "A005")) +
xlab("Time") + theme_bw()
See the resulting plot here
My question is, is it possible to add a legend for the 4 different vertical line lengths? THANK YOU!

As mentioned in the comment, showing lengths of lines in the legend is generally not easily possible. One option would be to create a fake legend - another plot, which you then add to your main plot.
But I struggle to find the visualisation very compelling. Line lengths are difficult to discern, expecially when you have different line types and a line can cut off weirdly (see subject A003). It would also not be clear in the legend how to map the respective line length to the legnth in the plot.
Thus, I recommend using a different aesthetic for visualising dimension event 2.
One way would be to draw rectangles instead of lines and use the fill. You can make this categorical (as in my example) or continuous, and the legend easily maps to your data - visually, in my opinion, you can better discern the four different event types.
library(tidyverse)
library(ggstance)
ggplot() +
geom_linerangeh(
data = data_foo1,
aes(
y = subjectn,
xmin = trt_start,
xmax = trt_end,
color = trt_type
),
size = 1.2, alpha = 0.7
) +
geom_rect(
data = data_foo2, aes(
xmin = evt_start - 2,
xmax = evt_start + 2,
ymin = subjectn,
ymax = subjectn + 0.5,
linetype = evt_type1,
fill = as.character(evt_type2)
),
size = 0.2, color = "black", alpha = 0.5
) +
scale_fill_brewer() +
scale_y_continuous(breaks = 1:5, labels = paste0("A00", 1:5)) +
theme_bw()
Or, you can keep the lines, and add a second color aesthetic with ggnewscale.
ggplot() +
geom_linerangeh(
data = data_foo1,
aes(
y = subjectn,
xmin = trt_start,
xmax = trt_end,
color = trt_type
),
size = 1.2, alpha = 0.7
) +
scale_y_continuous(breaks = 1:5, labels = paste0("A00", 1:5)) +
ggnewscale::new_scale_color()+
geom_linerange(data = data_foo2,
aes(x = evt_start,
y = subjectn,
ymin = subjectn,
ymax = subjectn + 0.5,
color = as.character(evt_type2),
linetype = evt_type1),
size=0.8) +
scale_color_brewer() +
theme_bw()
Created on 2020-04-26 by the reprex package (v0.3.0)
The colors come out quite "light", and if you want them "darker", you can use the shades packages, e.g, by wrapping your scale_color function into one of the brightness modifying functions, e.g. shades::brightness(scale_color_brewer(), shades::delta(-0.2))

geom_area plot stacks areas by default

I am using geom_area to plot a very simple dataset. When plotting using geom_line everything is normal but when I switch to geom_area higher values getting plotted. I think looking at the graphs would be the best way of representing my problem:
require(tidyverse)
x <- structure(list(Time = 0:40, X15.DCIA = c(0, 1, 0.5, 0, 2, 2.5,
1, 0.5, 0, 1, 1.5, 1, 0.5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.5, 3,
5, 7, 6.5, 5.5, 4, 3, 2, 1.5, 1, 0.25, 0, 0, 0, 0, 0, 0, 0),
X100.DCIA = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1.5, 7, 8, 7.5, 6.5, 5, 3.5, 2.25,
1.75, 1.1, 0.4, 0.1, 0, 0, 0, 0, 0, 0)),
class = "data.frame", row.names = c(NA,-41L))
x %>% gather(prct.DCIA, Vol, -Time) %>% ggplot(aes(x=Time, y=Vol)) +
geom_line(aes(color=prct.DCIA))
x %>% gather(prct.DCIA, Vol, -Time) %>% ggplot(aes(x=Time, y=Vol)) +
geom_area(aes(fill=prct.DCIA))
The geom_line is what I expected (a line plot of my data).
But then looking at the geom_area you see that 100DCIA has jumped up-to 15.
I am more interested in an explanation rather than a fix or workaround.
Note:
This can be a workaround:
x %>% gather(prct.DCIA, Vol, -Time) %>% ggplot(aes(x=Time, y=Vol)) +
geom_polygon(aes(fill=prct.DCIA, alpha=0.5)) + guides(alpha=FALSE)

Explanation:
Your plots are stacking on top of one another.
The values you see following the red line in the geom_area graph are the sum of the values for the red and blue lines in your geom_line graph.
You can see this clearly if you separate out prct.DCIA with facet_grid():
x %>% gather(prct.DCIA, Vol, -Time) %>% ggplot(aes(x=Time, y=Vol)) +
geom_area(aes(fill=prct.DCIA)) + facet_grid(.~prct.DCIA)
This is simply because position = "stack" is the default argument in geom_area:
geom_area(mapping = NULL, data = NULL, stat = "identity",
position = "stack", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...)
One might presume this is because people use geom_area because they want to show the whole area on a diagram, rather than fill under some lines. Generally bars or area might represent a count of something, or the area filled in represents something, while points or lines may represent a point estimate and the area above or below the line or point isn't meaningful.
Cf. the default argument for geom_line is position = "identity".
geom_line(mapping = NULL, data = NULL, stat = "identity",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...)
Fix:
If you use position = position_dodge() you can see they return to looking like the line graph, with the red area is plotted behind the blue area:
x %>% gather(prct.DCIA, Vol, -Time) %>% ggplot(aes(x=Time, y=Vol)) +
geom_area(aes(fill=prct.DCIA), position = position_dodge())
You can even set alpha < 1 and see this clearly:
x %>% gather(prct.DCIA, Vol, -Time) %>% ggplot(aes(x=Time, y=Vol)) +
geom_area(aes(fill=prct.DCIA), position = position_dodge(), alpha = 0.5)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

r ggplot2: fill area under curves with geom_step - r

You may use geom_rect instead of geom_ribbon. df %>% mutate(group = as.factor(group)) %>% ggplot(aes(x = time, y = prob, group = group, fill = group)) + geom_step()+ geom_rect(aes(xmin = time, xmax = lead(time), ymin = 0, ymax = prob), alpha = 0.4)

Related

How fill geom_ribbon with different colour in R?

scale_fill versus scale_color for geom_point

How to plot a zoom of the plot inside the same plot area using ggplot2?

Is it possible to draw a legend for the different lengths of lines (birds-on-wire plot) drawn using geom_linerange

geom_area plot stacks areas by default

Categories

Resources