Creating a ggplot() from scratch in R to illustrate results - r

I'm a bit new to R and this is the first time I'd like to use ggplot(). My aim is to create a few plots that will look like the template below, which is an output from the package effects for those who know it:
:
Given this data:
Average Error Area
1: 0.4407528 0.1853854 Loliondo
2: 0.2895050 0.1945540 Seronera
How can I replicate the plot seen in the image with labels, error bars as in Error and the line connecting both Average points?
I hope somebody can put me on the right track and then I will go from there for other data I have.
Any help is appreciated!

Using ggplot2::geom_errorbar you can add error bars by first deriving your ymin and ymax.
df <- tibble::tribble(~Average, ~Error, ~Area,
0.4407528, 0.1853854, "Loliondo",
0.2895050, 0.1945540, "Seronera")
dfnew <- df %>%
mutate(ymin = Average - Error,
ymax = Average + Error)
p <- ggplot(data = dfnew, aes(x = Area, y = Average)) +
geom_point(colour = "blue") + geom_line(aes(group = 1), colour = "blue") +
geom_errorbar(aes(x = Area, ymin = ymin, ymax = ymax), colour = "purple")

Here's a quick and dirty one that is similar to what was just posted:
df <-
tibble(
average = c(0.44, 0.29),
error = c(0.185, 0.195),
area = c("Loliondo", "Seronera")
)
df %>%
ggplot(aes(x = area)) +
geom_line(
aes(y = average, group = 1),
color = "blue"
) +
geom_errorbar(
aes(ymin = average - 0.5 * error, ymax = average + 0.5 * error),
color = "purple",
width = 0.1
)
The trickiest part here is the group = 1 segment, which you need for the line to be drawn with factors on the x axis.
The aes(x = area) goes up top because it's used in both geoms, while the y, group, ymin, and ymax are used only locally. The color and width arguments appear outside of the aes() call since they are used for appearance modifications.

Related

Shade parts of a ggplot based on a (changing) dummy variable

I want to shade areas in a ggplot but I don't want to manually tell geom_rect() where to stop and where to start. My data changes and I always want to shade several areas based on a condition.
Here for example with the condition "negative":
library("ggplot2")
set.seed(3)
plotdata <- data.frame(somevalue = rnorm(10), indicator = 0 , counter = 1:10)
plotdata[plotdata$somevalue < 0,]$indicator <- 1
plotdata
I can do that manually like here or here:
plotranges <- data.frame(from = c(1,4,9), to = c(2,4,9))
ggplot() +
geom_line(data = plotdata, aes(x = counter, y = somevalue)) +
geom_rect(data = plotranges, aes(xmin = from - 1, xmax = to, ymin = -Inf, ymax = Inf), alpha = 0.4)
But my problem is that, so to speak, the set.seed() argument changes and I want to still automatically generate the plot without specifying min and max values of the shading manually. Is there a way (maybe without geom_rect() but instead geom_bar()?) to plot shading based directly on my indicator variable?
edit: Here is my own best attempt, as you can see not great:
ggplot(data = plotdata, aes(x = counter, y = somevalue)) + geom_line() +
geom_bar(aes(y = indicator*max(somevalue)), stat= "identity")
You can use stat_summary() to calculate the extremes of runs of your indicator. In the code below data.table::rleid() groups the data by runs of indicators. In the summary layer, y doesn't really do anything, so we use it to store the resolution of your datapoints, which we then later use to offset the xmin/xmax parameters. The after_stat() function is used to access computed variables after the ranges have been computed.
library("ggplot2")
plotdata <- data.frame(somevalue = rnorm(10), counter = 1:10)
plotdata$indicator <- as.numeric(plotdata$somevalue < 0)
ggplot(plotdata, aes(counter, somevalue)) +
stat_summary(
geom = "rect",
aes(group = data.table::rleid(indicator),
xmin = after_stat(xmin - y * 0.5),
xmax = after_stat(xmax + y * 0.5),
alpha = I((indicator) * 0.4),
y = resolution(counter)),
fun.min = min, fun.max = max,
orientation = "y", ymin = -Inf, ymax = Inf
) +
geom_line()
Created on 2021-09-14 by the reprex package (v2.0.1)

How to control the order of ggplot2::geom_pointrange elements by colour, shape and linetype

data=data.frame("X"=c(22,5,8,17,7,22),
"XMIN"=c(17.6,4,6.4,13.6,5.6,17.6),
"XMAX"=c(26.4,6,9.6,20.4,8.4,26.4),
"VAR"=c('A','B','C','A','B','C'),
"L1"=c(1,2,3,1,2,3),
"L2"=c(1,1,1,2,2,2))
ggplot(data) +
geom_pointrange(aes(
ymin = XMIN,
ymax = XMAX,
y = X,
x = reorder(VAR, -X),
colour = factor(L1),
shape = factor(L1),
linetype = factor(L2)))
I wish to add space between the lines for each variable A,B,C. Also within (A,B,C) for each variable I wish to sort the line from lowest to highest by X value.
See photo here,enter image description here
This seems to do the trick:
Updated in response to comments so that variables L1 and L2 control the colour, shape and linetype aesthetics.
The really tricky problem was overcoming the conflict between the order imposed by using factor(L2) and that wished for as a combination of VAR and X.
The axis order will trump the linetype order where the x values are distinct.
So created a continuous variable x_loc to locate observations on the x axis which are then re-labelled with the required values from VAR.
library(ggplot2)
library(dplyr)
data=data.frame("X"=c(22,5,8,17,7,22),
"XMIN"=c(17.6,4,6.4,13.6,5.6,17.6),
"XMAX"=c(26.4,6,9.6,20.4,8.4,26.4),
"VAR"=c('A','B','C','A','B','C'),
"L1"=c(1,2,3,1,2,3),
"L2"=c(1,1,1,2,2,2))
# reorder the data to be in the right plotting order: grouping by VAR with X in ascending order, then everything follows quite nicely.
data1 <-
data %>%
arrange(VAR, X) %>%
mutate(x_loc = c(0.8, 1.1) + rep(0:2, each = 2))
data1
ggplot(data1) +
labs(x = "VAR") +
scale_x_continuous(breaks = 1:3, labels = data1$VAR[1:3])+
theme_minimal()+
theme(legend.position = "top")+
geom_pointrange(aes(
ymin = XMIN,
ymax = XMAX,
y = X,
x = x_loc,
linetype = factor(L2),
colour = factor(L1),
shape = factor(L1)))
Which results in:
Note: for some reason I do not fully understand adding additional ggplot layers after the geom_pointrange function resulted in revealing the list elements of the 'ggplot' layer. Something to follow up another time.

overlay discrete and continuous layer in ggplot - surprised that layer order matters

consider the following example dataset:
library(dplyr)
library(ggplot2)
d = mtcars %>%
as_tibble(rownames = "name") %>%
mutate(wt.cat = cut(wt, seq(1.5, 5.5, by = 1))) %>%
group_by(wt.cat) %>%
summarize(
Mean = mean(mpg),
Min = min(mpg),
Max = max(mpg)
)
Say I want to plot points for the "mean" value of each category in wt.cat and a ribbon showing the range. This works:
ggplot(d, aes(x = wt.cat)) +
geom_point(aes(y= Mean)) +
geom_ribbon(aes(x = as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue")
But the points are masked by the ribbon. However, if I change the order of the layers so that the points are plotted on top of the ribbon, I get an error:
ggplot(d, aes(x = wt.cat)) +
geom_ribbon(aes(x = as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue") +
geom_point(aes(y= Mean))
## Error: Discrete value supplied to continuous scale
So even though I'm specifying the discrete axis as the "default" aesthetic, it gets overridden by the specification of the first plotted layer. The only way I can find around this is to plot a dummy point layer first:
ggplot(d, aes(x = wt.cat)) +
geom_point(aes(y= Mean), shape = NA) +
geom_ribbon(aes(x = as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue") +
geom_point(aes(y= Mean))
## Warning message:
## Removed 4 rows containing missing values (geom_point).
Is there a more "proper" or correct way of combining discrete and continuous layers? Is there a solution that doesn't require creating a dummy layer?
would something like this be a solution?
d %>% {
ggplot(., aes(x = wt.cat)) +
scale_x_discrete(labels = levels(.$wt.cat)) +
geom_ribbon(aes(x =as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue") +
geom_point(aes(y=Mean))
}
I just learned you can wrap a pipe with { } and then reference the entire data frame with .
As camille said, the issue is that geom_ribbon requires a continuous scale because it plots area across values relating to the adjacent position. I believe the scale gets converted to continuous when geom_ribbon is added, but the labels are maintained.
Hope this helps
As per my reply -- the following works just as well if you want ggplot2 to handle all the labeling
d %>%
ggplot(aes(x = wt.cat)) +
scale_x_discrete() +
geom_ribbon(aes(x =as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue") +
geom_point(aes(y=Mean))

R: ggplot2 & Plotly: Recreating 'reference bands' on bar graphs from Tableau in R

I am trying to recreate some functionality I use daily in Tableau for R (ggplot2 and plotly). I need to be able to create reference bands and lines similar to the image below. I've figured out how to create the reference lines from the geom_errorbar(). However I can't seem to find a solution for the 'Reference Band'.
If a solution isn't possible in ggplot2 or plotly I would be open to another package, but I need somethign static for Rmarkdown reports and something dynamic for html widgets dashboard.
Below I Have some sample code, I would like to add reference bands of 'High' and 'Low' to the bar graph for each person.
library(ggplot2)
#Create Data
Name <- c("Rick","Carl","Daryl","Glenn")
Pos <- c("M","M","D","D")
Load <- c(100,110,90,130)
High <- c(150,160,130,140)
Low <- c(130,145,120,130)
data <- data.frame(Name,Pos,Load,High,Low)
rm(Name,Pos,Load,High,Low)
#create plot
ggplot(data = data, aes(x = Name, y = Load)) +
geom_bar(stat ="identity", width=.4)
Could any guidance would be appreciated. Thank you!
geom_rect() would be a better choise than geom_errorbar() because you can reproduce the same image that you posted. Take a look at both rect and errorbar documentarion.
The following example could be used in the markdown:
library(dplyr)
library(ggplot2)
delta <- 0.5
data <- mtcars %>% group_by(cyl, vs) %>%
summarise(xmin = first(cyl) - 1,
xmax = first(cyl) + 1,
wt = mean(wt),
ymin = wt - delta,
ymax = wt + delta)
ggplot(data = data, aes(x = cyl, y = wt)) +
geom_rect(aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax),
fill = "indianred", alpha = 0.4) + # adds the reference band layer before
geom_bar(stat = "identity", fill = "darkblue", width = 1) + # the bar layer
facet_wrap(~vs) + theme_classic()
If you wish just one reference band you just have to use the same ymax and ymin parameters to all the observations.
You will still need more effort in the html version, because plotly::ggplotly() is messing it up.
I was able to find a solution. I needed to set the xmin and xmax as numeric and then I was able to create the reference bars.
library(ggplot2)
#Create Data
NameID <- c("Rick","Carl","Daryl","Glenn")
Pos <- c("M","M","D","D")
Load <- c(100,110,90,130)
High <- c(110,160,130,140)
Low <- c(90,145,120,130)
df <- data.frame(NameID,Pos,Load,High,Low)
rm(NameID,Pos,Load,High,Low)
p <- ggplot()
p <- p + scale_x_discrete()
p <- p + geom_rect(data=df,
aes(xmin = as.numeric(NameID)-.25,
xmax = as.numeric(NameID)+.25,
ymin = Low,
ymax = High),
fill = "blue", alpha = 0.2)
p <- p + geom_bar(data = df, aes(x = NameID, y = Load), stat="identity", width = .4)
p

Modifying geom_ribbon borders

I am plotting a series of means and standard deviations over time with code below, and am trying to use geom_ribbon to display the sd's, see below.
Due to the significant overlap I'd like to add a border to the ribbons that is the same color as the corresponding variable but is a dashed line, but I can't figure out where in the code this would go. I know "colour" and "linetype" commands are involved somehow...
Thanks!
graph.msd <- ggplot(data=g.data, aes(x=quarter,y=mean,group=number))
graph.msd <- graph.msd + geom_line(aes(colour = number),size=1)+geom_ribbon(aes(ymin=mean-sd,ymax=mean+sd,fill=number),linetype=2,alpha=0.1)
You need pass a value for colour to geom_ribbon something like
graph.msd <- graph.msd +
geom_line(aes(colour = number),size=1)+
geom_ribbon(aes(ymin = mean-sd, ymax = mean+sd,
fill = number,colour = number), linetype=2, alpha=0.1)
with a reproducible example (using a variant on the examples in ?geom_ribbon
huron <- data.frame(year = 1875:1972, level = as.vector(LakeHuron))
library(plyr) # to access round_any
huron$decade <- round_any(huron$year, 10, floor)
ggplot(huron, aes(x =year, group = decade)) +
geom_ribbon(aes(ymin = level-1, ymax = level+1,
colour = factor(decade), fill = factor(decade)),
linetype = 2, alpha= 0.1)

Resources