ggplot add a line from different data to a geom_area() plot - r

I want to create an area plot with a line above it (I'm showing that some components of my data don't sum up to the total and want to discuss that). Here is the code that throws an error:
library(ggplot2)
plotdata <- data.frame(x= rep(1:5,2), y = abs(rnorm(10)), id = rep(c("a","b"), each =5))
plotdata_total <- data.frame(x = 1:5,
y = plotdata[plotdata$id =="a", "y"]+
plotdata[plotdata$id =="b", "y"]+1:5/0.2)
ggplot(plotdata,
aes(x=x, y=y, group = id, fill = id)) +
geom_area() +
geom_line(plotdata_total, aes(x=x, y=y))
and the error is "mapping must be created by aes()". So there is something wrong in the mapping but even if I manually add an id variable to plotdata_total, I get this error. It also doesnt help to specify color, group, fill in both aes() arguments. What am I missing? Comment out the last geom to see that the area plot works.

Related

Is there are a way to change the breaks of a ggplot legend without changing other properties of the aesthetic?

I wish to change the breaks of a ggplot legend without affecting the other properties of the aesthetic (e.g., palette, name, etc.). For example, a MWE where the aesthetic is colour:
## Original plot:
df <- data.frame(x = 1:10, y = 1:10, z = 1:10)
gg <- ggplot(df, aes(x, y, colour = z)) +
geom_point() +
scale_colour_distiller(palette = "Spectral", name = "Original title")
gg
## Plot with adjusted breaks:
gg + scale_colour_distiller(breaks = c(2.5, 7.5))
Original plot
Plot with adjusted breaks
In the second plot, the colour palette and the legend name are reset to their default values: I want to change the legend breaks only.
I understand why the above approach does not work; the first colour scale is completely replaced by the second scale. However, I don't know how to tackle this problem. Any advice is greatly appreciated!
I wrote a function which solves my question. It takes a ggplot object, the name of an aesthetic (as a string), and the breaks for the corresponding legend.
change_legend_breaks <- function(gg, aesthetic, breaks) {
## Find the scales associated with the specifed aesthetic
sc <- as.list(gg$scales)$scales
all_aesthetics <- sapply(sc, function(x) x[["aesthetics"]][1])
idx <- which(aesthetic == all_aesthetics)
## Overwrite the breaks of the specifed aesthetic
gg$scales$scales[[idx]][["breaks"]] <- breaks
return(gg)
}
This is my first time dealing with ggplot objects at a low level, so perhaps there is a better, more robust approach: This works for me, though.
Interestingly, it seems to be a mutating function, that is, it alters the plot object itself, rather than a copy of the object. I didn't know this was possible in R.
As a check that the function works as intended, here is a variant on the original MWE, this time with two aesthetics:
df <- data.frame(x = 1:10, y = 1:10, z1 = 1:10, z2 = 1:10)
gg <- ggplot(df, aes(x, y, colour = z1, size = z2)) +
geom_point() +
scale_size(name = "Original size title") +
scale_colour_distiller(palette = "Spectral", name = "Original colour title")
change_legend_breaks(gg, "colour", breaks = c(2.5, 7.5))
change_legend_breaks(gg, "size", breaks = c(1, 9))

ggplot2 - group aesthetic not working as expected am I missing something?

I am trying to plot a line graph with multiple lines (grouped by a categorical value - factor) and based on what I have done in the past and what I can find online here the easiest way to do this is by assigning the categorical value to the group aesthetic - but this isn't working for me I am only getting one line on the line graph. I am 100% sure I am doing something super silly but I can't for the life of me work it out. Thanks in advance :)
#dummy data for example
test <- data.frame(x = sample(seq(as.Date('2015/01/01'), as.Date('2020/01/01'), by="day"), 20),
y = sample(10:300, 10),
Origin_Station = as.factor(rep(1, 10)),
Neighbour_station = as.factor(rep(1:5, each = 20)))
#plot - what I want to see is a line for each of the 5 Neighbour_station categories (1:5) but what I get is just one line
ggplot(test, aes(x=x, y=y, group = Neighbour_station))+
geom_line()
I have also tried this:
ggplot(test, aes(x=x, y=y, group = factor(Neighbour_station), colour = Neighbour_station))+
geom_line()
Hi Rhetta also from Aus here, big ups Australian useRs:
library(ggplot2)
ggplot(test, aes(x = x, y = y, group = Neighbour_station, colour = Neighbour_station))+
geom_line()
Note the reason you can't see the distinct lines is because your data is exactly the same for each factor level (Neighbour_station 1:5).

plotting multiple geom-vline in a graph

I am trying to plot two ´geom_vline()´ in a graph.
The code below works fine for one vertical line:
x=1:7
y=1:7
df1 = data.frame(x=x,y=y)
vertical.lines <- c(2.5)
ggplot(df1,aes(x=x, y=y)) +
geom_line()+
geom_vline(aes(xintercept = vertical.lines))
However, when I add the second desired vertical line by changing
vertical.lines <- c(2.5,4), I get the error:
´Error: Aesthetics must be either length 1 or the same as the data (7): xintercept´
How do I fix that?
Just remove aes() when you use + geom_vline:
ggplot(df1,aes(x=x, y=y)) +
geom_line()+
geom_vline(xintercept = vertical.lines)
It's not working because the second aes() conflicts with the first, it has to do with the grammar of ggplot.
You should see +geom_vline as a layer of annotation to the graph, not like +geom_points or +geom_line which are for mapping data to the plot. (See here how they are in two different sections).
All the aesthetics need to have either length 1 or the same as the data, as the error tells you. But the annotations can have different lengths.
Data:
x=1:7
y=1:7
df1 = data.frame(x=x,y=y)
vertical.lines <- c(2.5,4)
ggplot(df1, aes(x = x, y = y)) +
geom_line() +
sapply(vertical.lines, function(xint) geom_vline(aes(xintercept = xint)))

Different behavior between ggplot2 and plotly using ggplotly

I want to make a line chart in plotly so that it does not have the same color on its whole length. The color is given continuous scale. It is easy in ggplot2 but when I translate it to plotly using ggplotly function the variable determining color behaves like categorical variable.
require(dplyr)
require(ggplot2)
require(plotly)
df <- data_frame(
x = 1:15,
group = rep(c(1,2,1), each = 5),
y = 1:15 + group
)
gg <- ggplot(df) +
aes(x, y, col = group) +
geom_line()
gg # ggplot2
ggplotly(gg) # plotly
ggplot2 (desired):
plotly:
I found one work-around that, on the other hand, behaves oddly in ggplot2.
df2 <- df %>%
tidyr::crossing(col = unique(.$group)) %>%
mutate(y = ifelse(group == col, y, NA)) %>%
arrange(col)
gg2 <- ggplot(df2) +
aes(x, y, col = col) +
geom_line()
gg2
ggplotly(gg2)
I also did not find a way how to do this in plotly directly. Maybe there is no solution at all. Any ideas?
It looks like ggplotly is treating group as a factor, even though it's numeric. You could use geom_segment as a workaround to ensure that segments are drawn between each pair of points:
gg2 = ggplot(df, aes(x,y,colour=group)) +
geom_segment(aes(x=x, xend=lead(x), y=y, yend=lead(y)))
gg2
ggplotly(gg2)
Regarding #rawr's (now deleted) comment, I think it would make sense to have group be continuous if you want to map line color to a continuous variable. Below is an extension of the OP's example to a group column that's continuous, rather than having just two discrete categories.
set.seed(49)
df3 <- data_frame(
x = 1:50,
group = cumsum(rnorm(50)),
y = 1:50 + group
)
Plot gg3 below uses geom_line, but I've also included geom_point. You can see that ggplotly is plotting the points. However, there are no lines, because no two points have the same value of group. If we hadn't included geom_point, the graph would be blank.
gg3 <- ggplot(df3, aes(x, y, colour = group)) +
geom_point() + geom_line() +
scale_colour_gradient2(low="red",mid="yellow",high="blue")
gg3
ggplotly(gg3)
Switching to geom_segment gives us the lines we want with ggplotly. Note, however, that line color will be based on the value of group at the first point in the segment (whether using geom_line or geom_segment), so there might be cases where you want to interpolate the value of group between each (x,y) pair in order to get smoother color gradations:
gg4 <- ggplot(df3, aes(x, y, colour = group)) +
geom_segment(aes(x=x, xend=lead(x), y=y, yend=lead(y))) +
scale_colour_gradient2(low="red",mid="yellow",high="blue")
ggplotly(gg4)

Suppress message from geom_line with only one point

I'm iterating through multiple data sets to produce line plots for each set. How can I prevent ggplot from complaining when I use geom_line over one point?
Take, for example, the following data:
mydata = data.frame(
x = c(1, 2),
y = c(2, 2),
group = as.factor(c("foo", "foo"))
)
Creating line graph looks and works just fine because there are two points in the line:
ggplot(mydata, aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
However, plotting only the fist row give the message:
geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
ggplot(mydata[1,], aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
Some of my figures will only have one point and the messages cause hangups in the greater script that produces these figures. I know the plots still work, so my concern is avoiding the message. I'd also like to avoid using suppressWarnings() if possible in case another legitimate and unexpected issue arises.
Per an answer to this question: suppressMessages(ggplot()) fails because you have to wrap it around a print() call of the ggplot object--not the ggplot object itself. This is because the warning/message only occurs when the object is drawn.
So, to view your plot without a warning message run:
p <- ggplot(mydata[1,], aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
suppressMessages(print(p))
I think the following if-else solution should resolve the problem:
if (nrow(mydata) > 1) {
ggplot(mydata, aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
} else {
ggplot(mydata, aes(x = x, y = y)) +
geom_point()
}
On the community.RStudio.com, John Mackintosh suggests a solution which worked for me:
Freely quoting:
Rather than suppress warnings, change the plot layers slightly.
Facet wrap to create empty plot
Add geom_point for entire data frame
Subset the dataframe by creating a vector of groups with more than one data point, and filtering the original data for those groups. Only
plot lines for this subset.
Details and example code in the followup of the link above.

Resources