Plotting a continuous of time data in R - r

I am trying to plot the distribution of turtle nesting activity over a night using ggplot, but I want to exclude the times from 8am - 6pm. Also, I need the x-axis to start at 7pm and end at 7am.
My code is
ggplot(sub.1) + geom_bar(aes(x = sub.1$time)) + scale_x_continuous(expand = c(0, 0), limits = c(0, 23), breaks = seq(0, 23, 1)) + xlab("Hour") + ylab ("Frequency")
Any assistance would really be appreciated.

Try this. I'll post some comments in there when I get a chance. Hope it helps!
# Example Data
SampleHours <- sample(1:23, 3000, replace = TRUE)
# Keep wanted Hours
IncludedHours <- c(19:23, 1:7)
Index <- SampleHours %in% IncludedHours
# Create dataframe
sub.1 <- data.frame(Hour = SampleHours[Index])
# Change the factor levels
sub.1$Hour <- as.factor(sub.1$Hour)
FactorLevels <- c(19:23, 1:7)
levels(sub.1$Hour) <- FactorLevels
# Plot
library(ggplot2)
ggplot(sub.1) +
geom_bar(aes(x = sub.1$Hour)) +
xlab("Hour") +
ylab ("Frequency")
EDIT
Changed a part to identify which hours to keep, not which ones to exclude. I think it makes it easier to follow

My approach used dplyr. First, I generated some fake data:
times = sample(seq(0,23), 10000, replace=T)
nest = sample(c(0,1), 10000, replace=T)
data = data.frame(times, nest)
Then, I used dplyr to pipe the results:
library(dplyr)
data %>% filter(times>18 | times<8) %>%
transform(times=factor(times, levels=c(19:23,0:7))) %>%
ggplot() + geom_bar(aes(x=times)) +
xlab("Hour") +
ylab ("Frequency")
The filter call selects the hours; the transform serves to do the same as #William did to create the order 19-7.

Related

Overlaying boxplot with a lineplot

I have some fake data representing the answering times of different users answering an online survey.
The dataset has three variables: the id of the respondent (user), the name of the question (question) and the answering time for each question (time).
n <- 1000
dat <- data.frame(user = 1:n,
question = sample(paste("q", 1:4, sep = ""), size = n, replace = TRUE),
time = round(rnorm(n, mean = 10, sd=4), 0)
)
pltSingleRespondent <- function(df, highlightUsers){
dat %>%
ggplot(aes(x = question, y = time)) +
geom_boxplot(fill = 'orange') + coord_flip() +
ggtitle("Answering time per question")
}
pltSingleRespondent(dat, c(1, 31) )
I was creating a function that plots a boxplot with the answering times for each question. However, now I'd like to overlay that plot with the answering times of specific respondents (highlightUsers). The following image shows an example:
Can someone please explain me how to do this?
I think the most direct way to do this is to subset your data within a call to geom_line.
I'll start with a different set of random data, since the sample data in the question does not include all questions for a user.
set.seed(2021)
dat <- expand.grid(user = factor(1:50), question = paste0("q", 1:4))
dat$time <- round(rnorm(200, mean = 10, sd = 4), 0)
dat %>%
ggplot(aes(x = question, y = time)) +
geom_boxplot(fill = 'orange') + coord_flip() +
ggtitle("Answering time per question") +
geom_line(aes(color = user, group = user), size = 2,
data = ~ subset(., user %in% c(1L, 34L)))
You can functionize it however you want. If you're using dplyr, you can use dplyr::filter instead of subset with no other change.
Also, I chose to factor(user), since otherwise ggplot2 tends to think its data is continuous (for color=user). You can choose to use or not use this, though you may need more wrangling to get it to be discrete.
Slightly different approach. Add a column to the data that indicates the highlighted users and map that variable to geom_line. Use scale_color_discrete(na.translate = FALSE) to color only the non-NA values.
library(dplyr)
library(ggplot2)
pltSingleRespondent <- function(df, highlightUsers) {
df %>%
mutate(User = factor(ifelse(user %in% highlightUsers, user, NA))) %>%
ggplot(aes(question, time)) +
geom_boxplot(fill = "orange") +
geom_line(aes(color = User, group = User)) +
ggtitle("Answering time per question") +
scale_color_discrete(na.translate = FALSE) +
coord_flip() +
theme_bw()
}
Using the example data from #r2evans
pltSingleRespondent(dat, c(1, 34))

Is there a way I could plot t = 300, 350, 450, and 500 lines in one graph?

enter image description hereI wanted to plot multiple lines in one graph but I couldn't figure out which code to use. Also, is there a way I could assign colors to each of the lines? Just new to Rstudio and was assigned to pick up someones work so I've been doing a lot of trial and error but I haven't been lucky for the past few days. Hope someone could help me with this! Thank you so much
ecdf.shift <- function(OUR_threshold, des_cap = 40, nint = 10000){
#create some empty vectors for later use in the loop
ecdf_med = c()
ecdf_obs = c()
for (i in 1:length(OUR_threshold)){
# filter out the OUR threshold data, then select only the capture column and create a ecdf function
ecdf_fun <- HRP_rESS_no %>%
filter(ESS > OUR_threshold[i]) %>%
.$TSS_con %>%
ecdf()
# extract the ecdf data and put in tibble dataframe, then create a linear interpolation of the curve.
ecdf_data <- tibble(TSS_con = environment(ecdf_fun)$x, prob = environment(ecdf_fun)$y)
ecdf_interpol <- approx(x = ecdf_data$TSS_con, y = ecdf_data$prob, n = nint)
# find the vector numbers in x which correspond with the desired capture. Then find correlate the vectornumbers with probability numbers in the y vectors. Take the median value in case multiple hits. Put this number in a vector with designed vectornumber as ditacted by the loopnumber i.
ecdf_med[i] <- median(ecdf_interpol$y[(round(ecdf_interpol$x,1) == des_cap)])
# calculate the number of observations when the filtering takes place.
ecdf_obs[i] <- HRP_rESS_no %>%
filter(ESS > OUR_threshold[i]) %>%
.$TSS_con %>%
length()
# Flush the ecdf data. The ecdf is encoded as a function with global paramaters, so you want to reset them everytime the loop is done to avoid pesky bugs to appear.
rm(ecdf_data)
}
#create a tibble dataframe with all the loop data.
ecdf_out <- tibble(OUR_ratio_cutoff = OUR_threshold, prob = (ecdf_med)*100, nobs = ecdf_obs)
return(ecdf_out)
}
ratio_threshold <- seq(0,115, by = 5)
t = ecdf_MLSS_target <- 400 %>%
ecdf.shift(ratio_threshold, .) %>%
filter(nobs > 2) %>%
ggplot(aes( x = OUR_ratio_cutoff, y = prob)) +
geom_line() +
geom_point() +
theme_bw(base_size = 12) +
theme(panel.grid = element_blank()) +
scale_y_continuous(limits = c(0,100),
breaks = seq(0,300, by = 5),
expand = c(0,0)) +
scale_x_continuous(limits = c(0,120),
breaks = seq(0,110, by = 10),
expand = c(0,0)) +
labs(x = "ESS mg TSS/L",
y = "Probability of contactor MLSS > 400 mg TSS/L ")
plot(t)
Easiest would be to loop over your different t values first and bring the resulting data frames into one big data frame, and use this for your plot. Your code is not fully reproducible (it requires data that we do not have, i.e. HRP_rESS_no). So I have stripped down the function to the core - creating a data frame which makes different "lines" depending on your t value. I just used it as slope.
I hope the idea is clear.
library(tidyverse)
ecdf.shift <- function(OUR_threshold, t) {
data.frame(x = OUR_threshold, y = t * OUR_threshold)
}
ratio_threshold <- seq(0, 115, by = 5)
t_df <-
map(1:5, function(t) ecdf.shift(ratio_threshold, t)) %>%
bind_rows(, .id = "t")
ggplot(t_df, aes(x, y, color = t)) +
geom_line() +
geom_point()
Created on 2020-05-07 by the reprex package (v0.3.0)

ggplot2: creating different graph panels per id

I am trying to create a time series plot for each individual (ID) I have in my dataset.
Example data:
ID <- rep(c(2:5), each = 9, times = 4)
Attitude <- rep(c('A1', 'A2','A3', 'A4', 'A5', 'A6', 'A7', 'A8', 'A9'), 16)
Answer <- rep(1:5, length.out = 144)
time <- as.character(rep(c(0, 1, 3, 4), each = 9, times = 4))
first_answer <- rep(1:5, length.out = 144)
df <- data.frame(ID, Attitude, Answer, time, first_answer)
df$time <- as.character(df$time)
The function code I am currently using:
library(dplyr)
spaghetti_plot <- function(input, MV, item_level){
MV <- enquo(MV)
titles <- enquo(item_level)
input %>%
filter(!!(MV) == item_level) %>%
mutate(first_answer = first_answer) %>%
ggplot(.,aes( x = time, y = jitter(Answer), group = ID)) +
geom_line(aes(colour = first_answer)) +
labs(title = titles ,x = 'Time', y = 'Answer', colour = 'Answer given at time 0')
}
This gives me a graph where I have a line for each individual, i.e. one plot for all individuals (equal to number of ID). Instead of this, I would like to have 1 plot with # panels = ID. For example, if I have data of 10 individuals, I would like to have a graph with 10 panels.
I tried using facet_wrap and facet_panel to get the job done, but I haven't found a proper solution yet.
EDIT using facet_wrap(~ID) gives
The result that I am after would look something like this:
Which was originally made in SAS.
EDIT2 Solution is in the comments.
The data from your reproducible example are a bit weird because you have only one value per ID, but I believe this is the code you are looking for:
library(ggplot2)
ggplot(df,aes(x = time, y = Answer)) +
geom_line()+
facet_grid(. ~ ID)
If you have too many facets the data may not show up, try to increase the size of the plot window or export the image directly with ggsave. If you find the right parameters for ggsave all the plots should be visible on the saved image.

Fill area between two lines, with high/low and dates

Forword: I provide a reasonably satisfactory answer to my own question. I understand this is acceptable practice. Naturally my hope is to invite suggestions and improvements.
My purpose is to plot two time series (stored in a dataframe with dates stored as class 'Date') and to fill the area between the data points with two different colors according to whether one is above the other. For instance, to plot an index of Bonds and an index of Stocks, and to fill the area in red when the Stock index is above the bond index, and to fill the area in blue otherwise.
I have used ggplot2 for this purpose, because I am reasonably familiar with the package (author: Hadley Wickham), but feel free to suggest other approaches. I wrote a custom function based on the geom_ribbon() function of the ggplot2 package. Early on I faced problems related to my lack of experience in handling the geom_ribbon() function and objects of class 'Date'. The function below represents my effort to solve these problems, almost surely it is roundabout, unecessarily complicated, clumsy, etc.. So my question is: Please suggest improvements and/or alternative approaches. Ultimately, it would be great to have a general-purpose function made available here.
Data:
set.seed(123456789)
df <- data.frame(
Date = seq.Date(as.Date("1950-01-01"), by = "1 month", length.out = 12*10),
Stocks = 100 + c(0, cumsum(runif(12*10-1, -30, 30))),
Bonds = 100 + c(0, cumsum(runif(12*10-1, -5, 5))))
library('reshape2')
df <- melt(df, id.vars = 'Date')
Custom Function:
## Function to plot geom_ribbon for class Date
geom_ribbon_date <- function(data, group, N = 1000) {
# convert column of class Date to numeric
x_Date <- as.numeric(data[, which(sapply(data, class) == "Date")])
# append numeric date to dataframe
data$Date.numeric <- x_Date
# ensure fill grid is as fine as data grid
N <- max(N, length(x_Date))
# generate a grid for fill
seq_x_Date <- seq(min(x_Date), max(x_Date), length.out = N)
# ensure the grouping variable is a factor
group <- factor(group)
# create a dataframe of min and max
area <- Map(function(z) {
d <- data[group == z,];
approxfun(d$Date.numeric, d$value)(seq_x_Date);
}, levels(group))
# create a categorical variable for the max
maxcat <- apply(do.call('cbind', area), 1, which.max)
# output a dataframe with x, ymin, ymax, is. max 'dummy', and group
df <- data.frame(x = seq_x_Date,
ymin = do.call('pmin', area),
ymax = do.call('pmax', area),
is.max = levels(group)[maxcat],
group = cumsum(c(1, diff(maxcat) != 0))
)
# convert back numeric dates to column of class Date
df$x <- as.Date(df$x, origin = "1970-01-01")
# create and return the geom_ribbon
gr <- geom_ribbon(data = df, aes(x, ymin = ymin, ymax = ymax, fill = is.max, group = group), inherit.aes = FALSE)
return(gr)
}
Usage:
ggplot(data = df, aes(x = Date, y = value, group = variable, colour = variable)) +
geom_ribbon_date(data = df, group = df$variable) +
theme_bw() +
xlab(NULL) +
ylab(NULL) +
ggtitle("Bonds Versus Stocks (Fake Data!)") +
scale_fill_manual('is.max', breaks = c('Stocks', 'Bonds'),
values = c('darkblue','darkred')) +
theme(legend.position = 'right', legend.direction = 'vertical') +
theme(legend.title = element_blank()) +
theme(legend.key = element_blank())
Result:
While there are related questions and answers on stackoverflow, I haven't found one that was sufficiently detailed for my purpose. Here is a selection of useful exchanges:
create-geom-ribbon-for-min-max-range: Asks a similar question, but provides less detail than I was looking for.
possible-bug-in-geom-ribbon: Closely related, but intermediate steps on how to compute max/min are missing.
fill-region-between-two-loess-smoothed-lines-in-r-with-ggplot: Closely related, but focuses on loess lines. Excellent.
ggplot-colouring-areas-between-density-lines-according-to-relative-position : Closely related, but focuses on densities. This post greatly inspired me.
Perhaps I'm not understanding your full problem but it seems that a fairly direct approach would be to define a third line as the minimum of the two time series at each time point. geom_ribbon is then called twice (once for each unique value of Asset) to plot the ribbons formed by each of the series and the minimum line. Code could look like:
set.seed(123456789)
df <- data.frame(
Date = seq.Date(as.Date("1950-01-01"), by = "1 month", length.out = 12*10),
Stocks = 100 + c(0, cumsum(runif(12*10-1, -30, 30))),
Bonds = 100 + c(0, cumsum(runif(12*10-1, -5, 5))))
library(reshape2)
library(ggplot2)
df <- cbind(df,min_line=pmin(df[,2],df[,3]) )
df <- melt(df, id.vars=c("Date","min_line"), variable.name="Assets", value.name="Prices")
sp <- ggplot(data=df, aes(x=Date, fill=Assets))
sp <- sp + geom_ribbon(aes(ymax=Prices, ymin=min_line))
sp <- sp + scale_fill_manual(values=c(Stocks="darkred", Bonds="darkblue"))
sp <- sp + ggtitle("Bonds Versus Stocks (Fake Data!)")
plot(sp)
This produces following chart:
I actually had the same question some time ago and here is the related post. It defines a function finding the intersections between two lines and an other function which takes a dataframe in input and then colors the space between the two columns using matplotand polygon
EDIT
Here is the code, modified a bit to allow the last polygon to be plotted
set.seed(123456789)
dat <- data.frame(
Date = seq.Date(as.Date("1950-01-01"), by = "1 month", length.out = 12*10),
Stocks = 100 + c(0, cumsum(runif(12*10-1, -30, 30))),
Bonds = 100 + c(0, cumsum(runif(12*10-1, -5, 5))))
intersects <- function(x1, x2) {
seg1 <- which(!!diff(x1 > x2)) # location of first point in crossing segments
above <- x2[seg1] > x1[seg1] # which curve is above prior to crossing
slope1 <- x1[seg1+1] - x1[seg1]
slope2 <- x2[seg1+1] - x2[seg1]
x <- seg1 + ((x2[seg1] - x1[seg1]) / (slope1 - slope2))
y <- x1[seg1] + slope1*(x - seg1)
data.frame(x=x, y=y, pindex=seg1, pabove=(1:2)[above+1L])
# pabove is greater curve prior to crossing
}
fillColor <- function(data, addLines=TRUE) {
## Find points of intersections
ints <- intersects(data[,2], data[,3]) # because the first column is for Dates
intervals <- findInterval(1:nrow(data), c(0, ints$x))
## Make plot
matplot(data, type="n", col=2:3, lty=1, lwd=4,xaxt='n',xlab='Date')
axis(1,at=seq(1,dim(data)[1],length.out=12),
labels=data[,1][seq(1,dim(data)[1],length.out=12)])
legend("topright", c(colnames(data)[2], colnames(data)[3]), col=3:2, lty=1, lwd=2)
## Draw the polygons
for (i in seq_along(table(intervals))) {
xstart <- ifelse(i == 1, 0, ints$x[i-1])
ystart <- ifelse(i == 1, data[1,2], ints$y[i-1])
xend <- ints$x[i]
yend <- ints$y[i]
x <- seq(nrow(data))[intervals == i]
polygon(c(xstart, x, xend, rev(x)), c(ystart, data[x,2], yend, rev(data[x,3])),
col=ints$pabove[i]%%2+2)
}
# add end of plot
xstart <- ints[dim(ints)[1],1]
ystart <- ints[dim(ints)[1],2]
xend <- nrow(data)
yend <- data[dim(data)[1],2]
x <- seq(nrow(data))[intervals == max(intervals)]
polygon(c(xstart, x, xend, rev(x)), c(ystart, data[x,2], yend, rev(data[x,3])),
col=ints[dim(ints)[1]-1,4]%%2+2)
## Add lines for curves
if (addLines)
invisible(lapply(1:2, function(x) lines(seq(nrow(data)), data[,x], col=x%%2+2, lwd=2)))
}
## Plot the data
fillColor(dat,FALSE)
and the final result is this (with the same data used for the question)
#walts answer should remain the winner but while implementing his solution, I gave it a tidy update.
library(tidyverse)
set.seed(2345)
# fake data​
raw_data <-
tibble(
date = as.Date("2020-01-01") + (1:40),
a = 95 + cumsum(runif(40, min = -20, max = 20)),
b = 55 + cumsum(runif(40, min = -1, max = 1))
)
​
# the steps
# the 'y' + 'min_line' + 'group' is the right granularity (by date) to
# create 2 separate ribbons
df <-
raw_data %>%
# find min of the two columns
mutate(min_line = pmin(a, b)) %>%
pivot_longer(c(a, b), names_to = "group", values_to = "y") %>%
print()
​
# the result
ggplot(data = df, aes(x = date, fill = group)) +
geom_ribbon(aes(ymax = y, ymin = min_line)) +
theme_classic()
another option using ggh4x - requires the data to be wide with y for lines 1 and 2 in different columns.
library(ggh4x)
#> Loading required package: ggplot2
set.seed(123456789)
df <- data.frame(
Date = seq.Date(as.Date("1950-01-01"), by = "1 month", length.out = 12*10),
Stocks = 100 + c(0, cumsum(runif(12*10-1, -30, 30))),
Bonds = 100 + c(0, cumsum(runif(12*10-1, -5, 5))))
## The data frame is NOT made long!!
ggplot(data = df, aes(x = Date)) +
stat_difference(aes(ymin = Stocks, ymax = Bonds)) +
scale_fill_brewer(palette = "Set1")
Created on 2022-11-24 with reprex v2.0.2

How to Create a Graph of Statistical Time Series

I have data in the following format:
Date Year Month Day Flow
1 1953-10-01 1953 10 1 530
2 1953-10-02 1953 10 2 530
3 1953-10-03 1953 10 3 530
I would like to create a graph like this:
Here is my current image and code:
library(ggplot2)
library(plyr)
library(reshape2)
library(scales)
## Read Data
df <- read.csv("Salt River Flow.csv")
## Convert Date column to R-recognized dates
df$Date <- as.Date(df$Date, "%m/%d/%Y")
## Finds Water Years (Oct - Sept)
df$WY <- as.POSIXlt(as.POSIXlt(df$Date)+7948800)$year+1900
## Normalizes Water Years so stats can be applied to just months and days
df$w <- ifelse(month(df$Date) %in% c(10,11,12), 1903, 1904)
##Creates New Date (dat) Column
df$dat <- as.Date(paste(df$w,month(df$Date),day(df$Date), sep = "-"))
## Creates new data frame with summarised data by MonthDay
PlotData <- ddply(df, .(dat), summarise, Min = min(Flow), Tenth = quantile(Flow, p = 0.05), TwentyFifth = quantile(Flow, p = 0.25), Median = quantile(Flow, p = 0.50), Mean = mean(Flow), SeventyFifth = quantile(Flow, p = 0.75), Ninetieth = quantile(Flow, p = 0.90), Max = max(Flow))
## Melts data so it can be plotted with ggplot
m <- melt(PlotData, id="dat")
## Plots
p <- ggplot(m, aes(x = dat)) +
geom_ribbon(aes(min = TwentyFifth, max = Median), data = PlotData, fill = alpha("black", 0.1), color = NA) +
geom_ribbon(aes(min = Median, max = SeventyFifth), data = PlotData, fill = alpha("black", 0.5), color = NA) +
scale_x_date(labels = date_format("%b"), breaks = date_breaks("month"), expand = c(0,0)) +
geom_line(data = subset(m, variable == "Mean"), aes(y = value), size = 1.2) +
theme_bw() +
geom_line(data = subset(m, variable %in% c("Min","Max")), aes(y = value, group = variable)) +
geom_line(data = subset(m, variable %in% c("Ninetieth","Tenth")), aes(y = value, group = variable), linetype = 2) +
labs(x = "Water Year", y = "Flow (cfs)")
p
I am very close but there are some issues I'm having. First, if you can see a way to improve my code, please let me know. The main problem I ran into was that I needed two dataframes to make this graph: one melted, and one not. The unmelted dataframe was necessary (I think) to create the ribbons. I tried many ways to use the melted dataframe for the ribbons, but there was always a problem with the aesthetic length.
Second, I know to have a legend - and I want one, I need to have something in the aesthetics of each line/ribbon, but I am having trouble getting that to work. I think it would involve scale_fill_manual.
Third, and I don't know if this is possible, I would like to have each month label in between the tick marks, not on them (like in the above image).
Any help is greatly appreciated (especially with creating more efficient code).
Thank you.
Something along these lines might get you close with base:
library(lubridate)
library(reshape2)
# simulating data...
Date <- seq(as.Date("1953-10-01"),as.Date("2010-10-01"),by="day")
Year <- year(Date)
Month <- month(Date)
Day <- day(Date)
set.seed(1)
Flow <- rpois(length(Date), 2000)
Data <- data.frame(Date=Date,Year=Year,Month=Month,Day=Day,Flow=Flow)
# use acast to get it in a convenient shape:
PlotData <- acast(Data,Year~Month+Day,value.var="Flow")
# apply for quantiles
Quantiles <- apply(PlotData,2,function(x){
quantile(x,probs=c(1,.9,.75,.5,.25,.1,0),na.rm=TRUE)
})
Mean <- colMeans(PlotData, na.rm=TRUE)
# ugly way to get month tick separators
MonthTicks <- cumsum(table(unlist(lapply(strsplit(names(Mean),split="_"),"[[",1))))
# and finally your question:
plot(1:366,seq(0,max(Flow),length=366),type="n",xlab = "Water Year",ylab="Discharge",axes=FALSE)
polygon(c(1:366,366:1),c(Quantiles["50%",],rev(Quantiles["75%",])),border=NA,col=gray(.6))
polygon(c(1:366,366:1),c(Quantiles["50%",],rev(Quantiles["25%",])),border=NA,col=gray(.4))
lines(1:366,Quantiles["90%",], col = gray(.5), lty=4)
lines(1:366,Quantiles["10%",], col = gray(.5))
lines(1:366,Quantiles["100%",], col = gray(.7))
lines(1:366,Quantiles["0%",], col = gray(.7), lty=4)
lines(1:366,Mean,lwd=3)
axis(1,at=MonthTicks, labels=NA)
text(MonthTicks-15,-100,1:12,pos=1,xpd=TRUE)
axis(2)
The plotting code really isn't that tricky. You'll need to clean up the aesthetics, but polygon() is usually my strategy for shaded regions in plots (confidence bands, whatever).
Perhaps this will get you closer to what you're looking for, using ggplot2 and plyr:
library(ggplot2)
library(plyr)
library(lubridate)
library(scales)
df$MonthDay <- df$Date - years( year(df$Date) + 100 ) #Normalize points to same year
df <- ddply(df, .(Month, Day), mutate, MaxDayFlow = max(Flow) ) #Max flow on day
df <- ddply(df, .(Month, Day), mutate, MinDayFlow = min(Flow) ) #Min flow on day
p <- ggplot(df, aes(x=MonthDay) ) +
geom_smooth(size=2,level=.8,color="black",aes(y=Flow)) + #80% conf. interval
geom_smooth(size=2,level=.5,color="black",aes(y=Flow)) + #50% conf. interval
geom_line( linetype="longdash", aes(y=MaxDayFlow) ) +
geom_line( linetype="longdash", aes(y=MinDayFlow) ) +
labs(x="Month",y="Flow") +
scale_x_date( labels = date_format("%b") ) +
theme_bw()
Edit: Fixed X scale and X scale label
(Partial answer with base plotting function and not including the min, max, or mean.) I suspect you will need to construct a dataset before passing to ggplot, since that is typical for that function. I already do something similar and then pass the resulting matrix to matplot. (It doesn't do that kewl highlighting, but maybe ggplot can do it>
HDL.mon.mat <- aggregate(dfrm$Flow,
list( dfrm$Year + dfrm$Month/12),
quantile, prob=c(0.1,0.25,0.5,0.75, 0.9), na.rm=TRUE)
matplot(HDL.mon.mat[,1], HDL.mon.mat$x, type="pl")

Resources