I am trying to plot the change in a time series for each calendar year using ggplot and I am having problems with the fine control of the x-axis. If I do not use scale="free_x" then I end up with an x-axis that shows several years as well as the year in question, like this:
If I do use scale="free_x" then as one would expect I end up with tick labels for each plot, and that in some cases vary by plot, which I do not want:
I have made various attempts to define the x-axis using scale_x_date etc but without any success. My question is therefore:
Q. How can I control the x-axis breaks and labels on a ggplot facet grid so that the (time series) x-axis is identical for each facet, shows only at the bottom of the panel and is in the form of months formatted 1, 2, 3 etc or as 'Jan','Feb','Mar'?
Code follows:
require(lubridate)
require(ggplot2)
require(plyr)
# generate data
df <- data.frame(date=seq(as.Date("2009/1/1"), by="day", length.out=1115),price=runif(1115, min=100, max=200))
# remove weekend days
df <- df[!(weekdays(as.Date(df$date)) %in% c('Saturday','Sunday')),]
# add some columns for later
df$year <- as.numeric(format(as.Date(df$date), format="%Y"))
df$month <- as.numeric(format(as.Date(df$date), format="%m"))
df$day <- as.numeric(format(as.Date(df$date), format="%d"))
# calculate change in price since the start of the calendar year
df <- ddply(df, .(year), transform, pctchg = ((price/price[1])-1))
p <- ggplot(df, aes(date, pctchg)) +
geom_line( aes(group = 1, colour = pctchg),size=0.75) +
facet_wrap( ~ year, ncol = 2,scale="free_x") +
scale_y_continuous(formatter = "percent") +
opts(legend.position = "none")
print(p)
here is an example:
df <- transform(df, doy = as.Date(paste(2000, month, day, sep="/")))
p <- ggplot(df, aes(doy, pctchg)) +
geom_line( aes(group = 1, colour = pctchg),size=0.75) +
facet_wrap( ~ year, ncol = 2) +
scale_x_date(format = "%b") +
scale_y_continuous(formatter = "percent") +
opts(legend.position = "none")
p
Do you want this one?
The trick is to generate day of year of a same dummy year.
UPDATED
here is an example for the dev version (i.e., ggplot2 0.9)
p <- ggplot(df, aes(doy, pctchg)) +
geom_line( aes(group = 1, colour = pctchg), size=0.75) +
facet_wrap( ~ year, ncol = 2) +
scale_x_date(label = date_format("%b"), breaks = seq(min(df$doy), max(df$doy), "month")) +
scale_y_continuous(label = percent_format()) +
opts(legend.position = "none")
p
Related
This question already has an answer here:
Setting individual y axis limits with facet wrap NOT with scales free_y
(1 answer)
Closed 4 years ago.
I'm trying to create a facet_wrap() where the unit of measure remains identical across the different plots, while allowing to slide across the y axis.
To clearify with I mean, I have created a dataset df:
library(tidyverse)
df <- tibble(
Year = c(2010,2011,2012,2010,2011,2012),
Category=c("A","A","A","B","B","B"),
Value=c(1.50, 1.70, 1.60, 4.50, 4.60, 4.55)
)
with df, we can create the following plot using facet_wrap:
ggplot(data = df, aes(x=Year, y=Value)) + geom_line() + facet_wrap(.~ Category)
Plot 1
To clarify the differences between both plots, one can use scale = "free_y":
ggplot(data = df, aes(x=Year, y=Value)) + geom_line()
+ facet_wrap(.~ Category, scale="free_y")
Plot 2
Although it's more clear, the scale on the y-axis in plot A isequal to 0.025, while being 0.0125 in B. This could be misleading to someone who's comparing A & B next to each other.
So my question right now is to know whether there exist an elegant way of plotting something like the graph below (with y-scale = 0.025) without having to plot two seperate plots into a grid?
Thanks
Desired result:
Code for the grid:
# Grid
## Plot A
df_A <- df %>%
filter(Category == "A")
plot_A <- ggplot(data = df_A, aes(x=Year, y=Value)) + geom_line() + coord_cartesian(ylim = c(1.5,1.7)) + ggtitle("A")
## Plot B
df_B <- df %>%
filter(Category == "B")
plot_B <- ggplot(data = df_B, aes(x=Year, y=Value)) + geom_line() + coord_cartesian(ylim = c(4.4,4.6)) + ggtitle("B")
grid.arrange(plot_A, plot_B, nrow=1)
Based on the info at Setting individual y axis limits with facet wrap NOT with scales free_y you can you use geom_blank() and manually specified y-limits by Category:
# df from above code
df2 <- tibble(
Category = c("A", "B"),
y_min = c(1.5, 4.4),
y_max = c(1.7, 4.6)
)
df <- full_join(df, df2, by = "Category")
ggplot(data = df, aes(x=Year, y=Value)) + geom_line() +
facet_wrap(.~ Category, scales = "free_y") +
geom_blank(aes(y = y_min)) +
geom_blank(aes(y = y_max))
Currently using ggplot2 and scales doing this but would be ideal to show a date range +/- 1 Year (for example). I shouldn't really be hardcoding these dates as it's not very efficient.
library(scales) #date time scales
library(ggplot2) # Visualization
ggplot(dataset,aes(x=datetime_start, y=dataset$Product, color=Stage, order = - as.numeric(Stage))) +
geom_segment(aes(x=From,xend=To,yend=dataset$Product), size=10) +
scale_x_datetime(
breaks = date_breaks("1 month"),
labels=date_format("%b%y"),
limits = c(
as.POSIXct("2016-03-01"),
as.POSIXct("2018-02-01")
)
) +
Expand the scale:
library(ggplot2)
df <- data.frame(x = seq(Sys.Date()-lubridate::years(2), Sys.Date(), by="3 month"))
df$y <- 1:nrow(df)
p <- ggplot(df, aes(x, y)) + geom_line()
p + scale_x_date(expand = c(0, 365))
Here is my sample data:
Singer <- c("A","B","C","A","B","C")
Rank <- c(1,2,3,3,2,1)
Episode <- c(1,1,1,2,2,2)
Votes <- c(0.3,0.28,0.11,0.14,0.29,0.38)
data <- data_frame(Episode,Singer,Rank,Votes)
data$Episode <- as.character(data$Episode)
I would like to make a line graph to show the performance of each singer.
I tried to use ggplot2 like below:
ggplot(data,aes(x=Episode,y=Votes,group = Singer)) + geom_line()
I have two questions:
How can I format the y-axis as percentage?
How can I label each dot in this line graph as the values of "Rank", which allows me to show rank and votes in the same graph?
To label each point use:
geom_label(aes(label = Rank))
# or
geom_text(aes(label = Rank), nudge_y = .01, nudge_x = 0)
To format the axis labels use:
scale_y_continuous(labels = scales::percent_format())
# or without package(scales):
scale_y_continuous(breaks = (seq(0, .4, .2)), labels = sprintf("%1.f%%", 100 * seq(0, .4, .2)), limits = c(0,.4))
Complete code:
library(ggplot2)
library(scales)
ggplot(data, aes(x = factor(Episode), y = Votes, group = Singer)) +
geom_line() +
geom_label(aes(label = Rank)) +
scale_y_continuous(labels = scales::percent_format())
Data:
Singer <- c("A","B","C","A","B","C")
Rank <- c(1,2,3,3,2,1)
Episode <- c(1,1,1,2,2,2)
Votes <- c(0.3,0.28,0.11,0.14,0.29,0.38)
data <- data_frame(Episode,Singer,Rank,Votes)
# no need to transform to character bc we use factor(Episode) in aes(x=..)
I asked a question yesterday about annotating the x-axis with N in a faceted plot using a minimal example that turns out to be too simple, relative to my real problem. The answer given there works in the case of complete data, but if you have missing facets you would like to preserve, the combination of facet_wrap options drop=FALSE and scales="free_x" triggers an error: "Error in if (zero_range(from) || zero_range(to)) { : missing value where TRUE/FALSE needed"
Here is a new, less-minimal example. The goal here is to produce a large graph with two panels using grid.arrange; the first showing absolute values over time by treatment group; the second showing the change from baseline over time by treatment group. In the second panel, we need a blank facet when vis=1.
# setup
library(ggplot2)
library(plyr)
library(gridExtra)
trt <- factor(rep(LETTERS[1:2],150),ordered=TRUE)
vis <- factor(c(rep(1,150),rep(2,100),rep(3,50)),ordered=TRUE)
id <- c(c(1:150),c(1:100),c(1:50))
val <- rnorm(300)
data <- data.frame(id,trt,vis,val)
base <- with(subset(data,vis==1),data.frame(id,trt,baseval=val))
data <- merge(data,base,by="id")
data <- transform(data,chg=ifelse(vis==1,NA,val-baseval))
data.sum <- ddply(data, .(vis, trt), summarise, N=length(na.omit(val)))
data <- merge(data,data.sum)
data <- transform(data, trtN=paste(trt,N,sep="\n"))
mytheme <- theme_bw() + theme(panel.margin = unit(0, "lines"), strip.background = element_blank())
# no missing facets
plot.a <- ggplot(data) + geom_boxplot(aes(x=trtN,y=val,group=trt,colour=trt), show.legend=FALSE) +
facet_wrap(~ vis, drop=FALSE, switch="x", nrow=1, scales="free_x") +
labs(x="Visit") + mytheme
# first facet should be blank
plot.b <- ggplot(data) + geom_boxplot(aes(x=trtN,y=chg,group=trt,colour=trt), show.legend=FALSE) +
facet_wrap(~ vis, drop=FALSE, switch="x", nrow=1, scales="free_x") +
labs(x="Visit") + mytheme
grid.arrange(plot.a,plot.b,nrow=2)
You can add a blank layer to draw all the facets in your second plot. The key is that you need a variable that exists for every level of vis to use as your y variable. In your case you can simply use the variable you used in your first plot.
ggplot(data) +
geom_boxplot(aes(x = trtN, y = chg, group = trt, colour = trt), show.legend = FALSE) +
geom_blank(aes(x = trtN, y = val)) +
facet_wrap(~ vis, switch = "x", nrow = 1, scales = "free_x") +
labs(x="Visit") + mytheme
If your variables have different ranges, you can set the y limits using the overall min and max of your boxplot y variable.
+ scale_y_continuous(limits = c(min(data$chg, na.rm = TRUE), max(data$chg, na.rm = TRUE)))
I am trying to create a circular plot to the display frequency/counts of months in my dataset but I would also like to group the months by season. Here is a similar plot for time of day, and now I would like to use the same approach to plot months/seasons. However, for some reason I can't seem to specify the right option to break my scale into non-overlapping month categories. Any suggestions are much appreciated.
library(lubridate)
library(ggplot2) # use at least 0.9.3 for theme_minimal()
library(circular)
### PLOT FOR HOURS ###
## generate random data in POSIX date-time format
set.seed(44)
N=500
events <- as.POSIXct("2011-01-01", tz="GMT") +
days(floor(365*runif(N))) +
hours(floor(24*rnorm(N))) + # using rnorm here
minutes(floor(60*runif(N))) +
seconds(floor(60*runif(N)))
# extract hour with lubridate function
hour_of_event <- hour(events)
# make a dataframe
eventdata <- data.frame(datetime = events, eventhour = hour_of_event)
# determine if event is in business hours
eventdata$Workday <- eventdata$eventhour %in% seq(6, 18)
ra<-length(eventdata[,2])
for (i in 1:ra){
if(eventdata[,3][i]=="TRUE"){eventdata$diel[i]<-"day"}else{eventdata$diel[i]<-"night"}
}
# Plot
ggplot(eventdata, aes(x = eventhour, fill = diel)) +
geom_histogram(breaks = seq(0,24), width = 2, colour = "grey") +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer() + ylab("Count") + ggtitle("Events by Time of day") +
scale_x_continuous("", limits = c(0, 24), breaks = seq(0, 24), labels = seq(0,24))
This is my attempt to do a plot by month/season,
### PLOT FOR MONTHS ###
head(events)
# extract hour with lubridate function
month_of_event <- month(events)
# make a dataframe
eventdata <- data.frame(datetime = events, months = month_of_event)
# classify months into seasons
summer<-c(1,2,12)
fall<-c(3,4,5)
winter<-c(6,7,8)
spring<-c(9,10,11)
season.names <- rep("",12)
season.names[summer] <- "Summer"
season.names[fall] <- "Fall"
season.names[winter] <- "Winter"
season.names[spring] <- "Spring"
season.names
eventdata$season<-season.names[eventdata$months]
str(eventdata)
# Plot
ggplot(eventdata, aes(x = months, fill = season)) +
geom_histogram(breaks = seq(0,12, by=1), width = 4) +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer() + ylab("Count") +
scale_x_continuous("", limits = c(0, 12), breaks = seq(0, 12), labels = seq(0,12))
Following simple version works:
ggplot(eventdata, aes(x = factor(months), fill = season)) +
geom_histogram()+
coord_polar()