I'm working with a data frame that includes both the day and month of the observation:
Day Month
1 January
2 January
3 January
32 February
33 February
34 February
60 March
61 March
And so on. I am creating a line graph in ggplot that reflects the day by day value of column wp (which can just be assumed to be random values between 0 and 1).
Because there are so many days in the data set, I don't want them to be reflected in the x-axis tick marks. I need to refer to the day column in creating the plot so I can see the day-by-day change in wp, but I just want month to be shown on the x-axis labels. I can easily rename the x-axis title with xlab("Month"), but I can't figure out how to change the tick marks to only show the month. So ideally, I want to see "January", "February", and "March" as the labels along the x-axis.
test <- ggplot(data = df, aes(x=day, y=wp)) +
geom_line(color="#D4BF91", size = 1) +
geom_point(color="#D4BF91", size = .5) +
ggtitle("testex") +
xlab("Day") +
ylab("Value") +
theme_fivethirtyeight() + theme(axis.title = element_text())
When I treat the day as an actual date, I get the desired result:
dat <- tibble(
date = seq(mdy("01-01-2020"), mdy("01-01-2020")+75, by=1),
day = seq_along(date),
wp = runif(length(day), 0, 1)
ggplot(data = dat, aes(x=date, y=wp)) +
geom_line(color="#D4BF91", size = 1) +
geom_point(color="#D4BF91", size = .5) +
ggtitle("testex") +
xlab("Day") +
ylab("Value") +
theme(axis.title = element_text())
More generally, though you could use scale_x_continuous() to do this:
dat <- tibble(
date = seq(mdy("01-01-2020"), mdy("01-01-2020")+75, by=1),
day = seq_along(date),
wp = runif(length(day), 0, 1),
month = as.character(month(date, label=TRUE))
firsts <- dat %>%
group_by(month) %>%
ggplot(data = dat, aes(x=day, y=wp)) +
geom_line(color="#D4BF91", size = 1) +
geom_point(color="#D4BF91", size = .5) +
ggtitle("testex") +
xlab("Day") +
ylab("Value") +
scale_x_continuous(breaks=firsts$day, label=firsts$month) +
theme(axis.title = element_text())
I have a dataframe which contains a variable for week-since-2017. So, it counts up from 1 to 313 in that column. I mutated another variable into the dataframe to indicate the year. So, in my scatterplot, I have each week as a point, but the x-axis is horrid, counting up from 1 to 313. Is there a way I can change the scale at the bottom to instead display the variable year, possibly even adding vertical lines in between to show when the year changes?
Currently, I have this:
ggplot(HS, aes(as.integer(Obs), Total)) + geom_point(aes(color=YEAR)) + geom_smooth() + labs(title="Weekly Sales since 2017",x="Week",y="Written Sales") + theme(axis.line = element_line(colour = "orange", size = 1, linetype = "solid"))
You can convert the number of weeks to a number of days using 7 * Obs and add this value on to the start date (as.Date('2017-01-01')). This gives you a date-based x axis which you can format as you please.
Here, we set the breaks at the turn of each year so the grid fits to them:
ggplot(HS, aes(as.Date('2017-01-01') + 7 * Obs, Total)) +
geom_point(aes(color = YEAR)) +
geom_smooth() +
labs(title = "Weekly Sales since 2017", x = "Week", y = "Written Sales") +
theme(axis.line = element_line(colour = "orange", size = 1)) +
scale_x_date('Year', date_breaks = 'year', date_labels = '%Y')
Data used
Obviously, we don't have your data, so I had to create a reproducible set with the same names and similar values to yours for the above example:
HS <- data.frame(Obs = 1:312,
Total = rnorm(312, seq(1200, 1500, length = 312), 200)^2,
YEAR = rep(2017:2022, each = 52))
I am practicing with R and have hit a speedbump while trying to create a graph of airline passengers per month.
I want to show a separate monthly line graph for each year from 1949 to 1960 whereby data has been recorded. To do this I have used ggplot to create a line graph with the values per month. This works fine, however when I try to separate this by year using facet_wrap() and formatting the current month field: facet_wrap(format(air$month[seq(1, length(air$month), 12)], "%Y")); it returns this:
Graph returned
I have also tried to format the facet by inputting my own sequence for the years: rep(c(1949:1960), each = 12). This returns a different result which is better but still wrong:
Second graph
Here is my code:
air = data.frame(
month = seq(as.Date("1949-01-01"), as.Date("1960-12-01"), by="months"),
air = as.vector(AirPassengers)
ggplot(air, aes(x = month, y = air)) +
geom_point() +
labs(x = "Month", y = "Passengers (in thousands)", title = "Total passengers per month, 1949 - 1960") +
geom_smooth(method = lm, se = F) +
geom_line() +
scale_x_date(labels = date_format("%b"), breaks = "12 month") +
facet_wrap(format(air$month[seq(1, length(air$month), 12)], "%Y"))
facet_wrap(rep(c(1949:1960), each = 12))
So how do I make an individual graph per year?
In the second try you were really close. The main problem with the data is that you are trying to make a facetted plot with different x-axis values (dates including the year). An easy solution to fix that would be to transform the data to a "common" x axis scale and then do the facetted plot. Here is the code that should output the desired plot.
air %>%
# Get the year value to use it for the facetted plot
mutate(year = year(month),
# Get the month-day dates and set all dates with a dummy year (2021 in this case)
# This will get all your dates in a common x axis scale
month_day = as_date(paste(2021,month(month),day(month), sep = "-"))) %>%
# Do the same plot, just change the x variable to month_day
ggplot(aes(x = month_day,
y = air)) +
geom_point() +
labs(x = "Month",
y = "Passengers (in thousands)",
title = "Total passengers per month, 1949 - 1960") +
geom_smooth(method = lm,
se = F) +
geom_line() +
# Set the breaks to 1 month
scale_x_date(labels = scales::date_format("%b"),
breaks = "1 month") +
# Use the year variable to do the facetted plot
facet_wrap(~year) +
# You could set the x axis in an 90° angle to get a cleaner plot
theme(axis.text.x = element_text(angle = 90,
vjust = 0.5,
hjust = 1))
I'm trying to plot the monthly sales data with RStudio, but the dates on the x-axis are not showing correctly.
My code :-
uc_ts_plot <- ggplot(monthly_sales, aes(DATE,DAUTONSA)) + geom_line(na.rm=TRUE) +
xlab("Month") + ylab("Auto Sales in Thousands") +
scale_x_date(labels = date_format(format= "%b-%Y"),
breaks = date_breaks("1 year")) +
stat_smooth(colour = "green")
I expect the dates on the x-axis to be displayed as Jan-2011, Jan-2012, as shown here.
All I'm getting is a 0001-01 at the left end and a 0002-01 at the right end of the x-axis.
The plot which is shown is filtered between year 2011 and 2018 whereas the data you have is from 1967.
The below code produces the exact plot
monthly_sales %>%
mutate(DATE = as.Date(DATE)) %>%
filter(year(DATE) >= 2011 & year(DATE) < 2018) %>%
ggplot() + aes(DATE,DAUTONSA) +
geom_line(na.rm=TRUE) +
xlab("Month") + ylab("Auto Sales in Thousands") +
scale_x_date(labels = date_format(format= "%b-%Y"),
breaks = date_breaks("1 year")) +
stat_smooth(colour = "green")
You can remove the filter step to plot data for all the years but then it clutters the x-axis with lot of labels.
I am trying to create a circular plot to the display frequency/counts of months in my dataset but I would also like to group the months by season. Here is a similar plot for time of day, and now I would like to use the same approach to plot months/seasons. However, for some reason I can't seem to specify the right option to break my scale into non-overlapping month categories. Any suggestions are much appreciated.
library(ggplot2) # use at least 0.9.3 for theme_minimal()
## generate random data in POSIX date-time format
events <- as.POSIXct("2011-01-01", tz="GMT") +
days(floor(365*runif(N))) +
hours(floor(24*rnorm(N))) + # using rnorm here
minutes(floor(60*runif(N))) +
# extract hour with lubridate function
hour_of_event <- hour(events)
# make a dataframe
eventdata <- data.frame(datetime = events, eventhour = hour_of_event)
# determine if event is in business hours
eventdata$Workday <- eventdata$eventhour %in% seq(6, 18)
for (i in 1:ra){
# Plot
ggplot(eventdata, aes(x = eventhour, fill = diel)) +
geom_histogram(breaks = seq(0,24), width = 2, colour = "grey") +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer() + ylab("Count") + ggtitle("Events by Time of day") +
scale_x_continuous("", limits = c(0, 24), breaks = seq(0, 24), labels = seq(0,24))
This is my attempt to do a plot by month/season,
# extract hour with lubridate function
month_of_event <- month(events)
# make a dataframe
eventdata <- data.frame(datetime = events, months = month_of_event)
# classify months into seasons
season.names <- rep("",12)
season.names[summer] <- "Summer"
season.names[fall] <- "Fall"
season.names[winter] <- "Winter"
season.names[spring] <- "Spring"
# Plot
ggplot(eventdata, aes(x = months, fill = season)) +
geom_histogram(breaks = seq(0,12, by=1), width = 4) +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer() + ylab("Count") +
scale_x_continuous("", limits = c(0, 12), breaks = seq(0, 12), labels = seq(0,12))
Following simple version works:
ggplot(eventdata, aes(x = factor(months), fill = season)) +
I am trying to plot the change in a time series for each calendar year using ggplot and I am having problems with the fine control of the x-axis. If I do not use scale="free_x" then I end up with an x-axis that shows several years as well as the year in question, like this:
If I do use scale="free_x" then as one would expect I end up with tick labels for each plot, and that in some cases vary by plot, which I do not want:
I have made various attempts to define the x-axis using scale_x_date etc but without any success. My question is therefore:
Q. How can I control the x-axis breaks and labels on a ggplot facet grid so that the (time series) x-axis is identical for each facet, shows only at the bottom of the panel and is in the form of months formatted 1, 2, 3 etc or as 'Jan','Feb','Mar'?
Code follows:
# generate data
df <- data.frame(date=seq(as.Date("2009/1/1"), by="day", length.out=1115),price=runif(1115, min=100, max=200))
# remove weekend days
df <- df[!(weekdays(as.Date(df$date)) %in% c('Saturday','Sunday')),]
# add some columns for later
df$year <- as.numeric(format(as.Date(df$date), format="%Y"))
df$month <- as.numeric(format(as.Date(df$date), format="%m"))
df$day <- as.numeric(format(as.Date(df$date), format="%d"))
# calculate change in price since the start of the calendar year
df <- ddply(df, .(year), transform, pctchg = ((price/price[1])-1))
p <- ggplot(df, aes(date, pctchg)) +
geom_line( aes(group = 1, colour = pctchg),size=0.75) +
facet_wrap( ~ year, ncol = 2,scale="free_x") +
scale_y_continuous(formatter = "percent") +
opts(legend.position = "none")
here is an example:
df <- transform(df, doy = as.Date(paste(2000, month, day, sep="/")))
p <- ggplot(df, aes(doy, pctchg)) +
geom_line( aes(group = 1, colour = pctchg),size=0.75) +
facet_wrap( ~ year, ncol = 2) +
scale_x_date(format = "%b") +
scale_y_continuous(formatter = "percent") +
opts(legend.position = "none")
Do you want this one?
The trick is to generate day of year of a same dummy year.
here is an example for the dev version (i.e., ggplot2 0.9)
p <- ggplot(df, aes(doy, pctchg)) +
geom_line( aes(group = 1, colour = pctchg), size=0.75) +
facet_wrap( ~ year, ncol = 2) +
scale_x_date(label = date_format("%b"), breaks = seq(min(df$doy), max(df$doy), "month")) +
scale_y_continuous(label = percent_format()) +
opts(legend.position = "none")