R ggplot2 - Plot year variable one over the other in same plot - r

How do I plot each year as a separate line in ggplot2 I tried the below code but it seems to plot continuous as a single plot.
library(ggplot2)
# Dummy data
data <- data.frame(
Date = c(as.Date("2017-01-14") - 0:13,as.Date("2016-01-14") - 0:13),
value = runif(28)
)
#data$Date <- strptime(data$Date, "%Y-%m-%d" )
data$Year <- as.character(year(data$Date))
data$Year <- factor(data$Year)
ggplot(data) + geom_line(aes(x = Date, y = value, group=Year, color=Year)) +
scale_x_date(date_breaks = "1 day", date_labels = "%d-%m-%y") +
theme(axis.text.x = element_text(angle = 90))
But I want each year to be a separate graph in the same plot.
something like below

Try this approach formating day and month in your date. You got a mess in your plot because of the different year in your date variable. Setting format can help you. Here the code:
library(ggplot2)
library(lubridate)
# Dummy data
data <- data.frame(
Date = c(as.Date("2017-01-14") - 0:13,as.Date("2016-01-14") - 0:13),
value = runif(28)
)
data$Year <- as.character(year(data$Date))
data$Year <- factor(data$Year)
#Format month
data$MonthDay <- format(data$Date,'%b-%d')
#Plot
ggplot(data) + geom_line(aes(x = MonthDay, y = value, group=Year, color=Year)) +
theme_bw()+
theme(axis.text.x = element_text(angle = 90))
Output:

Related

Timestamp on x-axis in timeseries ggplot

I have measurement data from the past months:
Variables
x <- df$DatoTid
y <- df$Partikler
color <- df$Opgave
I'm trying to plot my data based on the timestamp, so that I have the hours of the day in the x-axis, instead of the specific POSIXct datetime.
I would like the labels and ticks of the x-axis to be fx "00:00", "01:00",..."24:00".
So that noon is in the middle of the x-axis.
So far I tried to convert the datetime values into characters.
Doesn't look good yet (as you can see the axis ticks and labels are gone. Possibly other things are wrong as well).
Can someone help me?
And please let me know how to upload the data for you. I don't know how to add a huge .csv-file....
# Rounding up to nearest 10 min:
head(df)
df$Tid2 <- format(strptime("1970-01-01", "%Y-%m-%d", tz="CET") +
round(as.numeric(df$DatoTid)/300)*300 + 3600, "%Y-%m-%d %H:%M:%S")
head(df)
df$Tid2 <- as.character(df$Tid2)
str(df)
x <- df$Tid2
y <- df$Partikler
color <- df$Opgave
plot2 <- ggplot(data = df, aes(x = x, y = y, color = color)) +
geom_point(shape=16, alpha=0.6, size=1.8) +
scale_y_continuous(labels=function(x) format(x, big.mark = ".", decimal.mark = ",", scientific = FALSE)) +
scale_x_discrete(breaks=c("00:00:00", "06:00:00", "09:00:00", "12:00:00", "18:00:00", "21:00:00")) +
scale_color_discrete(name = "Case") +
xlab(" ") +
ylab(expression(paste("Partikelkoncentration [pt/cc]"))) +
myTheme +
theme(legend.text=element_text(size=8), legend.title=element_text(size=8))
plot2
I would approach this by making a new time stamp that uses a single day, but the hours/minutes/seconds of your existing time stamp.
First, here's a made-up version of your data, here using a linear trend in Partikler:
library(tidyverse); library(lubridate)
df <- data_frame(Tid2 = seq.POSIXt(from = ymd_h(2019010100),
to = ymd_h(2019011500), by = 60*60),
Partikler = seq(from = 0, to = 2.5E5, along.with = Tid2),
Opgave = as.factor(floor_date(Tid2, "3 days")))
# Here's a plot that's structurally similar to yours:
ggplot(df, aes(Tid2, Partikler, col = Opgave)) +
geom_point() +
scale_color_discrete(name = "Case")
Now, if we change the timestamps to be in the same day, we can control them like usual in ggplot, but with them collapsed into a single day of timing. We can also change the x axis so it doesn't mention the date component of the time stamp:
df2 <- df %>%
mutate(Tid2_sameday = ymd_hms(paste(Sys.Date(),
hour(Tid2), minute(Tid2), second(Tid2))))
ggplot(df2, aes(Tid2_sameday, Partikler, col = Opgave)) +
geom_point() +
scale_color_discrete(name = "Case") +
scale_x_datetime(date_labels = "%H:%M")

Programatically display date range with R scale limits

Currently using ggplot2 and scales doing this but would be ideal to show a date range +/- 1 Year (for example). I shouldn't really be hardcoding these dates as it's not very efficient.
library(scales) #date time scales
library(ggplot2) # Visualization
ggplot(dataset,aes(x=datetime_start, y=dataset$Product, color=Stage, order = - as.numeric(Stage))) +
geom_segment(aes(x=From,xend=To,yend=dataset$Product), size=10) +
scale_x_datetime(
breaks = date_breaks("1 month"),
labels=date_format("%b%y"),
limits = c(
as.POSIXct("2016-03-01"),
as.POSIXct("2018-02-01")
)
) +
Expand the scale:
library(ggplot2)
df <- data.frame(x = seq(Sys.Date()-lubridate::years(2), Sys.Date(), by="3 month"))
df$y <- 1:nrow(df)
p <- ggplot(df, aes(x, y)) + geom_line()
p + scale_x_date(expand = c(0, 365))

Weekly boxplot from hourly data [duplicate]

This code produces a single boxplot:
df <- data.frame(value = rnorm(62), my.date = seq(as.Date("2013-12-01"), as.Date("2014-01-31"), by="1 day"))
library(ggplot2)
ggplot(df, aes(as.Date(my.date), value)) + geom_boxplot() + scale_x_date(minor_breaks = "1 week", labels = date_format("%W\n%b"))
How can I produce a plot that has single boxplots for each week between 1 December and 31 January? So within the single plot, there should be about 8 boxplots. Would prefer solution that uses either ggplot() or scale_x_date().
One option is to transform your date before using ggplot
library(ggplot2)
df <- data.frame(value = rnorm(62),
my.date = seq(as.Date("2013-12-01"), as.Date("2014-01-31"), by="1 day"))
weeks <- format(df$my.date, "%Y/%W")
weeks <- factor(weeks, levels = unique(weeks))
ggplot(df, aes(weeks, value)) +
geom_boxplot()
library(ggplot2)
ggplot(df, aes(format(as.Date(my.date), "%W\n%b"), value)) + geom_boxplot()
Edit:
To order the dates:
ggplot(df, aes(reorder(format(as.Date(my.date), "%W\n%b"),
as.Date(my.date)),
value)) +
geom_boxplot()
This fulfils #luciano's request to retain functionality of scale_x_date
library('scales')
library(ggplot2)
df <- data.frame(value = rnorm(62), my.date = seq(as.Date("2013-12-01"), as.Date("2014-01-31"), by="1 day"))
ggplot(df, aes(x=as.Date(my.date), y=value, group=format(as.Date(my.date),"%W-%b"))) + geom_boxplot() + scale_x_date(date_breaks = "1 week", date_labels="%Y-%b-%d")
Alternatively, if you don't want the data grouped by week# - which gives you the split around most new years - you can group by week ending Sundays as below. Adjusting from the Sunday weekending, to say Friday, can be achieved with some such code
ceiling_date(x, "week") + ifelse(weekdays(x) %in% c("Saturday", "Sunday"), 5, -2)
ggplot(df, aes(x=as.Date(my.date), y=value, group=ceiling_date(my.date, "week"))) + geom_boxplot() + scale_x_date(date_breaks = "1 week", date_labels="%Y-%b-%d")

Plotting non-overlapping levels of a factor on a circular plot using ggplot2, R

I am trying to create a circular plot to the display frequency/counts of months in my dataset but I would also like to group the months by season. Here is a similar plot for time of day, and now I would like to use the same approach to plot months/seasons. However, for some reason I can't seem to specify the right option to break my scale into non-overlapping month categories. Any suggestions are much appreciated.
library(lubridate)
library(ggplot2) # use at least 0.9.3 for theme_minimal()
library(circular)
### PLOT FOR HOURS ###
## generate random data in POSIX date-time format
set.seed(44)
N=500
events <- as.POSIXct("2011-01-01", tz="GMT") +
days(floor(365*runif(N))) +
hours(floor(24*rnorm(N))) + # using rnorm here
minutes(floor(60*runif(N))) +
seconds(floor(60*runif(N)))
# extract hour with lubridate function
hour_of_event <- hour(events)
# make a dataframe
eventdata <- data.frame(datetime = events, eventhour = hour_of_event)
# determine if event is in business hours
eventdata$Workday <- eventdata$eventhour %in% seq(6, 18)
ra<-length(eventdata[,2])
for (i in 1:ra){
if(eventdata[,3][i]=="TRUE"){eventdata$diel[i]<-"day"}else{eventdata$diel[i]<-"night"}
}
# Plot
ggplot(eventdata, aes(x = eventhour, fill = diel)) +
geom_histogram(breaks = seq(0,24), width = 2, colour = "grey") +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer() + ylab("Count") + ggtitle("Events by Time of day") +
scale_x_continuous("", limits = c(0, 24), breaks = seq(0, 24), labels = seq(0,24))
This is my attempt to do a plot by month/season,
### PLOT FOR MONTHS ###
head(events)
# extract hour with lubridate function
month_of_event <- month(events)
# make a dataframe
eventdata <- data.frame(datetime = events, months = month_of_event)
# classify months into seasons
summer<-c(1,2,12)
fall<-c(3,4,5)
winter<-c(6,7,8)
spring<-c(9,10,11)
season.names <- rep("",12)
season.names[summer] <- "Summer"
season.names[fall] <- "Fall"
season.names[winter] <- "Winter"
season.names[spring] <- "Spring"
season.names
eventdata$season<-season.names[eventdata$months]
str(eventdata)
# Plot
ggplot(eventdata, aes(x = months, fill = season)) +
geom_histogram(breaks = seq(0,12, by=1), width = 4) +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer() + ylab("Count") +
scale_x_continuous("", limits = c(0, 12), breaks = seq(0, 12), labels = seq(0,12))
Following simple version works:
ggplot(eventdata, aes(x = factor(months), fill = season)) +
geom_histogram()+
coord_polar()

R ggplot and facet grid: how to control x-axis breaks

I am trying to plot the change in a time series for each calendar year using ggplot and I am having problems with the fine control of the x-axis. If I do not use scale="free_x" then I end up with an x-axis that shows several years as well as the year in question, like this:
If I do use scale="free_x" then as one would expect I end up with tick labels for each plot, and that in some cases vary by plot, which I do not want:
I have made various attempts to define the x-axis using scale_x_date etc but without any success. My question is therefore:
Q. How can I control the x-axis breaks and labels on a ggplot facet grid so that the (time series) x-axis is identical for each facet, shows only at the bottom of the panel and is in the form of months formatted 1, 2, 3 etc or as 'Jan','Feb','Mar'?
Code follows:
require(lubridate)
require(ggplot2)
require(plyr)
# generate data
df <- data.frame(date=seq(as.Date("2009/1/1"), by="day", length.out=1115),price=runif(1115, min=100, max=200))
# remove weekend days
df <- df[!(weekdays(as.Date(df$date)) %in% c('Saturday','Sunday')),]
# add some columns for later
df$year <- as.numeric(format(as.Date(df$date), format="%Y"))
df$month <- as.numeric(format(as.Date(df$date), format="%m"))
df$day <- as.numeric(format(as.Date(df$date), format="%d"))
# calculate change in price since the start of the calendar year
df <- ddply(df, .(year), transform, pctchg = ((price/price[1])-1))
p <- ggplot(df, aes(date, pctchg)) +
geom_line( aes(group = 1, colour = pctchg),size=0.75) +
facet_wrap( ~ year, ncol = 2,scale="free_x") +
scale_y_continuous(formatter = "percent") +
opts(legend.position = "none")
print(p)
here is an example:
df <- transform(df, doy = as.Date(paste(2000, month, day, sep="/")))
p <- ggplot(df, aes(doy, pctchg)) +
geom_line( aes(group = 1, colour = pctchg),size=0.75) +
facet_wrap( ~ year, ncol = 2) +
scale_x_date(format = "%b") +
scale_y_continuous(formatter = "percent") +
opts(legend.position = "none")
p
Do you want this one?
The trick is to generate day of year of a same dummy year.
UPDATED
here is an example for the dev version (i.e., ggplot2 0.9)
p <- ggplot(df, aes(doy, pctchg)) +
geom_line( aes(group = 1, colour = pctchg), size=0.75) +
facet_wrap( ~ year, ncol = 2) +
scale_x_date(label = date_format("%b"), breaks = seq(min(df$doy), max(df$doy), "month")) +
scale_y_continuous(label = percent_format()) +
opts(legend.position = "none")
p

Resources