Date format for subset of ticks on time axis - r

Problem
I would like to format my X-axis (time) so that the weekends are clearly visible. I would like to display the date as well as the day of the week.
Current situation
I do this with (full code below)
scale_x_date(breaks=myData$timestamp,
labels=paste(
substr(format(myData$timestamp, "%a"),1,1),
format(myData$timestamp, "%d"),
sep="\n")
)
which gives me
Wanted situation
I would rather have a one letter abbreviation for the weekdays since it became a bit tight there.. Also, I'd like to color sundays (and holidays really) in red. Here's what I mean (made with GIMP). Note how the first Monday and last Friday was added by using
scale_x_date(breaks = "1 day",
minor_breaks = "1 days",
labels = date_format("%a\n%d"),
name="")
However, then I get a three letter abbreviation of the weekdays, which I removed in GIMP.
Here is the complete code for this example.
library(ggplot2)
library(scales)
library(reshape2)
minimumTime <- as.Date("2014-07-01")
maximumTime <- as.Date("2014-07-31")
x <- seq(minimumTime,maximumTime, by="1 day")
y1 <- sin(as.numeric(x)/3)
y2 <- cos(as.numeric(x)/3)
myData <- data.frame(timestamp=x, y1=y1, y2=y2)
myData <- melt(myData, id.vars="timestamp")
rects <- data.frame(saturdays=myData[weekdays(myData$timestamp) == "Saturday","timestamp"]-0.5, sundays = myData[weekdays(myData$timestamp) == "Saturday","timestamp"]+1.5)
myPlot <- ggplot() +
geom_rect(data=rects, aes(xmin=saturdays, xmax=sundays,ymin=-Inf, ymax=Inf), alpha=0.1) +
geom_line(data=myData, aes(x=timestamp, y=value, colour=variable,size=1)) +
geom_point(data=myData, aes(x=timestamp, y=value, colour=variable,size=2)) +
scale_x_date(breaks=myData$timestamp, labels=paste(substr(format(myData$timestamp, "%a"),1,1),format(myData$timestamp, "%d"),sep="\n")) +
#scale_x_date(breaks = "1 day", minor_breaks = "1 days", labels = date_format("%a\n%d"), name="") +
scale_size_continuous(range = c(1.5,5), guide=FALSE)
So to sum up:
Is there a way to color specific breaks in another color?
Is there a way change to the labels manually and still have them for the Monday and
the Friday at the beginning and the end in this case?
Also, if there's a way to have the lines of each label centered, that would be
awesome :)
Thank you!

You can use your custom formater for labels also using breaks="1 day" argument, you just have to use function(x) after labels= and replace myDate$timestamp with x. This will also solve the third problem.
+ scale_x_date(breaks="1 day",
labels= function(x) paste(substr(format(x, "%a"),1,1),format(x, "%d"),sep="\n"))
Or you can make your transformation as seperate function and then use it for labels=.
my_date_trans<-function(x) {
paste(substr(format(x, "%a"),1,1),format(x, "%d"),sep="\n")
}
+ scale_x_date(breaks="1 day",labels=my_date_trans)
To change colors for labels you should use theme() and axis.text.x=. Here I using vector of colors that contains 6 time black and then red as your scale starts with Monday. Those colors are then repeated.
ggplot() +
geom_rect(data=rects, aes(xmin=saturdays, xmax=sundays,ymin=-Inf, ymax=Inf), alpha=0.1) +
geom_line(data=myData, aes(x=timestamp, y=value, colour=variable,size=1)) +
geom_point(data=myData, aes(x=timestamp, y=value, colour=variable,size=2)) +
scale_x_date(breaks="1 day",labels=my_date_trans)+
scale_size_continuous(range = c(1.5,5), guide=FALSE)+
theme(axis.text.x=element_text(color=c(rep("black",6),"red")))

Related

How to include ticks for every date using ggplot?

Let's consider data:
x <- c(
"1900-01-01", "1900-04-01", "1900-07-01", "1900-10-01", "1901-02-01",
"1901-05-01", "1901-08-01",
"1901-11-01", "1902-02-01", "1902-05-01", "1902-08-01", "1902-11-01", "1903-02-01"
)
x <- as.Date(x)
y <- 1:length(x)
df <- data.frame("Date" = x, "Preds" = y)
I want to make a plot using ggplot but with every date marked (in format %M-%Y):
My work so far:
ggplot(df, aes(x = Date, y = Preds)) +
geom_line() +
scale_x_date(date_breaks = "3 month", date_labels = "%b-%Y")
However this code brings one problem: starting day is in February, whereas our dates starts in Jan.
Do you know how can I back my time by one month ? And Do I have to specify exactly that I want to make a breaks by three months ? Or is there any generic solution that would allow me to just have ticks for every date without this specification ?
We just need to include breaks = x:
ggplot(df, aes(x = Date, y = Preds)) +
geom_line() +
scale_x_date(breaks = x, date_labels = "%b-%Y")

position_dodge2 does not work for multiple values in a dataframe

I want to create a box-plot with dodge as a position adjustment. The problem is that "position_dodge2" is not working when I use two layers of geom_boxplot with different data sources. An "obvious" solution is to merge both dataframes and then use ggplot with an unique dataframe, but I do not know how to organize the data to make it work.
The data is composed of precipitation estimation from different sources for the same location for 6 months.
Date Prec1 Prec2 ...
01-01-2000 0.2 0.8 ...
.
.
.
Then, I used the "gather" function to create the variable precipitation in one column. Also include the month and week columns
Precipitation Date Value Month Week
Prec1 01-01-2000 0.2 Jan 1
Prec2 01-01-2000 0.6 Jan 1
.
.
.
This is the code I have:
p1 <- ggplot(data = df, aes(x=Date, y=Value, group = Week),
position = position_dodge2(width = 0.5)) +
geom_boxplot(data = subset(df, Model != "PrecF1")) +
geom_boxplot(data = subset(df, Model == "PrecF1"),
color = "red",
width = 0.32) +
xlab("Date") + ylab("Precipitation (mm)") +
scale_x_date(date_breaks = "1 month", date_labels = "%B") # adjust the x axis breaks
p1 + theme(axis.text.x = element_text(size=8, angle=0))
I want to compare "Prec1" values against the other estimations using boxplots side-by-side but position_dodge2 does not work :(
I appreciate any help. Please let me know if I am overlooking something obvious or I should provide the data. The output is attached.
The data is here: PrecData
Edited in the light of subsequent comments...
The problem is that you are using a continuous x axis for what is essentially a discrete set of data points, so dodging does not really make sense (as it changes the apparent date value). You can solve this by basing the x axis on as.factor(week), and fiddling the labels appropriately. Try the following...
ggplot(data = df, aes(x=as.factor(week), y=value, colour = (Model=="PrecF1"))) +
geom_boxplot()
An alternative would be to use facets. Try the following...
ggplot(data = df, aes(x=Date, y=Value, group = Week)) +
geom_boxplot() +
facet_wrap(~(Model == "PrecF1")) +
xlab("Date") + ylab("Precipitation (mm)") +
scale_x_date(date_breaks = "1 month", date_labels = "%B") +
theme(axis.text.x = element_text(size=8, angle=0))
This will give you two facets side by side, depending on whether Model == "PrecF1" is true or false.

Is there away to remove increment labels on the y axis in ggplot2?

I am plotting three years of data on a scatterplot in ggplot2, with years as the y-axis. The axis is scaling so that the tick mark labels are "2015.5, 2016, 2016.5 … etc." and I need them to just be "2016 2017 2018". I have tried to use the scale_y_discrete function.
Here is my code
x <- (plot <- ggplot(NULL, aes(sos, year)) +
geom_jitter(data = epic, aes(col = "EPIC")) +
geom_jitter(data = landsat, aes(col = "Landsat")) +
geom_jitter(data = pheno, aes(col = "PhenoCam")))
x + labs(title = "Start of Season Comparison",
x = "DOY",
y = "Year")
and here is the current
scatterplot
Thank you!
Have you checked to see the year variable is numeric ? If that is numeric then I think Solorzanos solutions should work
One easy way is to change year as factors, but I am not sure if this is correct way of doing it or not.
ggplot(NULL, aes(sos, as.factor(year))) +

Timeseries in R - how to adjust stat_bin (or histogram) to start at a date instead of centering over it?

I've created two plots that are displayed next to each other vertically. The charts themselves are aligned properly, but the problem is that the uppermost plot which uses stat_bin refuses to start AT the first date 30/09. I guess this makes sense as most histograms start 0.5 to the left and end 0.5 to the right, but how can I adjust this? I want the uppermost plot to start at the first date in order for it to align with the second chart. This is my first post by the way, I'm greatful for answers!
# Pulling data from SQL
res <- sqlQuery(dbhandle, Query)
bar <- sqlQuery(dbhandle, Query2)
# Closing SQL objects
rm(Query,Query2,dbhandle)
# Some filtering
res <- res[res$RTC=="S" & res$Category != "Other" & res$RTC =="S",]
# Converting datetime to date
bar$AsPerDate <- as.Date(as.character(bar$AsPerDate), format="%Y-%m-%d")
res$CREATED_ON <- as.Date(as.character(res$CREATED_ON), format="%Y-%m-%d")
# Plot 1
p <- ggplot(data=res, aes(x=CREATED_ON, color=Category)) +
stat_bin(data=res[res$Category=="1",],aes(y=cumsum(..count..)),geom="step",size=1) +
stat_bin(data=res[res$Category=="2",],aes(y=cumsum(..count..)),geom="step",size=1) +
stat_bin(data=res[res$Category=="3",],aes(y=cumsum(..count..)),geom="step",size=1) +
stat_bin(data=res[res$Category=="4",],aes(y=cumsum(..count..)),geom="step",size=1) +
theme(panel.grid.major=element_line(colour="grey"), panel.grid.minor=element_line(colour="grey")) +
scale_x_date(date_breaks="1 day", date_labels = "%d/%m", date_minor_breaks = "1 day",
limits = as.Date(c('2016-09-30','2016-10-15')))+
xlab("Date dd/mm")
# Plot 2
m <- ggplot(data=bar, aes(x=AsPerDate, y=Value)) +
geom_line(alpha=1) + geom_point(data=res,aes(x=CREATED_ON,y=REVENUE, color=Category)) +
theme(panel.grid.major=element_line(colour="grey"), panel.grid.minor=element_line(colour="grey")) +
scale_x_date(date_breaks="1 day", date_labels = "%d/%m", date_minor_breaks = "1 day",
limits = as.Date(c('2016-09-30','2016-10-15'))) +
xlab("Date dd/mm")
# Combining plots
p1 <- ggplot_gtable(ggplot_build(p))
p2 <- ggplot_gtable(ggplot_build(m))
maxWidth = unit.pmax(p1$widths[2:3], p2$widths[2:3])
p1$widths[2:3] <- maxWidth
p2$widths[2:3] <- maxWidth
grid.arrange(p1, p2, heights = c(3, 2))
Here is an image-link showing the plot
Try using the "center" argument of stat_bin as
stat_bin(data=res[res$Category=="1",],aes(y=cumsum(..count..)),geom="step",size=1, center = 0.5)
I am not sure about the 0.5 value of center so you may try other values as well. "boundary" argument may also be used.

date_minor_breaks in ggplot2

I am a beginner in ggplot2. I am unable to use date_minor_breaks to show quarterly "ticks" on x-axis.
Here's my code:
x<-c(seq(1:12))
time<-c("2010Q1","2010Q2","2010Q3","2010Q4","2011Q1","2011Q2", "2011Q3","2011Q4","2012Q1","2012Q2","2012Q3","2012Q4")
z<-data.frame(type = x,time = time)
z$time = as.yearqtr(z$time)
z$time = as.Date(z$time)
ggplot(data = z, aes(x=time,y=type)) +
geom_point() +
scale_x_date(date_labels = "%Y",date_minor_breaks = "3 months",name = "Year") +
theme_tufte() +
theme(legend.position = "none")
I researched this topic on SO Formatting dates with scale_x_date in ggplot2 and on https://github.com/hadley/ggplot2/issues/542, and found that there were some issues reported on this topic. However, I didn't quite follow the conversation about changes to ggplot2 because it's been only 6 days since I started using ggplot2.
Here's the graph I got (it doesn't have any ticks)...
Here's a sample graph with "tick marks" generated from Excel. Please ignore values because my point of creating this Excel chart is to demonstrate what I am looking for--i.e. "quarterly ticks". I'd appreciate your help.
You may have to make major breaks every three months and then pad your labels with blanks to give the illusion of major (labeled) and minor (unlabeled) ticks. See this answer for another example.
First manually make the breaks for the tick marks at every quarter.
breaks_qtr = seq(from = min(z$time), to = max(z$time), by = "3 months")
Then make the year labels and pad these labels with three blanks after each number.
labels_year = format(seq(from = min(z$time), to = max(z$time), by = "1 year"), "%Y")
labs = c(sapply(labels_year, function(x) {
c(x, rep("", 3))
}))
Now use the breaks and the labels with the labels and breaks arguments in scale_x_date. Notice that I'm not using date_labels and date_breaks for this.
ggplot(data = z, aes(x=time,y=type)) +
geom_point() +
scale_x_date(labels = labs, breaks = breaks_qtr, name = "Year") +
theme_tufte() +
theme(legend.position = "none")
You should also define your (major) date breaks:
ggplot(data = z, aes(x=time, y=type)) +
geom_point() +
scale_x_date(date_breaks = "1 year", name = "Year", date_minor_breaks="3 months",
limits = c(as.Date(as.yearqtr("2009Q4")),
as.Date(as.yearqtr("2013Q2"))),
expand=c(0,0), date_labels = "%Y") +
theme(legend.position = "none")
And some other "fancy" stuff to align the minor ticks with the major ticks (I guess there a better ways to do this, but this works).

Resources