How to make scale_x_date week start with Sunday - r

I'm creating a weekly time series chart, and week should start with Sunday. When I specify scale_x_date(breaks = date_breaks('1 week')) grid line and labels start on Monday, so results looks slightly off. How can I force ggplot scale_x_date week to start on Sunday
This is example of my code
library(ggplot2)
library(scales)
data.set <- structure(list(week.start = structure(c(15732, 15739,
15746, 15753, 15760, 15767, 15774, 15781,
15788, 15795, 15802, 15809 ), class =
"Date"), overtime.avg = c(2.8,
2.85666666666667, 2.18333333333333,
2.44666666666667, 2.04833333333333,
2.45833333333333, 2.12833333333333,
1.81666666666667, 1.82166666666667,
1.54333333333333, 2.09166666666667,
0.970833333333333)), .Names =
c("week.start", "overtime.avg"), row.names
= 29733:29744, class = "data.frame")
ggplot(data = data.set,
aes(x = week.start,
y = overtime.avg)) +
geom_line() +
geom_point() +
scale_x_date(breaks = date_breaks("1 week"),
labels = date_format(format = "%Y-%m-%d"))

One way would be to use function seq() and provide your own break points starting with first Sunday (used minimal value of week.start) and set by="week".
ggplot(data = data.set,aes(x = week.start,y = overtime.avg)) +
geom_line() +
geom_point() +
scale_x_date(breaks = seq(min(data.set$week.start),max(data.set$week.start),by="week"),
labels = date_format(format = "%Y-%m-%d"))

Related

How to fix arguments for scale_x_date in R code (ggplot2)?

Please I need your help to find the error in this code. I am receiving the following error message: Error: Invalid input: date_trans works with objects of class Date only. I think the problem is with scale_x_datearguments, but I am unable to fix it. Thank you.
library(ggplot2)
library(scales)
library(lubridate)
library(readxl)
entrada<- read_excel("R_codes_examples/entrada_turistas.xlsx", sheet = "mensal", col_types =
c("date", "numeric"))
ggplot(entrada, aes(x = entrada$`mes_ano`, y = entrada$`movimento_de_passageiros`)) +
geom_line( colour = "#0c4c8a") +
scale_x_date(date_breaks = "6 months",labels = date_format("%b-%Y"),limits = c(as.Date("2006-
08-01"), NA)) +
scale_y_continuous(labels=function(n){format(n, scientific = FALSE)}) +
labs(y= "Movimento de Passageiros mensais 2006 a 2017 ", x = "Mês/Ano") +
xlab("") +
theme(axis.text.x=element_text(angle=60, hjust=1))
Please find the head of my code below:
> dput(head(entrada))
structure(list(mes_ano = structure(c(1136073600, 1138752000,
1141171200, 1143849600, 1146441600, 1149120000), tzone = "UTC",
class = c("POSIXct", "POSIXt")), movimento_de_passageiros =
c(119764, 100442, 114198,
124676, 113431, 115482)), row.names = c(NA, -6L), class =
c("tbl_df", "tbl", "data.frame"))
There are a few things that went wrong and yes your data was correct :)
Change labels = date_format("%b-%Y") to date_labels = "%b-%Y"
Your limits cannot contain one NA value as where does it need to stop? this cannot be an infinite date.
Limits must be in POSIXct format
Just some code style things, entrada$'mes_ano' is not needed as you already passed the data and you can call x and y just by its name.
So here the fixed code assuming you work with POSIXct data (including time):
ggplot(entrada, aes(x = mes_ano, y = movimento_de_passageiros)) +
geom_line(colour = "#0c4c8a") +
scale_x_datetime(date_breaks = "6 months", date_labels = "%b-%Y", limits = c(as.POSIXct("2006-01-01"), as.POSIXct("2006-12-01"))) +
scale_y_continuous(labels=function(n){format(n, scientific = FALSE)}) +
labs(y= "Movimento de Passageiros mensais 2006 a 2017 ", x = "Mês/Ano") +
xlab("")`
Here an example if you work with as.Date dates, then make sure all has the same format
ggplot(entrada, aes(x = as.Date(mes_ano), y = movimento_de_passageiros)) +
geom_line(colour = "#0c4c8a") +
scale_x_date(date_breaks = "6 months", date_labels = "%b-%Y", limits = c(as.Date("2006-01-01"), as.Date("2006-12-01"))) +
scale_y_continuous(labels=function(n){format(n, scientific = FALSE)}) +
labs(y= "Movimento de Passageiros mensais 2006 a 2017 ", x = "Mês/Ano") +
xlab("")

plotly overrules ggplot2's scale_fill_manual's labels

I have a sample data set containing a end of week date and a churn value, either be negative or positive. In ggplot2 I use the scale_fill_manual() on the sign of the value as group.
This works perfectly fine showing the colors for positive versus negative values. Also the labels get rewritten according to the labels provided. However if I simply make it a plotly graph I lose my labels and they are set back to the -1, 1 factors instead. Does plotly not support this and if so is their another way to get this done
library(ggplot2)
library(plotly)
dt <- structure(list(date = structure(c(18651L, 18658L, 18665L, 18672L,
18679L, 18686L, 18693L, 18700L, 18707L, 18714L), class = c("IDate",
"Date")), churn = c(-3.27088948787062, -0.582518144525087, -0.125024925224327,
-0.333746898263027, -0.685714285714286, -0.340165549862042, 0.0601176470588235,
-0.119351608461635, -0.0132513279284316, -0.011201854099989)), row.names = c(NA,
-10L), class = c("data.table", "data.frame"))
plot_ggplot <- ggplot(dt, aes(x = date, y = churn * 100)) +
geom_bar(stat = "identity", aes(fill = factor(sign(churn)))) +
scale_fill_manual(
values = c("#4da63f", "#e84e62"),
breaks = c("-1", "1"),
labels = c("Growing base", "Declining base")
) +
ylim(-75, 25) +
labs(
title = "Weekly churn rate",
fill = "Legend"
)
plot_ggplot
plot_ggplotly <- ggplotly(plot_ggplot)
plot_ggplotly
Does this do the trick?
dt$base = ifelse(sign(dt$churn)>0, "Growing base","Declining base")
plot_ggplot <- ggplot(dt, aes(x = date, y = churn * 100)) +
geom_bar(stat = "identity", aes(fill = base)) +
scale_fill_manual(
values = c("#4da63f", "#e84e62"),
) +
ylim(-75, 25) +
labs(
title = "Weekly churn rate",
fill = "Legend"
)
plot_ggplot
plot_ggplotly <- ggplotly(plot_ggplot)
edit: I just read the comment, I think it is what was suggested

How to plot the graph using all the dates in x axis? [duplicate]

I'm having a very, very tough time getting the x-axis to look correct for my graphs.
Here is my data (generated via dput()):
df <- structure(list(Month = structure(1:12, .Label = c("2011-07-31", "2011-08-31", "2011-09-30", "2011-10-31", "2011-11-30", "2011-12-31", "2012-01-31", "2012-02-29", "2012-03-31", "2012-04-30", "2012-05-31", "2012-06-30"), class = "factor"), AvgVisits = c(6.98655104580674,7.66045407330464, 7.69761337479304, 7.54387561322994, 7.24483848458728, 6.32001400498928, 6.66794871794872, 7.207780853854, 7.60281201431308, 6.70113837397123, 6.57634103019538, 6.75321935568936)), .Names = c("Month","AvgVisits"), row.names = c(NA, -12L), class = "data.frame")
Here is the chart I am trying to graph:
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar() +
theme_bw() +
labs(x = "Month", y = "Average Visits per User")
That chart works fine - but, if I want to adjust the formatting of the date, I believe I should add this:
scale_x_date(labels = date_format("%m-%Y"))
I'm trying to make it so the date labels are 'MMM-YYYY'
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar() +
theme_bw() +
labs(x = "Month", y = "Average Visits per User") +
scale_x_date(labels = date_format("%m-%Y"))
When I plot that, I continue to get this error:
stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
Despite hours of research on formatting of geom_line and geom_bar, I can't fix it. Can anyone explain what I'm doing wrong?
Edit: As a follow-up thought: Can you use date as a factor, or should you use as.Date on a date column?
To show months as Jan 2017 Feb 2017 etc:
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y")
Angle the dates if they take up too much space:
theme(axis.text.x=element_text(angle=60, hjust=1))
Can you use date as a factor?
Yes, but you probably shouldn't.
...or should you use as.Date on a date column?
Yes.
Which leads us to this:
library(scales)
df$Month <- as.Date(df$Month)
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar(stat = "identity") +
theme_bw() +
labs(x = "Month", y = "Average Visits per User") +
scale_x_date(labels = date_format("%m-%Y"))
in which I've added stat = "identity" to your geom_bar call.
In addition, the message about the binwidth wasn't an error. An error will actually say "Error" in it, and similarly a warning will always say "Warning" in it. Otherwise it's just a message.

ggplot delete specific x-axis labels

library(tidyverse)
df <- data.frame(date = as.Date(c("2017-12-01", "2018-01-01", "2018-02-01",
"2018-03-01", "2018-04-01", "2018-05-01",
"2018-06-01", "2018-07-01", "2018-08-01",
"2018-09-01", "2018-10-01", "2018-11-01")),
value = c(0.567859562, 0.514907158, 0.035399304, 0.485728823,
0.925127361, 0.237531067, 0.301930968, 0.133373326,
0.082275426, 0.464255614, 0.2366749, 0.652084264))
ggplot(df, aes(date, value)) +
geom_col() +
scale_x_date(date_breaks = "1 month",
date_labels = "%b") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.3))
I want to maintain my plot shown below, exactly as is, with two exceptions. I want to remove the first Nov on the x-axis label and the last Dec on the x-axis label. I added coord_cartesian(xlim = as.Date(c("2017-12-01", "2018-11-01"))) to my code chunk above, but this eliminates the 'blank space' padding at either end of my x-axis.
How do I simply tell ggplot to delete the text of the first and last x-axis labels? This would be the first Nov and the last Dec. Note that these do not exists in my df data frame at all so dplyr filters probably won't work.
You could achieve what you want by setting breaks using seq.date:
library(tidyverse);library(lubridate)
df <- data.frame(date = as.Date(c("2017-12-01", "2018-01-01", "2018-02-01",
"2018-03-01", "2018-04-01", "2018-05-01",
"2018-06-01", "2018-07-01", "2018-08-01",
"2018-09-01", "2018-10-01", "2018-11-01")),
value = c(0.567859562, 0.514907158, 0.035399304, 0.485728823,
0.925127361, 0.237531067, 0.301930968, 0.133373326,
0.082275426, 0.464255614, 0.2366749, 0.652084264))
ggplot(df, aes(date, value)) +
geom_col() +
scale_x_date(
date_labels = "%b",
breaks = seq.Date(ymd("2017-12-01"),ymd("2018-11-01"), by = "month")) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.3))
I think this is what you want. The date_breaks are unnecessary.
ggplot(df, aes(date, value)) +
geom_col() +
scale_x_date(date_labels = "%b", breaks = df$date) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.3))
I would suggest researching the Lubridate package in R - you would be able to convert any messy date values into approrpiate POSIXT format and can also extract month information really easily - you could also convert those dates into a single column of the actual months, and use that as your axis label, which is cleaner as you would have an additional column of just the corresponding month - you can also fill based on that month and do some other cool stuff!

ggplot2 with time series (week and year) overlapping x labels [duplicate]

I'm having a very, very tough time getting the x-axis to look correct for my graphs.
Here is my data (generated via dput()):
df <- structure(list(Month = structure(1:12, .Label = c("2011-07-31", "2011-08-31", "2011-09-30", "2011-10-31", "2011-11-30", "2011-12-31", "2012-01-31", "2012-02-29", "2012-03-31", "2012-04-30", "2012-05-31", "2012-06-30"), class = "factor"), AvgVisits = c(6.98655104580674,7.66045407330464, 7.69761337479304, 7.54387561322994, 7.24483848458728, 6.32001400498928, 6.66794871794872, 7.207780853854, 7.60281201431308, 6.70113837397123, 6.57634103019538, 6.75321935568936)), .Names = c("Month","AvgVisits"), row.names = c(NA, -12L), class = "data.frame")
Here is the chart I am trying to graph:
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar() +
theme_bw() +
labs(x = "Month", y = "Average Visits per User")
That chart works fine - but, if I want to adjust the formatting of the date, I believe I should add this:
scale_x_date(labels = date_format("%m-%Y"))
I'm trying to make it so the date labels are 'MMM-YYYY'
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar() +
theme_bw() +
labs(x = "Month", y = "Average Visits per User") +
scale_x_date(labels = date_format("%m-%Y"))
When I plot that, I continue to get this error:
stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
Despite hours of research on formatting of geom_line and geom_bar, I can't fix it. Can anyone explain what I'm doing wrong?
Edit: As a follow-up thought: Can you use date as a factor, or should you use as.Date on a date column?
To show months as Jan 2017 Feb 2017 etc:
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y")
Angle the dates if they take up too much space:
theme(axis.text.x=element_text(angle=60, hjust=1))
Can you use date as a factor?
Yes, but you probably shouldn't.
...or should you use as.Date on a date column?
Yes.
Which leads us to this:
library(scales)
df$Month <- as.Date(df$Month)
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar(stat = "identity") +
theme_bw() +
labs(x = "Month", y = "Average Visits per User") +
scale_x_date(labels = date_format("%m-%Y"))
in which I've added stat = "identity" to your geom_bar call.
In addition, the message about the binwidth wasn't an error. An error will actually say "Error" in it, and similarly a warning will always say "Warning" in it. Otherwise it's just a message.

Resources