Order of weekdays in a bar chart using lubridate and ggplot - r

When creating a bar plot where the x-axis is Weekdays (using lubridate and ggplot), the order of my weekdays does't follow the chronological "mon, tues, wed... "
I've tried to replicate the problem but as you can see below, in my replication, the order of labels on the x-axis if fine.
library(lubridate)
library(dplyr)
my_data <- data.frame(dates = sample(seq(as.Date('2010/01/01'), as.Date('2020/01/01'), by="day"), 100),
group = rep(c(1,2,3,4), times = 5))
my_data <- my_data %>%
mutate(Weekdays = wday(dates, label = TRUE)) %>%
filter(Weekdays != "Sat" &
Weekdays != "Sun")
ggplot(my_data, aes(Weekdays))+
geom_bar()+
facet_wrap(~group)
Output:
The code below is what I have used with the real data. As you can see the week days are not in order. I'm not sure where my actual code differs from the attempted replication above.
library(ggplot2)
df1 <- filter(df, wday != "Sat")
ggplot(df1, aes(x = wday))+
geom_bar()+
facet_wrap(~Grouping)+
theme(axis.text.x = element_text(angle =90, hjust = 1))
Output:
Any help / advice would be most appreciated.

I think this would be the solution
my_data %>%
mutate(weekday = fct_reorder(weekdays(dates, abbreviate = TRUE), # generating week days
wday(dates))) %>%
filter(!weekday %in% c("Sat", "Sun")) %>% # here you filter saturday and sunday
ggplot(aes(weekday)) +
geom_bar() +
facet_wrap(~ group)
As you can see, the wday function gives the ordered numbers that can be used with fct_reorder function.

Related

Arrange weekdays starting on Sunday

everyone!
How can I arrange weekdays, starting on Sunday, in R? I got the weekdays using lubridate's function weekdays(), but the days appears randomly (image attached) and I can't seem to find a way to sort it. I tried the arrange function, but I guess it only works with numeric values. A bar chart looks very weird starting on Friday. This is what the code looks like:
my_dataset <- my_dataset %>%
mutate(weekDay = weekdays(Date))
my_dataset %>%
group_by(weekDay) %>%
summarise(mean_steps = mean(TotalSteps)) %>%
ggplot(aes(x = weekDay, y = steps))+
geom_bar(stat = "identity")
Thanks!
I tried the arrange function, but I guess it only works with numeric values.
Your weekDay-vector probably is of the class character. This will be arranged in alphabetical order by ggplot. The solution to this is to convert this character-vector into a factor-class.
There are several ways to get the x-axis in the order you would like to see. All of them mean to convert weekDays into a factor.
In order to come close to your example I have at first created a data frame with weekdays and some data. As those are both created randomly a seed was set to make the code reproducible.
One method is to create the data.frame with summaries and then to define in this DF weekdays as a factor with defined levels.
This can also be done within the ggplot-call when creating the aesthetics.
library(tidyverse)
set.seed(111)
myData <- data.frame(
weekDay = sample(weekdays(Sys.Date() + 0:6), 100, replace = TRUE),
TotalSteps = sample(1000:8000, 100)
)
myData %>%
group_by(weekDay) %>%
summarise(mean_steps = mean(TotalSteps)) -> DF # new data.frame
# the following defines weekDay as a factor and also sets
# the sequence of factor levels. This sequence is then taken
# by ggplot to construct the x-axis.
DF$weekDay <- factor(DF$weekDay, levels = c(
"Sonntag", "Montag",
"Dienstag", "Mittwoch",
"Donnerstag", "Freitag",
"Samstag"
))
ggplot(DF, aes(x = weekDay, y = mean_steps)) +
geom_bar(stat = "identity") +
labs(x="")
# the factor can also be defined within the ggplot-call
myData %>%
group_by(weekDay) %>%
summarise(mean_steps = mean(TotalSteps)) %>%
ggplot(aes(x = factor(weekDay, levels = c(
"Sonntag", "Montag",
"Dienstag", "Mittwoch",
"Donnerstag", "Freitag",
"Samstag"
)), y = mean_steps)) +
geom_bar(stat = "identity") +
labs(x="")

Why can't I get the right horizontal axis labels on my ggplot2 chart?

I am trying to do a faceted plot of a grouped dataframe with ggplot2, using geom_line(). My dataframe has a Date column and I would like to have dates on the horizontal axis. If I just use Date in aes(x=Date, ...) I get nice labels on the horizontal axis. However, the line has an almost horizontal section where the date jumps from the end of one group to the beginning of the next group. This code and chart shows that:
dts <- seq.Date(as.Date("2020-01-01"), as.Date("2021-12-31"), by="day")
mos <- sapply(dts, month)
df <- data.frame(Date=dts, Month=mos)
nr <- nrow(df)
df$X <- rep(1, nr)
df %>%
group_by(Month) -> dfgrp
dfgrp %>%
group_by(Month) %>%
mutate(Time = Date[1:n()],
Z = cumsum(X)) %>%
ggplot(aes(x=Date, y=Z)) +
geom_line(color="darkgreen", size=0.5) +
facet_grid(. ~ Month, scale="free_x") +
theme(axis.text.x = element_text(angle=45, size=7))
I would not like my chart to have those almost-horizontal lines when the date changes by a large amount. I was able to generate a chart without those lines using integers on aes() as follows:
dfgrp %>%
mutate(Time = 1:n() %>% as.integer(),
Z = cumsum(X)) %>%
ggplot(aes(x=Time, y=Z)) +
geom_line(color="darkgreen", size=0.5) +
facet_grid(. ~ Month, scale="free_x") +
scale_x_continuous(breaks = seq(from=1, to=nr, by=10) %>% as.integer(),
labels = function(x) as.character(dfgrp$Date[x])) +
theme(axis.text.x = element_text(angle=45, size=7))
The line on the chart looks like I want it but the dates on the horizontal axis are not correct: they end in February 2020 in every facet while the dates in the dataframe end in December 2021 and the dates in the first chart begin and end on different months in different facets.
I tried many things but nothing worked. Any suggestions on how to have a chart with dates like in the first chart above and lines like in the second chart above?
Help will be much appreciated.
You may want to adjust the dates to be in the same year, but noting the original year as a variable:
library(lubridate)
dfgrp %>%
group_by(Month) %>%
mutate(year = year(Date),
adj_date = ymd(paste(2020, month(Date), day(Date)))) %>%
# 2020 was leap year so 2/29 won't be lost
mutate(Time = Date[1:n()],
Z = cumsum(X)) %>%
ggplot(aes(x=adj_date, y=Z, color = year, group = year)) +
geom_line(size=0.5) +
facet_grid(. ~ Month, scale="free_x") +
theme(axis.text.x = element_text(angle=45, size=7))

how to order the x-axis (in a graph) in chronological order?

I create some fake (time series) data in R at daily intervals. I want to "aggregate" the data by "week", and then plot the data. I have posted my code below:
#set seed
set.seed(123)
#load libraries
library(xts)
library(ggplot2)
#create data
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")
property_damages_in_dollars <- rnorm(731,100,10)
final_data <- data.frame(date_decision_made, property_damages_in_dollars)
#aggregate
y.mon<-aggregate(property_damages_in_dollars~format(as.Date(date_decision_made),
format="%W-%y"),data=final_data, FUN=sum)
y.mon$week = y.mon$`format(as.Date(date_decision_made), format = "%W-%y")`
#Plot
ggplot(y.mon, aes(x = week, y=property_damages_in_dollars))+
geom_line(aes(group=1))+
scale_x_discrete(guide = guide_axis(n.dodge=2))+
theme(axis.text.x = element_text(angle = 45))
However, the x-axis for the graph does not appear to be in chronological order. The points seem to be "alternating" between 2014 and 2015.
Can someone please show me how to fix this?
As #Allan mentioned it is easy if you keep dates as dates and use scale_x_date to format X-axis and specify their breaks the way you want.
library(dplyr)
library(ggplot2)
final_data %>%
mutate(date_decision_made = as.Date(date_decision_made, "%Y/%m/%d"),
week = format(date_decision_made, "%W-%y")) %>%
group_by(week) %>%
summarise(property_damages_in_dollars = sum(property_damages_in_dollars),
date = first(date_decision_made)) %>%
ggplot() + aes(x = date, y=property_damages_in_dollars) +
geom_line(aes(group=1)) +
scale_x_date(date_labels = "%W-%y", date_breaks = '1 month') +
theme(axis.text.x = element_text(angle = 45))

Display in an R chart the maximum value of X for a given Y in a dataframe

I'm trying to show in a linear chart in R from a Y (project) given the maximum date that corresponds to that Y, but it only shows me the highest date of that dataframe and not of each of the projects in order.
Example of Dataframe :
Chart:
Code:
output$grafica5 <-renderPlotly({
p <- ggplot(data(),aes(x=max(as.Date(data()[,names(data())[3]], format = "%d/%m/%Y")), y=as.factor(data()[,names(data())[2]]), group=1))+
geom_line()+
geom_point()+
theme_classic()+
theme(axis.text.x=element_text(angle=45, hjust=1))+scale_color_viridis(discrete = TRUE)+
labs(
y="Proyectos",
x="Fecha")})
Can you help me? Thanks in advance
You should do your data manipulation prior to the plotting (as.date and filtering the max values)
library(dplyr)
library(ggplot2)
# creation of a data.frame with repeated projects with due dates
df <- data.frame(
Projects = paste0("projects", sample(1:9, 20, T)),
Duedate = paste0(sample(10:30, 20, T), "/01/2020" )
)
# creation of a variable `dates` and filtering of the max date per project
df <- df %>%
mutate(dates = as.Date(Duedate, format = "%d/%m/%Y")) %>%
group_by(Projects) %>%
filter(dates == max(dates)) %>%
ungroup()
ggplot(df, aes(x = dates, y = Projects, group = 1)) +
geom_line() +
geom_point()

ggplot2 overlayed line chart by year?

Starting with the following dataset:
$ Orders,Year,Date
1608052.2,2019,2019-08-02
1385858.4,2018,2018-07-27
1223593.3,2019,2019-07-25
1200356.5,2018,2018-01-20
1198226.3,2019,2019-07-15
837866.1,2019,2019-07-02
Trying to make a similar format as:
with the criteria: X-axis will be days or months, y-axis will be sum of Orders, grouping / colors will be by year.
Attempts:
1) No overlay
dataset %>%
ggplot( aes(x=`Merge Date`, y=`$ Orders`, group=`Merge Date (Year)`, color=`Merge Date (Year)`)) +
geom_line()
2) ggplot month grouping
dataset %>%
mutate(Date = as.Date(`Date`) %>%
mutate(Year = format(Date,'%Y')) %>%
mutate(Month = format(Date,'%b')) -> dataset2
ggplot(data=dataset2, aes(x=Month, y=`$ Orders`, group=Year, color=factor(Year))) +
geom_line(size=.75) +
ylab("Volume")
The lubridate package is your answer. Extract month from the Date field and turn it into a variable. This code worked for me:
library(tidyverse)
library(lubridate)
dataset <- read_delim("OrderValue,Year,Date\n1608052.2,2019,2019-08-02\n1385858.4,2018,2018-07-27\n1223593.3,2019,2019-07-25\n1200356.5,2018,2018-01-20\n1198226.3,2019,2019-07-15\n837866.1,2019,2019-07-02", delim = ",")
dataset <- dataset %>%
mutate(theMonth = month(Date))
ggplot(dataset, aes(x = as.factor(theMonth), y = OrderValue, group = as.factor(Year), color = as.factor(Year))) +
geom_line()

Resources