ggplot delete specific x-axis labels - r

library(tidyverse)
df <- data.frame(date = as.Date(c("2017-12-01", "2018-01-01", "2018-02-01",
"2018-03-01", "2018-04-01", "2018-05-01",
"2018-06-01", "2018-07-01", "2018-08-01",
"2018-09-01", "2018-10-01", "2018-11-01")),
value = c(0.567859562, 0.514907158, 0.035399304, 0.485728823,
0.925127361, 0.237531067, 0.301930968, 0.133373326,
0.082275426, 0.464255614, 0.2366749, 0.652084264))
ggplot(df, aes(date, value)) +
geom_col() +
scale_x_date(date_breaks = "1 month",
date_labels = "%b") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.3))
I want to maintain my plot shown below, exactly as is, with two exceptions. I want to remove the first Nov on the x-axis label and the last Dec on the x-axis label. I added coord_cartesian(xlim = as.Date(c("2017-12-01", "2018-11-01"))) to my code chunk above, but this eliminates the 'blank space' padding at either end of my x-axis.
How do I simply tell ggplot to delete the text of the first and last x-axis labels? This would be the first Nov and the last Dec. Note that these do not exists in my df data frame at all so dplyr filters probably won't work.

You could achieve what you want by setting breaks using seq.date:
library(tidyverse);library(lubridate)
df <- data.frame(date = as.Date(c("2017-12-01", "2018-01-01", "2018-02-01",
"2018-03-01", "2018-04-01", "2018-05-01",
"2018-06-01", "2018-07-01", "2018-08-01",
"2018-09-01", "2018-10-01", "2018-11-01")),
value = c(0.567859562, 0.514907158, 0.035399304, 0.485728823,
0.925127361, 0.237531067, 0.301930968, 0.133373326,
0.082275426, 0.464255614, 0.2366749, 0.652084264))
ggplot(df, aes(date, value)) +
geom_col() +
scale_x_date(
date_labels = "%b",
breaks = seq.Date(ymd("2017-12-01"),ymd("2018-11-01"), by = "month")) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.3))

I think this is what you want. The date_breaks are unnecessary.
ggplot(df, aes(date, value)) +
geom_col() +
scale_x_date(date_labels = "%b", breaks = df$date) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.3))

I would suggest researching the Lubridate package in R - you would be able to convert any messy date values into approrpiate POSIXT format and can also extract month information really easily - you could also convert those dates into a single column of the actual months, and use that as your axis label, which is cleaner as you would have an additional column of just the corresponding month - you can also fill based on that month and do some other cool stuff!

Related

How to order geom_segment ggplot with colour

I am new to ggplot library. And trying to draw the plot using the following data.frame:
library(tidyverse)
df <-tribble(~event, ~startdate,~enddate,~loc,
"A",as.POSIXct("1984/02/10"),as.POSIXct("1987/06/10"),"1",
"B",as.POSIXct("1984/02/11"),as.POSIXct("1990/02/12"),"2",
"A",as.POSIXct("1992/05/15"),as.POSIXct("1999/06/15"),"3",
"C",as.POSIXct("2003/08/29"),as.POSIXct("2015/08/29"),"4",
"B",as.POSIXct("2002/04/11"),as.POSIXct("2012/04/12"),"5",
"E",as.POSIXct("2000/02/10"),as.POSIXct("2005/02/15"),"6")
max_date = max(df$startdate,df$enddate)
Using the following code snippet:
ggplot(NULL)+
geom_segment(data = df,aes(x=loc, xend =loc,y = startdate, yend = enddate,colour=event),size = 5,alpha=0.6) +
geom_label(aes(label=df$event,x = df$loc,y=max_date), size=2) +
#geom_point(data=final_df,aes(x=newspaper,y=date),color="black") + Point from other data frame
coord_flip() + xlab("LoC") + ylab("Year")
I can able to output the following chart:
How can I order the above chart using the colour i.e. using the event field (in other word how can I perform group by operation on the event field so that first it should display first all events A then events B, C etc.)? I have tried to use scale_x_continuous and reorder from tidyverse package but it didn't work. How can I display more "Year" on the x-axis? I tried to use scale_x_date (mentioned here R: ggplot display all dates on x axis but it needs as.Date and ggplot geom_segment needs as.POSIXct format). Please feel free to correct me!
Any help would be great! Thank you!
Two options. I've also reversed your x and y so you don't have to use coord_flip() and made several other small modifications including the x-axis labels (you were looking for scale_y_datetime since you flipped the axes and the "dates" were actually in POSIXct). Also, one difference with Duck's answer is my scales = "free" in facet_grid. You might decide your labels and your "loc" variable may not make sense given these new graphs anyway.
library(tibble); library(ggplot2)
df <-tribble(~event, ~startdate,~enddate,~loc,
"A",as.POSIXct("1984/02/10"),as.POSIXct("1987/06/10"),"1",
"B",as.POSIXct("1984/02/11"),as.POSIXct("1990/02/12"),"2",
"A",as.POSIXct("1992/05/15"),as.POSIXct("1999/06/15"),"3",
"C",as.POSIXct("2003/08/29"),as.POSIXct("2015/08/29"),"4",
"B",as.POSIXct("2002/04/11"),as.POSIXct("2012/04/12"),"5",
"E",as.POSIXct("2000/02/10"),as.POSIXct("2005/02/15"),"6")
max_date = max(df$startdate,df$enddate)
ggplot(df)+
geom_segment(aes(y=event, yend = event, x = startdate, xend = enddate, colour=event),size = 5,alpha=0.6) +
geom_label(aes(label=event, y = event, x=max_date), size=2) +
xlab("Year") + ylab("LoC") +
scale_x_datetime(date_breaks = "year", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
ggplot(df)+
geom_segment(aes(y=loc, yend = loc, x = startdate, xend = enddate, colour=event),size = 5,alpha=0.6) +
geom_label(aes(label=event, y = loc, x=max_date), size=2) +
xlab("Year") + ylab("LoC") +
scale_x_datetime(date_breaks = "year", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) +
facet_grid(rows = vars(event), scales = "free")
Created on 2020-10-18 by the reprex package (v0.3.0)
Consider this as an option, as mentioned by #ArthurYip setting reorder could affect the sense of your plot. You could avoid the labels and using facet_grid() in next way:
library(ggplot2)
#Plot
ggplot(df)+
geom_segment(aes(x=loc, xend =loc,y = startdate, yend = enddate,colour=event),size = 5,alpha=0.6) +
coord_flip() + xlab("LoC") + ylab("Year")+
facet_grid(event~.,switch = "x")
Output:

How to plot the graph using all the dates in x axis? [duplicate]

I'm having a very, very tough time getting the x-axis to look correct for my graphs.
Here is my data (generated via dput()):
df <- structure(list(Month = structure(1:12, .Label = c("2011-07-31", "2011-08-31", "2011-09-30", "2011-10-31", "2011-11-30", "2011-12-31", "2012-01-31", "2012-02-29", "2012-03-31", "2012-04-30", "2012-05-31", "2012-06-30"), class = "factor"), AvgVisits = c(6.98655104580674,7.66045407330464, 7.69761337479304, 7.54387561322994, 7.24483848458728, 6.32001400498928, 6.66794871794872, 7.207780853854, 7.60281201431308, 6.70113837397123, 6.57634103019538, 6.75321935568936)), .Names = c("Month","AvgVisits"), row.names = c(NA, -12L), class = "data.frame")
Here is the chart I am trying to graph:
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar() +
theme_bw() +
labs(x = "Month", y = "Average Visits per User")
That chart works fine - but, if I want to adjust the formatting of the date, I believe I should add this:
scale_x_date(labels = date_format("%m-%Y"))
I'm trying to make it so the date labels are 'MMM-YYYY'
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar() +
theme_bw() +
labs(x = "Month", y = "Average Visits per User") +
scale_x_date(labels = date_format("%m-%Y"))
When I plot that, I continue to get this error:
stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
Despite hours of research on formatting of geom_line and geom_bar, I can't fix it. Can anyone explain what I'm doing wrong?
Edit: As a follow-up thought: Can you use date as a factor, or should you use as.Date on a date column?
To show months as Jan 2017 Feb 2017 etc:
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y")
Angle the dates if they take up too much space:
theme(axis.text.x=element_text(angle=60, hjust=1))
Can you use date as a factor?
Yes, but you probably shouldn't.
...or should you use as.Date on a date column?
Yes.
Which leads us to this:
library(scales)
df$Month <- as.Date(df$Month)
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar(stat = "identity") +
theme_bw() +
labs(x = "Month", y = "Average Visits per User") +
scale_x_date(labels = date_format("%m-%Y"))
in which I've added stat = "identity" to your geom_bar call.
In addition, the message about the binwidth wasn't an error. An error will actually say "Error" in it, and similarly a warning will always say "Warning" in it. Otherwise it's just a message.

How to add every day as an x axis label in ggplot

I'd like to make a plot with every day labeled on the x axis.
Here is my data
my_data <- read.table(text="day value
11/15/19 0.23633
11/16/19 0.28485
11/17/19 0.63127
11/18/19 0.15434
11/19/19 0.47964
11/20/19 0.65967
11/21/19 0.48741
11/22/19 0.84541
11/23/19 0.10123
11/24/19 0.78169
11/25/19 0.23189
11/26/19 0.86665
11/27/19 0.55184
11/28/19 0.81410
11/29/19 0.25821
11/30/19 0.23576
12/1/19 0.46397
12/2/19 0.55764
12/3/19 0.95645
12/4/19 0.63954
12/5/19 0.76766
12/7/19 0.74505
12/8/19 0.65515
12/9/19 0.58222
12/10/19 0.17294", header=TRUE, stringsAsFactors=FALSE)
Here is my code
my_data %>%
ggplot(aes(day, value)) +
geom_line() +
scale_x_continuous(breaks = seq(1, nrow(my_data)),
labels = my_data$day)
It gives me this error: Error in as.Date.numeric(value) : 'origin' must be supplied
I'd like to make it so that every day is represented on the x axis and by default it only does a few of the days that are included in this range of data.
Try to use scale_x_date instead of scale_x_continuous
my_data %>%
ggplot(aes(x = mdy(day), value)) +
geom_line() +
scale_x_date(date_breaks = "1 day")+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
You can use lubridate to convert the data to proper date format:
library(lubridate)
my_data %>%
mutate(day = mdy(day)) %>%
ggplot(aes(day, value)) +
geom_line() +
scale_x_date(date_breaks = "1 day") +
theme(axis.text.x = element_text(angle=90, vjust=0.5))

ggplot2 with time series (week and year) overlapping x labels [duplicate]

I'm having a very, very tough time getting the x-axis to look correct for my graphs.
Here is my data (generated via dput()):
df <- structure(list(Month = structure(1:12, .Label = c("2011-07-31", "2011-08-31", "2011-09-30", "2011-10-31", "2011-11-30", "2011-12-31", "2012-01-31", "2012-02-29", "2012-03-31", "2012-04-30", "2012-05-31", "2012-06-30"), class = "factor"), AvgVisits = c(6.98655104580674,7.66045407330464, 7.69761337479304, 7.54387561322994, 7.24483848458728, 6.32001400498928, 6.66794871794872, 7.207780853854, 7.60281201431308, 6.70113837397123, 6.57634103019538, 6.75321935568936)), .Names = c("Month","AvgVisits"), row.names = c(NA, -12L), class = "data.frame")
Here is the chart I am trying to graph:
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar() +
theme_bw() +
labs(x = "Month", y = "Average Visits per User")
That chart works fine - but, if I want to adjust the formatting of the date, I believe I should add this:
scale_x_date(labels = date_format("%m-%Y"))
I'm trying to make it so the date labels are 'MMM-YYYY'
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar() +
theme_bw() +
labs(x = "Month", y = "Average Visits per User") +
scale_x_date(labels = date_format("%m-%Y"))
When I plot that, I continue to get this error:
stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
Despite hours of research on formatting of geom_line and geom_bar, I can't fix it. Can anyone explain what I'm doing wrong?
Edit: As a follow-up thought: Can you use date as a factor, or should you use as.Date on a date column?
To show months as Jan 2017 Feb 2017 etc:
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y")
Angle the dates if they take up too much space:
theme(axis.text.x=element_text(angle=60, hjust=1))
Can you use date as a factor?
Yes, but you probably shouldn't.
...or should you use as.Date on a date column?
Yes.
Which leads us to this:
library(scales)
df$Month <- as.Date(df$Month)
ggplot(df, aes(x = Month, y = AvgVisits)) +
geom_bar(stat = "identity") +
theme_bw() +
labs(x = "Month", y = "Average Visits per User") +
scale_x_date(labels = date_format("%m-%Y"))
in which I've added stat = "identity" to your geom_bar call.
In addition, the message about the binwidth wasn't an error. An error will actually say "Error" in it, and similarly a warning will always say "Warning" in it. Otherwise it's just a message.

How to make scale_x_date week start with Sunday

I'm creating a weekly time series chart, and week should start with Sunday. When I specify scale_x_date(breaks = date_breaks('1 week')) grid line and labels start on Monday, so results looks slightly off. How can I force ggplot scale_x_date week to start on Sunday
This is example of my code
library(ggplot2)
library(scales)
data.set <- structure(list(week.start = structure(c(15732, 15739,
15746, 15753, 15760, 15767, 15774, 15781,
15788, 15795, 15802, 15809 ), class =
"Date"), overtime.avg = c(2.8,
2.85666666666667, 2.18333333333333,
2.44666666666667, 2.04833333333333,
2.45833333333333, 2.12833333333333,
1.81666666666667, 1.82166666666667,
1.54333333333333, 2.09166666666667,
0.970833333333333)), .Names =
c("week.start", "overtime.avg"), row.names
= 29733:29744, class = "data.frame")
ggplot(data = data.set,
aes(x = week.start,
y = overtime.avg)) +
geom_line() +
geom_point() +
scale_x_date(breaks = date_breaks("1 week"),
labels = date_format(format = "%Y-%m-%d"))
One way would be to use function seq() and provide your own break points starting with first Sunday (used minimal value of week.start) and set by="week".
ggplot(data = data.set,aes(x = week.start,y = overtime.avg)) +
geom_line() +
geom_point() +
scale_x_date(breaks = seq(min(data.set$week.start),max(data.set$week.start),by="week"),
labels = date_format(format = "%Y-%m-%d"))

Resources