scale_x_date and how to make it equidistant? - r

How can please using ggplot2 package or whatever else package remove that "blank space " between months caused by absence of certain months on my x axis ? In other words to make the x axis looks equidistant and not having those "blank gapes".By the code is very normal ,it is about plotting certain values vs other column containing dates (not all the months are present in that date column ).
filtredplot1<-reactive({
req(res_mod())
dat<-res_mod()
dt<-dat[dat$M_Datum >= input$dateRange[1] & dat$M_Datum <= input$dateRange[2],]
dt[,5]<-as.Date(format(as.Date(dt[,5]), "%Y-%m-01"))
req(dt$M_Datum,dt$Yield)
dr<-data.frame("M_Datum"=dt$M_Datum,"Yield"=dt$Yield)
mydf=aggregate(Yield ~ M_Datum, dr, length)
req(mydf$M_Datum,mydf$Yield)
koka<-data.frame("M_Datum"=mydf$M_Datum,"Yiel"=mydf$Yield)
ggplot(koka, aes(x=factor(format(M_Datum, "%b %Y")), y=Yiel,group = 1)) +
geom_point(size=7,colour="#EF783D",shape=17) +
geom_line(color="#EF783D")+
scale_x_date(labels="%b %Y")
theme(axis.text.x = element_text(angle = 0, vjust = 0.5, hjust=1))+
theme(axis.text.y.left = element_text(color = "#EF783D"),
axis.title.y.left = element_text(color = "#EF783D"))+
ylab("Wafer Quantity")+
xlab("")
})

The code below does not create the data.frames dr and mydf, your data preparation code is too complicated. The following is much simpler and works.
Also, you have the typo Yiel for Yield twice in your code. The first when creating koka and the second in aes().
suppressPackageStartupMessages({
library(shiny)
library(ggplot2)
})
dt <- data.frame(
M_Datum=c("2018-02-05","2018-02-15","2018-02-10","2018-02-13","2017-02-05",
"2017-02-15","2017-02-10","2017-02-23","2020-02-25","2012-02-15",
"2020-02-10","2020-02-13"),
Yield=c(4,47,18,10,22,50,70,120,150,400,60,78)
)
dt$M_Datum <- as.Date(dt$M_Datum)
req(dt$M_Datum, dt$Yield)
koka <- aggregate(Yield ~ M_Datum, dt, length)
ggplot(koka, aes(x = M_Datum, y = Yield, group = 1)) +
geom_point(size = 7, colour = "#EF783D", shape = 17) +
geom_line(color = "#EF783D") +
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y") +
xlab("") +
ylab("Wafer Quantity") +
theme(
axis.text.x = element_text(angle = 60, vjust = 1, hjust = 1),
axis.text.y.left = element_text(color = "#EF783D"),
axis.title.y.left = element_text(color = "#EF783D")
)

Related

Having issues with bar chart x axis labels overlapping and spacing

I need to essentially do a graph of x-axis (the date) and y-axis (volume sold) and have each day in a calendar year be represented.
Issues - first is the white space between dates. I tried to use factor(Date), but ran into issues when I wanted additional changes to the graph. The other issue, is the x-axis is currently by month. This looks fine. However, when I try to do it by day, I get ...
In short, it looks like a mess. Probably because I'm trying to put every date in at once. Below is my code as is.
ggplot(MyDataCSV, aes(x = Date, y = Volume)) +
geom_col(stat = "identity",
width = 0.9,
fill = "coral",
alpha = 0.5,
colour = "black",
position = "dodge") +
scale_x_date(date_breaks = "1 day", labels = date_format("%m/%d")) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ggtitle("Volume Sold by Date") +
theme(plot.title = element_text(hjust = 0.5))
I'm a beginner to R, so I know this is beginner level, but I'm very confused. I have ggplot, tidyverse, dplyr, lubridate, and scales installed. Essentially, I want my labels on the x-axis to look like the first picture, except with every date in my data set (about a year)
One solution is to specify the dimensions of the saved figure, e.g.
# Load libraries
library(tidyverse)
library(lubridate)
# Generate a fake dataset (minimal reproducible example)
df <- data.frame(Date = seq.Date(from = ymd("2021-01-01"),
to = ymd("2021-12-31"),
by = "1 day"),
Volume = runif(365, 0, 4e+08))
# Plot the fake data
ggplot(df, aes(x = Date, y = Volume)) +
geom_col(stat = "identity",
width = 0.9,
fill = "coral",
alpha = 0.5,
colour = "black",
position = "dodge") +
scale_x_date(date_breaks = "1 day", labels = scales::date_format("%m/%d")) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ggtitle("Volume Sold by Date") +
theme(plot.title = element_text(hjust = 0.5))
# Save the plot
ggsave(filename = "example_1.png", width = 60, height = 10, limitsize = FALSE)
Then, if you zoom in, you can see the dates don't overlap:
Otherwise, you could change the date_breaks to "1 month", or "1 week" to stop the dates overlapping whilst keeping a 'normal' figure size:
# Plot the fake data
ggplot(df, aes(x = Date, y = Volume)) +
geom_col(stat = "identity",
width = 0.9,
fill = "coral",
alpha = 0.5,
colour = "black",
position = "dodge") +
scale_x_date(date_breaks = "1 week", labels = scales::date_format("%m/%d")) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ggtitle("Volume Sold by Date") +
theme(plot.title = element_text(hjust = 0.5))
Or, the option that I would probably recommend, you could create facets by e.g. every 3 months:
# Load libraries
library(tidyverse)
library(lubridate)
# Generate a fake dataset (minimal reproducible example)
df <- data.frame(Date = seq.Date(from = ymd("2021-01-01"),
to = ymd("2021-12-31"),
by = "1 day"),
Volume = runif(365, 0, 4e+08))
plot_labels <- c(
"1" = "First Quarter, 2021",
"2" = "Second Quarter, 2021",
"3" = "Third Quarter, 2021",
"4" = "Fourth Quarter, 2021"
)
# Plot the fake data
df %>%
mutate(quarter = cut.Date(Date, breaks = "quarter", labels = FALSE)) %>%
ggplot(., aes(x = Date, y = Volume)) +
geom_col(stat = "identity",
width = 0.9,
fill = "coral",
alpha = 0.5,
colour = "black",
position = "dodge") +
scale_x_date(date_breaks = "1 day", labels = scales::date_format("%m/%d")) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
ggtitle("Volume Sold by Date") +
theme(plot.title = element_text(hjust = 0.5)) +
facet_wrap(~ quarter, ncol = 1, scales = "free_x",
labeller = labeller(quarter = plot_labels))

ggplot secondary axis scaling

I'm still pretty much a novice with R and ggplot. I have the following code
library(ggplot2)
library(dplyr)
library(tidyr)
maxDate <- "2020-07-07"
my_dates <- function(d) {
seq( d[1] + (wday(maxDate) - wday(d[1])+1) %% 7, d[2] + 6, by = "week")
}
stateWeekly <- #structure at https://pastebin.com/jT8WV4dy
endpoints <- stateWeekly %>%
group_by(state) %>%
filter(weekStarting == max(weekStarting)) %>%
select(weekStarting, posRate, state, cumRate, posRateChange) %>%
ungroup()
g <- stateWeekly %>% ggplot(aes(x = as.Date(weekStarting))) +
geom_col(aes(y=100*dailyTest), size=0.75, color="darkblue", fill="white") +
geom_line(aes(y=cumRate), size = 0.75, color="red") +
geom_line(aes(y=posRate), size = 0.75, color="forestgreen") +
geom_point(data = endpoints,size = 1.5,shape = 21,
aes(y = cumRate), color = "red", fill = "red", show.legend = FALSE) +
geom_label(data=endpoints, aes(label=paste(round(cumRate,1),"%",sep=""),
x=as.Date("2020-04-07", format="%Y-%m-%d"), y = 80),
color="red",
show.legend = FALSE,
nudge_y = 12) +
geom_label(data=endpoints, aes(label=paste(round(posRateChange,1),"%",sep=""),
x=as.Date("2020-04-28", format="%Y-%m-%d"), y = 80),
color="forestgreen",
show.legend = FALSE,
nudge_y = 12) +
scale_y_continuous(name = "Cum Test Positivity Rate",
sec.axis = sec_axis(~./100, name="Weekly % of Pop Tested")) +
scale_x_date(breaks = my_dates, date_labels = "%b %d") +
labs(x = "Week Beginning") +
#title = "COVID-19 Testing",
#subtitle = paste("Data as of", format(maxDate, "%A, %B %e, %y")),
#caption = "HQ AFMC/A9A \n Data: The COVID Tracking Project (https://covidtracking.com)") +
theme(plot.title = element_text(size = rel(1), face = "bold"),
plot.subtitle = element_text(size = rel(0.7)),
plot.caption = element_text(size = rel(1)),
axis.text.y = element_text(color='red'),
axis.title.y = element_text(color="red"),
axis.text.y.right = element_text(color="blue"),
axis.title.y.right = element_text(color="blue"),
axis.text.x = element_text(angle = 45,hjust = 1),
strip.background =element_rect(fill="white"),
strip.text = element_text(colour = 'blue')) +
#coord_cartesian(ylim=c(0,90)) +
facet_wrap(~ state)
print(g)
Which produces this chart
Georgia has obviously been screwing with their COVID data (again) so nevermind the negative testing :)
What I'd like to do is scale the secondary axis so that the testing rate lines aren't so squished...they're very small numbers but I'd like to be able to see more differentiation. Any guidance on how to achieve that would be most appreciated.
Edit:
One suggestion below was to change facet_wrap(~ state) to facet_wrap(~ state, scales='free') Doing so changes the chart only slightly
I can fix the label anchors, but this really didn't offer the level of differentiation in the line plots I was hoping for.
A second suggestion was the change sec.axis = sec_axis(~./100, name="Weekly % of Pop Tested")) to sec.axis = sec_axis(~./1000, name="Weekly % of Pop Tested"))
As far as I can tell, that does nothing to the actual plot and just changes the axis markings:
Finally, I've been struggling to implement the solution found here from Dag Hjermann. My secondary axis is the Weekly % of Population Tested, which is represented in the geom_col. A reasonable range for that is 0-1.1. The primary axis is the line plots, test positivity rates, which vary from 0-30. So if I follow that solution, I should add
ylim.prim <- c(0, 30)
ylim.sec <- c(0, 1.1)
b <- diff(ylim.prim)/diff(ylim.sec)
a <- b*(ylim.prim[1] - ylim.sec[1])
and then change the plot code to read
geom_col(aes(y=a + 100*dailyTest*b), size=0.75, color="darkblue", fill="white")
and the secondary axis to
sec.axis = sec_axis(~ (. -a)/(b*100), name="Weekly % of Pop Tested"))
Doing so produces the following
which is clearly not right.
At the risk of sounding really dumb here, is the issue at least somewhat due to the line plots (what I want to scale) being on the primary axis?
Maybe use permille instead of percentage.
scale_y_continuous(name = "Cum Test Positivity Rate",
sec.axis = sec_axis(~./1000, name="Weekly ‰ of Pop Tested"))

ggplot - x-axis shows data beyond specified range for longer time periods

Here's some sample data for a company's Net revenue split by two cohorts:
data <- data.frame(dates = rep(seq(as.Date("2000/1/1"), by = "month", length.out = 48), each = 2),
revenue = rep(seq(10000, by = 1000, length.out = 48), each = 2) * rnorm(96, mean = 1, sd = 0.1),
cohort = c("Group 1", "Group 2"))
I can show one year's worth of data and it returns what I would expect:
start = "2000-01-01"
end = "2000-12-01"
ggplot(data, aes(fill = cohort, x = dates, y = revenue)) +
geom_bar(stat = "identity", position = position_dodge(width = NULL)) +
xlab("Month") +
ylab("Net Revenue") +
geom_text(aes(label = round(revenue, 0)), vjust = -0.5, size = 3, position = position_dodge(width = 25)) +
scale_x_date(date_breaks = "1 month", limits = as.Date(c(start, end))) +
ggtitle("Monthly Revenue by Group") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 60, hjust = 1), plot.title = element_text(hjust = 0.5)) +
scale_fill_manual(values=c("#00BFC4", "#F8766D"))
But if I expand the date range to two years or more and rerun the graph, it shows additional months on both sides of the x-axis despite not displaying any information on the y-axis.
start = "2000-01-01"
end = "2001-12-01"
#rerun the ggplot code from above
Note the non-existant data points for 1999-12-01 and 2002-01-01. Why do these appear and how can I remove them?
Many (all?) of the scale_* functions take expand= as an argument. It's common in R plots (both base and ggplot2) to expand the axes just a little bit (4% on each end, I believe), I think so that none of the lines/points are scrunched up against the "box" boundary.
If you include expand=c(0,0), you get what you want.
(BTW: you have mismatched parens. Fixed here.)
ggplot(data, aes(fill = cohort, x = dates, y = revenue)) +
geom_bar(stat = "identity", position = position_dodge(width = NULL)) +
xlab("Month") +
ylab("Net Revenue") +
geom_text(aes(label = round(revenue, 0)), vjust = -0.5, size = 3, position = position_dodge(width = 25)) +
scale_x_date(date_breaks = "1 month", limits = as.Date(c(start, end)), expand = c(0, 0)) +
ggtitle("Monthly Revenue by Group") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 60, hjust = 1), plot.title = element_text(hjust = 0.5)) +
scale_fill_manual(values=c("#00BFC4", "#F8766D"))
I am not sure what exactly the issue is but if you change from "Date" class on x-axis to any other it seems to work as expected. Also filtering the data for the specific range before passing it to ggplot.
For example in this case changing dates to month-year format,
library(dplyr)
library(ggplot2)
start = as.Date("2000-01-01")
end = as.Date("2001-12-01")
all_fac <- c(outer(month.abb, 2000:2001, paste, sep = "-"))
data %>%
filter(between(dates, start, end)) %>%
mutate(dates = factor(format(dates, "%b-%Y"),levels = all_fac)) %>%
ggplot() + aes(fill = cohort, x = dates, y = revenue) +
geom_bar(stat = "identity", position = "dodge") +
xlab("Month") +
ylab("Net Revenue") +
geom_text(aes(label = round(revenue, 0))) +
ggtitle("Monthly Revenue by Group") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 60, hjust = 1), plot.title =
element_text(hjust = 0.5)) +
scale_fill_manual(values=c("#00BFC4", "#F8766D"))
Please beautify/change the labels on the bars.

How to use scale_x_date properly

I am a new user in R and I hope you can help me.
setwd("C:/Users/USER/Desktop/Jorge")
agua <- read_excel("agua.xlsx")
pbi <- read_excel("PBIagro.xlsx")
str(agua);
names(agua)[2] <- "Variación";
agua[,1] <- as.Date(agua$Trimestre)
lagpbi <- lag(pbi$PBIAgropecuario, k=1)
pbi[,3]<- lagpbi; pbi <- pbi[-c(1),];
names(pbi)[3] <- "PBIlag"
growth <- ((pbi$PBIAgropecuario-pbi$PBIlag)/pbi$PBIlag)*100
Anual_growth <- data.frame(growth); Anual_growth[,2] <- pbi$Año; names(Anual_growth)[2] <- "Año"
# Plot
Agro <- ggplot(Anual_growth, aes(x=Año, y=growth)) +
geom_line(color="steelblue") +
geom_point() +
geom_text(aes(label = round(Anual_growth$growth, 1)),
vjust = "inward", hjust = "inward", size=2.5, show.legend = FALSE) +
xlab("") +
theme_ipsum() +
theme(axis.text.x=element_text(angle=60, hjust=1)) +
ylim(-9.9,13.4) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
axis.line.x = element_blank(), plot.margin = unit(c(1,1,0.5,1),"cm"),
axis.line.y = element_blank(), axis.text.x=element_text(face = "bold", size=8,
angle=1,hjust=0.95,vjust=0.2),
axis.text.y = element_blank(), axis.title.y=element_blank())+
scale_x_continuous("Año", labels = as.character(Anual_growth$Año), breaks = Anual_growth$Año)
print(Agro)
The problem is that it shows all the years, but I only want pair years (in X-axis) or years with step equal to 2.
I hope you can really help me.
Thank you.
Notice that the X-axis variable is a numeric string.
You can add something like
scale_x_date(date_breaks = "2 years", date_labels = "%Y") to your ggplot.
This is how it looks with my data, since you haven't posted yours. I am plotting a type date on x axis.
1.
ggplot(mydata) +
aes(x = date, y = number, color = somevar) +
geom_line()
ggplot(mydata) +
aes(x = date, y = number, color = somevar) +
geom_line() +
scale_x_date(date_breaks = "1 year", date_labels = "%Y")
3.
ggplot(mydata) +
aes(x = date, y = number, color = somevar) +
geom_line() +
scale_x_date(date_breaks = "2 years", date_labels = "%Y")
If you want pair years and because your x-axis variable is numeric, you can specify in scale_x_continous that breaks argument should take only even numbers.
Here how you can do it using this small example:
year = 1998:2020
value = rnorm(23,mean = 3)
df = data.frame(year,value)
library(ggplot2)
ggplot(df, aes(x = year, y = value))+
geom_point()+
geom_line()+
scale_x_continuous(breaks = year[year %%2 ==0])
Reciprocally, if you want odd years, you just have to specify scale_x_continuous(breaks = year[year %%2 != 0])
So, in your code, you should write:
scale_x_continuous(breaks = Anual_growth$Año[Anual_growth$Año %%2 ==0])
Does it answer your question ?

How to add more number of labels on x-axis using ggplot

I have the following plot but I want to add additional labels on the x axis.
I've already tried scale_x_continuous but it doesn't work since my values are not numeric values, but dates.
how can I solve this?
If by "more x values" you mean that you would like to have more labels on your x-axis, then you can adjust the frequency using the scale_x_dates argument like so:
scale_x_date(date_breaks = "1 month", date_labels = "%b-%y")
Here is my working example. Please post your own if I misunderstood your question:
library("ggplot2")
# make the results reproducible
set.seed(5117)
start_date <- as.Date("2015-01-01")
end_date <- as.Date("2017-06-10")
# the by=7 makes it one observation per week (adjust as needed)
dates <- seq(from = start_date, to = end_date, by = 7)
val1 <- rnorm(length(dates), mean = 12.5, sd = 3)
qnt <- quantile(val1, c(.05, .25, .75, .95))
mock <- data.frame(myDate = dates, val1)
ggplot(data = mock, mapping = aes(x = myDate, y = val1)) +
geom_line() +
geom_point() +
geom_hline(yintercept = qnt[1], colour = "red") +
geom_hline(yintercept = qnt[4], colour = "red") +
geom_hline(yintercept = qnt[2], colour = "lightgreen") +
geom_hline(yintercept = qnt[3], colour = "lightgreen") +
theme_classic() +
scale_x_date(date_breaks = "1 month", date_labels = "%b-%y") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))

Resources