Trouble with horizontal and vertical lines accepting class "Date" in ggplot? - r

I'm trying to make a Gantt chart in ggplot based on the generous code offered by user Didzis Elferts. I'm trying to add a vertical line showing today's date, but the geom_vline layer in the ggplot2 package simply returns Error: Discrete value supplied to continuous scale. Here is my code:
today <- as.Date(Sys.Date(), "%m/%d/%Y")
library(scales)
ggplot(mdfr, aes(time,name, colour = is.critical)) +
geom_line(size = 6) +
xlab("") + ylab("")+
labs(title="Sample Project Progress")+
theme_bw()+
scale_x_datetime(breaks=date_breaks("1 year"))+
geom_vline(aes(xintercept=today))
The plot without the geom_vline command looks like this :
Any reason why geom_vline wouldn't work for the "Date" character?
EDIT: Reproducible code used to generate plot:
### GANTT CHART 1 ###############3
tasks <- c("Meetings", "Client Calls", "Design", "Bidding", "Construction")
dfr <- data.frame(
name = factor(tasks, levels = tasks),
start.date = c("07/08/2013", "07/08/2013", "07/23/2013", "08/30/2013", "9/30/2013"),
end.date = c("07/12/2013", "07/13/2013", "08/15/2013", "09/12/2013", "12/01/2013"),
is.critical = c(TRUE, FALSE, TRUE, TRUE, TRUE))
mdfr <- melt(dfr, measure.vars = c("start.date", "end.date"))
mdfr$time <- as.POSIXct(strptime(mdfr$value,"%m/%d/%Y"))

There are two thinks you need to change in your code.
First, as for making the time column in mdfr you use as.POSIXct() the same should be done with today - both variables should have the same format.
today <- as.POSIXct(Sys.Date(), "%m/%d/%Y")
Second, use as.numeric() inside the geom_vline() around today.
+ geom_vline(aes(xintercept=as.numeric(today)))

Related

ggplotly fails with geom_vline() with xintercept Date value

Trying to use ggplotly to graph time series data with a vertical line to indicate dates of interest.
Call fails with Error in Ops.Date(z[[xy]], 86400000) : * not defined for "Date" objects. I have tried unsuccessfully using both the latest CRAN and development versions of ggplot2 (as per plotly recommendation). Other SO questions (e.g., ggplotly and geom_bar when using dates - latest version of plotly (4.7.0)) do not address my concerns.
As illustrated below with plot object p - both ggplot and ggplotly work as expected. However, when a geom_vline() is added to the plot in p2, it only works correctly in ggplot, failing when calling ggplotly(p2).
library(plotly)
library(ggplot2)
library(magrittr)
set.seed(1)
df <- data.frame(date = seq(from = lubridate::ymd("2019-01-01"), by = 1, length.out = 10),
y = rnorm(10))
p <- df %>%
ggplot(aes(x = date, y = y)) +
geom_line()
p ## plots as expected
ggplotly(p) ## plots as expected
p2 <- p + geom_vline(xintercept = lubridate::ymd("2019-01-08"), linetype = "dashed")
p2 ## plots as expected
ggplotly(p2) ##fails
I just solved this using #Axeman's suggestion. In your case, you can just replace the date:
lubridate::ymd("2019-01-01")
becomes
as.numeric(lubridate::ymd("2019-01-01"))
Not pretty, but it works.
For future reference:
The pop-up window for vertical lines created via date (or POSIX*) to numeric conversions is rather blank. This is particularly valid for POSIX* applications where the exact time can often not be read off directly.
In case you need more significant pop-up content, the definition of a text aesthetic could be helpful (just ignore the 'unknown aesthetics' warning as it doesn't seem to apply). Then, simply specify what you want to see during mouse hover via the tooltip argument, ie. rule out xintercept, and you're all set.
p2 = p +
geom_vline(
aes(
xintercept = as.numeric(lubridate::ymd("2019-01-08"))
, text = "date: 2019-01-08"
)
, linetype = "dashed"
)
ggplotly(p2, tooltip = c("x", "y", "text"))

ggplot2 not rendering correctly using Plotly

I have created a point plot using ggplot2 that works relatively well. I would love to run in using Plotly, however when I do - it ends up upsetting the y axis and making the legend very wonky. I will post some before and after below but I am very new to both and looking for the right direction. The ggplot2 is okay but the added interactivity of plotly would be a huge win for what we are doing. Also a weird note - the top graph returned seems to cut off the plot (the highest value - not sure why). Thanks.
Code is:
library(ggplot2)
library(dplyr)
library(plotly)
library(sqldf)
library(tidyverse)
library(lubridate)
library(rio) #lets you use "import" for any file - without using extension name
options(scipen =999) #disable scientific notation
#prepare data:
setwd("C:/Users/hayescod/Desktop/BuysToForecastTracking")
Buys_To_Forecast <- import("BuysToForecastTrack")
colnames(Buys_To_Forecast) <- c("Date", "BusinessSegment", "Material", "StockNumber", "POCreatedBy", "PlantCode", "StockCategory", "Description", "Excess", "QuantityBought", "WareHouseSalesOrders", "GrandTotal", "Comments" )
Buys_To_Forecast$PlantCode <-factor(Buys_To_Forecast$PlantCode) #update PlantCode to factor
#use SQL to filter and order the data set:
btf <- sqldf("SELECT Date,
SUM(QuantityBought) AS 'QuantityBought',
Comments
FROM Buys_To_Forecast
GROUP BY Date, Comments
ORDER BY Date")
#use ggplot:
btfnew <- ggplot(data=btf, aes(x=Date, y=QuantityBought, color=Comments, size=QuantityBought)) +
geom_point() +
facet_grid(Comments~., scales="free")+
ggtitle("Buys To Forecast Review")+
theme(plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(color="DarkBlue", size = 18),
axis.title.y = element_text(color="Red", size = 14))
btfnew #display the plot in ggplot
ggplotly(btfnew) #display the plot in Plotly

ggplot x axis trouble

Currently, I have this plot that looks like this:
I don't like how on the x-axis there are weird lines / bars. I suspect this may be because ggplot can't fit all 540000 observations in the x axis. Here is the code I used to graph this:
data %>%
ggplot() +
geom_point(aes(x = dates_df$date, y = Quantity)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
labs(x = "Invoice Date", y = "Quantity", title = "Quantity vs Invoice Date")
What can I do to get rid of / solve this mess on the x-axis?
As was told on comments it seems there is a mess in Date column and you use of two separate data frames. As first join the data. I assume both of them has some Id or other key like name in column:
library("dplyr")
left_join(data,dates_df,by="id")
Date is also a character as was mentioned. To change it to Date, if you haven't already do this use as.Date function. After joining
data$date<- as.Date(data$date, "%m/%d/%Y")
you can find other date formats here: http://www.statmethods.net/input/dates.html
You said there are 540 000 observation on x axis. My suggestfion is to separate the chart for unique year. To do this use facet_grid function inside ggplot.
library(lubridate)
ggplot(df, aes(x= df$date,y= df$Quantity))+
geom_point() +
facet_grid(~year(df$date))
Hope it helped :)

labeling axis of dates in ggplot?

I am trying to making plots using ggplot in R and I have the same problem that was discussed below.
Date axis labels in ggplot2 is one day behind
My data ranges from 2016-09-01 to 2016-09-30, but labels in plots say 2016-08-31 is the first day of data.
I solved the problem with the solution in the previous question, which is:
ggplot(df, aes(x, y)) +
geom_point() +
scale_x_datetime(breaks =df$x , labels = format(df$x, "%Y-%m-%d"))
(Is this to set breaks and labels by taking exact dates from the data?)
Anyways, I have a new problem,
dates match to labels well now but the plot does not look good.
I am not complaining length of dates is too long, but I don't like I can't set breaks and labels by a week or a certain number of days with the solution above.
Also, I have many missing dates.
What should I do to solve this problem? I need a new solution.
Just use this if you want your dates to appear vertically (that way you can see all your dates):
ggplot(df, aes(x, y)) +
geom_point() +
scale_x_datetime(breaks =df$x , labels = format(df$x, "%Y-%m-%d")) +
theme(axis.text.x = element_text(angle=90, vjust = 0.5))
I found the solution... Maybe my question was not described here in detail.
My solution for the situation where dates did not match to values on an axis and I wanted to make plots look better is:
# set breaks first by seq.POSIXt
breaks.index <- seq.POSIXt(from=as.POSIXct(strftime("2020-01-01", format="%Y-%m-%d"), format="%Y-%m-%d"), to=as.POSIXct(strftime("2020-12-31", format="%Y-%m-%d"), format="%Y-%m-%d"), by="1 week")
and
# plot
plot <- ggplot(data, aes(x=date, y=y)
+scale_x_datetime(breaks = breaks.index, labels = format(breaks.index, "%Y-%m-%d"))
plot
.
Though I don't understand what is different from using scale_x_date(date_labels ='%F') and how this code works, it works.

Add direct labels to ggplot2 geom_area chart

This is a continuation of the question here: Create non-overlapping stacked area plot with ggplot2
I have a ggplot2 area chart created by the following code. I want the labels from names be aligned on the right side of the graph. I think directlabels might work, but am willing to try whatever is most clever.
require(ggplot2)
require(plyr)
require(RColorBrewer)
require(RCurl)
require(directlabels)
link <- getURL("http://dl.dropbox.com/u/25609375/so_data/final.txt")
dat <- read.csv(textConnection(link), sep=' ', header=FALSE,
col.names=c('count', 'name', 'episode'))
dat <- ddply(dat, .(episode), transform, percent = count / sum(count))
# needed to make geom_area not freak out because of missing value
dat2 <- rbind(dat, data.frame(count = 0, name = 'lane',
episode = '02-tea-leaves', percent = 0))
g <- ggplot(arrange(dat2,name,episode), aes(x=episode,y=percent)) +
geom_area(aes(fill=name, group = name), position='stack') + scale_fill_brewer()
g1 <- g + geom_dl(method='last.points', aes(label=name))
I'm brand new to directlabels and not really sure how to get the labels to align to right side of the graph with the same colors as the areas.
You can use simple geom_text to add labels. First, subset you data set to get the final x value:
dd=subset(dat, episode=="06-at-the-codfish-ball")
Then order the data frame by factor level:
dd = dd[with(dd, order(name, levels(dd$name))),]
Then work out the cumulative percent for plotting:
dd$cum = cumsum(dd$percent)
Then just use a standard geom_text call:
g + geom_text(data=dd, aes(x=6, y=cum, label=name))
Oh, and you may want to angle your x-axis labels to avoid over plotting:
g + opts(axis.text.x=theme_text(angle=-25, hjust=0.5, size = 8))
Graph

Resources