Grouped bar chart with date on x-axis - r

I'm getting back to R, and I have some trouble plotting the data I want.
It's in this format :
date value1 value2
10/25/2016 50 60
12/16/2016 70 80
01/05/2017 35 45
And I would like to plot value1 and value2 next to each other, with the corresponding date on the x axis. So far I have this, I tried to plot only value1 first :
df$date <- as.Date(df$date, "%m/%d/%Y")
ggplot(data=df,aes(x=date,y=value1))
But the resulting plot doesn't show anything. The maximum values on the x and y axis seem to correspond to the ranges of my dataframe, but why is nothing showing up?
It works with plot(df$date,df$value1) though, so I don't get what I am doing wrong.

the ggplot call alone does not actually create any layers on the plot. You need to add a geom.
For this you probably want geom_point() or geom_line()
ggplot(data=df,aes(x=date,y=value1)) +
geom_point()
or
ggplot(data=df,aes(x=date,y=value1)) +
geom_line()
or you could do both if you want points and lines
ggplot(data=df,aes(x=date,y=value1)) +
geom_point() +
geom_line()
If you want both values on the plot, I would recommend doing some data manipulation first with the tidyr package.
df %>%
gather(key = "group", value = "value", value1:value2) %>%
ggplot(aes(date, value, color = group, group = group)) +
geom_line()

Related

How to add to each ggplot contained in a facet_wrap a second corresponding time series?

I have a facet_wrap with 24 plots. Each graph contains a time series but I want to add for each plot another time series in it. Thus, each of the 24 plots should have two time series. The time series that I want to add to each plot is from a different data frame (df_norm1 <- this data frame contains 24 columns (corresponding to the 24 timeseries that i want to add to each of the 24 plot in the facet_wrap).
Here is the code that I used to obtain the facet_wrap (that present only one time series):
df_norm_graph <- df_norm %>% gather(key = "variable", value = "value", -
date)
ts<-ggplot(df_norm_graph, aes(x = date, y = value)) +
geom_line(aes(color = variable), size = 1) +
theme_minimal() +
facet_wrap( ~ variable)
Do you have a way to do this? Thank you in advance!

Draw monthly data in geofacet US maps

I have a data df with the format
State
Date
Ratio
AL
2019-01
10.1
AL
2019-02
12.1
...
...
...
NY
2019-01
15.1
...
...
...
And I would like to draw a time series with the geofacet package. I am having troubles with the Date format I guess.
ggplot(df,aes(Date, Ratio)) + geom_line() + facet_geo(~ State, grid = "us_state_grid2") + ylab("Rate (%)")
The following errors shown:geom_path: Each group consists of only one observation. Do you
need to adjust the group aesthetic?
How I can adjust it?
Your date is structured 'yyyy-mm', so I'm guessing it's a character vector rather than a date object. You should convert it to class Date with as.Date() and then it should work as expected. (You'll need to paste on the day of the month.)
You get a grouping error because when your x-axis is a character vector, geom_line will group by values of the character vector x-axis. Lines are drawn instead between the various y values at each x value. Here's an example using the geofacet package's own state_ranks dataset.
library(ggplot2)
library(dplyr)
library(geofacet)
data(state_ranks)
# The lines are not connected across a character x-axis.
ggplot(state_ranks) +
geom_line(aes(x = variable, y = rank))
# Throws error: geom_path: Each group consists of only one observation. Do
# you need to adjust the group aesthetic?
ggplot(state_ranks) +
geom_line(aes(x = variable, y = rank)) +
facet_geo(~ state)
If you group by state, you get the expected result (with an alphabetically ordered x-axis).
# Works, x-axis is alphabetized and lines are connected
ggplot(state_ranks) +
geom_line(aes(x = variable, y = rank, group = state)) +
facet_geo(~ state)

Plot multicolor vertical lines by using ggplot to show average time taken for each type as facet. Each type will have different vertical lines

I want to plot a chart in R where it will show me vertical lines for each type in facet.
df is the dataframe with person X takes time in minutes to reach from A to B and so on.
I have tried below code but not able to get the result.
df<-data.frame(type =c("X","Y","Z"), "A_to_B"= c(20,56,57), "B_to_C"= c(10,35,50), "C_to_D"= c(53,20,58))
ggplot(df, aes(x = 1,y = df$type)) + geom_line() + facet_grid(type~.)
I have attached image from excel which is desired output but I need only vertical lines where there are joins instead of entire horizontal bar.
I would not use facets in your case, because there are only 3 variables.
So, to get a similar plot in R using ggplot2, you first need to reformat the dataframe using gather() from the tidyverse package. Then it's in long or tidy format.
To my knowledge, there is no geom that does what you want in standard ggplot2, so some fiddling is necessary.
However, it's possible to produce the plot using geom_segment() and cumsum():
library(tidyverse)
# First reformat and calculate cummulative sums by type.
# This works because factor names begins with A,B,C
# and are thus ordered correctly.
df <- df %>%
gather(-type, key = "route", value = "time") %>%
group_by(type) %>%
mutate(cummulative_time = cumsum(time))
segment_length <- 0.2
df %>%
mutate(route = fct_rev(route)) %>%
ggplot(aes(color = route)) +
geom_segment(aes(x = as.numeric(type) + segment_length, xend = as.numeric(type) - segment_length, y = cummulative_time, yend = cummulative_time)) +
scale_x_discrete(limits=c("1","2","3"), labels=c("Z", "Y","X"))+
coord_flip() +
ylim(0,max(df$cummulative_time)) +
labs(x = "type")
EDIT
This solutions works because it assigns values to X,Y,Z in scale_x_discrete. Be careful to assign the correct labels! Also compare this answer.

How to have custom list labels on R ggplot figure x-axis?

I want to have custom values on x-axis from the list dates which contains dates in string format.
I am not so interested in melting the data with mpg because main columns have the data structure where value is integer and I cannot have there Posixct dates.
Vars variable value
1: 1 Leo 164
...
Code which current output in Fig. 1.
library('ggplot2')
str(mpg)
dates <- c("1.1.2017", "1.2.2017", "1.3.2017", "2.4.2017", "10.5.2017", "12.5.2017", "13.5.2017")
# TODO how to have here custom values on x-axis from dates?
ggplot(mpg, aes(x = class, y = hwy)) +
geom_boxplot()
You cannot simply have x = dates because dates does not belong to mpg.
Fig. 1 Current output with default x-labels
Expected output: those 7 dates on the x-axis of the figure.
R: 3.4.0 (backports)
OS: Debian 8.7
Try this:
ggplot(mpg, aes(x = class, y = hwy)) +
geom_boxplot() +
scale_x_discrete(labels = dates)
If you want to maintain the values in the axis, use scale_x_continuous instead, for istance the following which keeps the tick values forthe y-axis
scale_y_continuous("Y Axis Title")

ggplot why are bars not stacked?

I would like to create a stacked bar graph however my output shows overlaid bars instead of stacked. How can I rectify this?
#Create data
date <- as.Date(rep(c("1/1/2016", "2/1/2016", "3/1/2016", "4/1/2016", "5/1/2016"),2))
sales <- c(23,52,73,82,12,67,34,23,45,43)*1000
geo <- c(rep("Western Territory",5), rep("Eastern Territory",5))
data <- data.frame(date, sales, geo)
#Plot
library(ggplot2)
ggplot(data=data, aes(x=date, y=sales, fill=geo))+
stat_summary(fun.y=sum, geom="bar") +
ggtitle("TITLE")
Plot output:
As you can see from the summarized table below, it confirms the bars are not stacked:
>#Verify plot is correct
>ddply(data, c("date"), summarize, total=sum(sales))
date total
1 0001-01-20 90000
2 0002-01-20 86000
3 0003-01-20 96000
4 0004-01-20 127000
5 0005-01-20 55000
Thanks!
You have to include position="stack" in your statSummary:
stat_summary(position="stack",fun.y=sum, geom="bar")
Alternatively, since your data are already summarized, you could use geom_col (the short hand for geom_bar(stat = "identity")):
ggplot(data=data, aes(x=date, y=sales, fill=geo))+
geom_col() +
scale_x_date(date_labels = "%b-%d")
Produces:
Note that I changed the date formatting (by adding format = "%m/%d/%Y" to the as.Date call) and explictly set the axis lable formatting.
If your actual data have more than one entry per period, you can always summarise first, then pass that into ggplot instead of the raw data.

Resources