One dodged barchart per year - r

I have a dodged bar chart that shows the data "Created Date" and "Last Accessed" per year. The date data is formatted as year-month-date hour:minute:second.
Now I want to split the data up into one graph per year that shows Created date and Last accessed per date of the year. I tried to plot everything in one graph, however it was really messy..
Is it possible to plot per year, so in this example it will be 6 graphs because it is 6 years. I was thinking something like looping trough each year?
for(x in 1:number_of_years) {
... plot
}
This is my code:
```{r echo=FALSE, warning=FALSE}
# Libraries
library(ggplot2)
library(data.table)
library(tidyr)
# Read data
df = read.csv2(text = "File.Name|Created.Date|Last.Accessed|Visual.Group
60be1ba43bf7cjpg|1989-11-17 06:25:22|2017-07-15 01:25:22|0
60be1ba43bf89jpg|1989-02-04 04:03:16|2021-12-17 04:03:16|1
60be1ba43bf8djpg|2017-04-22 14:57:13|2017-11-17 23:57:13|2
60be1ba43bf90jpg|2021-04-12 23:03:44|2018-11-17 05:03:44|3
60be1ba43bf93jpg|2019-08-28 18:23:16|1989-09-07 12:23:16|4
60be1ba43bf95jpg|1989-09-11 08:16:20|2020-03-17 10:16:20|5
60be1ba43bf98jpg|2018-08-01 16:56:05|2017-04-24 03:56:05|5
60be1ba43bf9bjpg|2017-06-23 19:01:37|1989-07-14 22:01:37|6
60be1ba43bf9ejpg|2018-02-20 15:21:26|2020-02-01 18:21:26|7
60be1ba43bfa1jpg|2021-12-10 08:34:09|2021-04-17 02:34:09|8
60be1ba43bfa4jpg|2017-01-02 19:03:10|1989-11-16 14:03:10|9
60be1ba43bfa6jpg|2017-04-28 15:50:33|2019-02-22 06:50:33|9
60be1ba43bfd6jpg|2018-04-14 22:21:37|2021-08-28 14:21:37|9
60be1ba43bfdajpg|2019-07-15 04:31:04|2017-07-11 04:31:04|10
60be1ba43bfddjpg|2020-11-06 01:06:25|1989-08-16 09:06:25|10
60be1ba43bfe0jpg|2021-08-05 06:38:07|2021-12-25 15:38:07|11
60be1ba43bfe3jpg|2017-01-14 03:47:54|2020-08-03 20:47:54|12
60be1ba43bfe6jpg|1989-11-26 17:33:01|2018-11-07 22:33:01|13
60be1ba43bfe9jpg|2018-09-21 07:17:29|1989-10-05 03:17:29|14
60be1ba43bfebjpg|2017-03-13 09:42:04|2020-08-23 11:42:04|14
60be1ba43bfeejpg|2020-07-18 08:36:52|2018-10-04 12:36:52|15
60be1ba43bff1jpg|2021-01-21 05:20:29|2019-04-28 03:20:29|16
60be1ba43bff4jpg|2018-10-19 08:13:24|1989-10-16 18:13:24|17
60be1ba43bff7jpg|2017-06-15 07:49:44|2021-05-11 01:49:44|18
60be1ba43bff9jpg|2019-05-23 23:41:20|2019-08-26 13:41:20|18
60be1ba43bffcjpg|1989-02-17 03:13:27|1989-11-20 16:13:27|19
60be1ba43bfffjpg|2020-08-26 15:22:01|2018-08-21 17:22:01|20
60be1ba43c001jpg|2020-12-21 05:22:03|2017-04-04 15:22:03|21
60be1ba43c004jpg|2018-02-14 08:23:01|1989-03-26 06:23:01|21
60be1ba43c007jpg|2018-10-25 08:19:18|2021-07-06 08:19:18|21
60be1ba43c00ajpg|2017-11-28 07:22:03|2021-09-23 08:22:03|22
60be1ba43c00cjpg|2019-11-19 17:12:02|2019-05-23 03:12:02|23
60be1ba43c00fjpg|2017-02-26 00:43:21|2019-11-19 20:43:21|24
60be1ba43c012jpg|2017-10-27 01:05:53|1989-03-14 02:05:53|25
60be1ba43c015jpg|2019-06-18 08:06:07|1989-10-28 16:06:07|25
60be1ba43c017jpg|2021-08-19 02:01:32|2020-05-28 03:01:32|25
60be1ba43c01ajpg|2021-07-13 23:02:21|2017-07-05 03:02:21|25
60be1ba43c01djpg|1989-05-14 02:51:23|2020-03-24 01:51:23|25
60be1ba43c020jpg|2021-12-13 04:05:19|2021-04-16 03:05:19|25
60be1ba43c023jpg|2019-03-26 23:42:25|2021-08-03 20:42:25|26
60be1ba43c025jpg|1989-05-09 04:08:58|2019-10-23 00:08:58|26
60be1ba43c028jpg|2018-08-11 00:48:32|2019-05-08 08:48:32|26
60be1ba43c02bjpg|2018-07-14 08:38:02|2019-05-06 22:38:02|27
60be1ba43c02ejpg|2020-03-06 19:13:14|2019-09-18 18:13:14|28
60be1ba43c030jpg|1989-07-10 11:40:46|2019-08-03 01:40:46|28
60be1ba43c033jpg|2021-12-11 02:23:44|2019-08-09 01:23:44|28
60be1ba43c036jpg|2017-11-03 19:53:43|2021-02-13 17:53:43|29
60be1ba43c038jpg|2017-02-07 02:45:47|2021-09-20 09:45:47|29
60be1ba43c03bjpg|2020-07-28 20:56:59|2018-06-06 11:56:59|30
60be1ba43c040jpg|2019-06-20 16:42:30|2020-01-02 00:42:30|31
60be1ba43c042jpg|2020-05-27 03:39:41|2021-08-11 08:39:41|31
60be1ba43c045jpg|2017-08-07 12:11:03|2017-12-15 20:11:03|32
60be1ba43c048jpg|1989-04-28 01:19:49|2019-08-17 23:19:49|32
60be1ba43c04bjpg|2017-08-26 22:07:51|2019-02-25 16:07:51|33
60be1ba43c04djpg|1989-10-12 02:27:44|2020-11-10 22:27:44|34
60be1ba43c050jpg|2021-08-18 09:01:48|2020-01-08 07:01:48|35
60be1ba43c053jpg|2017-07-21 20:56:50|2018-02-24 20:56:50|36
60be1ba43c056jpg|1989-05-13 11:23:09|2020-01-20 11:23:09|37
60be1ba43c058jpg|2020-08-21 18:46:14|2017-04-14 10:46:14|37
60be1ba43c05bjpg|2020-04-08 11:14:54|2020-12-02 00:14:54|38
60be1ba43c05ejpg|2021-02-21 06:13:14|2018-08-21 03:13:14|38
60be1ba43c060jpg|2018-06-28 04:36:20|2020-09-28 15:36:20|39
60be1ba43c063jpg|2017-09-22 23:39:53|2021-05-27 20:39:53|39
60be1ba43c065jpg|2018-05-09 16:54:13|2021-06-03 16:54:13|39
60be1ba43c068jpg|2019-01-04 00:19:33|2017-05-18 08:19:33|40
60be1ba43c06bjpg|2019-04-12 21:19:37|2017-06-23 04:19:37|40
60be1ba43c06ejpg|2019-09-22 20:44:20|2021-10-23 12:44:20|40
60be1ba43c070jpg|2021-10-23 07:57:35|2019-02-23 21:57:35|40
60be1ba43c073jpg|2021-09-17 19:13:53|2020-05-27 06:13:53|40
60be1ba43c075jpg|2017-08-13 23:06:53|2019-08-03 00:06:53|41
60be1ba43c078jpg|2017-11-02 04:47:11|2018-07-05 03:47:11|42
60be1ba43c07bjpg|2017-05-11 23:04:25|2021-09-22 04:04:25|42
60be1ba43c07ejpg|2018-04-06 21:19:03|2021-09-16 21:19:03|42
60be1ba43c080jpg|2019-03-16 01:56:36|2021-04-03 18:56:36|42
60be1ba43c083jpg|2019-02-24 13:21:29|2019-10-28 09:21:29|43
60be1ba43c086jpg|2020-03-08 10:50:00|2018-06-22 15:50:00|44
60be1ba43c088jpg|1989-06-10 16:34:03|2020-11-10 18:34:03|45
60be1ba43c08bjpg|2018-04-20 10:29:46|2021-01-24 08:29:46|46
60be1ba43c08ejpg|2019-02-13 18:27:24|2021-10-16 07:27:24|47
60be1ba43c090jpg|2019-09-16 21:54:43|2018-07-21 22:54:43|48
60be1ba43c094jpg|2021-05-23 17:25:18|2019-06-18 00:25:18|49
60be1ba43c097jpg|2020-09-15 06:42:26|2019-03-25 13:42:26|50
60be1ba43c09ajpg|2020-05-02 20:14:35|2020-06-13 08:14:35|50
60be1ba43c09djpg|1989-02-25 15:11:22|2017-08-28 03:11:22|51
60be1ba43c09fjpg|2018-07-13 15:42:14|2019-02-23 14:42:14|52
60be1ba43c0a2jpg|2020-09-09 08:58:49|2019-07-11 18:58:49|53
60be1ba43c0a4jpg|1989-12-17 02:36:08|2021-10-04 10:36:08|54
60be1ba43c0a7jpg|2020-03-11 08:41:55|2017-10-16 17:41:55|55
60be1ba43c0aajpg|2020-12-18 19:59:08|2017-12-02 03:59:08|55
60be1ba43c0adjpg|2019-06-20 09:48:27|2020-11-10 17:48:27|56
60be1ba43c0b0jpg|2020-01-05 15:49:18|1989-11-27 05:49:18|56
60be1ba43c0b2jpg|1989-06-23 23:50:52|2017-09-08 02:50:52|56
60be1ba43c0b5jpg|2019-09-01 04:29:25|2020-10-25 00:29:25|56
60be1ba43c0b8jpg|2020-08-08 07:08:47|2021-05-22 20:08:47|57
60be1ba43c0bbjpg|2018-04-11 07:32:17|2018-06-21 12:32:17|58
60be1ba43c0bdjpg|2021-05-26 08:32:28|1989-02-04 12:32:28|58
60be1ba43c0c0jpg|1989-11-25 22:22:37|2019-07-16 04:22:37|58
60be1ba43c0c4jpg|2018-02-03 10:37:57|2019-08-02 08:37:57|58
60be1ba43c0c7jpg|2018-08-18 06:36:04|1989-03-17 08:36:04|58
60be1ba43c0cajpg|2019-02-12 23:31:52|2020-06-17 13:31:52|59",
sep="|",stringsAsFactors=TRUE, na.strings="unknown");
# Remove duplicates (Visual group defines duplicate)
df <-df[!duplicated(df$Visual.Group), ]
# Extract year
df$Created.Date.Year <- format(as.Date(df$Created.Date, format="%Y-%m-%d"), format="%Y");
df$Last.Accessed.Year <- format(as.Date(df$Last.Accessed, format="%Y-%m-%d"), format="%Y");
#set to data.table object
dt <- as.data.table(df)
#change column names to match desired names of groups
setnames(dt, old = c("Created.Date", "Last.Accessed"), new = c("Created date", "Last Accessed"))
#pivot longer (this column becomes the variable you use in aes(fill = ..) later on. Change it to whichever name you want to have as legend title
dt <- pivot_longer(dt, cols = c("Created date", "Last Accessed"), names_to = "Legend Title", values_to = "Date")
#plot
ggplot(dt, aes(x = Date, fill = `Legend Title`)) +
geom_bar(position = "dodge") +
theme_bw() + geom_text(stat = "count", aes(label = after_stat(count)), position = position_dodge(width = 1), vjust = -1)
```

perhaps this one? I switched to points and lines for better visualization.
library(tidyverse)
df %>%
pivot_longer(cols = 2:3) %>%
mutate(value = as.Date(value)) %>%
count(name, value) %>%
mutate(year = lubridate::year(value)) %>%
ggplot(aes(x = value, n, color =name)) +
geom_point() +
geom_line() +
scale_x_date(date_breaks = "3 month", date_labels = "%b") +
facet_grid(~year, scales = "free") +
theme_bw()

Related

Change multiple geom_text labels

I'm trying to change the name of all labels in geom_text from the example below:
mpg %>%
distinct(trans, .keep_all = T) %>%
ggplot(aes(x=cty, y=hwy))+
geom_text(aes(x=cty, y=hwy,label = trans))
I would like to edit "auto(l3)" to "Al3", "auto(l4)" to "Al4", "auto(l5)" to "Al5", "auto(l6)" to "Al6", and so on according to the names in the code below
mpg %>%
distinct(trans, .keep_all = T) %>%
ggplot(aes(x=cty, y=hwy))+
geom_text(aes(x=cty, y=hwy,
label = replace(trans, c("auto(l3)", "auto(l4)", "auto(l5)", "auto(l6)",
"auto(s4)", "auto(s5)", "auto(s6)",
"manual(m5)", "manual(m6)", "auto(av)"),
c("Al3", "Al4", "Al5", "Al6",
"As4", "selected", "good",
"m5", "m6", "av"))))
My output is the following error "Error in check_aesthetics():
! Aesthetics must be either length 1 or the same as the data (10): label"
I put exactly the 10 trans values in the code, is there another way to do this change?
I would suggest to do this kind of recoding outside of ggplot. IMHO this results in cleaner code and is easier to check and to debug. Additionally I switched to dplyr::recode using a named vector:
library(ggplot2)
library(dplyr)
rec_vec <- setNames(
c("Al3", "Al4", "Al5", "Al6", "As4", "selected", "good", "m5", "m6", "av"),
c(
"auto(l3)", "auto(l4)", "auto(l5)", "auto(l6)", "auto(s4)", "auto(s5)",
"auto(s6)", "manual(m5)", "manual(m6)", "auto(av)"
)
)
mpg %>%
distinct(trans, .keep_all = T) %>%
mutate(trans = recode(trans, !!!rec_vec)) |>
ggplot(aes(x = cty, y = hwy)) +
geom_text(aes(
x = cty, y = hwy,
label = trans
))

Hide some legend entries in ggplot

I have the follow lines of code:
ggplot() +
geom_line(data=TS_SimHeads_HOBS_final, aes(x=as.Date(Date), y=BH2672), color='red') +
geom_point(data=Hydro_dates_wellData_2014_2018, aes(x=as.Date(Date), y=BH2672), color='red') +
geom_line(data=TS_SimHeads_HOBS_final, aes(x=as.Date(Date), y=BH3025), color='green') +
geom_point(data=Hydro_dates_wellData_2014_2018, aes(x=as.Date(Date), y=BH3025), color='green') +
xlab("Date") + ylab("Head")
#theme_bw()
which generate the following plot:
What I am trying to do, unsuccessfully, is to include legends only for the lines (points are the experimental data and lines the simulated ones). Some data for reproduction purposes:
Date BH2672 BH278 BH2978 BH2987 BH3025 BH312 BH3963 BH3962 BH3957
2014-02-19 31.28400 78.86755 5.671027 39.48419 53.60201 44.29516 69.23685 61.70843 56.13871
2014-02-20 30.76656 78.87344 5.656940 39.49012 53.56489 44.50679 69.50910 61.70638 56.09621
2014-02-21 30.43226 78.88097 5.642136 39.49902 53.56041 44.65761 69.65709 61.70126 56.04346
2014-02-22 30.16532 78.88979 5.643818 39.51101 53.56065 44.78333 69.75621 61.69643 55.99459
2014-02-23 29.93577 78.89954 5.650873 39.52544 53.55970 44.89429 69.82983 61.69332 55.95241
2014-02-24 29.73162 78.90991 5.658991 39.54147 53.55682 44.99520 69.88845 61.69236 55.91639
As is quite often the case you first have to convert both of your datasets to long or tidy format using e.g. tidyr::pivot_longer which will result in a new column with the variable names as categories which could then be mapped on the color aes. Doing so will automatically create a legend and also allows to simplify your code. And if you want only the lines to appear in the legend then you could add show.legend=FALSE to geom_point. Finally you can set your desired colors via scale_color_manual.
As you provided only one dataset I used this for both datasets which however shouldn't matter. Also, to make my life a bit easier I have put the datasets in an named list:
library(dplyr, warn = FALSE)
library(tidyr)
library(ggplot2)
data_list <- list(data = Hydro_dates_wellData_2014_2018, sim = TS_SimHeads_HOBS_final) %>%
lapply(function(x) {
x %>%
select(Date, BH2672, BH3025) %>%
mutate(Date = as.Date(Date)) %>%
tidyr::pivot_longer(-Date)
})
ggplot() +
geom_line(data=data_list$sim, aes(x=Date, y=value, color = name)) +
geom_point(data=data_list$data, aes(x=Date, y=value, color = name), show.legend = FALSE) +
scale_color_manual(values = c(BH2672 = "red", BH3025 = "green")) +
labs(x = "Date", y = "Head")
DATA
TS_SimHeads_HOBS_final <- structure(list(Date = c(
"2014-02-19", "2014-02-20", "2014-02-21",
"2014-02-22", "2014-02-23", "2014-02-24"
), BH2672 = c(
31.284,
30.76656, 30.43226, 30.16532, 29.93577, 29.73162
), BH278 = c(
78.86755,
78.87344, 78.88097, 78.88979, 78.89954, 78.90991
), BH2978 = c(
5.671027,
5.65694, 5.642136, 5.643818, 5.650873, 5.658991
), BH2987 = c(
39.48419,
39.49012, 39.49902, 39.51101, 39.52544, 39.54147
), BH3025 = c(
53.60201,
53.56489, 53.56041, 53.56065, 53.5597, 53.55682
), BH312 = c(
44.29516,
44.50679, 44.65761, 44.78333, 44.89429, 44.9952
), BH3963 = c(
69.23685,
69.5091, 69.65709, 69.75621, 69.82983, 69.88845
), BH3962 = c(
61.70843,
61.70638, 61.70126, 61.69643, 61.69332, 61.69236
), BH3957 = c(
56.13871,
56.09621, 56.04346, 55.99459, 55.95241, 55.91639
)), class = "data.frame", row.names = c(
NA,
-6L
))
Hydro_dates_wellData_2014_2018 <- TS_SimHeads_HOBS_final

Rainfall runoff Plotting

My data frame is as follows,
Date,Precipitation,Observed,Simulated
1/1/1988,21.90,3.06,15.27
1/2/1988,12.34,6.70,17.46
1/3/1988,19.76,18.32,32.45
1/4/1988,5.54,98.20,46.67
1/5/1988,6.50,91.92,37.43
1/6/1988,11.04,38.12,20.94
1/7/1988,17.90,44.14,26.64
1/8/1988,23.76,41.07,118.2
1/9/1988,59.74,169.06,225.4
1/10/1988,51.58,371.91,325.3
1/11/1988,18.08,447.05,387.1
1/12/1988,2.92,295.40,329.6
1/13/1988,2.72,133.90,218.7
1/14/1988,0.00,95.82,128.6
1/15/1988,7.38,70.32,70.17
1/16/1988,3.50,36.21,38.42
1/17/1988,4.58,9.43,21.97
1/18/1988,7.84,35.74,12.7
1/19/1988,16.86,24.32,12.96
1/20/1988,5.00,33.90,18.56
1/21/1988,0.96,27.06,20.9
1/22/1988,15.54,23.20,42.36
1/23/1988,11.36,16.41,54.19
1/24/1988,1.06,23.94,48.3
1/25/1988,6.42,17.35,42.32
1/26/1988,0.00,14.91,33.91
1/27/1988,2.44,10.13,25.46
1/28/1988,2.00,14.33,16.39
1/29/1988,2.36,10.62,6.423
1/30/1988,4.20,18.65,6.175
1/31/1988,15.80,12.67,38.42
2/1/1988,0.46,15.50,52.92
2/2/1988,2.50,14.91,35.03
2/3/1988,1.24,11.36,22.25
2/4/1988,0.04,16.72,15.55
2/5/1988,3.02,22.83,11.84
2/6/1988,0.00,10.86,10.33
2/7/1988,2.72,5.43,9.352
2/8/1988,0.36,3.32,9.41
2/9/1988,4.92,2.25,8.246
2/10/1988,1.94,21.38,8.059
2/11/1988,1.68,66.37,8.936
2/12/1988,0.00,66.37,8.247
2/13/1988,14.04,14.91,8.071
2/14/1988,2.08,9.90,9.941
2/15/1988,5.64,7.29,10.41
2/16/1988,0.00,21.38,10.08
2/17/1988,0.00,15.21,9.821
2/18/1988,1.44,13.49,10.43
2/19/1988,5.66,8.32,10.66
2/20/1988,0.32,4.77,8.877
2/21/1988,0.96,3.32,7.622
2/22/1988,1.36,2.46,6.069
2/23/1988,0.00,1.94,6.852
2/24/1988,0.36,1.48,7.538
2/25/1988,1.96,1.23,7.785
2/26/1988,4.08,1.08,9.163
2/27/1988,0.00,0.94,10.55
2/28/1988,0.56,0.81,10.5
2/29/1988,0.00,0.81,10.34
3/1/1988,0.00,0.69,10.77
3/2/1988,0.08,0.58,11.25
3/3/1988,0.82,0.53,10.86
3/4/1988,7.48,0.49,7.274
3/5/1988,6.26,0.40,5.618
3/6/1988,17.26,0.49,6.707
3/7/1988,3.42,0.35,8.519
3/8/1988,0.64,0.28,9.339
3/9/1988,3.64,0.32,7.926
3/10/1988,2.00,0.64,6.804
3/11/1988,19.60,0.75,4.982
3/12/1988,0.72,0.69,1.149
3/13/1988,12.52,0.64,1.99
3/14/1988,12.58,0.69,4.057
3/15/1988,22.34,0.81,13.46
3/16/1988,23.44,18.65,20.58
3/17/1988,35.20,65.08,41.81
3/18/1988,10.24,78.59,54.7
3/19/1988,47.32,146.45,88.96
3/20/1988,10.50,172.28,111.9
3/21/1988,0.12,215.85,88.96
3/22/1988,0.00,86.61,62.25
3/23/1988,5.88,49.53,42.3
3/24/1988,4.34,32.56,27.56
3/25/1988,0.50,121.94,18.82
3/26/1988,2.00,88.11,12.78
3/27/1988,14.30,35.28,7.07
3/28/1988,4.76,55.83,6.714
3/29/1988,2.60,17.67,6.184
3/30/1988,2.80,52.92,6.394
3/31/1988,2.80,63.80,7.282
4/1/1988,4.12,34.36,6.681
4/2/1988,23.88,14.62,18.06
4/3/1988,40.14,21.74,63.34
4/4/1988,20.74,31.24,128
4/5/1988,14.72,196.87,148.2
4/6/1988,3.08,121.94,115.5
4/7/1988,8.58,52.92,76.39
4/8/1988,4.32,99.81,48.48
4/9/1988,14.54,31.68,29.99
4/10/1988,3.62,53.49,20.66
4/11/1988,2.20,112.29,17.43
4/12/1988,2.06,58.83,14.83
4/13/1988,1.84,33.00,14.34
4/14/1988,0.00,50.08,13.21
4/15/1988,4.32,25.87,11.88
4/16/1988,6.84,21.03,11.32
4/17/1988,1.74,130.16,10.49
4/18/1988,19.20,55.83,10.87
4/19/1988,8.32,42.08,13.14
4/20/1988,4.30,21.38,14.37
4/21/1988,7.96,116.62,14.34
4/22/1988,11.52,60.67,14.35
4/23/1988,33.82,59.44,107.3
4/24/1988,5.66,198.03,152.3
4/25/1988,18.32,247.39,131.8
4/26/1988,17.80,130.16,121
4/27/1988,2.68,219.52,85.55
4/28/1988,0.00,148.43,52.51
4/29/1988,5.42,98.20,33.55
4/30/1988,1.12,54.65,24.19
5/1/1988,2.90,30.81,19.78
5/2/1988,0.00,16.72,17.36
5/3/1988,5.28,10.62,15.42
5/4/1988,0.00,7.69,10.36
5/5/1988,0.00,9.43,8.719
5/6/1988,0.00,5.60,8.94
5/7/1988,0.16,4.46,7.924
5/8/1988,0.00,3.72,6.791
5/9/1988,0.32,3.19,4.776
5/10/1988,3.24,3.32,6.717
5/11/1988,0.08,6.89,8.701
5/12/1988,0.00,4.61,9.747
5/13/1988,0.00,4.01,8.976
5/14/1988,0.00,3.19,8.588
5/15/1988,0.00,2.04,10.46
5/16/1988,0.00,1.75,12.28
5/17/1988,0.00,1.84,12.8
5/18/1988,3.26,1.56,12.48
5/19/1988,0.00,1.39,11.55
5/20/1988,2.84,1.39,11.39
5/21/1988,14.68,1.48,11.79
5/22/1988,8.00,1.39,12.92
5/23/1988,0.24,1.31,13.17
5/24/1988,0.00,1.08,12.73
5/25/1988,0.12,1.23,12.26
5/26/1988,3.12,1.01,12.14
5/27/1988,0.00,0.94,12.18
5/28/1988,0.00,1.23,12.56
5/29/1988,0.48,1.16,12.86
5/30/1988,0.00,0.94,11.86
5/31/1988,0.00,0.94,11.16
6/1/1988,2.72,0.94,11.3
6/2/1988,0.00,0.88,12.22
6/3/1988,0.92,0.81,13.16
6/4/1988,0.00,0.81,13.31
6/5/1988,0.00,0.69,13.42
6/6/1988,0.00,0.69,13.69
6/7/1988,0.00,0.64,13.95
6/8/1988,0.00,0.64,14.15
6/9/1988,0.00,0.58,14.18
6/10/1988,0.00,0.58,13.17
6/11/1988,0.00,0.58,12.07
6/12/1988,0.00,0.53,11.68
6/13/1988,0.00,0.49,10.58
6/14/1988,0.00,0.40,9.865
6/15/1988,0.00,0.40,9.251
6/16/1988,0.00,0.32,9.095
6/17/1988,0.00,0.32,9.796
6/18/1988,0.00,0.28,11.09
6/19/1988,0.00,0.25,12.24
6/20/1988,0.00,0.25,12.51
6/21/1988,0.00,0.25,12.83
6/22/1988,0.00,0.25,12.24
6/23/1988,0.00,0.25,11.73
6/24/1988,0.00,0.18,11.34
6/25/1988,0.00,0.18,11.88
6/26/1988,0.00,0.18,12.57
6/27/1988,0.00,0.16,13.2
6/28/1988,5.50,0.16,12.72
6/29/1988,0.00,0.13,11.74
6/30/1988,0.00,0.13,11.08
7/1/1988,0.00,0.32,11.06
7/2/1988,0.00,0.28,11.23
7/3/1988,0.00,0.28,11.97
7/4/1988,0.00,0.25,12.37
7/5/1988,0.00,0.25,13
7/6/1988,0.00,0.21,13.21
7/7/1988,0.00,0.18,13.29
7/8/1988,0.00,0.13,13.41
7/9/1988,0.00,0.13,13.49
7/10/1988,0.00,0.13,13.33
7/11/1988,0.00,0.13,13.13
7/12/1988,0.00,0.13,11.59
7/13/1988,0.00,0.11,10.34
7/14/1988,0.00,0.11,9.258
7/15/1988,0.00,0.09,9.584
7/16/1988,0.00,0.09,10.67
7/17/1988,0.00,0.09,11.51
7/18/1988,0.00,0.09,11.22
7/19/1988,0.00,0.13,10.27
7/20/1988,0.00,0.18,6.438
7/21/1988,0.00,0.28,2.296
7/22/1988,0.00,0.25,3.29
7/23/1988,0.00,0.32,5.341
7/24/1988,0.00,0.40,7.088
7/25/1988,0.00,0.35,7.95
7/26/1988,0.00,0.32,9.277
7/27/1988,0.00,0.32,10.6
7/28/1988,0.00,0.25,10.83
7/29/1988,0.00,0.25,10.31
7/30/1988,0.00,0.25,8.075
7/31/1988,0.00,0.18,7.727
8/1/1988,0.00,0.21,7.823
8/2/1988,0.00,0.21,8.826
8/3/1988,0.00,0.18,9.881
8/4/1988,0.00,0.18,9.957
8/5/1988,0.00,0.18,10.3
8/6/1988,0.00,0.18,9.91
8/7/1988,2.00,0.18,10.55
8/8/1988,2.64,0.18,10.41
8/9/1988,2.60,0.18,10.79
8/10/1988,0.00,0.18,9.982
8/11/1988,0.00,0.58,9.223
8/12/1988,0.00,0.64,7.946
8/13/1988,0.00,0.40,2.514
8/14/1988,0.00,0.32,4.496
8/15/1988,0.00,0.69,6.865
8/16/1988,0.00,0.81,9.129
8/17/1988,0.00,0.58,10.58
8/18/1988,0.00,0.32,11.36
8/19/1988,0.00,0.53,11.24
8/20/1988,0.00,0.32,11.25
8/21/1988,3.00,0.32,10.33
8/22/1988,0.00,0.25,9.991
8/23/1988,0.00,0.25,10.26
8/24/1988,0.00,0.21,9.69
8/25/1988,0.00,0.18,10.23
8/26/1988,1.20,0.18,9.806
8/27/1988,0.40,0.18,10.48
8/28/1988,18.32,0.16,11.04
8/29/1988,0.00,0.13,10.73
8/30/1988,0.00,0.13,10.2
8/31/1988,0.00,0.44,8.738
9/1/1988,0.00,0.58,7.971
9/2/1988,0.00,0.49,6.974
9/3/1988,0.00,0.32,7.245
9/4/1988,0.00,0.25,6.757
9/5/1988,0.00,0.18,5.911
9/6/1988,0.00,0.16,7.438
9/7/1988,0.00,0.13,7.892
9/8/1988,1.74,0.13,8.176
9/9/1988,0.00,0.09,7.239
9/10/1988,0.32,0.09,8.05
9/11/1988,0.00,0.09,8.973
9/12/1988,0.00,0.06,9.029
9/13/1988,0.00,0.04,7.839
9/14/1988,0.00,0.03,8.298
9/15/1988,0.00,0.03,9.274
9/16/1988,0.00,0.03,9.598
9/17/1988,0.20,0.01,9.834
9/18/1988,0.00,0.01,8.965
9/19/1988,0.00,0.01,7.586
9/20/1988,0.00,0.00,6.39
9/21/1988,0.00,0.00,6.086
9/22/1988,0.00,0.01,5.242
9/23/1988,0.00,0.01,5.468
9/24/1988,0.00,0.00,7.112
9/25/1988,0.00,0.00,8.564
9/26/1988,0.00,0.01,8.098
9/27/1988,0.00,0.01,7.438
9/28/1988,0.00,0.01,5.027
9/29/1988,0.00,0.01,3.847
9/30/1988,0.00,0.00,3.736
10/1/1988,19.98,0.00,4.882
10/2/1988,1.44,0.00,5.67
10/3/1988,4.76,0.00,5.89
10/4/1988,0.00,0.00,5.984
10/5/1988,0.00,0.01,4.795
10/6/1988,0.46,0.09,1.18
10/7/1988,0.16,0.16,1.71
10/8/1988,2.92,0.09,2.805
10/9/1988,0.00,0.06,3.646
10/10/1988,0.00,0.13,3.978
10/11/1988,3.02,0.11,3.825
10/12/1988,12.22,0.09,4.28
10/13/1988,5.72,0.13,5.557
10/14/1988,0.00,0.13,5.941
10/15/1988,0.00,0.09,6.157
10/16/1988,0.00,0.09,6.48
10/17/1988,2.12,0.16,7.368
10/18/1988,0.00,0.40,7.508
10/19/1988,0.00,0.28,7.449
10/20/1988,0.00,0.21,6.893
10/21/1988,1.74,0.18,7.525
10/22/1988,0.00,0.09,8.059
10/23/1988,0.00,0.09,6.449
10/24/1988,5.74,0.09,2.135
10/25/1988,0.08,0.06,2.628
10/26/1988,8.02,0.04,3.698
10/27/1988,1.08,0.03,3.943
10/28/1988,0.00,0.02,3.831
10/29/1988,2.66,0.01,4.499
10/30/1988,1.66,0.01,5.826
10/31/1988,2.88,0.09,5.149
11/1/1988,4.52,0.03,4.765
11/2/1988,2.92,0.01,4.237
11/3/1988,1.66,0.00,5.34
11/4/1988,1.40,0.00,5.443
11/5/1988,2.44,0.00,2.925
11/6/1988,2.76,0.00,2.2
11/7/1988,4.86,0.00,0.872
11/8/1988,6.00,0.03,1.637
11/9/1988,10.34,0.06,0
11/10/1988,4.36,0.06,0.4189
11/11/1988,2.00,0.81,0.7399
11/12/1988,0.58,4.46,0.0379
11/13/1988,3.72,3.59,0
11/14/1988,0.00,2.04,0.7838
11/15/1988,0.00,1.39,2.604
11/16/1988,0.00,1.94,3.007
11/17/1988,2.56,1.39,3.594
11/18/1988,6.48,0.69,3.388
11/19/1988,0.48,0.69,3.61
11/20/1988,6.92,0.40,4.088
11/21/1988,9.48,0.58,4.012
11/22/1988,2.50,0.49,2.636
11/23/1988,1.92,1.94,4.444
11/24/1988,3.78,3.45,5.406
11/25/1988,0.04,1.56,6.103
11/26/1988,1.52,1.56,6.55
11/27/1988,2.04,1.39,6.061
11/28/1988,0.00,2.46,6.712
11/29/1988,2.32,1.56,7.358
11/30/1988,0.00,1.08,7.037
12/1/1988,0.00,0.88,7.034
12/2/1988,1.04,0.64,5.991
12/3/1988,9.80,0.49,6.684
12/4/1988,0.00,0.40,7.438
12/5/1988,1.26,0.32,7.088
12/6/1988,1.00,0.28,6.931
12/7/1988,2.70,0.25,5.799
12/8/1988,0.76,0.21,3.556
12/9/1988,3.12,0.09,2.857
12/10/1988,0.04,0.09,2.951
12/11/1988,4.00,0.13,3.475
12/12/1988,0.40,0.35,4.755
12/13/1988,0.00,0.35,6.497
12/14/1988,2.36,0.64,6.792
12/15/1988,0.00,0.69,5.049
12/16/1988,1.88,0.49,3.227
12/17/1988,0.64,0.40,3.096
12/18/1988,1.56,0.28,3.878
12/19/1988,2.32,0.25,5.608
12/20/1988,3.18,0.25,6.537
12/21/1988,3.84,0.32,7.767
12/22/1988,1.86,0.32,8.419
12/23/1988,3.28,0.28,8.728
12/24/1988,14.22,1.48,8.467
12/25/1988,4.64,1.48,5.57
12/26/1988,3.20,1.39,3.291
12/27/1988,0.00,23.57,0.897
12/28/1988,4.92,22.46,1.262
12/29/1988,2.02,5.26,1.976
12/30/1988,0.00,43.62,2.227
12/31/1988,0.00,55.83,3.271
I would like top produce a plot like the following using ggplot2
I did it with excel but I would like to plot it using ggplot2 so as I can customize most of stuffs.
The entire dataset for the plot is available in the following link.
https://drive.google.com/file/d/10VXqgHyDOp7V4Ui5qQJD-f2V-7OcRA-e/view?usp=sharing
Help please.
There's some solutions in the following link:
https://rpubs.com/cxiao/hydrograph-ggplot2-plot
I personally use mine
library(ggplot2)
library(data.table)
library(lubridate)
library(tidyverse)
# store the data in csv
data <- fread('data.csv')
data <- melt(data, measure.vars = c('Observed', 'Simulated'), value.name = 'Streamflow') %>%
.[,Date := fast_strptime(Date, '%m/%d/%Y') %>% as_date()]
summary(data)
# Calculate the range needed to avoid having your hyetograph and hydrograph overlap
maxRange <- 600 # set how wide of the first axis (streamflow)
coeff <- 0.4 # set the shrink coeffcient of Precipitation
# Plot the data
ggplot(data = data, aes(x = Date)) +
# Use geom_tile to create the inverted hyetograph
# y = the center point of each bar
# maxRange - Precipitation/coeff/2
geom_tile(aes(y = maxRange - Precipitation/coeff/2,
height = Precipitation/coeff,
fill = 'PColor')
)+
# Plot your discharge data
geom_line(aes(y = Streamflow,
color = variable),
alpha = 0.8,
size = 0.7) +
# Create a second axis with sec_axis() and format the labels to display the original precipitation units.
scale_y_continuous(name = "Streamflow ()",
limit = c(0, maxRange),
expand = c(0, 0),
sec.axis = sec_axis(trans = ~(.-maxRange)*coeff,
name = "Precipitation (mm/d)"))+
scale_fill_manual(values = c('PColor' = "#386cb0"),
labels = c('PColor' = 'Precipitation'),
name = NULL
)+
scale_color_manual(values = c('black', '#e41a1c'),
name = NULL)+
theme_bw()+
guides(color = guide_legend(nrow = 1)) +
theme(
# legend.position = c(0.75, 0.5),
legend.position = 'top',
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())

How to filter data for data visualization in r?

I am required to re-create the visualization but filter the data to keep only the businesses that had terms less than 360 months.
The data I am using is the SBA data from this link:
https://amstat.tandfonline.com/doi/full/10.1080/10691898.2018.1434342
library('magrittr')
library(dplyr)
library(tidyr)
library(ggplot2)
sba2 <- sba_data %>%
mutate(default_binary = ifelse(MIS_Status=="CHGOFF","Paid in Full","Default"), daysterm = Term*30, xx = as.Date(sba_data$DisbursementDate, format="%Y-%m-%d") + daysterm, recession_binary = ifelse(xx >= "2007-12-01" & xx <="2009-06-30","Active during Recession","Not Active during Recession"), smaller_business_binary = ifelse(NoEmp < 30, "Very Small Biz", "Not Very Small Biz"), business_length = ifelse(Term < 360, "Short Business", "Long Business"))
table(sba2$business_length)
sba_3 <- sba2 %>%
group_by(recession_binary, default_binary) %>%
summarise(frequencies=n()) %>%
drop_na() %>%
mutate(percents = round(frequencies/sum(frequencies),2))
ggplot(data = sba_3 ) +
geom_col(mapping = aes(x = recession_binary, y = percents, fill = default_binary)) +
coord_flip() +
scale_fill_manual(breaks = c("Default", "Paid in Full"),
values=c(rgb(232/255,74/255,39/255), rgb(19/255,41/255,75/255))) +
scale_y_continuous(labels = scales::percent)
This is my code so far to recreate the visualization. However, I am unsure how to filter the data to only keep business with a term less than 360 months. I had created the variable business_length when mutating sba2, but am not sure what the next steps are. Any help would be greatly appreciated, thanks!
Something like this?
sba_3 <- sba2 %>%
filter(Term < 360) %>%
group_by(recession_binary, default_binary) %>%
summarise(frequencies=n()) %>%
drop_na() %>%
mutate(percents = round(frequencies/sum(frequencies),2))

stat_Summary mean value for group of values

I have the following matrix :
df1 <- read.table(text= 'Sample Al2O3_D1 CaO_D1 Fe2O3_D1 K2O_D1 SiO2_D1 TiO2_D1 Al2O3_D1b0 CaO_D1b0 Fe2O3_D1b0 K2O_D1b0 SiO2_D1b0 TiO2_D1b0 Al2O3_D2 CaO_D2 Fe2O3_D2 K2O_D2 SiO2_D2 TiO2_D2 Al2O3_D2b0 CaO_D2b0 Fe2O3_D2b0 K2O_D2b0 SiO2_D2b0 TiO2_D2b0
BIL1 18.527771 18.256632 6.56251 8.342669 4.374867 0.5842755 17.7422245 23.987872 6.238913 6.780187 5.211082 0.1722543 14.524362 11.540785027 6.567724 11.59751 2.716793 0.148641 21.5237007 32.4114021 6.802355 12.696409 2.418464 0.9386197
Ha11 4.397232 12.888497 34.33309 52.961988 11.131223 45.1434655 15.0384636 12.980365 30.093978 54.002291 3.393567 38.6145158 12.323475 17.028706310 38.690775 49.06582 20.080664 45.619040 3.5297936 16.5442327 41.180170 47.440273 24.276975 44.3668208
Ha137 103.000052 2.212967 51.25095 306.746410 121.806659 63.6579611 212.8800243 2.165292 18.769017 364.236920 62.013419 21.9816389 106.235891 1.269353476 5.057160 287.15363 55.081696 58.204002 161.4670946 1.2884466 34.972412 298.504372 104.189560 31.8284467
Ha171 90.591557 1.569013 67.93309 232.845227 127.158291 75.5793762 254.3307398 1.513897 27.405290 310.109817 42.146396 27.6889288 79.012056 0.145186508 3.886563 222.66687 7.391477 68.673164 191.7846982 0.2314552 42.085088 246.272719 80.267907 36.9812088
Ha21 7.097864 17.014437 45.15964 69.600986 13.324667 55.9935294 25.9518750 17.074982 40.215454 74.263048 2.894586 46.7076167 15.344770 20.336227579 46.986192 62.41907 18.820986 55.880410 10.5328447 19.9025608 50.333141 60.448031 25.283637 52.3899333
Ha28a 14.276762 10.891896 36.92640 93.706836 22.456851 53.3221147 33.2916429 10.929970 31.081099 98.541650 11.519390 43.0020990 23.122859 13.698527956 38.803813 85.77801 28.915269 53.137565 17.2874734 13.2874359 42.799188 83.485379 35.607017 49.1309903
Ha45 8.755274 16.194970 23.72806 24.662651 10.998789 33.8922508 9.9868429 16.441356 22.187580 23.786921 6.279046 30.1799502 14.205216 22.219296703 30.537651 24.31487 22.675330 34.744338 4.7672324 21.8993271 30.730442 23.807887 24.471693 35.2918163
Ha56 2.519377 19.291306 17.78027 14.689078 1.340254 26.7134525 0.3009044 19.692186 16.525441 13.056810 1.866104 25.1075378 5.078764 25.877687826 24.668074 16.58434 12.172266 27.542356 0.9123158 25.9770408 24.642464 16.954296 13.036683 28.7758434
Ha78a 16.630011 10.518633 36.20753 65.697116 18.732162 49.0261161 31.7415436 10.576057 31.740250 68.840011 9.396106 41.7404661 25.679895 13.894132835 40.025832 59.56926 26.868566 49.336063 17.2903714 13.4394515 42.754641 57.575556 32.258478 47.4607437
Ha91 47.531258 6.963291 44.16573 120.753235 57.927836 57.5642663 92.4665900 6.947703 35.101718 135.376499 34.582463 39.3617521 56.780675 7.732582039 39.817328 109.13196 49.911602 56.081569 64.6430022 7.5310163 47.001601 108.587613 66.871031 46.5091579
L110 32.616500 6.162218 48.84669 204.508488 51.576478 64.5594220 100.1659655 6.133002 32.384991 232.810812 20.687400 40.3856100 35.670427 6.239853155 31.080776 188.34130 26.920909 61.901493 67.6463355 6.1250955 45.472784 190.557631 51.085145 47.6973747
L28d 213.963550 4.226734 91.60353 263.427318 250.950733 98.6845750 638.5723017 4.165268 36.270331 529.076984 70.311398 0.0038225 154.434579 2.280303654 3.936042 300.61284 42.674167 82.568998 501.6810607 2.4456430 49.319737 419.089754 116.983089 13.1846725
SARM41 9.206072 8.542555 16.87283 1.652141 6.469613 4.8801782 9.9053176 15.636966 17.675264 3.357829 7.712771 5.6633793 5.816894 1.068768333 9.466277 6.61795 1.658162 3.989925 12.5027313 25.0161949 9.891560 7.313323 1.511116 2.3360493', header=T, stringsAsFactors = F)
These are chemical analysis for different samples obtained with different techniques (techniques are D1, D1b0, D2, D2b0). So for each sample, I have an analysis of different elements (e.g.) Al2O3 made with 4 techniques (Al2O3_D1, Al2O3_D1b0, Al2O3_D2, Al2O3_D2bO)
I want to make boxplot of the variations associated with each element and each technique, with the mean value obtained for each element and each technique in a facet_wrap, so I made the following code:
df1 %>%
pivot_longer(cols = -Sample) %>%
separate(name, c("key","number"), sep = "_") %>%
group_by(number) %>%
ggplot(aes(x=number, y=value, fill = number)) +
geom_boxplot() +
scale_fill_brewer(palette = "RdBu") +
scale_y_log10(limits = c(0.01, NA)) +
theme(legend.position="bottom") +
stat_summary(aes(y = value, label = round(..y.., 2)),
fun = mean, geom = "text", vjust=10,
position = position_dodge(0.9)) +
facet_wrap(key~., ncol=2)
They are two problems with this code:
it is giving me a mean value for something else than what I want. I should have for Al2O3: D1 135.32, D1b0 390.2, D2 100.29, D2b0 307.55
Instead, I have D1 1.3, D1b0 1.53, D2 1.4, D2b0 1.38
I would like all the mean values to be aligned on the same line (e.g. along 1e-1 line) on top of their respective technique but they are not aligned.
Thanks a lot for your help, Anne-Christine
You can use something like the following
df1 %>%
pivot_longer(cols = -Sample) %>%
separate(name, c("key","number"), sep = "_") %>%
group_by(number) %>%
ggplot(aes(x=number, y=value, fill = number)) +
geom_boxplot() +
scale_fill_brewer(palette = "RdBu") +
theme(legend.position="bottom") +
stat_summary(aes(y = value, label = round(..y.., 2)), fun = mean, geom = "text",colour="darkblue") +
facet_wrap(key~., ncol=2, scales = "free")
I will just piggyback off Bappa's answer and your updated code (I don't have enough rep to just comment)
df1 %>%
pivot_longer(cols = -Sample) %>%
separate(name, c("key","number"), sep = "_") %>%
group_by(key, number) %>%
mutate(mean = mean(value, na.rm = T)) %>%
ggplot(aes(x=number, y=value, fill = number)) +
geom_boxplot() +
scale_fill_brewer(palette = "RdBu") +
scale_y_log10() +
geom_text(aes(label = round(mean, 2), y = 1e-01)) +
facet_wrap(key~., ncol=2)
By getting the mean value before creating the plot and log transforming, you should be able to get around the issue with the means you had. The numbers align with those Bappa displayed.
Using geom_text, you can position the labels wherever you like.

Resources