Reversing the y axis on discrete data - r

I have a chart built with ggridges, as below, for which I would like to reverse the date order:
To do this I've added the c_trans() function, as defined here, but it requires the day in posixct format, which appears to convert the date to a continuous scale, even when I define group as a factor:
ggplot( lengthCounts2 %>% filter(rwi == rwiFilter),
aes( x = len,
fill = date,
group= factor(date)
)
) +
stat_density( aes( weight = normalised,
y = date, # time_trans works with objects of class POSIXct only
height = after_stat(density)
),
geom = 'density_ridges',
position = 'identity',
adjust = 0.1
) +
scale_y_continuous(trans = rev_date) +
theme_minimal() +
theme( plot.title = element_text(hjust=0.5)
) +
# scale_fill_brewer(palette = "Pastel1") +
labs( title = glue("Sampled/normalised packet size distribution for rwi={rwiFilter} ({rwiText})\n"),
x ="Length (bytes)",
y = "Date"
)
So that I end up with which is clearly not what I want. Is there any way to reverse the y axis but keep it discrete?
Input data:
> lengthCounts2
# A tibble: 8,724 x 5
# Groups: date, rwi [6]
date rwi len n normalised
<dttm> <chr> <dbl> <int> <dbl>
1 2022-04-13 00:00:00 01 35 677 0.0000319
2 2022-04-13 00:00:00 01 40 3113138 0.147
3 2022-04-13 00:00:00 01 41 15078 0.000710
4 2022-04-13 00:00:00 01 42 2077 0.0000978
5 2022-04-13 00:00:00 01 43 2554 0.000120
6 2022-04-13 00:00:00 01 44 29190 0.00137
7 2022-04-13 00:00:00 01 45 2065 0.0000972
8 2022-04-13 00:00:00 01 46 2054 0.0000967
9 2022-04-13 00:00:00 01 47 2625 0.000124
10 2022-04-13 00:00:00 01 48 146334 0.00689
# ... with 8,714 more rows

Use:
ylim("20220427", "20220420", "20220413")

Related

How to use geom_rect to highlight specified facets

I have a facet plot that I need to place a rectangle in or highlight 3 specific facets. Facets 5, 6, and 10. See Below:
I found some code referring to "geom_rect" that seems like it may work but it won't show up, also doesn't give me any error message. Here is the code:
weekly_TS_PDF<- ggplot(TS_stack, aes(x= TS_log, y = TS_depth, color= sentiment)) +
scale_y_reverse(limits= c(16,2), breaks= seq(16,2)) +
geom_rect(data = data.frame(Week = 5), aes(xmin = -65, xmax = -55, ymin = 1, ymax = 16), alpha = .3, fill="grey", inherit.aes = F) +
geom_point() + facet_grid(.~ Week) + geom_hline(data = week_avg_15E, aes(yintercept = x), linetype = "solid") +
ylab("Target Depth (m)") + xlab("Mean Target Strength (dB)") + ggtitle("Mean TS by Depth by Week (12 hour resolution)") +
guides(color=guide_legend("Year"))
Reprex data:
X TS_depth Group.1 x TS_log Date_time AMPM Week sentiment
1 1 9.593093 2020-12-01 18:00:00 5.390264e-07 -62.68390 2020-12-01 18:00:00 PM 5 Year 1
2 2 9.550032 2020-12-02 06:00:00 4.022841e-07 -63.95467 2020-12-02 06:00:00 AM 6 Year 1
3 3 9.677069 2020-12-02 18:00:00 6.277191e-07 -62.02235 2020-12-02 18:00:00 PM 7 Year 1
4 4 9.679256 2020-12-03 06:00:00 3.501608e-07 -64.55732 2020-12-03 06:00:00 AM 8 Year 1
5 5 9.606380 2020-12-03 18:00:00 6.698625e-07 -61.74014 2020-12-03 18:00:00 PM 9 Year 1
6 6 9.548408 2020-12-04 06:00:00 4.464622e-07 -63.50215 2020-12-04 06:00:00 AM 10 Year 1
I just need to highlight or put a rectangle in facets 5,6, and 10. Any help is appreciated.

How to use another variable values as labels on date x-axis in ggplot?

I have created a ggplot using date x axis but I would like to show their values from another variable instead of dates.
df
library(tidyverse)
library(lubridate)
df <- read_rds("https://github.com/johnsnow09/covid19-df_stack-code/blob/main/vaccine_milestones.rds?raw=true")
df
Updated.On cr_bin days_to_next_10cr_vacc
<date> <fct> <drtn>
1 2021-04-11 10 Cr 85 days
2 2021-05-27 20 Cr 46 days
3 2021-06-24 30 Cr 28 days
4 2021-07-18 40 Cr 24 days
5 2021-08-06 50 Cr 19 days
6 2021-08-25 60 Cr 19 days
7 2021-09-07 70 Cr 13 days
8 2021-09-18 80 Cr 11 days
9 2021-10-02 90 Cr 14 days
df %>%
ggplot(aes(x = Updated.On, y = days_to_next_10cr_vacc)) +
geom_col() +
scale_x_date(aes(labels = cr_bin))
Also tried: scale_x_date(aes(labels = c("10","20","30","40","50","60","70","80","90")))
In the plot on the x axis I would like to have values displayed from cr_bin instead of dates as 10 Cr, 20 cr, 30 Cr ... so on 90 Cr.
I have tried above code but I am not sure what else to use in place of labels to get desired results
You need to set breaks for labels. I'm using unique, just in case there might be duplicate rows.
Also note conversion off difftime to integer.
library(tidyverse)
library(lubridate)
df <- read_rds("https://github.com/johnsnow09/covid19-df_stack-code/blob/main/vaccine_milestones.rds?raw=true")
df %>%
ggplot(aes(x = Updated.On, y = as.integer(days_to_next_10cr_vacc))) +
geom_col() +
scale_x_date(breaks = unique(df$Updated.On), labels = unique(df$cr_bin))
Created on 2021-10-21 by the reprex package (v2.0.1)

How I change the origin of the x axis in ggplot to go from 'August to March' instead of 'Jan to March, August to December'?

I want to plot temperature data over time, with the x axis: "08-01", "09-01", "10-01", "11-01", "12-01", "01-01", "02-01", "03-01"
Rather then: "01-01", "02-01", "03-01", "08-01", "09-01", "10-01", "11-01", "12-01", which R is doing.
My data looks like the following- my x axis uses the Month_day column. Unique values in this column are: "08-01", "09-01", "10-01", "11-01", "12-01", "01-01", "02-01", "03-01".
> head(upstream)
Date daily_aveTempC Moving_Average_7day Year Month Day Month_day monthAbb Migration EmbryoDev
1 2007-08-01 13.49556 13.94947 2007 08 01 08-01 Aug Upstream
2 2007-08-02 13.44325 13.74864 2007 08 02 08-02 Aug Upstream
3 2007-08-03 12.93881 13.56086 2007 08 03 08-03 Aug Upstream
4 2007-08-04 12.78937 13.41106 2007 08 04 08-04 Aug Upstream
5 2007-08-05 13.13963 13.29029 2007 08 05 08-05 Aug Upstream
6 2007-08-06 13.11844 13.19651 2007 08 06 08-06 Aug Upstream
I have the following code that plots Month_day (x axis) vs Moving Average 7day (y axis).
png(paste0(read_out_final, "Migration_Upstream_7day_MovingAve_Sal_4.png"), res=300, width = 15, height = 8, units = "in")
ggplot(data=upstream, aes(x=as.factor(Month_day), y=factor(Moving_Average_7day, levels=upstream$Month_day), color=Year, group=Year)) +
geom_line(size=1) +
theme_bw() +
scale_y_continuous(n.breaks = 20,
limits=c(1,20)) +
scale_x_discrete(breaks = upstream$Month_day[grep("0*-01", upstream$Month_day)]) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(title="Salmon Creek 4: Upstream Migration from August to March",
x="Date",
y="Temperature (7-Day Rolling Average degrees C)")
dev.off()
This plots the data: "01-01", "02-01", "03-01", "08-01", "09-01", "10-01", "11-01", "12-01".
But I want the data plotted: "08-01", "09-01", "10-01", "11-01", "12-01", "01-01", "02-01", "03-01"
I've seen solutions to this issue using the plot() function, but not for ggplot.
Almost every question on ggplot2 that includes "order of ... axis" can be resolved by using factor(., levels=), and explicitly controlling the order of the levels.
dat <- data.frame(dt = seq(as.Date("2020-08-01"), as.Date("2021-04-01"), by="month"), y = 1:9)
dat$MonDay <- format(dat$dt, format = "%m-%d")
dat
# dt y MonDay
# 1 2020-08-01 1 08-01
# 2 2020-09-01 2 09-01
# 3 2020-10-01 3 10-01
# 4 2020-11-01 4 11-01
# 5 2020-12-01 5 12-01
# 6 2021-01-01 6 01-01
# 7 2021-02-01 7 02-01
# 8 2021-03-01 8 03-01
# 9 2021-04-01 9 04-01
library(ggplot2)
ggplot(dat, aes(MonDay, y)) + geom_point()
This is because ggplot2 looks to order its variables; if numeric or integer, it's easy; if character, then it sorts it lexicographically, and it seems clear that "08-01" comes after "04-01" (despite the fact that the strings were formed from an object that had the opposite ordering).
dat$MonDay <- factor(dat$MonDay, levels = unique(dat$MonDay[order(dat$dt)]))
ggplot(dat, aes(MonDay, y)) + geom_point()

for loop with ggplot2

I have to plot the same graph a couple of times with different rectangles.
head(df)
DATA n
<date> <int>
1 2018-01-02 243
2 2018-01-03 243
3 2018-01-04 221
4 2018-01-05 211
5 2018-01-06 35
6 2018-01-07 30
head(rectangles)
channel begin end
<chr> <date> <date>
1 aaaaaaaaaaaaa 2018-09-28 2018-12-28
2 bbbb 2018-08-31 2018-10-31
3 cccccccccccccc 2018-08-31 2018-10-31
4 aaaaaaaaaaaaaaaaaaaaaaa 2018-08-31 2018-10-31
5 ddddddddddddddddddddddddddddddd 2018-08-31 2018-10-31
What I have done so far to have many plots with the same data of df but with the unique rectangles$channels:
unique_rectangles <- unique(rectangles$channel)
for (rect in unique_rectangles) {
plot <- ggplot(df, aes(x = DATA, y =n)) +
geom_rect(data = subset(rectangles, rectangles$channel==unique_rectangles[ret]), aes(xmin = begin, xmax = end, ymin = -Inf, ymax = +Inf), inherit.aes = FALSE, fill = 'red', alpha = 0.2) +
geom_line() +
ggtitle(paste(unique_rectangles[ret]))
print(plot)}
But all I got is:
Error: Aesthetics must be either length 1 or the same as the data (1): xmin, xmax
What can I do to have the multiples plots?

ggplot `geom_segment()` fails to recognize `group_by()` specification

library(tidyverse)
library(lubridate)
library(stringr)
df <-
tibble(Date = as.Date(0:364, origin = "2017-07-01"), Value = rnorm(365)) %>%
mutate(Year = str_sub(Date, 1, 4),
MoFloor = floor_date(Date, unit = "month")) %>%
group_by(Year, MoFloor) %>%
mutate(MoAvgValue = mean(Value)) %>%
ungroup() %>%
group_by(Year) %>%
mutate(MinMoFloor = min(MoFloor),
MaxMoFloor = max(MoFloor),
YearAvgValue = mean(MoAvgValue))
#> # A tibble: 365 x 8
#> # Groups: Year [2]
#> Date Value Year MoFloor
#> <date> <dbl> <chr> <date>
#> 1 2017-07-01 -1.83 2017 2017-07-01
#> 2 2017-07-02 -2.13 2017 2017-07-01
#> 3 2017-07-03 1.49 2017 2017-07-01
#> 4 2017-07-04 0.0753 2017 2017-07-01
#> 5 2017-07-05 -0.437 2017 2017-07-01
#> 6 2017-07-06 -0.327 2017 2017-07-01
#> 7 2017-07-07 -1.28 2017 2017-07-01
#> 8 2017-07-08 0.280 2017 2017-07-01
#> 9 2017-07-09 1.24 2017 2017-07-01
#> 10 2017-07-10 0.0921 2017 2017-07-01
#> # ... with 355 more rows, and 4 more
#> # variables: MoAvgValue <dbl>,
#> # MinMoFloor <date>,
#> # MaxMoFloor <date>,
#> # YearAvgValue <dbl>
Let's first plot the data frame above.
ggplot(df, aes(MoFloor, MoAvgValue, group = Year)) +
facet_grid(~Year, scale = "free_x", space = "free_x") +
geom_point()
In my call to the facet_grid() function I added the arguments scale = "free_x" and space = "free_x" to get rid of empty white space on the plots.
When I go ahead and add geom_segment()s based on group_by()d data, the scale = "free_x" and space = "free_x" arguments are negated. The empty white space reappears!
ggplot(df, aes(MoFloor, MoAvgValue, group = Year)) +
facet_grid(~Year, scale = "free_x", space = "free_x") +
geom_point() +
geom_segment(data = df,
aes(x = min(MinMoFloor),
y = YearAvgValue,
xend = max(MaxMoFloor),
yend = YearAvgValue))
My df data frame is grouped by Year. Why doesn't the geom_segment() function recognize this when I enter (for example) the x = min(MinMoFloor) argument? geom_segment() is pulling the min(MinMoFloor) from the global column, instead of the grouped column. How do I geom_segment() to evaluate the MinMoFloor column as grouped data?

Resources