How to change POSIXct objects display in facets title? - r

I am trying to create a ggplot2 graph using facet_grid(). Each facet has to be entitled with a date (here a POSIXct object) and I would like to change the way it is displayed.
How can i control the way POSIXct objects displays in ggplot2 facets title ?
Ex : this is how it is displayed : "2019-03-29"
and here is how I would like to see it written : "29/03/2018"
I have already looked at the labeller function but I can't figure out how to use it to change the way POSIXct object display. Maybe I am missing something.
I know facet labels can be "manually" changed but here I want a solution that works for any POSIXct object.
# create a dummy dataframe named ex
ex = structure(list(date = structure(c(1510531200, 1510531200, 1522195200,
1522195200), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
cat = c("a", "b", "a", "b"), measure = c(0.0777420913800597,
0.71574708330445, 0.725231731543317, 0.217509124660864)), row.names = c(NA,
-4L), vars = "date", indices = list(0:1, 2:3), group_sizes = c(2L,
2L), biggest_group_size = 2L, labels = structure(list(date = structure(c(1510531200,
1522195200), class = c("POSIXct", "POSIXt"), tzone = "UTC")), row.names = c(NA,
-2L), class = "data.frame", vars = "date", indices = list(c(0L,
1L, 8L, 9L, 16L, 17L), c(2L, 3L, 4L, 5L, 10L, 11L, 12L, 13L,
18L, 19L, 20L, 21L), c(6L, 7L, 14L, 15L, 22L, 23L)), group_sizes = c(6L,
12L, 6L), biggest_group_size = 12L, labels = structure(list(date = structure(c(1510531200,
1522195200, 1543881600), class = c("POSIXct", "POSIXt"), tzone = "UTC")), row.names = c(NA,
-3L), class = "data.frame", vars = "date"), drop = TRUE), drop = TRUE, class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
# create a graph
plot_ex = ggplot(ex, aes(x = cat, y = measure)) +
geom_bar(stat = "identity") +
facet_grid(.~date)
print(plot_ex)
The facets are named "2017-11-13" and "2018-03-28". I want them to be "13/11/2017" and "28/03/2018".
Many thanks for your help,

You can change how dates are printed with format. Using that, we can set an appropriate labeller, without changing the data.frame column.
ggplot(ex, aes(x = cat, y = measure)) +
geom_bar(stat = "identity") +
facet_grid(.~date, labeller = function(x) format(x, '%d/%m/%Y'))

We can use strftime.
ex$date <- strftime(ex$date, format="%d/%m/%Y")
library(ggplot2)
plot_ex <- ggplot(ex, aes(x=cat, y=measure)) +
geom_bar(stat="identity") +
facet_grid(.~date)
print(plot_ex)

Related

Create Zip_Choropleth Map > Add County line overlay and adjust color gradient

Update (1/8/23)
With some great help from jrcalabrese, we were able to determine that the problem I was having stemmed from having zip codes in my sample that were not in the state I was zooming to. This was fixed by filtering the sample for only zip codes in the zoomed state.
There are 2 more things I want to do with this plot:
Create a manual color palette using...
scale_fill_gradientn(colours = c("#FFFFFF","#f688ee","#000775"))
Overlay county lines for
AZ_County <- (c("Apache County", "Coconino County", "Mohave County",
"Navajo County", "Yavapai County"))
I have not spent as much time on this question yet, as I am including it in this update.
Here is data and code:
structure(list(
region = c("85324", "85332", "85360", "85362", "85901", "85902", "85911", "85912", "85920", "85923"), value = c(363L, 238L, 75L, 71L, 4454L, 136L, 68L, 31L, 39L, 132L)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), groups = structure(list(region = c("85324", "85332", "85360", "85362", "85901", "85902", "85911", "85912", "85920", "85923"), .rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), .drop = TRUE))
zip_choropleth(Pop_Zip,
state_zoom = "arizona",
title = "Test",
legend = "Population"
) +
coord_map()
Original question:
I am trying to create a Zip_Choropleth map.
When I run the code, it just sort of hangs, and ultimately r studio times out. Other questions on this topic seemed dated, so I started this thread. Here is the code I am using. Suggestions?
install_github("arilamstein/choroplethrZip#v1.5.0")
library(choroplethrZip)
region <- c("06360", "11953", "22740", "24277", "34275",
"38671", "40351", "46371","53168","56081", "61615", "61920")
df <- data.frame(region)
Pop_Zip <- df %>%
count(region) %>%
rename(value = n)
zip_choropleth(Pop_Zip,
state_zoom = "arizona",
title = "Test",
legend = "Population") +
coord_map()
Instead of scale_fill_gradient, you should use scale_fill_gradient2 because it allows for a midpoint. To get the county lines, you need to bring in data(county.map) and subset it, but make sure that your own list of counties doesn't have the "County" string on the end. geom_polygon will allow for the county line overlay. Finally, setting num_colors to 1 allows for a continuous legend.
library(choroplethr)
library(choroplethrZip)
library(choroplethrMaps)
library(tidyverse)
AZ_County <- c("Apache", "Coconino", "Mohave", "Navajo", "Yavapai")
data("county.map")
countyref <- county.map %>%
filter(NAME %in% AZ_County)
zip <- structure(list(
region = c("85324", "85332", "85360", "85362", "85901", "85902", "85911", "85912", "85920", "85923"),
value = c(363L, 238L, 75L, 71L, 4454L, 136L, 68L, 31L, 39L, 132L)),
class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L),
groups = structure(list(
region = c("85324", "85332", "85360", "85362", "85901", "85902", "85911", "85912", "85920", "85923"),
.rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L),
ptype = integer(0),
class = c("vctrs_list_of", "vctrs_vctr", "list"))),
class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA, -10L),
.drop = TRUE)) %>% as.data.frame()
zip_choropleth(zip,
num_colors = 1,
state_zoom = "arizona",
title = "Test",
legend = "Population") +
scale_fill_gradient2(
low = "#FFFFFF",
mid = "#f688ee",
high = "#000775",
na.value = "gray") +
geom_polygon(data = countyref,
aes(x = long, y = lat, group = group),
alpha = 0,
color = "black",
size = 0.2)

ggplot add aggregated summaries to a bar plot

I have the following data frame:
structure(list(StepsGroup = structure(c(1L, 1L, 1L, 2L, 2L, 2L,
3L, 3L, 3L), .Label = c("(-Inf,3e+03]", "(3e+03,1.2e+04]", "(1.2e+04, Inf]"
), class = "factor"), GlucoseGroup = structure(c(1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L), .Label = c("<100", "100-180", ">180"
), class = "factor"), n = c(396L, 1600L, 229L, 787L, 4182L, 375L,
110L, 534L, 55L), freq = c(0.177977528089888, 0.719101123595506,
0.102921348314607, 0.147267964071856, 0.782559880239521, 0.0701721556886228,
0.157367668097282, 0.763948497854077, 0.0786838340486409)), class =
c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -9L), vars = "StepsGroup",
labels = structure(list(
StepsGroup = structure(1:3, .Label = c("(-Inf,3e+03]", "(3e+03,1.2e+04]",
"(1.2e+04, Inf]"), class = "factor")), class = "data.frame", row.names =
c(NA, -3L), vars = "StepsGroup", drop = TRUE), indices = list(0:2,
3:5, 6:8), drop = TRUE, group_sizes = c(3L, 3L, 3L), biggest_group_size =
3L)
I would like to create a stacked bar plot, and add a summary of each StepsGroup on top of each bar. So the first group will have 2225, the second 5344 and the third 699.
I am using the following script:
ggplot(d_stepsFastingSummary , aes(y = freq, x = StepsGroup, fill =
GlucoseGroup)) + geom_bar(stat = "identity") +
geom_text(aes(label = sum(n()), vjust = 0))
The part until before the geom_text works, but for the last bit I get the following error:
Error: This function should not be called directly
Any idea how to add the aggregated quantity?
We could create a new dataframe stacked_df which would have sum for each StepsGroup
stacked_df <- df %>% group_by(StepsGroup) %>% summarise(nsum = sum(n))
ggplot(df) +
geom_bar(aes(y = freq, x = StepsGroup, fill= GlucoseGroup),stat = "identity") +
geom_text(data = stacked_df, aes(label = nsum, StepsGroup,y = 1.1))

geom_smooth doesn't appear in scatterplot

I'm trying to plot a geom_smooth line in a ggplot2 scatterplot but it doesn't appear when I add the function.
Here's a pasrt of my data frame df.
df <- structure(list(Kennisnamedatum = structure(c(17168, 17169, 17170,
17171, 17172, 17173, 17174, 17175, 17176, 17177), class = "Date"),
misdrijven_per_dag = c(334L, 321L, 292L, 263L, 284L, 247L,
233L, 214L, 252L, 281L)), .Names = c("Kennisnamedatum", "misdrijven_per_dag"
), row.names = c(NA, -10L), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"), vars = "Kennisnamedatum", drop = TRUE, indices = list(
0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), group_sizes = c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), biggest_group_size = 1L, labels = structure(list(
Kennisnamedatum = structure(c(17168, 17169, 17170, 17171,
17172, 17173, 17174, 17175, 17176, 17177), class = "Date")), row.names = c(NA,
-10L), class = "data.frame", vars = "Kennisnamedatum", drop = TRUE, .Names = "Kennisnamedatum"))
I hope someone can tell me what I'm doing wrong by reading my plot code.
library(ggplot2)
library(ggiraph)
ggplot(df, aes(x = Kennisnamedatum, y = misdrijven_per_dag)) +
geom_point_interactive(aes(tooltip = Kennisnamedatum,
data_id = Kennisnamedatum),
alpha = 0.6,
colour = "#607D8B" ) +
geom_smooth() +
scale_x_date(date_breaks = "1 month",
date_labels = "%b")
I get this warning message: geom_smooth() using method = 'loess' and formula 'y ~ x'
I don't know ggiraph, so I did not test with geom_point_interactive. As mentionned by #Jimbou, if I use geom_point, I have no problem getting the graph, but with no interactivity.
I can give you an option with plotly::ggplotly to add the interactivity, using your data:
library(ggplot2)
library(plotly)
plot <- ggplot(df, aes(x = Kennisnamedatum, y = misdrijven_per_dag)) +
geom_smooth() +
geom_point(aes(text = Kennisnamedatum),
alpha = 0.6,
colour = "#607D8B") +
scale_x_date(date_breaks = "1 day",
date_labels = "%b %d")
ggplotly(plot, tooltip = "text")
Note that I have changed the scale for x so it would display something with the data you provided.

geom_smooth does not plot a line for my data frame

I have a dataframe with the following data
my2016.regression.dataframe <- structure(list(Economy_Directorate = structure(c(9L, 1L, 18L,
11L, 5L, 7L), .Label = c("20128895", "25392278", "26802176",
"33214069", "34194316", "34863777", "34867843", "36497785", "37280694",
"37411816", "44460126", "45484123", "47463441", "48354697", "57954259",
"60187650", "65135916", "67317188"), class = "factor"), People_Directorate = structure(c(12L,
14L, 17L, 16L, 13L, 15L), .Label = c("20128895", "25392278",
"26802176", "33214069", "34194316", "34863777", "34867843", "36497785",
"37280694", "37411816", "44460126", "45484123", "47463441", "48354697",
"57954259", "60187650", "65135916", "67317188"), class = "factor")), .Names = c("Economy_Directorate",
"People_Directorate"), row.names = c(NA, -6L), class = "data.frame")
I used the following code to plot it. it plotts the points, but it does not plot the lm .
Could you help me why it does not plot the the lm in the geom_smooth
library(ggplot2)
ggplot(data =my2016.regression.dataframe )+
geom_point(aes(y=Economy_Directorate,x=People_Directorate))+
geom_smooth(method = "lm",aes(y=Economy_Directorate,x=People_Directorate),
fill="orange",colour="red")
Regards,
You need to convert your columns to numeric types. They are currently factors:
my2016.regression.dataframe$Economy_Directorate = as.numeric(as.character(my2016.regression.dataframe$Economy_Directorate))
my2016.regression.dataframe$People_Directorate = as.numeric(as.character(my2016.regression.dataframe$People_Directorate))
ggplot(data = my2016.regression.dataframe) +
geom_point(aes(y=Economy_Directorate,x=People_Directorate))+
geom_smooth(method = "lm",aes(y=Economy_Directorate,x=People_Directorate),
fill="orange",colour="red")

How do I count rows in my data starting with a particular occurence of a value

structure(list(PROD_DATE = structure(c(1465876800, 1465963200,
1466049600, 1466136000, 1466222400, 1466308800, 1466395200, 1466481600,
1466568000, 1466654400), class = c("POSIXct", "POSIXt"), tzone = ""),
FILENUM = c(51922L, 51922L, 51922L, 51922L, 51922L, 51922L,
51922L, 51922L, 51922L, 51922L), CHOKE_SETTING = c(16L, 18L,
50L, 40L, 30L, 23L, 29L, 32L, 35L, 30L)), .Names = c("PROD_DATE",
"FILENUM", "CHOKE_SETTING"), row.names = c(NA, -10L), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), vars = "FILENUM", drop = TRUE, indices = list(
0:9), group_sizes = 10L, biggest_group_size = 10L, labels = structure(list(
FILENUM = 51922L), row.names = c(NA, -1L), class = "data.frame", vars = "FILENUM", drop = TRUE, .Names = "FILENUM"))
df <- df %>% group_by(FILENUM) %>% arrange(PROD_DATE) %>%
mutate(DAYS_ON = row_number())
I'm using the code above to start numbering the rows of the dataset to count days since the start. Rather than using Date-time variable in prod_date.
I am unsure how to add another column that counts days since the occurrence of a max value in a different column. It should start counting the first row at the value of 50. Previous rows would either be NA or 0

Resources