With this data:
df <- data.frame(value =c(20, 50, 90),
group = c(1, 2,3))
I can get a bar chart:
df %>% ggplot(aes(x = group, y = value, fill = value)) +
geom_col() +
coord_flip()+
scale_fill_viridis_c(option = "C") +
theme(legend.position = "none")
But I would like to have the colors of those bars to vary according to their corresponding values in value.
I have managed to change them using geom_raster:
ggplot() +
geom_raster(aes(x = c(0:20), y = .9, fill = c(0:20)),
interpolate = TRUE) +
geom_raster(aes(x = c(0:50), y = 2, fill = c(0:50)),
interpolate = TRUE) +
geom_raster(aes(x = c(0:90), y = 3.1, fill = c(0:90)),
interpolate = TRUE) +
scale_fill_viridis_c(option = "C") +
theme(legend.position = "none")
This approach is not efficient when I have many groups in real data. Any suggestions to get it done more efficiently would be appreciated.
I found the accepted answer to a previous similar question, but "These numbers needs to be adjusted depending on the number of x values and range of y". I was looking for an approach that I do not have to adjust numbers based on data. David Gibson's answer fits my purpose.
It does not look like this is supported natively in ggplot. I was able to get something close by adding additional rows, ranging from 0 to value) to the data. Then use geom_tile and separating the tiles by specifying width.
library(tidyverse)
df <- data.frame(value = c(20, 50, 90),
group = c(1, 2, 3))
df_expanded <- df %>%
rowwise() %>%
summarise(group = group,
value = list(0:value)) %>%
unnest(cols = value)
df_expanded %>%
ggplot() +
geom_tile(aes(
x = group,
y = value,
fill = value,
width = 0.9
)) +
coord_flip() +
scale_fill_viridis_c(option = "C") +
theme(legend.position = "none")
If this is too pixilated you can increase the number of rows generated by replacing list(0:value) with seq(0, value, by = 0.1).
This is a real hack using ggforce. This package has a geom that can take color gradients but it is for a line segment. I've just increased the size to make the line segment look like a bar. I made all the bars the same length to get the correct gradient, then covered a portion of each bar over with the same color as the background color to make them appear to be the correct length. Had to hide the grid lines, however. :-)
df %>%
ggplot() +
geom_link(aes(x = 0, xend = max(value), y = group, yend = group, color = stat(index)), size = 30) +
geom_link(aes(x = value, xend = max(value), y = group, yend = group), color = "grey", size = 31) +
scale_color_viridis_c(option = "C") +
theme(legend.position = "none", panel.background = element_rect(fill = "grey"),
panel.grid = element_blank()) +
ylim(0.5, max(df$group)+0.5 )
I have a Shiny dashboard which includes a line graph which tracks number of visitors on mon-thur and fri-sun periods per month for three years:
I originally also had an annotation which shaded the parts of the graph which occur during the Covid pandemic in Australia, i.e. 2020-03-01 to present. When ggplotly is called on the ggplot, it strips the annotations out. What I want to do is add the shading from 2020-03-01 to present back in. I've tried adding
%>% layout(
shapes = list(
list(type = "rect",
fillcolor = "blue", line = list(color = "blue"), opacity = 0.9,
x0 = "2020-03-01", x1 = Inf,
y0 = 0, y1 = Inf
)
)
after the ggplotly() call, but it doesn't do anything.
I also tried following the code in this question, but the shading doesn't start at the correct date, and it's also only on the first facet.
Reproducible code example:
date <- c("2019-01-01","2019-01-01","2019-02-01","2019-02-01","2019-03-01","2019-03-01","2019-04-01",
"2019-04-01","2019-05-01","2019-05-01","2019-06-01","2019-06-01","2019-07-01","2019-07-01",
"2019-08-01","2019-08-01","2019-09-01","2019-09-01","2019-10-01","2019-10-01","2019-11-01",
"2019-11-01","2019-12-01","2019-12-01","2020-01-01","2020-01-01","2020-02-01","2020-02-01",
"2020-03-01","2020-03-01","2020-04-01","2020-04-01","2020-05-01","2020-05-01","2020-06-01",
"2020-06-01","2020-07-01","2020-07-01","2020-08-01","2020-08-01","2020-09-01","2020-09-01",
"2020-10-01","2020-10-01","2020-11-01","2020-11-01","2020-12-01","2020-12-01","2021-01-01",
"2021-01-01","2021-02-01","2021-02-01","2021-03-01","2021-03-01","2021-04-01","2021-04-01",
"2021-05-01","2021-05-01","2021-06-01","2021-06-01","2019-01-01","2019-01-01","2019-02-01",
"2019-02-01","2019-03-01","2019-03-01","2019-04-01","2019-04-01","2019-05-01","2019-05-01",
"2019-06-01","2019-06-01","2019-07-01","2019-07-01","2019-08-01","2019-08-01","2019-09-01",
"2019-09-01","2019-10-01","2019-10-01","2019-11-01","2019-11-01","2019-12-01","2019-12-01",
"2020-01-01","2020-01-01","2020-02-01","2020-02-01","2020-03-01","2020-03-01","2020-04-01",
"2020-04-01","2020-05-01","2020-05-01","2020-06-01","2020-06-01","2020-07-01","2020-07-01",
"2020-08-01","2020-08-01","2020-09-01","2020-09-01","2020-10-01","2020-10-01","2020-11-01",
"2020-11-01","2020-12-01","2020-12-01","2021-01-01","2021-01-01","2021-02-01","2021-02-01",
"2021-03-01","2021-03-01","2021-04-01","2021-04-01","2021-05-01","2021-05-01","2021-06-01",
"2021-06-01")
location <- rep(c("1001", "1002"), c(60, 60))
daytype <- rep(c("mon-thur", "fri-sat"), 60)
visitors <- c(5694,6829,3087,4247,2814,4187,5310,6408,5519,5934,2817,4080,6762,6595,5339,6669,
4863,6137,8607,11974,4909,9103,7986,9493,15431,13044,6176,5997,6458,7694,5990,5419,
5171,8149,6091,7971,10677,10468,7782,7627,7210,9526,8554,9844,8262,9218,9418,9038,
13031,13418,7408,10621,6908,8122,8851,8861,7940,9179,5992,7026,7939,6923,8209,7815,
8190,7085,9136,7905,9784,8454,9467,9092,9183,8436,9029,8927,8828,8323,7679,7112,
1885,3156,6932,5530,6077,4975,4922,4008,5549,4557,3932,3395,4865,4820,5090,4529,
5407,4262,4858,4200,5101,4761,5108,4413,5209,4116,5405,4445,4140,2985,5589,4684,
5322,4540,4898,4214,5266,4188,5184,4555)
total <- data.frame(location, date, daytype, visitors)
mon_year_vis <- total %>%
ggplot() +
(
mapping =
aes(
x = as.Date(date),
y = visitors,
group = daytype,
color = daytype
)
) +
geom_line() +
geom_point(show.legend = FALSE, size = 1) +
scale_y_continuous(labels = comma) +
facet_wrap( ~location, ncol = 1, scales = "free") +
scale_x_date(date_labels = "%b-%y",
breaks = "3 month",
limits = range)
ggplotly(mon_year_vis)
this task is a bit more complex than it appears to be, since you use the scales_free argument in the facet_wrap call. Because of this you need a little helper that holds none global limits of the shaded areas and work with ggplot2::geom_rect else you could use ggplot2::annotate (for completeness I will list this option also). It is important to bear in mind that plotly seems to have issues with INF as limitations for coordinates when using plotly::ggplotly at least. (I will omit the lines until the declaration of your total variable)
# libraries needed to make things work
library(dplyr)
library(ggplot2)
library(plotly)
library(scales)
ggplot2::geom_rect
# needed for coordinates of shadowed area
helper <- total %>%
dplyr::group_by(location) %>%
dplyr::summarise(mv = max(visitors) , md = max(as.Date(date))) %>%
dplyr::ungroup()
mon_year_vis <- total %>%
ggplot() +
(
mapping =
aes(
x = as.Date(date),
y = visitors,
group = daytype,
color = daytype
)
) +
# insert the geom_rect before the lines so that plotly gets the layer order right
geom_rect(data = helper, aes(xmin = as.Date("2020-03-01"), xmax = md, ymin = 0, ymax = mv), alpha = 0.3, fill="blue", inherit.aes = FALSE) +
geom_line() +
geom_point(show.legend = FALSE, size = 1) +
scale_y_continuous(labels = comma) +
facet_wrap( ~location, ncol = 1, scales = "free") +
scale_x_date(date_labels = "%b-%y",
breaks = "3 month",
limits = range)
ggplotly(mon_year_vis)
ggplot2::annotate
mon_year_vis2 <- total %>%
group_by(daytype) %>%
mutate(maxy = max(visitors)) %>%
ggplot() +
(
mapping =
aes(
x = as.Date(date),
y = visitors,
group = daytype,
color = daytype
)
) +
# insert the annotate before the lines so that plotly gets the layer order right
annotate("rect", xmin=as.Date("2020-03-01"), xmax=max(as.Date(date)), ymin=0, ymax=max(visitors), alpha=0.2, fill="blue") +
geom_line() +
geom_point(show.legend = FALSE, size = 1) +
scale_y_continuous(labels = comma) +
facet_wrap( ~location, ncol = 1, scales = "free") +
scale_x_date(date_labels = "%b-%y",
breaks = "3 month",
limits = range)
ggplotly(mon_year_vis2)
The two resources I have used are: 1 2
My ggmap on which I would like small piecharts with labels is generated with the code:
p <-
get_googlemap(
"Poland",
maptype = "roadmap",
zoom = 6,
color = "bw",
crop = T,
style = 'feature:all|element:labels|visibility:off' #'feature:administrative.country|element:labels|visibility:off' or 'feature:all|element:labels|visibility:off'
) %>%
ggmap() + coord_cartesian() +
scale_x_continuous(limits = c(14, 24.3), expand = c(0, 0)) +
scale_y_continuous(limits = c(48.8, 55.5), expand = c(0, 0))
I am trying to plot my small ggplot piecharts on a ggmap following the answer
R::ggplot2::geom_points: how to swap points with pie charts?
I prepare data as follows:
df <-
df %>% mutate(Ours = Potential * MS, Others = Potential - Ours) %>%
na.omit() %>% filter(Potential > 0) %>%
select(-L.p., -MS) %>%
group_by(Miasto) %>%
summarise_each_(vars = c("Potential", "Ours", "Others"),
funs = funs(Sum = "sum")) %>%
left_join(coordinatesTowns, by = c("Miasto" = "address")) %>%
distinct(Miasto, .keep_all = T) %>%
select(-X) %>% ungroup()
df <-df %>% gather(key=component, value=sales, c(Ours_Sum,Others_Sum)) %>%
group_by(lon, lat,Potential_Sum)
My data looks then like
tibble::tribble(
~Miasto, ~Potential_Sum, ~lon, ~lat, ~component, ~sales,
"Bialystok", 100, 23.16433, 53.13333, "Ours_Sum", 70,
"Bialystok", 100, 23.16433, 53.13333, "Others_Sum", 30,
"Bydgoszcz", 70, 18.00762, 53.1235, "Ours_Sum", 0,
"Bydgoszcz", 70, 18.00762, 53.1235, "Others_Sum", 70,
"Gdansk", 50, 18.64637, 54.35205, "Ours_Sum", 25,
"Gdansk", 50, 18.64637, 54.35205, "Others_Sum", 75,
"Katowice", 60, 19.02754, 50.25842, "Ours_Sum", 20,
"Katowice", 60, 19.02754, 50.25842, "Others_Sum", 40
)
The last line group_by is essential for generating plots that will be pasted into my map. (I suspected maybe here is the reason of my problems described below).
Instead of totals, I would like to provide labels for each share in a piechart
In this answer I found the syntax, that should add labels to the piecharts https://stackoverflow.com/a/22804400/3480717
Below is the syntax in my script the line with geom_text (commented with hash) if uncommented, causes my plots to disappear and a long list (16 entries) for all small plots, of warnings:
1: Removed 1 rows containing missing values (geom_col).
I presume the reason can be in the last line of preparing the data, grouping it for the plotting.
The line I mark with a hash is a problem. If I put the hash plots are correct, if I include it, trying to get the desired labels on the slices, plots disappear or are very narrow vertical slices.
df.grobs <- df %>%
do(subplots = ggplot(., aes(1, sales, fill = component)) +
geom_bar(position = "fill", alpha = 0.5, colour = "white", stat="identity") +
# geom_text( aes(label = round(sales), y=sales), position = position_stack(vjust = 0.5), size = 2.5) +
coord_polar(theta = "y") +
scale_fill_manual(values = c("green", "red"))+
theme_void()+ guides(fill = F)) %>%
mutate(subgrobs = list(annotation_custom(ggplotGrob(subplots),
x = lon-Potential_Sum/300, y = lat-Potential_Sum/300,
xmax = lon+Potential_Sum/300, ymax = lat+Potential_Sum/300)))
df.grobs
df.grobs %>%
{p +
.$subgrobs +
geom_col(data = df,
aes(0,0, fill = component),
colour = "white")+ geom_text(data=df, aes(label = Miasto),nudge_y = -0.15, size=2.5)}
Why is the line marked with a hash (if uncommented) destroying the plot instead of adding labels? It seems to completely redefine aesthetics.
EDIT: I modified the marked line, now label=sales and y=sales. Now if I comment the line, the plots are generated, if I uncomment it, the labels are generated in correct position but without plots. Why I cannot have both?
Short answer:
I think the problem is actually in your earlier line:
geom_bar(position = "fill", alpha = 0.5, colour = "white", stat="identity") +
If you change the position from fill to stack (i.e. the default), it should work properly (at least it did on mine).
Long(-winded) explanation:
Let's use a summarised version of the mtcars dataset to reproduce the problem:
dfm <- mtcars %>% group_by(cyl) %>% summarise(disp = mean(disp)) %>% ungroup()
# correct pie chart
ggplot(dfm, aes(x = 1, y = disp, label = factor(cyl), fill = factor(cyl))) +
geom_bar(stat = "identity", position = "stack") +
geom_text(position = position_stack(vjust = 0.5)) +
coord_polar(theta = "y") + theme_void()
# "empty" pie chart
ggplot(dfm, aes(x = 1, y = disp, label = factor(cyl), fill = factor(cyl))) +
geom_bar(stat = "identity", position = "fill") +
geom_text(position = position_stack(vjust = 0.5)) +
coord_polar(theta = "y") + theme_void()
Why does changing geom_bar's position affect this? If we look at the plot before the coord_polar step, things may become clearer:
ggplot(dfm, aes(x = 1, y = disp, label = factor(cyl), fill = factor(cyl))) +
geom_bar(stat = "identity", position = "stack") +
geom_text(position = position_stack(vjust = 0.5))
Check the bar chart's y-axis. The bars & the labels are correctly positioned.
Now the version with position = "fill":
ggplot(dfm, aes(x = 1, y = disp, label = factor(cyl), fill = factor(cyl))) +
geom_bar(stat = "identity", position = "fill") +
geom_text(position = position_stack(vjust = 0.5))
Your bar chart now occupies the range 0-1 on the y-axis, while your labels continue to occupy the original full range, which is much larger. Thus when you convert the chart to polar coordinates, the bar chart is squeezed to a tiny slice that becomes practically invisible.
I have a test dataset like this:
df_test <- data.frame(
proj_manager = c('Emma','Emma','Emma','Emma','Emma','Alice','Alice'),
proj_ID = c(1, 2, 3, 4, 5, 6, 7),
stage = c('B','B','B','A','C','A','C'),
value = c(15,15,20,20,20,70,5)
)
Preparation for viz:
input <- select(df_test, proj_manager, proj_ID, stage, value) %>%
filter(proj_manager=='Emma') %>%
do({
proj_value_by_manager = sum(distinct(., proj_ID, value)$value);
mutate(., proj_value_by_manager = proj_value_by_manager)
}) %>%
group_by(stage) %>%
do({
sum_value_byStage = sum(distinct(.,proj_ID,value)$value);
mutate(.,sum_value_byStage= sum_value_byStage)
}) %>%
mutate(count_proj = length(unique(proj_ID)))
commapos <- function(x, ...) {
format(abs(x), big.mark = ",", trim = TRUE,
scientific = FALSE, ...) }
Visualization:
ggplot (input, aes(x=stage, y = count_proj)) +
geom_bar(stat = 'identity')+
geom_bar(aes(y=-proj_value_by_manager),
stat = "identity", fill = "Blue") +
scale_y_continuous(labels = commapos)+
coord_flip() +
ylab('') +
geom_text(aes(label= sum_value_byStage), hjust = 5) +
geom_text(aes(label= count_proj), hjust = -1) +
labs(title = "Emma: 4 projects| $90M Values \n \n Commitment|Projects") +
theme(plot.title = element_text(hjust = 0.5)) +
geom_hline(yintercept = 0, linetype =1)
My questions are:
Why is the y-values not showing up right? e.g. C is labeled 20, but nearing hitting 100 on the scale.
How to adjust the position of labels so that it sits on the top of its bar?
How to re-scale the y axis so that both the very short bar of 'count of project' and long bar of 'Project value' can be well displayed?
Thank you all for the help!
I think your issues are coming from the fact that:
(1) Your dataset has duplicated values. This causes geom_bar to add all of them together. For example there are 3 obs for B where proj_value_by_manager = 90 which is why the blue bar extends to 270 for that group (they all get added).
(2) in your second geom_bar you use y = -proj_value_by_manager but in the geom_text to label this you use sum_value_byStage. That's why the blue bar for A is extending to 90 (since proj_value_by_manager is 90) but the label reads 20.
To get you what I believe the chart you want is you could do:
#Q1: No dupe dataset so it doesnt erroneous add columns
input2 <- input[!duplicated(input[,-c(2,4)]),]
ggplot (input2, aes(x=stage, y = count_proj)) +
geom_bar(stat = 'identity')+
geom_bar(aes(y=-sum_value_byStage), #Q1: changed so this y-value matches your label
stat = "identity", fill = "Blue") +
scale_y_continuous(labels = commapos)+
coord_flip() +
ylab('') +
geom_text(aes(label= sum_value_byStage, y = -sum_value_byStage), hjust = 1) + #Q2: Added in y-value for label and hjust so it will be on top
geom_text(aes(label= count_proj), hjust = -1) +
labs(title = "Emma: 4 projects| $90M Values \n \n Commitment|Projects") +
theme(plot.title = element_text(hjust = 0.5)) +
geom_hline(yintercept = 0, linetype =1)
For your last question, there is no good way to display both of these. One option would be to rescale the small data and still label it with a 1 or 3. However, I didn't do this because once you scale down the blue bars the other bars look OK to me.