ggplot piecharts on a ggmap: labels destroy the small plots

ggplot piecharts on a ggmap: labels destroy the small plots - r

My ggmap on which I would like small piecharts with labels is generated with the code:
p <-
get_googlemap(
"Poland",
maptype = "roadmap",
zoom = 6,
color = "bw",
crop = T,
style = 'feature:all|element:labels|visibility:off' #'feature:administrative.country|element:labels|visibility:off' or 'feature:all|element:labels|visibility:off'
) %>%
ggmap() + coord_cartesian() +
scale_x_continuous(limits = c(14, 24.3), expand = c(0, 0)) +
scale_y_continuous(limits = c(48.8, 55.5), expand = c(0, 0))
I am trying to plot my small ggplot piecharts on a ggmap following the answer
R::ggplot2::geom_points: how to swap points with pie charts?
I prepare data as follows:
df <-
df %>% mutate(Ours = Potential * MS, Others = Potential - Ours) %>%
na.omit() %>% filter(Potential > 0) %>%
select(-L.p., -MS) %>%
group_by(Miasto) %>%
summarise_each_(vars = c("Potential", "Ours", "Others"),
funs = funs(Sum = "sum")) %>%
left_join(coordinatesTowns, by = c("Miasto" = "address")) %>%
distinct(Miasto, .keep_all = T) %>%
select(-X) %>% ungroup()
df <-df %>% gather(key=component, value=sales, c(Ours_Sum,Others_Sum)) %>%
group_by(lon, lat,Potential_Sum)
My data looks then like
tibble::tribble(
~Miasto, ~Potential_Sum, ~lon, ~lat, ~component, ~sales,
"Bialystok", 100, 23.16433, 53.13333, "Ours_Sum", 70,
"Bialystok", 100, 23.16433, 53.13333, "Others_Sum", 30,
"Bydgoszcz", 70, 18.00762, 53.1235, "Ours_Sum", 0,
"Bydgoszcz", 70, 18.00762, 53.1235, "Others_Sum", 70,
"Gdansk", 50, 18.64637, 54.35205, "Ours_Sum", 25,
"Gdansk", 50, 18.64637, 54.35205, "Others_Sum", 75,
"Katowice", 60, 19.02754, 50.25842, "Ours_Sum", 20,
"Katowice", 60, 19.02754, 50.25842, "Others_Sum", 40
)
The last line group_by is essential for generating plots that will be pasted into my map. (I suspected maybe here is the reason of my problems described below).
Instead of totals, I would like to provide labels for each share in a piechart
In this answer I found the syntax, that should add labels to the piecharts https://stackoverflow.com/a/22804400/3480717
Below is the syntax in my script the line with geom_text (commented with hash) if uncommented, causes my plots to disappear and a long list (16 entries) for all small plots, of warnings:
1: Removed 1 rows containing missing values (geom_col).
I presume the reason can be in the last line of preparing the data, grouping it for the plotting.
The line I mark with a hash is a problem. If I put the hash plots are correct, if I include it, trying to get the desired labels on the slices, plots disappear or are very narrow vertical slices.
df.grobs <- df %>%
do(subplots = ggplot(., aes(1, sales, fill = component)) +
geom_bar(position = "fill", alpha = 0.5, colour = "white", stat="identity") +
# geom_text( aes(label = round(sales), y=sales), position = position_stack(vjust = 0.5), size = 2.5) +
coord_polar(theta = "y") +
scale_fill_manual(values = c("green", "red"))+
theme_void()+ guides(fill = F)) %>%
mutate(subgrobs = list(annotation_custom(ggplotGrob(subplots),
x = lon-Potential_Sum/300, y = lat-Potential_Sum/300,
xmax = lon+Potential_Sum/300, ymax = lat+Potential_Sum/300)))
df.grobs
df.grobs %>%
{p +
.$subgrobs +
geom_col(data = df,
aes(0,0, fill = component),
colour = "white")+ geom_text(data=df, aes(label = Miasto),nudge_y = -0.15, size=2.5)}
Why is the line marked with a hash (if uncommented) destroying the plot instead of adding labels? It seems to completely redefine aesthetics.
EDIT: I modified the marked line, now label=sales and y=sales. Now if I comment the line, the plots are generated, if I uncomment it, the labels are generated in correct position but without plots. Why I cannot have both?

Short answer:
I think the problem is actually in your earlier line:
geom_bar(position = "fill", alpha = 0.5, colour = "white", stat="identity") +
If you change the position from fill to stack (i.e. the default), it should work properly (at least it did on mine).
Long(-winded) explanation:
Let's use a summarised version of the mtcars dataset to reproduce the problem:
dfm <- mtcars %>% group_by(cyl) %>% summarise(disp = mean(disp)) %>% ungroup()
# correct pie chart
ggplot(dfm, aes(x = 1, y = disp, label = factor(cyl), fill = factor(cyl))) +
geom_bar(stat = "identity", position = "stack") +
geom_text(position = position_stack(vjust = 0.5)) +
coord_polar(theta = "y") + theme_void()
# "empty" pie chart
ggplot(dfm, aes(x = 1, y = disp, label = factor(cyl), fill = factor(cyl))) +
geom_bar(stat = "identity", position = "fill") +
geom_text(position = position_stack(vjust = 0.5)) +
coord_polar(theta = "y") + theme_void()
Why does changing geom_bar's position affect this? If we look at the plot before the coord_polar step, things may become clearer:
ggplot(dfm, aes(x = 1, y = disp, label = factor(cyl), fill = factor(cyl))) +
geom_bar(stat = "identity", position = "stack") +
geom_text(position = position_stack(vjust = 0.5))
Check the bar chart's y-axis. The bars & the labels are correctly positioned.
Now the version with position = "fill":
ggplot(dfm, aes(x = 1, y = disp, label = factor(cyl), fill = factor(cyl))) +
geom_bar(stat = "identity", position = "fill") +
geom_text(position = position_stack(vjust = 0.5))
Your bar chart now occupies the range 0-1 on the y-axis, while your labels continue to occupy the original full range, which is much larger. Thus when you convert the chart to polar coordinates, the bar chart is squeezed to a tiny slice that becomes practically invisible.

Related

How to avoid overlaying annotations?

I'm trying to plot some annotation in a geom_histogram() ggplot. See image below. These annotations are the count of the histogram for each bin, each group. However, I don't know how to distance the different annotations when the counts are similar. I only know to fix the annotation with vjust or hjust but I wonder if there's a relative way. I don't think an example is necessary. Probably just looking at my code will be easy for someone more experienced.
This is the code I have used:
bind_rows(
RN_df %>% mutate(type='RN'),
RVNM_df %>% mutate(type='RVM')
) %>% group_by(hash) %>%
summarise(n_eps = n(), genre, type) %>%
ggplot(aes(x = n_eps, fill = genre)) +
geom_histogram(binwidth = 1) +
stat_count(aes(y=..count..,label=..count.., colour = genre),geom="text",vjust= -1, hjust = 0.5, size = 3) +
facet_wrap(~type)
This is my output image:

You can use geom_text_repel from ggrepel:
library(ggrepel)
ggplot(df, aes(x = n_eps, fill = genre)) +
geom_histogram(binwidth = 1) +
geom_text_repel(aes(label = ..count..), stat = 'count',
position = position_stack(vjust = 0.5), direction = 'y') +
facet_wrap(~type)

Change ggplot bar chart fill colors

With this data:
df <- data.frame(value =c(20, 50, 90),
group = c(1, 2,3))
I can get a bar chart:
df %>% ggplot(aes(x = group, y = value, fill = value)) +
geom_col() +
coord_flip()+
scale_fill_viridis_c(option = "C") +
theme(legend.position = "none")
But I would like to have the colors of those bars to vary according to their corresponding values in value.
I have managed to change them using geom_raster:
ggplot() +
geom_raster(aes(x = c(0:20), y = .9, fill = c(0:20)),
interpolate = TRUE) +
geom_raster(aes(x = c(0:50), y = 2, fill = c(0:50)),
interpolate = TRUE) +
geom_raster(aes(x = c(0:90), y = 3.1, fill = c(0:90)),
interpolate = TRUE) +
scale_fill_viridis_c(option = "C") +
theme(legend.position = "none")
This approach is not efficient when I have many groups in real data. Any suggestions to get it done more efficiently would be appreciated.
I found the accepted answer to a previous similar question, but "These numbers needs to be adjusted depending on the number of x values and range of y". I was looking for an approach that I do not have to adjust numbers based on data. David Gibson's answer fits my purpose.

It does not look like this is supported natively in ggplot. I was able to get something close by adding additional rows, ranging from 0 to value) to the data. Then use geom_tile and separating the tiles by specifying width.
library(tidyverse)
df <- data.frame(value = c(20, 50, 90),
group = c(1, 2, 3))
df_expanded <- df %>%
rowwise() %>%
summarise(group = group,
value = list(0:value)) %>%
unnest(cols = value)
df_expanded %>%
ggplot() +
geom_tile(aes(
x = group,
y = value,
fill = value,
width = 0.9
)) +
coord_flip() +
scale_fill_viridis_c(option = "C") +
theme(legend.position = "none")
If this is too pixilated you can increase the number of rows generated by replacing list(0:value) with seq(0, value, by = 0.1).

This is a real hack using ggforce. This package has a geom that can take color gradients but it is for a line segment. I've just increased the size to make the line segment look like a bar. I made all the bars the same length to get the correct gradient, then covered a portion of each bar over with the same color as the background color to make them appear to be the correct length. Had to hide the grid lines, however. :-)
df %>%
ggplot() +
geom_link(aes(x = 0, xend = max(value), y = group, yend = group, color = stat(index)), size = 30) +
geom_link(aes(x = value, xend = max(value), y = group, yend = group), color = "grey", size = 31) +
scale_color_viridis_c(option = "C") +
theme(legend.position = "none", panel.background = element_rect(fill = "grey"),
panel.grid = element_blank()) +
ylim(0.5, max(df$group)+0.5 )

r ggplot2 facet_grid how to add space between the top of the chart and the border

Is there a way to add space between the labels on the top of the chart and the margin of a plot using ggplot's facet_grid. Below is a reproducible example.
library(dplyr)
library(ggplot2)
Titanic %>% as.data.frame() %>%
filter(Survived == "Yes") %>%
mutate(FreqSurvived = ifelse(Freq > 100, Freq*1e+04,Freq)) %>%
ggplot( aes(x = Age, y = FreqSurvived, fill = Sex)) +
geom_bar(stat = "identity", position = "dodge") +
facet_grid(Class ~ ., scales = "free") +
theme_bw() +
geom_text(aes(label = prettyNum(FreqSurvived,big.mark = ",")), vjust = 0, position = position_dodge(0.9), size = 2)
The resulting chart has the label of numbers right next to the border of the plot.

I wanted to add to #dww 's answer, but don't have enough reputation.
The expand option actually will allow you to add space only to the top of your graph. From the ?expand_scale help file:
# No space below the bars but 10% above them
ggplot(mtcars) +
geom_bar(aes(x = factor(cyl))) +
scale_y_continuous(expand = expand_scale(mult = c(0, .1)))

One simple way is to use the expand argument of scale_y_continuous:
dt = Titanic %>% as.data.frame() %>%
filter(Survived == "Yes") %>%
mutate(FreqSurvived = ifelse(Freq > 100, Freq*1e+04,Freq))
ggplot(dt, aes(x = Age, y = FreqSurvived, fill = Sex)) +
geom_bar(stat = "identity", position = "dodge") +
facet_grid(Class ~ ., scales = "free") +
theme_bw() +
geom_text(aes(label = prettyNum(FreqSurvived,big.mark = ",")),
vjust = 0, position = position_dodge(0.9), size = 2) +
scale_y_continuous(expand = c(0.1,0))
The downside of using expand is that it will add space both above and below the bars. An alternative is to plot some invisible data on the graph at a height above the bars, which will force ggplt to expand the axis ranges to accomodate this dummy data. Here I add some invisible bars whose height is 1.2* the actual bars:
Titanic %>% as.data.frame() %>%
filter(Survived == "Yes") %>%
mutate(FreqSurvived = ifelse(Freq > 100, Freq*1e+04,Freq)) %>%
ggplot( aes(x = Age, y = FreqSurvived, fill = Sex)) +
geom_bar(aes(y = FreqSurvived*1.2), stat = "identity",
position = "dodge", fill=NA) +
geom_bar(stat = "identity", position = "dodge") +
facet_grid(Class ~ ., scales = "free") +
theme_bw() +
geom_text(aes(label = prettyNum(FreqSurvived,big.mark = ",")),
vjust = 0,
position = position_dodge(0.9), size = 2)

Placing tick marks between bars in ggplot2

Using the diamonds data set in the ggplot2 package, I can generate the following chart.
library(ggplot2)
library(dplyr)
diamond.summary <-
diamonds %>%
mutate(carat = ifelse(runif(nrow(.)) < 0.05, NA_real_, carat)) %>%
group_by(carat_quintile = ntile(carat, 5)) %>%
summarise(avg_price = mean(price))
diamond.summary %>%
filter(!is.na(carat_quintile)) %>%
ggplot(aes(carat_quintile, avg_price)) +
geom_bar(stat = "identity",
color = "black",
width = 1) +
scale_x_continuous("Carat percentile",
breaks = 1:6 - 0.5,
labels = seq(0,100, by = 20)) +
scale_y_continuous(expand = c(0,0),
limits = c(0, 1.1* max(diamond.summary$avg_price)))
So far, so easy. However, I would also like to display the average price of the missing entries alongside the chart. Similar to the following:
diamond.summary %>%
mutate(Facet = is.na(carat_quintile),
carat_quintile_noNA = ifelse(Facet, "Unknown", carat_quintile)) %>%
ggplot(aes(x = carat_quintile_noNA, y = avg_price, fill = Facet)) +
geom_bar(stat = "identity") +
facet_grid(~Facet, scales = "free_x", space = "free_x") +
scale_x_discrete(breaks = (0:6) - 0.5)
However, when I try to perform the same trick using scale_x_continuous, I get the error Discrete value supplied to continuous scale. When I try to use scale_x_discrete(breaks = c(0:6 + 0.5)) for example, the axis ticks and labels disappear.
My question is, how can I get the same faceted chart above with the tick marks in the first panel placed as in the first chart in this post? Advice about chart design could be an acceptable solution, but I don't think all problems like this are solvable with a redesign.

The trick is to convert your factor to a numeric, assigning a magic number to the unknown quantity. (ggplot2 will not plot bars with true NA values.) Then use scale_x_continuous
diamond.summary %>%
mutate(Facet = is.na(carat_quintile),
carat_quintile_noNA = ifelse(Facet, "Unknown", carat_quintile),
##
## 99 is a magic number. For our plot, it just has
## to be larger than 5. The value 6 would be a natural
## choice, but this means that the x tick marks would
## overflow ino the 'unknown' facet. You could choose
## choose 7 to avoid this, but any large number works.
## I used 99 to make it clear that it's magic.
numeric = ifelse(Facet, 99, carat_quintile)) %>%
ggplot(aes(x = numeric, y = avg_price, fill = Facet)) +
geom_bar(stat = "identity", width = 1) +
facet_grid(~Facet, scales = "free_x", space = "free_x") +
scale_x_continuous(breaks = c(0:5 + 0.5, 99),
labels = c(paste0(c(0:5) * 20, "%"), "Unknown"))

One solution is to approach a bit differently, and reposition the bars instead of the ticks, using position_nudge.
library(ggplot2)
library(dplyr)
diamond.summary <-
diamonds %>%
mutate(carat = ifelse(runif(nrow(.)) < 0.05, NA_real_, carat)) %>%
group_by(carat_quintile = ntile(carat, 5)) %>%
summarise(avg_price = mean(price))
# nudge bars to the left
diamond.summary %>%
filter(!is.na(carat_quintile)) %>%
ggplot(aes(carat_quintile, avg_price)) +
geom_bar(stat = "identity",
color = "black",
width = 1,
position=position_nudge((x=-1))) +
scale_x_continuous("Carat percentile",
breaks = 1:6 - 0.5,
labels = seq(0,100, by = 20)) +
scale_y_continuous(expand = c(0,0),
limits = c(0, 1.1* max(diamond.summary$avg_price)))
# nudge bars to the right
diamond.summary %>%
filter(!is.na(carat_quintile)) %>%
ggplot(aes(carat_quintile, avg_price)) +
geom_bar(stat = "identity",
color = "black",
width = 1,
position=position_nudge((x=1))) +
scale_x_continuous("Carat percentile",
breaks = 1:6 - 0.5,
labels = seq(0,100, by = 20)) +
scale_y_continuous(expand = c(0,0),
limits = c(0, 1.1* max(diamond.summary$avg_price)))

Create abbreviated legends manually for long X labels in ggplot2

I would like to create a simple bar chart with ggplot2 and my problem is that my x variable contains long strings so the labels are overlaid.
Here are fake datas and the plot :
library(dplyr)
library(tidyr)
library(ggplot2)
set.seed(42)
datas <- data.frame(label = sprintf("aLongLabel%d", 1:8),
ok = sample(seq(0, 1, by = 0.1), 8, rep = TRUE)) %>%
mutate(err = abs(ok - 1)) %>%
gather(type, freq, ok, err)
datas %>%
ggplot(aes(x = label, y = freq)) +
geom_bar(aes(fill = type), stat = "identity")
I would like to replace the labels by shorter ones and create a legend to show the matches.
What I've tried :
I use the shape aes parameter in geo_point which will create a legend with shapes (and plots shapes that I hide with alpha = 0). Then I change the shapes with scale_shape_manual and replace the x labels with scale_x_discrete. With guides I override the alpha parameter of my shapes so they wont be invisible in the legend.
leg.txt <- levels(datas$label)
x.labels <- structure(LETTERS[seq_along(leg.txt)],
.Names = leg.txt)
datas %>%
ggplot(aes(x = label, y = freq)) +
geom_bar(aes(fill = type), stat = "identity") +
geom_point(aes(shape = label), alpha = 0) +
scale_shape_manual(name = "Labels", values = x.labels) +
guides(shape = guide_legend(override.aes = list(size = 5, alpha = 1))) +
scale_x_discrete(name = "Label", labels = x.labels)
It gives me the expected output but I feel like this is very hacky.
Does ggplot2 provides a way to do this more directly ? Thanks.

Rotation solution suggested by Pascal
Rotate the labels and align them to the edge :
datas %>%
ggplot(aes(x = label, y = freq)) +
geom_bar(aes(fill = type), stat = "identity") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

ggplot piecharts on a ggmap: labels destroy the small plots - r

Related

How to avoid overlaying annotations?

Change ggplot bar chart fill colors

r ggplot2 facet_grid how to add space between the top of the chart and the border

Placing tick marks between bars in ggplot2

Create abbreviated legends manually for long X labels in ggplot2

Categories

Resources