Overlaying Pie Charts in ggplot2 - r

I am making a pie chart to go along with a series of plots all made in ggplot2. The data I'm using have two categories broken in to a total of three subcategories. Basically, the data look like this:
Category Category_Value Super_Category
<fctr> <dbl> <dbl>
1 A 0.03733874 1
2 B 0.66732754 0
3 C 0.29533372 1
Here is the basic pie chart I have at the subcategory level:
And here is what I'd like to have (or something similar):
I had never made a pie chart in ggplot2 before, so here is my basic code to generate the top plot:
pie.chart <- ggplot(pie.data, aes(x = "", y = Category_Value, fill = Category, width = 1)) +
geom_bar(width = 1, stat = "identity") +
coord_polar("y", start = 0) +
scale_fill_manual(values = c("#4DAF4A", "#377EB8", "#E41A1C")) +
theme_bw() +
theme(
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.border = element_blank(),
panel.grid = element_blank(),
axis.ticks = element_blank()
)
Is this something that's doable? I messed around with making another plot grouped at the major category level and overlaying them without success.

You could use annotate to get an approximation of your picture.
Firstly I've used your small subset of data
pie.data <- data.frame(
Category = c("A", "B", "C"),
Category_Value = c(0.03733874, 0.66732754, 0.29533372),
Super_Category = c(1,0,1))
Then I've appplied your code
pie.chart <- ggplot(pie.data, aes(x = "", y = Category_Value, fill = Category, width = 1)) +
geom_bar(width = 1, stat = "identity") +
coord_polar("y", start = 0) +
scale_fill_manual(values = c("#4DAF4A", "#377EB8", "#E41A1C")) +
theme_bw() +
theme(
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.border = element_blank(),
panel.grid = element_blank(),
axis.ticks = element_blank()
)
And Finnaly I drew line using
pie.chart + annotate("rect", xmin = 1.5, xmax = 1.9, ymin = 0,2, ymax = 0.30, alpha=0,colour = "black")
And the output:
Note that because you have more data that my sample you will have to play with the settings values of ymin = 0,2, ymax = 0.30 in annotate in order your line will cover values that you want.

Related

Combine two plots in ggplot?

I need to create a ggplot that is a column plot overlayed with a line plot. The line plot shows mean values, while the column plot shows how the mean values relate to benchmark values. I've managed to create two separate plots in ggplot, but I'm having trouble combining them.
My line plot looks like this:
And is created using this code:
benchMarkLine <- ggplot(UEQScores, aes(x=Scale, y=Score, group=1)) +
geom_line(size = 1.4, colour = "black") +
geom_point(size = 2.4, colour = "black") +
scale_y_continuous(name = "Score", breaks = seq(0, 2.5, 0.25), limits = c(0, 2.5)) +
scale_x_discrete(name = "Scale") +
ggtitle("Mean Scores") +
theme_bw() + # Set black and white theme +
theme(plot.title = element_text(hjust = 0.5, size=10), # Centre plot title
panel.grid.major = element_blank(), # Turn off major gridlines
panel.grid.minor = element_blank(), # Turn off minor gridlines
axis.title.x = element_text(size=10),
axis.text.x = element_text(angle=30, vjust=0.6),
axis.title.y = element_text(size=10))
benchMarkLine
My Column plot looks like this:
And was created with the following code:
benchmarkColPlot <- ggplot(benchmark_long, aes(x=factor(Scale, scaleLevels), y=value, fill=factor(cat, bmLevels))) +
geom_col(position="fill") +
scale_fill_manual(values = bmColours) +
scale_y_continuous(name = "Score", breaks = seq(-1.0, 1.0, 0.25), limits = c(-1, 1)) +
scale_x_discrete(name = "Scale") +
ggtitle("Benchmark Scores") +
theme_bw() + # Set black and white theme +
theme(plot.title = element_text(hjust = 0.5, size=10), # Centre plot title
panel.grid.major = element_blank(), # Turn off major gridlines
panel.grid.minor = element_blank(), # Turn off minor gridlines
axis.title.x = element_text(size=10),
axis.text.x = element_text(angle=30, vjust=0.6),
axis.title.y = element_text(size=10),
legend.title = element_blank())
benchmarkColPlot
How can I combine these two? I tried inserting geom_line(UEQScores, aes(x=Scale, y=Score, group=1)) + above geom_col(position="fill") + in the column plot code, but I just get the following error:
Error: `mapping` must be created by `aes()`
How do I combine these two plots?
OK, I've given up on this - I just created the chart in Excel as it seems to be a bit easier for what I'm doing here.

geom_text labelling bars incorrectly

I am trying to create a graph in R with ggplot. The graph is fine until I try to add labels with geom_text.
Data:
year <-c(2016,2017,2016,2017,2016,2017,2016,2017,2016,2017,2016,2017,2016,2017)
age <- c("0-15","0-15","16-25","16-25","26-35","26-35","36-45","36-45","46-55","46-55","56-65","56-65","66+","66+")
deaths <- c(10,4,40,33,38,28,23,22,18,22,13,16,44,33)
age_group <- factor(age)
fyear <- factor(year)
ideaths <- data.frame(fyear,age_group,deaths)
This is the code I have so far
ggplot(data = ideaths,mapping = aes(x = age_group, y=deaths,
fill=fyear)) +
geom_bar(position = "dodge", stat="identity", width=0.5) +
geom_text(label=deaths,vjust=-0.5) + ggtitle("Figure 8.") +
scale_fill_manual(values=c("#7F7F7F","#94D451")) +
scale_y_continuous(breaks=seq(0,55,5)) + theme_light() +
theme(panel.border = element_blank(), panel.grid.major.x =
element_blank(), panel.grid.minor.y =
element_blank(),panel.grid.major.y = element_line( size=.1,
color="grey"), axis.title = element_blank(), legend.position
= "bottom", legend.title=element_blank(), plot.title
=element_text(size=10))
Which gives me this graph:
I searched for how to align the labels with the bars and found position=position_dodge(width=0.9)
However, this puts the label over the wrong bar for me.
If anyone has any idea of how to fix this, or what is causing it in the first place it would be greatly appreciated!
You need to put label = deaths inside aes() so ggplot knows that it needs to use the deaths column inside ideaths data frame not the standalone deaths vector
library(ggplot2)
ggplot(data = ideaths, aes(x = age_group, y = deaths, fill = fyear)) +
geom_col(position = position_dodge(width = 0.9)) +
geom_text(aes(x = age_group, y = deaths + 3, label = deaths),
position = position_dodge(width = 0.9)) +
ggtitle("Figure 8.") +
scale_fill_manual(values = c("#7F7F7F", "#94D451")) +
scale_y_continuous(breaks = seq(0, 55, 5)) +
theme_light() +
theme(
panel.border = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.major.y = element_line(size = .1, color = "grey"),
axis.title = element_blank(), legend.position = "bottom",
legend.title = element_blank(), plot.title = element_text(size = 10)
)
Created on 2018-11-19 by the reprex package (v0.2.1.9000)

Dealing with factors in geom_pointrange in ggplot

I am trying to visualize some data that consist of odds ratios and confidence intervals for regions nested in countries. I am using the geom_pointrange option for that and it general it works very well.
My problem is that since the odds ratios (and upper confidence intervals) can get quite high values, the axes of the plot are stretched to accommodate for that. That has as a result that confidence intervals that lie between 0 and 1 do not appear clearly enough. One option I found through this community is to change the values into factors and the distance between them will be considered the same for every measurement. This works for the odds ratios (still need to tweak the axis tick marks) but when the values of lower and upper confidence intervals are involved, the position is totally wrong and the confidence intervals do not include the point estimate. I tried to solve this by including all values as levels of the factor, but this did not seem to solve the issue.
What i am trying to do is either to be able to "magnify" the area between 0 and 1 in the graph, while leaving the rest of the plot area unchanged or to manage to make ggplot to place the confidence intervals correctly around the odds ratios.
Below I include a simplified version of my data and the code I have been using for re-producibility.
dat <- data.frame(region = rep(LETTERS[1:5], 2),
country = rep(c("A1", "A2"), each = 5),
or = c(6.459578, 1.696221, 0.895115, 3.393235, 2.325510,
4.457805, 0.407111, 22.760861, 3.354883, 2.214915),
lower = c(5.768999699, 0.237062909, 0.347443105, 0.369881529,
0.010233696, 1.020315696, 0.004419494, 3.87391259,
0.808667764, 0.874415935),
upper = c(7.2328221, 12.1367207, 2.3060778, 31.1290104,
28.4497981, 19.4763489, 0.750188, 337.2960785,
13.9182469, 5.610429))
library(ggplot2)
ggplot(dat, aes(x = region, y = or, ymin = lower, ymax = upper))+
geom_pointrange() +
geom_hline(yintercept = 1, linetype = 2) +
theme_bw() +
theme(plot.margin = unit(c(1, 1, 1, 4), "lines"),
axis.title = element_blank(),
axis.ticks.y = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position="none") +
facet_wrap(~ country) +
coord_flip(ylim = c(0, 100))
# Change numeric variable into factors
f.levels <- c(dat$or, dat$lower, dat$upper)
f.levels <- unique(f.levels)
f.levels <- as.character(f.levels[order(f.levels)])
dat$or <- factor(dat$or, levels = f.levels)
dat$lower <- factor(dat$lower, levels = f.levels)
dat$upper <- factor(dat$upper, levels = f.levels)
ggplot(dat, aes(x = region, y = or, ymin = lower, ymax = upper))+
geom_pointrange() +
geom_hline(yintercept = 1, linetype = 2) +
theme_bw() +
theme(plot.margin = unit(c(1, 1, 1, 4), "lines"),
axis.title = element_blank(),
axis.ticks.y = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position="none") +
facet_wrap(~ country) +
coord_flip(ylim = c(0, 30))
I am relatively new to ggplot so please excuse any newbie mistakes.
Any suggestions on this problem are highly appreciated.
Thank you!
I think the standard solution for this problem is plotting the OR's in a log(10) scale. For a neat explanation see https://blogs.sas.com/content/iml/2015/07/29/or-plots-log-scale.html
ggplot(dat, aes(x = region, y = or, ymin = lower, ymax = upper)) +
geom_pointrange() +
geom_hline(yintercept = 1, linetype = 2) +
scale_y_log10() + ### This is the line that makes the transfomation
theme_bw() +
theme(plot.margin = unit(c(1, 1, 1, 4), "lines"),
axis.title = element_blank(),
axis.ticks.y = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position="none") +
facet_wrap(~ country) +
coord_flip()

ggplot2: Boxplots with points and fill separation [duplicate]

This question already has answers here:
ggplot2 - jitter and position dodge together
(2 answers)
Closed 6 years ago.
I have a data which can be divaded via two seperators. One is year and second is a field characteristics.
box<-as.data.frame(1:36)
box$year <- c(1996,1996,1996,1996,1996,1996,1996,1996,1996,
1997,1997,1997,1997,1997,1997,1997,1997,1997,
1996,1996,1996,1996,1996,1996,1996,1996,1996,
1997,1997,1997,1997,1997,1997,1997,1997,1997)
box$year <- as.character(box$year)
box$case <- c(6.40,6.75,6.11,6.33,5.50,5.40,5.83,4.57,5.80,
6.00,6.11,6.40,7.00,NA,5.44,6.00, NA,6.00,
6.00,6.20,6.40,6.64,6.33,6.60,7.14,6.89,7.10,
6.73,6.27,6.64,6.41,6.42,6.17,6.05,5.89,5.82)
box$code <- c("L","L","L","L","L","L","L","L","L","L","L","L",
"L","L","L","L","L","L","M","M","M","M","M","M",
"M","M","M","M","M","M","M","M","M","M","M","M")
colour <- factor(box$code, labels = c("#F8766D", "#00BFC4"))
In boxplots, I want to display points over them, to see how data is distributed. That is easily done with one single boxplot for every year:
ggplot(box, aes(x = year, y = case, fill = "#F8766D")) +
geom_boxplot(alpha = 0.80) +
geom_point(colour = colour, size = 5) +
theme(text = element_text(size = 18),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
legend.position = "none")
But it become more complicated as I add fill parameter in them:
ggplot(box, aes(x = year, y = case, fill = code)) +
geom_boxplot(alpha = 0.80) +
geom_point(colour = colour, size = 5) +
theme(text = element_text(size = 18),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
legend.position = "none")
And now the question: How to move these points to boxplot axes, where they belong? As blue points to blue boxplot and red to red one.
Like Henrik said, use position_jitterdodge() and shape = 21. You can clean up your code a bit too:
No need to define box, then fill it piece by piece
You can let ggplot hash out the colors if you wish and skip constructing the colors factor. If you want to change the defaults, look into scale_fill_manual and scale_color_manual.
box <- data.frame(year = c(1996,1996,1996,1996,1996,1996,1996,1996,1996,
1997,1997,1997,1997,1997,1997,1997,1997,1997,
1996,1996,1996,1996,1996,1996,1996,1996,1996,
1997,1997,1997,1997,1997,1997,1997,1997,1997),
case = c(6.40,6.75,6.11,6.33,5.50,5.40,5.83,4.57,5.80,
6.00,6.11,6.40,7.00,NA,5.44,6.00, NA,6.00,
6.00,6.20,6.40,6.64,6.33,6.60,7.14,6.89,7.10,
6.73,6.27,6.64,6.41,6.42,6.17,6.05,5.89,5.82),
code = c("L","L","L","L","L","L","L","L","L","L","L","L",
"L","L","L","L","L","L","M","M","M","M","M","M",
"M","M","M","M","M","M","M","M","M","M","M","M"))
ggplot(box, aes(x = factor(year), y = case, fill = code)) +
geom_boxplot(alpha = 0.80) +
geom_point(aes(fill = code), size = 5, shape = 21, position = position_jitterdodge()) +
theme(text = element_text(size = 18),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
legend.position = "none")
I see you've already accepted #JakeKaupp's nice answer, but I thought I would throw in a different option, using geom_dotplot. The data you are visualizing is rather small, so why not forego the boxplot?
ggplot(box, aes(x = factor(year), y = case, fill = code))+
geom_dotplot(binaxis = 'y', stackdir = 'center',
position = position_dodge())

ggplot2 x - y axis intersect while keeping axis labels

I posted my original question yesterday which got solved perfectly here
Original post
I made a few addition to my code
library(lubridate)
library(ggplot2)
library(grid)
### Set up dummy data.
dayVec <- seq(ymd('2016-01-01'), ymd('2016-01-10'), by = '1 day')
dayCount <- length(dayVec)
dayValVec1 <- c(0,-0.22,0.15,0.3,0.4,0.10,0.17,0.22,0.50,0.89)
dayValVec2 <- c(0,0.2,-0.17,0.6,0.16,0.41,0.55,0.80,0.90,1.00)
dayValVec3 <- dayValVec2
dayDF <- data.frame(Date = rep(dayVec, 3),
DataType = factor(c(rep('A', dayCount), rep('B', dayCount), rep('C', dayCount))),
Value = c(dayValVec1, dayValVec2, dayValVec3))
ggplot(dayDF, aes(Date, Value, colour = DataType)) +
theme_bw() +
ggtitle("Cumulative Returns \n") +
scale_color_manual("",values = c("#033563", "#E1E2D2", "#4C633C"),
labels = c("Portfolio ", "Index ", "In-Sample ")) +
geom_rect(aes(xmin = ymd('2016-01-01'),
xmax = ymd('2016-01-06'),
ymin = -Inf,
ymax = Inf
), fill = "#E1E2D2", alpha = 0.03, colour = "#E1E2D2") +
geom_line(size = 2) +
scale_x_datetime(labels = date_format('%b-%d'),
breaks = date_breaks('1 day'),
expand = c(0,0)) +
scale_y_continuous( expand = c(0,0), labels = percent) +
theme(axis.text.x = element_text(angle = 90),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major.x = element_blank(),
axis.line = element_line(size = 1),
axis.ticks = element_line(size = 1),
axis.text = element_text(size = 20, colour = "#033563"),
axis.title.x = element_text(hjust = 2),
plot.title = element_text(size = 40, face = "bold", colour = "#033563"),
legend.position = 'bottom',
legend.text = element_text(colour = "#033563", size = 20),
legend.key = element_blank()
)
which produces this output
The only thing that I still cannot get working is the position of the x axis. I want the x axis to be at y = 0 but still keep the x axis labels under the chart, exactly as in the excel version of it. I know the data sets are not the same but I didn't have the original data at hand so I produced some dummy data. Hope this was worth a new question, thanks.
> grid.ls(grid.force())
GRID.gTableParent.12660
background.1-5-7-1
spacer.4-3-4-3
panel.3-4-3-4
grill.gTree.12619
panel.background.rect.12613
panel.grid.minor.y.zeroGrob.12614
panel.grid.minor.x.zeroGrob.12615
panel.grid.major.y.polyline.12617
panel.grid.major.x.zeroGrob.12618
geom_rect.rect.12607
GRID.polyline.12608
panel.border.rect.12610
axis-l.3-3-3-3
axis.line.y.polyline.12631
axis
axis-b.4-4-4-4
axis.line.x.polyline.12624
axis
xlab.5-4-5-4
ylab.3-2-3-2
guide-box.6-4-6-4
title.2-4-2-4
> grid.gget("axis.1-1-1-1", grep=T)
NULL
ggplot2 doesn't make this easy. Below is one-way to approach this interactively. Basically, you just grab the relevant part of the plot (the axis line and ticks) and reposition them.
If p is your plot
p
grid.force()
# grab the relevant parts - have a look at grid.ls()
tck <- grid.gget("axis.1-1-1-1", grep=T)[[2]] # tick marks
ax <- grid.gget("axis.line.x", grep=T) # x-axis line
# add them to the plot, this time suppressing the x-axis at its default position
p + lapply(list(ax, tck), annotation_custom, ymax=0) +
theme(axis.line.x=element_blank(),
axis.ticks.x=element_blank())
Which produces
A quick note: the more recent versions of ggplot2 have the design decision to not show the axis. Also changes to axis.line are not automatically passed down to the x and y axis. Therefore, I tweaked your theme to define axis.line.x and axis.line.y separately.
That siad, perhaps its easier (and more robust??) to use geom_hline as suggested in the comments, and geom_segment for the ticks.

Resources