Pie chart - How to get the percent text at the right location? - r

I have issues having the percentage information at the right locations on my pie chart. Could someone very kindly help me on that? Thank you very much!
#sample dataframe
d <- data.frame(facet=c('a','b','c', "d"), value=c('0.46','0.11','0.18', "0.25"))
d$value <- as.numeric(as.character(d$value))
blank_theme <- theme_minimal()+
theme(
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.border = element_blank(),
panel.grid=element_blank(),
axis.ticks = element_blank(),
plot.title=element_text(size=14, face="bold")
)
d$perc <- round(d$value/sum(d$value) * 100,0)
d$pos <- cumsum(d$perc) - sapply(d$perc,function(x) cumsum(x)-0.5*x)
bp <- ggplot(data=d, aes(x="", y=perc, fill=facet))+
geom_bar(width = 1, stat = "identity") +
geom_text(aes(x="", y=pos, label=paste0(perc,"%"))) +
#geom_text(aes(x="", y=value/4+ c(0, cumsum(value)[-length(value)]), label=percent(value/100)))
scale_fill_manual(values = c("a" = "#b2df8a", "b" = "#238b45", "c" = "#636363", "d"="orange"))
bp
pie <- bp + coord_polar("y", start=0) + blank_theme +
theme(axis.text.x=element_blank())
pie

It happens that for some reason ggplot2 goes to the other direction when dealing with labels. Hence, instead using
d$pos <- 100 - (cumsum(d$perc) - sapply(d$perc, function(x) cumsum(x) - 0.5 * x))
gives

This is straightforward to achieve with ggpiestats function. It only requires slight modification to your dataframe-
library(ggstatsplot)
set.seed(123)
# data
d <-
data.frame(
facet = c('a', 'b', 'c', "d"),
value = c(46, 11, 18, 25)
)
# plot with statistical details in the subtitle
ggstatsplot::ggpiestats(data = d,
main = facet,
counts = value)
In case you don't want statistical test details and want to further customize aesthetics of the plot, you can also use ggplot2 functions-
# customizing it further
# change the slice label
ggstatsplot::ggpiestats(data = d,
main = facet,
counts = value,
slice.label = "both",
package = "wesanderson",
palette = "Royal2") +
ggplot2::labs(subtitle = NULL)
Created on 2019-02-09 by the reprex package (v0.2.1)

Related

Separating geom_point & geom_path plot layers in ggplot (R)

My plot consists of three circles and two points. I am hoping to complete two, seemingly simple but proving difficult, tasks. I am hoping to 1) Create two legends & 2) change the household point's shape, size, and color. The circles generated using the following function...
circleFun <- function(center,diameter, npoints){
# recovered from
# https://stackoverflow.com/questions/6862742/draw-a-circle-with-ggplot2
r = diameter / 2
tt <- seq(0,2*pi,length.out = npoints)
xx <- center[1] + r * cos(tt)
yy <- center[2] + r * sin(tt)
return(data.frame(x = xx, y = yy))
}
I then call the function with 3 different inputs to generate 100 x-y points for each cirlce
A <- circleFun(c(0,0), 1, npoints=100) %>%
cbind("A") %>%
set_names(c("x", "y", "Neighborhood"))
B <- circleFun(c(.5, .5), 1, npoints=100) %>%
cbind("B") %>%
set_names(c("x", "y", "Neighborhood"))
C <- circleFun(c(1, 1), 1, npoints=100) %>%
cbind("C") %>%
set_names(c("x", "y", "Neighborhood"))
neigh <- rbind(A, B, C)
I then create my point data
hh <- as.data.frame(matrix(c(.25,.5,.25,.5,1,2), 2, 3)) %>%
set_names(c("x", "y", "Household"))
Thus far I have two different data sets, both points, both following aes(x,y). However, their grouping is different: the first data set is grouped by "Neighborhood", the second is grouped by "Household".
I then plot what I have thus far..
# Plot Neighborhoods and set up plot specifics
c <- ggplot(data=neigh, aes(x,y, group = Neighborhood, color = Neighborhood)) +
geom_path(size = 1.5) +
xlab("Quality of Public Amenities") +
ylab("Price of Housing") +
ggtitle("Figure 2.5") +
theme(panel.grid = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks.x = element_blank(),
axis.ticks.y = element_blank(),
plot.title = element_text(hjust=0.5, face = 'bold', size = 14))
# Add corresponding household points
c+geom_point(data=hh, aes(x=x,y=y,group = as.factor(Household), color = as.factor(Household)))
This is my output..
So why am I asking for help here? I am hoping to 1) Create two legends, one for Neighborhood and another for Households & 2) change the household point's shape, size, and color. Due to the fact that their both point plots, R is not letting me separate the aesthetics of the plots (aes()) which is causing me to not fulfill tasks 1 & 2. The example is fully replicable.
Try this. First I use only x and y as global aesthetics. Second instead of mapping Household in geom_point on color I map it on fill which adds a second legend. One drawback of this solution. You have to chose from the filled shapes, e.g. shape 21 for filled points. The size of the points can be set via the size argument, while the colors can be set e.g. via scale_fill_manual.
library(ggplot2)
library(dplyr)
library(purrr)
# Plot Neighborhoods and set up plot specifics
c <- ggplot(data=neigh, aes(x,y)) +
geom_path(aes(group = Neighborhood, color = Neighborhood), size = 1.5) +
xlab("Quality of Public Amenities") +
ylab("Price of Housing") +
ggtitle("Figure 2.5") +
theme(panel.grid = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks.x = element_blank(),
axis.ticks.y = element_blank(),
plot.title = element_text(hjust=0.5, face = 'bold', size = 14))
# Add corresponding household points
c +
geom_point(data=hh, aes(group = as.factor(Household), fill = as.factor(Household)), shape = 21, color = "transparent", size = 2) +
scale_fill_manual(name = "Household", values = c("black", "orange"))
Created on 2020-03-31 by the reprex package (v0.3.0)

How can I create bubble grid charts in ggplot?

I want to create bubble grid charts with ggplot.
somthing like this:
I couldnt find any code or exampe online.
Thanks
Using geom_point with discrete x and y scales will get you started. Here's an example with some quick toy data:
library(tidyverse)
offenses <- c("robbery", "violence", "drugs")
actions <- c("formal", "informal", "considered")
counts <- sample(10:100, 9, replace = TRUE)
data <- expand.grid(offenses = offenses, actions = actions) %>% bind_cols(counts = counts)
ggplot(data,
aes(x = str_to_title(offenses),
y = str_to_title(actions),
colour = str_to_title(offenses),
size = counts)) +
geom_point() +
geom_text(aes(label = counts),
colour = "white",
size = 3) +
scale_x_discrete(position = "top") +
scale_size_continuous(range = c(10, 30)) + # Adjust as required.
scale_color_brewer(palette = "Set2") +
labs(x = NULL, y = NULL) +
theme(legend.position = "none",
panel.background = element_blank(),
panel.grid = element_blank(),
axis.ticks = element_blank())
Play around with the range parameter of scale_size_continouous as needed to get bubbles of a reasonable size for your data set.
Oh, I also tried meanwhile. It looks very similar ...
require(ggplot2)
require(ggsci) # for the scale_fill_npg()
grid.bubble.plot <- function(df,
axis_labels_size=16,
aspect_ratio=1/1,
values_text_size=3,
values_text_color="white",
x_axis_position="top", # or "bottom",
bubble_size_range=c(5, 24),
bubble_alpha=0.5,
bubble_shape=21,
bubble_edge_stroke=0) {
col_names <- colnames(df)
row_names <- rownames(df)
values <- as.vector(as.matrix(df))
values_x <- as.vector(sapply(col_names, function(i) rep(i, nrow(df))))
values_y <- as.vector(rep(row_names, dim(df)[2]))
res_df <- data.frame(values = values, values_x = values_x, values_y)
gg <- ggplot(res_df, aes(x=values_x, y=values_y, size = values, fill=factor(values_x))) +
geom_point(alpha=bubble_alpha, shape=bubble_shape, stroke=bubble_edge_stroke) +
scale_size(range = bubble_size_range) +
scale_fill_npg() +
scale_x_discrete(position = x_axis_position) +
geom_text(aes(label=values), size=values_text_size, color=values_text_color) +
theme(line=element_blank(),
panel.background=element_blank(),
legend.position="none",
axis.title=element_blank(),
axis.text=element_text(size=axis_labels_size),
aspect.ratio=aspect_ratio)
gg
}
grid.bubble.plot(df)
Play around with the values.
e.g. you can make also the text size vary with the values:
# thanks to #MSR for example dataset
require(tidyverse)
offenses <- c("robbery", "violence", "drugs")
actions <- c("formal", "informal", "considered")
counts <- sample(10:100, 9, replace = TRUE)
df <- expand.grid(offenses = offenses, actions = actions) %>% bind_cols(counts = counts)
grid.bubble.plot(df, values_text_size=as.vector(as.matrix(df)))
For color, you can try other scale_fill_ variants like:
_aaas() _lancet() _jco() _tron()

Label specific points in ggplot

I would like certain points I have created through ggplot to take labels at the side of the graph but I am not able to do that through my current code.
Ceplane1 is a matrix with two columns and 100 rows ( can take any random numbers). I want to plot column 2 on the x-axis and column 1 on the y-axis with. I have done this part using the below code. Now I want to make changes in the code so that I can put the label at the side of the graph and not on the graph area itself. Additionally, I want to represent the axis in a comma format. you can take result.table[1,1] and result.table[1,3] to be some number and suggest the solution.
ggplot(Ceplane1, aes(x = Ceplane1[,2], y = Ceplane1[,1])) +
geom_point(colour="blue")+geom_abline(slope = -results.table[5,1],intercept = 0,colour="darkred",size=1.25)+
geom_point(aes(mean(Ceplane1[,2]),mean(Ceplane1[,1])),colour="red")+
geom_point(aes(results.table[1,1],results.table[3,1],colour="darkred"))+ggtitle("CE-Plane: Drug A vs Drug P")+
xlab("QALY Difference")+ylab("Cost Difference")+xlim(-0.05,0.05)+ylim(-6000,6000)+
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),plot.background = element_rect(fill = "white", colour = "black", size = 0.5))+
geom_vline(xintercept = 0,colour="black")+geom_hline(yintercept = 0,colour="black")+
geom_label(aes(mean(Ceplane1[,2]),mean(Ceplane1[,1])),label="mean")+
geom_label(aes(results.table[1,1],results.table[3,1]),label="Base ICER")
I want to put the label at the side of the graph and not on the points of the graph itself. Please suggest me a way to do that.
I think the best way is to add the mean and Base ICER points to your dataset. Then add a column for the legend and you will see them show up as matching in the chart and the legend:
library(ggplot2)
set.seed(1)
Ceplane1 <- data.frame(y = rnorm(100),
x = rnorm(100))
results.table <- data.frame(z = rnorm(100))
Ceplane1$Legend <- "Data"
meanPoint <- data.frame(y = mean(Ceplane1[,1]), x = mean(Ceplane1[,2]), Legend = "Mean")
basePoint <- data.frame(y = results.table[3,1], x = results.table[1,1], Legend = "Base ICER")
Ceplane1 <- rbind(Ceplane1, meanPoint)
Ceplane1 <- rbind(Ceplane1, basePoint)
ggplot(Ceplane1, aes(x = x, y = y, color = Legend)) +
geom_point() +
geom_abline(slope = -results.table[5,1],intercept = 0,colour="darkred",size=1.25) +
ggtitle("CE-Plane: Drug A vs Drug P")+ xlab("QALY Difference")+ylab("Cost Difference") +
xlim(-3,3) + ylim(-3,3) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),plot.background = element_rect(fill = "white", colour = "black", size = 0.5)) +
geom_vline(xintercept = 0,colour="black") +
geom_hline(yintercept = 0,colour="black")
This gives me the following:
Note that I changed the xlim and ylim to match the random data I created.

Removing axis labelling for one geom when multiple geoms are present

All I want is this R code to display the names of players inside the "topName" while hiding the names inside the "otherNames" by plotting both of them using two different geom_col().
epldata <- read.csv(file = 'epldata.csv')
epldata$srno <- c(1:461)
attach(epldata)
points <- epldata[order(-fpl_points),]
detach(epldata)
topNames[24:461]<-NA epldata$topNames <- topNames
topPoints[24:461]<-NA epldata$topPoints <- topPoints
epldata$otherNames <- NA epldata$otherNames[24:461] <-
as.character(points$name[c(24:461)]) epldata$otherPoints <- NA
epldata$otherPoints[24:461] <-
as.numeric(points$fpl_points[c(24:461)])
ggplot(data = epldata)+ geom_col(aes(x=epldata$topNames,
y=epldata$topPoints), fill = "red", alpha = 1) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
annotate("text", x=epldata$topNames, y=-50, #epldata$topPoints,
label = epldata$topNames, fontface = 1, size = 2, hjust = 0)+ geom_col(aes(x=epldata$otherNames, y=epldata$otherPoints), fill
= "gray", alpha = 0.3)+ theme(legend.position = "none")+ #theme(axis.text.x = element_text(angle = 90, hjust = 1))+ xlab("Player Names")+ ylab("FPL Points")+ guides(fill=FALSE,
color=FALSE, guide = FALSE) + coord_flip() + theme(axis.text.y =
element_blank(),
axis.ticks.y = element_blank())
This is the kind of output I am looking for but without using the Annotate Hack that I am currently using but directly plotting the names on the axis.
Update : have added the entire code and the link to the data set is below :
https://drive.google.com/open?id=1KTitWDcLIBmeBsz8mLcHXDIyhQLZnlhS
Once you've created a list of topNames, you can use scale_x_continuous to display only these axis labels:
scale_x_discrete(breaks = topNames)
Also, rather than using two separate geom_col() geometries, you can create a new "highlight" column in the dataframe and use that with the fill and alpha aesthetics:
library(dplyr)
library(ggplot2)
# read data from google drive
id <- "1KTitWDcLIBmeBsz8mLcHXDIyhQLZnlhS" #google file ID
epldata <- read.csv(sprintf("https://docs.google.com/uc?id=%s&export=download", id),
stringsAsFactors = FALSE)
N <- 24 #number of players to highlight
#get list of names of top N players
topNames <- epldata %>%
arrange(-fpl_points) %>%
head(N) %>%
pull(name)
#> Warning: package 'bindrcpp' was built under R version 3.5.1
# make variable for highlighting
epldata <- epldata %>%
mutate(highlight = ifelse(name %in% topNames, TRUE, FALSE))
ggplot(data = epldata,
aes(x = name, y = fpl_points, fill = highlight, alpha = highlight)) +
geom_col() +
scale_fill_manual(guide = FALSE,
values = c("gray", "red")) +
scale_alpha_manual(guide = FALSE,
values = c(0.4, 1)) +
scale_x_discrete(breaks = topNames) + #use breaks to determine axis labels
coord_flip() +
ylab("FPL Points") +
theme_classic() +
theme(axis.ticks.y = element_blank(),
axis.title.y = element_blank())
Created on 2018-09-19 by the reprex package (v0.2.1)

Closing the lines in a ggplot2 radar / spider chart

I need a flexible way to make radar / spider charts in ggplot2. From solutions I've found on github and the ggplot2 group, I've come this far:
library(ggplot2)
# Define a new coordinate system
coord_radar <- function(...) {
structure(coord_polar(...), class = c("radar", "polar", "coord"))
}
is.linear.radar <- function(coord) TRUE
# rescale all variables to lie between 0 and 1
scaled <- as.data.frame(lapply(mtcars, ggplot2:::rescale01))
scaled$model <- rownames(mtcars) # add model names as a variable
as.data.frame(melt(scaled,id.vars="model")) -> mtcarsm
ggplot(mtcarsm, aes(x = variable, y = value)) +
geom_path(aes(group = model)) +
coord_radar() + facet_wrap(~ model,ncol=4) +
theme(strip.text.x = element_text(size = rel(0.8)),
axis.text.x = element_text(size = rel(0.8)))
which works, except for the fact that lines are not closed.
I thougth that I would be able to do this:
mtcarsm <- rbind(mtcarsm,subset(mtcarsm,variable == names(scaled)[1]))
ggplot(mtcarsm, aes(x = variable, y = value)) +
geom_path(aes(group = model)) +
coord_radar() + facet_wrap(~ model,ncol=4) +
theme(strip.text.x = element_text(size = rel(0.8)),
axis.text.x = element_text(size = rel(0.8)))
in order to join the lines, but this does not work. Neither does this:
closes <- subset(mtcarsm,variable == names(scaled)[c(1,11)])
ggplot(mtcarsm, aes(x = variable, y = value)) +
geom_path(aes(group = model)) +
coord_radar() + facet_wrap(~ model,ncol=4) +
theme(strip.text.x = element_text(size = rel(0.8)),
axis.text.x = element_text(size = rel(0.8))) + geom_path(data=closes)
which does not solve the problem, and also produces lots of
"geom_path: Each group consist of only one observation. Do you need to
adjust the group aesthetic?"
messages. Som, how do I go about closing the lines?
/Fredrik
Using the new ggproto mechanism available in ggplot2 2.0.0, coord_radar can be defined as:
coord_radar <- function (theta = "x", start = 0, direction = 1)
{
theta <- match.arg(theta, c("x", "y"))
r <- if (theta == "x")
"y"
else "x"
ggproto("CoordRadar", CoordPolar, theta = theta, r = r, start = start,
direction = sign(direction),
is_linear = function(coord) TRUE)
}
Not sure if the syntax is perfect but it is working...
The codes here seem outdated for ggplot2: 2.0.0
Try my package zmisc: devtools:install_github("jerryzhujian9/ezmisc")
After you install it, you will be able to run:
df = mtcars
df$model = rownames(mtcars)
ez.radarmap(df, "model", stats="mean", lwd=1, angle=0, fontsize=0.6, facet=T, facetfontsize=1, color=id, linetype=NULL)
ez.radarmap(df, "model", stats="none", lwd=1, angle=0, fontsize=1.5, facet=F, facetfontsize=1, color=id, linetype=NULL)
if you are curious about what's inside, see my codes at github:
The main codes were adapted from http://www.cmap.polytechnique.fr/~lepennec/R/Radar/RadarAndParallelPlots.html
solution key factor
add duplicated mpg row after melt by rbind
inherit CoordPolar on ggproto
set is_linear = function() TRUE on ggproto
especially is_linear = function() TRUE is important,
since if not you will get plot like this...
with is_linear = function() TRUE settings you can get,
library(dplyr)
library(data.table)
library(ggplot2)
rm(list=ls())
scale_zero_to_one <-
function(x) {
r <- range(x, na.rm = TRUE)
min <- r[1]
max <- r[2]
(x - min) / (max - min)
}
scaled.data <-
mtcars %>%
lapply(scale_zero_to_one) %>%
as.data.frame %>%
mutate(car.name=rownames(mtcars))
plot.data <-
scaled.data %>%
melt(id.vars='car.name') %>%
rbind(subset(., variable == names(scaled.data)[1]))
# create new coord : inherit coord_polar
coord_radar <-
function(theta='x', start=0, direction=1){
# input parameter sanity check
match.arg(theta, c('x','y'))
ggproto(
NULL, CoordPolar,
theta=theta, r=ifelse(theta=='x','y','x'),
start=start, direction=sign(direction),
is_linear=function() TRUE)
}
plot.data %>%
ggplot(aes(x=variable, y=value, group=car.name, colour=car.name)) +
geom_path() +
geom_point(size=rel(0.9)) +
coord_radar() +
facet_wrap(~ car.name, nrow=4) +
theme_bw() +
theme(
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.x = element_blank(),
legend.position = 'none') +
labs(title = "Cars' Status")
final result
Sorry, I was beeing stupid. This seems to work:
library(ggplot2)
# Define a new coordinate system
coord_radar <- function(...) {
structure(coord_polar(...), class = c("radar", "polar", "coord"))
}
is.linear.radar <- function(coord) TRUE
# rescale all variables to lie between 0 and 1
scaled <- as.data.frame(lapply(mtcars, ggplot2:::rescale01))
scaled$model <- rownames(mtcars) # add model names as a variable
as.data.frame(melt(scaled,id.vars="model")) -> mtcarsm
mtcarsm <- rbind(mtcarsm,subset(mtcarsm,variable == names(scaled)[1]))
ggplot(mtcarsm, aes(x = variable, y = value)) +
geom_path(aes(group = model)) +
coord_radar() + facet_wrap(~ model,ncol=4) +
theme(strip.text.x = element_text(size = rel(0.8)),
axis.text.x = element_text(size = rel(0.8)))
It turns out than geom_polygom still produces a polygon in the polar coordinates so that
# rescale all variables to lie between 0 and 1
scaled <- as.data.frame(lapply(mtcars, ggplot2:::rescale01))
scaled$model <- rownames(mtcars) # add model names as a variable
# melt the dataframe
mtcarsm <- reshape2::melt(scaled)
# plot it as using the polygon geometry in the polar coordinates
ggplot(mtcarsm, aes(x = variable, y = value)) +
geom_polygon(aes(group = model), color = "black", fill = NA, size = 1) +
coord_polar() + facet_wrap( ~ model) +
theme(strip.text.x = element_text(size = rel(0.8)),
axis.text.x = element_text(size = rel(0.8)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank()) +
xlab("") + ylab("")
works perfectly...
Thank you guys for the help but it did not cover all of my needs. I used two series of data to be compared so I took the subset of mtcars for Mazda:
nobody mentioned about order of the x variable and ggplot2 sorts this variable for the plot but does not sort the data and it made my chart wrong at the first attempt. Apply sorting function for me it was dplyr::arrange(plot.data, x.variable.name)
I needed to annotate the chart with values and ggplot2::annotate() works fine but it was not included in the recent answers
the above code did not work fine for my data until adding ggplot2::geom_line
Finally this code chunk did my chart:
scaled <- as.data.frame(lapply(mtcars, ggplot2:::rescale01))
scaled$model <- rownames(mtcars)
mtcarsm <- scaled %>%
filter(grepl('Mazda', model)) %>%
gather(variable, value, mpg:carb) %>%
arrange(variable)
ggplot(mtcarsm, aes(x = variable, y = value)) +
geom_polygon(aes(group = model, color = model), fill = NA, size = 1) +
geom_line(aes(group = model, color = model), size = 1) +
annotate("text", x = mtcarsm$variable, y = (mtcarsm$value + 0.05), label = round(mtcarsm$value, 2), size = 3) +
theme(strip.text.x = element_text(size = rel(0.8)),
axis.text.x = element_text(size = rel(1.2)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank()) +
xlab("") + ylab("") +
guides(color = guide_legend()) +
coord_radar()
Hopefully usefull for somebody

Resources