Convert col plot to pyramid format - r

Is this type of plot possible with ggplot? I've seen examples such as this one:
http://alburez.me/2018-03-20-Population-pyramids-in-R-for-beginners/
But actually I want something much more simpler like this one:
Is this possible in ggplot? My first instinct is to create a bar plot/col plot but cannot make the columns centered like the ones in the attached picture above.

Sure could this be achieved via ggplot2. One option would be to make use of geom_rect like so:
library(ggplot2)
dat <- data.frame(
name = factor(c("Gen Z", "Millenials", "Gen X")),
name_y = c(3:1),
pct = c(.3, .51, .19)
)
midpoint <- max(dat$pct) / 2
ggplot(dat) +
geom_rect(aes(xmin = midpoint - pct / 2, xmax = midpoint + pct / 2,
ymin = name_y - .45, ymax = name_y + .45),
fill = "steelblue", color = "white") +
geom_text(aes(x = midpoint, y = name_y, label = scales::percent(pct)), color = "white", fontface = "bold") +
scale_y_continuous(breaks = unique(dat$name_y), labels = unique(dat$name)) +
theme_minimal() +
labs(x = NULL, y = NULL) +
theme(axis.line.x = element_blank(), axis.text.x = element_blank(),
panel.grid = element_blank())

Related

Is there a method to set the theta-axis ticks for circular ring plot in ggplot2?

I want to show a tick mark for theta axis in the ggplot2 poar plot. However, both axis.ticks.y and axis.ticks.y in the theme() does not work for theta axis. Any help would be appreciated, thanks
library(ggplot2)
df <- data.frame(
start = c(0, 121, 241),
end = c(120, 240, 359),
group = letters[1:3]
)
# a example circular ring plot
base <- ggplot(df, aes(ymax = end, ymin = start,
xmin = 0.8, xmax = 1,
fill = group)) +
geom_rect() +
coord_polar(theta = "y") +
xlim(c(0, 1))
base
# the tick of y axis can be changed
base + theme(axis.ticks.y = element_blank(), axis.text.y = element_blank())
# set the tick of x axis not worked for the theta axis
base + theme(axis.ticks.x = element_line(color = "black", size = 2))
Thanks for #Vishal A., the answer from Controlling ticks and odd text in a pie chart generated from a factor variable in ggplot2 used the panel.grid.major.y. However, it will add the major grids rather than ticks like the following:
base + theme(panel.grid.major.y = element_line(colour = "black"))
Created on 2021-12-20 by the reprex package (v2.0.1)
I see two options. You can use the panel grids, but you need to hide them. The usefulness of this solution depends on your intended plot background. I've used white, but this can be customised, of course.
Second option is to fake the ticks with annotation, e.g., with the symbol "|".
Further smaller comments in the code below.
library(tidyverse)
df <- data.frame(
start = c(0, 121, 241),
end = c(120, 240, 359),
group = letters[1:3]
)
ggplot(df) +
## annotate with a rectangle, effectively covering your central hole
annotate(geom = "rect", xmin = 0, xmax = 1, ymin = min(df$start), ymax = max(df$end),
fill = "white") +
## move aes to the geom_layer
geom_rect(aes(ymax = end, ymin = start,
xmin = 0.8, xmax = 1,
fill = group)) +
coord_polar(theta = "y") +
xlim(c(0, 1)) +
theme(panel.grid.major.y = element_line(colour = "black"))
## Option 2 - fake the ticks
## the position along the theta axis is defined by y
## you need to change the angle of your fake ticks according to the angle.
df_annot <-
data.frame(y = seq(0,300,100), x = Inf, angle = 360-seq(0,300,100))
ggplot(df) +
## annotate with text, along your y
## by placing it beneath your geom_rect layer it will automatically be covered
geom_text(data = df_annot, aes(x, y, label = "|", angle = angle)) +
## move aes to the geom_layer
geom_rect(aes(ymax = end, ymin = start,
xmin = 0.8, xmax = 1,
fill = group)) +
coord_polar(theta = "y") +
xlim(c(0, 1))
Created on 2021-12-21 by the reprex package (v2.0.1)
You need to use panel.grid.minor.y instead of axis.ticks.y in order to change the ticks.
Your code will look like this:
base + theme(axis.ticks.y = element_blank(), axis.text.y = element_blank())
base + theme(panel.grid.minor.y = element_line(color = "black", size = 1))
The output will look like this:

Combine legend for fill and colour ggplot to give only single legend

I am plotting a smooth to my data using geom_smooth and using geom_ribbon to plot shaded confidence intervals for this smooth. No matter what I try I cannot get a single legend that represents both the smooth and the ribbon correctly, i.e I am wanting a single legend that has the correct colours and labels for both the smooth and the ribbon. I have tried using + guides(fill = FALSE), guides(colour = FALSE), I also read that giving both colour and fill the same label inside labs() should produce a single unified legend.
Any help would be much appreciated.
Note that I have also tried to reset the legend labels and colours using scale_colour_manual()
The below code produces the below figure. Note that there are two curves here that are essentially overlapping. The relabelling and setting couours has worked for the geom_smooth legend but not the geom_ribbon legend and I still have two legends showing which is not what I want.
ggplot(pred.dat, aes(x = age.x, y = fit, colour = tagged)) +
geom_smooth(size = 1.2) +
geom_ribbon(aes(ymin = lci, ymax = uci, fill = tagged), alpha = 0.2, colour = NA) +
theme_classic() +
labs(x = "Age (days since hatch)", y = "Body mass (g)", colour = "", fill = "") +
scale_colour_manual(labels = c("Untagged", "Tagged"), values = c("#3399FF", "#FF0033")) +
theme(axis.title.x = element_text(face = "bold", size = 14),
axis.title.y = element_text(face = "bold", size = 14),
axis.text.x = element_text(size = 12),
axis.text.y = element_text(size = 12),
legend.text = element_text(size = 12))
The problem is that you provide new labels for the color-aesthetic but not for the fill-aesthetic. Consequently ggplot shows two legends because the labels are different.
You can either also provide the same labels for the fill-aesthetic (code option #1 below) or you can set the labels for the levels of your grouping variable ("tagged") before calling ggplot (code option #2).
library(ggplot2)
#make some data
x = seq(0,2*pi, by = 0.01)
pred.dat <- data.frame(x = c(x,x),
y = c(sin(x), cos(x)) + rnorm(length(x) * 2, 0, 1),
tag = rep(0:1, each = length(x)))
pred.dat$lci <- c(sin(x), cos(x)) - 0.4
pred.dat$uci <- c(sin(x), cos(x)) + 0.4
#option 1: set labels within ggplot call
pred.dat$tagged <- as.factor(pred.dat$tag)
ggplot(pred.dat, aes(x = x, y = y, color = tagged, fill = tagged)) +
geom_smooth(size = 1.2) +
geom_ribbon(aes(ymin = lci, ymax = uci), alpha = 0.2, color = NA) +
scale_color_manual(labels = c("untagged", "tagged"), values = c("#F8766D", "#00BFC4")) +
scale_fill_manual(labels = c("untagged", "tagged"), values = c("#F8766D", "#00BFC4")) +
theme_classic() + theme(legend.title = element_blank())
#option 2: set labels before ggplot call
pred.dat$tagged <- factor(pred.dat$tag, levels = 0:1, labels = c("untagged", "tagged"))
ggplot(pred.dat, aes(x = x, y = y, color = tagged, fill = tagged)) +
geom_smooth(size = 1.2) +
geom_ribbon(aes(ymin = lci, ymax = uci), alpha = 0.2, color = NA) +
theme_classic() + theme(legend.title = element_blank())

Putting horizontal lines on grouped boxplots

I am trying to make a boxplot with this basic code:
design=c("Red","Green","Blue")
actions=c("1","2","3","4","5","6","7","8")
proportion=(seq(1:240)+sample(1:500, 240, replace=T))/2000
df=data.frame(design, actions , proportion)
ggplot(df, aes(x=actions, y=proportion, fill=design)) +
geom_boxplot()+
xlab(TeX("group"))+
ylab("Y value")+
ggtitle("Y values for each group stratified by color")
Producing something like this:
I want to add horizontal lines for "true" Y values that are different for each group.
Does anyone have any tips for doing this? I don't know how to extract the width of each group of boxes, otherwise I could use geom_segment.
Here is a MWE with a non-grouped boxplot:
dBox <- data.frame(y = rnorm(10),group="1")
dBox=rbind(dBox,data.frame(y=rnorm(10),group="2"))
dLines <- data.frame(X =c(-0.36, 0.015),
Y = c(0.4, -0.2),
Xend = c(0.-0.015, 0.36),
Yend=c(0.4, -0.2),
group = c("True", "True"),
color = c("black", "red"))
ggplot(dBox, aes(x=0, y=y,fill=group)) +
geom_boxplot(outlier.shape = 1)+
geom_segment(data = dLines, aes(x = X, xend = Xend, y = Y, yend = Yend),color="red",size=1.5,linetype=1) +
theme(legend.background = element_rect(fill = "white", size = 0.1, linetype = "solid", colour = "black"))
This produces something like this:
However, it's difficult to make the geom_segments line up with the boxes exactly, and to then extend this to the grouped boxplot setting.
Thanks!
This can be done using a workaround with facets:
lines = data.frame(actions = 1:8, proportion=abs(rnorm(8)))
design=c("Red","Green","Blue")
actions=c("1","2","3","4","5","6","7","8")
proportion=(seq(1:240)+sample(1:500, 240, replace=T))/2000
df=data.frame(design, actions , proportion)
lines = data.frame(actions = 1:8, proportion=abs(rnorm(8)))
p = ggplot(df, aes(x=actions, y=proportion, fill=design)) +
geom_boxplot()+
xlab("group")+
ylab("Y value")+
ggtitle("Y values for each group stratified by color") +
facet_grid(~actions, scale='free_x') +
theme(
panel.spacing.x = unit(0, "lines"),
strip.background = element_blank(),
strip.text.x = element_blank())
p + geom_hline(aes(yintercept = proportion), lines)
You could probably fiddle around with removing the spaces between the facets to make it look more like what you intended.
Thanks to #eugene100hickey for pointing out how to remove spacing between facets.
theme(panel.spacing.x) can remove those pesky lines:
p + geom_hline(aes(yintercept = proportion), lines) +
theme(panel.spacing.x = unit(0, "lines"))

r - column wise heatmap using ggplot2

I would really appreciate if anyone could guide me with the following challenge.
I am trying to build column wise heatmap. For each column, I want the lowest value to be green and highest value to be red. The current solution takes a matrix wide approach.
I saw the solution on Heat map per column with ggplot2. As you can see, I implemented the same code but I am not getting the desired result [picture below]
df <- data.frame(
F1 = c(0.66610194649319, 0.666123551800434,
0.666100611954119, 0.665991102703081,
0.665979885730484),
acc_of_pred = c(0.499541627510021, 0.49960260221954,
0.499646067768102, 0.499447308828986,
0.499379552967265),
expected_mean_return = c(2.59756065316356e-07, 2.59799087404167e-07,
2.86466725381146e-07, 2.37977452007967e-07,
2.94242908573705e-07),
win_loss_ratio = c(0.998168189343307, 0.998411671274781,
0.998585272507726, 0.997791676357902,
0.997521287688458),
corr_pearson = c(0.00161443345430616, -0.00248811119331013,
-0.00203407575954095, -0.00496817102369628,
-0.000140531627184482),
corr_spearman = c(0.00214838517340878, -0.000308343671725617,
0.00228492127281917, -0.000359577740835049,
0.000608090759428587),
roc_vec = c(0.517972308828151, 0.51743161463546,
0.518033230192484, 0.518033294993802,
0.517931553535524)
)
combo <- data.frame(combo = c("baseline_120", "baseline_20",
"baseline_60", "baseline_288",
"baseline_5760"))
df.scaled <- scale(df)
df.scaled <- cbind(df.scaled,combo)
df.melt <- melt(df.scaled, id.vars = "combo")
ggplot(df.melt, aes(combo, variable)) +
geom_tile(aes(fill = value), colour = "white") +
scale_fill_gradient(low = "green", high = "red") +
geom_text(aes(label=value)) +
theme_grey(base_size = 9) +
labs(x = "", y = "") + scale_x_discrete(expand = c(0, 0)) +
scale_y_discrete(expand = c(0, 0)) +
theme(legend.position = "none", axis.ticks = element_blank(),
axis.text.x = element_text(size = 9 * 0.8,
angle = 0, hjust = 0, colour = "grey50"))
You are nearly correct. The code you implemented is the same for plotting. But the person who asked the question did one step in data preparation, he added a scaling variable.
If you scale your variable before plotting it and using the scaled factor as fill argument it works (i just added the rescale in scale_fill_gradient in ggplot after calculating it):
df.melt <- melt(df.scaled, id.vars = "combo")
df.melt<- ddply(df.melt, .(combo), transform, rescale = rescale(value))
ggplot(df.melt, aes(combo, variable)) +
geom_tile(aes(fill = rescale), colour = "white") +
scale_fill_gradient( low= "green", high = "red") +
geom_text(aes(label=round(value,4))) +
theme_grey(base_size = 9) +
labs(x = "", y = "") + scale_x_discrete(expand = c(0, 0)) +
scale_y_discrete(expand = c(0, 0)) +
theme(legend.position = "none", axis.ticks = element_blank(),
axis.text.x = element_text(size = 9 * 0.8,
angle = 0, hjust = 0, colour = "grey50"))
giving the plot:

Create legend with manual shapes and colours

I use bars and line to create my plot. The demo code is:
timestamp <- seq(as.Date('2010-01-01'),as.Date('2011-12-01'),by="1 mon")
data1 <- rnorm(length(timestamp), 3000, 30)
data2 <- rnorm(length(timestamp), 30, 3)
df <- data.frame(timestamp, data1, data2)
p <- ggplot()
p <- p + geom_histogram(data=df,aes(timestamp,data1),colour="black",stat="Identity",bindwidth=10)
p <- p + geom_line(data=df,aes(timestamp,y=data2*150),colour="red")
p <- p + scale_y_continuous(sec.axis = sec_axis(~./150, name = "data2"))
p <- p + scale_colour_manual(name="Parameter", labels=c("data1", "data2"), values = c('black', 'red'))
p <- p+ scale_shape_manual(name="Parameter", labels=c("data1", "data2"), values = c(15,95))
p
This results in a plot like this:
This figure does not have a legend. I followed this answer to create a customized legend but it is not working in my case. I want a square and line shape in my legend corresponding to bars and line. How can we get it?
I want legend as shown in below image:
For the type of data you want to display, geom_bar is a better fit then geom_histogram. When you to manipulate the appaerance of the legend(s), you need to place the colour = ... parts inside the aes. To get the desired result it probably best to use different types of legend for the line and the bars. In that way you are better able to change the appearance of the legends with guide_legend and override.aes.
A proposal for your problem:
ggplot(data = df) +
geom_bar(aes(x = timestamp, y = data1, colour = "black"),
stat = "Identity", fill = NA) +
geom_line(aes(x = timestamp, y = data2*150, linetype = "red"), colour = "red", size = 1) +
scale_y_continuous(sec.axis = sec_axis(~./150, name = "data2")) +
scale_linetype_manual(labels = "data2", values = "solid") +
scale_colour_manual(name = "Parameter\n", labels = "data1", values = "black") +
guides(colour = guide_legend(override.aes = list(colour = "black", size = 1),
order = 1),
linetype = guide_legend(title = NULL,
override.aes = list(linetype = "solid",
colour = "red",
size = 1),
order = 2)) +
theme_minimal() +
theme(legend.key = element_rect(fill = "white", colour = NA),
legend.spacing = unit(0, "lines"))
which gives:

Resources