I am doing a ring plot with ggplot and I would like to add borders to the categories but they overlap. Is there a way to make the borders internal to the rectangle?
Reproducible example.
Data:
plot.df <- data.frame("number"=c(3455, 3714, 2345),
"group"=c("A","B", "C"))
plot.df$fraction <- plot.df$number / sum(plot.df$number)
plot.df <- plot.df[order(plot.df$fraction), ]
plot.df$ymax <- cumsum(plot.df$fraction)
plot.df$ymin = c(0, head(plot.df$ymax, n=-1))
Plot:
ggplot(plot.df, aes(color = group, fill=group,
ymax=ymax, ymin=ymin,
xmax=4, xmin=2.5)) +
geom_rect(alpha = 0.6, size = 4) +
coord_polar(theta="y") +
xlim(c(0, 4)) +
theme_bw() +
theme(panel.grid=element_blank(), axis.text=element_blank()) +
theme(axis.ticks=element_blank()) +
labs(title="My Ring Plot", x = "", y = "",
fill = "", color = "") +
theme(plot.title = element_text(hjust = 0.5))
I get the following plot, that is correct except for the borders.
For example, between B and C only the the B (green) border is visible and I would like to see a thick green line next to a thick blue line. Did I explain myself?
Thanks for your help!
EDIT:
I found a dirty solution, it is not perfect or elegant but it kind of does the job.
First we need to modify the ymin column
plot.df$ymin = c(0.0125, head(plot.df$ymax, n=-1)+ 0.0125)
and then add a new row for a "ghost" category
plot.df <- rbind(c(234, "D", 0.0125, 0.0125, 0), plot.df)
plot.df[,4] <- as.numeric(plot.df[,4])
plot.df[,5] <- as.numeric(plot.df[,5])
now we can make the plot hiding the "ghost" category
ggplot(plot.df, aes(color = group, fill=group,
ymax=ymax, ymin=ymin,
xmax=4, xmin=3)) +
geom_rect(alpha = 0.6, size = 4) +
coord_polar(theta="y") +
xlim(c(0, 4)) +
theme_bw() +
scale_fill_manual(breaks = c("A", "B", "C"),
values = c("red", "green", "blue", "white"),
aesthetics = c("colour", "fill")) +
theme(panel.grid=element_blank(), axis.text=element_blank()) +
theme(axis.ticks=element_blank()) +
labs(title="My Ring Plot", x = "", y = "",
fill = "", color = "") +
theme(plot.title = element_text(hjust = 0.5))
That looks like what I was looking for, but the way I made it is not ideal.
Any other solution to achieve this? Thanks!
Related
I am trying to display a graph showing the log10 transformed values of some data. When creating a violin graph with the below code:
tt <- EditedDF1 %>%
ggplot(aes(x=ct_marshallFAC, y=log10(uchl1_d1_pgml), fill = EditedDF1$ct_marshallFAC)) +
geom_violin() +
geom_boxplot(width=0.1, outlier.shape = NA, fill="white") +
theme_classic() +
theme(axis.text.x = element_text(size= 20)) +
labs(x="", y= "Log UCHL1(pg/ml)") +
ylim(-3.5, 6) +
theme(legend.position="none")+
scale_x_discrete(labels=c("I", "II", "III-IV", "V-VI")) +
geom_hline(yintercept = 1.568, colour = "red", linetype="dotted" )
My problem is the y axis gives a value of -3, -2, -1, 0 when what I want it to show is the actual value the log10 would equate to. For instance -1 would be 0.1, 0 would be 1, 1 would be 10 and so on.
I have chosen to display the log-transformed data as I have multiple panels of data. However I feel the reader would be better able to understand if I gave the actual values on the y axis.
Many thanks.
Dan W
Take out your ylim(-3.5, 6) and use ggplot's scale_y_log10 instead with limits there:
library(ggplot2)
EditedDF1 <- data.frame(ct_marshallFAC = rep(c("I", "II", "III-IV", "V-VI"),10), uchl1_d1_pgml =runif(40)*100)
tt <- EditedDF1 %>%
ggplot(aes(x=ct_marshallFAC, y=log10(uchl1_d1_pgml),
fill = EditedDF1$ct_marshallFAC)) +
geom_violin() +
geom_boxplot(width=0.1, outlier.shape = NA, fill="white") +
theme_classic() +
theme(axis.text.x = element_text(size= 20)) +
labs(x="", y= "UCHL1(pg/ml)") +
theme(legend.position="none")+
scale_x_discrete(labels=c("I", "II", "III-IV", "V-VI")) +
geom_hline(yintercept = 1.568, colour = "red", linetype="dotted" ) +
scale_y_log10(limits = c(exp(-3.5), exp(6)))
Edit: I've never had someone tell me that a logarithmic scale is confusing...I'd just use what you had to begin with and keep the label to reflect that the values are logarithmic.
I know this question is similar to ones that has been asked before but the suggested solutions don't seem to apply.
I set up the problem as follows
mat1 <- NULL
mat2 <- NULL
mat1 <- data.frame(matrix(nrow =16, ncol =2, data = rnorm(32, 0, 1)))
mat2 <- data.frame(matrix(nrow =16, ncol =2, data = rnorm(32, 0, 1)))
mat1[,1] = mat2[,1] = 1:16
colnames(mat1) = c("Window", "CM")
colnames(mat2) = c("Window", "FM")
ggplot() +
geom_line(data = mat1, aes(x = mat1$Window, y= mat1$CM), linetype ="twodash", color ="steelblue") +
geom_line(data = mat2, aes(x = mat2$Window, y= mat2$FM), color = "black") +
theme_classic() + xlab("Quater after alpha assessment") + ylab("Estimated Coefficient") + labs(fill = "cohort model")
I want to add in a legend. Specifically i want the blue line to be labelled as CM and the black line to be labelled as FM
In these kind of scenarios I think it is often the easiest to bring your data into the appropriate format for ggplot. Then you can properly use all of the ggplot toolset.
library(tidyverse)
mat3 = bind_cols(mat1, mat2) %>%
select(-Window1) %>%
gather(type, value, -Window)
mat3 %>%
ggplot(aes(x = Window, y = value, group = type, color = type, linetype = type)) +
geom_line() +
scale_color_manual("cohort model",
values = c("CM" = "steelblue","FM" = "black"),
breaks = c("CM", "FM")) +
scale_linetype_manual("cohort model",
values = c("twodash", "solid"),
breaks = c("CM", "FM")) +
labs(x = "Quater after alpha assessment", y = "Estimated Coefficient") +
theme_classic()
I assume the simplest way to do this would be to use annote():
ggplot() +
geom_line(data = mat1, aes(x = mat1$Window, y= mat1$CM), linetype ="twodash", color ="steelblue") +
geom_line(data = mat2, aes(x = mat2$Window, y= mat2$FM), color = "black") +
theme_classic() + xlab("Quater after alpha assessment") + ylab("Estimated Coefficient") + labs(fill = "cohort model") +
xlim(NA,18) +
annotate(geom="text", x=16.5, y=1.51232841, label="CM", color="blue", size=3) +
annotate(geom="text", x=16.5, y=-0.487350382, label="FM", color="black", size=3)
You can easily change and adjust the position with x= and y=. I also slightly extended the upper limit of x-scale so that the text fits in.
Of course, I don't know if that's enough for you. Otherwise, you could also add a text field as legend. But this would be the easiest and fastest way.
I have similar data like the following example:
dat1 <- data.frame(group=c("a", "a","a", "a","a","a","b","b","b","b","b", "b", "b","b","b","c","c","c","c","c","c"),
subgroup=c(paste0("R", rep(1:6)),paste0("R", rep(1:9)),paste0("R", rep(1:6))),
value=c(15,16,12,12,14,5,14,27,20,23,14,10,20,22,14,15,18,14,23,30,32),
pp=c("AT","BT","CT","AA","CC","SE","DN","AS","MM","XT","QQ","HH","MK","HT","dd","US","AG","TT","ZZ","XK","RU"),
clusters=c(rep("cluster1",6),rep("cluster2",9),rep("cluster3",6)))
colors <- c(rep("#74c1e8",6),rep("#808000",9),rep("#FF69B4",6))
names(colors) <- c("cluster1","cluster2","cluster3")
my code is :
pl <- ggplot(dat1, aes(y = pp, x = subgroup))
+ geom_point(aes(size=value))
+ facet_grid(~group, scales="free_x", space = "free")
+ ylab("names")
+ xlab(" ")
+ theme(axis.text.y = element_text(color=colors))
pl
What I want is to add some space on y_axis after each cluster. For example, after cluster 3 (red ones) I want to add some space like space between panels, etc. in the following plot.
Is there a way to do that?
My solution converts the y axis to a factor and add geom_hline between each cluster
library(tidyverse)
dat1 <- data.frame(group=c("a", "a","a", "a","a","a","b","b","b","b","b", "b", "b","b","b","c","c","c","c","c","c"),
subgroup=c(paste0("R", rep(1:6)),paste0("R", rep(1:9)),paste0("R", rep(1:6))),
value=c(15,16,12,12,14,5,14,27,20,23,14,10,20,22,14,15,18,14,23,30,32),
pp=c("AT","BT","CT","AA","CC","SE","DN","AS","MM","XT","QQ","HH","MK","HT","dd","US","AG","TT","ZZ","XK","RU"),
clusters=c(rep("cluster1",6),rep("cluster2",9),rep("cluster3",6)))
colors <- c(rep("#74c1e8",6),rep("#808000",9),rep("#FF69B4",6))
names(colors) <- c("cluster1","cluster2","cluster3")
ggplot(dat1, aes(y = factor(pp), x = subgroup)) + geom_point(aes(size=value)) + facet_grid(~group, scales="free_x", space = "free")+
ylab("names") +
xlab(" ") +
theme(axis.text.y = element_text(color=colors)) +
geom_hline(yintercept = 15.5, color = "white", size = 2) +
geom_hline(yintercept = 6.5, color = "white", size = 2)
I'm working on some data on party polarization (something like this) and used geom_dumbbell from ggalt and ggplot2. I keep getting the same aes error and other solutions in the forum did not address this as effectively. This is my sample data.
df <- data_frame(policy=c("Not enough restrictions on gun ownership", "Climate change is an immediate threat", "Abortion should be illegal"),
Democrats=c(0.54, 0.82, 0.30),
Republicans=c(0.23, 0.38, 0.40),
diff=sprintf("+%d", as.integer((Democrats-Republicans)*100)))
I wanted to keep order of the plot, so converted policy to factor and wanted % to be shown only on the first line.
df <- arrange(df, desc(diff))
df$policy <- factor(df$policy, levels=rev(df$policy))
percent_first <- function(x) {
x <- sprintf("%d%%", round(x*100))
x[2:length(x)] <- sub("%$", "", x[2:length(x)])
x
}
Then I used ggplot that rendered something close to what I wanted.
gg2 <- ggplot()
gg2 <- gg + geom_segment(data = df, aes(y=country, yend=country, x=0, xend=1), color = "#b2b2b2", size = 0.15)
# making the dumbbell
gg2 <- gg + geom_dumbbell(data=df, aes(y=country, x=Democrats, xend=Republicans),
size=1.5, color = "#B2B2B2", point.size.l=3, point.size.r=3,
point.color.l = "#9FB059", point.color.r = "#EDAE52")
I then wanted the dumbbell to read Democrat and Republican on top to label the two points (like this). This is where I get the error.
gg2 <- gg + geom_text(data=filter(df, country=="Government will not control gun violence"),
aes(x=Democrats, y=country, label="Democrats"),
color="#9fb059", size=3, vjust=-2, fontface="bold", family="Calibri")
gg2 <- gg + geom_text(data=filter(df, country=="Government will not control gun violence"),
aes(x=Republicans, y=country, label="Republicans"),
color="#edae52", size=3, vjust=-2, fontface="bold", family="Calibri")
Any thoughts on what I might be doing wrong?
I think it would be easier to build your own "dumbbells" with geom_segment() and geom_point(). Working with your df and changing the variable refences "country" to "policy":
library(tidyverse)
# gather data into long form to make ggplot happy
df2 <- gather(df,"party", "value", Democrats:Republicans)
ggplot(data = df2, aes(y = policy, x = value, color = party)) +
# our dumbell
geom_path(aes(group = policy), color = "#b2b2b2", size = 2) +
geom_point(size = 7, show.legend = FALSE) +
# the text labels
geom_text(aes(label = party), vjust = -1.5) + # use vjust to shift text up to no overlap
scale_color_manual(values = c("Democrats" = "blue", "Republicans" = "red")) + # named vector to map colors to values in df2
scale_x_continuous(limits = c(0,1), labels = scales::percent) # use library(scales) nice math instead of pasting
Produces this plot:
Which has some overlapping labels. I think you could avoid that if you use just the first letter of party like this:
ggplot(data = df2, aes(y = policy, x = value, color = party)) +
geom_path(aes(group = policy), color = "#b2b2b2", size = 2) +
geom_point(size = 7, show.legend = FALSE) +
geom_text(aes(label = gsub("^(\\D).*", "\\1", party)), vjust = -1.5) + # just the first letter instead
scale_color_manual(values = c("Democrats" = "blue", "Republicans" = "red"),
guide = "none") +
scale_x_continuous(limits = c(0,1), labels = scales::percent)
Only label the top issue with names:
ggplot(data = df2, aes(y = policy, x = value, color = party)) +
geom_path(aes(group = policy), color = "#b2b2b2", size = 2) +
geom_point(size = 7, show.legend = FALSE) +
geom_text(data = filter(df2, policy == "Not enough restrictions on gun ownership"),
aes(label = party), vjust = -1.5) +
scale_color_manual(values = c("Democrats" = "blue", "Republicans" = "red")) +
scale_x_continuous(limits = c(0,1), labels = scales::percent)
Following guides like ggplot Donut chart I am trying to draw small gauges, doughnuts with a label in the middle, with the intention to put them later on on a map.
If the value reaches a certain threshold I would like the fill of the doughnut to change to red. Is it possible to achieve with if_else (it would be most natural but it does not work).
library(tidyverse)
df <- tibble(ID=c("A","B"),value=c(0.7,0.5)) %>% gather(key = cat,value = val,-ID)
ggplot(df, aes(x = val, fill = cat)) + scale_fill_manual(aes,values = c("red", "yellow"))+
geom_bar(position="fill") + coord_polar(start = 0, theta="y")
ymax <- max(df$val)
ymin <- min(df$val)
p2 = ggplot(df, aes(fill=cat, y=0, ymax=1, ymin=val, xmax=4, xmin=3)) +
geom_rect(colour="black",stat = "identity") +
scale_fill_manual(values = if_else (val > 0.5, "red", "black")) +
geom_text( aes(x=0, y=0, label= scales::percent (1-val)), position = position_dodge(0.9))+
coord_polar(theta="y") +
xlim(c(0, 4)) +
theme_void() +
theme(legend.position="none") +
scale_y_reverse() + facet_wrap(facets = "ID")
Scale fill manual values= if else.... this part does not work, the error says: Error in if_else(val > 0.5, "red", "black") : object 'val' not found. Is it my error, or some other solution exists?
I also realize my code is not optimal, initially gather waited for more variables to be included in the plot, but I failed to stack one variable on top of the other. Now one variable should be enough to indicate the percentage of completion. I realise my code is redundant for the purpose. Can you help me out?
A solution for the color problem is to first create a variable in the data and then use that to map the color in the plot:
df <- tibble(ID=c("A","B"),value=c(0.7,0.5)) %>% gather(key = cat,value = val,-ID) %>%
mutate(color = if_else(val > 0.5, "red", "black"))
p2 = ggplot(df, aes(fill=color, y=0, ymax=1, ymin=val, xmax=4, xmin=3)) +
geom_rect(colour="black",stat = "identity") +
scale_fill_manual(values = c(`red` = "red", `black` = "black")) +
geom_text( aes(x=0, y=0, label= scales::percent (1-val)), position = position_dodge(0.9))+
coord_polar(theta="y") +
xlim(c(0, 4)) +
theme_void() +
theme(legend.position="none") +
scale_y_reverse() + facet_wrap(facets = "ID")
The result would be: