I am plotting the following data using geom_tile and geom_textin ggplot2
mydf
Var1 Var2 dc1 bin
1 H G 0.93333333 0
2 G H 0.06666667 1
3 I G 0.80000000 0
4 G I 0.20000000 1
5 J G 0.33333333 1
6 G J 0.66666667 0
7 K G 0.57894737 1
8 G K 0.42105263 0
9 I H 0.80000000 0
10 H I 0.20000000 1
11 J H 0.25000000 0
12 H J 0.75000000 1
13 K H 0.20000000 0
14 H K 0.80000000 1
15 J I 0.12500000 0
16 I J 0.87500000 1
17 K I 0.32000000 0
18 I K 0.68000000 1
19 K J 0.28571429 0
20 J K 0.71428571 1
I am plotting 'Var1' vs 'Var2', and then using the 'bin' variable as my geom_text. Currently, I have filled each tile based upon scale_fill_gradient using the variable 'dc1'.
### Plotting
ggplot(mydf, aes(Var2, Var1, fill = dc1)) +
geom_tile(colour="gray20", size=1.5, family="bold", stat="identity", height=1, width=1) +
geom_text(data=mydf, aes(Var2, Var1, label = bin), color="black", size=rel(4.5)) +
scale_fill_gradient(low = "white", high = "firebrick3", space = "Lab", na.value = "gray20",
guide = "colourbar") +
scale_x_discrete(expand = c(0, 0)) +
scale_y_discrete(expand = c(0, 0)) +
xlab("") +
ylab("") +
theme(axis.text.x = element_text(vjust = 1),
axis.text.y = element_text(hjust = 0.5),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_rect(fill=NA,color="gray20", size=0.5, linetype="solid"),
axis.line = element_blank(),
axis.ticks = element_blank(),
axis.text = element_text(color="white", size=rel(1.5)),
panel.background = element_rect(fill="gray20"),
plot.background = element_rect(fill="gray20"),
legend.position = "none"
)
Which gives this:
What I am trying to do (unsuccessfully) is to make the fill conditional upon the 'bin' variable. If bin==1then I would like to fill according to 'dc1'. If bin==0 then I would like to fill with 'white'.
This would give the following which I have manually created as an example desired plot:
I tried messing around with scale_fill_gradient to try and introduce a second fill option, but cannot seem to figure this out. Thanks for any help/pointers.
This is the dput for mydf:
structure(list(Var1 = structure(c(4L, 5L, 3L, 5L, 2L, 5L, 1L,
5L, 3L, 4L, 2L, 4L, 1L, 4L, 2L, 3L, 1L, 3L, 1L, 2L), .Label = c("K",
"J", "I", "H", "G"), class = "factor"), Var2 = structure(c(1L,
2L, 1L, 3L, 1L, 4L, 1L, 5L, 2L, 3L, 2L, 4L, 2L, 5L, 3L, 4L, 3L,
5L, 4L, 5L), .Label = c("G", "H", "I", "J", "K"), class = "factor"),
dc1 = c(0.933333333333333, 0.0666666666666667, 0.8, 0.2,
0.333333333333333, 0.666666666666667, 0.578947368421053,
0.421052631578947, 0.8, 0.2, 0.25, 0.75, 0.2, 0.8, 0.125,
0.875, 0.32, 0.68, 0.285714285714286, 0.714285714285714),
bin = c(0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0,
1, 0, 1)), .Names = c("Var1", "Var2", "dc1", "bin"), row.names = c(NA,
-20L), class = "data.frame")
Perhaps replace fill = dc1 with fill = dc1 * bin? A stripped-down version of your code:
ggplot(data = mydf, aes(x = Var2, y = Var1, fill = dc1 * bin, label = bin)) +
geom_tile() +
geom_text() +
scale_fill_gradient(low = "white", high = "firebrick3")
Related
I need help in order to add colors to ggplot objects (specificaly geom_bar).
Here is my data
Names Family Groups Values
H.sapiens A G1 2
H.erectus A G1 6
H.erectus B G2 12
M.griseus C G2 3
A.mellifera D G3 3
L.niger D G3 8
H.erectus D G3 2
L.niger A G1 3
L.niger B G2 3
A.mellifera A G1 8
And so far I suceeded to create this plot :
with this code :
library(ggplot2)
library(ggstance)
library(ggthemes)
ggplot(table, aes(fill=Family, y=Names, x=Values)) +
geom_barh(stat="identity",colour="white")+ theme_minimal() +
scale_x_continuous(limits = c(0,60), expand = c(0, 0))
and now I would like to change the color depending of Groups. More precisely I would like to choose a major color for each group, for instance: G1= blue ; G2 = Green ; G3= Red.
and for each Family to get a gradient within these colors. For instance, B will be darkblue and C ligthblue.
Does someone have an idea, please ?
Here are the data :
dput(table)
structure(list(Names = structure(c(3L, 2L, 2L, 5L, 1L, 4L, 2L,
4L, 4L, 1L), .Label = c("A.mellifera", "H.erectus", "H.sapiens",
"L.niger", "M.griseus"), class = "factor"), Family = structure(c(1L,
1L, 2L, 3L, 4L, 4L, 4L, 1L, 2L, 1L), .Label = c("A", "B", "C",
"D"), class = "factor"), Groups = structure(c(1L, 1L, 2L, 2L,
3L, 3L, 3L, 1L, 2L, 1L), .Label = c("G1", "G2", "G3"), class = "factor"),
Values = c(2L, 6L, 12L, 3L, 3L, 8L, 2L, 3L, 3L, 8L)), class = "data.frame", row.names = c(NA,
-10L))
You may perhaps tweak this one to suit your requirements (I have changed your sample data a bit to show you different gradient among same Group)
df <- read.table(header = T, text = "Names Family Groups Values
H.sapiens A G1 2
H.erectus B G1 6
H.erectus B G2 12
M.griseus C G2 3
A.mellifera D G3 3
L.niger D G3 8
H.erectus A G3 2
L.niger A G1 3
L.niger B G2 3
A.mellifera C G1 8")
library(tidyverse)
df %>% ggplot() +
geom_col(aes(x = Names, y = Values, fill = Groups, alpha = as.integer(as.factor(Family)))) +
coord_flip() +
scale_fill_manual(name = "Groups", values = c("blue", "green", 'red')) +
scale_alpha_continuous(name = "Family", range = c(0.2, 0.7)) +
theme_classic()
Created on 2021-06-12 by the reprex package (v2.0.0)
We can create range of colours for each Group then match on order of Family. You might need to play around with colours to make the difference more prominent:
cols <- lapply(list(G1 = c("darkblue", "lightblue"),
G2 = c("darkgreen", "lightgreen"),
G3 = c("red4", "red")),
function(i) colorRampPalette(i)(length(unique(table$Family))))
table$col <- mapply(function(g, i) cols[[ g ]][ i ],
g = table$Groups, i = as.numeric(table$Family))
ggplot(table, aes(x = Values, y = Names, fill = col )) +
geom_barh(stat = "identity", colour = "white") +
scale_x_continuous(limits = c(0, 60), expand = c(0, 0)) +
scale_fill_identity() +
theme_minimal()
Hello I have a df such as :
tab
X molecule gene start_gene end_gene start_scaff end_scaff strand direction COL1 COL2
1 7 scaffold_1254 G7 6708 11967 1 20072 backward -1 10 20
2 5 scaffold_7638 G5 9567 10665 1 15336 backward -1 18 1
3 4 scaffold_7638 G4 3456 4479 1 15336 forward 1 18 1
4 2 scaffold_15158 G2 10105 10609 1 13487 backward -1 5 9
5 6 scaffold_8315 G6 2760 3849 1 10827 forward 1 25 7
6 3 scaffold_7180 G3 9814 10132 1 10155 backward -1 21 9
7 1 scaffold_74038 G1 1476 2010 1 2010 forward 1 8 34
so far with this code :
ggplot(tab, aes(x = start_scaff, xend = end_scaff,
y = molecule, yend = molecule)) +
geom_segment(size = 3, col = "grey80") +
geom_segment(aes(x = ifelse(direction == 1, start_gene, end_gene),
xend = ifelse(direction == 1, end_gene, start_gene)),
data = tab,
arrow = arrow(length = unit(0.1, "inches")), size = 2) +
geom_text_repel(aes(x = start_gene, y = molecule, label = gene),
data = tab, nudge_y = 0.5,size=2) +
scale_y_discrete(limits = rev(levels(tab$molecule))) +
theme_minimal()
I mannaged to get this plot :
and I wondered if there were a way to add a text just next to geom_segment with COL1 and COL2 values and color the text depending on a threshold : green values > 10, red values <= 10
and get something like
dput(tab)
structure(list(X = c(7L, 5L, 4L, 2L, 6L, 3L, 1L), molecule = structure(c(1L,
5L, 5L, 2L, 6L, 3L, 4L), .Label = c("scaffold_1254", "scaffold_15158",
"scaffold_7180", "scaffold_74038", "scaffold_7638", "scaffold_8315"
), class = "factor"), gene = structure(c(7L, 5L, 4L, 2L, 6L,
3L, 1L), .Label = c("G1", "G2", "G3", "G4", "G5", "G6", "G7"), class = "factor"),
start_gene = c(6708L, 9567L, 3456L, 10105L, 2760L, 9814L,
1476L), end_gene = c(11967L, 10665L, 4479L, 10609L, 3849L,
10132L, 2010L), start_scaff = c(1L, 1L, 1L, 1L, 1L, 1L, 1L
), end_scaff = c(20072L, 15336L, 15336L, 13487L, 10827L,
10155L, 2010L), strand = structure(c(1L, 1L, 2L, 1L, 2L,
1L, 2L), .Label = c("backward", "forward"), class = "factor"),
direction = c(-1L, -1L, 1L, -1L, 1L, -1L, 1L), COL1 = c(10L,
18L, 18L, 5L, 25L, 21L, 8L), COL2 = c(20L, 1L, 1L, 9L, 7L,
9L, 34L)), class = "data.frame", row.names = c(NA, -7L))
An approximation would be
ggplot(tab, aes(x = start_scaff, xend = end_scaff,
y = molecule, yend = molecule)) +
geom_segment(size = 3, col = "grey80") +
geom_segment(aes(x = ifelse(direction == 1, start_gene, end_gene),
xend = ifelse(direction == 1, end_gene, start_gene)),
data = tab,
arrow = arrow(length = unit(0.1, "inches")), size = 2) +
geom_text_repel(aes(x = start_gene, y = molecule, label = gene),
data = tab, nudge_y = 0.5,size=2) +
scale_y_discrete(limits = rev(levels(tab$molecule))) +
theme_minimal() +
geom_text(data = mutate(tab, COLr1 = COL1<10), aes(color = COLr1, label = COL1), position = position_nudge(x=20000)) +
geom_text(data = mutate(tab, COLr2 = COL2<10), aes(color = COLr2, label = COL2), position = position_nudge(x=22000)) +
geom_text(data = mutate(tab, txt = "-"), aes(label = txt), position = position_nudge(x=21100)) +
scale_color_manual(values = c("darkgreen", "red")) +
xlim(c(NA,23000)) +
theme(legend.position = "none")
I have the following data frame:
library(tidyverse)
library(directlabels)
dat <- structure(list(time.course = c("CONTROL", "DAY03", "DAY06", "DAY09",
"DAY12", "DAY15", "CONTROL", "DAY03", "DAY06", "DAY09", "DAY12",
"DAY15", "CONTROL", "DAY03", "DAY06", "DAY09", "DAY12", "DAY15",
"CONTROL", "DAY03", "DAY06", "DAY09", "DAY12", "DAY15"), log_delta = c(0,
0.620163956872191, 0.97251217133899, 0.788819459139427, 0.412543422847407,
0.401621905837411, 0, -0.168711062429047, -0.973481367557294,
-1.46433243027353, -1.34771037206345, -1.77709667157235, 0, -0.187344700204557,
-0.254280909246003, -0.335330756378048, -0.655121382977672, -1.1733031812697,
0, -0.0160729795971869, -0.628563089917479, -1.43060414378064,
-1.466051599194, -2.57510172892555), `UMAP cluster` = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L, 5L, 5L, 7L, 7L, 7L, 7L, 7L,
7L, 13L, 13L, 13L, 13L, 13L, 13L), .Label = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15"
), class = "factor"), cell_name = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 5L, 5L, 5L, 5L, 5L, 5L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L), .Label = c("Macrophage", "Enteroendocrine",
"Endothelial", "Lymphatic", "Fibroblast", "T cell", "Myofibroblast",
"Absorbtice & secrectory cell", "Plasmacytoid DC", "Neutrophil",
"Plasma cell", "Cajal intestinal cell", "Glial cell", "Germinal center B cell"
), class = "factor")), row.names = c(NA, -24L), class = c("tbl_df",
"tbl", "data.frame"))
dat
It looks like this:
# A tibble: 24 x 4
time.course log_delta `UMAP cluster` cell_name
<chr> <dbl> <fct> <fct>
1 CONTROL 0 1 Macrophage
2 DAY03 0.620 1 Macrophage
3 DAY06 0.973 1 Macrophage
4 DAY09 0.789 1 Macrophage
5 DAY12 0.413 1 Macrophage
6 DAY15 0.402 1 Macrophage
7 CONTROL 0 5 Fibroblast
8 DAY03 -0.169 5 Fibroblast
9 DAY06 -0.973 5 Fibroblast
10 DAY09 -1.46 5 Fibroblast
11 DAY12 -1.35 5 Fibroblast
12 DAY15 -1.78 5 Fibroblast
13 CONTROL 0 7 Myofibroblast
14 DAY03 -0.187 7 Myofibroblast
15 DAY06 -0.254 7 Myofibroblast
16 DAY09 -0.335 7 Myofibroblast
17 DAY12 -0.655 7 Myofibroblast
18 DAY15 -1.17 7 Myofibroblast
19 CONTROL 0 13 Myofibroblast
20 DAY03 -0.0161 13 Myofibroblast
21 DAY06 -0.629 13 Myofibroblast
22 DAY09 -1.43 13 Myofibroblast
23 DAY12 -1.47 13 Myofibroblast
24 DAY15 -2.58 13 Myofibroblast
Notice that "Myofibroblast" occurs twice as UMAP cluster 7 and 13.
I tried to plot that using this code with directlabels package:
ggplot(dat, aes(x = time.course, y = log_delta,
color = cell_name)) +
geom_line(aes(group = `UMAP cluster`)) +
scale_x_discrete(expand = c(0, 2.5)) +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
xlab("") +
ylab("log(proportion / control proportion)") +
theme(legend.title = element_blank()) +
geom_dl(aes(label = cell_name), method = list(dl.trans(x = x + 0.3), "last.bumpup", cex = 0.8))
The plot looks like this:
Notice that Myofibroblast doesn't occur at the end of the two lines (blue).
What I want to do is
to color Myofibroblast with two colors
each of Myofibroblast lines also tagged with labels.
How can I achieve that?
It's not possible to do it with geom_dl because it inherits the aes and directly takes the column variable to be the label. I can think of two solutions, first, is to create a new variable by fusing the cluster id with cell type:
# the column name is giving some problems
colnames(dat)[3] = "UMAPcluster"
dat <- dat %>% mutate(new=paste(cell_name,UMAPcluster))
dat$new <- factor(dat$new,levels=unique(dat$new))
ggplot(dat, aes(x = time.course, y = log_delta,,group=new,col = new)) +
geom_line() +
scale_x_discrete(expand = c(0, 2.5)) +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
xlab("") +
ylab("log(proportion / control proportion)") +
theme(legend.title = element_blank()) +
geom_dl(aes(label = new), method = list(dl.trans(x = x + 0.3), "last.bumpup", cex = 0.8))
Or you create a new data frame and annotate with geom_text (or geom_label if you like boxes). Preferably you keep the legend for the cluster so that it is clear what the colors mean.
LAB = dat %>% group_by(UMAPcluster) %>% top_n(1,wt=time.course)
ggplot(dat, aes(x = time.course, y = log_delta,color = UMAPcluster)) +
geom_line(aes(group = UMAPcluster)) +
scale_x_discrete(expand = c(0, 2.5)) +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
xlab("") +
ylab("log(proportion / control proportion)") +
geom_label(data=LAB,aes(label=cell_name),show.legend=FALSE,nudge_x=0.7)
I have been trying to plot the result of a lsmeans model, where boxes indicate the LS mean. Error bars indicate the 95% confidence interval of the LS mean and where means sharing a letter are not significantly different. I would like to plot the following table cld.mixed.lme with ggplot2:
dput(cld.mixed.lme)
structure(list(hor = structure(c(3L, 3L, 3L, 1L, 1L, 1L, 2L,
2L, 2L), .Label = c("L", "F", "H"), class = "factor"), managem = structure(c(1L,
3L, 2L, 3L, 1L, 2L, 1L, 2L, 3L), .Label = c("WTH", "CH", "CHF"
), class = "factor"), response = c(23.6794086785122, 23.8174295982324,
24.4481975946679, 27.7814605969773, 28.6059616644958, 28.7459261527063,
37.1161977750334, 40.0618072489354, 40.062016186989), SE = c(2.47194303396734,
2.47194303396734, 2.47194303396734, 2.47194303396734, 2.47194303396734,
2.47194303396734, 2.47194303396734, 2.47194303396734, 2.47194303396734
), df = c(12.8849763292624, 12.8849763292851, 12.8849763290692,
12.8849763293197, 12.8849763292728, 12.8849763291023, 12.8849763292846,
12.88497632933, 12.8849763292846), lower.CL = c(15.4642399103678,
15.602260830088, 16.2330288265235, 19.5662918288329, 20.3907928963513,
20.5307573845618, 28.901029006889, 31.846638480791, 31.8468474188446
), upper.CL = c(31.8945774466566, 32.0325983663769, 32.6633663628123,
35.9966293651217, 36.8211304326402, 36.9610949208507, 45.3313665431779,
48.2769760170799, 48.2771849551334), .group = c("a", "ab", "ab",
"abc", "abcde", "abd", "bcde", "ce", "de")), .Names = c("hor",
"managem", "response", "SE", "df", "lower.CL", "upper.CL", ".group"
), row.names = c(8L, 5L, 2L, 6L, 9L, 3L, 7L, 1L, 4L), class = "data.frame")
it looks like this:
----------------------------------------------------------------------------------
hor managem response SE df lower.CL upper.CL .group
-------- ----- --------- ---------- ------- ------- ---------- ---------- --------
**8** H WTH 23.68 2.472 12.88 15.46 31.89 a
**5** H CHF 23.82 2.472 12.88 15.6 32.03 ab
**2** H CH 24.45 2.472 12.88 16.23 32.66 ab
**6** L CHF 27.78 2.472 12.88 19.57 36 abc
**9** L WTH 28.61 2.472 12.88 20.39 36.82 abcde
**3** L CH 28.75 2.472 12.88 20.53 36.96 ab d
**7** F WTH 37.12 2.472 12.88 28.9 45.33 bcde
**1** F CH 40.06 2.472 12.88 31.85 48.28 c e
**4** F CHF 40.06 2.472 12.88 31.85 48.28 de
---------------------------------------------------------------------------------
After running the following code, the plot is displayed correctly, but there is a mismatch as the .group letters fall on the wrong response.
Example in the resulting plot: under hor = L managem = WTH I have .group letters "abc" instead of "abcde" (this falling under managem=CH instead).
Here is the code:
library(ggplot2)
pd = position_dodge(0.7)
plot.mixed.lme<-ggplot(cld.mixed.lme,aes(x = hor,y=response, color=managem, label=.group))+
theme_bw()+
geom_point(shape = 15, size = 4, position = pd) +
geom_errorbar(aes(ymin = lower.CL,ymax = upper.CL),width = 0.2,size = 0.7,position = pd) +
theme(axis.title = element_text(face = "bold"),
axis.text = element_text(face = "bold"),
plot.caption = element_text(hjust = 0)) +
geom_text(nudge_x = c(-0.3, 0, 0.3, -0.3, 0, 0.3,-0.3, 0, 0.3),
nudge_y = c(4.5, 4.5, 4.5,4.5, 4.5, 4.5,4.5, 4.5, 4.5),
color = "black")
plot.mixed.lme
Here is the resulting plot:
I welcome any suggestions and many thanks in advance for your help,
BAlpine
I found a way around, but this is time consuming. Basically I modified the
geom_text to suit the table:
geom_text(nudge_x = c(-0.3, 0.3, 0, 0.3, -0.3, 0,-0.3, 0, 0.3),
Any idea to match it automatically?
Many thanks
I have data of the form:
Day A B
1 1 4
1 2 5
1 3 6
2 2 2
2 3 4
2 5 6
3 6 7
3 4 6
And I would like to display this on a single chart, with Day along the x-axis, and with each x-position having a boxplot for each of A and B (colour coded).
Here's a (slight) modification of an example form the ?boxplot help page. The examples show off many common uses of the functions.
tg <- data.frame(
dose=ToothGrowth$dose[1:30],
A=ToothGrowth$len[1:30],
B=ToothGrowth$len[31:60]
)
head(tg)
# dose A B
# 1 0.5 4.2 15.2
# 2 0.5 11.5 21.5
# 3 0.5 7.3 17.6
# 4 0.5 5.8 9.7
# 5 0.5 6.4 14.5
# 6 0.5 10.0 10.0
boxplot(A ~ dose, data = tg,
boxwex = 0.25, at = 1:3 - 0.2,
col = "yellow",
main = "Guinea Pigs' Tooth Growth",
xlab = "Vitamin C dose mg",
ylab = "tooth length",
xlim = c(0.5, 3.5), ylim = c(0, 35), yaxs = "i")
boxplot(B ~ dose, data = tg, add = TRUE,
boxwex = 0.25, at = 1:3 + 0.2,
col = "orange")
legend(2, 9, c("A", "B"),
fill = c("yellow", "orange"))
Try:
ddf = structure(list(Day = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L), A = c(1L,
2L, 3L, 2L, 3L, 5L, 6L, 4L), B = c(4L, 5L, 6L, 2L, 4L, 6L, 7L,
6L)), .Names = c("Day", "A", "B"), class = "data.frame", row.names = c(NA,
-8L))
mm = melt(ddf, id='Day')
ggplot(mm)+geom_boxplot(aes(x=factor(Day), y=value, fill=variable))