My data looks like this:
var1, var2, mean, std
1 , 2 , 3 , 4
etc..
I want to plot these into a heat map that looks like this one but I want to add text labels inside each cell, in this style: mean±std (i.e. mean plus minus error). in above case, the value in the cell would be 3±4 for var1 column = 1 and var2 row = 2, and similarly different values for other cells.
It's not important that it is a heatmap, it could be the label to a point or to a bar, i just want to generate the labels so that I get the strings "mean±std" for each label: 3±4. In my case, I will be making a heatmap where the colors are based on the value of mean, such as in here: https://stackoverflow.com/a/14290705/1504411
Thank you!
You can use plotmath in geom_text by setting parse = TRUE. Based on #beetroot's answer:
ggplot(dat) +
geom_text(aes(x = 1, y = 2.5,
label = paste(mean, std, sep = "%+-%")),
parse = TRUE)
You can create labels with geom_text and paste the mean and sd values with the plus-minus-sign as the seperator (\u00B1 is the respective unicode):
dat <- data.frame(var1 = 1, var2 = 2, mean = 3, std = 4)
ggplot(dat) +
geom_text(aes(x = 1, y = 2.5, label = paste(mean, std, sep = "\u00B1")))
Thanks to beetroot's and Roland's answer this was my final code that worked (plus some bells and whistles):
p1 <- ggplot(r_output, aes(var1, var2)) +
geom_tile(aes(fill = mean))+
geom_text(aes(fill = mean, label = paste(round(mean, 2), round(std, 2), sep = "\u00B1")), size = 2)+
scale_fill_gradient(low = "red", high = "blue") +
Related
I want to boxplot two groups (A and B) and display the mean value on each box plot.
I have 30 lines and 2 columns : each line contains the value of group A (col 1) and group B (col 2).
I did a boxplot with graphic boxplot
boxplot(Data_Q4$Group.A,Data_Q4$Group.B,names=c("group A","group B"))
but it seems like adding a mean point on the boxplot necessiting ggplot 2.
I tried many things but it already send me an error message
! Aesthetics must be either length 1 or the same as the data (30): x...
It seems my problem come from y axis. I need him to take the data from columns A and B but I don't know how to do this.
if my data was with value column and group columns (A or B for each line) it would work but I don't know how to rearrange it so that I get 2 columns (value and groups) and 60 lines with the values of the groups.
and then I do dataQ4 %>% ggplot(aes(x=group,y=value))+geom_boxplot+stat_summary(fun.y=mean)
I think it will be ok.
so my problem is to rearrange my data frame so that I can use ggplot and boxplot it
thanks for your help !
I share here my data :
dput(Data_Q4) structure(list(Group.A = c(1.25310535, 0.5546414, 0.301283, 1.29312466, 0.99455579, 0.5141743, 2.0078324, 0.42224244, 2.17877257, 3.21778902, 0.55782935, 0.59461765, 0.97739581, 0.20986658, 0.30944786, 1.10593627, 0.77418776, 0.08967408, 1.10817666, 0.24726425, 1.57198685, 4.83281274, 0.43113213, 2.73038931, 1.13683142, 0.81336825, 0.83700649, 1.7847654, 2.31247163, 2.90988727), Group.B = c(2.94928948, 0.70302878, 0.69016263, 1.25069011, 0.43649776, 0.22462232, 0.39231981, 1.5763435, 0.42792839, 0.19608026, 0.37724368, 0.07071508, 0.03962611, 0.38580831, 2.63928857, 0.78220807, 0.66454197, 0.9568569, 0.02484568, 0.21600677, 0.88031195, 0.13567357, 0.68181725, 0.20116062, 0.4834762, 0.50102846, 0.15668497, 0.71992076, 0.68549794, 0.86150777)), class = "data.frame", row.names = c(NA, -30L))
First I create some random data:
df <- data.frame(group = rep(c("A", "B"), 15),
value = runif(30, 0, 10))
You can use the following code:
library(tidyverse)
ggplot(data = df,
aes(x = group, y = value)) +
geom_boxplot() +
stat_summary(fun.y = mean, color = "darkred", position = position_dodge(0.75),
geom = "point", shape = 18, size = 3,
show.legend = FALSE)
Output:
The red dots represent the mean.
Using your data:
You can use the following code:
library(tidyverse)
library(reshape)
dataQ4 %>%
melt() %>%
ggplot(aes(x = variable, y = value)) +
geom_boxplot() +
stat_summary(fun.y = mean, color = "darkred", position = position_dodge(0.75),
geom = "point", shape = 18, size = 3,
show.legend = FALSE)
Output:
I have an input matrix consists of 5 columns and 12 rows.
I am trying to plot a range for same variables (lets say width) across two methods/conditions (Paper, estimated). I am able to plot range across one methods/condition using code:
Input <- read.table("File.txt", header = T, sep = "\t")
ggplot(Input, aes(x=Trait))+
geom_linerange(aes(ymin=min,ymax=max),linetype=3,color="Black")+
geom_point(aes(y=min),size=3,color="darkgreen")+
geom_point(aes(y=max),size=3,color="darkgreen")+ labs(y="-log10(P)", x="Traits") +
theme_bw()
But I want to plot each variable across methods together in the same plot. I can do this by adding an extra suffix with each variable Is there a nicer way to do this? I have tried shape=Method but it's not working for me, Any help will be highly appreciated.
I would suggest mapping Method on color instead of shape. But hey. It's your plot. (; To achieve your desired result without adding a suffix you could make use of position_dodge like so:
library(tibble)
library(ggplot2)
ggplot(Input, aes(x = Trait, shape = Method)) +
geom_linerange(aes(ymin = min, ymax = max, group = Method), linetype = 3, color = "Black", position = position_dodge(.6)) +
geom_point(aes(y = min), color = "darkgreen", size = 3, position = position_dodge(.6)) +
geom_point(aes(y = max), color = "darkgreen", size = 3, position = position_dodge(.6)) +
labs(y = "-log10(P)", x = "Traits") +
theme_bw()
DATA
set.seed(42)
Input <- tibble(
Method = rep(c("Paper", "Estimated"), each = 3),
Trait = rep(c("Width", "Density", "Lenght"), 2),
Count = rep(c(2, 4, 10), 2),
min = runif(6, 5, 7),
max = min + runif(6, 0, 10)
)
I want to highlight text based on the position in a string, for example if we have this text:
this is a really nice informative piece of text
Then I want to say let's draw a rectangle around positions 2 till 4:
t[his] is a really nice informative piece of text
I tried to do so in ggplot2 using the following code:
library(ggplot2)
library(dplyr)
box.data <- data.frame(
start = c(4,6,5,7,10,7),
type = c('BOX1.start', 'BOX1.start', 'BOX1.start','BOX1.end', 'BOX1.end', 'BOX1.end'),
text.id = c(1,2,3,1,2,3)
)
text.data <- data.frame(
x = rep(1,3),
text.id = c(1,2,3),
text = c('Thisissomerandomrandomrandomrandomtext1',
'Thisissomerandomrandomrandomrandomtext2',
'Thisissomerandomrandomrandomrandomtext3')
)
ggplot(data = text.data, aes(x = x, y = text.id)) +
scale_x_continuous(limits = c(1, nchar(as.character(text.data$text[1])))) +
geom_text(label = text.data$text, hjust = 0, size = 3) +
geom_line(data = box.data, aes(x = start, y = text.id, group = text.id, size = 3, alpha = 0.5, colour = 'red'))
This produces the following graph:
My method fails as a letter does not cover exactly one unit of the x-axis, is there any way to achieve this?
I just figured out that I can split the string in characters and plot these, perhaps it is useful for someone else.
library(ggplot2)
library(dplyr)
library(splitstackshape)
# First remember the plotting window, which equals the text length
text.size = nchar(as.character(text.data$text[1]))
# Split the string into single characters, and adjust the X-position to the string position
text.data <- cSplit(text.data, 'text', sep = '', direction = 'long', stripWhite = FALSE) %>%
group_by(text.id) %>%
mutate(x1 = seq(1,n()))
# Plot each character and add highlights
ggplot(data = text.data, aes(x = x1, y = text.id)) +
scale_x_continuous(limits = c(1, text.size)) +
geom_text(aes(x = text.data$x1, y = text.data$text.id, group = text.id, label = text)) +
geom_line(data = box.data, aes(x = start, y = text.id, group = text.id, size = 3, alpha = 0.5, colour = 'red'))
Which produces this plot:
Perhaps the marking should extend a little but upwards and downwards, but that's an easy fix.
I am generating density plots for observations. The observations belong to a species and some are also connected to an individual ID.
With the data below, I want to generate a line for each level of IndID for species One and Two, and only a single line for Species Three, which does not include IndID. There are related questions on SO, but not with reproducible data and looking for different results.
library(ggplot2)
set.seed(1)
dat <- data.frame(Species = c(rep(c("One", "Two"), each = 2, length = 30), rep("Three",50)),
IndID = c(rep(letters[1:5],each = 6),rep(NA,50) ),
Value = sample(1:20, replace = T))
Keeping the color ascetic on the Species level, I want to create multiple lines for Species One and Two (green and red) and a single blue line for species Three.
ggplot(dat, aes(Value)) + geom_density(aes(color = Species), size = 1.25) +
scale_colour_manual(values = c("darkgreen","blue", "red"))
If you want to be able to tell them apart, you can set the linetype to IndID. Note, however, that you will need to change the NA to some other value to (easily) get it to plot.
I also expanded your data a little bit to give enough values per individual to show meaningful lines. I also used geom_line(stat = "density") instead of geom_density() because it omits the line along the bottom and gives legends with lines instead of boxes.
set.seed(1)
dat <- data.frame(Species = c(rep(c("One", "Two"), each = 2, length = 60), rep("Three",50)),
IndID = c(rep(letters[1:5],each = 12),rep("NA",50) ),
Value = sample(1:20, 110, replace = T))
ggplot(dat
, aes(x = Value
, color = Species
, linetype = IndID)) +
geom_line(stat = "density"
, size = 1.25) +
scale_colour_manual(values = c("darkgreen","blue", "red"))
gives
If you want the lines to all be solid, you can run:
ggplot(dat
, aes(x = Value
, color = Species
, linetype = IndID)) +
geom_line(stat = "density"
, size = 1.25) +
scale_colour_manual(values = c("darkgreen","blue", "red")) +
scale_linetype_manual(values = rep("solid", 6)) +
guides(linetype = "none")
(or use group as #Henrik suggested in zir comment)
I wonder if there is the possibility to change the fill main colour according to a categorical variable
Here is a reproducible example
df = data.frame(x = c(rnorm(10, mean = 0),
rnorm(10, mean = 3)),
y = c(rnorm(10, mean = 0),
rnorm(10, mean = 3)),
grp = c(rep('a', times = 10),
rep('b', times = 10)),
val = rep(1:10, times = 2))
ggplot(data = df,
aes(x = x,
y = y)) +
geom_point(pch = 21,
aes(color = grp,
fill = val,
size = val))
Of course it is easy to change the circle colour/shape, according to the variable grp, but I'd like to have the a group in shades of red and the b group in shades of blue.
I also thought about using facets, but don't know if the fill gradient can be changed for the two panels.
Anyone knows if that can be done, without gridExtra?
Thanks!
I think there are two ways to do this. The first is using the alpha aesthetic for your val column. This is a quick and easy way to accomplish your goal but may not be exactly what you want:
ggplot(data = df,
aes(x = x,
y = y)) +
geom_point(pch = 21,
aes(alpha=val,
fill = grp,
size = val)) + theme_minimal()
The second way would be to do something similar to this post: Vary the color gradient on a scatter plot created with ggplot2. I edited the code slightly so its not a range from white to your color of interest but from a lighter color to a darker color. This requires a little bit of work and using the scale_fill_identity function which basically takes a variable that has the colors you want and maps them directly to each point (so it doesn't do any scaling).
This code is:
#Rescale val to [0,1]
df$scaled_val <- rescale(df$val)
low_cols <- c("firebrick1","deepskyblue")
high_cols <- c("darkred","deepskyblue4")
df$col <- ddply(df, .(grp), function(x)
data.frame(col=apply(colorRamp(c(low_cols[as.numeric(x$grp)[1]], high_cols[as.numeric(x$grp)[1]]))(x$scaled_val),
1,function(x)rgb(x[1],x[2],x[3], max=255)))
)$col
df
ggplot(data = df,
aes(x = x,
y = y)) +
geom_point(pch = 21,
aes(
fill = col,
size = val)) + theme_minimal() +scale_fill_identity()
Thanks to this other post I found a way to visualize the fill bar in the legend, even though that wasn't what I meant to do.
Here's the ouptup
And the code
df = data.frame(x = c(rnorm(10, mean = 0),
rnorm(10, mean = 3)),
y = c(rnorm(10, mean = 0),
rnorm(10, mean = 3)),
grp = factor(c(rep('a', times = 10),
rep('b', times = 10)),
levels = c('a', 'b')),
val = rep(1:10, times = 2)) %>%
group_by(grp) %>%
mutate(scaledVal = rescale(val)) %>%
ungroup %>%
mutate(scaledValOffSet = scaledVal + 100*(as.integer(grp) - 1))
scalerange <- range(df$scaledVal)
gradientends <- scalerange + rep(c(0,100,200), each=2)
ggplot(data = df,
aes(x = x,
y = y)) +
geom_point(pch = 21,
aes(fill = scaledValOffSet,
size = val)) +
scale_fill_gradientn(colours = c('white',
'darkred',
'white',
'deepskyblue4'),
values = rescale(gradientends))
Basically one should rescale fill values (e.g. between 0 and 1) and separate them using another order of magnitude, provided by the categorical variable grp.
This is not what I wanted though: the snippet can be improved, of course, to make the whole thing less manual, but still lacks the simple usual discrete fill legend.