How can I create a scatter plot with error bars in two directions? Usually the error bars are in the vertical direction (i.e. the uncertainty in the y value). However my data has uncertainty in the x value as well
X ErrX Y ErrY
1.0 0.1 3.0 0.2
1.5 0.3 4.2 0.1
etc
Using ggplot2, this is easy. You have complete control over the length of all four "sides" of the errorbars. With geom_errorbar() you set the y-errors, and geom_errobarh() (the h is for horizontal) you set the x-errors.
#toy data
df <- data.frame(X = rnorm(4), errX = rnorm(4)*0.1, Y = rnorm(4), errY = rnorm(4)*0.2)
#load ggplot2
require(ggplot2)
#make graph
ggplot(data = df, aes(x = X, y = Y)) + geom_point() + #main graph
geom_errorbar(aes(ymin = Y-errY, ymax = Y+errY)) +
geom_errorbarh(aes(xmin = X-errX, xmax = X+errX))
You have separate control for the color of each bar, the linewidth, etc by setting parameters inside geom_errorbar(). See the help and Google for details. For example, you can control the width of the "caps" or eliminate them entirely with the width parameter. Compare the graph above to this one for an example of removing them:
ggplot(data = df, aes(x = X, y = Y)) + geom_point() +
geom_errorbar(aes(ymin = Y-errY, ymax = Y+errY), width = 0) +
geom_errorbarh(aes(xmin = X-errX, xmax = X+errX), height = 0)
As an alternative (using Curt F. 's "df"):
rangeX = range(c(df$X + df$errX, df$X - df$errX))
rangeY = range(c(df$Y + df$errY, df$Y - df$errY))
plot(df$X, df$Y, xlim = rangeX, ylim = rangeY)
segments(df$X, df$Y - df$errY, df$X, df$Y + df$errY)
segments(df$X - df$errX, df$Y, df$X + df$errX, df$Y)
Using error.crosses from my psych package + the toy data from Curt:
df1 <- data.frame(mean=df$X,sd=df$errX)
df2 <- data.frame(mean=df$Y,sd=df$errY)
error.crosses(df1,df2,sd=TRUE)
See the help page for error.crosses for some more complicated examples.
Related
I was trying to recreate this plot:
using the following code -
library(tidyverse)
set.seed(0); r <- rnorm(10000);
df <- as.data.frame(r)
avg <- round(mean(r),2)
SD <- round(sd(r),2)
x.scale <- seq(from = avg - 3*SD, to = avg + 3*SD, by = SD)
x.lab <- c("-3SD", "-2SD", "-1SD", "Mean", "1SD", "2SD", "3SD")
df %>% ggplot(aes(r)) +
geom_histogram(aes(y=..density..), bins = 20,
colour="black", fill="lightblue") +
geom_density(alpha=.2, fill="darkblue") +
scale_x_continuous(breaks = x.scale, labels = x.lab) +
labs(x = "")
Using the code I plotted this:
,
but this isn't near to the plot that I am trying to create. How do I make an additional axis with the X axis? How do I add the lines to automatically show the percentage of observations? Is there any way, that I can create the plot as nearly identical as possible using ggplot2?
Welcome to SO. Excellent first question!
It's actually quite tricky. You'd need to create a second plot (the second x axis) but it's not the most straight forward to align both perfectly.
I will be using Z.lin's amazing modification of the cowplot package.
I am not using the reprex package, because I think I'd need to define every single function (and I don't know how to use trace within reprex.)
library(tidyverse)
library(cowplot)
set.seed(0); r <- rnorm(10000);
foodf <- as.data.frame(r)
avg <- round(mean(r),2)
SD <- round(sd(r),2)
x.scale <- round(seq(from = avg - 3*SD, to = avg + 3*SD, by = SD), 1)
x.lab <- c("-3SD", "-2SD", "-1SD", "Mean", "1SD", "2SD", "3SD")
x2lab <- -3:3
# calculate the density manually
dens_r <- density(r)
# for each x value, calculate the closest x value in the density object and get the respective y values
y_dens <- dens_r$y[sapply(x.scale, function(x) which.min(abs(dens_r$x - x)))]
# added annotation for segments and labels.
# Arrow segments can be added in a similar way.
p1 <-
ggplot(foodf, aes(r)) +
geom_histogram(aes(y=..density..), bins = 20,
colour="black", fill="lightblue") +
geom_density(alpha=.2, fill="darkblue") +
scale_x_continuous(breaks = x.scale, labels = x.lab) +
labs(x = NULL) +# use NULL here
annotate(geom = "segment", x = x.scale, xend = x.scale,
yend = 1.1 * max(dens_r$y), y = y_dens, lty = 2 ) +
annotate(geom = "text", label = x.lab,
x = x.scale, y = 1.2 * max(dens_r$y))
p2 <-
ggplot(foodf, aes(r)) +
scale_x_continuous(breaks = x.scale, labels = x2lab) +
labs(x = NULL) +
theme_classic() +
theme(axis.line.y = element_blank())
# This is with the modified plot_grid() / align_plot() function!!!
plot_grid(p1, p2, ncol = 1, align = "v", rel_heights = c(1, 0.1))
I am performing a mantel test using the function mantel.rtest from ade4 on two Euclidean distance matrices to get the correlation between them. Since I would like to show the resulting plot for different tests, I would like to know if it would be possible to plot the mantel result using ggplot2 instead of the basic function plot.
first, of all I have tried to convert r1 into data.frame but I get this error:
r2 <- as.data.frame(r1)
Error in as.data.frame.default(r1) :
cannot coerce class ‘c("mantelrtest", "randtest", "lightrandtest")’ to a data.fr
I am adding a reproducible example:
data(yanomama)
gen <- quasieuclid(as.dist(yanomama$gen))
geo <- quasieuclid(as.dist(yanomama$geo))
plot(r1 <- mantel.rtest(geo,gen), main = "Mantel's test")
r1
Thanks a lot!
The following function will draw a ggplot for your mantelrtest object:
ggplot_mantel <- function(mant, fill = "gray50") {
df <- data.frame(x = mant$plot$hist$mids,
y = mant$plot$hist$counts)
ggplot(df, aes(x, y)) +
geom_col(orientation = "x",
width = diff(mant$plot$hist$breaks)[1],
fill = fill, color = "gray30") +
labs(x = mant$plot$hist$xname, y = "Frequency") +
scale_x_continuous(limits = mant$plot$xlim) +
geom_segment(aes(x = mant$obs, xend = mant$obs, y = 0,
yend = 0.75 * max(y))) +
geom_point(aes(x = mant$obs, y = 0.75 * max(y)), size = 5,
shape = 18)
}
So, using your own example:
plot(r1)
ggplot_mantel(r1)
I'm currently finishing off my Masters project and need to include some graphics for the write-up. Without boring you too much, I have some data which is associated with AR(1) parameters ranging from 0.1 to 0.9 by 0.1 increments. As such I thought of doing a faceted histogram like the one below (worry not about the hideous fruit salad of colours, it will not be used).
I used this code.
ggplot(opt_lens_geom,aes(x=l_1024,fill=factor(rho))) + geom_histogram()+coord_flip()+facet_grid(.~rho,scales = "free_x")
I also would like to draw a trend line for the median values since the AR(1) parameter is continuous. In a later iteration I deleted the padding and made it "look" like it was one graph, but I have had issues with the endpoints matching up since each facet is a separate graphical device. Can anyone give me some advice on how to do this? I am not particularly partial to the faceting so if it is not needed I do away with it.
I will try and upload sample data, but all simulating 100 values for each of the 9 rhos would work just to get it started like:
opt_lens_geom <- data.frame(rho= rep(seq(0.1,0.9,by=0.1),each=100),l_1024=rnorm(900))
You might consider ggridges. I've assumed here that you want a median value for each value of rho.
library(ggplot2)
library(ggridges)
library(dplyr)
set.seed(1001)
opt_lens_geom <- data.frame(rho = rep(seq(0.1, 0.9, by = 0.1), each = 100),
l_1024 = rnorm(900))
opt_lens_geom %>%
mutate(rho_f = factor(rho)) %>%
ggplot(aes(l_1024, rho_f)) +
stat_density_ridges(quantiles = 2, quantile_lines = TRUE)
Result. You can add scale = 1 as a parameter to stat_density_ridges if you don't like the amount of overlap.
Try the following. It uses a pre-computed data frame of the medians.
library(ggplot2)
df <- iris[c(1, 5)]
names(df) <- c("val", "rho")
med <- plyr::ddply(df, "rho", summarise, m = median(val))
ggplot(data = df, aes(x = val, fill = factor(rho))) +
geom_histogram() +
coord_flip() +
geom_vline(data = med, aes(xintercept = m), colour = 'black') +
facet_wrap(~ factor(rho))
You could do a variant on this using geom_violin instead of using histograms, although you wouldn't get labelled counts, just an idea of the relative density. Example with made up data:
df = data.frame(
rho = rep(c(0.1, 0.2, 0.3), each = 50),
val = sample(1:10, 150, replace = TRUE)
)
df$val = df$val + (5 * (df$rho == 0.2)) + (8 * (df$rho == 0.3))
ggplot(df, aes(x = rho, y = val, fill = factor(rho))) +
geom_violin() +
stat_summary(aes(group = 1), colour = "black",
geom = "line", fun.y = "median")
This produces a violin for each value of rho, and joins the medians for each violin.
this is my first stack overflow post and I am a relatively new R user, so please go gently!
I have a data frame with three columns, a participant identifier, a condition (factor with 2 levels either Placebo or Experimental), and an outcome score.
set.seed(1)
dat <- data.frame(Condition = c(rep("Placebo",10),rep("Experimental",10)),
Outcome = rnorm(20,15,2),
ID = factor(rep(1:10,2)))
I would like to construct a bar plot with two bars with the mean outcome score for each condition and the standard deviation as an error bar. I would like to then overlay lines connecting points for each participant's score in each condition. So the plot displays the individual response as well as the group mean.If it is also possible I would like to include an axis break.
I don't seem to be able to find any advice in other threads, apologies if I am repeating a question.
Many Thanks.
p.s. I realise that presenting data in this way will not be to everyones tastes. It is for a specific requirement!
This ought to work:
library(ggplot2)
library(dplyr)
dat.summ <- dat %>% group_by(Condition) %>%
summarize(mean.outcome = mean(Outcome),
sd.outcome = sd(Outcome))
ggplot(dat.summ, aes(x = Condition, y = mean.outcome)) +
geom_bar(stat = "identity") +
geom_errorbar(aes(ymin = mean.outcome - sd.outcome,
ymax = mean.outcome + sd.outcome),
color = "dodgerblue", width = 0.3) +
geom_point(data = dat, aes(x = Condition, y = Outcome),
color = "firebrick", size = 1.2) +
geom_line(data = dat, aes(x = Condition, y = Outcome, group = ID),
color = "firebrick", size = 1.2, alpha = 0.5) +
scale_y_continuous(limits = c(0, max(dat$Outcome)))
Some people are better with ggplot's stat functions and arguments than I am and might do it differently. I prefer to just transform my data first.
set.seed(1)
dat <- data.frame(Condition = c(rep("Placebo",10),rep("Experimental",10)),
Outcome = rnorm(20,15,2),
ID = factor(rep(1:10,2)))
dat.w <- reshape(dat, direction = 'wide', idvar = 'ID', timevar = 'Condition')
means <- colMeans(dat.w[, 2:3])
sds <- apply(dat.w[, 2:3], 2, sd)
ci.l <- means - sds
ci.u <- means + sds
ci.width <- .25
bp <- barplot(means, ylim = c(0,20))
segments(bp, ci.l, bp, ci.u)
segments(bp - ci.width, ci.u, bp + ci.width, ci.u)
segments(bp - ci.width, ci.l, bp + ci.width, ci.l)
segments(x0 = bp[1], x1 = bp[2], y0 = dat.w[, 2], y1 = dat.w[, 3], col = 1:10)
points(c(rep(bp[1], 10), rep(bp[2], 10)), dat$Outcome, col = 1:10, pch = 19)
Here is a method using the transfomations inside ggplot2
ggplot(dat) +
stat_summary(aes(x=Condition, y=Outcome, group=Condition), fun.y="mean", geom="bar") +
stat_summary(aes(x=Condition, y=Outcome, group=Condition), fun.data="mean_se", geom="errorbar", col="green", width=.8, size=2) +
geom_line(aes(x=Condition, y=Outcome, group=ID), col="red")
Following up on a recent question of mine, this one is a bit different and illustrates the problem more fully using simpler examples. Below are two data sets and three functions. The first one draws some points and a circle as expected:
library("ggplot2")
library("grid")
td1 <- data.frame(x = rnorm(10), y = rnorm(10))
tf1 <- function(df) { # works as expected
p <- ggplot(aes(x = x, y = y), data = df)
p <- p + geom_point(color = "red")
p <- p + annotation_custom(circleGrob())
print(p)
}
tf1(td1)
This next one seems to ask for the exact sample plot but the code is slightly different. It does not give an error but does not draw the circle:
tf2 <- function(df) { # circle isn't draw, but no error either
p <- ggplot()
p <- p + geom_point(data = df, aes(x = x, y = y), color = "red")
p <- p + annotation_custom(circleGrob())
print(p)
}
tf2(td1)
Finally, this one involves a more complex aesthetic and gives an empty layer when you try to create the circle:
td3 <- data.frame(r = c(rnorm(5, 5, 1.5), rnorm(5, 8, 2)),
f1 = c(rep("L", 5), rep("H", 5)), f2 = rep(c("A", "B"), 5))
tf3 <- function(df) {
p <- ggplot()
p <- p + geom_point(data = df,
aes(x = f1, y = r, color = f2, group = f2))
# p <- p + annotation_custom(circleGrob()) # comment out and it works
print(p)
}
tf3(td3)
Now, I suspect the problem here is not the code but my failure to grasp the inner workings of ggplot2. I could sure use an explanation of why the circle is not drawn in the 2nd case and why the layer is empty in the third case. I looked at the code for annotation_custom and it has a hard-wired inherit.aes = TRUE which I think is the problem. I don't see why this function needs any aesthetic at all (see the docs on it). I did try several ways to override it and set inherit.aes = FALSE but I was unable to fully penetrate the namespace and make it stick. I tried to example the objects created by ggplot2 but these proto objects are nested very deeply and hard to decipher.
To answer this :
"I don't see why this function needs any aesthetic at all".
In fact annotation_custom need x and y aes to scale its grob, and to use after the native units.
Basically it did this :
x_rng <- range(df$x, na.rm = TRUE) ## ranges of x :aes x
y_rng <- range(df$y, na.rm = TRUE) ## ranges of y :aes y
vp <- viewport(x = mean(x_rng), y = mean(y_rng), ## create a viewport
width = diff(x_rng), height = diff(y_rng),
just = c("center","center"))
dd <- editGrob(grod =circleGrob(), vp = vp) ##plot the grob in this vp
To illustrate this I add a grob to a dummy plot used as a scale for my grob. The first is a big scale and the second is a small one.
base.big <- ggplot(aes(x = x1, y = y1), data = data.frame(x1=1:100,y1=1:100))
base.small <- ggplot(aes(x = x1, y = y1), data = data.frame(x1=1:20,y1=1:1))
I define my grob, see I use the native scales for xmin,xmax,ymin,ymax
annot <- annotation_custom(grob = circleGrob(), xmin = 0,
xmax = 20,
ymin = 0,
ymax = 1)
Now see the scales difference(small point / big circle) between (base.big +annot) and (base.small + annot).
library(gridExtra)
grid.arrange(base.big+annot,
base.small+annot)