I am trying to create Bland-Altman plots between 2 sets of percentages with a custom function that uses ggplot within it to generate the plot.
Perc1 <- sample(1:100, 100)
Perc2 <- sample(1:100, 100)
d <- data.frame(Perc1, Perc2)
bland <- function(dat, x, y){
df <- subset(dat[ ,c(x, y)])
df$avg <- rowMeans(df)
df$diff <- df[[1]] - df[[2]]
mean_diff <- mean(df$diff)
lower <- mean_diff - 1.96 * sd(df$diff)
upper <- mean_diff + 1.96 * sd(df$diff)
p <- ggplot(df, aes(x = avg, y = diff)) +
geom_point(size=2) +
geom_hline(yintercept = mean_diff) +
geom_hline(yintercept = lower, color = "red", linetype="dashed") +
geom_hline(yintercept = upper, color = "red", linetype="dashed") +
ggtitle("Bland-Altman Plot") +
ylab("Difference Between Measurements") +
xlab("Average Measurement")
plot(p)
}
bland(d, Perc1, Perc2)
However, when I run the function none of the lines are produced with the graph, but the title and x/y labels are. If anyone can explain why this is that would be great, thanks in advance.
Try this:
(Note also, the p <- and plot(p) are not needed as the function anyway returns the last object.)
library(tidyverse)
Perc1 <- sample(1:100, 100)
Perc2 <- sample(1:100, 100)
bland <- function(x, y){
df <- data.frame(x, y)
df$avg <- rowMeans(df)
df$diff <- df[[1]] - df[[2]]
mean_diff <- mean(df$diff)
lower <- mean_diff - 1.96 * sd(df$diff)
upper <- mean_diff + 1.96 * sd(df$diff)
p <- ggplot(df, aes(x = avg, y = diff)) +
geom_point(size=2) +
geom_hline(yintercept = mean_diff) +
geom_hline(yintercept = lower, color = "red", linetype="dashed") +
geom_hline(yintercept = upper, color = "red", linetype="dashed") +
ggtitle("Bland-Altman Plot") +
ylab("Difference Between Measurements") +
xlab("Average Measurement")
plot(p)
}
bland(Perc1, Perc2)
Created on 2022-05-17 by the reprex package (v2.0.1)
Related
I'm trying to implement the log_2(x + 1) transformation in ggplot2 but am running into issues.
Here is an MWE
library(ggplot2)
x <- rexp(100)
y <- rexp(100)
df <- data.frame(x = x, y = y)
p <- ggplot(df, aes(x = x, y = y)) + geom_point(colour = "blue") +
scale_x_continuous(trans = "log2") +
scale_y_continuous(trans = "log2")
print(p)
However, I'm unsure how to best go about transforming the axes, as well as labelling the axes as log_2{x + 1) and log_2(y + 1).
You could use log2_trans from scales with a function to add 1 like this:
library(ggplot2)
library(scales)
x <- rexp(100)
y <- rexp(100)
df <- data.frame(x = x, y = y)
p <- ggplot(df, aes(x = x, y = y)) + geom_point(colour = "blue") +
scale_x_continuous(trans = log2_trans(),
breaks = trans_breaks("log2", function(x) x + 1),
labels = trans_format("log2", math_format(.x + 1))) +
scale_y_continuous(trans = log2_trans(),
breaks = trans_breaks("log2", function(x) x + 1),
labels = trans_format("log2", math_format(.x + 1)))
print(p)
Created on 2022-11-04 with reprex v2.0.2
I have a dataframe of 10 variables and I plotted it in two columns. But ggplot defines whiskers as 5th and 95th perecentile. I want whisker lengths as Q1 - 1.5*IQR / Q3 + 1.5*IQR for each of these plots and outliers as usual. A similar question has been posted in this link, but I couldn't make use of it. Any help will be appreciated!!
library(ggplot2)
library(tidyr)
df <- data.frame(matrix(rnorm(2000), ncol = 10))
plot.data <- gather(df, variable, value)
# plot.data$out <- as.numeric(rep(input_data, each = nrow(x_train)))
p <- ggplot(plot.data, aes(x = 0, y=value))
p <- p + geom_boxplot()
#p <- p + geom_point(aes(x = 0, y = test_data), color = "red")
p <- p + facet_wrap(~variable, scales = "free_x", strip.position = 'top', ncol = 2)
p <- p + coord_flip()
p <- p + xlab("") + ylab("")
p <- p + theme(legend.position="none") + theme_bw()
p <- p + theme(axis.text.y=element_blank(),
axis.ticks.y=element_blank())
p
By default (notched=FALSE), the geom_boxplot() should give you the whisker you want (Q1 - 1.5*IQR / Q3 + 1.5*IQR). See a more current question link. Although, this is subjected to the quantile, IQR definition.
If you insist on setting them manually with stat_summary
# geom_boxplot parameters with stat summary
f <- function(x) {
r <- quantile(x, probs = c(0.25, 0.25, 0.5, 0.75, 0.75))
r[[1]]<-r[[1]]-1.5*IQR(x) #ymin lower whisker, as per geom_boxplot
r[[5]]<-r[[5]]+1.5*IQR(x) #ymax upper whisker
names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
r
}
# To subset the outlying points for plotting,
o <- function(x) {
r <- quantile(x, probs = c(0.25, 0.75))
r[[1]]<-r[[1]]-1.5*IQR(x)
r[[2]]<-r[[2]]+1.5*IQR(x)
subset(x, x < r[[1]] | r[[2]] < x)
}
# added seed for consistency
set.seed(123)
df <- data.frame(matrix(rnorm(2000), ncol = 10))
plot.data <- gather(df, variable, value)
# plot.data$out <- as.numeric(rep(input_data, each = nrow(x_train)))
p <- ggplot(plot.data, aes(x = 0, y=value))
p <- p + stat_summary(fun.data = f, geom="boxplot")+
stat_summary(fun.y = o, geom="point")
#p <- p + geom_point(aes(x = 0, y = test_data), color = "red")
p <- p + facet_wrap(~variable, scales = "free_x", strip.position = 'top', ncol = 2)
p <- p + coord_flip()
p <- p + xlab("") + ylab("")
p <- p + theme(legend.position="none") + theme_bw()
p <- p + theme(axis.text.y=element_blank(),
axis.ticks.y=element_blank())
I am using the ggplot function to plot this kind of graph
image
I want to add the specific value of the x-axis as shown in the picture
this is my code :
quantiles <- quantile(mat,prob = quant)
x <- as.vector(mat)
d <- as.data.frame(x=x)
p <- ggplot(data = d,aes(x=x)) + theme_bw() +
geom_histogram(aes(y = ..density..), binwidth=0.001,color="black",fill="white") +
geom_density(aes(x=x, y = ..density..),fill="blue", alpha=0.5, color = 'black')
x.dens <- density(x)
df.dens <- data.frame(x = x.dens$x, y = x.dens$y)
p <- p + geom_area(data = subset(df.dens, x <= quantiles), aes(x=x,y=y),
fill = 'green', alpha=0.6)
print(p)
I created a plot with several geom_area according to the following code :
library(ggplot2)
set.seed(1)
dat <- data.frame(matrix(rnorm(100, 10, 2), 100, 1))
dat_density <- data.frame(density(dat[, 1])[c("x", "y")])
quant <- quantile(dat[, 1], probs = seq(0, 1, 0.10))
library(RColorBrewer)
color_pal <- brewer.pal(length(quant)-1, "RdYlBu")
dens <- ggplot(data = dat_density, aes(x = x, y = y)) +
geom_line(size = 2)
for(i in 1:(length(color_pal))){
dens <- dens +
geom_area(data = subset(dat_density, x > quant[[i]] & x < quant[[i + 1]]), fill = color_pal[i])
}
dens
How can I add a common legend with each color of the color_pal vector (corresponding to all the 10% area of data) ?
The easiest way is to define the groups in your dataset
dat_density$quant <- cut(dat_density$x, breaks = c(-Inf, quant, Inf))
ggplot(data = dat_density, aes(x = x, y = y, fill = quant)) +
geom_line(size = 2) +
geom_area() +
scale_fill_brewer(palette = "RdYlBu")
How can I overlay my barplot on real data with the estimated negative binomial density function using the same mean and variance?
library(data.table)
library(ggplot2)
temp <- data.table(cbind(V1=c(1,2,3,4,5,9), N=c(50,40,30,20,10,2)))
ggplot(temp, aes(x=V1, y= N)) +
geom_histogram(stat="identity", binwidth = 2.5) +
scale_y_continuous(breaks=c(0, 100, 200, max(temp$N))) +
scale_x_continuous(breaks=c(0, 100, 200, max(temp$V1))) +
theme(panel.grid.minor.x=element_blank(),
panel.grid.major.x=element_blank()
)
I tried to add stat_function(fun = dnbinom, args = list(size=1, mu = mean(temp$V1)), color="red") but all I see is a red line on the abscissa. Same for dpois (with lambda=mean(temp$V1)) and dnorm (with mean = mean(temp$V1), sd = sd(temp$V1)).
Maybe my parametrization is wrong?
#mmk is correct: normalization is the key. Here's how you can achieve what you want:
#simplest normalization
temp$Nmod <- temp$N / sum(temp$N)
#alternative normalization
#temp$Nmod <- temp$N / sqrt(sum(temp$N * temp$N))
temp$pois <- dpois(temp$V1, lambda = mean(temp$V1))
temp$nbinom <- dnbinom(temp$V1, mu = mean(temp$V1), size = 1)
ggplot(temp, aes(x=V1, y= Nmod)) +
geom_histogram(stat="identity", binwidth = 2.5) +
theme(panel.grid.minor.x=element_blank(),
panel.grid.major.x=element_blank()) +
geom_line(aes(y = pois), col = "red") +
geom_line(aes(y = nbinom), col = "blue")