Related
I would like to align the area of several plots, each of them created by separate chunks in an RMarkdown document (preferably .html) "nicely". My problem: Because of the different lengths of the y-axis texts. The plotted area doesn't overlap perfectly (A pity because my actual x-axis is months).
Setting the fig.width= and out.width= don't help here as they consider the axis text lengths.
Dummy Data chunk:
require(ggplot2)
df = expand.grid(y = LETTERS,
x = paste0('A', 1:10),
stringsAsFactors = FALSE)
set.seed(42)
df$fill = rnorm(nrow(df))
df2 = df
df2$y = unlist(lapply(lapply(df2$y, function(x) rep(x, 10)), paste0, collapse = ''))
Plot-Chunk1:
gg1 = ggplot(df, aes(y = y, x = x, fill = fill)) +
geom_tile()
gg1
Plot-Chunk2:
gg2 = ggplot(df2, aes(y = y, x = x, fill = fill)) +
geom_tile()
gg2
The plots in the RMarkdown document should look like that (red lines highlight the desired alignment):
I achieved this with the patchwork package. However, like this I can only use one chunk and not multiple.
Patchwork-Plot-Chunk:
require(patchwork)
gg1 / gg2 +
plot_annotation(tag_levels = 'A')
Edited (tidier?) solution: cowplot::align_plots
Having a bit of a play around with cowplot::align_plots, it would be possible to set a standard panel width to use across all graphs. But to do this across chunks when you're constructing each graph 'blind' to the forthcoming ones, you could create a 'template' plot with labels as wide as needed (gg_set below). Each subsequent graph would then adopt the sizing of this unused plot:
require(ggplot2)
df <- expand.grid(y = LETTERS,
x = paste0('A', 1:10),
stringsAsFactors = FALSE)
set.seed(42)
df$fill = rnorm(nrow(df))
df2 <- df
df2$y <-
unlist(lapply(lapply(df2$y, function(x)
rep(x, 5)), paste0, collapse = ''))
# df for setting max size needed - might need experimented with
dfset <- df
dfset$y <-
unlist(lapply(lapply(df$y, function(x)
rep(x, 10)), paste0, collapse = ''))
# 'template' plot
gg_set <- ggplot(dfset, aes(y = y, x = x, fill = fill)) +
geom_tile()
require(cowplot)
# Chunk 1
gg1 <- ggplot(df, aes(y = y, x = x, fill = fill)) +
geom_tile()
ggs <- align_plots(gg_set, gg1, align = "v")
# Only extracting relevant graph.
ggdraw(ggs[[2]])
# Chunk 2
gg2 <- ggplot(df2, aes(y = y, x = x, fill = fill)) +
geom_tile()
ggs <- align_plots(gg_set, gg2, align = "v")
ggdraw(ggs[[2]])
Created on 2021-12-17 by the reprex package (v2.0.1)
Untidy former solution
I've previously used an admittedly messy solution, which really just involves padding all labels with blank rows above and below to greater than the max length:
require(ggplot2)
#> Loading required package: ggplot2
df <- expand.grid(y = LETTERS,
x = paste0('A', 1:10),
stringsAsFactors = FALSE)
set.seed(42)
df$fill = rnorm(nrow(df))
df2 <- df
df2$y <-
unlist(lapply(lapply(df2$y, function(x)
rep(x, 10)), paste0, collapse = ''))
df$y <-
paste0(paste0(rep(" ", 40), collapse = ""), "\n", df$y, "\n", paste0(rep(" ", 40)))
df2$y <-
paste0(paste0(rep(" ", 40), collapse = ""), "\n", df2$y, "\n", paste0(rep(" ", 40)))
gg1 <- ggplot(df, aes(y = y, x = x, fill = fill)) +
geom_tile()
gg1
gg2 <- ggplot(df2, aes(y = y, x = x, fill = fill)) +
geom_tile()
gg2
I would hope their is a more formal solution which allows a static panel sizing, and I look forward to hearing other answers. But had used this as a quick fix!
Created on 2021-12-17 by the reprex package (v2.0.1)
The patchwork package also includes the function align_patches() which works similar to cowplot::align_plots().
gg_l = patchwork::align_patches(gg1,
gg2)
Plot-Chunk1:
gg_l[[1]]
Plot-Chunk2:
gg_l[[2]]
Data from question.
There is a good discussion about using ggplot in loop and other creative ways at Looping over variables in ggplot. However, the discussion does not quite solve my problem.
I have a vertical dataset that I need to create plots from in a loop. There is no error in the code but my code only prints the last plot. Can't figure out why. Here is a reproducible example:
df <- cbind.data.frame(var = sample(c('a','b'), size = 100, replace = TRUE),
grp = sample(c('x','y'), size = 100, replace = TRUE), value = rnorm(100))
for (i in 2) {
plot.df <- df[which(df$var == c('a','b')[i]),]
print(ggplot(plot.df, aes(x = 1:nrow(plot.df), y = value, color = grp)) +
geom_line() + ggtitle(c('a','b')[i]))
}
As an alternative, you might also consider using lapply, as it makes the code a lot more readable.
If I am not mistaken you want to produce plots for each of the levels of the variable var.
You can firstly define your function, and then apply it to all levels
my_plot <- function(x){
# debug: x <- "a"
plot.df <- df[df$var %in% x,]
ggplot(plot.df, aes(x = 1:nrow(plot.df), y = value, color = grp)) +
geom_line() + ggtitle(x)
}
lapply(unique(df$var), my_plot)
The comment by #EJJ is correct, your loop isn't you need something like
for (i in seq_along(1:nlevels(factor(df$var))))
library(ggplot2)
library(dplyr)
df <- cbind.data.frame(var = sample(c('a','b'), size = 100, replace = TRUE),
grp = sample(c('x','y'), size = 100, replace = TRUE), value = rnorm(100))
for (i in seq_along(1:nlevels(factor(df$var)))) {
plot.df <- df[which(df$var == c('a','b')[i]),]
print(ggplot(plot.df, aes(x = 1:nrow(plot.df), y = value, color = grp)) +
geom_line() + ggtitle(c('a','b')[i]))
}
I've got an exponentially distributed variable that I'd like to plot using ggplot2. I'm going to take the log of the variable. However, instead of having the axis label be the log format, I'd like it to be the original exponentially distributed values. Here's an example.
set.seed(1000)
aero_df <-
data_frame(
x = rnorm(100,100,99),
y = sample(c('dream on',
'dude looks like a lady'),
100,
replace = T)) %>%
mutate(x = x*x,
log_x = log(x)) %>%
gather(key,value,-y)
aero_plot <- ggplot(aero_df,aes(value,color = y,fill = y))+
geom_density(show.legend = F)+
facet_wrap(key~y,scales = 'free')
I'd like to have the x variable labels on the log_x.
aero_plot
I started of with this, but the issue here is that you can see the normal log_x labels also in the x plots.
ticks <- c(3,6,9,12)
logticks <- c(exp(9),exp(10),exp(11))
ggplot(aero_df,aes(value,color = y,fill = y))+
geom_density(show.legend = F)+
scale_x_continuous(breaks = c(ticks,logticks), labels = c(ticks,log(logticks))) +
facet_wrap(key~y,scales = 'free')
ggplot's scale_x_log10 to the rescue, maybe? I'm not 100% sure I understand your question, because I didn't understand your example code. Hopefully this is what you mean...
library(tidyverse)
set.seed(1000)
aero_df <-
data_frame(
x = rnorm(100,100,99),
y = sample(c('dream on',
'dude looks like a lady'),
100,
replace = T))
aero_plot <- ggplot(aero_df,aes(x,color = y,fill = y)) +
geom_density(show.legend = F) +
scale_x_log10() +
facet_wrap(~y,scales = 'free')
print(aero_plot)
I have data in following format.
X ID Mean Mean+Error Mean-Error
61322107 cg09959428 0.39158198 0.39733463 0.38582934
61322255 cg17147820 0.30742542 0.31572314 0.29912770
61322742 cg08922201 0.47443355 0.47973039 0.46913671
61322922 cg08360511 0.06614797 0.06750279 0.06479315
61323029 cg00998427 0.05625839 0.05779519 0.05472160
61323113 cg15492820 0.10606674 0.10830587 0.10382761
61323284 cg02950427 0.36187007 0.36727818 0.35646196
61323413 cg01996653 0.35582920 0.36276991 0.34888849
61323667 cg14161454 0.77930230 0.78821970 0.77038491
61324205 cg25149253 0.93585347 0.93948514 0.93222180
How can i plot error bar plot with column(bars)
enter image description here
where X-Axis is having X value. So each bar will be plotted at X of fixed width.
I'll try answering. I am using a package called plotly. You can look here for more details.
df <- read.csv('test.csv')
colnames(df) <- c("x", "id", "mean", "mean+error", "mean-error")
df$`mean+error` = df$`mean+error` - df$mean
df$`mean-error` = df$mean - df$`mean-error`
library(plotly)
p <- ggplot(df, aes(factor(x), y = mean)) + geom_bar(stat = "identity")
p <- plotly_build(p)
length(p$data)
p$layout$xaxis
plot_ly(df, x = 1:10, y = mean, type = "bar",
error_y = list(symmetric = F,
array = df$`mean+error`,
arrayminus = df$`mean-error`,
type = "data")) %>%
layout(xaxis = list(tickmode = "array",tickvals = 1:10,ticktext = df$x))
I get this:
The most popular approach would probably be using geom_errorbar() in ggplot2.
library("ggplot2")
ggplot(df, aes(x=ID, y = Mean)) +
geom_bar(stat="identity", fill="light blue") +
geom_errorbar(aes(ymin = Mean.Error, ymax = Mean.Error.1))
where Mean.Error and Mean.Error.1 are the header names for mean +/- error you get when you try to read in your example as text.
I am trying to insert labels into a proportional barchart: one label per segment, with as text the percentage of each segment. With the help of thothal I managed to do this:
var1 <- factor(as.character(c(1,1,2,3,1,4,3,2,3,2,1,4,2,3,2,1,4,3,1,2)))
var2 <- factor(as.character(c(1,4,2,3,4,2,1,2,3,4,2,1,1,3,2,1,2,4,3,2)))
data <- data.frame(var1, var2)
dat <- ddply(data, .(var1), function(.) {
res <- cumsum(prop.table(table(factor(.$var2))))
data.frame(lab = names(res), y = c(res))
})
ggplot(data, aes(x = var1)) + geom_bar(aes(fill = var2), position = 'fill') +
geom_text(aes(label = lab, x = var1, y = y), data = dat)
I would like to have for labels the percentage of each level, and not the level name.
Any help appreciated!
You are telling geom_text to use var2 as your y variable. That is in fact as.numeric(data$var2), which translates to a range of 1-4. However, your barplot uses the cumulative percentages.
Hence you have to calculate these positions before:
library(ggplot2)
library(plyr) # just for convenience
var1 <- factor(as.character(c(1,1,2,3,1,4,3,2,3,2,1,4,2,3,2,1,4,3,1,2)))
var2 <- factor(as.character(c(1,4,2,3,4,2,1,2,3,4,2,1,1,3,2,1,2,4,3,2)))
data <- data.frame(var1, var2)
dat <- ddply(data, .(var1), function(.) {
res <- cumsum(prop.table(table(factor(.$var2)))) # re-factor to use only used levels
res2 <- prop.table(table(factor(.$var2))) # re-factor to use only used levels
data.frame(lab = names(res), y = c(res), lab2 = c(res2))
})
ggplot(data, aes(x = var1)) + geom_bar(aes(fill = var2), position = 'fill') +
geom_text(aes(label = round(lab2, 2), x = var1, y = y), data = dat)
This places the labs at the end of each bar. If you want to have them slightly offset, you should play arround in the creation of dat.
Another way to get non-cumulative percentage plus centering the labels, for future reference:
dat <- ddply(data, .(var1), function(.) {
good <- prop.table(table(factor(.$var2)))
res <- cumsum(prop.table(table(factor(.$var2))))
data.frame(lab = names(res), y = c(res), good = good, pos = cumsum(good) - 0.5*good)
})
ggplot(data, aes(x = var1)) + geom_bar(aes(fill = var2), position = 'fill') +
geom_text(aes(label = round(good.Freq, 2), x = var1, y = pos.Freq), data = dat)
I used the following code and work well for me, give it a try.
geom_text(aes(label = paste(round(dat2$value,0), "%"),
vjust = ifelse(value >= 0, -0.05, 1.15)
),
size = 4, position = position_stack(vjust=0.5)
)
Basically, you need label = paste(y value, "%"). In my code, dat2 is the data file name; value is the Y value in the figure. In this case, I rounded up the number with 0 decimal.Good luck.