Side by side violin plots for multiple iteration - r

Here is my data
set.seed(42)
dat = data.frame(iter = rep(1:3, each = 10),
variable = rep(rep(letters[1:2], each = 5), 3),
value = rnorm(30))
I know I can draw violin plots for a and b with
library(ggplot2)
ggplot(data = dat, aes (x = variable, y = value)) + geom_violin()
But how do I draw violin plots for each iteration of a and b so that there will be three plots for a next to three plots for b. I have done it previously using base plot but I am looking for a better solution since the number of iterations as well as number of 'a's and 'b's keeps on changing.

There are two possible ways. One would be by adding a fill command, the other using facet_wrap (or facet_grid)
With fill:
ggplot(data = dat, aes (x = variable, y = value, fill = as.factor(iter))) + geom_violin(position = "dodge")
Or using facet_wrap:
ggplot(data = dat, aes (x = as.factor(iter), y = value)) + geom_violin(position = "dodge") + facet_wrap(~variable)

Maybe there is a better way but in this kind of situation I usually create a new variable:
set.seed(42)
dat = data.frame(iter = rep(1:3, each = 10),
variable = rep(rep(letters[1:2], each = 5), 3),
value = rnorm(30))
dat <- dat %>% mutate(x_axis = as.factor(as.numeric(factor(variable))*100 + 10*iter))
levels(dat$x_axis)<- c("a1", "a", "a3", "b2", "b", "b3")
ggplot(data = dat,
aes(x = x_axis,
y = value, fill =variable)) + geom_violin() + scale_x_discrete(breaks = c("a","b"))
Result is:

Related

Comparing specific rows and whole rows in boxplot

I have a data frame ("Date", "A", "B"). I'm trying to use boxplot (by month) to analysis the data "A" for the row filtered by "B" and also for all the "A". I can only create two separate plots to do the boxplot for specific rows and for whole rows of data.
I tried two have 2 geom_boxplot under one ggplot(), but two boxplot just overlap with each other. Here is the code I used. Does anyone know how I can combime those two boxplot into one, so two boxplots will share same x axis, and each month in x axis will have two boxes.
ggplot() +
geom_boxplot(data = df %>% filter(B == 1),
aes(x = Month, y = A, group=Month, fill = "Chamber_no fire"), outlier.shape = T) +
geom_boxplot(data = df, aes(x = Month, y = A, group=Month, fill="Chamber"), outlier.shape = T) +
theme_bw() +
theme(panel.grid.major = element_blank()) +
scale_x_continuous(breaks=seq(2,12,1), minor_breaks = F) +
geom_hline(yintercept = 0, linetype="dotted")
ggsave("sate_meas_O3_NOx_5km_nofire.png", width = 6, height = 4, units = "in")
One approach to achieve your desired result is to
Bind the filtered dataset and the total dataset by row and add an identifier id for each dataset which could easily be done via dplyr::bind_rows.
Make a boxplot where you map id on the fill aesthetic and group by both id and Month using interaction
Set the legend labels via scale_fill_discrete
As you provided no data I make use of a random example data set:
set.seed(42)
df <- data.frame(
Month = sample(2:12, 100, rep = TRUE),
A = rnorm(100),
B = sample(1:2, 100, rep = TRUE)
)
library(ggplot2)
library(dplyr)
d <- bind_rows(list(b1 = df %>% filter(B == 1),
all = df), .id = "id")
ggplot(data = d, mapping = aes(x = Month, y = A, group=interaction(Month, id), fill = id)) +
geom_boxplot(outlier.shape = T, position = "dodge") +
scale_fill_discrete(labels = c(b1 = "Chamber_no fire", all = "Chamber")) +
theme_bw() +
theme(panel.grid.major = element_blank()) +
scale_x_continuous(breaks=seq(2,12,1), minor_breaks = F) +
geom_hline(yintercept = 0, linetype="dotted")

R: How to combine grouping and colour aesteric in ggplot line plot

I am trying to create a line plot with 2 types of measurements, but my data is missing some x values. In Line break when no data in ggplot2 I have found how to create plot that will make a break when there is now data, but id does not allow to plot 2 lines (one for each Type).
1) When I try
ggplot(Data, aes(x = x, y = y, group = grp)) + geom_line()
it makes only one line, but with break when there is no data
2) When I try
ggplot(Data, aes(x = x, y = y, col = Type)) +
geom_line()
it makes 2 lines, but with break when there is no data
3) When I try
ggplot(Data, aes(x = x, y = y, col = Type, group = grp)) +
geom_line()
it makes unreadyble chart
4) of course I could combine the Type and grp to make new variable, but then the legend is not nice, and I get 4 groups (and colours) insted of 2.
5) also I could make something like that, but it dose not produce a legend, and in my real dataset i have way to many Types to do that
ggplot() +
geom_line(data = Data[Data$Type == "A",], aes(x = x, y = y, group = grp), col = "red") +
geom_line(data = Data[Data$Type == "B",], aes(x = x, y = y, group = grp), col = "blue")
Data sample:
Data <- data.frame(x = c(1:100, 201:300), y = rep(c(1, 2), 100), Type = rep(c("A", "B"), 100), grp = rep(c(1, 2), each = 100))
One way is to use interaction() to specify a grouping of multiple columns:
library(ggplot2)
Data <- data.frame(x = c(1:100, 201:300), y = rep(c(1, 2), 100), Type = rep(c("A", "B"), 100), grp = rep(c(1, 2), each = 100))
ggplot(Data, aes(x = x, y = y, col = Type, group = interaction(grp,Type))) +
geom_line()

R - geom_bar - 'stack' position without summing the values

I have this data frame
df <- data.frame(profile = rep(c(1,2), times = 1, each = 3), depth = c(100, 200, 300), value = 1:3)
This is my plot
ggplot() +
geom_bar(data = df, aes(x = profile, y = - depth, fill = value), stat = "identity")
My problem is the y labels which doesn't correspond to the depth values of the data frame
To help, my desired plot seems like this :
ggplot() +
geom_point(data = df, aes(x = profile, y = depth, colour = value), size = 20) +
xlim(c(0,3))
But with bar intead of points vertically aligned
nb : I don't want to correct it manually in changing ticks with scale_y_discrete(labels = (desired_labels))
Thanks for help
Considering you want a y-axis from 0 to -300, using facet_grid() seems to be a right option without summarising the data together.
ggplot() + geom_bar(data = df, aes(x = as.factor(profile), y = -depth, fill = value), stat = 'identity') + facet_grid(~ value)
I have it !
Thanks for your replies and to this post R, subtract value from previous row, group by
To resume; the data :
df <- data.frame(profile = rep(c(1,2), times = 1, each = 3), depth = c(100, 200, 300), value = 1:3)
Then we compute the depth step of each profile :
df$diff <- ave(df$depth, df$profile, FUN=function(z) c(z[1], diff(z)))
And finally the plot :
ggplot(df, aes(x = factor(profile), y = -diff, fill = value)) + geom_col()

R: ggplot2 removing some legend entries

require(reshape2);require(ggplot2)
df <- data.frame(time = 1:10,
x1 = rnorm(10),
x2 = rnorm(10),
x3 = rnorm(10),
y1 = rnorm(10),
y2 = rnorm(10))
df <- melt(df, id = "time")
ggplot(df, aes(x = time, y = value, color = variable, group = variable,
size = variable, linetype = variable)) +
geom_line() +
scale_linetype_manual(values = c(rep(1, 3), 2, 2)) +
scale_size_manual(values = c(rep(.3, 3), 2, 2)) +
scale_color_manual(values = c(rep("grey", 3), "red", "green")) +
theme_minimal()
This example might not be very representative, but, for example, imagine running bunch of regression models that individually are not important but just contribute to the picture. While I want to emphasize only actual and averaged fit series. So basically variables x are not important and should not appear on legend.
I've tried to set scale_color_discrete(breaks = c("y1", "y2")) as suggested in some other posts. But the problem is that all of aesthetics are already in use via manual and trying to set another discrete version will override properties that are already set for graph (and mess up whole thing). So ideally - I'd want to see the exact same graph, but only y1 and y2 displayed in the legend.
You can try subsetting the data set by the variable name and plotting them separately.
p <- ggplot(df, aes(x = time, y = value, color = variable,
group = variable, size = variable, linetype = variable)) +
geom_line(data=df[which(substr(df$variable,1,1)=='y'),])+
scale_linetype_manual(values = c(2, 2)) + scale_size_manual(values = c(2, 2)) +
scale_color_manual(values = c("red", "green")) +
theme_minimal() +
geom_line(data=df[which(substr(df$variable,1,1)=='x'),],
aes(x = time, y = value, group = variable),
color="grey",size=0.3,linetype=1)
# Plot elements that have attributes set outside of aes() will
# not appear on legend!

geom_text with facet_wrap in ggplot2 when group specified

When using ggplot2 to make faceted plots, I'm having trouble getting individual labels in each facet when I also specify a grouping parameter. Without specifying group = ..., things work fine, but I'm trying to make plots of paired data that emphasize the before vs. after treatment changes.
Here is an example:
library(tidyr)
library(ggplot2)
set.seed(253)
data <- data.frame(Subject = LETTERS[1:10],
Day1.CompoundA = rnorm(10, 4, 2),
Day2.CompoundA = rnorm(10, 7, 2),
Day1.CompoundB = rnorm(10, 5, 2),
Day2.CompoundB = rnorm(10, 5.5, 2))
# Compare concentration of compounds by day
A <- t.test(data$Day1.CompoundA, data$Day2.CompoundA, paired = TRUE)
B <- t.test(data$Day1.CompoundB, data$Day2.CompoundB, paired = TRUE)
data.long <- gather(data, key = DayCompound, value = Concentration, -Subject) %>%
separate(DayCompound, c("Day", "Compound"))
# text to annotate graphs
graphLabels <- data.frame(Compound = c("CompoundA", "CompoundB"),
Pval = paste("p =", c(signif(A$p.value, 2),
signif(B$p.value, 2))))
Ok, now that the data are set up, I can make a boxplot just fine:
ggplot(data.long, aes(x = Day, y = Concentration)) +
geom_boxplot() +
facet_wrap(~ Compound) +
geom_text(data = graphLabels, aes(x = 1.5, y = 10, label = Pval))
But if I want to show line plots that emphasize the paired nature of the data by showing each subject in a different color, the facet labels don't work.
ggplot(data.long, aes(x = Day, y = Concentration, color = Subject, group = Subject)) +
geom_point() + geom_line() +
facet_wrap(~ Compound) +
geom_text(data = graphLabels, aes(x = 1.5, y = 10, label = Pval))
# Error in eval(expr, envir, enclos) : object 'Subject' not found
Any suggestions?
When you map aesthetics (i.e. aes(...,color = Subject)) in the top level ggplot() call, those mappings are passed on to each layer, which means that each layer expects data to have variables by those names.
You either need to specify the data and mapping separately in each layer, or unmap them explicitly:
ggplot(data.long, aes(x = Day, y = Concentration, color = Subject, group = Subject)) +
geom_point() + geom_line() +
facet_wrap(~ Compound) +
geom_text(data = graphLabels, aes(x = 1.5, y = 10, label = Pval,color = NULL,group= NULL))
There is also an inherit.aes argument that you can set to FALSE in any layer you don't want pulling in those other mappings, e.g.
ggplot(data.long, aes(x = Day, y = Concentration, color = Subject, group = Subject)) +
geom_point() + geom_line() +
facet_wrap(~ Compound) +
geom_text(data = graphLabels, aes(x = 1.5, y = 10, label = Pval),inherit.aes = FALSE)

Resources