Plotting multiple plots with geom_violing and lapply - r

I'm trying to use lapply to make multiple violin plots, stacked side by side.
The base code is:
ggplot(mpg, aes(x = class, y = cyl, fill = class)) +
geom_violin() + ggtitle("cyl") +
geom_jitter(shape=16,position=position_jitter(0.1)
So I'm trying to use lapply:
plots_list = lapply(
names(mpg[,3:5]),
function(n)
ggplot(mpg, aes(x = class, y = n, fill = class)) +
geom_violin() + geom_jitter(shape=16, position=position_jitter(0.1))
+ ggtitle(n)
)
plots_list[[1]]
But y = n gives no violin plot.
If I use:
plots_list = lapply(
mpg[,3:5],
function(n)
ggplot(mpg, aes(x = class, y = n, fill = class)) +
geom_violin() + geom_jitter(shape=16, position=position_jitter(0.1)) + ggtitle(n)
)
plots_list[[1]]
Then the plots titles are not correct.
Also, when using:
grid.arrange(plots_list[1:3], ncol = 2)
I get errors, but ploting with:
plots_list[1:3]
works like a charm

Your n is not a symbol, it's a string, so you need aes_string:
plots_list = lapply(
names(mpg[,3:5]),
function(n)
ggplot(mpg, aes_string(x = "class", y = n, fill = "class")) +
geom_violin() + geom_jitter(shape=16, position=position_jitter(0.1))
+ ggtitle(n)
)
plots_list[[1]]

Related

Pass changed geom from object to other ggplot

I first make a plot
df <- data.frame(x = c(1:40, rep(1:20, 3), 15:40))
p <- ggplot(df, aes(x=x, y = x)) +
stat_density2d(aes(fill='red',alpha=..level..),geom='polygon', show.legend = F)
Then I want to change the geom_density values and use these in another plot.
# build plot
q <- ggplot_build(p)
# Change density
dens <- q$data[[1]]
dens$y <- dens$y - dens$x
Build the other plot using the changed densities, something like this:
# Built another plot
ggplot(df, aes(x=x, y =1)) +
geom_point(alpha = 0.3) +
geom_density2d(dens)
This does not work however is there a way of doing this?
EDIT: doing it when there are multiple groups:
df <- data.frame(x = c(1:40, rep(1:20, 3), 15:40), group = c(rep('A',40), rep('B',60), rep('C',26)))
p <- ggplot(df, aes(x=x, y = x)) +
stat_density2d(aes(fill=group,alpha=..level..),geom='polygon', show.legend = F)
q <- ggplot_build(p)
dens <- q$data[[1]]
dens$y <- dens$y - dens$x
ggplot(df, aes(x=x, y =1)) +
geom_point(aes(col = group), alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = fill, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F)
Results when applied to my own dataset
Although this is exactly what I'm looking for the fill colors seem not to correspond to the initial colors (linked to A, B and C):
Like this? It is possible to plot a transformation of the shapes plotted by geom_density. But that's not quite the same as manipulating the underlying density...
ggplot(df, aes(x=x, y =1)) +
geom_point(alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = fill, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F)
Edit - OP now has multiple groups. We can plot those with the code below, which produces an artistic plot of questionably utility. It does what you propose, but I would suggest it would be more fruitful to transform the underlying data and summarize that, if you are looking for representative output.
ggplot(df, aes(x=x, y =1)) +
geom_point(aes(col = group), alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = group, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F) +
theme_minimal()

overlaying plots from different dataframes in ggplot without messing with legend

I want to overlay two plots: one is a simple point plot where a variable is used to control the dot size; and another is a simple curve.
Here is a dummy example for the first plot;
library(ggplot2)
x <- seq(from = 1, to = 10, by = 1)
df = data.frame(x=x, y=x^2, v=2*x)
ggplot(df, aes(x, y, size = v)) + geom_point() + theme_classic() + scale_size("blabla")
Now lets overlay a curve to this plot with data from another dataframe:
df2 = data.frame(x=x, y=x^2-x+2)
ggplot(df, aes(x, y, size = v)) + geom_point() + theme_classic() + scale_size("blabla") + geom_line(data=df2, aes(x, y), color = "blue") + scale_color_discrete(name = "other", labels = c("nanana"))
It produces the error:
Error in FUN(X[[i]], ...) : object 'v' not found
The value in v is not used to draw the intended curse, but anyway, I added a dummy v to df2.
df2 = data.frame(x=x, y=x^2-x+2, v=replicate(length(x),0)) # add a dummy v
ggplot(df, aes(x, y, size = v)) + geom_point() + theme_classic() + scale_size("blabla") + geom_line(data=df2, aes(x, y), color = "blue") + scale_color_discrete(name = "other", labels = c("nanana"))
An the result has a messed legend:
What is the right way to achieve the desired plot?
You can put the size aes in the geom_point() call to make it so that you don't need the dummy v in df2.
Not sure exactly what you want regarding the legend. If you replace the above, then the blue portion goes away. If you want to have a legend for the line color, then you have to place color inside the geom_line aes call.
x <- seq(from = 1, to = 10, by = 1)
df = data.frame(x=x, y=x^2, v=2*x)
df2 = data.frame(x=x, y=x^2-x+2)
ggplot(df, aes(x, y)) +
geom_point(aes(size = v)) +
theme_classic() +
scale_size("blabla") +
geom_line(data=df2, aes(x, y, color = "blue")) +
scale_color_manual(values = "blue", labels = "nanana", name = "other")

Create a vector with multiple expressions

In the following ggplot barchart. How can I generate a vector with multiple expressions automatically?
data <- data.frame(x = LETTERS[1:11], y = 10^(0:10))
z <- 0:10
y.labels <- sprintf(paste0("10^", z))
ggplot(data, aes(x, y)) +
geom_bar(stat = "identity") +
scale_y_log10(breaks = 10^(z), labels = y.labels)
I've tried with bquote(.(10^c(z))), but is not the desired result .
My only alternative is to do it manually, but it is not automatic:
y.labels <- expression("10"^0, "10"^1, "10"^2, "10"^3, "10"^4, "10"^5, "10"^6, "10"^7, "10"^8, "10"^9, "10"^10)
Try parse(text =, which will convert the character vector y.labels into the expected expression:
ggplot(data, aes(x, y)) +
geom_bar(stat = "identity") +
scale_y_log10(breaks = 10^(z), labels = parse(text = y.labels))
We can use bquote with expression
y.labels <- sapply(z, function(u) as.expression(bquote(10^.(u))))
ggplot(data, aes(x, y)) +
geom_bar(stat = "identity") +
scale_y_log10(breaks = 10^(z), labels =y.labels)
If you don't need to store both z and y.labels, you can use:
library(scales)
data <- data.frame(x = LETTERS[1:11], y = 10^(0:10))
ggplot(data, aes(x, y)) +
geom_bar(stat = "identity") +
scale_y_log10(breaks = trans_breaks(log10, function(x) 10^x, 10),
labels = trans_format(log10, math_format(10^.x)))

qplot (ggplot2): plot of more functions with the same color

I'm plotting 11 curves and the program bellow works well. BUT I'm not able two change the wild colors to plot 11 black curves:
library(ggplot2)
#library(latex2exp)
library(reshape)
fn <- "img/plot.eps"
fct1 <- function(x0 ){
return(1/sin(x0)+1/tan(x0))
}
fct2 <- function(beta, t ){
return(2*atan(exp(t)/beta))
}
t<-seq(from=0,to=10,by=0.01)
s1<-cbind(t, fct2(fct1(-pi+0.0001),t),
fct2(fct1(-1.5),t),
fct2(fct1(-0.5),t),
fct2(fct1(-0.05),t),
fct2(fct1(-0.01),t),
fct2(fct1(0),t),
fct2(fct1(0.01),t),
fct2(fct1(0.05),t),
fct2(fct1(0.5),t),
fct2(fct1(1.5),t),
fct2(fct1(pi),t))
colnames(s1)<-c("time","y1","y2","y3","y4","y5","y6","y7","y8","y9","y10","y11")
s2 <- melt(as.data.frame(s1), id = "time")
q <- ggplot(s2, aes(x = time, y = value, color = variable))
q <- q + geom_line() + ylab("y") + xlab("t")+ ylab("x(t)")+
theme_bw(base_size = 7) + guides(colour = FALSE)
ggsave(file = fn, width = 2, height = 1)
q
EDIT Now the code should be reproducible
You need to map the variable to the grouping, and it will produce black lines by default.
q <- ggplot() +
geom_line(data = s2, aes(x = time, y = value,
group = variable)) +
xlab("t")+ ylab("x(t)") +
theme_bw(base_size = 7) + guides(colour = FALSE)
q
To be perfectly clear, it is possible to map the color to the variable, which can produce black lines, but not without changing the legend. Here is how you would amend the colors after the fact, if you wanted to, having already mapped the color to the variable.
q <- ggplot() +
geom_line(data = s2, aes(x = time, y = value,
color = variable)) +
xlab("t")+ ylab("x(t)") +
theme_bw(base_size = 7) + guides(colour = FALSE) +
scale_color_manual(values = rep("black",11))
q

ggplot: relative frequencies of two groups

I want a plot like this except that each facet sums to 100%. Right now group M is 0.05+0.25=0.30 instead of 0.20+0.80=1.00.
df <- rbind(
data.frame(gender=c(rep('M',5)), outcome=c(rep('1',4),'0')),
data.frame(gender=c(rep('F',10)), outcome=c(rep('1',7),rep('0',3)))
)
df
ggplot(df, aes(outcome)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
facet_wrap(~gender, nrow=2, ncol=1)
(Using y = ..density.. gives worse results.)
here's another way
ggplot(df, aes(outcome)) +
geom_bar(aes(y = ..count.. / sapply(PANEL, FUN=function(x) sum(count[PANEL == x])))) +
facet_wrap(~gender, nrow=2, ncol=1)
I usually do this by simply precalculating the values outside of ggplot2 and using stat = "identity":
df1 <- melt(ddply(df,.(gender),function(x){prop.table(table(x$outcome))}),id.vars = 1)
ggplot(df1, aes(x = variable,y = value)) +
facet_wrap(~gender, nrow=2, ncol=1) +
geom_bar(stat = "identity")

Resources