ggMarginal ignores choord_cartesian. How to change marginal scales? - r

I'm trying to plot a 2D density plot with ggplot, with added marginal histograms. Problem is that the polygon rendering is stupid and needs to be given extra padding to render values outside your axis limits (e.g. in this case I set limits between 0 and 1, because values outside this range have no physical meaning). I still want the density estimate though, because often it's much cleaner than a blocky 2D heatmap.
Is there a way around this problem, besides scrapping ggMarginal entirely and spending another 50 lines of code trying to align histograms?
Unsightly lines:
Now rendering works, but ggMarginal ignores choord_cartesian(), which demolishes the plot:
Data here:
http://pasted.co/b581605a
dataset <- read.csv("~/Desktop/dataset.csv")
library(ggplot2)
library(ggthemes)
library(ggExtra)
plot_center <- ggplot(data = dataset, aes(x = E,
y = S)) +
stat_density2d(aes(fill=..level..),
bins= 8,
geom="polygon",
col = "black",
alpha = 0.5) +
scale_fill_continuous(low = "yellow",
high = "red") +
scale_x_continuous(limits = c(-1,2)) + # Render padding for polygon
scale_y_continuous(limits = c(-1,2)) + #
coord_cartesian(ylim = c(0, 1),
xlim = c(0, 1)) +
theme_tufte(base_size = 15, base_family = "Roboto") +
theme(axis.text = element_text(color = "black"),
panel.border = element_rect(colour = "black", fill=NA, size=1),
legend.text = element_text(size = 12, family = "Roboto"),
legend.title = element_blank(),
legend.position = "none")
ggMarginal(plot_center,
type = "histogram",
col = "black",
fill = "orange",
margins = "both")

You can solve this problem by using xlim() and ylim() instead of coord_cartesian.
dataset <- read.csv("~/Desktop/dataset.csv")
library(ggplot2)
library(ggthemes)
library(ggExtra)
plot_center <- ggplot(data = dataset, aes(x = E,
y = S)) +
stat_density2d(aes(fill=..level..),
bins= 8,
geom="polygon",
col = "black",
alpha = 0.5) +
scale_fill_continuous(low = "yellow",
high = "red") +
scale_x_continuous(limits = c(-1,2)) + # Render padding for polygon
scale_y_continuous(limits = c(-1,2)) + #
xlim(c(0,1)) +
ylim(c(0,1)) +
theme_tufte(base_size = 15, base_family = "Roboto") +
theme(axis.text = element_text(color = "black"),
panel.border = element_rect(colour = "black", fill=NA, size=1),
legend.text = element_text(size = 12, family = "Roboto"),
legend.title = element_blank(),
legend.position = "none")
ggMarginal(plot_center,
type = "histogram",
col = "black",
fill = "orange",
margins = "both")

Related

histogram with densities estimated by a model in ggplot2

I'm trying to make a plot in ggplot2 of the densities estimated by a model fitted in gamlss.
I performed this using R base, as shown below:
library(gamlss)
library(ggplot2)
data(Orange)
mod.g = gamlss(circumference ~ age,
family=GA, data = Orange)
pred.g <- predict(mod.g, type = "r")
shapex = (mean(pred.g)/sd(pred.g))^2
ratex = mean(pred.g)/sd(pred.g)^2
hist(Orange$circumference, freq = FALSE, breaks = seq(0, 240, 20))
curve(dgamma(x,
shapex,
ratex), add = T,col = "blue",lwd=2)
legend("topright", legend = c("Gamma"), lty = 1, col = "blue")
Result:
However, when I tried to perform this in ggplot2 the lines are not being plotted, see:
ggplot(Orange, aes(x = circumference)) +
geom_histogram(color = "black", fill = "#225EA8", binwidth=30) +
geom_line(aes(shapex, ratex)) +
theme(legend.title = element_text(size = 15),
legend.text = element_text(size = 17),
axis.title = element_text(size = 22),
axis.text.x = element_text(color = "black", hjust=1),
axis.text.y = element_text(color = "black", hjust=1),
axis.text = element_text(size = 15),
strip.text.x = element_text(size = 18))
After_stat is necessary, but doesn't do the entire trick. With curve you are actually plotting a function. You are passing two constants to your geom_line - how are you expecting ggplot2 to know that you want to plot a gamma distribution with those two constants as parameter?
For this, you could use stat_function
ggplot(Orange, aes(x = circumference)) +
geom_histogram(aes(y = after_stat(density)), color = "black", fill = "#225EA8", binwidth=30) +
stat_function(fun = function(x) dgamma(x, shapex, ratex))
Created on 2023-02-15 with reprex v2.0.2

Dodge failing in violin plot

I would like to plot the congruence effects (incongruent minus congruent) as a violin plot per combination of stimulus age and response type. This is what my code looks like so far. I am not yet satisfied with the representation. How can I change it so that for each of the four conditions (adult frown, adult smile, child frown, child smile) I get the corresponding violin plot horizontally next to each other? Thanks in advance for the help. Attached is the code and an excerpt from the data frame.
violin plot
dataset$congruency_effect <- ifelse(dataset$congruency == "congruent", dataset$avgAmplitude, -dataset$avgAmplitude)
p <- ggplot(dataset, aes(x = stimulusResponse, y = congruency_effect, fill = congruency_effect, group = stimulusAge)) +
geom_violin() +
geom_point(position = position_dodge(width = 0.75), size = 3, stat = "summary", fun.y = "mean") +
scale_fill_manual(values = c("#F8766D", "#00BFC4")) +
ggtitle("Conventional EEG 350-450 ms") +
scale_y_continuous(limits = c(-5, 5)) +
facet_wrap(~stimulusAge, scales = "free_x")
EEG_Conventional450_age_response <- p + theme(
# Set the plot title and axis labels to APA style
plot.title = element_text(face = "bold", size = 16),
axis.title = element_text(face = "bold", size = 14),
# Set the axis tick labels to APA style
axis.text = element_text(size = 12),
# Set the legend title and labels to APA style
legend.title = element_text(face = "bold", size = 14),
legend.text = element_text(size = 12),
# Set the plot and panel backgrounds to white
panel.background = element_rect(fill = "white"),
plot.background = element_rect(fill = "white")
)
EEG_Conventional450_age_response
excerpt data frame
several permutations of arguments in ggplot
This has to do with the grouping aesthetic. Remove it, and your plot works.
library(ggplot2)
set.seed(42)
dataset <- data.frame(stimulusResponse = rep(c("frown", "smile"), each = 20),
congruency_effect = rnorm(40),
stimulusAge = rep(c("baby", "adult"), 20))
## removed group = stimulusAge
ggplot(dataset, aes(x = stimulusResponse, y = congruency_effect)) +
geom_violin() +
geom_point(position = position_dodge(width = 0.75), size = 3, stat = "summary") +
facet_wrap(~stimulusAge, scales = "free_x")

ggplot2 barplot with dual Y-axis and error bars

I am trying to generate a barplot with dual Y-axis and error bars. I have successfully generated a plot with error bars for one variable but I don't know how to add error bars for another one. My code looks like this. Thanks.
library(ggplot2)
#Data generation
Year <- c(2014, 2015, 2016)
Response <- c(1000, 1100, 1200)
Rate <- c(0.75, 0.42, 0.80)
sd1<- c(75, 100, 180)
sd2<- c(75, 100, 180)
df <- data.frame(Year, Response, Rate,sd1,sd2)
df
# The errorbars overlapped, so use position_dodge to move them horizontally
pd <- position_dodge(0.7) # move them .05 to the left and right
png("test.png", units="in", family="Times", width=2, height=2.5, res=300) #pointsize is font size| increase image size to see the key
ggplot(df) +
geom_bar(aes(x=Year, y=Response),stat="identity", fill="tan1", colour="black")+
geom_errorbar(aes(x=Year, y=Response, ymin=Response-sd1, ymax=Response+sd1),
width=.2, # Width of the error bars
position=pd)+
geom_line(aes(x=Year, y=Rate*max(df$Response)),stat="identity",color = 'red', size = 2)+
geom_point(aes(x=Year, y=Rate*max(df$Response)),stat="identity",color = 'black',size = 3)+
scale_y_continuous(name = "Left Y axis", expand=c(0,0),limits = c(0, 1500),breaks = seq(0, 1500, by=500),sec.axis = sec_axis(~./max(df$Response),name = "Right Y axis"))+
theme(
axis.title.y = element_text(color = "black"),
axis.title.y.right = element_text(color = "blue"))+
theme(
axis.text=element_text(size=6, color = "black",family="Times"),
axis.title=element_text(size=7,face="bold", color = "black"),
plot.title = element_text(color="black", size=5, face="bold.italic",hjust = 0.5,margin=margin(b = 5, unit = "pt")))+
theme(axis.text.x = element_text(angle = 360, hjust = 0.5, vjust = 1.2,color = "black" ))+
theme(axis.line = element_line(size = 0.2, color = "black"),axis.ticks = element_line(colour = "black", size = 0.2))+
theme(axis.ticks.length = unit(0.04, "cm"))+
theme(plot.margin=unit(c(1,0.1,0.1,0.4),"mm"))+
theme(axis.title.y = element_text(margin = margin(t = 0, r = 4, b = 0, l = 0)))+
theme(axis.title.x = element_text(margin = margin(t = 0, r = 4, b = 2, l = 0)))+
theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank())+
ggtitle("SRG3")+
theme(legend.position="top")+
theme( legend.text=element_text(size=4),
#legend.justification=c(2.5,1),
legend.key = element_rect(size = 1.5),
legend.key.size = unit(0.3, 'lines'),
legend.position=c(0.79, .8), #width and height
legend.direction = "horizontal",
legend.title=element_blank())
dev.off()
and my plot is as follows:
A suggestion for future questions: your example is far from being a minimal reproducible example. All the visuals an the annotations are not related to your problem but render the code overly complex which makes it harder for others to work with it.
The following would be sufficient:
ggplot(df) +
geom_bar(aes(x = Year, y = Response),
stat = "identity", fill = "tan1",
colour = "black") +
geom_errorbar(aes(x = Year, ymin = Response - sd1, ymax = Response + sd1),
width = .2,
position = pd) +
geom_line(aes(x = Year, y = Rate * max(df$Response)),
color = 'red', size = 2) +
geom_point(aes(x = Year, y = Rate * max(df$Response)),
color = 'black', size = 3)
(Notice that I've removed stat = "identity" in all geom_s because this is set by default. Furthermore, y is not a valid aestetic for geom_errorbar() so I omitted that, too.)
Assuming that the additional variable you would like to plot error bars for is Rate * max(df$Response)) and that the relevant standard deviation is sd2, you may simply append
+ geom_errorbar(aes(x = Year, ymin = Rate * max(df$Response) - sd2,
ymax = Rate * max(df$Response) + sd2),
colour = "green",
width = .2)
to the code chunk above. This yields the output below.

change color data points plotLearnerPrediction (MLR package)

I have produced some nice plots with the plotLearnerPrediction function of the MLR package. I was able to make some adjustments to the returned ggplot (see my code below). But I am not sure how to make the last adjustment. Namely, I want to change the coloring of the data points based on labels (groups in example plot).
My last plot (with black data points)
Another produced plot (overlapping data points)
This is the last version of my code (normally part of a for loop):
plot <- plotLearnerPrediction(learner = learner_name, task = tasks[[i]], cv = 0,
pointsize = 1.5, gridsize = 500) +
ggtitle(trimws(sprintf("Predictions %s %s", meta$name[i], meta$nr[i])),
subtitle = sprintf("DR = %s, ML = %s, CV = LOO, ACC = %.2f", meta$type[i],
toupper(strsplit(learner_name, "classif.")[[1]][2]), acc[[i]])) +
xlab(sprintf("%s 1", lab)) +
ylab(sprintf("%s 2", lab)) +
scale_fill_manual(values = colors) +
theme(plot.title = element_text(size = 18, face = "bold"),
plot.subtitle = element_text(size = 12, face = "bold", colour = "grey40"),
axis.text.x = element_text(vjust = 0.5, hjust = 1),
axis.text = element_text(size = 14, face = "bold"),
axis.title.x = element_text(vjust = 0.5),
axis.title = element_text(size = 16, face = "bold"),
#panel.grid.minor = element_line(colour = "grey80"),
axis.line.x = element_line(color = "black", size = 1),
axis.line.y = element_line(color = "black", size = 1),
panel.grid.major = element_line(colour = "grey80"),
panel.background = element_rect(fill = "white"),
legend.justification = "top",
legend.margin = margin(l = 0),
legend.title = element_blank(),
legend.text = element_text(size = 14))
Below is a part of the source code of the plotLearnerPrediction function. I want to overrule geom_point(colour = "black"). Adding simply geom_point(colour = "pink") to my code will not color data points, but the whole plot. Is there a solution to overrule that code with a vector of colors? Possibly a change in the aes() is also needed to change colors based on groups.
else if (taskdim == 2L) {
p = ggplot(mapping = aes_string(x = x1n, y = x2n))
p = p + geom_tile(data = grid, mapping = aes_string(fill = target))
p = p + scale_fill_gradient2(low = bg.cols[1L], mid = bg.cols[2L],
high = bg.cols[3L], space = "Lab")
p = p + geom_point(data = data, mapping = aes_string(x = x1n,
y = x2n, colour = target), size = pointsize)
p = p + geom_point(data = data, mapping = aes_string(x = x1n,
y = x2n), size = pointsize, colour = "black",
shape = 1)
p = p + scale_colour_gradient2(low = bg.cols[1L],
mid = bg.cols[2L], high = bg.cols[3L], space = "Lab")
p = p + guides(colour = FALSE)
}
You can always hack into gg objects. The following works for ggplot2 2.2.1 and adds a manual alpha value to all geom_point layers.
library(mlr)
library(ggplot2)
g = plotLearnerPrediction(makeLearner("classif.qda"), iris.task)
ids.geom.point = which(sapply(g$layers, function(z) class(z$geom)[[1]]) == "GeomPoint")
for(i in ids.geom.point) {
g$layers[[i]]$aes_params$alpha = 0.1
}
g
The plotLearnerPrediction() function returns the ggplot plot object, which allows for some level of customization without having to modify the source code. In your particular case, you can use scale_fill_manual() to set custom fill colors:
library(mlr)
g = plotLearnerPrediction(makeLearner("classif.randomForest"), iris.task)
g + scale_fill_manual(values = c("yellow", "orange", "red"))

ggplot2: Adjust legend symbols in overlayed plot

I need to create a plot, in which a histogram gets overlayed by a density. Here is my result so far using some example data:
library("ggplot2")
set.seed(1234)
a <- round(rnorm(10000, 5, 5), 0)
b <- rnorm(10000, 5, 7)
df <- data.frame(a, b)
ggplot(df) +
geom_histogram(aes(x = a, y = ..density.., col = "histogram", linetype = "histogram"), fill = "blue") +
stat_density(aes(x = b, y = ..density.., col = "density", linetype = "density"), geom = "line") +
scale_color_manual(values = c("red", "white"),
breaks = c("density", "histogram")) +
scale_linetype_manual(values = c("solid", "solid")) +
theme(legend.title = element_blank(),
legend.position = c(.75, .75),
legend.text = element_text(size = 15))
Unfortunately I can not figure out how I can change the symbols in the legend properly. The first symbol should be a relatively thick red line and the second symbol should be a blue box without the white line in the middle.
Based on some internet research, I tried to change different things in scale_linetype_manual and further I tried to use override.aes, but I could not figure out how I would have to use it in this specific case.
EDIT - Here is the best solution based on the very helpful answers below.
ggplot(df) +
geom_histogram(aes(x = a, y = ..density.., linetype = "histogram"),
fill = "blue",
# I added the following 2 lines to keep the white colour arround the histogram.
col = "white") +
scale_linetype_manual(values = c("solid", "solid")) +
stat_density(aes(x = b, y = ..density.., linetype = "density"),
geom = "line", color = "red") +
theme(legend.title = element_blank(),
legend.position = c(.75, .75),
legend.text = element_text(size = 15),
legend.key = element_blank()) +
guides(linetype = guide_legend(override.aes = list(linetype = c(1, 0),
fill = c("white", "blue"),
size = c(1.5, 1.5))))
As you thought, most of the work can be done via override.aes for linetype.
Note I removed color from the aes of both layers to avoid some trouble I was having with the legend box outline. Doing this also avoids the need for the scale_*_* function calls. To set the color of the density line I used color outside of aes.
In override.aes I set the linetype to be solid or blank, the fill to be either white or blue, and the size to be 2 or 0 for the density box and histogram box, respectively.
ggplot(df) +
geom_histogram(aes(x = a, y = ..density.., linetype = "histogram"), fill = "blue") +
stat_density(aes(x = b, y = ..density.., linetype = "density"), geom = "line", color = "red") +
theme(legend.title = element_blank(),
legend.position = c(.75, .75),
legend.text = element_text(size = 15),
legend.key = element_blank()) +
guides(linetype = guide_legend(override.aes = list(linetype = c(1, 0),
fill = c("white", "blue"),
size = c(2, 0))))
The fill and colour aesthetics are labelled by histogram and density respectively, and their values set using scale_*_manual. Doing so maps directly to the desired legend without needing any overrides.
ggplot(df) +
geom_histogram(aes(x = a, y = ..density.., fill = "histogram")) +
stat_density(aes(x = b, y = ..density.., colour="density"), geom = "line") +
scale_fill_manual(values = c("blue")) +
scale_colour_manual(values = c("red")) +
labs(fill="", colour="") +
theme(legend.title = element_blank(),
legend.position = c(.75, .75),
legend.box.just = "left",
legend.background = element_rect(fill=NULL),
legend.key = element_rect(fill=NULL),
legend.text = element_text(size = 15))

Resources