facet_wrap text labelling issues with stat_fit_glance - r

I am wondering why the text is trending higher in the plots... it won't stay put with the facet_wrap or facet_grid. In a more complex dataset plot, the text is illegible because of the overlap.
Below is data and code to reproduce the plot and issue. Adding geom="text" to stat_fit_glance, results in Error: Discrete value supplied to continuous scale .
library(ggpmisc)
library(ggplot2)
DF <- data.frame(Site = rep(LETTERS[20:24], each = 4),
Region = rep(LETTERS[14:18], each = 4),
time = rep(LETTERS[1:10], each = 10),
group = rep(LETTERS[1:4], each = 10),
value1 = runif(n = 1000, min = 10, max = 15),
value2 = runif(n = 1000, min = 100, max = 150))
DF$time <- as.numeric(DF$time)
formula1 <- y~x
plot1 <- ggplot(data=DF,
aes(x=time, y= value2,group=Site)) +
geom_point(col="gray", alpha=0.5) +
geom_line(aes(group=Site),col="gray", alpha=0.5) +
geom_smooth(se=F, col="darkorange", alpha=0.8, fill="orange",
method="lm",formula=formula1) +
theme_bw() +
theme(strip.text.x = element_text(size=10),
strip.text.y = element_text(size=10, face="bold", angle=0),
strip.background = element_rect(colour="black", fill="gray90"),
axis.text.x = element_text(size=10), # remove x-axis text
axis.text.y = element_text(size=10), # remove y-axis text
axis.ticks = element_blank(), # remove axis ticks
axis.title.x = element_text(size=18), # remove x-axis labels
axis.title.y = element_text(size=25), # remove y-axis labels
panel.background = element_blank(),
panel.grid.major = element_blank(), #remove major-grid labels
panel.grid.minor = element_blank(), #remove minor-grid labels
plot.background = element_blank()) +
labs(y="", x="Year", title = "")+ facet_wrap(~group)
plot1 + stat_fit_glance(method = "lm", label.x="right", label.y="bottom",
method.args = list(formula = formula1),
aes(label = sprintf('R^2~"="~%.3f~~italic(p)~"="~%.2f',
stat(..r.squared..),stat(..p.value..))),
parse = TRUE)

When the position of the labels is set automatically, the npcy position is increased for each level in the grouping variable. You map Site to the group aesthetic, as Site has 5 levels unevenly appearing in different facets, the rather crude algorithm in 'ggpmisc' positions the labels unevenly: the five rows correspond one to each of the five Sites. I have changed the mapping to use colour so that this becomes more obvious. I have also deleted all code that is irrelevant to this question.
plot1 <- ggplot(data=DF,
aes(x=time, y= value2, color=Site)) +
geom_smooth(se=F, alpha=0.8,
method="lm",formula=formula1) +
facet_wrap(~group)
plot1 +
stat_fit_glance(method = "lm", label.x="right", label.y="bottom",
method.args = list(formula = formula1),
aes(label = sprintf('R^2~"="~%.3f~~italic(p)~"="~%.2f',
stat(..r.squared..),stat(..p.value..))),
parse = TRUE) +
expand_limits(y = 110)
To use fixed positions one can pass the npcy coordinates if using the default "geom_text_npcy()" or passing data coordinates and using "geom_text()". One position corresponds to each level of the grouping factor Site. If the vector is shorter, it is recycled. Of course to fit more labels you can reduce the size of the text and add space by expanding the plotting area. In any case, in practice, you will need to indicate in a way or another which estimates correspond to which line.
plot1 +
stat_fit_glance(method = "lm", label.x="right", label.y= c(0.01, 0.06, 0.11, 0.01, 0.06),
method.args = list(formula = formula1),
aes(label = sprintf('R^2~"="~%.3f~~italic(p)~"="~%.2f',
stat(..r.squared..),stat(..p.value..))),
parse = TRUE, size = 2.5) +
expand_limits(y = 110)
Note: Error: Discrete value supplied to continuous scale when attempting to use
geom_text() is a bug in 'ggpmisc' that I fixed some days ago, but has not made it yet to CRAN (future version 0.3.3).

Related

geom points are not placed on the boxplot?

I don't know how to align the dots to each belong to it is box plot.
Why they are appearing like that?
I found this post, but it is answering the dodging part which is not part of my code
here is my code
library(phyloseq)
library(ggplot2)
plot_richness(ps.prev.intesParts.f, x = "part", measures = "Shannon",
color = "Samples") +
geom_boxplot() +
theme_classic() +
theme(text = element_text(size = 20),
strip.background = element_blank(),
axis.text.x.bottom = element_text(angle = 90),
legend.title = element_blank())) +
labs(x = "Intestinal Parts", y = "Shannon Index")
Could you please advise?
You are correct that you need to specify dodging. However, dodging needs to be set in the underlying geom_point(...) of the plot_richness function itself. Unfortunately, phyloseq offers no such option. This means you'll need to calculate the alpha diversity measures yourself and generate your own plot. Luckily this only requires a few extra lines of code. Here's an example using phyloseq's GlobalPatterns.
require("phyloseq")
require("dplyr")
require("ggplot2")
# Load data
data("GlobalPatterns")
# Calculate alpha indices
a_div <- estimate_richness(GlobalPatterns, measures = "Shannon")
a_div$SampleID <- row.names(a_div)
# Add sample_data from physeq object
a_div <- left_join(a_div,
sample_data(GlobalPatterns),
by = c("SampleID" = "X.SampleID"))
# GlobalPatterns only has grouping by SampleType.
# Generate an extra group by duplicating all rows
a_div <- rbind(a_div, a_div)
a_div$Samples <- rep(x = c("MMV", "VMV"),
each = nrow(a_div)/2)
# Plot
ggplot(a_div,
aes(x = SampleType,
y = Shannon,
colour = Samples)) +
geom_boxplot(position = position_dodge(width = 0.9)) +
geom_point(position = position_dodge(width = 0.9)) +
theme_classic() +
theme(text = element_text(size = 20),
strip.background = element_blank(),
axis.text.x.bottom = element_text(angle = 90,
hjust = 1,
vjust = 0.5),
legend.title = element_blank())
Created on 2022-09-02 by the reprex package (v2.0.1)

Ho to add the count values calculated in the geom_histogram on top of the bars in R

I'd like to add the count values calculated in the geom_histogram function on ggplot2. I've put the ggplot2 into a loop so I can produce multiple plots, in my case 30 but for ease, here is a dummy set for only four plots. Facet wrap didn't work as the geom density was pooling the data across all factors before calculating proportions, rather than within a factor/variable. To produce this plot, I've essentially mixed a whole bunch of code from various sources so credit to them.
library(dplyr)
library(ggplot2)
library(ggridges)
library(reshape2)
library(gridExtra)
#Make the data#
df.fact <- data.frame("A"=rnorm(400, mean = 350, sd=160),"B"=rnorm(400, mean = 300, sd=100), "C"=rnorm(400, mean = 200, sd=80), names=rep(factor(LETTERS[23:26]), 100))
df.test<-melt(df.fact, id.vars = "names", value.name = "Length2")
names(df.test)[names(df.test) =="variable"] <- "TSM.FACT"
#Create the plotlist##
myplots <- list()
#Loop for plots##
for(i in 1:(length(unique(df.test$names)))){
p1 <- eval(substitute(
ggplot(data=df.test[df.test$names == levels(df.test$names)[i],], aes(x=Length2, group=TSM.FACT, colour = TSM.FACT, fill=TSM.FACT)) +
geom_histogram(aes( y = stat(width*density)), position = "dodge", binwidth = 50, alpha =0.4, show.legend=T)+
ggtitle(paste0(levels(df.test$names)[i]))+
geom_density_line(stat="density", aes(y=(..count..)/sum(..count..)*50), alpha=0.3, size=0.5, show.legend=F) +
geom_vline(data=ddply(df.test[df.test$names == levels(df.test$names)[i],], ~ TSM.FACT, numcolwise(mean)), mapping=aes(xintercept = Length2, group=TSM.FACT, colour=TSM.FACT), linetype=2, size=1, show.legend=F) +
scale_y_continuous(labels = percent_format()) +
ylab("relative frequency") +
scale_color_manual(values= c("#00B2EE", "#1E90FF", "#104E8B")) +
scale_fill_manual(values= c("#00B2EE", "#1E90FF", "#104E8B")) +
theme_bw() + theme(
plot.title = element_text(lineheight=0.5, hjust= 0.5, size=10),
strip.text.y = element_text(hjust = 1, angle = 0),
strip.text.x = element_text(size=10, vjust = 0.9),
strip.text=element_text(margin = margin(t=0.3,r=1,b=0.3,l=1), size=8, debug = F, vjust=0.2),
strip.background = element_blank(),
axis.text.x = element_text(size=8, angle=0, vjust=0.2, margin = margin(t=0.3,r=0.1,b=0.3,l=0.1)),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
axis.line.x=element_line(colour="black"),
axis.line.y=element_line(colour="black"),
panel.grid.minor = element_blank(),
panel.border=element_blank(),
panel.background=element_blank(),
legend.position=(c(0.9,0.9)),
legend.title = element_blank(),
legend.key = element_blank()),
list(i = i)))
print(i)
print(p1)
myplots[[i]] <- p1
plot(p1)
}
#Join the plots
panelplot=grid.arrange(plotlist = myplots, grobs = myplots, shared.legend=T)
Unfortunately I am unable to reproduce your example. I can recommend adding a column that includes the sum of each bar (let's name it "Bar")
The required addition to the ggplot code then involves:
geom_text(aes(label = Bar), position = position_stack(vjust = 1)) +
The text height above the bar can be adjusted with vjust

Plot timeseries and regression line for two groups of data

I have data from two sites across years (note the differences in sampling years). A sample is below:
df<- data.frame( year= c(seq(1997,2016,1), seq(2001,2017,1)),
site= c(rep("cr", 20),rep("ec", 17)),
mean= sample(1:50,37))
I would like to make a time series-like graph of mean for each year. Each data point would be connected (in the typical zig-zag fashion of time-series graphs) and then a regression line is superimposed to indicate the trend. I have created a time series-like plot using ggplot (I do not mind a solution from base package), but I am having trouble superimposing a dashed-regression line for each site without error.
Here is the code I have tried:
f1 <- ggplot(data = df, aes(x = year, y = mean, group= site, color=
site))+
geom_line(aes(color=site)) +
geom_point( aes(color=site),size=0.5)+
geom_smooth(method = "lm", se = FALSE, size= 0.5, aes(fill=site,
linetype= 2 ))+
scale_linetype_manual(values=c("solid", "solid"))+
scale_color_manual(values=c("#CC0000", "#000000"))+
theme_minimal()+
scale_x_continuous("Year",limits = c(1997, 2020), breaks =
seq(1995,2020,5)) +
scale_y_continuous("Mean Monthly Abundance", limits = c(0, 1500),
breaks=seq(0, 1500, by = 100)) +
theme_bw()+
theme(axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank())
f1
A few details I would like this graph to illustrate:
Each group (site) will have a different color (black ,red) for the points and the line connecting each point
The regression lines for each group (site) will be dashed and match the color specified above.
The regression lines should NOT extend to the y-axis and be limited to the length the of the data
-Points do not need to be visible. Only the line connecting each point should be visible.
Preferably the dashed regression line will NOT display the shaded 95% CI.
As #kath stated, adding linetype = "dashed" would fix it. I've made some minor modifications to the code as well:
ggplot(data = df, aes(x = year, y = mean, group= site, color = site))+
geom_line() +
geom_point(size=0.5)+
geom_smooth(method = "lm", se = FALSE, size= 0.5, linetype = "dashed")+
scale_color_manual(values=c("#CC0000", "#000000"))+
theme_minimal()+
scale_x_continuous("Year",limits = c(1997, 2020), breaks =
seq(1995,2020,5)) +
scale_y_continuous("Mean Monthly Abundance", limits = c(0, 1500),
breaks=seq(0, 1500, by = 100)) +
theme_bw()+
theme(axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank())

move legend title in ggplot2

I have been trying to shift my legend title across to be centered over the legend contents using the guide function. I've been trying to use the following code:
guides(colour=guide_legend(title.hjust = 20))
I thought of trying to make a reproducable example, but I think the reason it's not working has something to do with the above line not matching the rest of my code specifically. So here is the rest of the code I'm using in my plot:
NH4.cum <- ggplot(data=NH4_by_Date, aes(x=date, y=avg.NH4, group = CO2, colour=CO2)) +
geom_line(aes(linetype=CO2), size=1) + #line options
geom_point(size=3) + #point symbol sizes
#scale_shape_manual(values = c(1, 16)) + #manually choose symbols
theme_bw()+
theme(axis.text.x=element_text(colour="white"), #change x axis labels to white.
axis.title=element_text(size=12),
axis.title.x = element_text(color="white"), #Change x axis label colour to white
panel.border = element_blank(), #remove box boarder
axis.line.x = element_line(color="black", size = 0.5), #add x axis line
axis.line.y = element_line(color="black", size = 0.5), #add y axis line
legend.key = element_blank(), #remove grey box from around legend
legend.position = c(0.9, 0.6))+ #change legend position
geom_vline(xintercept=c(1.4,7.5), linetype="dotted", color="black")+ #put in dotted lines for season boundaries
scale_color_manual(values = c("#FF6600", "green4", "#0099FF"),
name=expression(CO[2]~concentration~(ppm))) + #manually define line colour
scale_linetype_manual(guide="none", values=c("solid", "solid", "solid")) + #manually define line types
scale_shape_manual(values = c(16, 16, 16)) + #manually choose symbols
guides(colour=guide_legend(title.hjust = 20))+
scale_y_continuous(expand = c(0, 0), limits = c(0,2200), breaks=seq(0,2200,200))+ #change x axis to intercept y axis at 0
xlab("Date")+
ylab(expression(Membrane~available~NH[4]^{" +"}~-N~(~mu~g~resin^{-1}~14~day^{-1})))+
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
geom_errorbar(aes(ymin = avg.NH4 - se.NH4, #set y error bars
ymax = avg.NH4 + se.NH4),
width=0.1)
I have tried doing the following instead with no luck:
guides(fill=guide_legend(title.hjust=20)
I have also adjusted the hjust value from values between -2 to 20 just to see if that made a difference but it didn't.
I'll try to attach a picture of the graph so far so you can see what I'm talking about.
I've looked through all the questions I can on stack overflow and to the best of my knowledge this is not a duplicate as it's specific to a coding error of my own somewhere.
Thank-you in advance!!
The obvious approach e.g.
theme(legend.title = element_text(hjust = .5))
didn't work for me. I wonder if it is related to this open issue in ggplot2. In any case, one manual approach would be to remove the legend title, and position a new one manually:
ggplot(mtcars, aes(x = wt, y = mpg, colour = factor(cyl))) +
geom_point() +
stat_smooth(se = FALSE) +
theme_bw() +
theme(legend.position = c(.85, .6),
legend.title = element_blank(),
legend.background = element_rect(fill = alpha("white", 0)),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
annotate("text", x = 5, y = 27, size = 3,
label = "CO[2]~concentration~(ppm)", parse = TRUE)
Output:

ggplot2 x - y axis intersect while keeping axis labels

I posted my original question yesterday which got solved perfectly here
Original post
I made a few addition to my code
library(lubridate)
library(ggplot2)
library(grid)
### Set up dummy data.
dayVec <- seq(ymd('2016-01-01'), ymd('2016-01-10'), by = '1 day')
dayCount <- length(dayVec)
dayValVec1 <- c(0,-0.22,0.15,0.3,0.4,0.10,0.17,0.22,0.50,0.89)
dayValVec2 <- c(0,0.2,-0.17,0.6,0.16,0.41,0.55,0.80,0.90,1.00)
dayValVec3 <- dayValVec2
dayDF <- data.frame(Date = rep(dayVec, 3),
DataType = factor(c(rep('A', dayCount), rep('B', dayCount), rep('C', dayCount))),
Value = c(dayValVec1, dayValVec2, dayValVec3))
ggplot(dayDF, aes(Date, Value, colour = DataType)) +
theme_bw() +
ggtitle("Cumulative Returns \n") +
scale_color_manual("",values = c("#033563", "#E1E2D2", "#4C633C"),
labels = c("Portfolio ", "Index ", "In-Sample ")) +
geom_rect(aes(xmin = ymd('2016-01-01'),
xmax = ymd('2016-01-06'),
ymin = -Inf,
ymax = Inf
), fill = "#E1E2D2", alpha = 0.03, colour = "#E1E2D2") +
geom_line(size = 2) +
scale_x_datetime(labels = date_format('%b-%d'),
breaks = date_breaks('1 day'),
expand = c(0,0)) +
scale_y_continuous( expand = c(0,0), labels = percent) +
theme(axis.text.x = element_text(angle = 90),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major.x = element_blank(),
axis.line = element_line(size = 1),
axis.ticks = element_line(size = 1),
axis.text = element_text(size = 20, colour = "#033563"),
axis.title.x = element_text(hjust = 2),
plot.title = element_text(size = 40, face = "bold", colour = "#033563"),
legend.position = 'bottom',
legend.text = element_text(colour = "#033563", size = 20),
legend.key = element_blank()
)
which produces this output
The only thing that I still cannot get working is the position of the x axis. I want the x axis to be at y = 0 but still keep the x axis labels under the chart, exactly as in the excel version of it. I know the data sets are not the same but I didn't have the original data at hand so I produced some dummy data. Hope this was worth a new question, thanks.
> grid.ls(grid.force())
GRID.gTableParent.12660
background.1-5-7-1
spacer.4-3-4-3
panel.3-4-3-4
grill.gTree.12619
panel.background.rect.12613
panel.grid.minor.y.zeroGrob.12614
panel.grid.minor.x.zeroGrob.12615
panel.grid.major.y.polyline.12617
panel.grid.major.x.zeroGrob.12618
geom_rect.rect.12607
GRID.polyline.12608
panel.border.rect.12610
axis-l.3-3-3-3
axis.line.y.polyline.12631
axis
axis-b.4-4-4-4
axis.line.x.polyline.12624
axis
xlab.5-4-5-4
ylab.3-2-3-2
guide-box.6-4-6-4
title.2-4-2-4
> grid.gget("axis.1-1-1-1", grep=T)
NULL
ggplot2 doesn't make this easy. Below is one-way to approach this interactively. Basically, you just grab the relevant part of the plot (the axis line and ticks) and reposition them.
If p is your plot
p
grid.force()
# grab the relevant parts - have a look at grid.ls()
tck <- grid.gget("axis.1-1-1-1", grep=T)[[2]] # tick marks
ax <- grid.gget("axis.line.x", grep=T) # x-axis line
# add them to the plot, this time suppressing the x-axis at its default position
p + lapply(list(ax, tck), annotation_custom, ymax=0) +
theme(axis.line.x=element_blank(),
axis.ticks.x=element_blank())
Which produces
A quick note: the more recent versions of ggplot2 have the design decision to not show the axis. Also changes to axis.line are not automatically passed down to the x and y axis. Therefore, I tweaked your theme to define axis.line.x and axis.line.y separately.
That siad, perhaps its easier (and more robust??) to use geom_hline as suggested in the comments, and geom_segment for the ticks.

Resources