how to tweak geom_text() in a ggplot2 function in R - r

I am creating a step plot using a self-defined function "stepPlot" which is working. I got stuck when I attempted to put a geom_text() inside this function. Interestingly, geom_text() works when it's not in the function. Can any one help me tweak the geom_text(). There are two parts: (1) "labelPosiX" is the horizontal position of the label text, and (2) geom_text() where is at the end of the function. "labelPosiY", the vertical position of the label, will be manually specified with a number. These two lines of codes have been inactivated. Thanks in advance
stepPlot <- function(Data,xVar, yVar, LegendTitle="", GroupLabels, Plottitle="",labelPosiY, labelText="A"
# plot specifications remain the same over data subsets. Ignore these setting when calling the function
GroupColour=c("black","blue","orange"), LineTypeGroup=c("solid","solid","solid"), LineSize=1,
LegendPosition=c(0.5,0.8),
YaxisTitle="", YAxisTitleSize=element_blank(),
XAxisText=element_text(size=20),AxisTextSize=15,LegendTitleSize=10, LegendTextSize=10,LegendKeySize=10,
PlotTitleSize=15
){
# define x limits (Xmin, Xmax), x break increments (BreakIncreX),level of breaks (GroupBreaks),horizontal position of label text (labelPosiX)
Xmin <- min(Data[xVar])-1
Xmax <- max(Data[xVar])+1
BreakIncreX <- round((Xmax-Xmin)/6)
GroupBreaks <-unique(Data$trt_label)
#labelPosiX <-min(Data[xVar])+2
# define y maximal limit (limitYMax),y break increments (BreakIncreY)
library(plyr)
limitYMax <- round_any(max(Data[yVar]), 100, f = ceiling)
BreakIncreY <- round_any(max(Data[yVar])/5, 100, f = ceiling)
# step plot
ggplot(Data, aes_string(x=xVar, y=yVar, group='trt_label'))+
geom_step(aes(colour=trt_label, linetype=trt_label), direction='hv',size= LineSize)+ #specify step curve from different group with colours, colour by default
scale_y_continuous(YaxisTitle, limits=c(0,limitYMax), expand=c(0,0), breaks=seq(0,limitYMax,by=BreakIncreY))+
scale_x_continuous("Age of adults in days", limits=c(Xmin, Xmax), expand=c(0,0), breaks=seq(Xmin,Xmax,by=BreakIncreX)) +
scale_colour_manual(name=LegendTitle,
breaks=GroupBreaks,
labels=GroupLabels,
values=GroupColour
)+ # change default colours to manually specified grey scale
scale_linetype_manual(name =LegendTitle,
breaks=GroupBreaks,
labels=GroupLabels,
values=LineTypeGroup
)+
guides(colour = guide_legend(LegendTitle), linetype = guide_legend(LegendTitle))+ # merge two legends into a single one
theme_bw() + # maek background theme black and white
theme(axis.title.x = element_blank(), #font size of x axis title
axis.title.y = YAxisTitleSize, #font size of y axis title
axis.text.x = XAxisText, #font size of x axis text
axis.text.y = element_text(size=AxisTextSize), #font size of y axis text
legend.position=LegendPosition,
legend.title=element_text(size=LegendTitleSize), #font size of legend title
legend.text = element_text(colour="black", size = LegendTextSize, face = "bold"), #font size of legend text
legend.key.size=unit(LegendKeySize,'points'), ## ben - added to shrink the legend
legend.background=element_blank(), ## ben - added to get rid of white background
panel.grid.major = element_line(size = 0.5, colour = '#FFFFFF'),
panel.grid.minor = element_line(colour = NA), # colour = NA to suppress gridlines, reappear if colour='black'
plot.title=element_text( face="bold", size=PlotTitleSize) # aduust plot title size
)+
ggtitle(Plottitle)
# add label text
**#+ geom_text(aes(labelPosiX, labelPosiY, label="test"), colour="black",size=5)** }
}
My old way to add text works but I am hoping to move the geom_text into the function.
source("C:/Now/R/Rfunction_stepPlot.R")
fig17b <-stepPlot(Data=df17b,xVar= "age", yVar='mean_cumSumDurLeftByBeeAge',
LegendTitle="Precocious topical",GroupLabels=c("acetone", "untreated", "methoprene"),
Plottitle="weighed hive"
)+
geom_text(aes((min(df17b$age)+2), 3700, label="A"), colour="black",size=5)

Here's one way you can pass the (x,y) positions of geom_text to a function:
In general, make it a part of the data frame that ggplot is plotting. (Using SimonO101's suggestion, here's how it works for mtcars.)
plotFunction <- function (df, labelPosiY) {
df$xPos = df$cyl #add columns to the data frame
df$yPos = labelPosiY
p <- ggplot(data=df, aes(x=cyl, y=mpg)) + geom_step(aes(colour=gear, direction='hv',size=2))
p <- p + geom_text(aes(xPos, y=yPos, label="test"), colour="black",size=5)
return (p)
}
Now, calling
plotFunction(mtcars, 17)
produces
You can try making the geom_text part of your code work, and then bringing in all the other aspects of your plot.

Related

How do I flip the trendline patterns on my ggplot2 graph?

I want to make the Girls have the dashed trendline and the Boys have a solid trendline. I'd also like to remove the box around the graph, save the y and x-axis lines, and the shading behind the shapes on the key. I am using ggplot2 in R.
dr <- ggplot(DATASET,
aes(x=EC,
y=sqrt_Percent.5,
color=Sex1M,
shape=Sex1M,
linetype=Sex1M)) +
geom_point(size= 3,
aes(shape=Sex1M,
color=Sex1M)) +
scale_shape_manual(values=c(1,16))+
geom_smooth(method=lm,
se=FALSE,
fullrange=TRUE) +
labs(x="xaxis title",
y = "yaxis title",
fill= "") +
xlim(3,7) +
ylim(0,10) +
theme(legend.position = 'right',
legend.title = element_blank(),
panel.border = element_rect(fill=NA,
color = 'white'),
panel.background = NULL,
legend.background =element_rect(fill=NA,
size=0.5,
linetype="solid")) +
scale_color_grey(start = 0.0,
end = 0.4)
Current Graph
There is quite something going on in your visualisation. One strategy to develop this is to add layer and feature by feature once you have your base plot.
There a different ways to change the "sequence" of your colours, shapes, etc.
You can do this in ggplot with one of the scale_xxx_manual layers.
Conceptually, I suggest you deal with this in the data and only use the scales for "twisting". But that is a question of style.
In your case, you use Sex1M as a categorical variable. There is a built in sequence for (automatic) colouring and shapes. So in your case, you have to "define" the levels in another order.
As you have not provided a representative sample, I simulate some data points and define Sex1M as part of the data creation process.
DATASET <- data.frame(
x = sample(x = 2:7, size = 20, replace = TRUE)
, y = sample(x = 0.2:9.8, size = 20, replace = TRUE)
, Sex1M = sample(c("Boys", "Girls"), size = 20, replace = TRUE )
Now let's plot
library(dplyr)
library(ggplot2)
DATASET <- DATASET %>%
mutate(Sex1M = factor(Sex1M, levels = c("Boys","Girls)) # set sequence of levels: boys are now the first level aka 1st colour, linetype, shape.
# plot
ggplot(DATASET,
aes(x=x, # adapted to simulated data
y=y, # adapted to simulated data
color=Sex1M, # these values are now defined in the sequence
shape=Sex1M, # of the categorical factor you created
linetype=Sex1M) # adapt the factor levels as needed (e.g change order)
) +
geom_point(size= 3,
aes(shape=Sex1M,
color=Sex1M)) +
scale_shape_manual(values=c(1,16))+
geom_smooth(method=lm,
se=FALSE,
fullrange=TRUE) +
labs(x="xaxis title",
y = "yaxis title",
fill= "") +
xlim(3,7) +
ylim(0,10) +
theme(legend.position = 'right',
legend.title = element_blank(),
panel.border = element_rect(fill=NA,
color = 'white'),
panel.background = NULL,
#------------ ggplot is not always intuitive - the legend background the panel
# comprising the legend keys (symbols) and the labels
# you want to remove the colouring of the legend keys
legend.key = element_rect(fill = NA),
# ----------- that can go. To see above mentioned difference of background and key
# set fill = "blue"
# legend.background =element_rect(fill = NA, size=0.5,linetype="solid")
) +
scale_color_grey(start = 0.0,
end = 0.4)
The settings for the background panel make the outer line disappear in my plot.
Hope this helps to get you started.

ggplot2 dodged boxplot with geom_point dodging and unequal number of subgroups

I am attempting to plot a dodged boxplot but I run into a couple of difficulties. First of all, the x-axis basically has 2 types of grouping: the "letter-groups" (A, B, C etc...) are the main groups, I specify these as my "X" aesthetic (X_main_group). Within this main group I have subgroups called "X_group", the boxes are coloured by those subgroup types. What causes problems is that for each letter group I have different amounts of these subgroups, e.g. for x=A I have 4 subgroups but for x=B I have only one. This causes problems, for one the dodging of the plotted points do not work anymore (see the example plot below) as they do not align with the dodged boxplots. Secondly, the boxes are not centered around the x-axis tick anymore, this is most clear for x=B. How can I fix this?
I would also like to achieve small x-axis ticks below each subgroup (so 4 ticks for x=A, 1 tick for x=B, 3 for x=C etc..) but this has less priority. I have attached the figure, and in red I drew some examples of what I hope to achieve with the tick-marks. ggplot2 code is shown below. I would like to provide a reproducible piece of code, but I can not manage to create a piece of code that creates a dataframe with unequal amounts of subgroups so people that want to help can run it. I can only make "symmetrical" dataframes...
cbpallette <- c("#999999", "#666666", "#333333", "#000000", "#003300")
p1 <- ggplot(data=df, aes(x=X_main_group,y=Intensity, colour=factor(X_group))) + stat_boxplot(geom = "errorbar", width=.4, position = position_dodge(0.5, preserve="single")) + geom_boxplot(width=0.5, outlier.shape=NA, position=position_dodge(preserve = "single")) + theme_classic() + geom_point(position=position_jitterdodge(), alpha=0.3)
p2 <- p1 + scale_colour_manual(values = cbpallette) + theme(legend.position = "none") + theme(axis.ticks.length = unit(-0.1, "cm"), axis.text.x = element_text(size=30, vjust=-0.4), axis.text.y=element_text(size=35, hjust = 0.5, angle=45), axis.title = element_blank())
p3 <- p2 + theme(axis.text.x = element_text(margin = margin(t = .5, unit = "cm")), axis.text.y = element_text(margin = margin(r = .5, unit = "cm")))
p3

How to set plot size in ggplot or ggarrange?

I work with Rmarkdown and I am creating many figures composed of different ggplot2 charts, using ggarrange. The problem is that I am not able to set different sizes for figures inside a single chunk. The only way I managed to set figure size is within chunk options, like:
{r Figure1, fig.height = 4, fig.width = 7}
Is there a way of setting the plot/grid size within each ggplot() or within ggarrange() function?
Details, data & code:
All my primary plots have the same size of the grid area, but the size of the final figures changes depending on the number of primary plots they contain. In addition, for annotation purposes (e.g. with annotation_figures() function), charts in the first row must have a larger top margin (for the title), while charts in the last row must have a larger bottom margin (for captions). The extra margins, to make room for title and captions (e.g. text grobs) should not alter the plot grid size, which I want to be the same for all plots inside a figure.
One example of a dataframe called "A1" that I have:
library("pacman")
p_load(ggplot2, ggpubr, hrbrthemes, tidyverse)
year <- rep(seq(2010,2019, 1), each=3)
name <- rep(c("A", "B", "C"), times=10)
n1 <- c(0,0,1,0,2,1,1,1,1,2,0,2,0,2,1,3,2,0,1,4,2,2,9,4,8,11,8,7,9,8)
n2 <- c(7,3,1,14,1,1, 15,4,4,19,9,4,26,9,4,46,4,3,52,12,3,37,12,5,45,10,5,47,18,4)
name2 <- name
A1 <-data.frame(year,name,n1,n2,name2)
With this data frame, I build the first row of plots inside a chunk with specifications {fig.height = 4.3, fig.width = 7}. This plot row has three plots (made with facet_wrap) and a top margin of 0.3 inches to make room for title annotation in the final figure, and no bottom margin. This plot row also has its own title, which will function like a subtitle or a tag in the final figure.
AA.1 <- A1 %>%
ggplot( aes(x=year, y=n1)) +
geom_line( data=A1 %>% dplyr::select(-name), aes(group=name2), color="grey", size=1.0, alpha=0.6) +
geom_line( aes(color=name), color="black", size=1.5 ) +
theme_ipsum() +
theme(
axis.text.x = element_text(size=12, angle=45),
axis.text.y = element_text(size=12),
legend.position="none",
plot.title = element_text(size=16),
panel.grid = element_blank(),
plot.margin = unit(c(0.3, 0.2, 0, 0.2), "in")) + #Top row charts have a 0.3 top margin
labs(title="A. TAK") +
scale_x_continuous(name ="",
limits=c(2010,2019),
breaks=c(seq(2010,2019,2)))+
scale_y_continuous(name ="",
limits=c(0,12),
breaks=c(seq(0,12,3)))+
facet_wrap(~name) +
theme(strip.text = element_text(size=13))
AA.1
Then I create the bottom row plots, inside another chunk, with a different figure height specification: {fig.height = 4.1, fig.width = 7}. This row is also made of three plots, which should be similar in all aesthetics aspects to the first row, although I am plotting a different variable, with different values (ref). This row has no top margins and a 0.1 inches bottom margin, to make room for captions.
AA.2 <- A1 %>%
ggplot( aes(x=year, y=n2)) +
geom_line( data=A1 %>% dplyr::select(-name), aes(group=name2), color="grey", size=1.0, alpha=0.6) +
geom_line( aes(color=name), color="black", size=1.5 )+
theme_ipsum() +
theme(
axis.text.x = element_text(size=12, angle=45),
axis.text.y = element_text(size=12),
legend.position="none",
plot.title = element_text(size=16),
panel.grid = element_blank(),
plot.margin = unit(c(0, 0.2, 0.1, 0.2), "in")) + #Margins are different
ggtitle("B. REF") +
scale_x_continuous(name ="",
limits=c(2010,2019),
breaks=c(seq(2010,2019,2)))+
scale_y_continuous(name ="",
limits=c(0,60),
breaks=c(seq(0,60,10)))+
facet_wrap(~name) +
theme(strip.text = element_text(size=13))
AA.2
Finally, I arrange both sets of plots within a figure using ggarange(), and write the final title and captions with annotate_figure(). In this chunk, I set the total fig.height to 8.4 (the sum of two previous chunks).
figureA1 <- ggarrange(AA.1, AA.2,
ncol=1, nrow=2,
heights=c(1,1))
annotate_figure(
figureA1,
top = text_grob("Figura A.1 - TAK & REF",
color = "black", face = "bold", size = 18),
bottom = text_grob("Source: My.Data (2020)", face="italic", color = "black",
hjust = 1, x = 1, size = 12),)
Possible solutions:
I would like that each plot had a total plot grid area of 4 inches. As the first plot has a top margin of 0.3 and a bottom margin of 0, I set fig.height to 4.3. As the second plot has a top margin of 0 and a bottom margin of 0.1, I set fig.height to 0.1. However, the plot grid area does not seem to be of the same size in both plots. Any ideas on how to fix this?
As the fig.height parameters are different for each plot, I need to split the code to build a figure into different chunks. I have many figures that are made in a similar fashion, and therefore, I would like to write a function to build and arrange plots within figures. However, I cannot write a function across different chunks.
I think of two possible solutions for this second problem:
Some way of setting the grid plot size within ggplot function (or the total plot area); or
Some way of setting each of the grid plots sizes within the ggarrange function;
Anyone has an idea of how can I do that?
Perhaps the second would be even better. I tried to set the heights of the rows in ggarrange to be the same with the same, with heights argument ("heights=c(1,1)"). Also tried to make them proportional to each fig.height ("heights=c(5.3/5.1,1)"), but the second plot row grid still looks taller than the first one.

Problem of different x-axis position when using grid.arrange and legend on bottom

I have to arrange two plots with same axes next to each other and did this with ggplot2 and grid.arrange. Because of a more tidy representation, the legends have to be placed bottom. Unfortunately some times the left plot has more legend entries than the right one and therefore needs a second line, yielding x-axes on different y positions. Therefore it does not only look untidy, the aim of being able to compare these plots is not fulfilled anymore.
Can anybody help?
plot_left <- some_ggplot2_fct(variable,left) +
theme(legend.position = "bottom")+
theme(legend.background = element_rect(size = 0.5, linetype="solid", colour ="black"))
plot_right <- some_ggplot2_fct(variable,right,f)+
theme(legend.position = "bottom")+
theme(legend.background = element_rect(size = 0.5, linetype="solid", colour ="black"))
# adjust y axis for more easy compare
upper_lim <- max(plot_Volume_right$data$value, plot_Volume_left$data$value)
lower_lim <- min(plot_Volume_right$data$value, plot_Volume_left$data$value)
plot_Volume_left <- plot_Volume_left + ylim(c(lower_lim, upper_lim))
plot_Volume_right <- plot_Volume_right + ylim(c(lower_lim, upper_lim))
# Arrange plots in grid
grid.arrange(plot_Volume_left, plot_Volume_right,
ncol = 2,
top = textGrob(strTitle,
gp = gpar(fontfamily = "Raleway", fontsize = 15, font = 2)))
In the picture you can see the result:
Do you now an easy way to solve this without too much change in code? (The underlying framework is quite large)

R - how to allocate screen space to complex ggplot images

I am trying to write a script that produces four different plots in a single image. Specifically, I want to recreate this graphic as closely as possible:
My current script produces four plots similar to these but I cannot figure out how to allocate screen real-estate accordingly. I want to:
modify the height and width of the plots so that all four have uniform width, one is substantially taller than the others which have uniform height among them
define the position of the legends by coordinates so that I can use screen space effectively
modify the overall shape of my image explicitly as needed (maybe I will need it closer to square-shaped at some point)
GENERATE SOME DATA TO PLOT
pt_id = c(1:279) # DEFINE PATIENT IDs
smoke = rbinom(279,1,0.5) # DEFINE SMOKING STATUS
hpv = rbinom(279,1,0.3) # DEFINE HPV STATUS
data = data.frame(pt_id, smoke, hpv) # PRODUCE DATA FRAME
ADD ANATOMICAL SITE DATA
data$site = sample(1:4, 279, replace = T)
data$site[data$site == 1] = "Hypopharynx"
data$site[data$site == 2] = "Larynx"
data$site[data$site == 3] = "Oral Cavity"
data$site[data$site == 4] = "Oropharynx"
data$site_known = 1 # HACK TO FACILITATE PRODUCING BARPLOTS
ADD MUTATION FREQUENCY DATA
data$freq = sample(1:1000, 279, replace = F)
DEFINE BARPLOT
require(ggplot2)
require(gridExtra)
bar = ggplot(data, aes(x = pt_id, y = freq)) + geom_bar(stat = "identity") + theme(axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.text.x = element_blank()) + ylab("Number of Mutations")
# DEFINE BINARY PLOTS
smoke_status = ggplot(data, aes(x=pt_id, y=smoke, fill = "red")) + geom_bar(stat="identity") + theme(legend.position = "none", axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.text.x = element_blank()) + ylab("Smoking Status")
hpv_status = ggplot(data, aes(x=pt_id, y = hpv, fill = "red")) + geom_bar(stat="identity") + theme(legend.position = "none", axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.text.x = element_blank()) + ylab("HPV Status")
site_status = ggplot(data, aes(x=pt_id, y=site_known, fill = site)) + geom_bar(stat="identity")
PRODUCE FOUR GRAPHS TOGETHER
grid.arrange(bar, smoke_status, hpv_status, site_status, nrow = 4)
I suspect that the functions needed to accomplish these tasks are already included in ggplot2 and gridExtra but I have not been able to figure out how. Also, if any of my code is excessively verbose or there is a simpler, more-elegant way to do what I have already done - please feel free to comment on that as well.
Here are the steps to get the layout you describe:
1) Extract the legend as a separate grob ("graphical object"). We can then lay out the legend separately from the plots.
2) Left-align the edges of the four plots so that the left edges and the x-scales line up properly. The code to do that comes from this SO answer. That answer has a function to align an arbitrary number of plots, but I wasn't able to get that to work when I also wanted to change the proportional space allotted to each plot, so I ended up doing it the "long way" by adjusting each plot separately.
3) Lay out the plots and the legend using grid.arrange and arrangeGrob. The heights argument allocates different proportions of the total vertical space to each plot. We also use the widths argument to allocate horizontal space to the plots in one wide column and the legend in another narrow column.
4) Plot to a device in whatever size you desire. This is how you get a particular shape or aspect ratio.
library(gridExtra)
library(grid)
# Function to extract the legend from a ggplot graph as a separate grob
# Source: https://stackoverflow.com/a/12539820/496488
get_leg = function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
legend
}
# Get legend as a separate grob
leg = get_leg(site_status)
# Add a theme element to change the plot margins to remove white space between the plots
thm = theme(plot.margin=unit(c(0,0,-0.5,0),"lines"))
# Left-align the four plots
# Adapted from: https://stackoverflow.com/a/13295880/496488
gA <- ggplotGrob(bar + thm)
gB <- ggplotGrob(smoke_status + thm)
gC <- ggplotGrob(hpv_status + thm)
gD <- ggplotGrob(site_status + theme(plot.margin=unit(c(0,0,0,0), "lines")) +
guides(fill=FALSE))
maxWidth = grid::unit.pmax(gA$widths[2:5], gB$widths[2:5], gC$widths[2:5], gD$widths[2:5])
gA$widths[2:5] <- as.list(maxWidth)
gB$widths[2:5] <- as.list(maxWidth)
gC$widths[2:5] <- as.list(maxWidth)
gD$widths[2:5] <- as.list(maxWidth)
# Lay out plots and legend
p = grid.arrange(arrangeGrob(gA,gB,gC,gD, heights=c(0.5,0.15,0.15,0.21)),
leg, ncol=2, widths=c(0.8,0.2))
You can then determine the shape or aspect ratio of the final plot by setting the parameters of the output device. (You may have to adjust font sizes when you create the underlying plots in order to get the final layout to look the way you want it.) The plot pasted in below is a png saved directly from the RStudio graph window. Here's how you would save the plot as PDF file (but there are many other "devices" you can use (e.g., png, jpeg, etc.) to save in different formats):
pdf("myPlot.pdf", width=10, height=5)
p
dev.off()
You also asked about more efficient code. One thing you can do is create a list of plot elements that you use multiple times and then just add the name of the list object to each plot. For example:
my_gg = list(geom_bar(stat="identity", fill="red"),
theme(legend.position = "none",
axis.title.x = element_blank(),
axis.ticks.x = element_blank(),
axis.text.x = element_blank()),
plot.margin = unit(c(0,0,-0.5,0), "lines"))
smoke_status = ggplot(data, aes(x=pt_id, y=smoke)) +
labs(y="Smoking Status") +
my_gg

Resources