I am creating a forestplot using the forestplot package in R, and am having trouble with a few things.
Questions:
Is it possible to merge two adjacent text elements
Is it possible to modify either a single text element font, or the font of an entire row
My Code:
library(forestplot)
# creating text
text <- rbind(c('', 'N (%)', 'SRT', 'ART', 'HR [95% CI]'),
c('', '', '5 year survival %', '5 year survival %', ''),
c('Seminal Vesicle Involvement', '', '', '', ''),
c(' Yes', '10 (20%)', '94', '12', '0.73 [0.36, 1.50]'),
c(' No', '40 (80%)', '96', '10', '1.78 [0.73, 4.35]'),
c('Gender', '', '', '', ''),
c(' Male', '13 (22.5%)', '84', '22', '0.06 [-0.2, 0.86]'),
c(' Female', '37 (77.5%)', '93', '13', '1.89 [0.90, 6.67]'))
# creating the plot
forestplot(text,
mean = c(NA, NA, NA, 0.73, 1.78, NA, 0.06, 1.89),
lower = c(NA, NA, NA, 0.36, 0.73, NA, -0.2, 0.90),
upper = c(NA, NA, NA, 1.50, 4.35, NA, 0.86, 6.67),
is.summary=c(T, T, T, F, F, T, F, F),
lineheight = unit(0.9, "cm"),
graph.pos = 5,
graphwidth = unit(4, 'cm'),
xticks = c(-1, 0, 1, 2, 3, 4),
ci.vertices = T,
txt_gp = fpTxtGp(ticks = gpar(cex = 1),
xlab = gpar(cex = 1),
label = gpar(cex = 0.8),
summary = gpar(cex = 0.8)),
col=fpColors(box="black",
line="darkgrey",
summary="black",
zero='grey20',
axes='grey20'),
hrzl_lines = list("2" = gpar(lwd=1, col = "#000044")))
Output:
Desired:
I would like the two 5 year survival % text bits to be combined into 1 (and centered between the two headings above), and either just those elements or the whole row to be italic font.
I have tried using summary=list(gpar(...)) for the txt_gp option, but that only seems to be able to modify the whole column, and I have found nothing on merging cells at all.
If you make the colgap much smaller in forestplot than usual, you can split the text that is currently duplicated in row 2 in columns 3 and 4 into two parts:
> text[2, 4] <- 'survival % '
> text[2, 3] <- '5 year '
>
> forestplot(text,
+ mean = c(NA, NA, NA, 0.73, 1.78, NA, 0.06, 1.89),
+ lower = c(NA, NA, NA, 0.36, 0.73, NA, -0.2, 0.90),
+ upper = c(NA, NA, NA, 1.50, 4.35, NA, 0.86, 6.67),
+ is.summary=c(T, T, T, F, F, T, F, F),
+ lineheight = unit(0.9, "cm"),
+ graph.pos = 5,
+ graphwidth = unit(4, 'cm'),
+ xticks = c(-1, 0, 1, 2, 3, 4),
+ ci.vertices = T,
# add line---------
colgap=unit(.0011,"npc"),
#
+ txt_gp = fpTxtGp(ticks = gpar(cex = 1),
+ xlab = gpar(cex = 1),
+ label = gpar(cex = 0.8),
+ summary = gpar(cex = 0.8)),
+ col=fpColors(box="black",
+ line="darkgrey",
+ summary="black",
+ zero='grey20',
+ axes='grey20'),
+ hrzl_lines = list("2" = gpar(lwd=1, col = "#000044")))
Related
I wrote a script using the "forestplot" package. I want to group the variables in certain categories, which I would like to show in bold, in order to accentuate those categories. How can i adjust my script, so that only certain rows, i.e Risk factor OR (95% CI), patient characteristics, medication history, comorbidities, surgical history and other are shown in bold? I have two colums and 18 rows. Can someone help me? I would be much grateful!!
My script is as below:
tabletext <- cbind(
c("Risk factor" ,"Patient characteristics","Sex, male*", "Bmi (5 points)",
"Alcohol (5 units)", "Smoking*","Medication history",
"Steroid use", "Anticoagulant use*","Comorbidities",
"COPD GOLD 1/2", "COPD GOLD 3/4", "Other pulmonary disease",
"Surgical history",
"Previous colorectal surgery*",
"Previous abdominal surgery (other)","Other", "HIPEC*"),
c("OR (95% CI)",NA, "1.78 (1.20-2.68)", "1.15 (0.95-1.38)", "1.04 (0.94-1.14)",
"1.78 (1.11-2.80)", NA," 1.40 (0.68-2.67)", "1.55 (1.02-2.32)",NA,
"1.40 (0.70-2.61)", "1.56 (0.42-4.67)", "1.78 (0.63-4.28)",NA,
"1.61 (1.03-2.49)", "0.80 (0.47-1.32)",NA, "4.14 (2.14-7.73)"))
?fpTxtGp
require(forestplot)
forestplot(tabletext,
txt_gp = fpTxtGp(label = list(gpar(fontfamily = "Times",
fontface="bold"),
gpar(fontfamily = "",
col = "black"))),
df_c,new_page = TRUE,
boxsize = 0.2,
is.summary = c(rep(FALSE,32)),
clip = c(0,17),
xlab = 'Odds ratio with 95% confidence interval
* indicates significance',
xlog = FALSE,
zero = 1,
plotwidth=unit(12, "cm"),
colgap=unit(2, "mm"),
col = fpColors(box = "royalblue",
line = "darkblue",
summary = "royalblue"))
Its not clear what df_c is so I just created it based on your tabletext matrix:
df_c <- data.frame(mean = c(NA, NA, 1.78, 1.15, 1.04, 1.78, NA, 1.4, 1.55,
NA, 1.4, 1.56, 1.78, NA, 1.61, 0.8, NA, 4.14),
lower = c(NA, NA, 1.2, 0.95, 0.94, 1.11, NA, 0.68, 1.02, NA, 0.7,
0.42, 0.63, NA, 1.03, 0.47, NA, 2.14),
upper = c(NA, NA, 2.68, 1.38,1.14, 2.8, NA, 2.67,2.32, NA,
2.61, 4.67, 4.28, NA, 2.49, 1.32, NA, 7.73))
From there, its just a matter of adjusting the values passed to is.summary:
forestplot(tabletext,
txt_gp = fpTxtGp(label = list(gpar(fontfamily = "Times"),
gpar(fontfamily = "",
col = "black"))),
df_c,new_page = TRUE,
boxsize = 0.2,
is.summary = c(TRUE, TRUE, rep(FALSE, 4),
TRUE, FALSE, FALSE, TRUE,
rep(FALSE,3), TRUE, rep(FALSE,4)),
clip = c(0,17),
xlab = 'Odds ratio with 95% confidence interval
* indicates significance',
xlog = FALSE,
zero = 1,
plotwidth=unit(12, "cm"),
colgap=unit(2, "mm"),
col = fpColors(box = "royalblue",
line = "darkblue",
summary = "royalblue"))
Which generates the following figure:
I need to create a forestplot of high resolution. I used the forestplot() function from library(forestplot) to create my plot, and then attempted to use the tiff() function to create a high resolution image for publication. However, my image turned blank.
It works if I export directly from R but not as high resolution as it was supposed to.
library(forestplot)
df <- structure(list(
mean = c(NA, 0.22, 0.20, 0.27),
lower = c(NA, 0.05, 0.04, 0.01),
upper = c(NA, 0.95, 1.08, 9.12)),
.Names = c("mean", "lower", "upper"),
row.names = c(NA, -4L),
class = "data.frame")
tabletext <- cbind(
c("", "Pooled", "Group 1", "Group 2"),
c("N", "4334", "3354", "980"),
c("HR (95% CI)", "0.22 (0.05, 0.95)", "0.20 (0.04, 1.08)", "0.27 (0.01, 9.12)"),
c("p-value", "0.042", "0.061", "0.467")
)
ggfp <- forestplot(tabletext,
df,
new_page = TRUE,
is.summary = c(TRUE, rep(FALSE, 3)),
clip = c(0, 2),
colgap = unit(5, "mm"),
line.margin = unit(2, "mm"),
lineheight = unit(1, "in"),
txt_gp = fpTxtGp(label = gpar(cex = 1),
ticks = gpar(cex = 1)),
align = c("l", "c", "c", "c"),
boxsize = 0.2,
xticks = seq(0, 2.0, 0.5),
zero = 1,
col = fpColors(box = "royalblue",
line = "darkblue"),
mar = unit(c(-1, 0.5, -2, 0.5), "in"))
tiff("forestplot.tiff", units = "in", width = 9, height = 7, res = 300)
ggfp
dev.off()
The file was created but it was a blank page
This works for me (output file is 17MB):
library(forestplot)
setwd("/path/to/directory/for/plot")
df <- structure(list(
mean = c(NA, 0.22, 0.20, 0.27),
lower = c(NA, 0.05, 0.04, 0.01),
upper = c(NA, 0.95, 1.08, 9.12)),
.Names = c("mean", "lower", "upper"),
row.names = c(NA, -4L),
class = "data.frame")
tabletext <- cbind(
c("", "Pooled", "Group 1", "Group 2"),
c("N", "4334", "3354", "980"),
c("HR (95% CI)", "0.22 (0.05, 0.95)", "0.20 (0.04, 1.08)", "0.27 (0.01, 9.12)"),
c("p-value", "0.042", "0.061", "0.467")
)
tiff("forestplot.tiff", units = "in", width = 9, height = 7, res = 300)
forestplot(tabletext,
df,
new_page = TRUE,
is.summary = c(TRUE, rep(FALSE, 3)),
clip = c(0, 2),
colgap = unit(5, "mm"),
line.margin = unit(2, "mm"),
lineheight = unit(1, "in"),
txt_gp = fpTxtGp(label = gpar(cex = 1),
ticks = gpar(cex = 1)),
align = c("l", "c", "c", "c"),
boxsize = 0.2,
xticks = seq(0, 2.0, 0.5),
zero = 1,
col = fpColors(box = "royalblue",
line = "darkblue"),
mar = unit(c(-1, 0.5, -2, 0.5), "in"))
dev.off()
I would like my graphs to start at y= 0, but I would like the maximum to change with a multiple of the data, or somehow otherwise zoom out dynamically. I have 34 charts in this set with various ymax.
I have tried scale_y_continuous and coord_cartesian but when I try to put in the expand = expand_scale(mult = 2) that works for getting my maximum to change dynamically, but then the graphs start to start at negative numbers, and I want them to start at 0.
title<- c(
"Carangidae",
"Atlantic cutlassfish",
"Lizardfish",
"Sharks",
"Mackerel")
#DATA#
biomass<- structure(list(timestep = structure(c(10957, 10988, 11017, 11048,
11078, 11109, 11139, 11170, 11201, 11231, 11262, 11292), class = "Date"),
bio_pre_Carangidae = c(0.01105, 0.0199, 0.017,
0.01018, 0.0119, 0.0101, 0.009874, 0.009507,
0.009019, 0.00843, 0.00841, 0.00805), bio_obs_Carangidae = c(NA,
NA, NA, NA, NA, 0.00239, NA, NA, NA, NA, NA, NA), bio_pre_Atl_cutlassfish = c(0.078,
0.069, 0.067, 0.06872, 0.0729, 0.0769,
0.0775, 0.075, 0.0743, 0.072, 0.071,
0.069), bio_obs_Atl_cutlassfish = c(NA, NA, NA, NA, NA,
0.0325, NA, NA, NA, NA, NA, NA), bio_pre_lizardfish = c(0.0635,
0.062, 0.057, 0.0536, 0.0505, 0.0604,
0.0627, 0.068, 0.0695, 0.066, 0.0623,
0.0598), bio_obs_lizardfish = c(NA, NA, NA, NA, NA, 0.037,
NA, NA, NA, NA, NA, NA), bio_pre_sharks = c(0.025, 0.0155,
0.0148, 0.0135, 0.01379, 0.01398, 0.014,
0.0139, 0.0136, 0.0132, 0.0126, 0.011),
bio_obs_sharks = c(NA, NA, NA, NA, NA, 0.003, NA, NA,
NA, NA, NA, NA), bio_pre_mackerel = c(0.0567, 0.0459,
0.0384, 0.03, 0.0328, 0.0336, 0.0299,
0.0296, 0.02343, 0.02713, 0.0239, 0.019
), bio_obs_mackerel = c(NA, NA, NA, NA, NA, 0.055, NA,
NA, NA, NA, NA, NA)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -12L))
This is my function:
function (biomass, .var1, .var2, .var3) {
p <- ggplot(biomass, aes(x = timestep)) +
geom_line(aes(y = .data[[.var1]], linetype = "Predicted")) + geom_point(size = 3, aes(y = .data[[.var2]], shape = "Observed")) +
ggtitle(paste0(.var3)) +
ylab(expression("biomass" ~ (t/km^2))) +
theme_classic() +
scale_y_continuous(limits = c(0, NA), expand = expand_scale(mult = 2))+
###This is the portion where I cannot figure out how to set ymin = 0 and then ymax to 2* the maximum value of a dataset.##
theme(legend.position = "right") +
theme(axis.ticks = element_line(size = 1), axis.ticks.length = unit(0.25, "cm"))
return(p)
}
## create two separate name vectors
var1_names <- colnames(biomass)[grepl("^bio_pre", colnames(biomass))]
var2_names <- colnames(biomass)[grepl("^bio_obs", colnames(biomass))]
var3_names <- title
## loop through two vectors simultaneously and save result in a list
# ..1 = var1_names, ..2 = var2_names
my_plot_b <- pmap(list(var1_names, var2_names, var3_names), ~ my_bio_plot(biomass, ..1, ..2, ..3))
## merge plots together
# https://cran.r-project.org/web/packages/cowplot/
# install.packages("cowplot", dependencies = TRUE)
dev.new(title = "Model Fit Biomass",
width = 12,
height = 6,
noRStudioGD = TRUE
)
print(my_plot_b)
I can manage to get EITHER a set ymin=0 (a) OR a dynamic ymax (b) but cannot manage to get both.
a
b
How about this? Seems to work on your data.
Define the max for each chart at the top of your function:
my_bio_plot <- function (biomass, .var1, .var2, .var3) {
max_y = 2.0 * max(biomass[[.var1]])
...
scale_y_continuous(limits = c(0, max_y)) +
...
This seems to create the requested output, with min y = 0 and max y = 2 * max y in data.
Updated to add a substantially different approach from yours:
biomass %>%
gather(species, bio, -timestep) %>%
mutate(type = ifelse(stringr::str_detect(species, 'pre'), 'predicted', 'observed'),
species = gsub(".*_", "", species)) %>%
group_by(species) %>%
mutate(ul = max(bio, na.rm = TRUE) * 2) %>%
filter(species == "sharks") -> df
df %>%
ggplot(aes(timestep, bio, group = type)) +
geom_point(aes(shape = type)) +
geom_line(aes(linetype = type)) +
# facet_wrap(~species) +
scale_linetype_manual(name = "",
values = c("blank", 'solid')) +
scale_shape_manual(name = "",
values = c(19, NA))+
scale_y_continuous(limits = c(0, max(df$ul)))
You could remove the filter(species == "sharks") and uncomment thefacet_wrap(~species) and you will get all the species plotted at the same time.
Please, I need some help about using the xlog=TRUE option.
It is requested to provide the mean, lower, upper, zero, grid and clip already as exponentials, but I find the package is drawing the grid lines at the exponential of the numbers I am already providing as exponentials. As a consequence, the grid lines are in the wrong place.
metaan <-
structure(list(
mean = c(NA, NA, NA, 0.27, 0.47, 0.33, 0.69, 0.86, 0.37, 0.08, 0.44, 0.54, 0.41, NA),
lower = c(NA, NA, NA, 0.13, 0.12, 0.19, 0.12, 0.54, 0.17, 0.03, 0.16, 0.06, 0.29, NA),
upper = c(NA, NA, NA, 0.58, 1.81, 0.60, 3.97, 1.36, 0.81, 0.21, 1.25, 4.50, 0.58, NA)),
.Names = c("mean", "lower", "upper"),
row.names = c(NA, -27L),
class = "data.frame")
tabletext<-cbind(
c("", "AB class", "", " Aminoglycosides", " B-lactams", " Cephalosporins", " Fenicoles", " Fluoroquinolones", " Multiresistance", " Sulphamides", " Tetracyclines", " Tri/Sulpha", " Subtotal", ""),
c("", "OR", "", "0.27", "0.47", "0.33", "0.69", "0.86", "0.37", "0.08", "0.44", "0.54", "0.41", ""),
c("", "n", "", "4", "3", "2", "3", "4", "2", "3", "4", "3", "5", ""))
xticks <- c(0.1, 0.25, 0.5, 1, 1.5, 2, 3)
forestplot(tabletext,
graph.pos = 3,
txt_gp = fpTxtGp(label = gpar(fontsize=10)),
hrzl_lines = list("3" = gpar(lty=1)),
zero = 1,
line.margin = .05,
mean = cbind(metaan[,"mean"]),
lower = cbind(metaan[,"lower"]),
upper = cbind(metaan[,"upper"]),
is.summary=c(FALSE, TRUE, rep(FALSE, 9)),
col=fpColors(box=c("blue"), summary=c("blue")),
grid = structure(0.41,
gp = gpar(lty = 2, col = "#CCCCFF")),
clip=c(0.1, 3),
xlog=T,
xticks=xticks,
xlab="Odds ratio")
The grid line is at the exponential of OR=0.41, instead of at OR=0.41
When provided the log to get the grid lines at the correct place (e.g. -0.38, or log(0.41)), I get the error message that I should provide all parameters already as exponential.
forestplot(tabletext,
graph.pos = 3,
txt_gp = fpTxtGp(label = gpar(fontsize=10)),
hrzl_lines = list("3" = gpar(lty=1)),
zero = 1,
line.margin = .05,
mean = cbind(metaan[,"mean"]),
lower = cbind(metaan[,"lower"]),
upper = cbind(metaan[,"upper"]),
is.summary=c(FALSE, TRUE, rep(FALSE, 9)),
col=fpColors(box=c("blue"), summary=c("blue")),
grid = structure(-0.39,
gp = gpar(lty = 2, col = "#CCCCFF")),
clip=c(0.1, 3),
xlog=T,
xticks=xticks,
xlab="Odds ratio")
Error in forestplot.default(tabletext, graph.pos = 3, txt_gp = fpTxtGp(label = gpar(fontsize = 10)), :
All argument values (mean, lower, upper, zero, grid and clip) should be provided as exponentials when using the log scale. This is an intentional break with the original forestplot function in order to simplify other arguments such as ticks, clips, and more.
I have tried including the grid numbers as lists, but I always encounter the same error message either if I provide the numbers as exponentials (grid misplaced) or as log (error message).
I am wondering what I am doing wrong and if there is any other way to get the grid lines in the correct place.
Thanks in advance,
Magda.
Solved. Updated to 1.8 package version from GitHub and got the correct figure. :)
I would assign a name for every circle in a Venn diagram. I have tried to change options in category but seems this is the only set I can use. I attach my code, please where is the wrong part?
goterm3 = c(1,2,3,4,5,6)
goterm2 =c(2,2,3,4,3,5)
goterm1=c(4,5,3,2,4,3,2,4)
int12 = intersect(goterm1, goterm2)
int13 = intersect(goterm1, goterm3)
int23 = intersect(goterm2, goterm3)
intall = intersect(int12, goterm3)
require(VennDiagram)
venn.plot = draw.triple.venn(length(goterm1), length(goterm2), length(goterm3),
length(int12), length(int23), length(int13),length(intall),
category = rep("ORG1, ORG2,Org",3) ,rotation = 1, reverse = FALSE, euler.d = FALSE,
scaled = FALSE, lwd = rep(2, 3), lty = rep("solid", 3),
col = rep("black", 3), fill = c("blue", "red", "green"),
alpha = rep(0.5, 3),
label.col = rep("black", 7), cex = rep(1, 7), fontface = rep("plain", 7),
fontfamily = rep("serif", 7), cat.pos = c(0, 0, 180),
cat.dist = c(0.05, 0.05, 0.025), cat.col = rep("black", 3),
cat.cex = rep(1, 3), cat.fontface = rep("plain", 3),
cat.fontfamily = rep("serif", 3),
cat.just = list(c(0.5, 1), c(0.5, 1), c(0.5, 0)), cat.default.pos = "outer",
cat.prompts = FALSE, rotation.degree = 0, rotation.centre = c(0.5, 0.5),
ind = TRUE, sep.dist = 0.05, offset = 0)
This is what I get and it does have the same labels as your categories (after I unmangled the string values for the categories:
category = c("ORG1", "ORG2","Org") # no rep needed and proper quotes