How can I format the legend in a gtable plot object? - r

I am using the cutpointr package to generate cut off for a continuous variable. I work as prescribed but the plot objects generated are complex and a result of large gtable data. I want to format or edit the legend in the plot but I have failed totally with ggplot2
This is the code with cutpointr used to generate the cut off:
opt_cut_b_cycle.type<- cutpointr(hcgdf_v2, beta.hcg, livebirth.factor, cycle.type,
method = maximize_boot_metric,
metric = youden, boot_runs = 1000,
boot_stratify = TRUE,
na.rm = TRUE) %>% add_metric(list(ppv, npv, odds_ratio, risk_ratio, p_chisquared))
The plot object is obtained by running 'plot' function
plot(opt_cut_b_cycle.type)
This is the plot generated
.
I want to edit the legend title from subgroup to Oocyte source
I want to change the labels EDET to Donor, IVFET to Autologous
I tried working treating the plot object as a ggplot2 plot and running code such as, where p is the said plot object.
p + scale_fill_discrete(name = "Oocyte source", labels = c("Donor", "Autologous"))
Unfortunately, the console returns 'NULL'
This is an example data set:
hcgdf_v2 <-tibble(id = 1:10, beta.hcg = seq(from = 5, to = 1500, length.out = 10),
livebirth.factor = c("yes", "no", "yes", "no", "no", "yes", "yes", "no", "no", "yes"),
cycle.type = c("edet","ivfet","edet", "edet", "edet", "edet", "ivfet", "ivfet", "ivfet","edet"))

When I attempted to use your code, it didn't work. However, based on the current type of graph and graph options you've called, this should work.
The legend title
I suggest you run this line and ensure it returns Subgroup before using it to change anything.
# assign the plot to an object
plt <- plot(opt_cut_b_cycle.type)
# printing this should return "Subgroup" - current legend title
plt$grobs[[2]]$grobs[[1]]$grobs[[2]][[4]][[1]][[6]][[1]][[1]]
# change the legend title
plt$grobs[[2]]$grobs[[1]]$grobs[[2]][[4]][[1]][[6]][[1]][[1]] <- "Oocyte"
Legend entries
The easiest method to change the legend entries is probably to rename the factors in your data. Alternatively, you can change these labels the same way you changed the legend title.
Note that the colors will swap between the two options when you change the factor levels. (That's because it is alphabetized.)
#### Option 1 - - Recommended method
# change the legend entries-- factor levels
# in your image, you have "EDET", but your data has "edet"
# make sure this has the capitalization used in your data
hcgdf_v2$cycle2 <- ifelse(hcgdf_v2$cycletype == "edet", "Donor", "Autologous")
# now rerun plot with alternate subgroup
opt_cut_b_cycle.type<- cutpointr(hcgdf_v2, beta.hcg, livebirth.factor, cycle2,
method = maximize_boot_metric,
metric = youden, boot_runs = 1000,
boot_stratify = TRUE,
na.rm = TRUE) %>%
add_metric(list(ppv, npv, odds_ratio, risk_ratio, p_chisquared))
#### Option 2 - - Not recommended due to legend spacing
# alternative to rename legend entry labels
# this should return "EDET"
plt$grobs[[2]]$grobs[[1]]$grobs[[7]][[4]][[1]][[6]][[1]][[1]]
# this should return "IVFET"
plt$grobs[[2]]$grobs[[1]]$grobs[[8]][[4]][[1]][[6]][[1]][[1]]
plt$grobs[[2]]$grobs[[1]]$grobs[[7]][[4]][[1]][[6]][[1]][[1]] <- "Donor"
plt$grobs[[2]]$grobs[[1]]$grobs[[8]][[4]][[1]][[6]][[1]][[1]] <- "Autologous"
To see your modified plot, use plot.
plot(plt)
When I ran it this code, changing the legend title causes some odd behavior where the plot background isn't entirely white, if that happens in your plot do the following.
This requires the library gridExtra.
# clear the plot
plot.new()
# recreate the grid
plt2 <- grid.arrange(plt$grobs[[1]]$grobs[[1]], # 2 small graphs top left
plt$grobs[[1]]$grobs[[2]], # ROC curve graph (top right)
plt$grobs[[1]]$grobs[[3]], # distro of optimal cut
plt$grobs[[1]]$grobs[[4]], nrow = 2) # out-of-bag estimates
plot.new()
# graphs and legend set to 4:1 ratio of space graphs to legend
grid.arrange(plt2, plt$grobs[[2]], ncol = 2, widths = c(4, 1))
```

Related

Missing right tick marks in R latticeExtra c.trellis

When using latticeExtra:::c.trellis to combine plots, the right-side tick marks and text/numeric labels go missing, and I'd like to bring them back:
library(latticeExtra)
set.seed(1)
foo <- data.frame(x = 1:100,
y = 1:100 + rnorm(100))
foo$resid <- with(foo, x-y)
## Plot 1 -----
(p1 <- xyplot(y~x, foo))
## Plot 2 -----
(p2 <-
xyplot(resid~x, foo,
scales = list(rot = 0, tck = c(1,1), alternating = 3),
between = list(y = 1), ylab.right = "ylab.right",
# par.settings = list(axis.components =
# list(right = list(pad1 = 2, pad2 = 2)))
# Note: this padding attempt does not restore the missing ticks,
# pad arguments get ignored when using c.trellis below
))
# tick marks appear on all four sides (as desired)
## Combine -----
(p12 <- latticeExtra:::c.trellis(p2, p1,layout = c(1,2)))
# right tick marks are missing
Is there a way to restore the right-side ticks and/or labels manually, say, by modifying the combined trellis object?
From the help file ?c.trellis:
Description
Combine the panels of multiple trellis objects into one.
and later,
Note that combining panels from different types of plots does not really fit the trellis model. Some features of the plot may not work as expected. In particular, some work may be needed to show or hide scales on selected panels. An example is given below.
It looks to me that you really aren't trying to combine panels into one object. You even use between to put some separation. Rather, you are trying to combine two plots.
You can use print,
print(p1,split=c(1,1,1,2),more=TRUE)
print(p2,split=c(1,2,1,2),more=FALSE)
See ?print.trellis.

Plotting quantile regression by variables in a single page

I am running quantile regressions for several independent variables separately (same dependent). I want to plot only the slope estimates over several quantiles of each variable in a single plot.
Here's a toy data:
set.seed(1988)
y <- rnorm(50, 5, 3)
x1 <- rnorm(50, 3, 1)
x2 <- rnorm(50, 1, 0.5)
# Running Quantile Regression
require(quantreg)
fit1 <- summary(rq(y~x1, tau=1:9/10), se="boot")
fit2 <- summary(rq(y~x2, tau=1:9/10), se="boot")
I want to plot only the slope estimates over quantiles. Hence, I am giving parm=2 in plot.
plot(fit1, parm=2)
plot(fit2, parm=2)
Now, I want to combine both these plots in a single page.
What I have tried so far;
I tried setting par(mfrow=c(2,2)) and plotting them. But it's producing a blank page.
I have tried using gridExtra and gridGraphics without success. Tried to convert base graphs into Grob objects as stated here
Tried using function layout function as in this document
I am trying to look into the source code of plot.rqs. But I am unable to understand how it's plotting confidence bands (I'm able to plot only the coefficients over quantiles) or to change mfrow parameter there.
Can anybody point out where am I going wrong? Should I look into the source code of plot.rqs and change any parameters there?
While quantreg::plot.summary.rqs has an mfrow parameter, it uses it to override par('mfrow') so as to facet over parm values, which is not what you want to do.
One alternative is to parse the objects and plot manually. You can pull the tau values and coefficient matrix out of fit1 and fit2, which are just lists of values for each tau, so in tidyverse grammar,
library(tidyverse)
c(fit1, fit2) %>% # concatenate lists, flattening to one level
# iterate over list and rbind to data.frame
map_dfr(~cbind(tau = .x[['tau']], # from each list element, cbind the tau...
coef(.x) %>% # ...and the coefficient matrix,
data.frame(check.names = TRUE) %>% # cleaned a little
rownames_to_column('term'))) %>%
filter(term != '(Intercept)') %>% # drop intercept rows
# initialize plot and map variables to aesthetics (positions)
ggplot(aes(x = tau, y = Value,
ymin = Value - Std..Error,
ymax = Value + Std..Error)) +
geom_ribbon(alpha = 0.5) +
geom_line(color = 'blue') +
facet_wrap(~term, nrow = 2) # make a plot for each value of `term`
Pull more out of the objects if you like, add the horizontal lines of the original, and otherwise go wild.
Another option is to use magick to capture the original images (or save them with any device and reread them) and manually combine them:
library(magick)
plots <- image_graph(height = 300) # graphics device to capture plots in image stack
plot(fit1, parm = 2)
plot(fit2, parm = 2)
dev.off()
im1 <- image_append(plots, stack = TRUE) # attach images in stack top to bottom
image_write(im1, 'rq.png')
The function plot used by quantreg package has it's own mfrow parameter. If you do not specify it, it enforces some option which it chooses on it's own (and thus overrides your par(mfrow = c(2,2)).
Using the mfrow parameter within plot.rqs:
# make one plot, change the layout
plot(fit1, parm = 2, mfrow = c(2,1))
# add a new plot
par(new = TRUE)
# create a second plot
plot(fit2, parm = 2, mfrow = c(2,1))

Automatically scale x-axis by date range within a factor using xyplot()

I've been trying to write out an R script that will plot the date-temp series for a set of locations that are identified by a Deployment_ID.
Ideally, each page of the output pdf would have the name of the Deployment_ID (check), a graph with proper axes (check) and correct scaling of the x-axis to best show the date-temp series for that specific Deployment_ID (not check).
At the moment, the script makes a pdf that shows each ID over the full range of the dates in the date column (i.e. 1988-2010), instead of just the relevant dates (i.e. just 2005), which squishes the scatterplot down into uselessness.
I'm pretty sure it's something to do with how you define xlim, but I can't figure out how to have R access the date min and the date max for each factor as it draws the plots.
Script I have so far:
#Get CSV to read data from, change the file path and name
data <- read.csv(file.path("C:\Users\Person\Desktop\", "SampleData.csv"))
#Make Date real date - must be in yyyy/mm/dd format from the csv to do so
data$Date <- as.Date(data$Date)
#Call lattice to library, note to install.packages(lattice) if you don't have it
library(lattice)
#Make the plots with lattice, this takes a while.
dataplot <- xyplot(data$Temp~data$Date|factor(data$Deployment_ID),
data=data,
stack = TRUE,
auto.key = list(space = "right"),
layout = c(1,1),
ylim = c(-10,40)
)
#make the pdf
pdf("Dataplots_SampleData.pdf", onefile = TRUE)
#print to the pdf? Not really sure how this works. Takes a while.
print(dataplot)
dev.off()
Use the scales argument. give this a try
dataplot <- xyplot(data$Temp~data$Date|factor(data$Deployment_ID),
data=data,
stack = TRUE,
auto.key = list(space = "right"),
layout = c(1,1),
scales= list( relation ="free")
)

Rgraphviz: edge labels outside plotting region

I am trying to plot a Rgraphviz object with two edge labels. Unfortunately the labels fall outside the plot. Here is my example:
require('Rgraphviz')
set.seed(123)
g1 <- randomGraph(letters[1:10], 1:4, 0.4)
eAttrs <- list()
eAttrs$label <- c("a~g" = "I have a very long label 1", "a~i" = "and a long label 2")
plot(g1, edgeAttrs = eAttrs)
Here is my plot:
I tried several things with no success:
1.
Set a larger bounding box
z <- agopen(g1, "foo")
z#boundBox#upRight#x <- z#boundBox#upRight#x + 300
z#boundBox#upRight#y <- z#boundBox#upRight#y + 300
plot(z, edgeAttrs = eAttrs)
2.
Decrease the label fontsize (not really what I want in my application, anyways)
eAttrs$labelfontsize=c("a~g"="3")
plot(g1, edgeAttrs = eAttrs)
3.
Change par attributes:
par(oma=c(10,10,10,10))
plot(g1, edgeAttrs = eAttrs)
4.
Change node, edge and general attributes from ?Rgraphviz::GraphvizAttributes
attrs <- list(graph=list(size=c(1, 1)))
attrs$edge$fontsize<-8
plot(g1, edgeAttrs = eAttrs, attrs=attrs)
None of my attempts seem to work. Does anyone have an idea?
The problem
Calling plot() on a graph object dispatches an S4 method (shown by getMethod("plot", "graph")), which in turn calls the function shown by typing getMethod("plot", "Ragraph"). That function contains the following rather unfortunate lines which, regardless of any related parameter settings you've made, will override them to reset the left and right margins to 0. Frustrating!
oldpars <- par(mai = c(sheight, 0, mheight, 0))
on.exit(par(oldpars), add = TRUE)
A workaround
One workaround is to construct a three panel layout in which the left and right panels are just there to provide a bit of buffering space. Turn off clipping, plot your graph object in the middle panel, and it then seems to work:
layout(matrix(1:3, nrow=1), widths=c(1,5,1))
par(xpd=NA) ## turn off all clipping
plot.new() ## blank plot in Panel 1
plot(g1, edgeAttrs = eAttrs) ## graph in Panel 2
plot.new() ## blank plot in Panel 3
I found another solution: In my original question I changed the size of the bounding box in a laid out graph I got with agopen. Plotting the laid out graph showed no edge labels at all. I found that it is possible to pass the edge attributes from the graph object to agopen and then change the bounding box:
require('Rgraphviz')
set.seed(123)
g1 <- randomGraph(letters[1:10], 1:4, 0.4)
eAttrs <- list()
eAttrs$label <- c("a~g" = "I have a very long label 1", "a~i" = "and a long label 2")
z <- agopen(g1, "foo", edgeAttr=eAttrs)
z#boundBox#botLeft#x <- z#boundBox#botLeft#x - 400 ##left
z#boundBox#upRight#x <- z#boundBox#upRight#x + 200 ##right
plot(z)
The plot:

lattice or latticeExtra combine multiple plots different yscaling (log10 and non-transformed)

I have a multiple variable time series were some of the variables have rather large ranges. I wish to make a single-page plot with multiple stacked plots of each variable were some of the variables have a log10 y-axis scaling. I am relatively new to lattice and have not been able to figure out how to effectively mix the log10 scaling with non-transformed axes and get a publication quality plot. If print.trellis is used the plots are not aligned and the padding needs some work, if c.trellis is used the layout is good, but only the y-scaling from only one plot is used. Any suggestions for an efficient solution, where I can replicate the output of c.trellis using the different y-scaling for each (original) object?
Example below:
require(lattice)
require(latticeExtra)
# make data.frame
d.date <- as.POSIXct(c("2009-12-15", "2010-01-15", "2010-02-15", "2010-03-15", "2010-04-15"))
CO2dat <- c(100,200,1000,9000,2000)
pHdat <- c(10,9,7,6,7)
tmp <- data.frame(date=d.date ,CO2dat=CO2dat ,pHdat=pHdat)
# make plots
plot1 <- xyplot(pHdat ~ date, data=tmp
, ylim=c(5,11)
, ylab="pHdat"
, xlab="Date"
, origin = 0, border = 0
, scales=list(y=list(alternating=1))
, panel = function(...){
panel.xyarea(...)
panel.xyplot(...)
}
)
# make plot with log y scale
plot2 <- xyplot(CO2dat ~ date, data=tmp
, ylim=c(10,10^4)
, ylab="CO2dat"
, xlab="Date"
, origin = 0, border = 0
, scales=list(y=list(alternating=1,log=10))
, yscale.components = yscale.components.log10ticks
, panel = function(...){
panel.xyarea(...)
panel.xyplot(...)
# plot CO2air uatm
panel.abline(h=log10(390),col="blue",type="l",...)
}
)
# plot individual figures using split
print(plot2, split=c(1,1,1,2), more=TRUE)
print(plot1, split=c(1,2,1,2), more=F)
# combine plots (more convenient)
comb <- c(plot1, plot2, x.same=F, y.same=F, layout = c(1, 2))
# plot combined figure
update(comb, ylab = c("pHdat","log10 CO2dat"))
Using #joran's idea, I can get the axes to be closer but not exact; also, reducing padding gets them closer together but changes the aspect ratio. In the picture below I've reduced the padding perhaps by too much to show the not exactness; if this close were desired, you'd clearly want to remove the x-axis labels on the top as well.
I looked into the code that sets up the layout and the margin on the left side is calculated from the width of the labels, so #joran's idea is probably the only thing that will work based on the printing using split, unless one were to rewrite the plot.trellis command. Perhaps the c method could work but I haven't found a way yet to set the scale components separately depending on the panel. That does seem more promising though.
mtheme <- standard.theme("pdf")
mtheme$layout.heights$bottom.padding <- -10
plot1b <- update(plot1, scales=list(y=list(alternating=1, at=5:10, labels=paste(" ",c(5:10)))))
plot2b <- update(plot2, par.settings=mtheme)
pdf(file="temp.pdf")
print(plot2b, split=c(1,1,1,2), more=TRUE)
print(plot1b, split=c(1,2,1,2), more=F)

Resources