save yaxis legends as a separate grob? - r

I have a very large scatterplot of two categories where a dot is a "hit." I wanted to make histograms across the top and side of the plot to represent the hits as seen on the following website: http://blog.mckuhn.de/2009/09/learning-ggplot2-2d-plot-with.html
I can arrange the plots as a 2-by-two grid however I run into a problem: The yaxis of my primary scatterplot has very long titles (important for the project) and in 2x2 grid the top histogram stretches to the full width and is no longer aligned along the x-axis.
My thought was to make a 3x3 grid where I use the leftmost grid for titles. However this requires saving the Y-axis text as "grob." Iin the above blog-post this is achieved as follows:
p <- qplot(data = mtcars, mpg, hp, geom = "point", colour = cyl)
legend <- p + opts(keep= "legend_box")
this allows "legend" to be placed into the 2x2 grid layout. If I could use the same logic to make a seperate grob for the Yaxis labels I would be all good. I have tried at leastthe following:
legend <- p +opts(keep="Yaxis")
legend <- p +opts(keep="axis_text_y")
legend <- p +opts(keep="axis_text")
..... and many others
Is it possible to make a grob from things besides the Legend Box? If so - please let me know. If not, I'll take any suggestions on how to arrange the three plots while keeping them aligned and preserving the Y Labels.
thanks

This question has been sitting long enough, that it is time to document an answer for posterity.
The short answer is that highly-customized data visualizations cannot be done using function wrappers from the 'lattice' and 'ggplot2' packages. The purpose of a function wrapper is to take some of the decisions out of your hands, so you will always be limited to the decisions originally envisioned by the function coder. I highly recommend everyone learn the 'lattice' or 'ggplot2' packages, but these packages are more useful for data exploration than for being creative with data visualizations.
This answer is for those who want to create a customized visual. The following process may take half a day, but that is significantly less time than it would take to hack the 'lattice' or 'ggplot2' packages into the shape you want. This isn't a criticism of either of those packages; it's just a byproduct of their purpose. When you need a creative visual for a publication or client, 4 or 5 hours of your day is nothing compared to the payoff.
The work to make a customized visual is pretty simple with the 'grid' package, but that doesn't mean the math behind it is always simple. Most of the work in this example is actually the math and not the graphic.
Preface: There are some things you should know before you being working with the base 'grid' package for your visuals. The first is that 'grid' works off the idea of viewports. These are plotting spaces that allow you to reference from within that space, ignoring the rest of the graphic. This is important, because it allows you to make graphics without having to scale your work into fractions of the entire space. It's a lot like the layout options in the base plotting functions, except that they can overlap, be rotated, and made transparent.
Units are another thing to know. The viewports each have a variety of units that you can use to indicate positions and sizes. You can see the whole list in the 'grid' documentation, but there are only a few that I use very often: npc, native, strwidth, and lines. Npc units start at (0,0) in the bottom left and go to c(1,1) in the upper right. Native units use an 'xscale' and 'yscale' to create what is essentially a plotting space for data. Strwidth units tell you how wide a certain string of text would be once printed on the graphic. Lines units tell you how tall a line of text would be once printed on the graphic. Since multiple types of units are always available, you should get in the habit of always either explicitly defining a number with a 'unit' function or specifying the 'default.units' argument from within your drawing functions.
Finally, you have the ability to specify justifications for all your objects' locations. This is HUGE. It means you can specify the location of a shape and then say how you want that shape horizontally and vertically justified (center, left, right, bottom, top). You can line up things perfectly this way by referencing the location of other objects.
This is what we are making: This isn't a perfect graphic, since I'm having to guess what the OP wants, but it is enough to get us on our way to a perfect graphic.
Step 1: Load up some libraries to work with. When you want to do highly-customized visuals, use the 'grid' package. It's the base set of functions that wrappers like 'lattice' and 'ggplot2' are calling. When you want to work with dates, use the 'lubridate' package, because IT MAKES YOUR LIFE BETTER. This last one is a personal preference: when I'm going to doing any sort of data summary work, I like to use the 'plyr' package. It allows me to quickly shape my data into aggregate forms.
library(grid)
library(lubridate)
library(plyr)
Sample data generation: This isn't necessary if you already have your data, but for this example, I'm creating a set of sample data. You can play around with it by changing the user settings for the data generation. The script is flexible and will adapt to the data generated. Feel free to add more websites and play around with the lambda values.
set.seed(1)
#############################################
# User settings for the data generation. #
#############################################
# Set number of hours to generate data for.
time_Periods <- 100
# Set starting datetime in m/d/yyyy hh:mm format.
start_Datetime <- "2/24/2013 00:00"
# Specify a list of websites along with a
# Poisson lambda to represent the average
# number of hits in a given time period.
df_Websites <- read.table(text="
url lambda
http://www.asitenoonereallyvisits.com 1
http://www.asitesomepeoplevisit.com 10
http://www.asitesomemorepeoplevisit.com 20
http://www.asiteevenmorepeoplevisit.com 40
http://www.asiteeveryonevisits.com 80
", header=TRUE, sep=" ")
#############################################
# Generate the data. #
#############################################
# Initialize lists to hold hit data and
# website names.
hits <- list()
websites <- list()
# For each time period and for each website,
# flip a coin to see if any visitors come. If
# visitors come, use a Poisson distribution to
# see how many come.
# Also initialize the list of website names.
for (i in 1:nrow(df_Websites)){
hits[[i]] <- rbinom(time_Periods, 1, 0.5) * rpois(time_Periods, df_Websites$lambda[i])
websites[[i]] <- rep(df_Websites$url[i], time_Periods)
}
# Initialize list of time periods.
datetimes <- mdy_hm(start_Datetime) + hours(1:time_Periods)
# Tie the data into a data frame and erase rows with no hits.
# This is what the real data is more likely to look like
# after import and cleaning.
df_Hits <- data.frame(datetime=rep(datetimes, nrow(df_Websites)), hits=unlist(hits), website=unlist(websites))
df_Hits <- df_Hits[df_Hits$hits > 0,]
# Clean up data-generation variables.
rm(list=ls()[ls()!="df_Hits"])
Step 2: Now, we need to decide how we want our graphic to work. It's useful to separate things like sizes and colors into a different section of your code, so you can quickly make changes. Here, I've chosen some basic settings that should produce a decent graphic. You'll notice that a few of the size settings are using the 'unit' function. This is one of the amazing things about the 'grid' package. You can use various units to describe space on your graphic. For instance, unit(1, "lines") is the height of one line of text. This makes laying out a graphic significantly easier.
#############################################
# User settings for the graphic. #
#############################################
# Specify the window width and height and
# pixels per inch.
device_Width=12
device_Height=4.5
pixels_Per_Inch <- 100
# Specify the bin width (in hours) of the
# upper histogram.
bin_Width <- 2
# Specify a padding size for separating text
# from other plot elements.
padding <- unit(1, "strwidth", "W")
# Specify the bin cut-off values for the hit
# counts and the corresponding colors. The
# cutoff should be the maximum value to be
# contained in the bin.
bin_Settings <- read.table(text="
cutoff color
10 'darkblue'
20 'deepskyblue'
40 'purple'
80 'magenta'
160 'red'
", header=TRUE, sep=" ")
# Specify the size of the histogram plots
# in 'grid' units. Override only if necessary.
# histogram_Size <- unit(6, "lines")
histogram_Size <- unit(nrow(bin_Settings) + 1, "lines")
# Set the background color for distinguishing
# between rows of data.
row_Background <- "gray90"
# Set the color for the date lines.
date_Color <- "gray40"
# Set the color for marker lines on histograms.
marker_Color <- "gray80"
# Set the fontsize for labels.
label_Size <- 10
Step 3: It's time to make the graphic. I have limited space for explanations in an SO answer, so I will summarize and then leave the code comments to explain the details. In a nutshell, I'm calculating how big everything will be and then making the plots one at a time. For each plot, I format my data first, so I can specify the viewport appropriately. Then I lay down labels that need to be behind the data, and then I plot the data. At the end, I "pop" the viewport to finalize it.
#############################################
# Make the graphic. #
#############################################
# Make sure bin cutoffs are in increasing order.
# This way, we can make assumptions later.
bin_Settings <- bin_Settings[order(bin_Settings$cutoff),]
# Initialize plot window.
# Make sure you always specify the pixels per
# inch, so you have an appropriately scaled
# graphic for output.
windows(
width=device_Width,
height=device_Height,
xpinch=pixels_Per_Inch,
ypinch=pixels_Per_Inch)
grid.newpage()
# Push an initial viewport, so we can set the
# font size to use in calculating label widths.
pushViewport(viewport(gp=gpar(fontsize=label_Size)))
# Find the list of websites in the data.
unique_Urls <- as.character(unique(df_Hits$website))
# Calculate the width of the website
# urls once printed on the screen.
label_Width <- list()
for (i in 1:length(unique_Urls)){
label_Width[[i]] <- convertWidth(unit(1, "strwidth", unique_Urls[i]), "npc")
}
# Use the maximum url width plus two padding.
x_Label_Margin <- unit(max(unlist(label_Width)), "npc") + padding * 2
# Calculate a height for the date labels plus two padding.
y_Label_Margin <- unit(1, "strwidth", "99/99/9999") + padding * 2
# Calculate size of main plot after making
# room for histogram and label margins.
main_Width <- unit(1, "npc") - histogram_Size - x_Label_Margin
main_Height <- unit(1, "npc") - histogram_Size - y_Label_Margin
# Calculate x values, using the minimum datetime
# as zero, and counting the hours between each
# datetime and the minimum.
x_Values <- as.integer((df_Hits$datetime - min(df_Hits$datetime)))/60^2
# Initialize main plotting area
pushViewport(viewport(
x=x_Label_Margin,
y=y_Label_Margin,
width=main_Width,
height=main_Height,
xscale=c(-1, max(x_Values) + 1),
yscale=c(0, length(unique_Urls) + 1),
just=c("left", "bottom"),
gp=gpar(fontsize=label_Size)))
# Put grey background behind every other website
# to make data easier to read, and write urls as
# y-labels.
for (i in 1:length(unique_Urls)){
if (i%%2==0){
grid.rect(
x=unit(-1, "npc"),
y=i,
width=unit(2, "npc"),
height=1,
default.units="native",
just=c("left", "center"),
gp=gpar(col=row_Background, fill=row_Background))
}
grid.text(
unique_Urls[i],
x=unit(0, "npc") - padding,
y=i,
default.units="native",
just=c("right", "center"))
}
# Find the hour offset of the minimum date value.
time_Offset <- as.integer(format(min(df_Hits$datetime), "%H"))
# Find the dates in the data.
x_Labels <- unique(format(df_Hits$datetime, "%m/%d/%Y"))
# Find where the days begin in the data.
midnight_Locations <- (0:max(x_Values))[(0:max(x_Values)+time_Offset)%%24==0]
# Write the appropriate date labels on the x-axis
# where the days begin.
grid.text(
x_Labels,
x=midnight_Locations,
y=unit(0, "npc") - padding,
default.units="native",
just=c("right", "center"),
rot=90)
# Draw lines to vertically mark when days begin.
grid.polyline(
x=c(midnight_Locations, midnight_Locations),
y=unit(c(rep(0, length(midnight_Locations)), rep(1, length(midnight_Locations))), "npc"),
default.units="native",
id=rep(midnight_Locations, 2),
gp=gpar(lty=2, col=date_Color))
# Initialize bin assignment variable.
bin_Assignment <- 1
# Calculate which bin each hit value belongs in.
for (i in 1:nrow(bin_Settings)){
bin_Assignment <- bin_Assignment + ifelse(df_Hits$hits>bin_Settings$cutoff[i], 1, 0)
}
# Draw points, coloring according to the bin settings.
grid.points(
x=x_Values,
y=match(df_Hits$website, unique_Urls),
pch=19,
size=unit(1, "native"),
gp=gpar(col=as.character(bin_Settings$color[bin_Assignment]), alpha=0.5))
# Finalize the main plotting area.
popViewport()
# Create the bins for the upper histogram.
bins <- ddply(
data.frame(df_Hits, bin_Assignment, mid=floor(x_Values/bin_Width)*bin_Width+bin_Width/2),
.(bin_Assignment, mid),
summarize,
freq=length(hits))
# Initialize upper histogram area
pushViewport(viewport(
x=x_Label_Margin,
y=y_Label_Margin + main_Height,
width=main_Width,
height=histogram_Size,
xscale=c(-1, max(x_Values) + 1),
yscale=c(0, max(bins$freq) * 1.05),
just=c("left", "bottom"),
gp=gpar(fontsize=label_Size)))
# Calculate where to put four value markers.
marker_Interval <- floor(max(bins$freq)/4)
digits <- nchar(marker_Interval)
marker_Interval <- round(marker_Interval, -digits+1)
# Draw horizontal lines to mark values.
grid.polyline(
x=unit(c(rep(0,4), rep(1,4)), "npc"),
y=c(1:4 * marker_Interval, 1:4 * marker_Interval),
default.units="native",
id=rep(1:4, 2),
gp=gpar(lty=2, col=marker_Color))
# Write value labels for each marker.
grid.text(
1:4 * marker_Interval,
x=unit(0, "npc") - padding,
y=1:4 * marker_Interval,
default.units="native",
just=c("right", "center"))
# Finalize upper histogram area, so we
# can turn it back on but with clipping.
popViewport()
# Initialize upper histogram area again,
# but with clipping turned on.
pushViewport(viewport(
x=x_Label_Margin,
y=y_Label_Margin + main_Height,
width=main_Width,
height=histogram_Size,
xscale=c(-1, max(x_Values) + 1),
yscale=c(0, max(bins$freq) * 1.05),
just=c("left", "bottom"),
gp=gpar(fontsize=label_Size),
clip="on"))
# Draw bars for each bin.
for (i in 1:nrow(bin_Settings)){
active_Bin <- bins[bins$bin_Assignment==i,]
if (nrow(active_Bin)>0){
for (j in 1:nrow(active_Bin)){
grid.rect(
x=active_Bin$mid[j],
y=0,
width=bin_Width,
height=active_Bin$freq[j],
default.units="native",
just=c("center","bottom"),
gp=gpar(col=as.character(bin_Settings$color[i]), fill=as.character(bin_Settings$color[i]), alpha=1/nrow(bin_Settings)))
}
}
}
# Draw x-axis.
grid.lines(x=unit(c(0, 1), "npc"), y=0, default.units="native")
# Finalize upper histogram area.
popViewport()
# Calculate the frequencies for each website and bin.
freq_Data <- ddply(
data.frame(df_Hits, bin_Assignment),
.(website, bin_Assignment),
summarize,
freq=length(hits))
# Create the line data for the side histogram.
line_Data <- matrix(0, nrow=length(unique_Urls)+2, ncol=nrow(bin_Settings))
for (i in 1:nrow(freq_Data)){
line_Data[match(freq_Data$website[i], unique_Urls)+1,freq_Data$bin_Assignment[i]] <- freq_Data$freq[i]
}
# Initialize side histogram area
pushViewport(viewport(
x=x_Label_Margin + main_Width,
y=y_Label_Margin,
width=histogram_Size,
height=main_Height,
xscale=c(0, max(line_Data) * 1.05),
yscale=c(0, length(unique_Urls) + 1),
just=c("left", "bottom"),
gp=gpar(fontsize=label_Size)))
# Calculate where to put four value markers.
marker_Interval <- floor(max(line_Data)/4)
digits <- nchar(marker_Interval)
marker_Interval <- round(marker_Interval, -digits+1)
# Draw vertical lines to mark values.
grid.polyline(
x=c(1:4 * marker_Interval, 1:4 * marker_Interval),
y=unit(c(rep(0,4), rep(1,4)), "npc"),
default.units="native",
id=rep(1:4, 2),
gp=gpar(lty=2, col=marker_Color))
# Write value labels for each marker.
grid.text(
1:4 * marker_Interval,
x=1:4 * marker_Interval,
y=unit(0, "npc") - padding,
default.units="native",
just=c("center", "top"))
# Draw lines for each bin setting.
grid.polyline(
x=array(line_Data),
y=rep(0:(length(unique_Urls)+1), nrow(bin_Settings)),
default.units="native",
id=array(t(matrix(1:nrow(bin_Settings), nrow=nrow(bin_Settings), ncol=length(unique_Urls)+2))),
gp=gpar(col=as.character(bin_Settings$color)))
# Draw vertical line for the y-axis.
grid.lines(x=0, y=c(0, length(unique_Urls)+1), default.units="native")
# Finalize side histogram area.
popViewport()
# Draw legend.
# Draw box behind legend headers.
grid.rect(
x=0,
y=1,
width=unit(1, "strwidth", names(bin_Settings)[1]) + unit(1, "strwidth", names(bin_Settings)[2]) + 3 * padding,
height=unit(1, "lines"),
default.units="npc",
just=c("left","top"),
gp=gpar(col=row_Background, fill=row_Background))
# Draw legend headers from bin_Settings variable.
grid.text(
names(bin_Settings)[1],
x=padding,
y=1,
default.units="npc",
just=c("left","top"))
grid.text(
names(bin_Settings)[2],
x=unit(1, "strwidth", names(bin_Settings)[1]) + 2 * padding,
y=1,
default.units="npc",
just=c("left","top"))
# For each row in the bin_Settings variable,
# write the cutoff values and the color associated.
# Write the color name in the color it specifies.
for (i in 1:nrow(bin_Settings)){
grid.text(
bin_Settings$cutoff[i],
x=unit(1, "strwidth", names(bin_Settings)[1]) + padding,
y=unit(1, "npc") - i * unit(1, "lines"),
default.units="npc",
just=c("right","top"))
grid.text(
bin_Settings$color[i],
x=unit(1, "strwidth", names(bin_Settings)[1]) + 2 * padding,
y=unit(1, "npc") - i * unit(1, "lines"),
default.units="npc",
just=c("left","top"),
gp=gpar(col=as.character(bin_Settings$color[i])))
}

Related

Align text to edge of grid R

I made a grid, I drew some lines between each cell.
I want text in the upper lefthand corner of a cell, but I can only get alignment relative to the cell's center.
I've spent a lot of time searching, and this is the closest I could find is What do hjust and vjust do when making a plot using ggplot?. The [0,1] values of hjust and vjust align the text in reference to the points in this example, and on a grid (using grid.text) they align the text relative to the center of the cell. I've tried values of hvjust outside of [0,1] with no luck; I've tried using decimal places when specifying the row/column (1.5 should be between row 1 and 2 right?) for text placement, but the decimals just get rounded. I can't align by hand because my script should be aligning many names of variable length.
Code to layout the grid:
grid.newpage()
pushViewport(viewport(layout=grid.layout(29+1,7, heights = unit(rep(1,
(29+1), "null"), widths=unit(c(1,.5,.5,.5,1,1,1), "null"))))
grid.polyline(x=c(0,0,1,1,1.5,1.5,2,2,2.5,2.5,3.5,3.5,4.5,4.5,5.5,5.5)/5.5, y=rep(c(0,1), 8), id.lengths=rep(2,8))
grid.polyline(x=rep(c(0,1), 29+2), y=sort(rep(0:(29+1), 2)/(29+1)), id.lengths=rep(2,29+2))
Assume the 29s are variable numbers, I had to change them from something more specific. If the code doesn't layout the grid, I must've deleted an extra parenthesis. The only other code I have just places text in the near center of a cell of the grid.
My goal is to place text so that the first letter is in the upper left corner next to the gridlines. Any guidance is greatly appreciated.
You may find it easier to work with a gtable, in which case each label can be centred within its own cell. Otherwise, with your strategy, you'd need to keep track of all x and y positions (or define as many individual viewports, but that's essentially what gtable does on top of grid).
Here's an example (I used tableGrob to set up the gtable only for convenience),
library(gtable)
library(gridExtra)
g <- tableGrob(matrix("", 29+1, 7), theme = ttheme_minimal())
g$widths <- unit(c(1,.5,.5,.5,1,1,1), "null")
g$heights <- unit(rep(1, (29+1)), "null")
vs <- replicate(ncol(g)-1,
segmentsGrob(x1 = unit(0, "npc"), gp=gpar(lwd=0.5)),
simplify=FALSE)
hs <- replicate(nrow(g)-1,
segmentsGrob(y1 = unit(0, "npc"), gp=gpar(lwd=0.5)),
simplify=FALSE)
g <- gtable::gtable_add_grob(g, grobs = vs,
t = 1, b = nrow(g), l = seq_len(ncol(g)-1)+1)
g <- gtable::gtable_add_grob(g, grobs = hs,
l = 1, r = ncol(g), t = seq_len(nrow(g)-1))
labels <- list(textGrob("l", hjust=0, x=0),
textGrob("c", hjust=0.5, x=0.5),
textGrob("r", hjust=1, x=1))
g <- gtable_add_grob(g, labels, t=c(1,2,3), l=c(2, 4, 7), z = -1)
grid.newpage()
grid.draw(g)
Note that this strategy is however very inefficient, because it doesn't make use of vectorised grobs (instead individual grobs are placed). For a small enough table it may not matter, and the gain in convenience can be worthwhile. See this wiki page for further examples of gtable processing.

How to determine the size of a pointsGrob (vs the one of a textGrob)?

Why does determining the size (in mm, for example) of a points grob (pointsGrob) fail, but not for a text grob (textGrob)?
### puzzled about grobs (not) knowning how big they are:
require(grid)
convertUnit(grobWidth(textGrob("some tex")), "mm") # 16.93mm => fine
convertUnit(grobWidth(pointsGrob(0.5, 0.5)), "mm") # 0mm => ?
convertUnit(grobWidth(pointsGrob(0.5, 0.5, size=unit(3, "mm"))), "mm") ## still 0mm...
The reason why I am asking is: If you place a text grob and a points grob side-by-side, and change the value of cex, then suddenly the two grobs overlap (unwanted behavior).
Here is an example showing a similar prolem:
gt <- grobTree(pointsGrob(x=.5, y=.5, gp=gpar(cex=4)),
linesGrob(x=0:1, y=.5, gp=gpar(cex=4)),
pointsGrob(x=.5, y=.5, gp=gpar(cex=1)))
pg <- packGrob(frameGrob(vp=NULL), gt,
width = unit(1, "char"),
height = unit(1, "char"))
grid.newpage()
grid.draw(pg)
grid.rect(width=grobWidth(pg), height=grobHeight(pg), gp=gpar(col="red"))
The rectangle reveals that the grob width and height are not correct; pg does not "see" the size of the point with large cex. How can this be achieved?
I do not know how to solve the problem of zero point size, presumably it would have to be defined in the internals of the grid source code at the C level.
However, I want to point out that regardless of the pointsGrob issue, the grobWidth and grobHeight are not defined for your packGrob / gTree and the approach would fail even if pointsGrob were replaced by a textGrob.
You probably want to define a gTree of a new class, say "mygrob", and define your own widthDetails.mygrob and heightDetails.mygrob methods,
library(grid)
gt <- grobTree(linesGrob(x=c(0.2, 0.8), y=.5, gp=gpar(col="grey", lwd=10)),
textGrob("some label", gp=gpar(cex=2)),
cl = "mygrob")
widthDetails.mygrob <- function(x)
do.call(max, lapply(x$children, grobWidth))
heightDetails.mygrob <- function(x)
do.call(max, lapply(x$children, grobHeight))
grid.newpage()
grid.draw(gt)
grid.rect(width=grobWidth(gt), height=grobHeight(gt), gp=gpar(col="red"))

Get width of plot area in ggplot2

Is there any way to get the width of the plot area in the grid window? It grows or shrinks, for instance, if plot.margin is changed or if the y-axis labels' font-size is increased. Is is hidden somewhere in str(p)?
Any size measure would work. I need to be able to measure the relative change in the width of the plot area in different scenarios such as change of y-axis labels' font-size.
df = data.frame(x = (1:3),One=c(12, 8, 13),Two=c(13, 7, 11),Three=c(11, 9, 11))
df.melt = melt(df, id.vars="x")
p = ggplot(df.melt, aes(x=x, y=value, color=variable)) +
geom_line() +
coord_cartesian(xlim=c(min(df.melt$x),max(df.melt$x))) +
theme(legend.position="none", plot.margin = unit(c(1, 4, 1, 1), "cm"))
p
UPDATE – To clarify: Please help me calculate a/b.
p = ggplot(df.melt, aes(x=x, y=value, color=variable)) +
geom_line() + coord_cartesian(xlim=c(min(df.melt$x),max(df.melt$x))) +
theme(legend.position="none")
p1 = p + theme(plot.margin=unit(c(1,1,1,1),"cm"), axis.text.y=element_text(size=10))
p2 = p + theme(plot.margin=unit(c(1,1,1,2),"cm"), axis.text.y=element_text(size=30))
grid.arrange(p1, p2, ncol=2)
The plot in ggplot2 uses grid graphics. A graphical scene that has been produced
using the grid graphics package consists of grobs and viewports.
You can use gridDebug package for the inspection of the grobs.
showGrob show the locations and names of the grobs used to draw the scene
showGrob()
Get the gpath of the grob
sceneListing <- grid.ls(viewports=T, print=FALSE)
do.call("cbind", sceneListing)
name gPath
[1,] "ROOT" ""
[2,] "GRID.gTableParent.45019" ""
[3,] "background.1-5-6-1" "GRID.gTableParent.45019"
[4,] "spacer.4-3-4-3" "GRID.gTableParent.45019"
[5,] "panel.3-4-3-4" "GRID.gTableParent.45019"
[6,] "grill.gTree.44997" "GRID.gTableParent.45019::panel.3-4-3-4"
Retrieve the gorb
h <- grid.get(gPath="GRID.gTableParent.45019")
get h properties (e.g)
h$layoutvp$width
Application:
grid.get('x',grep=TRUE,global=T)
(polyline[panel.grid.minor.x.polyline.21899], polyline[panel.grid.major.x.polyline.21903], gTableChild[axis-l.3-3-3-3], gTableChild[axis-b.4-4-4-4], gTableChild[xlab.5-4-5-4])
> grid.get('x',grep=TRUE,global=T)[[3]]
gTableChild[axis-l.3-3-3-3]
> xx <- grid.get('x',grep=TRUE,global=T)[[3]]
> grobWidth(xx)
[1] sum(1grobwidth, 0.15cm+0.1cm)
This intrigued me enough to look into it deeper. I was hoping that the grid.ls function would give the information to navigate to the correct viewports to get the information, but for your example there are a bunch of the steps that get replaced with '...' and I could not see how to change that to give something that is easily worked with. However using grid.ls or other tools you can see the names of the different viewports. The viewports of interest are both named 'panel.3-4-3-4' for your example, below is some code that will navigate to the 1st, find the width in inches, navigate to the second and find the width of that one in inches.
grid.ls(view=TRUE,grob=FALSE)
current.vpTree()
seekViewport('panel.3-4-3-4')
a <- convertWidth(unit(1,'npc'), 'inch', TRUE)
popViewport(1)
seekViewport('panel.3-4-3-4')
b <- convertWidth(unit(1,'npc'), 'inch', TRUE)
a/b
I could not figure out an easy way to get to the second panel without poping the first one. This works and gives the information that you need, unfortunately since it pops the 1st panel off the list you cannot go back to it and find additional information or modify it. But this does give the info you asked for that could be used in future plots.
Maybe someone else knows how to navigate to the second panel without popping the first, or getting the full vpPath of each of them to navigate directly.
This answer is mainly in reply to comments by #java_xof. The reply is too long and includes code so it will not fit in a comment. However, it may help with the original question as well (or at least give a starting place).
Here is a function and some code using it (it requires the tcltk and tkrplot packages):
library(ggplot2)
library(tkrplot)
TkPlotLocations <- function(FUN) {
require(tkrplot)
cl <- substitute(FUN)
replot <- function() eval(cl)
tt <- tktoplevel()
img <- tkrplot(tt, replot, vscale=1.5, hscale=1.5)
tkpack(img)
tkpack(xfr <- tkframe(tt), side='left')
tkpack(yfr <- tkframe(tt), side='left')
xndc <- tclVar()
yndc <- tclVar()
xin <- tclVar()
yin <- tclVar()
tkgrid(tklabel(xfr, text='x ndc'), tklabel(xfr, textvariable=xndc))
tkgrid(tklabel(yfr, text='y ndc'), tklabel(yfr, textvariable=yndc))
tkgrid(tklabel(xfr, text='x inch'), tklabel(xfr, textvariable=xin))
tkgrid(tklabel(yfr, text='y inch'), tklabel(yfr, textvariable=yin))
iw <- as.numeric(tcl("image","width", tkcget(img, "-image")))
ih <- as.numeric(tcl("image","height",tkcget(img, "-image")))
cc <- function(x,y) {
x <- (as.real(x)-1)/iw
y <- 1-(as.real(y)-1)/ih
c(x,y)
}
mm <- function(x, y) {
xy <- cc(x,y)
tclvalue(xndc) <- xy[1]
tclvalue(yndc) <- xy[2]
tclvalue(xin) <- grconvertX(xy[1], from='ndc', to='inches')
tclvalue(yin) <- grconvertY(xy[2], from='ndc', to='inches')
}
tkbind( img, "<Motion>", mm)
invisible()
}
x <- runif(25)
y <- rnorm(25, x, 0.25)
plot(x,y)
par()$pin
par()$plt
TkPlotLocations(plot(x,y))
qplot(x,y)
par()$pin
par()$plt
TkPlotLocations(print(qplot(x,y)))
qplot(x,y) + xlab('Multi\nline\nx\nlabel')
par()$pin
par()$plt
TkPlotLocations(print(qplot(x,y) + xlab('Multi\nline\nx\nlabel')))
Defining the above function, then running the following lines will produce 3 plots of the same random data. You can see that the results of par()$pin and par()$plt (and other parameters) are exactly the same for the 3 plots even though the plotting regions differ in the plots.
There will also be 3 new windows that have popped up, in the windows you can move the mouse pointer over the graph and at the bottom of the window you will see the current location of the pointer in normalized device coordinates and in inches (both from the bottom left corner of the device region). You can hover the mouse pointer over the corners of the graph (or any other part) to see the values and compare between the 3 graphs.
This may be enough to answer at least part of the original question (just not programatically, which would be more useful). The functon can be modified to print out other measurements as well. I may expand this and include it in a package in the future if others would be interested.

Bubble chart for integer variables where the largest bubble has a diameter of 1 (on the x or y axis scale)?

I want to achieve the following outcomes:
Rescale the size of the bubbles such that the largest bubble has a
diameter of 1 (on whichever has the more compressed scale of the x
and y axes).
Rescale the size of the bubbles such that the smallest bubble has a diameter of 1 mm
Have a legend with the first and last points the minimum non-zero
frequency and the maximum frequency.
The best I have been able to do is as follows, but I need a more general solution where the value of maxSize is computed rather than hard-coded. If I was doing it in the traditional R plots I would use par("pin") to work out the size of plot area and work backwards, but I cannot figure out how to access this information with ggplot2. Any suggestions?
library(ggplot2)
agData = data.frame(
class=rep(1:7,3),
drv = rep(1:3,rep(7,3)),
freq = as.numeric(xtabs(~class+drv,data = mpg))
)
agData = agData[agData$freq != 0,]
rng = range(agData$freq)
mn = rng[1]
mx = rng[2]
minimumArea = mx - mn
maxSize = 20
minSize = max(1,maxSize * sqrt(mn/mx))
qplot(class,drv,data = agData, size = freq) + theme_bw() +
scale_area(range = c(minSize,maxSize),
breaks = seq(mn,mx,minimumArea/4), limits = rng)
Here is what it looks like so far:
When no ggplot, lattice or other highlevel package seems to do the job without hours of fine tuning I always revert to the base graphics. The following code gets you what you want, and after it I have another example based on how I would have plotted it.
Note however that I have set the maximum radius to 1 cm, but just divide size.range/2 to get diameter instead. I just thought radius gave me nicer plots, and you'll probably want to adjust things anyways.
size.range <- c(.1, 1) # Min and max radius of circles, in cm
# Calculate the relative radius of each circle
radii <- sqrt(agData$freq)
radii <- diff(size.range)*(radii - min(radii))/diff(range(radii)) + size.range[1]
# Plot in two panels
mar0 <- par("mar")
layout(t(1:2), widths=c(4,1))
# Panel 1: The circles
par(mar=c(mar0[1:3],.5))
symbols(agData$class, agData$drv, radii, inches=size.range[2]/cm(1), bg="black")
# Panel 2: The legend
par(mar=c(mar0[1],.5,mar0[3:4]))
symbols(c(0,0), 1:2, size.range, xlim=c(-4, 4), ylim=c(-2,4),
inches=1/cm(1), bg="black", axes=FALSE, xlab="", ylab="")
text(0, 3, "Freq")
text(c(2,0), 1:2, range(agData$freq), col=c("black", "white"))
# Reset par settings
par(mar=mar0)
Now follows my suggestion. The largest circle has a radius of 1 cm and area of the circles are proportional to agData$freq, without forcing a size of the smallest circle. Personally I think this is easier to read (both code and figure) and looks nicer.
with(agData, symbols(class, drv, sqrt(freq),
inches=size.range[2]/cm(1), bg="black"))
with(agData, text(class, drv, freq, col="white"))

lattice or latticeExtra combine multiple plots different yscaling (log10 and non-transformed)

I have a multiple variable time series were some of the variables have rather large ranges. I wish to make a single-page plot with multiple stacked plots of each variable were some of the variables have a log10 y-axis scaling. I am relatively new to lattice and have not been able to figure out how to effectively mix the log10 scaling with non-transformed axes and get a publication quality plot. If print.trellis is used the plots are not aligned and the padding needs some work, if c.trellis is used the layout is good, but only the y-scaling from only one plot is used. Any suggestions for an efficient solution, where I can replicate the output of c.trellis using the different y-scaling for each (original) object?
Example below:
require(lattice)
require(latticeExtra)
# make data.frame
d.date <- as.POSIXct(c("2009-12-15", "2010-01-15", "2010-02-15", "2010-03-15", "2010-04-15"))
CO2dat <- c(100,200,1000,9000,2000)
pHdat <- c(10,9,7,6,7)
tmp <- data.frame(date=d.date ,CO2dat=CO2dat ,pHdat=pHdat)
# make plots
plot1 <- xyplot(pHdat ~ date, data=tmp
, ylim=c(5,11)
, ylab="pHdat"
, xlab="Date"
, origin = 0, border = 0
, scales=list(y=list(alternating=1))
, panel = function(...){
panel.xyarea(...)
panel.xyplot(...)
}
)
# make plot with log y scale
plot2 <- xyplot(CO2dat ~ date, data=tmp
, ylim=c(10,10^4)
, ylab="CO2dat"
, xlab="Date"
, origin = 0, border = 0
, scales=list(y=list(alternating=1,log=10))
, yscale.components = yscale.components.log10ticks
, panel = function(...){
panel.xyarea(...)
panel.xyplot(...)
# plot CO2air uatm
panel.abline(h=log10(390),col="blue",type="l",...)
}
)
# plot individual figures using split
print(plot2, split=c(1,1,1,2), more=TRUE)
print(plot1, split=c(1,2,1,2), more=F)
# combine plots (more convenient)
comb <- c(plot1, plot2, x.same=F, y.same=F, layout = c(1, 2))
# plot combined figure
update(comb, ylab = c("pHdat","log10 CO2dat"))
Using #joran's idea, I can get the axes to be closer but not exact; also, reducing padding gets them closer together but changes the aspect ratio. In the picture below I've reduced the padding perhaps by too much to show the not exactness; if this close were desired, you'd clearly want to remove the x-axis labels on the top as well.
I looked into the code that sets up the layout and the margin on the left side is calculated from the width of the labels, so #joran's idea is probably the only thing that will work based on the printing using split, unless one were to rewrite the plot.trellis command. Perhaps the c method could work but I haven't found a way yet to set the scale components separately depending on the panel. That does seem more promising though.
mtheme <- standard.theme("pdf")
mtheme$layout.heights$bottom.padding <- -10
plot1b <- update(plot1, scales=list(y=list(alternating=1, at=5:10, labels=paste(" ",c(5:10)))))
plot2b <- update(plot2, par.settings=mtheme)
pdf(file="temp.pdf")
print(plot2b, split=c(1,1,1,2), more=TRUE)
print(plot1b, split=c(1,2,1,2), more=F)

Resources