Numbered point labels plus a legend in a scatterplot - r

I am trying to label points in a scatterplot in R (ggplot2) using numbers (1, 2, 3, ...) and then match the numbers to names in a legend (1 - Alpha, 2 - Bravo, 3 - Charlie... ), as a way of dealing with too many, too long labels on the plot.
Let's assume this is a.df:
Name X Attribute Y Attribute Size Attribute Color Attribute
Alpha 1 2.5 10 A
Bravo 3 3.5 5 B
Charlie 2 1.5 10 C
Delta 5 1 15 D
And this is a standard scatterplot:
ggplot(a.df, aes(x=X.Attribute, y=Y.Attribute, size=Size.Attribute, fill=Colour.Attribute, label=Name)) +
geom_point(shape=21) +
geom_text(size=5, hjust=-0.2,vjust=0.2)
Is there a way to change it as follows?
have scatterplot points labeled with numbers (1,2,3...)
have a legend next to the plot assigning the plot labels (1,2,3...) to a.df$Name
In the next step I would like to assign other attributes to the point size and color, which may rule out some 'hacks'.

Here's an alternative solution, which draws the labels as geom_text. I've borrowed from
ggplot2 - annotate outside of plot.
library(MASS) # for Cars93 data
library(grid)
library(ggplot2)
d <- Cars93[1:30,]
d$row_num <- 1:nrow(d)
d$legend_entry <- paste(" ", d$row_num, d$Manufacturer, d$Model)
ymin <- min(d$Price)
ymax <- max(d$Price)
y_values <- ymax-(ymax-ymin)*(1:nrow(d))/nrow(d)
p <- ggplot(d, aes(x=Min.Price, y=Price)) +
geom_text(aes(label=row_num)) +
geom_text(aes(label=legend_entry, x=Inf, y=y_values, hjust=0)) +
theme(plot.margin = unit(c(1,15,1,1), "lines"))
gt <- ggplot_gtable(ggplot_build(p))
gt$layout$clip[gt$layout$name == "panel"] <- "off"
grid.draw(gt)

This is pretty hacky, but might help. The plot labels are simply added by geom_text, and to produce a legend, I've mapped colour to a label in the data. Then to stop the points being coloured, I override it with scale_colour_manual, where you can set the colour of the points, as well as the labels on the legend. Finally, I made the points in the legend invisible by setting alpha = 0, and the squares that are usually behind the dots in theme().
dat <- data.frame(id = 1:10, x = rnorm(10), y = rnorm(10), label = letters[1:10])
ggplot(dat, aes(x, y)) + geom_point(aes(colour = label)) +
geom_text(aes(x = x + 0.1, label = id)) +
scale_colour_manual(values = rep("black", nrow(dat)),
labels = paste(dat$id, "=", dat$label)) +
guides(colour = guide_legend(override.aes = list(alpha = 0))) +
theme(legend.key = element_blank())

Related

ggplot2 colorbar with discontinuous jump for skewed data

Here is some fake data, x and y, with color information z. z is highly skewed, and as such renders the colorbar uninformative:
set.seed(1)
N <- 100
x <- rnorm(N)
y <- x + rnorm(N)
z <- x+y+rnorm(N)
z[z>2] <- z[z>2]+exp(z[z>2]-2)
d <- data.frame(x,y,z)
ggplot(d, aes(x=x, y=y, color = z)) + geom_point()
I'd like to have most of the colorbar reflect the main range of the the data, but have a box for overflows, say above 5. Something like this:
Is there a way to do this in ggplot2? Note that I would like the colorbar to remain continuous, rather than discrete, for most of its range. I'll probably either discretize or topcode if what I want isn't feasible.
You can get that general plot, although the legends would need more work:
p <- ggplot(d, aes(x=x, y=y, color = z)) + geom_point(size = 5)
p + scale_color_gradient2(
low = 'green', high = 'red', mid = 'grey80', na.value = 'blue', limits= c(-10, 10)
)
You can cheat in some extra legend fluff, e.g.:
ggplot(d, aes(x=x, y=y, color = z, alpha = '>10')) +
geom_point(size = 5) +
scale_color_gradient2(
low = 'green', high = 'red', mid = 'grey80', na.value = 'blue', limits= c(-10, 10),
guide = guide_colorbar(title.position = 'left')
) +
scale_alpha_manual(
values = 1, name = 'z',
guide = guide_legend(
override.aes = list(color = 'blue'), title.position = 'left',
title.theme = element_text(color = 'white', angle = 0)
)
) +
theme(legend.margin = margin(-5, 10, -5, 10))
Note that red/green pallets are bad for the color impaired.
Extending upon Axeman's answer I came up with the following slight hack to get blues into your color scale:
First, define a color map with 20 colors for the values within and 5 for the values outside your range.
cmap <- colorRampPalette(c("green","grey80","red"))(20)
cmap <- append(cmap,rep("blue",5))
Then cut the z values into 20 chunks between -10 and 10 and convert to numeric (resulting in NA's for values above 10). By specifying the cmap in scale_color_gradientn and limits of [1,25] we map values of -10 to 1 (green) and 10 to 20 (red). Finally by specifying breaks we manually add the correct labels (i.e. the 5th category corresponds to values between -6 and -5).
ggplot(d, aes(x=x, y=y, color=as.numeric(cut(z, breaks=seq(-10,10))))) +
geom_point(size=3) +
scale_color_gradientn(colors=cmap, limits=c(1,25), breaks=c(5,11,17,23),
labels=c(-6,0,6,">10"), name="z", na.value = "blue")
Lovely result :)
The only issue is that you will have to make sure that no values will ever fall below -10 as they would also be shown in blue as well using this method.

Overlaying points and controlling size with ggplot2

I want to plot some point estimates with a couple of interval estimates around them, and then to superimpose the true point values using a different color and size, with a legend for the color.
I've tried lots of things. If I just use a new call to geom_point, I can't figure out how to add a legend. Therefore, my current approach resorts to stacking the data on top of itself, which is clumsy. Even then, the graph comes out wrong with big blue points for the True values, with the desired orange points on top of them.
I'd appreciate any help I can get.
nms <- c("2.5%","25%","50%","75%","97.5%","dose","truep")
a <- c(9.00614679684893e- 44,0.000123271800672435,0.0339603711049475,0.187721170170911,0.67452033450121,5,0.040752445325937)
b <- c(1.59502878028266e-25,0.00328588588499889,0.0738203422543555,0.25210200886225,0.714843425007051,10,0.0885844107052267)
cc <- c(1.41975723605948e-14,0.0184599181547097,0.118284929584256,0.311068595276067,0.74339745948793,15,0.141941915501108)
d <- c(0.0311851190805834,0.154722028150561,0.299318020818234,0.50887634580605,0.838779816278485,25,0.359181624981881)
e <- c(0.0529617924263383,0.289588386297245,0.566777817134668,0.883959271416755,0.999999999999317,40,0.680133380561602)
f <- c(0.0598904847882839,0.327655201251564,0.640100529843672,0.950060245074853,1,50,0.768120635812406)
g <- c(0.0641613025760661,0.355626055560067,0.686504841650593,0.978023943968809,1,60,0.823805809980712)
p <- as.data.frame(t(data.frame(a, b, cc, d, e, f, g)))
names(p) <- nms
# Faff duplicating data
p$truep <- 1.2 * p$truep
p2 <- p
p2[, 1:5] <- p$truep # truep is known, so there are no intervals
p3 <- rbind(p2, p)
p3$wh <- rep((c(2, 3)), each=nrow(p))
p3$col <- rep(c("orange", "blue"), each=nrow(p))
ggplot(p3, aes(dose, `50%`)) +
geom_point(aes(size=wh, color=col)) +
scale_size(range=c(5, 7), guide="none") +
scale_color_manual(name="", labels=c("Prior", "True"), values=c("blue", "orange")) +
geom_pointrange(aes(ymin=`2.5%`, ymax=`97.5%`, x=dose), color="blue") +
geom_pointrange(aes(ymin=`25%`, ymax=`75%`, x=dose), color="blue", size=2) +
geom_point(aes(dose, truep), color="orange") +
theme(axis.text.x=element_text(size=12), axis.title.x=element_text(size=14),
axis.text.y=element_text(size=12), axis.title.y=element_text(size=14),
legend.text=element_text(size=12))
R 3.3.1, ggplot2_2.1.1
Thanks,
Harry
I found a solution by splitting the dataset in two parts:
library(dplyr)
priors <- p%>%
mutate(datatype = 'Prior')
truevals <- p%>%
select(dose, truep)%>%
mutate(datatype = 'True')
ggplot(truevals, aes(x = dose, y = truep, colour = datatype))+
geom_pointrange(data = priors, aes(ymin=`25%`, ymax=`75%`, y = `50%`), size=1.5) +
geom_pointrange(data = priors, aes(ymin=`2.5%`, ymax=`97.5%`, y = `50%`))+
geom_point()+
scale_color_manual(name="", values=c("Prior" = "blue", "True" = "orange")) +
theme(axis.text.x=element_text(size=12), axis.title.x=element_text(size=14),
axis.text.y=element_text(size=12), axis.title.y=element_text(size=14),
legend.text=element_text(size=12))
First we plot the two pointranges based on the dataset with priors. Then the actual values. By adding a row with the datatype to both datasets we can add the legend. The result is this graph:
For the method ggplot2::geom_point() there is a show.legend attribute which is NA by default so setting this to TRUE should help.
You can add a legend using the labels attribute as follows:
ggplot2::scale_fill_manual(values = c("red", "black",
labels = c("Number of people",
"Number of birds"))
You are already doing this with labels=c("Prior", "True")
You can also change the look of the legend with:
ggplot2::theme(legend.position = "bottom",
legend.text = ggplot2::element_text(size = 22),
legend.box = "horizontal",
legend.key = ggplot2::element_blank())

Add a legend for geom_polygon

I'm trying to produce a scatter plot with geom_point where the points are circumscribed by a smoothed polygon, with geom_polygon.
Here's my point data:
set.seed(1)
df <- data.frame(x=c(rnorm(30,-0.1,0.1),rnorm(30,0,0.1),rnorm(30,0.1,0.1)),y=c(rnorm(30,-1,0.1),rnorm(30,0,0.1),rnorm(30,1,0.1)),val=rnorm(90),cluster=c(rep(1,30),rep(2,30),rep(3,30)),stringsAsFactors=F)
I color each point according the an interval that df$val is in. Here's the interval data:
intervals.df <- data.frame(interval=c("(-3,-2]","(-2,-0.999]","(-0.999,0]","(0,1.96]","(1.96,3.91]","(3.91,5.87]","not expressed"),
start=c(-3,-2,-0.999,0,1.96,3.91,NA),end=c(-2,-0.999,0,1.96,3.91,5.87,NA),
col=c("#2f3b61","#436CE8","#E0E0FF","#7d4343","#C74747","#EBCCD6","#D3D3D3"),stringsAsFactors=F)
Assigning colors and intervals to the points:
df <- cbind(df,do.call(rbind,lapply(df$val,function(x){
if(is.na(x)){
return(data.frame(col=intervals.df$col[nrow(intervals.df)],interval=intervals.df$interval[nrow(intervals.df)],stringsAsFactors=F))
} else{
idx <- which(intervals.df$start <= x & intervals.df$end >= x)
return(data.frame(col=intervals.df$col[idx],interval=intervals.df$interval[idx],stringsAsFactors=F))
}
})))
Preparing the colors for the leged which will show each interval:
df$interval <- factor(df$interval,levels=intervals.df$interval)
colors <- intervals.df$col
names(colors) <- intervals.df$interval
Here's where I constructed the smoothed polygons (using a function courtesy of this link):
clusters <- sort(unique(df$cluster))
cluster.cols <- c("#ff00ff","#088163","#ccbfa5")
splinePolygon <- function(xy,vertices,k=3, ...)
{
# Assert: xy is an n by 2 matrix with n >= k.
# Wrap k vertices around each end.
n <- dim(xy)[1]
if (k >= 1) {
data <- rbind(xy[(n-k+1):n,], xy, xy[1:k, ])
} else {
data <- xy
}
# Spline the x and y coordinates.
data.spline <- spline(1:(n+2*k), data[,1], n=vertices, ...)
x <- data.spline$x
x1 <- data.spline$y
x2 <- spline(1:(n+2*k), data[,2], n=vertices, ...)$y
# Retain only the middle part.
cbind(x1, x2)[k < x & x <= n+k, ]
}
library(data.table)
hulls.df <- do.call(rbind,lapply(1:length(clusters),function(l){
dt <- data.table(df[which(df$cluster==clusters[l]),])
hull <- dt[, .SD[chull(x,y)]]
spline.hull <- splinePolygon(cbind(hull$x,hull$y),100)
return(data.frame(x=spline.hull[,1],y=spline.hull[,2],val=NA,cluster=clusters[l],col=cluster.cols[l],interval=NA,stringsAsFactors=F))
}))
hulls.df$cluster <- factor(hulls.df$cluster,levels=clusters)
And here's my ggplot command:
library(ggplot2)
p <- ggplot(df,aes(x=x,y=y,colour=interval))+geom_point(cex=2,shape=1,stroke=1)+labs(x="X", y="Y")+theme_bw()+theme(legend.key=element_blank(),panel.border=element_blank(),strip.background=element_blank())+scale_color_manual(drop=FALSE,values=colors,name="DE")
p <- p+geom_polygon(data=hulls.df,aes(x=x,y=y,group=cluster),color=hulls.df$col,fill=NA)
which produces:
My question is how do I add a legend for the polygon under the legend for the points? I want it to a legend with 3 lines colored according to the cluster colors and the corresponding cluster number beside each line?
Slightly different output, only changing the last line of your code, it may solve your purpose:
p+geom_polygon(data=hulls.df,aes(x=x,y=y,group=cluster, fill=cluster),alpha=0.1)
Say, you want to add a legend of the_factor. My basic idea is,
(1) put the_factor into mapping by using unused aes arguments; aes(xx = the_factor)
(2) if (1) affects something, delete the effect by using scale_xx_manual()
(3) modify the legend by using guides(xx = guide_legend(override.aes = list()))
In your case, aes(fill) and aes(alpha) are unused. The former is better to do it because of no effect. So I used aes(fill=as.factor(cluster)).
p <- ggplot(df,aes(x=x,y=y,colour=interval, fill=as.factor(cluster))) + # add aes(fill=...)
geom_point(cex=2, shape=1, stroke=1) +
labs(x="X", y="Y",fill="cluster") + # add fill="cluster"
theme_bw() + theme(legend.key=element_blank(),panel.border=element_blank(),strip.background=element_blank()) + scale_color_manual(drop=FALSE,values=colors,name="DE") +
guides(fill = guide_legend(override.aes = list(colour = cluster.cols, pch=0))) # add
p <- p+geom_polygon(data=hulls.df,aes(x=x,y=y,group=cluster), color=hulls.df$col,fill=NA)
Of course, you can make the same graph by using aes(alpha = the_factor)). Because it has influence, you need to control it by using scale_alpha_manual().
g <- ggplot(df, aes(x=x,y=y,colour=interval)) +
geom_point(cex=2, shape=1, stroke=1, aes(alpha=as.factor(cluster))) + # add aes(alpha)
labs(x="X", y="Y",alpha="cluster") + # add alpha="cluster"
theme_bw() + theme(legend.key=element_blank(),panel.border=element_blank(),strip.background=element_blank()) + scale_color_manual(drop=FALSE,values=colors,name="DE") +
scale_alpha_manual(values=c(1,1,1)) + # add
guides(alpha = guide_legend(override.aes = list(colour = cluster.cols, pch=0))) # add
g <- p+geom_polygon(data=hulls.df,aes(x=x,y=y,group=cluster), color=hulls.df$col,fill=NA)
What you are asking for is two colour scales. My understanding is that this is not possible. But you can give the impression of having two colour scales with a bit of a cheat and using the filled symbols (shapes 21 to 25).
p <- ggplot(df, aes(x = x, y = y, fill = interval)) +
geom_point(cex = 2, shape = 21, stroke = 1, colour = NA)+
labs(x = "X", y = "Y") +
theme_bw() +
theme(legend.key = element_blank(), panel.border = element_blank(), strip.background = element_blank()) +
scale_fill_manual(drop=FALSE, values=colors, name="DE") +
geom_polygon(data = hulls.df, aes(x = x, y = y, colour = cluster), fill = NA) +
scale_colour_manual(values = cluster.cols)
p
Alternatively, use a filled polygon with a low alpha
p <- ggplot(df,aes(x=x,y=y,colour=interval))+
geom_point(cex=2,shape=1,stroke=1)+
labs(x="X", y="Y")+
theme_bw() +
theme(legend.key = element_blank(),panel.border=element_blank(), strip.background=element_blank()) +
scale_color_manual(drop=FALSE,values=colors,name="DE", guide = guide_legend(override.aes = list(fill = NA))) +
geom_polygon(data=hulls.df,aes(x=x,y=y,group=cluster, fill = cluster), alpha = 0.2, show.legend = TRUE) +
scale_fill_manual(values = cluster.cols)
p
But this might make the point colours difficult to see.

Different size facets at x-axis

Length of x-axis is important for my plot because it allows one to compare between facets, therefore I want facets to have different x-axis sizes. Here is my example data:
group1 <- seq(1, 10, 2)
group2 <- seq(1, 20, 3)
x = c(group1, group2)
mydf <- data.frame (X =x , Y = rnorm (length (x),5,1),
groups = c(rep(1, length (group1)), rep(2, length(group2))))
And my code:
p1 = ggplot(data=mydf,aes(x=X,y=Y,color=factor(groups)) )+
geom_point(size=2)+
scale_x_continuous(labels=comma)+
theme_bw()
p1+facet_grid(groups ~ .,scales = "fixed",space="free_x")
And the resulting figure:
Panel-1 has x-axis values less then 10 whereas panel-2 has x-axis value extending to 20. Still both panels and have same size on x-axis. Is there any way to make x-axis panel size different for different panels, so that they correspond to their (x-axis) values?
I found an example from some different package that shows what I am trying to do, here is the figure:
Maybe something like this can get you started. There's still some formatting to do, though.
library(grid)
library(gridExtra)
library(dplyr)
library(ggplot2)
p1 <- ggplot(data=mydf[mydf$groups==1,],aes(x=X,y=Y))+
geom_point(size=2)+
theme_bw()
p2 <- ggplot(data=mydf[mydf$groups==2,],aes(x=X,y=Y))+
geom_point(size=2)+
theme_bw()
summ <- mydf %>% group_by(groups) %>% summarize(len=diff(range(X)))
summ$p <- summ$len/max(summ$len)
summ$q <- 1-summ$p
ng <- nullGrob()
grid.arrange(arrangeGrob(p1,ng,widths=summ[1,3:4]),
arrangeGrob(p2,ng,widths=summ[2,3:4]))
I'm sure there's a way to make this more general, and the axes don't line up perfectly yet, but it's a beginning.
Here is a solution following OP's clarifying comment ("I guess axis will be same but the boxes will be of variable size. Is it possible by plotting them separately and aligning in grid?").
library(plyr); library(ggplot2)
buffer <- 0.5 # Extra space around the box
#Calculate box parameters
mydf.box <- ddply(mydf, .(groups), summarise,
max.X = max(X) + buffer,
min.X = 0,
max.Y = max(Y) + buffer,
min.Y = 0,
X = mean(X), Y = mean(Y)) #Dummy values for X and Y needed for geom_rect
p2 <- ggplot(data=mydf,aes(x=X, y=Y) )+
geom_rect(data = mydf.box, aes( xmax = max.X, xmin = min.X,
ymax = max.Y, ymin = min.Y),
fill = "white", colour = "black", fill = NA) +
geom_point(size=2) + facet_grid(groups ~ .,scales = "free_y") +
theme_classic() +
#Extra formatting to make your plot like the example
theme(panel.background = element_rect(fill = "grey85"),
strip.text.y = element_text(angle = 0),
strip.background = element_rect(colour = NA, fill = "grey65"))

Is there a way to have a barplot and a stacked barplot on the same graph using barplot or ggplot?

I have two pieces of data that I want to overlay onto the same plot. I've looked at several ggplot articles and I don't think it's possible within ggplot. So I have been using barplot. I have 5 tiers and I'm plotting total dollars by tier as a solid bar.
Then I have another piece of data that represents the number of tasks within those tiers by two different types of workers. I have this as a stacked bar plot. But I want to show them on the same graph with the total dollar amount as one bar and then the corresponding stacked bar next to it.
Here are the plots:
The data for the first graph looks like this (it's a table):
1 2 3 4 5
0 9 340 97 812 4271
1 1 417 156 3163 11314
The data for the second graph looks like this (this is a dataset):
Tier variable value
1 1 Opp_Amt 16200.00
2 2 Opp_Amt 116067.50
3 3 Opp_Amt 35284.12
4 4 Opp_Amt 278107.10
5 5 Opp_Amt 694820.29
I want to put the graphs on top of each other but the bars keep overlapping and I want them to appear side by side by tier.
Code for what I have so far.
par(mar=c(2.5, 4, 4, 4)+2)
## Plot first set of data and draw its axis
barplot(data1$value, axes=FALSE,ylim=c(0,700000), xlab="", ylab="",
col="black",space=-10,main="Work Score")
axis(2, ylim=c(0,700000),col="black",las=1) ## las=1 makes horizontal labels
mtext("Total Opportunity Amount",side=2,line=3.5)
box()
## Allow a second plot on the same graph
par(new=TRUE)
## Plot the second plot and put axis scale on right
m <- barplot(counts, xlab="", ylab="", ylim=c(0,16000),axes=FALSE, col=c("red","darkblue"),space=3,width=0.5,density=20)
## a little farther out (line=4) to make room for labels
mtext("Task Ratio: Outbound to AE",side=4,col="red",line=3.5)
axis(4, ylim=c(0,16000), col="red",col.axis="black",las=1)
And it gives me this
Using ggplot, I would do something like one of these. They plot the two sets of data separately. The first arranges the data into one dataframe, then uses facet_wrap() to position the plots side-by-side. The second generates the two plot objects separately, then combines the two plots and the legend into a combined plot.
But if you really need the "dual y-axis" approach, then with some fiddling, and using the plots' layouts and gtable functions, it can be done (using code borrowed from here).
Like this:
library(ggplot2)
library(gtable)
library(plyr)
df1 <- data.frame(Tier = rep(1:5, each = 2),
y = c(9, 1, 340, 417, 97, 156, 812, 3063, 4271, 11314),
gp = rep(0:1, 5))
df2 <- read.table(text = "
Tier variable value
1 Opp_Amt 16200.00
2 Opp_Amt 116067.50
3 Opp_Amt 35284.12
4 Opp_Amt 278107.10
5 Opp_Amt 694820.29", header = TRUE)
dfA = df1
dfB = df2
names(dfA) = c("Tier", "Value", "gp")
dfA$var = "Task Ratio"
dfB = dfB[,c(1,3)]
dfB$gp = 3
dfB$var = "Total Opportunity Amount"
names(dfB) = names(dfA)
df = rbind(dfA, dfB)
df$var = factor(df$var)
df$var = factor(df$var, levels = rev(levels(df$var)))
ggplot(df, aes(Tier, Value, fill = factor(gp))) +
geom_bar(position = "stack", stat = "identity") +
facet_wrap( ~ var, scale = "free_y") +
scale_fill_manual("Group", breaks = c("0","1"), values = c("#F8766D", "#00BFC4", "black")) +
theme_bw() +
theme(panel.spacing = unit(2, "lines"),
panel.grid = element_blank())
Or this:
p1 <- ggplot(df1, aes(factor(Tier), y, fill = factor(gp))) +
geom_bar(position = "stack", stat = "identity") +
#guides(fill = FALSE) +
scale_y_continuous("Task Ratio",
limit = c(0, 1.1*max(ddply(df1, .(Tier), summarise, sum = sum(y)))),
expand = c(0,0)) +
scale_x_discrete("Tier") +
theme_bw() +
theme(panel.grid = element_blank())
p2 <- ggplot(df2, aes(factor(Tier), value)) +
geom_bar(stat = "identity") +
scale_y_continuous("Total Opportunity Amount", limit = c(0, 1.1*max(df2$value)), expand = c(0,0)) +
scale_x_discrete("Tier") +
theme_bw() +
theme(panel.grid = element_blank())
# Get the ggplot grobs,
# And get the legend from p1
g1 <- ggplotGrob(p1)
leg = gtable_filter(g1, "guide-box")
legColumn = g1$layout[which(g1$layout$name == "guide-box"), "l"]
g1 = g1[,-legColumn]
g2 <- ggplotGrob(p2)
# Make sure the width are the same in g1 and g2
library(grid)
maxWidth = unit.pmax(g1$widths, g2$widths)
g1$widths = as.list(maxWidth)
g2$widths = as.list(maxWidth)
# Combine g1, g2 and the legend
library(gridExtra)
grid.arrange(arrangeGrob(g2, g1, nrow = 1), leg,
widths = unit.c(unit(1, "npc") - leg$width, leg$width), nrow=1)
Or the dual y-axis approach (But not recommended for reasons given in #Phil's post):
width1 = 0.6 # width of bars in p1
width2 = 0.2 # width of bars in p2
pos = .5*width1 + .5*width2 # positioning bars in p2
p1 <- ggplot(df1, aes(factor(Tier), y, fill = factor(gp))) +
geom_bar(position = "stack", stat = "identity", width = width1) +
guides(fill = FALSE) +
scale_y_continuous("",
limit = c(0, 1.1*max(ddply(df1, .(Tier), summarise, sum = sum(y)))),
expand = c(0,0)) +
scale_x_discrete("Tier") +
theme_bw() +
theme(panel.grid = element_blank(),
axis.text.y = element_text(colour = "red", hjust = 0, margin = margin(l = 2, unit = "pt")),
axis.ticks.y = element_line(colour = "red"))
p2 <- ggplot(df2, aes(factor(Tier), value)) +
geom_blank() +
geom_bar(aes(x = Tier - pos), stat = "identity", width = width2) +
scale_y_continuous("", limit = c(0, 1.1*max(df2$value)), expand = c(0,0)) +
theme_bw() +
theme(panel.grid = element_blank())
# Get ggplot grobs
g1 <- ggplotGrob(p1)
g2 <- ggplotGrob(p2)
# Get locations of the panels in g1
pp1 <- c(subset(g1$layout, name == "panel", se = t:r))
## Get bars from g2 and insert them into the panel in g1
g <- gtable_add_grob(g1, g2$grobs[[which(g2$layout$name == "panel")]][[4]][[4]], pp1$t, pp1$l)
# Grab axis from g1, reverse elements, and put it on the right
index <- which(g1$layout$name == "axis-l")
grob <- g1$grobs[[index]]
axis <- grob$children[[2]]
axis$widths <- rev(axis$widths)
axis$grobs <- rev(axis$grobs)
axis$grobs[[1]]$x <- axis$grobs[[1]]$x - unit(1, "npc") + unit(3, "pt")
g <- gtable_add_cols(g, g1$widths[g1$layout[index, ]$l], pp1$r)
g <- gtable_add_grob(g, axis, pp1$t, pp1$l+1)
# Grab axis from g2, and put it on the left
index <- which(g2$layout$name == "axis-l")
grob <- g2$grobs[[index]]
axis <- grob$children[[2]]
g <- gtable_add_grob(g, rectGrob(gp = gpar(col = NA, fill = "white")), pp1$t-1, pp1$l-1, pp1$b+1)
g <- gtable_add_grob(g, axis, pp1$t, pp1$l-1)
# Add axis titles
# right axis title
RightAxisText = textGrob("Task Ratio", rot = 90, gp = gpar(col = "red"))
g <- gtable_add_cols(g, unit.c(unit(1, "grobwidth", RightAxisText) + unit(1, "line")), 5)
g <- gtable_add_grob(g, RightAxisText, pp1$t, pp1$r+2)
# left axis title
LeftAxisText = textGrob("Total Opportunity Amount", rot = 90)
g <- gtable_add_grob(g, LeftAxisText, pp1$t, pp1$l-2)
g$widths[2] <- unit.c(unit(1, "grobwidth", LeftAxisText) + unit(1, "line"))
# Draw it
grid.newpage()
grid.draw(g)
It appears you are trying to plot two variables on two different y scales on to one chart. I recommend against this, and this is considered bad practice. See, for example, #hadley 's (the author of ggplot2) answer here about a similar issue: https://stackoverflow.com/a/3101876/3022126
It is possible to plot two variables on one y axis if they have comparable scales, but the range of your two datasets do not greatly overlap.
Consider other visualisations, perhaps using two separate charts.
Try looking at the add parameter for barplot.
## Function to create alpha colors for illustration.
col2alpha <- function(col, alpha = 0.5) {
tmp <- col2rgb(col)
rgb(tmp[1]/255, tmp[2]/255, tmp[3]/255, alpha)
}
## Some fake data
dat1 <- data.frame(id = 1:4, val = c(10, 8, 6, 4))
dat2 <- data.frame(id = 1:4, val = c(4, 6, 8, 10))
barplot(dat1$val, col = col2alpha("blue"))
barplot(dat2$val, col = col2alpha("red"), add = TRUE)

Resources