More than six shapes in ggplot - r

I would like to plot lines with different shapes with more than six sets of data, using discrete colors. The problems are 1) a different legend is generated for line color and shape, but should be only one legend with the line color and shape, 2) when correcting the title for the line color legend, the color disappear.
t=seq(0,360,20)
for (ip in seq(0,10)) {
if (ip==0) {
df<-data.frame(t=t,y=sin(t*pi/180)+ip/2,sn=ip+100)
} else {
tdf<-data.frame(t=t,y=sin(t*pi/180)+ip/2,sn=ip+100)
df<-rbind(df,tdf)
}
}
head(df)
# No plot
# Error: A continuous variable can not be mapped to shape
gp <- ggplot(df,aes(x=t,y=y,group=sn,color=sn,shape=sn))
gp <- gp + labs(title = "Demo more than 6 shapes", x="Theat (deg)", y="Magnitude")
gp <- gp + geom_line() + geom_point()
print(gp)
# No plot
# Error: A continuous variable can not be mapped to shape (doesn't like integers)
gp <- ggplot(df,aes(x=t,y=y,group=sn,color=sn,shape=as.integer(sn)))
gp <- gp + labs(title = "Demo more than 6 shapes", x="Theat (deg)", y="Magnitude")
gp <- gp + geom_line() + geom_point()
print(gp)
# Gives warning about 6 shapes, and only shows 6 shapes, continous sn colors
gp <- ggplot(df,aes(x=t,y=y,group=sn,color=sn,shape=as.factor(sn)))
gp <- gp + labs(title = "Only shows six shapes, and two legends, need discrete colors",
x="Theat (deg)", y="Magnitude")
gp <- gp + geom_line() + geom_point()
print(gp)
# This is close to what is desired, but correct legend title and combine legends
gp <- ggplot(df,aes(x=t,y=y,group=sn,color=as.factor(sn),shape=as.factor(sn %% 6)))
gp <- gp + labs(title = "Need to combine legends and correct legend title", x="Theat (deg)", y="Magnitude")
gp <- gp + geom_line() + geom_point()
print(gp)
# Correct legend title, but now the line color disappears
gp <- ggplot(df,aes(x=t,y=y,group=sn,color=as.factor(sn),shape=as.factor(sn %% 6)))
gp <- gp + labs(title = "Color disappeard, but legend title changed", x="Theat (deg)", y="Magnitude")
gp <- gp + geom_line() + geom_point()
gp <- gp + scale_color_manual("SN",values=as.factor(df$sn))
print(gp)
# Add color and shape in geom_line / geom_point commands,
gp <- ggplot(df,aes(x=t,y=y,group=sn))
gp <- gp + labs(title = "This is close, but legend symbols are wrong", x="Theat (deg)", y="Magnitude")
gp <- gp + geom_line(aes(color=as.factor(df$sn)))
gp <- gp + geom_point(color=as.factor(df$sn),shape=as.factor(df$sn %% 6))
gp <- gp + scale_color_manual("SN",values=as.factor(df$sn))
print(gp)

First, it would be easier to convert sn to a factor.
df$sn <- factor(df$sn)
Then, you need to use scale_shape_manual to specify your shapes to use.
gp <- ggplot(df,aes(x=t, y=y, group=sn,color=sn, shape=sn)) +
scale_shape_manual(values=1:nlevels(df$sn)) +
labs(title = "Demo more than 6 shapes", x="Theat (deg)", y="Magnitude") +
geom_line() +
geom_point(size=3)
gp
This should give you what you want. You need to use scale_shape_manual because, even with sn as a factor, ggplot will only add up to 6 different symbols automatically. After that you have to specify them manually. You can change your symbols in a number of ways. Have a look at these pages for more information on how: http://sape.inf.usi.ch/quick-reference/ggplot2/shape
http://www.cookbook-r.com/Graphs/Shapes_and_line_types/

For me, the key to the error message about the 6 shapes is the part that says Consider specifying shapes manually..
If you add in the values in scale_shape_manual, I believe you'll get what you want. I made sn a factor in the dataset first.
df$sn = factor(df$sn)
ggplot(df, aes(x = t, y = y, group = sn, color = sn, shape = sn)) +
geom_point() +
geom_line() +
scale_shape_manual(values = 0:10)
I go to the Cookbook for R site when I need to remember which numbers correspond to which shapes.
Edit The example above shows adding 11 symbols, the same number of symbols in your example dataset. Your comments indicate that you have many more unique values for the sn variable than in your example. Be careful with using a long series of numbers in values, as not all numbers are defined as symbols.
Ignoring whether it is a good idea to have so many shapes in a single graphic or not, you can use letters and numbers as well as symbols as shapes. So if you wanted, say, 73 unique shapes based on a factor with 73 levels, you could use 19 symbols, all upper and lower case letters, and the numbers 0 and 1 as your values.
scale_shape_manual(values = c(0:18, letters, LETTERS, "0", "1"))

you can get about a hundred different shapes if you need them. good.shapes is a vector of the shape numbers that render on my screen without any fill argument.
library(ggplot2)
N = 100; M = 1000
good.shapes = c(1:25,33:127)
foo = data.frame( x = rnorm(M), y = rnorm(M), s = factor( sample(1:N, M, replace = TRUE) ) )
ggplot(aes(x,y,shape=s ), data=foo ) +
scale_shape_manual(values=good.shapes[1:N]) +
geom_point()

Related

Add multiple legends to ggplot geom_tile

I'm using ggplot to create a heat-map style plot, and would like to add a second legend with the data scaled a different way. I'm wondering if there is a simple way to do this.
I do not believe that this is a duplicate of other "multiple legends" questions e.g. Multiple legends for a ggplot in R as crucially I want to add extra legends for the same aesthetic - i.e. one aesthetic mapping, two legends.
Example code
# Create a dataframe with some dummy data
x <- c()
y <- c()
for(i in 1:100){
for(j in 1:100){
x <- c(x, i)
y <- c(y, j)
}
}
example_data <- data.frame(x, y)
example_data$z <- example_data$x*example_data$y
example_data$z_rescale <- example_data$z*0.5
Now we've got some data that I'd like to plot as a heatmap with "z" as a colour gradient.
ggplot(example_data, aes(x = x, y = y, fill = z)) +
geom_tile() +
scale_fill_gradient(low = "blue", high = "red") +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0))
Doing the same with the rescaled z gives an identical plot, but with the rescaled legend:
ggplot(example_data, aes(x = x, y = y, fill = z_rescale)) +
geom_tile() +
scale_fill_gradient(low = "blue", high = "red") +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0))
What I'd like to do however is have a single plot showing the two different legends, which would look something like this mock-up:
Now, I imagine this would be possible by creating two plots, finding the grob that represents the legend in one of the plots and cunningly adding it to the second plot... however, is there a much simpler way that I'm overlooking?
Many thanks!
Please add the code
aes(color = z_rescale) +
scale_color_gradient(low = "blue", high = "red") +
after geom_tile() line and you will get the desired

Create ggplot2 legend for multiple datasets

I am trying to display background data in grey in a ggplot with legend automatically. My aim is to either include the grey datapoints in the legend, or to make a second legend with a manual title. However I fail at doing any of the two. My data is in long format.
require(ggplot2)
xx<-data.frame(observation="all cats",x=1:2,y=1:2)
yy<-data.frame(observation=c("red cats","blue cats"),x=3:4,y=3:4)
g<-ggplot() +
geom_point(aes(x,y, colour=factor(observation)), colour="grey60", size=5, data=xx) +
geom_point(aes(x,y, colour=factor(observation)), size=5, data=yy) +
scale_color_discrete(name = "ltitle")
g
I tried to merge the data.frames with rbind.data.frame, which produces a nice legend, but then I am not able to colour the background data in grey and keep ggplot colours at the same time.
I also realized that this solves the problem:
g<-ggplot(aes(x,y, colour=factor(observation)), colour="grey60", data=xx) +
geom_point(size=5) +
geom_point(aes(x,y, colour=factor(observation)), size=5, data=yy) +
scale_color_discrete(name = "ltitle")
g
however I can't do this, because I'm using a function which creates a complicated empty plot before, in which I then add the geom_points.
Assuming your plot doesn't have other geoms that require a fill parameter, the following is a workaround that fixes the colour of your background data geom_point layer without affecting the other geom_point layers:
g <- ggplot() +
geom_point(aes(x, y,
fill = "label"), # key change 1
shape = 21, # key change 2
color = "grey50", size = 5,
data = xx) +
geom_point(aes(x, y, colour = factor(observation)), size = 5, data = yy) +
scale_color_discrete(name = "ltitle") +
scale_fill_manual(name = "", values = c("label" = "grey50")) # key change 3
g
shape = 21 gives you a shape that looks like the default round dot, but accepts a fill parameter in addition to the colour parameter. You can then set xx's geom_point layer's fill to grey in scale_fill_manual() (this creates a fill legend), while leaving color = "grey50" outside aes() (this does not add to the colour legend).
The colour scale for yy's geom_point layer is not affected by any of this.
p.s. Just realized I used "grey50" instead of "grey60"... But everything else still applies. :)
One solution is to create color vector and pass it to scale_color_manual.
xx <- data.frame(observation = "all cats",x = 1:2,y = 1:2)
yy <- data.frame(observation = c("red cats", "blue cats"),x = 3:4,y = 3:4)
# rbind both datasets
# OP tried to use rbind.data.frame here
plotData <- rbind(xx, yy)
# Create color vector
library(RColorBrewer)
# Extract 3 colors from brewer Set1 palette
colorData <- brewer.pal(length(unique(plotData$observation)), "Set1")
# Replace first color first wanted grey
colorData[1] <- "grey60"
# Plot data
library(ggplot2)
ggplot(plotData, aes(x, y, colour = observation)) +
geom_point(size = 5)+
scale_color_manual(values = colorData, name = "ltitle")
I came up with pretty much same solution as Z.Lin but using the combined dataframe from rbind.data.frame. Similarly, it uses scale_colour_manual with a vector colors specifying the color mapping:
require(ggplot2)
xx<-data.frame(observation="all cats",x=1:2,y=1:2)
yy<-data.frame(observation=c("red cats","blue cats"),x=3:4,y=3:4)
zz <- rbind.data.frame(xx,yy)
colors <- c(
"all cats" = "grey60",
"red cats" = "red",
"blue cats" = "blue"
)
g<-ggplot() +
geom_point(aes(x,y, colour=factor(observation)), size=5, data=zz) +
scale_color_manual(values= colors, name = "ltitle")
g

R 'cowplot' neatly produce gridded plot with shared (common) legends and unique legends

See my related question and the accepted answer here.
I am trying to produce a plot similar to that in the accepted answer i.e. a gridded plot with a shared common legend and a different unique legend attached to each plot on the grid.
Specifically, I want a 3 row, 1 column grid with 1 plot on each row. Like this:
Which was produced with the following code:
library (ggplot2)
library(gridExtra)
library (grid)
library(cowplot)
diamonds2 <- diamonds[sample(nrow(diamonds), 500), ]
# 3 ggplot plot objects with multiple legends 1 common legend and 3 unique legends
p1<- ggplot(diamonds2, aes(x=price, y= depth, color= clarity , shape= cut )) +
geom_point(size=5) + labs (shape = "unique legend", color = "common legend")
p2 <- ggplot(diamonds2, aes(x=price, y= depth, color= clarity , shape= color )) +
geom_point(size=5) + labs (shape = "unique legend", color = "common legend")
p3 <- ggplot(diamonds2, aes(x=price, y= depth, color= clarity , shape= clarity )) +
geom_point(size=5) + labs (shape = "unique legend", color = "common legend")
cowplot::plot_grid(
cowplot::plot_grid(
p1 + scale_color_discrete(guide = FALSE),
p2 + scale_color_discrete(guide = FALSE),
p3 + scale_color_discrete(guide = FALSE),
nrow=3, ncol = 1))
But with a shared legend which relates to the color = argument of each plot object.
I've tried many variations of the below code and have added/adjusted/removed various arguments/parameters in consultation with the cowplot documentation but I cannot get a neat plot like the one above with the shared legend at the bottom (or anywhere useful!) - everything I have attempted returns a crowded plot like below.
Adaption of the above code to include the shared legend :
cowplot::plot_grid(
cowplot::plot_grid(
p1 + scale_color_discrete(guide = FALSE),
p2 + scale_color_discrete(guide = FALSE),
p3 + scale_color_discrete(guide = FALSE),
nrow=3, ncol = 1
),
cowplot::get_legend(p1 + scale_shape(guide = FALSE) + theme(legend.position = "bottom")), nrow=3)
Which results in a crowded plot like this with a lot of empty space:
Could anyone suggest where I might be going wrong?
Each call to plot_grid splits your plotting area. Here, you are nesting two calls to plot_grid, and you are asking for 3 rows in each. cowplot therefore splits the plotting area in two equal parts:
in the top part, it puts your scatter plot
in the bottom part, your legend takes the first row, with nothing in the bottom two rows creating a lot of empty space while squishing your scatter plots.
You can specify the relative height of each of your plotting area giving more space for the scatter plots, and less space for the legend at the bottom. For instance for 85% plots, and 15% legend:
cowplot::plot_grid(
cowplot::plot_grid(
p1 + scale_color_discrete(guide = FALSE),
p2 + scale_color_discrete(guide = FALSE),
p3 + scale_color_discrete(guide = FALSE),
ncol = 1, align = "v"
),
cowplot::get_legend(p1 + scale_shape(guide = FALSE) +
theme(legend.position = "bottom")),
ncol=1, rel_heights=c(.85, .15))
which produces :

Consistent hexagon sizes and legend for manually assignment of colors

This is a continuation of a question I recently asked (Manually assigning colors with scale_fill_manual only works for certain hexagon sizes).
I was unable to plot geom_hex() so that all hexagons were the same size. Someone solved the problem. However, their solution removed the legend key. Now, I am unable to keep all the hexagons the same size while also retaining the legend.
To be specific, I really want to keep the legend labels sensical. In the example below, the legend has values (0,2,4,6,8,20), rather than hexadecimal labels (#08306B, #08519C, etc).
Below is MWE illustrating the problem. At the end, as per the 3 comments, you can see that I am able to 1) Create a plot with consistent hexagon sizes but no legend, 2) Create a plot with legend, but inconsistent hexagon sizes, 3) Attempt to create a plot with consistent hexagon sizes and legend but fail:
library(ggplot2)
library(hexbin)
library(RColorBrewer)
library(reshape)
set.seed(1)
xbins <- 10
x <- abs(rnorm(10000))
y <- abs(rnorm(10000))
minVal <- min(x, y)
maxVal <- max(x, y)
maxRange <- c(minVal, maxVal)
buffer <- (maxRange[2] - maxRange[1]) / (xbins / 2)
bindata = data.frame(x=x,y=y,factor=as.factor(1))
h <- hexbin(bindata, xbins = xbins, IDs = TRUE, xbnds = maxRange, ybnds = maxRange)
counts <- hexTapply (h, bindata$factor, table)
counts <- t (simplify2array (counts))
counts <- melt (counts)
colnames (counts) <- c ("factor", "ID", "counts")
counts$factor =as.factor(counts$factor)
hexdf <- data.frame (hcell2xy (h), ID = h#cell)
hexdf <- merge (counts, hexdf)
my_breaks <- c(2, 4, 6, 8, 20, 1000)
clrs <- brewer.pal(length(my_breaks) + 3, "Blues")
clrs <- clrs[3:length(clrs)]
hexdf$countColor <- cut(hexdf$counts, breaks = c(0, my_breaks, Inf), labels = rev(clrs))
# Has consistent hexagon sizes, but no legend
ggplot(hexdf, aes(x=x, y=y, hexID=ID, counts=counts, fill=countColor)) + geom_hex(stat="identity", fill=hexdf$countColor) + scale_fill_manual(labels = as.character(c(0, my_breaks)), values = rev(clrs), name = "Count") + geom_abline(intercept = 0, color = "red", size = 0.25) + labs(x = "A", y = "C") + coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)), ylim = c(-0.5, (maxRange[2]+buffer))) + theme(aspect.ratio=1)
# Has legend, but inconsistent hexagon sizes
ggplot(hexdf, aes(x=x, y=y, hexID=ID, counts=counts, fill=countColor)) + geom_hex(data=hexdf, stat="identity", aes(fill=countColor)) + scale_fill_manual(labels = as.character(c(0, my_breaks)), values = rev(clrs), name = "Count") + geom_abline(intercept = 0, color = "red", size = 0.25) + labs(x = "A", y = "C") + coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)), ylim = c(-0.5, (maxRange[2]+buffer))) + theme(aspect.ratio=1)
# One attempt to create consistent hexagon sizes and retain legend
ggplot(hexdf, aes(x=x, y=y, hexID=ID, counts=counts, fill=countColor)) + geom_hex(data=hexdf, aes(fill=countColor)) + geom_hex(stat="identity", fill=hexdf$countColor) + scale_fill_manual(labels = as.character(c(0, my_breaks)), values = rev(clrs), name = "Count") + geom_abline(intercept = 0, color = "red", size = 0.25) + labs(x = "A", y = "C") + coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)), ylim = c(-0.5, (maxRange[2]+buffer))) + theme(aspect.ratio=1)
Any suggestions on how to keep the hexagon sizes consistent while retaining the legend would be very helpful!
Wow, this is an interesting one -- geom_hex seems to really dislike mapping color/fill onto categorical variables. I assume that's because it is designed to be a two-dimensional histogram and visualize continuous summary statistics, but if anyone has any insight into what's going on behind the scenes, I would love to know.
For your specific problem, that really throws a wrench in the works, because you're attempting to have categorical colorization that assigns non-linear groups to the individual hexagons. Conceptually, you might consider why you're doing that. There may be a good reason, but you're essentially taking a linear color gradient and mapping it non-linearly onto your data, which can end up being visually misleading.
However, if that is what you want to do, the best approach I could come up with was to create a new continuous variable that mapped linearly onto your chosen colors and then use those to create a color gradient. Let me try to walk you through my thought process.
You essentially have a continuous variable (counts) that you want to map onto colors. That's easy enough with a simple color gradient, which is the default in ggplot2 for continuous variables. Using your data:
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=counts))
yields something close.
However, the bins with really high counts wash out the gradient for points with much lower counts, so we need to change the way the gradient maps colors onto values. You've already declared the colors you want to use in the clrs variable; we just need to add a column to your data frame to use in conjunction with these colors to create a smooth gradient. I did that as follows:
all_breaks <- c(0, my_breaks)
breaks_n <- 1:length(all_breaks)
get_break_n <- function(n) {
break_idx <- max(which((all_breaks - n) < 0))
breaks_n[break_idx]
}
hexdf$bin <- sapply(hexdf$counts, get_break_n)
We create the bin variable as the index of the break that is nearest the count variable without exceeding it. Now, you'll notice that:
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin))
is getting much closer to the goal.
The next step is to change how the color gradient maps onto that bin variable, which we can do by adding a call to scale_fill_gradientn:
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin)) +
scale_fill_gradientn(colors=rev(clrs[-1])) # odd color reversal to
# match OP's color mapping
This takes a vector of colors between which you want to interpolate a gradient. The way we've set it up, the points along the interpolation will perfectly match up with the unique values of the bin variable, meaning each value will get one of the colors specified.
Now we're cooking with gas, and the only thing left to do is add the various bells and whistles from the original graph. Most importantly, we need to make the legend look the way we want. This requires three things: (1) changing it from the default color bar to a discretized legend, (2) specifying our own custom labels, and (3) giving it an informative title.
# create the custom labels for the legend
all_break_labs <- as.character(all_breaks[1:(length(allb)-1)])
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin)) +
scale_fill_gradientn(colors=rev(clrs[-1]),
guide="legend", # (1) make legend discrete
labels=all_break_labs, # (2) specify labels
name="Count") + # (3) legend title
# All the other prettification from the OP
geom_abline(intercept = 0, color = "red", size = 0.25) +
labs(x = "A", y = "C") +
coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)),
ylim = c(-0.5, (maxRange[2]+buffer))) +
theme(aspect.ratio=1)
All of this leaves us with the following graph:
Hopefully that helps you out. For completeness, here's the new code in full:
# ... the rest of your code before the plots
clrs <- clrs[3:length(clrs)]
hexdf$countColor <- cut(hexdf$counts,
breaks = c(0, my_breaks, Inf),
labels = rev(clrs))
### START OF NEW CODE ###
# create new bin variable
all_breaks <- c(0, my_breaks)
breaks_n <- 1:length(all_breaks)
get_break_n <- function(n) {
break_idx <- max(which((all_breaks - n) < 0))
breaks_n[break_idx]
}
hexdf$bin <- sapply(hexdf$counts, get_break_n)
# create legend labels
all_break_labs <- as.character(all_breaks[1:(length(all_breaks)-1)])
# create final plot
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin)) +
scale_fill_gradientn(colors=rev(clrs[-1]),
guide="legend",
labels=all_break_labs,
name="Count") +
geom_abline(intercept = 0, color = "red", size = 0.25) +
labs(x = "A", y = "C") +
coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)),
ylim = c(-0.5, (maxRange[2]+buffer))) +
theme(aspect.ratio=1)

ggplot2 legend with only one category / with only the shape and no scale

How can I have a section of the legend with only one category? I tried to mess with the override.aes without sucess. Alternatively, the desired output could be seen as a legend with only the shape but not the scale.
ggplot(iris) +
geom_point(aes(x=Sepal.Width, y=Sepal.Length, color=Species, size=Sepal.Length))+
scale_size_continuous("Legend with \n only 1 circle ",range = c(5,10))+
guides(size = guide_legend( override.aes=list(range= c(1,5))))
An illustration of the type of product I am trying to get to:
Points are scaled but the legend does not report the scale.
Just create one break in the scale. You can add a custom label to it as well (here it is ""). You can also control the size of the point in the legend with the break you choose.
The scale_color_discrete() line is there because otherwise the 1-point legend would be on top, not what you had in your desired picture.
require(ggplot2)
g <- ggplot(iris) + geom_point(aes(x = Sepal.Width, y = Sepal.Length,
color = Species, size = Sepal.Length)) +
scale_color_discrete(name = "Color") +
scale_size_continuous(name = "Legend with \n only 1 circle",
breaks = 5, labels = "")
Although #choff solution is the best solution for the example I gave, here is the slightly different one I ended up using as I needed to have control over the range size of the circles.
ggplot(iris) +
geom_point(aes(x=Sepal.Width, y=Sepal.Length, color=Species, size=Sepal.Length))+
scale_size_continuous("Legend with \n only 1 circle ",range = c(5,10), labels=c("","","","",""))+
guides(size =guide_legend( override.aes=list(size=c(4,0,0)))) +
theme(legend.key = element_blank())
From your target map chart, it looks like you want your legend to separate your chart symbols by color and shape. Size used only to set the size of the symbols. Also, your data would likely have a column for separately countries with investments from those without. So we can add a column to iris which separates the rows by two values, map color and shape to that column, and then display the legend for those two aesthetics in a single, combined legend. The code looks like:
sp <- ggplot(transform(iris, Flower_size = ifelse(Petal.Width < 1, "Small Flower","Big Flower")))
sp <- sp + geom_point(aes(x=Sepal.Width, y=Sepal.Length, fill=Flower_size, shape=Flower_size, size=Sepal.Length), colour=NA)
sp <- sp + scale_size_continuous(range = c(4,7))
sp <- sp + scale_shape_manual(values=c(21, 22))
sp <- sp + scale_fill_manual(values=c("grey", "orange"))
sp <- sp + labs(fill="", shape="")
sp <- sp + guides(size=FALSE, fill=guide_legend(override.aes=list(size=7)))
sp <- sp + theme(legend.text=element_text(size=12))
plot(sp)

Resources