ggplot2: Legend for NA in scale_fill_brewer - r

I wonder how I can get legend category for NA values in scale_fill_brewer. Here is my MWE.
set.seed(12345)
dat <-
data.frame(
Row = rep(x = LETTERS[1:5], times = 10)
, Col = rep(x = LETTERS[1:10], each = 5)
, Y = c(rnorm(n = 48, mean = 500, sd = 1), NA, NA)
)
dat$Y1 <- addNA(cut(log(dat$Y), 5))
levels(dat$Y1)
[1] "(6.21,6.212]" "(6.212,6.214]" "(6.214,6.216]" "(6.216,6.218]" "(6.218,6.22]" NA
library(ggplot2)
ggplot(data = dat, aes(x = Row, y = Col)) +
geom_tile(aes(fill = Y1), colour = "white") +
scale_fill_brewer(palette = "PRGn")

You could explicitly treat the missing values as another level of your Y1 factor to get it on your legend.
After cutting the variable as before, you will want to add NA to the levels of the factor. Here I add it as the last level.
dat$Y1 <- cut(log(dat$Y), 5)
levels(dat$Y1) <- c(levels(dat$Y1), "NA")
Then change all the missing values to the character string NA.
dat$Y1[is.na(dat$Y1)] <- "NA"
This makes NA part of the legend in your plot:

I've found a workaround without changing the original data frame, adding an extra legend based on this post:
ggplot(data = dat, aes(x = Row, y = Col)) +
geom_tile(aes(fill = Y1), colour = "white") +
scale_fill_brewer(palette = "PRGn")+
geom_point(data = dat, aes(size="NA"), shape =NA, colour = "grey95")+
guides(size=guide_legend("NA", override.aes=list(shape=15, size = 10)))
Colouring the NAs:
ggplot(data = dat, aes(x = Row, y = Col)) +
geom_tile(aes(fill = Y1), colour = "white") +
scale_fill_brewer(palette = "PRGn", na.value="red")+
geom_point(data = dat, aes(size="NA"), shape =NA, colour = "red")+
guides(size=guide_legend("NA", override.aes=list(shape=15, size = 10)))

Related

ggplot2 fill legend does not display the correct "fill" color

I am confused of this problem for a long time. A simple data frame is constructed as follows
data <- data.frame(
x = 1:5,
y = 5:1,
fill = c(rep("pink", 3), rep("blue", 2)),
shape = c(rep(21, 3), rep(22, 2))
)
Suppose I wand to show the legend of the fill
uniFill <- unique(data$fill)
p <- ggplot(data,
mapping = aes(x = x,
y = y,
fill = fill)) +
geom_point(shape = data$shape) +
# show legend so that I do not call `scale_fill_identity()`
scale_fill_manual(values = uniFill,
labels = uniFill,
breaks = uniFill)
p
The graphics are OK, however, the legend is not correct
I guess, maybe different shapes (21 to 25) cannot be merged? Then, I partition the data into two subsets where the first set has shape 21 and the second has shape 22.
data1 <- data[1:3, ]
data2 <- data[4:5, ]
# > data1$shape
# [1] 21 21 21
# > data2$shape
# [1] 22 22
ggplot(mapping = aes(x = x,
y = y,
fill = fill)) +
geom_point(data = data1, shape = data1$shape) +
geom_point(data = data2, shape = data2$shape) +
scale_fill_manual(values = uniFill,
labels = uniFill,
breaks = uniFill)
Unfortunately, the legend does not change. Then, I changed the shape from a vector to a scalar, as in
ggplot(mapping = aes(x = x,
y = y,
fill = fill)) +
geom_point(data = data1, shape = 21) +
geom_point(data = data2, shape = 22) +
scale_fill_manual(values = uniFill,
labels = uniFill,
breaks = uniFill)
The legend of the fill color is correct finally...
So what happens here? Is it a bug? Is it possible to just add a single layer but with different shapes (21 to 25)?
A possible solution is that one can add component guides(), as in
p +
guides(fill = guide_legend(override.aes = list(fill = uniFill,
shape = 21)))
But I am more interested in why p does not work (legend)
The main reason your legend is not working in your first example is because you did not put your shape in the aesthetics.
I have a couple other suggestions: Do not define colors in your data frame; instead define a column to change the aesthetics using a code. Then define your fill and shape values explicitly. Each of the scales needs to have the same name - in this case "Legend."
Give this edit a try.
data <- data.frame(
x = 1:5,
y = 5:1,
fill = c(rep("p", 3), rep("b", 2))
)
uniFill <- c("p"="pink", "b"="blue")
uniShape <- c("p" = 21, "b" = 22)
p <- ggplot(data,
mapping = aes(x = x,
y = y,
fill = fill,
shape = fill)) +
geom_point() +
# show legend so that I do not call `scale_fill_identity()`
scale_fill_manual("Legend",values = uniFill,
labels = uniFill)+
scale_shape_manual("Legend",values = uniShape,
labels = uniFill)
p
(edit) If your fill and shape aesthetics do not match up, I don't see any other way than to use guides and two legends. Notice that if your attribute column is descriptive, you do not need to set the labels and your code will be cleaner (see shape vs fill aesthetics).
data <- data.frame(
x = 1:5,
y = 5:1,
fill = c(rep("p", 3), rep("b", 2)),
shape = c(rep("circles", 2), rep("squares", 3))
)
uniFill <- c("p"="pink", "b"="blue")
uniShape <- c("circles" = 21, "squares" = 22)
p <- ggplot(data,
mapping = aes(x = x,
y = y,
fill = fill,
shape = shape)) +
geom_point() +
# show legend so that I do not call `scale_fill_identity()`
scale_fill_manual("Legend fill",values = uniFill,
labels = uniFill)+
scale_shape_manual("Legend shape",values = uniShape )+
guides(fill = guide_legend("Legend fill", override.aes = list(shape = 21)))
p

How can I remove the NA label in my GGplot legend?

my original code
my_graph10 <- ggplot(Adata, aes(x = SVL, y = Fi)) + geom_point(aes(color = Morph)) + labs(x = "SVL (mm)", y = "Front Inner Limb (mm)") + geom_smooth(method=lm,se=FALSE,aes(color = Morph,linetype = Morph)) + scale_color_manual(values = c("orange", "steelblue")) results in this
Legend NA
after reading online, many said to use na.translate = F ; therefore I added this to the code
my_graph11 <- ggplot(Adata, aes(x = SVL, y = Fo)) + geom_point(aes(color = Morph)) + labs(x = "SVL (mm)", y = "Front Outer Limb (mm)") + geom_smooth(method=lm,se=FALSE,aes(color = Morph,linetype = Morph)) + scale_color_manual(**na.translate = F**, values = c("orange", "steelblue") and I am left with this
Two Legends
However, when I do so, it removes the NA from the original legend, but adds a new legend for linetype, under which NA is still listed. I attempted to do the same code for linetype but receive this error message "Error: Insufficient values in manual scale. 2 needed but only 0 provided"
You can use remove_missing
# Let's create some sample data first
library(ggplot2)
set.seed(2020)
x <- seq(0, 1, length.out = 100)
df <- data.frame(
x = x,
y = rnorm(length(x)),
Morph = sample(c("S", "U", NA), length(x), replace = TRUE))
# Use `remove_missing` with `na.rm = TRUE` to remove NA rows
ggplot(remove_missing(df, na.rm = TRUE), aes(x, y, colour = Morph)) +
geom_point() +
geom_smooth(aes(linetype = Morph), method = "lm", se = FALSE)
Or alternatively you use na.omit
ggplot(na.omit(df), aes(x, y, colour = Morph)) +
geom_point() +
geom_smooth(aes(linetype = Morph), method = "lm", se = FALSE)

How to connect points of two dataframes to each other using ggplot in R?

I have two dataframes df1 and df2 as follows:
> df1
time value
1 1 6
2 2 2
3 3 3
4 4 1
> df2
time value
1 2 3
2 3 8
3 4 4
4 5 5
I want to plot these dataframes in just one diagram, show their name on their plots with a colour, and connect each value of df1 to the corresponding value of df2. Actually, here is the diagram which I want:
The code which I wrote to try to get the above diagram is:
ggplot() +
geom_point() +
geom_line(data=df1, aes(x=time, y=value), color='green') +
geom_line(data=df2, aes(x=time, y=value), color='red') +
xlab("time") +
geom_text(aes(x = df1$time[1], y = 6.2, label = "df1", color = "green", size = 18)) +
geom_text(aes(x = df2$time[1], y = 2.8, label = "df2", color = "red", size = 18)) +
theme(axis.text=element_text(size = 14), axis.title=element_text(size = 14))
But the result is:
As you can see in plot 2, there are no points even I used geom_point(), the names colour are wrong, there is no connection between each values of df1 to the corresponding value of df2, and also I cannot increase the text size for the names even I determined size = 18 in my code.
A very similar solution to zx8754’s answer but with more explicit data wrangling. In theory my solution should be more general as the dataframes could be unsorted, they would just need a common variable to join.
library(magrittr)
library(ggplot2)
df1 = data.frame(
time = 1:4,
value = c(6,2,3,1),
index = 1:4
)
df2 = data.frame(
time = 2:5,
value = c(3,8,4,5),
index = 1:4
)
df3 = dplyr::inner_join(df1,df2,by = "index")
df1$type = "1"
df2$type = "2"
plot_df = dplyr::bind_rows(list(df1,df2))
plot_df %>% ggplot(aes(x = time, y = value, color = type)) +
geom_point(color = "black")+
geom_line() +
geom_segment(inherit.aes = FALSE,
data = df3,
aes(x = time.x,
y = value.x,
xend = time.y,
yend = value.y),
linetype = "dashed") +
scale_color_manual(values = c("1" = "green",
"2" = "red"))
Created on 2019-04-25 by the reprex package (v0.2.0).
Combine (cbind) dataframes then use geom_segment:
ggplot() +
geom_line(data = df1, aes(x = time, y = value), color = 'green') +
geom_line(data = df2, aes(x = time, y = value), color = 'red') +
geom_segment(data = setNames(cbind(df1, df2), c("x1", "y1", "x2", "y2")),
aes(x = x1, y = y1, xend = x2, yend = y2), linetype = "dashed")
There is a very simple solution (from here):
plot_df$'Kukulkan' <- rep(1:4, 2)
plot_df %>% ggplot(aes(x = time, y = value, color = type)) +
geom_point(size=3)+
geom_line(aes(group = Kukulkan))

R - geom_bar - 'stack' position without summing the values

I have this data frame
df <- data.frame(profile = rep(c(1,2), times = 1, each = 3), depth = c(100, 200, 300), value = 1:3)
This is my plot
ggplot() +
geom_bar(data = df, aes(x = profile, y = - depth, fill = value), stat = "identity")
My problem is the y labels which doesn't correspond to the depth values of the data frame
To help, my desired plot seems like this :
ggplot() +
geom_point(data = df, aes(x = profile, y = depth, colour = value), size = 20) +
xlim(c(0,3))
But with bar intead of points vertically aligned
nb : I don't want to correct it manually in changing ticks with scale_y_discrete(labels = (desired_labels))
Thanks for help
Considering you want a y-axis from 0 to -300, using facet_grid() seems to be a right option without summarising the data together.
ggplot() + geom_bar(data = df, aes(x = as.factor(profile), y = -depth, fill = value), stat = 'identity') + facet_grid(~ value)
I have it !
Thanks for your replies and to this post R, subtract value from previous row, group by
To resume; the data :
df <- data.frame(profile = rep(c(1,2), times = 1, each = 3), depth = c(100, 200, 300), value = 1:3)
Then we compute the depth step of each profile :
df$diff <- ave(df$depth, df$profile, FUN=function(z) c(z[1], diff(z)))
And finally the plot :
ggplot(df, aes(x = factor(profile), y = -diff, fill = value)) + geom_col()

Legend for summary statistics in ggplot2

Here is the code for the plot
library(ggplot2)
df <- data.frame(gp = factor(rep(letters[1:3], each = 10)), y = rnorm(30))
library(plyr)
ds <- ddply(df, .(gp), summarise, mean = mean(y), sd = sd(y))
ggplot(df, aes(x = gp, y = y)) +
geom_point() +
geom_point(data = ds, aes(y = mean), colour = 'red', size = 3)
I want to have a legend for this plot that will identify the data values and mean values some thing like this
Black point = Data
Red point = Mean.
How can I achieve this?
Use a manual scale, i.e. in your case scale_colour_manual. Then map the colours to values in the scale using the aes() function of each geom:
ggplot(df, aes(x = gp, y = y)) +
geom_point(aes(colour="data")) +
geom_point(data = ds, aes(y = mean, colour = "mean"), size = 3) +
scale_colour_manual("Legend", values=c("mean"="red", "data"="black"))
You can combine the mean variable and data in the same data.frame and colour /size by column which is a factor, either data or mean
library(reshape2)
# in long format
dsl <- melt(ds, value.name = 'y')
# add variable column to df data.frame
df[['variable']] <- 'data'
# combine
all_data <- rbind(df,dsl)
# drop sd rows
data_w_mean <- subset(all_data,variable != 'sd',drop = T)
# create vectors for use with scale_..._manual
colour_scales <- setNames(c('black','red'),c('data','mean'))
size_scales <- setNames(c(1,3),c('data','mean') )
ggplot(data_w_mean, aes(x = gp, y = y)) +
geom_point(aes(colour = variable, size = variable)) +
scale_colour_manual(name = 'Type', values = colour_scales) +
scale_size_manual(name = 'Type', values = size_scales)
Or you could not combine, but include the column in both data sets
dsl_mean <- subset(dsl,variable != 'sd',drop = T)
ggplot(df, aes(x = gp, y = y, colour = variable, size = variable)) +
geom_point() +
geom_point(data = dsl_mean) +
scale_colour_manual(name = 'Type', values = colour_scales) +
scale_size_manual(name = 'Type', values = size_scales)
Which gives the same results

Resources