Legend for scatter-line graph in ggplot2 (without colour) - r

very new to this so apologies if there's an obvious answer. I'm trying to add a legend to a scatter-line graph with 2 y variables; i'm aware this can be done using colour, however I ideally want to keep this black and white, and define the variables in the legend by linetype/point instead. Is there any way to do this?
ggplot(birds, aes(distance)) +geom_point(aes(y=individuals_AC)) +geom_point(aes(y=species_AC, shape=17)) +geom_line(aes(y=individuals_AC)) +geom_line(aes(y=species_AC, linetype="dashed")) + scale_shape_identity() + scale_linetype_identity() + theme_classic()

library(tidyverse)
#create some dummy data
df <- tibble(
x = runif(10),
y = runif(10),
type = rep(c("a", "b"), 5)
)
#plot it with a different shape for each type
df %>%
ggplot(aes(x, y, shape = type)) +
geom_point()

Related

r ggplot when two colors overlap

I have some codes to generate a plot,the only problem I have is there're many overlapping colors.
When two colors overlap, how do I specify the dominant color?
For example, there're 4 black points when indicator = threshold. They are at 4 x-axis correspondingly. However, the black points at "Wire" and "ACH" scales do not show up because it is overlap with blue points. The black point at "RDFI" scale barely shows up. How can I make black as the dominant color when two colors overlap? Thanks ahead!
ggplot(df, aes(a-axis, y-axis), color=indicator)) +
geom_quasirandom(groupOnX=TRUE, na.rm = TRUE) +
labs(title= 'chart', x='x-axis', y= 'y-axis') +
scale_color_manual(name = 'indicator', values=c("#99ccff","#000000" ))
for specify the dominant color you should use the function new_scale () and its aliases new_scale_color () and new_scale_fill ().
As an example, lets overlay some measurements over a contour map of topography using the beloed volcano
library(ggplot2)
library(ggnewscale)
# Equivalent to melt(volcano)
topography <- expand.grid(x = 1:nrow(volcano),
y = 1:ncol(volcano))
topography$z <- c(volcano)
# point measurements of something at a few locations
set.seed(42)
measurements <- data.frame(x = runif(30, 1, 80),
y = runif(30, 1, 60),
thing = rnorm(30))
dominant point:
ggplot(mapping = aes(x, y)) +
geom_contour(data = topography, aes(z = z, color = stat(level))) +
# Color scale for topography
scale_color_viridis_c(option = "D") +
# geoms below will use another color scale
new_scale_color() +
geom_point(data = measurements, size = 3, aes(color = thing)) +
# Color scale applied to geoms added after new_scale_color()
scale_color_viridis_c(option = "A")
dominant contour:
ggplot(mapping = aes(x, y)) +
geom_point(data = measurements, size = 3, aes(color = thing)) +
scale_color_viridis_c(option = "A")+
new_scale_color() +
geom_contour(data = topography, aes(z = z, color = stat(level))) +
scale_color_viridis_c(option = "D")
Your problem may not lie with what color is dominant. You have selected colors that will show up often. You may be losing the bottom of your Y axis. The code you have in your example can not have possibly produced that plot it has errors.
Here is a simple example that show's one way to overcome your problem by simply overplottting the threshold points after you have plotted the beeswarm.
library(dplyr)
library(ggbeeswarm)
distro <- data.frame(
'variable'=rep(c('runif','rnorm'),each=1000),
'value'=c(runif(2000, min=-3, max=3))
)
distro$indicator <- "NA"
distro[3,3] <- "Threshhold"
distro[163,3] <- "Threshhold"
ggplot2::ggplot(distro,aes(variable, value, color=indicator)) +
geom_quasirandom(groupOnX=TRUE, na.rm = TRUE, width=0.1) +
scale_color_manual(name = 'indicator', values=c("#99ccff","#000000")) +
geom_point(data = distro %>% filter(indicator == "Threshhold"))
You sort your data based on the color variable (your indicator).
Basically you want your black dots to be plotted last = on top of the other ones.
df$indicator <- sort(df$indicator, decreasing=T)
#Tidyverse solution
df <- df %>% arrange(desc(indicator))
Dependent on your levels you may have to reverse sort or not.
Then you just plot.
pd <- tibble(x=rnorm(1000), y=1, indicator=sample(c("A","B"), replace=T, size = 1000))
ggplot(pd, aes(x=x,y=y,color=indicator)) + geom_point()
pd <- pd %>% arrange(indicator)
ggplot(pd, aes(x=x,y=y,color=indicator)) + geom_point()
pd <- pd %>% arrange(desc(indicator))
ggplot(pd, aes(x=x,y=y,color=indicator)) + geom_point()

How to draw a barplot from counts data in R?

I have a data-frame 'x'
I want barplot like this
I tried
barplot(x$Value, names.arg = x$'Categorical variable')
ggplot(as.data.frame(x$Value), aes(x$'Categorical variable')
Nothing seems to work properly. In barplot, all axis labels (freq values) are different. ggplot is filling all bars to 100%.
You can try plotting using geom_bar(). Following code generates what you are looking for.
df = data.frame(X = c("A","B C","D"),Y = c(23,12,43))
ggplot(df,aes(x=X,y=Y)) + geom_bar(stat='identity') + coord_flip()
It helps to read the ggplot documentation. ggplot requires a few things, including data and aes(). You've got both of those statements there but you're not using them correctly.
library(ggplot2)
set.seed(256)
dat <-
data.frame(variable = c("a", "b", "c"),
value = rnorm(3, 10))
dat %>%
ggplot(aes(x = variable, y = value)) +
geom_bar(stat = "identity", fill = "blue") +
coord_flip()
Here, I'm piping my dat to ggplot as the data argument and using the names of the x and y variables rather than passing a data$... value. Next, I add the geom_bar() statement and I have to use stat = "identity" to tell ggplot to use the actual values in my value rather than trying to plot the count of the number.
You have to use stat = "identity" in geom_bar().
dat <- data.frame("cat" = c("A", "BC", "D"),
"val" = c(23, 12, 43))
ggplot(dat, aes(as.factor(cat), val)) +
geom_bar(stat = "identity") +
coord_flip()

Add labels above top axis in ggplot2 graph while keeping original x axis on bottom

I'm trying to add some labels to a ggplot2 boxplot to indicate the number of observations, and I'd like that annotation to appear above the top axis of the graph. I can add them inside the graph pretty easily, and I suspect there's an application of ggplot_gtable that might do this, but I don't understand how to use that (a point in the direction of a good tutorial would be much appreciated). Here's some example data with labels:
Count <- sample(100:500, 3)
MyData <- data.frame(Category = c(rep("A", Count[1]), rep("B", Count[2]),
rep("C", Count[3])),
Value = c(rnorm(Count[1], 10),
rnorm(Count[2], 20),
rnorm(Count[3], 30)))
MyCounts <- data.frame(Category = c("A", "B", "C"),
Count = Count)
MyCounts$Label <- paste("n =", MyCounts$Count)
ggplot(MyData, aes(x = Category, y = Value)) +
geom_boxplot() +
annotate("text", x = MyCounts$Category, y = 35,
label = MyCounts$Label)
What I'd love is for the "n = 441" and other labels to appear above the graph rather than just inside the upper boundary. Any suggestions?
Rather than separately calculating the counts, you can add the counts with geom_text and the original data frame (MyData). The key is that we need to add stat="count" inside geom_text so that counts will be calculated and can be used as the text labels.
theme_set(theme_classic())
ggplot(MyData, aes(x = Category, y = Value)) +
geom_boxplot() +
geom_text(stat="count", aes(label=paste0("n=",..count..)), y=1.05*max(MyData$Value)) +
expand_limits(y=1.05*max(MyData$Value))
To put the labels above the plot, add some space above the plot area for the text labels and then use the code in the answer linked by #aosmith to override clipping:
library(grid)
theme_set(theme_bw())
p = ggplot(MyData, aes(x = Category, y = Value)) +
geom_boxplot() +
geom_text(stat="count", aes(label=paste0("n=",..count..)),
y=1.06*max(MyData$Value), size=5) +
theme(plot.margin=margin(t=20))
# Override clipping
gt <- ggplot_gtable(ggplot_build(p))
gt$layout$clip[gt$layout$name == "panel"] <- "off"
grid.draw(gt)

How to create a heatmap with continuous scale using ggplot2 in R

I have got a data frame with several 1000 rows in the form of
group = c("gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3")
pos = c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10)
color = c(2,2,2,2,3,3,2,2,3,2,1,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,1,1,2,2)
df = data.frame(group, pos, color)
and would like to make a kind of heatmap in which one axes has a continuous scale (position). The color column is categorical. However due to the large amount of data points I want to use binning, i.e. use it as a continuous variable.
This is more or less how the plot should look like:
I can't think of a way to create such a plot using ggplot2/R. I have tried several geometries, e.g. geom_point()
ggplot(data=df, aes(x=strain, y=pos, color=color)) +
geom_point() +
scale_colour_gradientn(colors=c("yellow", "black", "orange"))
Thanks for your help in advance.
Does this help you?
library(ggplot2)
group = c("gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3")
pos = c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10)
color = c(2,2,2,2,3,3,2,2,3,2,1,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,1,1,2,2)
df = data.frame(group, pos, color)
ggplot(data = df, aes(x = group, y = pos)) + geom_tile(aes(fill = color))
Looks like this
Improved version with 3 color gradient if you like
library(scales)
ggplot(data = df, aes(x = group, y = pos)) + geom_tile(aes(fill = color))+ scale_fill_gradientn(colours=c("orange","black","yellow"),values=rescale(c(1, 2, 3)),guide="colorbar")

Plot discrete values with different color

Given a dataframe with discrete values,
d=data.frame(id=1:6, a=c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), c=c(10,20,30,30,10,20))
I want to make a plot like
However I want to make different color for each layer, say red and green for "a", yellow/blue for "b".
The idea is to reshape your data (define coordinates to draw the rectangles) in order to use geom_rect from ggplot:
library(ggplot2)
library(reshape2)
i = setNames(expand.grid(1:nrow(d),1:ncol(d[-1])),c('x1','y1'))
ggplot(cbind(i,melt(d, id.vars='id')),
aes(xmin=x1, xmax=x1+1, ymin=y1, ymax=y1+1, color=variable, fill=value)) +
geom_rect()
Try geom_tile(). But you need to reshape your data to get exactly the same figure as you presented.
df <- data.frame(id=factor(c(1:6)), a=c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), c=c(10,20,30,30,10,20))
library(reshape2)
df <- melt(df, vars.id = c(df$id))
library(ggplot2)
ggplot(aes(x = id, y = variable, fill = value), data = df) + geom_tile()
require("dplyr")
require("tidyr")
require("ggplot2")
d=data.frame(id=1:6, a=c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), c=c(10,20,30,30,10,20))
ggplot(d %>% gather(type, value, a, b, c) %>% mutate(value = paste0(type, value)),
aes(x = id, y = type)) +
geom_tile(aes(fill = value), color = "white") +
scale_fill_manual(values = c("forestgreen", "indianred", "lightgoldenrod1",
"royalblue", "plum1", "plum2", "plum3"))
First we use reshape2 to transform the data from wide to long. Then to get discrete values we use as.factor(value) and finally we use scale_fill_manual to assign the 5 different colours we need. In geom_tile we specify the colour of the tile borders.
library(reshape2)
library(ggplot2)
df <- data.frame(id=1:6, a=c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), c=c(10,20,30,30,10,20))
df <- melt(df, id.vars=c("id"))
ggplot(df, aes(id, variable, fill = as.factor(value))) + geom_tile(colour = "white") +
scale_fill_manual(values = c("lightblue", "steelblue2", "steelblue3", "steelblue4", "darkblue"), name = "Values")+
scale_x_discrete(limits = 1:6)

Resources