ggplot2 non-linearly adjust scale with color - r

I have two questions:
Although this data has a range from -1 to 5, most of them have a value between -1 and 1. Thus, I was wondering if I could adjust the relationship between result and color to be non-linear (that is, more change between -1 and 1, but less change between 2 and 5). Would this be possible using GGplot2?
How can I move the scale bar into the maps, say, in the bottom right position?
Here is my code:
library(ggplot2)
library(ggmap)
library(data.table)
map<-get_map(location='united states', zoom=4, maptype = "terrain",
source='google',color='bw')
ggmap(map) + geom_point(
aes(x=longitude, y=latitude, show_guide = TRUE, colour=V1),
data=plot.data, alpha=0.3, na.rm = T) +
scale_color_gradient(low="red", high = "red4", name = "Level")

You can use scale_colour_gradientn and define values (from ?scale_colour_gradientn: if colours should not be evenly positioned along the gradient this vector gives the position (between 0 and 1) for each colour in the colours vector.), e.g.:
ggplot(data = iris, aes(x = Species, y = Sepal.Width, colour = Sepal.Length)) +
geom_point() +
scale_colour_gradientn(colours = c("blue", "red", "orange"),
values = c(0, 0.1, 1))
To change the position of the legend have a look at ?theme > legend.position and/or legend.justification

You could try this command:
+theme(legend.position = c(xxx, xxx))

Building upon beetroot's answer, one can write a short helper function to transform values on the original scale to the 0-1 scale used for plotting.
gradient_setter <- function(x, low = NULL, mid = 0, high = NULL){
rn <- range(x, na.rm = TRUE)
(c(rn[1], low, mid, high, rn[2]) - rn[1])/(rn[2] - rn[1])
}
ggplot(data = iris, aes(x = Species, y = Sepal.Width, colour = Sepal.Length)) +
geom_point() +
scale_colour_gradientn(colours = c("blue", "red", "orange"),
values = gradient_setter(iris$Sepal.Length, mid = 5, high = 5.8))

Related

How to fill shapes with color gradient in ggplot2

I am trying to fill shapes with colors gradient corresponding to a continuous variable. I don't have any errors, but it does not print on the graph.
I tried to use as.numeric to make sure the variable is continuous.
The NMDS is already calculated and it works. Everything works except the color gradient.
ggplot()+
geom_point(data = NMDS.all.taxa, aes(y = NMDS2, x = NMDS1, fill = env$elev.num), shape = env$saison, size = 4)+ #this is the points
geom_path(data = df_ell.all.taxa, aes(x = NMDS1, y = NMDS2, colour = group))+ ##Elipses
scale_fill_gradient2(low = "green", mid = "blue", high = "red", midpoint = 1800)+ ##fill
theme_bw()+
theme(panel.background = element_blank(),
panel.grid.major = element_blank(), #remove major-grid labels
panel.grid.minor = element_blank(), #remove minor-grid labels
plot.background = element_blank()
)
It gives me a NMDS with everything except that the shapes are empty. Not any Error codes.
To recreate aspects of the data:
library(vegan)
Macro_AUG_2019_rep.<- matrix(0:10, ncol = 20, nrow = 50)
env <- data.frame(Traitement = sample(c("n","r"),50, replace = TRUE),
saison = sample(c("S","A"),50, replace = TRUE),
Elevation = sample(1000:1049),
Site = sample(c("lake1","lake2","lake3","lake4","lake5","lake6","lake7"),
50, replace = TRUE))
spe.nmds <- metaMDS(Macro_AUG_2019_rep., distance='bray', k=2, try=999, maxit=500)
NMDS.all.taxa <- data.frame(NMDS1 = spe.nmds$points[,1],
NMDS2 = spe.nmds$points[,2],
group = env$Traitement,
sites = env$Site)
veganCovEllipse <- function(cov, center = c(0, 0), scale = 1.75, npoints = 100) {
theta <- (0:npoints) * 2 * pi/npoints
Circle <- cbind(cos(theta), sin(theta))
t(center + scale * t(Circle %*% chol(cov)))
}
df_ell.all.taxa <- data.frame()
for(g in levels(NMDS.all.taxa$group)){
df_ell.all.taxa <- rbind(
df_ell.all.taxa,
cbind(as.data.frame(with(NMDS.all.taxa[NMDS.all.taxa$group == g, ],
veganCovEllipse(cov.wt(cbind(NMDS1, NMDS2),
wt=rep(1 / length(NMDS1),
length(NMDS1)))$cov,
center=c(mean(NMDS1), mean(NMDS2))))),
group = g)
)
}
NMDS.mean.all.taxa = aggregate(NMDS.all.taxa[ ,c("NMDS1", "NMDS2")],
list(group = NMDS.all.taxa$group),
mean)
ggplot2 doesn't play nice with $ operators in aes() functions, so it would be good practise to avoid these by appending the env$Elevation and env$saison to your main data.frame:
df <- cbind(NMDS.all.taxa, Elevation = env$Elevation, Saison = env$saison)
Now, if I understood correctly, the problem was that the geom_point shapes aren't filled. The shape aesthetic in ggplot2 is (I think) identical to the pch argument in base R plots, so we can see what the values would mean:
Factor variables are integers dressed up with a level-label, so the env$saison that you were using would pass down 1s and 2s as shapes. These shapes are line-only, without any fill associated with them.
Thus, to fix the problem we need to let ggplot know what shapes we actually want, 21 and 24 for example. To map your factor variable to these shapes, we'll define the shapes inside the aes() function. Then, we can use scale_shape_manual() to set the correct shapes.
# I defined the xy mapping in the main `ggplot()` call so that we
# don't need to do this seperately for the path and points
ggplot(data = df, aes(NMDS1, NMDS2))+
geom_point(aes(fill = Elevation, shape = Saison),
size = 4) +
geom_path(data = df_ell.all.taxa,
aes(colour = group)) +
scale_shape_manual(values = c(21, 24)) +
scale_fill_gradient2(low = "green", mid = "blue", high = "red",
# I adjusted the midpoint to match example
midpoint = mean(df$Elevation))
Which gave me the following plot:
As an aside, your problem would have been easily illustrated with a build-in dataset. It is more in line with what a minimal reproducible example is, and it would have saved your time copying your data analysis code. Example below:
ggplot(iris, aes(Sepal.Width, Sepal.Length)) +
geom_point(aes(fill = Petal.Width), shape = iris$Species) +
scale_fill_gradient()
To which the answer would have been:
ggplot(iris, aes(Sepal.Width, Sepal.Length)) +
geom_point(aes(fill = Petal.Width, shape = Species)) +
scale_fill_gradient() +
scale_shape_manual(values = c(21, 22, 24))

How to modify and add an extra legend in a ggplot2 figure

I have data that looks like this:
example.df <- as.data.frame(matrix( c("height","fruit",0.2,0.4,0.7,
"height","veggies",0.3,0.6,0.8,
"height","exercise",0.1,0.2,0.5,
"bmi","fruit",0.2,0.4,0.6,
"bmi","veggies",0.1,0.5,0.7,
"bmi","exercise",0.4,0.7,0.8,
"IQ","fruit",0.4,0.5,0.6,
"IQ","veggies",0.3,0.5,0.7,
"IQ","exercise",0.1,0.4,0.6),
nrow=9, ncol=5, byrow = TRUE))
colnames(example.df) <- c("phenotype","predictor","corr1","corr2","corr3")
So basically three different correlations between 3x3 variables. I want to visualize the increase in correlations as follows:
ggplot(example.df, aes(x=phenotype, y=corr1, yend=corr3, colour = predictor)) +
geom_linerange(aes(x = phenotype,
ymin = corr1, ymax = corr3,
colour = predictor),
position = position_dodge(width = 0.5))+
geom_point(size = 3,
aes(x = phenotype, y = corr1, colour = predictor),
position = position_dodge(width = 0.5), shape=4)+
geom_point(size = 3,
aes(x = phenotype, y = corr2, colour = predictor),
position = position_dodge(width = 0.5), shape=18)+
geom_point(size = 3,
aes(x = phenotype, y = corr3, colour = predictor),
position = position_dodge(width = 0.5))+
labs(x=NULL, y=NULL,
title="Stackoverflow Example Plot")+
scale_colour_manual(name="", values=c("#4682B4", "#698B69", "#FF6347"))+
theme_minimal()
This gives me the following plot:
Problems:
Tthere is something wrong with the way the geom_point shapes are dodged with BMI and IQ. They should be all with on the line with the same colour, like with height.
How do I get an extra legend that can show what the circle, cross, and square represent? (i.e., the three different correlations shown on the line: cross = correlation 1, square = correlation 2, circle = correlation 3).
The legend now shows a line, circle, cross through each other, while just a line for the predictors (exercise, fruit, veggies) would suffice..
Sorry for the multiple issues, but adding the extra legend (problem #2) is the most important one, and I would be already very satisfied if that could be solved, the rest is bonus! :)
See if the following works for you? The main idea is to convert the data frame from wide to long format for the geom_point layer, and map correlation as a shape aesthetic:
example.df %>%
ggplot(aes(x = phenotype, color = predictor, group = predictor)) +
geom_linerange(aes(ymin = corr1, ymax = corr3),
position = position_dodge(width = 0.5)) +
geom_point(data = . %>% tidyr::gather(corr, value, -phenotype, -predictor),
aes(y = value, shape = corr),
size = 3,
position = position_dodge(width = 0.5)) +
scale_color_manual(values = c("#4682B4", "#698B69", "#FF6347")) +
scale_shape_manual(values = c(4, 18, 16),
labels = paste("correlation", 1:3)) +
labs(x = NULL, y = NULL, color = "", shape = "") +
theme_minimal()
Note: The colour legend is based on both geom_linerange and geom_point, hence the legend keys include both a line and a point shape. While it's possible to get rid of the second one, it does take some more convoluted code, and I don't think the plot would be much improved as a result...

Change transparency, shape and size of a categorical variables

I am trying to plot using ggplot and trying to set the transparency, size and shape for geom_point using a binary variable in my dataset.
For example, if binary_variable == 1 then set the size to 1, shape = triangle, transparency = 0.2, if binary_variable == 0 set the size to 0.5 etc.
I have been able to make the colour change as follows:
library(ggplot2)
df <- data.frame(variable1 = 1:5,
variable2 = 1:5,
binary = c(0,0,0,1,1))
ggplot(df, aes(x = variable1, y = variable2, colour = as.factor(binary))) +
geom_point(size = 2, alpha = 0.3) +
scale_colour_manual(values = c("grey", "black"), labels = c("cat1", "cat2")) +
theme_bw()
You can control shape, colour and aesthetics in the same way using the scale_X_manual functions. See the help page for all the different ways these can be controlled.
The key part to make this work though is to make sure that you added the variable you want to control to the aes part of the ggplot function.
Here is an example:
df$binary <- as.factor(df$binary)
ggplot(df, aes(x = variable1, y = variable2, colour = binary, shape = binary, alpha = binary)) +
geom_point(size = 2) +
scale_colour_manual(values = c("blue", "red")) +
scale_shape_manual(values=c(16,17)) +
scale_alpha_manual(values=c(1, 0.5)) +
theme_bw()

How do I add a legend identifying the different geom_points in RStudio

I need help:
I cannot seem to add a legend to the following piece of code for ggplot R STDUIO
ggplot(Report_Data,
aes(x=Report_Data$Transect Point), show.legend = TRUE) +
geom_point(aes(y=Report_Data$Q1North),
shape = 6, size = 5, colour = label , show.legend = TRUE) +
geom_point(aes(y=Report_Data$Q1South),
shape = 4, size = 5, colour = label, show.legend = TRUE)+
labs(title="Density of Trees Species found North & South of the creek using two sampling methods",
y="Density in Tree Species Found", x="Transect Points",caption = "n7180853")+
geom_line(aes(y=Report_Data$Q1North, colour = Q1North),
colour = "green", size = 1, show.legend = TRUE)+
geom_line(aes(y=Report_Data$Q1South, color = Q1South),
colour = "pink4", size = 1, show.legend = TRUE)+ theme(legend.position = "right")
Hard to replicate without your data, but if you want to have your plot to have a legend for point shape, you need to include it within aes(...) for that layer. Similar with colour, or size, if you want. Then, add scale_shape_manual(...) or scale_colour_manual(...) as needed, specifying your specific values.
Here's a toy example using the default diamonds dataset.
data("diamonds")
ggplot(diamonds, aes(x = carat)) +
geom_point(data = diamonds[diamonds$cut == "Ideal",],
aes(y= price, shape = cut)) +
geom_point(data = diamonds[diamonds$cut == "Good",],
aes(y = price, shape = cut)) +
scale_shape_manual(values = c(6,4))

How to merge legends for color and shape when geom_hline has a separate (additional) entry in the color legend?

I have the following code, which produces the following plot:
cols <- brewer.pal(n = 3, name = 'Dark2')
p4 <- ggplot(all.m, aes(x=xval, y=yval, colour = Approach, ymax = 0.95)) + theme_bw() +
geom_errorbar(aes(ymin= yval - se, ymax = yval + se), width=5, position=pd) +
geom_line(position=pd) +
geom_point(aes(shape=Approach, colour = Approach), size = 4) +
geom_hline(aes(yintercept = cp.best$slope, colour = "C2P"), show_guide = FALSE) +
scale_color_manual(name="Approach", breaks=c("C2P", "P2P", "CP2P"), values = cols[c(1,3,2)]) +
scale_y_continuous(breaks = seq(0.4, 0.95, 0.05), "Test AUROC") +
scale_x_continuous(breaks = seq(10, 150, by = 20), "# Number of Patient Samples in Training")
p4 <- p4 + theme(legend.direction = 'horizontal',
legend.position = 'top',
plot.margin = unit(c(5.1, 7, 4.5, 3.5)/2, "lines"),
text = element_text(size=15), axis.title.x=element_text(vjust=-1.5), axis.title.y=element_text(vjust=2))
p4 <- p4 + guides(colour=guide_legend(override.aes=list(shape=c(NA,17,16))))
p4
When I try show_guide = FALSE in geom_point, the shape of the point in the upper legend are all set to default solid circles.
How can I make the lower legend to disappear, without affecting the upper legend?
This is a solution, complete with reproducible data:
library("ggplot2")
library("grid")
library("RColorBrewer")
cp2p <- data.frame(xval = 10 * 2:15, yval = cumsum(c(0.55, rnorm(13, 0.01, 0.005))), Approach = "CP2P", stringsAsFactors = FALSE)
p2p <- data.frame(xval = 10 * 1:15, yval = cumsum(c(0.7, rnorm(14, 0.01, 0.005))), Approach = "P2P", stringsAsFactors = FALSE)
pd <- position_dodge(0.1)
cp.best <- list(slope = 0.65)
all.m <- rbind(p2p, cp2p)
all.m$Approach <- factor(all.m$Approach, levels = c("C2P", "P2P", "CP2P"))
all.m$se <- rnorm(29, 0.1, 0.02)
all.m[nrow(all.m) + 1, ] <- all.m[nrow(all.m) + 1, ] # Creates a new row filled with NAs
all.m$Approach[nrow(all.m)] <- "C2P"
cols <- brewer.pal(n = 3, name = 'Dark2')
p4 <- ggplot(all.m, aes(x=xval, y=yval, colour = Approach, ymax = 0.95)) + theme_bw() +
geom_errorbar(aes(ymin= yval - se, ymax = yval + se), width=5, position=pd) +
geom_line(position=pd) +
geom_point(aes(shape=Approach, colour = Approach), size = 4, na.rm = TRUE) +
geom_hline(aes(yintercept = cp.best$slope, colour = "C2P")) +
scale_color_manual(values = c(C2P = cols[1], P2P = cols[2], CP2P = cols[3])) +
scale_shape_manual(values = c(C2P = NA, P2P = 16, CP2P = 17)) +
scale_y_continuous(breaks = seq(0.4, 0.95, 0.05), "Test AUROC") +
scale_x_continuous(breaks = seq(10, 150, by = 20), "# Number of Patient Samples in Training")
p4 <- p4 + theme(legend.direction = 'horizontal',
legend.position = 'top',
plot.margin = unit(c(5.1, 7, 4.5, 3.5)/2, "lines"),
text = element_text(size=15), axis.title.x=element_text(vjust=-1.5), axis.title.y=element_text(vjust=2))
p4
The trick is to make sure that all of the desired levels of all.m$Approach appear in all.m, even if one of them gets dropped out of the graph. The warning about the omitted point is suppressed by the na.rm = TRUE argument to geom_point.
Short answer:
Just add a dummy geom_point layer (transparent points) where shape is mapped to the same level as in geom_hline.
geom_point(aes(shape = "int"), alpha = 0)
Longer answer:
Whenever possible, ggplot merges / combines legends of different aesthetics. For example, if colour and shape is mapped to the same variable, then the two legends are combined into one.
I illustrate this using simple data set with 'x', 'y' and a grouping variable 'grp' with two levels:
df <- data.frame(x = rep(1:2, 2), y = 1:4, grp = rep(c("a", "b"), each = 2))
First we map both color and shape to 'grp'
ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
geom_line() +
geom_point(size = 4)
Fine, the legends for the aesthetics, color and shape, are merged into one.
Then we add a geom_hline. We want it to have a separate color from the geom_lines and to appear in the legend. Thus, we map color to a variable, i.e. put color inside aes of geom_hline. In this case we do not map the color to a variable in the data set, but to a constant. We may give the constant a desired name, so we don't need to rename the legend entries afterwards.
ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
geom_line() +
geom_point(size = 4) +
geom_hline(aes(yintercept = 2.5, color = "int"))
Now two legends appears, one for the color aesthetics of geom_line and geom_hline, and one for the shape of the geom_points. The reason for this is that the "variable" which color is mapped to now contains three levels: the two levels of 'grp' in the original data, plus the level 'int' which was introduced in the geom_hline aes. Thus, the levels in the color scale differs from those in the shape scale, and by default ggplot can't merge the two scales into one legend.
How to combine the two legends?
One possibility is to introduce the same, additional level for shape as for color by using a dummy geom_point layer with transparent points (alpha = 0) so that the two aesthetics contains the same levels:
ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
geom_line() +
geom_point(size = 4) +
geom_hline(aes(yintercept = 2.5, color = "int")) +
geom_point(aes(shape = "int"), alpha = 0) # <~~~~ a blank geom_point
Another possibility is to convert the original grouping variable to a factor, and add the "geom_hline level" to the original levels. Then use drop = FALSE in scale_shape_discrete to include "unused factor levels from the scale":
datadf$grp <- factor(df$grp, levels = c(unique(df$grp), "int"))
ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
geom_line() +
geom_point(size = 4) +
geom_hline(aes(yintercept = 2.5, color = "int")) +
scale_shape_discrete(drop = FALSE)
Then, as you already know, you may use the guides function to "override" the shape aesthetics in the legend, and remove the shape from the geom_hline entry by setting it to NA:
guides(colour = guide_legend(override.aes = list(shape = c(16, 17, NA))))

Resources