ggplot red-white gradient appears orange - r

I've created a heat map in ggplot and I'm using a default red to white scale_fill_gradient. When I plot the gradient however, it appears to have an orange tint. I was wondering if there was a way to edit how ggplot calculates the gradient so that I get pink intermediates so that it matches the other figures I'm making. Related to this graph, though I can post separately, I want to make my NA values have a striped pattern (rather than grey fill). I haven't found a way to do that in ggplot.
p <- ggplot(heat.data, aes(x = MutResFac, y = mutAA, fill = disruption)) +
geom_tile(color="grey50") +
scale_fill_gradient(name = "Disruption", limits = c(0,3),
labels = c("WT-like", "Mild", "Moderate", "Severe"),
low = "#FFFFFF", high = "#FF0000",
guide = "legend") +
theme(axis.text.x = element_text(size = 25, hjust = 1),
text = element_text(size = 25),
plot.title = element_text(hjust = 0.5)) +
coord_fixed() +
labs(x = "Position", y = "Mutant Amino Acid") +
ggtitle(title) +
scale_x_discrete(labels = wildTypeSeqLabels)

The simple answer as to why your middle range of colors from red to white comes out orange is that he in-between color value from #FF0000 and white is orange. The better practical color theory answer is that #FF0000 is a yellowish or orange-ish red, and not a blueish red, so the middle color will be orangeish. The most technical answer is explained in the technical documentation for the gradient color scale functions. The scale_..._gradient functions create the gradient based on keeping hue constant and varying chroma and luminance.
Options to fix scale
For this, I'll use an example dataset and plot with the base code here:
set.seed(1111)
df <- data.frame(x=1:10, y=rnorm(10,1,0.2), values=1:10)
p <- ggplot(df, aes(x,y, fill=values)) + geom_col(width=0.7)
You can consider one of the following:
select a more blue-ish red to start with:
p + scale_fill_gradient(low='#FFFFFF', high='#FE43AF')
Select your scale more discretely using scale_fill_gradientn:
pal <- c('#F8F6F6', '#F4AABC', '#E83D7D')
p + scale_fill_gradientn(colours = pal)
Incidentally, the function choose_color() from the colorspace package is a pretty useful little tool (requires shiny and shinyjs to be installed, as it's a shiny app) that can be helpful for choosing colors. This guide may also come in handy. scale_fill_gradientn and use of the choose_color() to set your palette gives you complete control, but it has a tendency to make the transition kind of odd: this is why I like the first gradient shown above vs. the second.
Patterned fills
Currently, there are no good, simple methods to create patterned fill in ggplot to my knowledge. Could be useful to ask a question, but the best seems to be this answer from SO. Not really too satisfactory, IMO, but it seems at the moment to be the best that can be done.

Related

How to add legend to ggplot loadings plot?

So created a loadings plot via arrow style using ggplot command. In order to make things easier for graphing, I added a column into the dataframe of my rr.pr$rotation code with colours so that it graphs those arrows based on the colour I specified. The colours that match the arrows are important which is why I did it that way. I am having trouble now adding a legend as ggplot isn't adding a legend.
Is there a way to add one or do I have to do something to the dataframe?
I was thinking of adding the colours manually, but I am getting stuck.
Green represents Sulfated, Orange represents Sialyllated, and Brown represents Neutral. And I would like the legend to show that.
Here is the code:
Dataframe
rrload<-data.frame(rr.pr$rotation[c(2,15,17,24,52),c(1:5)])
rrload$class<-c('orange','springgreen3','bisque3','bisque3','bisque3')
rrload1<-rrload[,c(1:5)]
rrload1<-as.numeric(as.matrix(rrload1))
rrload1<-matrix(rrload1,nrow=5,ncol=5,byrow = F)
rrload[,c(1:5)]<-rrload1
Code for plotting it:
ggplot(rrload)+geom_segment(aes(xend=PC1,yend=PC2),x=0,y=0,arrow = arrowstyle2,color=rrload$class)+
geom_text(aes(x=PC1,y=PC2,label=row.names(rrload)),hjust=0,nudge_x = -0.05,vjust=1,nudge_y = 0.025,size=3.5,color='black')+xlim(-0.3,0.3)+ylim(-0.3,0.3)+theme_light()+
theme_minimal()+theme(legend.title = element_text("Class"),axis.text.x = element_text(colour = "black",size = 10),axis.text.y = element_text(colour = "black",size = 10),axis.title.x = element_text(colour = "black",size = 10),axis.title.y = element_text(colour = "black",size = 10),axis.ticks = element_line(color = "black"),panel.grid = element_blank(), panel.border = element_rect(colour = "black",fill = NA,size = 1))+geom_hline(yintercept = 0,linetype="dashed",color="gray69")+geom_vline(xintercept = 0,linetype="dashed",color="gray69")
This is the graph:
Loadings plot
Without access to your full data (your code is unable to recreate the dataframe, rrload properly), it's hard to help. I managed to estimate the numbers based on the plot you shared. Here's the dataframe I used - note the naming conventions for the columns:
d <- data.frame(
PC1=c(-0.2,-0.2,0.1,0.15,-0.08),
PC2=c(0.13,-0.1,0.2,0.1,-0.2300),
class=c('Neutral','Neutral','Neutral','Sulfated','Silylated'),
name=c('o53','o18','o25','o15','o2')
)
To prepare the data for plotting, I included d$name and d$class. d$class is similar to the column you had, although instead of the color, I'm using the actual name. d$name is the name that I'm using to plot your labels.
Here's the code I used and resulting plot. Explanation will come after:
library(ggrepel)
ggplot(d) + theme_classic() +
geom_vline(xintercept=0, linetype=2, color='gray60') +
geom_hline(yintercept=0, linetype=2, color='gray60') +
geom_segment(
aes(xend=PC1,yend=PC2, color=class), x=0,y=0,
arrow=arrow(type='closed', angle=20, length=unit(0.02,'npc'))
) +
geom_text_repel(
aes(x=PC1, y=PC2, label=name), force=6, min.segment.length = 10, seed=123
) +
ylim(-0.3,0.3) + xlim(-0.3,0.3) +
scale_color_manual(
name='Legend Title',
values=c('Neutral'='bisque3','Sulfated'='springgreen3','Silylated'='orange'))
ggplot2 will create a legend for certain aesthetics, but they must be placed within aes(). Once you do that, ggplot2 will create the legend and automatically assign colors. This means that if we want to create a legend for color=, you need to put it within aes(). The interesting part is that you can put it within aes() anywhere in the call, or just apply to specific geom/geoms. This allows a lot of flexibility in creating your plot. In this case, I only want to color the arrows, so you include color=class within the geom_segment() call. If you put it within the ggplot() call, it would color both the line segment as well as the text geom.
I'm also paying attention to the ordering. We want to make sure the background dotted lines for the central axis at 0,0 are "behind" everything, so they go first. Then the segments, and then the text geom.
The scale_color_manual() function is used to specify the colors for the different d$class values explicitly and the name of the legend. You can also just let ggplot2 find a palette by default, or you can specify via a palette (there are a ton of other methods to specify color). BTW - you can also specify the name of the legend via labs(color=....
Finally, I decided to use geom_text_repel() rather than geom_text(). Since the lines go out in every direction, the "nudge" values for each text item are not going to work going in the same direction. In other words, if you plot the text at x=PC1, y=PC2, it will overlap the arrowheads. You noticed this too and applied nudge_ values, which happens to work, but if your data was a bit different, it would not have worked. geom_text_repel from the ggrepel package can work to do this by kind of "pushing" the text away from your points.

How to specify a single colour for rasters with tmap?

I would like to adjust the basic colour of a raster plotted with tmap, when there is only one value in the raster.
Here is a very simple reproducible example:
library(raster)
library(tmap)
a <- raster(matrix(sample(c(1, NA), 10000, replace = TRUE, prob = c(0.01, 0.99)), nr = 100, nc = 100, ))
tm_shape(a) +
tm_raster()
You can see that the default yellow colour is barely visible by a human eye. Hence, when drawing a map where you only have a few pixels, it is extremely difficult to locate where are the pixels with values.
Unfortunately, I was not able to change this colour after multiple attempts. I think this issue may be encountered by other users, thus if a simple answer arises here it might be very helpful.
Unsuccessful attempts:
tm_shape(a) +
tm_raster(col = "black")
tm_shape(a) +
tm_raster(palette = "RdBu")
Note: for this one, I expected either a Red or a Blue colour to show up. Not grey... I tried to adjust midpoints as well but nothing changed.
tm_shape(a) +
tm_raster() +
tm_layout(aes.color= c(fill = "black"))
Apparently, when you just specifiy col= it colors the whole raster in one color. So I guess you have to chose the layer where the points are on? And then provide a argument to palette= as explained in the documentation.
This is how I got it to work:
tm_shape(a) +
tm_raster(col = "layer", palette = "black")

How do I change the fill color for a computed variable in geom_bar

I am trying to change the default fill color from blue to green or red.
Here is the code I am using
Top_pos<- ggplot(Top_10, aes(x=reorder(Term,Cs), y=Cs, fill=pvalue)) +
geom_bar(stat = "identity", colour="black") + coord_flip()
Using the above code, I get the following image. I have no problem with this data but I do not know how to change the fill color.
It's easy to confuse scaling the color and scaling the fill. In the case of geom_bar/geom_col, color changes the borders around the bars while fill changes the colors inside the bars.
You already have the code that's necessary to scale fill color by value: aes(fill = pvalue). The part you're missing is a scale_fill_* command. There are several options; some of the more common for continuous scales are scale_fill_gradient or scale_fill_distiller. Some packages also export palettes and scale functions to make it easy to use them, such as the last example which uses a scale from the rcartocolor package.
scale_fill_gradient lets you set endpoints for a gradient; scale_fill_gradient2 and scale_fill_gradientn let you set multiple midpoints for a gradient.
scale_fill_distiller interpolates ColorBrewer palettes, which were designed for discrete data, into a continuous scale.
library(tidyverse)
set.seed(1234)
Top_10 <- tibble(
Term = letters[1:10],
Cs = runif(10),
pvalue = rnorm(10, mean = 0.05, sd = 0.005)
)
plt <- ggplot(Top_10, aes(x = reorder(Term, Cs), y = Cs, fill = pvalue)) +
geom_col(color = "black") +
coord_flip()
plt + scale_fill_gradient(low = "white", high = "purple")
plt + scale_fill_distiller(palette = "Greens")
plt + rcartocolor::scale_fill_carto_c(palette = "Sunset")
Created on 2018-05-05 by the reprex package (v0.2.0).
Personally, I'm a fan of R Color Brewer. It's got a set of built-in palettes that play well together for qualitative, sequential or diverging data types. Check out colorbrewer2.org for some examples on real-ish data
More generally, and for how to actually code it, you can always add a scale_fill_manual argument. There are some built-ins in ggplot2 for gradients (examples here)

ggplot: combining size and color in legend

I've only very recently started learning R. Now what I'm trying to do is to integrate two legends for the same plot. In other words, I want the default size legend to change color depending on it's size.
I have been Googling several solutions that apparently all don't seem to work, but again, I'm new to R so maybe I'm just doing something wrong.
My code:
ggplot(Caschool, aes(x=testscr, y=avginc), colour="green") +
geom_point(aes(size=enrltot, color=enrltot)) +
geom_smooth(colour="blue") +
labs(x="Test Score", y="Average Income", title="California Test Score Data", color="Number of Students\nPer District") +
theme(
panel.grid.minor = element_blank(),
panel.grid.major=element_line(colour="grey", size=0.4),
panel.background=element_rect(fill="beige"),
axis.line=element_line(size = 1.2, colour = "black"),
plot.title = element_text(size = rel(2))) +
scale_color_continuous(limits=c(0, 30000), breaks=seq(0,30000, by=2500)) +
guides(color= guide_legend(), size=guide_legend())
Apparently, I'm not allowed to post pictures, or I would have shown what this looks like so far.
ggplot2 can indeed combine size and colour legends into one, however, this only works, if they are compatible: they need to have exactly the same breaks, otherwise they can not be combined.
Let me make an example: Assume, you have values between 0 and 10 that you want to map on size and colour. You tell ggplo2 to use small points for values below 5 and large points for larger value. It will then plot a legend with a small and a large point, as expected. Now, you also want to add colour and you require points below 3 to be green and points above to be blue. ggplot2 will also draw a legend for this, but it is impossible to combine the two legends. The small point would have to be both, green and blue. The problem can be solved by using the same breaks for colour and size.
In your example, you manually change the breaks of the colour scale, but not those of the size scale. This results in incompatible legends that can not be combined.
I can not demonstrate this using your date, because I don't have it. So I will create an example with mtcars. The variant with incompatible legends is constructed as follows:
p <- ggplot(mtcars, aes(x=mpg, y=drat)) +
geom_point(aes(size=gear, color=gear)) +
scale_color_continuous(limits=c(2, 5), breaks=seq(2, 5, by=0.5)) +
guides(color= guide_legend(), size=guide_legend())
which gives the following plot:
If I now add the same breaks for size,
p + scale_size_continuous(limits=c(2, 5), breaks=seq(2, 5, by=0.5))
I get a plot with only one legend:
For your code, this means that you should add the following to your plot:
+ scale_size_continuous(limits=c(0, 30000), breaks=seq(0,30000, by=2500))
A little side remark: What do you intend by using colour = "green" in your call to ggplot? I don't see that this has any effect at all, because you set the colour again in both geoms that you use later. Maybe a relic from an older variant of the plot?

Gradient of n colors ranging from color 1 and color 2

I often work with ggplot2 that makes gradients nice (click here for an example). I have a need to work in base and I think scales can be used there to create color gradients as well but I'm severely off the mark on how. The basic goal is generate a palette of n colors that ranges from x color to y color. The solution needs to work in base though. This was a starting point but there's no place to input an n.
scale_colour_gradientn(colours=c("red", "blue"))
I am well aware of:
brewer.pal(8, "Spectral")
from RColorBrewer. I'm looking more for the approach similar to how ggplot2 handles gradients that says I have these two colors and I want 15 colors along the way. How can I do that?
colorRampPalette could be your friend here:
colfunc <- colorRampPalette(c("black", "white"))
colfunc(10)
# [1] "#000000" "#1C1C1C" "#383838" "#555555" "#717171" "#8D8D8D" "#AAAAAA"
# [8] "#C6C6C6" "#E2E2E2" "#FFFFFF"
And just to show it works:
plot(rep(1,10),col=colfunc(10),pch=19,cex=3)
Just to expand on the previous answer colorRampPalettecan handle more than two colors.
So for a more expanded "heat map" type look you can....
colfunc<-colorRampPalette(c("red","yellow","springgreen","royalblue"))
plot(rep(1,50),col=(colfunc(50)), pch=19,cex=2)
The resulting image:
Try the following:
color.gradient <- function(x, colors=c("red","yellow","green"), colsteps=100) {
return( colorRampPalette(colors) (colsteps) [ findInterval(x, seq(min(x),max(x), length.out=colsteps)) ] )
}
x <- c((1:100)^2, (100:1)^2)
plot(x,col=color.gradient(x), pch=19,cex=2)
Edit
Let me try to explain why I think this function is superior to the other suggested solutions.
Let's apply the function suggested by jsol for the exponential data I used for my plot. I try two variations using range and length in the call to colfunc.
Result: It simply does not work as intended.
colfunc <- colorRampPalette(c("red","yellow","springgreen","royalblue"))
x <- c((1:100)^2, (100:1)^2)
plot(x, col=colfunc(range(x)), pch=19,cex=2)
plot(x, col=colfunc(length(x)), pch=19,cex=2)
The above answer is useful but in graphs, it is difficult to distinguish between darker gradients of black. One alternative I found is to use gradients of gray colors as follows
palette(gray.colors(10, 0.9, 0.4))
plot(rep(1,10),col=1:10,pch=19,cex=3))
More info on gray scale here.
Added
When I used the code above for different colours like blue and black, the gradients were not that clear.
heat.colors() seems more useful.
This document has more detailed information and options. pdf
An alternative approach (not necessarily better than the previous answers!) is to use the viridis package. As explained here, it allows for a variety of color gradients that are based on more than two colors.
The package is pretty easy to use - you just need to replace the ggplot2 scale fill function (e.g., scale_fill_gradient(low = "skyblue", high = "dodgerblue4")) with the equivalent viridis function.
So, change the code for this plot:
ggplot(mtcars, aes(wt*1000, mpg)) +
geom_point(size = 4, aes(colour = hp)) +
xlab("Weight (pounds)") + ylab("Miles per gallon (MPG)") + labs(color='Horse power') +
scale_x_continuous(limits = c(1000, 6000),
breaks = c(seq(1000,6000,1000)),
labels = c("1,000", "2,000", "3,000", "4,000", "5,000", "6,000")) +
scale_fill_gradient(low = "skyblue", high = "dodgerblue4") +
theme_classic()
Which produces:
To this, which uses viridis:
ggplot(mtcars, aes(wt*1000, mpg)) +
geom_point(size = 4, aes(colour = factor(cyl))) +
xlab("Weight (pounds)") + ylab("Miles per gallon (MPG)") + labs(color='Number\nof cylinders') +
scale_x_continuous(limits = c(1000, 6000),
breaks = c(seq(1000,6000,1000)),
labels = c("1,000", "2,000", "3,000", "4,000", "5,000", "6,000")) +
scale_color_viridis(discrete = TRUE) +
theme_classic()
The only difference is in the second to last line: scale_color_viridis(discrete = TRUE).
This is the plot that is produced using viridis:
Hoping someone finds this useful, as its the solution I ended up using after coming to this question.

Resources