Ifelse conditional colours ggplot2 [duplicate] - r

I am trying to have ggplot2 show one line of a histogram as a different color than the rest. In this I have been successful; however, ggplot is using the default colors when a different set are specified. I am sure there is an error in my code, but I am unable to determine where it is. The data and code are below:
create data
library(ggplot2)
set.seed(71185)
dist.x <- as.data.frame(round(runif(100000, min= 1.275, max= 1.725), digits=2))
colnames(dist.x) <- 'sim_con'
start histogram
ggplot(dist.x, aes(x = sim_con)) +
geom_histogram(colour = "black", aes(fill = ifelse(dist.x$sim_con==1.55, "darkgreen", "firebrick")), binwidth = .01) +
theme(legend.position="none")
Which results in the following image:
I do not want to use the default colors, but instead want to use 'darkgreen' and 'firebrick'. Where is the error in the code? Thanks for any help you can provide.

You're so close!
In your code above, ggplot is interpreting your fill as variables in your data set - factor darkgreen and factor firebrick - and doesn't have any way of knowing that those labels are colors, not, say, names of animal species.
If you add scale_fill_identity() to the end of your plot, as below, it will interpret those strings as colors (the identity), not as features of the data.
One benefit of this approach vs #marat's excellent answer above: if you have a complex plot (say, using geom_segment(), with a starting value and an ending value for each observation) and you want to apply two fill scales on your data (one scale for the start value and a different scale for the end value) you can do the conditional logic in the data processing step, then use scale_fill_identity() to color each observation accordingly.
ggplot(
data=dist.x,
aes(
x = sim_con,
fill = ifelse(dist.x$sim_con==1.55, "darkgreen", "firebrick")
)
) +
geom_histogram(
colour = "black",
binwidth = .01
) +
theme(legend.position="none") +
scale_fill_identity()

I don't think you can explicitly set colors in aes; you need to do it in scale_fill_manual, as in the example below:
ggplot(dist.x, aes(x = sim_con)) +
geom_histogram(colour = "black", binwidth = .01,aes(fill=(sim_con==1.55))) +
scale_fill_manual(values=c('TRUE'='darkgreen','FALSE'='firebrick')) +
theme(legend.position="none")

Related

How to plot two histograms of different variables in one GGPlot, with legend and colours

This is my first post on Stack Overflow, my first reproducible example, and I'm new to R, so please be gentle!
I am trying to display two histograms on one plot. Each histogram is a different variable (column) in my dataframe. I can't figure out how to both colour in the bars and have the legend displayed. If I use scale_fill_manual the colours are ignored, but if I use scale_colour_manual the colours are just the outlines of the bars. If I map the colours to each histogram separately (and don't use scale_xxx_manual at all) the colours work great but I then don't get the legend.
Here is my code:
TwoHistos <- ggplot (cars) +
labs(color="Variable name",x="XX",y="Count")+
geom_histogram(aes(x=speed, color= "Speed"), alpha = 0.2 ) +
geom_histogram(aes(x=dist, color= "Dist"), alpha = 0.2) +
scale_colour_manual(values = c("yellow","green"))
TwoHistos
Here is my result in an image (I pasted it but I don't know why it isn't showing up. I'm sorry!):
Two histograms with outlines for colours
I think (if I understand you correctly), what you might want is to give a fill arguement within the geom_histogram() call.
(I've used the mtcars built-in R data here as you did not give any data to work with)
TwoHistos <- ggplot (mtcars) +
labs(fill="Variable name",x="XX",y="Count")+
geom_histogram(aes(x=hp, fill= "Speed", color = "yellow"), alpha = 0.2 ) +
geom_histogram(aes(x=disp, fill= "Dist", color = "green"), alpha = 0.2) +
scale_fill_manual(values = c("yellow","green"))+
scale_colour_manual(values = c("yellow","green"), guide=FALSE)
TwoHistos
Edit: just to make really clear that I've changed the x in the geom_histogram() so it works with mtcars
Use fill instead of color and use scale_fill_manual
TwoHistos <- ggplot (cars) +
labs(color="Variable name",x="XX",y="Count")+
geom_histogram(aes(x=speed, fill= "Speed"), alpha = 0.2 ) +
geom_histogram(aes(x=dist, fill= "Dist"), alpha = 0.2) +
scale_fill_manual(values = c("yellow","green"))
TwoHistos

How do I change the fill color for a computed variable in geom_bar

I am trying to change the default fill color from blue to green or red.
Here is the code I am using
Top_pos<- ggplot(Top_10, aes(x=reorder(Term,Cs), y=Cs, fill=pvalue)) +
geom_bar(stat = "identity", colour="black") + coord_flip()
Using the above code, I get the following image. I have no problem with this data but I do not know how to change the fill color.
It's easy to confuse scaling the color and scaling the fill. In the case of geom_bar/geom_col, color changes the borders around the bars while fill changes the colors inside the bars.
You already have the code that's necessary to scale fill color by value: aes(fill = pvalue). The part you're missing is a scale_fill_* command. There are several options; some of the more common for continuous scales are scale_fill_gradient or scale_fill_distiller. Some packages also export palettes and scale functions to make it easy to use them, such as the last example which uses a scale from the rcartocolor package.
scale_fill_gradient lets you set endpoints for a gradient; scale_fill_gradient2 and scale_fill_gradientn let you set multiple midpoints for a gradient.
scale_fill_distiller interpolates ColorBrewer palettes, which were designed for discrete data, into a continuous scale.
library(tidyverse)
set.seed(1234)
Top_10 <- tibble(
Term = letters[1:10],
Cs = runif(10),
pvalue = rnorm(10, mean = 0.05, sd = 0.005)
)
plt <- ggplot(Top_10, aes(x = reorder(Term, Cs), y = Cs, fill = pvalue)) +
geom_col(color = "black") +
coord_flip()
plt + scale_fill_gradient(low = "white", high = "purple")
plt + scale_fill_distiller(palette = "Greens")
plt + rcartocolor::scale_fill_carto_c(palette = "Sunset")
Created on 2018-05-05 by the reprex package (v0.2.0).
Personally, I'm a fan of R Color Brewer. It's got a set of built-in palettes that play well together for qualitative, sequential or diverging data types. Check out colorbrewer2.org for some examples on real-ish data
More generally, and for how to actually code it, you can always add a scale_fill_manual argument. There are some built-ins in ggplot2 for gradients (examples here)

R, ggplot2: creating a single legend in a bubble chart with positive and negative values

I want to create a single legend for a bubble chart with positive and negative values like in plot below, generated using sp::bubble().
But, for various reasons I want to duplicate this in ggplot2. The closest I have gotten is to generate a single legend with scaled symbols, but the actual bubbles themselves are'nt scaled.
The above plot was created using the code below
# create data frame
x=sample(seq(1,50),50,T)
y=sample(seq(1,50),50,T)
plot_dat=data.frame(x=x,y=y,value=rnorm(50,0,25))
# plot
library(ggplot2)
ggplot(data=plot_dat, aes(x=x, y=y,colour=factor(sign(value)), size=value)) +
geom_point() +
scale_size(breaks = c(-40,-30,-20,-10,0,10,20,30,40,50), range = c(0.5,4)) +
scale_colour_manual(values = c("orange", "blue"), guide=F) +
guides(size = guide_legend(override.aes = list(colour = list("orange","orange","orange","orange","blue","blue","blue","blue","blue","blue"),size=c(3,2.5,2,1,0.5,1,2,2.5,3,4))))
Continue using abs(value) for size and sign(value) for color.
Provide the breaks= argument of scale_size_continuous() with duplicates of breaks required (e.g. c(10,10,20,20,...)). Next, provide labels= with the values you desire. Finally, use guides() and override.aes to set your own order of values and colours.
ggplot(data=plot_dat, aes(x=x, y=y,colour=factor(sign(value)), size=abs(value))) +
geom_point() +
scale_color_manual(values=c("orange","blue"),guide=FALSE)+
scale_size_continuous(breaks=c(10,10,20,20,30,30,40,40,50,50),labels=c(-50,-40,-30,-20,-10,10,20,30,40,50),range = c(1,5))+
guides(size = guide_legend(override.aes = list(colour = list("orange","orange","orange","orange","orange","blue","blue","blue","blue","blue"),
size=c(4.92,4.14,3.50,2.56,1.78,1.78,2.56,3.50,4.14,4.92))))
To assign exact values for the size= argument in the guides() function you could use function rescale() from the scales library. Rescale the entire range of values you are plotting, along with the break points provided to range= argument in scale_size_continuous().
set.seed(1234)
x=sample(seq(1,50),50,T)
y=sample(seq(1,50),50,T)
plot_dat=data.frame(x=x,y=y,value=rnorm(50,0,20))
library(scales)
rescale(c(abs(plot_dat$value),10,20,30,40,50),to=c(1,5))[51:55]
[1] 1.775906 2.562657 3.349409 4.136161 4.922912

Adding shaded target region to ggplot2 barchart

I have two data frames: one I am using to create the bars in a barchart and a second that I am using to create a shaded "target region" behind the bars using geom_rect.
Here is example data:
test.data <- data.frame(crop=c("A","B","C"), mean=c(6,4,12))
target.data <- data.frame(crop=c("ONE","TWO"), mean=c(31,12), min=c(24,9), max=c(36,14))
I start with the means of test.data for the bars and means of target.data for the line in the target region:
library(ggplot2)
a <- ggplot(test.data, aes(y=mean, x=crop)) + geom_hline(aes(yintercept = mean, color = crop), target.data) + geom_bar(stat="identity")
a
So far so good, but then when I try to add a shaded region to display the min-max range of target.data, there is an issue. The shaded region appears just fine, but somehow, the crops from target.data are getting added to the x-axis. I'm not sure why this is happening.
b <- a + geom_rect(aes(xmin=-Inf, xmax=Inf, ymin=min, ymax=max, fill = crop), data = target.data, alpha = 0.5)
b
How can I add the geom_rect shapes without adding those extra names to the x-axis of the bar-chart?
This is a solution to your question, but I'd like to better understand you problem because we might be able to make a more interpretable plot. All you have to do is add aes(x = NULL) to your geom_rect() call. I took the liberty to change the variable 'crop' in add.data to 'brop' to minimize any confusion.
test.data <- data.frame(crop=c("A","B","C"), mean=c(6,4,12))
add.data <- data.frame(brop=c("ONE","TWO"), mean=c(31,12), min=c(24,9), max=c(36,14))
ggplot(test.data, aes(y=mean, x=crop)) +
geom_hline(data = add.data, aes(yintercept = mean, color = brop)) +
geom_bar(stat="identity") +
geom_rect(data = add.data, aes(xmin=-Inf, xmax=Inf, x = NULL, ymin=min, ymax=max, fill = brop),
alpha = 0.5, show.legend = F)
In ggplot calls all of the aesthetics or aes() are inherited from the intial call:
ggplot(data, aes(x=foo, y=bar)).
That means that regardless of what layers I add on geom_rect(), geom_hline(), etc. ggplot is looking for 'foo' to assign to x and 'bar' to assign to y, unless you specifically tell it otherwise. So like aeosmith pointed out you can clear all inherited aethesitcs for a layer with inherit.aes = FALSE, or you can knock out single variables at a time by reassigning them as NULL.

Why is ggplot using default colors when others are specified?

I am trying to have ggplot2 show one line of a histogram as a different color than the rest. In this I have been successful; however, ggplot is using the default colors when a different set are specified. I am sure there is an error in my code, but I am unable to determine where it is. The data and code are below:
create data
library(ggplot2)
set.seed(71185)
dist.x <- as.data.frame(round(runif(100000, min= 1.275, max= 1.725), digits=2))
colnames(dist.x) <- 'sim_con'
start histogram
ggplot(dist.x, aes(x = sim_con)) +
geom_histogram(colour = "black", aes(fill = ifelse(dist.x$sim_con==1.55, "darkgreen", "firebrick")), binwidth = .01) +
theme(legend.position="none")
Which results in the following image:
I do not want to use the default colors, but instead want to use 'darkgreen' and 'firebrick'. Where is the error in the code? Thanks for any help you can provide.
You're so close!
In your code above, ggplot is interpreting your fill as variables in your data set - factor darkgreen and factor firebrick - and doesn't have any way of knowing that those labels are colors, not, say, names of animal species.
If you add scale_fill_identity() to the end of your plot, as below, it will interpret those strings as colors (the identity), not as features of the data.
One benefit of this approach vs #marat's excellent answer above: if you have a complex plot (say, using geom_segment(), with a starting value and an ending value for each observation) and you want to apply two fill scales on your data (one scale for the start value and a different scale for the end value) you can do the conditional logic in the data processing step, then use scale_fill_identity() to color each observation accordingly.
ggplot(
data=dist.x,
aes(
x = sim_con,
fill = ifelse(dist.x$sim_con==1.55, "darkgreen", "firebrick")
)
) +
geom_histogram(
colour = "black",
binwidth = .01
) +
theme(legend.position="none") +
scale_fill_identity()
I don't think you can explicitly set colors in aes; you need to do it in scale_fill_manual, as in the example below:
ggplot(dist.x, aes(x = sim_con)) +
geom_histogram(colour = "black", binwidth = .01,aes(fill=(sim_con==1.55))) +
scale_fill_manual(values=c('TRUE'='darkgreen','FALSE'='firebrick')) +
theme(legend.position="none")

Resources