ggplot2: use different colors in different facets - r

I have what seems to be a very basic problem, but I cannot solve it, as I have barely used ggplots2... I just want that the plot on the left uses the colors in the variable color1 and the plot on the right uses the colors in the variable color2. This is a MWE:
library(reshape2)
library(ggplot2)
a.df <- data.frame(
id=c("a","b","c","d","e","f","g","h"),
var1=c(1,2,3,4,5,6,7,8), var2=c(21,22,23,24,25,26,27,28),
var3=c(56,57,58,59,60,61,62,63),
color1=c(1,2,"NONE","NONE",1,2,2,1),
color2=c(1,"NONE",1,1,2,2,"NONE",2)
)
a.dfm <- melt(a.df, measure.vars=c("var2","var3"))
ggplot(a.dfm, aes(x=value, y=var1, color=color1)) +
geom_point(shape=1) +
facet_grid(. ~ variable)
Thanks a lot!

I think the easiest approach with your data is to create an additional column which has the color defined appropriately based on the value of variable. Since there are just two possible values that variable can take on, this isn't that hard.
a.dfm2 <- transform(a.dfm,
color.use = ifelse(variable=="var2",
as.character(color1),
as.character(color2)))
ggplot(a.dfm2, aes(x=value, y=var1, color=color.use)) +
geom_point(shape=1) +
facet_grid(. ~ variable)

Related

How to supply named colors to the scale_x_manual in ggplot2 >=3.3.4

I have a code generating power point presentations from weekly data.
My table with row materials contain the material used in that week. Additionally I have a named vector with all the materials ever used and assigned colors to them. This ensures that Mat1 is always the same color weather or not it is used that week.
This is how it is programmed:
ggplot(df, aes(fill=Mat, y=pct, x=Date)) +
geom_bar(position="fill", stat="identity")+
scale_fill_manual(values = colors)
upto ggplot2 3.3.3 it was working fine. New 3.3.4 version changes the scale_xx_manual function which takes all the names from values and apply it to limits. This generates very long legend to the plot with all the possible color values even if only 3-4 are used that week.
I can go around it supplying
values = colors[unique(df$Mat)]
but that does not seem to be elegant solution.
Making use of the ggplot2::mpg dataset a minimal reproducible example of the issue:
library(ggplot2)
library(scales)
colors <- scales::hue_pal()(length(unique(mpg$class)))
colors <- setNames(colors, unique(mpg$class))
x <- subset(mpg, mpg$class %in% c("compact", "subcompact", "2seater"))
ggplot(x, aes(x = class, fill=class)) +
geom_bar() +
scale_fill_manual(values = colors)

Weird geom_path behavior

I have noticed an odd behavior in geom_path() in ggplot2. I am not sure whether I am doing something wrong or whether it's a bug.
Here's my data set:
x <- abs(rnorm(10))
y <- abs(rnorm(10)/10)
categs <- c("a","b","c","d","e","f","g","h","i","j")
df <- data.frame(x,y,categs)
I make a plot with points and I join them using geom_path. Works well:
ggplot(df, aes(categs, x, group=1)) + geom_point() + geom_errorbar(aes(ymin=x-y, ymax=x+y)) + geom_path()
However, if I reorder my levels, for instance like this:
df$categs <- factor(df$categs, levels = c("f","i","c","g","e","a","d","h","b","j"))
then geom_plot still keeps the original order (although the order of the factor levels has been updated on the x axis).
Any guesses at what I am doing wrong? Thanks.
Order the df rows based on df$categs, geom_path goes row-by-row to plot:
ggplot(df[ order(df$categs), ], aes(categs, x, group=1)) +
geom_point() +
geom_errorbar(aes(ymin=x-y, ymax=x+y)) +
geom_path()
From ?geom_path manual:
geom_path() connects the observations in the order in which they appear in the data.

Adding text to facetted histogram

Using ggplot2 I have made facetted histograms using the following code.
library(ggplot2)
library(plyr)
df1 <- data.frame(monthNo = rep(month.abb[1:5],20),
classifier = c(rep("a",50),rep("b",50)),
values = c(seq(1,10,length.out=50),seq(11,20,length.out=50))
)
means <- ddply (df1,
c(.(monthNo),.(classifier)),
summarize,
Mean=mean(values)
)
ggplot(df1,
aes(x=values, colour=as.factor(classifier))) +
geom_histogram() +
facet_wrap(~monthNo,ncol=1) +
geom_vline(data=means, aes(xintercept=Mean, colour=as.factor(classifier)),
linetype="dashed", size=1)
The vertical line showing means per month is to stay.
But I want to also add text over these vertical lines displaying the mean values for each month. These means are from the 'means' data frame.
I have looked at geom_text and I can add text to plots. But it appears my circumstance is a little different and not so easy. It's a lot simpler to add text in some cases where you just add values of the plotted data points. But cases like this when you want to add the mean and not the value of the histograms I just can't find the solution.
Please help. Thanks.
Having noted the possible duplicate (another answer of mine), the solution here might not be as (initially/intuitively) obvious. You can do what you need if you split the geom_text call into two (for each classifier):
ggplot(df1, aes(x=values, fill=as.factor(classifier))) +
geom_histogram() +
facet_wrap(~monthNo, ncol=1) +
geom_vline(data=means, aes(xintercept=Mean, colour=as.factor(classifier)),
linetype="dashed", size=1) +
geom_text(y=0.5, aes(x=Mean, label=Mean),
data=means[means$classifier=="a",]) +
geom_text(y=0.5, aes(x=Mean, label=Mean),
data=means[means$classifier=="b",])
I'm assuming you can format the numbers to the appropriate precision and place them on the y-axis where you need to with this code.

want to layer aes in ggplot2

I would like to plot another series of data on top of a current graph. The additional data only contains information for 3 (out of 6) spp, which are used in the facet_wraping.
The other series of data is currently a column (in the same data file).
Current graph:
ped.num <- ggplot(data, aes(ped.length, seeds.inflorstem))
ped.num + geom_point(size=2) + theme_bw() + facet_wrap(~spp, scales = "free_y")
Additional layer would be:
aes(ped.length, seeds.filled)
I feel I should be able to plot them using the same y-axis, because they have just slightly smaller values. How do I go about add this layer?
#ialm 's solution should work fine, but I recommend calling the aes function separately in each geom_* because it makes the code easier to read.
ped.num <- ggplot(data) +
geom_point(aes(x=ped.length, y=seeds.inflorstem), size=2) +
theme_bw() +
facet_wrap(~spp, scales="free_y") +
geom_point(aes(x=ped.length, y=seeds.filled))
(You'll always get better answers if you include example data, but I'll take a shot in the dark)
Since you want to plot two variables that are on the same data.frame, it's probably easiest to reshape the data before feeding it into ggplot:
library(reshape2)
# Melting data gives you exactly one observation per row - ggplot likes that
dat.melt <- melt(dat,
id.var = c("spp", "ped.length"),
measure.var = c("seeds.inflorstem", "seeds.filled")
)
# Plotting is slightly different - instead of explicitly naming each variable,
# you'll refer to "variable" and "value"
ggplot(dat.melt, aes(x = ped.length, y = value, color = variable)) +
geom_point(size=2) +
theme_bw() +
facet_wrap(~spp, scales = "free_y")
The seeds.filled values should plot only on the facets for the corresponding species.
I prefer this to Drew's (totally valid) approach of explicitly mapping different layers because you only need a single geom_point() whether you have two variables or twenty and it's easy to map a variety of aesthetics to variable.

Connecting means in ggplot2

I'm trying to build some kind of profile diagram with ggplot2. I therefore want a line which connects the means in the plot. As you see, geom_line doesn't work here because it only connects the points within each factor level but not the means between factor levels.
Here's a small example:
df <- data.frame(variable=rep(1:3,each=10),value=rnorm(30))
p <- ggplot(df,aes(factor(variable),value))
p + stat_summary(fun.y=mean, geom="point")+coord_flip()+geom_line()
Does anyone has an idea how to achieve that?
Thank you in advance!
It is often easier to summarize the data before you plot. Something like
The next trick is to use group within the call to geom_line to override the default grouping by factor(variable)
summarydf <- ddply(df,.(variable),summarize, value = mean(value))
p <- ggplot(summarydf,aes(factor(variable),value)) +
geom_point() + geom_line(aes(group=1)) + coord_flip()
p

Resources