ggplot, geom_segment, mapping or setting of xend - r

I am currently working on the code of "R graphics cookbook", Chapter 3, as provided by user "gaorongchao" on github:
A) code as given
install.packages(gcookbook)
library(gcookbook)
tophit <- tophitters2001[1:25, ]
nameorder <- tophit$name[order(tophit$lg, tophit$avg)]
tophit$name <- factor(tophit$name, levels=nameorder)
ggplot(tophit, aes(x=avg, y=name)) +
geom_segment(aes(yend=name), xend=0, colour="grey50") +
geom_point(aes(colour=lg), size=3) +
scale_colour_brewer(palette="Set1", limits=c("NL","AL")) +
theme_bw() +
theme(panel.grid.major.y = element_blank(),
legend.position=c(1, 0.55),
legend.justification=c(1, 0.5))
B) Then I tried a variation with
ggplot(tophit, aes(x=avg, y=name)) +
geom_segment(aes(xend=0, yend=name), colour="grey50") +
geom_point(aes(colour=lg), size=3) +
scale_colour_brewer(palette="Set1", limits=c("NL","AL")) +
theme_bw() +
theme(panel.grid.major.y = element_blank(),
legend.position=c(1, 0.55),
legend.justification=c(1, 0.5))
where xend is part of the aes mapping in geom_segment(). B) leads to another graphic with another scale, where xend=0 is explicitly part of the x-scale. Can somehow explain the systematic behind this difference of code A) and B)? xend being part of aes and being not. What is the difference? Thanks

It is required to set aesthetic mappings to your data frame inside aes() - only inside aes() will ggplot know to look in your data frame form a column name.
Constants, like your xend = 0, or color = "red" if you wanted to color all points red, can be set inside aes(), but it is generally preferred to set them outside of aes(). For something like color, this will not automatically create a legend (you normally don't want a color legend if there is only one color). Similarly, in your example, you saw that putting xend = 0 inside aes() make it "explicitly part of the scale".
Setting a constant inside aes is equivalent to adding that column to your data frame and then mapping it, outside aes tells ggplot "hey, just do this, but don't worry about adding it to the data frame or legends or anything".

Related

I'm using ggplot in R version 3.5.3 trying to add to annotate text to a faceted grid but I keep getting an error about the aesthetics length

Tr<-c("Sorghum Male \n Sorghum Female","Sorghum Male \n Wheat Female","Wheat Male \n Sorghum Female","Wheat Male \n Wheat Female")
Treatment<-c(rep(Tr,3))
Matingdiet<-c(rep(c("Same diet","Cross diet","Cross diet", "Same diet"),3))
Rejection<-c(0.05, 0.00, 0.10, 0.00, 0.00, 0.05, 0.05, 0.00, 0.05, 0.05, 0.05, 0.05)
d<-as.data.frame(cbind(Treatment,Rejection, Matingdiet))
d$pop<-c(rep("JN200A-OBL",4),rep("JN200B-OBL",4),rep("JN200C-OBL",4))
d$Rejection<-as.numeric(as.character(d$Rejection))
d$pop<-as.factor(d$pop)
datatxt<-as.data.frame(cbind(labels = rep("N = 20 per treatment",3)),pop=c("JN200A-OBL","JN200B-OBL","JN200C-OBL"))
pl<-ggplot(data = d, aes(x=Treatment, y=Rejection, fill=Matingdiet))+geom_col()+facet_wrap(~pop)
pl<-pl+labs(fill="Mating pair type", y = "Proportion of mates rejected")+ylim(0,1)+theme(axis.text.x = element_text(angle = -60, hjust = 1, vjust = -1))
pl<-pl+theme(plot.background = element_blank(),panel.grid.major = element_blank(), panel.grid.minor = element_blank())
pl+geom_text(data=datatxt,aes(label = labels))
Which gives this error
Error: Aesthetics must be either length 1 or the same as the data (9): x, y and fill
When I run it without adding the geom_text() function I get my desired graph but I want to annotate it.
Well, first of all, it looks like in your code there is a misplaced parentheses when you define datatxt that results in that data frame to have only one column called labels. You're also using as.data.frame() when it makes much more sense to use simply data.frame(), where you do not have to use cbind(), but rather just list the column names as vectors, separated by ,.
datatxt <- data.frame(
labels = rep("N = 20 per treatment",3),
pop=c("JN200A-OBL","JN200B-OBL","JN200C-OBL")
)
As for placing the text on your plot, if you are using geom_text it will be mapping the data like ggplot does for all the other geoms. That is to say that what is drawn on the plot will be based on the data itself and the mapping you define in aes(), which is linked to the columns of that data. For geom_text, it will look through each observation in the dataset you give it (in this case, datatxt) and look for those values pertaining to x, y, and also fill (because this was defined in the overall call to ggplot() in your plot. The error message is due to not finding those columns in the dataset, and in fact, you do not have columns that are mapped to x, y, or fill in datatxt at all.
The first fix is to remove the fill aesthetic from the overall call to ggplot(). If it is used in all geoms, it makes sense to put it here, but I like to define the aesthetics that are used only for particular geoms inside the geom call itself. Hence, I'm moving fill=Matingdiet inside the aes() for geom_col() where it is used. We can get around another way, but this is simplest.
Second, you presumably want the text to appear in the same location for each facet, right? Since it's not going to move with the data, we should be defining where it goes outside the mapping= specification of geom_text() - in other worse, outside aes(). I also change a few other aesthetics so you can see what else you may want to specify here.
Here's the result:
pl<-
ggplot(data = d, aes(x=Treatment, y=Rejection)) +
geom_col(aes(fill=Matingdiet)) +
facet_wrap(~pop) +
labs(fill="Mating pair type", y = "Proportion of mates rejected") +
ylim(0,1) +
theme(
axis.text.x = element_text(angle = -60, hjust = 1, vjust = -1),
plot.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()
)
pl +
geom_text(
data=datatxt, aes(label = labels),
x=1, y=0.95, hjust=0, fontface='italic', color='red')
Remember that geom_text is really supposed to be suited for the case where the value of whatever you assign inside aes() changes with respect to the data. For example, you might have N=20 for two of the facets, but N=30 for another one. If that's the case, you can use that approach above. If you need to have the text remain the same regardless of the data, an easier approach might be to use annotate() instead:
pl +
annotate(
geom='text', x=1, y=0.95, color='red', fontface='italic',
label='N = 20 per treatment', hjust=0
)
Ultimately, it's up to you, as both work here. The above code gives you the same plot as using geom_text.

How to set background color for each panel in grouped boxplot?

I plotted a grouped boxplot and trying to change the background color for each panel. I can use panel.background function to change whole plot background. But how this can be done for individual panel? I found a similar question here. But I failed to adopt the code to my plot.
Top few lines of my input data look like
Code
p<-ggplot(df, aes(x=Genotype, y=Length, fill=Treatment)) + scale_fill_manual(values=c("#69b3a2", "#CF7737"))+
geom_boxplot(width=2.5)+ theme(text = element_text(size=20),panel.spacing.x=unit(0.4, "lines"),
axis.title.x=element_blank(),axis.text.x=element_blank(),axis.ticks.x=element_blank(),axis.text.y = element_text(angle=90, hjust=1,colour="black")) +
labs(x = "Genotype", y = "Petal length (cm)")+
facet_grid(~divide,scales = "free", space = "free")
p+theme(panel.background = element_rect(fill = "#F6F8F9", colour = "#E7ECF1"))
Unfortunately, like the other theme elements, the fill aesthetic of element_rect() cannot be mapped to data. You cannot just send a vector of colors to fill either (create your own mapping of sorts). In the end, the simplest solution probably is going to be very similar to the answer you linked to in your question... with a bit of a twist here.
I'll use mtcars as an example. Note that I'm converting some of the continuous variables in the dataset to factors so that we can create some more discrete values.
It's important to note, the rect geom is drawn before the boxplot geom, to ensure the boxplot appears on top of the rect.
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
All done... but not quite. Something is wrong and you might notice this if you pay attention to the boxes on the legend and the gridlines in the plot panels. It looks like the alpha value is incorrect for some facets and okay for others. What's going on here?
Well, this has to do with how geom_rect works. It's drawing a box on each plot panel, but just like the other geoms, it's mapped to the data. Even though the x and y aesthetics for the geom_rect are actually not used to draw the rectangle, they are used to indicate how many of each rectangle are drawn. This means that the number of rectangles drawn in each facet corresponds to the number of lines in the dataset which exist for that facet. If 3 observations exist, 3 rectangles are drawn. If 20 observations exist for one facet, 20 rectangles are drawn, etc.
So, the fix is to supply a dataframe that contains one observation each for every facet. We have to then make sure that we supply any and all other aesthetics (x and y here) that are included in the ggplot call, or we will get an error indicating ggplot cannot "find" that particular column. Remember, even if geom_rect doesn't use these for drawing, they are used to determine how many observations exist (and therefore how many to draw).
rect_df <- data.frame(carb=unique(mtcars$carb)) # supply one of each type of carb
# have to give something to disp
rect_df$disp <- 0
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
data=rect_df,
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
That's better.

Adding a legend to a combined line and bargraph ggplot

So I know many people have asked similar questions but the code others have used does not seem to be working for my graph hence why I'm wondering if I have done something wrong.
I have this code:
ggplot(dfMonth)
+ geom_col(aes(x=Month, y=NumberMO), size=.7, colour="black", fill="white")
+ geom_line(aes(x=Month, y=NumberME), size=1, colour="black", group=1)
+ xlab("Month")
+ ylab("No. of birds observed")
+ theme_bw()
+ geom_point(x=Month, y=NumberME)
+ scale_colour_manual("" ,values =c("NumberME"="black"), labels=c("Expected No. of birds"))
+ theme(legend.key=element_blank(),legend.title=element_blank(), legend.box="horizontal")
+ theme(axis.title.x = element_text(margin = unit(c(5, 0, 0, 0), "mm")),
axis.title.y = element_text(margin = unit(c(0,3 , 0, 0), "mm")))
Which produces this graph:
so as you can see, the legend to show what the black line with the points mean has not been added to my graph even though I have inputted the code. No error comes up so hence why I'm lost on whats wrong. Any ideas on what i've failed to include?
Thanks
In order for ggplot to know to draw a legend, you need to include one of the aesthetics for a geom within aes(). In this case, if you want a legend to be drawn for your line, you need to include within the aes() in the geom_line() call one of the aesthetics that you have identified for the line: linetype or color works. We'll use color here.
Oh... and in the absence of OP sharing their dataset, here's a made-up example:
set.seed(1234)
dfMonth <- data.frame(
Month=month.name,
NumberMO=sample(50:380, 12),
NumberME=sample(50:380, 12)
)
Now the code to make the plot and ensure the legend is created.
p <- ggplot(dfMonth, aes(x=Month)) +
geom_col(aes(y=NumberMO), size=0.7, color="black", fill="white") +
geom_line(aes(y=NumberME, color='black'), size=1, group=1)
p
We have a legend, but there's some problems. You get the default title of the legend (which is the name of the aesthetic), and the default label (which is whatever text you put inside aes(color=.... Since we put "black" as the value there, it's applied as the label, and not the actual color. The actual color of the line is to default to the first level of the standard colorset used by ggplot2, which in this case is that light red color.
To set the color, name of the legend, and name of the label, we should specify the value. There's only one item in the legend, so there's no need to specify, but if you were to send a named vector to indicate the name for our single line explicitly, you end up with the somewhat strange-looking c('black'='black'). I also included a line break in the label name to make the look a bit better. Also, the months were running into each other, so I also changed the angle of the x axis labels.
Finally, you might notice the months were out of order. That's because default ggplot2 behavior is to factor a column of discrete values, which uses alphabetical ordering for the levels. To fix that, you specify the column as a factor before plotting with the correct levels.
dfMonth$Month <- factor(dfMonth$Month, levels=month.name)
p + scale_color_manual(
name=NULL, values=c('black'='black'),
labels='Expected No.\nof birds') +
theme(axis.text.x=element_text(angle=30, hjust=1))

`fill` scale is not shown in the legend

Here is my dummy code:
set.seed(1)
df <- data.frame(xx=sample(10,6),
yy=sample(10,6),
type2=c('a','b','a','a','b','b'),
type3=c('A','C','B','A','B','C')
)
ggplot(data=df, mapping = aes(x=xx, y=yy)) +
geom_point(aes(shape=type3, fill=type2), size=5) +
scale_shape_manual(values=c(24,25,21)) +
scale_fill_manual(values=c('green', 'red'))
Resulting plot has a legend but it's 'type2' section doesn't reflect scale of fill value - is it by design?
I know this is an old thread, but I ran into this exact problem and want to post this here for others like me. While the accepted answer works, the less risky, cleaner method is:
library(ggplot2)
ggplot(data=df, mapping = aes(x=xx, y=yy)) +
geom_point(aes(shape=type3, fill=type2), size=5) +
scale_shape_manual(values=c(24,25,21)) +
scale_fill_manual(values=c(a='green',b='red'))+
guides(fill=guide_legend(override.aes=list(shape=21)))
The key is to change the shape in the legend to one of those that can have a 'fill'.
Here's a different workaround.
library(ggplot2)
ggplot(data=df, mapping = aes(x=xx, y=yy)) +
geom_point(aes(shape=type3, fill=type2), size=5) +
scale_shape_manual(values=c(24,25,21)) +
scale_fill_manual(values=c(a='green',b='red'))+
guides(fill=guide_legend(override.aes=list(colour=c(a="green",b="red"))))
Using guide_legend(...) with override_aes is a way to influence the appearance of the guide (the legend). The hack is that here we are "overriding" the fill colors in the guide with the colors they should have had in the first place.
I played with the data and came up with this idea. I first assigned shape in the first geom_point. Then, I made the shapes empty. In this way, outlines stayed in black colour. Third, I manually assigned specific shape. Finally, I filled in the symbols.
ggplot(data=df, aes(x=xx, y=yy)) +
geom_point(aes(shape = type3), size = 5.1) + # Plot with three types of shape first
scale_shape(solid = FALSE) + # Make the shapes empty
scale_shape_manual(values=c(24,25,21)) + # Assign specific types of shape
geom_point(aes(color = type2, fill = type2, shape = type3), size = 4.5)
I'm not sure if what you want looks like this?
ggplot(df,aes(x=xx,y=yy))+
geom_point(aes(shape=type3,color=type2,fill=type2),size=5)+
scale_shape_manual(values=c(24,25,21))

Plot thick line with dark dots at data points in ggplot2

I want to plot a path and show where the datapoints are.
Combine Points with lines with ggplot2
uses geom_point() + geom_line() but I do not like that the dots are much thicker and the lines have a discontinuous look - x - x ----- x --- thus I decidet to
create my own dotted line:
mya <- data.frame(a=1:20)
ggplot() +
geom_path(data=mya, aes(x=a, y=a, colour=2, size=1)) +
geom_point(data=mya, aes(x=a, y=a, colour=1, size=1)) +
theme_bw() +
theme(text=element_text(size=11))
I like that the dots and the line have the same size. I did not use the alpha channel because I fear trouble with the alpha channel when I include the files in other programs.
open problems:
R should not create those legends
can R calculate the "darker colour" itself? darker(FF0000) = AA0000
how can I manipulate the linethickness? The size= parameter did not work as expected in R 2.15
Aesthetics can be set or mapped within a ggplot call.
An aesthetic defined within aes(...) is mapped from the data, and a legend created.
An aesthetic may also be set to a single value, by defining it outside aes().
In your case it appears you want to set the size to a single value. You can also use scale_..._manual(values = ..., guide = 'none') to suppress the creation of a legend.
This appears to be what you want with colour.
You can then use named colours such as lightblue and darkblue (see ?colors for more details)
ggplot() +
geom_line(data=mya, aes(x=a, y=a, colour='light'), size = 2) +
geom_point(data=mya, aes(x=a, y=a, colour='dark'), size = 2) +
scale_colour_manual(values = setNames(c('darkblue','lightblue'),
c('dark','light')), guide = 'none') +
theme_bw()

Resources