geom_text positions per group - r

I am using geom_line, geom_point, and geom_text to plot something like the picture below:
I am grouping, and coloring my data frame, but I want the geom_text not to be so close to each other.
I want to put the one text on top, and the other on bottom. Or at least, hide the one of the two. Is there any way I can do this?

You can specify custom aesthetics in different geom_text() calls. You can include only a subset of the data (such as just one group) in each call, and give each geom_text() a custom hjust or vjust value for each subset.
ggplot(dat, aes(x, y, group=mygroups, color=mygroups, label=mylabel)) +
geom_point() +
geom_line() +
geom_text(data=dat[dat$mygroups=='group1',], aes(vjust=1)) +
geom_text(data=dat[dat$mygroups=='group2',], aes(vjust=-1))

Related

Redistribute columns along x axis using ggplot2

Using this code:
ggplot(total_reads, aes(x=Week, y=Reads)) +
geom_bar(position = "dodge", stat = "identity") +
scale_y_log10(breaks=breaks, minor_breaks=minor_breaks) +
scale_x_continuous() +
facet_grid(~PEDIS, scales="free_x", space = "free_x") +
theme_classic() +
ylab("Total Bacterial Reads")
I produced this graph:
How do I remove the empty spaces in the first facet (pedis1) and make sure only the relevant labels are on the x axis (ie 0,3,6,12,13)?
The quick answer is because your x axis values (total_reads$Week) is an integer/number. This automatically sets the scale to be continuous and therefore you have spacing according to the distance on the scale (like any numeric scale). If you want to have the bars right next to one another and remove the white space, you'll need to set the x axis to a discrete variable when plotting. It's easiest to do this by mapping factor(Week) right in the aes() declaration.
Here's an example with that modification as well as some other suggestions described below:
total_reads <- data.frame(
Week=c(0,3,6,12,13),
Reads=c(100,110,100,129,135),
PEDIS=c(rep('PEDIS1', 3), rep('PEDIS2',2))
)
ggplot(total_reads, aes(x=factor(Week), y=Reads)) +
geom_col() +
facet_grid(~PEDIS, scales="free_x", space="free_x") +
theme_classic()
A few other notes on what you see changed here:
Use geom_col(), not geom_bar(). If you check out the documentation associated with the geom_bar() function, you can see it mentions that geom_bar() is used for showing counts of observations along a single axis, whereas if you want to show value, you should use geom_col(). You get the same effect with geom_col() as if you use geom_bar(stat="identity").
Remove scale_x_continuous(). Not sure why you have this there anyway, but if your column Week is numeric, it would default to use this scale anyway. If you do use the sale, you will ask ggplot to force a continuous scale - apparently not what you want here.

Why does this ggplot only plot the grid without the values?

I am trying to plot a bar chart in ggplot but I am continuously getting only the grid. This is apparently a demonstration about the draw nothing here but I would like to understand how to get the values visible in the simplest way.
library(ggplot2)
testData<-data.frame(x=c("a","b","c","d","e","f"), y=c(10,6,9,28,10,17))
bar <- ggplot(data=testData, aes(x=c("a","b","c","d","e","f"), y=c(10,6,9,28,10,17), fill = "#FFCC00"))
One way I can get the plots is the geom_bar
bar <- ggplot(data=testData, aes(x=c("a","b","c","d","e","f"), y=c(10,6,9,28,10,17), fill = "#FFCC00")) + geom_bar(stat="identity")
Why are the values not plotted on the first bar chart and how to fix it the simplest way? What is the idea behind of this way of plotting with + and what is it called?
With the ggplot2 package, calling ggplot() is only meant to call the basic grid; it's like taking out a piece of graph paper before drawing a graph. In either case, having the grid ready has nothing to do with plotting the graph. That's why running the following command will result in the empty grid in your first example:
ggplot(data=testData, aes(x=x, y=y, fill = "#FFCC00"))
It's not the same as using a function like plot() or hist(), which prep the grid and plot the data at the same time:
plot(x=x,y=y,data=testData)
hist(x=x,data=testData)
The "+" in ggplot is just a way to say that there are more arguments related to the ggplot that we want included on top of the first blank grid. That's why each line separated by a "+" is typically called a layer.
So, if we want to make a simple scatterplot, we add points on top of a grid:
testData<-data.frame(x=c(1:6), y=c(10,6,9,28,10,17))
ggplot(data=testData,aes(x=x,y=y)) +
geom_point()
Output:
If we want to add lines to that scatterplot, we can just add one line of code:
ggplot(data=testData,aes(x=x,y=y)) +
geom_point() +
geom_line()
Output:
We can keep adding layers like this if we want. Just note that they will print in the order that you type them (i.e. the first few lines will be below the lines printed after them):
ggplot(data=testData,aes(x=x,y=y)) +
geom_bar(stat="identity",fill="#00BFC4") +
geom_point() +
geom_line()
Output:
Also, note that it's recommended not to call your data multiple times within a ggplot call; that can lead to errors.
Don't use:
ggplot(data=testData, aes(x=c("a","b","c","d","e","f"),
y=c(10,6,9,28,10,17), fill = "#FFCC00")) +
geom_bar(stat="identity")
#or
ggplot(data=testData, aes(x=testData$x, y=testData$x, fill = "#FFCC00")) +
geom_bar(stat="identity")
Instead use:
ggplot(data=testData, aes(x=x, y=y, fill="#FFCC00")) +
geom_bar(stat="identity")
If you want to plot data from a data frame(s) not called within the first ggplot() line, then simply add a data argument to the "layers" that use that different data frame, like this:
ggplot(data=testData,aes(x=x,y=y)) +
geom_bar(stat="identity",fill="#00BFC4") +
geom_point(data=differentDf, aes(x=x,y=y)) +
geom_line(data=differentDf, aes(x=x,y=y))

Adding a legend to a ggplot2 plot which contains multiple elements

This is a similar question to here however I could not get their solution to work for me. I want to add a legend to a ggplot2 plot, when using more than one independent data frame to generate the plot.
Here is an example based on the data sets available in R:
a=longley
b=iris
a$scaled=scale(a$Unemployed,center=TRUE,scale=TRUE)
b$scaled=scale(b$Sepal.Length,center=TRUE,scale=TRUE)
ggplot () +
geom_density(data=a,aes(x=scaled),fill="red",alpha=0.25) +
geom_density(data=b,aes(x=scaled),fill="blue",alpha=0.25) +
scale_colour_manual("",breaks=c("a","b"),values=c("red","blue"))
The plot produced looks like this:
ie. no legend.
How would I add a legend to this?
Very minor syntactic change required. Move the fill= part into the aes() statement in each geom.
a=longley
b=iris
a$scaled=scale(a$Unemployed,center=TRUE,scale=TRUE)
b$scaled=scale(b$Sepal.Length,center=TRUE,scale=TRUE)
ggplot () +
geom_density(data=a,aes(x=scaled,fill="red"),alpha=0.25) +
geom_density(data=b,aes(x=scaled,fill="blue"),alpha=0.25)
This should work alone and will give you the default r color scheme. Or, if you really want to change the colors from the defaults, you can add the manual scale. However, since you want the scale to apply to the fill parameter, make sure to specify scale_fill_manual rather than scale_colour_manual.
ggplot () +
geom_density(data=a,aes(x=scaled,fill="red"),alpha=0.25) +
geom_density(data=b,aes(x=scaled,fill="blue"),alpha=0.25) +
scale_fill_manual("",breaks=c("a","b"),values=c("red","blue"))
If you wanted to change the colors of the lines you would do that with the color aesthetic and would then be able to use the scale_color_manual or scale_colour_manual (same thing) option.
ggplot() +
geom_density(data=a, aes(x=scaled, fill="red", color="yellow"), alpha=0.25) +
geom_density(data=b, aes(x=scaled, fill="blue", color="green"), alpha=0.25) +
scale_fill_manual(values=c("red","blue")) +
scale_color_manual(values=c("yellow", "green"))

ggplot2 position='dodge' producing bars that are too wide

I'm interested in producing a histogram with position='dodge' and fill=some factor (i.e. side-by-side bars for different subgroups within each bar/group), but ggplot2 gives me something like the first plot here, which has a rightmost bar that's too wide and reserves no space for the empty group, which I would like.
Here's a simple case:
df = data.frame(a=c('o','x','o','o'), b=c('a','b','a','b'))
qplot(a, data=df, fill=b, position='dodge')
From ggplot geom_bar - bars too wide I got this idea, and while it technically produces a bar of the same width, but preserves no space for the empty group:
ggplot(df, aes(x=a, fill=a))+
geom_bar(aes(y=..count../sum(..count..))) +
facet_grid(~b,scales="free",space="free")
How do I achieve what I want? Thanks in advance.
The default options in ggplot produces what I think you describe. The scales="free" and space="free" options does the opposite of what you want, so simply remove these from the code. Also, the default stat for geom_bar is to aggregate by counting, so you don't have to specify your stat explicitly.
ggplot(df, aes(x=a, fill=a)) + geom_bar() + facet_grid(~b)

Adding stat_smooth in to only 1 facet in ggplot2

I have some data for which, at one level of a factor, there is a significant correlation. At the other level, there is none. Plotting these side-by-side is simple. Adding a line to both of them with stat_smooth, also straightforward. However, I do not want the line or its fill displayed in one of the two facets. Is there a simple way to do this? Perhaps specifying a blank color for the fill and colour of one of the lines somehow?
Don't think about picking a facet, think supplying a subset of your data to stat_smooth:
ggplot(df, aes(x, y)) +
geom_point() +
geom_smooth(data = subset(df, z =="a")) +
facet_wrap(~ z)
Of course, I later answered my own question. Although, is there a less hack-y way to do this? I wonder if one could even fit different functions to different panels.
One technique is to use + scale_fill_manual and scale_colour_manual. They allow one to specify what colors will be used. So, in this case, let's say you have
a<-qplot(x, y, facets=~z)+stat_smooth(method="lm", aes(colour=z, fill=z))
You can specify colors for the fill and colour using the following. Note, the second color is clear, as it is using a hex value with the final two numbers representing transparency. So, 00=clear.
a+stat_fill_manual(values=c("grey", "#11111100"))+scale_colour_manual(values=c("blue", "#11111100"))

Resources