ggplot2: line connecting axis to point - r

data:
df<-data.frame(grp=letters[1:4],perc=runif(4))
First option:
First, create a second dataset that contains zeros for each group
df2<-rbind(df,data.frame(grp=df[,1],perc=c(0,0,0,0)))
Then plot with geom_points and geom_line:
ggplot(df,aes(y=perc,x=grp))+
geom_point()+
geom_line(data=df2, aes(y=perc, x=grp))+
coord_flip()
Which looks just fine. Just too much extra work to create a second dataset.
The other option is using geom_bar and making the width tiny:
ggplot(df,aes(y=perc,x=grp))+
geom_point()+
geom_bar(stat="identity",width=.01)+
coord_flip()
But this is also weird, and when I save to .pdf, not all of the bars are the same width.
There clearly has to be an easier way to do this, any suggestions?

Use geom_segment with fixed yend = 0. You'll also need expand_limits to adjust the plotting area:
ggplot(df, aes(y=perc, x=grp)) +
geom_point() +
geom_segment(aes(xend=grp), yend=0) +
expand_limits(y=0) +
coord_flip()

Related

How to set background color for each panel in grouped boxplot?

I plotted a grouped boxplot and trying to change the background color for each panel. I can use panel.background function to change whole plot background. But how this can be done for individual panel? I found a similar question here. But I failed to adopt the code to my plot.
Top few lines of my input data look like
Code
p<-ggplot(df, aes(x=Genotype, y=Length, fill=Treatment)) + scale_fill_manual(values=c("#69b3a2", "#CF7737"))+
geom_boxplot(width=2.5)+ theme(text = element_text(size=20),panel.spacing.x=unit(0.4, "lines"),
axis.title.x=element_blank(),axis.text.x=element_blank(),axis.ticks.x=element_blank(),axis.text.y = element_text(angle=90, hjust=1,colour="black")) +
labs(x = "Genotype", y = "Petal length (cm)")+
facet_grid(~divide,scales = "free", space = "free")
p+theme(panel.background = element_rect(fill = "#F6F8F9", colour = "#E7ECF1"))
Unfortunately, like the other theme elements, the fill aesthetic of element_rect() cannot be mapped to data. You cannot just send a vector of colors to fill either (create your own mapping of sorts). In the end, the simplest solution probably is going to be very similar to the answer you linked to in your question... with a bit of a twist here.
I'll use mtcars as an example. Note that I'm converting some of the continuous variables in the dataset to factors so that we can create some more discrete values.
It's important to note, the rect geom is drawn before the boxplot geom, to ensure the boxplot appears on top of the rect.
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
All done... but not quite. Something is wrong and you might notice this if you pay attention to the boxes on the legend and the gridlines in the plot panels. It looks like the alpha value is incorrect for some facets and okay for others. What's going on here?
Well, this has to do with how geom_rect works. It's drawing a box on each plot panel, but just like the other geoms, it's mapped to the data. Even though the x and y aesthetics for the geom_rect are actually not used to draw the rectangle, they are used to indicate how many of each rectangle are drawn. This means that the number of rectangles drawn in each facet corresponds to the number of lines in the dataset which exist for that facet. If 3 observations exist, 3 rectangles are drawn. If 20 observations exist for one facet, 20 rectangles are drawn, etc.
So, the fix is to supply a dataframe that contains one observation each for every facet. We have to then make sure that we supply any and all other aesthetics (x and y here) that are included in the ggplot call, or we will get an error indicating ggplot cannot "find" that particular column. Remember, even if geom_rect doesn't use these for drawing, they are used to determine how many observations exist (and therefore how many to draw).
rect_df <- data.frame(carb=unique(mtcars$carb)) # supply one of each type of carb
# have to give something to disp
rect_df$disp <- 0
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
data=rect_df,
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
That's better.

Conditionally circling around data plots using ggplot2

I have couple of questions regarding plotting using ggplot2.
I have already used below commands to colour data points using R.
library(ggplot2)
df <- read.csv(file="c:\\query2.csv")
ggplot( df,aes( x = Time,y ,y = users,colour = users>40) ) + geom_point()
My question is: how should I draw a continuous line connecting data points and how do I circle around data points for users >40?
To connect the points, use geom_line (if that doesn't give you what you need, please explain what you're trying to accomplish).
I haven't used geom_encircle, but another option is to use a filled marker with the fill deleted to create the circles. Here's an example, using the built-in mtcars data frame for illustration:
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
geom_point(data=mtcars[mtcars$mpg>30,],
pch=21, fill=NA, size=4, colour="red", stroke=1) +
theme_bw()
pch=21 is one of the filled markers (see ?pch for more info on other available point markers). We set fill=NA to remove the fill. stroke sets the thickness of the circle border.
UPDATE: To add a line to this chart, using the example above:
ggplot(mtcars, aes(wt, mpg)) +
geom_line() +
geom_point() +
geom_point(data=mtcars[mtcars$mpg>30,],
pch=21, fill=NA, size=4, colour="red", stroke=1) +
theme_bw()
However, if (as in my original code for this graph) you put the aes statement inside the geom, rather than in the initial call to ggplot, then you need to include an aes statement inside geom_line as well.

How to change style settings in stacked barchart overlaid with density line (ggplot2)

I am trying to change the style settings of this kind of chart and hope you can help me.
R code:
set_theme(theme_bw)
cglac$pred2<-as.factor(cglac$pred)
ggplot(cglac, aes(x=depth, colour=pred2))
+ geom_bar(aes(y=..density..),binwidth=3, alpha=.5, position="stack")
+ geom_density(alpha=.2)
+ xlab("Depth (m)")
+ ylab("Counts & Density")
+ coord_flip()
+ scale_x_reverse()
+ theme_bw()
which produces this graph:
Here some points:
What I want is to have the density line as black and white lines separated by symbols rather than colour (dashed line, dotted line etc).
The other thing is the histogram itself. How do I get rid of the grey background in the bars?
Can I change the bars also to black and white symbol lines (shaded etc)? So that they would match the density lines?
Last but not least I want to add a second x or in this case y axis, because of flip_coord(). The one I see right now is for the density. The other one I need would then be the count data from the pred2 variable.
Thanks for helping.
Best,
Moritz
Have different line types: inside aes(), put linetype = pred2. To make the line color black, inside geom_density, add an argument color = "black".
The "background" of the bars is called "fill". Inside geom_bar, you can set fill = NA for no fill. A more common approach is to fill in the bars with the colors, inside aes() specify fill = pred2. You might consider faceting by your variable, + facet_wrap(~ pred2, nrow = 1) might look very nice.
Shaded bars in ggplot? No, you can't do that easily. See the answers to this question for other options and hacks.
Second y-axis, similar to the shaded symbol lines, the ggplot creator thinks a second y-axis is a terrible design choice, so you can't do it at all easily. Here's a related question, including Hadley's point of view:
I believe plots with separate y scales (not y-scales that are transformations of each other) are fundamentally flawed.
It's definitely worth considering his point of view, and asking yourself if those design choices are really what you want.
Different linetypes for densities
Here's my built-in data version of what you're trying to do:
ggplot(mtcars, aes(x = hp,
linetype = cyl,
group = cyl,
color = cyl)) +
geom_histogram(aes(y=..density.., fill = cyl),
alpha=.5, position="stack") +
geom_density(color = "black") +
coord_flip() +
theme_bw()
And what I think you should do instead. This version uses facets instead of stacking/colors/linetypes. You seem to be aiming for black and white, which isn't a problem at all in this version.
ggplot(mtcars, aes(x = hp,
group = cyl)) +
geom_histogram(aes(y=..density..),
alpha=.5) +
geom_density() +
facet_wrap(~ cyl, nrow = 1) +
coord_flip() +
theme_bw()

How to get scales to appear under each facet of ggplot2?

I would like that scales appear under each facet of ggplot2. The graph produced by the following code inserts the scales only in the last facet. It is probably simple but I can't seem to get a handle on this. Below is my R code.
ggplot(aes(x=month),data=tot.d.pred ) +
geom_line(aes(y=tot.diff.pred, colour="tot.diff.pred")) +
geom_point(aes(y=tot.diff.pred, colour="tot.diff.pred")) +
facet_wrap(~state_code, nrow=3) + ylab("diff_pred") + xlab("Month") +
scale_colour_manual("",breaks=c("tot.diff.pred"), values=c("red"))
Many thanks in advance!
Setting scales = "free_x" inside facet_wrap will print x-scales on every facet. If your data already has the same xmin and xmax in each facet, nothing else will change, otherwise you can specify limits with scale_x_continuous().

making line legends for geom_density in ggplot2 in R

with ggplot2, I make the following density plot:
ggplot(iris) + geom_density(aes(x=Sepal.Width, colour=Species))
The colour legend (for each Species value) appears as a box with a line through it, but the density plotted is a line. Is there a way to make the legend appear as just a colored line for each entry of Species, rather than a box with a line through it?
One possibility is to use stat_density() with geom="line". Only in this case there will be only upper lines.
ggplot(iris)+
stat_density(aes(x=Sepal.Width, colour=Species),
geom="line",position="identity")
If you need also the whole area (all lines) then you can combine geom_density() with show_guide=FALSE (to remove legend) and stat_density() than will add legend just with horizontal lines.
ggplot(iris) +
geom_density(aes(x=Sepal.Width, colour=Species),show_guide=FALSE)+
stat_density(aes(x=Sepal.Width, colour=Species),
geom="line",position="identity")
The show_guide function used in the answer by #liesb is deprecated under ggplot 3.0.0; it has been changed to show.legend:
ggplot(iris) +
geom_density(aes(x=Sepal.Width, colour=Species),show.legend=FALSE) +
stat_density(aes(x=Sepal.Width, colour=Species),
geom="line",position="identity", size = 0) +
guides(colour = guide_legend(override.aes=list(size=1)))
ggplot(iris) +
stat_density(aes(x=Sepal.Width, colour=Species),
geom="line",position="identity")
Will do want you want.
You can get around plotting the lines twice by
ggplot(iris) +
geom_density(aes(x=Sepal.Width, colour=Species),show_guide=FALSE) +
stat_density(aes(x=Sepal.Width, colour=Species),
geom="line",position="identity", size = 0) +
guides(colour = guide_legend(override.aes=list(size=1)))
ps: sorry for not commenting on the obviously correct answer -- lack of rep issues :)
pps: I realise the thread is quite old but it helped me today, so it might help someone else sometime...

Resources