R, how to add one break to the default breaks in ggplot? - r

Suppose I have the following issue: having a set of data, generate a chart indicating how many datapoints are below any given threshold.
This is fairly easy to achieve
n.data <- 215
set.seed(0)
dt <- rnorm(n.data) ** 2
x <- seq(0, 5, by=.2)
y <- sapply(x, function(i) length(which(dt < i)))
ggplot() +
geom_point(aes(x=x,y=y)) +
geom_hline(yintercept = n.data)
The question is, suppose I want to add a label to indicate what the total number of observation was (n.data). How do I do that, while maintaining the other breaks as default?
The outcome I'd like looks something like the image below, generated with the code
ggplot() +
geom_point(aes(x=x,y=y)) +
geom_hline(yintercept = n.data) +
scale_y_continuous(breaks = c(seq(0,200,50),n.data))
However, I'd like this to work even when I change the value of n.data, just by adding it to the default breaks.
(bonus points if you also get rid of the grid line between the last default break and the n.data one!)

Three years and some more knowledge of ggplot later, here's how I would do this today.
ggplot() +
geom_point(aes(x=x,y=y)) +
geom_hline(yintercept = n.data) +
scale_y_continuous(breaks = c(pretty(y), n.data))

Here is how you can get rid of the grid line between the last auto break and the manual one :
theme_update(panel.grid.minor=element_blank())
For the rest, I can't quite understand your question, as when you change n.data, your break is updated.

Related

big dotted geom_line, but dots closer together

I am curious if it is possible to increase the size of dots on a dotted line within geom_line but keep the dots closer together. The R code below produces a basic reproducible example of what I see, and then I include another chart showing what I would like to see.
library(dplyr)
library(ggplot2)
set.seed(5223)
myDF <- data.frame(x=rnorm(20,0,1),
y=runif(20,0,20))
myDF <- myDF %>%
mutate(From8to12 = y>=8 & y<=12)
ggplot(myDF,aes(x=x,y=y,col=From8to12)) +
geom_point() +
geom_hline(yintercept=8,lty="dotted") +
geom_hline(yintercept=12,lty="dotted",size=1.5)
Unedited Image
(Manually) Edited Image (In Paint)
I would like to make the dots bigger, but closer together. Is this possible? I've found nothing online.
It's something like lty in base R plot, and one way is to specify specifically how long should the dash and gap be:
ggplot(myDF,aes(x=x,y=y,col=From8to12)) +
geom_point() +
geom_hline(yintercept=8,lty="dotted") +
geom_hline(yintercept=12,lty="11",size=1.5)
the "11" means length of 1 for dash, length of 1 for gap, and it will replicate this. you can see more about it here
Edit
I thought it wasn't possible without some severe hacking of the underlying draw functions - #StupidWolf's answer proved me wrong. My suggestion was to draw every single point with geom_point and shape = 15 (filled square). It is then a matter of your final plot size, what parameters you chose (i.e., the distance between the 'dots' and their size)
P.S. It's impressive that you have actually managed to produce your image in paint.
library(tidyverse)
set.seed(5223)
myDF <- data.frame(x = rnorm(20, 0, 1), y = runif(20, 0, 20))
dot_dis <- 0.05
x_line <- seq(min(myDF$x), max(myDF$x), dot_dis)
y_line <- 12
ggplot() +
geom_point(aes(x, y), data = myDF) +
geom_point(aes(x_line, y_line), shape = 15, size = 1.5)
Created on 2020-02-18 by the reprex package (v0.3.0)

Customize linetype in ggplot2 OR add automatic arrows/symbols below a line

I would like to use customized linetypes in ggplot. If that is impossible (which I believe to be true), then I am looking for a smart hack to plot arrowlike symbols above, or below, my line.
Some background:
I want to plot some water quality data and compare it to the standard (set by the European Water Framework Directive) in a red line. Here's some reproducible data and my plot:
df <- data.frame(datum <- seq.Date(as.Date("2014-01-01"),
as.Date("2014-12-31"),by = "week"),y=rnorm(53,mean=100,sd=40))
(plot1 <-
ggplot(df, aes(x=datum,y=y)) +
geom_line() +
geom_point() +
theme_classic()+
geom_hline(aes(yintercept=70),colour="red"))
However, in this plot it is completely unclear if the Standard is a maximum value (as it would be for example Chloride) or a minimum value (as it would be for Oxygen). So I would like to make this clear by adding small pointers/arrows Up or Down. The best way would be to customize the linetype so that it consists of these arrows, but I couldn't find a way.
Q1: Is this at all possible, defining custom linetypes?
All I could think of was adding extra points below the line:
extrapoints <- data.frame(datum2 <- seq.Date(as.Date("2014-01-01"),
as.Date("2014-12-31"),by = "week"),y2=68)
plot1 + geom_point(data=extrapoints, aes(x=datum2,y=y2),
shape=">",size=5,colour="red",rotate=90)
However, I can't seem to rotate these symbols pointing downward. Furthermore, this requires calculating the right spacing of X and distance to the line (Y) every time, which is rather inconvenient.
Q2: Is there any way to achieve this, preferably as automated as possible?
I'm not sure what is requested, but it sounds as though you want arrows at point up or down based on where the y-value is greater or less than some expected value. If that's the case, then this satisfies using geom_segment:
require(grid) # as noted by ?geom_segment
(plot1 <-
ggplot(df, aes(x=datum,y=y)) + geom_line()+
geom_segment(data = data.frame( df$datum, y= 70, up=df$y >70),
aes(xend = datum , yend =70 + c(-1,1)[1+up]*5), #select up/down based on 'up'
arrow = arrow(length = unit(0.1,"cm"))
) + # adjust units to modify size or arrow-heads
geom_point() +
theme_classic()+
geom_hline(aes(yintercept=70),colour="red"))
If I'm wrong about what was desired and you only wanted a bunch of down arrows, then just take out the stuff about creating and using "up" and use a minus-sign.

Get rid of second legend in ggplot2

got some problems with ggplot2 again
I want to plot at least two datasets with two different colors and two different shapes.
This works but when i try to put the names for the legend it doubles the legend automatically.
The number of datasets can change and so the legendnames of course.
I`d need a code that not just works for this example:
library(ggplot2)
xdata=1:5
ydata=c(3.45,4.67,7.8,8.98,10)
ydata2=c(12.4,13.5,14.6,15.8,16)
p <-data.frame(matrix(NA,nrow=5,ncol=3))
p$X1 <- xdata
p$X2 <- ydata
p$X3 <- ydata2
shps <-c(1,2)
colp <-c("navy","red3")
p <- melt(p,id="X1")
px <-ggplot(p,aes(X1,value))
legendnames <- c("name1","name2")
px <- px +aes(shape = factor(variable))+
geom_point(aes(colour =factor(variable)))+
theme_bw()+
scale_shape_manual(labels=legendnames,values =shps )+
scale_color_manual(values = colp)
px
This gives me this:
But i want that with my legendnames
I just deleted the labels=legendnames, in scale_shape_manual
So whats the issue to solve that problem.
Please help
I think this is just a matter of providing the same labels parameter to the scale_color_manual, otherwise it doesn't know how to consolidate the legends together.
So
px <- px + aes(shape = factor(variable)) +
geom_point(aes(colour = factor(variable))) +
theme_bw()+
scale_shape_manual(labels=legendnames, values = shps)+
scale_color_manual(labels=legendnames, values = colp)
px
It's not really a problem, you programmed it in yourself by using legendnames (which it then adds, even though those variables are not on your data). If you remove them, the plot behaves as you want:
shps <-c(X2=1,X3=2)
colp <-c(X2="navy",X3="red3")
#easy if you want to rerun code, don't overwrite variables
p2 <- melt(p,id="X1")
px <- ggplot(data=p2) + geom_point(aes(x=X1, y=value,shape=variable,colour=variable)) +
scale_shape_manual(values=shps)+
scale_color_manual(values=colp)
px

ggplot2: Changing which values are annotated on the x-axis

My question is certainly a replicate but I can't find the answer.
On the x-axis the values that have a tick in my plot are: 2.5,5,7.5,10,12.5.
I want to modify which values have a tick in order to see the following values: 2,4,6,8,10,12
In order to make sure I was well understood: I do not want to change my axes to something that is not carthesian, I just want to change which positions on the x axis are annotated.
How can I achieve this?
Here is my current code:
ggplot(data.and.factors.prov,aes(x=number.of.traits,y=FP,colour=factor(Corr))) +
stat_summary(fun.data=mean_cl_normal,position=position_dodge(width=0.2)) +
geom_blank() +
geom_smooth(method='lm',se=F,formula=y~I(x)) +
labs(x='Number of traits') +
scale_colour_manual(values=c(1:6),name='Correlation Coefficient') +
xlim(c(1,12))
Use scale_x_discrete(breaks = seq(2, 12, by=2))

Add rectangles around common values in ggplot

When I make an experimental design, I use ggplot to show the layout. Here's a simple example:
df <- data.frame(Block=rep(1:2, each=18),
Row=rep(1:9, 4),
Col=rep(1:4, each=9),
Treat=sample(c(1:6),replace=F))
Which I'll plot like:
df.p <- ggplot(df, aes(Row, Col)) + geom_tile(aes(fill=as.factor(Treat)))
to give:
Sometimes I have a structure within the design I would like to highlight by putting a box around it, for example a mainplot. In this case:
df$Mainplot <- ceiling(df$Row/3) + 3*(ceiling(df$Col/2) - 1)
I then use geom_rect and some messy code that needs adjusting for each design to generate something like:
Question: How do I add the rectangles around the mainplots in a simple way? It seems like a simple enough problem, but I haven't found an obvious way. I can map colour or some other aesthetic to mainplot, but I can't seem to surround them with a box. Any pointers greatly appreciated.
Here is a possible solution where I create an auxiliary data.frame for plotting borders with geom_rect(). I'm not sure if this is as simple as you would like! I hope the code that computes the rectangle coordinates will be reusable/generalizable with just a bit of additional effort.
library(ggplot2)
# Load example data.
df = data.frame(Block=rep(1:2, each=18),
Row=rep(1:9, 4),
Col=rep(1:4, each=9),
Treat=sample(c(1:6),replace=F))
df$Mainplot = ceiling(df$Row/3) + 3*(ceiling(df$Col/2) - 1)
# Create an auxiliary data.frame for plotting borders.
group_dat = data.frame(Mainplot=sort(unique(df$Mainplot)),
xmin=0, xmax=0, ymin=0, ymax=0)
# Fill data.frame with appropriate values.
for(i in 1:nrow(group_dat)) {
item = group_dat$Mainplot[i]
tmp = df[df$Mainplot == item, ]
group_dat[i, "xmin"] = min(tmp$Row) - 0.5
group_dat[i, "xmax"] = max(tmp$Row) + 0.5
group_dat[i, "ymin"] = min(tmp$Col) - 0.5
group_dat[i, "ymax"] = max(tmp$Col) + 0.5
}
p2 = ggplot() +
geom_tile(data=df, aes(x=Row, y=Col, fill=factor(Treat)),
colour="grey30", size=0.35) +
geom_rect(data=group_dat, aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax),
size=1.4, colour="grey30", fill=NA)
ggsave(filename="plot_2.png", plot=p2, height=3, width=6.5)
Here's a solution that might be a easier. Just use geom_tile with alpha set to 0. I didn't take the time to give you an exact solution, but here's an example. To achieve what you want I'm guessing you'll need to actually create a new data frame, which should be easy enough.
df <- data.frame(Block=rep(1:2, each=18),Row=rep(1:9, 4),Col=rep(1:4, each=9),Treat=sample(c(1:6),replace=F))
df$blocking <- rep(sort(rep(1:3,3)),4)
df.p <- ggplot(df, aes(Row, Col)) + geom_tile(aes(fill=as.factor(Treat)))
df.p+ geom_tile(data=df,aes(x=Row,y=blocking),colour="black",fill="white",alpha=0,lwd=1.4)
the alpha=0 will create a blank tile, and then you can set the line width using lwd. That's probably easier than specifying all the rectangles. Hope it helps.
I thought it would be worth posting my own (non-ideal) solution, since it seems there's nothing obvious I'm missing. I'm going to leave the question unanswered in the hope someone will come up with something.
At present, I use geom_rect in a fashion that would probably be able to be made general (perhaps into a geom_border addition to ggplot??). For the example in my question, the essential information is that each mainplot is 3 x 2.
Adding onto df.p from the original question, this is what I do currently:
df.p1 <- df.p + geom_rect(aes(xmin=((Mainplot- 3*(ceiling(Col/2)-1) )-1)*3 + 0.5,
xmax=((Mainplot - 3*(ceiling(Col/2)-1))-1)*3 + 3.5,
ymin=ceiling(ceiling(Col/2)/2 + 2*(ceiling(Col/2)-1))-0.5,
ymax=2*ceiling(Col/2)+0.5),
colour="black", fill="transparent",size=1)
Ugly, I know - hence the question. That code generates the second plot from the question. Maybe the best option is building this all into a function.

Resources