group by two columns in ggplot2 - r

Is it possible to group by two columns? So the cross product is drawn
by geom_point() and geom_smooth()?
As example:
frame <- data.frame(
series <- rep(c('a', 'b'), 6),
sample <- rep(c('glass','water', 'metal'), 4),
data <- c(1:12))
ggplot(frame, aes()) # ...
Such that the points 6 and 12 share a group, but not with 3.

Taking the example from this question, using interaction to combine two columns into a new factor:
# Data frame with two continuous variables and two factors
set.seed(0)
x <- rep(1:10, 4)
y <- c(rep(1:10, 2)+rnorm(20)/5, rep(6:15, 2) + rnorm(20)/5)
treatment <- gl(2, 20, 40, labels=letters[1:2])
replicate <- gl(2, 10, 40)
d <- data.frame(x=x, y=y, treatment=treatment, replicate=replicate)
ggplot(d, aes(x=x, y=y, colour=treatment, shape = replicate,
group=interaction(treatment, replicate))) +
geom_point() + geom_line()

for example:
qplot(round, price, data=firm, group=id, color=id, geom='line') +
geom_smooth(aes(group=interaction(size, type)))

Why not just paste those two columns together and use that variable as groups?
frame$grp <- paste(frame[,1],frame[,2])
A somewhat more formal way to do this would be to use the function interaction.

Related

How to group by multiple variabes in ggplot2 [duplicate]

Is it possible to group by two columns? So the cross product is drawn
by geom_point() and geom_smooth()?
As example:
frame <- data.frame(
series <- rep(c('a', 'b'), 6),
sample <- rep(c('glass','water', 'metal'), 4),
data <- c(1:12))
ggplot(frame, aes()) # ...
Such that the points 6 and 12 share a group, but not with 3.
Taking the example from this question, using interaction to combine two columns into a new factor:
# Data frame with two continuous variables and two factors
set.seed(0)
x <- rep(1:10, 4)
y <- c(rep(1:10, 2)+rnorm(20)/5, rep(6:15, 2) + rnorm(20)/5)
treatment <- gl(2, 20, 40, labels=letters[1:2])
replicate <- gl(2, 10, 40)
d <- data.frame(x=x, y=y, treatment=treatment, replicate=replicate)
ggplot(d, aes(x=x, y=y, colour=treatment, shape = replicate,
group=interaction(treatment, replicate))) +
geom_point() + geom_line()
for example:
qplot(round, price, data=firm, group=id, color=id, geom='line') +
geom_smooth(aes(group=interaction(size, type)))
Why not just paste those two columns together and use that variable as groups?
frame$grp <- paste(frame[,1],frame[,2])
A somewhat more formal way to do this would be to use the function interaction.

Plot two functions in ggplot2 with different x range limits

I have plotted linear functions with ggplot as follow:
ggplot(data.frame(x=c(0,320)), aes(x)) +
stat_function(fun=function(x)60.762126*x-549.98, geom="line", colour="black") +
stat_function(fun=function(x)-0.431181333*x+2.378735e+02, geom="line", colour="black")+
ylim(-600,600)
However, I want the 1st function to be plotted for x ranging from 0 to 12 and the 2nd function to be plotted for x ranging from 12 to max(x).
Does anyone know how to do it?
It's easiest to just calculate the data you need outside of the ggplot call first.
fun1 <- function(x) 60.762126 * x - 549.98
dat1 <- data.frame(x = c(0, 12), y = NA)
dat1$y <- fun1(dat1$x)
fun2 <- function(x) -0.431181333 * x + 2.378735e+02
dat2 <- data.frame(x = c(12, 320), y = NA)
dat2$y <- fun2(dat2$x)
ggplot(mapping = aes(x, y)) +
geom_line(data = dat1) +
geom_line(data = dat2)
Or you can join the data for the lines first (as suggested by #Heroka), resulting in an identical plot:
dat.com <- rbind(dat1, dat2)
dat.com$gr <- rep(1:2, c(nrow(dat1), nrow(dat2)))
ggplot(dat.com, aes(x, y, group = gr)) +
geom_line()

Create part-fixed, part-free axis limits on facets with ggplot?

I'd like to create a faceted plot using ggplot2 in which the minimum limit of the y axis will be fixed (say at 0) and the maximum limit will be determined by the data in the facet (as it is when scales="free_y". I was hoping that something like the following would work, but no such luck:
library(plyr)
library(ggplot2)
#Create the underlying data
l <- gl(2, 10, 20, labels=letters[1:2])
x <- rep(1:10, 2)
y <- c(runif(10), runif(10)*100)
df <- data.frame(l=l, x=x, y=y)
#Create a separate data frame to define axis limits
dfLim <- ddply(df, .(l), function(y) max(y$y))
names(dfLim)[2] <- "yMax"
dfLim$yMin <- 0
#Create a plot that works, but has totally free scales
p <- ggplot(df, aes(x=x, y=y)) + geom_point() + facet_wrap(~l, scales="free_y")
#Add y limits defined by the limits dataframe
p + ylim(dfLim$yMin, dfLim$yMax)
It's not too surprising to me that this throws an error (length(lims) == 2 is not TRUE) but I can't think of a strategy to get started on this problem.
In your case, either of the following will work:
p + expand_limits(y=0)
p + aes(ymin=0)

Plotting two variables using ggplot2 - same x axis

I have two graphs with the same x axis - the range of x is 0-5 in both of them.
I would like to combine both of them to one graph and I didn't find a previous example.
Here is what I got:
c <- ggplot(survey, aes(often_post,often_privacy)) + stat_smooth(method="loess")
c <- ggplot(survey, aes(frequent_read,often_privacy)) + stat_smooth(method="loess")
How can I combine them?
The y axis is "often privacy" and in each graph the x axis is "often post" or "frequent read".
I thought I can combine them easily (somehow) because the range is 0-5 in both of them.
Many thanks!
Example code for Ben's solution.
#Sample data
survey <- data.frame(
often_post = runif(10, 0, 5),
frequent_read = 5 * rbeta(10, 1, 1),
often_privacy = sample(10, replace = TRUE)
)
#Reshape the data frame
survey2 <- melt(survey, measure.vars = c("often_post", "frequent_read"))
#Plot using colour as an aesthetic to distinguish lines
(p <- ggplot(survey2, aes(value, often_privacy, colour = variable)) +
geom_point() +
geom_smooth()
)
You can use + to combine other plots on the same ggplot object. For example, to plot points and smoothed lines for both pairs of columns:
ggplot(survey, aes(often_post,often_privacy)) +
geom_point() +
geom_smooth() +
geom_point(aes(frequent_read,often_privacy)) +
geom_smooth(aes(frequent_read,often_privacy))
Try this:
df <- data.frame(x=x_var, y=y1_var, type='y1')
df <- rbind(df, data.frame(x=x_var, y=y2_var, type='y2'))
ggplot(df, aes(x, y, group=type, col=type)) + geom_line()

ggplot2 Scatter Plot Labels

I'm trying to use ggplot2 to create and label a scatterplot. The variables that I am plotting are both scaled such that the horizontal and the vertical axis are plotted in units of standard deviation (1,2,3,4,...ect from the mean). What I would like to be able to do is label ONLY those elements that are beyond a certain limit of standard deviations from the mean. Ideally, this labeling would be based off of another column of data.
Is there a way to do this?
I've looked through the online manual, but I haven't been able to find anything about defining labels for plotted data.
Help is appreciated!
Thanks!
BEB
Use subsetting:
library(ggplot2)
x <- data.frame(a=1:10, b=rnorm(10))
x$lab <- letters[1:10]
ggplot(data=x, aes(a, b, label=lab)) +
geom_point() +
geom_text(data = subset(x, abs(b) > 0.2), vjust=0)
The labeling can be done in the following way:
library("ggplot2")
x <- data.frame(a=1:10, b=rnorm(10))
x$lab <- rep("", 10) # create empty labels
x$lab[c(1,3,4,5)] <- LETTERS[1:4] # some labels
ggplot(data=x, aes(x=a, y=b, label=lab)) + geom_point() + geom_text(vjust=0)
Subsetting outside of the ggplot function:
library(ggplot2)
set.seed(1)
x <- data.frame(a = 1:10, b = rnorm(10))
x$lab <- letters[1:10]
x$lab[!(abs(x$b) > 0.5)] <- NA
ggplot(data = x, aes(a, b, label = lab)) +
geom_point() +
geom_text(vjust = 0)
Using qplot:
qplot(a, b, data = x, label = lab, geom = c('point','text'))

Resources