Is it possible to group by two columns? So the cross product is drawn
by geom_point() and geom_smooth()?
As example:
frame <- data.frame(
series <- rep(c('a', 'b'), 6),
sample <- rep(c('glass','water', 'metal'), 4),
data <- c(1:12))
ggplot(frame, aes()) # ...
Such that the points 6 and 12 share a group, but not with 3.
Taking the example from this question, using interaction to combine two columns into a new factor:
# Data frame with two continuous variables and two factors
set.seed(0)
x <- rep(1:10, 4)
y <- c(rep(1:10, 2)+rnorm(20)/5, rep(6:15, 2) + rnorm(20)/5)
treatment <- gl(2, 20, 40, labels=letters[1:2])
replicate <- gl(2, 10, 40)
d <- data.frame(x=x, y=y, treatment=treatment, replicate=replicate)
ggplot(d, aes(x=x, y=y, colour=treatment, shape = replicate,
group=interaction(treatment, replicate))) +
geom_point() + geom_line()
for example:
qplot(round, price, data=firm, group=id, color=id, geom='line') +
geom_smooth(aes(group=interaction(size, type)))
Why not just paste those two columns together and use that variable as groups?
frame$grp <- paste(frame[,1],frame[,2])
A somewhat more formal way to do this would be to use the function interaction.
Related
Is it possible to group by two columns? So the cross product is drawn
by geom_point() and geom_smooth()?
As example:
frame <- data.frame(
series <- rep(c('a', 'b'), 6),
sample <- rep(c('glass','water', 'metal'), 4),
data <- c(1:12))
ggplot(frame, aes()) # ...
Such that the points 6 and 12 share a group, but not with 3.
Taking the example from this question, using interaction to combine two columns into a new factor:
# Data frame with two continuous variables and two factors
set.seed(0)
x <- rep(1:10, 4)
y <- c(rep(1:10, 2)+rnorm(20)/5, rep(6:15, 2) + rnorm(20)/5)
treatment <- gl(2, 20, 40, labels=letters[1:2])
replicate <- gl(2, 10, 40)
d <- data.frame(x=x, y=y, treatment=treatment, replicate=replicate)
ggplot(d, aes(x=x, y=y, colour=treatment, shape = replicate,
group=interaction(treatment, replicate))) +
geom_point() + geom_line()
for example:
qplot(round, price, data=firm, group=id, color=id, geom='line') +
geom_smooth(aes(group=interaction(size, type)))
Why not just paste those two columns together and use that variable as groups?
frame$grp <- paste(frame[,1],frame[,2])
A somewhat more formal way to do this would be to use the function interaction.
I have plotted linear functions with ggplot as follow:
ggplot(data.frame(x=c(0,320)), aes(x)) +
stat_function(fun=function(x)60.762126*x-549.98, geom="line", colour="black") +
stat_function(fun=function(x)-0.431181333*x+2.378735e+02, geom="line", colour="black")+
ylim(-600,600)
However, I want the 1st function to be plotted for x ranging from 0 to 12 and the 2nd function to be plotted for x ranging from 12 to max(x).
Does anyone know how to do it?
It's easiest to just calculate the data you need outside of the ggplot call first.
fun1 <- function(x) 60.762126 * x - 549.98
dat1 <- data.frame(x = c(0, 12), y = NA)
dat1$y <- fun1(dat1$x)
fun2 <- function(x) -0.431181333 * x + 2.378735e+02
dat2 <- data.frame(x = c(12, 320), y = NA)
dat2$y <- fun2(dat2$x)
ggplot(mapping = aes(x, y)) +
geom_line(data = dat1) +
geom_line(data = dat2)
Or you can join the data for the lines first (as suggested by #Heroka), resulting in an identical plot:
dat.com <- rbind(dat1, dat2)
dat.com$gr <- rep(1:2, c(nrow(dat1), nrow(dat2)))
ggplot(dat.com, aes(x, y, group = gr)) +
geom_line()
I'd like to create a faceted plot using ggplot2 in which the minimum limit of the y axis will be fixed (say at 0) and the maximum limit will be determined by the data in the facet (as it is when scales="free_y". I was hoping that something like the following would work, but no such luck:
library(plyr)
library(ggplot2)
#Create the underlying data
l <- gl(2, 10, 20, labels=letters[1:2])
x <- rep(1:10, 2)
y <- c(runif(10), runif(10)*100)
df <- data.frame(l=l, x=x, y=y)
#Create a separate data frame to define axis limits
dfLim <- ddply(df, .(l), function(y) max(y$y))
names(dfLim)[2] <- "yMax"
dfLim$yMin <- 0
#Create a plot that works, but has totally free scales
p <- ggplot(df, aes(x=x, y=y)) + geom_point() + facet_wrap(~l, scales="free_y")
#Add y limits defined by the limits dataframe
p + ylim(dfLim$yMin, dfLim$yMax)
It's not too surprising to me that this throws an error (length(lims) == 2 is not TRUE) but I can't think of a strategy to get started on this problem.
In your case, either of the following will work:
p + expand_limits(y=0)
p + aes(ymin=0)
I have two graphs with the same x axis - the range of x is 0-5 in both of them.
I would like to combine both of them to one graph and I didn't find a previous example.
Here is what I got:
c <- ggplot(survey, aes(often_post,often_privacy)) + stat_smooth(method="loess")
c <- ggplot(survey, aes(frequent_read,often_privacy)) + stat_smooth(method="loess")
How can I combine them?
The y axis is "often privacy" and in each graph the x axis is "often post" or "frequent read".
I thought I can combine them easily (somehow) because the range is 0-5 in both of them.
Many thanks!
Example code for Ben's solution.
#Sample data
survey <- data.frame(
often_post = runif(10, 0, 5),
frequent_read = 5 * rbeta(10, 1, 1),
often_privacy = sample(10, replace = TRUE)
)
#Reshape the data frame
survey2 <- melt(survey, measure.vars = c("often_post", "frequent_read"))
#Plot using colour as an aesthetic to distinguish lines
(p <- ggplot(survey2, aes(value, often_privacy, colour = variable)) +
geom_point() +
geom_smooth()
)
You can use + to combine other plots on the same ggplot object. For example, to plot points and smoothed lines for both pairs of columns:
ggplot(survey, aes(often_post,often_privacy)) +
geom_point() +
geom_smooth() +
geom_point(aes(frequent_read,often_privacy)) +
geom_smooth(aes(frequent_read,often_privacy))
Try this:
df <- data.frame(x=x_var, y=y1_var, type='y1')
df <- rbind(df, data.frame(x=x_var, y=y2_var, type='y2'))
ggplot(df, aes(x, y, group=type, col=type)) + geom_line()
I'm trying to use ggplot2 to create and label a scatterplot. The variables that I am plotting are both scaled such that the horizontal and the vertical axis are plotted in units of standard deviation (1,2,3,4,...ect from the mean). What I would like to be able to do is label ONLY those elements that are beyond a certain limit of standard deviations from the mean. Ideally, this labeling would be based off of another column of data.
Is there a way to do this?
I've looked through the online manual, but I haven't been able to find anything about defining labels for plotted data.
Help is appreciated!
Thanks!
BEB
Use subsetting:
library(ggplot2)
x <- data.frame(a=1:10, b=rnorm(10))
x$lab <- letters[1:10]
ggplot(data=x, aes(a, b, label=lab)) +
geom_point() +
geom_text(data = subset(x, abs(b) > 0.2), vjust=0)
The labeling can be done in the following way:
library("ggplot2")
x <- data.frame(a=1:10, b=rnorm(10))
x$lab <- rep("", 10) # create empty labels
x$lab[c(1,3,4,5)] <- LETTERS[1:4] # some labels
ggplot(data=x, aes(x=a, y=b, label=lab)) + geom_point() + geom_text(vjust=0)
Subsetting outside of the ggplot function:
library(ggplot2)
set.seed(1)
x <- data.frame(a = 1:10, b = rnorm(10))
x$lab <- letters[1:10]
x$lab[!(abs(x$b) > 0.5)] <- NA
ggplot(data = x, aes(a, b, label = lab)) +
geom_point() +
geom_text(vjust = 0)
Using qplot:
qplot(a, b, data = x, label = lab, geom = c('point','text'))