I want to create a facetted lineplot. In each subplot, one y-value (y1 or y2) is compared to the baseline. The y-value and baseline should be visualized by different colours, but this colour scheme should stay consistent within each subplot. As a legend, I only require 2 entries: "y-value" and "baseline", since the header of each subplot names the y-value to be compared.
Yet, I only got this (sample code):
library(ggplot2)
library(reshape)
df = data.frame(c(10,20,40),c(0.1,0.2,0.3),c(0.1,0.4,0.5),c(0.05,0.1,0.2))
names(df)[1]="classes"
names(df)[2]="y1"
names(df)[3]="y2"
names(df)[4]="baseline"
df$classes <- factor(df$classes,levels=c(10,20,40), labels=c("10m","20m","40m"))
dfMelted <- melt(df)
diagram <- ggplot()
diagram <- diagram + theme_bw(base_size=16)
diagram <- diagram + geom_point(data=dfMelted, size=4, aes(x=factor(classes),y=value, colour=variable, shape=variable))
diagram <- diagram + geom_line(data=dfMelted, aes(x=factor(classes),y=value, group=variable, colour=variable))
diagram <- diagram + facet_wrap(~ variable, ncol=1)
diagram
And this is how it looks so far:
I tried to create groups, each comprising one y-dataset and the duplicated baseline data. Then, I used facetting in terms of the group-column. Unfortunately, this results in the use of many different colours and a huge legend. Is there any better way to do this?
Is this what you had in mind?
df = data.frame(classes=c(10,20,40), y1=c(0.1,0.2,0.3), y2=c(0.1,0.4,0.5),
baseline=c(0.05,0.1,0.2))
df$classes <- factor(df$classes, levels=c(10,20,40),
labels=c("10m","20m","40m"))
# Two melts to create a grouping variable for baseline vs. new value (y1 or y2)
# and another grouping variable for faceting on y1/y2
dfm=melt(df, id.var=c(1,4))
names(dfm)[3] = "y_value"
dfm=melt(dfm, id.var=c(1,3))
ggplot(dfm, aes(x=classes, y=value, group=variable, colour=variable)) +
geom_point() + geom_line() +
theme_bw(base_size=16) +
facet_grid(. ~ y_value)
You are probably looking for the scale_colour_manual function:
ggplot() +
geom_point(data=dfMelted, size=4, aes(x=factor(classes),y=value, colour=variable, shape=variable)) +
geom_line(data=dfMelted, aes(x=factor(classes),y=value, group=variable, colour=variable)) +
scale_colour_manual(values = c("y1" = "red","baseline" = "blue","y2" = "green")) +
theme_bw(base_size=16) +
facet_grid(variable ~.)
which results in:
Related
I have the following dataset:
subj <- c(rep(11,3),rep(12,3),rep(14,3),rep(15,3),rep(17,3),rep(18,3),rep(20,3))
group <- c(rep("u",3),rep("t",6),rep("u",6),rep("t",6))
time <- rep(1:3,7)
mean <- c(0.7352941, 0.8059701, 0.8823529, 0.9264706, 0.9852941, 0.9558824, 0.7941176, 0.8676471, 0.7910448, 0.7058824, 0.8382353, 0.7941176, 0.9411765, 0.9558824, 0.9852941, 0.7647059, 0.8088235, 0.7968750, 0.8088235, 0.8500000, 0.8412698)
df <- data.frame(subj,group,time,mean)
df$subj <- as.factor(df$subj)
df$time <- as.factor(df$time)
And now I create a barplot with ggplot2:
library(ggplot2)
qplot(x=subj, y=mean*100, fill=time, data=df, geom="bar",stat="identity",position="dodge") +
facet_wrap(~ group)
How do I make it so that the x-axis labels that are not present in each facet are not shown? How do I get equal distances between each subj (i.e. get rid of the bigger gaps)?
You can use scale="free":
ggplot(df, aes(x=subj, y=mean*100, fill=time)) +
geom_bar(stat="identity", position="dodge") +
facet_wrap(~ group, scale="free")
Another option with slightly different aesthetics using facet_grid. In contrast to the plots above, the panels aren't the same width here, but due to "space="free_x", the bars are the same widths.
ggplot(df, aes(x=subj, y=mean*100, fill=time)) +
geom_bar(stat="identity", position="dodge") +
facet_grid(~ group, scale="free", space="free_x")
I have two dataframes: dataf1, dataf2. They have the same structure and columns.
3 columns names are A,B,C. And they both have 50 rows.
I would like to plot the histogram of column B on dataf1 and dataf2. I can plot two histograms separately but they are not of the same scale. I would like to know how to either put them on the same histogram using different colors or plot two histograms of the same scale?
ggplot() + aes(dataf1$B)+ geom_histogram(binwidth=1, colour="black",fill="white")
ggplot() + aes(dataf2$B)+ geom_histogram(binwidth=1, colour="black", fill="white")
Combine your data into a single data frame with a new column marking which data frame the data originally came from. Then use that new column for the fill aesthetic for your plot.
data1$source="Data 1"
data2$source="Data 2"
dat_combined = rbind(data1, data2)
You haven't provided sample data, so here are a few examples of possible plots, using the built-in iris data frame. In the plots below, dat is analogous to dat_combined, Petal.Width is analogous to B, and Species is analogous to source.
dat = subset(iris, Species != "setosa") # We want just two species
ggplot(dat, aes(Petal.Width, fill=Species)) +
geom_histogram(position="identity", colour="grey40", alpha=0.5, binwidth=0.1)
ggplot(dat, aes(Petal.Width, fill=Species)) +
geom_histogram(position="dodge", binwidth=0.1)
ggplot(dat, aes(Petal.Width, fill=Species)) +
geom_histogram(position="identity", colour="grey40", binwidth=0.1) +
facet_grid(Species ~ .)
As Zheyuan says, you just need to set the y limits for each plot to get them on the same scale. With ggplot2, one way to do this is with the lims command (though scale_y_continuous and coord_cartesian also work, albeit slightly differently). You also should never use data$column indside aes(). Instead, use the data argument for the data frame and unquoted column names inside aes(). Here's an example with some built-in data.
p1 = ggplot(mtcars, aes(x = mpg)) + geom_histogram() + lims(y = c(0, 13))
p2 = ggplot(iris, aes(x = Sepal.Length)) + geom_histogram() + lims(y = c(0, 13))
gridExtra::grid.arrange(p1, p2, nrow = 1)
Two get two histograms on the same plot, the best way is to combine your data frames. A guess, without seeing what your data looks like:
dataf = rbind(dataf1["B"], dataf2["B"])
dafaf$source = c(rep("f1", nrow(dataf1)), rep("f2", nrow(dataf2))
ggplot(dataf, aes(x = B, fill = source)) +
geom_histogram(position = "identity", alpha = 0.7)
I would like to plot two data series (same type and number of measures, but measured at two timepoints) in the same barplot. Preferably the first series is plotted in grey, with the second series plotted in colours with transparency such that the series 1 data is still visible.
The data I have is of the following format:
MyData = data.frame(
method=rep(c("A","B","C","D","E"),times=3),
time1=rnorm(30,10,3),
time2=rnorm(30,8,2),
lab=rep(rep(c(1,2,3),each=5),times=2),
cat=rep(c(1,2),each=15)
)
To show the type of plot I'm looking for I have added the code for plotting data series 1 below:
p <- ggplot(data = MyData,
aes(x=lab,
y=time1,
fill=method))
p + geom_bar(stat="identity",
position="dodge",
alpha=.3) +
facet_grid(. ~ cat)
In the end it doesn't really matter which one of the data series is in grey and which is in colour, as long as they are plotted on top of each other, and both are visible.
All suggestions are welcome!
There can only be one active fill_scale, so we need to map the variable method to something else, either group or color.
library(ggplot2)
MyData = data.frame(
method=rep(c("A","B","C","D","E"),times=3),
time1=rnorm(30,10,3),
time2=rnorm(30,8,2),
lab=rep(rep(c(1,2,3),each=5),times=2),
cat=rep(c(1,2),each=15)
)
p <- ggplot(data = MyData,
aes(x=lab)) +
geom_bar(aes(y=time2,fill=method),
stat="identity",
position="dodge",
alpha=.3
) +
geom_bar(aes(y=time1,group=method),
stat="identity",
position="dodge",
alpha=.3) +
scale_fill_discrete() +
facet_grid(. ~ cat)
p
I have been thinking about a different way to add the second data series. I can add the second series using geom_point instead of geom_bar, as this gives less clutter. However, how do I position the points on the corresponding bar? (i.e. right now the points are all on the same x-axis position).
library(ggplot2)
MyData = data.frame(
method=rep(c("A","B","C","D","E"),times=3),
time1=rnorm(30,10,3),
time2=rnorm(30,8,2),
lab=rep(rep(c(1,2,3),each=5),times=2),
cat=rep(c(1,2),each=15)
)
p <- ggplot(data = MyData,
aes(x=lab)) +
geom_bar(aes(y=time1,fill=method),
stat="identity",
position="dodge",
alpha=.7
) +
geom_point(aes(y=time2,group=method),
stat="identity",
position="dodge",
alpha=.8,
size=3) +
scale_fill_brewer(palette=3) +
facet_grid(. ~ cat)
p
I have this simple data frame holding three replicates (value) for each factor (CT). I would like to plot it as geom_point and than the means of the point as geom_line.
gene <- c("Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5")
value <- c(0.86443, 0.79032, 0.86517, 0.79782, 0.79439, 0.89221, 0.93071, 0.87170, 0.86488, 0.91133, 0.87202, 0.84028, 0.83242, 0.74016, 0.86656)
CT <- c("ET","ET","ET", "HP","HP","HP","HT","HT","HT", "LT","LT","LT","P","P","P")
df<- cbind(gene,value,CT)
df<- data.frame(df)
So, I can make the scatter plot.
ggplot(df, aes(x=CT, y=value)) + geom_point()
How do I get a geom_line representing the means for each factor. I have tried the stat_summary:
ggplot(df, aes(x=CT, y=value)) + geom_point() +
stat_summary(aes(y = value,group = CT), fun.y=mean, colour="red", geom="line")
But it does not work.
"geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?"
But each group has three observations, what is wrong?
Ps. I am also interested in a smooth line.
You should set the group aes to 1:
ggplot(df, aes(x=CT, y=value)) + geom_point() +
stat_summary(aes(y = value,group=1), fun.y=mean, colour="red", geom="line",group=1)
You can use the dplyr package to get the means of each factor.
library(dplyr)
group_means <- df %>%
group_by(CT) %>%
summarise(mean = mean(value))
Then you will need to convert the factors to numeric to let you plot lines on the graph using the geom_segment function. In addition, the scale_x_continuous function will let you set the labels for the x axis.
ggplot(df, aes(x=as.numeric(CT), y=value)) + geom_point() +
geom_segment(aes(x=as.numeric(CT)-0.4, xend=as.numeric(CT)+0.4, y=mean, yend=mean),
data=group_means, colour="red") +
scale_x_continuous("name", labels=as.character(df$CT), breaks=as.numeric(df$CT))
Following on from hrbrmstr's comment you can add the smooth line using the following:
ggplot(df, aes(x=as.numeric(CT), y=value, group=1)) + geom_point() +
geom_segment(aes(x=as.numeric(CT)-0.4, xend=as.numeric(CT)+0.4, y=mean, yend=mean),
data=group_means, colour="red") +
scale_x_continuous("name", labels=as.character(df$CT), breaks=as.numeric(df$CT)) +
geom_smooth()
I want to compare one level of a variable against the combined influence of all other variables. I would like to do this with a facet plot.
For instance:
ggplot(diamonds, aes(price, colour = cut)) + geom_density() + facet_grid(~clarity)
This provides a faceted plot of all the factor levels in clarity. However, what I would like to have is a density plot of I1 in the first facet and a density plot of ~(I1) in the second facet.
So I would like to produce a comparison of the following using the facet feature of ggplot2:
ggplot(subset(diamonds, (clarity == "I1")) , aes(price, colour = cut)) + geom_density()
ggplot(subset(diamonds, !(clarity == "I1")) , aes(price, colour = cut)) + geom_density()
I can see how I could define a new column in the dataframe and use that as the factor in facet_grid, but I suspect there are much better ways to do this.
You can create a new column(better solution) or use gridExtra package:
library(gridExtra)
p1 <- ggplot(subset(diamonds, (clarity == "I1")) , aes(price, colour = cut)) + geom_density()
p2 <- ggplot(subset(diamonds, !(clarity == "I1")) , aes(price, colour = cut)) + geom_density()
grid.arrange(p1,p2)