I have the following dataset:
subj <- c(rep(11,3),rep(12,3),rep(14,3),rep(15,3),rep(17,3),rep(18,3),rep(20,3))
group <- c(rep("u",3),rep("t",6),rep("u",6),rep("t",6))
time <- rep(1:3,7)
mean <- c(0.7352941, 0.8059701, 0.8823529, 0.9264706, 0.9852941, 0.9558824, 0.7941176, 0.8676471, 0.7910448, 0.7058824, 0.8382353, 0.7941176, 0.9411765, 0.9558824, 0.9852941, 0.7647059, 0.8088235, 0.7968750, 0.8088235, 0.8500000, 0.8412698)
df <- data.frame(subj,group,time,mean)
df$subj <- as.factor(df$subj)
df$time <- as.factor(df$time)
And now I create a barplot with ggplot2:
library(ggplot2)
qplot(x=subj, y=mean*100, fill=time, data=df, geom="bar",stat="identity",position="dodge") +
facet_wrap(~ group)
How do I make it so that the x-axis labels that are not present in each facet are not shown? How do I get equal distances between each subj (i.e. get rid of the bigger gaps)?
You can use scale="free":
ggplot(df, aes(x=subj, y=mean*100, fill=time)) +
geom_bar(stat="identity", position="dodge") +
facet_wrap(~ group, scale="free")
Another option with slightly different aesthetics using facet_grid. In contrast to the plots above, the panels aren't the same width here, but due to "space="free_x", the bars are the same widths.
ggplot(df, aes(x=subj, y=mean*100, fill=time)) +
geom_bar(stat="identity", position="dodge") +
facet_grid(~ group, scale="free", space="free_x")
Related
My data is in the long format (as required to do the grouped barplot), so that the values for different categories are in one single column. The data is here.
Now, a standard barplot with ggplot2 orders the bars alphabetically (in my case of country names, from Argentina to Uganda). I want to keep the order of countries as it is in the dataframe. Using the suggestion here (i.e. ussing the limits= option inside the scale_x_discrete function) I get the following graph:
My code is this:
mydata <- read_excel("WDR2016Fig215.xls", col_names = TRUE)
y <- mydata$value
x <- mydata$country
z <- mydata$Skill
ggplot(data=mydata, aes(x=x, y=y, fill=z)) +
geom_bar(stat="identity", position=position_dodge(), colour="black") +
scale_x_discrete(limits=x)
The graph is nicely sorted as I want but the x axis is for some reason expanded. Any idea what is the problem?
this?
mydata$country <- factor(mydata$country, levels=unique(mydata$country)[1:30])
ggplot(data=mydata, aes(x=country, y=value, fill=Skill)) +
geom_bar(stat="identity", position=position_dodge(), colour="black")
How do I draw a horizontal line indicating the Highest (Posterior) Density interval for faceted density plots in ggplot2? This is what I have tried:
# Functions to calculate lower and upper part of HPD.
hpd_lower = function(x) coda::HPDinterval(as.mcmc(x))[1]
hpd_upper = function(x) coda::HPDinterval(as.mcmc(x))[2]
# Data: two groups with different means
df = data.frame(value=c(rnorm(500), rnorm(500, mean=5)), group=rep(c('A', 'B'), each=500))
# Plot it
ggplot(df, aes(x=value)) +
geom_density() +
facet_wrap(~group) +
geom_segment(aes(x=hpd_lower(value), xend=hpd_upper(value), y=0, yend=0), size=3)
As you can see, geom_segment computes on all data for both facets whereas I would like it to respect the faceting. I would also like a solution where HPDinterval is only run once per facet.
Pre-calculate the hpd intervals. ggplot evaluates the calculations in the aes() function in the entire data frame, even when data are grouped.
# Plot it
library(dplyr)
df_hpd <- group_by(df, group) %>% summarize(x=hpd_lower(value), xend=hpd_upper(value))
ggplot(df, aes(x=value)) +
geom_density() +
facet_wrap(~group) +
geom_segment(data = df_hpd, aes(x=x, xend=xend, y=0, yend=0), size=3)
I would like to plot two data series (same type and number of measures, but measured at two timepoints) in the same barplot. Preferably the first series is plotted in grey, with the second series plotted in colours with transparency such that the series 1 data is still visible.
The data I have is of the following format:
MyData = data.frame(
method=rep(c("A","B","C","D","E"),times=3),
time1=rnorm(30,10,3),
time2=rnorm(30,8,2),
lab=rep(rep(c(1,2,3),each=5),times=2),
cat=rep(c(1,2),each=15)
)
To show the type of plot I'm looking for I have added the code for plotting data series 1 below:
p <- ggplot(data = MyData,
aes(x=lab,
y=time1,
fill=method))
p + geom_bar(stat="identity",
position="dodge",
alpha=.3) +
facet_grid(. ~ cat)
In the end it doesn't really matter which one of the data series is in grey and which is in colour, as long as they are plotted on top of each other, and both are visible.
All suggestions are welcome!
There can only be one active fill_scale, so we need to map the variable method to something else, either group or color.
library(ggplot2)
MyData = data.frame(
method=rep(c("A","B","C","D","E"),times=3),
time1=rnorm(30,10,3),
time2=rnorm(30,8,2),
lab=rep(rep(c(1,2,3),each=5),times=2),
cat=rep(c(1,2),each=15)
)
p <- ggplot(data = MyData,
aes(x=lab)) +
geom_bar(aes(y=time2,fill=method),
stat="identity",
position="dodge",
alpha=.3
) +
geom_bar(aes(y=time1,group=method),
stat="identity",
position="dodge",
alpha=.3) +
scale_fill_discrete() +
facet_grid(. ~ cat)
p
I have been thinking about a different way to add the second data series. I can add the second series using geom_point instead of geom_bar, as this gives less clutter. However, how do I position the points on the corresponding bar? (i.e. right now the points are all on the same x-axis position).
library(ggplot2)
MyData = data.frame(
method=rep(c("A","B","C","D","E"),times=3),
time1=rnorm(30,10,3),
time2=rnorm(30,8,2),
lab=rep(rep(c(1,2,3),each=5),times=2),
cat=rep(c(1,2),each=15)
)
p <- ggplot(data = MyData,
aes(x=lab)) +
geom_bar(aes(y=time1,fill=method),
stat="identity",
position="dodge",
alpha=.7
) +
geom_point(aes(y=time2,group=method),
stat="identity",
position="dodge",
alpha=.8,
size=3) +
scale_fill_brewer(palette=3) +
facet_grid(. ~ cat)
p
I have this simple data frame holding three replicates (value) for each factor (CT). I would like to plot it as geom_point and than the means of the point as geom_line.
gene <- c("Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5")
value <- c(0.86443, 0.79032, 0.86517, 0.79782, 0.79439, 0.89221, 0.93071, 0.87170, 0.86488, 0.91133, 0.87202, 0.84028, 0.83242, 0.74016, 0.86656)
CT <- c("ET","ET","ET", "HP","HP","HP","HT","HT","HT", "LT","LT","LT","P","P","P")
df<- cbind(gene,value,CT)
df<- data.frame(df)
So, I can make the scatter plot.
ggplot(df, aes(x=CT, y=value)) + geom_point()
How do I get a geom_line representing the means for each factor. I have tried the stat_summary:
ggplot(df, aes(x=CT, y=value)) + geom_point() +
stat_summary(aes(y = value,group = CT), fun.y=mean, colour="red", geom="line")
But it does not work.
"geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?"
But each group has three observations, what is wrong?
Ps. I am also interested in a smooth line.
You should set the group aes to 1:
ggplot(df, aes(x=CT, y=value)) + geom_point() +
stat_summary(aes(y = value,group=1), fun.y=mean, colour="red", geom="line",group=1)
You can use the dplyr package to get the means of each factor.
library(dplyr)
group_means <- df %>%
group_by(CT) %>%
summarise(mean = mean(value))
Then you will need to convert the factors to numeric to let you plot lines on the graph using the geom_segment function. In addition, the scale_x_continuous function will let you set the labels for the x axis.
ggplot(df, aes(x=as.numeric(CT), y=value)) + geom_point() +
geom_segment(aes(x=as.numeric(CT)-0.4, xend=as.numeric(CT)+0.4, y=mean, yend=mean),
data=group_means, colour="red") +
scale_x_continuous("name", labels=as.character(df$CT), breaks=as.numeric(df$CT))
Following on from hrbrmstr's comment you can add the smooth line using the following:
ggplot(df, aes(x=as.numeric(CT), y=value, group=1)) + geom_point() +
geom_segment(aes(x=as.numeric(CT)-0.4, xend=as.numeric(CT)+0.4, y=mean, yend=mean),
data=group_means, colour="red") +
scale_x_continuous("name", labels=as.character(df$CT), breaks=as.numeric(df$CT)) +
geom_smooth()
I want to create a facetted lineplot. In each subplot, one y-value (y1 or y2) is compared to the baseline. The y-value and baseline should be visualized by different colours, but this colour scheme should stay consistent within each subplot. As a legend, I only require 2 entries: "y-value" and "baseline", since the header of each subplot names the y-value to be compared.
Yet, I only got this (sample code):
library(ggplot2)
library(reshape)
df = data.frame(c(10,20,40),c(0.1,0.2,0.3),c(0.1,0.4,0.5),c(0.05,0.1,0.2))
names(df)[1]="classes"
names(df)[2]="y1"
names(df)[3]="y2"
names(df)[4]="baseline"
df$classes <- factor(df$classes,levels=c(10,20,40), labels=c("10m","20m","40m"))
dfMelted <- melt(df)
diagram <- ggplot()
diagram <- diagram + theme_bw(base_size=16)
diagram <- diagram + geom_point(data=dfMelted, size=4, aes(x=factor(classes),y=value, colour=variable, shape=variable))
diagram <- diagram + geom_line(data=dfMelted, aes(x=factor(classes),y=value, group=variable, colour=variable))
diagram <- diagram + facet_wrap(~ variable, ncol=1)
diagram
And this is how it looks so far:
I tried to create groups, each comprising one y-dataset and the duplicated baseline data. Then, I used facetting in terms of the group-column. Unfortunately, this results in the use of many different colours and a huge legend. Is there any better way to do this?
Is this what you had in mind?
df = data.frame(classes=c(10,20,40), y1=c(0.1,0.2,0.3), y2=c(0.1,0.4,0.5),
baseline=c(0.05,0.1,0.2))
df$classes <- factor(df$classes, levels=c(10,20,40),
labels=c("10m","20m","40m"))
# Two melts to create a grouping variable for baseline vs. new value (y1 or y2)
# and another grouping variable for faceting on y1/y2
dfm=melt(df, id.var=c(1,4))
names(dfm)[3] = "y_value"
dfm=melt(dfm, id.var=c(1,3))
ggplot(dfm, aes(x=classes, y=value, group=variable, colour=variable)) +
geom_point() + geom_line() +
theme_bw(base_size=16) +
facet_grid(. ~ y_value)
You are probably looking for the scale_colour_manual function:
ggplot() +
geom_point(data=dfMelted, size=4, aes(x=factor(classes),y=value, colour=variable, shape=variable)) +
geom_line(data=dfMelted, aes(x=factor(classes),y=value, group=variable, colour=variable)) +
scale_colour_manual(values = c("y1" = "red","baseline" = "blue","y2" = "green")) +
theme_bw(base_size=16) +
facet_grid(variable ~.)
which results in: