I want to group the bars in a stacked barplot according to the values in another factor-variable. However, I want to do this without using facets.
my data in long format
I want to group the stacked bars according the afk variable. The normal stacked bar plot can be made with:
ggplot(nl.melt, aes(x=naam, y=perc, fill=stemmen)) +
geom_bar(stat="identity", width=.7) +
scale_x_discrete(expand=c(0,0)) +
scale_y_continuous(expand=c(0,0)) +
coord_flip() +
theme_bw()
which gives an alfabetically ordered barplot:
I tried to group them by using x=reorder(naam,afk) in the aes. But that didn't work. Also using group=afk does not have the desired effect.
Any ideas how to do this?
reorder should work but the problem is you're trying to re-order by a factor. You need to be explicit on how you want to use that information. You can either use
nl.melt$naam <- reorder(nl.melt$naam, as.numeric(nl.melt$afk))
or
nl.melt$naam <- reorder(nl.melt$naam, as.character(nl.melt$afk), FUN=min)
depending on whether you want to sort by the existing levels of afk or if you want to sort alphabetically by the levels of afk.
After running that and re-running the ggplot code, i get
An alternative to #MrFlick's approach (based on the answer #CarlosCinelli linked to) is:
ggplot(nl.melt, aes(x=interaction(naam,afk), y=perc, fill=stemmen)) +
geom_bar(stat="identity", width=.7) +
scale_x_discrete(expand=c(0,0)) +
scale_y_continuous(expand=c(0,0)) +
coord_flip() +
theme_bw()
which gives:
R tends to see the order of levels as a property of the data rather than a property of the graph. Try reordering the data itself before calling the plotting commands. Try running:
nl.melt$naam <- reorder(nl.melt$naam, nl.melt$afk)
Then run your ggplot code. Or use other ways of reordering your factor levels in naam.
Related
I am wondering if I am able to graph separate lines for 2 variables without using the grid function. I would prefer the 4 lines on one graph than 2 lines in 2 grids. Its ok if I can't but thought I would ask.
My data is as follows:
nd<-data.frame(Machine = c(2,2,3,3,2,2,3,3),
Source = c("tube", "machine","tube", "machine","tube", "machine","tube", "machine"),
Time=c(0,0,0,0,2,2,2,2),
Count=c(224000, 107000, 850000, 940000, 610000,116000, 1160000, 1100000))
and this code gives me what I want with a facet...
ggplot(data=nd, aes(x=Time, y=Count, group=Machine, color=Machine)) +
geom_line(aes(group=Machine))+ geom_point()+facet_grid(~Source)
Is there an alternative to this?
P.S. even though Machine is a factor variable why is my legend showing it as continuous?
One quick way is to use the interaction function, which paste your two variables with a "."
ggplot(data=nd, aes(x=Time, y=Count, color=interaction(Machine,Source))) +
geom_line() + geom_point() +
scale_color_manual("groups",
values=c("#61d4b3","#fdd365","#fb8d62","#fd2eb3"))
Using ggplot2 I have made facetted histograms using the following code.
library(ggplot2)
library(plyr)
df1 <- data.frame(monthNo = rep(month.abb[1:5],20),
classifier = c(rep("a",50),rep("b",50)),
values = c(seq(1,10,length.out=50),seq(11,20,length.out=50))
)
means <- ddply (df1,
c(.(monthNo),.(classifier)),
summarize,
Mean=mean(values)
)
ggplot(df1,
aes(x=values, colour=as.factor(classifier))) +
geom_histogram() +
facet_wrap(~monthNo,ncol=1) +
geom_vline(data=means, aes(xintercept=Mean, colour=as.factor(classifier)),
linetype="dashed", size=1)
The vertical line showing means per month is to stay.
But I want to also add text over these vertical lines displaying the mean values for each month. These means are from the 'means' data frame.
I have looked at geom_text and I can add text to plots. But it appears my circumstance is a little different and not so easy. It's a lot simpler to add text in some cases where you just add values of the plotted data points. But cases like this when you want to add the mean and not the value of the histograms I just can't find the solution.
Please help. Thanks.
Having noted the possible duplicate (another answer of mine), the solution here might not be as (initially/intuitively) obvious. You can do what you need if you split the geom_text call into two (for each classifier):
ggplot(df1, aes(x=values, fill=as.factor(classifier))) +
geom_histogram() +
facet_wrap(~monthNo, ncol=1) +
geom_vline(data=means, aes(xintercept=Mean, colour=as.factor(classifier)),
linetype="dashed", size=1) +
geom_text(y=0.5, aes(x=Mean, label=Mean),
data=means[means$classifier=="a",]) +
geom_text(y=0.5, aes(x=Mean, label=Mean),
data=means[means$classifier=="b",])
I'm assuming you can format the numbers to the appropriate precision and place them on the y-axis where you need to with this code.
In the data that I am attempting to plot, each sample belongs in one of several groups, that will be plotted on their own grids. I am plotting stacked bar plots for each sample that will be ordered in increasing number of sequences, which is an id attribute of each sample.
Currently, the plot (with some random data) looks like this:
(Since I don't have the required 10 rep for images, I am linking it here)
There are couple things I need to accomplish. And I don't know where to start.
I would like the bars not to be placed at its corresponding nseqs value, rather placed next to each other in ascending nseqs order.
I don't want each grid to have the same scale. Everything needs to fit snugly.
I have tried to set scales and size to for facet_grid to free_x, but this results in an unused argument error. I think this is related to the fact that I have not been able to get the scales library loaded properly (it keeps saying not available).
Code that deals with plotting:
ggfdata <- melt(fdata, id.var=c('group','nseqs','sample'))
p <- ggplot(ggfdata, aes(x=nseqs, y=value, fill = variable)) +
geom_bar(stat='identity') +
facet_grid(~group) +
scale_y_continuous() +
opts(title=paste('Taxonomic Distribution - grouped by',colnames(meta.frame)[i]))
Try this:
update.packages()
## I'm assuming your ggplot2 is out of date because you use opts()
## If the scales library is unavailable, you might need to update R
ggfdata <- melt(fdata, id.var=c('group','nseqs','sample'))
ggfdata$nseqs <- factor(ggfdata$nseqs)
## Making nseqs a factor will stop ggplot from treating it as a numeric,
## which sounds like what you want
p <- ggplot(ggfdata, aes(x=nseqs, y=value, fill = variable)) +
geom_bar(stat='identity') +
facet_wrap(~group, scales="free_x") + ## No need for facet_grid with only one variable
labs(title = paste('Taxonomic Distribution - grouped by',colnames(meta.frame)[i]))
I have an existing dataset with three factors. I would like to plot these three factors using facet_grid() and have them ordered based on how they are ordered in the dataset instead of alphabetical order. Is this possible to do somehow without modifying my data structure?
Here's the data:
https://dl.dropboxusercontent.com/u/22681355/data.csv
data<-read.csv("data.csv", head=T)
ggplot(data, aes(time,a, color="one")) +
geom_line(linetype=1, size=0.3) +
scale_y_continuous(breaks=seq(0,1,0.2)) +
scale_x_continuous(breaks=seq(100,300,50)) +
theme_bw() +
geom_line(aes(time,b)) +
geom_line(aes(time,c)) +
geom_line(aes(time,d))+facet_wrap(~X.1)
This question appears quite too often on SO. You've to get the desired column (by which you're facetting) as a factor with levels in the order you desire, as follows:
data$X.1 <- factor(data$X.1, levels=unique(data$X.1))
Now, plot it and you'll get the facetted plot in the desired order.
I'm trying to build some kind of profile diagram with ggplot2. I therefore want a line which connects the means in the plot. As you see, geom_line doesn't work here because it only connects the points within each factor level but not the means between factor levels.
Here's a small example:
df <- data.frame(variable=rep(1:3,each=10),value=rnorm(30))
p <- ggplot(df,aes(factor(variable),value))
p + stat_summary(fun.y=mean, geom="point")+coord_flip()+geom_line()
Does anyone has an idea how to achieve that?
Thank you in advance!
It is often easier to summarize the data before you plot. Something like
The next trick is to use group within the call to geom_line to override the default grouping by factor(variable)
summarydf <- ddply(df,.(variable),summarize, value = mean(value))
p <- ggplot(summarydf,aes(factor(variable),value)) +
geom_point() + geom_line(aes(group=1)) + coord_flip()
p