Grouped Bar Plot Species and plots - r

Id like to make a grouped barplot that has two groups. One named Exotic Species and the second Native Species. then compare them to the Plot that they are found.Therefore 3 columns are involved with the graph. Y would be "Species Richness" and it would be the number of species either of native or exotic. X will be the "Plot name". How do i write out the coding for the bar graph i described above? If you google search European Parliament Elections R grouped barplot (orange and purply plot. thats what i want

If I understand your question correctly you want something like this:
Here is two solutions:
1.using barplot: let's say mtcars is the dataframe
# Grouped Bar Plot
counts <- table(mtcars$vs, mtcars$gear)
barplot(counts, main="Car Distribution by Gears and VS",
xlab="Number of Gears", col=c("darkblue","red"),
legend = rownames(counts), beside=TRUE)
Link
using lattice
library(lattice)
barchart(Species~Reason,data=Reasonstats,groups=Catergory,
scales=list(x=list(rot=90,cex=0.8)))
Link

Related

add extra labels to x axis barplot [duplicate]

This question already has an answer here:
R - how to make barplot plot zeros for missing values over the data range?
(1 answer)
Closed 2 years ago.
Imagine that I have the following bar plot
counts <- table(mtcars$gear)
barplot(counts, main="Car Distribution",
xlab="Number of Gears")
What I would like to do is to add extra categories, for example 2 and 6 gears. This would be, of course, reflected as 0 in the plot.
Any idea?
You need to make it a factor and declare the levels:
counts <- table(factor(mtcars$gear,levels=2:6))
barplot(counts, main="Car Distribution",
xlab="Number of Gears")
To add an explanation, factors are something meant for categorical variables. There's two aspect achieved by setting the levels as above. One you can detail what levels to expect, including missing. This is useful when say you subset and table etc. Second, you order the categories or factors. You can see it is plotted from 2 to 6. You can try doing this:
counts <- table(factor(mtcars$gear,levels=6:2))
barplot(counts, main="Car Distribution",
xlab="Number of Gears")
The plot will reverse now. You can also see this R chapter on factors

Adding group mean lines to geom_bar plot and including in legend

I want to be able to create a bar graph which shows also shows the mean value for bars in each group. AND shows the mean bar in the legend.
I have been able to get this graph Bar chart with means using the code below, which is fine, but I would like to be able to see the mean lines in the legend.
##The data to be graphed is the proportion of persons receiving a treatment
## (num=numerator) in each population (denom=demoninator). The population is
##grouped by two age groups and (Age) and further divided by a categorical
##variable V1
###SET UP DATAFRAME###
require(ggplot2)
df <- data.frame(V1 = c(rep(c("S1","S2","S3","S4","S5"),2)),
Age= c(rep(70,5),rep(80,5)),
num=c(5280,6570,5307,4894,4119,3377,4244,2999,2971,2322),
denom=c(9984,12600,9425,8206,7227,7290,8808,6386,6206,5227))
df$prop<-df$num/df$denom*100
PopMean<-sum(df$num)/sum(df$denom)*100
df70<-df[df$Age==70,]
group70mean<-sum(df70$num)/sum(df70$denom)*100
df80<-df[df$Age==80,]
group80mean<-sum(df80$num)/sum(df80$denom)*100
df$PopMean<-c(rep(PopMean,10))
df$groupmeans<-c(rep(group70mean,5),rep(group80mean,5))
I want the plot to look like this, but want the lines in the legend too, to be labelled as 'mean of group' or similar.
#basic plot
P<-ggplot(df, aes(x=factor(Age), y=prop, fill=factor(V1))) +
geom_bar(position=position_dodge(), colour='black',stat="identity")
P
####add mean lines
P+geom_errorbar(aes(y=df$groupmeans, ymax=df$groupmeans,
ymin=df$groupmeans), col="red", lwd=2)
Adding show.legend=TRUE overlays the error bars onto the factor legend, rather than separately. If there is a way of showing geom_errorbar separately in the legend this is probably the simplest solution.
I have also tried various things with geom_line
The syntax below produces a line for the population mean value, but running from the centre of each point rather than covering the width of the bars
This produces a line for the population mean and it does produce a legend but one showing a bar of colour rather than a line.
P+geom_line(aes(y=df$PopMean, group=df$PopMean, color=df$PopMean),lwd=1)
If i try to do lines for group means the lines are not visible (because they are only single points).
P+geom_line(aes(y=df$groupmeans, group=df$groupmeans, color=df$groupmeans))
I also tried to get round this with facet plot, although this requires me to pretend my categorical variable is numeric to get it to work.
###set up new df
df2<-df
df2$V1<-c(rep(c(1,2,3,4,5),2))
P<-ggplot(df2, aes(x=factor(V1), y=prop, fill=factor(V1))) +
geom_bar(position=position_dodge(),
colour='black',stat="identity",width=1)
P+facet_grid(.~factor(df2$Age))
P+facet_grid(.~factor(df2$Age))+geom_line(aes(y=df$groupmeans,
group=df$groupmeans, color=df$groupmeans))
Facetplot
This allows me to show the mean lines, using geom_line, so a legend does appear (although it doesn't look right, showing a colour gradient rather than coloured lines!). However, the lines still do not go the full width of the bars. Also my x-axis now needs relabelling to show S1, S2 etc rather than numeric 1,2,3
To sum up - is there a way of showing error bar lines separately in the legend?
If not, then, if i use facetting, how do I correct the legend appearance and relabel axes with my categorical variables and is is possible to get the line to go the full width of the plot?
Or is there an alternate solution that I am missing!?
Thanks
To get the legend for the geom_error you need to pass the colour argument in the aes.
As you want only one category (here red), I've create a dummy variable first
df$mean <- "Mean"
ggplot(df, aes(x=factor(Age), y=prop, fill=factor(V1))) +
geom_bar(position=position_dodge(), colour='black',stat="identity") +
geom_errorbar(aes (ymax=groupmeans,
ymin=groupmeans, colour=mean), lwd=2) +
scale_colour_manual(name="",values = "#ff0000")

ggplot2 adding stacked barchart to heatmap

I would like to add functional information to a HeatMap (geom_tile). I've got the following simplified DataFrame and R code producing a HeatMap and a separate stacked BarPlot (in the right order, corresponding to the HeatMap).
Question:
How can I add the BarPlot to the right edge/side of the Heatmap?? It shouldn't overlap with any of the tiles, and the tiles of the BarPlot should align with the tiles of the HeatMap.
Data:
AccessionNumber <- c('A4PU48','A9YWS0','B7FKR5','G4W9I5','B7FGU7','B7FIR4','DY615543_2','G7I6Q7','G7I9C1','G7I9Z0','A4PU48','A9YWS0','B7FKR5','G4W9I5','B7FGU7','B7FIR4','DY615543_2','G7I6Q7','G7I9C1','G7I9Z0','A4PU48','A9YWS0','B7FKR5','G4W9I5','B7FGU7','B7FIR4','DY615543_2','G7I6Q7','G7I9C1','G7I9Z0','A4PU48','A9YWS0','B7FKR5','G4W9I5','B7FGU7','B7FIR4','DY615543_2','G7I6Q7','G7I9C1','G7I9Z0')
Bincode <- c(13,25,29,19,1,1,35,16,4,1,13,25,29,19,1,1,35,16,4,1,13,25,29,19,1,1,35,16,4,1,13,25,29,19,1,1,35,16,4,1)
MMName <- c('amino acid metabolism','C1-metabolism','protein','tetrapyrrole synthesis','PS','PS','not assigned','secondary metabolism','glycolysis','PS','amino acid metabolism','C1-metabolism','protein','tetrapyrrole synthesis','PS','PS','not assigned','secondary metabolism','glycolysis','PS','amino acid metabolism','C1-metabolism','protein','tetrapyrrole synthesis','PS','PS','not assigned','secondary metabolism','glycolysis','PS','amino acid metabolism','C1-metabolism','protein','tetrapyrrole synthesis','PS','PS','not assigned','secondary metabolism','glycolysis','PS')
cluster <- c(1,2,2,2,3,3,4,4,4,4,1,2,2,2,3,3,4,4,4,4,1,2,2,2,3,3,4,4,4,4,1,2,2,2,3,3,4,4,4,4)
variable <- c('rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96')
value <- c(2.15724042939,1.48366099919,1.29388509992,1.59969471112,1.82681962192,2.13347487296,1.08298157478,1.20709456306,1.02011775131,0.88018823632,1.41435923375,1.31680079684,1.32041325076,1.23402873856,2.04977975574,1.90651971106,0.911615352178,1.05021352328,1.18437303394,1.05620421143,1.02132613918,1.22080237755,1.40759491365,1.43131574695,1.65848581311,1.91886008221,0.639581269674,1.11779720968,1.09406554542,1.02259316617,1.00529867534,1.30885290475,1.39376458384,1.35503544429,1.81418617518,1.92505106722,0.862870707741,1.0832577668,1.03118887309,1.21310404226)
df <- data.frame(AccessionNumber, Bincode, MMName, cluster, variable, value)
HeatMap plot:
hm <- ggplot(df, aes(x=variable, y=AccessionNumber))
hm + geom_tile(aes(fill=value), colour = 'white') + scale_fill_gradient2(low='blue', midpoint=1, high='red')
stacked BarPlot:
bp <- ggplot(df, aes(x=sum(df$Bincode), fill=MMName))
bp + stat_bin(aes(ymax = ..count..), binwidth = 1, geom='bar')
Thank you very much for your help/support!!
The variables of the y-axis are sorted first by increasing "cluster" then alphabetically by "AccessionNumber". This is true for both the HeatMap as well as the BarPlot. The values appear in the same order in both plots, but show two different variables (same amount of rows and in the same order, but different content). The HeatMap displays a continuous variable in contrast to the BarPlot which displays a categorical variable. Therefore, the plots could be combined, displaying additional information.
Please help!

How to create a distribution plot faceted that contain title in x an y axis GGPLOT

I have data that looks like this
df <- data.frame(c("Cell1","Cell2","NK-Cell"),c("K","L","S","L","K","S","S","L","K"),abs(log2(abs(rnorm(180)))))
colnames(df) <- c("ctype","tissue","exp")
I'm trying to create a plot that looks like this.
General idea in this situation would be to use faceting to get this kind of plot.
ggplot(df,aes(exp))+geom_density()+facet_grid(tissue~ctype)
There are two empty plots because there are no Cell2 values for the tissue S and no NK-Cell values for tissue L.

Calculate pairwise wilcox.test for several categories and plot significance in boxplot with ggplot2

I have a dataset that lookd pretty much like this one from diamonds:
diamonds2 = subset(diamonds, cut!='Good' & cut!='Very Good', -c(table, x, y, z, clarity, depth, price))
I want to make a boxplot like this one:
ggplot(diamonds2, aes(x=color, y=carat, col=cut))+geom_boxplot()
And the hard question comes here. My idea is to perform pairwise wilcox.test for each distribution of the variable y (carat) by group (cut) and for each of the columns (color).
library(plyr)
ddply(diamonds2,"color",
function(x) {
w <- wilcox.test(carat~cut,data=diamonds2)
with(w,data.frame(statistic,p.value))
})
The code fails because is asking for 2 levels (obviously). I can make a subset before applying the function (to remove one of the "cut") but It's not giving me what I want and can't understand why.
Additionally I would like to plot the results as asterisks of the color between the two distributions I'm comparing.
In the first boxplot (D), I would like to plot 3 asterisks, a purple (red and blue are significantly different), a yellow and a cian.
About the asterisk color plotting I've been playing a bit with the function geom_text from ggplot2 but I can't figure out how to plot below the X axis or plot text in different colors.

Resources