Clustered Bar plot in r using ggplot2 - r

Snip of my data frame is
Basically i want to display barplot which is grouped by Country i.e i want to display no of people doing suicides for all of the country in clustered plot and similarly for accidents and Stabbing as well.I am using ggplot2 for this.I have no idea how to do this.
Any helps.
Thanks in advance

Edit to update for newer (2017) package versions
library(tidyr)
library(ggplot2)
dat.g <- gather(dat, type, value, -country)
ggplot(dat.g, aes(type, value)) +
geom_bar(aes(fill = country), stat = "identity", position = "dodge")
Original Answer
dat <- data.frame(country=c('USA','Brazil','Ghana','England','Australia'), Stabbing=c(15,10,9,6,7), Accidents=c(20,25,21,28,15), Suicide=c(3,10,7,8,6))
dat.m <- melt(dat, id.vars='country')
I guess this is the format you're after?
ggplot(dat.m, aes(variable, value)) +
geom_bar(aes(fill = country), position = "dodge")

library(ggplot2)
library(reshape2)
df <- data.frame(country=c('USA','Brazil','Ghana','England','Australia'), Stabbing=c(15,10,9,6,7), Accidents=c(20,25,21,28,15), Suicide=c(3,10,7,8,6))
mm <- melt(df, id.vars='country')
ggplot(mm, aes(x=country, y=value)) + geom_bar(stat='identity') + facet_grid(.~variable) + coord_flip() + labs(x='',y='')

Related

Mantain order of dataframe for a stacked barplot using ggplot2

Using the following dataframe and ggplot...
sample ="BC04"
df<- data.frame(Name=c("Pseudomonas veronii", "Pseudomonas stutzeri", "Janthinobacterium lividum", "Pseudomonas viridiflava"),
Abundance=c(7.17, 4.72, 3.44, 3.33))
ggplot(data=df, aes(x=sample, y=Abundance, fill=Name)) +
geom_bar(stat="identity")
... creates the following graph
barplot
Altough the "geom_bar(stat="identity")" is set to "identity", it still ignores the order in the dataframe. I would like to get a stack order based on the Abundance percentage (Highest percentage at the top with ascending order)
Earlier, strings passed to ggplot, are evaluated with aes_string (which is now deprecated). Now, we convert the string to symbol and evaluate (!!)
library(ggplot2)
ggplot(data=df, aes(x= !! rlang::sym(sample), y=Abundance, fill=Name)) +
geom_bar(stat="identity")
Or another option is .data
ggplot(data=df, aes(x= .data[[sample]]), y=Abundance, fill=Name)) +
geom_bar(stat="identity")
Update
By checking the plot, it may be that the OP created a column named 'sample. In that case, we reorder the 'Name' based on the descending order of 'Abundance'
df$sample <- "BC04"
ggplot(data = df, aes(x = sample, y = Abundance,
fill = reorder(Name, desc(Abundance)))) +
geom_bar(stat = 'identity')+
guides(fill = guide_legend(title = "Name"))
-output
Or another option is to convert the 'Name' to factor with levels mentioned as the unique elements of 'Name' (as the data is already arranged in descending order of 'Abundance')
library(dplyr)
df %>%
mutate(Name = factor(Name, levels = unique(Name))) %>%
ggplot(aes(x = sample, y = Abundance, fill = Name)) +
geom_bar(stat = 'identity')

ggplot ordering a clustered barplot

Code to reproduce the issue I have:
library("data.table")
library("ggplot2")
DT<-data.table(team=c("Q1","Q2","Q3"), mon=c(3,5,2), tues=c(4,2,1), weds=c(4,2,5))
DT<-melt(DT,id.vars = "team", measure.name = c("mon","tues","weds"))
chartdata<-DT[,.(team, day=variable, score=value)]
ggplot(chartdata, aes(fill=day, y=score, x=team)) +#reorder(data3$Insurer, if(thisdir=="asc") {value} else {-value}))) +
geom_bar(position="dodge", stat="identity")
This produces a clustered barplot. I need to set the order by Monday's score (descending), but can't see a way of doing this. I have tried:
ggplot(chartdata, aes(fill=day, y=score, x=reorder(team, {-score}))) +
geom_bar(position="dodge", stat="identity")
but this appears to sort the data measured by the totals of Monday - Wedsnesday, not using only Monday as I want.
Is this possible? Many thanks!
You can sort your dataframe before plotting into ggplot2 and fix factor levels of the variable used for x axis:
library(dplyr)
library(ggplot2)
chartdata %>%
arrange(day, -score) %>%
mutate(team = factor(team, unique(team))) %>%
ggplot(aes(x = team, y = score, fill = day))+
geom_col(position = position_dodge())
Is it what you are looking for ?

Plotting the means in ggplot, without using stat_summary()

In ggplot, I want to compute the means (per group) and plot them as points. I would like to do that with geom_point(), and not stat_summary().
Here are my data.
group = rep(c('a', 'b'), each = 3)
grade = 1:6
df = data.frame(group, grade)
# this does the job
ggplot(df, aes(group, grade)) +
stat_summary(fun.y = 'mean', geom = 'point')
# but this does not
ggplot(df, aes(group, grade)) +
geom_point(stat = 'mean')
What value can take the stat argument above?
Is it possible to compute the means, using geom_point(), without computing a new data frame?
You could do
ggplot(df, aes(group, grade)) +
geom_point(stat = 'summary', fun.y="mean")
But in general its really not a great idea to rely on ggplot to do your data manipulation for you. Just let ggplot take of the plotting. You can use packages like dplyr to help with the summarizing
df %>% group_by(group) %>%
summarize(grade=mean(grade)) %>%
ggplot(aes(group, grade)) +
geom_point()

Ordering alphanumeric variables for plotting

How to I order a set of variable names along the x-axis that contain letters and numbers? So these come from a survey where the variables are formatted like var1, below. But when plotted, they appear out_1, out_10, out_11...
But what I would like is for it to be plotted out_1, out_2...
library(tidyverse)
var1<-rep(paste0('out','_', seq(1,12,1)), 100)
var2<-rnorm(n=length(var1) ,mean=2)
df<-data.frame(var1, var2)
ggplot(df, aes(x=var1, y=var2))+geom_boxplot()
I tried this:
df %>%
separate(var1, into=c('A', 'B'), sep='_') %>%
arrange(B) %>%
ggplot(., aes(x=B, y=var2))+geom_boxplot()
You can order the levels of var1 before plotting:
levels(df$var1) <- unique(df$var1)
ggplot(df, aes(var1,var2)) + geom_boxplot()
Or you can specify the order in ggplot scale options:
ggplot(df, aes(var1,var2)) +
geom_boxplot() +
scale_x_discrete(labels = unique(df$var1))
Both cases will give the same result:
You can also use it to give personalized labels; there's no need to create a new variable:
ggplot(df, aes(var1, var2)) +
geom_boxplot() +
scale_x_discrete('output', labels = gsub('out_', '', unique(df$var1)))
Check ?discrete_scale for details. You can use breaks and labels in different combinations, including the use of labels that came from outside your data.frame:
pers.labels <- paste('Output', 1:12)
ggplot(df, aes(var1, var2)) +
geom_boxplot() +
scale_x_discrete(NULL, labels = pers.labels)

ggplot geom_boxplot and plotting last value with geom_point

I'm new to R. I was trying to plot the last value of each variable in a data frame on top of a boxplot. Without success I was trying:
ggplot(iris, aes(x=Species,y=Sepal.Length)) +
geom_boxplot() +
geom_point(iris, aes(x=unique(iris$Species), y=tail(iris,n=1)))
Thanks, Bill
One approach is
library(tidyverse)
iris1 <- iris %>%
group_by(Species) %>%
summarise(LastVal = last(Sepal.Length))
ggplot(iris, aes(x=Species,y=Sepal.Length)) +
geom_boxplot() +
geom_point(data = iris1, aes(x = Species, y = LastVal))

Resources