Stacked bar chart - r

I would like to create a stacked chart using ggplot2 and geom_bar.
Here is my source data:
Rank F1 F2 F3
1 500 250 50
2 400 100 30
3 300 155 100
4 200 90 10
I want a stacked chart where x is the rank and y is the values in F1, F2, F3.
# Getting Source Data
sample.data <- read.csv('sample.data.csv')
# Plot Chart
c <- ggplot(sample.data, aes(x = sample.data$Rank, y = sample.data$F1))
c + geom_bar(stat = "identity")
This is as far as i can get. I'm not sure of how I can stack the rest of the field values.
Maybe my data.frame is not in a good format?

You said :
Maybe my data.frame is not in a good format?
Yes this is true. Your data is in the wide format You need to put it in the long format. Generally speaking, long format is better for variables comparison.
Using reshape2 for example , you do this using melt:
dat.m <- melt(dat,id.vars = "Rank") ## just melt(dat) should work
Then you get your barplot:
ggplot(dat.m, aes(x = Rank, y = value,fill=variable)) +
geom_bar(stat='identity')
But using lattice and barchart smart formula notation , you don't need to reshape your data , just do this:
barchart(F1+F2+F3~Rank,data=dat)

You need to transform your data to long format and shouldn't use $ inside aes:
DF <- read.table(text="Rank F1 F2 F3
1 500 250 50
2 400 100 30
3 300 155 100
4 200 90 10", header=TRUE)
library(reshape2)
DF1 <- melt(DF, id.var="Rank")
library(ggplot2)
ggplot(DF1, aes(x = Rank, y = value, fill = variable)) +
geom_bar(stat = "identity")

Building on Roland's answer, using tidyr to reshape the data from wide to long:
library(tidyr)
library(ggplot2)
df <- read.table(text="Rank F1 F2 F3
1 500 250 50
2 400 100 30
3 300 155 100
4 200 90 10", header=TRUE)
df %>%
gather(variable, value, F1:F3) %>%
ggplot(aes(x = Rank, y = value, fill = variable)) +
geom_bar(stat = "identity")

You will need to melt your dataframe to get it into the so-called long format:
require(reshape2)
sample.data.M <- melt(sample.data)
Now your field values are represented by their own rows and identified through the variable column. This can now be leveraged within the ggplot aesthetics:
require(ggplot2)
c <- ggplot(sample.data.M, aes(x = Rank, y = value, fill = variable))
c + geom_bar(stat = "identity")
Instead of stacking you may also be interested in showing multiple plots using facets:
c <- ggplot(sample.data.M, aes(x = Rank, y = value))
c + facet_wrap(~ variable) + geom_bar(stat = "identity")

Related

Plot divergent stacked bar chart with ggplot2

Is there a way to use ggplot2 to create divergent stacked bar charts like the one on the right-hand side of the image below?
Data for reproducible example
library(ggplot2)
library(scales)
library(reshape)
dat <- read.table(text = " ONE TWO THREE
1 23 234 324
2 34 534 12
3 56 324 124
4 34 234 124
5 123 534 654",sep = "",header = TRUE)
# reshape data
datm <- melt(cbind(dat, ind = rownames(dat)), id.vars = c('ind'))
# plot
ggplot(datm,aes(x = variable, y = value,fill = ind)) +
geom_bar(position = "fill",stat = "identity") +
coord_flip()
Sure, positive values stack positively, negative values stack negatively. Don't use position fill. Just define what you want as negative values, and actually make them negative. Your example only has positive scores. E.g.
ggplot(datm, aes(x = variable, y = ifelse(ind %in% 1:2, -value, value), fill = ind)) +
geom_col() +
coord_flip()
If you want to also scale to 1, you need some preprocessing:
library(dplyr)
datm %>%
group_by(variable) %>%
mutate(value = value / sum(value)) %>%
ggplot(aes(x = variable, y = ifelse(ind %in% 1:2, -value, value), fill = ind)) +
geom_col() +
coord_flip()
An extreme approach might be to calculate the boxes yourself. Here's one method
dd <- datm %>% group_by(variable) %>%
arrange(desc(ind)) %>%
mutate(pct = value/sum(value), right = cumsum(pct), left=lag(right, default=0))
then you can plot with
ggplot(dd) +
geom_rect(aes(xmin=right, xmax=left, ymin=as.numeric(variable)-.4, ymax=as.numeric(variable)+.4, fill=ind)) +
scale_y_continuous(labels=levels(dd$variable), breaks=1:nlevels(dd$variable))
to get the left plot. and to get the right, you just shift the boxes a bit. This will line up all the right edges of the ind 3 boxes.
ggplot(dd %>% group_by(variable) %>% mutate(left=left-right[ind==3], right=right-right[ind==3])) +
geom_rect(aes(xmin=right, xmax=left, ymin=as.numeric(variable)-.4, ymax=as.numeric(variable)+.4, fill=ind)) +
scale_y_continuous(labels=levels(dd$variable), breaks=1:nlevels(dd$variable))
So maybe overkill here, but you have a lot of control this way.

R barplot with grouping

I checked ggplot specs and looks like I need transpose my data to build bar plot, or there is still an option to use with that df, so I actually can use column names in groupings, I mocked up image for demo below, on ggplot still can't get where we do groupings, or we can list them with comma ? Tx all
df1 <- data.frame(yy=2017, F1=23, F2=40, F3=4)
df2 <- data.frame(yy=2018, F1=16, F2=90, F3=8)
df <- rbind(df1,df2)
df
yy F1 F2 F3
1 2017 23 40 4
2 2018 16 90 8
ggplot(df, aes(F1, yy)) + ## this is just bad sample
geom_bar(aes(fill = yy), stat = "identity", position = "dodge")
library(tidyverse)
df1 <- data.frame(yy=2017, F1=23, F2=40, F3=4)
df2 <- data.frame(yy=2018, F1=16, F2=90, F3=8)
df <- rbind(df1,df2)
df %>%
gather(type,value,-yy) %>% # reshape data
mutate(yy = factor(yy)) %>% # update variable to a factor
ggplot(aes(type, value, fill=yy)) +
geom_bar(stat = "identity", position = "dodge")

Reordering columns by y-value in R?

I have a dataframe structured like this:
> head(df)
Zip Crimes Population CPC
1 78701 2103 6841 0.3074
2 78719 186 1764 0.1054
3 78702 1668 21334 0.0782
4 78723 2124 28330 0.0750
5 78753 3472 49301 0.0704
6 78741 2973 44935 0.0662
And I'm plotting it using this function:
p = ggplot(df, aes(x=Zip, y=CPC)) + geom_col() + theme(axis.text.x = element_text(angle = 90))
And this is the graph I get:
How can I order the plot by CPC, where the highest Zip codes are on the left?
Convert Zip to a factor ordered by negative CPC. E.g., try df$Zip <- reorder(df$Zip, -df$CPC) before plotting. Here's a small example:
d <- data.frame(
x = c('a', 'b', 'c'),
y = c(5, 15, 10)
)
library(ggplot2)
# Without reordering
ggplot(d, aes(x, y)) + geom_col()
# With reordering
d$x <- reorder(d$x, -d$y)
ggplot(d, aes(x, y)) + geom_col()
Sort your data frame in descending order and then plot it:
library(dplyr)
df <- arrange(df,desc(CPC))
ggplot...

Drawing a multiple line ggplot figure

I am working on a figure which should contain 3 different lines on the same graph. The data frame I am working on is the follow:
I would like to be able to use ind(my data point) on x axis and then draw 3 different lines using the data coming from the columns med, b and c.
I only managed to obtain draw one line.
Could you please help me? the code I am using now is
ggplot(data=f, aes(x=ind, y=med, group=1)) +
geom_line(aes())+ geom_line(colour = "darkGrey", size = 3) +
theme_bw() +
theme(plot.background = element_blank(),panel.grid.major = element_blank(),panel.grid.minor = element_blank())
The key is to spread columns in question into a new variable. This happens in the gather() step in the below code. The rest is pretty much boiler plate ggplot2.
library(ggplot2)
library(tidyr)
xy <- data.frame(a = rnorm(10), b = rnorm(10), c = rnorm(10),
ind = 1:10)
# we "spread" a and b into a a new variable
xy <- gather(xy, key = myvariable, value = myvalue, a, b)
ggplot(xy, aes(x = ind, y = myvalue, color = myvariable)) +
theme_bw() +
geom_line()
With melt and ggplot:
df$ind <- 1:nrow(df)
head(df)
a b med c ind
1 -87.21893 -84.72439 -75.78069 -70.87261 1
2 -107.29747 -70.38214 -84.96422 -73.87297 2
3 -106.13149 -105.12869 -75.09039 -62.61283 3
4 -93.66255 -97.55444 -85.01982 -56.49110 4
5 -88.73919 -95.80307 -77.11830 -47.72991 5
6 -86.27068 -83.24604 -86.86626 -91.32508 6
df <- melt(df, id='ind')
ggplot(df, aes(ind, value, group=variable, col=variable)) + geom_line(lwd=2)

Sort stacked bar plot by cumulative value in R

I am pretty new to R and i'm trying to get a stacked bar plot. My data looks like this:
name value1 value2
1 A 1118 239
2 B 647 31
3 C 316 1275
4 D 2064 230
5 E 231 85
I need a horizontal bar graph with stacked values, this is as far as i can get with my limited R skills (and most of that is also copy-pasted):
melted <- melt(data, id.vars=c("name"))
melted$name <- factor(
melted$name,
levels=rev(sort(unique(melted$name))),
ordered=TRUE
)
melted2 <- melted[order(melted$value),]
ggplot(melted2, aes(x= name, y = value, fill = variable)) +
geom_bar(stat = "identity") +
coord_flip()
It even took me several hours to get to this point, with witch I am pretty content as far as looks go, this is the produced output
What I now want to do is to get the bars ordered by summed up value (D is first, followed by C, A, B, E). I googled and tried some reorder and order stuff, but I simply can't get it to behave like I want it to. I'm sure the solution has to be pretty simple, so I hope you guys can help me with this.
Thanks in advance!
Well, I am not down or keeping up with all the latest changes in ggplot, but here is one way you could remedy this
I used your idea to set up the factor levels of name but based on the grouped sums. You might also find order = variable useful at some point, which will order the bar colors based on the variable, but not needed here
data <- read.table(header = TRUE, text = "name value1 value2
1 A 1118 239
2 B 647 31
3 C 316 1275
4 D 2064 230
5 E 231 85")
library('reshape2')
library('ggplot2')
melted <- melt(data, id.vars=c("name"))
melted <- within(melted, {
name <- factor(name, levels = names(sort(tapply(value, name, sum))))
})
levels(melted$name)
# [1] "E" "B" "A" "C" "D"
ggplot(melted, aes(x= name, y = value, fill = variable, order = variable)) +
geom_bar(stat = "identity") +
coord_flip()
Another option would be to use the dplyr package to set up a total column in your data frame and use that to sort.
The approach would look something like this.
m <- melted %>% group_by(name) %>%
mutate(total = sum(value) ) %>%
ungroup() %>%
arrange(total) %>%
mutate(name = factor(name, levels = unique(as.character(name))) )
ggplot(m, aes(x = name, y = value, fill = variable)) + geom_bar(stat = 'identity') + coord_flip()
Note that trying below code.
using tidyr package instead to reshape2 package
library(ggplot2)
library(dplyr)
library(tidyr)
data <- read.table(text = "
class value1 value2
A 1118 239
B 647 31
C 316 1275
D 2064 230
E 231 85", header = TRUE)
pd <- gather(data, key, value, -class) %>%
mutate(class = factor(class, levels = tapply(value, class, sum) %>% sort %>% names))
pd %>% ggplot(aes(x = class, y = value, fill = key, order = class)) +
geom_bar(stat = "identity") +
coord_flip()

Resources