R barplot with grouping - r

I checked ggplot specs and looks like I need transpose my data to build bar plot, or there is still an option to use with that df, so I actually can use column names in groupings, I mocked up image for demo below, on ggplot still can't get where we do groupings, or we can list them with comma ? Tx all
df1 <- data.frame(yy=2017, F1=23, F2=40, F3=4)
df2 <- data.frame(yy=2018, F1=16, F2=90, F3=8)
df <- rbind(df1,df2)
df
yy F1 F2 F3
1 2017 23 40 4
2 2018 16 90 8
ggplot(df, aes(F1, yy)) + ## this is just bad sample
geom_bar(aes(fill = yy), stat = "identity", position = "dodge")

library(tidyverse)
df1 <- data.frame(yy=2017, F1=23, F2=40, F3=4)
df2 <- data.frame(yy=2018, F1=16, F2=90, F3=8)
df <- rbind(df1,df2)
df %>%
gather(type,value,-yy) %>% # reshape data
mutate(yy = factor(yy)) %>% # update variable to a factor
ggplot(aes(type, value, fill=yy)) +
geom_bar(stat = "identity", position = "dodge")

Related

summarise column and add it's common values in R

I have a dataframe something like this and my end goal is to make a bar chart.
Here is the data frame.
a 5
a 7
b 23
b 12
c 21
c 21
c 27
I want to summarize the dataframe with the first column but want to add the values of the 2nd column and make a bar chart for the values of 2nd column. The resulting data frame should be :
a 12
b 35
c 69
I tried something like this but it does not work:
d %>%
group_by(V1) %>%
summarise(V2) %>%
ggplot(aes(x = V1, y = V2)) + geom_col()+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
A simple base R option using barplot + aggregate
barplot(SumValue ~ ., aggregate(cbind(SumValue = Value) ~ ., df, sum))
Seems to be pretty straightforward. Let me know if this helps.
library(dplyr)
library(ggplot2)
#Converting your values into a dataframe
data <- data.frame("Key" = c("a","a","b","b","c","c","c"), "Value" = c(5,7,23,12,21,21,27))
data <- data %>%
group_by(Key) %>%
summarise(Value = sum(Value))
#Plot
ggplot(data, aes(x=Key, y=Value))+
geom_bar(stat="identity")

Adding character values of a column in R

I have two columns i.e. square_id & Smart_Nsmart as given below.
I want to count(add) N's and S's against each square_id and ggplot the data i.e. plot square_id vs Smart_Nsmart.
square_id 1
1
2 2 2 2 3 3 3 3
Smart_Nsmart
S N N N S S N S S S
We can use count and then use ggplot to plot the frequency. Here, we are plotting it with geom_bar (as it is not clear from the OP's post)
library(dplyr)
library(ggplot2)
df %>%
count(square_id, Smart_Nsmart) %>%
ggplot(., aes(x= square_id, y = n, fill = Smart_Nsmart)) +
geom_bar(stat = 'identity')
The above answer is very smart. However, instead of count function, you can implement group_by and summarise just in case in future you want to apply some other functions to your code.
library(dplyr)
library(ggplot2)
dff <- data.frame(a=c(1,1,1,1,2,1,2),b=c("C","C","N","N","N","C","N"))
dff %>%
group_by(a,b) %>%
summarise(n = length(b) ) %>%
ggplot(., aes(x= a, y = n, fill = b)) +
geom_bar(stat = 'identity')

Plot divergent stacked bar chart with ggplot2

Is there a way to use ggplot2 to create divergent stacked bar charts like the one on the right-hand side of the image below?
Data for reproducible example
library(ggplot2)
library(scales)
library(reshape)
dat <- read.table(text = " ONE TWO THREE
1 23 234 324
2 34 534 12
3 56 324 124
4 34 234 124
5 123 534 654",sep = "",header = TRUE)
# reshape data
datm <- melt(cbind(dat, ind = rownames(dat)), id.vars = c('ind'))
# plot
ggplot(datm,aes(x = variable, y = value,fill = ind)) +
geom_bar(position = "fill",stat = "identity") +
coord_flip()
Sure, positive values stack positively, negative values stack negatively. Don't use position fill. Just define what you want as negative values, and actually make them negative. Your example only has positive scores. E.g.
ggplot(datm, aes(x = variable, y = ifelse(ind %in% 1:2, -value, value), fill = ind)) +
geom_col() +
coord_flip()
If you want to also scale to 1, you need some preprocessing:
library(dplyr)
datm %>%
group_by(variable) %>%
mutate(value = value / sum(value)) %>%
ggplot(aes(x = variable, y = ifelse(ind %in% 1:2, -value, value), fill = ind)) +
geom_col() +
coord_flip()
An extreme approach might be to calculate the boxes yourself. Here's one method
dd <- datm %>% group_by(variable) %>%
arrange(desc(ind)) %>%
mutate(pct = value/sum(value), right = cumsum(pct), left=lag(right, default=0))
then you can plot with
ggplot(dd) +
geom_rect(aes(xmin=right, xmax=left, ymin=as.numeric(variable)-.4, ymax=as.numeric(variable)+.4, fill=ind)) +
scale_y_continuous(labels=levels(dd$variable), breaks=1:nlevels(dd$variable))
to get the left plot. and to get the right, you just shift the boxes a bit. This will line up all the right edges of the ind 3 boxes.
ggplot(dd %>% group_by(variable) %>% mutate(left=left-right[ind==3], right=right-right[ind==3])) +
geom_rect(aes(xmin=right, xmax=left, ymin=as.numeric(variable)-.4, ymax=as.numeric(variable)+.4, fill=ind)) +
scale_y_continuous(labels=levels(dd$variable), breaks=1:nlevels(dd$variable))
So maybe overkill here, but you have a lot of control this way.

Sort stacked bar plot by cumulative value in R

I am pretty new to R and i'm trying to get a stacked bar plot. My data looks like this:
name value1 value2
1 A 1118 239
2 B 647 31
3 C 316 1275
4 D 2064 230
5 E 231 85
I need a horizontal bar graph with stacked values, this is as far as i can get with my limited R skills (and most of that is also copy-pasted):
melted <- melt(data, id.vars=c("name"))
melted$name <- factor(
melted$name,
levels=rev(sort(unique(melted$name))),
ordered=TRUE
)
melted2 <- melted[order(melted$value),]
ggplot(melted2, aes(x= name, y = value, fill = variable)) +
geom_bar(stat = "identity") +
coord_flip()
It even took me several hours to get to this point, with witch I am pretty content as far as looks go, this is the produced output
What I now want to do is to get the bars ordered by summed up value (D is first, followed by C, A, B, E). I googled and tried some reorder and order stuff, but I simply can't get it to behave like I want it to. I'm sure the solution has to be pretty simple, so I hope you guys can help me with this.
Thanks in advance!
Well, I am not down or keeping up with all the latest changes in ggplot, but here is one way you could remedy this
I used your idea to set up the factor levels of name but based on the grouped sums. You might also find order = variable useful at some point, which will order the bar colors based on the variable, but not needed here
data <- read.table(header = TRUE, text = "name value1 value2
1 A 1118 239
2 B 647 31
3 C 316 1275
4 D 2064 230
5 E 231 85")
library('reshape2')
library('ggplot2')
melted <- melt(data, id.vars=c("name"))
melted <- within(melted, {
name <- factor(name, levels = names(sort(tapply(value, name, sum))))
})
levels(melted$name)
# [1] "E" "B" "A" "C" "D"
ggplot(melted, aes(x= name, y = value, fill = variable, order = variable)) +
geom_bar(stat = "identity") +
coord_flip()
Another option would be to use the dplyr package to set up a total column in your data frame and use that to sort.
The approach would look something like this.
m <- melted %>% group_by(name) %>%
mutate(total = sum(value) ) %>%
ungroup() %>%
arrange(total) %>%
mutate(name = factor(name, levels = unique(as.character(name))) )
ggplot(m, aes(x = name, y = value, fill = variable)) + geom_bar(stat = 'identity') + coord_flip()
Note that trying below code.
using tidyr package instead to reshape2 package
library(ggplot2)
library(dplyr)
library(tidyr)
data <- read.table(text = "
class value1 value2
A 1118 239
B 647 31
C 316 1275
D 2064 230
E 231 85", header = TRUE)
pd <- gather(data, key, value, -class) %>%
mutate(class = factor(class, levels = tapply(value, class, sum) %>% sort %>% names))
pd %>% ggplot(aes(x = class, y = value, fill = key, order = class)) +
geom_bar(stat = "identity") +
coord_flip()

Stacked bar chart

I would like to create a stacked chart using ggplot2 and geom_bar.
Here is my source data:
Rank F1 F2 F3
1 500 250 50
2 400 100 30
3 300 155 100
4 200 90 10
I want a stacked chart where x is the rank and y is the values in F1, F2, F3.
# Getting Source Data
sample.data <- read.csv('sample.data.csv')
# Plot Chart
c <- ggplot(sample.data, aes(x = sample.data$Rank, y = sample.data$F1))
c + geom_bar(stat = "identity")
This is as far as i can get. I'm not sure of how I can stack the rest of the field values.
Maybe my data.frame is not in a good format?
You said :
Maybe my data.frame is not in a good format?
Yes this is true. Your data is in the wide format You need to put it in the long format. Generally speaking, long format is better for variables comparison.
Using reshape2 for example , you do this using melt:
dat.m <- melt(dat,id.vars = "Rank") ## just melt(dat) should work
Then you get your barplot:
ggplot(dat.m, aes(x = Rank, y = value,fill=variable)) +
geom_bar(stat='identity')
But using lattice and barchart smart formula notation , you don't need to reshape your data , just do this:
barchart(F1+F2+F3~Rank,data=dat)
You need to transform your data to long format and shouldn't use $ inside aes:
DF <- read.table(text="Rank F1 F2 F3
1 500 250 50
2 400 100 30
3 300 155 100
4 200 90 10", header=TRUE)
library(reshape2)
DF1 <- melt(DF, id.var="Rank")
library(ggplot2)
ggplot(DF1, aes(x = Rank, y = value, fill = variable)) +
geom_bar(stat = "identity")
Building on Roland's answer, using tidyr to reshape the data from wide to long:
library(tidyr)
library(ggplot2)
df <- read.table(text="Rank F1 F2 F3
1 500 250 50
2 400 100 30
3 300 155 100
4 200 90 10", header=TRUE)
df %>%
gather(variable, value, F1:F3) %>%
ggplot(aes(x = Rank, y = value, fill = variable)) +
geom_bar(stat = "identity")
You will need to melt your dataframe to get it into the so-called long format:
require(reshape2)
sample.data.M <- melt(sample.data)
Now your field values are represented by their own rows and identified through the variable column. This can now be leveraged within the ggplot aesthetics:
require(ggplot2)
c <- ggplot(sample.data.M, aes(x = Rank, y = value, fill = variable))
c + geom_bar(stat = "identity")
Instead of stacking you may also be interested in showing multiple plots using facets:
c <- ggplot(sample.data.M, aes(x = Rank, y = value))
c + facet_wrap(~ variable) + geom_bar(stat = "identity")

Resources