GGPLOT2 : geom_area with ordered character variable as x axis - r

I have a dataset like the following :
dat <- data.frame(sp = c("a", "a", "b", "b", "b", "c", "c"),
nb = c(5, 44, 32, 56, 10, 1, 43),
gp = c("ds1", "ds2", "ds1", "ds2", "ds3", "ds1", "ds3"))
With sp = species ; nb = nb occurrences ; gp = sampling group
I want to make a geom_area graph where values for species (sp) are displayed on y axis, with species grouped on x axis and ordered by descending order based on their total sum.
Up to now I only managed to do that :
ggplot(dat, aes(x=as.numeric(factor(sp)), y=nb, fill=gp, colour = gp)) +
geom_area()
Which gives this output (please don't laugh ;))
Could you help me to sort the x axis on descending order of the sum of stacked values ? And to fill the empty area ?
E.g. I try to do something like that (here in ascending order, but it no matters) :

Try this. The gaps in your plot could be filled by filling the df with the missing combinations of gp and sp using tidyr::complete. To reorder the levels of sp I make use of forcats::fct_reorder:
library(ggplot2)
library(dplyr)
library(tidyr)
library(forcats)
dat <- data.frame(sp = c("a", "a", "b", "b", "b", "c", "c"),
nb = c(5, 44, 32, 56, 10, 1, 43),
gp = c("ds1", "ds2", "ds1", "ds2", "ds3", "ds1", "ds3"))
dat1 <- dat %>%
# Fill with missing combinations of gp and sp
tidyr::complete(gp, sp, fill = list(nb = 0)) %>%
# Reorder according to sum of nb
mutate(sp = forcats::fct_reorder(sp, nb, sum, .desc = TRUE),
sp_num = as.numeric(sp))
ggplot(dat1, aes(x=sp_num, y=nb, fill=gp, colour = gp)) +
geom_area()

Related

ggplot2 bar chart with two bars for each x value of data and two y-axis

I struggle to create a bar chart with two different y-axis and two bars for each x -value (category).
I have different types of categories of data (see below) for each I have two values that I want to plot side by side (price and number). However, the values for each category are far apart, which makes the bars of the number category become almost invisible. Thus, I want to add a second y-axis (one for the price one for the number) to allow a comparison between the two categories.
Example data:
Cat Type Value
1 A price 12745
2 A number 5
3 B price 34874368
4 B number 143
5 C price 84526
6 C number 11
I use the following R code (ggplot2) to create the plot:
plot = ggplot(df ,aes(x=Cat, fill=Type, y=Value))+
geom_bar(stat="identity", position="dodge")+
theme_bw() +
labs_pubr() +
scale_fill_grey() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
plot
scale_y_continuous and sec.axis but I did not manage to assign the y-axis to the type of data.
scale_y_continuous(
"price",
sec.axis = sec_axis(~., name = "number")
) +
I am happy for every hint :)
Is that what you mean?
df=tribble(
~Id, ~Cat, ~Type, ~Value,
1, "A", "price", 13,
2, "A", "number", 5,
3, "B", "price", 19,
4, "B", "number", 12,
5, "C", "price", 8,
6, "C", "number", 11)
df %>% ggplot(aes(Cat))
df %>% ggplot(aes(x=Type, fill=Type, y=Value))+
geom_col()+
facet_grid(~Cat)
P.S.
I changed your values a bit because you could not see much when the differences were of the order of 10 ^ 7!
With these numbers, the logarithmic scale is better suited
df=tribble(
~Id, ~Cat, ~Type, ~Value,
1, "A", "price", 12745,
2, "A", "number", 5,
3, "B", "price", 34874368,
4, "B", "number", 143,
5, "C", "price", 84526,
6, "C", "number", 11)
df %>% ggplot(aes(x=Type, fill=Type, y=Value))+
geom_col()+
scale_y_continuous(trans='log10')+
facet_grid(~Cat)
The idea as I understand is to split the graphs by Type, and you can do this using the helpful ggplot facet_wrap() verb. Then use the scales package to fix the rounding along the y-axis.
library(scales)
library(ggplot)
library(dplyr)
tbl <- tibble(Cat = c("A", "A", "B", "B", "C", "C"), Type = c("price", "number", "price", "number","price", "number")
, Value = c(12745, 5, 34874368, 143, 84526, 11))
tbl %>%
ggplot(aes(Cat, Value, fill = Cat)) +
geom_col(position = "dodge") +
facet_wrap(~Type, scales = "free") +
scale_y_continuous(labels = scales::number_format())

Assign colours in geom_bar in different plots

I have a dataset with different species and year of their observation (in 3 categories). I would now like to make one plot per species with their observations per year and assign each year-category (each bar) a different color, but it should be the same colour in every plot.
I tired to do lapply and then do a ggplot with geom_bar (see code below). I know I can assign the colours with geom_bar(fill = c("#e31a1c", "#ff7f00", "#33a02c") but the problem is, that there are species, that were only observed in some of the years, so I get an error that the aesthetics are not the same length as the data. So is there another way to assign the colours to the bars?
Species = c(rep("X", 7), rep("Y", 3), rep("Z", 4), "V", rep("W", 3))
Year = c("A", "A", "A", "B", "B", "C", "C","A", "A", "C","B", "B", "C", "C","A", "A", "B", "C")
df <- data.frame(Species, Year)
mylist = lapply(split(df, as.factor(df$Species)), function(memefin){
ggplot(memefin, aes(x = Year, fill = Year))+
geom_bar(fill = c("#e31a1c", "#ff7f00", "#33a02c"))+
ggtitle(memefin$Species)+
scale_x_discrete(breaks=c("A","B","C"),labels=c("2000-2004", "2005-2009", "2010-2014"), drop = F)
})
You are on the right track:
mylist = lapply(split(df, as.factor(df$Species)), function(memefin){
ggplot(memefin, aes(x = Year, fill = Year))+
geom_bar()+
ggtitle(memefin$Species)+
scale_x_discrete(breaks=c("A","B","C"),labels=c("2000-2004", "2005-2009", "2010-2014"), drop = F)+
scale_fill_manual(values=c("A"="#e31a1c","B"= "#ff7f00","C" ="#33a02c")) # just apply the colours to specific "Years"
})

Values in gganimate col chart differs from original data values

I'm starting with animated charts and using gganimate package. I've found that when generating a col chart animation over time, values of variables change from original. Let me show you an example:
Data <- as.data.frame(cbind(c(1,1,1,2,2,2,3,3,3),
c("A","B","C","A","B","C","A","B","C"),
c(20,10,15,20,20,20,30,25,35)))
colnames(Data) <- c("Time","Object","Value")
Data$Time <- as.integer(Data$Time)
Data$Value <- as.numeric(Data$Value)
Data$Object <- as.character(Data$Object)
p <- ggplot(Data,aes(Object,Value)) +
stat_identity() +
geom_col() +
coord_cartesian(ylim = c(0,40)) +
transition_time(Time)
p
The chart obtained loks like this:
Values obtained in the Y-axis are between 1 and 6. It seems that the original value of 10 corresponds to a value of 1 in the Y-axis. 15 is 2, 20 is 3 and so on...
Is there a way for keeping the original values in the chart?
Thanks in advance
Your data changed when you coerced a factor variable into numeric. (see data section how to efficiently define a data.frame)
You were missing a position = "identity" for your bar charts to stay at the same place. I added a fill = Time for illustration.
Code
p <- ggplot(Data, aes(Object, Value, fill = Time)) +
geom_col(position = "identity") +
coord_cartesian(ylim = c(0, 40)) +
transition_time(Time)
p
Data
Data <- data.frame(Time = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
Object = c("A", "B", "C", "A", "B", "C", "A", "B", "C"),
Value = c(20, 10, 15, 20, 20, 20, 30, 25, 35))

Plot order in ggplot by colour and not alphabertical

I have the following code which splits the ggplot by color. Instead of the x axis to be plotted alphabetically is there an easy way to group the bars together so the red bar for example are next to each other? Manually moving them would not be an option as I have many more variables - cheers.
mydata <- data.frame(x = c("a", "d", "c", "q", "e", "s", "r", "b"),
n = c("UK","EUR","UK", "UK", "EUR", "GLB", "GLB", "EUR"),
F = c(-6, 17, 26, -37, 44, -22, 15, 12))
ggplot(mydata, aes(x = x, y = F, colour = n, fill =n)) + geom_bar(stat = "Identity")
You can try:
library(tidyverse)
mydata %>%
mutate(x1 = factor(x, levels=x[order(n,F)])) %>%
ggplot(aes(x = x1, y = F, colour = n, fill =n)) +
geom_col()
Not sure if it is what you want.
I guess you are plotting bar charts, and the bars are currently in alphabetical order like the following example,
library(ggplot2)
library(dplyr)
sample_data <- data.frame(
city = letters[1:5],
value = c(10, 11, 17, 12, 13),
country = c("c1", "c2", "c1", "c1", "c2")
)
ggplot(sample_data) +
geom_col(aes(x=city, y=value, colour=country, fill=country))
The order of the bars (left to right) is a, b, c, d, e. However, you want the bars ordered by country (the variable determines the colours/fill), i.e. a (c1), c (c1), d (c1), b (c2), e (c2).
To do this, you can set the 'correct' order of city using factor(city, levels=...). Since you want to sort city by country, the levels would be city[order(country)].
sample_data <- sample_data %>%
mutate(city2 = factor(city, levels=city[order(country)]))
ggplot(sample_data) +
geom_col(aes(x=city2, y=value, colour=country, fill=country))

How to create a bar plot and show average Y values

I want to create a bar plot based on the following data:
Station Delay
A 5
B 6
A 4
A 3
B 8
X axis should contain stations "A" and "B", while bars (Y axis) should show average delay per a station.
I tried this, but it does not give a correct result:
barplot(c(data$Station, data$Delay),
main="BARPLOT", xlab="Stations", ylab="Delays",
names.arg=data$Station)
df <- data.frame(Station = c("A", "B", "A", "A", "B"), Delay= c(5, 6, 4, 3, 8))
library(dplyr)
df <- df %>% group_by(Station) %>% summarise(me = mean(Delay))
library(ggplot2)
ggplot(aes(x = Station, y = me), data = df) + geom_bar(stat = "identity")
or directly with stat_summary
ggplot(aes(x = Station, y = Delay), data = df) + stat_summary(fun.y = "mean", geom = "bar")
In base R, you can do:
m_data <- data.frame(data$Station, m_del=ave(data$Delay, data$Station), stringsAsFactors=F)
barplot(unique(m_data)$m_del, names=unique(m_data)$Station, main="BARPLOT", xlab="Stations", ylab="Delays")
Or with the package data.table, you can do:
library(data.table)
m_data <- setDT(data)[, mean(Delay), by=Station]
m_data[, barplot(V1, names=Station, main="BARPLOT", xlab="Stations", ylab="Delays")]

Resources