Plot order in ggplot by colour and not alphabertical - r

I have the following code which splits the ggplot by color. Instead of the x axis to be plotted alphabetically is there an easy way to group the bars together so the red bar for example are next to each other? Manually moving them would not be an option as I have many more variables - cheers.
mydata <- data.frame(x = c("a", "d", "c", "q", "e", "s", "r", "b"),
n = c("UK","EUR","UK", "UK", "EUR", "GLB", "GLB", "EUR"),
F = c(-6, 17, 26, -37, 44, -22, 15, 12))
ggplot(mydata, aes(x = x, y = F, colour = n, fill =n)) + geom_bar(stat = "Identity")

You can try:
library(tidyverse)
mydata %>%
mutate(x1 = factor(x, levels=x[order(n,F)])) %>%
ggplot(aes(x = x1, y = F, colour = n, fill =n)) +
geom_col()

Not sure if it is what you want.
I guess you are plotting bar charts, and the bars are currently in alphabetical order like the following example,
library(ggplot2)
library(dplyr)
sample_data <- data.frame(
city = letters[1:5],
value = c(10, 11, 17, 12, 13),
country = c("c1", "c2", "c1", "c1", "c2")
)
ggplot(sample_data) +
geom_col(aes(x=city, y=value, colour=country, fill=country))
The order of the bars (left to right) is a, b, c, d, e. However, you want the bars ordered by country (the variable determines the colours/fill), i.e. a (c1), c (c1), d (c1), b (c2), e (c2).
To do this, you can set the 'correct' order of city using factor(city, levels=...). Since you want to sort city by country, the levels would be city[order(country)].
sample_data <- sample_data %>%
mutate(city2 = factor(city, levels=city[order(country)]))
ggplot(sample_data) +
geom_col(aes(x=city2, y=value, colour=country, fill=country))

Related

Barplot with sorted dots

I want to plot a 4 group barplot from a first data-frame called df1 and display dots from another data-frame called df2. The idea is to check how many dots from df2 lie outside of df1.
So I made the following graph which works well.
### 0- Import package
library(dplyr)
### 1- Data simulation
set.seed(4)
df1 <- data.frame(var=c("a", "b", "c", "d"), value=c(15, 19, 18, 17))
df2 <- data.frame(var1=rep(c("a", "b", "c", "d"), each=20), value=rnorm(80, 15, 2), color=NA, fill=NA)
### 2- Coloring data (outside=red, inside=blue)
df2$fill <- case_when(
(df2$var1=="a" & df2$value>subset(df1, var=='a')$value) ~ "#e18b8b",
(df2$var1=="b" & df2$value>subset(df1, var=='b')$value) ~ "#e18b8b",
(df2$var1=="c" & df2$value>subset(df1, var=='c')$value) ~ "#e18b8b",
(df2$var1=="d" & df2$value>subset(df1, var=='d')$value) ~ "#e18b8b",
TRUE ~ "#8cbee2")
df2$color <- case_when(
(df2$var1=="a" & df2$value>subset(df1, var=='a')$value) ~ "#ca0d0d",
(df2$var1=="b" & df2$value>subset(df1, var=='b')$value) ~ "#ca0d0d",
(df2$var1=="c" & df2$value>subset(df1, var=='c')$value) ~ "#ca0d0d",
(df2$var1=="d" & df2$value>subset(df1, var=='d')$value) ~ "#ca0d0d",
TRUE ~ "#0c78ca")
### 3- Display plot
ggplot(aes(x=var, y=value), data=df1) + geom_bar(stat="identity", fill='#8cbee2', width=0.6) +
geom_point(data=df2, aes(x=var1, y=value), colour=df2$color, fill=df2$fill, position=position_jitter(width=0.05, height=0), shape=21, size=2)
In order to improve this graph, I would like to order dots from df2 displayed within each barplot group, kind of qqplot-shaped.
-First, this would allow to tell whether the amount of dots outside is huge or not compared to barplots.
-Second, this would allow to see distribution of inside & outside dots.
I have found the following link but it only deals with one data-frame and I am working with 2.
How to plot boxplots superimposed with sorted points using ggplot2
Do you have any clue on how to sort these dots?
EDIT
Result following Stephan's answer
One option to achieve your desired result would be to use position_dodge and a helper column. To this end first order your data by var1 and value, then add the helper column as an interaction of var1 and the row index or number. This helper column could then be mapped on the group aes to ensure that points are plotted in ascending order where the dodge gives the qqplot-like shape:
Note: I also used a different approach for the colors which uses a left_join and maps on the color and fill aes.
library(dplyr)
set.seed(4)
df1 <- data.frame(var = c("a", "b", "c", "d"), value = c(15, 19, 18, 17))
df2 <- data.frame(var1 = rep(c("a", "b", "c", "d"), each = 20), value = rnorm(80, 15, 2), color = NA, fill = NA)
df2 <- df2 %>%
left_join(df1, by = c("var1" = "var"), suffix = c("", "_df1")) %>%
arrange(var1, value) %>%
mutate(
var_dodge = interaction(var1, row_number()),
color = value > value_df1
)
library(ggplot2)
ggplot(aes(x = var, y = value), data = df1) +
geom_bar(stat = "identity", fill = "#8cbee2", width = 0.6) +
geom_point(
data = df2, aes(x = var1, y = value, group = var_dodge, color = color, fill = color),
position = position_dodge(width = .4), shape = 21, size = 2
) +
scale_color_manual(values = c("TRUE" = "#ca0d0d", "FALSE" = "#0c78ca")) +
scale_fill_manual(values = c("TRUE" = "#e18b8b", "FALSE" = "#8cbee2")) +
guides(fill = "none", color = "none")

ggplot2 bar chart with two bars for each x value of data and two y-axis

I struggle to create a bar chart with two different y-axis and two bars for each x -value (category).
I have different types of categories of data (see below) for each I have two values that I want to plot side by side (price and number). However, the values for each category are far apart, which makes the bars of the number category become almost invisible. Thus, I want to add a second y-axis (one for the price one for the number) to allow a comparison between the two categories.
Example data:
Cat Type Value
1 A price 12745
2 A number 5
3 B price 34874368
4 B number 143
5 C price 84526
6 C number 11
I use the following R code (ggplot2) to create the plot:
plot = ggplot(df ,aes(x=Cat, fill=Type, y=Value))+
geom_bar(stat="identity", position="dodge")+
theme_bw() +
labs_pubr() +
scale_fill_grey() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
plot
scale_y_continuous and sec.axis but I did not manage to assign the y-axis to the type of data.
scale_y_continuous(
"price",
sec.axis = sec_axis(~., name = "number")
) +
I am happy for every hint :)
Is that what you mean?
df=tribble(
~Id, ~Cat, ~Type, ~Value,
1, "A", "price", 13,
2, "A", "number", 5,
3, "B", "price", 19,
4, "B", "number", 12,
5, "C", "price", 8,
6, "C", "number", 11)
df %>% ggplot(aes(Cat))
df %>% ggplot(aes(x=Type, fill=Type, y=Value))+
geom_col()+
facet_grid(~Cat)
P.S.
I changed your values a bit because you could not see much when the differences were of the order of 10 ^ 7!
With these numbers, the logarithmic scale is better suited
df=tribble(
~Id, ~Cat, ~Type, ~Value,
1, "A", "price", 12745,
2, "A", "number", 5,
3, "B", "price", 34874368,
4, "B", "number", 143,
5, "C", "price", 84526,
6, "C", "number", 11)
df %>% ggplot(aes(x=Type, fill=Type, y=Value))+
geom_col()+
scale_y_continuous(trans='log10')+
facet_grid(~Cat)
The idea as I understand is to split the graphs by Type, and you can do this using the helpful ggplot facet_wrap() verb. Then use the scales package to fix the rounding along the y-axis.
library(scales)
library(ggplot)
library(dplyr)
tbl <- tibble(Cat = c("A", "A", "B", "B", "C", "C"), Type = c("price", "number", "price", "number","price", "number")
, Value = c(12745, 5, 34874368, 143, 84526, 11))
tbl %>%
ggplot(aes(Cat, Value, fill = Cat)) +
geom_col(position = "dodge") +
facet_wrap(~Type, scales = "free") +
scale_y_continuous(labels = scales::number_format())

GGPLOT2 : geom_area with ordered character variable as x axis

I have a dataset like the following :
dat <- data.frame(sp = c("a", "a", "b", "b", "b", "c", "c"),
nb = c(5, 44, 32, 56, 10, 1, 43),
gp = c("ds1", "ds2", "ds1", "ds2", "ds3", "ds1", "ds3"))
With sp = species ; nb = nb occurrences ; gp = sampling group
I want to make a geom_area graph where values for species (sp) are displayed on y axis, with species grouped on x axis and ordered by descending order based on their total sum.
Up to now I only managed to do that :
ggplot(dat, aes(x=as.numeric(factor(sp)), y=nb, fill=gp, colour = gp)) +
geom_area()
Which gives this output (please don't laugh ;))
Could you help me to sort the x axis on descending order of the sum of stacked values ? And to fill the empty area ?
E.g. I try to do something like that (here in ascending order, but it no matters) :
Try this. The gaps in your plot could be filled by filling the df with the missing combinations of gp and sp using tidyr::complete. To reorder the levels of sp I make use of forcats::fct_reorder:
library(ggplot2)
library(dplyr)
library(tidyr)
library(forcats)
dat <- data.frame(sp = c("a", "a", "b", "b", "b", "c", "c"),
nb = c(5, 44, 32, 56, 10, 1, 43),
gp = c("ds1", "ds2", "ds1", "ds2", "ds3", "ds1", "ds3"))
dat1 <- dat %>%
# Fill with missing combinations of gp and sp
tidyr::complete(gp, sp, fill = list(nb = 0)) %>%
# Reorder according to sum of nb
mutate(sp = forcats::fct_reorder(sp, nb, sum, .desc = TRUE),
sp_num = as.numeric(sp))
ggplot(dat1, aes(x=sp_num, y=nb, fill=gp, colour = gp)) +
geom_area()

Add multiple shape legends in ggplot and overlaying shapes

I am trying to create an easily understandable ggplot graph with 3 subgroups delineated by geom_point's 1) color (3 colors; for A, B, and C variables), 2) overall shape (3 colored shapes with borders; for c, d, and e criteria), and 3) a cross shape overlaid over the points (2 groups; some with shape overlaid and some without based on df$Subscale = 1 vs. 0).
I am having difficulty figuring out how to incorporate the aesthetics and a separate legend for no. 3, since that would represent a second shape-based aesthetic.
Here's what I have so far:
It looks okay with with these subgroups (other than the color legend not working yet). Next I want to overlay a shape for all the points with variable names (y-axis) that have underscores in their names using df$Subscale (e.g., A_1, C2_1, B_2 rather than A, C, B2). Since I am already using the shape aesthetic, I don't know how to re-apply a shape conditionally.
Example of the shape I'd like applied:
Here is the code for the sample dataset df:
#The way my data is currently structured
a<- c("A", "A_1", "A_2", "A_3", "A2", "A2_1", "A2_2",
"B", "B_1", "B_2", "B2", "B2_1",
"C", "C_1", "C_2", "C2", "C2_1")
b<- c(rep(1, times=4),
rep(2, times = 3),
rep(1, times = 3),
rep(2, times = 2),
rep(1, times = 3),
rep(2, times = 2))
col<- c(rep(1, times=7),
rep(2, times = 5),
rep(3, times = 5))
u <- c(0, rep(1, times=3),
0, rep(1, times = 2),
0, rep(1, times = 2),
0, rep(1, times = 1),
0, rep(1, times = 2),
0, rep(1, times=1))
set.seed(12)
c <- round(rnorm(17, .5, 1),2)
d <- round(rnorm(17, .0, .5),2)
e <- round(rnorm(17, -.2, .5),2)
dat<-data.frame(cbind(a, b, col, u, c, d, e))
#Restructuring for graphing
library(reshape)
df <- melt(dat, id.vars = c("a", "b", "col", "u"))
colnames(df) <- c("Name", "Type", "Color", "Subscale", "Criteria", "Value")
df$Value<- as.numeric(as.character(df$Value))
df$Name_order <- factor(df$Name, levels=df$Name[order(df$Value[df$Criteria == "c"])], ordered=TRUE)
Here is the code to create the graphs:
palette <- c("#56B4E9", "#D55E00","#009E73")
graph_test <- ggplot(df, aes(x=df$Value, y = df$Name_order,
colour = df$Color, shape = df$Criteria)) +
geom_point(size = 6, aes(#colour=factor(df$Color),
fill=factor(df$Color),
shape=factor(df$Criteria))) +
scale_shape_manual(values=c(21, 24, 22),
labels=c("Criteria1", "Criteria2", "Criteria3")) +
scale_fill_manual(values=palette,
labels = "c", "d", "e") +
scale_color_manual(values=c(rep("black", times = 3))) +
labs(fill = "ABC", shape = "Criteria")
#First graph
graph_test
#Second graph
graph_test + geom_point(size = 5, shape=3)
I considered 6 categories for the shape aes, but I would still need to overlay the cross shape conditionally, and I would prefer 3 legends (3 colors, 3 shapes, 2 with overlay vs. not).
df$CriteriabySub <- paste0(df$Criteria, df$Subscale)
Any ideas/tips for correctly applying the cross shape to some of the points and creating a third legend for it?

ggplot2 draw graph with respect to a specific order

If duplicated, please point me to the original question.
I would like to draw a figure in R using ggplot2, and the following codes show what I would like to achieve.
require(ggplot2)
require(data.table)
set.seed(1)
dat <- data.table(time = rep(c(1:40), times = 5),
value = runif(200),
team = rep(c("A","B","C","D","E"), each = 40))
dat[, value := value / sum(value), by = .(time)]
ggplot(dat, aes(x = time, y = value, group=team, fill=team)) +
geom_area(position = "fill") +
scale_fill_manual(values = c("red","blue","green","pink","yellow"),
breaks = c("D", "B", "E", "A", "C"),
labels = c("D", "B", "E", "A", "C"))
ggplot2 output:
As you can see, the order of the figure does not match the order of the legend. It is the order of A, B, C, D, E, but not D, B, E, A, C. I would like to draw the figure with pink at the top, then blue, then yellow, then red, then green (DBEAC). How can I achieve this?
Thanks in advance!
This is pretty much a duplicate of ggplot2: Changing the order of stacks on a bar graph,
geom_area appears to stack the areas in the order in which they first appear in the data.
Ordering dat in the appropriate order appears to solve your problem
# create order you want
my_order <- c("D", "B", "E", "A", "C")
# reversed to get correct ordering in data table
dat[, order := match(team, rev(my_order))]
# sort the data.table
setorder(dat, time, order)
# make the plot
ggplot(dat, aes(x = time, y = value, fill=team))+
geom_area(position = "fill") +
scale_fill_manual(values = c("red","blue","green","pink","yellow"),
breaks = my_order ,
labels = my_order )

Resources