Lollipop chart with repeated elements in different groups - r

I am trying to plot a lollipop chart with 5 groups and repeated elements in those groups. If all elements have different names it works as expected:
Intended behavior:
The problem is that I want to plot only 5 algorithms in different groups, and when I actually name them from Algorithm 1-5 this happens with the plot:
Unexpected behavior:
This is my snippet that produces the correct behavior of the lollipop chart (except for the wrong labels):
library(ggpubr)
# Create dataset
data <- data.frame(
algorithm=paste( "Algorithm ", seq(1,25), sep=""),
category=as.factor(c( rep('A', 5), rep('B', 5), rep('C', 5), rep('D', 5), rep('E', 5))),
metric=c(rep(rev(96:100), 5))
)
ggdotchart(data, x = "algorithm", y = "metric",
color = "category", # Color by groups
palette = c("#264653", "#2a9d8f", "#e9c46a", "#f4a261", "#e76f51"), # Custom color palette
sorting = "descending", # Sort value in descending order
add = "segments", # Add segments from y = 0 to dots
rotate = TRUE, # Rotate vertically
group = "category", # Order by groups
dot.size = 7, # Large dot size
label = round(data$metric), # Add mpg values as dot labels
font.label = list(color = "white", size = 8,
vjust = 0.5), # Adjust label parameters
ggtheme = theme_pubr() # ggplot2 theme
) +
labs(y = "Metric (%)", color="")
This is the new data snippet that causes this behavior:
# Create dataset
data <- data.frame(
algorithm=rep(paste( "Algorithm ", seq(1,5), sep=""), 5),
category=as.factor(c( rep('A', 5), rep('B', 5), rep('C', 5), rep('D', 5), rep('E', 5))),
metric=c(rep(rev(96:100), 5))
)
How can I possibly solve this issue?

Once produced, we can edit this like any other ggplot object. We can use scale_x_discrete() to manipulate the axis labels, which avoids any confusion with the original plot definition and construction under the hood of ggdotchart(). Using your first plot as p, we can do:
alg_labels <- rep(paste( "Algorithm ", seq(1,5), sep=""), 5)
p +
scale_x_discrete(
labels = alg_labels
)

Related

How do you change the order of explanatory and response variables in a mosaic plot? [duplicate]

My current plot:
My desired plot (nevermind the variables s)
Specifically: explanatory variables on the bottom with an x-axis, response variables on the right, relative frequency and the y-axis on the left. I'll attach my R code below.
mosaictable <- matrix (c (3, 9, 22, 21), byrow = T, ncol = 2)
rownames (mosaictable) = c ("White", "Blue ")
colnames (mosaictable) = c ("Captured", "Not Captured")
mosaicplot ((mosaictable), sub = "Pigeon Color", ylab = "Relative frequency",
col = c ("firebrick", "goldenrod1"), font = 2, main = "Mosaic Plot of Pigeon Color and Their Capture Rate"
)
axis (1)
axis (4)
This particular flavor of mosaic display where you have a "dependent" variable on the y-axis and want to add corresponding annotation, is sometimes also called a "spine plot". R implements this in the spineplot() function. Also plot(y ~ x) internally calls spineplot() when both y and x are categorical.
In your case, spineplot() does almost everything you want automatically provided that you supply it with a nicely formatted "table" object:
tab <- as.table(matrix(c(3, 22, 9, 21), ncol = 2))
dimnames(tab) <- list(
"Pigeon Color" = c("White", "Blue"),
"Relative Frequency" = c("Captured", "Not Captured")
)
tab
## Relative Frequency
## Pigeon Color Captured Not Captured
## White 3 9
## Blue 22 21
And then you get:
spineplot(tab)
Personally, I would leave it at that. But if it is really important to switch the axis labels from left to right and vice versa, then you can do so by first suppressing axes = FALSE and then adding them manually afterwards. The coordinates for that need to be obtained from the marginal distribution of the first variable and the conditional distribution of the second variable given the first, respectively
x <- prop.table(margin.table(tab, 1))
y <- prop.table(tab, 1)[2, ]
spineplot(tab, col = c("firebrick", "goldenrod1"), axes = FALSE)
axis(1, at = c(0, x[1]) + x/2, labels = rownames(tab), tick = FALSE)
axis(2)
axis(4, at = c(0, y[1]) + y/2, labels = colnames(tab), tick = FALSE)

Mosaic Plot Help in R

My current plot:
My desired plot (nevermind the variables s)
Specifically: explanatory variables on the bottom with an x-axis, response variables on the right, relative frequency and the y-axis on the left. I'll attach my R code below.
mosaictable <- matrix (c (3, 9, 22, 21), byrow = T, ncol = 2)
rownames (mosaictable) = c ("White", "Blue ")
colnames (mosaictable) = c ("Captured", "Not Captured")
mosaicplot ((mosaictable), sub = "Pigeon Color", ylab = "Relative frequency",
col = c ("firebrick", "goldenrod1"), font = 2, main = "Mosaic Plot of Pigeon Color and Their Capture Rate"
)
axis (1)
axis (4)
This particular flavor of mosaic display where you have a "dependent" variable on the y-axis and want to add corresponding annotation, is sometimes also called a "spine plot". R implements this in the spineplot() function. Also plot(y ~ x) internally calls spineplot() when both y and x are categorical.
In your case, spineplot() does almost everything you want automatically provided that you supply it with a nicely formatted "table" object:
tab <- as.table(matrix(c(3, 22, 9, 21), ncol = 2))
dimnames(tab) <- list(
"Pigeon Color" = c("White", "Blue"),
"Relative Frequency" = c("Captured", "Not Captured")
)
tab
## Relative Frequency
## Pigeon Color Captured Not Captured
## White 3 9
## Blue 22 21
And then you get:
spineplot(tab)
Personally, I would leave it at that. But if it is really important to switch the axis labels from left to right and vice versa, then you can do so by first suppressing axes = FALSE and then adding them manually afterwards. The coordinates for that need to be obtained from the marginal distribution of the first variable and the conditional distribution of the second variable given the first, respectively
x <- prop.table(margin.table(tab, 1))
y <- prop.table(tab, 1)[2, ]
spineplot(tab, col = c("firebrick", "goldenrod1"), axes = FALSE)
axis(1, at = c(0, x[1]) + x/2, labels = rownames(tab), tick = FALSE)
axis(2)
axis(4, at = c(0, y[1]) + y/2, labels = colnames(tab), tick = FALSE)

How to make geom_ribbon have gradation color in ggplot2

I would like to make geom_ribbon have gradation color.
For example, I have data.frame as below;
df <-data.frame(Day = c(rnorm(300, 3, 2.5), rnorm(150, 7, 2)), # create random data
Depth = c(rnorm(300, 6, 2.5), rnorm(150, 2, 2)),
group = c(rep('A', 300), rep('B', 150))) # add two groups
With this data.frame, I make ggplot using geom_ribbon as below
gg <-
ggplot(data=df,aes(x=Day))+
geom_ribbon(aes(ymin=Depth,ymax=max(Depth)),alpha = 0.25)+
ylim(max(df$Depth),0)+
facet_wrap(~group,scales = "free_x",ncol=2)+
labs(x="Days(d)",y="Depth (m)")
gg
, which makes a following plot;
Here, I would like to make the ribbon have gradation color by the value of y-axis (i.e. df$Depth, in this case). However, I do not how to do it.
I can do it by geom_point as below;
gg <- gg +
geom_point(aes(y=Depth,color=Depth),alpha = 1, shape = 20, size=5)+
scale_color_gradient2(midpoint = 5,
low = "red", mid="gray37", high = "black",
space ="Lab")
gg
But, I want the color gradation on ribbon by filling the ribbon area, not on each point.
Do you have any suggestion to do it with geom_ribbon?
I do not know this is perfect, but I found a solution for what I want as follows;
First, I prepare data.frame;
df <-data.frame(Day = c(rnorm(300, 7, 2), rnorm(150, 5, 1)), # create random data
Depth = c(rnorm(300, 10, 2.5), rnorm(150, 7, 2)),
group = c(rep('A', 300), rep('B', 150))) # add two groups
Second, prepare the gradation background by following the link; log background gradient ggplot
xlength <- ceiling(max(df$Day))
yseq <- seq(0,max(df$Depth), length=100)
bg <- expand.grid(x=0:xlength, y=yseq) # dataframe for all combinations
Third, plot by using ggplot2;
gg <- ggplot() +
geom_tile(data=bg,
aes(x=x, y=y, fill=y),
alpha = 0.75)+ # plot the gradation
scale_fill_gradient2(low='red', mid="gray37", high = "black",
space ="Lab",midpoint = mean(df$Depth)/2)+ #set the color
geom_ribbon(data=df,
aes(x=Day,ymin=0,ymax=Depth),
fill = "gray92")+ #default ggplot2 background color
ylim(max(df$Depth),0)+
scale_x_continuous()+
facet_wrap(~group,scales = "free_x",ncol=2)+
labs(x="Days(d)",y="Depth (m)")+
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
gg

How to make three different bar charts of similar type clustered in the same plot?

I need to map my Erosion values for different levels of tillage (colomns) with three levels of soil depth (rows (A1, A2, A3)). I want all of this to be shown as a barchart in a single plot.
Here is a reproducible example:
a<- matrix(c(1:36), byrow = T, ncol = 4)
rownames(a)<-(c("A1","B1","C1","A2","B2","C2","A3","B3","C3"))
colnames(a)<-(c("Int_till", "Redu_till", "mulch_till", "no_till"))
barplot(a[1,], xlab = "A1", ylab = "Erosion")
barplot(a[4,], xlab = "A2", ylab = "Erosion")
barplot(a[7,], xlab = "A3", ylab = "Erosion")
##I want these three barchart side by side in a single plot
## for comparison
### and need similar plots for all the "Bs" and "Cs"
### Lastly, i want these three plots in the same page.
I have seen people do similar things using "fill" in ggplot (for lines) and specifying the factor which nicely separates the chart for different categories but I tried doing it but always run into error maybe because my data is continuous.
If any body could help me with these two things.. It will be a great help. I will appreciate it.
Thank you!
We can use ggplot
library(reshape2)
library(ggplot2)
library(dplyr)
melt(a) %>%
ggplot(., aes(x = Var2, y = value, fill = Var1)) +
geom_bar(stat = 'identity',
position = position_dodge2(preserve = "single")) +
facet_wrap(~ Var1)
Set mfcol to specify a 3x3 grid and then for each row generate its bar plot. Also, you could consider adding the barplot argument ylim = c(0, max(a)) so that all graphs use the same Y axis. title and mtext can be used to set the overall title and various margin text as we do below. See ?par, ?title and ?mtext for more information.
opar <- par(mfcol = c(3, 3), oma = c(0, 3, 0, 0))
for(r in rownames(a)) barplot(a[r, ], xlab = r, ylab = "Erosion")
par(opar)
title("My Plots", outer = TRUE, line = -1)
mtext(LETTERS[1:3], side = 2, outer = TRUE, line = -1,
at = c(0.85, 0.5, 0.17), las = 2)

Multiple intraday time series on the same chart [duplicate]

This question already has answers here:
ggplot with 2 y axes on each side and different scales
(18 answers)
Closed 6 years ago.
I am struggling with something that, I believe, should be pretty straighforward in R.
Please consider the following example:
library(dplyr)
library(tidyverse)
time = c('2013-01-03 22:04:21.549', '2013-01-03 22:04:22.349', '2013-01-03 22:04:23.559', '2013-01-03 22:04:25.559' )
value1 = c(1,2,3,4)
value2 = c(400,500,444,210)
data <- data_frame(time, value1, value2)
data <-data %>% mutate(time = as.POSIXct(time))
> data
# A tibble: 4 × 3
time value1 value2
<dttm> <dbl> <dbl>
1 2013-01-03 22:04:21 1 400
2 2013-01-03 22:04:22 2 500
3 2013-01-03 22:04:23 3 444
4 2013-01-03 22:04:25 4 210
My problem is simple:
I want to plot value1 AND value2 on the SAME chart with TWO different Y axis.
Indeed, as you can see in the example, the units are largely different between the two variables so using just one axis would compress one of the time series.
Surprisingly, getting a nice looking chart for this problem has proven to be very difficult. I am mad (of course, not really mad. Just puzzled ;)).
In Python Pandas, one could simply use:
data.set_index('time', inplace = True)
data[['value1', 'value2']].plot(secondary_y = 'value2')
in Stata, one could simply say:
twoway (line value1 time, sort ) (line value2 time, sort)
In R, I don't know how to do it. Am I missing something here? Base R, ggplot2, some weird package, any working solution with decent customization options would be fine here.
A base R hack that may answer your need. I'll go out of my way to make it clear which components (blue vs red) are responsible for what components. It's ugly, but it demonstrates the requisite points. Using your data:
# making sure the left and right sides have the same space
par(mar = c(4,4,1,4) + 0.1)
# first plot
plot(value1 ~ time, data = data, pch = 16, col = "blue", las = 1,
col.axis = "blue", col.lab = "blue")
grid(lty = 1, col = "blue")
# "reset" the whole plot for an overlay
par(fig = c(0,1,0,1), new = TRUE)
# second plot, sans axes and other annotation
plot(value2 ~ time, data = data, pch = 16, col = "red",
axes = FALSE, ann = FALSE)
grid(lty = 3, col = "red")
# add the right-axis and label
axis(side = 4, las = 1, col.axis = "red")
mtext("value2", side = 4, line = 3, col = "red")
I added the grids to highlight an aesthetic issue: they don't align "neatly". If you're okay with that, feel free to stop now.
Here's one method (which has not been tested with significantly-different data ranges). (There are most certainly other methods depending on your data and your preferences.)
# one way that may "normalize" the y-axes for you, so that the grid should be identical
y1 <- pretty(data$value1)
y1n <- length(y1)
y2 <- pretty(data$value2)
y2n <- length(y2)
if (y1n < y2n) {
y1 <- c(y1, y1[y1n] + diff(y1)[1])
} else if (y1n > y2n) {
y2 <- c(y2, y2[y2n] + diff(y2)[1])
}
And the ensuing plot, adding ylim=range(...):
# making sure the left and right sides have the same space
par(mar = c(4,4,1,4) + 0.1)
# first plot
plot(value1 ~ time, data = data, pch = 16, col = "blue", las = 1, ylim = range(y1),
col.axis = "blue", col.lab = "blue")
grid(lty = 1, col = "blue")
# "reset" the whole plot for an overlay
par(fig = c(0,1,0,1), new = TRUE)
# second plot, sans axes and other annotation
plot(value2 ~ time, data = data, pch = 16, col = "red", ylim = range(y2),
axes = FALSE, ann = FALSE)
grid(lty = 3, col = "red")
# add the right-axis and label
axis(side = 4, las = 1, col.axis = "red")
mtext("value2", side = 4, line = 3, col = "red")
(Though the red-blue alternating grid lines are atrocious, they demonstrate that the grids do in fact align well.)
NB: the use of par(fig = c(0,1,0,1), new = TRUE) is a bit fragile. Doing things like changing margins or other significant changes between plots can easily break the overlay, and you won't really know unless you do some manual work to see how the additive process actually pans out. In this "check" process, you will likely want to remove axes=F, ann=F from the second plot in order to confirm that at least the boxes and x-axis are aligning as intended.
Version 2.2.0 of ggplot2 allows to define a secondary axis. Now, the second time series can be scaled appropriately and displayed in the same chart:
data %>%
mutate(value2 = value2 / 100) %>% # scale value2
gather(variable, value, -time) %>% # reshape wide to long
ggplot(aes(time, value, colour = variable)) +
geom_point() + geom_line() +
scale_y_continuous(name = "value1", sec.axis = sec_axis(~ . * 100, name = "value2"))

Resources