I am trying to place two plots side-by-side in R and have the below example.
library(vioplot)
x <- rnorm(100)
y <- rpois(100,1)
plot(x, y, xlim=c(-5,5), ylim=c(-5,5),type='n')
vioplot(x, col="tomato", horizontal=TRUE, at=-4, add=TRUE,lty=2, rectCol="gray")
vioplot(y, col="cyan", horizontal=TRUE, at=-3, add=TRUE,lty=2)
vioplot(y, col="cyan", horizontal=TRUE, at=-2, add=TRUE,lty=2)
With this data, I'm able to make a vioplot of my x and y variables. Now, for example, I want to develop bar plots of separate count data that relates to each vioplot on the left-hand side.
counts <- c(10, 20, 30)
barplot(counts, main="Car Distribution", horiz=TRUE)
I've used the mtcars example but it could be any count data. I'm wondering if it is possible to generate these plots side-by-side so that the count plot lines up with the vioplot correctly. I do not need any y-axis labels for the count plot.
According your specifications ggplot is my recommendation
library(tidyverse)
p1 <- lst(x, y, y1=y) %>%
bind_cols() %>%
pivot_longer(1:3) %>%
ggplot(aes(name, value)) +
geom_violin(trim = FALSE)+
geom_boxplot(width=0.15) +
coord_flip()
p2 <- mtcars %>%
count(gear) %>%
ggplot(aes(gear, n)) +
geom_col()+
coord_flip()
cowplot::plot_grid(p1, p2)
In base R you can do (please note, I used boxplot, but should work with viopülot either)
par(mfrow=c(1,2))
counts <- table(mtcars$gear)
boxplot(cbind(x,y,y), col="tomato", horizontal=TRUE,lty=2, rectCol="gray")
barplot(counts, main="Car Distribution", horiz=TRUE,
names.arg=c("3 Gears", "4 Gears", "5 Gears"))
Another option if you want to use ggplot is function ggarrange() from ggpubr.
library(dplyr)
library(ggplot2)
library(ggpubr)
# Create a sample dataset
dt <- tibble(group = rep(c("x", "y"), each = 100)) %>%
mutate(value = if_else(group == "x", rnorm(200),
as.double(rpois(200, 1))))
# Combined violin/Box plot
violins <- dt %>%
ggplot(aes(value, group)) +
geom_violin(width = 0.5) +
geom_boxplot(width = 0.1)
# Bar chart
bars <- dt %>%
ggplot(aes(group)) +
geom_bar(width = 0.1) +
coord_flip()
# Combine
ggpubr::ggarrange(violins, bars + rremove("ylab") + rremove("y.text"), ncol = 2)
Output:
You can use this code:
library(vioplot)
x <- rnorm(100)
y <- rpois(100,1)
par(mfrow=c(1,2))
plot(x, y, xlim=c(-5,5), ylim=c(-5,-1),type='n')
vioplot(x, col="tomato", horizontal=TRUE, at=-4, add=TRUE,lty=2, rectCol="gray")
vioplot(y, col="cyan", horizontal=TRUE, at=-3, add=TRUE,lty=2)
vioplot(y, col="cyan", horizontal=TRUE, at=-2, add=TRUE,lty=2)
counts <- table(mtcars$gear)
barplot(counts, main="Car Distribution", horiz=TRUE,
names.arg=c("3 Gears", "4 Gears", "5 Gears"))
Output:
Thank you for your interesting question, which has motivates me to explore base R graphics features. I have tried to find a case where the side-by-side configuration between the violin plot and the barplot provides a meaningful relationship. The case is that I have a subset of iris data with various counts of the species. I want to show three statistics:
the counts of sampled each species, by showing barplots;
the spread of sepal lengths in each sampled species, by showing violin plots; and
the median petal width of each sampled species, by positioning the violin plots.
I follow #GW5's idea here to create barplots of which the positions on the axes can be controlled. I follow #IRTFM's idea here to adjust the origins of the axes.
Here is the full code:
library(vioplot)
some_iris <- iris[c(1:90, 110:139), ]
ir_counts <- some_iris |> with(Species) |> table()
ir_counts
# setosa versicolor virginica
# 50 40 30
ir_names <- names(ir_counts)
ir_colors <- c("cyan", "green", "pink")
x_vio1 <- some_iris |> subset(Species == ir_names[1]) |> with(Sepal.Length)
x_vio2 <- some_iris |> subset(Species == ir_names[2]) |> with(Sepal.Length)
x_vio3 <- some_iris |> subset(Species == ir_names[3]) |> with(Sepal.Length)
y_vio1 <- some_iris |> subset(Species == ir_names[1]) |> with(Petal.Length) |> median()
y_vio2 <- some_iris |> subset(Species == ir_names[2]) |> with(Petal.Length) |> median()
y_vio3 <- some_iris |> subset(Species == ir_names[3]) |> with(Petal.Length) |> median()
# `xpd = FALSE` to keep the grid inside the plotting boxes.
par(mfrow = c(1, 2), xpd = FALSE)
# The violin plots, put on the left side.
plot(NULL,
xlim = c(0, 10), ylim = c(0, 10), type = "n", las = 1, xaxs = "i", yaxs = "i",
xlab = "Sepal Length (cm)", ylab = " Median Petal Width (cm)")
vioplot(x_vio1, col = ir_colors[1], horizontal = TRUE, at = y_vio1, add = TRUE, lty = 2)
vioplot(x_vio2, col = ir_colors[2], horizontal = TRUE, at = y_vio2, add = TRUE, lty = 2)
vioplot(x_vio3, col = ir_colors[3], horizontal = TRUE, at = y_vio3, add = TRUE, lty = 2)
grid()
# The texts that informs the names of the species
text(labels = ir_names, y = c(y_vio1, y_vio2, y_vio3),
x = c (min(x_vio1), min(x_vio2), min(x_vio3)) - 1)
# The barplots, put on the right side.
plot(NULL,
xlim = c(0, 60), ylim = c(0, 10), yaxt = "n", type = "n",
las = 1, xlab = "Counts", ylab = "", xaxs = "i", yaxs = "i"
)
rect(xleft = 0, xright = ir_counts[1],
ybottom = y_vio1 - 0.3, ytop = y_vio1 + 0.3, col = ir_colors[1])
rect(xleft = 0, xright = ir_counts[2],
ybottom = y_vio2 - 0.3, ytop = y_vio2 + 0.3, col = ir_colors[2])
rect(xleft = 0, xright = ir_counts[3],
ybottom = y_vio3 - 0.3, ytop = y_vio3 + 0.3, col = ir_colors[3])
grid()
Here is the result:
In case you want to put labels on the barplots (on the right side), you can use mtext as follows:
# ... (The same code above)
mtext(text = ir_names, side = 2, at = c(y_vio1, y_vio2, y_vio3),
line = 0.2, las = 1 )
The resulted labels:
Related
I compare two treatments A and B. The objective is to show that A is not inferior to B. The non inferiority margin delta =-2
After comparing Treatment A - Treatment B I have these results
Mean difference and 95% CI = -0.7 [-2.1, 0.8]
I would like to plot this either with a package or manually. I have no idea how to do it.
Welch Two Sample t-test
data: mydata$outcome[mydata$traitement == "Bras S"] and mydata$outcome[mydata$traitement == "B"]
t = 0.88938, df = 258.81, p-value = 0.3746
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.133224 0.805804
sample estimates:
mean of x mean of y
8.390977 9.054688
I want to create this kind of plot:
You could abstract the relevant data from the t.test results and then plot in base R using segments and points to plot the data and abline to draw in the relevant vertical lines. Since there were no reproducible data, I made some up but the process is generally the same.
#sample data
set.seed(123)
tres <- t.test(runif(10), runif(10))
# get values to plot from t test results
ci <- tres$conf.int
ests <- tres$estimate[1] - tres$estimate[2]
# plot
plot(x = ci, ylim = c(0,2), xlim = c(-4, 4), type = "n", # blank plot
bty = "n", xlab = "Treatment A - Treatment B", ylab = "",
axes = FALSE)
points(x = ests, y = 1, pch = 20) # dot for point estimate
segments(x0 = ci[1], x1 = ci[2], y0 = 1) #CI line
abline(v = 0, lty = 2) # vertical line, dashed
abline(v = 2, lty = 1, col = "darkblue") # vertical line, solid, blue
axis(1, col = "darkblue") # add in x axis, blue
EDIT:
If you wanted to more accurately recreate your figure with the x axis in descending order and using your statement "Mean difference and 95% CI = -0.7 [-2.1, 0.8]", you can do the following manipulations to the above approach:
diff <- -0.7
ci <- c(-2.1, 0.8)
# plot
plot(1, xlim = c(-4, 4), type = "n",
bty = "n", xlab = "Treatment A - Treatment B", ylab = "",
axes = FALSE)
points(x = -diff, y = 1, pch = 20)
segments(x0 = -ci[2], x1 = -ci[1], y0 = 1)
abline(v = 0, lty = 2)
abline(v = 2, lty = 1, col = "darkblue")
axis(1, at = seq(-4,4,1), labels = seq(4, -4, -1), col = "darkblue")
I want to draw a CDF in R, but I am having some problems. I want it to look like this:
But I get lines between the open and closed points by using the command plot(x,y,type="s")
So how do I get rid of those lines?
This isn't a general purpose example, but it will show you how to build the plot you desire in a couple of steps.
First, let's create some data (notice the zeros at the beginning):
x <- 0:6
fx <- c(0, 0.19, 0.21, 0.4, 0.12, 0.05, 0.03)
Fx <- cumsum(fx)
n <- length(x)
Then let's make an empty plot
plot(x = NA, y = NA, pch = NA,
xlim = c(0, max(x)),
ylim = c(0, 1),
xlab = "X label",
ylab = "Y label",
main = "Title")
Add closed circles, open circles, and finally the horizontal lines
points(x = x[-n], y = Fx[-1], pch=19)
points(x = x[-1], y = Fx[-1], pch=1)
for(i in 1:(n-1)) points(x=x[i+0:1], y=Fx[c(i,i)+1], type="l")
Viola!
If you insist on not seeing the line "inside" of the white points, do this instead:
points(x = x[-n], y = Fx[-1], pch=19)
for(i in 1:(n-1)) points(x=x[i+0:1], y=Fx[c(i,i)+1], type="l")
points(x = x[-1], y = Fx[-1], pch=19, col="white")
points(x = x[-1], y = Fx[-1], pch=1)
You can construct this plot using:
plot(x, y, pch = 16, ylim = c(-0.03, 1.03), ylab = "CDF") # solid points/graphic settings
points(x[-1], y[-length(y)]) # open points
abline(h = c(0, 1), col = "grey", lty = 2) # horizontal lines
Note: plot(x,y, type = "s) does not produce a plot like your original question, but rather a step function with both treads (horizontal lines) and risers (vertical lines):
Data
library(dplyr)
set.seed(1)
df <- data.frame(x = rpois(30, 3)) %>%
dplyr::arrange(x) %>%
dplyr::add_count(x) %>%
dplyr::distinct(x, .keep_all = T) %>%
mutate(y = cumsum(n) / sum(n))
x <- df$x
y <- df$y
I am trying to match two graphs in such a way that the two graphs are located vertically above each other sharing one x Axis
I already tried to use ggplot but didn't succeed. I did not manage to rewrite the commands barplot() and plot() to ggplot() in such a way that the graphs still come out right.
I would be very grateful for any help!
That's the first plot:
plot(as.factor(DauerK_mcpM$Kulturkategorie),
DauerK_mcpM$Electivity,
ylim = c(-1,1),
ylab="Elektivitätsindex",
col = DauerK_mcpM$Farbe, xaxt = "n",
main = "Elektivität Männchen mit Dauer")
abline(h = 0, lty = 2)
x.labels <- gsub("^.*?)","",levels(as.factor(DauerK_mcpM$Kulturkategorie)))
breaks <- seq(1,length(x.labels), 1)
axis(1, labels = x.labels, at = breaks, las = 2, cex.axis = 1)
dev.off()
That's the second plot:
barplot(Dauer_pro_Kultur_prozentM,
beside = TRUE,
xaxt = "n", ylab="verbrachte Zeit [%]",
main = "Männchen", col = Dauer_pro_KulturW$Farbe)
x.labels <- gsub("^.*?)", "", levels(as.factor(Dauer_pro_KulturW$Kulturkategorie)))
length <- length(x.labels)*1.2
breaks <- seq(from = 0.7, to = length, 1.2)
axis(1, labels = x.labels, at = breaks, las = 2, cex.axis = 1)
dev.off()
This can be done in ggplot by adding an indicator column for the plot type and then faceting by that indicator:
library(tidyverse)
#create some data
set.seed(20181022)
data <- data.frame(x = letters[ceiling(runif(100, 0, 10))],
y = runif(100),
stringsAsFactors = FALSE)
#duplicate the data and add an indicator for the Plot Type
data <- data %>%
bind_rows(data) %>%
mutate(PlotType = rep(1:2, each = nrow(data)))
#Facet by the plot type and subset each geom
data %>%
ggplot(aes(x, y)) +
facet_grid(PlotType~., scales = "free")+
geom_boxplot(data = filter(data, PlotType == 1)) +
geom_bar(data = filter(data, PlotType == 2), stat = "identity")
I tried to name the x axis correct.
hist(InsectSprays$count, col='pink', xlab='Sprays', labels=levels(InsectSprays$spray), xaxt='n')
axis(1, at=unique(InsectSprays$spray), labels=levels(InsectSprays$spray))
But this produces
I want the letters below the bars and not on top.
You have to plot the labels at the histogram bin midpoints. If you want to remove the axis and just have lettering, the padj will move the letters closer to the axis which you just removed.
h <- hist(InsectSprays$count, plot = FALSE)
plot(h, xaxt = "n", xlab = "Insect Sprays", ylab = "Counts",
main = "", col = "pink")
axis(1, h$mids, labels = LETTERS[1:6], tick = FALSE, padj= -1.5)
I generally think barplot are more suited for categorical variables. A solution in base R could be, with some rearrangement of the data:
d <- aggregate(InsectSprays$count, by=list(spray=InsectSprays$spray), FUN=sum)
d <- d[order(d$x, decreasing = T),]
t <- d$x
names(t) <- d$spray
barplot(t, las = 1, space = 0, col = "pink", xlab = "Sprays", ylab = "Count")
The output is the following:
Since you mentioned a ggplot solution would be nice:
library(ggplot)
library(dplyr)
InsectSprays %>%
group_by(spray) %>%
summarise(count = sum(count)) %>%
ggplot(aes(reorder(spray, -count),count)) +
geom_bar(stat = "identity", fill = "pink2") +
xlab("Sprays")
The output being:
I know the question was already asked, but i couldn't solve my problem.
I get a graph unreadale when i choose the text argument for my graph and when i choose the identify argument it's not better.
This is what i get whith this script :
VehiculeFunction <- function(data, gamme, absciss, ordinate, label, xlim, ylim){
my.data <- data[data$GAMME == gamme,]
ma.col = rgb(red = 0.1,blue = 1,green = 0.1, alpha = 0.2)
X <- my.data[[absciss]]
Y <- my.data[[ordinate]]
Z <- my.data[[label]]
X11()
plot(X, Y, pch=20, las = 1, col = ma.col, xlab = absciss, ylab = ordinate, xlim = xlim, ylim = ylim)
text(X, Y, labels = Z, pos=3, cex = 0.7, col = ma.col)
#identify(X, Y, labels = Z, cex = 0.7)
}
VehiculeFunction(data.vehicule, "I", "GMF.24", "Cout.24", "NITG", c(0,0.2), c(0,0.2))
I used iplot, but i couldn't add the identify and text argument...
I never used ggplot, so i don't know if it's could solve my problem.
Thank you for help.
A tool that might help with is facet_zoom from the ggforce package.
I don't have access to the data.vehicule object, so I will use the mtcars data.frame for an example of zooming in on a region of the graphic.
library(ggplot2)
library(ggforce)
library(dplyr)
mtcars2 <- mtcars %>% mutate(nm = rownames(mtcars))
ggplot(mtcars2) +
aes(x = wt, y = mpg, label = nm) +
geom_text()
last_plot() +
theme_bw() +
facet_zoom(x = dplyr::between(wt, 3, 4),
y = dplyr::between(mpg, 12, 17))