Annotation logticks along ggplot x axis not having uniform gaps - r

I have created an R dataframe as follows
A<-data.frame("Col1"= c(21.5 ,22.5 ,15.5, 20.5 ,17.5 ,14.5 ,23.5, 11.5, 16.5, 25.5 ,18.5, 24.5 ,10.5 , 9.5, 19.5, 26.5, 13.5, 12.5 ,27.5, 4.5 , 5.5, 8.5, 6.5, 7.5))
A$Col2=c(0.619219548, 0.723265668,0.122833055, 0.536849680, 0.257225692 ,0.081648474, 0.794797325 ,0.023125359, 0.194364553, 0.909681117, 0.343930779, 0.857658382, 0.018791029 ,0.014457257, 0.467485576 ,0.950865217, 0.062140165, 0.040464671, 0.989875246, 0.001502443,0.003637989 ,0.012290763, 0.005796326, 0.007959621)
I have created the following plot on log scale using ggplot2 package
library(scales)
library(ggplot2)
chart_1<-ggplot(A, aes(x=Col1, y=Col2)) + geom_point()+ geom_smooth(method = "lm")+
scale_x_log10(minor_breaks = seq(0,max(A$Col1)*10 , 0.1), breaks = pretty_breaks())+
scale_y_log10(minor_breaks = seq(0,100,0.1))+ annotation_logticks(sides = "lb", outside =
FALSE,short = unit(1,"mm"), mid = unit(3,"mm"),long = unit(6,"mm")) + theme( panel.grid.major
= element_line(colour = "red", size = 0.5), panel.grid.minor= element_line(colour = "green",
size = 0.2))
In this I am able to generate a Y axis with uniform 9 annotation logticks between 2 major gridlines. ie between 0.001 - 0.01, 0.01 - 0.1 ,0.1 - 1, the axis is divided equally into 10 divisions. I would like the same to be done along the x axis dynamically. I am unable to accomplish the same. I request someone to guide me in this regard. Many thanks in advance

I believe your code is working just fine.
The annotation_logticks will write 10 marks between each log10 default scale values.
This way, you have 10 tickmarks between 0.01 and 0.1, 10 tickmarks between 0.1 and 1, 10 tickmarks between 1 and 10 (you can see in your x-axis the marks on 5,6,7,8,9 and 10; and 10 tickmarks between 10 and 100 -> 20,20,40...100. You can see the tickmark on 20 and 30 on your x-axis.

Related

Plotting unequal error bars as bubbles on a scatterplot in ggplot2

I have a set of 10 density estimates, obtained from 5 sites using two differnt methods (REM and DS). Each density estimate has their respective confidence intervals, which are unequal.
I want a scatter plot with the x-axis showing the density from REM and the y-axis showing the density estimate from DS. I then want to a bubble around each point, representing the confidence intervals.
At the moment I can only seem to set specific height and width values for these confidence intervals, which would be fine if they were even. Since they are uneven, the bubbles will not be circles but should be more of an egg-shaped ellipse, off-centre from the point estimate.
This is the code I've used, in which you can see the respective confidence intervals. The plot shows what this makes, if the confidence intervals were event. How would I adapt this to make the confidence intervals uneven?
Thank you.
# sample data
df <- data.frame(site=c(1, 2, 3, 4, 5),
rem=c(17.7, 14.1, 10.6, 13.2, 1.0),
rem_lower=c(8.2, 6.6, 4.2, 3.2, 0.2),
rem_upper=c(27.1, 21.5, 17.0, 23.1, 1.7),
ds=c(16.6, 18.5, 5.2, 21.8, 2.4),
ds_lower=c(6.3, 5.1, 2.7, 4.5, 0.5),
ds_upper=c(40.4, 39.9, 10.9, 44.7, 8.3))
# calculate the width and height of each ellipse
width <- df$rem_upper - df$rem_lower
height <- df$ds_upper - df$ds_lower
# plot the data with ellipses
ggplot(df, aes(x = rem, y = ds, color = factor(site))) +
geom_point(size = 5) +
geom_ellipse(aes(x0 = rem, y0 = ds, a = width, b = height, fill = factor(site),
angle = 45), alpha = 0.3) +
scale_fill_manual(values = c("#1f78b4", "#33a02c", "#e31a1c", "#ff7f00", "#6a3d9a")) +
labs(x = "rem", y = "DS") +
theme_classic()

Plotting log normal density in R has wrong height

I have a log-normal density with a mean of -0.4 and standard deviation of 2.5.
At x = 0.001 the height is over 5 (I double checked this value with the formula for the log-normal PDF):
dlnorm(0.001, -0.4, 2.5)
5.389517
When I plot it using the curve function over the input range 0-6 it looks like with a height just over 1.5:
curve(dlnorm(x, -.4, 2.5), xlim = c(0, 6), ylim = c(0, 6))
When I adjust the input range to 0-1 the height is nearly 4:
curve(dlnorm(x, -.4, 2.5), xlim = c(0, 1), ylim = c(0, 6))
Similarly with ggplot2 (output not shown, but looks like the curve plots above):
library(ggplot2)
ggplot(data = data.frame(x = 0), mapping = aes(x = x)) +
stat_function(fun = function(x) dlnorm(x, -0.4, 2.5)) +
xlim(0, 6) +
ylim(0, 6)
ggplot(data = data.frame(x = 0), mapping = aes(x = x)) +
stat_function(fun = function(x) dlnorm(x, -0.4, 2.5)) +
xlim(0, 1) +
ylim(0, 6)
Does someone know why the density height is changing when the x-axis scale is adjusted? And why neither attempt above seems to reach the correct height? I tried this with just the normal density and this doesn't happen.
curves generates a set of discrete points in the range you give it. By default it generates n = 101 points, so there is a step problem. If you increase the number of points you will have almost the correct value:
curve(dlnorm(x, -.4, 2.5), xlim = c(0, 1), ylim = c(0, 6), n = 1000)
In the first case you propose curve generates 101 points in the interval x <- c(0,6), while in the second case generates 101 points in the interval x <- c(0,1), so the step is more dense

How to annotate different values for each facet with dodged geom_boxplot on R?

I am trying to add significance asterisks to my ggplot boxplot, using groups (fill) and facets.
Using geom_signif() I can add bars such as:
I am trying to do the same for the dodged boxplots too.. similar to
(Imagine there were significance values above the smaller lines...)
The code for the former graph:
data:
library(ggplot2)
library(ggsignif)
df <- data.frame(iris,petal.colour=c("red","blue"), country=c("UK","France","France"))
First plot:
ggplot(df, aes(country,Sepal.Length))+
geom_boxplot(position="dodge",aes(fill=petal.colour))+
facet_wrap(~Species, ncol=3)+
geom_signif(comparisons = list(c("France", "UK")), map_signif_level=TRUE,
tip_length=0,y_position = 9, textsize = 4)
and for the smaller bars
+geom_signif(annotations = c("", ""),
y_position = 8.5,
xmin=c(0.75,1.75), xmax=c(1.25,2.25),tip_length=0)
It would great to let R do the work, but if its easier to manually add text above these smaller lines then that's fine with me.
I can't figure out how to get them to work for that group using geom_signif. See the first part for my attempt. I was able to get it to work using ggpubr and stat_compare_means, which I believe is an extension of geom_signif.
ggplot(df, aes(country,Sepal.Length)) +
geom_boxplot(position="dodge",aes(fill=petal.colour)) +
facet_wrap(~Species, ncol=3) +
geom_signif(comparisons = list(c("France", "UK")), map_signif_level=TRUE,
tip_length=0,y_position = 9, textsize = 4) +
geom_signif(y_position = 8.5,
xmin=c(0.75,1.75), xmax=c(1.25,2.25), tip_length=0, map_signif_level = c("***" = 0.001, "**" = 0.01, "*" = 0.05))
Warning messages:
1: In wilcox.test.default(c(4.9, 4.7, 5, 5.4, 5, 4.4, 5.4, 4.8, 4.3, :
cannot compute exact p-value with ties
2: In wilcox.test.default(c(7, 6.9, 5.5, 5.7, 6.3, 6.6, 5.2, 5.9, 6, :
cannot compute exact p-value with ties
3: In wilcox.test.default(c(6.3, 5.8, 6.3, 6.5, 4.9, 7.3, 7.2, 6.5, :
cannot compute exact p-value with ties
4: Computation failed in `stat_signif()`:
arguments imply differing number of rows: 6, 0
5: Computation failed in `stat_signif()`:
arguments imply differing number of rows: 6, 0
6: Computation failed in `stat_signif()`:
arguments imply differing number of rows: 6, 0
Using ggpubr and stat_compare_means. Note you can use different labels, and tests, etc. See ?stat_compare_means.
library(ggpubr)
ggplot(df, aes(country,Sepal.Length)) +
geom_boxplot(position="dodge",aes(fill=petal.colour)) +
facet_wrap(~Species, ncol=3) +
stat_compare_means(aes(group = country), label = "p.signif", label.y = 10, label.x = 1.5) +
stat_compare_means(aes(group = petal.colour), label = "p.format", label.y = 8.5)
Maybe you can save the plot as .pdf file and try to use Adobe Illustrator to manually add whatever you want into the plot, the greatest advantage of R plot is its perfect compatibility with Adobe Illustrator.
Or maybe you can try to set
map_signif_level = c("***"=0.001, "**"=0.01, "*"=0.05)
in geom_signif
Hope that helps

Axis breaks with integer and non-integer numbers: how to suppress the zero decimals of the integer numbers without excluding the non-integer numbers?

I’m working in a graph where the axis breaks include integer and non-integer numbers.
For illustration purposes consider the following example:
library(ggplot2)
ggplot() + geom_point(aes(x = 0:10, y = 0:10)) +
scale_x_continuous(breaks = seq(0, 10, 2.5)) +
scale_y_continuous(breaks = seq(0, 10, 2.5))
Ggplot is plotting the breaks as: 0.0, 2.5, 5.0, 7.5, 10.0
However I wished that the integer numbers (0, 5 and 10) would appear without the zero decimal, and at the same time I still want keep the non-integer numbers (2.5 and 7.5).
Considering the example above, I wished that the axis breaks would appear as: 0, 2.5, 5, 7.5, 10
Is it possible to do this?
Thanks in advance for any suggestion.
Try this:
ggplot()+geom_point(aes(x=0:10,y=0:10))+
scale_x_continuous(breaks=seq(0,10,2.5), labels=c(0,2.5,5,7.5,10))+
scale_y_continuous(breaks=seq(0,10,2.5), labels=c(0,2.5,5,7.5,10))
with output

R: Bar plot on a continuous x-axis (time-scaled)

I'm fairly new to R so please comment on anything you see.
I have data taken at different timepoints, under two conditions (for one timpoint) and I want to plot this as a bar plot with errorbars and with the bars at the appropriate timepoint.
I currently have this (stolen from another question on this site):
library(ggplot2)
example <- data.frame(tp = factor(c(0, "14a", "14b", 24, 48, 72)), means = c(1, 2.1, 1.9, 1.8, 1.7, 1.2), std = c(0.3, 0.4, 0.2, 0.6, 0.2, 0.3))
ggplot(example, aes(x = tp, y = means)) +
geom_bar(position = position_dodge()) +
geom_errorbar(aes(ymin=means-std, ymax=means+std))
Now my timepoints are a factor, but the fact that there is an unequal distribution of measurements across time makes the plot less nice.!
This is how I imagine the graph :
I find the ggplot2 package can give you very nice graphs, but I have a lot more difficulty understanding it than I have with other R stuff.
Before we get into R, you have to realize that even in a bar plot the x axis needs a numeric value. If you treat them as factors then the software assumes equal spacing between the bars by default. What would be the x-values for each of the bars in this case? It can be (0, 14, 14, 24, 48, 72) but then it will plot two bars at point 14 which you don't seem to want. So you have to come up with the x-values.
Joran provides an elegant solution by modifying the width of the bars at position 14. Modifying the code given by joran to make the bars fall at the right position in the x-axis, the final solution is:
library(ggplot2)
example <- data.frame(tp = factor(c(0, "14a", "14b", 24, 48, 72)), means = c(1, 2.1, 1.9, 1.8, 1.7, 1.2), std = c(0.3, 0.4, 0.2, 0.6, 0.2, 0.3))
example$tp1 <- gsub("a|b","",example$tp)
example$grp <- c('a','a','b','a','a','a')
example$tp2 <- as.numeric(example$tp1)
ggplot(example, aes(x = tp2, y = means,fill = grp)) +
geom_bar(position = "dodge",stat = "identity") +
geom_errorbar(aes(ymin=means-std, ymax=means+std),position = "dodge")

Resources