ggplot, data space according to sampling time?

ggplot, data space according to sampling time? - r

I need to space the dates according to the days between sampling. Between some sampling there is 5 days and some 4 days.
data looks like this (also need to add to the labels BBCH):
structure(list(Time = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
4L, 5L, 5L), .Label = c("06.05.2016 BBCH 50–51", "09.05.2016 BBCH 51–53",
"13.05.2016 BBCH 55–59", "16.05.2016 BBCH 59–61", "20.05.2016 BBCH 61–64"
), class = "factor"), Mean1 = c(0.9133333, 0.4366667, 0.313333,
0.176, 0.4, 0.1533333, 0.2066667, 0.29, 0.4633333, 0.4833333),
sd = c(2.704973, 1.639598, 0.8780997, 0.5158375, 1.1213943,
0.5203121, 0.5461531, 0.6587969, 0.823153, 0.9965101), n = c(300L,
300L, 300L, 250L, 300L, 300L, 300L, 300L, 300L, 300L), Mean2 = c(0.15617168,
0.09466226, 0.05069711, 0.03262443, 0.06474373, 0.03004023,
0.03153216, 0.03803566, 0.04752476, 0.05753354), SNH = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("OC", "OF"
), class = "factor"), Round = structure(c(1L, 1L, 2L, 2L,
3L, 3L, 4L, 4L, 5L, 5L), .Label = c("Round 1", "Round 2",
"Round 3", "Round 4", "Round 5"), class = "factor")), class = "data.frame", row.names = c(NA,
-10L))
and my script:
Pan_16<-qplot(x= Time,
y= Mean1,
group= SNH,
data = Plant) +
geom_errorbar(aes(ymin = Mean1- Mean2,
ymax = Mean1 + Mean2),
width=0.2, size=1)+
coord_cartesian(xlim=c(), ylim=c(0,2))+
geom_line(size=1,aes(linetype = SNH)) +
scale_x_discrete(labels=function(x){sub("\\s", "\n", x)})+
scale_color_manual("Field type", values=c("#gray20", "#gray46"))+
labs(title = "", x = "", y = "")+
annotate("text", x = 1 , y = 1.3, label = c("* * * "), color="black", size=5 , fontface="bold")+
annotate("text", x = 2 , y = 0.8, label = c(" * * ") , color="black", size=5 , fontface="bold")+
annotate("text", x = 3 , y = 0.8, label = c("* * * "), color="black", size=5 , fontface="bold")+
theme(axis.line = element_line(size = 1, colour = "grey80"))+
theme( panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.text = element_text(colour = "black"))+
theme(
plot.background = element_rect(fill = "white"),
panel.background = element_rect(fill = "white", colour="white"))

Sisi, to get you going ... also check that your Time variable is a factor. Always check the data type, if you do not get expected results or errors.
The praise goes to #Rui who basically gave you the answer.
I stripped off the superfluous stuff from your plot to help you see the major building blocks. You can add these layers for your desired plot/end result.
library(dplyr)
df <- structure(list(Time = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
4L, 5L, 5L), .Label = c("06.05.2016 BBCH 50–51", "09.05.2016 BBCH 51–53",
"13.05.2016 BBCH 55–59", "16.05.2016 BBCH 59–61", "20.05.2016 BBCH 61–64"
), class = "factor"), Mean1 = c(0.9133333, 0.4366667, 0.313333,
0.176, 0.4, 0.1533333, 0.2066667, 0.29, 0.4633333, 0.4833333),
sd = c(2.704973, 1.639598, 0.8780997, 0.5158375, 1.1213943,
0.5203121, 0.5461531, 0.6587969, 0.823153, 0.9965101), n = c(300L,
300L, 300L, 250L, 300L, 300L, 300L, 300L, 300L, 300L), Mean2 = c(0.15617168,
0.09466226, 0.05069711, 0.03262443, 0.06474373, 0.03004023,
0.03153216, 0.03803566, 0.04752476, 0.05753354), SNH = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("OC", "OF"
), class = "factor"), Round = structure(c(1L, 1L, 2L, 2L,
3L, 3L, 4L, 4L, 5L, 5L), .Label = c("Round 1", "Round 2",
"Round 3", "Round 4", "Round 5"), class = "factor")), class = "data.frame", row.names = c(NA,
-10L))
# ---------- coerce Time to character
df <- df %>% mutate(Time = as.character(Time))
# ---------- now make a Date column
df$Date <- as.Date(df$Time, "%d.%m.%Y")
# with the given data frame plot and set time axis
qplot(x= Date, y= Mean1, group= SNH, data = df) +
geom_errorbar(aes(ymin = Mean1- Mean2,
ymax = Mean1 + Mean2),
width=0.2, size=1) +
# ------------- set a date scale and "configure" to your liking
scale_x_date( date_labels = "%d %b" # show day and month
, date_breaks = "2 days" # have a major break every 2 days
,date_minor_breaks = "1 day" # show minor breaks in between
)
Amendment to show-case setting of user-defined axis breaks
Scales support the setting of breaks. This allows to provide a vector of values or inject a function returning the desired breaks.
Below we replace the (regular) and preconfigured break setting of date_breaks by supplying a breaks statement.
# ---------- coerce Time to character
df <- df %>% mutate(Time = as.character(Time))
# ---------- now make a Date column
df$Date <- as.Date(df$Time, "%d.%m.%Y")
# with the given data frame plot and set time axis
qplot(x= Date, y= Mean1, group= SNH, data = df) +
geom_errorbar(aes(ymin = Mean1- Mean2,
ymax = Mean1 + Mean2),
width=0.2, size=1) +
# ------------- set a date scale and "configure" to your liking
scale_x_date( breaks = unique(df$Date) # setting user defined breaks
,minor_breaks = "1 day" # keep minor breaks evenly spaced
,date_labels = "%d %b" # show day and month
This yields:

Related

Why doesn't the x axis add value to the existing values? [duplicate]

I have the following plot:
library(reshape)
library(ggplot2)
library(gridExtra)
require(ggplot2)
data2<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(15L, 11L, 29L, 42L, 0L, 5L, 21L,
22L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
p <- ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15))
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
q<- ggplot(data3, aes(x =factor(IR), y = value, fill = Legend, width=.15))
##the plot##
q + geom_bar(position='dodge', colour='black') + ylab('Frequency') + xlab('IR')+scale_fill_grey() +theme(axis.text.x=element_text(colour="black"), axis.text.y=element_text(colour="Black"))+ opts(title='', panel.grid.major = theme_blank(),panel.grid.minor = theme_blank(),panel.border = theme_blank(),panel.background = theme_blank(), axis.ticks.x = theme_blank())
I want the y-axis to display only integers. Whether this is accomplished through rounding or through a more elegant method isn't really important to me.

If you have the scales package, you can use pretty_breaks() without having to manually specify the breaks.
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks= pretty_breaks())

This is what I use:
ggplot(data3, aes(x = factor(IR), y = value, fill = Legend, width = .15)) +
geom_col(position = 'dodge', colour = 'black') +
scale_y_continuous(breaks = function(x) unique(floor(pretty(seq(0, (max(x) + 1) * 1.1)))))

With scale_y_continuous() and argument breaks= you can set the breaking points for y axis to integers you want to display.
ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_bar(position='dodge', colour='black')+
scale_y_continuous(breaks=c(1,3,7,10))

You can use a custom labeller. For example, this function guarantees to only produce integer breaks:
int_breaks <- function(x, n = 5) {
l <- pretty(x, n)
l[abs(l %% 1) < .Machine$double.eps ^ 0.5]
}
Use as
+ scale_y_continuous(breaks = int_breaks)
It works by taking the default breaks, and only keeping those that are integers. If it is showing too few breaks for your data, increase n, e.g.:
+ scale_y_continuous(breaks = function(x) int_breaks(x, n = 10))

These solutions did not work for me and did not explain the solutions.
The breaks argument to the scale_*_continuous functions can be used with a custom function that takes the limits as input and returns breaks as output. By default, the axis limits will be expanded by 5% on each side for continuous data (relative to the range of data). The axis limits will likely not be integer values due to this expansion.
The solution I was looking for was to simply round the lower limit up to the nearest integer, round the upper limit down to the nearest integer, and then have breaks at integer values between these endpoints. Therefore, I used the breaks function:
brk <- function(x) seq(ceiling(x[1]), floor(x[2]), by = 1)
The required code snippet is:
scale_y_continuous(breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1))
The reproducible example from original question is:
data3 <-
structure(
list(
IR = structure(
c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L),
.Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"),
class = "factor"
),
variable = structure(
c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L),
.Label = c("Real queens", "Simulated individuals"),
class = "factor"
),
value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L),
Legend = structure(
c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
.Label = c("Real queens",
"Simulated individuals"),
class = "factor"
)
),
row.names = c(NA,-8L),
class = "data.frame"
)
ggplot(data3, aes(
x = factor(IR),
y = value,
fill = Legend,
width = .15
)) +
geom_col(position = 'dodge', colour = 'black') + ylab('Frequency') + xlab('IR') +
scale_fill_grey() +
scale_y_continuous(
breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1),
expand = expand_scale(mult = c(0, 0.05))
) +
theme(axis.text.x=element_text(colour="black", angle = 45, hjust = 1),
axis.text.y=element_text(colour="Black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.ticks.x = element_blank())

I found this solution from Joshua Cook and worked pretty well.
integer_breaks <- function(n = 5, ...) {
fxn <- function(x) {
breaks <- floor(pretty(x, n, ...))
names(breaks) <- attr(breaks, "labels")
breaks
}
return(fxn)
}
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks = integer_breaks())
The source is:
https://joshuacook.netlify.app/post/integer-values-ggplot-axis/

You can use the accuracy argument of scales::label_number() or scales::label_comma() for this:
fakedata <- data.frame(
x = 1:5,
y = c(0.1, 1.2, 2.4, 2.9, 2.2)
)
library(ggplot2)
# without the accuracy argument, you see .0 decimals
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::comma)
# with the accuracy argument, all displayed numbers are integers
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = ~ scales::comma(.x, accuracy = 1))
# equivalent
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_comma(accuracy = 1))
# this works with scales::label_number() as well
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_number(accuracy = 1))
Created on 2021-08-27 by the reprex package (v2.0.0.9000)

All of the existing answers seem to require custom functions or fail in some cases.
This line makes integer breaks:
bad_scale_plot +
scale_y_continuous(breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3)))
For more info, see the documentation ?labeling::extended (which is a function called by scales::breaks_extended).
Basically, the argument Q is a set of nice numbers that the algorithm tries to use for scale breaks. The original plot produces non-integer breaks (0, 2.5, 5, and 7.5) because the default value for Q includes 2.5: Q = c(1,5,2,2.5,4,3).
EDIT: as pointed out in a comment, non-integer breaks can occur when the y-axis has a small range. By default, breaks_extended() tries to make about n = 5 breaks, which is impossible when the range is too small. Quick testing shows that ranges wider than 0 < y < 2.5 give integer breaks (n can also be decreased manually).

This answer builds on #Axeman's answer to address the comment by kory that if the data only goes from 0 to 1, no break is shown at 1. This seems to be because of inaccuracy in pretty with outputs which appear to be 1 not being identical to 1 (see example at the end).
Therefore if you use
int_breaks_rounded <- function(x, n = 5) pretty(x, n)[round(pretty(x, n),1) %% 1 == 0]
with
+ scale_y_continuous(breaks = int_breaks_rounded)
both 0 and 1 are shown as breaks.
Example to illustrate difference from Axeman's
testdata <- data.frame(x = 1:5, y = c(0,1,0,1,1))
p1 <- ggplot(testdata, aes(x = x, y = y))+
geom_point()
p1 + scale_y_continuous(breaks = int_breaks)
p1 + scale_y_continuous(breaks = int_breaks_rounded)
Both will work with the data provided in the initial question.
Illustration of why rounding is required
pretty(c(0,1.05),5)
#> [1] 0.0 0.2 0.4 0.6 0.8 1.0 1.2
identical(pretty(c(0,1.05),5)[6],1)
#> [1] FALSE

Google brought me to this question. I'm trying to use real numbers in a y scale. The y scale numbers are in Millions.
The scales package comma method introduces a comma to my large numbers. This post on R-Bloggers explains a simple approach using the comma method:
library(scales)
big_numbers <- data.frame(x = 1:5, y = c(1000000:1000004))
big_numbers_plot <- ggplot(big_numbers, aes(x = x, y = y))+
geom_point()
big_numbers_plot + scale_y_continuous(labels = comma)
Enjoy R :)

One answer is indeed inside the documentation of the pretty() function. As pointed out here Setting axes to integer values in 'ggplot2' the function contains already the solution. You have just to make it work for small values. One possibility is writing a new function like the author does, for me a lambda function inside the breaks argument just works:
... + scale_y_continuous(breaks = ~round(unique(pretty(.))
It will round the unique set of values generated by pretty() creating only integer labels, no matter the scale of values.

If your values are integers, here is another way of doing this with group = 1 and as.factor(value):
library(tidyverse)
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
data3 %>%
mutate(value = as.factor(value)) %>%
ggplot(aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_col(position = 'dodge', colour='black', group = 1)
Created on 2022-04-05 by the reprex package (v2.0.1)

This is what I did
scale_x_continuous(labels = function(x) round(as.numeric(x)))

R - reformat P value in ggplot using 'stat_compare_means'

I want to plot the p values to each panel in a faceted ggplot. If the p value is larger than 0.05, I want to display the p value as it is. If the p value is smaller than 0.05, I want to display the value in scientific notation (i.e, 0.0032 -> 3.20e-3; 0.0000425 -> 4.25e-5).
The code I wrote to do this is:
p1 <- ggplot(data = CD3, aes(location, value, color = factor(location),
fill = factor(location))) +
theme_bw(base_rect_size = 1) +
geom_boxplot(alpha = 0.3, size = 1.5, show.legend = FALSE) +
geom_jitter(width = 0.2, size = 2, show.legend = FALSE) +
scale_color_manual(values=c("#4cdee6", "#e47267", "#13ec87")) +
scale_fill_manual(values=c("#4cdee6", "#e47267", "#13ec87")) +
ylab(expression(paste("Density of clusters, ", mm^{-2}))) +
xlab(NULL) +
stat_compare_means(comparisons = list(c("CT", 'N'), c("IF","N")),
aes(label = ifelse(..p.format.. < 0.05, formatC(..p.format.., format = "e", digits = 2),
..p.format..)),
method = 'wilcox.test', show.legend = FALSE, size = 10) +
#ylab(expression(paste('Density, /', mm^2, )))+
theme(axis.text = element_text(size = 10),
axis.title = element_text(size = 20),
legend.text = element_text(size = 38),
legend.title = element_text(size = 40),
strip.background = element_rect(colour="black", fill="white", size = 2),
strip.text = element_text(margin = margin(10, 10, 10, 10), size = 40),
panel.grid = element_line(size = 1.5))
plot(p1)
This code runs without error, however, the format of numbers isn't changed. What am I doing wrong?
I attached the data to reproduce the plot: donwload data here
EDIT
structure(list(value = c(0.931966449207829, 3.24210526315789,
3.88811650210901, 0.626860993574675, 4.62085308056872, 0.477508650519031,
0.111900110501359, 3.2495164410058, 4.06626506024096, 0.21684918139434,
1.10365086026018, 4.66666666666667, 0.174109967855698, 0.597625869832174,
2.3758865248227, 0.360751947840548, 1.00441501103753, 3.65168539325843
), Criteria = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Density", "Density of cluster",
"nodular count", "Elongated count"), class = "factor"), Case = structure(c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L,
6L), .Label = c("Case 1A", "Case 1B", "Case 2", "Case 3", "Case 4",
"Case 5"), class = "factor"), Mark = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CD3",
"CD4", "CD8", "CD20", "FoxP3"), class = "factor"), location = structure(c(3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L), .Label = c("CT", "IF", "N"), class = "factor")), row.names = c(91L,
92L, 93L, 106L, 107L, 108L, 121L, 122L, 123L, 136L, 137L, 138L,
151L, 152L, 153L, 166L, 167L, 168L), class = "data.frame")

I think your issue came from the stat_compare_means and the use of comparisons.
I'm not totally sure, but I will guess that the output of p value for stat_compare_means is different from compare_means and so, you can't use it for the aes of label.
Let me explain, with your example, you can modify the display of the p.value like this:
library(ggplot2)
library(ggpubr)
ggplot(df, aes(x = location, y = value, color = location))+
geom_boxplot()+
stat_compare_means(ref.group = "N", aes(label = ifelse(p < 0.05,sprintf("p = %2.1e", as.numeric(..p.format..)), ..p.format..)))
You get the correct display of p.value but you lost your bars. So, if you use comparisons argument, you get:
library(ggplot2)
library(ggpubr)
ggplot(df, aes(x = location, y = value, color = location))+
geom_boxplot()+
stat_compare_means(comparisons = list(c("CT","N"), c("IF","N")), aes(label = ifelse(p < 0.05,sprintf("p = %2.1e", as.numeric(..p.format..)), ..p.format..)))
So, now, you get bars but not the correct display.
To circumwent this issue, you can perform the statistics outside of ggplot2 using compare_means functions and use the package ggsignif to display the correct display.
Here, I'm using dplyr and the function mutate to create new columns, but you can do it easily in base R.
library(dplyr)
library(magrittr)
c <- compare_means(value~location, data = df, ref.group = "N")
c %<>% mutate(y_pos = c(5,5.5), labels = ifelse(p < 0.05, sprintf("%2.1e",p),p))
# A tibble: 2 x 10
.y. group1 group2 p p.adj p.format p.signif method y_pos labels
<chr> <chr> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr>
1 value N CT 0.00866 0.017 0.0087 ** Wilcoxon 5 8.7e-03
2 value N IF 0.00866 0.017 0.0087 ** Wilcoxon 5.5 8.7e-03
Then, you can plot it:
library(ggplot2)
library(ggpubr)
library(ggsignif)
ggplot(df, aes(x = location, y = value))+
geom_boxplot(aes(colour = location))+
ylim(0,6)+
geom_signif(data = as.data.frame(c), aes(xmin=group1, xmax=group2, annotations=labels, y_position=y_pos),
manual = TRUE)
Does it look what you are trying to plot ?

How to add comparison bars to a plot to denote which comparison a p value corresponds to

I'm using the following data frame:
df1 <- structure(list(Genotype = structure(c(1L, 1L, 1L, 1L, 1L,
2L,2L,2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L,1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L),
.Label= c("miR-15/16 FL", "miR-15/16 cKO"), class = "factor"),
Tissue = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L), .Label = c("iLN", "Spleen", "Skin", "Colon"), class = "factor"),
`Cells/SC/Live/CD8—,, CD4+/Foxp3+,Median,<BV421-A>,CD127` = c(518L,
715L, 572L, 599L, 614L, 881L, 743L, 722L, 779L, 843L, 494L,
610L, 613L, 624L, 631L, 925L, 880L, 932L, 876L, 926L, 1786L,
2079L, 2199L, 2345L, 2360L, 2408L, 2509L, 3129L, 3263L, 3714L,
917L, NA, 1066L, 1059L, 939L, 1269L, 1047L, 974L, 1048L,
1084L)),
.Names = c("Genotype", "Tissue", "Cells/SC/Live/CD8—,,CD4+/Foxp3+,Median,<BV421-A>,CD127"),
row.names = c(NA, -40L), class = c("tbl_df", "tbl", "data.frame"))
and trying to make a plot using ggplot2 where box plots and points are displayed grouped by "Tissue" and interleaved by "Genotype". The significance values are displaying properly but I would like to add lines to denote the comparisons being made and have them start at the center of each "miR-15/16 FL" box plot and end at the center of each "miR-15/16 cKO" box plot and sit directly below the significance values. Below is the code I am using to generate the plot:
library(ggplot2)
library(ggpubr)
color.groups <- c("black","red")
names(color.groups) <- unique(df1$Genotype)
shape.groups <- c(16, 1)
names(shape.groups) <- unique(df1$Genotype)
ggplot(df1, aes(x = Tissue, y = df1[3], color = Genotype, shape = Genotype)) +
geom_boxplot(position = position_dodge(), outlier.shape = NA) +
geom_point(position=position_dodge(width=0.75)) +
ylim(0,1.2*max(df1[3], na.rm = TRUE)) +
ylab('MFI CD127 (of CD4+ Foxp3+ T cells') +
scale_color_manual(values=color.groups) +
scale_shape_manual(values=shape.groups) +
theme_bw() + theme(panel.border = element_blank(), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x=element_blank(), aspect.ratio = 1,
text = element_text(size = 9)) +
stat_compare_means(show.legend = FALSE, label = 'p.format', method = 't.test',
label.y = c(0.1*max(df1[3], na.rm = TRUE) + max(df1[3][c(1:10),], na.rm = TRUE),
0.1*max(df1[3], na.rm = TRUE) + max(df1[3][c(11:20),], na.rm = TRUE),
0.1*max(df1[3], na.rm = TRUE) + max(df1[3][c(21:30),], na.rm = TRUE),
0.1*max(df1[3], na.rm = TRUE) + max(df1[3][c(31:40),], na.rm = TRUE)))
Thanks for any help!

I've created the brackets with three calls to geom_segment. These calls use a new dmax data frame created to provide the reference y-values for positioning the brackets and the p-value labels. The values e and r are for tweaking these positions.
I've made a few other changes to your code.
Change the name of the third column to temp and use this name y=temp in the call to ggplot. Your original code uses y=df1[3], which essentially reaches outside the plot environment to the df1 object in the parent environment, which can cause problems. Also, having a short name to refer to makes it easier to generate the dmax data frame and refer to its columns.
Use the dmax data frame for label.y positions in stat_compare_means, which reduces the amount of code needed. (Incidently, stat_compare_means seems to require hard-coded label.y positions, rather than getting them from an aes mapping of the data.)
Position the p-value labels an absolute distance above each pair of box plots (using the value e), rather than a multiplicative distance. This makes it easier to keep spacing consistent between p-value labels, brackets, and box plots.
# Use a short column name for the third column
names(df1)[3] = "temp"
# Generate data frame of reference y-values for p-value labels and bracket positions
dmax = df1 %>% group_by(Tissue) %>%
summarise(temp=max(temp, na.rm=TRUE),
Genotype=NA)
# For tweaking position of brackets
e = 350
r = 0.6
w = 0.19
bcol = "grey30"
ggplot(df1, aes(x = Tissue, y = temp, color = Genotype, shape = Genotype)) +
geom_boxplot(position = position_dodge(), outlier.shape = NA) +
geom_point(position=position_dodge(width=0.75)) +
ylim(0,1.2*max(df1[3], na.rm = TRUE)) +
ylab('MFI CD127 (of CD4+ Foxp3+ T cells') +
scale_color_manual(values=color.groups) +
scale_shape_manual(values=shape.groups) +
theme_bw() + theme(panel.border = element_blank(), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x=element_blank(), aspect.ratio = 1,
text = element_text(size = 9)) +
stat_compare_means(show.legend = FALSE, label = 'p.format', method = 't.test',
label.y = e + dmax$temp) +
geom_segment(data=dmax,
aes(x=as.numeric(Tissue)-w, xend=as.numeric(Tissue)+w,
y=temp + r*e, yend=temp + r*e), size=0.3, color=bcol, inherit.aes=FALSE) +
geom_segment(data=dmax,
aes(x=as.numeric(Tissue) + w, xend=as.numeric(Tissue) + w,
y=temp + r*e, yend=temp + r*e - 60), size=0.3, color=bcol, inherit.aes=FALSE) +
geom_segment(data=dmax,
aes(x=as.numeric(Tissue) - w, xend=as.numeric(Tissue) - w,
y=temp + r*e, yend=temp + r*e - 60), size=0.3, color=bcol, inherit.aes=FALSE)
To address your comment, here's an example to show that the method above inherently adjusts to any number of x-categories.
Let's begin by adding two new tissue categories:
library(forcats)
df1$Tissue = fct_expand(df1$Tissue, "Tissue 5", "Tissue 6")
df1$Tissue[seq(1,20,4)] = "Tissue 5"
df1$Tissue[seq(21,40,4)] = "Tissue 6"
dmax = df1 %>% group_by(Tissue) %>%
summarise(temp=max(temp, na.rm=TRUE),
Genotype=NA)
Now run exactly the same plot code listed above to get the following plot:

How to fix the following output plot by R? [duplicate]

I have the following plot:
library(reshape)
library(ggplot2)
library(gridExtra)
require(ggplot2)
data2<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(15L, 11L, 29L, 42L, 0L, 5L, 21L,
22L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
p <- ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15))
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
q<- ggplot(data3, aes(x =factor(IR), y = value, fill = Legend, width=.15))
##the plot##
q + geom_bar(position='dodge', colour='black') + ylab('Frequency') + xlab('IR')+scale_fill_grey() +theme(axis.text.x=element_text(colour="black"), axis.text.y=element_text(colour="Black"))+ opts(title='', panel.grid.major = theme_blank(),panel.grid.minor = theme_blank(),panel.border = theme_blank(),panel.background = theme_blank(), axis.ticks.x = theme_blank())
I want the y-axis to display only integers. Whether this is accomplished through rounding or through a more elegant method isn't really important to me.

If you have the scales package, you can use pretty_breaks() without having to manually specify the breaks.
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks= pretty_breaks())

This is what I use:
ggplot(data3, aes(x = factor(IR), y = value, fill = Legend, width = .15)) +
geom_col(position = 'dodge', colour = 'black') +
scale_y_continuous(breaks = function(x) unique(floor(pretty(seq(0, (max(x) + 1) * 1.1)))))

With scale_y_continuous() and argument breaks= you can set the breaking points for y axis to integers you want to display.
ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_bar(position='dodge', colour='black')+
scale_y_continuous(breaks=c(1,3,7,10))

You can use a custom labeller. For example, this function guarantees to only produce integer breaks:
int_breaks <- function(x, n = 5) {
l <- pretty(x, n)
l[abs(l %% 1) < .Machine$double.eps ^ 0.5]
}
Use as
+ scale_y_continuous(breaks = int_breaks)
It works by taking the default breaks, and only keeping those that are integers. If it is showing too few breaks for your data, increase n, e.g.:
+ scale_y_continuous(breaks = function(x) int_breaks(x, n = 10))

These solutions did not work for me and did not explain the solutions.
The breaks argument to the scale_*_continuous functions can be used with a custom function that takes the limits as input and returns breaks as output. By default, the axis limits will be expanded by 5% on each side for continuous data (relative to the range of data). The axis limits will likely not be integer values due to this expansion.
The solution I was looking for was to simply round the lower limit up to the nearest integer, round the upper limit down to the nearest integer, and then have breaks at integer values between these endpoints. Therefore, I used the breaks function:
brk <- function(x) seq(ceiling(x[1]), floor(x[2]), by = 1)
The required code snippet is:
scale_y_continuous(breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1))
The reproducible example from original question is:
data3 <-
structure(
list(
IR = structure(
c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L),
.Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"),
class = "factor"
),
variable = structure(
c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L),
.Label = c("Real queens", "Simulated individuals"),
class = "factor"
),
value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L),
Legend = structure(
c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
.Label = c("Real queens",
"Simulated individuals"),
class = "factor"
)
),
row.names = c(NA,-8L),
class = "data.frame"
)
ggplot(data3, aes(
x = factor(IR),
y = value,
fill = Legend,
width = .15
)) +
geom_col(position = 'dodge', colour = 'black') + ylab('Frequency') + xlab('IR') +
scale_fill_grey() +
scale_y_continuous(
breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1),
expand = expand_scale(mult = c(0, 0.05))
) +
theme(axis.text.x=element_text(colour="black", angle = 45, hjust = 1),
axis.text.y=element_text(colour="Black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.ticks.x = element_blank())

I found this solution from Joshua Cook and worked pretty well.
integer_breaks <- function(n = 5, ...) {
fxn <- function(x) {
breaks <- floor(pretty(x, n, ...))
names(breaks) <- attr(breaks, "labels")
breaks
}
return(fxn)
}
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks = integer_breaks())
The source is:
https://joshuacook.netlify.app/post/integer-values-ggplot-axis/

You can use the accuracy argument of scales::label_number() or scales::label_comma() for this:
fakedata <- data.frame(
x = 1:5,
y = c(0.1, 1.2, 2.4, 2.9, 2.2)
)
library(ggplot2)
# without the accuracy argument, you see .0 decimals
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::comma)
# with the accuracy argument, all displayed numbers are integers
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = ~ scales::comma(.x, accuracy = 1))
# equivalent
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_comma(accuracy = 1))
# this works with scales::label_number() as well
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_number(accuracy = 1))
Created on 2021-08-27 by the reprex package (v2.0.0.9000)

All of the existing answers seem to require custom functions or fail in some cases.
This line makes integer breaks:
bad_scale_plot +
scale_y_continuous(breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3)))
For more info, see the documentation ?labeling::extended (which is a function called by scales::breaks_extended).
Basically, the argument Q is a set of nice numbers that the algorithm tries to use for scale breaks. The original plot produces non-integer breaks (0, 2.5, 5, and 7.5) because the default value for Q includes 2.5: Q = c(1,5,2,2.5,4,3).
EDIT: as pointed out in a comment, non-integer breaks can occur when the y-axis has a small range. By default, breaks_extended() tries to make about n = 5 breaks, which is impossible when the range is too small. Quick testing shows that ranges wider than 0 < y < 2.5 give integer breaks (n can also be decreased manually).

This answer builds on #Axeman's answer to address the comment by kory that if the data only goes from 0 to 1, no break is shown at 1. This seems to be because of inaccuracy in pretty with outputs which appear to be 1 not being identical to 1 (see example at the end).
Therefore if you use
int_breaks_rounded <- function(x, n = 5) pretty(x, n)[round(pretty(x, n),1) %% 1 == 0]
with
+ scale_y_continuous(breaks = int_breaks_rounded)
both 0 and 1 are shown as breaks.
Example to illustrate difference from Axeman's
testdata <- data.frame(x = 1:5, y = c(0,1,0,1,1))
p1 <- ggplot(testdata, aes(x = x, y = y))+
geom_point()
p1 + scale_y_continuous(breaks = int_breaks)
p1 + scale_y_continuous(breaks = int_breaks_rounded)
Both will work with the data provided in the initial question.
Illustration of why rounding is required
pretty(c(0,1.05),5)
#> [1] 0.0 0.2 0.4 0.6 0.8 1.0 1.2
identical(pretty(c(0,1.05),5)[6],1)
#> [1] FALSE

Google brought me to this question. I'm trying to use real numbers in a y scale. The y scale numbers are in Millions.
The scales package comma method introduces a comma to my large numbers. This post on R-Bloggers explains a simple approach using the comma method:
library(scales)
big_numbers <- data.frame(x = 1:5, y = c(1000000:1000004))
big_numbers_plot <- ggplot(big_numbers, aes(x = x, y = y))+
geom_point()
big_numbers_plot + scale_y_continuous(labels = comma)
Enjoy R :)

One answer is indeed inside the documentation of the pretty() function. As pointed out here Setting axes to integer values in 'ggplot2' the function contains already the solution. You have just to make it work for small values. One possibility is writing a new function like the author does, for me a lambda function inside the breaks argument just works:
... + scale_y_continuous(breaks = ~round(unique(pretty(.))
It will round the unique set of values generated by pretty() creating only integer labels, no matter the scale of values.

If your values are integers, here is another way of doing this with group = 1 and as.factor(value):
library(tidyverse)
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
data3 %>%
mutate(value = as.factor(value)) %>%
ggplot(aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_col(position = 'dodge', colour='black', group = 1)
Created on 2022-04-05 by the reprex package (v2.0.1)

This is what I did
scale_x_continuous(labels = function(x) round(as.numeric(x)))

How to display only integer values on an axis using ggplot2

I have the following plot:
library(reshape)
library(ggplot2)
library(gridExtra)
require(ggplot2)
data2<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(15L, 11L, 29L, 42L, 0L, 5L, 21L,
22L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
p <- ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15))
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
q<- ggplot(data3, aes(x =factor(IR), y = value, fill = Legend, width=.15))
##the plot##
q + geom_bar(position='dodge', colour='black') + ylab('Frequency') + xlab('IR')+scale_fill_grey() +theme(axis.text.x=element_text(colour="black"), axis.text.y=element_text(colour="Black"))+ opts(title='', panel.grid.major = theme_blank(),panel.grid.minor = theme_blank(),panel.border = theme_blank(),panel.background = theme_blank(), axis.ticks.x = theme_blank())
I want the y-axis to display only integers. Whether this is accomplished through rounding or through a more elegant method isn't really important to me.

If you have the scales package, you can use pretty_breaks() without having to manually specify the breaks.
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks= pretty_breaks())

This is what I use:
ggplot(data3, aes(x = factor(IR), y = value, fill = Legend, width = .15)) +
geom_col(position = 'dodge', colour = 'black') +
scale_y_continuous(breaks = function(x) unique(floor(pretty(seq(0, (max(x) + 1) * 1.1)))))

With scale_y_continuous() and argument breaks= you can set the breaking points for y axis to integers you want to display.
ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_bar(position='dodge', colour='black')+
scale_y_continuous(breaks=c(1,3,7,10))

You can use a custom labeller. For example, this function guarantees to only produce integer breaks:
int_breaks <- function(x, n = 5) {
l <- pretty(x, n)
l[abs(l %% 1) < .Machine$double.eps ^ 0.5]
}
Use as
+ scale_y_continuous(breaks = int_breaks)
It works by taking the default breaks, and only keeping those that are integers. If it is showing too few breaks for your data, increase n, e.g.:
+ scale_y_continuous(breaks = function(x) int_breaks(x, n = 10))

These solutions did not work for me and did not explain the solutions.
The breaks argument to the scale_*_continuous functions can be used with a custom function that takes the limits as input and returns breaks as output. By default, the axis limits will be expanded by 5% on each side for continuous data (relative to the range of data). The axis limits will likely not be integer values due to this expansion.
The solution I was looking for was to simply round the lower limit up to the nearest integer, round the upper limit down to the nearest integer, and then have breaks at integer values between these endpoints. Therefore, I used the breaks function:
brk <- function(x) seq(ceiling(x[1]), floor(x[2]), by = 1)
The required code snippet is:
scale_y_continuous(breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1))
The reproducible example from original question is:
data3 <-
structure(
list(
IR = structure(
c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L),
.Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"),
class = "factor"
),
variable = structure(
c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L),
.Label = c("Real queens", "Simulated individuals"),
class = "factor"
),
value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L),
Legend = structure(
c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
.Label = c("Real queens",
"Simulated individuals"),
class = "factor"
)
),
row.names = c(NA,-8L),
class = "data.frame"
)
ggplot(data3, aes(
x = factor(IR),
y = value,
fill = Legend,
width = .15
)) +
geom_col(position = 'dodge', colour = 'black') + ylab('Frequency') + xlab('IR') +
scale_fill_grey() +
scale_y_continuous(
breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1),
expand = expand_scale(mult = c(0, 0.05))
) +
theme(axis.text.x=element_text(colour="black", angle = 45, hjust = 1),
axis.text.y=element_text(colour="Black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.ticks.x = element_blank())

I found this solution from Joshua Cook and worked pretty well.
integer_breaks <- function(n = 5, ...) {
fxn <- function(x) {
breaks <- floor(pretty(x, n, ...))
names(breaks) <- attr(breaks, "labels")
breaks
}
return(fxn)
}
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks = integer_breaks())
The source is:
https://joshuacook.netlify.app/post/integer-values-ggplot-axis/

You can use the accuracy argument of scales::label_number() or scales::label_comma() for this:
fakedata <- data.frame(
x = 1:5,
y = c(0.1, 1.2, 2.4, 2.9, 2.2)
)
library(ggplot2)
# without the accuracy argument, you see .0 decimals
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::comma)
# with the accuracy argument, all displayed numbers are integers
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = ~ scales::comma(.x, accuracy = 1))
# equivalent
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_comma(accuracy = 1))
# this works with scales::label_number() as well
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_number(accuracy = 1))
Created on 2021-08-27 by the reprex package (v2.0.0.9000)

All of the existing answers seem to require custom functions or fail in some cases.
This line makes integer breaks:
bad_scale_plot +
scale_y_continuous(breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3)))
For more info, see the documentation ?labeling::extended (which is a function called by scales::breaks_extended).
Basically, the argument Q is a set of nice numbers that the algorithm tries to use for scale breaks. The original plot produces non-integer breaks (0, 2.5, 5, and 7.5) because the default value for Q includes 2.5: Q = c(1,5,2,2.5,4,3).
EDIT: as pointed out in a comment, non-integer breaks can occur when the y-axis has a small range. By default, breaks_extended() tries to make about n = 5 breaks, which is impossible when the range is too small. Quick testing shows that ranges wider than 0 < y < 2.5 give integer breaks (n can also be decreased manually).

This answer builds on #Axeman's answer to address the comment by kory that if the data only goes from 0 to 1, no break is shown at 1. This seems to be because of inaccuracy in pretty with outputs which appear to be 1 not being identical to 1 (see example at the end).
Therefore if you use
int_breaks_rounded <- function(x, n = 5) pretty(x, n)[round(pretty(x, n),1) %% 1 == 0]
with
+ scale_y_continuous(breaks = int_breaks_rounded)
both 0 and 1 are shown as breaks.
Example to illustrate difference from Axeman's
testdata <- data.frame(x = 1:5, y = c(0,1,0,1,1))
p1 <- ggplot(testdata, aes(x = x, y = y))+
geom_point()
p1 + scale_y_continuous(breaks = int_breaks)
p1 + scale_y_continuous(breaks = int_breaks_rounded)
Both will work with the data provided in the initial question.
Illustration of why rounding is required
pretty(c(0,1.05),5)
#> [1] 0.0 0.2 0.4 0.6 0.8 1.0 1.2
identical(pretty(c(0,1.05),5)[6],1)
#> [1] FALSE

Google brought me to this question. I'm trying to use real numbers in a y scale. The y scale numbers are in Millions.
The scales package comma method introduces a comma to my large numbers. This post on R-Bloggers explains a simple approach using the comma method:
library(scales)
big_numbers <- data.frame(x = 1:5, y = c(1000000:1000004))
big_numbers_plot <- ggplot(big_numbers, aes(x = x, y = y))+
geom_point()
big_numbers_plot + scale_y_continuous(labels = comma)
Enjoy R :)

One answer is indeed inside the documentation of the pretty() function. As pointed out here Setting axes to integer values in 'ggplot2' the function contains already the solution. You have just to make it work for small values. One possibility is writing a new function like the author does, for me a lambda function inside the breaks argument just works:
... + scale_y_continuous(breaks = ~round(unique(pretty(.))
It will round the unique set of values generated by pretty() creating only integer labels, no matter the scale of values.

If your values are integers, here is another way of doing this with group = 1 and as.factor(value):
library(tidyverse)
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
data3 %>%
mutate(value = as.factor(value)) %>%
ggplot(aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_col(position = 'dodge', colour='black', group = 1)
Created on 2022-04-05 by the reprex package (v2.0.1)

This is what I did
scale_x_continuous(labels = function(x) round(as.numeric(x)))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

ggplot, data space according to sampling time? - r

Related

Why doesn't the x axis add value to the existing values? [duplicate]

R - reformat P value in ggplot using 'stat_compare_means'

How to add comparison bars to a plot to denote which comparison a p value corresponds to

How to fix the following output plot by R? [duplicate]

How to display only integer values on an axis using ggplot2

Categories

Resources