Related
I have the following plot:
library(reshape)
library(ggplot2)
library(gridExtra)
require(ggplot2)
data2<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(15L, 11L, 29L, 42L, 0L, 5L, 21L,
22L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
p <- ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15))
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
q<- ggplot(data3, aes(x =factor(IR), y = value, fill = Legend, width=.15))
##the plot##
q + geom_bar(position='dodge', colour='black') + ylab('Frequency') + xlab('IR')+scale_fill_grey() +theme(axis.text.x=element_text(colour="black"), axis.text.y=element_text(colour="Black"))+ opts(title='', panel.grid.major = theme_blank(),panel.grid.minor = theme_blank(),panel.border = theme_blank(),panel.background = theme_blank(), axis.ticks.x = theme_blank())
I want the y-axis to display only integers. Whether this is accomplished through rounding or through a more elegant method isn't really important to me.
If you have the scales package, you can use pretty_breaks() without having to manually specify the breaks.
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks= pretty_breaks())
This is what I use:
ggplot(data3, aes(x = factor(IR), y = value, fill = Legend, width = .15)) +
geom_col(position = 'dodge', colour = 'black') +
scale_y_continuous(breaks = function(x) unique(floor(pretty(seq(0, (max(x) + 1) * 1.1)))))
With scale_y_continuous() and argument breaks= you can set the breaking points for y axis to integers you want to display.
ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_bar(position='dodge', colour='black')+
scale_y_continuous(breaks=c(1,3,7,10))
You can use a custom labeller. For example, this function guarantees to only produce integer breaks:
int_breaks <- function(x, n = 5) {
l <- pretty(x, n)
l[abs(l %% 1) < .Machine$double.eps ^ 0.5]
}
Use as
+ scale_y_continuous(breaks = int_breaks)
It works by taking the default breaks, and only keeping those that are integers. If it is showing too few breaks for your data, increase n, e.g.:
+ scale_y_continuous(breaks = function(x) int_breaks(x, n = 10))
These solutions did not work for me and did not explain the solutions.
The breaks argument to the scale_*_continuous functions can be used with a custom function that takes the limits as input and returns breaks as output. By default, the axis limits will be expanded by 5% on each side for continuous data (relative to the range of data). The axis limits will likely not be integer values due to this expansion.
The solution I was looking for was to simply round the lower limit up to the nearest integer, round the upper limit down to the nearest integer, and then have breaks at integer values between these endpoints. Therefore, I used the breaks function:
brk <- function(x) seq(ceiling(x[1]), floor(x[2]), by = 1)
The required code snippet is:
scale_y_continuous(breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1))
The reproducible example from original question is:
data3 <-
structure(
list(
IR = structure(
c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L),
.Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"),
class = "factor"
),
variable = structure(
c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L),
.Label = c("Real queens", "Simulated individuals"),
class = "factor"
),
value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L),
Legend = structure(
c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
.Label = c("Real queens",
"Simulated individuals"),
class = "factor"
)
),
row.names = c(NA,-8L),
class = "data.frame"
)
ggplot(data3, aes(
x = factor(IR),
y = value,
fill = Legend,
width = .15
)) +
geom_col(position = 'dodge', colour = 'black') + ylab('Frequency') + xlab('IR') +
scale_fill_grey() +
scale_y_continuous(
breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1),
expand = expand_scale(mult = c(0, 0.05))
) +
theme(axis.text.x=element_text(colour="black", angle = 45, hjust = 1),
axis.text.y=element_text(colour="Black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.ticks.x = element_blank())
I found this solution from Joshua Cook and worked pretty well.
integer_breaks <- function(n = 5, ...) {
fxn <- function(x) {
breaks <- floor(pretty(x, n, ...))
names(breaks) <- attr(breaks, "labels")
breaks
}
return(fxn)
}
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks = integer_breaks())
The source is:
https://joshuacook.netlify.app/post/integer-values-ggplot-axis/
You can use the accuracy argument of scales::label_number() or scales::label_comma() for this:
fakedata <- data.frame(
x = 1:5,
y = c(0.1, 1.2, 2.4, 2.9, 2.2)
)
library(ggplot2)
# without the accuracy argument, you see .0 decimals
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::comma)
# with the accuracy argument, all displayed numbers are integers
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = ~ scales::comma(.x, accuracy = 1))
# equivalent
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_comma(accuracy = 1))
# this works with scales::label_number() as well
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_number(accuracy = 1))
Created on 2021-08-27 by the reprex package (v2.0.0.9000)
All of the existing answers seem to require custom functions or fail in some cases.
This line makes integer breaks:
bad_scale_plot +
scale_y_continuous(breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3)))
For more info, see the documentation ?labeling::extended (which is a function called by scales::breaks_extended).
Basically, the argument Q is a set of nice numbers that the algorithm tries to use for scale breaks. The original plot produces non-integer breaks (0, 2.5, 5, and 7.5) because the default value for Q includes 2.5: Q = c(1,5,2,2.5,4,3).
EDIT: as pointed out in a comment, non-integer breaks can occur when the y-axis has a small range. By default, breaks_extended() tries to make about n = 5 breaks, which is impossible when the range is too small. Quick testing shows that ranges wider than 0 < y < 2.5 give integer breaks (n can also be decreased manually).
This answer builds on #Axeman's answer to address the comment by kory that if the data only goes from 0 to 1, no break is shown at 1. This seems to be because of inaccuracy in pretty with outputs which appear to be 1 not being identical to 1 (see example at the end).
Therefore if you use
int_breaks_rounded <- function(x, n = 5) pretty(x, n)[round(pretty(x, n),1) %% 1 == 0]
with
+ scale_y_continuous(breaks = int_breaks_rounded)
both 0 and 1 are shown as breaks.
Example to illustrate difference from Axeman's
testdata <- data.frame(x = 1:5, y = c(0,1,0,1,1))
p1 <- ggplot(testdata, aes(x = x, y = y))+
geom_point()
p1 + scale_y_continuous(breaks = int_breaks)
p1 + scale_y_continuous(breaks = int_breaks_rounded)
Both will work with the data provided in the initial question.
Illustration of why rounding is required
pretty(c(0,1.05),5)
#> [1] 0.0 0.2 0.4 0.6 0.8 1.0 1.2
identical(pretty(c(0,1.05),5)[6],1)
#> [1] FALSE
Google brought me to this question. I'm trying to use real numbers in a y scale. The y scale numbers are in Millions.
The scales package comma method introduces a comma to my large numbers. This post on R-Bloggers explains a simple approach using the comma method:
library(scales)
big_numbers <- data.frame(x = 1:5, y = c(1000000:1000004))
big_numbers_plot <- ggplot(big_numbers, aes(x = x, y = y))+
geom_point()
big_numbers_plot + scale_y_continuous(labels = comma)
Enjoy R :)
One answer is indeed inside the documentation of the pretty() function. As pointed out here Setting axes to integer values in 'ggplot2' the function contains already the solution. You have just to make it work for small values. One possibility is writing a new function like the author does, for me a lambda function inside the breaks argument just works:
... + scale_y_continuous(breaks = ~round(unique(pretty(.))
It will round the unique set of values generated by pretty() creating only integer labels, no matter the scale of values.
If your values are integers, here is another way of doing this with group = 1 and as.factor(value):
library(tidyverse)
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
data3 %>%
mutate(value = as.factor(value)) %>%
ggplot(aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_col(position = 'dodge', colour='black', group = 1)
Created on 2022-04-05 by the reprex package (v2.0.1)
This is what I did
scale_x_continuous(labels = function(x) round(as.numeric(x)))
I have this data:
structure(list(Sexo = structure(c(1L, 2L, 1L, 2L), .Label = c("Female",
"Male"), class = "factor"), Status = structure(c(1L, 1L, 2L,
2L), .Label = c("Active", "Terminated"), class = "factor"), Freq = c(1367L,
7030L, 394L, 940L)), class = "data.frame", row.names = c(NA,
-4L))
When I try to plot it, it shows all wrong. what am I doing wrong, please?
razao1%>%
ggplot(aes(x=as.factor(Status), fill = factor(Status)))+
geom_bar(position = "fill")+
facet_wrap(~Sexo)
If by "it shows all wrong" you mean that you're only getting y-values of 1, then it's likely due to not having a y-value in the original ggplot call.
Try this:
razao1%>%
ggplot(aes(x=Status, y = Freq, fill = Status))+
geom_bar(stat="identity")+
facet_wrap(~Sexo)
Which will generate this plot:
I am working on the dataset reported here below (pre.sss)
pre.sss <- pre.sss <- structure(list(Pretest.num = c(63, 62, 61, 60, 59, 58, 57, 4,2, 1), stress = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L,1L), .Label = c("[0,6]", "(6,9]"), class = "factor"), time = c(1L,1L, 1L, 1L, 1L, 1L, 1L, 8L, 8L, 8L), after = structure(c(2L,2L, 2L, 2L, 2L, 2L, 1L, 1L, NA, 1L), .Label = c("no", "yes"), class = "factor"),id = c("call_fam", "call_fam", "call_fam", "call_fam", "call_fam","call_fam", "call_fam", "counselor", "counselor", "counselor")), .Names = c("Pretest.num", "stress", "time", "after","id"), reshapeLong = structure(list(varying = structure(list(after = c("after.call.fam", "after.speak", "after.send.email","after.send.card", "after.attend", "after.fam.mtg", "after.sup.grp","after.counselor")), .Names = "after", v.names = "after", times = 1:8),v.names = "after", idvar = "Pretest.num", timevar = "time"), .Names = c("varying","v.names", "idvar", "timevar")), row.names = c("63.1", "62.1","61.1", "60.1", "59.1", "58.1", "57.1", "4.8", "2.8", "1.8"), class = "data.frame")
and I need to plot the counts of several categorical variables according to a specific level of another categorical variable ('stress'): so, a faceted bobble-lot would do the job in my case
So what I do is the following:
ylabels = c('call_fam' = "call fam.member for condolences",
'speak' = "speak to fam.member in person",
'send.email' = "send condolence email to fam.member",
'send.card' = "send condolence card/letter to fam.member",
'attend' = "attend funeral/wake",
'fam.mtg' = "provide fam.meeting",
'sup.grp' = "suggest attending support grp.",
'counselor' = "make referral to bereavement counselor" )
p = ggplot(pre.sss, aes(x = after, y = id)) +
geom_count(alpha = 0.5, col = 'darkblue') +
scale_size(range = c(1,30)) +
theme(legend.position = 'none') +
xlab("Response") +
ylab("What did you do after learning about death?") +
scale_y_discrete(labels = ylabels) +
facet_grid(.~ pre.sss$stress, labeller = as_labeller(stress.labels))
and I obtain the following image, exactly as I want.
Now I would like to label each bubble with the count with which the corresponding data appear in the dataset.
dat = data.frame(ggplot_build(p)$data[[1]][, c('x', 'y', 'PANEL', 'n')])
dat$PANEL = ifelse(dat$PANEL==1, "[0,6]", "(6-9]")
colnames(dat) = c('x', 'y', 'stress', 'n')
p + geom_text(aes(x, y, label = n, group = NULL), data = dat)
This gives me the following error I really can't understand.
> p + geom_text(aes(x, y, label=n, group=NULL), data=dat)
Error in `$<-.data.frame`(`*tmp*`, "PANEL", value = c(1L, 1L, 1L, 1L, :
replacement has 504 rows, data has 46
Can anybody help me with this?
Thanks!
EM
The function you refer to as your labeller function is missing from this example still. geom_count uses stat_sum, which calculates a parameter n, the number of observations at that point. Because you can use this calculated parameter, you don't actually have to assign the plot to a variable and pull out its data, as you did with ggplot_build.
This should do what you're looking for:
ggplot(pre.sss, aes(x = after, y = id)) +
geom_count(alpha = 0.5, col = 'darkblue') +
# note the following line
stat_sum(mapping = aes(label = ..n..), geom = "text") +
scale_size(range = c(1,30)) +
theme(legend.position = 'none') +
xlab("Response") +
ylab("What did you do after learning about death?") +
scale_y_discrete(labels = ylabels) +
facet_grid(.~ stress)
The line I added computes the same thing as what's behind the scenes in geom_count, but gives it a text geom instead, with the label mapped to that computed parameter n.
I have the following plot:
library(reshape)
library(ggplot2)
library(gridExtra)
require(ggplot2)
data2<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(15L, 11L, 29L, 42L, 0L, 5L, 21L,
22L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
p <- ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15))
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
q<- ggplot(data3, aes(x =factor(IR), y = value, fill = Legend, width=.15))
##the plot##
q + geom_bar(position='dodge', colour='black') + ylab('Frequency') + xlab('IR')+scale_fill_grey() +theme(axis.text.x=element_text(colour="black"), axis.text.y=element_text(colour="Black"))+ opts(title='', panel.grid.major = theme_blank(),panel.grid.minor = theme_blank(),panel.border = theme_blank(),panel.background = theme_blank(), axis.ticks.x = theme_blank())
I want the y-axis to display only integers. Whether this is accomplished through rounding or through a more elegant method isn't really important to me.
If you have the scales package, you can use pretty_breaks() without having to manually specify the breaks.
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks= pretty_breaks())
This is what I use:
ggplot(data3, aes(x = factor(IR), y = value, fill = Legend, width = .15)) +
geom_col(position = 'dodge', colour = 'black') +
scale_y_continuous(breaks = function(x) unique(floor(pretty(seq(0, (max(x) + 1) * 1.1)))))
With scale_y_continuous() and argument breaks= you can set the breaking points for y axis to integers you want to display.
ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_bar(position='dodge', colour='black')+
scale_y_continuous(breaks=c(1,3,7,10))
You can use a custom labeller. For example, this function guarantees to only produce integer breaks:
int_breaks <- function(x, n = 5) {
l <- pretty(x, n)
l[abs(l %% 1) < .Machine$double.eps ^ 0.5]
}
Use as
+ scale_y_continuous(breaks = int_breaks)
It works by taking the default breaks, and only keeping those that are integers. If it is showing too few breaks for your data, increase n, e.g.:
+ scale_y_continuous(breaks = function(x) int_breaks(x, n = 10))
These solutions did not work for me and did not explain the solutions.
The breaks argument to the scale_*_continuous functions can be used with a custom function that takes the limits as input and returns breaks as output. By default, the axis limits will be expanded by 5% on each side for continuous data (relative to the range of data). The axis limits will likely not be integer values due to this expansion.
The solution I was looking for was to simply round the lower limit up to the nearest integer, round the upper limit down to the nearest integer, and then have breaks at integer values between these endpoints. Therefore, I used the breaks function:
brk <- function(x) seq(ceiling(x[1]), floor(x[2]), by = 1)
The required code snippet is:
scale_y_continuous(breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1))
The reproducible example from original question is:
data3 <-
structure(
list(
IR = structure(
c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L),
.Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"),
class = "factor"
),
variable = structure(
c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L),
.Label = c("Real queens", "Simulated individuals"),
class = "factor"
),
value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L),
Legend = structure(
c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
.Label = c("Real queens",
"Simulated individuals"),
class = "factor"
)
),
row.names = c(NA,-8L),
class = "data.frame"
)
ggplot(data3, aes(
x = factor(IR),
y = value,
fill = Legend,
width = .15
)) +
geom_col(position = 'dodge', colour = 'black') + ylab('Frequency') + xlab('IR') +
scale_fill_grey() +
scale_y_continuous(
breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1),
expand = expand_scale(mult = c(0, 0.05))
) +
theme(axis.text.x=element_text(colour="black", angle = 45, hjust = 1),
axis.text.y=element_text(colour="Black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.ticks.x = element_blank())
I found this solution from Joshua Cook and worked pretty well.
integer_breaks <- function(n = 5, ...) {
fxn <- function(x) {
breaks <- floor(pretty(x, n, ...))
names(breaks) <- attr(breaks, "labels")
breaks
}
return(fxn)
}
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks = integer_breaks())
The source is:
https://joshuacook.netlify.app/post/integer-values-ggplot-axis/
You can use the accuracy argument of scales::label_number() or scales::label_comma() for this:
fakedata <- data.frame(
x = 1:5,
y = c(0.1, 1.2, 2.4, 2.9, 2.2)
)
library(ggplot2)
# without the accuracy argument, you see .0 decimals
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::comma)
# with the accuracy argument, all displayed numbers are integers
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = ~ scales::comma(.x, accuracy = 1))
# equivalent
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_comma(accuracy = 1))
# this works with scales::label_number() as well
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_number(accuracy = 1))
Created on 2021-08-27 by the reprex package (v2.0.0.9000)
All of the existing answers seem to require custom functions or fail in some cases.
This line makes integer breaks:
bad_scale_plot +
scale_y_continuous(breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3)))
For more info, see the documentation ?labeling::extended (which is a function called by scales::breaks_extended).
Basically, the argument Q is a set of nice numbers that the algorithm tries to use for scale breaks. The original plot produces non-integer breaks (0, 2.5, 5, and 7.5) because the default value for Q includes 2.5: Q = c(1,5,2,2.5,4,3).
EDIT: as pointed out in a comment, non-integer breaks can occur when the y-axis has a small range. By default, breaks_extended() tries to make about n = 5 breaks, which is impossible when the range is too small. Quick testing shows that ranges wider than 0 < y < 2.5 give integer breaks (n can also be decreased manually).
This answer builds on #Axeman's answer to address the comment by kory that if the data only goes from 0 to 1, no break is shown at 1. This seems to be because of inaccuracy in pretty with outputs which appear to be 1 not being identical to 1 (see example at the end).
Therefore if you use
int_breaks_rounded <- function(x, n = 5) pretty(x, n)[round(pretty(x, n),1) %% 1 == 0]
with
+ scale_y_continuous(breaks = int_breaks_rounded)
both 0 and 1 are shown as breaks.
Example to illustrate difference from Axeman's
testdata <- data.frame(x = 1:5, y = c(0,1,0,1,1))
p1 <- ggplot(testdata, aes(x = x, y = y))+
geom_point()
p1 + scale_y_continuous(breaks = int_breaks)
p1 + scale_y_continuous(breaks = int_breaks_rounded)
Both will work with the data provided in the initial question.
Illustration of why rounding is required
pretty(c(0,1.05),5)
#> [1] 0.0 0.2 0.4 0.6 0.8 1.0 1.2
identical(pretty(c(0,1.05),5)[6],1)
#> [1] FALSE
Google brought me to this question. I'm trying to use real numbers in a y scale. The y scale numbers are in Millions.
The scales package comma method introduces a comma to my large numbers. This post on R-Bloggers explains a simple approach using the comma method:
library(scales)
big_numbers <- data.frame(x = 1:5, y = c(1000000:1000004))
big_numbers_plot <- ggplot(big_numbers, aes(x = x, y = y))+
geom_point()
big_numbers_plot + scale_y_continuous(labels = comma)
Enjoy R :)
One answer is indeed inside the documentation of the pretty() function. As pointed out here Setting axes to integer values in 'ggplot2' the function contains already the solution. You have just to make it work for small values. One possibility is writing a new function like the author does, for me a lambda function inside the breaks argument just works:
... + scale_y_continuous(breaks = ~round(unique(pretty(.))
It will round the unique set of values generated by pretty() creating only integer labels, no matter the scale of values.
If your values are integers, here is another way of doing this with group = 1 and as.factor(value):
library(tidyverse)
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
data3 %>%
mutate(value = as.factor(value)) %>%
ggplot(aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_col(position = 'dodge', colour='black', group = 1)
Created on 2022-04-05 by the reprex package (v2.0.1)
This is what I did
scale_x_continuous(labels = function(x) round(as.numeric(x)))
I have the following plot:
library(reshape)
library(ggplot2)
library(gridExtra)
require(ggplot2)
data2<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(15L, 11L, 29L, 42L, 0L, 5L, 21L,
22L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
p <- ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15))
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
q<- ggplot(data3, aes(x =factor(IR), y = value, fill = Legend, width=.15))
##the plot##
q + geom_bar(position='dodge', colour='black') + ylab('Frequency') + xlab('IR')+scale_fill_grey() +theme(axis.text.x=element_text(colour="black"), axis.text.y=element_text(colour="Black"))+ opts(title='', panel.grid.major = theme_blank(),panel.grid.minor = theme_blank(),panel.border = theme_blank(),panel.background = theme_blank(), axis.ticks.x = theme_blank())
I want the y-axis to display only integers. Whether this is accomplished through rounding or through a more elegant method isn't really important to me.
If you have the scales package, you can use pretty_breaks() without having to manually specify the breaks.
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks= pretty_breaks())
This is what I use:
ggplot(data3, aes(x = factor(IR), y = value, fill = Legend, width = .15)) +
geom_col(position = 'dodge', colour = 'black') +
scale_y_continuous(breaks = function(x) unique(floor(pretty(seq(0, (max(x) + 1) * 1.1)))))
With scale_y_continuous() and argument breaks= you can set the breaking points for y axis to integers you want to display.
ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_bar(position='dodge', colour='black')+
scale_y_continuous(breaks=c(1,3,7,10))
You can use a custom labeller. For example, this function guarantees to only produce integer breaks:
int_breaks <- function(x, n = 5) {
l <- pretty(x, n)
l[abs(l %% 1) < .Machine$double.eps ^ 0.5]
}
Use as
+ scale_y_continuous(breaks = int_breaks)
It works by taking the default breaks, and only keeping those that are integers. If it is showing too few breaks for your data, increase n, e.g.:
+ scale_y_continuous(breaks = function(x) int_breaks(x, n = 10))
These solutions did not work for me and did not explain the solutions.
The breaks argument to the scale_*_continuous functions can be used with a custom function that takes the limits as input and returns breaks as output. By default, the axis limits will be expanded by 5% on each side for continuous data (relative to the range of data). The axis limits will likely not be integer values due to this expansion.
The solution I was looking for was to simply round the lower limit up to the nearest integer, round the upper limit down to the nearest integer, and then have breaks at integer values between these endpoints. Therefore, I used the breaks function:
brk <- function(x) seq(ceiling(x[1]), floor(x[2]), by = 1)
The required code snippet is:
scale_y_continuous(breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1))
The reproducible example from original question is:
data3 <-
structure(
list(
IR = structure(
c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L),
.Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"),
class = "factor"
),
variable = structure(
c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L),
.Label = c("Real queens", "Simulated individuals"),
class = "factor"
),
value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L),
Legend = structure(
c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
.Label = c("Real queens",
"Simulated individuals"),
class = "factor"
)
),
row.names = c(NA,-8L),
class = "data.frame"
)
ggplot(data3, aes(
x = factor(IR),
y = value,
fill = Legend,
width = .15
)) +
geom_col(position = 'dodge', colour = 'black') + ylab('Frequency') + xlab('IR') +
scale_fill_grey() +
scale_y_continuous(
breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1),
expand = expand_scale(mult = c(0, 0.05))
) +
theme(axis.text.x=element_text(colour="black", angle = 45, hjust = 1),
axis.text.y=element_text(colour="Black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.ticks.x = element_blank())
I found this solution from Joshua Cook and worked pretty well.
integer_breaks <- function(n = 5, ...) {
fxn <- function(x) {
breaks <- floor(pretty(x, n, ...))
names(breaks) <- attr(breaks, "labels")
breaks
}
return(fxn)
}
q + geom_bar(position='dodge', colour='black') +
scale_y_continuous(breaks = integer_breaks())
The source is:
https://joshuacook.netlify.app/post/integer-values-ggplot-axis/
You can use the accuracy argument of scales::label_number() or scales::label_comma() for this:
fakedata <- data.frame(
x = 1:5,
y = c(0.1, 1.2, 2.4, 2.9, 2.2)
)
library(ggplot2)
# without the accuracy argument, you see .0 decimals
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::comma)
# with the accuracy argument, all displayed numbers are integers
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = ~ scales::comma(.x, accuracy = 1))
# equivalent
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_comma(accuracy = 1))
# this works with scales::label_number() as well
ggplot(fakedata, aes(x = x, y = y)) +
geom_point() +
scale_y_continuous(label = scales::label_number(accuracy = 1))
Created on 2021-08-27 by the reprex package (v2.0.0.9000)
All of the existing answers seem to require custom functions or fail in some cases.
This line makes integer breaks:
bad_scale_plot +
scale_y_continuous(breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3)))
For more info, see the documentation ?labeling::extended (which is a function called by scales::breaks_extended).
Basically, the argument Q is a set of nice numbers that the algorithm tries to use for scale breaks. The original plot produces non-integer breaks (0, 2.5, 5, and 7.5) because the default value for Q includes 2.5: Q = c(1,5,2,2.5,4,3).
EDIT: as pointed out in a comment, non-integer breaks can occur when the y-axis has a small range. By default, breaks_extended() tries to make about n = 5 breaks, which is impossible when the range is too small. Quick testing shows that ranges wider than 0 < y < 2.5 give integer breaks (n can also be decreased manually).
This answer builds on #Axeman's answer to address the comment by kory that if the data only goes from 0 to 1, no break is shown at 1. This seems to be because of inaccuracy in pretty with outputs which appear to be 1 not being identical to 1 (see example at the end).
Therefore if you use
int_breaks_rounded <- function(x, n = 5) pretty(x, n)[round(pretty(x, n),1) %% 1 == 0]
with
+ scale_y_continuous(breaks = int_breaks_rounded)
both 0 and 1 are shown as breaks.
Example to illustrate difference from Axeman's
testdata <- data.frame(x = 1:5, y = c(0,1,0,1,1))
p1 <- ggplot(testdata, aes(x = x, y = y))+
geom_point()
p1 + scale_y_continuous(breaks = int_breaks)
p1 + scale_y_continuous(breaks = int_breaks_rounded)
Both will work with the data provided in the initial question.
Illustration of why rounding is required
pretty(c(0,1.05),5)
#> [1] 0.0 0.2 0.4 0.6 0.8 1.0 1.2
identical(pretty(c(0,1.05),5)[6],1)
#> [1] FALSE
Google brought me to this question. I'm trying to use real numbers in a y scale. The y scale numbers are in Millions.
The scales package comma method introduces a comma to my large numbers. This post on R-Bloggers explains a simple approach using the comma method:
library(scales)
big_numbers <- data.frame(x = 1:5, y = c(1000000:1000004))
big_numbers_plot <- ggplot(big_numbers, aes(x = x, y = y))+
geom_point()
big_numbers_plot + scale_y_continuous(labels = comma)
Enjoy R :)
One answer is indeed inside the documentation of the pretty() function. As pointed out here Setting axes to integer values in 'ggplot2' the function contains already the solution. You have just to make it work for small values. One possibility is writing a new function like the author does, for me a lambda function inside the breaks argument just works:
... + scale_y_continuous(breaks = ~round(unique(pretty(.))
It will round the unique set of values generated by pretty() creating only integer labels, no matter the scale of values.
If your values are integers, here is another way of doing this with group = 1 and as.factor(value):
library(tidyverse)
data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens",
"Simulated individuals"), class = "factor")), .Names = c("IR",
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
data3 %>%
mutate(value = as.factor(value)) %>%
ggplot(aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
geom_col(position = 'dodge', colour='black', group = 1)
Created on 2022-04-05 by the reprex package (v2.0.1)
This is what I did
scale_x_continuous(labels = function(x) round(as.numeric(x)))