Count and axis labels on stat_bin2d with ggplot - r

I am trying to make a 2D histogram with the individual bins showing both the bin contents and a gradient. The data are integers ranging from 0 to 4 (only) in both axes.
I tried working with this answer but I end up with a few issues. First, a few bins end up getting no gradient at all. In the MWE below, the bottom left bins of 130 and 60 seems to be blank. Second, the bins are shifted to below 0 in both axes. For this axis issue, I found I could simply add a 0.5 to both x and y. In the end though, I also would like to have the axis labels to be centered within a bin and adding that 0.5 does not address that.
library(ggplot2)
# Construct the data to be plotted
x <- c(rep(0,190),rep(1,50),rep(2,10),rep(3,40))
y <- c(rep(0,130),rep(1,80),rep(2,30),rep(3,10),rep(4,40))
data <- data.frame(x,y)
# Taken from the example
ggplot(data, aes(x = x, y = y)) +
geom_bin2d(binwidth=1) +
stat_bin2d(geom = "text", aes(label = ..count..), binwidth=1) +
scale_fill_gradient(low = "snow3", high = "red", trans = "log10") +
xlim(-1, 5) +
ylim(-1, 5) +
coord_equal()
Is there something obvious I am doing wrong in both the color gradients and axis labels? I am also not married to ggplot or stat_bin2d if there is a better way to do it with some other package/command. Thanks in advance!

stat_bin2d uses the cut function to create the bins. By default, cut creates bins that are open on the left and closed on the right. stat_bin2d also sets include.lowest=TRUE so that the lowest interval will be closed on the left also. I haven't looked through the code for stat_bin2d to try and figure out exactly what's going wrong, but it seems like it has to do with how the breaks in cut are being chosen. In any case, you can get the desired behavior by setting the bin breaks explicitly to start at -1. For example:
ggplot(data, aes(x = x, y = y)) +
geom_bin2d(breaks=c(-1:4)) +
stat_bin2d(geom = "text", aes(label = ..count..), breaks=c(-1:4)) +
scale_fill_gradient(low = "snow3", high = "red", trans = "log10") +
xlim(-1, 5) +
ylim(-1, 5) +
coord_equal()
To center the tiles on the integer lattice points, set the breaks to half-integer values:
ggplot(data, aes(x = x, y = y)) +
geom_bin2d(breaks=seq(-0.5,4.5,1)) +
stat_bin2d(geom = "text", aes(label = ..count..), breaks=seq(-0.5,4.5,1)) +
scale_fill_gradient(low = "snow3", high = "red", trans = "log10") +
scale_x_continuous(breaks=0:4, limits=c(-0.5,4.5)) +
scale_y_continuous(breaks=0:4, limits=c(-0.5,4.5)) +
coord_equal()
Or, to emphasize that the values are discrete, set the bins to be half a unit wide:
ggplot(data, aes(x = x, y = y)) +
geom_bin2d(breaks=seq(-0.25,4.25,0.5)) +
stat_bin2d(geom = "text", aes(label = ..count..), breaks=seq(-0.25,4.25,0.5)) +
scale_fill_gradient(low = "snow3", high = "red", trans = "log10") +
scale_x_continuous(breaks=0:4, limits=c(-0.25,4.25)) +
scale_y_continuous(breaks=0:4, limits=c(-0.25,4.25)) +
coord_equal()

Related

Create plot with only x axis

I want to create a bubble plot without the y axis, meaning the x axis represents a range between certain values and the size of the bubbles corresponds to a "number" variable.
Since geom_point() requires a y variable, I created a new column with only zero values and assigned it to the y axis.
ggplot(df, aes(x=range, y=new, size = numberPoints)) +
geom_point(alpha=0.5, shape=19) +
scale_size(range = c(.1, 24)) +
scale_y_continuous(breaks = NULL)
However, it gave the following result (the y axis is too large):
I only wanted the bubbles above the x axis (without too much space), but I can't find a way to do it.
You can use coord_fixed to "reduce" your axis
library(dplyr)
library(ggplot2)
data.frame(x = c(1,2,3,4), size = c(1,1,4,8)) %>%
ggplot(aes(x=x, y=1, size = size)) +
geom_point(alpha=0.5, shape=19) +
scale_size(range = c(.1, 24)) +
scale_y_continuous(breaks = NULL)+
coord_fixed(6)

ground geom_text to x axis (e.g. y =0)

I have a graph made in ggplot that looks like this:
I wish to have the numeric labels at each of the bars to be grounded/glued to the x axis where y <= 0.
This is the code to generate the graph as such:
ggplot(data=df) +
geom_bar(aes(x=row, y=numofpics, fill = crop, group = 1), stat='identity') +
geom_point(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_line(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_text(aes(x=row, y=numofpics, label=bbch)) +
geom_hline(yintercept=300, linetype="dashed", color = "red", size=1) +
scale_y_continuous(sec.axis= sec_axis(~./50, name="Number of Parcels")) +
scale_x_discrete(name = c(),breaks = unique(df$crop), labels = as.character(unique(df$crop)))+
labs(x=c(), y="Number of Pictures")
I've tried vjust and experimenting with position_nudge for the geom_text element, but every solution I can find changes the position of each element of the geom_text respective to its current position. As such everything I try results in situation like this one:
How can I make ggplot ground the text to the bottom of the x axis where y <= 0, possibly with the possibility to also introduce a angle = 45?
Link to dataframe = https://drive.google.com/file/d/1b-5AfBECap3TZjlpLhl1m3v74Lept2em/view?usp=sharing
As I said in the comments, just set the y-coordinate of the text to 0 or below, and specify the angle : geom_text(aes(x=row, y=-100, label=bbch), angle=45)
I'm behind a proxy server that blocks connections to google drive so I can't access your data. I'm not able to test this, but I would introduce a new label field in my dataset that sets y to be 0 if y<0:
df <- df %>%
mutate(labelField = if_else(numofpics<0, 0, numofpics)
I would then use this label field in my geom_text call:
geom_text(aes(x=row, y=labelField, label=bbch), angle = 45)
Hope that helps.
You can simply define the y-value in geom_text (e.g. -50)
ggplot(data=df) +
geom_bar(aes(x=row, y=numofpics, fill = crop, group = 1), stat='identity') +
geom_point(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_line(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_text(aes(x=row, y=-50, label=bbch)) +
geom_hline(yintercept=300, linetype="dashed", color = "red", size=1) +
scale_y_continuous(sec.axis= sec_axis(~./50, name="Number of Parcels")) +
scale_x_discrete(name = c(),breaks = unique(df$crop), labels =
as.character(unique(df$crop)))+
labs(x=c(), y="Number of Pictures")

How to switch y axis labels in ggplot without reversing plot?

I'm trying to plot mean values for species although the mean values are all negative. I want the more smaller values (more negative) to be towards the bottom of the y axis with the larger values (less negative) to be higher up on the y axis.
I've tried changing coord_cartesian and ylim and neither work.
ggplot(meanWUE, aes(x = Species, y = mean, fill = Species)) +
coord_cartesian(ylim = c(-0.8, -0.7)) +
scale_fill_manual( values c("EUCCHR" = "darkolivegreen2","ESCCAL" = "darkgoldenrod2", "ARTCAL" = "darkcyan", "DEIFAS" = "darkred", "ENCCAL" = "darkorchid2", "SALMEL" = "deepskyblue1", "ERIFAS" = "blue3", "BRANIG" = "azure3", "PHAPAR"= "palevioletred" )) +
scale_y_reverse() +
geom_bar(position = position_dodge(), stat="identity") +
geom_errorbar(aes(ymin=mean-se, ymax=mean+se),width=.3) +
labs(x="Species", y="WUE")+
theme_bw() +
theme(panel.grid.major = element_blank(), legend.position = "none")
I want ESCCAL and EUCCHR to be the shortest bars essentially, but currently they're being shown as the tallest.
Species vs water use efficiency
If I don't do scale_y_reverse, I get a plot that looks like this second image
One approach is to shift all the numbers to show their value over a baseline, and then adjust the labeling the same way:
df <- data.frame(Species = LETTERS[1:10],
mean = -80:-71/100)
ggplot(df, aes(x = Species, y = mean, fill = Species)) +
geom_bar(position = position_dodge(), stat="identity")
Here we shift the values to show them against a new baseline. Then we can show larger numbers as larger bars the way we'd normally expect for positive numbers. At the same time, we change the labels on the y axis so they correspond to the original values. So -0.8 becomes +0.1 vs. a baseline of -0.9. But we adjust the labels too, so that adjusted 0 has a label of -0.9, and adjusted +0.1 has a label of -0.8, its original value.
baseline <- -0.9
ggplot(df, aes(x = Species, y = mean - baseline, fill = Species)) +
geom_bar(position = position_dodge(), stat="identity") +
scale_y_continuous(breaks = 0:100*0.02,
labels = 0:100*0.02 + baseline, minor_breaks = NULL)

Binning not correct? Different amount of counts

I have two vectors of values, both with the same number of entries. Hence, when these vectors are histogrammed, the corresponding distributions should depict the counts vs values. I'm not sure whether I misinterpret something or plotted something wrong but in my understand the red values should not top the green values everywhere. When both vectors provide the same number of entries the one distribution must be lower than the other when the other is higher somewhere. Or not?
The plot command:
number_ticks<- function(n) {function(limits) pretty(limits, n)}
ggplot(data, aes(x = value, fill = Parameter)) +
geom_histogram(
binwidth = 0.25,
color = "black",
alpha = 0.75) +
theme_classic() +
theme(legend.position = c(0.21, 0.85)) +
labs(title = "",
x = TeX("$ \\Delta U_{bias} / V"))) +
scale_x_continous(breaks = number_ticks(20)) +
guides(fill=guide_legend(title=Parameter))
Currently the red histogram goes on top of the green one: they are stacked. That is, position = "stack" is the default option in geom_histogram, while you want to use position = "identity".
For instance, compare
ggplot(diamonds, aes(price, fill = cut)) +
geom_histogram(binwidth = 500)
with
ggplot(diamonds, aes(price, fill = cut)) +
geom_histogram(binwidth = 500, position = "identity", alpha = 0.5)

Dual "y" axis in ggplot2 plot [duplicate]

This question already has an answer here:
Dual y axis (second axis) use in ggplot2
(1 answer)
Closed 5 years ago.
I know this topic has arisen some time in different threads of this page, but I am afraid that following the instructions of all of them I have not managed to fix it. I have been trying to solve this problem for a week that seems quite trivial and I can not find the way.
I do not know if it's about differences in the graphics or that there is something I do wrong. The case is as follows. I have two graphics using the ggplot2 package:
library(ggplot2)
data<-data.frame(Age=0,var2=0,var1=0,inf=0,sup=0,ppv=0)
data[1,]<-c(1,1,0.857,0.793,0.904,0.03)
data[2,]<-c(1,2,0.771 ,0.74,0.799,0.056)
data[3,]<-c(1,3,0.763 ,0.717,0.804,0.06)
data[4,]<-c(1,4,0.724 ,0.653,0.785,0.09)
data[5,]<-c(2,1,0.906,0.866,0.934,0.055)
data[6,]<-c(2,2,0.785 ,0.754,0.813,0.067)
data[7,]<-c(2,3,0.660,0.593,0.722,0.089)
data[8,]<-c(2,4,0.544,0.425,0.658,0.123)
pd <- position_dodge(0.2) #
names(data)<-c("Age","var2","var1","inf","sup","ppv")
data$Age<-as.character(data$Age)
data$var2<-as.character(data$var2)
p<- ggplot(data, aes(x=var2, y=var1, colour=Age)) +
geom_errorbar(aes(ymin=inf, ymax=sup), width=.1 , position=pd) +
geom_line(position=pd,aes(group=Age),linetype=c("dashed")) +
geom_point(position=pd,size=3) +
theme_light()+
ylim(0,1) +
scale_color_manual(values=c("1"="grey55","2"="grey15"))+guides(fill=guide_legend(nrow=2,byrow=TRUE)
)
s<- ggplot(data, aes(x=var2, y=ppv, colour=Age)) +
geom_line(position=pd,aes(group=Age),linetype=c("dashed")) +
geom_point(position=pd,size=3) +
theme_light()+
ylim(0,0.2) + scale_color_manual(values=c("1"="grey55","2"="grey15"))+guides(fill=guide_legend(nrow=2,byrow=TRUE)
)
They look like this:
Image of p
Image of s
I was wondering if someone would know the way to put them together in a single graph, with the two scales that they currently have, for example, the y axis of the graph p at the left side and the y axis of the graph s at the right side since I can not directly draw both data in a graph due to the radical difference in the scales .
Thank you very much for your time,
Best regards,
try this code, you should set aes at new layer.
ggplot(data, aes(x = var2, y = var1, colour=Age)) +
geom_errorbar(aes(ymin = inf, ymax = sup), width = .1, position = pd) +
geom_line(position = pd, aes(group = Age), linetype = c("dashed")) +
geom_point(position = pd, size = 3) +
geom_line(position = pd, aes(x = var2, y = ppv * 5, colour = Age, group = Age), linetype = c("dashed"), data = data) +
geom_point(aes(x = var2, y = ppv * 5, colour = Age, group = Age), position = pd, size = 5) +
theme_light() +
scale_color_manual(values = c("1" = "grey55", "2" = "grey15")) +
scale_y_continuous(sec.axis = sec_axis(~./5)) +
guides(fill = guide_legend(nrow = 2, byrow = TRUE))

Resources