geom_bar not showing every values - r

I want to draw a bar plot, with ggplot and geom_bar, but it seems that the behavior of geom_bar is not consistent. I don't understand why.
My data is a time series of precipitations:
library(ggplot2)
library(data.table)
library(lubridate)
set.seed(42)
dt1 <- data.table(dateHeure=seq(ymd_hms("2014-06-04 13:30:00"),
ymd_hms("2014-10-20 08:30:00"), by='1 hour'),
rain=sample(c(rep(5,15), rep(10,15), rep(20,10),
rep(30, 5), 40, rep(0, 3262))))
Then i plot it, and not all the data appears... Why is some data missing?
ggplot(data=dt1)+
geom_bar(aes(x=dateHeure, y=rain),
stat="identity",
fill="blue") # doesn't work!
But if i add the variable color in aes, then the plot is correct!
ggplot(data=dt1)+
geom_bar(aes(x=dateHeure, y=rain, color="rain"),
stat="identity",
width=0.2) # work properly
So someone know why geom_bar doesn't work properly without color? Because i can't rely on it if sometimes not all the data is correctly plotted...
thanks!
edit: to respond to #eipi10, i added the plots. The strange thing is that when i resize the plot window in the first case, the data which is plotted changes!

Based on the edit to your question, I think I know what's happening: In the first plot, you use fill="blue". But the bin widths are very small compared to the overall range of the x-axis. This results in very, very thin vertical bars--so thin that you can't see some of them on your screen, but they appear when you expand the physical width of the plot.
On the other hand, in your second plot you used colour="rain", which adds a border to each bar, making each bar thicker, so they are visible, even when the physical width of the plot is relatively small.
Try adding colour="blue"(or "red" or whatever) to your first plot and I think you'll see all the bars, even without resizing. On the other hand, try changing colour="rain" to fill="rain" on your second plot and see if that creates the "disappearing data" effect on your second plot.
UPDATE: In response to your comment, you can use the colour parameter and then set the line width to get exactly the bar thickness you want, so you don't really need fill. For example:
ggplot(data=dt1)+
geom_bar(aes(x=dateHeure, y=rain),
stat="identity",
colour="blue", lwd=0.5)
Just set lwd (line width) to a value that gives you the bar-width you want. And, of course, you can also change the colour as well.

Related

Change space between bars in histogram - R

I created a histogram in RStudio with the following code:
ggplot(data_csv, aes(x=Phasenew, fill=Success)) +
geom_histogram(binwidth = 1, position = "dodge", color="white")
What I want to do now, is to add more space between the bars of the histgram. I already tried the "width" parameter, but that one obviously does not work in histogram. Also I tried to make the outline bigger in white, but this will not show the correct length of the bar. Does anyone has an idea how to do that?
As two people wrote already in the comments, I also feel that your attempt to change the space between the 'bar' of a histogram is based on a misunderstanding about the nature of a histogram. Here the frequency of your events is represented as areas of the cells in the histogram. Or to quote Wikipedia:
the range of values—that is, divide the entire range of values into a series of intervals—and then count how many values fall into each interval
A priori these cells do not even need to have the same width (in the case your class widths would differ).
Perhaps what you are looking for is geom_bar (https://ggplot2.tidyverse.org/reference/geom_bar.html)
ggplot(data_csv, aes(x=Phasenew, fill=Success)) +
geom_bar()

Can I place my ggplot2 legend right beside the horizontal axis label?

In order to save as much as white space as I can, I would like to place the legend entry at the same height as the horizontal axis label. Can this be done and if so, how?
Here's a plot illustrating what I am hoping to achieve, with current and hoped-for legend position illustrated using a (manually added) green box.
I currently place the legend using theme(legend.position=c(0.87,0.1)) (noting that the exact coordinates are not relevant). Ideally, this route would allow for values outside of the [0,1] domain but it appears not to allow for that.
theme(legend.position="bottom") places the legend well outside of the plotting area, thus taking up more white space than I am willing to spare.
You just might have to play around with negative values regarding the y-coordinates of your legend.position-vector.
Here's an example:
library(ggplot2)
ggplot(iris, aes(Sepal.Width, Sepal.Length, color=Species))+
geom_line()+
facet_wrap(~Species)+
theme(legend.position=c(0.87,-0.01))
Note the -0.01 value. Is this what you're looking for?

How to preserve plot sizes when stacking with common x-axes in ggarrange?

I am trying to stack plots with common x- and y-axes in ggplot. What I want to do is have only the bottom plot show the x-axis labels and titles. But I've never been able to figure out how to do this cleanly in ggplot2 without having the bottom plot be squished by carrying the virtue of the x-axis labels/title. There must be an easy way to do this- everyone wants to stack graphs, right?!
I'm currently trying with ggarrange. Example code below. Note that the bottom plot gets compressed vertically because it has the tick and axis labels. I could just have the top two have white font labels/title, but then there is an unseemly amount of margin space between the three if you use that hack.
I'm definitely open to packages other than gpubr, but I am hoping for something not too elaborate that I can use in subsequent situations, as I'm sure I'll encounter this again...
Help, please!! -Ryan
#
require(ggplot2); require(ggpubr)
X=data.frame(seq(as.Date("2001-01-01"),as.Date("2001-12-31"),by='days')); colnames(X)='date'
X$Y1=sample(80:100,size=nrow(X),replace=T)
X$Y2=sample(100:120,size=nrow(X),replace=T)
X$Y3=sample(50:70,size=nrow(X),replace=T)
plot.Y1= ggplot(X, aes(x=date,y=Y1))+
geom_line()+lims(y=c(50,150))+
theme(axis.title.x = element_blank(),axis.text.x=element_blank())
plot.Y2= ggplot(X, aes(x=date,y=Y2))+
geom_line()+lims(y=c(50,150))+
theme(axis.title.x = element_blank(),axis.text.x=element_blank())
plot.Y3= ggplot(X, aes(x=date,y=Y3))+
geom_line()+lims(y=c(50,150))
x11(10,8)
ggarrange(plot.Y1,plot.Y2,plot.Y3,nrow=3,ncol=1)
Bottom plot is squished!
try this,
egg::ggarrange(plot.Y1,plot.Y2,plot.Y3,ncol=1)

How to enlarge the size of facet in R

I'm working on the following dataset where each facet shows the bleaching for one kind of coral at one site across the time period. My problem is how to enlarge the size of each facet to see the trend more clearly, as in current facets, it is hard to see the trend because of the small change in bleaching....
here is my code,
cb1<-aggregate(cb$latitude, list(Site=cb$site), mean)
cb$site=factor(cb$site, levels=cb1$Site[order(cb1$x)])
ggplot(cb,aes(year,bleaching)) +
geom_point() +
facet_grid(site~kind) +
geom_smooth(method="lm",color="grey") +
coord_cartesian(ylim=c(0,1))
due to the current size of the grid of facets, some lines seem flat but actually they are not.
You cannot really increase the sizes of the facets unless you increases the size of the plot overall. One option would be to save a large version of the plot:
p<-ggplot(cb,aes(year,bleaching))+geom_point()+facet_grid(site~kind)+geom_smooth(method="lm",color="grey")+coord_cartesian(ylim=c(0,1))
ggsave("file_name.jpg", plot = p, width = 24, height = 24, units = "in")
If you have limited space (e.g. the plot has to go on an A4 sheet) then the facet_grid_paginate function from ggforce would be a good option. It allows you to split faceted plots over multiple pages. You can define the number of rows and columns per page. See this link.
Alternatively, if you want to show that the lines are not flat more clearly, you can try toying with a couple of the arguments to facet_grid. facet_grid allows you to set the scales to free, free_x or free_y. Setting free_y would mean that each facet has its own y-axis (not necessarily between 0 and one (assuming you also removed the ylim=c(0,1). This would, however, make the the facets more difficult to compare with each other.

ggplot legend list is larger than page

I have a plot in R which has a very large number of sample groups, and therefore the legend is larger than the page size and is cut off. I understand that this is not publication quality, but I need to know the colours to be able to make the legend in Illustrator.
Is there a way to make the page size much bigger or somehow change the legend format so that I can include all the keys? The reason for this is so that I can open the PDF in Illustrator and get the colours for each sample to create a new legend that will be for publication. I thought that maybe there is a clipping mask, and that the actual legend will be preserved, but when I opened in Illustrator, the legend was actually cut at the page ends1.
As was suggested in the comments below I gave nrow a try which helped break the legends up but now the entire page is just legends.
ggplot(purine.n, aes(x=variable, y=value, colour=metabolite_gene, shape=variable))
+geom_abline(slope=0)
+geom_point(size=4, position=position_dodge(width=0.08))
+scale_y_continuous(limit=c(-3.5,5.5), breaks=c(-3,-2,-1,0,1,2,3,4,5))
+scale_shape_manual(values=c(16,17,17), guide=F)
+theme_bw()
+theme(legend.key=element_blank(), legend.key.size=unit(1,"point"))
+guides(colour=guide_legend(nrow=16))
As was suggested in the comments, nrow was the answer to my problem. I had to adjust the value to get the right number of rows to fit my legend. Below is the completed code that worked. There's more tweaking I need to do, like change page size to help make things look better, but that is out of the scope of this question.
ggplot(data.n, aes(x=variable, y=value, colour=metabolite_gene, shape=variable))
+geom_abline(slope=0)+geom_point(size=4, position=position_dodge(width=0.08))
+scale_y_continuous(limit=c(-3.5,5.5), breaks=c(-3,-2,-1,0,1,2,3,4,5))
+scale_shape_manual(values=c(16,17,17), guide=F)
+theme_bw()
+theme(legend.key=element_blank(), legend.key.size=unit(1,"point"))
+guides(colour=guide_legend(nrow=30))

Resources