Reduce space between groups of bars in ggplot2 - r

I haven't been able to remove extra white space flanking groups of bars in geom_plot.
I'd like to do what Roland achieves here: Remove space between bars ggplot2 but when I try to implement his solution I get the error "Warning message:
geom_bar() no longer has a binwidth parameter. Please use geom_histogram() instead."
I added this line of code to my plot (trying different widths):
geom_histogram(binwidth = 0.5) +
which returns "Error: stat_bin() must not be used with a y aesthetic." and no plot.
Data:
mydf<- data.frame(Treatment = c("Con", "Con", "Ex", "Ex"),
Response = rep(c("Alive", "Dead"), times=2),
Count = c(259,10,290,21))
aPalette<-c("#009E73", "#D55E00")
Plot:
example<-ggplot(mydf, aes(factor(Response), Count, fill = Treatment)) +
geom_bar(stat="identity",position = position_dodge(width = 0.55), width =
0.5) +
scale_fill_manual(values = aPalette, name = "Treatment") + #legend title
theme_classic() +
labs(x = "Response",
y = "Count") +
scale_y_continuous(breaks = c(0,50,100,150,200,250,275), expand = c(0,0),
limits = c(0, 260)) +
theme(legend.position = c(0.7, 0.3)) +
theme(text = element_text(size = 15)) #change all text size
example
Returns:
Note: I don't know why I'm getting "Warning message: Removed 1 rows containing missing values (geom_bar)." but I'm not concerned about it because that doesn't happen using my actual data
**Edit re: note - this is happening because I set the limit for the y-axis lower then the max value for the bar that was removed. I'm not going to change to code so I don't have to redraw my figure, but changing
limits = c(0, 260)
to
limits = c(0, 300)
will show all the bars. In case someone else had a similar problem. I'm going to find a post related to this issue and will make this edit more concise when I can link an answer

Forgive me if I completely missed what your trying to accomplish here but the only reason that ggplot has included so much white space is because you constrained the bars to a particular width and increased the size of the graph.
The white space within the graph is an output of width of the bars and width of the graph.
Using your original graph...
We notice a lot of whitespace but you both made the bins small and your graph wide. Think of the space as a compromise between bins and whitespace. Its illogical to expect a wide graph with small bins and no whitespace. To fix this we can either decrease the graph size or increase the bin size.
First we increase the bin size back to normal by removing your constraints.
Which looks rediculous....
But by looking at the Remove space between bars ggplot2 link that you included above all he did was remove constraints and limit width. Doing so would result in a similar graph...
Including the graph from your link above....
And removing all of your constraints....
example<-ggplot(mydf, aes(factor(Response), Count, fill = Treatment)) +
geom_bar(stat="identity",position = position_dodge()) +
scale_fill_manual(values = aPalette, name = "Treatment") +
theme_bw() +
labs(x = "Response", y = "Count")
example
If your goal was not to make your graph similar to the one in the link by removing whitespace let me know, other then that I hope this helped.

Related

X axis labels tied to histogram bars instead of following separate rules

When using a histogram with x as a POSIXct value, I'm not sure how you're supposed to line the ticks up with the binsize of the graph.
Setting the tick size to the same as the binsize makes it line a bit off, but the offset adds onto each other until its no longer accurate.
bymonth <- ggplot() +
scale_x_datetime("", breaks = date_breaks("60 days"), labels = date_format("%m-%y")) +
...
lots of geom_rects for background colors
...
theme(legend.title = element_blank()) +
geom_histogram(data=dat, aes(x = iso, fill = name), binwidth = 30*24*60*60, position = 'dodge')
I tried using annotate() as well as experimenting with the spacing of the tick but I think my approach here might be wrong in its own accord
This leads to a graph looking something like this
Which is quite annoying

How to adjust the following ggplot2 graph?

I conduct a research about global education recently and the following graph is an important plot of my research.
ggplot(sam_data,aes(JOY,PV)) +
geom_line(aes(colour = Individualism))+
facet_grid(occupation~as.factor(Gender)) +
theme(legend.key.height = unit(2.0,"cm"),legend.text = element_text(size = 5,face = "plain")) +
scale_color_continuous("Individualism",labels=sam_data$country,breaks =sam_data$Individualism)+
geom_smooth()
And the problem is obvious :
1) The correlation line of different countries is all combined into one line, instead of different lines when segmented into gender and occupation.
2) The legend is a mess as I want to make it shown clear the countries corresponding to their individualism level. However, I tried to adjust many parameters of the legend and it did not work so much.
3) Also, I do not know how to delete the white gap produced by the breaks parameter. Any thoughts would be great!
I have solved the second problem by adjusting the aes parameter in ggplot function. The new code of mine is as follows
ggplot(sam_data,aes(JOYSCIE,PV1SCIE,group = CNTRYID)) +
geom_point(aes(color = Individualism.comp4))+
facet_grid(recode.OCOD3~as.factor(Gender0women1men)) +
theme(legend.key.height = unit(3.0,"cm"),legend.text = element_text(size = 5,face = "plain")) +
scale_color_gradientn("Individualism",labels=sam_data$CNTRYID,breaks =sam_data$Individualism.comp4,colors = rainbow(4))+
scale_x_continuous(limits = c(-2,2))

Unexpected behaviour: italic() causing cut-off of two-line axis label in ggplot2

Use of italics (italic()) in a y-axis label that goes over two lines in ggplot is causing the first line to be partly cut off.
E.g.
ggplot() +
geom_hline(aes(yintercept = 1)) +
labs(y = expression(paste("Something\nsomething", italic(x'))))
There's no reason apparent this should be happening — the same thing doesn't happen with very similar code not using italic(), e.g. using hat() instead:
ggplot() +
geom_hline(aes(yintercept = 1)) +
labs(y = expression(paste("Something\nsomething", hat(x))))
Anyone know why this would occur or what to do about it, other than tedious manual altering plot and margin sizes or such?
Not sure why this happens but you can increase the plot margins within ggplot2...
ggplot() +
geom_hline(aes(yintercept = 1)) +
labs(y = expression(paste("Something\nsomething", hat(x)))) +
theme(plot.margin=unit(c(1,1,1,1), "cm"))

adjust a legend position in a barplot

I need to adjust the legend for the following barplot in a proper position somewhere outside the plot
COLORS=rainbow(18)
barplot(sort(task3_result$respondents_share,decreasing = TRUE), main="Share of respondents that mentioned brand among top 3 choices ", names.arg=task3_result$brand, col = COLORS)
legend("right", tolower(as.character(task3_result$brand)), yjust=1,col = COLORS, lty=c(1,1) )
Thanks guys, i couldn't solve the problem but i reached my goal using ggplot,
windows(width = 500, height= 700)
ggplot(data = task3_result, aes(x = factor(brand), y = respondents_share, fill = brand)) +
geom_bar(colour = 'black', stat = 'identity') + scale_fill_discrete(name = 'brands') + coord_flip()+
ggtitle('Share of respondents that mentioned brand among top 3 choices') +xlab("Brands") + ylab("Share of respondents")
As DatamineR pointed out, your code is not reproducible as-is (we don't have task3_result), but you can probably accomplish what you're talking about by playing with the x and y arguments to legend() - you can just set the x coordinate to something beyond the edges of the bars, for example. See the documentation: https://stat.ethz.ch/R-manual/R-devel/library/graphics/html/legend.html. Also note there the cex argument, because that legend might be bulkier than you want.
Note that you will have to specify a larger plot window in order to leave space for the legend; the relevant help file for that is plot.window: https://stat.ethz.ch/R-manual/R-devel/library/graphics/html/plot.window.html
Though you won't want to call plot.window directly - better to pass the relevant arguments to it through the barplot() function. If that doesn't make sense, I recommend you read up on R's base plotting package more generally.

geom_boxplot behaving oddly?

I'm currently plotting some data (response times in ms) in geom_boxplot.
I have a question:
When you adjust the limits on the y-axis does it disregard any values above that in the plotting & error bar calculations?
The data itself comprises of over 20k entries and I'm not sure providing a sample will be of much use as this is a more functionality based question.
Here is the code I use:
f <- function(x) {ans <- boxplot.stats(x)
data.frame(ymin = ans$conf[1], ymax = ans$conf[2], y = ans$stats[3])}
RTs.box = ggplot(mean.vis.aud.long, aes(x = Report, y = RTs, fill =Report)) + theme_bw() + facet_grid(Audio~Visual)
RTs.box +
geom_boxplot(alpha = .8) + geom_hline(yintercept = .333, linetype = 3, alpha = .8) + theme(legend.position = "none") + ylab("Reposponse Times ms") + scale_fill_grey(start=.4) +
labs(title = expression("Visual Condition")) + theme(plot.title = element_text(size = rel(1)))+
theme(panel.background = element_rect())+
#line below for shaded confidence intervals
stat_summary(fun.data = f, geom = "crossbar",
colour = NA, fill = "skyblue", width = 0.75, alpha = .9)+
ylim(0,1000)#this is the value that I change that results in different plots and shaded confidence intervals
Here is the plot with
ylim(0,1000)
And using the same data but changing the limit to
ylim(0,3000)
results in this plot:
As you can see the values in the boxplots appear to be adjusted according to the limit used. Instead of plotting to the edge of the limit the percentiles are reduced. This is apparent when you compare the middle boxplot in the top-left panel of both grids.
There are differences in the confidence intervals also as can be seen.
Does this mean geom_boxplot is discarding the data above the limit or is there something I'm missing?
I want to include all the data when plotting the boxplot & confidence intervals but limit the scale so it can be seen clearly. It means not seeing some major outliers in the data but for my purposes that is fine.
Has anyone got any suggestions as to what is going on here & how to get around it without potentially dropping the values from the data outside the visual range chosen for my calculation?
Thanks as always.
From ?ylim "Observations not in this range will be dropped completely and not passed to any other layers. If a NA value is substituted for one of the limits that limit is automatically calculated."
If you want to adjust the limits without affecting the data, use coord_cartesian instead.
The function ylim clearly influences which data points are used for plotting.
T avoid this, you may want to use coord_cartesian, which will not change the underlying data.
Try to replace ylim(0,1000) with:
coord_cartesian(ylim = c(0,1000))

Resources