Adding extra space above graph with facet_wrap - r

I’m trying to use facet_wrap in r with 4 plots that have different x-axis. I used scales = “free” which was helpful, but most of the plots touch the top of the graph. I would like there to be space above the tallest bar of the graph, so it doesn’t look like it’s going off above the limit of the graph. I hope that makes sense - this is my first time posting a question here. I have provided the code that I have and a screenshot of one of the graphs. I haven't been able to find any sort of fix without using cow plot which is a bit more advanced than I would like.
Code:
wrap_plot <- df %>% ggplot(aes(x=Race, y = Rate, fill = Sex))+
geom_col(position = "dodge")+
labs(x = "Race/Ethnicity",
y = "Rate per 100,000 population")+
theme_classic()+
facet_wrap(~Disease, scales = "free")+
scale_y_continuous(expand = c(0, 0))

Related

How to remove or not show any of the data points above and below the error bar in boxplot and violin plot?

I'm working on a very large dataset containing around 1.6M data points. I'm using the violin plot along with the boxplot to represent the data from each category (there are multiple categories and each has its own set of values).
But the problem which I'm facing is, there are a lot of data points (outliers) above the error bar because of that the focus of the plot has been lost.
Earlier I thought that probably if I remove all the data points after a specific value it will help me to represent what I wanted to show. But It didn't work because for each category the errorbar range is different and because of that, I lost the majority of data from other categories.
So, now I'm thinking to remove or not showing the data points above the error bar for each category individually, for both box and violin plot. And I introduced outlier.shape=NA in the geom_boxplot, it worked fine for the boxplot. Similarly, I wanted to remove all those data points from the violin plot as well which are above the error bar in the boxplot.
Here are the plots before and after using outlier.shape=NA.
Before:
After:
Here is my code :
med_violin <- data %>%
left_join(sample_size) %>%
mutate(myaxis = fct_reorder(paste0(Country), Diff, .fun='median')) %>%
ggplot( aes(x=myaxis, y=Diff, fill=Country)) +
geom_violin(width=1.5, color = "black", position = position_dodge(width=1.8), trim = TRUE) +
geom_boxplot(width=0.2, color="white", alpha=0.01, outlier.colour="red", outlier.size=0.1, outlier.shape = NA) +
scale_y_continuous(breaks = c(0,25,50,75,100,125,150,525,550))+
coord_trans(y = squash_axis(150, 525, 15)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))+
theme(axis.text.x = element_text(size = 8))+
theme(legend.position ="none")+
scale_fill_viridis(discrete = TRUE) +
xlab("")
med_violin
How can I implement the same thing in genom_violin, so that it will also not show the data points above the error bar?
I even tried this : Ignore outliers in ggplot2 geom_violin
But did not work for me.
Thank you.

Overlapping data on columns in a ggplot facet grid

Thanks in advance for humoring a complete newbie to R. I'm working with some data from the GSS for an online class, and I've created a ggplot facet grid. I'm sure I've done this a super awkward, long way, but I'm trying to get these data points to not overlap each other, but be centered on the columns.
Here's what I've got so far:
I've created a new dataset from the GSS with the variables 'conpress', 'sex', and 'news' -- which refer to the confidence in the press, gender and how often someone reads a newspaper. I wanted to get the percentages, not the counts, which is why I did the ..count..stuff.
gss_press_full <- gss %>% select (conpress, news, sex)
gss_press_clean <-na.omit(gss_press_full)
ggplot(gss_press_clean, aes(x = conpress, y = (..count..)/sum(..count..), fill = sex)) +
geom_bar(aes(y = (..count..)/sum(..count..)), position = position_dodge()) + facet_grid(news~.) +
geom_text(aes(y = ((..count..)/sum(..count..)), label = round((..count..)/sum(..count..), 2)), stat = "count", vjust = -0.25) +
labs(title = "Newspaper readership and Press confidence", y = "Percent", x = "Levels of confidence in the Press")
I have been googling for far too long and can't seem to find a way to adjust these labels atop the columns. It seems to be especially tricky since my y variable is being calculated in my ggplot creation, but again, like a complete novice, that was how I cobbled my way to the output. If someone has help on how to streamline this process, I'd appreciate that too!
( I hope I've included enough code to be helpful!)
Again, thanks for any help!

Introduce explicit line break in ggplot2 on the Y-axis (boxplot)

I'm attempting to write some code that can be used to make boxplots of temperatures at which proteins melt at, I'm 99% there except I need to introduce a line break on the y-axis of my boxplot.
Essentially, my current y axis scale goes from 45-60, I want to make the y axis start at 0, line break, 45-60. See the picture as an e.g.
I've tried using the scale_y_continuous to set a break but that didn't work as I'd hoped.
df %>%
group_by(Protein) %>%
ggplot(., aes(x = factor(Protein), y = Melting_Temperature)) +
geom_boxplot() +
theme_classic() +
geom_point(aes(x = as.numeric(df$Protein) + 0.5, colour = Protein),
alpha=0.7)+
xlab("Protein Type")+
ylab("Melting Temperature") +
stat_summary(fun.y=mean, colour = "darkred", geom = "point", shape =
18, size = 3, show_guide = FALSE) +
geom_text(data = means, aes(label = round(Melting_Temperature, 1), y =
Melting_Temperature + 0.5))
IMHO, tick marks and axis labels should be sufficient to indicate the range of data on display. So, there is no need to start an axis at 0 (except for bar charts and alike).
However, the package ggthemes offers Tufte style axes which might be an alternative to the solution the OP is asking for:
library(ggplot2)
library(ggthemes)
ggplot(iris) +
aes(x = Species, y = Sepal.Length) +
geom_boxplot() +
geom_rangeframe() +
theme_tufte(base_family = "")
Note that the iris dataset is used here in place of OP's data which are not available.
geom_rangeframe() plots axis lines which extend to the maximum and minimum of the plotted data. As the plot area is usually somewhat larger this creates a kind of gap.
theme_tufte() is a theme based on Chapter 6 "Data-Ink Maximization and Graphical Design" of Edward Tufte's The Visual Display of Quantitative Information with no border, no axis lines, and no grids.
This is not supported in ggplot as built. In this discussion from 2010, Hadley Wickham (author of ggplot as well as RStudio et al) explains that axis breaks are questionable practice in his view.
Those comments by Hadley are linked, and other options discussed, in this prior SO discussion.

Preserving text size with `grid.arrange`

Problem
I have 4 graphs that I want to display using grid.arrange(). When I display them individually, they look like this:
But when I use grid.arrange(), they become distorted
with them individually looking like
Specific Issues:
The x-axis labels do not scale and overlap, making them unreadable.
The subtitles get cutoff.
Goal
I want to reproduce each plot exactly like the first ideal case in a grid with grid.arrange(). One possible way might be to convert each plot to an image and then use grid.arrange() but I don't know how to do this.
Reproducible Example
Below is an example reproducible code that shows the problem I am having.
p1 <- ggplot(subset(mtcars, cyl = 4), aes(wt, mpg, colour = cyl)) + geom_point() + labs(title = "TITLE-TITLE-TITLE-TITLE-TITLE-TITLE", subtitle = "-subtitle-subtitle-subtitle-subtitle-subtitle-subtitle-subtitle-") +theme(plot.title = element_text(hjust = 0.5),plot.subtitle = element_text(hjust = 0.5))
p2 <- ggplot(subset(mtcars, cyl = 4), aes(wt, mpg, colour = cyl)) + geom_point() + labs(title = "TITLE-TITLE-TITLE-TITLE-TITLE-TITLE", subtitle = "-subtitle-subtitle-subtitle-subtitle-subtitle-subtitle-subtitle-") +theme(plot.title = element_text(hjust = 0.5),plot.subtitle = element_text(hjust = 0.5))
grid.arrange(p1, p2, ncol = 2)
When you display those graphs individually they simply have more space. So, those are natural distortions and there are perhaps only three ways to solve that.
When exporting the combined graph, make it big enough. If the individual one looks good in 6x5 inches, then surely the combined one will look good in 12x10 inches.
Give correspondingly less space for the problematic parts: x-axis labels and the subtitle. For instance, use something like element_text(size = 6) for plot.subtitle and axis.title.x, add \n to the subtitles and even x-axis labels, try something like element_text(angle = 30) for the latter as well.
Get rid of something unnecessary. As #Richard Telford suggests in the comments, using facet_wrap should work better. That would be due to, e.g., not repeating the y-axis labels and, hence, giving more horizontal space.

adjust a legend position in a barplot

I need to adjust the legend for the following barplot in a proper position somewhere outside the plot
COLORS=rainbow(18)
barplot(sort(task3_result$respondents_share,decreasing = TRUE), main="Share of respondents that mentioned brand among top 3 choices ", names.arg=task3_result$brand, col = COLORS)
legend("right", tolower(as.character(task3_result$brand)), yjust=1,col = COLORS, lty=c(1,1) )
Thanks guys, i couldn't solve the problem but i reached my goal using ggplot,
windows(width = 500, height= 700)
ggplot(data = task3_result, aes(x = factor(brand), y = respondents_share, fill = brand)) +
geom_bar(colour = 'black', stat = 'identity') + scale_fill_discrete(name = 'brands') + coord_flip()+
ggtitle('Share of respondents that mentioned brand among top 3 choices') +xlab("Brands") + ylab("Share of respondents")
As DatamineR pointed out, your code is not reproducible as-is (we don't have task3_result), but you can probably accomplish what you're talking about by playing with the x and y arguments to legend() - you can just set the x coordinate to something beyond the edges of the bars, for example. See the documentation: https://stat.ethz.ch/R-manual/R-devel/library/graphics/html/legend.html. Also note there the cex argument, because that legend might be bulkier than you want.
Note that you will have to specify a larger plot window in order to leave space for the legend; the relevant help file for that is plot.window: https://stat.ethz.ch/R-manual/R-devel/library/graphics/html/plot.window.html
Though you won't want to call plot.window directly - better to pass the relevant arguments to it through the barplot() function. If that doesn't make sense, I recommend you read up on R's base plotting package more generally.

Resources