Spacing of discrete axis by a categorical variable - r

I have a categorical axis where i'd like to visually separate groups within that categorical variable. I don't want to facet because it takes up too much space and is visually not as clean.
Here's a visual example of what I want that involves some tedious hacking (setting alpha to 0 for non-data entries used for spacing).
library(ggplot2)
dd <- data.frame(x=factor(c(1,-1,2:10),levels=c(1,-1,2:10)), y=c(1,2,2:10), hidden=as.factor(c(0,1,rep(0,9))))
ggplot(data=dd,aes(x=x,y=y,alpha=hidden)) +
geom_point() + scale_alpha_manual(values=c("1"=0,"0"=1)) +
scale_x_discrete(breaks=c(1:10))
I'd like to be able create this plot without having to hack an extra category in (which wouldn't be feasible with the amount of data/number of groups I'm trying to plot) using the following data structure (where the variable "groups" determines where the spacing occurs):
dd2 <- data.frame(x=factor(1:10,), y=c(1:10), groups=c("A",rep("B",9)))

You can get the result you are looking for via the breaks and limits arguments to scale_x_discrete. Set the breaks to the levels of the factor on the x-axis and the limits to the factor levels with spacers were you want/need them.
Here is an example:
library(ggplot2)
dd <- data.frame(x = factor(letters[1:10]), y = 1:10)
ggplot(dd) +
aes(x = x, y = y) +
geom_point() +
scale_x_discrete(breaks = levels(dd$x),
limits = c(levels(dd$x)[1], "skip", levels(dd$x)[-1]))

Related

Showing discrete variable in a single geom_sina/violin plot

I'm trying to create a plot with geom_sina and geom_violin where all data points are plotted together (as one violin shape) and are coloured by a factor.
However, when I specify ggplot(mtcars, aes(x = "", y = mpg, fill = am)), the plot is split according to the factor, which is what I'd like to avoid (plot 1). The closest I've come is treating the factor as a continuous variable (plot 2). But then the legend displays a "fill" bar and not the discrete factor levels I'd like.
So, if possible, I'd like the plot to stop splitting by colour when using a factor, or to overide the legend to discrete values if going with numerics.
Any help is much appreciated : )
plot 1
plot 2
Maybe this is what you are looking for. Using the group aesthetic you could overwrite the default grouping by fill or color or ...:
Note: As you want the points do be colored I switched to the color aesthetic.
library(ggplot2)
library(ggforce)
ggplot(mtcars, aes(x = "", y = mpg)) +
geom_violin() +
geom_sina(aes(color = factor(am), group = 1))

How do I add a separate legend for each variable in geom_tile?

I would like to have a separate scale bar for each variable.
I have measurements taken throughout the water column for which the means have been calculated into 50cm bins. I would like to use geom_tile to show the variation of each variable in each bin throughout the water column, so the plot has the variable (categorical) on the x-axis, the depth on the y-axis and a different colour scale for each variable representing the value. I am able to do this for one variable using
ggplot(data, aes(x=var, y=depth, fill=value, color=value)) +
geom_tile(size=0.6)+ theme_classic()+scale_y_continuous(limits = c(0,11), expand = c(0, 0))
But if I put all variables onto one plot, the legend is scaled to the min and max of all values so the variation between bins is lost.
To provide a reproducible example, I have used the mtcars, and I have included alpha = which, of course, doesn't help much because the scale of each variable is so different
data("mtcars")
# STACKS DATA
library(reshape2)
dat2b <- melt(mtcars, id.vars=1:2)
dat2b
ggplot(dat2b) +
geom_tile(aes(x=variable , y=cyl, fill=variable, alpha = value))
Which produces
Is there a way I can add a scale bar for each variable on the plot?
This question is similar to others (e.g. here and here), but they do not use a categorical variable on the x-axis, so I have not been able to modify them to produce the desired plot.
Here is a mock-up of the plot I have in mind using just four of the variables, except I would have all legends horizontal at the bottom of the plot using theme(legend.position="bottom")
Hope this helps:
The function myfun was originally posted by Duck here: R ggplot heatmap with multiple rows having separate legends on the same graph
library(purrr)
library(ggplot2)
library(patchwork)
data("mtcars")
# STACKS DATA
library(reshape2)
dat2b <- melt(mtcars, id.vars=1:2)
dat2b
#Split into list
List <- split(dat2b,dat2b$variable)
#Function for plots
myfun <- function(x)
{
G <- ggplot(x, aes(x=variable, y=cyl, fill = value)) +
geom_tile() +
theme(legend.direction = "vertical", legend.position="bottom")
return(G)
}
#Apply
List2 <- lapply(List,myfun)
#Plot
reduce(List2, `+`)+plot_annotation(title = 'My plot')
patchwork::wrap_plots(List2)

ggplot2 scale_x_discrete value causing uneven axis spacing

I have a data frame with rows containing the titles of journal publications, values, and indicating whether it is a normal or a highlight data point. I want the plot to preserve the order of the data frame. The following code produces an unevenly spaced y-axis.
require(ggplot2)
title <- c("COGNITION","MUTAT RES-DNA REPAIR","AM J PHYSIOL-CELL PH","AM J PHYSIOL-CELL PH","BLOOD",
"PNAS","BIOCHEM BIOPH RES CO","CLIN CANCER RES","BIOCHEM BIOPH RES CO","MOL THER" )
value <- c(-0.428, -0.637, -0.740, -0.782, -0.880, -1.974, -1.988, -2.029, -2.217, -2.242)
indicator <- c(rep("highlight",5), rep("normal",5))
df <- data.frame(title, value, indicator)
mycolors <- c("highlight" = "blue", "normal" = "red")
x_axis_range <- c((min(df$value)), (max(df$value)))
p <- ggplot(df, aes(x = title, y = value)) +
geom_point(aes(size=3, color=indicator)) +
coord_flip() +
scale_color_manual(values=mycolors) +
scale_y_continuous(limit=x_axis_range) +
# produces uneven spacing
scale_x_discrete(limits=df$title) +
theme(legend.position="none")
show(p)
I don't know why ggplot is adding extra space between the MOL THER and CLIN CANCER RES and between the BLOOD and AM J PHYSIOL-CELL PH data points. When I change the scale_x_discrete() line to:
scale_x_discrete(limits=df$title.1) +
This spacing becomes even, but the order of the data is changed to alphabetically by title from bottom-to-top.
Why does adding the .1 to the end of limits=df$title even out the spacing? How can I preserve this evenness, and still be able to control the order the data along the y-axis with the order() function?
You get uneven spacing for the discrete scale because by providing df$title you give 10 values but in plot there are only 8 unique values - so there are two spaces for the levels already used.
When you provide scale_x_discrete(limits=df$title.1) limits actually are ignored because there is no title.1 column in your data and result is NULL
To get the order you need provide unique() values of df$title that are converted to character (to keep original order)
ggplot(df, aes(x = title, y = value)) +
geom_point(aes(size=3, color=indicator)) +
coord_flip() +
scale_color_manual(values=mycolors) +
scale_y_continuous(limit=x_axis_range) +
scale_x_discrete(limits=unique(as.character(df$title)) )+
theme(legend.position="none")

How to change origin line position in ggplot bar graph?

Say I'm measuring 10 personality traits and I know the population baseline. I would like to create a chart for individual test-takers to show them their individual percentile ranking on each trait. Thus, the numbers go from 1 (percentile) to 99 (percentile). Given that a 50 is perfectly average, I'd like the graph to show bars going to the left or right from 50 as the origin line. In bar graphs in ggplot, it seems that the origin line defaults to 0. Is there a way to change the origin line to be at 50?
Here's some fake data and default graphing:
df <- data.frame(
names = LETTERS[1:10],
factor = round(rnorm(10, mean = 50, sd = 20), 1)
)
library(ggplot2)
ggplot(data = df, aes(x=names, y=factor)) +
geom_bar(stat="identity") +
coord_flip()
Picking up on #nongkrong's comment, here's some code that will do what I think you want while relabeling the ticks to match the original range and relabeling the axis to avoid showing the math:
library(ggplot2)
ggplot(data = df, aes(x=names, y=factor - 50)) +
geom_bar(stat="identity") +
scale_y_continuous(breaks=seq(-50,50,10), labels=seq(0,100,10)) + ylab("Percentile") +
coord_flip()
This post was really helpful for me - thanks #ulfelder and #nongkrong. However, I wanted to re-use the code on different data without having to manually adjust the tick labels to fit the new data. To do this in a way that retained ggplot's tick placement, I defined a tiny function and called this function in the label argument:
fix.labels <- function(x){
x + 50
}
ggplot(data = df, aes(x=names, y=factor - 50)) +
geom_bar(stat="identity") +
scale_y_continuous(labels = fix.labels) + ylab("Percentile") +
coord_flip()

Adding space between bars in ggplot2

I'd like to add spaces between bars in ggplot2. This page offers one solution: http://www.streamreader.org/stats/questions/6204/how-to-increase-the-space-between-the-bars-in-a-bar-plot-in-ggplot2. Instead of using factor levels for the x-axis groupings, however, this solution creates a numeric sequence, x.seq, to manually place the bars and then scales them using the width() argument. width() doesn't work, however, when I use factor level groupings for the x-axis as in the example, below.
library(ggplot2)
Treatment <- rep(c('T','C'),each=2)
Gender <- rep(c('M','F'),2)
Response <- sample(1:100,4)
df <- data.frame(Treatment, Gender, Response)
hist <- ggplot(df, aes(x=Gender, y=Response, fill=Treatment, stat="identity"))
hist + geom_bar(position = "dodge") + scale_y_continuous(limits = c(0,
100), name = "")
Does anyone know how to get the same effect as in the linked example, but while using factor level groupings?
Is this what you want?
hist + geom_bar(width=0.4, position = position_dodge(width=0.5))
width in geom_bar determines the width of the bar.
width in position_dodge determines the position of each bar.
Probably you can easily understand their behavior after you play with them for a while.

Resources