Adjusting position on discrete x-axis of two goups in ggplot - r

I have a plot with a continous y-axis and discrete x-axis.
For the data I have a group factor with 3 levels and 2 meausement points, so 6 geoms are created
1
I would like to keep the width of the single geoms but adding space between the two measurement points, respectively the two groups of geoms. Like: 3 geoms - gap - 3 geoms. Is there any possibility of adjusting the position of a group of geoms on the x-axis in ggplot?
preferences %>%
pivot_longer(c(F1_life_satisfaction_pre, F1_life_satisfaction_current), names_to = "variables", values_to = "ratings")%>%
ggplot( aes(y=ratings, x=fct_inorder(variables), fill=fct_inorder(playing_preference))) +
geom_violin(scale="width", adjust=0.5, width=0.8, alpha= 0.2, position = position_dodge(1)) +
stat_summary(fun=mean, geom="point", shape=23, size=2, position = position_dodge(1)) +
stat_summary(aes(group=fct_inorder(playing_preference)), fun=mean, geom = "line", size= 0.5, position = position_dodge(1)) +
stat_summary(fun.data=mean_cl_normal, fun.args=list(mult=1),aes(x=fct_inorder(variables), y=ratings), geom="errorbar",
width=0.05, position = position_dodge(1)) +
scale_x_discrete(labels = c("pre-Pokemon-Go", "current"),expand = c(0, 0.3)) +
theme(axis.text.x = element_text(color = "black", size=10)) +
scale_y_continuous(breaks = c(1, 2, 3, 4, 5, 6, 7), limits=c(1,7)) +
geom_segment(aes(x = 0, y=4, yend=4, xend=3), color="grey") +
theme(axis.ticks.x = element_blank()) +
labs(fill = "playing preference") +
labs(x="life satisfaction") +
theme(axis.title = element_text(size = 10))+
theme(legend.text = element_text(size = 10)) +
theme(legend.title = element_text(size = 10)) +
labs(y = "mean ratings") +
geom_boxplot(width=0.1,color="black", alpha=0.2, position = position_dodge(1)) +
scale_fill_viridis(discrete=T)

TL;DR - play with width= and position_dodge(width=...) within your line for geom_violin and geom_boxplot to adjust the positions along with scale_x_discrete(expand=expansion(...).
The first point is that the resolution (and how close and far apart) things are on your plot will be related to the size of your window. With that being said, the positioning relationship of the plot elements between one another can be controlled via ggplot. In particular, you want to change the values of width= and position_dodge(width=...) in your geom_violin call (and your geom_boxplot call).
Example Dataset
I'll use an example dataset to illustrate the idea, where I'll plot boxplots... but the idea is identical. The example dataset contains two x values ("Group1" and "Group2"), and each of those has subdivisions that are either "A", "B", or "C", containing a separate normal distribution of 50 datapoints for every x and x.subdiv.
set.seed(8675309)
df <- data.frame(
x=c(rep('Group1', 150), rep('Group2', 150)),
x.subdiv=rep(c(rep('A', 50), rep('B',50), rep('C',50)), 2),
y=unlist(lapply(1:6, function(x){rnorm(50, runif(1,10,15), runif(1,0,7))}))
)
Width of position_dodge
Here's the simple boxplot, where I'll use 0.5 as the value for both width= and position_dodge(width=...). Note that the first argument in position_dodge is width=, so you can just supply that number directly to that function without explicitly assigning to the width argument.
p <- ggplot(df, aes(x=x, y=y)) + theme_bw()
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.5))
The rule to note here is:
geom_boxplot(width=...) controls how wide the overall spread of box plots are around each x= value.
position_dodge(width=...) controls the amount of spread (the amount of "dodging") for the groups around the x= aesthetic.
So this is what happens when you change position_dodge(width=1), but leave geom_boxplot(width=0.5):
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(1))
The width of each box remains the same as before, but the positioning of each box around x= is more "spread out". In effect, each is "dodged" more. If you set position_dodge(width=0.2), you'll see the opposite effect, where the boxes become squished together (because they are not spread out as much around x=):
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.2))
The interesting thing is how geom_boxplot(width=) and position_dodge(width=) are related:
If geom_boxplot(width=) is equal to position_dodge(width=), the boxes will be touching
If geom_boxplot(width=) is less than position_dodge(width=), the boxes will be separated from one another
If geom_boxplot(width=) is greater than position_dodge(width=), the boxes will be overlapping one another
Width of the geom
The width= of the geom itself relates to how wide the boxplots are. The point to keep in mind are these two points:
The width= is the sum of all the widths of the individual dodged geoms for that particular x= aesthetic.
width=1 is the width between two values on a discrete axis, meaning when you set width=1, the boxes will be wide enough to touch
That means that if we set geom_boxplot(width=1), the combined total of all the boxes for "Group1" will be wide enough to touch the boxes of "Group2"... but you would only see that if there were no overlap among the boxes (meaning that position_dodge(width=) would be equal to geom_boxplot(width=)).
So this makes the boxes wide enough to be touching, but position_dodge(width) is less than geom_boxplot(width)... so the boxes overlap, but "Group1" boxes are separated from "Group2" boxes:
p + geom_boxplot(aes(fill=x.subdiv), width=1, position=position_dodge(0.8))
If we want everything to touch, you have to set them equal, and both equal to 1:
p + geom_boxplot(aes(fill=x.subdiv), width=1, position=position_dodge(1))
Control both widths
In the end, it's probably best to control both. If we go from the previous plot, you probably want the plots to have separation between "Group1" and "Group2". That means you need to make the width of all boxes smaller (which we control by geom_boxplot(width)). However, you probably still want the dodging to leave a bit of space between the boxes, so we'll have to set position_dodge(width) to be greater than geom_boxplot(width), but not too large so that we lose the separation between "Group1" and "Group2". Something like this works pretty well:
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.55))
In your case, you have both geom_violin and geom_boxplot, so you'll need to adjust those together and work out the proper look.
EDIT: "Shift Left and Right" and "Squish"
If the width= and position_dodge(width= arguments are just not quite getting you what you need, there is another parameter that can work in concert with them to move things around. This would be to use scale_x_discrete(expand=... to control the amount of space to the left and right of your x axis items. Used together with width= and position_dodge(width=, this actually gives you precise control of where to position your data along the x axis while still respecting the automated plotting that ggplot2 provides.
width= controls the whitespace between data along the x axis
position_dodge(width= controls the amount of whitespace between subgroups in the data positioned along the x axis
scale_x_discrete(expand=... controls white space to the left and right sides of the panel.
I'll demonstrate the functionality using the same dataset as before. Note that proper use of the expand= argument for scale_x_discrete should call expansion() and you will need to provide a 1 or 2 length vector to either add= or mult=. Play around with both and numbers to see the effect, but here's kind of what to expect.
The expansion() function takes either mult= or add= as arguments, which can either be a vector of length 2 (where 1 is applied to left side and 2 is applied to the right side, or length 1 (where the number is applied to both sides). Numbers sent to mult= are multiplied by the normal expansion to give you the new amount, so the code below sets the extra whitespace to the left and the right equal to 30% (0.3 * normal) of the typical expansion for both sides:
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.55)) +
scale_x_discrete(expand=expansion(mult=0.3))
Sending two values, you can adjust separately. This sets the left side to be 100% (normal) and the right side to be reduced to 50% of normal:
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.55)) +
scale_x_discrete(expand=expansion(mult=c(1,0.5)))
Bottom Line: Seems like by using all three arguments for width=, position_dodge(width=, and scale_x_discrete(expand=expansion(..., you can theoretically place your x groupings anywhere along your plot. Just keep in mind that the resolution and aspect ratio of your graphics device will change how things are laid out a bit, so additional control can be adjusted by resizing the graphics window.

Related

How to remove empty spaces between tiles in geom_tile and change tile size

I have a df with the following structure:
id col1 col2 col3
#1 A 1 3 3
#2 B 2 2 3
#3 C 1 2 3
#4 D 3 1 1
I wanted to create a "heatmap-like" figure where col1-col3 are treated as a factor variable (with five levels 1-5, not all shown here) and depending on their value they receive a different color. I've gotten relatively far with the following code:
df <- melt(df, id.vars="id")
p <- ggplot(df, aes(x=variable, y=id, label=value, fill=as.factor(value))) +
geom_tile(colour="white", alpha=0.2, aes(width=0.4)) +
scale_fill_manual(values=c("yellow", "orange", "red", "green", "grey")) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(x = "Value", y="id") +
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))
However, for some reason my tiles have large grey empty spaces between them on the x-axis (i.e. between each factor level).
The output image looks something like this
Additionally, I have these thin white lines in the middle of each tile that
So what I'd like to do is:
1- change the tile size and shape (would like it to be a square & smaller than now)
2- remove white line in the middle of tile.
Thank you!
OP. I noticed that in your response to another answer, you've refined your question a bit. I would recommend you edit your original question to reflect some of what you were looking to do, but here's the overall picture to summarize what you wanted to know:
How to remove the gray space between tiles
How to make the tiles smaller
How to make the tiles more square
Here's how to address each one in turn.
How to remove gray space between tiles
This was already answered in a comment and in the other answer from #dy_by. The tile geom has the attribute width which determines how big the tile is relative to the coordinate system, where width=1 means the tiles "touch" one another. This part is important, because the size of the tile is different than the size of the tile relative to the coordinate system. If you set width=0.4, then the size of the tile is set to take up 40% of the area between one discrete value in x and y. This means, if you have any value other than width=1, then you will have "space" between the tiles.
How to make the tiles square
The tile geom draws a square tile, so the reason that your tiles are not square in the output has nothing to do with the geom - it has to do with your coordinate system and the graphics device drawing it in your program. By default, ggplot2 will draw your coordinate system in an aspect ratio to match that of your graphics device. Change the size of the device viewport (the window), and the aspect ratio of your coordinate system (and tiles) will change. There is an easy way to fix this to be "square", which is to use coord_fixed(). You can set any aspect ratio you want, but by default, it will be set to 1 (square).
How to make the tiles smaller
Again, the size of your tiles is not controlled by the geom_tile() function... or the coordinate system. It's controlled by the viewport you set in your graphics device. Note that the coordinate system and geoms will resize, but the text will remain constant. This means that if you scale down a viewport or window, your tiles will become smaller, but the size of the text will (relatively-speaking) seem larger. Try this out by calling ggsave() with different arguments for width= with your plot.
Putting it together
Therefore, here's my suggestion for how to change your code to fix all of that. Note I'm also suggesting you change the theme to theme_classic() or something similar, which removes the gridlines by default and the background color is set to white. It works well for tile maps like this.
p <- ggplot(df, aes(x=variable, y=id, label=value, fill=as.factor(value))) +
geom_tile(colour="white", alpha=0.2, width=1) +
scale_fill_manual(values=c("yellow", "orange", "red", "green", "grey")) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(x = "Value", y="id") +
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0)) +
coord_fixed() +
theme_classic()
p
Now for saving that plot with different width= settings to show you how things change for sizing. You don't have to specify height=, since the aspect ratio is fixed at 1.
ggsave("example_big.png", plot=p, width=12)
ggsave("example_small.png", plot=p, width=3)
gaps between tiles: change width=0.4 to width=1 or remove it.
white lines between tiles: they come from parameter colour="white" - remove it if you want
lines on tiles are backround lines, couse transparency parameter alpha=0.2 - change it to higher value or remove lines by + theme(panel.grid.major = element_blank()) at the end
summary:
ggplot(df, aes(x=variable, y=id, label=value, fill=as.factor(value))) +
geom_tile(alpha=0.2) +
scale_fill_manual(values=c("yellow", "orange", "red", "green", "grey")) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(x = "Value", y="id") +
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))+
theme(panel.grid.major = element_blank())

Create one tornado diagram with multiple factors ggplot

I am working with data that I am turning into a tornado diagram that in the end I want to look something like this:
Currently my data looks like this:
Using facet_wrap I get a result like this but want it all on one graph to look like the sample at the start:
ggplot(data = df, aes(x = Person)) +
geom_crossbar(aes(y=Mean, ymin= Min, ymax = Max)) +
coord_flip() +
facet_wrap(~Letter, strip.position = "left", scales = "free_x") +
theme(panel.spacing = unit(0, "lines"), strip.background = element_blank(), strip.placement = "outside")
Is there any way to do this in ggplot?
Sorry I had to include the images as links!
How about this? See the code and result below with some explanations after.
# reorder levels in order of appearance
df$Person <- factor(df$Person, levels=unique(df$Person))
ggplot(df, aes(x=Letter)) +
geom_crossbar(
aes(y=Mean, ymin=Min, ymax=Max),
fill='dodgerblue2', width=0.8
) +
facet_grid(Person~., scales='free', space='free', switch = 'y') +
scale_y_continuous(expand=expansion(mult=c(0.3,0.3))) +
labs(x=NULL, y='Axis Label Here') +
theme_classic() +
theme(
strip.placement = 'outside',
strip.background = element_rect(color=NA, fill=NA),
strip.text.y=element_text(angle=90, size=12),
panel.spacing = unit(0,'pt'),
panel.background = element_rect(color='black')
) +
coord_flip()
Now, for some explanation, where I'll step through the code from top to bottom, calling out specific changes/adjustments as they come up.
Re-leveling Person Factor. The purpose here is to ensure that the order we see the listing of "Persons" matches the order in which they are listed in the dataset. You can list them in any order, of course, but the default for characters/strings is: If it is a factor, then the order = the ordering of the levels. If it is not a factor, the order = alphabetical.
Overall Adjustment for facets. Given that each df$Person has one or more df$Letters associated, and given your example plot, it seems that you actually want to have facets be df$Person, with each having an x aesthetic for df$Letter.
Facet_grid. I use facet_grid() instead of facet_wrap(), since it offers more control. If you use the . ~ facet or facet ~ . notation, it acts just like facet_wrap(), except it will not "wrap" around. There are three critical arguments that are only all available via facet_grid():
scales=. This will remove the extra space in each facet that is not used. Since not every df$Person has the same amount of df$Letter associated, this is very important.
space=. By default, the space occupied by each facet is kept constant. This means that if one name has 3 letters and another facet has only 1, the width of the bars in each facet will be smaller in the one with 3 vs the one bar in the facet with only 2. Setting space="free" allows for all widths to be constant: it's the facet size that is "free" to adjust to the bar - not the other way around.
switch=. This allows for the strip placement (facet label) to be on the opposite side. It doesn't place it outside though...
Expanding the y scale. This is purely aesthetic. I'm trying to match what you show, which has extra space around the bars.
Theme Elements. There's a decent amount going on here, but basically I'm putting the facet label outside (strip.placement) and removing the box that goes around it usually (strip.background). I also smoosh the facets together (panel.spacing) and decided it was easier to view when you drew some lines between the facets (panel.background).
Some other things would be purely aesthetic, but I think this gets you close to your desired result. If you want to include the information / text... that's a different matter.

ggplot2: geom_bar and position_dodge: not centered to x axis (factor)

I have two issue on this plot. I want to make the bars wider (and less spacing between the groups) and I want each group of bars to be centered to each x factor values.
I have a continuous variable on the y-axis and a factor value on the x-axis. I have three groups for each factor.
Here is an example of my issue with the Iris data:
d <- iris
ggplot(d) +
geom_col(aes(x=as.factor(Sepal.Length), y=Petal.Width, fill=as.factor(Species)),position = position_dodge(preserve = "single"), width=1) +
theme(axis.text.x = element_text(angle = 90).
This gets you something probably closer to what you're looking for (I assume d in your example is iris):
ggplot(iris) +
geom_col(
aes(
x=as.factor(Sepal.Length), y=Petal.Width, fill=Species
),
position = position_dodge(0.5),
width=0.5) +
theme(axis.text.x = element_text(angle = 90, vjust=0.5))
Now, for the explanation of what I changed and why:
Text Positioning on X-axis. You used element_text(angle=90) to change the direction of the text. This is correct, but it only changes the angle and not the positioning/alignments. By default, horizontal text is vertically aligned to be "at the top". If you run the code above and use vjust=1 in place of vjust=0.5, you'll see it goes back to the way it appears for you, with the tick marks being aligned to the "top" of the value on the x axis text.
as.factor(Species) No need to declare Species a factor. Run str(iris) and you'll see that iris$Species is already a factor. Doesn't really change anything to the result except messes with the title of the legend.
Position_dodge width and width. This one is best explained by you messing with the values in the two terms position_dodge(0.5) and width=0.5. Play with it yourself and you'll see what they each do, but here's the general explanation:
Total column width for each position on the x-axis is determined by width=0.5 that is the argument for geom_col(). So, for every Sepal.Length factor in this graph, it means that "0.5" is used as the total width of the column (or columns) that are in that space. "1.0" would mean "I want all columns to touch each other" and something like "0.2" means "I want skinny columns". "0" means... I don't want columns - give them a width of zero!
The width of each "sub-column" (each Species column for each Sepal.Length in this example) is controlled by the position_dodge(width=0.5) term. 0.5 represents "split this in half and have the columns touch each other exactly". Higher values will split them apart and lower values will squish them together, where 0 means they are on top of one another. Making the value really large, you get sub-columns running into neighboring columns...
Again - play around with those terms and you should get how they work together.
Maybe an another solution is to use position_dodge2 that will center each bar to the center of x values:
ggplot(iris, aes(x = as.factor(Sepal.Length), y = Petal.Width, fill = Species))+
geom_col(position = position_dodge2(preserve = "single", width = 1))+
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))

ggplot facetted geom_boxplot: reduce space between x-axis categories

I am creating a boxplot using ggplot. When I reduce the width of the boxplot the space between the x-axis categories increases. I would like to be able to reduce the space between the x-axis categories and bring the box plots closer to each other.
p<-ggplot(data.plot1, aes(time2, Count))
p+geom_boxplot(outlier.shape = NA, width=0.3)+
ggtitle("")+ylab("Cell Count (cells/mL) ")+ xlab("Time") +
theme_bw()+ coord_cartesian(ylim = c(0, 850))+
geom_hline(data=normal1, aes(yintercept = val), linetype="dashed")+
facet_grid(.~CellType1)
So, basically, reduce the space between Day 0, Day30, Day 100 and bring the boxplots closer to each other.
As mentioned in the comments, narrowing the graphics device is one way of doing it. Another way to do it without changing the size of the graphics device is to add spaces between your bars and the sides of the panels. Note: Since your question is not reproducible, I have used the build-in infert dataset, which serves demonstrative purposes. Assuming that this is your original facetted side-by-side boxplots:
p<-ggplot(infert, aes(as.factor(education), stratum))
p+geom_boxplot(outlier.shape = NA, width=0.3)+
ggtitle("")+ylab("Cell Count (cells/mL) ")+ xlab("Time") +
theme_bw()+ coord_cartesian(ylim = c(0, 80))+
# geom_hline(data=normal1, aes(yintercept = val), linetype="dashed")+
facet_grid(.~induced)
This brings the categories together by adding white space on both ends of each panel:
p+geom_boxplot(outlier.shape = NA, width=0.6)+
ggtitle("")+ylab("Cell Count (cells/mL) ")+ xlab("Time") +
theme_bw()+ coord_cartesian(ylim = c(0, 80))+
# geom_hline(data=normal1, aes(yintercept = val), linetype="dashed")+
facet_grid(.~induced) +
scale_x_discrete(expand=c(0.8,0))
The two numbers in scale_x_discrete(expand=c(0.8,0)) indicate the multiplicative and additive expansion constant that "places some distance away from the axes". See ?scale_x_discrete. This effectively "squishes" the boxplots in each panel together, which also reduces the width of each boxplot. To compensate for that, I increased the width to width=0.6 in geom_boxplot. Notice that the x-axis labels are now overlapping. You will have to experiment with different expansion factors and width sizes to get exactly how you want it.
Also see this question for a related issue: Remove space between bars within a grid

In ggplot2, can borders of bars be changed on only one side? (color, thickness)

I know, 3D Barcharts are a sin. But i´m asked to do them and as a trade-off i suggested to only make a border with a slightly darker color than the bar´s on the top and the right side of the bar. Like that, the bars would have some kind of "shadow" (urgh) but at least you still would be able to compare them.
Is there any way to do this?
ggplot(diamonds, aes(clarity)) + geom_bar()
Another possibility, using two sets of geom_bar. The first set, the green ones, are made slightly higher and offset to the right. I borrow the data from #Didzis Elferts.
ggplot(data = df2) +
geom_bar(aes(x = as.numeric(clarity) + 0.1, y = V1 + 100),
width = 0.8, fill = "green", stat = "identity") +
geom_bar(aes(x = as.numeric(clarity), y = V1),
width = 0.8, stat = "identity") +
scale_x_continuous(name = "clarity",
breaks = as.numeric(df2$clarity),
labels = levels(df2$clarity))+
ylab("count")
As you already said - 3D barcharts are "bad". You can't do it directly in ggplot2 but here is a possible workaround for this.
First, make new data frame that contains levels of clarity and corresponding count for each level.
library(plyr)
df2<-ddply(diamonds,.(clarity),nrow)
Then in ggplot() call use new data frame and clarity as x values and V1 (counts) as y values and add geom_blank() - this will make x axis with levels we need. Then add geom_rect() to produce shading for bars - here xmin and xmax values are made as.numeric() from clarity and constant is added - for xmin constant should be less than half of bars width and xmax constant larger than half of bars width. ymin is 0 and ymax is V1 (counts) plus some constant. Finally add geom_bar(stat="identity") above this shadow to plot actually barplot.
ggplot(df2,aes(clarity,V1)) + geom_blank()+
geom_rect(aes(xmin=as.numeric(clarity)-0.38,
xmax=as.numeric(clarity)+.5,
ymin=0,
ymax=V1+250),fill="green")+
geom_bar(width=0.8,stat="identity")

Resources