Reduce distance in plot X labels (R: ggplot2) - r

This is my dataframe:
df = data.frame(info=1:30, type=c(replicate(5,'A'), replicate(5,'B')), group= c(replicate(10,'D1'), replicate(10,'D2'), replicate(10,'D3')))
I want to make a jitter plot of my data distinguished by group (X-label) and type (colour):
ggplot()+
theme(panel.background=element_rect(colour="grey", size=0.2, fill='grey100'))+
geom_jitter(data=df, aes(x=group, y=info, color=type, shape=type), position=position_dodge(0.2), cex=2)+
scale_shape_manual(values=c(17,15,19))+
scale_color_manual(values=c(A="mediumvioletred", B="blue"))
How can I reduce the distance between the X-labels (D1, D2, D3) in the representation?
P.D. I want to do it even if I left a blank space in the graphic

Here are a few options.
# Setting up the plot
library(ggplot2)
df <- data.frame(
info=1:30,
type=c(replicate(5,'A'), replicate(5,'B')),
group= c(replicate(10,'D1'), replicate(10,'D2'), replicate(10,'D3'))
)
p <- ggplot(df, aes(group, info, colour = type, shape = type))
Option 1: increase the dodge distance. This won't put the labels closer, but it makes better use of the space available so that the labels appear less isolated.
p +
geom_point(position = position_dodge(width = 0.9))
Option 2: Expand the x-axis. Increasing the expansion factor from the default 0.5 to >0.5 increases the space at the ends of the axis, putting the labels closer.
p +
geom_point(position = position_dodge(0.2)) +
scale_x_discrete(expand = c(2, 0))
Option 3: change the aspect ratio. Depending on the plotting window size, this also visually puts the x-axis labels closer together.
p +
geom_point(position = position_dodge(0.2)) +
theme(aspect.ratio = 2)
Created on 2021-06-25 by the reprex package (v1.0.0)

Try adding coord_fixed(ratio = 0.2) and play around with the ratio.
ggplot()+
theme(panel.background=element_rect(colour="grey", size=0.2, fill='grey100'))+
geom_jitter(data=df, aes(x=group, y=info, color=type, shape=type), position=position_dodge(0.2))+
scale_shape_manual(values=c(17,15,19))+
scale_color_manual(values=c(A="mediumvioletred", B="blue")) + coord_fixed(ratio = 0.2)

The simplest solution is to resize the plot. For example if you follow your command with ggsave("my_plot.pdf", width = 3, height = 4.5) it looks like this:
Or in an Rmd file you can control the dimensions by various means: see this link.

Related

How to draw circles inside each other with ggplot2?

I want to draw two circles inside each other with ggplot2.
So far my effort is:
make a fake data and plot it with geom_line(). If I convert this with coord_polar() then I will not be able to see two different circles the one inside each other
library(ggplot2)
library(tidyverse)
x1=seq(0,6000000,1000)
y1=rep(1,length(x1))
y2=rep(2,length(x1))
data=as.data.frame(cbind(x1,y1,y2))
Created on 2021-12-25 by the reprex package (v2.0.1)
# plot the data
ggplot(data) +
geom_line(aes(x1,y1)) +
geom_line(aes(x1,y2))
#coord_polar()
I would avoid the geom_circle option and use the coord_polar option if possible.
The reason is that these two circles have some differences in the x-axis, which I would indicate after drawing the circles.
I would like my plot to look like this
The code you have with coord_polar() is correct, just the plot limits need adjusting to see both the circles, e.g.
ggplot(data) +
geom_line(aes(x1,y1)) +
geom_line(aes(x1,y2)) +
coord_polar() + ylim(c(0,NA))
The reason for using ylim is that this is the direction getting transformed to the radius by the coord_polar()
Why not use two geom_point() with different sizes and pch = 21?
library(ggplot2)
df <- tibble(x = 0, y = 0)
ggplot(df, aes(x, y)) +
geom_point(pch = 21, size = 50) +
geom_point(pch = 21, size = 40) +
theme_void()

plotting geom_text at individual positions in each facet of facet_wrap

I have this dataset
df <- data.frame(groups=factor(c(rep("A",6), rep("B",6), rep("C",6))),
types=factor(c(rep(c(rep("Z",2), rep("Y",2), rep("X",2)),3))),
values=c(10,11,1,2,0.1, 0.2, 12,13, 2,2.5, 0.2, 0.01,
12,14, 2,3,0.1,0.2))
library(ggplot2)
px <- ggplot(df, aes(groups, values)) + facet_wrap(~types, scale="free") + geom_point()
px
i conducted post-hoc analysis (values~groups for each level of types) and created a dataset, which contains significance groups (as letters: a, b, c and so on):
df.text <- data.frame(groups=factor(c(rep("A",3), rep("B",3), rep("C",3))),
label=rep("a", 9))
i proceeded to plot the labels on px:
px + geom_text(data=df.text, aes(x=groups, y=0.1, label=label), size=4, col="red", stat="identity") +theme_bw()
which doesnt look great.
My problem is to define a aes(y) in geom_text, which plots the labels at at a fixed position (e.g. above the x-axis or on top of the panel), without shifting the limits of the y axis too much. With previous datasets, the y-values were quite homogeneous among groups, so i could get away with a very low y-value. This time however the range of y is quite high, so its not easily getting done.
So, the question is how to plot the labels inside df.text at a fixed position in facet_wrap, while keeping scale="free". Best would be above the top panel.border.
You could define a new variable height which determines the height to plot the labels:
library(tidyverse)
df %>%
group_by(types) %>%
mutate(height = max(values) + .3 * sd(values)) %>%
left_join(df.text, by = "groups") %>%
ggplot(aes(groups, values)) +
facet_wrap(~types, scale = "free") +
geom_point() +
geom_text(aes(x = groups, y = height, label = label), size = 4, col = "red", stat = "identity") +
theme_bw()
Here I used the max value plus .3 times the standard deviation but you could change that to whatever you wanted obviously. Not sure how to get the labels on top of the panel strips though.

How to set automatic label position based on box height

In a previous question, I asked about moving the label position of a barplot outside of the bar if the bar was too small. I was provided this following example:
library(ggplot2)
options(scipen=2)
dataset <- data.frame(Riserva_Riv_Fine_Periodo = 1:10 * 10^6 + 1,
Anno = 1:10)
ggplot(data = dataset,
aes(x = Anno,
y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity",
width=0.8,
position="dodge") +
geom_text(aes( y = Riserva_Riv_Fine_Periodo,
label = round(Riserva_Riv_Fine_Periodo, 0),
angle=90,
hjust= ifelse(Riserva_Riv_Fine_Periodo < 3000000, -0.1, 1.2)),
col="red",
size=4,
position = position_dodge(0.9))
And I obtain this graph:
The problem with the example is that the value at which the label is moved must be hard-coded into the plot, and an ifelse statement is used to reposition the label. Is there a way to automatically extract the value to cut?
A slightly better option might be to base the test and the positioning of the labels on the height of the bar relative to the height of the highest bar. That way, the cutoff value and label-shift are scaled to the actual vertical range of the plot. For example:
ydiff = max(dataset$Riserva_Riv_Fine_Periodo)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity", width=0.8) +
geom_text(aes(label = round(Riserva_Riv_Fine_Periodo, 0), angle=90,
y = ifelse(Riserva_Riv_Fine_Periodo < 0.3*ydiff,
Riserva_Riv_Fine_Periodo + 0.1*ydiff,
Riserva_Riv_Fine_Periodo - 0.1*ydiff)),
col="red", size=4)
You would still need to tweak the fractional cutoff in the test condition (I've used 0.3 in this case), depending on the physical size at which you render the plot. But you could package the code into a function to make the any manual adjustments a bit easier.
It's probably possible to automate this by determining the actual sizes of the various grobs that make up the plot and setting the condition and the positioning based on those sizes, but I'm not sure how to do that.
Just as an editorial comment, a plot with labels inside some bars and above others risks confusing the visual mapping of magnitudes to bar heights. I think it would be better to find a way to shrink, abbreviate, recode, or otherwise tweak the labels so that they contain the information you want to convey while being able to have all the labels inside the bars. Maybe something like this:
library(scales)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo/1000)) +
geom_col(width=0.8, fill="grey30") +
geom_text(aes(label = format(Riserva_Riv_Fine_Periodo/1000, big.mark=",", digits=0),
y = 0.5*Riserva_Riv_Fine_Periodo/1000),
col="white", size=3) +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
theme_classic() +
labs(y="Riserva (thousands)")
Or maybe go with a line plot instead of bars:
ggplot(dataset, aes(Anno, Riserva_Riv_Fine_Periodo/1e3)) +
geom_line(linetype="11", size=0.3, colour="grey50") +
geom_text(aes(label=format(Riserva_Riv_Fine_Periodo/1e3, big.mark=",", digits=0)),
size=3) +
theme_classic() +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
expand_limits(y=0) +
labs(y="Riserva (thousands)")

Whisker plots to compare mean and variance between clusters [duplicate]

I am trying to recreate a figure from a GGplot2 seminar http://dl.dropbox.com/u/42707925/ggplot2/ggplot2slides.pdf.
In this case, I am trying to generate Example 5, with jittered data points subject to a dodge. When I run the code, the points are centered around the correct line, but have no jitter.
Here is the code directly from the presentation.
set.seed(12345)
hillest<-c(rep(1.1,100*4*3)+rnorm(100*4*3,sd=0.2),
rep(1.9,100*4*3)+rnorm(100*4*3,sd=0.2))
rep<-rep(1:100,4*3*2)
process<-rep(rep(c("Process 1","Process 2","Process 3","Process 4"),each=100),3*2)
memorypar<-rep(rep(c("0.1","0.2","0.3"),each=4*100),2)
tailindex<-rep(c("1.1","1.9"),each=3*4*100)
ex5<-data.frame(hillest=hillest,rep=rep,process=process,memorypar=memorypar, tailindex=tailindex)
stat_sum_df <- function(fun, geom="crossbar", ...) {stat_summary(fun.data=fun, geom=geom, ...) }
dodge <- position_dodge(width=0.9)
p<- ggplot(ex5,aes(x=tailindex ,y=hillest,color=memorypar))
p<- p + facet_wrap(~process,nrow=2) + geom_jitter(position=dodge) +geom_boxplot(position=dodge)
p
In ggplot2 version 1.0.0 there is new position named position_jitterdodge() that is made for such situation. This postion should be used inside the geom_point() and there should be fill= used inside the aes() to show by which variable to dodge your data. To control the width of dodging argument dodge.width= should be used.
ggplot(ex5, aes(x=tailindex, y=hillest, color=memorypar, fill=memorypar)) +
facet_wrap(~process, nrow=2) +
geom_point(position=position_jitterdodge(dodge.width=0.9)) +
geom_boxplot(fill="white", outlier.colour=NA, position=position_dodge(width=0.9))
EDIT: There is a better solution with ggplot2 version 1.0.0 using position_jitterdodge. See #Didzis Elferts' answer. Note that dodge.width controls the width of the dodging and jitter.width controls the width of the jittering.
I'm not sure how the code produced the graph in the pdf.
But does something like this get you close to what you're after?
I convert tailindex and memorypar to numeric; add them together; and the result is the x coordinate for the geom_jitter layer. There's probably a more effective way to do it. Also, I'd like to see how dodging geom_boxplot and geom_jitter, and with no jittering, will produce the graph in the pdf.
library(ggplot2)
dodge <- position_dodge(width = 0.9)
ex5$memorypar2 <- as.numeric(ex5$tailindex) +
3 * (as.numeric(as.character(ex5$memorypar)) - 0.2)
p <- ggplot(ex5,aes(x=tailindex , y=hillest)) +
scale_x_discrete() +
geom_jitter(aes(colour = memorypar, x = memorypar2),
position = position_jitter(width = .05), alpha = 0.5) +
geom_boxplot(aes(colour = memorypar), outlier.colour = NA, position = dodge) +
facet_wrap(~ process, nrow = 2)
p

How to increase the space between the bars in a bar plot in ggplot2?

How can I increase the space between the bars in a bar plot in ggplot2 ?
You can always play with the width parameter, as shown below:
df <- data.frame(x=factor(LETTERS[1:4]), y=sample(1:100, 4))
library(ggplot2)
ggplot(data=df, aes(x=x, y=y, width=.5)) +
geom_bar(stat="identity", position="identity") +
opts(title="width = .5") + labs(x="", y="") +
theme_bw()
Compare with the following other settings for width:
So far, so good. Now, suppose we have two factors. In case you would like to play with evenly spaced juxtaposed bars (like when using space together with beside=TRUE in barplot()), it's not so easy using geom_bar(position="dodge"): you can change bar width, but not add space in between adjacent bars (and I didn't find a convenient solution on Google). I ended up with something like that:
df <- data.frame(g=gl(2, 1, labels=letters[1:2]), y=sample(1:100, 4))
x.seq <- c(1,2,4,5)
ggplot(data=transform(df, x=x.seq), aes(x=x, y=y, width=.85)) +
geom_bar(stat="identity", aes(fill=g)) + labs(x="", y="") +
scale_x_discrete(breaks = NA) +
geom_text(aes(x=c(sum(x.seq[1:2])/2, sum(x.seq[3:4])/2), y=0,
label=c("X","Y")), vjust=1.2, size=8)
The vector used for the $x$-axis is "injected" in the data.frame, so that so you change the outer spacing if you want, while width allows to control for inner spacing. Labels for the $x$-axis might be enhanced by using scale_x_discrete().
For space between factor bars use
ggplot(data = d, aes(x=X, y=Y, fill=F))
+ geom_bar(width = 0.8, position = position_dodge(width = 0.9))
The width in geom_bar controls the bar width in relation to the x-axis while the width in position_dodge control the width of the space given to both bars also in relation to the x-axis. Play around with the width to find one that you like.
Thank you very much chl.! I had the same problem and you helped me solving it. Instead of using geom_text to add the X-labels I used scale_x_continuous (see below)
geom_text(aes(x=c(sum(x.seq[1:2])/2, sum(x.seq[3:4])/2), y=0,
label=c("X","Y")), vjust=1.2, size=8)
replaced by
scale_x_continuous(breaks=c(mean(x.seq[1:2]), mean(x.seq[3:4])), labels=c("X", "Y"))
For space between POSIXlt bars you need adjust the width from the number of seconds in a day
# POSIXlt example: full & half width
d <- data.frame(dates = strptime(paste(2016, "01", 1:10, sep = "-"), "%Y-%m-%d"),
values = 1:10)
ggplot(d, aes(dates, values)) +
geom_bar(stat = "identity", width = 60*60*24)
ggplot(d, aes(dates, values)) +
geom_bar(stat = "identity", width = 60*60*24*0.5)

Resources