ggplot2: axis does not show all ticks/breaks - r

I am currently plotting data using the ggpubr package in R (based on ggplot2). When I plot the means of two conditions including standard errors, the y-axis should be limited from 1 to 7, which I indicate using:
p <- ggline(data, x = "condition", y = "measure",
add = c("mean_se"),
ylab = "Measure")
ggpar(y, ylim = c(1, 7), ticks=T, yticks.by = 1)
In the final plot, however, the y-axis shows only values from 1 to 6
I tried to plot the same data using native ggplot2, but the problem persists, once I change the layout.
For ggplot2 I used:
p <- ggplot(data, aes(x=condition, y=measure)) +
geom_line() +
geom_point()+
geom_errorbar(aes(ymin=measure-se, ymax=measure+se), width=.2, position=position_dodge(0.05)) +
ylab("measure") +
xlab("Condition")
p + scale_y_continuous(name="measure", limits=c(1, 7), breaks=c(1:7))
p + theme_classic()
It would be great if someone could help me with this issue.
Edit:
as suggested in the comments, here is the data I am trying to plot using ggplot2:
structure(list(condition = structure(3:4, .Label = c("IC", "SC",
"ILC", "SLC"), class = "factor"), measure = c(4.10233918128655, 3.83040935672515
), se = c(0.235026318386523, 0.216811675834834)), class = "data.frame", row.names = c(NA,
-2L))

I think I got something resembling your plot with correct y-axes with the following code:
ggplot(data, aes(x = condition, y = measure)) +
geom_point() +
geom_errorbar(aes(ymin = measure-se, ymax = measure+se),
width = .2, position = position_dodge(0.05)) +
# Group prevents geom_line interpreting each x-axis point as it's own group
geom_line(aes(group = rep(1, nrow(data)))) +
xlab("Condition") +
# Expand is optional, it prevents padding beyond 1 and 7
scale_y_continuous(name = "measure",
limits = c(1, 7),
breaks = 1:7,
expand = c(0,0)) +
theme_classic()

The solution is much more trivial. You were doing everything right! Except for one clerical error. Here is what was happening:
First, you generate your initial plot, fine.
p <- ggplot(data, aes(x=condition, y=measure)) +
geom_line() + geom_point() +
geom_errorbar(aes(ymin=measure-se, ymax=measure+se),
width=.2, position=position_dodge(0.05)) +
ylab("measure") +
xlab("Condition")
This plot does not have the limits. When you add the limits and display it, the scales are correct:
p + scale_y_continuous(name="measure", limits=c(1, 7), breaks=c(1:7))
However, note that p did not change! You did not store the result of adding the limits to p. Therefore, p is still without the scale_y_continuous. No wonder then that when you type
p + theme_classic()
...the limits are gone. However, if you try
p <- p + scale_y_continuous(name="measure", limits=c(1, 7), breaks=c(1:7))
p + theme_classic()
everything will be correct.

Related

How to add percentages on top of an histogram when data is grouped

This is not my data (for confidentiality reasons), but I have tried to create a reproducible example using a dataset included in the ggplot2 library. I have an histogram summarizing the value of some variable by group (factor of 2 levels). First, I did not want the counts but proportions of the total, so I used that code:
library(ggplot2)
library(dplyr)
df_example <- diamonds %>% as.data.frame() %>% filter(cut=="Premium" | cut=="Ideal")
ggplot(df_example,aes(x=z,fill=cut)) +
geom_histogram(aes(y=after_stat(width*density)),binwidth=1,center=0.5,col="black") +
facet_wrap(~cut) +
scale_x_continuous(breaks=seq(0,9,by=1)) +
scale_y_continuous(labels=scales::percent_format(accuracy=2,suffix="")) +
scale_fill_manual(values=c("#CC79A7","#009E73")) +
labs(x="Depth (mm)",y="Count") +
theme_bw() + theme(legend.position="none")
It gave me this as a result.
enter image description here
The issue is that I would like to print the numeric percentages on top of the bins and haven't find a way to do so.
As I saw it done for printing counts elsewhere, I attempted to print them using stat_bin(), including the same y and label values as the y in geom_histogram, thinking it would print the right numbers:
ggplot(df_example,aes(x=z,fill=cut)) +
geom_histogram(aes(y=after_stat(width*density)),binwidth=1,center=0.5,col="black") +
stat_bin(aes(y=after_stat(width*density),label=after_stat(width*density*100)),geom="text",vjust=-.5) +
facet_wrap(~cut) +
scale_x_continuous(breaks=seq(0,9,by=1)) +
scale_y_continuous(labels=scales::percent_format(accuracy=2,suffix="")) +
scale_fill_manual(values=c("#CC79A7","#009E73")) +
labs(x="Depth (mm)",y="%") +
theme_bw() + theme(legend.position="none")
However, it does print way more values than there are bins, these values do not appear consistent with what is portrayed by the bar heights and they do not print in respect to vjust=-.5 which would make them appear slightly above the bars.
enter image description here
What am I missing here? I know that if there was no grouping variable/facet_wrap, I could use after_stat(count/sum(count)) instead of after_stat(width*density) and it seems that it would have fixed my issue. But I need the histograms for both groups to appear next to each other. Thanks in advance!
You have to use the same arguments in stat_bin as for the histogram when adding your labels to get same binning for both layers and to align the labels with the bars:
library(ggplot2)
library(dplyr)
df_example <- diamonds %>%
as.data.frame() %>%
filter(cut == "Premium" | cut == "Ideal")
ggplot(df_example, aes(x = z, fill = cut)) +
geom_histogram(aes(y = after_stat(width * density)),
binwidth = 1, center = 0.5, col = "black"
) +
stat_bin(
aes(
y = after_stat(width * density),
label = scales::number(after_stat(width * density), scale = 100, accuracy = 1)
),
geom = "text", binwidth = 1, center = 0.5, vjust = -.25
) +
facet_wrap(~cut) +
scale_x_continuous(breaks = seq(0, 9, by = 1)) +
scale_y_continuous(labels = scales::number_format(scale = 100)) +
scale_fill_manual(values = c("#CC79A7", "#009E73")) +
labs(x = "Depth (mm)", y = "%") +
theme_bw() +
theme(legend.position = "none")

Need help on customizing my Odds Ratio (ggplot)!

I'm assigned to create an Odds of Ratio ggplot in R. The plot I'm supposed to create is given below.
Given plot
My job is to figure out codes which creates the exact plots in R. I've done most parts. Here is my work.
My work
Before jumping into my code, it is very important that I am not using the correct values for boxOdds, boxCILow, and boxCIHigh since I have not figured out the correct values. I wanted to figure out codes for ggplot first so I can enter the right values as soon as I find them.
This is the code I used:
library(ggplot2)
boxLabels = c("Females/Males", "Student-Centered Prac. (+1)", "Instructor Quality (+1)", "Undecided / STM",
"non-STEM / STM", "Pre-med / STM", "Engineering / STM", "Std. test percentile (+10)",
"No previous calc / HS calc", "College calc / HS calc")
df <- data.frame(yAxis = length(boxLabels):1,
boxOdds =
c(2.23189, 1.315737, 1.22866, 0.8197413, 0.9802449, 0.9786673, 0.6559005, 0.5929812, 0.6923759, 1.3958275),
boxCILow =
c(.7543566,1.016,.9674772,.6463458,.9643047,.864922,.4965308,.3572142, 0.4523759, 1.2023275),
boxCIHigh =
c(6.603418,1.703902,1.560353,1.039654,.9964486,1.107371,.8664225,.9843584, 0.9323759, 1.5893275)
)
(p <- ggplot(df, aes(x = boxOdds, y = boxLabels)) +
geom_vline(aes(xintercept = 1), size = 0.75, linetype = 'dashed') +
geom_errorbarh(aes(xmax = boxCIHigh, xmin = boxCILow), size = .5, height =
0, color = 'gray50') +
geom_point(size = 3.5, color = 'orange') +
theme_bw() +
theme(panel.grid.minor = element_blank()) +
scale_x_continuous(breaks = seq(0,7,1) ) +
ylab('') +
xlab('Odds Ratio') +
annotate(geom = 'text', y =1.1, x = 3.5, label ='',
size = 3.5, hjust = 0) + ggtitle('Estimated Odds of Switching') +
theme(plot.title = element_text(hjust = 0.5, size = 30),
axis.title.x = (element_text(size = 15))) +
theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
)
p
Where I'm stuck at:
Removing small vertical lines on the beginning and end of each row's CI). I was not sure what it's called so I was having hard time looking it up. SOLVED
I'm also stuck at coloring specific rows in different colors.
The last part I'm stuck at is assigning proper order of each variable for y-axis. As you can see in my code ("boxLabels" part), I have put all the variables in order of given plot but it seems like the R didn't care about the order. So the varaible located at the very top is "Undecided / STM", instead of "Females / Males".
How do I decrease the space from 0 to 1? SOLVED
Any help would be appreciated!
First, probably you want ggstance::geom_pointrangeh. Second, you could define colors by yAxis right at the beginning. To group some factors create a new variable group. Third is related to your data where you could assign factor labels. Fourth, remove coord_trans as suggested by #beetroot.
Assign factor labels
dat$yAxis <- factor(dat$yAxis, levels=10:1, labels=rev(boxLabels))
Create groups
dat$group <- 1
dat$group[which(dat$yAxis %in% c("Females/Males", "Undecided / STM", "non-STEM / STM",
"Pre-med / STM"))] <- 2
dat$group[which(dat$yAxis %in% c("Student-Centered Prac. (+1)",
"No previous calc / HS calc",
"College calc / HS calc"))] <- 3
Colors
colors <- c("#860fc2", "#fc691d", "black")
Plot
library(ggplot2)
library(ggstance)
ggplot(dat, aes(x=boxOdds, y=yAxis, color=as.factor(group))) +
geom_vline(aes(xintercept=1), size=0.75, linetype='dashed') +
geom_pointrangeh(aes(xmax=boxCIHigh, xmin=boxCILow), size=.5,
show.legend=FALSE) +
geom_point(size=3.5, show.legend=FALSE) +
theme_bw() +
scale_color_manual(values=colors)+
theme(panel.grid.minor=element_blank()) +
scale_x_continuous(breaks=seq(0,7,1), limits=c(0, max(dat[2:4]))) +
ylab('') +
xlab('Odds Ratio') +
annotate(geom='text', y =1.1, x=3.5, label ='',
size=3.5, hjust=0) + ggtitle('Estimated Odds of Switching') +
theme(plot.title=element_text(hjust=.5, size=20)) +
theme(panel.grid.minor=element_blank(), panel.grid.major=element_blank())
Gives
Data
dat <- structure(list(yAxis = 10:1, boxOdds = c(2.23189, 1.315737, 1.22866,
0.8197413, 0.9802449, 0.9786673, 0.6559005, 0.5929812, 0.6923759,
1.3958275), boxCILow = c(0.7543566, 1.016, 0.9674772, 0.6463458,
0.9643047, 0.864922, 0.4965308, 0.3572142, 0.4523759, 1.2023275
), boxCIHigh = c(6.603418, 1.703902, 1.560353, 1.039654, 0.9964486,
1.107371, 0.8664225, 0.9843584, 0.9323759, 1.5893275)), class = "data.frame", row.names = c(NA,
-10L))

How can I add an annotation like this?

How can I add annotation in ggplot like this? I need to add the text (CV, Network), the bars, and also those stars.
I hope you can transfer this example to your dataset:
ggplot() + geom_point(aes(x = 1:10, y = 1:10)) +
geom_segment(aes(x=0,y=11,xend=5,yend=11)) + geom_segment(aes(x=0,y=11,xend=0,yend=10.5)) + geom_segment(aes(x=5,y=11,xend=5,yend=10.5)) + ##bracket 1
geom_segment(aes(x=5.5,y=11,xend=10,yend=11)) + geom_segment(aes(x=5.5,y=11,xend=5.5,yend=10.5)) + geom_segment(aes(x=10,y=11,xend=10,yend=10.5)) + #bracket 2
geom_text(aes(x=2.5,y=11.5,label="Group 1")) + geom_text(aes(x=7.75,y=11.5,label="Group 2")) + #add labels
coord_cartesian(ylim = c(0, 10), clip="off")+theme(plot.margin = unit(c(4,1,1,0), "lines")) #change plot margins

Moving x or y axis together with tick labels to the middle of a single ggplot (no facets)

I made the following plot in Excel:
But then I thought I would make it prettier by using ggplot. I got this far:
If you're curious, the data is based on my answer here, although it doesn't really matter. The plot is a standard ggplot2 construct with some prettification, and the thick line for the x-axis through the middle is achieved with p + geom_hline(aes(yintercept=0)) (p is the ggplot object).
I feel that the axis configuration in the Excel plot is better. It emphasizes the 0 line (important when the data is money) and finding intercepts is much easier since you don't have to follow lines from all the way at the bottom. This is also how people draw axes when plotting on paper or boards.
Can the axis be moved like this in ggplot as well? I want not just the line, but the tick labels as well moved. If yes, how? If no, is the reason technical or by design? If by design, why was the decision made?
try this,
shift_axis <- function(p, y=0){
g <- ggplotGrob(p)
dummy <- data.frame(y=y)
ax <- g[["grobs"]][g$layout$name == "axis-b"][[1]]
p + annotation_custom(grid::grobTree(ax, vp = grid::viewport(y=1, height=sum(ax$height))),
ymax=y, ymin=y) +
geom_hline(aes(yintercept=y), data = dummy) +
theme(axis.text.x = element_blank(),
axis.ticks.x=element_blank())
}
p <- qplot(1:10, 1:10) + theme_bw()
shift_axis(p, 5)
I tried to change the theme's axis.text.x,but only can change hjust.
So I think you can delete axis.text.x,then use geom_text() to add.
For example:
test <- data.frame(x=seq(1,5), y=seq(-1,3))
ggplot(data=test, aes(x,y)) +
geom_line() +
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank()) +
geom_text(data=data.frame(x=seq(1,5), y=rep(0,5)), label=seq(1,5), vjust=1.5)
Maybe these codes are useful.
just to complete baptiste's excellent answer with the equivalent for moving the y axis:
shift_axis_x <- function(p, x=0){
g <- ggplotGrob(p)
dummy <- data.frame(x=x)
ax <- g[["grobs"]][g$layout$name == "axis-l"][[1]]
p + annotation_custom(grid::grobTree(ax, vp = grid::viewport(x=1, width = sum(ax$height))),
xmax=x, xmin=x) +
geom_vline(aes(xintercept=x), data = dummy) +
theme(axis.text.y = element_blank(),
axis.ticks.y=element_blank())
}
As alistaire commented it can be done using geom_hline and geom_text as shown below.
df <- data.frame(YearMonth = c(200606,200606,200608,200701,200703,200605),
person1 = c('Alice','Bob','Alice','Alice','Bob','Alice'),
person2 = c('Bob','Alice','Bob','Bob','Alice','Bob'),
Event = c('event1','event2','event3','event3','event2','event4')
)
df$YM <- as.Date(paste0("01",df$YearMonth), format="%d%Y%m")
rangeYM <- range(df$YM)
ggplot()+geom_blank(aes(x= rangeYM, y = c(-1,1))) + labs(x = "", y = "") +
theme(axis.ticks = element_blank()) +
geom_hline(yintercept = 0, col = 'maroon') +
scale_x_date(date_labels = '%b-%y', date_breaks = "month", minor_breaks = NULL) +
scale_y_continuous(minor_breaks = NULL) +
geom_text(aes(x = df$YM, y = 0, label = paste(format(df$YM, "%b-%y")), vjust = 1.5), colour = "#5B7FA3", size = 3.5, fontface = "bold")

R: How to spread (jitter) points with respect to the x axis?

I have the following code snippet in R:
dat <- data.frame(cond = factor(rep("A",10)),
rating = c(1,2,3,4,6,6,7,8,9,10))
ggplot(dat, aes(x=cond, y=rating)) +
geom_boxplot() +
guides(fill=FALSE) +
geom_point(aes(y=3)) +
geom_point(aes(y=3)) +
geom_point(aes(y=5))
This particular snippet of code produces a boxplot where one point goes over another (in the above case one point 3 goes over another point 3).
How can I move the point 3 so that the point remains in the same position on the y axis, but it is slightly moved left or right on the x axis?
This can be achieved by using the position_jitter function:
geom_point(aes(y=3), position = position_jitter(w = 0.1, h = 0))
Update:
To only plot the three supplied points you can construct a new dataset and plot that:
points_dat <- data.frame(cond = factor(rep("A", 3)), rating = c(3, 3, 5))
ggplot(dat, aes(x=cond, y=rating)) +
geom_boxplot() +
guides(fill=FALSE) +
geom_point(aes(x=cond, y=rating), data = points_dat, position = position_jitter(w = 0.05, h = 0))
ggplot2 now includes position_dodge(). From the help's description: "Dodging preserves the vertical position of an geom while adjusting the horizontal position."
Thus you can either use it as geom_point(position = position_dodge(0.5)) or, if you want to dodge points that are connected by lines and need the dodge to the be the same across both geoms, you can use something like:
dat <- data.frame(cond = rep(c("A", "B"), each=10), x=rep(1:10, 2), y=rnorm(20))
dodge <- position_dodge(.3) # how much jitter on the x-axis?
ggplot(dat, aes(x, y, group=cond, color=cond)) +
geom_line(position = dodge) +
geom_point(position = dodge)
ggplot2 now has a separate geom for this called geom_jitter so you don't need the position = dodge or position = position_dodge()) argument. Here applied to OP's example:
dat <- data.frame(cond = factor(rep("A",10)),
rating = c(1,2,3,4,6,6,7,8,9,10))
ggplot(dat, aes(x=cond, y=rating)) +
geom_boxplot() +
guides(fill=FALSE) +
geom_jitter(aes(y=c(3, 3, 5)))

Resources