Change axes label and scale using ggplot and patchwork in R - r

(I am trying to make this question as short and concise as possible, as other related answers may be tough for the non-savvy like myself.)
With the following code in mind, is it possible to have both y-axes on the same scale (that of the graph with the highest y-limit), and to have independent labels for each of the axes (namely the y-axes)? I tried to use facet_wrap but haven't so far been able to succeed as Layer 1 is missing)
library(ggplot2)
library(patchwork)
d <- cars
d$Obs <- c(1:50)
f1 <- function(a) {
ggplot(data=d, aes_string(x="Obs", y=a)) +
geom_line() +
labs(x="Observation",y="Speed/Distance")
}
f1("speed") + f1("dist")

You could add two additional arguments to your function, one for the axis label and one for your desired limits.
library(ggplot2)
library(patchwork)
d <- cars
d$Obs <- c(1:50)
f1 <- function(a, y_lab) {
ggplot(data = d, aes_string(x = "Obs", y = a)) +
geom_line() +
scale_y_continuous(limits = range(c(d$speed, d$dist))) +
labs(x = "Observation", y = y_lab)
}
f1("speed", "Speed") + f1("dist", "Distance")

Reshape wide-to-long, then use facet. Instead of having different y-axis labels we will have facet labels:
library(ggplot2)
library(tidyr)
pivot_longer(d, 1:2, names_to = "grp") %>%
ggplot(aes(x = Obs, y = value)) +
geom_line() +
facet_wrap(vars(grp))

Related

Represent dataset in column bar in R using ggplot [duplicate]

I have a csv file which looks like the following:
Name,Count1,Count2,Count3
application_name1,x1,x2,x3
application_name2,x4,x5,x6
The x variables represent numbers and the applications_name variables represent names of different applications.
Now I would like to make a barplot for each row by using ggplot2. The barplot should have the application_name as title. The x axis should show Count1, Count2, Count3 and the y axis should show the corresponding values (x1, x2, x3).
I would like to have a single barplot for each row, because I have to store the different plots in different files. So I guess I cannot use "melt".
I would like to have something like:
for each row in rows {
print barplot in file
}
Thanks for your help.
You can use melt to rearrange your data and then use either facet_wrap or facet_grid to get a separate plot for each application name
library(ggplot2)
library(reshape2)
# example data
mydf <- data.frame(name = paste0("name",1:4), replicate(5,rpois(4,30)))
names(mydf)[2:6] <- paste0("count",1:5)
# rearrange data
m <- melt(mydf)
# if you are wanting to export each plot separately
# I used facet_wrap as a quick way to add the application name as a plot title
for(i in levels(m$name)) {
p <- ggplot(subset(m, name==i), aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
ggsave(paste0("figure_",i,".pdf"), p)
}
# or all plots in one window
ggplot(m, aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
I didn't see #user20650's nice answer before preparing this. It's almost identical, except that I use plyr::d_ply to save things instead of a loop. I believe dplyr::do() is another good option (you'd group_by(Name) first).
yourData <- data.frame(Name = sample(letters, 10),
Count1 = rpois(10, 20),
Count2 = rpois(10, 10),
Count3 = rpois(10, 8))
library(reshape2)
yourMelt <- melt(yourData, id.vars = "Name")
library(ggplot2)
# Test a function on one piece to develope graph
ggplot(subset(yourMelt, Name == "a"), aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = subset(yourMelt, Name == 'a')$Name)
# Wrap it up, with saving to file
bp <- function(dat) {
myPlot <- ggplot(dat, aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = dat$Name)
ggsave(filname = paste0("path/to/save/", dat$Name, "_plot.pdf"),
myPlot)
}
library(plyr)
d_ply(yourMelt, .variables = "Name", .fun = bp)

How to graph "before and after" measures using ggplot with connecting lines and subsets?

I’m totally new to ggplot, relatively fresh with R and want to make a smashing ”before-and-after” scatterplot with connecting lines to illustrate the movement in percentages of different subgroups before and after a special training initiative. I’ve tried some options, but have yet to:
show each individual observation separately (now same values are overlapping)
connect the related before and after measures (x=0 and X=1) with lines to more clearly illustrate the direction of variation
subset the data along class and id using shape and colors
How can I best create a scatter plot using ggplot (or other) fulfilling the above demands?
Main alternative: geom_point()
Here is some sample data and example code using genom_point
x <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1) # 0=before, 1=after
y <- c(45,30,10,40,10,NA,30,80,80,NA,95,NA,90,NA,90,70,10,80,98,95) # percentage of ”feelings of peace"
class <- c(0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1) # 0=multiple days 1=one day
id <- c(1,1,2,3,4,4,4,4,5,6,1,1,2,3,4,4,4,4,5,6) # id = per individual
df <- data.frame(x,y,class,id)
ggplot(df, aes(x=x, y=y), fill=id, shape=class) + geom_point()
Alternative: scale_size()
I have explored stat_sum() to summarize the frequencies of overlapping observations, but then not being able to subset using colors and shapes due to overlap.
ggplot(df, aes(x=x, y=y)) +
stat_sum()
Alternative: geom_dotplot()
I have also explored geom_dotplot() to clarify the overlapping observations that arise from using genom_point() as I do in the example below, however I have yet to understand how to combine the before and after measures into the same plot.
df1 <- df[1:10,] # data before
df2 <- df[11:20,] # data after
p1 <- ggplot(df1, aes(x=x, y=y)) +
geom_dotplot(binaxis = "y", stackdir = "center",stackratio=2,
binwidth=(1/0.3))
p2 <- ggplot(df2, aes(x=x, y=y)) +
geom_dotplot(binaxis = "y", stackdir = "center",stackratio=2,
binwidth=(1/0.3))
grid.arrange(p1,p2, nrow=1) # GridExtra package
Or maybe it is better to summarize data by x, id, class as mean/median of y, filter out ids producing NAs (e.g. ids 3 and 6), and connect the points by lines? So in case if you don't really need to show variability for some ids (which could be true if the plot only illustrates tendencies) you can do it this way:
library(ggplot)
library(dplyr)
#library(ggthemes)
df <- df %>%
group_by(x, id, class) %>%
summarize(y = median(y, na.rm = T)) %>%
ungroup() %>%
mutate(
id = factor(id),
x = factor(x, labels = c("before", "after")),
class = factor(class, labels = c("one day", "multiple days")),
) %>%
group_by(id) %>%
mutate(nas = any(is.na(y))) %>%
ungroup() %>%
filter(!nas) %>%
select(-nas)
ggplot(df, aes(x = x, y = y, col = id, group = id)) +
geom_point(aes(shape = class)) +
geom_line(show.legend = F) +
#theme_few() +
#theme(legend.position = "none") +
ylab("Feelings of peace, %") +
xlab("")
Here's one possible solution for you.
First - to get the color and shapes determined by variables, you need to put these into the aes function. I turned several into factors, so the labs function fixes the labels so they don't appear as "factor(x)" but just "x".
To address multiple points, one solution is to use geom_smooth with method = "lm". This plots the regression line, instead of connecting all the dots.
The option se = FALSE prevents confidence intervals from being plotted - I don't think they add a lot to your plot, but play with it.
Connecting the dots is done by geom_line - feel free to try that as well.
Within geom_point, the option position = position_jitter(width = .1) adds random noise to the x-axis so points do not overlap.
ggplot(df, aes(x=factor(x), y=y, color=factor(id), shape=factor(class), group = id)) +
geom_point(position = position_jitter(width = .1)) +
geom_smooth(method = 'lm', se = FALSE) +
labs(
x = "x",
color = "ID",
shape = 'Class'
)

Handle ggplot2 axis text face programmatically

(x-posted to community.rstudio.com)
I'm wondering if it's possible to change the axis text in ggplot2 programatically or if there is some native way to do this in ggplot2. In this reprex, the idea is that I want to bold the axis text of a variable y that has an absolute value of x over 1.5. I can add it in manually via theme(), and that works fine:
library(ggplot2)
library(dplyr)
library(forcats)
set.seed(2939)
df <- data.frame(x = rnorm(15), y = paste0("y", 1:15), group = rep(1:3, 5))
df <- mutate(df, big_number = abs(x) > 1.5, face = ifelse(big_number, "bold",
"plain"))
p <- ggplot(df, aes(x = x, y = fct_inorder(y), col = big_number)) + geom_point() +
theme(axis.text.y = element_text(face = df$face))
p
Plot 1 with no facets
But if I facet it by group, y gets reordered and ggplot2 has no idea how face is connected to df and thus y, so it just bolds in the same order as the first plot.
p + facet_grid(group ~ .)
Plot 2 with facets
And it's worse if I use a different scale for each.
p + facet_grid(group ~ ., scales = "free")
Plot 3 with facets and different scales
What do you think? Is there a general way to handle this that would work consistently here?
Idea: Don't change theme, change y-axis labels. Create a call for every y with if/else condition and parse it with parse.
Not the most elegant solution (using for loop), but works (need loop as bquote doesn't work with ifelse). I always get confused when trying to work with multiple expressions (more on that here).
Code:
# Create data
library(tidyverse)
set.seed(2939)
df <- data.frame(x = rnorm(15), y = paste0("y", 1:15), group = rep(1:3, 5)) %>%
mutate(yF = fct_inorder(y),
big_number = abs(x) > 1.5)
# Expressions for y-axis
# ifelse doesn't work
# ifelse(df$big_number, bquote(bold(1)), bquote(plain(2)))
yExp <- c() # Ignore terrible way of concatenating
for(i in 1:nrow(df)) {
if (df$big_number[i]) {
yExp <- c(yExp, bquote(bold(.(as.character(df$yF[i])))))
} else {
yExp <- c(yExp, bquote(plain(.(as.character(df$yF[i])))))
}
}
# Plot with facets
ggplot(df, aes(x, yF, col = big_number)) +
geom_point() +
scale_y_discrete(breaks = levels(df$yF),
labels = parse(text = yExp)) +
facet_grid(group ~ ., scales = "free")
Result:
Inspired by #PoGibas, I also used a function in scale_y_discrete(), which works, too.
bold_labels <- function(breaks) {
big_nums <- filter(df, y %in% breaks) %>%
pull(big_number)
labels <- purrr::map2(
breaks, big_nums,
~ if (.y) bquote(bold(.(.x))) else bquote(plain(.(.x)))
)
parse(text = labels)
}
ggplot(df, aes(x, fct_inorder(y), col = big_number)) +
geom_point() +
scale_y_discrete(labels = bold_labels) +
facet_grid(group ~ ., scales = "free")

Multiple curves in ggplot2 with same independent variable

I have a sequence of points in the x-axis for each of which there are two points in the y-axis.
x<-seq(8.5,10,by=0.1)
y<-c(0.9990276914, 0.9973015358, 0.9931704801, 0.9842176288, 0.9666471511, 0.9354201700, 0.8851624615, 0.8119131899, 0.7152339504, 0.5996777045, 0.4745986612, 0.3519940258, 0.2431610835, 0.1556738744, 0.0919857178, 0.0500000000, 0.0249347645, 0.0113838852, 0.0047497169, 0.0018085048, 0.0006276833)
y1<-c(9.999998e-01,9.999980e-01,9.999847e-01,9.999011e-01,9.994707e-01,9.976528e-01,9.913453e-01, 9.733730e-01, 9.313130e-01, 8.504646e-01, 7.228116e-01, 5.572501e-01,3.808638e-01,2.264990e-01, 1.155286e-01, 5.000000e-02, 1.821625e-02, 5.554031e-03, 1.410980e-03, 2.976926e-04, 5.203069e-05)
I would now like to create two curves in ggplot2. This is quite easy to accomplish in the normal way in R. The result is in the plot below. I am not sure, however, how to do that in ggplot2. For just one curve, I can use
library(ggplot2)
p<-qplot(x,y,geom="line")
Could you please help me generalise the above? Any help is greatly appreciated, thank you.
Note that the lengths of your x and y values don't match. Combine your data and use a grouping variable:
x<-seq(8.5,10, length.out = 21)
DF <- data.frame(x=rep(x, 2), y=c(y, y1), g=c(y^0, y1^0*2))
library(ggplot2)
ggplot(DF, aes(x=x, y=y, colour=factor(g), linetype=factor(g))) +
geom_line()
As #Roland also pointed out first you should fix the length of x. A possible solution using the reshape2 package:
library(reshape2)
library(ggplot2)
x<-seq(8.5,10,length.out = 21)
y<-c(0.9990276914, 0.9973015358, 0.9931704801, 0.9842176288, 0.9666471511, 0.9354201700, 0.8851624615, 0.8119131899, 0.7152339504, 0.5996777045, 0.4745986612, 0.3519940258, 0.2431610835, 0.1556738744, 0.0919857178, 0.0500000000, 0.0249347645, 0.0113838852, 0.0047497169, 0.0018085048, 0.0006276833)
y1<-c(9.999998e-01,9.999980e-01,9.999847e-01,9.999011e-01,9.994707e-01,9.976528e-01,9.913453e-01, 9.733730e-01, 9.313130e-01, 8.504646e-01, 7.228116e-01, 5.572501e-01,3.808638e-01,2.264990e-01, 1.155286e-01, 5.000000e-02, 1.821625e-02, 5.554031e-03, 1.410980e-03, 2.976926e-04, 5.203069e-05)
df <- data.frame(x, y, y1)
df <- melt(df, id.var='x')
ggplot(df, aes(x = x, y = value, color = variable))+geom_line()
EDIT:
Changing the linetype and legend:
g <- ggplot(df, aes(x = x, y = value, color = variable, linetype=variable)) + geom_line()
g <- g + scale_linetype_discrete(name="Custom legend name",
labels=c("Curve1", "Curve2"))
g <- g + guides(color=FALSE)
print(g)

Plotting two variables using ggplot2 - same x axis

I have two graphs with the same x axis - the range of x is 0-5 in both of them.
I would like to combine both of them to one graph and I didn't find a previous example.
Here is what I got:
c <- ggplot(survey, aes(often_post,often_privacy)) + stat_smooth(method="loess")
c <- ggplot(survey, aes(frequent_read,often_privacy)) + stat_smooth(method="loess")
How can I combine them?
The y axis is "often privacy" and in each graph the x axis is "often post" or "frequent read".
I thought I can combine them easily (somehow) because the range is 0-5 in both of them.
Many thanks!
Example code for Ben's solution.
#Sample data
survey <- data.frame(
often_post = runif(10, 0, 5),
frequent_read = 5 * rbeta(10, 1, 1),
often_privacy = sample(10, replace = TRUE)
)
#Reshape the data frame
survey2 <- melt(survey, measure.vars = c("often_post", "frequent_read"))
#Plot using colour as an aesthetic to distinguish lines
(p <- ggplot(survey2, aes(value, often_privacy, colour = variable)) +
geom_point() +
geom_smooth()
)
You can use + to combine other plots on the same ggplot object. For example, to plot points and smoothed lines for both pairs of columns:
ggplot(survey, aes(often_post,often_privacy)) +
geom_point() +
geom_smooth() +
geom_point(aes(frequent_read,often_privacy)) +
geom_smooth(aes(frequent_read,often_privacy))
Try this:
df <- data.frame(x=x_var, y=y1_var, type='y1')
df <- rbind(df, data.frame(x=x_var, y=y2_var, type='y2'))
ggplot(df, aes(x, y, group=type, col=type)) + geom_line()

Resources