r - make plotting function with ggplot2, aes_string and reorder - r

I'm trying to make a function that will use ggplot2inside,aes_stringand reorder but with no luck so far.
Basically if we have a sample dataset like the following:
library(ggplot2)
library(dplyr)
set.seed(123)
dt <- data.frame(
id = c(1,1,1,2,2),
a = c("b", "d", "c", "a", "b"),
b = sample(1:10, 5, replace = F),
cat = c(1,1,2,2,2)) %>%
mutate(a = as.factor(a)) %>%
as_tibble()
I want the function to accept the following arguments: the dataset, a filtering variable, and two variables for plotting.
This is what I managed to do:
myplot <- function(df, filtval, var1, var2) {
data <- df %>% filter(id == filtval)
ggplot(data) +
geom_point(
aes_string(
x = reorder(var1, var2),
y = var2)
)
}
Unfortunately when running it returns the error:
myplot(dt, 1, "a", "b")
Warning message:
In mean.default(X[[i]], ...) :
argument is not numeric or logical: returning NA
This is what I want the function to do:
data <- dt %>% filter(id == 1)
ggplot(data) +
geom_col(aes(x = reorder(a, - b), y = b))

With the latest version of ggplot, you should be use aes with !! and sym() to turn your strings into symbols.
myplot <- function(df, filtval, var1, var2) {
data <- df %>% filter(id == filtval)
ggplot(data) +
geom_point(
aes(
x = reorder(!!sym(var1), !!sym(var2)),
y = !!sym(var2))
)
}

After discussing with mr Flick (see below), this should do NOT be used:
myplot <- function(df, filtval, var1, var2) {
data <- df %>% filter(id == filtval)
data$new_order <- reorder(data[[var1]], data[[var2]])
ggplot(data) +
geom_point(mapping=
aes_string(
x = "new_order",
y = var2)
)
}
Take his solution instead :)

Related

Plotting ggplot with for loop. How should variable be referenced?

I am trying to create scatterplots where a single plot displays the relationship between each predictor and a single outcome. However, the outcome is not displaying normally. I assume this is because the ggplot function is not recognizing the outcome as a column name. Any advice on how to properly refer to the outcome?
# data
data <- data.frame(o1=rnorm(100, 3, sd=1.2),
o2=rnorm(100, 3.5, sd=1.4),
p1=rnorm(100, 2, sd=1.9),
p2=rnorm(100, 1, sd=1.2),
p3=rnorm(100, 7, sd=1.6)
)
func <- function(data, outcomes, predictors) {
for(i in seq_along(outcomes)){
print(data %>% select(outcomes[[i]], predictors[[i]]) %>%
gather(var, value, -outcomes[[i]]) %>%
ggplot(aes(x=value, y=outcomes[[i]])) + geom_point() + facet_wrap(~var))
}
}
func(data, outcomes=c("o1", "o2"), predictors=list(c("p1", "p2"), c("p2","p3")))
You could try this without the loop:
data <- data.frame(o1=rnorm(100, 3, sd=1.2),
o2=rnorm(100, 3.5, sd=1.4),
p1=rnorm(100, 2, sd=1.9),
p2=rnorm(100, 1, sd=1.2),
p3=rnorm(100, 7, sd=1.6)
)
tidy_data <- data %>%
pivot_longer(c(p1:p3), names_to = "predictor", values_to = "x") %>%
pivot_longer(c(o1:o2), names_to = "outcome", values_to = "y")
ggplot(tidy_data) +
geom_point(aes(x = x,
y = y)) +
facet_grid(outcome~predictor)
In the function you can :
create a list to hold all the plots.
replace gather with pivot_longer since gather is retired.
Use .data pronoun to specify y-axis value as variable.
library(tidyverse)
func <- function(data, outcomes, predictors) {
plot_list <- vector('list', length(outcomes))
for(i in seq_along(outcomes)){
plot_list[[i]] <- data %>%
select(outcomes[i], predictors[[i]]) %>%
pivot_longer(cols = -outcomes[i]) %>%
ggplot(aes(x=value, y=.data[[outcomes[i]]])) +
geom_point() + facet_wrap(~name)
}
return(plot_list)
}
and call it as :
result <- func(data, outcomes=c("o1", "o2"),
predictors=list(c("p1", "p2"), c("p2","p3")))
where result is a list of plots and each individual plots can be accessed as result[[1]], result[[2]] and so on.

How to pass an expression to a geom_text label in ggplot? (Continued)

This is a follow-up my original question for how to pass an expression with subscript to a geom_text label in ggplot.
Duck provided a great solution using parse = T within the geom_text() command. However, I am now running into a problem because the variable I wish to pass an expression to contains other content that appears unreadable with parse = T
Here is my current code (again, thank you to Duck for this solution):
library(ggplot2)
library(tidyverse)
#Data
my_exp <- as.character(expression('my_exp'[s][u][b]))
my_data <-
data.frame(
var_1 = c("9R", "14M", "17C"),
var_2 = c(1, 2, 3),stringsAsFactors = F
)
#Mutate
my_data$label <- ifelse(my_data$var_1=='9R',my_exp,my_data$var_1)
#Plot
my_data %>%
ggplot(aes(x = var_1, y = var_2))+
geom_text(aes(label = label),parse = T)
And here is the error output that appears when I try to render the ggplot:
> library(ggplot2)
> library(tidyverse)
> #Data
> my_exp <- as.character(expression('my_exp'[s][u][b]))
> my_data <-
+ data.frame(
+ var_1 = c("9R", "14M", "17C"),
+ var_2 = c(1, 2, 3),stringsAsFactors = F
+ )
> #Mutate
> my_data$label <- ifelse(my_data$var_1=='9R',my_exp,my_data$var_1)
> #Plot
> my_data %>%
+ ggplot(aes(x = var_1, y = var_2))+
+ geom_text(aes(label = label),parse = T)
Error in parse(text = text[[i]]) : <text>:1:3: unexpected symbol
1: 14M
^
>
It appears R is having a hard time reading the cells where I have not passed the expression. Is there a way to have R only parse the relevant cell(s)?
Thanks!
As an alternative, you can use geom_richtext() from the ggtext package and create super- or subscripts with <sup>...</sup> or <sub>...</sub>.
library(ggplot2)
library(ggtext)
#Data
my_exp <- "my_exp<sub>sub</sub>"
my_data <-
data.frame(
var_1 = c("9R", "14M", "17C"),
var_2 = c(1, 2, 3), stringsAsFactors = F
)
#Mutate
my_data$label <- ifelse(my_data$var_1=='9R', my_exp, my_data$var_1)
#Plot
ggplot(my_data, aes(x = var_1, y = var_2)) +
geom_richtext(
aes(label = label),
# customization to remove background and border around labels
fill = NA,
label.colour = NA
)
Created on 2020-09-09 by the reprex package (v0.3.0)
Maybe this might not be optimal but you can create a label for your expressions and another for your classic text. Here the code:
library(ggplot2)
library(tidyverse)
#Data
my_exp <- as.character(expression('my_exp'[s][u][b]))
my_data <-
data.frame(
var_1 = c("9R", "14M", "17C"),
var_2 = c(1, 2, 3),stringsAsFactors = F
)
#Mutate label 1
my_data$label <- ifelse(my_data$var_1=='9R',my_exp,NA)
my_data$label2 <- ifelse(my_data$var_1=='9R',NA,my_data$var_1)
#Plot
my_data %>%
ggplot(aes(x = var_1, y = var_2))+
geom_text(aes(label = label),parse = T)+
geom_text(aes(label = label2))
Output:
Using geom_text() twice you can hack the plot.

Variable column names in the pipe

I have the following code:
install.packages('tidyverse')
library(tidyverse)
x <- 1:10
y <- x^2
df <- data.frame(first_column = x, second_column = y)
tibble <- as_tibble(df)
tibble %>%
filter(second_column != 16) %>%
ggplot(aes(x = first_column, y = second_column)) +
geom_line()
Now I would like to create the following function
test <- function(colname) {
tibble %>%
filter(colname != 16) %>%
ggplot(aes(x = first_column, y = colname)) +
geom_line()
}
test('second_column')
But running it creates a vertical line instead of the function. How can I make this function work?
Edit: My focus is on getting the pipe to work, not ggplot.
In order to pass character strings for variable names, you have to use the standard evaluation version of each function. It is aes_string for aes, and filter_ for filter. See the NSE vignette for more details.
Your function could look like:
test <- function(colname) {
tibble %>%
filter_(.dots= paste0(colname, "!= 16")) %>%
ggplot(aes_string(x = "first_column", y = colname)) +
geom_line()
}

Facet skip value x-axis

Im working on this df:
library("ggplot2")
library("reshape2")
library("tidyr")
library("scales")
library("dplyr")
Col0 <- c("AA", "BB", "CC", "DD","EE","FF")
D01012015 <- c(2,2,2,6,1,NA)
D02012015 <- c(2,2,2,1,3,1)
D03012015 <- c(2,2,3,4,6,4)
D04012015 <- c(2,2,3,1,2,4)
D05012015 <- c(2,1,1,1,1,0)
D06012015 <- c(2,4,2,5,4,9)
D07012015 <- c(2,4,2,5,4,1)
D08012015 <- c(2,2,3,4,5,3)
D09012015 <- c(1,3,3,2,2,1)
D10012015 <- c(1,3,3,2,2,1)
D11012015 <- c(1,3,3,2,4,1)
D12012015 <- c(1,3,3,4,2,1)
D13012015 <- c(1,3,5,2,2,1)
D14012015 <- c(1,3,3,7,2,1)
D15012015 <- c(1,3,3,7,2,7)
df<-data.frame(Col0,D01012015,D02012015,D03012015,D04012015,D05012015,D06012015,D07012015,D08012015,D09012015,D10012015,D11012015,
D12012015,D13012015,D14012015,D15012015)
I know that normally, if i'd like to print a value per week on the x axis i should create this ggplot function:
f<-melt(df,id =c("Col0"))
f$date<-as.Date(f$variable, format="D%d%m%Y")
pl<- ggplot(f, aes(date, value, fill=Col0))+ geom_line(aes(color=Col0,group=Col0))+ scale_x_date(breaks = date_breaks("1 week"))
My problem is that i have to create the same x axis values, using this function:
plotfun = function(data) {
xval<-"dates"
column<- names(data)[1]
data %>%
gather_(xval, "Val", select_vars_(names(.),
names(.),
exclude = column)) %>%
ggplot(aes_string(xval, "Val", group = column, col = column)) +
facet_grid(as.formula(paste(column, "~."))) +
geom_line()
}
plotfun(df)
I don't know how to transform in dates the x values with gather and how to jump values as in the previous ggplot function
Can you not just put in a mutate statement?
plotfun <- function(data) {
xval <- "dates"
column <- names(data)[1]
data %>%
gather_(xval, "Val", select_vars_(names(.),
names(.),
exclude = column)) %>%
mutate(dates = as.Date(f$variable, format = "D%d%m%Y")) %>%
ggplot(aes_string(xval, "Val", group = column, col = column)) +
facet_grid(as.formula(paste(column, "~."))) +
geom_line()
}
plotfun(df)

dplyr and ggplot in a function: use reorder in aes function

I'm struggling to reorder my data for plotting with ggplot in a function that also uses dplyr:
# example data
library(ggplot2)
library(dplyr)
dat <- data.frame(a = c(rep("l", 10), rep("m", 5), rep("o", 15)),
b = sample(100, 30),
c= c(rep("q", 10), rep("r", 5), rep("s", 15)))
Here are my steps outside of a function:
# set a variable
colm <- "a"
# make a table
dat1 <- dat %>%
group_by_(colm) %>%
tally(sort = TRUE)
# put in order and plot
ggplot(dat2, aes(x = reorder(a, n), y = n)) +
geom_bar(stat = "identity")
But when I try to make that into a function, I can't seem to use reorder:
f <- function(the_data, the_column){
dat %>% group_by_(the_column) %>%
tally(sort = TRUE) %>%
ggplot(aes_string(x = reorder(the_column, 'n'), y = 'n')) +
geom_bar(stat = "identity")
}
f(dat, "a")
Warning message:
In mean.default(X[[i]], ...) :
argument is not numeric or logical: returning NA
The function will work without reorder:
f <- function(the_data, the_column){
dat %>% group_by_(the_column) %>%
tally(sort = TRUE) %>%
ggplot(aes_string(x = the_column, y = 'n')) +
geom_bar(stat = "identity")
}
f(dat, "a")
And I can get what I want without dplyr, but I'd prefer to use dplyr because it's more efficient in my actual use case:
# without dplyr
ff = function(the_data, the_column) {
data.frame(table(the_data[the_column])) %>%
ggplot(aes(x = reorder(Var1, Freq), y = Freq)) +
geom_bar(stat = "identity") +
ylab("n") +
xlab(the_column)
}
ff(dat, "a")
I see that others have struggled with this (1, 2), but it seems there must be a more efficient dplyr/pipe idiom for this reordering-in-a-function task.
If you are going to use aes_string, then the whole value must be a string, not just partially a string. You can use paste() to help build the expression you want to use for x. For example
f <- function(the_data, the_column){
dat %>% group_by_(the_column) %>%
tally(sort = TRUE) %>%
ggplot(aes_string(x = paste0("reorder(",the_column,", n)"), y = 'n')) +
geom_bar(stat = "identity")
}
Or you could use expressions rather than strings
f <- function(the_data, the_column){
dat %>% group_by_(the_column) %>%
tally(sort = TRUE) %>%
ggplot(aes_q(x = substitute(reorder(x, n),list(x=as.name(the_column))), y = quote(n))) +
geom_bar(stat = "identity")
}
but the general idea is that you need to be careful when mixing strings and raw language elements (like names or expressions).

Resources