Pass name through `map()` to be evaluated lazily - r

I'm making a series of plots programmatically, and I want to pass the name of the tibble (or dataframe) into the title of my ggplot2 plot, so I know which is which.
deparse(substitute(x)) works for making a single plot from a tibble, but outputs "." when called via purrr::map() when making plots from a list of tibbles.
#initialize data frame
myDf <- tibble(x = LETTERS[1:5], y = sample(1:10, 5))
#initialize function
myPlot <- function(df) {
title = deparse(substitute(df))
ggplot(df, aes(x, y)) +
geom_col() +
ggtitle(title)
}
#call function
myPlot(myDf)
This gives me a plot with the title myDF.
Now I want to do the same thing with a list of plots:
#initialize list of data frames
myDFs <- vector("list", 0)
myDFs$first <- tibble(x = LETTERS[1:5], y = sample(1:10, 5))
myDFs$second <- tibble(x = LETTERS[1:5], y = sample(1:10, 5))
myDFs$third <- tibble(x = LETTERS[1:5], y = sample(1:10, 5))
#initialize same function
myPlot <- function(df) {
title = deparse(substitute(df))
ggplot(df, aes(x, y)) +
geom_col() +
ggtitle(title)
}
#call function with purrr::map
map(myDFs, myPlot)
Now each is titled with the same title: .x[[i]]
I'd love to know how to pass a more informative title through map. It doesn't have to be pretty, but it does have to be unique. Thank you in advance!

We could use imap which is made for such operations
myPlot <- function(df, names) {
ggplot(df, aes(x, y)) +
geom_col() +
ggtitle(names)
}
purrr::imap(myDFs, myPlot)

We could use Map from base R
Map(myPlot, myDFs, names(myDFs))
Or using iwalk
purrr::iwalk(myDFs, ~ myPlot(.x, .y))
where
myPlot <- function(data, nameVec){
ggplot(data, aes(x, y)) +
geom_col() +
ggtitle(nameVec)
}

Related

how to change titles in ggplot for series of plots?

I created a function that splits a dataframe by group variable 'gear' and makes plots for each new df. How do I change the title for each plot then?
rank20 <- function(df){
sdf <- df
clusterName <- paste0("cluster",sdf$gear)
splitData <- split(sdf,clusterName)
plot <- lapply(splitData, function (x) {ggplot(x, aes(mpg, cyl)) + geom_point()+
labs(x="mpg", y="cyl",
title="This Needs To Be Changed") +
theme_minimal()})
do.call(grid.arrange,plot)
}
rank20(mtcars)
I want the following titles: gear3, gear4, etc. (corresponding to their gear value)
UPD: both results are right if using mtcars. But in my real case, I transform my initial df. So, I should have put my question in the different way. I need to take the titles from splitData df name itself rather than from the column.
I find it easier with lapply to use the indexes as the input. Provides more flexibility if you need to link the list to another element eg the name of the list:
rank20 <- function(df, col="gear"){
sdf <- df
clusterName <- paste0("cluster", sdf[[col]])
splitData <- split(sdf, clusterName)
# do the apply across the splitData indexes instead and pull out cluster
# name from the column
plot <- lapply(seq_along(splitData), function(i) {
X <- splitData[[i]]
i_title <- paste0(col, X[[col]][1])
## to use clusterName instead eg cluster3 instead of gear3:
#i_title <- paste0(col, names(splitData)[i])
ggplot(X, aes(mpg, cyl)) +
geom_point() +
labs(x="mpg", y="cyl", title=i_title) +
theme_minimal()
})
do.call(grid.arrange,plot)
}
rank20(mtcars)
As you say you wanted the name to be gear2, gear3 I've gone with this but hashed out the alternative i_title that uses the clusterName value instead.
In the input of the function you can change the col value to switch to a different column so gear isn't hard-coded
May I suggest a slightly different approach (yet also using the loop over indices as suggested in user Jonny Phelps answer). I am creating a list of plots and then using patchwork::wrap_plots for plotting. I find it smoother.
library(tidyverse)
library(patchwork)
len_ind <- length(unique(mtcars$cyl))
ls_plot <-
mtcars %>%
split(., .$cyl) %>%
map2(1:len_ind, ., function(x, y) {
ggplot(y, aes(mpg, cyl)) +
geom_point() +
labs(x = "mpg", y = "cyl",
title = names(.)[x]
) +
theme_minimal()
})
wrap_plots(ls_plot) + plot_layout(ncol = 1)
Just noticed this was the wrong column - using cyl instead of gear. Oops. It was kind of fun to wrap this into a function:
plot_col <- function(x, col, plotx, ploty){
len_ind <- length(unique(mtcars[[col]]))
x_name <- deparse(substitute(plotx))
y_name <- deparse(substitute(ploty))
ls_plot <-
mtcars %>%
split(., .[[col]]) %>%
map2(1:len_ind, ., function(x, y) {
ggplot(y, aes({{plotx}}, {{ploty}})) +
geom_point() +
labs(x = x_name, y = y_name,
title = names(.)[x]
) +
theme_minimal()
})
wrap_plots(ls_plot) + plot_layout(ncol = 1)
}
plot_col(mtcars, "gear", mpg, cyl)

ggplot: adding a label to a geom_line aes_string

I have a for loop plotting 3 geom_lines, how do I add a label/legend so they won't all be 3 indiscernible black lines?
methods.list <- list(rwf,snaive,meanf)
cv.list <- lapply(methods.list, function(method) {
taylor%>% tsCV(forecastfunction = method, h=48)
})
gg <- ggplot(NULL, aes(x))
for (i in seq(1,3)){
gg <- gg + geom_line(aes_string( y=sqrt(colMeans(cv.list[[i]]^2, na.rm=TRUE))))
}
gg + guides(colour=guide_legend(title="Forecast"))
If I don't use a loop, I can use aes instead of that horrible aes_string and then everything works, but I have to write the same code 3 times and replace the loop with this:
gg <- gg + geom_line(aes(y=sqrt(colMeans(cv.list[[1]]^2, na.rm=TRUE)), colour=names(cv.list)[1]))
gg <- gg + geom_line(aes(y=sqrt(colMeans(cv.list[[2]]^2, na.rm=TRUE)), colour=names(cv.list)[2]))
gg <- gg + geom_line(aes(y=sqrt(colMeans(cv.list[[3]]^2, na.rm=TRUE)), colour=names(cv.list)[3]))
and then there are nice automatic colors and legend. What am I missing? Why is r being so noob-unfriendly?
The example is not reproducible, (there is no data!) but it seems you have some information in a list cv.list which contains multiple data.frames, and you want to plot some summary statistic of each against a common varaible stored in x.
The simplest method is simply to create a data.frame and plot using the data.frame.
#Create 3 data.frames with data (forecast?)
df <- lapply(1:3, function(group){
summ_stat <- sqrt(colMeans(cv.list[[i]]^2, na.rm=TRUE))
group <- group
data.frame(summ_stat, group, x = x)
})
#bind the data.frames into a single data.frame
df <- do.call(rbind, df)
#Create the plot
ggplot(data = df, aes(x = x, y = summ_stat, colour = group)) +
geom_line() +
labs(colour = "Forecast")
Note the change of label in the labs argument. This is changing the label of colour which is part of aes.

plotly tooltip names returned from ggplot function

I have defined a function that takes a data.frame and returns a plot, which I later on pass to plotly. I need this function to be flexible and it's going to be called a number of times (that's why I wrote a function). A simple reproducible example:
a <- data.frame(x = 1:3, y = c(2, 6, 3))
library(ggplot2)
library(plotly)
plotTrend <- function(x){
var1 <- names(x)[1]
var2 <- names(x)[2]
p <- ggplot(a, aes(x = get(var1), y = get(var2)))+
geom_point()+
geom_smooth(method = "lm")
return(p)
}
Of course I can call plotTrend on a and I'll get the plot I'm expecting.
But when I call ggplotly on it, the tooltip reads an ugly get(var1) instead of the name of the column ("x" in this example).
plotTrend(a)
ggplotly()
I'm aware I could create a text column for the data.frame inside the function, and call ggplotly(tooltip = "text") (I read plenty of questions in SO about that), but I wondered if there's another way to get the right names in the tooltips, either by modifying the function or by using some special argument in ggplotly.
My expected output is:
A plotly plot with
Tooltips that accurately read the values and whose names are "x" and "y"
We can use aes_string to display the evaluated column names in the ggplotly tooltips:
library(ggplot2)
library(plotly)
a <- data.frame(x = 1:3, y = c(2, 6, 3))
var1 <- names(a)[1]
var2 <- names(a)[2]
p <- ggplot(a, aes_string(x = var1, y = var2)) +
geom_point()+
geom_smooth(method = "lm")
ggplotly(p)
NB: this works the same inside the plotTrend function call.
Alternatively, use tidy evaluation to pass the column names as function arguments in the plotTrend function:
plotTrend <- function(data, x, y) {
x_var <- enquo(x)
y_var <- enquo(y)
p <- ggplot(data, aes(x = !!x_var, y = !!y_var)) +
geom_point()+
geom_smooth(method = "lm")
return(p)
}
plotTrend(a, x = x, y = y) %>%
ggplotly()

How to plot three point lines using ggplot2 instead of the default plot in R

I have three matrix and I want to plot the graph using ggplot2. I have the data below.
library(cluster)
require(ggplot2)
require(scales)
require(reshape2)
data(ruspini)
x <- as.matrix(ruspini[-1])
w <- matrix(W[4,])
df <- melt(data.frame(max_Wmk, min_Wmk, w, my_time = 1:10), id.var = 'my_time')
ggplot(df, aes(colour = variable, x = my_time, y = value)) +
geom_point(size = 3) +
geom_line() +
scale_y_continuous(labels = comma) +
theme_minimal()
I want to add the three plots into one plot using a beautiful ggplot2.
Moreover, I want to make the points with different values have different colors.
I'm not quite sure what you're after, here's a guess
Your data...
max <- c(175523.9, 33026.97, 21823.36, 12607.78, 9577.648, 9474.148, 4553.296, 3876.221, 2646.405, 2295.504)
min <- c(175523.9, 33026.97, 13098.45, 5246.146, 3251.847, 2282.869, 1695.64, 1204.969, 852.1595, 653.7845)
w <- c(175523.947, 33026.971, 21823.364, 5246.146, 3354.839, 2767.610, 2748.689, 1593.822, 1101.469, 1850.013)
Slight modification to your base plot code to make it work...
plot(1:10,max,type='b',xlab='Number',ylab='groups',col=3)
points(1:10,min,type='b', col=2)
points(1:10,w,type='b',col=1)
Is this what you meant?
If you want to reproduce this with ggplot2, you might do something like this...
# ggplot likes a long table, rather than a wide one, so reshape the data, and add the 'time' variable explicitly (ie. my_time = 1:10)
require(reshape2)
df <- melt(data.frame(max, min, w, my_time = 1:10), id.var = 'my_time')
# now plot, with some minor customisations...
require(ggplot2); require(scales)
ggplot(df, aes(colour = variable, x = my_time, y = value)) +
geom_point(size = 3) +
geom_line() +
scale_y_continuous(labels = comma) +
theme_minimal()
UPDATE after the question was edited and the example data changed, here's an edit to suit the new example data:
Here's your example data (there's scope for simplification and speed gains here, but that's another question):
library(cluster)
require(ggplot2)
require(scales)
require(reshape2)
data(ruspini)
x <- as.matrix(ruspini[-1])
wss <- NULL
W=matrix(data=NA,ncol=10,nrow=100)
for(j in 1:100){
k=10
for(i in 1: k){
wss[i]=kmeans(x,i)$tot.withinss
}
W[j,]=as.matrix(wss)
}
max_Wmk <- matrix(data=NA, nrow=1,ncol=10)
for(i in 1:10){
max_Wmk[,i]=max(W[,i],na.rm=TRUE)
}
min_Wmk <- matrix(data=NA, nrow=1,ncol=10)
for(i in 1:10){
min_Wmk[,i]=min(W[,i],na.rm=TRUE)
}
w <- matrix(W[4,])
Here's what you need to do to make the three objects into vectors so you can make the data frame as expected:
max_Wmk <- as.numeric(max_Wmk)
min_Wmk <- as.numeric(min_Wmk)
w <- as.numeric(w)
Now reshape and plot as before...
df <- melt(data.frame(max_Wmk, min_Wmk, w, my_time = 1:10), id.var = 'my_time')
ggplot(df, aes(colour = variable, x = my_time, y = value)) +
geom_point(size = 3) +
geom_line() +
scale_y_continuous(labels = comma) +
theme_minimal()
And here's the result:

How to plot a list of vectors with different lengths?

I have a list with 9 different vectors inside. And I want plot them (dot-line) in one figure with different colors by their names. How to do that in R language?
Using a made up example:
# example data:
dat <- list(a=1:5,b=2:7,c=3:10)
# get plotting:
plot(unlist(dat),type="n",xlim=c(1,max(sapply(dat,length))))
mapply(lines,dat,col=seq_along(dat),lty=2)
legend("topleft",names(dat),lty=2,col=seq_along(dat))
No question would be complete without a ggplot answer.
dat <- list(a=1:5,b=2:7,c=3:10)
dat <- lapply(dat, function(x) cbind(x = seq_along(x), y = x))
list.names <- names(dat)
lns <- sapply(dat, nrow)
dat <- as.data.frame(do.call("rbind", dat))
dat$group <- rep(list.names, lns)
library(ggplot2)
ggplot(dat, aes(x = x, y = y, colour = group)) +
theme_bw() +
geom_line(linetype = "dotted")
To plot each line in a separate plot, use
ggplot(dat, aes(x = x, y = y, colour = group)) +
theme_bw() +
geom_line(linetype = "dotted") +
facet_wrap(~ group)

Resources