I'm having trouble with creating a list of plots using for loops, and I don't know why.
Here's the code that doesn't work (i.e. returns aplotfinal as an empty list)
aplotfinal <- list()
for(i in 1:length(Rlist)){
a <- Rlist[[i]] %>%
select(Frame_times, average)
del <- 0.016667
x.spec <- spectrum(a$average, log = "no", plot = FALSE)
spx <- x.spec$freq/del
spy <- 2*x.spec$spec
aplotfinal[[i]] <- plot(spy~spx, main = names(Rlist)[i], xlab = "frequency", ylab = "spectral density", type = "l")
}
The plot function works, I just want to apply it for a list of dataframes that I have (i.e. Rlist). Thank you!
the base R plot() does not return an object, it just draws on the device. So you have to do a multiplot or save onto a pdf to have a record of the plots.
To store it in a list, I guess you need something like ggplot, for example:
library(ggplot2)
library(gridExtra)
Rlist = lapply(1:5,function(i){
data.frame(Frame_times = seq(0,1,length.out=100),average=runif(100))
})
names(Rlist) = letters[1:5]
aplotfinal <- lapply(1:length(Rlist),function(i){
a <- Rlist[[i]] %>% select(Frame_times, average)
del <- 0.016667
x.spec <- spectrum(a$average, log = "no", plot = FALSE)
spx <- x.spec$freq/del
spy <- 2*x.spec$spec
aplotfinal[[i]] <- qplot(y = spy,x=spx,geom="line") +
ggtitle(names(Rlist)[i]) +
xlab("frequency")+ylab("spectral density")
})
grid.arrange(grobs=aplotfinal,ncol=5)
Here is an example you can use to tweak your code. Plots get saved to the plot_list variable and then to pdf residing at path/to/pdf. Note that you have to initiate a device first (in my case, pdf).
library(ggplot2)
library(dplyr)
df <- data.frame(country = c(rep('USA',20), rep('Canada',20), rep('Mexico',20)),
wave = c(1:20, 1:20, 1:20),
par = c(1:20 + 5*runif(20), 21:40 + 10*runif(20), 1:20 + 15*runif(20)))
countries <- unique(df$country)
plot_list <- list()
i <- 1
for (c in countries){
pl <- ggplot(data = df %>% filter(country == c)) +
geom_point(aes(wave, par), size = 3, color = 'red') +
labs(title = as.character(c), x = 'wave', y = 'value') +
theme_bw(base_size = 16)
plot_list[[i]] <- pl
i <- i + 1
}
pdf('path/to/pdf')
pdf.options(width = 9, height = 7)
for (i in 1:length(plot_list)){
print(plot_list[[i]])
}
dev.off()
Related
I'm creating three plots in a loop over I and using assign to save each plot. The y variable is scaled by the loop index. The scaling should ensure that the final panel of plots each has y going from 0 to 1. This isn't happening and the plots seem to be being changed as the loop runs. I'd be grateful if someone could explain this apparently odd behaviour.
library(dplyr)
library(ggplot2)
library(gridExtra)
loci = c(1,2,3)
x <- seq(0,1,0.01)
df <- expand.grid(x = x, loci = loci)
df <- df %>% mutate(y = loci * x)
cols = c("red", "blue", "green")
for (i in loci){
plot_this <- df %>% filter(loci == i)
my_plot = ggplot(plot_this) +
geom_point( aes( x = x, y = y/i), colour = cols[i]) +
ylim(0,3) + ggtitle(paste0("i = ", i))
assign(paste0("plot_", i), my_plot)
print(plot_1)
}
grid.arrange(plot_1, plot_2, plot_3, ncol = 3)
It's due to the lazy evaluation nature of ggplot, and more explanation can be found in this post.
"Looping" with lapply avoids the problem.
Data
library(ggplot2)
library(gridExtra)
library(dplyr)
loci = c(1,2,3)
x <- seq(0,1,0.01)
df <- expand.grid(x = x, loci = loci)
df <- df %>% mutate(y = loci * x)
cols = c("red", "blue", "green")
Code
my_plot <- lapply(loci, function(i) {
df %>%
filter(loci == i) %>%
ggplot() +
geom_point(aes(x = x, y = y/i), colour = cols[i]) +
ylim(0,3) +
ggtitle(paste0("i = ", i))
})
grid.arrange(my_plot[[1]], my_plot[[2]], my_plot[[3]], ncol = 3)
Created on 2022-04-26 by the reprex package (v2.0.1)
I generate a series of plots stored in a matrix as part of a for loop much like in the MWE below. This same matrix also stores two other columns of information (Colour and Animal in this example). I then want to be able to create a grid of plots, where I identify the plot based on the corresponding Colour and Animal.
I tried creating a data frame and then using row names to call out the plots I needed, but had the common error of Cannot convert object of class list into a grob.. If I call from the matrix directly this works - however I want a way not have to do this in case the order of the data changes in the input files. Is it possible to work directly from the data frame? I've seen similar examples, but couldn't apply to my case. I want to stick with cow plot and change as little as possible in the data generation stage.
MWE
library(cowplot)
p <- vector('list', 15)
p <-
matrix(
p,
nrow = 5,
ncol = 3
)
myColours = c("Yellow", "Red", "Blue", "Green", "Orange")
myAnimals = c("Kangaroo", "Emu", "Echidna", "Platypus", "Cassowary")
x = seq(1,10)
it = 1
for (i in seq(0,4)){ # generate example data and plots
y = x^i
t = runif(5)
df <- data.frame("X" = x, "Y" = y, "T" = t)
theanimal = myAnimals[i+1]
thecolour = myColours[i+1]
p[[it,1]] = thecolour
p[[it,2]] = theanimal
p[[it,3]] = ggplot(data = df, mapping = aes(x = X, y = Y)) +
geom_point(aes(color = T)) +
ggtitle(paste(thecolour, theanimal, sep = " "))
it = it+ 1
}
# turn into df
pltdf<- as.data.frame(p)
colnames(pltdf) <- c("Colour", "Animal", "plot")
rownames(pltdf) <- do.call(paste, c(pltdf[c("Colour", "Animal")], sep="-"))
pltdf[[1,3]] # this is what I expect for a single plot
plot1 = vector('list', 4)
plot1 <-
matrix(
plot1,
nrow = 2,
ncol = 2
)
plot1[[1,1]] = pltdf["Red-Emu", "plot"]. # also tried with just plot[[1]] = etc.
plot1[[1,2]] = pltdf["Blue-Echidna", "plot"]
plot1[[2,1]] = pltdf["Orange-Cassowary", "plot"]
plot1[[2,2]] = pltdf["Green-Platypus", "plot"]
plot_grid(plotlist = t(plot1), ncol = 2)
plot_grid(plotlist = list(plot1), ncol = 2) # suggested solution on a dif problem
plot2 = vector('list', 4) # what I want plots to look like in the end
plot2[[1]] = p[[1,3]]
plot2[[2]] = p[[4,3]]
plot2[[3]] = p[[2, 3]]
plot2[[4]] = p[[5, 3]]
plot_grid(plotlist = t(plot2), ncol = 2)
You can specify the order that you want the plots to be in and subset the dataframe accordingly which can be used in plot_grid.
library(cowplot)
order <- c("Red-Emu", "Blue-Echidna", "Orange-Cassowary", "Green-Platypus")
plot_grid(plotlist = pltdf[order, 'plot'], ncol = 2)
Here is the data that I will be using to give context to my question:
library(dplyr)
library(tidyr)
library(ggplot2)
set.seed(1)
f1 <- sample(c(letters[1:3],NA),100, prob = c(rep((0.9/3),times = 3),0.1),replace = T)
f2 <- sample(c(letters[1:3],NA),100, prob = c(rep((0.8/3),times = 3),0.2),replace = T)
f3 <- sample(c(letters[1:3],NA),100, prob = c(rep((0.95/3),times = 3),0.01),replace = T)
sample_dat <- tibble(
x1 = factor(f1, level=letters[1:3]),
x2 = factor(f2, level=letters[1:3]),
x3 = factor(f3, level=letters[1:3]),
grpA = factor(sample(c("grp1","grp2"),100, prob=c(0.3, 0.7) ,replace=T),
levels = c("grp1", "grp2"))
)
sample_dat
here is a function that I created to prepare the data for plotting:
plot_data_prepr <- function(dat, groupvar, mainvar){
groupvar <- sym(groupvar)
mainvar <- sym(mainvar)
plot_data <- dat %>%
group_by(!!groupvar) %>%
count(!!mainvar, .drop = F) %>% drop_na() %>%
mutate(pct = n/sum(n),
pct2 = ifelse(n == 0, 0.005, n/sum(n)),
grp_tot = sum(n),
pct_lab = paste0(format(pct*100, digits = 1),'%'),
pct_pos = pct2 + .02)
return(plot_data)
}
here is the application of the function to produce the data sets I will use for plotting
plot_data_prepr(dat = sample_dat, groupvar = "grpA", mainvar = "x1")
plot_data_prepr(dat = sample_dat, groupvar = "grpA", mainvar = "x2")
plot_data_prepr(dat = sample_dat, groupvar = "grpA", mainvar = "x3")
here I use a for loop to plot the data and dynamically change the labels of the facets -- if one runs this in
rstudio as an RMarkdown file, one can see that the plots are produced and the labels for the facets are
each distinct as they should be given the different degrees of missingness and sampling densities for the
'grpA' variable.
plot_list <- vector('list', length = 0)
for (fct in names(sample_dat)[1:3]){
mvar <- fct
smvar <- sym(mvar)
gvar <- "grpA"
sgvar <- sym(gvar)
dd <- plot_data_prepr(dat = sample_dat, groupvar = gvar, mainvar = mvar)
pre_lookup <- dd %>%
select(!!sgvar, grp_tot) %>%
group_by(!!sgvar) %>%
summarise(lookup = mean(grp_tot))
lookup <- pre_lookup$lookup
my_label <- function(x) {
var <- names(x)[1]
list(paste0(x[[var]], " (N = ", lookup, ")"))
}
plot <- ggplot(dd,
mapping = aes(x=!!smvar, y = pct2, fill = !!smvar)) +
geom_bar(stat = 'identity') +
ylim(0,1.3) +
geom_text(aes(x=!!smvar, label=pct_lab, y = pct_pos + .02)) +
facet_grid(as.formula(paste0(".~", gvar)), labeller = my_label) +
ggtitle(paste(gvar,"by",mvar))
plot_list[[fct]] <- plot
print(plot)
}
Here's my problem -- when I print the plots which are stored in the list,
they all seem to retain the facet label from the last plot, instead of retaining
the distinct facet-labels they displayed when they were originally generated.
for (name in names(sample_dat)[1:3]){
print(plot_list[[name]])
}
Basically, I would like to be able to print the plots from the list
when I need them and have them display their distinct facet labels
as they had been displayed when the plots were originally produced.
Perhaps someone in the community could help me?
I would suggest you try to avoid the loop for the plots building. It uses to create that kind of issues as you have with labels or sometimes with data. Here, I have packaged your loop in a function and stored the results in a list. Also, you can use lapply() with the names of your data in order to directly create the list with the plots. Here the code:
#Function for plot
myplotfun <- function(fct)
{
mvar <- fct
smvar <- sym(mvar)
gvar <- "grpA"
sgvar <- sym(gvar)
dd <- plot_data_prepr(dat = sample_dat, groupvar = gvar, mainvar = mvar)
pre_lookup <- dd %>%
select(!!sgvar, grp_tot) %>%
group_by(!!sgvar) %>%
summarise(lookup = mean(grp_tot))
lookup <- pre_lookup$lookup
my_label <- function(x) {
var <- names(x)[1]
list(paste0(x[[var]], " (N = ", lookup, ")"))
}
plot <- ggplot(dd,
mapping = aes(x=!!smvar, y = pct2, fill = !!smvar)) +
geom_bar(stat = 'identity') +
ylim(0,1.3) +
geom_text(aes(x=!!smvar, label=pct_lab, y = pct_pos + .02)) +
facet_grid(as.formula(paste0(".~", gvar)), labeller = my_label) +
ggtitle(paste(gvar,"by",mvar))
return(plot)
}
Now, we create a list:
#Create a list
plot_list <- lapply(names(sample_dat)[1:3],myplotfun)
Finally, the plots as you used in the last loop:
#Loop
for (i in 1:length(plot_list)){
plot(plot_list[[i]])
}
Outputs:
The problem is your my_label function has a free variable lookup that's only resolved when you actually plot the function. After your for-loop runs, then you it only contains the last value in the loop. To capture the current loop value, you can place it inside an enclosure. So you could change the my_label function to
my_labeler <- function(lookup) {
function(x) {
var <- names(x)[1]
list(paste0(x[[var]], " (N = ", lookup, ")"))
}
}
and then call facet_grid with
facet_grid(as.formula(paste0(".~", gvar)), labeller = my_labeler(lookup))
But I agree with #Duck that avoiding the for-loop in this case would be easier.
Facebook's Prophet in R (there's also a Python version) is used to generate time series forecasts.
A model m is created by:
m <- prophet(df)
future <- make_future_dataframe(m, periods = 365)
forecast <- predict(m, future)
plot(m, forecast)
Which returns a very nicely formatted graph, like:
I would like to change the line type, to get not dots but a usual thin line.
I had tried this
lines(m$history$y,lty=1)
but got an error
In doTryCatch(return(expr), name, parentenv, handler)
Are there are any suggestions how to convert those dots into a line?
The plot method for prophet objects uses ggplot2, so base R graphics functions like lines() won't work. You can use ggplot2::geom_line() to add lines, but at the moment I don't see an easy way to replace the points by lines ...
Example from ?prophet:
history <- data.frame(ds = seq(as.Date('2015-01-01'), as.Date('2016-01-01'), by = 'd'),
y = sin(1:366/200) + rnorm(366)/10)
m <- prophet(history)
future <- make_future_dataframe(m, periods = 365)
forecast <- predict(m, future)
pp <- plot(m,forecast)
Add lines:
library(ggplot2)
pp + geom_line()
This question provides a (hacky) way forward:
pp2 <- pp + geom_line()
qq2 <- ggplot_build(pp2)
qq2$data[[2]]$colour <- NA
plot(ggplot_gtable(qq2))
But obviously something went wrong with the hack. The better bet would be to look at the plot method(prophet:::plot.prophet) and modify it to behave as you want ... Here is the bare-bones version:
df <- prophet:::df_for_plotting(m, forecast)
gg <-ggplot(df, aes(x = ds, y = y)) + labs(x = "ds", y = "y")
gg <- gg + geom_ribbon(ggplot2::aes(ymin = yhat_lower,
ymax = yhat_upper), alpha = 0.2, fill = "#0072B2",
na.rm = TRUE)
## replace first geom_point() with geom_line() in next line ...
gg <- gg + geom_line(na.rm = TRUE) + geom_line(aes(y = yhat),
color = "#0072B2", na.rm = TRUE) + theme(aspect.ratio = 3/5)
I may have stripped out some components that exist in your data/forecast, though ...
it is possible to make such manipulations with dyplot.prophet(m, forecast) (html version of plot) :) before that, we should rewrite function like here:
dyplot.prophet <- function(x, fcst, uncertainty=TRUE,
...)
{
forecast.label='Predicted'
actual.label='Actual'
# create data.frame for plotting
df <- prophet:::df_for_plotting(x, fcst)
# build variables to include, or not, the uncertainty data
if(uncertainty && exists("yhat_lower", where = df))
{
colsToKeep <- c('y', 'yhat', 'yhat_lower', 'yhat_upper')
forecastCols <- c('yhat_lower', 'yhat', 'yhat_upper')
} else
{
colsToKeep <- c('y', 'yhat')
forecastCols <- c('yhat')
}
# convert to xts for easier date handling by dygraph
dfTS <- xts::xts(df %>% dplyr::select_(.dots=colsToKeep), order.by = df$ds)
# base plot
dyBase <- dygraphs::dygraph(dfTS)
presAnnotation <- function(dygraph, x, text) {
dygraph %>%
dygraphs::dyAnnotation(x, text, text, attachAtBottom = TRUE)
}
dyBase <- dyBase %>%
# plot actual values
dygraphs::dySeries(
'y', label=actual.label, color='black',stepPlot = TRUE, strokeWidth=1
) %>%
# plot forecast and ribbon
dygraphs::dySeries(forecastCols, label=forecast.label, color='blue') %>%
# allow zooming
dygraphs::dyRangeSelector() %>%
# make unzoom button
dygraphs::dyUnzoom()
if (!is.null(x$holidays)) {
for (i in 1:nrow(x$holidays)) {
# make a gray line
dyBase <- dyBase %>% dygraphs::dyEvent(
x$holidays$ds[i],color = "rgb(200,200,200)", strokePattern = "solid")
dyBase <- dyBase %>% dygraphs::dyAnnotation(
x$holidays$ds[i], x$holidays$holiday[i], x$holidays$holiday[i],
attachAtBottom = TRUE)
}
}
return(dyBase)
}
the strokeWidth=0 was before and we have changed it to strokeWidth=1 and added stepPlot = TRUE
the whole basis code is situated here: https://rdrr.io/cran/prophet/src/R/plot.R
I'm still learning R (clearly), and cannot figure out where my problem might be when trying to save ggplot2 output into a pdf file. I have been able to create code using a loop to save ggplot output, but want to force myself to avoid loops and take advantage of R's ability to do so.
I have looked at other posts regarding saving pdf files, but none seemed to address my issue.
Here is a reproducible example:
# Create example data frame for reproducible example
amount <- c(rep(5, 25), rep(10, 50), rep(15, 25))
value <- c(rep(100, 20), rep(200, 30), rep(300, 50))
fund <- I(c(rep("FundA", 50), rep("FundB", 50)))
example_df <- data.frame(amount, value, fund)
#==============================================================
# Weighted histogram function for plotting
histogram_wt_fx <- function(my_df, xvar, my_weight,
chart_title = "title",
chart_xlabel = "x",
chart_ylabel = "y") {
library(ggplot2)
histogram <- ggplot(my_df, aes(x = xvar, weight = my_weight)) +
geom_histogram(binwidth=0.25, colour="black", fill="white")
# add another layer showing weighted avg amount
histogram <- histogram + geom_vline(aes(xintercept = sum (xvar*my_weight)),
color="red", linetype="dashed", size=1) +
labs(title = chart_title , x = chart_xlabel, y = chart_ylabel)
}
#===============================================================
# Function to weight data and plot histogram
# Note: fund_wtd_fx in turn calls histogram_wt_fx
fund_wtd_fx <- function(my_df, my_title) {
my_df <- my_df %>%
mutate(pct_amount = amount/sum(amount))
my_df %>%
histogram_wt_fx (xvar = my_df$value,
my_weight = my_df$pct_amount,
chart_title = my_title,
chart_xlabel = "Amount",
chart_ylabel = "Percent") %>%
plot() #%>%
#*** This is where the problem code is ****
#pdf() %>%
#plot()
}
#=====================================
# Extract fund lists from larger data set and run the functions on this list
fund_names <- unique(example_df$fund) # List of funds in the data frame
fund_dfs <- list() # Initialize list of data frames
# Create list of fund data frames
for (myfund in fund_names) {
myfund <- example_df %>%
filter(fund == myfund)
fund_dfs[[length(fund_dfs)+1]] <- myfund
}
rm(myfund)
names(fund_dfs) <- fund_names
# Assign list of fund names to the list of data frames
for (i in 1:length(fund_names)) {
assign(fund_names[[i]], fund_dfs[[i]])
}
# Run histogram function on each fund
my_title <- as.list(paste0("Some title for ", (names(fund_dfs))))
mapply(FUN = fund_wtd_fx, fund_dfs, my_title)
#dev.off()
My problem:
This code runs like I want it to, but if you uncomment lines 39, 41, 42, and 68 (assuming you pasted the code starting in line 1), then the plots do not get saved and a plot.window error is thrown.
I would have thought that pipe operator on uncommented line 39, would feed into the pdf function to save the plot output as the mapply function cycles through the data frames. Ultimately that is what I am trying to do--save the plots generated into a pdf file with this code.
Many thanks for any help or suggestions.
histogram_wt_fx() now returns the plot object to fund_wtd_fx() which now also returns the plot object.
Switched to purrr::map2() from mapply and did the plotting at the end.
Take a look, give it a go and let me know if I can/should explain a bit more.
library(dplyr)
library(ggplot2)
library(purrr)
amount <- c(rep(5, 25), rep(10, 50), rep(15, 25))
value <- c(rep(100, 20), rep(200, 30), rep(300, 50))
fund <- I(c(rep("FundA", 50), rep("FundB", 50)))
example_df <- data.frame(amount, value, fund)
histogram_wt_fx <- function(my_df, xvar, my_weight,
chart_title = "title",
chart_xlabel = "x",
chart_ylabel = "y") {
histogram <- ggplot(my_df, aes(x = xvar, weight = my_weight)) +
geom_histogram(binwidth=0.25, colour="black", fill="white")
histogram <- histogram + geom_vline(aes(xintercept = sum (xvar*my_weight)),
color="red", linetype="dashed", size=1) +
labs(title = chart_title , x = chart_xlabel, y = chart_ylabel)
histogram
}
fund_wtd_fx <- function(my_df, my_title) {
my_df <- my_df %>%
mutate(pct_amount = amount/sum(amount))
my_df %>%
histogram_wt_fx(xvar = my_df$value,
my_weight = my_df$pct_amount,
chart_title = my_title,
chart_xlabel = "Amount",
chart_ylabel = "Percent")
}
fund_names <- unique(example_df$fund) # List of funds in the data frame
fund_dfs <- list() # Initialize list of data frames
for (myfund in fund_names) {
myfund <- example_df %>%
filter(fund == myfund)
fund_dfs[[length(fund_dfs)+1]] <- myfund
}
rm(myfund)
names(fund_dfs) <- fund_names
for (i in 1:length(fund_names)) {
assign(fund_names[[i]], fund_dfs[[i]])
}
my_title <- as.list(paste0("Some title for ", (names(fund_dfs))))
plots <- map2(fund_dfs, my_title, fund_wtd_fx)
pdf()
walk(plots, print)
dev.off()