How to create a matrix of plots with R and ggplot2 - r

I am trying to arrange n consecutive plots into one single matrix of plots. I get the plots in first place by running a for-loop, but I can't figure out how to arrange those into a 'plot of plots'. I have used par(mfrow=c(num.row,num.col)) but it does not work. Also multiplot(plotlist = p, cols = 4) and plot_grid(plotlist = p)
#import dataset
Survey<-read_excel('datasets/Survey_Key_and_Complete_Responses_excel.xlsx',
sheet = 2)
#Investigate how the dataset looks like
glimpse(Survey)#library dplyr
#change data types
Survey$brand <- as.factor(Survey$brand)
Survey$zipcode <- as.factor(Survey$zipcode)
Survey$elevel <- as.factor(Survey$elevel)
Survey$car <- as.numeric(Survey$car)
#Relation brand-variables
p = list()
for(i in 1:ncol(Survey)) {
if ((names(Survey[i])) == "brand"){
p[[i]]<-ggplot(Survey, aes(x = brand)) + geom_bar() +
labs(x="Brand")
} else if (is.numeric(Survey[[i]]) == "TRUE"){
p[[i]]<-ggplot(Survey, aes(x = Survey[[i]], fill=brand)) + geom_histogram() +
labs(x=colnames(Survey[i]))
} else {
p[[i]]<-ggplot(Survey, aes(x = Survey[[i]], fill = brand)) + geom_bar() +
labs(x=colnames(Survey[i]))
}
}
I think plots are appended correctly to the list but I can not plot them in a matrix form.

The problem does not appear to be with your multiple plots, but how you are calling the variable into your plot.
You've already put "Survey" into ggplot as the first argument (the data slot). In the mapping argument (the second slot), you put in aes(...) and inside that you should be specifying variable names, not data itself. So try this:
Where you have aes(x = Survey[[i]], fill=brand)) in two places,
put aes(x = names(Survey[[i]], fill=brand)) instead.
Regarding plotting multiple plots, par(mfrow... is for base R plots and cannot be used for ggplots. grid.arrange, multiplot, and plot_grid should all work once you fix the error in your plot.

Related

drawing multiple plots, 2 per page using ggplot

I have a list of dataframes and I would like to print them all in a .RMarkdown document with 2 per page. However, I have not been able to find a source for doing this. Is it possible to do this via a for loop?
What I would like to achieve is something with the following idea:
listOfDataframes <- list(df1, df2, df3, ..., dfn)
for(i in 1:){
plot <- ggplot(listOfDataframes[i], aes(x = aData, y = bData)) + geom_point(color = "steelblue", shape = 19)
#if two plots have been ploted break to a new page.
}
Is this possible to achieve with ggplot in rmarkdown? I need to print out a PDF document.
If you just need to output plots with two per page, then I would use gridExtra as was suggested above. You could do something like this if you were to put your ggplot objects into a list.
library(ggplot2)
library(shinipsum) # Just used to create random ggplot objects.
library(purrr)
library(gridExtra)
# Create some random ggplot objects.
ggplot_objects <- list(random_ggplot("line"), random_ggplot("line"))
# Create a list of names for the plots.
ggplot_objects_names <- c("This is Graph 1", "This is Graph 2")
# Use map2 to pass the ggplot objects and the list of names to the the plot titles, so that you can change them.
ggplot_objects_new <-
purrr::map2(
.x = ggplot_objects,
.y = ggplot_objects_names,
.f = function(x, y) {
x + ggtitle(y)
}
)
# Arrange each ggplot object to be 2 per page. Use marrangeGrob so that you can save two ggplot objects per page.
ggplot_arranged <-
gridExtra::marrangeGrob(ggplot_objects_new, nrow = 2, ncol = 1)
# Save as one pdf. Use scale here in order for the multi-plots to fit on each page.
ggsave("ggplot_arranged.pdf",
ggplot_arranged, scale = 1.5)
If you have a list of dataframes that you are wanting to create ggplots for, then you can use purrr::map to do that. You could do something like this:
purrr::map(df_list, function(x) {
ggplot(data = x, aes(x = aData, y = bData)) +
geom_point(color = "steelblue", shape = 19)
})

How can I make a loop with tickers created in Excel in R?

I am trying to make a loop in R with tickers I have created in Excel.
I am trying to collect stock data from Yahoo finance, and the idea is that R is going to read the name of the stock from the Excel file, and then collect it from Yahoo Finance and create a graph.
In the data frame "Stocks" there are 10 different stocks listed in different columns, and I would like to run a loop so that I can get 10 different graphs. Here is the formula I have used to create a graph out of the first stock in the dataset.
`Stocks %>%
ggplot(aes(x=Date, y = NOVO.B.CO.Close)) +
geom_line(col = "darkgreen") +
geom_point(col = "darkgreen") +
theme_gray() +
ggtitle("Novo Nordiske B") `
Wrapping my comments into a proper answer. It is not clear whether you want to make this a single plot (using facet_wrap) or multiple plots combined into a single window. When using ggplot2 it is beneficial to have a single data.frame in long format, as one can let ggplot handle all of the colouring and grouping based on a grouping column similar to the example below
library(ggplot2)
data(mtcars)
ggplot(mtcars, aes(x = mpg, y = hp, col = factor(cyl))) +
geom_point() +
geom_smooth() +
labs(col = 'Nmb. Cylinder')
From here the guide gives names for each colour, and scale_*_(manual/discrete/continuous) can be used to change specific colour palettes (eg scale_colour_discrete can be used to change the palette for factors).
When it comes to combining ggplots the patchwork package provides the simple interface. If we assume you have a vector tickers, titles and colors respectively, we can create a list of plots and combine them using simply addition (+).
library(purrr)
plots <- vector('list', n <- length(tickers))
base <- ggplot(Stocks, aes(x = Date)) +
theme_gray()
for(i in seq_len(n)){
plots[[i]] <- base +
geom_point(aes_string(y = tickers[i]), col = colors[i])
geom_line(aes_string(y = tickers[i]), col = colors[i])
ggtitle(titles[i])
}
reduce(plots, `+`)
However for stocks the first option is likely going to give a better result.

Changing title of plots in a loop with colnames() in R

I am creating a for loop which creates a ggplot2 plot for each of the first six columns in a dataframe. Everything works except for the looping of the title names. I have been trying to use title = colnames(df[,i]) and title = paste0(colnames(df[,i]) to create the proper title but it simply ends up repeating the 2nd column name. The plots themselves produce the data correctly for each column, but the title is for some reason not looping. For the first plot it produces the correct title, but then for the second plot and beyond it just keeps on repeating the third column name, completely skipping over the second column name. I even tried creating a variable within the loop to store the respective title name to then use within the ggplot2 title labels: changetitle <- colnames(df[,i]) and then using title = changetitle but that also loops incorrectly.
Here is an example of what I have so far:
plot_6 <- list()
for(i in df[1:6]){
plot_6[i] <- print(ggplot(df, aes(x = i, ...) ...) +
... +
labs(title = colnames(df[,i]),
x = ...) +
...)
}
Thank you very much.
df[1:6] is a data frame with six columns. When used as a loop variable, this results in i being a vector of values each time through the loop. This might "work" in the sense that ggplot will prroduce a plot, but it breaks the link between the data frame provided to ggplot (df in this case) and the mapping of df's columns to ggplot's aesthetics.
Here are a few options, using the built-in mtcars data frame:
library(tidyverse)
library(patchwork)
plot_6 <- list()
for(i in 1:6) {
var = names(mtcars)[i]
plot_6[[i]] <- ggplot(mtcars, aes(x = !!sym(var))) +
geom_density() +
labs(title = var)
}
# Use column names directly as loop variable
for(i in names(mtcars)[1:6]) {
plot_6[[i]] <- ggplot(mtcars, aes(x = !!sym(i))) +
geom_density() +
labs(title = var)
}
# Use map, which directly generates a list of plots
plot_6 = map(names(mtcars)[1:6],
~ggplot(mtcars, aes(x = !!sym(.x))) +
geom_density() +
labs(title = .x)
)
Any of these produces the same list of plots:
wrap_plots(plot_6)

How do I loop a ggplot2 functon to export and save about 40 plots?

I am trying to loop a ggplot2 plot with a linear regression line over it. It works when I type the y column name manually, but the loop method I am trying does not work. It is definitely not a dataset issue.
I've tried many solutions from various websites on how to loop a ggplot and the one I've attempted is the simplest I could find that almost does the job.
The code that works is the following:
plots <- ggplot(Everything.any, mapping = aes(x = stock_VWRETD, y = stock_10065)) +
geom_point() +
labs(x = 'Market Returns', y = 'Stock Returns', title ='Stock vs Market Returns') +
geom_smooth(method='lm',formula=y~x)
But I do not want to do this another 40 times (and then 5 times more for other reasons). The code that I've found on-line and have tried to modify it for my means is the following:
plotRegression <- function(z,na.rm=TRUE,...){
nm <- colnames(z)
for (i in seq_along(nm)){
plots <- ggplot(z, mapping = aes(x = stock_VWRETD, y = nm[i])) +
geom_point() +
labs(x = 'Market Returns', y = 'Stock Returns', title ='Stock vs Market Returns') +
geom_smooth(method='lm',formula=y~x)
ggsave(plots,filename=paste("regression1",nm[i],".png",sep=" "))
}
}
plotRegression(Everything.any)
I expect it to be the nice graph that I'd expect to get, a Stock returns vs Market returns graph, but instead on the y-axis, I get one value which is the name of the respective column, and the Market value plotted as normally, but as if on a straight number-line across the one y-axis value. Please let me know what I am doing wrong.
Desired Plot:
Actual Plot:
Sample Data is available on Google Drive here:
https://drive.google.com/open?id=1Xa1RQQaDm0pGSf3Y-h5ZR0uTWE-NqHtt
The problem is that when you assign variables to aesthetics in aes, you mix bare names and strings. In this example, both X and Y are supposed to be variables in z:
aes(x = stock_VWRETD, y = nm[i])
You refer to stock_VWRETD using a bare name (as required with aes), however for y=, you provide the name as a character vector produced by colnames. See what happens when we replicate this with the iris dataset:
ggplot(iris, aes(Petal.Length, 'Sepal.Length')) + geom_point()
Since aes expects variable names to be given as bare names, it doesn't interpret 'Sepal.Length' as a variable in iris but as a separate vector (consisting of a single character value) which holds the y-values for each point.
What can you do? Here are 2 options that both give the proper plot
1) Use aes_string and change both variable names to character:
ggplot(iris, aes_string('Petal.Length', 'Sepal.Length')) + geom_point()
2) Use square bracket subsetting to manually extract the appropriate variable:
ggplot(iris, aes(Petal.Length, .data[['Sepal.Length']])) + geom_point()
you need to use aes_string instead of aes, and double-quotes around your x variable, and then you can directly use your i variable. You can also simplify your for loop call. Here is an example using iris.
library(ggplot2)
plotRegression <- function(z,na.rm=TRUE,...){
nm <- colnames(z)
for (i in nm){
plots <- ggplot(z, mapping = aes_string(x = "Sepal.Length", y = i)) +
geom_point()+
geom_smooth(method='lm',formula=y~x)
ggsave(plots,filename=paste("regression1_",i,".png",sep=""))
}
}
myiris<-iris
plotRegression(myiris)

how to pass an arguments to function to get a line plot using ggplot2?

I am trying to write a function to create time series plot (line graph). How do I pass an argument to function so that the plot is created? I tried different ways like using aes_string etc. but no success.
lineplotfun <- function(feature){
ggplot(aes(x = 1:length(feature), y = feature), data = mtcars) +
geom_line()
}
lineplotfun(mpg)
I want to pass mpg as string or name.
There are numerous problems with the code in the question.
1) y is not in aes()
2) if ggplot2 is loaded, mpg is a tibble
3) y = feature with data = mtcars is meaningless
4) 1:length(feature) only makes sense if feature is a vector
One way of achieving what you want is by setting data = NULL and pass a vector to the function:
lineplotfun <- function(feature){
require(ggplot2)
ggplot2::ggplot(data = NULL, aes(x = seq_along(feature), y = feature)) +
ggplot2::geom_line()
}
lineplotfun(mtcars$mpg)
The result is:

Resources