Changing title of plots in a loop with colnames() in R - r

I am creating a for loop which creates a ggplot2 plot for each of the first six columns in a dataframe. Everything works except for the looping of the title names. I have been trying to use title = colnames(df[,i]) and title = paste0(colnames(df[,i]) to create the proper title but it simply ends up repeating the 2nd column name. The plots themselves produce the data correctly for each column, but the title is for some reason not looping. For the first plot it produces the correct title, but then for the second plot and beyond it just keeps on repeating the third column name, completely skipping over the second column name. I even tried creating a variable within the loop to store the respective title name to then use within the ggplot2 title labels: changetitle <- colnames(df[,i]) and then using title = changetitle but that also loops incorrectly.
Here is an example of what I have so far:
plot_6 <- list()
for(i in df[1:6]){
plot_6[i] <- print(ggplot(df, aes(x = i, ...) ...) +
... +
labs(title = colnames(df[,i]),
x = ...) +
...)
}
Thank you very much.

df[1:6] is a data frame with six columns. When used as a loop variable, this results in i being a vector of values each time through the loop. This might "work" in the sense that ggplot will prroduce a plot, but it breaks the link between the data frame provided to ggplot (df in this case) and the mapping of df's columns to ggplot's aesthetics.
Here are a few options, using the built-in mtcars data frame:
library(tidyverse)
library(patchwork)
plot_6 <- list()
for(i in 1:6) {
var = names(mtcars)[i]
plot_6[[i]] <- ggplot(mtcars, aes(x = !!sym(var))) +
geom_density() +
labs(title = var)
}
# Use column names directly as loop variable
for(i in names(mtcars)[1:6]) {
plot_6[[i]] <- ggplot(mtcars, aes(x = !!sym(i))) +
geom_density() +
labs(title = var)
}
# Use map, which directly generates a list of plots
plot_6 = map(names(mtcars)[1:6],
~ggplot(mtcars, aes(x = !!sym(.x))) +
geom_density() +
labs(title = .x)
)
Any of these produces the same list of plots:
wrap_plots(plot_6)

Related

How can I make a loop with tickers created in Excel in R?

I am trying to make a loop in R with tickers I have created in Excel.
I am trying to collect stock data from Yahoo finance, and the idea is that R is going to read the name of the stock from the Excel file, and then collect it from Yahoo Finance and create a graph.
In the data frame "Stocks" there are 10 different stocks listed in different columns, and I would like to run a loop so that I can get 10 different graphs. Here is the formula I have used to create a graph out of the first stock in the dataset.
`Stocks %>%
ggplot(aes(x=Date, y = NOVO.B.CO.Close)) +
geom_line(col = "darkgreen") +
geom_point(col = "darkgreen") +
theme_gray() +
ggtitle("Novo Nordiske B") `
Wrapping my comments into a proper answer. It is not clear whether you want to make this a single plot (using facet_wrap) or multiple plots combined into a single window. When using ggplot2 it is beneficial to have a single data.frame in long format, as one can let ggplot handle all of the colouring and grouping based on a grouping column similar to the example below
library(ggplot2)
data(mtcars)
ggplot(mtcars, aes(x = mpg, y = hp, col = factor(cyl))) +
geom_point() +
geom_smooth() +
labs(col = 'Nmb. Cylinder')
From here the guide gives names for each colour, and scale_*_(manual/discrete/continuous) can be used to change specific colour palettes (eg scale_colour_discrete can be used to change the palette for factors).
When it comes to combining ggplots the patchwork package provides the simple interface. If we assume you have a vector tickers, titles and colors respectively, we can create a list of plots and combine them using simply addition (+).
library(purrr)
plots <- vector('list', n <- length(tickers))
base <- ggplot(Stocks, aes(x = Date)) +
theme_gray()
for(i in seq_len(n)){
plots[[i]] <- base +
geom_point(aes_string(y = tickers[i]), col = colors[i])
geom_line(aes_string(y = tickers[i]), col = colors[i])
ggtitle(titles[i])
}
reduce(plots, `+`)
However for stocks the first option is likely going to give a better result.

How to assign unique title and text labels to ggplots created in lapply loop?

I've tried about every iteration I can find on Stack Exchange of for loops and lapply loops to create ggplots and this code has worked well for me. My only problem is that I can't assign unique titles and labels. From what I can tell in the function i takes the values of my response variable so I can't index the title I want as the ith entry in a character string of titles.
The example I've supplied creates plots with the correct values but the 2nd and 3rd plots in the plot lists don't have the correct titles or labels.
Mock dataset:
library(ggplot2)
nms=c("SampleA","SampleB","SampleC")
measr1=c(0.6,0.6,10)
measr2=c(0.6,10,0.8)
measr3=c(0.7,10,10)
qual1=c("U","U","")
qual2=c("U","","J")
qual3=c("J","","")
df=data.frame(nms,measr1,qual1,measr2,qual2,measr3,qual3,stringsAsFactors = FALSE)
identify columns in dataset that contain response variable
measrsindex=c(2,4,6)
Create list of plots that show all samples for each measurement
plotlist=list()
plotlist=lapply(df[,measrsindex], function(i) ggplot(df,aes_string(x="nms",y=i))+
geom_col()+
ggtitle("measr1")+
geom_text(aes(label=df$qual1)))
Create list of plots that show all measurements for each sample
plotlist2=list()
plotlist2=lapply(df[,measrsindex],function(i)ggplot(df,aes_string(x=measrsindex, y=i))+
geom_col()+
ggtitle("SampleA")+
geom_text(aes(label=df$qual1)))
The problem is that I cant create unique title for each plot. (All plots in the example have the title "measr1" or "SampleA)
Additionally I cant apply unique labels (from qual columns) for each bar. (ex. the letter for qual 2 should appear on top of the column for measr2 for each sample)
Additionally in the second plot list the x-values aren't "measr1","measr2","measr3" they're the index values for those columns which isn't ideal.
I'm relatively new to R and have never posted on Stack Overflow before so any feedback about my problem or posting questions is welcomed.
I've found lots of questions and answers about this sort of topic but none that have a data structure or desired plot quite like mine. I apologize if this is a redundant question but I have tried to find the solution in previous answers and have been unable.
This is where I got the original code to make my loops, however this example doesn't include titles or labels:
Looping over ggplot2 with columns
You could loop over the names of the columns instead of the column itself and then use some non-standard evaluation to get column values from the names. Also, I have included label in aes.
library(ggplot2)
library(rlang)
plotlist3 <- purrr::map(names(df)[measrsindex],
~ggplot(df, aes(nms, !!sym(.x), label = qual1)) +
geom_col() + ggtitle(.x) + geom_text(vjust = -1))
plotlist3[[1]]
plotlist3[[2]]
The same can be achieved with lapply as well
plotlist4 <- lapply(names(df)[measrsindex], function(x)
ggplot(df, aes(nms, !!sym(x), label = qual1)) +
geom_col() + ggtitle(x) + geom_text(vjust = -1))
I would recommend putting your data in long format prior to using ggplot2, it makes plotting a much simpler task. I also recoded some variables to facilitate constructing the plot. Here is the code to construct the plots with lapply.
library(tidyverse)
#Change from wide to long format
df1<-df %>%
pivot_longer(cols = -nms,
names_to = c(".value", "obs"),
names_sep = c("r","l")) %>%
#Separate Sample column into letters
separate(col = nms,
sep = "Sample",
into = c("fill","Sample"))
#Change measures index to 1-3
measrsindex=c(1,2,3)
plotlist=list()
plotlist=lapply(measrsindex, function(i){
#Subset by measrsindex (numbers) and plot
df1 %>%
filter(obs == i) %>%
ggplot(aes_string(x="Sample", y="meas", label="qua"))+
geom_col()+
labs(x = "Sample") +
ggtitle(paste("Measure",i, collapse = " "))+
geom_text()})
#Get the letters A : C
samplesvec<-unique(df1$Sample)
plotlist2=list()
plotlist2=lapply(samplesvec, function(i){
#Subset by samplesvec (letters) and plot
df1 %>%
filter(Sample == i) %>%
ggplot(aes_string(x="obs", y = "meas",label="qua"))+
geom_col()+
labs(x = "Measure") +
ggtitle(paste("Sample",i,collapse = ", "))+
geom_text()})
Watching the final plots, I think it might be useful to use facet_wrap to make these plots. I added the code to use it with your plots.
#Plot for Measures
ggplot(df1, aes(x = Sample,
y = meas,
label = qua)) +
geom_col()+
facet_wrap(~ obs) +
ggtitle("Measures")+
labs(x="Samples")+
geom_text()
#Plot for Samples
ggplot(df1, aes(x = obs,
y = meas,
label = qua)) +
geom_col()+
facet_wrap(~ Sample) +
ggtitle("Samples")+
labs(x="Measures")+
geom_text()
Here is a sample of the plots using facet_wrap.

How do I loop a ggplot2 functon to export and save about 40 plots?

I am trying to loop a ggplot2 plot with a linear regression line over it. It works when I type the y column name manually, but the loop method I am trying does not work. It is definitely not a dataset issue.
I've tried many solutions from various websites on how to loop a ggplot and the one I've attempted is the simplest I could find that almost does the job.
The code that works is the following:
plots <- ggplot(Everything.any, mapping = aes(x = stock_VWRETD, y = stock_10065)) +
geom_point() +
labs(x = 'Market Returns', y = 'Stock Returns', title ='Stock vs Market Returns') +
geom_smooth(method='lm',formula=y~x)
But I do not want to do this another 40 times (and then 5 times more for other reasons). The code that I've found on-line and have tried to modify it for my means is the following:
plotRegression <- function(z,na.rm=TRUE,...){
nm <- colnames(z)
for (i in seq_along(nm)){
plots <- ggplot(z, mapping = aes(x = stock_VWRETD, y = nm[i])) +
geom_point() +
labs(x = 'Market Returns', y = 'Stock Returns', title ='Stock vs Market Returns') +
geom_smooth(method='lm',formula=y~x)
ggsave(plots,filename=paste("regression1",nm[i],".png",sep=" "))
}
}
plotRegression(Everything.any)
I expect it to be the nice graph that I'd expect to get, a Stock returns vs Market returns graph, but instead on the y-axis, I get one value which is the name of the respective column, and the Market value plotted as normally, but as if on a straight number-line across the one y-axis value. Please let me know what I am doing wrong.
Desired Plot:
Actual Plot:
Sample Data is available on Google Drive here:
https://drive.google.com/open?id=1Xa1RQQaDm0pGSf3Y-h5ZR0uTWE-NqHtt
The problem is that when you assign variables to aesthetics in aes, you mix bare names and strings. In this example, both X and Y are supposed to be variables in z:
aes(x = stock_VWRETD, y = nm[i])
You refer to stock_VWRETD using a bare name (as required with aes), however for y=, you provide the name as a character vector produced by colnames. See what happens when we replicate this with the iris dataset:
ggplot(iris, aes(Petal.Length, 'Sepal.Length')) + geom_point()
Since aes expects variable names to be given as bare names, it doesn't interpret 'Sepal.Length' as a variable in iris but as a separate vector (consisting of a single character value) which holds the y-values for each point.
What can you do? Here are 2 options that both give the proper plot
1) Use aes_string and change both variable names to character:
ggplot(iris, aes_string('Petal.Length', 'Sepal.Length')) + geom_point()
2) Use square bracket subsetting to manually extract the appropriate variable:
ggplot(iris, aes(Petal.Length, .data[['Sepal.Length']])) + geom_point()
you need to use aes_string instead of aes, and double-quotes around your x variable, and then you can directly use your i variable. You can also simplify your for loop call. Here is an example using iris.
library(ggplot2)
plotRegression <- function(z,na.rm=TRUE,...){
nm <- colnames(z)
for (i in nm){
plots <- ggplot(z, mapping = aes_string(x = "Sepal.Length", y = i)) +
geom_point()+
geom_smooth(method='lm',formula=y~x)
ggsave(plots,filename=paste("regression1_",i,".png",sep=""))
}
}
myiris<-iris
plotRegression(myiris)

How to create a matrix of plots with R and ggplot2

I am trying to arrange n consecutive plots into one single matrix of plots. I get the plots in first place by running a for-loop, but I can't figure out how to arrange those into a 'plot of plots'. I have used par(mfrow=c(num.row,num.col)) but it does not work. Also multiplot(plotlist = p, cols = 4) and plot_grid(plotlist = p)
#import dataset
Survey<-read_excel('datasets/Survey_Key_and_Complete_Responses_excel.xlsx',
sheet = 2)
#Investigate how the dataset looks like
glimpse(Survey)#library dplyr
#change data types
Survey$brand <- as.factor(Survey$brand)
Survey$zipcode <- as.factor(Survey$zipcode)
Survey$elevel <- as.factor(Survey$elevel)
Survey$car <- as.numeric(Survey$car)
#Relation brand-variables
p = list()
for(i in 1:ncol(Survey)) {
if ((names(Survey[i])) == "brand"){
p[[i]]<-ggplot(Survey, aes(x = brand)) + geom_bar() +
labs(x="Brand")
} else if (is.numeric(Survey[[i]]) == "TRUE"){
p[[i]]<-ggplot(Survey, aes(x = Survey[[i]], fill=brand)) + geom_histogram() +
labs(x=colnames(Survey[i]))
} else {
p[[i]]<-ggplot(Survey, aes(x = Survey[[i]], fill = brand)) + geom_bar() +
labs(x=colnames(Survey[i]))
}
}
I think plots are appended correctly to the list but I can not plot them in a matrix form.
The problem does not appear to be with your multiple plots, but how you are calling the variable into your plot.
You've already put "Survey" into ggplot as the first argument (the data slot). In the mapping argument (the second slot), you put in aes(...) and inside that you should be specifying variable names, not data itself. So try this:
Where you have aes(x = Survey[[i]], fill=brand)) in two places,
put aes(x = names(Survey[[i]], fill=brand)) instead.
Regarding plotting multiple plots, par(mfrow... is for base R plots and cannot be used for ggplots. grid.arrange, multiplot, and plot_grid should all work once you fix the error in your plot.

How to use column names starting with numbers in ggplot functions

I have a huge dataframe, whose variables/ column names start with a number such as `1_variable`. Now I am trying to create a function that can take these column names as arguments to then plot a few boxplots using ggplot. However I need the string but also need to to use its input with `` to use the arguments in ggplot. However I am not sure how to escape the character string such as "1_variable" to give ggplot an input that is `1_variable`.
small reproducible example:
dfx = data.frame(`1ev`=c(rep(1,5), rep(2,5)), `2ev`=sample(10:99, 10),
`3ev`=10:1, check.names = FALSE)
If I were to plot the figure manually, the input would look like this:
dfx$`1ev` <- as.factor(dfx$`1ev`)
ggplot(dfx, aes(x = `1ev`, y = `2ev`))+
geom_boxplot()
the function I'd like to be able to run for the dataframe is this one:
plot_boxplot <- function(data, group, value){
data = data[c(group, value)]
data[,group] = as.factor(data[,group])
plot <- ggplot(data, aes(x = group, y = value))+
geom_boxplot()
return(plot)
}
1. Try
plot_boxplot(dfx, `1ev`, `2ev`)
which gives me an error saying Error in [.data.frame(data, c(group, value)) : object '1ev' not found
2. Try
entering the arguments with double quotes "" gives me unexpectedly this:
plot_boxplot(dfx, "1ev", "2ev")
3. Try
I also tried to replace the double quotes of the string with gsub in the function
gsub('\"', '`', group)
but that does not change anything abut its output.
4. Try
finally, I also tried to make use of aes_string , but that just gives me the same errors.
plot_boxplot <- function(data, group, value){
data = data[c(as.character(group), as.character(value))]
data[,group] = as.factor(data[,group])
plot <- ggplot(data, aes_string(x= group, y=value))+
geom_boxplot()
return(plot)
}
plot_boxplot(dfx, `1ev`, `2ev`)
plot_boxplot(dfx, "1ev", "2ev")
Ideally I would like to run the function to produce this output:
plot_boxplot(dfx, group = "1ev", value = "2ev")
[can be produced with this code manually]
ggplot(dfx, aes(x= `1ev`, y=`2ev`)) +
geom_boxplot()
Any help would be greatly appreciated.
One way to do this is a combination of aes_ and as.name():
plot_boxplot <- function(data, group, value){
data = data[c(group, value)]
data[,group] = as.factor(data[,group])
plot <- ggplot(data, aes_(x= as.name(group), y=as.name(value))) +
geom_boxplot()
return(plot)
}
And passing in strings for group and value:
plot_boxplot(dfx, "1ev", "2ev")
It's not the same plot you show above, but it looks to align with the data.

Resources