This question already has answers here:
How to use a variable to specify column name in ggplot
(6 answers)
Closed 4 years ago.
df <- data.frame(id = rep(1:6, each = 50), a = rnorm(50*6, mean = 10, sd = 5),
b = rnorm(50*6, mean = 20, sd = 10),
c = rnorm(50*6, mean = 30, sd = 15))
I have three variables a,b and c. If I have to plot a variable for all loc.id
ggplot(df, aes(a)) + geom_histogram() + facet_wrap(~id)
I have a loop for which I have to plot a, b and c.
var.names <- c("a","b","c")
for(v in seq_along(var.names)){
variable <- var.names[v]
ggplot(df, aes(x = paste0(variable))) + geom_histogram() + facet_wrap(~id)
}
This loop does not work. I was wondering how do I refer to a column in the above command by its name.My actual data
has many variables and hence I was doing like this.
We can use aes_string to pass strings
l1 <- vector("list", length(var.names))
for(v in seq_along(var.names)){
variable <- var.names[v]
l1[[v]] <- ggplot(df, aes_string(x = variable)) +
geom_histogram() +
facet_wrap(~id)
}
Or another option in the dev version should be to convert the string to symbol (rlang::sym) and evaluate (!!) within the aes
for(v in seq_along(var.names)){
variable <- rlang::sym(var.names[v])
l1[[v]] <- ggplot(df, aes(x = !!variable)) +
geom_histogram() +
facet_wrap(~id)
}
The plots stored in the list can be saved in a .pdf file
library(gridExtra)
l2 <- map(l1, ggplotGrob)
ggsave(marrangeGrob(grobs = l2, nrow = 1, ncol = 1), file = 'plots.pdf')
If we need to overlay the three plots in a single page, use gather to convert to 'long' format
library(tidyr)
library(dplyr)
gather(df, key, val, var.names) %>%
ggplot(., aes(x = val, fill = key)) +
geom_histogram() +
facet_wrap(~id)
-output
Related
I have a csv file which looks like the following:
Name,Count1,Count2,Count3
application_name1,x1,x2,x3
application_name2,x4,x5,x6
The x variables represent numbers and the applications_name variables represent names of different applications.
Now I would like to make a barplot for each row by using ggplot2. The barplot should have the application_name as title. The x axis should show Count1, Count2, Count3 and the y axis should show the corresponding values (x1, x2, x3).
I would like to have a single barplot for each row, because I have to store the different plots in different files. So I guess I cannot use "melt".
I would like to have something like:
for each row in rows {
print barplot in file
}
Thanks for your help.
You can use melt to rearrange your data and then use either facet_wrap or facet_grid to get a separate plot for each application name
library(ggplot2)
library(reshape2)
# example data
mydf <- data.frame(name = paste0("name",1:4), replicate(5,rpois(4,30)))
names(mydf)[2:6] <- paste0("count",1:5)
# rearrange data
m <- melt(mydf)
# if you are wanting to export each plot separately
# I used facet_wrap as a quick way to add the application name as a plot title
for(i in levels(m$name)) {
p <- ggplot(subset(m, name==i), aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
ggsave(paste0("figure_",i,".pdf"), p)
}
# or all plots in one window
ggplot(m, aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
I didn't see #user20650's nice answer before preparing this. It's almost identical, except that I use plyr::d_ply to save things instead of a loop. I believe dplyr::do() is another good option (you'd group_by(Name) first).
yourData <- data.frame(Name = sample(letters, 10),
Count1 = rpois(10, 20),
Count2 = rpois(10, 10),
Count3 = rpois(10, 8))
library(reshape2)
yourMelt <- melt(yourData, id.vars = "Name")
library(ggplot2)
# Test a function on one piece to develope graph
ggplot(subset(yourMelt, Name == "a"), aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = subset(yourMelt, Name == 'a')$Name)
# Wrap it up, with saving to file
bp <- function(dat) {
myPlot <- ggplot(dat, aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = dat$Name)
ggsave(filname = paste0("path/to/save/", dat$Name, "_plot.pdf"),
myPlot)
}
library(plyr)
d_ply(yourMelt, .variables = "Name", .fun = bp)
I want to use ggplot to loop over several columns to create multiple plots, but using the placeholder in the for loop changes the behavior of ggplot.
If I have this:
t <- data.frame(w = c(1, 2, 3, 4), x = c(23,45,23, 34),
y = c(23,34,54, 23), z = c(23,12,54, 32))
This works fine:
ggplot(data=t, aes(w, x)) + geom_line()
But this does not:
i <- 'x'
ggplot(data=t, aes(w, i)) + geom_line()
Which is a problem if I want to eventually loop over x, y and z.
Any help?
You just need to use aes_string instead of aes, like this:
ggplot(data=t, aes_string(x = "w", y = i)) + geom_line()
Note that w then needs to be specified as a string, too.
ggplot2 > 3.0.0 supports tidy evaluation pronoun .data. So we can do the following:
Build a function that takes x- & y- column names as inputs. Note the use of .data[[]].
Then loop through every column using purrr::map.
library(rlang)
library(tidyverse)
dt <- data.frame(
w = c(1, 2, 3, 4), x = c(23, 45, 23, 34),
y = c(23, 34, 54, 23), z = c(23, 12, 54, 32)
)
Define a function that accept strings as input
plot_for_loop <- function(df, x_var, y_var) {
ggplot(df, aes(x = .data[[x_var]], y = .data[[y_var]])) +
geom_point() +
geom_line() +
labs(x = x_var, y = y_var) +
theme_classic(base_size = 12)
}
Loop through every column
plot_list <- colnames(dt)[-1] %>%
map( ~ plot_for_loop(dt, colnames(dt)[1], .x))
# view all plots individually (not shown)
plot_list
# Combine all plots
library(cowplot)
plot_grid(plotlist = plot_list,
ncol = 3)
Edit: the above function can also be written w/ rlang::sym & !! (bang bang).
plot_for_loop2 <- function(df, .x_var, .y_var) {
# convert strings to variable
x_var <- sym(.x_var)
y_var <- sym(.y_var)
# unquote variables using !!
ggplot(df, aes(x = !! x_var, y = !! y_var)) +
geom_point() +
geom_line() +
labs(x = x_var, y = y_var) +
theme_classic(base_size = 12)
}
Or we can just use facet_grid/facet_wrap after convert the data frame from wide to long format (tidyr::gather)
dt_long <- dt %>%
tidyr::gather(key, value, -w)
dt_long
#> w key value
#> 1 1 x 23
#> 2 2 x 45
#> 3 3 x 23
#> 4 4 x 34
#> 5 1 y 23
#> 6 2 y 34
#> 7 3 y 54
#> 8 4 y 23
#> 9 1 z 23
#> 10 2 z 12
#> 11 3 z 54
#> 12 4 z 32
### facet_grid
ggp1 <- ggplot(dt_long,
aes(x = w, y = value, color = key, group = key)) +
facet_grid(. ~ key, scales = "free", space = "free") +
geom_point() +
geom_line() +
theme_bw(base_size = 14)
ggp1
### facet_wrap
ggp2 <- ggplot(dt_long,
aes(x = w, y = value, color = key, group = key)) +
facet_wrap(. ~ key, nrow = 2, ncol = 2) +
geom_point() +
geom_line() +
theme_bw(base_size = 14)
ggp2
### bonus: reposition legend
# https://cran.r-project.org/web/packages/lemon/vignettes/legends.html
library(lemon)
reposition_legend(ggp2 + theme(legend.direction = 'horizontal'),
'center', panel = 'panel-2-2')
The problem is how you access the data frame t. As you probably know, there are several ways of doing so but unfortunately using a character is obviously not one of them in ggplot.
One way that could work is using the numerical position of the column in your example, e.g., you could try i <- 2. However, if this works rests on ggplot which I have never used (but I know other work by Hadley and I guess it should work)
Another way of circumventing this is by creating a new temporary data frame every time you call ggplot. e.g.:
tmp <- data.frame(a = t[['w']], b = t[[i]])
ggplot(data=tmp, aes(a, b)) + geom_line()
Depending on what you are trying to do, I find facet_wrap or facet_grid to work well for creating multiple plots with the same basic structure. Something like this should get you in the right ballpark:
t.m = melt(t, id="w")
ggplot(t.m, aes(w, value)) + facet_wrap(~ variable) + geom_line()
data <- data.frame(a=1:10, b=1:10 * 2, c=1:10 * 3)
library(ggplot2)
p <- ggplot(NULL, aes(x = 1:10))
# Using for loop will cause the plot only to draw the last line.
for (i in names(data)){
p <- p + geom_line(aes(y = data[[i]], colour = i))
}
# Lines below works fine.
# p <- p + geom_line(aes(y = data[["a"]], colour = "a"))
# p <- p + geom_line(aes(y = data[["b"]], colour = "b"))
# p <- p + geom_line(aes(y = data[["c"]], colour = "c"))
print(p)
Why loop plotting doesn't work as what we expected?
Is this a lazy plotting method?
You don't actually have to loop to get your lines. You just need to reshape your data and actually include x in your data frame. Your data is wide, and ggplot2 likes long data. This is how you can easily make multiple lines in a single plot.
As an aside, your method doesn't work as you are replacing p each time you iterate, ending up with only the endpoint of the loop.
library(ggplot2)
library(tidyr)
data <- data.frame(x = 1:10, a=1:10, b=1:10 * 2, c=1:10 * 3)
df <- gather(data, name, value, -x)
ggplot(df, aes(x = x, y = value, color = name)) +
geom_line()
I have a csv file which looks like the following:
Name,Count1,Count2,Count3
application_name1,x1,x2,x3
application_name2,x4,x5,x6
The x variables represent numbers and the applications_name variables represent names of different applications.
Now I would like to make a barplot for each row by using ggplot2. The barplot should have the application_name as title. The x axis should show Count1, Count2, Count3 and the y axis should show the corresponding values (x1, x2, x3).
I would like to have a single barplot for each row, because I have to store the different plots in different files. So I guess I cannot use "melt".
I would like to have something like:
for each row in rows {
print barplot in file
}
Thanks for your help.
You can use melt to rearrange your data and then use either facet_wrap or facet_grid to get a separate plot for each application name
library(ggplot2)
library(reshape2)
# example data
mydf <- data.frame(name = paste0("name",1:4), replicate(5,rpois(4,30)))
names(mydf)[2:6] <- paste0("count",1:5)
# rearrange data
m <- melt(mydf)
# if you are wanting to export each plot separately
# I used facet_wrap as a quick way to add the application name as a plot title
for(i in levels(m$name)) {
p <- ggplot(subset(m, name==i), aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
ggsave(paste0("figure_",i,".pdf"), p)
}
# or all plots in one window
ggplot(m, aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
I didn't see #user20650's nice answer before preparing this. It's almost identical, except that I use plyr::d_ply to save things instead of a loop. I believe dplyr::do() is another good option (you'd group_by(Name) first).
yourData <- data.frame(Name = sample(letters, 10),
Count1 = rpois(10, 20),
Count2 = rpois(10, 10),
Count3 = rpois(10, 8))
library(reshape2)
yourMelt <- melt(yourData, id.vars = "Name")
library(ggplot2)
# Test a function on one piece to develope graph
ggplot(subset(yourMelt, Name == "a"), aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = subset(yourMelt, Name == 'a')$Name)
# Wrap it up, with saving to file
bp <- function(dat) {
myPlot <- ggplot(dat, aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = dat$Name)
ggsave(filname = paste0("path/to/save/", dat$Name, "_plot.pdf"),
myPlot)
}
library(plyr)
d_ply(yourMelt, .variables = "Name", .fun = bp)
I want to use ggplot to loop over several columns to create multiple plots, but using the placeholder in the for loop changes the behavior of ggplot.
If I have this:
t <- data.frame(w = c(1, 2, 3, 4), x = c(23,45,23, 34),
y = c(23,34,54, 23), z = c(23,12,54, 32))
This works fine:
ggplot(data=t, aes(w, x)) + geom_line()
But this does not:
i <- 'x'
ggplot(data=t, aes(w, i)) + geom_line()
Which is a problem if I want to eventually loop over x, y and z.
Any help?
You just need to use aes_string instead of aes, like this:
ggplot(data=t, aes_string(x = "w", y = i)) + geom_line()
Note that w then needs to be specified as a string, too.
ggplot2 > 3.0.0 supports tidy evaluation pronoun .data. So we can do the following:
Build a function that takes x- & y- column names as inputs. Note the use of .data[[]].
Then loop through every column using purrr::map.
library(rlang)
library(tidyverse)
dt <- data.frame(
w = c(1, 2, 3, 4), x = c(23, 45, 23, 34),
y = c(23, 34, 54, 23), z = c(23, 12, 54, 32)
)
Define a function that accept strings as input
plot_for_loop <- function(df, x_var, y_var) {
ggplot(df, aes(x = .data[[x_var]], y = .data[[y_var]])) +
geom_point() +
geom_line() +
labs(x = x_var, y = y_var) +
theme_classic(base_size = 12)
}
Loop through every column
plot_list <- colnames(dt)[-1] %>%
map( ~ plot_for_loop(dt, colnames(dt)[1], .x))
# view all plots individually (not shown)
plot_list
# Combine all plots
library(cowplot)
plot_grid(plotlist = plot_list,
ncol = 3)
Edit: the above function can also be written w/ rlang::sym & !! (bang bang).
plot_for_loop2 <- function(df, .x_var, .y_var) {
# convert strings to variable
x_var <- sym(.x_var)
y_var <- sym(.y_var)
# unquote variables using !!
ggplot(df, aes(x = !! x_var, y = !! y_var)) +
geom_point() +
geom_line() +
labs(x = x_var, y = y_var) +
theme_classic(base_size = 12)
}
Or we can just use facet_grid/facet_wrap after convert the data frame from wide to long format (tidyr::gather)
dt_long <- dt %>%
tidyr::gather(key, value, -w)
dt_long
#> w key value
#> 1 1 x 23
#> 2 2 x 45
#> 3 3 x 23
#> 4 4 x 34
#> 5 1 y 23
#> 6 2 y 34
#> 7 3 y 54
#> 8 4 y 23
#> 9 1 z 23
#> 10 2 z 12
#> 11 3 z 54
#> 12 4 z 32
### facet_grid
ggp1 <- ggplot(dt_long,
aes(x = w, y = value, color = key, group = key)) +
facet_grid(. ~ key, scales = "free", space = "free") +
geom_point() +
geom_line() +
theme_bw(base_size = 14)
ggp1
### facet_wrap
ggp2 <- ggplot(dt_long,
aes(x = w, y = value, color = key, group = key)) +
facet_wrap(. ~ key, nrow = 2, ncol = 2) +
geom_point() +
geom_line() +
theme_bw(base_size = 14)
ggp2
### bonus: reposition legend
# https://cran.r-project.org/web/packages/lemon/vignettes/legends.html
library(lemon)
reposition_legend(ggp2 + theme(legend.direction = 'horizontal'),
'center', panel = 'panel-2-2')
The problem is how you access the data frame t. As you probably know, there are several ways of doing so but unfortunately using a character is obviously not one of them in ggplot.
One way that could work is using the numerical position of the column in your example, e.g., you could try i <- 2. However, if this works rests on ggplot which I have never used (but I know other work by Hadley and I guess it should work)
Another way of circumventing this is by creating a new temporary data frame every time you call ggplot. e.g.:
tmp <- data.frame(a = t[['w']], b = t[[i]])
ggplot(data=tmp, aes(a, b)) + geom_line()
Depending on what you are trying to do, I find facet_wrap or facet_grid to work well for creating multiple plots with the same basic structure. Something like this should get you in the right ballpark:
t.m = melt(t, id="w")
ggplot(t.m, aes(w, value)) + facet_wrap(~ variable) + geom_line()