Extracting the column names in R with lapply - r

So here is my code
h <- lapply(select(winedata, -quality), function(variable){
return(ggplot(aes(x = variable), data = winedata) +
geom_histogram(bins = 30) + xlab(variable))})
There is one problem, and that is xlab(variable) displays the value of the first column as the x axis title, if I choose variable[2] it displays the value of the second column as the x axis title. How do I get it to put the column names as the x axis title. names(variable) does not seem to work

You can use Map:
Map(function(var, names){
return(ggplot(iris, aes(x = var)) +
geom_histogram(bins = 30) + xlab(names))},
select(iris, -Species), names(iris)[1:4])
Map is essentially mapply with SIMPLIFY=FALSE, which takes multiple inputs and returns a list.


R: Programmatically changing ggplot scale labels to Greek letters with expressions

I am trying to change the labels in a ggplot object to Greek symbols for an arbitrary number of labels. Thanks to this post, I can do this manually when I know the number of labels in advance and the number is not too large:
# Simulate data
df <- data.frame(name = rep(c("alpha1","alpha2"), 50),
value = rnorm(100))
# Create a plot with greek letters for labels
ggplot(df, aes(x = value, y = name)) + geom_density() +
scale_y_discrete(labels = c("alpha1" = expression(alpha[1]),
"alpha2" = expression(alpha[2])))
For our purposes, assume I need to change k default labels, where each of the k labels is the pre-fix "alpha" followed by a number 1:k. Their corresponding updated labels would substitute the greek letter for "alpha" and use a subscript. An example of this is below:
# default labels
paste0("alpha", 1:k)
# desired labels
for (i in 1:k) { expression(alpha[i]) }
I was able to hack together the below programmatic solution that appears to produce the desired result thanks to this post:
ggplot(df, aes(x = value, y = name)) + geom_density() +
scale_y_discrete(labels = parse(text = paste("alpha[", 1:length(unique(df)), "]")))
However, I do not understand this code and am seeking clarification about:
What is parse() doing here that expression() otherwise would do?
While I understand everything to the right-hand side of =, what is text doing on the left-hand side of the =?
Another option to achieve your desired result would be to add a new column to your data which contains the ?plotmath expression as a string and map this new column on y. Afterwards you could use scales::label_parse() to parse the expressions:
df <- data.frame(name = rep(c("alpha1","alpha2"), 50),
value = rnorm(100))
df$label <- gsub("^(.*?)(\\d+)$", "\\1[\\2]", df$name)
ggplot(df, aes(x = value, y = label)) + geom_density() +
scale_y_discrete(labels = scales::label_parse())

how to loop a geographic mapping function over a list of dataframes (or a subsetted dataframe)

I have a dataframe consisting of species names, longitude and latitude coordinates. there are 115 different species with 25000 lat/long coordinates. I need to make individual maps that show observations for each specific species.
first, I created a function that would generate the kind of map that I want, called platmaps. when I call the function for my full dataset (platmaps(df1)), it creates a map displaying all lat long observations.
Then I constructed a for loop which was supposed to subset my df by species name, and insert that subsetted dataframe into my platmaps function. It runs for a couple of minutes and then nothing happens.
so I then I split the dataframe by species name, and created a list of dataframes(out1), and used lapply(out1, platmaps) but it only returned a list of the names of my dfs.
Then I tried a variation of an example that I saw here, but it also did not work.
wm <- wm <- borders("world", colour="gray50", fill="gray50")
wm +
geom_point(data =df1 , aes(x = decimalLongitude, y = decimalLatitude),
colour = "pink", size = 0.5)
for(i in 1:nrow(PP)){
p<-subset(df1, df1$species== query))
for (i in 1:length(out1)){
applied example
p =
wm <- wm <- borders("world", colour="gray50", fill="gray50")
wm +
geom_point(data =df1 , aes(x = decimalLongitude, y = decimalLatitude),
colour = "pink", size = 0.5)
plots = df1 %>%
group_by(species) %>%
do(plots = p %+% . + facet_wrap(~species))
the error for the applied example is:
Error: Cannot add ggproto objects together. Did you forget to add this
object to a ggplot object?
As I'm new to R (and coding), I assume I'm getting the syntax wrong, or am not applying my function correctly to/within either of my loops, or I fundamentally misunderstand the way looping works.
data frame sample
species decimalLongitude decimalLatitude
Platanthera lacera -71.90000 42.80000
Platanthera lacera -90.54861 40.12083
Platanthera lacera -71.00889 42.15500
Platanthera lacera -93.20833 45.20028
Platanthera lacera -72.45833 41.91666
Platanthera bifolia 5.19800 59.64310
Platanthera sparsiflora -117.67472 34.36278
fixed platmaps function
ggplot(data=df1 %>% filter(species == s))+
borders("world", colour="gray50", fill="gray50")+
geom_point(aes(x = decimalLongitude, y = decimalLatitude),
colour = "pink", size = 0.5)+
Because you didn't provide a test data set, let me give you a general idea how to make multiple plots you can inspect later. The code below will plot a parameter for a number of countries and save plot pdfs to a given path. You can replace the code behind the pl variable in the loop with your function.
df <- data.frame(country = c(rep('USA',20), rep('Canada',20), rep('Mexico',20)),
wave = c(1:20, 1:20, 1:20),
par = c(1:20 + 5*runif(20), 21:40 + 10*runif(20), 1:20 + 15*runif(20)))
countries <- unique(df$country)
plot_list <- list()
i <- 1
for (c in countries){
pl <- ggplot(data = df %>% filter(country == c)) +
geom_point(aes(wave, par), size = 3, color = 'red') +
labs(title = as.character(c), x = 'wave', y = 'value') +
theme_bw(base_size = 16)
plot_list[[i]] <- pl
i <- i + 1
pdf.options(width = 9, height = 7)
for (i in 1:length(plot_list)){
After the plots are obtained (the plot_list variable), we turn on the pdf terminal and print them. In the end, we turn off the pdf terminal.
there is a neat way to apply any function to a list of items. I have outlined a way to do this with the data you added. I cannot get platmaps to work so I have just made a scatter plot.
The method is to split your data frame into individual subsets using split() and then apply the plotting function to the resulting list using lapply(). Since lapply() returns a list, this can be passed directly to a function such as ggpubr::ggarrange() for visualizing.
plot_function <- function(x){
p <- ggplot(x, aes(x = decimalLongitude, y = decimalLatitude)) + geom_point()
plot_list <-
df %>%
split(.$species) %>% # Separate df into subset dfs based on species column
lapply(., plot_function) # map plot_function to list
# Display on a grid (many ways to do this - I just find this package simple)
ggpubr::ggarrange(plotlist = plot_list)

R - Reorder a bar plot in a function using ggplot2

I have the following plot function using ggplot2.
Function_Plot <- function(Fun_Data, Fun_Color)
MyPlot <- ggplot(data = na.omit(Fun_Data), aes_string(x = colnames(Fun_Data[2]), fill = colnames(Fun_Data[1]))) +
geom_bar(stat = "count") +
coord_flip() +
scale_fill_manual(values = Fun_Color)
The result is :
I need to upgrade my function to reorder the bar according frequencies of the words (in descending order). As I see the answer for another question about reordering, I try to introduce reorder function in the aes_string but it doesn't work.
A reproducible example :
a <- c("G1","G1","G1","G1","G1","G1","G1","G1","G1","G1","G2","G2","G2","G2","G2","G2","G2","G2")
b <- c("happy","sad","happy","bravery","bravery","God","sad","happy","freedom","happy","freedom",
MyData <- data.frame(Cluster = a, Word = b)
MyColor <- c("red","blue")
Function_Plot(Fun_Data = MyData, Fun_Color = MyColor)
Well, if reordering doesn't work inside aes_string, let's try it beforehand.
Function_Plot <- function(Fun_Data, Fun_Color)
Fun_Data[[2]] <- reorder(Fun_Data[[2]], Fun_Data[[2]], length)
MyPlot <- ggplot(data = na.omit(Fun_Data), aes_string(x = colnames(Fun_Data[2]), fill = colnames(Fun_Data[1]))) +
geom_bar(stat = "count") +
coord_flip() +
scale_fill_manual(values = Fun_Color)
Couple other notes - I'd recommend you use a more consistent style, mixing whether or not use use _ to separate words in variable names is confusing and asking for bugs.
It won't matter much unless your data is really big, but extracting names from a data frame is very efficient, whereas subsetting a data frame is less efficient. Your code subsets a data frame and then extracts the column names remaining, e.g., colnames(Fun_Data[1]). It will be cleaner to extract the names and then subset that vector: colnames(Fun_Data)[1]

Loops, dataframes and ggplot

I would like to display multiple plots on the same page using ggplot, and the multiplot function described here: http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_(ggplot2)/. My data is stored in a large dataframe with the first column corresponding to period. I want to visualize columns 2:26. My issue is reproducible using:
rawdata1 <- data.frame("Period" = 1:34, "Sample" = sample(x = c(1,2),34, replace = TRUE),"Runif" = runif(n = 34))
Intuitively, I would use the following code: (With 2:3 replaced with 2:26)
out <- NULL
for (i in 2:3){
out[[i-1]] <- ggplot(rawdata1, aes("Period", y = value)) + geom_line(aes(x = Period, y = rawdata1[[i]])) + ggtitle(label = colnames(rawdata1)[i])
multiplot(plotlist = out, cols = 2)
This succeeds in plotting multiple graphs, however my problem is that each graph that is plotted uses data from the same column (column 3 in the above example, column 26 in my dataset). I've puzzled out that this is because my "out" list stores the ggplot list with the y values stored dynamically.
i's final value is 26, and when I call an item from "out", it uses the current value for i to create the graph. So every graph displays using the same column. As I am new to R, so my guess is that I am not managing my variables correctly. Any help would be appreciated
Below you find an alternative: using the melt function from reshape2 and then faceting with facet_wrap.
data.melt <- melt(rawdata1, id.var='Period')
ggplot(data.melt, aes(Period, value)) +
geom_line() +
facet_wrap(~variable, scales='free_y')
If you want to use multiplot instead, you could do the following:
out <- lapply(names(rawdata1)[-1],
function(index) ggplot(rawdata1) +
geom_line(aes_string(x = 'Period', y = index)) +
ggtitle(label = index))
multiplot(plotlist = out, cols = 2)

How to specify columns in facet_grid OR how to change labels in facet_wrap

I have a large number of data series that I want to plot using small multiples. A combination of ggplot2 and facet_wrap does what I want, typically resulting a nice little block of 6 x 6 facets. Here's a simpler version:
The problem is that I don't have adequate control over the labels in facet strips. The names of the columns in the data frame are short and I want to keep them that way, but I want the labels in the facets to be more descriptive. I can use facet_grid so that I can take advantage of the labeller function but then there seems to be no straightforward way to specify the number of columns and a long row of facets just doesn't work for this particular task. Am I missing something obvious?
Q. How can I change the facet labels when using facet_wrap without changing the column names? Alternatively, how can I specify the number of columns and rows when using facet_grid?
Code for a simplified example follows. In real life I am dealing with multiple groups each containing dozens of data series, each of which changes frequently, so any solution would have to be automated rather than relying on manually assigning values.
# Random data with short column names
myrows <- 30
mydf <- data.frame(date = seq(as.Date('2012-01-01'), by = "day", length.out = myrows),
aa = runif(myrows, min=1, max=2),
bb = runif(myrows, min=1, max=2),
cc = runif(myrows, min=1, max=2),
dd = runif(myrows, min=1, max=2),
ee = runif(myrows, min=1, max=2),
ff = runif(myrows, min=1, max=2))
# Plot using facet wrap - we want to specify the columns
# and the rows and this works just fine, we have a little block
# of 2 columns and 3 rows
mydf <- melt(mydf, id = c('date'))
p1 <- ggplot(mydf, aes(y = value, x = date, group = variable)) +
geom_line() +
facet_wrap( ~ variable, ncol = 2)
print (p1)
# Problem: we want more descriptive labels without changing column names.
# We can change the labels, but doing so requires us to
# switch from facet_wrap to facet_grid
# However, in facet_grid we can't specify the columns and rows...
mf_labeller <- function(var, value){ # lifted bodily from the R Cookbook
value <- as.character(value)
if (var=="variable") {
value[value=="aa"] <- "A long label"
value[value=="bb"] <- "B Partners"
value[value=="cc"] <- "CC Inc."
value[value=="dd"] <- "DD Company"
value[value=="ee"] <- "Eeeeeek!"
value[value=="ff"] <- "Final"
p2 <- ggplot(mydf, aes(y = value, x = date, group = variable)) +
geom_line() +
facet_grid( ~ variable, labeller = mf_labeller)
print (p2)
I don't quite understand. You've already written a function that converts your short labels to long, descriptive labels. What is wrong with simply adding a new column and using facet_wrap on that column instead?
mydf <- melt(mydf, id = c('date'))
mydf$variableLab <- mf_labeller('variable',mydf$variable)
p1 <- ggplot(mydf, aes(y = value, x = date, group = variable)) +
geom_line() +
facet_wrap( ~ variableLab, ncol = 2)
print (p1)
To change the label names, just change the factor levels of the factor you use in facet_wrap. These will be used in facet_wrap on the strips. You can use a similar setup as you would using the labeller function in facet_grid. Just do something like:
new_labels = sapply(levels(df$factor_variable), custom_labeller_function)
df$factor_variable = factor(df$factor_variable, levels = new_labels)
Now you can use factor_variable in facet_wrap.
Just add labeller = label_wrap_gen(width = 25, multi_line = TRUE) to the facet_wrap() arguments.
Eg.: ... + facet_wrap( ~ variable, ,labeller = label_wrap_gen(width = 25, multi_line = TRUE))
More info: ?ggplot2::label_wrap_gen
Simply add labeller = label_both to the facet_wrap() arguments.
... + facet_wrap( ~ variable, labeller = label_both)
