In my ggplot below, I'm trying to change the 10 facet labels of facet_wrap using labeller(sch.id=paste0("sch.id:", unique(ten$sch.id))).
However, the plot shows NA instead of the correct facet labels, I wonder what the fix is?
library(ggplot2)
hsb <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv')
ten <- subset(hsb, sch.id %in% unique(sch.id)[1:10])
p <- ten %>% ggplot() + aes(ses, math) + geom_point() +
facet_wrap(~sch.id) + geom_smooth(method = "lm", se = FALSE)
p + facet_wrap(~sch.id, labeller = labeller(sch.id=paste0("sch.id:", unique(ten$sch.id)))) ## HERE ##
The problem seems to be that you are passing a variable to the labeller function but facet_wrap already passes its own faceting variable. A conflict occurs and the result are NA's.
The solution is to create a labeller function as a function of a variable x (or any other name as long as it's not the faceting variables' names) and then coerce to labeller with as_labeller.
Note that there is no need for unique, just like there is no need for it in the facet_wrap formula.
p <- ten %>% ggplot() + aes(ses, math) + geom_point() +
geom_smooth(method = "lm", formula = y ~ x, se = FALSE)
cust_labeller <- function(x) paste0("sch.id:", x)
p + facet_wrap(~ sch.id,
labeller = as_labeller(cust_labeller)) ## HERE ##
I think the easiest way would be to change sch.id before plotting.
library(ggplot2)
ten$sch.id <- paste0("sch.id:", ten$sch.id)
ggplot(ten) + aes(ses, math) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
facet_wrap(~sch.id)
If you don't want to modify your data and want to use the labeller argument you can create a named vector and use it in labeller.
cust_label <- setNames(paste0("sch.id:", unique(ten$sch.id)), unique(ten$sch.id))
ggplot(ten) + aes(ses, math) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
facet_wrap(~sch.id, labeller = as_labeller(cust_label))
Related
I would like to create shorthand notations or functions that combines multiple geoms for ggplot.
For example, instead of
mtcars %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point() +
geom_smooth(method = "lm") +
ggpubr::stat_cor()
I would like to be able to create a function to combine the geoms like so
lm_and_cor <- function() {
geom_smooth(method = "lm", se = FALSE) +
stat_cor()
}
mtcars %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point() +
lm_and_cor()
I am aware that I can create functions that does all of the plotting, basically
plot_data <- function(x) {
x %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point() +
geom_smooth(method = "lm") +
ggpubr::stat_cor()
}
which to be fair does what I want, to some degree. However, I would instead like to combine multiple geoms in a single function, as the underlying geom (e.g. point, lines, etc.) will not always be the same. Is this doable, and is it feasible?
With ggplot2 you can use list of elements:
lm_and_cor <- function()
list(geom_smooth(method = "lm", se = FALSE),
ggpubr::stat_cor()
)
mtcars %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point() +
lm_and_cor()
Output:
Do you mean something like this?
You can store multiple geom in a list object.
Edit: I misunderstand the question. This should meet the expectation.
data(iris)
library(ggplot2)
x <- list(geom_point(), geom_line())
ggplot(iris, aes(Sepal.Length, Sepal.Width)) + x
Or if you want to make a function to plot by column use this {{variable}}.
library(dplyr)
plotting <- function(data, x, y){
data %>%
ggplot(aes({{x}}, {{y}})) +
geom_point() +
geom_smooth(method = "lm")}
plotting(iris, Sepal.Length, Sepal.Width)
I know how to modify titles in ggplot without altering the original data. Suppose I have the following data frame and I want to change the labels. Then, I would do so in the following way
df <- data.frame(x = 1:4, y = 1:4, label = c(c("params[1]", "params[2]", "params[3]",
"params[4]")))
params_names <- list(
'params[1]'= "beta[11]",
'params[2]'= "beta[22]",
'params[3]'= "beta[33]",
'params[4]'= "beta[44]"
)
param_labeller <- function(variable, value){
params_names[value]
}
ggplot(df, aes(x=x,y=y)) +
geom_point() +
facet_grid(~label, labeller = param_labeller)
If I wanted to display the subscripts, I would just do this
ggplot(df, aes(x=x,y=y)) +
geom_point() +
facet_grid(~label, labeller = label_parsed)
How do I apply both operations at the same time?
I don't know exactly if this conflicts with you not wanting to "alter" the original data, but you add the labelling information to the factor itself:
df$label2 <- factor(df$label,
labels = c("beta[4]", "beta[24]", "beta[42]", "beta[43]"))
ggplot(df, aes(x = x, y = y)) +
geom_point() +
facet_grid( ~ label2, labeller = label_parsed)
This produces the following plot:
Plot with formatted facet labels
Specifically, this is in a facet_grid. Have googled extensively for similar questions but not clear on the syntax or where it goes. What I want is for every number on the y-axes to have two digits after the decimal, even if the trailing one is 0. Is this a parameter in scale_y_continuous or element_text or...?
row1 <- ggplot(sector_data[sector_data$sector %in% pages[[x]],], aes(date,price)) + geom_line() +
geom_hline(yintercept=0,size=0.3,color="gray50") +
facet_grid( ~ sector) +
scale_x_date( breaks='1 year', minor_breaks = '1 month') +
scale_y_continuous( labels = ???) +
theme(panel.grid.major.x = element_line(size=1.5),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_text(size=8),
axis.ticks=element_blank()
)
From the help for ?scale_y_continuous, the argument 'labels' can be a function:
labels One of:
NULL for no labels
waiver() for the default labels computed by the transformation object
A character vector giving labels (must be same length as breaks)
A function that takes the breaks as input and returns labels as output
We will use the last option, a function that takes breaks as an argument and returns a number with 2 decimal places.
#Our transformation function
scaleFUN <- function(x) sprintf("%.2f", x)
#Plot
library(ggplot2)
p <- ggplot(mpg, aes(displ, cty)) + geom_point()
p <- p + facet_grid(. ~ cyl)
p + scale_y_continuous(labels=scaleFUN)
The "scales" package has some nice functions for formatting the axes. One of these functions is number_format(). So you don't have to define your function first.
library(ggplot2)
# building on Pierre's answer
p <- ggplot(mpg, aes(displ, cty)) + geom_point()
p <- p + facet_grid(. ~ cyl)
# here comes the difference
p + scale_y_continuous(
labels = scales::number_format(accuracy = 0.01))
# the function offers some other nice possibilities, such as controlling your decimal
# mark, here ',' instead of '.'
p + scale_y_continuous(
labels = scales::number_format(accuracy = 0.01,
decimal.mark = ','))
The scales package has been updated, and number_format() has been retired. Use label_number(). This can also be applied to percentages and other continuous scales (ex: label_percent(); https://scales.r-lib.org/reference/label_percent.html).
#updating Rtists answer with latest syntax from scales
library(ggplot2); library(scales)
p <- ggplot(mpg, aes(displ, cty)) + geom_point()
p <- p + facet_grid(. ~ cyl)
# number_format() is retired; use label_number() instead
p + scale_y_continuous(
labels = label_number(accuracy = 0.01)
)
# for whole numbers use accuracy = 1
p + scale_y_continuous(
labels = label_number(accuracy = 1)
)
Several people have suggested the scales package, but you could just do pretty much the same with base R as well, here by using the format() function.
require(ggplot2)
ggplot(iris, aes(y = Sepal.Length, x = Sepal.Width)) +
geom_point() +
scale_y_continuous(labels = function(x) format(x, nsmall = 2)) +
facet_wrap(~Species)
In the code below I build a 40x1000 data frame where in each column I have the cumulative means for successive random draws from an exponential distribution with parameter lambda = 0.2.
I add an additional column to host the specific number of the "draw".
I also calculate the rowmeans as df_means.
How do I add df_means (as a black line) on top of all my simulated RVs? I don't understand ggplot well enough to do this.
df <- data.frame(replicate(1000,cumsum(rexp(40,lambda))/(1:40)))
df$draw <- seq(1,40)
df_means <- rowMeans(df)
Molten <- melt(df, id.vars="draw")
ggplot(Molten, aes(x = draw, y = value, colour = variable)) + geom_line() + theme(legend.position = "none") + geom_line(df_means)
How would I add plot(df_means, type="l") to my ggplot, below?
Thank you,
You can make another data.frame with the means and ids and use that to draw the line,
df_means <- rowMeans(df)
means <- data.frame(id=1:40, mu=df_means)
ggplot(Molten, aes(x=draw, y=value, colour=variable)) +
geom_line() +
theme(legend.position = "none") +
geom_line(data=means, aes(x=id, y=mu), color="black")
As described here
stat_sum_df <- function(fun, geom="crossbar", ...) {
stat_summary(fun.data=fun, colour="red", geom=geom, width=0.2, ...)
}
k<-ggplot(Molten, aes(x = draw, y = value, colour = variable)) + geom_line() + theme(legend.position = "none")
k+stat_sum_single(mean) #gives you the required plot
I am trying to create a Cleveland Dot Plot given for two categories in this case J and K. The problem is the elements A,B,C are in both categories so R keeps farting. I have made a simple example:
x <- c(LETTERS[1:10],LETTERS[1:3],LETTERS[11:17])
type <- c(rep("J",10),rep("K",10))
y <- rnorm(n=20,10,2)
data <- data.frame(x,y,type)
data
data$type <- as.factor(data$type)
nameorder <- data$x[order(data$type,data$y)]
data$x <- factor(data$x,levels=nameorder)
ggplot(data, aes(x=y, y=x)) +
geom_segment(aes(yend=x), xend=0, colour="grey50") +
geom_point(size=3, aes(colour=type)) +
scale_colour_brewer(palette="Set1", limits=c("J","K"), guide=FALSE) +
theme_bw() +
theme(panel.grid.major.y = element_blank()) +
facet_grid(type ~ ., scales="free_y", space="free_y")
Ideally, I would want a dot plot for both categories(J,K) individually with each factor(vector x) decreasing with respect to the y vector. What ends up happening is that both categories aren't going from biggest to smallest and are erratic at the end instead. Please help!
Unfortunately factors can only have one set of levels. The only way i've found to do this is actually to create two separate data.frames from your data and re-level the factor in each. For example
data <- data.frame(
x = c(LETTERS[1:10],LETTERS[1:3],LETTERS[11:17]),
y = rnorm(n=20,10,2),
type= c(rep("J",10),rep("K",10))
)
data$type <- as.factor(data$type)
J<-subset(data, type=="J")
J$x <- reorder(J$x, J$y, max)
K<-subset(data, type=="K")
K$x <- reorder(K$x, K$y, max)
Now we can plot them with
ggplot(mapping = aes(x=y, y=x, xend=0, yend=x)) +
geom_segment(data=J, colour="grey50") +
geom_point(data=J, size=3, aes(colour=type)) +
geom_segment(data=K, colour="grey50") +
geom_point(data=K, size=3, aes(colour=type)) +
theme_bw() +
theme(panel.grid.major.y = element_blank()) +
facet_grid(type ~ ., scales="free_y", space="free_y")
which results in