I am trying to write a function that uses ggplot but allows user specification of several of the plotting variables. However I'm having trouble getting it to work as a function (receiving an error message: see below).
A small example dataset and working implementation are provided below, together with my attempt at the function and the associated error. I'm sure it is to do with non-standard evaluation (NSE), but I'm unsure how to get around it given my use of filter within the function, and my various attempts have been in vain.
library(dplyr)
library(ggplot2)
df<-data.frame(Date=c(seq(1:50),seq(1:50)), SRI=runif(100,-2,2), SITE=(c(rep("A",50), rep("B", 50))))
ggplot() +
geom_linerange(aes(x = Date, ymin = 0, ymax = SRI), colour = I('blue'), data = filter(df, SRI>0)) +
geom_linerange(aes(x = Date, ymin = SRI, ymax = 0), colour = I('red'), data = filter(df, SRI<=0)) +
facet_wrap(~SITE) +
labs(x = 'Date', y = "yvar", title = "Plot title")
The above works, but when implemented as a function:
plot_fun <- function(df, x, y, ylab="y-lab", plot_title="Title", facets) {
ggplot() +
geom_linerange(aes(x = x, ymin = 0, ymax = y), colour = I('blue'), data = filter(df, y > 0)) +
geom_linerange(aes(x = x, ymin = y, ymax = 0), colour = I('red'), data = filter(df, y <= 0)) +
facet_wrap(~ facets) +
labs(x = 'Date', y = ylab, title = plot_title)
return(p)
}
plot_fun(df, x="Date", y="SRI", ylab="y-lab", plot_title="Title", facets="SITE")
I get the following "Error: Aesthetics must be either length 1 or the same as the data (1): x, ymin, max".
I've tried various approaches using as_string and filter_, but all have been unsuccessful.
Any help much appreciated.
Regards
Nick
You'll need to switch to aes_string as you expected and change your facet_wrap code to either take the facets argument as a formula or remove the tilde as in the answers to this question. You'll also need to switch to using filter_, which can be used along with interp from package lazyeval.
library(lazyeval)
Here is your function with the changes I outlined and the resulting plot:
plot_fun <- function(df, x, y, ylab = "y-lab", plot_title = "Title", facets) {
ggplot() +
geom_linerange(aes_string(x = x, ymin = 0, ymax = y), colour = I('blue'),
data = filter_(df, interp(~var > 0, var = as.name(y)))) +
geom_linerange(aes_string(x = x, ymin = y, ymax = 0), colour = I('red'),
data = filter_(df, interp(~var <= 0, var = as.name(y)))) +
facet_wrap(facets) +
labs(x = 'Date', y = ylab, title = plot_title)
}
plot_fun(df, x="Date", y="SRI", facets="SITE")
Related
I am trying to create a custom function that extends ggplot2. The goal of the function is to superimpose a mean with horizontal and vertical standard errors. The code below does the entire thing.
library(plyr)
library(tidyverse)
summ <- ddply(mtcars,.(),summarise,
dratSE = sqrt(var(drat))/length(drat),
mpgSE = sqrt(var(mpg))/length(mpg),
drat = mean(drat),
mpg = mean(mpg))
ggplot(data = mtcars, mapping = aes(x = drat, y = mpg)) +
geom_point(shape = 21, fill = 'black', color = 'white', size = 3) +
geom_errorbarh(data = summ, aes(xmin = drat - dratSE, xmax = drat + dratSE)) +
geom_errorbar(data = summ, aes(ymin = mpg - mpgSE, ymax = mpg+mpgSE), width = .1) +
geom_point(data = summ, color='red',size=4)
Ideally, it would only take a function such as geom_scattermeans() to do this whole thing. But I am not sure how the aesthetics get transferred into subsequent geom functions from ggplot().
Also I've had difficulties in making a function that receives column names as argument and making it work with ddply().
I think plyr is pretty defunct at this point. I would recommend the dplyr package. When programming with dplyr you can use {{ (curly-curly, or embracing as the documentation says) to properly quote expressions.
library(ggplot2)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
geom_point_error <- function(data, x, y, color = 'red', size = 4) {
data <- dplyr::summarise(
data,
x_se = sqrt(var({{x}}))/length({{x}}),
y_se = sqrt(var({{y}}))/length({{y}}),
x = mean({{x}}),
y = mean({{y}})
)
list(
geom_errorbarh(data = data,
mapping = aes(y = y,
xmin = x - x_se, xmax = x + x_se), inherit.aes = F),
geom_errorbar(data = data,
mapping = aes(x = x,
ymin = y - y_se, ymax = y + y_se), width = .1,inherit.aes = F),
geom_point(data = data,
mapping = aes(x = x, y = y),
color = color, size = size)
)
}
ggplot(data = mtcars, mapping = aes(x = drat, y = mpg)) +
geom_point(shape = 21, fill = 'black', color = 'white', size = 3) +
geom_point_error(mtcars, x = drat, y = mpg)
Created on 2021-05-17 by the reprex package (v1.0.0)
The second option would be to build your own ggproto Geom to handle these calculations inside ggplot2, but that is a bit much for right now.
Since my first answer is still the easier solution, I decieded to keep it. This answer should get OP closer to their goal.
Building a ggproto object can be cumbersome depending on what you are trying to do. In your case you are combining 3 ggproto Geoms classes together with the possibility of a new Stat.
The three Geoms are:
GeomErrorbar
GeomErrorbarh
GeomPoint
To get started, sometimes you just need to inherit from one of the classes and overwrite the method, but to pool the three together you will need to do more work.
Lets first consider how each of these Geoms draw their grid objects. Depending on the Geom it is in one of these functions draw_layer(), draw_panel(), and draw_group(). Fortunately, each of the geoms we want to use only use draw_panel() which mean a bit less work for us - we will just call these methods directly and build a new grobTree object. We will just need to be careful that all the correct parameters are making it to our new Geom's draw_panel() method.
Before we get to writing our own draw_panel, we have to first consider setup_params() and setup_data() functions. Occasionally, these will modify the data right out the gate. These steps are usually helpful to have automatic processing here and are often used to standardize/transform the data. A good example is GeomTile and GeomRect, they are essentially the same Geoms but their setup_data() functions differ because they are parameterized differently.
Lets assume you only want to assign an x and a y aesthetic, and leave the calculations of xmin, ymin, xmax, and ymax to the geoms/stats.
Fortunately, GeomPoint just returns the data with no modifications, so we will need to incorporate GeomErrorbar and GeomErrorbarh's setup_data() first. To skip some steps, I am just going to make a new Stat that will take care of transforming those values for us within a compute_group() method.
A note here, GeomErrorbar and GeomErrorbarh allow for another parameter to be included - width and height respectively, which controls how wide the flat portions of the error bars are.
also, within these functions, each will make their own xmin, xmax, ymin, ymax - so we will need to distinguish these parameters.
First load required information into the namespace
library(ggplot2)
library(grid)
"%||%" <- ggplot2:::`%||%`
Start with new Stat, I've decided to call it a PointError
StatPointError <- ggproto(
"StatPointError",
Stat,
#having `width` and `height` as named parameters here insure
#that they will be available to the `Stat` ggproto object.
compute_group = function(data, scales, width = NULL, height = NULL){
data$width <- data$width %||% width %||% (resolution(data$x, FALSE)*0.9)
data$height <- data$height %||% height %||% (resolution(data$y, FALSE)*0.9)
data <- transform(
data,
x = mean(x),
y = mean(y),
# positions for flat parts of vertical error bars
xmin = mean(x) - width /2,
xmax = mean(x) + width / 2,
width = NULL,
# y positions of vertical error bars
ymin = mean(y) - sqrt(var(y))/length(y),
ymax = mean(y) + sqrt(var(y))/length(y),
#positions for flat parts of horizontal error bars
ymin_h = mean(y) - height /2,
ymax_h = mean(y) + height /2,
height = NULL,
# x positions of horizontal error bars
xmin_h = mean(x) - sqrt(var(x))/length(x),
xmax_h = mean(x) + sqrt(var(x))/length(x)
)
unique(data)
}
)
Now for the fun part, the Geom, again I'm going for PointError as a consistent name.
GeomPointError <- ggproto(
"GeomPointError",
GeomPoint,
#include some additional defaults
default_aes = aes(
shape = 19,
colour = "black",
size = 1.5, # error bars have defaults of 0.5 - you may want to add another parameter?
fill = NA,
alpha = NA,
linetype = 1,
stroke = 0.5, # for GeomPoint
width = 0.5, # for GeomErrorbar
height = 0.5, # for GeomErrorbarh
),
draw_panel = function(data, panel_params, coord, width = NULL, height = NULL, na.rm = FALSE) {
#make errorbar grobs
data_errbar <- data
data_errbar[["size"]] <- 0.5
errorbar_grob <- GeomErrorbar$draw_panel(data = data_errbar,
panel_params = panel_params, coord = coord,
width = width, flipped_aes = FALSE)
#re-parameterize errbarh data
data_errbarh <- transform(data,
xmin = xmin_h, xmax = xmax_h, ymin = ymin_h, ymax = ymax_h,
xmin_h = NULL, xmax_h = NULL, ymin_h = NULL, ymax_h = NULL,
size = 0.5)
#make errorbarh grobs
errorbarh_grob <- GeomErrorbarh$draw_panel(data = data_errbarh,
panel_params = panel_params, coord = coord,
height = height)
point_grob <- GeomPoint$draw_panel(data = data, panel_params = panel_params,
coord = coord, na.rm = na.rm)
gt <- grobTree(
errorbar_grob,
errorbarh_grob,
point_grob, name = 'geom_point_error')
gt
}
)
Last, we need a function for the user to call that will make a Layer object.
geom_point_error <- function(mapping = NULL, data = NULL,
position = "identity",
...,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE) {
layer(
data = data,
mapping = mapping,
stat = StatPointError,
geom = GeomPointError,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
na.rm = na.rm,
...
)
)
}
Now we can test if this is working properly
ggplot(data = mtcars, mapping = aes(x = drat, y = mpg)) +
geom_point(shape = 21, fill = 'black', color = 'white', size = 3) +
geom_point_error(color = "red", width = .1, height = .3)
ggplot(data = mtcars, mapping = aes(x = drat, y = mpg)) +
geom_point(shape = 21, fill = 'black', color = 'white', size = 3) +
geom_point_error(aes(color = hp>100))
Created on 2021-05-18 by the reprex package (v1.0.0)
There is obviously so much more you could do with this, from including additional default aesthetics such that you could control the color and size of the lines/points separately (may want to override GeomPointError$setup_data() to insure everything maps correctly).
Finially, this geom is pretty naive in that it assumes the x and y data mappings are continuous. It still works with mixing continuous and discrete, but looks a bit funky
ggplot(mpg, aes(cty, model)) +
geom_point() +
geom_point_error(color = 'red')
I am trying to add labels (letters) above a barplot using ggplot2 function geom_text. My bars are separated using position=position_dodge() and so I need to apply the same for the new labels. However I would like to use also nudge_y to separate the labels from the bar. If I try to use both together R complains that I can use only one of either options. I'd like to do something like this:
Tukey.labels <- geom_text(data=stats,
aes(x=factor(Treatment2), y=x.mean,
label=Tukey.dif),
size=4, nudge_y=3, # move letters in Y
position=position_dodge(0.5)) # move letters in X
To create something like this image Does anybody knows a possibility to shift all my labels the same distance in Y while doing position_dodge at the same time? I could not find answer for this in other posts
Hard to troubleshoot without a reproducible example. Hopefully this helps:
library(dplyr); library(ggplot2)
ggplot(mtcars %>% rownames_to_column("car") ,
aes(as.factor(cyl), mpg, group = car)) +
geom_col(position = position_dodge(0.9)) +
geom_errorbar(aes(ymin = mpg - wt,
ymax = mpg + wt),
position = position_dodge(0.9)) +
geom_text(aes(label = gear, y = mpg + wt), vjust = -0.5,
position = position_dodge(0.9))
In the spirit of the original question, one can easily combine ggplot's position_nudge and position_dodge like this:
position_nudgedodge <- function(x = 0, y = 0, width = 0.75) {
ggproto(NULL, PositionNudgedodge,
x = x,
y = y,
width = width
)
}
PositionNudgedodge <- ggproto("PositionNudgedodge", PositionDodge,
x = 0,
y = 0,
width = 0.3,
setup_params = function(self, data) {
l <- ggproto_parent(PositionDodge,self)$setup_params(data)
append(l, list(x = self$x, y = self$y))
},
compute_layer = function(self, data, params, layout) {
d <- ggproto_parent(PositionNudge,self)$compute_layer(data,params,layout)
d <- ggproto_parent(PositionDodge,self)$compute_layer(d,params,layout)
d
}
)
Then you can use it like this:
Tukey.labels <- geom_text(data=stats,
aes(x=factor(Treatment2), y=x.mean, label=Tukey.dif),
size=4,
position=position_nudgedodge(y=3,width=0.5)
)
I am creating animated plotly graph for my assignment in r, where I am comparing several models with various number of observations. I would like to add annotation showing what is the RMSE of the current model - this means I would like to have text that changes together with slider. Is there any easy way how to do that?
Here is my dataset stored on GitHub. There already is created variable with RMSE: data
The base ggplot graphic is as follows:
library(tidyverse)
library(plotly)
p <- ggplot(values_predictions, aes(x = x)) +
geom_line(aes(y = preds_BLR, frame = n, colour = "BLR")) +
geom_line(aes(y = preds_RLS, frame = n, colour = "RLS")) +
geom_point(aes(x = x, y = target, frame = n, colour = "target"), alpha = 0.3) +
geom_line(aes(x = x, y = sin(2 * pi * x), colour = "sin(2*pi*x)"), alpha = 0.3) +
ggtitle("Comparison of performance) +
labs(y = "predictions and targets", colour = "colours")
This is converted to plotly, and I have added an animation to the Plotly graph:
plot <- ggplotly(p) %>%
animation_opts(easing = "linear",redraw = FALSE)
plot
Thanks!
You can add annotations to a ggplot graph using the annotate function: http://ggplot2.tidyverse.org/reference/annotate.html
df <- data.frame(x = rnorm(100, mean = 10), y = rnorm(100, mean = 10))
# Build model
fit <- lm(x ~ y, data = df)
# function finds RMSE
RMSE <- function(error) { sqrt(mean(error^2)) }
library(ggplot2)
ggplot(df, aes(x, y)) +
geom_point() +
annotate("text", x = Inf, y = Inf, hjust = 1.1, vjust = 2,
label = paste("RMSE", RMSE(fit$residuals)) )
There seems to be a bit of a problem converting between ggplot and plotly. However this workaround here shows a workaround which can be used:
ggplotly(plot) %>%
layout(annotations = list(x = 12, y = 13, text = paste("RMSE",
RMSE(fit$residuals)), showarrow = F))
Here's an example of adding data dependent text using the built in iris dataset with correlation as text to ggplotly.
library(plotly)
library(ggplot2)
library(dplyr)
mydata = iris %>% rename(variable1=Sepal.Length, variable2= Sepal.Width)
shift_right = 0.1 # number from 0-1 where higher = more right
shift_down = 0.02 # number from 0-1 where higher = more down
p = ggplot(mydata, aes(variable1,variable2))+
annotate(geom = "text",
label = paste0("Cor = ",as.character(round(cor.test(mydata$variable1,mydata$variable2)$estimate,2))),
x = min(mydata$variable1)+abs(shift_right*(min(mydata$variable1)-max(mydata$variable1))),
y = max(mydata$variable2)-abs(shift_down*(min(mydata$variable2)-max(mydata$variable2))), size=4)+
geom_point()
ggplotly(p) %>% style(hoverinfo = "none", traces = 1) # remove hover on text
I am trying to create a chart like this one produced in the NYTimes using ggplot:
I think I'm getting close, but I'm not quite sure how to separate out some of my data so I get the right view. My data is political office holders that appear something like this:
name,year_elected,year_left,years_in_office,type,party
Person 1,1969,1969,1,Candidate,Unknown
Person 2,1969,1971,2,Candidate,Unknown
Person 3,1969,1973,4,Candidate,Unknown
Person 4,1969,1973,4,Candidate,Unknown
Person 5,1971,1974,3,Candidate,Unknown
Person 1,1971,1976,5,Candidate,Unknown
Person 2,1971,1980,9,Candidate,Unknown
Person 6,1973,1978,5,Candidate,Unknown
Person 7,1973,1980,7,Candidate,Unknown
Person 8,1975,1980,5,Candidate,Unknown
Person 9,1977,1978,1,Candidate,Unknown
And I've used the below code to get very close to this view, but I think an issue I'm running into is either drawing segments incorrectly (e.g., I don't seem to have a single segment for each candidate), or segments are overlapping/stacking. The key issue I'm running into is my list of office holders is around 60, but my chart is only drawing around 28 lines.
library(googlesheets)
library(tidyverse)
# I'm reading from a Google Spreadsheet
data <- gs_title("Council Members")
data_sj <- gs_read(ss = data, ws = "Sheet1")
ggplot(data, aes(year_elected, years_in_office)) +
geom_segment(aes(x = year_elected, y = 0,
xend = year_left, yend = years_in_office)) +
theme_minimal()
The above code gives me:
Thanks ahead of time for any pointers!
If your data frame is called d, then:
Transform it to data.table
Add jitter to year_electer
Add equivalent jitter to year_left
Add group (as an example) to color your samples
Use ggrepel to add text if there are many points.
Code:
library(data.table)
library(ggplot2)
library(ggrepel)
d[, year_elected2 := jitter(year_elected)]
d[, year_left2 := year_left + year_elected2 - year_elected + 0.01]
d[, group := TRUE]
d[factor(years_in_office %/% 9) == 1, group := FALSE]
ggplot(d, aes(year_elected2, years_in_office)) +
geom_segment(aes(x = year_elected2, xend = year_left2,
y = 0, yend = years_in_office, linetype = group),
alpha = 0.8, size = 1, color = "grey") +
geom_point(aes(year_left2), color = "black", size = 3.3) +
geom_point(aes(year_left2, color = group), size = 2.3) +
geom_text_repel(aes(year_left2, label = name), ) +
scale_colour_brewer(guide = FALSE, palette = "Dark2") +
scale_linetype_manual(guide = FALSE, values = c(2, 1)) +
labs(x = "Year elected",
y = "Years on office") +
theme_minimal(base_size = 10)
Result:
For the record and to address my comment on #PoGibas answer above, here's my tidyverse version:
data_transform <- data_sj %>%
mutate(year_elected_jitter = jitter(year_elected)) %>%
mutate(year_left_jitter = year_left + year_elected_jitter - year_elected + 0.01)
ggplot(data_transform, aes(year_elected, years_in_office, label = name)) +
geom_segment(aes(x = year_elected_jitter, y = 0, xend = year_left_jitter, yend = years_in_office, color = gender), size = 0.3) +
geom_text_repel(aes(year_left_jitter, label = name)) +
theme_minimal()
I want to annotate a contour plot with particular points that I want to highlight (where these points are stored in a different data set). When I try, I get an error:
Error: Aesthetics must either be length one, or the same length as the dataProblems:z
However, when I tried to make a reproducible example, I get a different error:
Error in eval(expr, envir, enclos) : object 'z' not found
The code for the reproducible example is below:
library(mnormt)
library(dplyr)
library(ggplot2)
f <- function(x, y) {
dmnorm(x = c(x, y),
mean = c(0, 0),
varcov = diag(2))
}
f <- Vectorize(f)
xmesh <- seq(from = -3, to = 3, length.out = 100)
ymesh <- seq(from = -3, to = 3, length.out = 100)
dummy <- expand.grid(x = xmesh, y = ymesh)
dummy$z <- f(dummy$x, dummy$y)
stuff <- data_frame(x = c(0, 0, 1),
y = c(0, -1, -1),
point = c("O", "P", "Q"))
dummy %>%
ggplot(aes(x = x, y = y, z = z)) +
stat_contour(aes(color = ..level..)) +
labs(color = "density") +
geom_point(data = stuff, mapping = aes(x = x, y = y, color = point))
ggplot passes the aes from the first ggplot call to the rest of the geoms, unless told otherwise. So the error is telling you that it cannot find z inside stuff, and it still thinks that the z should be z from the initial call.
There are a range of ways to fix this, I think the easiest way to fix it is to give each geom its data separately:
ggplot() +
stat_contour(data = dummy, aes(x = x, y = y, z = z, color = ..level..)) +
labs(color = "density") +
geom_point(data = stuff, aes(x = x, y = y, fill = factor(point)), pch = 21)
NB. you also have a problem where colour cannot be mapped in two different geoms, so I've fixed it using pch and fill.