So my example data are:
x <- runif(1000, min = 0, max = 5)
y <- (2 / pi) * atan(x)
z <- floor(x)
df <- data.frame(x, y, z)
I draw boxplots of x, binned by z:
library(ggplot2)
g <- ggplot(df, aes(x = x, y = y, group = z)) +
geom_boxplot()
g
But the thing is, in my real-life data, I'm not completely sure that the y-values follow (2 / pi) * atan(x). There's a random element there. So, how do I draw the function on top of my graph to see for myself? As per the ggplot2 documentation, I tried...
g + stat_function(fun = (2 / pi) * atan(x), colour = "red")
...but am receiving the error Warning message:
Computation failed in 'stat_function()':
'what' must be a function or character string.
The error is saying:
'what' must be a function or character string
so it is asking you simply define your function.
You need to define your function suuch as func
func<-function(x){ (2 / pi) * atan(x)}
and then call it in ggplot
library(ggplot2)
g <- ggplot(df, aes(x = x, y = y, group = z)) +
geom_boxplot()
g+stat_function(fun = func, colour = "red")
Here is the result
the parameter fun must be a function
g + stat_function(fun = function(x){(2 / pi) * atan(x)}, colour = "red")
I could solve your problem by simply defining a new function and the pass it to as the argument of stat_function
Here it is
myfun <- function(x){(2 / pi) * atan(x)}
and then
g + stat_function(fun = myfun colour = "red")
would do it
Related
I am trying to make a graph of the periodic function -23.5*cospi/4(x)+23.5
It gives the results I want in Desmos for reference.
However, when I code it as a function and plot it in ggplot, I get a weird-looking periodic function.
x <- 0:32
test <- data.frame(x, y=-23.5*cos(pi/4*x)+23.5)
test2 <- function(x) -23.5*cos(pi/4*x)+23.5
ggplot(data = test, mapping = aes(x,y))+
geom_point()+
stat_smooth(se = FALSE)
The points show as normal, but the stat_smooth gives this lumpy hunk of garbage.
As you want to graph a function I would suggest to use geom_function:
Note: For smoothness I doubled the default number of interpolation points n.
x <- 0:32
test <- data.frame(x, y = -23.5 * cos(pi / 4 * x) + 23.5)
test2 <- function(x) -23.5 * cos(pi / 4 * x) + 23.5
library(ggplot2)
ggplot() +
geom_point(data = test, mapping = aes(x, y)) +
geom_function(fun = test2, n = 202)
Context
Reading the vignette Programming with dplyr I tried to use the ... and !!! operators to implement a function that would wrap around ggplot functions and would accept an arbitrary number of arguments that would define which variables in a dataframe were to be mapped to each aesthetic.
My goal
I wanted to define a function plot_points2() such that
plot_points2(df, x = x, y = y, color = z) would be equivalent to df %>% ggplot( mapping = aes(x = x, y = y, color = z) ) + geom_point(alpha = 0.1)
plot_points2(df, x = x, y = z, color = y) would be equivalent to df %>% ggplot( mapping = aes(x = x, y = z, color = y) ) + geom_point(alpha = 0.1)
plot_points2(df, x = x, y = z) would be equivalent to df %>% ggplot( mapping = aes(x = x, y = z) ) + geom_point(alpha = 0.1)
What failed
packages
require(tidyverse)
require(rlang)
reduced example dataset
df <- tibble(g1= sample(x = c(1,2,3), replace = T, size = 10000),
g2= sample(x = c("a","b","c"), replace = T, size = 10000),
x = rnorm(10000, 50, 10),
y = rnorm(10000, 0, 20) + x*2,
z = rnorm(10000, 10, 5))
df
my attempt
plot_points2 <- function(d, ...){
args <- quos(...)
print(args)
ggplot(data = d, mapping = aes(!!!args)) + geom_point(alpha = 0.1)
}
plot_points2(df, x = x, y = y, color = z)
the error
Error: Can't use `!!!` at top level
Call `rlang::last_error()` to see a backtrace
Why I think it should work
I figure what I wanted to acomplish isn't much different from an example in the vignette that uses these operators to make a function that wraps around mutate(), and passes multiple arguments that defined the grouping variables (in deed I was able to implement a function that does that to the example dataset above I'm posting as an example), but somehow the latter works and the former doesn't:
this works
add_dif_to_group_mean <- function(df, ...) {
groups <- quos(...)
df %>% group_by(!!!groups) %>% mutate(x_dif = x-mean(x),
y_dif = y-mean(y),
z_dif = z-mean(z))
}
df %>% add_dif_to_group_mean(g1)
df %>% add_dif_to_group_mean(g1, g2)
this doesn't
plot_points2 <- function(d, ...){
args <- quos(...)
print(args)
ggplot(data = d, mapping = aes(!!!args)) + geom_point(alpha = 0.1)
}
plot_points2(df, x = x, y = y, color = z)
I also read that the problem could be related with aes() being evaluated only when the plot is printed, but in that case I think using !! and unpacking manually should raise the same error but it doesn't:
plot_points2b <- function(d, ...){
args <- quos(...)
print(args)
ggplot(data = d, mapping = aes(x = !!args[[1]],
y = !!args[[2]],
color = !!args[[3]])) +
geom_point(alpha = 0.1)
}
plot_points2b(df, x = x, y = y, color = z)
In deed this last example works fine if you plot 3 variables, but it doesn't allow you to plot a number of variables different from 3
eg: plot_points2b(df, x = x, y = z) is not equivalent to
df %>% ggplot( mapping = aes(x = x, y = z) ) + geom_point(alpha = 0.1)
In stead it raises the error:
Error in args[[3]] : subscript out of bounds
Anyone knows what concept am I missing here? Thank you in advance!
Your specific use case is an example in ?aes. aes automatically quotes its arguments. One can simply directly pass the dots. Try:
plot_points3 <- function(d, ...){
print(aes(...))
ggplot(d, aes(...)) + geom_point(alpha = 0.1)
}
plot_points3(df, x = x, y = y, color = z)
This nicely prints:
Aesthetic mapping:
* `x` -> `x`
* `y` -> `y`
* `colour` -> `z`
And yields the required plot.
As mentioned in my comment, I think you may already have x and y in your environment and that is why some of your code is working. I'm not totally sure what you are trying to achieve but I think you are doing too much rlang for getting your code to run without error.
For example:
plot_points <- function(d, ...){
ggplot(data = d, mapping = aes(x = x, y = y)) +
geom_point(alpha = 0.1)
}
plot_points (df, x, y)
will make your plot without any reason to add the overhead and complexity of !!! or enquo().
You were on this path here too, where this much simpler code works fine:
add_dif_to_group_mean <- function(., ...) {
df %>% group_by(g1) %>% mutate(x_dif = x-mean(x),
y_dif = y-mean(y),
z_dif = z-mean(z))
}
df %>% add_dif_to_group_mean(g1)
Likewise:
plot_points2 <- function(d, ...){
ggplot(data = d, mapping = aes(x=x, y=y, color=z)) +
geom_point(alpha = 0.1)
}
plot_points2(df, x = x, y = y, color = z)
works fine from what I can tell.
So I understand that maybe you are working through the examples in the book, which is great. But I think there is a missing issue somewhere that would make it so you have to do all the extra stuff in a real world function. For example, maybe you want to pass in strings like "x" and "y" instead of x and y?
For reasons I won't go into I need to plot a vertical normal curve on a blank ggplot2 graph. The following code gets it done as a series of points with x,y coordinates
dfBlank <- data.frame()
g <- ggplot(dfBlank) + xlim(0.58,1) + ylim(-0.2,113.2)
hdiLo <- 31.88
hdiHi <- 73.43
yComb <- seq(hdiLo, hdiHi, length = 75)
xVals <- 0.79 - (0.06*dnorm(yComb, 52.65, 10.67))/0.05
dfVertCurve <- data.frame(x = xVals, y = yComb)
g + geom_point(data = dfVertCurve, aes(x = x, y = y), size = 0.01)
The curve is clearly discernible but is a series of points. The lines() function in basic plot would turn these points into a smooth line.
Is there a ggplot2 equivalent?
I see two different ways to do it.
geom_segment
The first uses geom_segment to 'link' each point with its next one.
hdiLo <- 31.88
hdiHi <- 73.43
yComb <- seq(hdiLo, hdiHi, length = 75)
xVals <- 0.79 - (0.06*dnorm(yComb, 52.65, 10.67))/0.05
dfVertCurve <- data.frame(x = xVals, y = yComb)
library(ggplot2)
ggplot() +
xlim(0.58, 1) +
ylim(-0.2, 113.2) +
geom_segment(data = dfVertCurve, aes(x = x, xend = dplyr::lead(x), y = y, yend = dplyr::lead(y)), size = 0.01)
#> Warning: Removed 1 rows containing missing values (geom_segment).
As you can see it just link the points you created. The last point does not have a next one, so the last segment is removed (See the warning)
stat_function
The second one, which I think is better and more ggplotish, utilize stat_function().
library(ggplot2)
f = function(x) .79 - (.06 * dnorm(x, 52.65, 10.67)) / .05
hdiLo <- 31.88
hdiHi <- 73.43
yComb <- seq(hdiLo, hdiHi, length = 75)
ggplot() +
xlim(-0.2, 113.2) +
ylim(0.58, 1) +
stat_function(data = data.frame(yComb), fun = f) +
coord_flip()
This build a proper function (y = f(x)), plot it. Note that it is build on the X axis and then flipped. Because of this the xlim and ylim are inverted.
I want to add a stat_function layer to a plot with an aesthetic mapped to the state of some variable that identifies a set of parameters. I have manually created the two stat_function lines in the minimal working example below. That's generally what the result should look like.
p <- ggplot(data.frame(x = -1:1), aes(x = x))
p + stat_function(fun = function (x) 0 + 1 * x, linetype = 'dotted') +
stat_function(fun = function (x) 0.5 + -1 * x, linetype = 'solid')
My best guess at how to accomplish this is
params <- data.frame(
type = c('true', 'estimate'),
b = c(0, 0.5),
m = c(1, -1),
x = 0
)
linear_function <- function (x, b, m) b + m * x
p + stat_function(data = params,
aes(linetype = type, x = x),
fun = linear_function,
args = list(b = b, m = m))
This form works if I use constants like args = list(b = 0, m = 1), but when I try to get the values for the parameters out of the params data frame it's unable to find those columns. Why is this not working like I expect and what's a solution?
Unfortunately, nothing positive to add here; the fact stands: stat_function does not support this functionality.
The alternative is to either use for loops to make layers, as demonstrated in this question, or generate the grid data yourself, which is what was suggested as a closing comment to a feature request discussion about adding this functionality.
I realize similar questions have already been asked already. For example, consider the one here: Equivalent of curve() for ggplot. Is it possible to plot the functions below using group somehow or would I have to write stat_function for each instance of a and b?
myfun <- function(x, i) {
sin(a[i] * x) + log(b[i] * x)
}
ggplot(data.frame(x=c(0, 10)), aes(x)) +
stat_function(fun = myfun(x, 1)) +
stat_function(fun = myfun(x, 2)) ...
What if a and b are big? The above seems inelegant.
I would not use stat_function for anything non-trivial, it's usually easier to explicitly create values,
myfun <- function(a, b, x) {
data.frame(x = x, y = sin(a * x) + log(b * x))
}
ab <- expand.grid(a=1:3, b=2:6)
d <- plyr::mdply(ab, myfun, x=seq(0,10, length=100))
ggplot(d, aes(x, y, colour=factor(a))) +
facet_wrap(~b) +
geom_line()