How to map stat_function aesthetics to data in ggplot2? - r

I want to add a stat_function layer to a plot with an aesthetic mapped to the state of some variable that identifies a set of parameters. I have manually created the two stat_function lines in the minimal working example below. That's generally what the result should look like.
p <- ggplot(data.frame(x = -1:1), aes(x = x))
p + stat_function(fun = function (x) 0 + 1 * x, linetype = 'dotted') +
stat_function(fun = function (x) 0.5 + -1 * x, linetype = 'solid')
My best guess at how to accomplish this is
params <- data.frame(
type = c('true', 'estimate'),
b = c(0, 0.5),
m = c(1, -1),
x = 0
)
linear_function <- function (x, b, m) b + m * x
p + stat_function(data = params,
aes(linetype = type, x = x),
fun = linear_function,
args = list(b = b, m = m))
This form works if I use constants like args = list(b = 0, m = 1), but when I try to get the values for the parameters out of the params data frame it's unable to find those columns. Why is this not working like I expect and what's a solution?

Unfortunately, nothing positive to add here; the fact stands: stat_function does not support this functionality.
The alternative is to either use for loops to make layers, as demonstrated in this question, or generate the grid data yourself, which is what was suggested as a closing comment to a feature request discussion about adding this functionality.

Related

Adding various functions to the same plot with a loop ggplot2

I have the following equation: y = 1 - cx, where c is a real number.
I'm trying to make something where I can pick the range of values for c and plot all the graphs of every function with the corresponding c.
Here's what I got as of now:
p <- ggplot(data = data.frame(x = 0), mapping = aes(x = x))
statfun1 <- c()
for (i in 1:3){
c <- i
fun1.i <- function(x){1 - c*x}
fun1.i.plot <- stat_function(fun = fun1.i, color="red")
statfun1 <- statfun1 + fun1.i.plot
}
p + statfun1 + xlim(-5, 5)
The p is basically what you need in ggplot2 to plot a function, then I go over in this case the values 1, 2 and 3 for c and I try to add them all at the end but this does not seem to work. Anyone maybe can help me out or put me on the right track?
Define your function
fun1.i <- function(x, c){1 - c*x}
Now from ?`+.gg`
You can add any of the following types of objects:
...
You can also supply a list, in which case each element of the list will be added in turn.
So you might use lapply
p + xlim(-5, 5) + lapply(1:3, function(c) {
stat_function(fun = fun1.i, args = list(c = c), geom = "line", color="red")
})
Result

Does stat_function() in ggplot2 work with args other than vectors?

I am trying to print some values (geom_point) and on top of that draw some function (stat_function) with ggplot2, however I can't plot the function because it has an argument of type list.
I want to print the function create.new.func(x,W) which gets two parameters (x,W) where x is a numeric value and W a list containing two matrices of different dimensions. I tried using the line
stat_function(fun= create.new.func,aes(colour="sep1"),args = list(W=superW))
However, I keep getting the following error:
Computation failed in `stat_function()`: non-conformable arguments##
Of course create.new.func(x,W=superW) works perfectly for any x.
All the code snippet I have seen so far seem to use only vectors for the args parameter, hence my question.
Example:
W <- list(matrix(c(1, -1, -1, 1), nrow = 2), matrix(c(1, 2)))
func <- function(x, W){
sum(W[[2]] * (W[[1]] %*% c(1, x)))
}
ggplot() +
geom_point(aes(x = 0, y = 0)) +
theme_bw()+
stat_function(fun = func, args = list(W), aes(colour = "black")) +
scale_colour_manual("data", values = c("blue"))
Per ?stat_function, fun must be vectorized. stat_function makes a vector of x values of length n (101 by default) between the range of x values, passes it into the function, and plots the created x values with the resulting y values. For instance,
library(ggplot2)
ggplot() + stat_function(aes(x = 0:1), fun = sqrt)
Note that x must have a range; if x = 0, the result will just be a point, even though stat_function will still make a vector of x values (that will all be the same), i.e. seq(0, 0, length.out = 101).
A quick way to get your code working, then, is to add a useful domain for x and iterate over x in func:
W <- list(matrix(c(1, -1, -1, 1), nrow = 2), matrix(c(1, 2)))
func <- function(x, W){
sapply(x, function(x_i){
sum(W[[2]] * (W[[1]] %*% c(1, x_i)))
})
}
# it's vectorized now
func(1:10, W)
#> [1] 0 1 2 3 4 5 6 7 8 9
ggplot() +
geom_point(aes(x = 0, y = 0)) +
stat_function(aes(x = 0:1), fun = func, args = list(W = W))
Ultimately this is not a great way to vectorize func because it just loops instead of writing better code/math, so it is not terribly efficient. In this case, 101 iterations of a simple function will still be very fast, though, so it's not necessarily worth the effort of optimizing it further. For slower, more complicated functions, it may be.

Plotting multiple variable function in R

I'm needing help with the following question:
Consider the following R function, named negloglike that has two input arguments: lam and x, in that order.
Use this function to produce a plot of the log-likelihood function over a range of values λ ∈ (0, 2).
negloglike <- function(lam, x) {
l = -sum(log(dexp(x, lam)))
return(l)
}
Can anyone please help? Is it possible to do something like this with ggplot? I've been trying to do it with a set value of lam (like 0.2 here for example) using stat_function:
ggplot(data = data.frame(x = 0), mapping = aes(x = x)) +
stat_function(fun = negloglike, args = list(lam = 0.2)) +
xlim(0,10)
but the plot always returns a horizontal line at some y-value instead of returning a curve.
Should I be possibly using a different geom? Or even a different package altogether?
Much appreciated!
The trick is to Vectorize the function over the argument of interest.
Thanks for the tip go to the most voted answer to this question. It uses base graphics only, so here is a ggplot2 equivalent.
First I will define the negative log-likelihood using function dexp
library(ggplot2)
negloglike <- function(lam, x) {negloglike <- function(lam, x) {
l = -sum(dexp(x, lam, log = TRUE))
return(l)
}
nllv <- Vectorize(negloglike, "lam")
But it's better to use the analytic form, which is easy to establish by hand.
negloglike2 <- function(lam, x) {
l = lam*sum(x) - length(x)*log(lam)
return(l)
}
nllv2 <- Vectorize(negloglike2, "lam")
ggplot(data = data.frame(lam = seq(0, 2, by = 0.2)), mapping = aes(x = lam)) +
stat_function(fun = nllv2, args = list(x = 0:10))
Both nllv and nllv2 give the same graph.

Using stat_function to draw partially shaded normal curve in ggplot2

I'm a big beginner in R and am very confused as to how ggplot is using variable "x" when creating normal curves.
My situation is this. I'm trying to plot normal curves given specific means and standard deviations and in the absence of data the most common way I've seen to do this is as follows:
score = 1800
m = 1500
std = 300
ggplot(data.frame(x = c(300, 2700)), aes(x = x)) + stat_function(fun =
dnorm, args = list(mean = m, sd = std)) + scale_x_continuous(name
= "Score", breaks = seq(300, 2700, std))
I wanted to shade a specific area of the curve so using the Internet I created a function as follows:
funcShaded <- function(x) {
y = dnorm(x, mean = m, sd = std)
y[x < score] <- NA
return(y)
}
And then added a layer to my curve with
p + stat_function(fun = funcShaded, geom="area", fill="#84CA72", alpha=.2)
This works to create the graph I desire. However, I have 2 questions about this. First, when I break the code down
data.frame(x = c(300, 2700))
creates a two item dataframe as you would expect so how is this capable to being used to create x-axis values and, further, to be passed to the function to be used appropriately (read. as if it were a list of values)?
Second, I now want to re-use this function later to fill in other area under the curve based on a different score (e.g. score2 = 1630) and was thinking I could just add another variable to funcShaded to pass score (i.e. funcShaded <- function(x, score)) and then call my stat_function as follows: p + stat_function(fun = funcShaded(x, score2), ...) but:
I'm not sure this syntax will work
It seems like the x variable is never explicitly "created" with this code because it doesn't show up in my Environment and when I try this code I get Error: object 'x' not found
So I guess I'm just curious as to how 'x' is working in this situation and if I should be creating it differently given what I want to do.
The function passed to stat_function must be uncalled (unless it returns another function; an adverb like purrr::partial or the like is another approach here), because stat_function needs to pass it a vector of x values.
You've already done with dnorm what you need to do with funcShaded: pass additional fixed parameters through args:
library(ggplot2)
score = 1800
m = 1500
std = 300
funcShaded <- function(x, lower_bound) {
y = dnorm(x, mean = m, sd = std)
y[x < lower_bound] <- NA
return(y)
}
ggplot(data.frame(x = c(300, 2700)), aes(x = x)) +
stat_function(fun = dnorm, args = list(mean = m, sd = std)) +
stat_function(fun = funcShaded, args = list(lower_bound = score),
geom = "area", fill = "#84CA72", alpha = .2) +
scale_x_continuous(name = "Score", breaks = seq(300, 2700, std))
Alternately, without writing your own function, you can do the same thing with stat_function's xlim parameter:
ggplot(data.frame(x = c(300, 2700)), aes(x = x)) +
stat_function(fun = dnorm, args = list(mean = m, sd = std)) +
stat_function(fun = dnorm, args = list(mean = m, sd = std), xlim = c(score, 2700),
geom = "area", fill = "#84CA72", alpha = .2) +
scale_x_continuous(name = "Score", breaks = seq(300, 2700, std))
As for how stat_function uses the values passed into its x aesthetic, it uses them as limits between which to interpolate a grid of values, the number of which set by its n parameter, which defaults to 101. It's decidedly a different usage than most stats, but it's a very useful function.

Plotting family of functions with qplot without duplicating data

Given family of functions f(x;q) (x is argument and q is parameter) I'd like to visulaize this function family on x taking from the interval [0,1] for 9 values of q (from 0.1 to 0.9). So far my solution is:
f = function(p,q=0.9) {1-(1-(p*q)^3)^1024}
x = seq(0.0,0.99,by=0.01)
q = seq(0.1,0.9,by=0.1)
qplot(rep(x,9), f(rep(x,9),rep(q,each=100)), colour=factor(rep(q,each=100)),
geom="line", size=I(0.9), xlab="x", ylab=expression("y=f(x)"))
I get quick and easy visual with qplot:
My concern is that this method is rather memory hungry as I need to duplicate x for each parameter and duplicate each parameter value for whole x range. What would be alternative way to produce same graph without these duplications?
At some point ggplot will need to have the data available to plot it and the way that package works prohibits simply doing what you want. I suppose you could set up a blank plot if you know the x and y axis limits, and then loop over the 9 values of q, generating the data for that q, and adding a geom_line layer to the existing plot object. However, you'll have to produce the colours for each layer yourself.
If this is representative of the size of problem you have, I wouldn't worry too much about the memory footprint. We're only talking about a two vectors of length 900
> object.size(rnorm(900))
7240 bytes
and the 100 values over the range of x appears sufficient to give a smooth plot.
for loop to add layers to ggplot
require("ggplot2")
## something to replicate ggplot's colour palette, sure there is something
## to do this already in **ggplot** now...
ggHueColours <- function(n, h = c(0, 360) + 15, l = 65, c = 100,
direction = 1, h.start = 0) {
turn <- function(x, h.start, direction) {
(x + h.start) %% 360 * direction
}
if ((diff(h) %% 360) < 1) {
h[2] <- h[2] - 360 / n
}
hcl(h = turn(seq(h[1], h[2], length = n), h.start = h.start,
direction = direction), c = c, l = l)
}
f = function(p,q=0.9) {1-(1-(p*q)^3)^1024}
x = seq(0.0,0.99,by=0.01)
q = seq(0.1,0.9,by=0.1)
cols <- ggHueColours(n = length(q))
for(i in seq_along(q)) {
df <- data.frame(y = f(x, q[i]), x = x)
if(i == 1) {
plt <- ggplot(df, aes(x = x, y = y)) + geom_line(colour = cols[i])
} else {
plt <- plt + geom_line(data = df, colour = cols[i])
}
}
plt
which gives:
I'll leave the rest to you - I'm not familiar enough with ggplot to draw a legend manually.

Resources