How can you use ggplot to superimpose many plots of related functions in an automatic way? - r

I have a family of functions that are all the same except for one adjustable parameter, and I want to plot all these functions on one set of axes all superimposed on one another. For instance, this could be sin(n*x), with various values of n, say 1:30, and I don't want to have to type out each command individually -- I figure there should be some way to do it programatically.
library(ggplot2)
define trig functions as a function of frequency: sin(x), sin(2x), sin(3x) etc.
trigf <- function(i)(function(x)(sin(i*x)))
Superimpose two function plots -- this works manually of course
ggplot(data.frame(x=c(0,pi)), aes(x)) + stat_function(fun=trigf(1)) + stat_function(fun=trigf(2))
now try to generalize -- my idea was to make a list of the stat_functions using lapply
plotTrigf <- lapply(1:5, function(i)(stat_function(fun=function(x)(sin(i*x))) ))
try using the elements of the list manually but it doesn't really work -- only the i=5 plot is shown and I'm not sure why when that's not what I referenced
ggplot(data.frame(x=c(0,pi)), aes(x)) +plotTrigf[[1]] + plotTrigf[[2]]
I Thought this Reduce might handle the 'generalized sum' to add to a ggplot() but it doesn't work -- it complains of a non-numeric argument to binary operator
Reduce("+", plotTrigf)
So I'm kind of stuck both in executing this strategy, or perhaps there's some other way to do this.

Are you using version R <3.2? The problem is that you actually need to evaluate your i parameter in your lapply call. Right now it's being left as a promise and not getting evaulated till you try to plot and at that point i has the last value it had in the lapply loop which is 5. Use:
plotTrigf <- lapply(1:5, function(i) {force(i);stat_function(fun=function(x)(sin(i*x))) })
You can't just add stat_function calls together, even without Reduce() you get the error
stat_function(fun=sin) + stat_function(fun=cos)
# Error in stat_function(fun = sin) + stat_function(fun = cos) :
# non-numeric argument to binary operator
You need to add them to a ggplot object. You can do this with Reduce() if you just specify the init= parameter
Reduce("+", plotTrigf, ggplot(data.frame(x=c(0,pi)), aes(x)))
And actually the special + operator for ggplot objects allows you to add a list of objects so you don't even need the Reduce at all (see code for ggplot2:::add_ggplot)
ggplot(data.frame(x=c(0,pi)), aes(x)) + plotTrigf
The final result is

You need to use force in order to make sure the parameter is being evaluated at the right time. It's a very useful technique and a common source of confusion in loops, you should read about it in Hadley's book http://adv-r.had.co.nz/Functions.html
To solve your question: you just need to add force(i) when defining all the plots, inside the lapply function, before making the call to stat_function. Then you can use Reduce or any other method to combine them. Here's a way to combine the plots using lapply (note that I'm using the <<- operator which is discouraged)
p <- ggplot(data.frame(x=c(0,pi)), aes(x))
lapply(plotTrigf, function(x) {
p <<- p + x
return()
})

Related

Plot a normal distribution in R with specific parameters

I'd like to plot something like this:
plot(dnorm(mean=2),from=-3,to=3)
But it doesn't work as if you do:
plot(dnorm,from=-3,to=3)
what is the problem?
The answer you received from #r2evans is excellent. You might also want to consider learning ggplot, as in the long run it will likely make your life much easier. In that case, you can use stat_function which will plot the results of an arbitrary function along a grid of the x variable. It accepts arguments to the function as a list.
library(ggplot2)
ggplot(data = data.frame(x=c(-3,3)), aes(x = x)) +
stat_function(fun = dnorm, args = list(mean = 2))
curve(dnorm(x, mean = 2), from = -3, to = 3)
The curve function looks for the xname= variable (defaults to x) in the function call, so in dnorm(x, mean=2), it is not referencing an x in the calling environment, it is a placeholder for curve to use for iterated values.
The reason plot(dnorm, ...) works as it does is because there exists graphics::plot.function, since dnorm in that case is a function. When you try plot(dnorm(mean=2)), the dnorm(mean=2) is no longer a function, it is a call ... that happens to fail because it requires x (its first argument) be provided.
Incidentally, plot.function calls curve(...), so other than being a convenience function, there is very little reason to use plot(dnorm, ...) over curve(dnorm(x), ...) other than perhaps a little code-golf. The biggest advantage to curve is that it lets you control arbitrary arguments to the dnorm() function, whereas plot.function does not.

passing arguments to geom_point2 with mapply

My objective is pass lists as arguments to the function geom_point2 using lapply or analogously mapply. In similar situations, I had success passing a list (or lists) to geom_cladelabel as in:
mapply(function (x,y,z,w,v,u,t,s) geom_cladelabel(node=x, label=y,
align=F, etc. # Where x y z etc are lists.
Problem is related to the use of aes inside geom_point2. (not in geom_cladelabel):
In the case of geom_point2, the node info is inside aes, and I could't do it. Normally I do not get any error message, but it does not work.
The objective is to make this example work, but using mapply instead of writting geom_point2 two times.
# source("https://bioconductor.org/biocLite.R")
# biocLite("ggtree")
library(ggtree)
library(ape)
#standard code
newstree<-rtree(10)
node1<-getMRCA(newstree,c(1,2))
node2<-getMRCA(newstree,c(3,4))
ggtree(newstree)+
geom_point2(aes(subset=(node ==node1) ), fill="black", size=3, shape=23)+
geom_point2(aes(subset=(node ==node2) ), fill="green", size=3, shape=23)
#desire to substitute the geom_point2 layers with mapply or lapply:
#lapply(c(node1,node2), function (x) geom_point2(aes(subset=(node=x)))))
Here is a solution calling geom_point2 usig mapply:
library(ggtree)
ggtree(rtree(10)) +
mapply(function(x, y, z)
geom_point2(
aes_string(subset=paste("node ==", x)),
fill=y,
size=10,
shape=z
),
x=c(11,12),
y=c("green", "firebrick"),
z=c(23,24)
) +
geom_text2(aes(subset=!isTip, label=node))
The solution is in the aes_string(), which writes the value of x directly in the aesthetics. The default aes() does not pass on the value of x, but just the string "x". When plotting, ggtree then looks for a node called "x", and ends with an empty node list.
I guess this has to do with the variable being stored in the mapply-environment and not being passed on to the plot.
PS: Sorry for my too quick answer with do.call() earlier. It is useful, but off-topic here.

How to programmatically overlap arbitrary stat_functions in ggplot?

I am looking for a way to automatically plot an arbitrary number of stat_function objects in a single ggplot, each one with a different set of parameters, and coloring them.
Initially I thought of having one big data.table with a large number of samples from each distribution, each set associated with an index, and using geom_density, grouping and coloring by the index.
This is, however, very inefficient. There is, in my opinion, no need to spend time and memory to produce and keep large sets of values if we already have parameters that perfectly describe each distribution.
I present my initial solution below, but is there a more elegant and/or practical way of doing this?
distrData.dt <- data.table( Shape = c(2.1,2.2,2.3), Scale = c(1.1,1.2,1.3), time = c(1,2,3) )
ggplot(data.table(x=c(0:15)), aes(x)) +
apply(distrData.dt,1, FUN = function(x) stat_function(fun = dgamma,arg = list(shape=as.numeric(x[1]),scale=as.numeric(x[2])), mapping = aes_string(color=x[3]) ) ) +
scale_colour_gradient("Time Step", low="blue", high="red", space="Lab")
This is the current result:
It produces the main result, that is, it will plot as many "perfect" densities as the number of parameter sets you give it. However, I am not using aesthetics to pass parameters from the column names ("Shape" and "Scale") or to get the color of each line. As far as I understand, that is not possible, but is there another way?
First of all, your solution is absolutely fine to me: it does the job, and it does it elegantly. I just wanted to both expand on #joran's comment and show one useful trick that's called "function factory", which is perfectly suitable for a case like yours.
So I'm building a function that returns a function with fixed parameters. Note that using force prevents from shape and scale being lazily evaluated, that is necessary since we'll be using a for loop.
I'm using data.frame instead of data.table, but there shouldn't be a significant difference. That vector("list", n) construction is preallocating space for a list, as seen in ?list. I don't think it's obligatory in this particular case (significant overhead will appear for lenghts, say, >100, unlikely here), but it's always better to avoid iteratively growing objects, that's a bad practice.
As a last remark, check the stat_function call: it seems reasonably readable, at least you can see what's the mapping and what's related to dgamma parameters.
dgamma_factory <- function(shape, scale) {
force(shape)
force(scale)
function(x) dgamma(x, shape = shape, scale = scale)
}
l <- vector("list", nrow(distrData.dt))
for (i in seq.int(nrow(distrData.dt))) {
params <- distrData.dt[i, ]
l[[i]] <- stat_function(
fun = dgamma_factory(params$Shape, params$Scale),
mapping = aes_string(color = params$time))
}
ggplot(data.frame(x=c(0:15)), aes(x)) +
l +
scale_colour_gradient("Time Step", low="blue", high="red", space="Lab")

Adding points after the fact with ggplot2; user defined function

I believe the answer to this is that I cannot, but rather than give in utterly to depraved desperation, I will turn to this lovely community.
How can I add points (or any additional layer) to a ggplot after already plotting it? Generally I would save the plot to a variable and then just tack on + geom_point(...), but I am trying to include this in a function I am writing. I would like the function to make a new plot if plot=T, and add points to the existing plot if plot=F. I can do this with the basic plotting package:
fun <- function(df,plot=TRUE,...) {
...
if (!plot) { points(dYdX~Time.Dec,data=df2,col=col) }
else { plot(dYdX~Time.Dec,data=df2,...) }}
I would like to run this function numerous times with different dataframes, resulting in a plot with multiple series plotted.
For example,
fun(df.a,plot=T)
fun(df.b,plot=F)
fun(df.c,plot=F)
fun(df.d,plot=F)
The problem is that because functions in R don't have side-effects, I cannot access the plot made in the first command. I cannot save the plot to -> p, and then recall p in the later functions. At least, I don't think I can.
have a ggplot plot object be returned from your function that you can feed to your next function call like this:
ggfun = function(df, oldplot, plot=T){
...
if(plot){
outplot = ggplot(df, ...) + geom_point(df, ...)
}else{
outplot = oldplot + geom_point(data=df, ...)
}
print(outplot)
return(outplot)
}
remember to assign the plot object returned to a variable:
cur.plot = ggfun(...)

How can I plot multiple functions in R?

Using ggplot, is there a way of graphing several functions on the same plot? I want to use parameters from a text file as arguments for my functions and overlay these on the same plot.
I understand this but I do not know how to add the visualized function together if I loop through.
Here is an implementation of Hadley's idea.
library(ggplot2)
funcs <- list(log,function(x) x,function(x) x*log(x),function(x) x^2, exp)
cols <-heat.colors(5,1)
p <-ggplot()+xlim(c(1,10))+ylim(c(1,10))
for(i in 1:length(funcs))
p <- p + stat_function(aes(y=0),fun = funcs[[i]], colour=cols[i])
print(p)

Resources