ggplot2: easy way to plot integral over independent variable? - r

I'm integrating a function f(t) = 2t (just an example) and would like to plot the integral as a function of time t using
awesome_thing <- function(t) {2*t}
integrate(awesome_thing, lower=0, upper=10)
However, I would like to plot the integral as a function of time in ggplot2, so for this example the plotted points would be (1,1), (2,4), (3,9), ..., (10,100).
Is there an easy way to do this in ggplot (e.g., something similar to how functions are plotted)? I understand I can "manually" evaluate and plot the data for each t, but I thought i'd see if anyone could recommend a simpler way.

Here is a ggplot solution and stat_function
# create a function that is vectorized over the "upper" limit of your
# integral
int_f <- Vectorize(function(f = awesome_thing, lower=0,upper,...){
integrate(f,lower,upper,...)[['value']] },'upper')
ggplot(data.frame(x = c(0,10)),aes(x=x)) +
stat_function(fun = int_f, args = list(f = awesome_thing, lower=0))

Not ggplot2 but shouldn't be difficult to adapt by creating a dataframe to pass to that paradgm:
plot(x=seq(0.1,10, by=0.1),
y= sapply(seq(0.1,10, by=0.1) ,
function(x) integrate(awesome_thing, lower=0, upper=x)$value ) ,
type="l")
The trick with the integrate function is that it retruns a list and you need to extract the 'value'-element for various changes in the upper limit.

Related

Basic Calculations with stat_functions -- Plotting hazard functions

I am currently trying to plot some density distributions functions with R's ggplot2. I have the following code:
f <- stat_function(fun="dweibull",
args=list("shape"=1),
"x" = c(0,10))
stat_F <- stat_function(fun="pweibull",
args=list("shape"=1),
"x" = c(0,10))
S <- function() 1 - stat_F
h <- function() f / S
wei_h <- ggplot(data.frame(x=c(0,10))) +
stat_function(fun=h) +
...
Basically I want to plot hazard functions based on a Weibull Distribution with varying parameters, meaning I want to plot:
The above code gives me this error:
Computation failed in stat_function():
unused argument (x_trans)
I also tried to directly use
S <- 1 - stat_function(fun="pweibull", ...)
instead of above "workaround" with the custom function construction. This threw another error, since I was trying to do numeric arithmetics on an object:
non-numeric argument for binary operator
I get that error, but I have no idea for a solution.
I have done some research, but without success. I feel like this should be straightforward. Also I would like to do it "manually" as much as possible, but if there is no simple way to do this, then a packaged solution is just fine aswell.
Thanks in advance for any suggestions!
PS: I basically want to recreate the graph you can find in Kiefer, 1988 on page 10 of the linked PDF file.
Three comments:
stat_function is a function statistic for ggplot2, you cannot divide two stat_function expressions by each other or otherwise use them in mathematical expressions, as in S <- 1 - stat_function(fun="pweibull", ...). That's a fundamental misunderstanding of what stat_function is. stat_function always needs to be added to a ggplot2 plot, as in the example below.
The fun argument for stat_function takes a function as an argument, not a string. You can define functions on the fly if you need ones that don't exist already.
You need to set up an aesthetic mapping, via the aes function.
This code works:
args = list("shape" = 1.2)
ggplot(data.frame(x = seq(0, 10, length.out = 100)), aes(x)) +
stat_function(fun = dweibull, args = args, color = "red") +
stat_function(fun = function(...){1-pweibull(...)}, args = args, color = "green") +
stat_function(fun = function(...){dweibull(...)/(1-pweibull(...))},
args = args, color = "blue")

Using user-defined functions within "curve" function in R graphics

I am needing to produce normally distributed density plots with different total areas (summing to 1). Using the following function, I can specify the lambda - which gives the relative area:
sdnorm <- function(x, mean=0, sd=1, lambda=1){lambda*dnorm(x, mean=mean, sd=sd)}
I then want to plot up the function using different parameters. Using ggplot2, this code works:
require(ggplot2)
qplot(x, geom="blank") + stat_function(fun=sdnorm,args=list(mean=8,sd=2,lambda=0.7)) +
stat_function(fun=sdnorm,args=list(mean=18,sd=4,lambda=0.30))
but I really want to do this in base R graphics, for which I think I need to use the "curve" function. However, I am struggling to get this to work.
If you take a look at the help file for ? curve, you'll see that the first argument can be a number of different things:
The name of a function, or a call or an expression written as a function of x which will evaluate to an object of the same length as x.
This means you can specify the first argument as either a function name or an expression, so you could just do:
curve(sdnorm)
to get a plot of the function with its default arguments. Otherwise, to recreate your ggplot2 representation you would want to do:
curve(sdnorm(x, mean=8,sd=2,lambda=0.7), from = 0, to = 30)
curve(sdnorm(x, mean=18,sd=4,lambda=0.30), add = TRUE)
The result:
You can do the following in base R
x <- seq(0, 50, 1)
plot(x, sdnorm(x, mean = 8, sd = 2, lambda = 0.7), type = 'l', ylab = 'y')
lines(x, sdnorm(x, mean = 18, sd = 4, lambda = 0.30))
EDIT I added ylab = 'y' and updated the picture to have the y-axis re-labeled.
This should get you started.

Setting equal xlim and ylim in plot function

Is there a way to get the plot function to generate equal xlimand ylimautomatically?
I do not want to define a fix range beforehand, but I want the plot function to decide about the range itself. However, I expect it to pick the same range for x and y.
A possible solution is to define a wrapper to the plot function:
plot.Custom <- function(x, y, ...) {
.limits <- range(x, y)
plot(x, y, xlim = .limits, ylim = .limits, ...)
}
One way is to manipulate interactively and then choose the right one. A slider will appear once you run the following code.
library(manipulate)
manipulate(
plot(cars, xlim=c(x.min,x.max)),
x.min=slider(0,15),
x.max=slider(15,30))
I'm not aware of anyway to do this using plot(doesn't mean there isn't one). ggplot might be the way to go; it lends itself more to be being retroactively changed since it is designed around a layer system.
library(ggplot2)
#Creating our ggplot object
loop_plot <- ggplot(cars, aes(x = speed, y = dist)) +
geom_point()
#pulling out the 'auto' x & y axis limits
rangepull <- t(cbind(
ggplot_build(loop_plot)$panel$ranges[[1]]$x.range,
ggplot_build(loop_plot)$panel$ranges[[1]]$y.range))
#taking the max and min(so we don't cut out data points)
newrange <- list(cor.min = min(rangepull[,1]), cor.max = max(rangepull[,2]))
#changing our plot size to be nice and symmetric
loop_plot <- loop_plot +
xlim(newrange$cor.min, newrange$cor.max) +
ylim(newrange$cor.min, newrange$cor.max)
Note that the loop_plot object is of ggplot class, and wont actually print until its called.
I used the cars dataset in the code above to show whats going on, but just sub in your data set[s] and then do whatever postmortem your end goal is.
You'll also be able to add in titles and the like based off of the dataset name et cetera which will likely end up producing a clearer visualization out of your loop.
Hopefully this works for your needs.

Graphing a polynomial output of calc.poly

I apologize first for bringing what I imagine to be a ridiculously simple problem here, but I have been unable to glean from the help file for package 'polynom' how to solve this problem. For one out of several years, I have two vectors of x (d for day of year) and y (e for an index of egg production) data:
d=c(169,176,183,190,197,204,211,218,225,232,239,246)
e=c(0,0,0.006839425,0.027323127,0.024666883,0.005603878,0.016599262,0.002810977,0.00560387 8,0,0.002810977,0.002810977)
I want to, for each year, use the poly.calc function to create a polynomial function that I can use to interpolate the timing of maximum egg production. I want then to superimpose the function on a plot of the data. To begin, I have no problem with the poly.calc function:
egg1996<-poly.calc(d,e)
egg1996
3216904000 - 173356400*x + 4239900*x^2 - 62124.17*x^3 + 605.9178*x^4 - 4.13053*x^5 +
0.02008226*x^6 - 6.963636e-05*x^7 + 1.687736e-07*x^8
I can then simply
plot(d,e)
But when I try to use the lines function to superimpose the function on the plot, I get confused. The help file states that the output of poly.calc is an object of class polynomial, and so I assume that "egg1996" will be the "x" in:
lines(x, len = 100, xlim = NULL, ylim = NULL, ...)
But I cannot seem to, based on the example listed:
lines (poly.calc( 2:4), lty = 2)
Or based on the arguments:
x an object of class "polynomial".
len size of vector at which evaluations are to be made.
xlim, ylim the range of x and y values with sensible defaults
Come up with a command that successfully graphs the polynomial "egg1996" onto the raw data.
I understand that this question is beneath you folks, but I would be very grateful for a little help. Many thanks.
I don't work with the polynom package, but the resultant data set is on a completely different scale (both X & Y axes) than the first plot() call. If you don't mind having it in two separate panels, this provides both plots for comparison:
library(polynom)
d <- c(169,176,183,190,197,204,211,218,225,232,239,246)
e <- c(0,0,0.006839425,0.027323127,0.024666883,0.005603878,
0.016599262,0.002810977,0.005603878,0,0.002810977,0.002810977)
egg1996 <- poly.calc(d,e)
par(mfrow=c(1,2))
plot(d, e)
plot(egg1996)

contour plot of a custom function in R

I'm working with some custom functions and I need to draw contours for them based on multiple values for the parameters.
Here is an example function:
I need to draw such a contour plot:
Any idea?
Thanks.
First you construct a function, fourvar that takes those four parameters as arguments. In this case you could have done it with 3 variables one of which was lambda_2 over lambda_1. Alpha1 is fixed at 2 so alpha_1/alpha_2 will vary over 0-10.
fourvar <- function(a1,a2,l1,l2){
a1* integrate( function(x) {(1-x)^(a1-1)*(1-x^(l2/l1) )^a2} , 0 , 1)$value }
The trick is to realize that the integrate function returns a list and you only want the 'value' part of that list so it can be Vectorize()-ed.
Second you construct a matrix using that function:
mat <- outer( seq(.01, 10, length=100),
seq(.01, 10, length=100),
Vectorize( function(x,y) fourvar(a1=2, x/2, l1=2, l2=y/2) ) )
Then the task of creating the plot with labels in those positions can only be done easily with lattice::contourplot. After doing a reasonable amount of searching it does appear that the solution to geom_contour labeling is still a work in progress in ggplot2. The only labeling strategy I found is in an external package. However, the 'directlabels' package's function directlabel does not seem to have sufficient control to spread the labels out correctly in this case. In other examples that I have seen, it does spread the labels around the plot area. I suppose I could look at the code, but since it depends on the 'proto'-package, it will probably be weirdly encapsulated so I haven't looked.
require(reshape2)
mmat <- melt(mat)
str(mmat) # to see the names in the melted matrix
g <- ggplot(mmat, aes(x=Var1, y=Var2, z=value) )
g <- g+stat_contour(aes(col = ..level..), breaks=seq(.1, .9, .1) )
g <- g + scale_colour_continuous(low = "#000000", high = "#000000") # make black
install.packages("directlabels", repos="http://r-forge.r-project.org", type="source")
require(directlabels)
direct.label(g)
Note that these are the index positions from the matrix rather than the ratios of parameters, but that should be pretty easy to fix.
This, on the other hand, is how easilyy one can construct it in lattice (and I think it looks "cleaner":
require(lattice)
contourplot(mat, at=seq(.1,.9,.1))
As I think the question is still relevant, there have been some developments in the contour plot labeling in the metR package. Adding to the previous example will give you nice contour labeling also with ggplot2
require(metR)
g + geom_text_contour(rotate = TRUE, nudge_x = 3, nudge_y = 5)

Resources