Integrand Syntax in R - r

I'm trying to complete a straightforward integration, but I'm running into an issue that (I think) is due to the form in which I'm writing the integrand.
Suppose I want to find the area bound by f(x) = 3x and g(x) = x^2. Geometrically, the area between the two curves:
Ok, so not a big deal to do analytically:
But I'd like to accomplish this with R, of course.
So I enter my function and there's a problem:
> g <- function(x) {3x-x^2}
Error: unexpected symbol in "g <- function(x) {3x"
This frustrated me so I started playing around with things. Interestingly, I found that if I factor x out of the integrand:
everything works smoothly:
> f <- function(x) {x*(3-x)}
> integrate(f, 0, 3)
4.5 with absolute error < 5e-14
My next step was to check ?integrate, part of which is attached below:
integrate(f, lower, upper, ..., subdivisions = 100L,
rel.tol = .Machine$double.eps^0.25, abs.tol = rel.tol,
stop.on.error = TRUE, keep.xy = FALSE, aux = NULL)
Arguments
f
an R function taking a numeric first argument and returning a numeric vector of the same length. Returning a non-finite element will generate an error.
lower, upper
the limits of integration. Can be infinite.
Am I somehow not taking a numeric first argument in my first attempt to integrate? Thanks in advance.

Change 3x to 3*x.
(This may be the smallest answer-length-to-question-length ratio I've seen in a long time ;-)

Related

R - how to define a "symbolic sequence"

Let's say that we have this sequence of numbers:
1/2, 1/3, 1/4, ..., 1/N
Sorry for the bad formatting, but the LaTeX doesn't work here. Let's say that we define a function which sums all the elements of this sequence:
Σ n = 1N 1/n
The N is supposed to be in the superscipt, but I can't align it properly here - anyway, this function would calculate the sum of all 1/n elements starting from n=1 to N.
I now want to find the limit of this function when N tends to infinity using R. For this, I was planning to use the lim function from the Ryacas package as suggested in this question.
However, I can't seem to find a way to make my script work. My idea was this:
x <- ysym("x")
sq <- 1:x
fun <- sum(1/sq)
lim(fun, x, Inf)
However, I am already getting an error at the second step of this process. I can't seem to run the
sq <- 1:x:
Error in 1:x : NA/NaN argument
In addition: Warning message:
In 1:x : numerical expression has 3 elements: only the first used
So it seems that it's not possible to define a "symbolic sequence" (don't know what else I would call it) in this way.
What would be the proper way to calculate what I want?
This series goes to infinity:
library(Ryacas)
yac_str("Sum(n, 1, Infinity, 1/n)")
# "Infinity"
Let's try a convergent series:
yac_str("Sum(n, 1, Infinity, 1/n^2)")
# "Pi^2/6"
If you really want to use Limit:
yac_str("Limit(n, Infinity) Sum(i, 1, n, 1/i^2)")
Use yac_str to execute a Yacas command.

Optimize within for loop cannot find function

I've got a function, KozakTaper, that returns the diameter of a tree trunk at a given height (DHT). There's no algebraic way to rearrange the original taper equation to return DHT at a given diameter (4 inches, for my purposes)...enter R! (using 3.4.3 on Windows 10)
My approach was to use a for loop to iterate likely values of DHT (25-100% of total tree height, HT), and then use optimize to choose the one that returns a diameter closest to 4". Too bad I get the error message Error in f(arg, ...) : could not find function "f".
Here's a shortened definition of KozakTaper along with my best attempt so far.
KozakTaper=function(Bark,SPP,DHT,DBH,HT,Planted){
if(Bark=='ob' & SPP=='AB'){
a0_tap=1.0693567631
a1_tap=0.9975021951
a2_tap=-0.01282775
b1_tap=0.3921013594
b2_tap=-1.054622304
b3_tap=0.7758393514
b4_tap=4.1034897617
b5_tap=0.1185960455
b6_tap=-1.080697381
b7_tap=0}
else if(Bark=='ob' & SPP=='RS'){
a0_tap=0.8758
a1_tap=0.992
a2_tap=0.0633
b1_tap=0.4128
b2_tap=-0.6877
b3_tap=0.4413
b4_tap=1.1818
b5_tap=0.1131
b6_tap=-0.4356
b7_tap=0.1042}
else{
a0_tap=1.1263776728
a1_tap=0.9485083275
a2_tap=0.0371321602
b1_tap=0.7662525552
b2_tap=-0.028147685
b3_tap=0.2334044323
b4_tap=4.8569609081
b5_tap=0.0753180483
b6_tap=-0.205052535
b7_tap=0}
p = 1.3/HT
z = DHT/HT
Xi = (1 - z^(1/3))/(1 - p^(1/3))
Qi = 1 - z^(1/3)
y = (a0_tap * (DBH^a1_tap) * (HT^a2_tap)) * Xi^(b1_tap * z^4 + b2_tap * (exp(-DBH/HT)) +
b3_tap * Xi^0.1 + b4_tap * (1/DBH) + b5_tap * HT^Qi + b6_tap * Xi + b7_tap*Planted)
return(y=round(y,4))}
HT <- .3048*85 #converting from english to metric (sorry, it's forestry)
for (i in c((HT*.25):(HT+1))) {
d <- KozakTaper(Bark='ob',SPP='RS',DHT=i,DBH=2.54*19,HT=.3048*85,Planted=0)
frame <- na.omit(d)
optimize(f=abs(10.16-d), interval=frame, lower=1, upper=90,
maximum = FALSE,
tol = .Machine$double.eps^0.25)
}
Eventually I would like this code to iterate through a csv and return i for the best d, which will require some rearranging, but I figured I should make it work for one tree first.
When I print d I get multiple values, so it is iterating through i, but it gets held up at the optimize function.
Defining frame was my most recent tactic, because d returns one NaN at the end, but it may not be the best input for interval. I've tried interval=c((HT*.25):(HT+1)), defining KozakTaper within the for loop, and defining f prior to the optimize, but I get the same error. Suggestions for what part I should target (or other approaches) are appreciated!
-KB
Forestry Research Fellow, Appalachian Mountain Club.
MS, University of Maine
**Edit with a follow-up question:
I'm now trying to run this script for each row of a csv, "Input." The row contains the values for KozakTaper, and I've called them with this:
Input=read.csv...
Input$Opt=0
o <- optimize(f = function(x) abs(10.16 - KozakTaper(Bark='ob',
SPP='Input$Species',
DHT=x,
DBH=(2.54*Input$DBH),
HT=(.3048*Input$Ht),
Planted=0)),
lower=Input$Ht*.25, upper=Input$Ht+1,
maximum = FALSE, tol = .Machine$double.eps^0.25)
Input$Opt <- o$minimum
Input$Mht <- Input$Opt/.3048. # converting back to English
Input$Ht and Input$DBH are numeric; Input$Species is factor.
However, I get the error invalid function value in 'optimize'. I get it whether I define "o" or just run optimize. Oddly, when I don't call values from the row but instead use the code from the answer, it tells me object 'HT' not found. I have the awful feeling this is due to some obvious/careless error on my part, but I'm not finding posts about this error with optimize. If you notice what I've done wrong, your explanation will be appreciated!
I'm not an expert on optimize, but I see three issues: 1) your call to KozakTaper does not iterate through the range you specify in the loop. 2) KozakTaper returns a a single number not a vector. 3) You haven't given optimize a function but an expression.
So what is happening is that you are not giving optimize anything to iterate over.
All you should need is this:
optimize(f = function(x) abs(10.16 - KozakTaper(Bark='ob',
SPP='RS',
DHT=x,
DBH=2.54*19,
HT=.3048*85,
Planted=0)),
lower=HT*.25, upper=HT+1,
maximum = FALSE, tol = .Machine$double.eps^0.25)
$minimum
[1] 22.67713 ##Hopefully this is the right answer
$objective
[1] 0
Optimize will now substitute x in from lower to higher, trying to minimize the difference

Find minimums with R (1 Variable X, n times a fixed parameter U)

I'm trying to minimize a function f(X,U) = (X*log(X)-1/(1-U))^2
where U=(U_1,...,U_n) ~ U(0,1), that means I have n amount of fixed U's and want to find the min of:
(x_1*ln(x_1)-1/(1-u_1))^2
(x_2*ln(x_2)-1/(1-u_2))^2
......
(x_n*ln(x_n)-1/(1-u_n))^2
For that, I wanted to use the optim function.
I have defined:
n <- 10^3
U <- sort(runif(n,min=0,max=1))
X <- c()
Xsolution<- c()
f <- function(X,U){
return(-(X*log(X)-(1/(1-U)))^2)
} #-, because min(f) = max(-f)
now I have no idea how to do this with optim()? I always get the following error for the following code:
for(i in 1:n){
Xsolution[i] <- optim(f(X,U[i])
}
Error in log(X) : non-numeric argument to mathematical function
Sidenote: I would welcome a method without a for-loop, since for great n, it will take too long. Maybe you can help me get it work with sapply? Or an alternative way?
Alternatively, I thought I got it working with optimize(...,maximize=FALSE,..):
f <- function (X, a) ((X*log(X)-(1/(1-a)))^2)
for (i in 1:n){
xmin[i] <- optimize(f, c(0, 10000), tol = 0.0001, a = U[i])
}
This doesn't work either properly...
Also, the problem may be that it will take tooooo long. I want to do it with n=10^6. But I'm quite sure there has to be a way doing it without a for-loop? I think the for-loop is the problem that makes this take ages. Please help me, I've been sitting on this problem for ages and it's quite frustrating.
Since X * log(X) = 1 / (1 - U[i]) can be solved numerically for any U[i], there is a solution for each distinct U[i] so any of the (X*ln(X)-1/(1-U[i]))^2 can be driven to zero and therefore there is a solution for each distinct U[i]. If typically the U[i] are all distinct that means there are length(U) solutions. The solutions are given by (can omit the unique if the U[i] are all distinct):
f <- function (X, a) ((X*log(X)-(1/(1-a)))^2)
unique(sapply(U, function(a) optimize(f, c(0, 1000000), a = a)$minimum))

Compute multiple Integral and plot them (with R)

I'm having trouble to compute and then plot multiple integral. It would be great if you could help me.
So I have this function
> f = function(x, mu = 30, s = 12){dnorm(x, mu, s)}
which i want to integrate multiple time between z(1:100) to +Inf to plot that with x=z and y = auc :
> auc = Integrate(f, z, Inf)
R return :
Warning message:
In if (is.finite(lower)) { :
the condition has length > 1 and only the first element will be used
I have tested to do a loop :
while(z < 100){
z = 1
auc = integrate(f,z,Inf)
z = z+1}
Doesn't work either ... don't know what to do
(I'm new to R , so I'm already sorry if it is really easy .. )
Thanks for your help :) !
There is no need to do the integrating by hand. pnorm gives the integral from negative infinity to the input for the normal density. You can get the upper tail instead by modifying the lower.tail parameter
z <- 1:100
y <- pnorm(z, mean = 30, sd = 12, lower.tail = FALSE)
plot(z, y)
If you're looking to integrate more complex functions then using integrate will be necessary - but if you're just looking to find probabilities for distributions then there will most likely be a function built in that does the integration for you directly.
Your problem is actually somewhat subtle, and in a certain sense gets to the core of how R works, so here is a slightly longer explanation.
R is a "vectorized" language, which means that just about everything works on vectors. If I have 2 vectors A and B, then A+B is the element-by-element sum of A and B. Nearly all R functions work this way also. If X is a vector, then Y <- exp(X) is also a vector, where each element of Y is the exponential of the corresponding element of X.
The function integrate(...) is one of the few functions in R that is not vectorized. So when you write:
f <- function(x, mu = 30, s = 12){dnorm(x, mu, s)}
auc <- integrate(f, z, Inf)
the integrate(...) function does not know what to do with z when it is a vector. So it takes the first element and complains. Hence the warning message.
There is a special function in R, Vectorize(...) that turns scalar functions into vectorized functions. You would use it this way:
f <- function(x, mu = 30, s = 12){dnorm(x, mu, s)}
auc <- Vectorize(function(z) integrate(f,z,Inf)$value)
z <- 1:100
plot(z,auc(z), type="l") # plot lines

computing an intergral with multiple variables in R

Hi I have a equation like the following that I want to calculate.
The equation is given by :
In this equation x is an arrary from 0 to 500.
The value of t = 500 i.e upper limit of the integration.
Now I want to compute c as c(500,x).
The code that I have written so far is as follows:
x <- seq(from=0,by=0.5,length=1000)
t=500
integrand <- function(t)t^(-0.5)*exp((-x^2/t)-t)
integrated <- integrate(integrand, lower=0, upper=t)
final <- pi^(-0.5)*exp(2*x)*integrated
The error I get is as follows:
Error in integrate(integrand, lower = 0, upper = t) :
evaluation of function gave a result of wrong length
In addition: Warning messages:
1: In -x^2/t :
longer object length is not a multiple of shorter object length
2: In -x^2/t - t :
longer object length is not a multiple of shorter object length
3: In t^(-0.5) * exp(-x^2/t - t) :
longer object length is not a multiple of shorter object length
But it doesn't work because there is a variable x inside the integrand which is an arrary. Can anyone suggest how can I compute the integration first and then calculate the total expression for each value of x ? If I change the value of x in the integrand to constant I can compute integration but I want to compute for all the values of x from 0 to 500.
Thank you so much.
Well, here is some code, but it blows up after t=353:
Cfun <- function(XX, upper){
integrand <- function(x)x^(-0.5)*exp((-XX^2/x)-x)
integrated <- integrate(integrand, lower=0, upper=upper)$value
(final <- pi^(-0.5)*exp(2*XX)*integrated) }
sapply(1:400, Cfun, upper=500)
I'd put the loop over values for x outside the integration. Iterate over the x-values and perform the integration for each one inside. Then you'll have C(x) as a function of x suitable for plotting.
You realize, of course, that the indefinite integral can be evaluated:
http://www.wolframalpha.com/input/?i=integrate+exp%28-%28c%2Bt%5E2%29%2Ft%29%2Fsqrt%28t%29
Maybe that will help you see what the answer looks like before you get started.

Resources