Which trigonometric function is most appropriate? - r

I'm trying to figure out which trigonometric function is most appropriate for image below.
What I have tried:
function(x,theta){
theta*acos(theta+1)*x
}
However, I'm not sure if this takes it to the 'power of' like in the image:

This is just an oridinary cosine being raised to a power. It is written that way by convention to save parentheses and to make it clear that it is the result of applying the cosine which is raised to the power (rather than raising the argument to a power prior to taking its cosine). Use:
f <- function(x,theta){
theta*cos(x)^(theta + 1)
}

Related

Constructing Taylor Series from a Recursive function in Pari-GP

This is a continuation of my questions:
Declaring a functional recursive sequence in Matlab
Is there a more efficient way of nesting logarithms?
Nesting a specific recursion in Pari-GP
But I'll keep this question self contained. I have made a coding project for myself; which is to program a working simple calculator for a tetration function I've constructed. This tetration function is holomorphic, and stated not to be Kneser's solution (as to all the jargon, ignore); long story short, I need to run the numbers; to win over the nay-sayers.
As to this, I have to use Pari-GP; as this is a fantastic language for handling large numbers and algebraic expressions. As we are dealing with tetration (think numbers of the order e^e^e^e^e^e); this language is, of the few that exist, the best for such affairs. It is the favourite when doing iterated exponential computations.
Now, the trouble I am facing is odd. It is not so much that my code doesn't work; it's that it's overflowing because it should over flow (think, we're getting inputs like e^e^e^e^e^e; and no computer can handle it properly). I'll post the first batch of code, before I dive deeper.
The following code works perfectly; and does everything I want. The trouble is with the next batch of code. This produces all the numbers I want.
\\This is the asymptotic solution to tetration. z is the variable, l is the multiplier, and n is the depth of recursion
\\Warning: z with large real part looks like tetration; and therefore overflows very fast. Additionally there are singularities which occur where l*(z-j) = (2k+1)*Pi*I.
\\j,k are integers
beta_function(z,l,n) =
{
my(out = 0);
for(i=0,n-1,
out = exp(out)/(exp(l*(n-i-z)) +1));
out;
}
\\This is the error between the asymptotic tetration and the tetration. This is pretty much good for 200 digit accuracy if you need.
\\modify the 0.000000001 to a bigger number to make this go faster and receive less precision. When graphing 0.0001 is enough
\\Warning: This will blow up at some points. This is part of the math; these functions have singularities/branch cuts.
tau(z,l,n)={
if(1/real(beta_function(z,l,n)) <= 0.000000001, //this is where we'll have problems; if I try to grab a taylor series with this condition we error out
-log(1+exp(-l*z)),
log(1 + tau(z+1,l,n)/beta_function(z+1,l,n)) - log(1+exp(-l*z))
)
}
\\This is the sum function. I occasionally modify it; to make better graphs, but the basis is this.
Abl(z,l,n) = {
beta_function(z,l,n) + tau(z,l,n)
}
Plugging this in, you get the following expressions:
Abl(1,log(2),100)
realprecision = 28 significant digits (20 digits displayed)
%109 = 0.15201551563214167060
exp(Abl(0,log(2),100))
%110 = 0.15201551563214167060
Abl(1+I,2+0.5*I,100)
%111 = 0.28416643148885326261 + 0.80115283113944703984*I
exp(Abl(0+I,2+0.5*I,100))
%112 = 0.28416643148885326261 + 0.80115283113944703984*I
And so on and so forth; where Abl(z,l,n) = exp(Abl(z-1,l,n)). There's no problem with this code. Absolutely none at all; we can set this to 200 precision and it'll still produce correct results. The graphs behave exactly as the math says they should behave. The problem is, in my construction of tetration (the one we actually want); we have to sort of paste together the solutions of Abl(z,l,n) across the value l. Now, you don't have to worry about any of that at all; but, mathematically, this is what we're doing.
This is the second batch of code; which is designed to "paste together" all these Abl(z,l,n) into one function.
//This is the modified asymptotic solution to the Tetration equation.
beta(z,n) = {
beta_function(z,1/sqrt(1+z),n);
}
//This is the Tetration function.
Tet(z,n) ={
if(1/abs(beta_function(z,1/sqrt(1+z),n)) <= 0.00000001,//Again, we see here this if statement; and we can't have this.
beta_function(z,1/sqrt(1+z),n),
log(Tet(z+1,n))
)
}
This code works perfectly for real-values; and for complex values. Some sample values,
Tet(1+I,100)
%113 = 0.12572857262453957030 - 0.96147559586703141524*I
exp(Tet(0+I,100))
%114 = 0.12572857262453957030 - 0.96147559586703141524*I
Tet(0.5,100)
%115 = -0.64593666417664607364
exp(Tet(0.5,100))
%116 = 0.52417133958039107545
Tet(1.5,100)
%117 = 0.52417133958039107545
We can also effectively graph this object on the real-line. Which just looks like the following,
ploth(X=0,4,Tet(X,100))
Now, you may be asking; What's the problem then?
If you try and plot this function in the complex plane, it's doomed to fail. The nested logarithms produce too many singularities near the real line. For imaginary arguments away from the real-line, there's no problem. And I've produced some nice graphs; but the closer you get to the real line; the more it misbehaves and just short circuits. You may be thinking; well then, the math is wrong! But, no, the reason this is happening is because Kneser's tetration is the only tetration that is stable about the principal branch of the logarithm. Since this tetration IS NOT Kneser's tetration, it's inherently unstable about the principal branch of the logarithm. Of course, Pari just chooses the principal branch. So when I do log(log(log(log(log(beta(z+5,100)))))); the math already says this will diverge. But on the real line; it's perfectly adequate. And for values of z with an imaginary argument away from zero, we're fine too.
So, how I want to solve this, is to grab the Taylor series at Tet(1+z,100); which Pari-GP is perfect for. The trouble?
Tet(1+z,100)
*** at top-level: Tet(1+z,100)
*** ^------------
*** in function Tet: ...unction(z,1/sqrt(1+z),n))<=0.00000001,beta_fun
*** ^---------------------
*** _<=_: forbidden comparison t_SER , t_REAL.
The numerical comparison I've done doesn't translate to a comparison between t_SER and t_REAL.
So, my question, at long last: what is an effective strategy at getting the Taylor series of Tet(1+z,100) using only real inputs. The complex inputs near z=0 are erroneous; the real values are not. And if my math is right; we can take the derivatives along the real-line and get the right result. Then, we can construct a Tet_taylor(z,n) which is just the Taylor Series expansion. Which; will most definitely have no errors when trying to graph.
Any help, questions, comments, suggestions--anything, is greatly appreciated! I really need some outside eyes on this.
Thanks so much if you got to the bottom of this post. This one is bugging me.
Regards, James
EDIT:
I should add that a Tet(z+c,100) for some number c is the actual tetration function we want. There is a shifting constant I haven't talked about yet. Nonetheless; this is spurious to the question, and is more a mathematical point.
This is definitely not an answer - I have absolutely no clue what you are trying to do. However, I see no harm in offering suggestions. PARI has a built in type for power series (essentially Taylor series) - and is very good at working with them (many operations are supported). I was originally going to offer some suggestions on how to get a Taylor series out of a recursive definition using your functions as an example - but in this case, I'm thinking that you are trying to expand around a singularity which might be doomed to failure. (On your plot it seems as x->0, the result goes to -infinity???)
In particular if I compute:
log(beta(z+1, 100))
log(log(beta(z+2, 100)))
log(log(log(beta(z+3, 100))))
log(log(log(log(beta(z+4, 100)))))
...
The different series are not converging to anything. Even the constant term of the series is getting smaller with each iteration, so I am not entirely sure there is even a Taylor series expansion about x = 0.
Questions/suggestions:
Should you be expanding about a different point? (say where the curve
crosses the x-axis).
Does the Taylor series satisfy some recursive relation? For example: A(z) = log(A(z+1)). [This doesn't work, but perhaps there is another way to write it].
I suspect my answer is unlikely to be satisfactory - but then again your question is more mathematical than a practical programming problem.
So I've successfully answered my question. I haven't programmed in so long; I'm kind of shoddy. But I figured it out after enough coffee. I created 3 new functions, which allow me to grab the Taylor series.
\\This function attempts to find the number of iterations we need.
Tet_GRAB_k(A,n) ={
my(k=0);
while( 1/real(beta(A+k,n)) >= 0.0001, k++);
return(k);
}
\\This function will run and produce the same results as Tet; but it's slower; but it let's us estimate Taylor coefficients.
\\You have to guess which k to use for whatever accuracy before overflowing; which is what the last function is good for.
Tet_taylor(z,n,k) = {
my(val = beta(z+k,n));
for(i=1,k,val = log(val));
return(val);
}
\\This function produces an array of all the coefficients about a value A.
TAYLOR_SERIES(A,n) = {
my(ser = vector(40,i,0));
for(i=1,40, ser[i] = polcoeff(Tet_taylor(A+z,n,Tet_GRAB_k(A,n)),i-1,z));
return(ser);
}
After running the numbers, I'm confident this works. The Taylor series is converging; albeit rather slowly and slightly less accurately than desired; but this will have to do.
Thanks to anyone who read this. I'm just answering this question for completeness.

Recursive arc-length reparameterization of an arbitrary curve

I have a 3D parametric curve defined as P(t) = [x(t), y(t), z(t)].
I'm looking for a function to reparametrize this curve in terms of arc-length. I'm using OpenSCAD, which is a declarative language with no variables (constants only), so the solution needs to work recursively (and with no variables aside from global constants and function arguments).
More precisely, I need to write a function Q(s) that gives the point on P that is (approximately) distance s along the arc from the point where t=0. I already have functions for numeric integration and derivation that can be incorporated into the answer.
Any suggestions would be greatly appreciated!
p.s It's not possible to pass functions as a parameter in OpenSCAD, I usually get around this by just using global declarations.
The length of an arc sigma between parameter values t=0 and t=T can be computed by solving the following integral:
sigma(T) = Integral[ sqrt[ x'(t)^2 + y'(t)^2 + z'(t)^2 ],{t,0,T}]
If you want to parametrize your curve with the arc-length, you have to invert this formula. This is unfortunately rather difficult from a mathematics point of view. The simplest method is to implement a simple bisection method as a numeric solver. The computation method quickly becomes heavy so reusing previous results is ideal. The secant method is also useful as the derivative of sigma(t) is already known and equals
sigma'(t) = sqrt[ x'(t)^2 + y'(t)^2 + z'(t)^2]
Maybe not really the most helpful answer, but I hope it gives you some ideas. I cannot help you with the OpenSCad implementation.

R Optimization: Pass value from function to gradient with each iteration

I have a function that I am optimizing using the optimx function in R (I'm also open to using optim, since I'm not sure it will make a difference for what I'm trying to do). I have a gradient that I am passing to optimx for (hopefully) faster convergence compared to not using a gradient. Both the function and the gradient use many of the same quantities that are computed from each new parameter set. One of these quantities in particular is very computationally costly, and it's redundant to have to compute this quantity twice for each iteration - once for the function, and again for the gradient. I'm trying to find a way to compute this quantity once, then pass it to the function and the gradient.
So here is what I am doing. So far this works, but it is inefficient:
optfunc<-function(paramvec){
quant1<-costlyfunction(paramvec)
#costlyfunction is a separate function that takes a while to run
loglikelihood<-sum(quant1)**2
#not really squared, but the log likelihood uses quant1 in its calculation
return(loglikelihood)
}
optgr<-function(paramvec){
quant1<-costlyfunction(paramvec)
mygrad<-sum(quant1) #again not the real formula, just for illustration
return(mygrad)
}
optimx(par=paramvec,fn=optfunc,gr=optgr,method="BFGS")
I am trying to find a way to calculate quant1 only once with each iteration of optimx. It seems the first step would be to combine fn and gr into a single function. I thought the answer to this question may help me, and so I recoded the optimization as:
optfngr<-function(){
quant1<-costlyfunction(paramvec)
optfunc<-function(paramvec){
loglikelihood<-sum(quant1)**2
return(loglikelihood)
}
optgr<-function(paramvec){
mygrad<-sum(quant1)
return(mygrad)
}
return(list(fn = optfunc, gr = optgr))
}
do.call(optimx, c(list(par=paramvec,method="BFGS",optfngr() )))
Here, I receive the error: "Error in optimx.check(par, optcfg$ufn, optcfg$ugr, optcfg$uhess, lower, : Cannot evaluate function at initial parameters." Of course, there are obvious problems with my code here. So, I'm thinking answering any or all of the following questions may shed some light:
I passed paramvec as the only arguments to optfunc and optgr so that optimx knows that paramvec is what needs to be iterated over. However, I don't know how to pass quant1 to optfunc and optgr. Is it true that if I try to pass quant1, then optimx will not properly identify the parameter vector?
I wrapped optfunc and optgr into one function, so that the quantity quant1 will exist in the same function space as both functions. Perhaps I can avoid this if I can find a way to return quant1 from optfunc, and then pass it to optgr. Is this possible? I'm thinking it's not, since the documentation for optimx is pretty clear that the function needs to return a scalar.
I'm aware that I might be able to use the dots arguments to optimx as extra parameter arguments, but I understand that these are for fixed parameters, and not arguments that will change with each iteration. Unless there is also a way to manipulate this?
Thanks in advance!
Your approach is close to what you want, but not quite right. You want to call costlyfunction(paramvec) from within optfn(paramvec) or optgr(paramvec), but only when paramvec has changed. Then you want to save its value in the enclosing frame, as well as the value of paramvec that was used to do it. That is, something like this:
optfngr<-function(){
quant1 <- NULL
prevparam <- NULL
updatecostly <- function(paramvec) {
if (!identical(paramvec, prevparam)) {
quant1 <<- costlyfunction(paramvec)
prevparam <<- paramvec
}
}
optfunc<-function(paramvec){
updatecostly(paramvec)
loglikelihood<-sum(quant1)**2
return(loglikelihood)
}
optgr<-function(paramvec){
updatecostly(paramvec)
mygrad<-sum(quant1)
return(mygrad)
}
return(list(fn = optfunc, gr = optgr))
}
do.call(optimx, c(list(par=paramvec,method="BFGS"),optfngr() ))
I used <<- to make assignments to the enclosing frame, and fixed up your do.call second argument.
Doing this is called "memoization" (or "memoisation" in some locales; see http://en.wikipedia.org/wiki/Memoization), and there's a package called memoise that does it. It keeps track of lots of (or all of?) the previous results of calls to costlyfunction, so would be especially good if paramvec only takes on a small number of values. But I think it won't be so good in your situation because you'll likely only make a small number of repeated calls to costlyfunction and then never use the same paramvec again.

R optim same function for fn and gr

I would like to use optim() to optimize a cost function (fn argument), and I will be providing a gradient (gr argument). I can write separate functions for fn and gr. However, they have a lot of code in common and I don't want the optimizer to waste time repeating those calculations. So is it possible to provide one function that computes both the cost and the gradient? If so, what would be the calling syntax to optim()?
As an example, suppose the function I want to minimize is
cost <- function(x) {
x*exp(x)
}
Obviously, this is not the function I'm trying to minimize. That's too complicated to list here, but the example serves to illustrate the issue. Now, the gradient would be
grad <- function(x) {
(x+1)*exp(x)
}
So as you can see, the two functions, if called separately, would repeat some of the work (in this case, the exponential function). However, since optim() takes two separate arguments (fn and gr), it appears there is no way to avoid this inefficiency, unless there is a way to define a function like
costAndGrad <- function(x) {
ex <- exp(x)
list(cost=x*ex, grad=(x+1)*ex)
}
and then pass that function to optim(), which would need to know how to extract the cost and gradient.
Hope that explains the problem. Like I said my function is much more complicated, but the idea is the same: there is considerable code that goes into both calculations (cost and gradient), which I don't want to repeat unnecessarily.
By the way, I am an R novice, so there might be something simple that I'm missing!
Thanks very much
The nlm function does optimization and it expects the gradient information to be returned as an attribute to the value returned as the original function value. That is similar to what you show above. See the examples in the help for nlm.

Is this a correct way to find the derivative of the sigmoid function in python?

I came up with this code:
def DSigmoid(value):
return (math.exp(float(value))/((1+math.exp(float(value)))**2))
a.) Will this return the correct derivative?
b.) Is this an efficient method?
Friendly regards,
Daquicker
Looks correct to me. In general, two good ways of checking such a derivative computation are:
Wolfram Alpha. Inputting the sigmoid function 1/(1+e^(-t)), we are given an explicit formula for the derivative, which matches yours. To be a little more direct, you can input D[1/(1+e^(-t)), t] to get the derivative without all the additional information.
Compare it to a numerical approximation. In your case, I will assume you already have a function Sigmoid(value). Taking
Dapprox = (Sigmoid(value+epsilon) - Sigmoid(value)) / epsilon
for some small epsilon and comparing it to the output of your function DSigmoid(value) should catch all but the tiniest errors. In general, estimating the derivative numerically is the best way to double check that you've actually coded the derivative correctly, even if you're already sure about the formula, and it takes almost no effort.
In case numerical stability is an issue, there is another possibility: provided that you have a good implementation of the sigmoid available (such as in scipy) you can implement it as:
from scipy.special import expit as sigmoid
def sigmoid_grad(x):
fx = sigmoid(x)
return fx * (1 - fx)
Note that this is mathematically equivalent to the other expression.
In my case this solution worked, while the direct implementation caused floating point overflows when computing exp(-x).

Resources