I am new to R and trying to understand this function call below:
mle_mos <- function (n, m1, m2, x, x1, x2, initial, iters)
mvnewton (function (p) log_likelihood_gradient(p,n,m1,m2,x,x1,x2),
function (p) -fisher_information(p,n,m1,m2,x,x1,x2),
initial, iters)
mvnewton <- function (f, g, x, n)
{
print(f)
if (n < 1)
stop("invalid number of iterations")
for (i in 1:n) {
cat("\nAt iteration",i,":\n\n")
cat("x =",x,"\n")
cat("f(x) =",f(x),"\n")
cat("g(x) =\n")
print(g(x));
x <- x - solve(g(x),f(x)) # computes inverse of g(x) times
f(x)
}
x
}
mvnnewton takes
(function (p) log_likelihood_gradient(p,n,m1,m2,x,x1,x2)
as the first parameter. Where is p getting it's value from? Where is it defined? Can someone explain where p is getting a value to be passed into log_likelihood_gradient as a parameter? I have been googling for the past 2 days and reading up a lot of stuff but am still not clear that I understand this properly.
Arguments
If we have a function fun
fun <- function(z) z+1
and then we call it
fun(1)
1 is said to be the actual argument and z is said to be the formal argument. The formal argument takes on the value of the actual argument so z takes on the value 1. (There are some technicalities which we have glossed over but these do not affect the situation in the question.)
mvnewton
When mvnewton is called the caller passes to it four actual arguments:
function (p) log_likelihood_gradient(p,n,m1,m2,x,x1,x2)
function (p) -fisher_information(p,n,m1,m2,x,x1,x2)
initial
iters
Note that the first two arguments are functions. It does not pass the result of calling the functions but passes the functions themselves.
Now within mvnewton these 4 actual arguments correspond to the formal arguments f, g, x and n so when mvnewton refers to f it is really referring to
function (p) log_likelihood_gradient(p,n,m1,m2,x,x1,x2)
Also within mvnewton the function f is called with actual argument x and x corresponds to formal argument p within f (and similarly for g).
Example
For example, suppose we call
f <- function(x) x^2
g <- function(x) 2*x
mvnewton(f, g, 1, 20)
Then x is 1 and the first time f(x) and g(x) are called within mvnewton they are called as f(1) and g(1) and within f and g the formal argument p takes the value 1. mvnewton then updates x to a new value, calls f and g again with the new value of x and so on.
Related
What I am about to explain is kinda tricky, but I hope I can explain it clearly.
Suppose you have a function that does the Hodrick Prescott detrending, that is pretty much this:
The user picks up the λ value, and therefore for every λ it exists a series trend τ(λ).
Suppose you pick a number near 0 (on the positive side)
This number is V, for this case suppose V=0.0001278846
Then you want to compute this:
(I have the function that does)
But you want to find a λ so that F(λ) = V
How can I complete this?
I was trying to make a while statement, but could not state it correctly, then I made a for loop with an if statement to break the for loop when F(λ)-V = 0.
This is what my for loop looks like:
for(L in 1:3500){
F_ <- find_v(dataa, L)
if((F_-V)==0){
print(paste("The λ value for this series following Rule 1 is:", L))
break
}
cat(paste("The λ =",L,"has a (F-V) difference of:", (F_-V),"\n"))
where dataa is my data composed of 89 obs.
Using this for loop I see that (F-V) turns negative between L = 3276 and L = 3277.
Is there a better way to do it? Like optimization?
Because using the for loop it feels like I'm obtaining the optimal λ by the force.
Sorry for not getting my data or codes for the hodrick prescott detrending or the find_v function, they are way too long.
Since you are doing double optimization, consider the following:
The data
set.seed(0)
y <- rnorm(89)
The function to be optimized:
lfun <- function(tau, y, lambda){
n <- length(tau)
tt <- tau[-(1:2)] - 2 * tau[-c(1, n)] + head(tau, -2)
sum((y-tau)^2) + lambda *sum(tt^2)
}
The F function:
f_lambda <- function(lambda, y, V = 0){
tau <- optim(y,lfun,y = y, lambda = lambda, method = 'BFGS')$par
tt <- tail(tau,-2) - 2 * head(tau[-1], -1) + head(tau, -2)
sqrt((sum((y-tau)^2)/sum(tt^2) - V)^2)
}
Optimizing the F function:
optim(0.1, f_lambda, y = y, V=0.0001278846, method="Brent",lower=0, upper=100)
$par
[1] 0.003412824
$value
[1] 2.633131e-10
$counts
function gradient
NA NA
$convergence
[1] 0
$message
NULL
Now the lambda = 0.003412824 gives the desired V ie:
f_lambda(0.003412824, y)
[1] 0.0001278843
Which is very close to the V=0.0001278846 you started with.
I have been reading Hadley Wickham's Advanced R in order to gain a better understanding of the mechanisms of R and how it works behind the scene. I have so far enjoyed it and everything is quite clear. There is one question that occupies my mind for which I have not yet found an explanation.
I am quite familiar with the scoping rules of R which determine how values are assigned to FREE VARIABLES. However, I have been grappling with the question of why R cannot find the value of a formal argument through lexical scoping in the first case. Consider the following example:
y <- 4
f1 <- function(x = 2, y) {
x*2 + y
}
f1(x = 3)
It normally throws an error because I didn't assign a default value for argument y. However, if I create a local variable y in the body of the function it won't throw any error:
I also read in Professeur Matloff's book that arguments act like local variables, so that's why this question remains a mystery for me.
f1 <- function(x = 2, y) {
y <- 4
x*2 + y
}
f1(x = 3)
And also here there is no error and it is quite clear why:
y <- 2
f2 <- function(x = 2) {
x*2 + y
}
f2()
Thank you very much in advance.
Note that R will only throw the error when you go to use the variable. if you had
f1 <- function(x = 2, y) {
x*2 + 5
}
f1(x = 3)
# [1] 11
everything would be fine. That's because the parameter is a "promise" which isn't resolved till you actually use it. This allows you to do things like
f1 <- function(x = 2, y=x+5) {
x*2 + y
}
f1(x = 3)
# [1] 14
Where the y value will actually use the value of x that's passed to the function when the promise is evaluated. Furthermore you can also do
f1 <- function(x = 2, y=z+2) {
z <- x + 10
x*2 + y
}
f1(x = 3)
[1] 21
Where y is able to take the value of z that didn't even exist when the function was called. Again this is because the parameter values are promises and are only evaluated when they are actually used. They have access to all the values in the environment when they are evaluated. But note that this only works because default parameter values are evaluated in the context of the function body. This is different than when you pass in a value to a function. In that case the value is evaluated in the calling environment, not the local function body. So you can't do
f1(x = 3, y=z+2)
# Error in f1(x = 3, y = z + 2) : object 'z' not found
The the reason you get the error in your first function is that a value for y does not exist when you try to use it in x*2 + y. Since you've defined y as a parameter, it is no longer a "free" variable and will not be looked up in parent scope. You don't get an error in your second function because you've re-bound the y variable to a local function variable so you are never using the parameter value at all.
If you ran
f1 <- function(x = 2, y) {
y <- 4
x*2 + y
}
f1(x = 3, y=200)
# [1] 10
The 2000 basically disappears. You no longer have access to that value after you reassign y. R does not check if a variable exists already before redefining so there is nothing that will try to evaluate the promise value y of the function parameter.
Arguments will act like local variables once the promise has been evaluated.
I'd argue that R is quite consistent here.
As long as y is not called, R does not bother with this thing called promise.
f1 <- function(x = 2, y) {
x*2 + 5
}
f1(x = 3)
# [1] 11
In your first example, y cannot be derived from the global environment since y is defined locally inside of the function environment. Since the value is missing, an error is thrown
y <- 4
f2 <- function(x = 2, y) {
x*2 + y
}
f2(x = 3)
#> Error in f2(x = 3) : argument "y" is missing, with no default
This can lead to situations like this which can "hide" an error-prone setting until a user accidentally stumbles over it.
f3 <- function(x = 2, y) {
if (x >= 2) {
y <- 4
}
x * 2 + y
}
f3(x = 2)
#> 8
Such cases are hard to cover with unit tests which is why you often see functions with force() on top of their body.
f4 <- function(x = 2, y) {
force(y)
if (x > 2) {
y <- 4
}
x * 2 + y
}
f4(x = 2)
#> Error in force(y) : argument "y" is missing, with no default
There is a nice blog post that covers lazy evaluation quite extensively.
I have to use recursion to produce pseudo random numbers. For fixed values a, b and c, I need to calculate:
x_n+1 = (a * x_n + c) modulo 2^b. Random numbers are obtained by the function R_n = x_n / (2^b). I need to save these R_n values to make a histogram. How can I make a function in R that uses it's previous values x_n to produce x_n+1? I have made a start with my code, it's listed below.
a=5
b=4
c=3
k=10000
random <- function(x) {
if(x<k){
x = (a*x+c)%%2^b
k++
}
}
Here's a thought for starters,
random <- function(a = 5, b = 4, c = 3, k = 10000, x0 = 1) {
x <- x0 # or some other sane default
function(n = 1) {
newx <- Reduce(function(oldx, ign) (a*oldx + c) %% (2^b), seq_len(n),
init = x, accumulate = TRUE)[-1]
# if (x >= k)? do something else
if (length(newx)) {
x <<- newx[length(newx)]
k <<- k + n
}
newx
}
}
The premise is that the random function is a setup function that returns a function. This inner function has its a, b, c, k, and previous x variables stored within it.
thisran <- random()
thisran()
# [1] 8
thisran(3)
# [1] 11 10 5
I haven't studied creating PRNG in depth, but I'm inferring that x0 here is effectively your seed. I'm not certain why you had a if (x<k) conditional in your function; since k was never used otherwise, just incremented, I'm thinking it only serves as a termination indicator for your PRNG (so it is not infinite).
If need be, the current k value (and other variables, for that matter) can be peeked-at with
get("k", environment(thisran))
# [1] 10003
BTW: the use of Reduce might seem like an unnecessary complication, but it enables the ran(n) functionality, similar to other PRNGs in R. That is, one can do runif(7) for seven random numbers, and I thought it would be useful to do that here. The use of Reduce is required in that case since each calculation depends on the results from the previous calculation, so a sample replicate or sapply would not work (without some contrived coding that I wanted to avoid).
evaluating an integral depending on two parameters
A <- 0.0004
B <- 0.0013
c <- 1.0780
f1 <- function(x) {
f2 <- function(s) {
A + B * c^(x+s)
}
return(f2)
}
tpx <- function(t,x) {
exp(-integrate(f1(x), lower=0, upper=t)$value)
}
i get
> tpx(1,1)
[1] 0.9981463
for the ordered pair (1,1). when using the colon operator
> tpx(1,1:2)
[1] 0.9980945
Warning messages:
1: In x + s :
longer object length is not a multiple of shorter object length
2: In x + s :
longer object length is not a multiple of shorter object length
3: In x + s :
some error messages occur. is it possible to adjust the integration variable s? what causes the output right before the warning messages? it is neither
> tpx(1,1)
[1] 0.9981463
nor
> tpx(1,2)
[1] 0.998033
i suppose i get something wrong :S
You can use the following
sapply(1:2, tpx, t = 1)
Why?
The error is caused because: integrate expects f to be (quote from ?integrate)
an R function taking a numeric first argument and returning a numeric vector of the same length. Returning a non-finite element will generate an error.
But
> s = 1
> A + B * c^(1:2+s)
[1] 0.001910709 0.002028545
is of length 2 whereas s is of length 1.
sapply supplies one elements at a time the f1 and combines the result afterwards. That's why it works.
Comment:
You can also simplify your f1, f2 and tpx function as follows:
f1 <- function(s, x, A, B, c){
A + B * c^(x+s)
}
tpx <- function(x,t){
exp(-integrate(f1, lower=0, upper=t, x, A, B, c)$value)
}
Again quoting from ?integrate
... - additional arguments to be passed to f.
Meaning that the parameter x, A, B, c will be passed to f1 and only the first argument will be used by the integration.
Just do tpx = Vectorize(tpx) much easier than apply functions
This is my code. The kum.loglik function returns negative loglikelihood and takes two arguments a and b. I need to find a and b that minimize this function using optim function. (n1,n2,n3 is pre-specified and passed to optim function.
kum.loglik = function(a, b, n1, n2, n3) {
loglik = n1*log(b*beta(1+2/a,b)) + n2 * log(b*beta(1+2/a,b)-2*b*beta(1+1/a,b)+1) +
n3 * log(b*beta(1+1/a,b)-b*beta(1+2/a,b))
return(-loglik)
}
optim(par=c(1,1), kum.loglik, method="L-BFGS-B",
n1=n1, n2=n2, n3=n3,
control=list(ndeps=c(5e-4,5e-4)))
This code should work well but it gives error message
Error in b * beta(1 + 2/a, b) : 'b' is missing
What is wrong in this code?
The problem is (straight from the optim help):
fn: A function to be minimized (or maximized), with first
argument the vector of parameters over which minimization is
to take place.
Your kum.loglik function needs to take a vector v which you pull the parameters out of, e.g.:
kum.loglik=function(v) { a = v[1]; b = v[2]; ...}
I always use the following, it gives you the best results
p0 <- c(a,b) #example vector of starting values
m <- optim(p0, loglik, method="BFGS", control=list(fnscale=-1, trace=10),
hessian=TRUE, x=data.frame)
#for table of results
rbind(m$par, sqrt(diag(solve(-m$hessian))))