Trying to write a gradient descent algorithm in R - r

Implement the gradient descent algorithm in this question. Let
{X1,…,Xn} be a dataset and g(x)=n−1∑ni=1(x−Xi)2. It is known that the
mean of the dataset is the solution to the following minimization
problem minx∈ℝg(x).
To minimize g(x), you are going to use a while loop to implement the
gradient descent algorithm, as follows.
Step 0. Initialize x1=0 Step 1. In the kth step, where k=1,2,…, set
xk+1=xk−0.99k×g′(xk).
Step 2. Repeat Step 1 until |g′(xk)| is smaller than a small tolerance
level tol (e.g., set it to 1e-5) or if k exceeds the maximum number of
iterations Kmax (e.g., set it to 1000).
You are going to implement the gradient descent algorithm to find the
mean. Use the dataset cars$speed for {X1,…,Xn}.You don’t have to write
the algorithm into a function in this question; you are going to do
this in the next.
Could someone help me with this?
Here is what I have so far
data(cars)
x1 <- 0
k <- 1
toleranceLevel <-0.00005
X <- cars$speed
kmax <- 10000
while(x1 > toleranceLevel){
gxprime <- 2 * mean(x1 - X)
gxprime
x1 <-(((x1)-(.99^k))*gxprime)
if(x1 < toleranceLevel){
k <- k + 1
} else {
}
if(k == kmax){
break
}
print(k)
}

data(cars)
x_old <- 0
k <- 1
toleranceLevel <-0.00005
X <- cars$speed
kmax <- 10000
err <- 1
while(err > toleranceLevel & k < kmax){
x_new <- x_old -.99^k * 2 * mean(x_old - X)
err <- abs(x_new - x_old)
x_old <- x_new
k <- k + 1
}
x_new

Related

Derivative of a function with matrices and vectors for Newton-Raphson method

I've tried to find roots of nonlinear equation by using a Newton-Raphson method.
The problem I stuck in is that I don't get the right first derivative of following equation:
y is the target variable for my prediction, X is a matrix of the predictor variables, Theta is a smoothing parameter.
I need to get the arg min of Q.
For that I want to use this approach of Newton-Raphson:
newton.raphson <- function(f, a, b, tol = 1e-5, n = 1000) {
require(numDeriv) # Package for computing f'(x)
x0 <- a # Set start value to supplied lower bound
k <- n # Initialize for iteration results
# Check the upper and lower bounds to see if approximations result in 0
fa <- f(a)
if (fa == 0.0) {
return(a)
}
fb <- f(b)
if (fb == 0.0) {
return(b)
}
for (i in 1:n) {
dx <- genD(func = f, x = x0)$D[1] # First-order derivative f'(x0)
x1 <- x0 - (f(x0) / dx) # Calculate next value x1
k[i] <- x1 # Store x1
if (abs(x1 - x0) < tol) {
root.approx <- tail(k, n=1)
res <- list('root approximation' = root.approx, 'iterations' = k)
return(res)
}
# If Newton-Raphson has not yet reached convergence set x1 as x0 and continue
x0 <- x1
}
print('Too many iterations in method')
Thanks in advance for your help!

Looping through functions in R

I'm trying to write a loop in R that should do the following:
Calculate the square root of a given positive number using Newtons method. My idea is something like this:
delta <- 0.0000001
x <- input_value
#DO:
x.new = 0.5*(x + mu/x)
x = x.new
#UNTIL:
abs(xˆ2 - mu) < delta
It's meant as a quick way to find the root(s) of a given number.
Does anyone has any ideas as to how to make a loop that does this in R?
This is how I ended up solving my issue:
# PROGRAM
find_roots <- function(f, a, b, delta = 0.00005, n = 1000) {
require(numDeriv) # Package for calculating f'(x)
x_0 <- a # Set start value to supplied lower bound
k <- n # Initialize for iteration results
for (i in 1:n) {
dx <- genD(func = f, x = x_0)$D[1] # First-order derivative f'(x0)
x_1 <- x_0 - (f(x_0) / dx) # Calculate next value x_1
k[i] <- x_1 # Store x_1
# Once the difference between x0 and x1 becomes sufficiently small, output the results.
if (abs(x_1 - x_0) < delta) {
root.approx <- tail(k, n=1)
res <- list('root approximation' = root.approx, 'iterations' = k)
return(res)
}
# If Newton-Raphson has not yet reached convergence set x1 as x0 and continue
x_0 <- x_1
}
print('Too many iterations in method')
}
#Example of it working:
func1 <- function(x) {
x^2 + 3*x + 1
}
# Check out the magic
newton.raphson(func1, 2,3)

R - finding roots for a cartesian product of function parameters

Given a function f(x,c,d) of x that also depends on some parameters c and d. I would like to find the zeroes for a cartesian product of certain values c_1,...,c_n and d_1,...,d_m of the parameters, i.e. an x_ij such that f(x_ij,c_i,d_j)=0 for i=1,...,n and j=1,...,m. Although not that crucial I am applying a Newton-Raphson algorithm for the root finding:
newton.raphson <- function(f, a, b, tol = 1e-5, n = 1000){
require(numDeriv) # Package for computing f'(x)
x0 <- a # Set start value to supplied lower bound
k <- n # Initialize for iteration results
# Check the upper and lower bounds to see if approximations result in 0
fa <- f(a)
if (fa == 0.0){
return(a)
}
fb <- f(b)
if (fb == 0.0) {
return(b)
}
for (i in 1:n) {
dx <- genD(func = f, x = x0)$D[1] # First-order derivative f'(x0)
x1 <- x0 - (f(x0) / dx) # Calculate next value x1
k[i] <- x1 # Store x1
# Once the difference between x0 and x1 becomes sufficiently small, output the results.
if (abs(x1 - x0) < tol) {
root.approx <- tail(k, n=1)
res <- list('root approximation' = root.approx, 'iterations' = k)
return(res)
}
# If Newton-Raphson has not yet reached convergence set x1 as x0 and continue
x0 <- x1
}
print('Too many iterations in method')
}
The actual function that I am interest is more complicated, but the following example illustrates my problem.
test.function <- function(x=1,c=1,d=1){
return(c*d-x)
}
Then for any given c_i and d_j I can easily calculate the zero by
newton.raphson(function(x) test.function(x,c=c_i,d=d_j),0,1)[1]
which here is obviously just the product c_i*d_j.
Now I tried to define a function that finds for two given vectors (c_1,...,c_n) and (d_1,...,d_m) the zeroes for all combinations. For this, I tried to define
zeroes <- function(ci=1,dj=1){
x<-newton.raphson(function(x) test.function(x,c=ci,d=dj),0,1)[1]
return(as.numeric(x))
}
and then use the outer-function, e.g.
outer(c(1,2),c(1,2,3),FUN=zeroes)
Unfortunately, this did not work. I got an error message
Error during wrapup: dims [product 6] do not match the length of object [1]
There might be also a much better solution to my problem. I am happy for any input.

Vectorizing a simulation

Trying to wrap my mind arround vectorizing, trying to make some simulations faster I found this very basic epidemic simulation. The code is from the book http://www.amazon.com/Introduction-Scientific-Programming-Simulation-Using/dp/1420068725/ref=sr_1_1?ie=UTF8&qid=1338069156&sr=8-1
#program spuRs/resources/scripts/SIRsim.r
SIRsim <- function(a, b, N, T) {
# Simulate an SIR epidemic
# a is infection rate, b is removal rate
# N initial susceptibles, 1 initial infected, simulation length T
# returns a matrix size (T+1)*3 with columns S, I, R respectively
S <- rep(0, T+1)
I <- rep(0, T+1)
R <- rep(0, T+1)
S[1] <- N
I[1] <- 1
R[1] <- 0
for (i in 1:T) {
S[i+1] <- rbinom(1, S[i], (1 - a)^I[i])
R[i+1] <- R[i] + rbinom(1, I[i], b)
I[i+1] <- N + 1 - R[i+1] - S[i+1]
}
return(matrix(c(S, I, R), ncol = 3))
}
The core of the simulation is the for loop. My question, is since the code produces the S[i+1] and R[i+1] values from the S[i] and R[i] values, is it possible to vectorize it with an apply function?
Many thanks
It's hard to 'vectorize' iterative calculations, but this is a simulation and simulations are likely to be run many times. So write this to do all the the simulations at the same time by adding an argument M (number of simulations to perform), allocating an M x (T + 1) matrix, and then filling in successive columns (times) of each simulation. The changes seem to be remarkably straight-forward (so I've probably made a mistake; I'm particularly concerned about the use of vectors in the second and third arguments to rbinom, though this is consistent with the documentation).
SIRsim <- function(a, b, N, T, M) {
## Simulate an SIR epidemic
## a is infection rate, b is removal rate
## N initial susceptibles, 1 initial infected, simulation length T
## M is the number of simulations to run
## returns a list of S, I, R matricies, each M simulation
## across T + 1 time points
S <- I <- R <- matrix(0, M, T + 1)
S[,1] <- N
I[,1] <- 1
for (i in seq_along(T)) {
S[,i+1] <- rbinom(M, S[,i], (1 - a)^I[,i])
R[,i+1] <- R[,i] + rbinom(M, I[,i], b)
I[,i+1] <- N + 1 - R[,i+1] - S[,i+1]
}
list(S=S, I=I, R=R)
}

Vectorizing code and stuck but good

Here are some sample starting values for variables in the code below.
sd <- 2
sdtheory <- 1.5
meanoftheory <- 0.6
obtained <- 0.8
tails <- 2
I'm trying to vectorize the following code. It is a component of a Bayes factor calculator that was originally written by Dienes and adapted to R by Danny Kaye & Thom Baguley. This part is for calculating the likelihood for the theory. I've got the thing massively sped up by vectorizing but I can't match output of the bit below.
area <- 0
theta <- meanoftheory - 5 * sdtheory
incr <- sdtheory / 200
for (A in -1000:1000){
theta <- theta + incr
dist_theta <- dnorm(theta, meanoftheory, sdtheory)
if(identical(tails, 1)){
if (theta <= 0){
dist_theta <- 0
} else {
dist_theta <- dist_theta * 2
}
}
height <- dist_theta * dnorm(obtained, theta, sd)
area <- area + height * incr
}
area
And below is the vectorized version.
incr <- sdtheory / 200
newLower <- meanoftheory - 5 * sdtheory + incr
theta <- seq(newLower, by = incr, length.out = 2001)
dist_theta <- dnorm(theta, meanoftheory, sdtheory)
if (tails == 1){
dist_theta <- dist_theta[theta > 0] * 2
theta <- theta[theta > 0]
}
height <- dist_theta * dnorm(obtained, theta, sd)
area <- sum(height * incr)
area
This code exactly copies the results of the original if tails <- 2. Everything I've got here so far should just copy and paste and give the exact same results. However, once tails <- 1 the second function no longer matches exactly. But as near as I can tell I'm doing the equivalent in the new if statement to what is happening in the original. Any help would be appreciated.
(I did try to create a more minimal example, stripping it down to just he loop and if statements and a tiny amount of slices and I just couldn't get the code to fail.)
You're dropping observations where theta==0. That's a problem because the output of dnorm is not zero when theta==0. You need those observations in your output.
Rather than drop observations, a better solution would be to set those elements to zero.
incr <- sdtheory / 200
newLower <- meanoftheory - 5 * sdtheory + incr
theta <- seq(newLower, by = incr, length.out = 2001)
dist_theta <- dnorm(theta, meanoftheory, sdtheory)
if (tails == 1){
dist_theta <- ifelse(theta < 0, 0, dist_theta) * 2
theta[theta < 0] <- 0
}
height <- dist_theta * dnorm(obtained, theta, sd)
area <- sum(height * incr)
area
The original calculation has an error due to floating point arithmetic; adding incr each time causes theta to actually equal 7.204654e-14 when it should equal zero. So it's not actually doing the right thing on that pass through the loop; it's not doing the <= code when it should be. Your code is (at least, it did with these starting values on my machine).
Your code isn't necessarily guaranteed to do the right thing every time either; what seq does is better than adding an increment over and over again, but it's still floating point arithmetic. You really should probably be checking to within machine tolerance of zero, perhaps using all.equal or something similar.

Resources