Julia: Why ridge regression not working (optim) - julia

I am trying to implement ridge-regression from scratch in Julia but something is going wrong.
# Imports
using DataFrames
using LinearAlgebra: norm, I
using Optim: optimize, LBFGS, minimizer
# Read Data
out = CSV.read(download("https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv"), DataFrame, header=0)
# Separate features and response
y = Vector(out[:, end])
X = Matrix(out[:, 1:(end-1)])
λ = 0.1
# Functions
loss(beta) = norm(y - X * beta)^2 + λ*norm(beta)^2
function grad!(G, beta)
G = -2*transpose(X) * (y - X * beta) + 2*λ*beta
end
function hessian!(H, beta)
H = X'X + λ*I
end
# Optimization
start = randn(13)
out = optimize(loss, grad!, hessian!, start, LBFGS())
However, the result of this is terrible and we essentially get back start since it is not moving. Of course, I know I could simply use (X'X + λ*I) \ X'y or IterativeSolvers.lmsr(X, y) but I would like to implement this myself.

The problem is with the implementation of the grad! and hessian! functions: you should use dot assignment to change the content of the G and H matrices:
G .= -2*transpose(X) * (y - X * beta) + 2*λ*beta
H .= X'X + λ*I
Without the dot you replace the matrix the function parameter refers to, but the matrix passed to the function (which will then be used by the optimizer) remains unchanged (presumably a zero matrix, that's why you got back the start vector).

Related

Reassign function and avoid recursive definition in Julia

I need to operate on a sequence of functions
h_k(x) = (I + f_k( ) )^k g(x)
for each k=1,...,N.
A basic example (N=2, f_k=f) is the following:
f(x) = x^2
g(x) = x
h1(x) = g(x) + f(g(x))
h2(x) = g(x) + f(g(x)) + f(g(x) + f(g(x)))
println(h1(1)) # returns 2
println(h2(1)) # returns 6
I need to write this in a loop and it would be best to redefine g(x) at each iteration. Unfortunately, I do not know how to do this in Julia without conflicting with the syntax for a recursive definition of g(x). Indeed,
f(x) = x^2
g(x) = x
for i=1:2
global g(x) = g(x) + f(g(x))
println(g(1))
end
results in a StackOverflowError.
In Julia, what is the proper way to redefine g(x), using its previous definition?
P.S. For those who would suggest that this problem could be solved with recursion: I want to use a for loop because of how the functions f_k(x) (in the above, each f_k = f) are computed in the real problem that this derives from.
I am not sure if it is best, but a natural approach is to use anonymous functions here like this:
let
f(x) = x^2
g = x -> x
for i=1:2
l = g
g = x -> l(x) + f(l(x))
println(g(1))
end
end
or like this
f(x) = x^2
g = x -> x
for i=1:4
l = g
global g = x -> l(x) + f(l(x))
println(g(1))
end
(I prefer the former option using let as it avoids using global variables)
The issue is that l is a loop local variable that gets a fresh binding at each iteration, while g is external to the loop.
You might also check out this section of the Julia manual.

How to work with the result of the wild sympy

I have the following code:
f=tan(x)*x**2
q=Wild('q')
s=f.match(tan(q))
s={q_ : x}
How to work with the result of the "wild"? How to not address the array, for example, s[0], s{0}?
Wild can be used when you have an expression which is the result of some complicated calculation, but you know it has to be of the form sin(something) times something else. Then s[q] will be the sympy expression for the "something". And s[p] for the "something else". This could be used to investigate both p and q. Or to further work with a simplified version of f, substituting p and q with new variables, especially if p and q would be complex expressions involving multiple variables.
Many more use cases are possible.
Here is an example:
from sympy import *
from sympy.abc import x, y, z
p = Wild('p')
q = Wild('q')
f = tan(x) * x**2
s = f.match(p*tan(q))
print(f'f is the tangent of "{s[q]}" multiplied by "{s[p]}"')
g = f.xreplace({s[q]: y, s[p]:z})
print(f'f rewritten in simplified form as a function of y and z: "{g}"')
h = s[p] * s[q]
print(f'a new function h, combining parts of f: "{h}"')
Output:
f is the tangent of "x" multiplied by "x**2"
f rewritten in simplified form as a function of y and z: "z*tan(y)"
a new function h, combining parts of f: "x**3"
If you're interested in all arguments from tan that appear in f written as a product, you might try:
from sympy import *
from sympy.abc import x
f = tan(x+2)*tan(x*x+1)*7*(x+1)*tan(1/x)
if f.func == Mul:
all_tan_args = [a.args[0] for a in f.args if a.func == tan]
# note: the [0] is needed because args give a tupple of arguments and
# in the case of tan you'ld want the first (there is only one)
elif f.func == tan:
all_tan_args = [f.args[0]]
else:
all_tan_args = []
prod = 1
for a in all_tan_args:
prod *= a
print(f'All the tangent arguments are: {all_tan_args}')
print(f'Their product is: {prod}')
Output:
All the tangent arguments are: [1/x, x**2 + 1, x + 2]
Their product is: (x + 2)*(x**2 + 1)/x
Note that neither method would work for f = tan(x)**2. For that, you'ld need to write another match and decide whether you'ld want to take the same power of the arguments.

Julia & Functions - NoMethodError

I'm not understanding why the following snippet of code is returning a NoMethodError in Julia
using Calculus
nx = 101
nt = 101
dx = 2*pi / (nx - 1)
nu = 0.07
dt = dx*nu
function init(x, nu, t)
phi = exp( -x^2 / 4.0*nu ) + exp( -(x - 2.0*pi)^2 / 4.0*nu )
dphi_dx = derivative(phi)
u = ( 2.0*nu /phi )*dphi_dx + 4.0
return u
end
x = range(0.0,stop=2*pi,length=nx)
t = 0.0
u = [init(x0,nu,t) for x0 in x]
My aim here is to populate the elements of an array named u with values as calculated by my function init. The u array should have nx elements with u calculated at every x value in the range between 0.0 and 2*pi.
Next time please also post the error message and take a detailed at it before, so you can try to spot the mistake by yourself.
I don't really know the Calculus package but it seems you are using it wrong. Your phi is a number and not a function. You can't take a derivative from just a single number. Change it to
phi = x -> exp( -x^2 / 4.0*nu ) + exp( -(x - 2.0*pi)^2 / 4.0*nu )
an then call the phi and derivative at argument x, so phi(x) and derivative(phi,x) or dphi_x(x). As I don't know much about the Calculus package you should take a look at its documentation again to verify that the derivative command is doing exactly what you want like that.
Little extra: there are also element-wise operations in Julia (similar to Matlab for example) that apply functions to the whole array. Instead of [init(x0,nu,t) for x0 in x], you can also write init.(x,nu,t).

How I display a math function in Julia?

I'm new in Julia and I'm trying to learn to manipulate calculus on it. How do I do if I calculate the gradient of a function with "ForwardDiff" like in the code below and see the function next?
I know if I input some values it gives me the gradient value in that point but I just want to see the function (the gradient of f1).
julia> gradf1(x1, x2) = ForwardDiff.gradient(z -> f1(z[1], z[2]), [x1, x2])
gradf1 (generic function with 1 method)
To elaborate on Felipe Lema's comment, here are some examples using SymPy.jl for various tasks:
#vars x y z
f(x,y,z) = x^2 * y * z
VF(x,y,z) = [x*y, y*z, z*x]
diff(f(x,y,z), x) # ∂f/∂x
diff.(f(x,y,z), [x,y,z]) # ∇f, gradiant
diff.(VF(x,y,z), [x,y,z]) |> sum # ∇⋅VF, divergence
J = VF(x,y,z).jacobian([x,y,z])
sum(diag(J)) # ∇⋅VF, divergence
Mx,Nx, Px, My,Ny,Py, Mz, Nz, Pz = J
[Py-Nz, Mz-Px, Nx-My] # ∇×VF
The divergence and gradient are also part of SymPy, but not exposed. Their use is more general, but cumbersome for this task. For example, this finds the curl:
import PyCall
PyCall.pyimport_conda("sympy.physics.vector", "sympy")
RF = sympy.physics.vector.ReferenceFrame("R")
v1 = get(RF,0)*get(RF,1)*RF.x + get(RF,1)*get(RF,2)*RF.y + get(RF,2)*get(RF,0)*RF.z
sympy.physics.vector.curl(v1, RF)

Get derivative in R

I'm trying to take the derivative of an expression:
x = read.csv("export.csv", header=F)$V1
f = expression(-7645/2* log(pi) - 1/2 * sum(log(w+a*x[1:7644]^2)) + (x[2:7645]^2/(w + a*x[1:7644]^2)),'a')
D(f,'a')
x is simply an integer vector, a and w are the variables I'm trying to find by deriving. However, I get the error
"Function '[' is not in Table of Derivatives"
Since this is my first time using R I'm rather clueless what to do now. I'm assuming R has got some problem with my sum function inside of the expression?
After following the advice I now did the following:
y <- x[1:7644]
z <- x[2:7645]
f = expression(-7645/2* log(pi) - 1/2 * sum(log(w+a*y^2)) + (z^2/(w + a*y^2)),'a')
Deriving this gives me the error "sum is not in the table of derivatives". How can I make sure the expression considers each value of y and z?
Another Update:
y <- x[1:7644]
z <- x[2:7645]
f = expression(-7645/2* log(pi) - 1/2 * log(w+a*y^2) + (z^2/(w + a*y^2)))
d = D(f,'a')
uniroot(eval(d),c(0,1000))
I've eliminated the "sum" function and just entered y and z. Now, 2 questions:
a) How can I be sure that this is still the expected behaviour?
b) Uniroot doesn't seem to like "w" and "a" since they're just symbolic. How would I go about fixing this issue? The error I get is "object 'w' not found"
This should work:
Since you have two terms being added f+g, the derivative D(f+g) = D(f) + D(g), so let's separate both like this:
g = expression((z^2/(w + a*y^2)))
f = expression(- 1/2 * log(w+a*y^2))
See that sum() was removed from expression f, because the multiplying constant was moved into the sum() and the D(sum()) = sum(D()). Also the first constant was removed because the derivative is 0.
So:
D(sum(-7645/2* log(pi) - 1/2 * log(w+a*y^2)) + (z^2/(w + a*y^2)) = D( constant + sum(f) + g ) = sum(D(f)) + D(g)
Which should give:
sum(-(1/2 * (y^2/(w + a * y^2)))) + -(z^2 * y^2/(w + a * y^2)^2)
expression takes only a single expr input, not a vector, and it is beyond r abilities to vectorize that.
you can also do this with a for loop:
foo <- c("1+2","3+4","5*6","7/8")
result <- numeric(length(foo))
foo <- parse(text=foo)
for(i in seq_along(foo))
result[i] <- eval(foo[[i]])

Resources