Say I have prices of a stock and I want to find the slope of the regression line in rolling manner with a given window size. How can I get it done in Julia? I want it to be really fast hence don't want to use a for loop.
You should not, in general, be worried about for loops in Julia, as they do not have the overhead of R or Python for loops. Thus, you only need to worry about asymptotic complexity and not the potentially large constant factor introduced by interpreter overhead.
Nevertheless, this operation can be done much more (asymptotically) efficiently with convolutions than with the naïve O(n²) slice-and-regress approach. The DSP.jl package provides convolution functionality. The following is an example with no intercept (it computes the rolling betas); support for an intercept should be possible by modifying the formulas.
using DSP
# Create some example x (signal) and y (stock prices)
# such that strength of signal goes up over time
const x = randn(100)
const y = (1:100) .* x .+ 100 .* randn(100)
# Create the rolling window
const window = Window.rect(20)
# Compute linear least squares estimate (X^T X)^-1 X^T Y
const xᵗx = conv(x .* x, window)[length(window):end-length(window)+1]
const xᵗy = conv(x .* y, window)[length(window):end-length(window)+1]
const lls = xᵗy ./ xᵗx # desired beta
# Check result against naïve for loop
const βref = [dot(x[i:i+19], y[i:i+19]) / dot(x[i:i+19], x[i:i+19]) for i = 1:81]
#assert isapprox(βref, lls)
Edit to add: To support an intercept, i.e. X = [x 1], so X^T X = [dot(x, x) sum(x); sum(x) w] where w is the window size, the formula for inverse of a 2D matrix can be used to get (X^T X)^-1 = [w -sum(x); -sum(x) dot(x, x)]/(w * dot(x, x) - sum(x)^2). Thus, [β, α] = [w dot(x, y) - sum(x) * sum(y), dot(x, x) * sum(y) - sum(x) * dot(x, y)] / (w * dot(x, x) - sum(x)^2). This can be translated to the following convolution code:
# Compute linear least squares estimate with intercept
const w = length(window)
const xᵗx = conv(x .* x, window)[w:end-w+1]
const xᵗy = conv(x .* y, window)[w:end-w+1]
const 𝟙ᵗx = conv(x, window)[w:end-w+1]
const 𝟙ᵗy = conv(y, window)[w:end-w+1]
const denom = w .* xᵗx - 𝟙ᵗx .^ 2
const α = (xᵗx .* 𝟙ᵗy .- 𝟙ᵗx .* xᵗy) ./ denom
const β = (w .* xᵗy .- 𝟙ᵗx .* 𝟙ᵗy) ./ denom
# Check vs. naive solution
const ref = vcat([([x[i:i+19] ones(20)] \ y[i:i+19])' for i = 1:81]...)
#assert isapprox([β α], ref)
Note that, for weighted least squares with a different window shape, some minor modifications will be needed to disentangle length(window) and sum(window) which are used interchangeably in the code above.
Since I dont need a x variable, I created a numeric series. Using RollingFunctions Package I was able to get rolling regressions through below function.
using RollingFunctions
function rolling_regression(price,windowsize)
sum_x = sum(collect(1:windowsize))
sum_x_squared = sum(collect(1:windowsize).^2)
sum_xy = rolling(sum,price,windowsize,collect(1:windowsize))
sum_y = rolling(sum,price,windowsize)
b = ((windowsize*sum_xy) - (sum_x*sum_y))/(windowsize*sum_x_squared - sum_x^2)
c = [repeat([missing],windowsize-1);b]
end
Related
I am trying to implement ridge-regression from scratch in Julia but something is going wrong.
# Imports
using DataFrames
using LinearAlgebra: norm, I
using Optim: optimize, LBFGS, minimizer
# Read Data
out = CSV.read(download("https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv"), DataFrame, header=0)
# Separate features and response
y = Vector(out[:, end])
X = Matrix(out[:, 1:(end-1)])
λ = 0.1
# Functions
loss(beta) = norm(y - X * beta)^2 + λ*norm(beta)^2
function grad!(G, beta)
G = -2*transpose(X) * (y - X * beta) + 2*λ*beta
end
function hessian!(H, beta)
H = X'X + λ*I
end
# Optimization
start = randn(13)
out = optimize(loss, grad!, hessian!, start, LBFGS())
However, the result of this is terrible and we essentially get back start since it is not moving. Of course, I know I could simply use (X'X + λ*I) \ X'y or IterativeSolvers.lmsr(X, y) but I would like to implement this myself.
The problem is with the implementation of the grad! and hessian! functions: you should use dot assignment to change the content of the G and H matrices:
G .= -2*transpose(X) * (y - X * beta) + 2*λ*beta
H .= X'X + λ*I
Without the dot you replace the matrix the function parameter refers to, but the matrix passed to the function (which will then be used by the optimizer) remains unchanged (presumably a zero matrix, that's why you got back the start vector).
I'm not understanding why the following snippet of code is returning a NoMethodError in Julia
using Calculus
nx = 101
nt = 101
dx = 2*pi / (nx - 1)
nu = 0.07
dt = dx*nu
function init(x, nu, t)
phi = exp( -x^2 / 4.0*nu ) + exp( -(x - 2.0*pi)^2 / 4.0*nu )
dphi_dx = derivative(phi)
u = ( 2.0*nu /phi )*dphi_dx + 4.0
return u
end
x = range(0.0,stop=2*pi,length=nx)
t = 0.0
u = [init(x0,nu,t) for x0 in x]
My aim here is to populate the elements of an array named u with values as calculated by my function init. The u array should have nx elements with u calculated at every x value in the range between 0.0 and 2*pi.
Next time please also post the error message and take a detailed at it before, so you can try to spot the mistake by yourself.
I don't really know the Calculus package but it seems you are using it wrong. Your phi is a number and not a function. You can't take a derivative from just a single number. Change it to
phi = x -> exp( -x^2 / 4.0*nu ) + exp( -(x - 2.0*pi)^2 / 4.0*nu )
an then call the phi and derivative at argument x, so phi(x) and derivative(phi,x) or dphi_x(x). As I don't know much about the Calculus package you should take a look at its documentation again to verify that the derivative command is doing exactly what you want like that.
Little extra: there are also element-wise operations in Julia (similar to Matlab for example) that apply functions to the whole array. Instead of [init(x0,nu,t) for x0 in x], you can also write init.(x,nu,t).
I'm new in Julia and I'm trying to learn to manipulate calculus on it. How do I do if I calculate the gradient of a function with "ForwardDiff" like in the code below and see the function next?
I know if I input some values it gives me the gradient value in that point but I just want to see the function (the gradient of f1).
julia> gradf1(x1, x2) = ForwardDiff.gradient(z -> f1(z[1], z[2]), [x1, x2])
gradf1 (generic function with 1 method)
To elaborate on Felipe Lema's comment, here are some examples using SymPy.jl for various tasks:
#vars x y z
f(x,y,z) = x^2 * y * z
VF(x,y,z) = [x*y, y*z, z*x]
diff(f(x,y,z), x) # ∂f/∂x
diff.(f(x,y,z), [x,y,z]) # ∇f, gradiant
diff.(VF(x,y,z), [x,y,z]) |> sum # ∇⋅VF, divergence
J = VF(x,y,z).jacobian([x,y,z])
sum(diag(J)) # ∇⋅VF, divergence
Mx,Nx, Px, My,Ny,Py, Mz, Nz, Pz = J
[Py-Nz, Mz-Px, Nx-My] # ∇×VF
The divergence and gradient are also part of SymPy, but not exposed. Their use is more general, but cumbersome for this task. For example, this finds the curl:
import PyCall
PyCall.pyimport_conda("sympy.physics.vector", "sympy")
RF = sympy.physics.vector.ReferenceFrame("R")
v1 = get(RF,0)*get(RF,1)*RF.x + get(RF,1)*get(RF,2)*RF.y + get(RF,2)*get(RF,0)*RF.z
sympy.physics.vector.curl(v1, RF)
I have an array Z in Julia which represents an image of a 2D Gaussian function. I.e. Z[i,j] is the height of the Gaussian at pixel i,j. I would like to determine the parameters of the Gaussian (mean and covariance), presumably by some sort of curve fitting.
I've looked into various methods for fitting Z: I first tried the Distributions package, but it is designed for a somewhat different situation (randomly selected points). Then I tried the LsqFit package, but it seems to be tailored for 1D fitting, as it is throwing errors when I try to fit 2D data, and there is no documentation I can find to lead me to a solution.
How can I fit a Gaussian to a 2D array in Julia?
The simplest approach is to use Optim.jl. Here is an example code (it was not optimized for speed, but it should show you how you can handle the problem):
using Distributions, Optim
# generate some sample data
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const m = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
decode(x) = (mu=x[1:2], sig=[x[3] x[4]; x[4] x[5]], s=x[6])
function objective(x)
mu, sig, s = decode(x)
try # sig might be infeasible so we have to handle this case
est_d = MvNormal(mu, sig)
ref_m = [s * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_m, m))
catch
sum(m)
end
end
# test for an example starting point
result = optimize(objective, [1.0, 0.0, 1.0, 0.0, 1.0, 1.0])
decode(result.minimizer)
Alternatively you could use constrained optimization e.g. like this:
using Distributions, JuMP, NLopt
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const Z = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
m = Model(solver=NLoptSolver(algorithm=:LD_MMA))
#variable(m, m1)
#variable(m, m2)
#variable(m, sig11 >= 0.001)
#variable(m, sig12)
#variable(m, sig22 >= 0.001)
#variable(m, sc >= 0.001)
function obj(m1, m2, sig11, sig12, sig22, sc)
est_d = MvNormal([m1, m2], [sig11 sig12; sig12 sig22])
ref_Z = [sc * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_Z, Z))
end
JuMP.register(m, :obj, 6, obj, autodiff=true)
#NLobjective(m, Min, obj(m1, m2, sig11, sig12, sig22, sc))
#NLconstraint(m, sig12*sig12 + 0.001 <= sig11*sig22)
setvalue(m1, 0.0)
setvalue(m2, 0.0)
setvalue(sig11, 1.0)
setvalue(sig12, 0.0)
setvalue(sig22, 1.0)
setvalue(sc, 1.0)
status = solve(m)
getvalue.([m1, m2, sig11, sig12, sig22, sc])
In principle, you have a loss function
loss(μ, Σ) = sum(dist(Z[i,j], N([x(i), y(j)], μ, Σ)) for i in Ri, j in Rj)
where x and y convert your indices to points on the axes (for which you need to know the grid distance and offset positions), and Ri and Rj the ranges of the indices. dist is the distance measure you use, eg. squared difference.
You should be able to pass this into an optimizer by packing μ and Σ into a single vector:
pack(μ, Σ) = [μ; vec(Σ)]
unpack(v) = #views v[1:N], reshape(v[N+1:end], N, N)
loss_packed(v) = loss(unpack(v)...)
where in your case N = 2. (Maybe the unpacking deserves some optimization to get rid of unnecessary copying.)
Another thing is that we have to ensure that Σ is positive semidifinite (and hence also symmetric). One way to do that is to parametrize the packed loss function differently, and optimize over some lower triangular matrix L, such that Σ = L * L'. In the case N = 2, we can write this as
unpack(v) = v[1:2], LowerTriangular([v[3] zero(v[3]); v[4] v[5]])
loss_packed(v) = let (μ, L) = unpack(v)
loss(μ, L * L')
end
(This is of course prone to further optimization, such as expanding the multiplication directly in to loss). A different way is to specify the condition as constraints into the optimizer.
For the optimzer to work you probably have to get the derivative of loss_packed. Either have to find the manually calculate it (by a good choice of dist), or maybe more easily by using a log transformation (if you're lucky, you find a way to reduce it to a linear problem...). Alternatively you could try to find an optimizer that does automatic differentiation.
I’m trying to optimize a function using one of the algorithms that require a gradient. Basically I’m trying to learn how to optimize a function using a gradient in Julia. I’m fairly confident that my gradient is specified correctly. I know this because the similarly defined Matlab function for the gradient gives me the same values as in Julia for some test values of the arguments. Also, the Matlab version using fminunc with the gradient seems to optimize the function fine.
However when I run the Julia script, I seem to get the following error:
julia> include("ex2b.jl")
ERROR: `g!` has no method matching g!(::Array{Float64,1}, ::Array{Float64,1})
while loading ...\ex2b.jl, in ex
pression starting on line 64
I'm running Julia 0.3.2 on a windows 7 32bit machine. Here is the code (basically a translation of some Matlab to Julia):
using Optim
function mapFeature(X1, X2)
degrees = 5
out = ones(size(X1)[1])
for i in range(1, degrees+1)
for j in range(0, i+1)
term = reshape( (X1.^(i-j) .* X2.^(j)), size(X1.^(i-j))[1], 1)
out = hcat(out, term)
end
end
return out
end
function sigmoid(z)
return 1 ./ (1 + exp(-z))
end
function costFunc_logistic(theta, X, y, lam)
m = length(y)
regularization = sum(theta[2:end].^2) * lam / (2 * m)
return sum( (-y .* log(sigmoid(X * theta)) - (1 - y) .* log(1 - sigmoid(X * theta))) ) ./ m + regularization
end
function costFunc_logistic_gradient!(theta, X, y, lam, m)
grad= X' * ( sigmoid(X * theta) .- y ) ./ m
grad[2:end] = grad[2:end] + theta[2:end] .* lam / m
return grad
end
data = readcsv("ex2data2.txt")
X = mapFeature(data[:,1], data[:,2])
m, n = size(data)
y = data[:, end]
theta = zeros(size(X)[2])
lam = 1.0
f(theta::Array) = costFunc_logistic(theta, X, y, lam)
g!(theta::Array) = costFunc_logistic_gradient!(theta, X, y, lam, m)
optimize(f, g!, theta, method = :l_bfgs)
And here is some of the data:
0.051267,0.69956,1
-0.092742,0.68494,1
-0.21371,0.69225,1
-0.375,0.50219,1
-0.51325,0.46564,1
-0.52477,0.2098,1
-0.39804,0.034357,1
-0.30588,-0.19225,1
0.016705,-0.40424,1
0.13191,-0.51389,1
0.38537,-0.56506,1
0.52938,-0.5212,1
0.63882,-0.24342,1
0.73675,-0.18494,1
0.54666,0.48757,1
0.322,0.5826,1
0.16647,0.53874,1
-0.046659,0.81652,1
-0.17339,0.69956,1
-0.47869,0.63377,1
-0.60541,0.59722,1
-0.62846,0.33406,1
-0.59389,0.005117,1
-0.42108,-0.27266,1
-0.11578,-0.39693,1
0.20104,-0.60161,1
0.46601,-0.53582,1
0.67339,-0.53582,1
-0.13882,0.54605,1
-0.29435,0.77997,1
-0.26555,0.96272,1
-0.16187,0.8019,1
-0.17339,0.64839,1
-0.28283,0.47295,1
-0.36348,0.31213,1
-0.30012,0.027047,1
-0.23675,-0.21418,1
-0.06394,-0.18494,1
0.062788,-0.16301,1
0.22984,-0.41155,1
0.2932,-0.2288,1
0.48329,-0.18494,1
0.64459,-0.14108,1
0.46025,0.012427,1
0.6273,0.15863,1
0.57546,0.26827,1
0.72523,0.44371,1
0.22408,0.52412,1
0.44297,0.67032,1
0.322,0.69225,1
0.13767,0.57529,1
-0.0063364,0.39985,1
-0.092742,0.55336,1
-0.20795,0.35599,1
-0.20795,0.17325,1
-0.43836,0.21711,1
-0.21947,-0.016813,1
-0.13882,-0.27266,1
0.18376,0.93348,0
0.22408,0.77997,0
Let me know if you guys need additional details. Btw, this relates to a coursera machine learning course if curious.
The gradient should not be a function to compute the gradient,
but a function to store it
(hence the exclamation mark in the function name, and the second argument in the error message).
The following seems to work.
function g!(theta::Array, storage::Array)
storage[:] = costFunc_logistic_gradient!(theta, X, y, lam, m)
end
optimize(f, g!, theta, method = :l_bfgs)
The same using closures and currying (version for those who got used to a function that returns the cost and gradient):
function cost_gradient(θ, X, y, λ)
m = length(y);
return (θ::Array) -> begin
h = sigmoid(X * θ); #(m,n+1)*(n+1,1) -> (m,1)
J = (1 / m) * sum(-y .* log(h) .- (1 - y) .* log(1 - h)) + λ / (2 * m) * sum(θ[2:end] .^ 2);
end, (θ::Array, storage::Array) -> begin
h = sigmoid(X * θ); #(m,n+1)*(n+1,1) -> (m,1)
storage[:] = (1 / m) * (X' * (h .- y)) + (λ / m) * [0; θ[2:end]];
end
end
Then, somewhere in the code:
initialθ = zeros(n,1);
f, g! = cost_gradient(initialθ, X, y, λ);
res = optimize(f, g!, initialθ, method = :cg, iterations = your_iterations);
θ = res.minimum;