Related
I am trying to use maximum likelihood to estimate the normal linear model in Julia. I have used the following code to simulate the process with just an intercept and an anonymous function per the Optim documentation regarding values that do not change:
using Optim
nobs = 500
nvar = 1
β = ones(nvar)*3.0
x = [ones(nobs) randn(nobs,nvar-1)]
ε = randn(nobs)*0.5
y = x*β + ε
function LL_anon(X, Y, β, σ)
-(-length(X)*log(2π)/2 - length(X)*log(σ) - (sum((Y - X*β).^2) / (2σ^2)))
end
LL_anon(X,Y, pars) = LL_anon(X,Y, pars...)
res2 = optimize(vars -> LL_anon(x,y, vars...), [1.0,1.0]) # Start values: β=1.0, σ=1.0
This actually recovered the parameters and I received the following output:
* Algorithm: Nelder-Mead
* Starting Point: [1.0,1.0]
* Minimizer: [2.980587812647935,0.5108406803949835]
* Minimum: 3.736217e+02
* Iterations: 47
* Convergence: true
* √(Σ(yᵢ-ȳ)²)/n < 1.0e-08: true
* Reached Maximum Number of Iterations: false
* Objective Calls: 92
However, when I try and set nvar = 2, i.e. an intercept plus an additional covariate, I get the following error message:
MethodError: no method matching LL_anon(::Array{Float64,2}, ::Array{Float64,1}, ::Float64, ::Float64, ::Float64)
Closest candidates are:
LL_anon(::Any, ::Any, ::Any, ::Any) at In[297]:2
LL_anon(::Array{Float64,1}, ::Array{Float64,1}, ::Any, ::Any) at In[113]:2
LL_anon(::Any, ::Any, ::Any) at In[297]:4
...
Stacktrace:
[1] (::##245#246)(::Array{Float64,1}) at .\In[299]:1
[2] value!!(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\NLSolversBase\src\interface.jl:9
[3] initial_state(::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Void}, ::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/solvers/zeroth_order\nelder_mead.jl:136
[4] optimize(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}, ::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Void}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\optimize.jl:25
[5] #optimize#151(::Array{Any,1}, ::Function, ::Tuple{##245#246}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\interface.jl:62
[6] #optimize#148(::Array{Any,1}, ::Function, ::Function, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\interface.jl:52
[7] optimize(::Function, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\interface.jl:52
I'm not sure why adding an additional variable is causing this issue but it seems like a type instability problem.
The second issue is that when I use my original working example and set the starting values to [2.0,2.0], I get the following error message:
log will only return a complex result if called with a complex argument. Try log(complex(x)).
Stacktrace:
[1] nan_dom_err at .\math.jl:300 [inlined]
[2] log at .\math.jl:419 [inlined]
[3] LL_anon(::Array{Float64,2}, ::Array{Float64,1}, ::Float64, ::Float64) at .\In[302]:2
[4] (::##251#252)(::Array{Float64,1}) at .\In[304]:1
[5] value(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\NLSolversBase\src\interface.jl:19
[6] update_state!(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Optim.NelderMeadState{Array{Float64,1},Float64,Array{Float64,1}}, ::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/solvers/zeroth_order\nelder_mead.jl:193
[7] optimize(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}, ::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Void}, ::Optim.NelderMeadState{Array{Float64,1},Float64,Array{Float64,1}}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\optimize.jl:51
[8] optimize(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}, ::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Void}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\optimize.jl:25
[9] #optimize#151(::Array{Any,1}, ::Function, ::Tuple{##251#252}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\interface.jl:62
Again, I’m not sure why this is happening and since start values are very important I’d like to know how to overcome this issue and they are not too far off from the true values.
Any help would be greatly appreciated!
Splatting causes the problem. E.g. it transforms [1, 2, 3] into three parameters while your function accepts only two.
Use the following call:
res2 = optimize(vars -> LL_anon(x,y, vars[1:end-1], vars[end]), [1.0,1.0,1.0])
and you can remove the following line from your code
LL_anon(X,Y, pars) = LL_anon(X,Y, pars...)
or replace it with:
LL_anon(X,Y, pars) = LL_anon(X,Y, pars[1:end-1], pars[end])
but it would not be called by optimization routine unless you change a call to:
res2 = optimize(vars -> LL_anon(x,y, vars), [1.0,1.0,1.0])
Finally - to get good performance of this code I would recommend to wrap it all in a function.
EDIT: now I see a second question. The reason is that σ can become negative in the optimization process and then log(σ) fails. The simplest thing to do in this case is to take log(abs(σ))) like this:
function LL_anon(X, Y, β, σ)
-(-length(X)*log(2π)/2 - length(X)*log(abs(σ)) - (sum((Y - X*β).^2) / (2σ^2)))
end
Of course then you have to take absolute value of σ as a solution as you might get a negative value from optimization routine.
A cleaner way would be to optimize over e.g. log(σ) not σ like this:
function LL_anon(X, Y, β, logσ)
-(-length(X)*log(2π)/2 - length(X)*logσ - (sum((Y - X*β).^2) / (2(exp(logσ))^2)))
end
but then you have to use exp(logσ) to recover σ after optimization finishes.
I have asked around regarding this and have another option. The main reason for looking at this problem is twofold. One, to learn how to use the optimization routines in Julia in a canonical situation and two, to expand this to spatial econometric models. With that in mind, I'm posting the other suggested code from the Julia message board so that others may see another solution.
using Optim
nobs = 500
nvar = 2
β = ones(nvar) * 3.0
x = [ones(nobs) randn(nobs, nvar - 1)]
ε = randn(nobs) * 0.5
y = x * β + ε
function LL_anon(X, Y, β, log_σ)
σ = exp(log_σ)
-(-length(X) * log(2π)/2 - length(X) * log(σ) - (sum((Y - X * β).^2) / (2σ^2)))
end
opt = optimize(vars -> LL_anon(x,y, vars[1:nvar], vars[nvar + 1]),
ones(nvar+1))
# Use forward autodiff to get first derivative, then optimize
fun1 = OnceDifferentiable(vars -> LL_anon(x, y, vars[1:nvar], vars[nvar + 1]),
ones(nvar+1))
opt1 = optimize(fun1, ones(nvar+1))
Results of Optimization Algorithm
Algorithm: L-BFGS
Starting Point: [1.0,1.0,1.0]
Minimizer: [2.994204150985705,2.9900626550063305, …]
Minimum: 3.538340e+02
Iterations: 12
Convergence: true
|x - x’| ≤ 1.0e-32: false
|x - x’| = 8.92e-12
|f(x) - f(x’)| ≤ 1.0e-32 |f(x)|: false
|f(x) - f(x’)| = 9.64e-16 |f(x)|
|g(x)| ≤ 1.0e-08: true
|g(x)| = 6.27e-09
Stopped by an increasing objective: true
Reached Maximum Number of Iterations: false
Objective Calls: 50
Gradient Calls: 50
opt1.minimizer
3-element Array{Float64,1}:
2.9942
2.99006
-1.0651 #Note: needs to be exponentiated
# Get Hessian, use Newton!
fun2 = TwiceDifferentiable(vars -> LL_anon(x, y, vars[1:nvar], vars[nvar + 1]),
ones(nvar+1))
opt2 = optimize(fun2, ones(nvar+1))
Results of Optimization Algorithm
Algorithm: Newton’s Method
Starting Point: [1.0,1.0,1.0]
Minimizer: [2.99420415098702,2.9900626550079026, …]
Minimum: 3.538340e+02
Iterations: 9
Convergence: true
|x - x’| ≤ 1.0e-32: false
|x - x’| = 1.36e-11
|f(x) - f(x’)| ≤ 1.0e-32 |f(x)|: false
|f(x) - f(x’)| = 1.61e-16 |f(x)|
|g(x)| ≤ 1.0e-08: true
|g(x)| = 6.27e-09
Stopped by an increasing objective: true
Reached Maximum Number of Iterations: false
Objective Calls: 45
Gradient Calls: 45
Hessian Calls: 9
fieldnames(fun2)
13-element Array{Symbol,1}:
:f
:df
:fdf
:h
:F
:DF
:H
:x_f
:x_df
:x_h
:f_calls
:df_calls
:h_calls
opt2.minimizer
3-element Array{Float64,1}:
2.98627
3.00654
-1.11313
numerical_hessian = (fun2.H) #.H is the numerical Hessian
3×3 Array{Float64,2}:
64.8715 -9.45045 0.000121521
-0.14568 66.4507 0.0
1.87326e-6 4.10675e-9 44.7214
From here, one can use the numerical Hessian to obtain the standard errors for the estimates and form t-statistics, etc. for their own functions.
Again, thank you for providing an answer and I hope people find this information useful.
I am interested in calculating quantity:
where x_i is a 1xD vector (one out of my N data of dimension D), μ is a DxK matrix and W is a list of K DxD matrices.
This should result in a 1XK vector. I try it for all N and K in the following way that works:
res = zeros(N,K);
for i in 1:N
for k in 1:K
res[i,k] = (x_matrix[i,:]-mus_matrix[:,k])'*
w_matrix[k]*(x_matrix[i,:]-mus_matrix[:,k])
If I try to vectorize it, using the following:
res = zeros(N,K);
for i in 1:N
res[i,:] = (x_matrix[i,:].-mus_matrix)'.*w_matrix.*(x_matrix[i,:].-mus_matrix)
I get the following error:
ERROR: DimensionMismatch("arrays could not be broadcast to a common size")
Stacktrace:
[1] _bcs1(::Base.OneTo{Int64}, ::Base.OneTo{Int64}) at ./broadcast.jl:70
[2] _bcs at ./broadcast.jl:63 [inlined]
[3] broadcast_shape(::Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}, ::Tuple{Base.OneTo{Int64}}, ::Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}, ::Vararg{Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},N} where N) at ./broadcast.jl:57 (repeats 3 times)
[4] broadcast_indices(::Array{Float64,2}, ::Array{Any,1}, ::Array{Float64,1}, ::Vararg{Any,N} where N) at ./broadcast.jl:53
[5] broadcast_c(::Function, ::Type{Array}, ::Array{Float64,2}, ::Array{Any,1}, ::Vararg{Any,N} where N) at ./broadcast.jl:311
[6] broadcast(::Function, ::Array{Float64,2}, ::Array{Any,1}, ::Array{Float64,1}, ::Vararg{Any,N} where N) at ./broadcast.jl:434
Here is an example:
julia> N = 5
5
julia> D=2
2
julia> K = 4
4
julia> W=[]
0-element Array{Any,1}
julia> x = rand(N,D)
5×2 Array{Float64,2}:
0.576477 0.9575
0.184454 0.660436
0.470267 0.729649
0.648879 0.782561
0.626453 0.111332
julia> mu = rand(K,D)
4×2 Array{Float64,2}:
0.989281 0.00126782
0.659106 0.66136
0.50843 0.289442
0.327962 0.523229
julia> for i in 1:K
push!(W,rand(D,D))
end
And then run
julia> (x_matrix[i,:]-mus_matrix[:,k])'*
w_matrix[k]*(x_matrix[i,:]-mus_matrix[:,k])
34649.850360744866
But with the second code
julia> (x_matrix[i,:].-mus_matrix)'.*w_matrix.*(x_matrix[i,:].-mus_matrix)
ERROR: DimensionMismatch("arrays could not be broadcast to a common size")
Stacktrace:
[1] _bcs1(::Base.OneTo{Int64}, ::Base.OneTo{Int64}) at ./broadcast.jl:70
[2] _bcs at ./broadcast.jl:63 [inlined]
[3] broadcast_shape(::Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}, ::Tuple{Base.OneTo{Int64}}, ::Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}, ::Vararg{Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},N} where N) at ./broadcast.jl:57 (repeats 3 times)
[4] broadcast_indices(::Array{Float64,2}, ::Array{Any,1}, ::Array{Float64,1}, ::Vararg{Any,N} where N) at ./broadcast.jl:53
[5] broadcast_c(::Function, ::Type{Array}, ::Array{Float64,2}, ::Array{Any,1}, ::Vararg{Any,N} where N) at ./broadcast.jl:311
[6] broadcast(::Function, ::Array{Float64,2}, ::Array{Any,1}, ::Array{Float64,1}, ::Vararg{Any,N} where N) at ./broadcast.jl:434
TL/DR: optimized variant below, but Einsum looks nicer, IMHO.
Looks like a case for using Einstein summation notation. In Julia, Einsum.jl can do this:
julia> N = 5
5
julia> D = 3
3
julia> K = 10
10
julia> x = rand(N, D)
5×3 Array{Float64,2}:
0.587436 0.210529 0.261725
0.527269 0.457477 0.482939
0.52726 0.411209 0.138872
0.89107 0.464789 0.758392
0.885267 0.931014 0.672959
julia> μ = rand(D, K)
3×10 Array{Float64,2}:
0.280792 0.265066 0.81437 0.503377 0.0717916 … 0.275872 0.609961 0.0820088 0.0042564
0.0177643 0.0959438 0.563948 0.332433 0.088527 0.691971 0.0296638 0.604488 0.956057
0.668128 0.444816 0.74203 0.518232 0.48689 0.465067 0.117469 0.729514 0.109973
julia> W = rand(K, D, D)
10×3×3 Array{Float64,3}:
[:, :, 1] =
0.320861 0.662103 0.219234
0.780944 0.769377 0.566203
0.466207 0.428527 0.330901
0.15534 0.035435 0.346737
0.810676 0.328116 0.469505
0.676575 0.668204 0.285334
0.455551 0.211295 0.85295
0.229995 0.741487 0.783361
0.0937583 0.401419 0.47032
0.956335 0.434213 0.967791
[:, :, 2] =
0.275903 0.130298 0.184485
0.941648 0.940107 0.439454
0.425292 0.252654 0.797115
0.0203406 0.594075 0.484809
0.164309 0.941597 0.455314
0.73628 0.109502 0.920664
0.906305 0.177235 0.540193
0.360038 0.0486971 0.20626
0.914357 0.699901 0.295872
0.284143 0.659117 0.291479
[:, :, 3] =
0.138311 0.921371 0.353719
0.345247 0.70865 0.246736
0.361364 0.636543 0.343837
0.752149 0.581561 0.346399
0.705888 0.24765 0.703952
0.992327 0.369668 0.109407
0.341624 0.223715 0.970667
0.762169 0.94248 0.917569
0.0367128 0.589345 0.121106
0.826602 0.692111 0.229499
julia> using Einsum
julia> #einsum r[n,k] := (x[n,i] - μ[i,k]) * W[k,i,j] * (x[n,j] - μ[j,k])
julia> r
5×10 Array{Float64,2}:
0.0176889 0.087092 0.522184 0.0417967 … -0.0430999 0.041266 -0.0596579 0.432076
0.0521066 0.364059 0.181515 0.00434307 -0.0248712 0.226976 -0.0686294 0.437169
-0.0472136 0.127803 0.458812 0.0119074 0.0391649 -0.0190299 -0.0585371 0.264379
0.468634 1.16498 -0.00263205 0.192809 0.273537 1.13787 -0.0653081 1.41321
0.749655 2.20266 0.0205068 0.420249 0.573358 1.42499 0.441232 1.67574
Which #macroexpands to essentially the following loops (plus preparation and bounds checking):
begin
local k
for k = 1:size(μ, 2)
begin
local n
for n = 1:size(x, 1)
begin
local s = zero(T)
begin
local j
for j = 1:size(W, 3)
begin
local i
for i = 1:size(x, 2)
s += (x[n, i] - μ[i, k]) * W[k, i, j] * (x[n, j] - μ[j, k])
end
end
end
end
r[n, k] = s
end
end
end
end
end
Now, to find something more performant, I compared a couple of variants using BenchmarkTools.jl. You can see the full code and results on my laptop here. It shows that the Einsum variant already is in fact better than the original:
# Original:
# memory estimate: 1017.73 MiB
# allocs estimate: 3429967
# median time: 361.982 ms (15.94% GC)
# Einsum:
# memory estimate: 2.64 MiB
# allocs estimate: 76
# median time: 127.536 ms (0.00% GC)
By far the most efficient and least allocating variant is the following, which requires x = x' and W = permutedims(W, [2, 3, 1]) (assuming you can change your representation easily):
function test_optimized!(res, x, μ, W)
z = zero(eltype(x))
for k = 1:size(μ, 1)
for n = 1:size(x, 1)
res[n, k] = z
for i = 1:size(W, 1)
for j = 1:size(W, 2)
#inbounds res[n, k] += (x[i, n] - μ[i, k]) * W[i, j, k] * (x[j, n] - μ[j, k])
end
end
end
end
end
function test_optimized(x, μ, W)
res = zeros(N, K)
test_optimized!(res, x, μ, W)
res
end
This brings us down to
# memory estimate: 2.63 MiB
# allocs estimate: 2
# median time: 521.215 μs (0.00% GC)
It uses a couple of "tricks" that can be found in the docs: filling a preallocated matrix in a separate method, accessing strides in column-major order, and using #inbounds (although that only improves things at the order of a microsecond).
There is also TensorOperations.jl, which I think does more intelligent things under the hood, but it fail on this:
julia> #tensor r[n,k] := (x[n,i] - μ[i,k]) * W[k,i,j] * (x[n,j] - μ[j,k])
ERROR: TensorOperations.IndexError{String}("invalid index specification: (:n, :i) to (:i, :k)")
Stacktrace:
[1] add_indices(::Tuple{Symbol,Symbol}, ::Tuple{Symbol,Symbol}) at /home/philipp/.julia/v0.6/TensorOperations/src/implementation/indices.jl:22
[2] + at /home/philipp/.julia/v0.6/TensorOperations/src/indexnotation/sum.jl:40 [inlined]
[3] -(::TensorOperations.IndexedObject{(:n, :i),:N,Array{Float64,2},Int64}, ::TensorOperations.IndexedObject{(:i, :k),:N,Array{Float64,2},Int64}) at /home/philipp/.julia/v0.6/TensorOperations/src/indexnotation/sum.jl:44
I guess that's deliberate and has to do with efficiency, see this issue.
I was curious how quick and accurate, algorithm from Rosseta code ( https://rosettacode.org/wiki/Ackermann_function ) for (4,2) parameters, could be. But got StackOverflowError.
julia> using Memoize
#memoize ack3(m, n) =
m == 0 ? n + 1 :
n == 0 ? ack3(m-1, 1) :
ack3(m-1, ack3(m, n-1))
# WARNING! Next line has to calculate and print number with 19729 digits!
julia> ack3(4,2) # -> StackOverflowError
# has to be -> 2003529930406846464979072351560255750447825475569751419265016973710894059556311
# ...
# 4717124577965048175856395072895337539755822087777506072339445587895905719156733
EDIT:
Oscar Smith is right that trying ack3(4,2) is unrealistic. This is version translated from Rosseta's C++:
module Ackermann
function ackermann(m::UInt, n::UInt)
function ack(m::UInt, n::BigInt)
if m == 0
return n + 1
elseif m == 1
return n + 2
elseif m == 2
return 3 + 2 * n;
elseif m == 3
return 5 + 8 * (BigInt(2) ^ n - 1)
else
if n == 0
return ack(m - 1, BigInt(1))
else
return ack(m - 1, ack(m, n - 1))
end
end
end
return ack(m, BigInt(n))
end
end
julia> import Ackermann;Ackermann.ackermann(UInt(1),UInt(1));#time(a4_2 = Ackermann.ackermann(UInt(4),UInt(2)));t = "$a4_2"; println("len = $(length(t)) first_digits=$(t[1:20]) last digits=$(t[end-20:end])")
0.000041 seconds (57 allocations: 33.344 KiB)
len = 19729 first_digits=20035299304068464649 last digits=445587895905719156733
Julia itself does not have an internal limit to the stack size, but your operating system does. The exact limits here (and how to change them) will be system dependent. On my Mac (and I assume other POSIX-y systems), I can check and change the stack size of programs that get called by my shell with ulimit:
$ ulimit -s
8192
$ julia -q
julia> f(x) = x > 0 ? f(x-1) : 0 # a simpler recursive function
f (generic function with 1 method)
julia> f(523918)
0
julia> f(523919)
ERROR: StackOverflowError:
Stacktrace:
[1] f(::Int64) at ./REPL[1]:1 (repeats 80000 times)
$ ulimit -s 16384
$ julia -q
julia> f(x) = x > 0 ? f(x-1) : 0
f (generic function with 1 method)
julia> f(1048206)
0
julia> f(1048207)
ERROR: StackOverflowError:
Stacktrace:
[1] f(::Int64) at ./REPL[1]:1 (repeats 80000 times)
I believe the exact number of recursive calls that will fit on your stack will depend upon both your system and the complexity of the function itself (that is, how much each recursive call needs to store on the stack). This is the bare minimum. I have no idea how big you'd need to make the stack limit in order to compute that Ackermann function.
Note that I doubled the stack size and it more than doubled the number of recursive calls — this is because of a constant overhead:
julia> log2(523918)
18.998981503278365
julia> 2^19 - 523918
370
julia> log2(1048206)
19.99949084151746
julia> 2^20 - 1048206
370
Just fyi, even if you change the max recursion depth, you won't get the right answer as Julia uses 64 bit integers, so integer overflow with make stuff not work. To get the right answer, you will have to use big ints to have any hope. The next problem is that you probably don't want to memoize, as almost all of the computations are not repeated, and you will be computing the function more than 10^19729 different inputs, which you really do not want to store.
I have the following Fortran subroutine:
subroutine test(d, i, nMCd, DF, X)
integer, intent(in) :: d, i, nMCd
double precision, intent(in), dimension(i,nMCd) :: DF
double precision, intent(out), dimension(i) :: X
X = DF(:,d)+DF(:,d)
end subroutine test
I am able to compile it for R load it and run it. But instead of getting an array I'm getting a single number.
system("R CMD SHLIB ./Fortran/mytest.f90")
dyn.load("./Fortran/mytest.so")
input <- data.frame(A=c(11,12), B=c(21, 22))
.Fortran("test", d = as.integer(1), i = nrow(input), nMCd = ncol(input), DF = unlist(input), X = as.numeric(1))
What am I doing wrong?!
My output looks like
$d
[1] 1
$i
[1] 2
$nMCd
NULL
$DF
A1 A2 B1 B2
11 12 21 22
$X
[1] 22
The R version of this is:
input[,1]+input[,1]
I haven't figured out what this was supposed to do because I don't program in FORTRAN (And you didn't say what you expected in a language that I do read) but this is an experiment that delivers the sum of the items in the first columns of the input object, which might make some sense when I look at the code with the inputs. It seems possible that sending 1 for d to extract from DF(:,d)+ DF(:,d) might mean you wanted the sums of first columns. Notice that I just supplied an empty 4 element vector to X and made its Fortran dimensions the same as DF:
Source in file:
subroutine test(d, i, nMCd, DF, X)
integer, intent(in) :: d, i, nMCd
double precision, intent(in), dimension(i,nMCd) :: DF
double precision, intent(out), dimension(i,nMCd) :: X(i)
X = DF(:,d)+DF(:,d)
end subroutine test
R code:
input <- data.frame(A=c(11,12), B=c(21, 22))
.Fortran("test", d = as.integer(1), i = nrow(input), nMCd = ncol(input),
DF = unlist(input), X = numeric(4))
#--- result------
$d
[1] 1
$i
[1] 2
$nMCd
[1] 2
$DF
A1 A2 B1 B2
11 12 21 22
$X
[1] 22 24 0 0
Further experiment, still without any knowledge of Fortran, trying to add the items in the first row together:
X = DF(d,:)+DF(d,:)
Produced:
$X
[1] 22 42 0 0
The regular recursive approach for pow(x,n) is as follows:
pow (x,n):
= 1 ...n=0
= 0 ...x=0
= x ...n=1
= x * pow (x, n-1) ...n>0
With this approach 2^(37) will require 37 multiplications. How do I modify this to reduces the number of multiplications to less than 10? I think this could be done only if the function is not excessive.
With this approach you can compute 2^(37) with only 7 multiplications.
pow(x,n):
= 1 ... n=0
= 0 ... x=0
= x ... n=1
= pow(x,n/2) * pow (x,n/2) ... n = even
= x * pow(x,n/2) * pow(x,n.2) ... n = odd
Now lets calculate 2^(37) with this approach -
2^(37) =
= 2 * 2^(18) * 2^(18)
= 2^(9) * 2^(9)
= 2 * 2^(4) * 2^(4)
= 2^(2) * 2^(2)
= 2 * 2
This function is not excessive and hence it reuses the values once calculated. Thus only 7 multiplications are required to calculate 2^(37).
You can calculate the power of a number in logN time instead of linear time.
int cnt = 0;
// calculate a^b
int pow(int a, int b){
if(b==0) return 1;
if(b%2==0){
int v = pow(a, b/2);
cnt += 1;
return v*v;
}else{
int v = pow(a, b/2);
cnt += 2;
return v*v*a;
}
}
Number of multiplications will be 9 for the above code as verified by this program.
Doing it slightly differently than invin did, I come up with 8 multiplications. Here's a Ruby implementation. Be aware that Ruby methods return the result of the last expression evaluated. With that understanding, it reads pretty much like pseudo-code except you can actually run it:
$count = 0
def pow(a, b)
if b > 0
$count += 1 # note only one multiplication in both of the following cases
if b.even?
x = pow(a, b/2)
x * x
else
a * pow(a, b-1)
end
else # no multiplication for the base case
1
end
end
p pow(2, 37) # 137438953472
p $count # 8
Note that the sequence of powers with which the method gets invoked is
37 -> 36 -> 18 -> 9 -> 8 -> 4 -> 2 -> 1 -> 0
and that each arrow represents one multiplication. Calculating the zeroth power always yields 1, with no multiplication, and there are 8 arrows.
Since xn = (xn/2)2 = (x2)n/2 for even values of n, we can derive this subtly different implementation:
$count = 0
def pow(a, b)
if b > 1
if b.even?
$count += 1
pow(a * a, b/2)
else
$count += 2
a * pow(a * a, b/2)
end
elsif b > 0
a
else
1
end
end
p pow(2, 37) # 137438953472
p $count # 7
This version includes all of the base cases in the original question, it's easy to run and confirm that it calculates 2^37 in 7 multiplications, and doesn't require any allocation of local variables. For production use you would, of course, comment out or remove the references to $count.