I'm new to julia so I would welcome some advice to improve the following function,
using SpecialFunctions
function rb(x, nu_max)
bj = Array{Complex64}(length(x), nu_max)
nu = 0.5 + (0:nu_max)
# somehow dot broadcast isn't happy
# bj .= [ besselj(_nu,_x)*sqrt(pi/2*_x) for _nu in nu, _x in x]
bj = [ besselj(_nu,_x)*sqrt(pi/2*_x) for _nu in nu, _x in x]
end
rb(1.0:0.1:2.0, 500)
basically, I'm not quite sure what's the recommended way to get a matrix over these two parameters (x and nu). The documentation doesn't offer much information, but I understand that the underlying fortran routine internally loops over nu, so I'd rather not do it again in the interest of performance.
Edit:
I'm asked about the goal; it's to compute the Riccati-Bessel functions $j_1(x,\nu),h_1(x,\nu)$ for multiple values of $x$ and $\nu$.
I've stripped down stylistic questions from the original version to focus on this core issue.
This is a great example where you can take full advantage of broadcasting. It looks like you want the cartesian product between x and nu, where the rows are populated by the values of nu and the columns are x. This is exactly what broadcasting can do — you just need to reshape x such that it's a single row across many columns:
julia> using SpecialFunctions
julia> x = 1.0:0.1:2.0
1.0:0.1:2.0
julia> nu = 0.5 + (0:500)
0.5:1.0:500.5
# this shows how broadcast works — these are the arguments and their location in the matrix
julia> tuple.(nu, reshape(x, 1, :))
501×11 Array{Tuple{Float64,Float64},2}:
(0.5, 1.0) (0.5, 1.1) … (0.5, 1.9) (0.5, 2.0)
(1.5, 1.0) (1.5, 1.1) (1.5, 1.9) (1.5, 2.0)
(2.5, 1.0) (2.5, 1.1) (2.5, 1.9) (2.5, 2.0)
(3.5, 1.0) (3.5, 1.1) (3.5, 1.9) (3.5, 2.0)
⋮ ⋱ ⋮
(497.5, 1.0) (497.5, 1.1) (497.5, 1.9) (497.5, 2.0)
(498.5, 1.0) (498.5, 1.1) (498.5, 1.9) (498.5, 2.0)
(499.5, 1.0) (499.5, 1.1) (499.5, 1.9) (499.5, 2.0)
(500.5, 1.0) (500.5, 1.1) … (500.5, 1.9) (500.5, 2.0)
julia> bj = besselj.(nu,reshape(x, 1, :)).*sqrt.(pi/2*reshape(x, 1, :))
501×11 Array{Float64,2}:
0.841471 0.891207 0.932039 … 0.9463 0.909297
0.301169 0.356592 0.414341 0.821342 0.870796
0.0620351 0.0813173 0.103815 0.350556 0.396896
0.00900658 0.0130319 0.0182194 0.101174 0.121444
⋮ ⋱ ⋮
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 … 0.0 0.0
Elaborating on my comment above. At first glance, and in general, try to avoid temporary allocations by preallocating arrays and filling them in-place (e.g. using dot broadcasting). Also maybe use #inbounds.
To give you an impression, after
using SpecialFunctions
x = 1.0
nu_max = 3
nu = 0.5 + (0:nu_max)
f(nu,x) = besselj.(nu,x).*sqrt.(pi/2*x)
compare (using BenchmarkTools) performance (and allocations) of
bj = hcat([ besselj.(_nu,x).*sqrt.(pi/2*x) for _nu in nu]...)
and
f.(nu,x)
(Technically the output is not identical, you would have to use vcat above, but anyway)
UPDATE (after OP purified his code):
Ok, I think I (finally) see your real question now (sorry for that). What I said above was about optimizing your original code with respect to how it calls besselj and efficiently processes it's output (see #Matt B.'s post for the nice full broadcast solution here).
IIUC, you want to exploit the fact (I don't know and didn't check if this is actually true) in the calculation of besselj for given nu and x internally there is a summation over nu. In other words you want to use intermediate results of this internal summation to avoid redundant calculations.
As SpecialFunctions's besselj just seems to call the Fortran routine (probably here) I doubt that you can access any of this information. Unfortunately I can't help you along here (I'd probably look for a pure Julia implementation of besselj).
Related
Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 days ago.
Improve this question
I am using a modified Weibull CDF to predict the height of trees in a stand at a given age: Y = b0 * {1.0 + EXP[-b1 * (X**b2)]}. Where Y = height of dominant trees; X = stand age; b0 is the asymptote parameter to be predicted; b1 is the rate parameter to be predicted; and b2 is the shape parameter to be predicted. Each of b0, b1, b2 are linear functions of one or more bio-geo-climatic and/or physiographic variables. I have a data set of N ~ 2000 stands. I get an initial fit with a higher tolerance, and then sample the data with replacement (i.e. bootstrap) and re-fit at a lower tolerance. With ~200 iterations, I generate a distribution of parameter estimates from which I can identify the least-significant element across b0, b1, b2. Eliminate (or add), and repeat.
I have a working version of this process in R using OPTIM, where the minimizing function is evaluating Z = the sum of squares (SSQ) rather than the Y values directly. My problem is computer time: the initial fit requires about 1 day, and the 200 bootstraps require an additional 2-3 days depending. The 40 or so additions/reductions in variables have been spinning around continuously on a server since August 2022. So I am attempting to port this into a FORTRAN 90 .dll called from R.
Here is what my data look like:
`
Y <- c(50.1, 80.3, 60.4, ... , 75.5, 90.2), len(Y) = 2000
X <- c(21, 38, 27, ... , 34, 37), len(X) = 2000
b0 <- f(P1, P2, P3, P4) len(b0) = 4
b1 <- f(P3, P5, P7, P8, P12) len(b1) = 5
b2 <- f(P6, P2, P8, P9, P10) len(b2) = 5
`
where P is the set of bio-geo-climatic and physiographic variables, with values at each X. Note that some of the same predictors are repeated across the parameters, but since they act on different parts of the equation (asymp,rate,shape), their sign and magnitude will vary, and are therefore treated as separate variables. My data matrix has 2000 rows by (len(b0) + len(b1) + len(b2) + 3) columns, one for each predictor in each parameter, plus an intercept term for each of b0,b1,b2. The number of columns may vary over time as I'm adding/subtracting terms to/from each. so my fitting data is a matrix with 2000 rows and a column structure that looks like this:
(icpt0, P1, P2, P3, P4, icpt1, P3, P5, P7, P8, P12, icpt, P6, P2, P8, P9, P10), cols = 17, rows = 2000
When evaluating the function I grab the appropriate columns to get the parameters:
Y.hat = (icpt0 + P1 + P2 + P3 + P4) * {1.0 + EXP[-(icpt1 + P3 + P5 + P7 + P8 + P12) * (X**(icpt2 + P6 + P2 + P8 + P9 + P10))]}
residual = Y(X) - Y.hat(X)
squares = residual**2
sum squares across D to get SSQ
Z(i) = SSQ (this is what I'm actually minimizing, not Y across D)
I need help constructing the initial coefficient simplex S = (cols+1) x cols vertices to pass to the FORTRAN implementation of Nelder-Mead. Within R and OPTIM, I only need to pass a single point (b0,b1,b2), where the estimates are first applied only to the intercepts. I'm not sure how to construct the appropriate unit vectors to build a robust initial matrix.
if my initial point esimate is b0 = 200.0, b1 = 0.001, b2 = 1.5, then my first row of vertices S(1) looks like:
icpt0
P1
P2
P3
P4
icpt1
P3
P5
P7
P8
P12
icpt
P6
P2
P8
P9
P10
200.0
0.0
0.0
0.0
0.0
0.001
0.0
0.0
0.0
0.0
0.0
1.5
0.0
0.0
0.0
0.0
0.0
0.0
4.0
0.0
0.0
0.0
0.001
0.0
0.0
0.0
0.0
0.0
1.5
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.7
0.0
0.0
0.001
0.0
0.0
0.0
0.0
0.0
1.5
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
12.0
0.0
0.001
0.0
0.0
0.0
0.0
0.0
1.5
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
2.3
0.001
0.0
0.0
0.0
0.0
0.0
1.5
0.0
0.0
0.0
0.0
0.0
If the average of P1 across D is 50, then S(2) looks like row 2 with the expected parameter = 200/50. I repeated this process for each row, so that I have a diagonal of positive coefficient values across S, but zeroes or intercept terms otherwise. I see that I could input 0.00025 for all zeroes, and mess with positive values by 0.05, but I'm not sure what that really changes.
My FORTRAN .dll appears to work, however the results do no match outputs from R/OPTIM version (using same data/predictors); The FORTRAN results yield Z values that are at least an order of magnitude larger than R, and my re-starts don't improve anything. I figure that R/OPTIM is constructing a different version of S than above. What would a better initial simplex S look like? what would the unit vectors look like? I'm struggling since my initial points are (b0,b1,b2), but each are linear functions, so I don't know how to construct unit vectors to get a full array of positive values, which I suspect is what's needed.
I am absolute Newbie, try to plot y=1/x for x=(1.0,2.0,3.0,4.0,5.0) coming out with MethodError: no method matching /(::Int64, ::NTuple{10,Float64})
I have try y=x^(-1) seem like coming out with same result, documentation didn't help....or I can't find the right one
If you want to apply the operation element wise you need to use broadcasting in Julia, e.g. with "dot-notation":
julia> x=(1.0, 2.0, 3.0, 4.0, 5.0)
(1.0, 2.0, 3.0, 4.0, 5.0)
julia> y = 1 ./ x
(1.0, 0.5, 0.3333333333333333, 0.25, 0.2)
See https://docs.julialang.org/en/v1/manual/arrays/#Broadcasting-1
I'm interested in the fastest way to linearly interpolate a 1D function on regularly spaced data.
I don't quite understand how to use the scale function in Interpolations.jl:
using Interpolations
v = [x^2 for x in 0:0.1:1]
itp=interpolate(v,BSpline(Linear()),OnGrid())
itp[1]
# 0.0
itp[11]
# 1.0
scale(itp,0:0.1:1)
itp[0]
# -0.010000000000000002
# why is this not equal to 0.0, i.e. the value at the lowest index?
the function does not alter the object, as would be by scale!.
julia> sitp = scale(itp,0:0.1:1)
11-element Interpolations.ScaledInterpolation{Float64,1,Interpolations.BSplineInterpolation{Float64,1,Array{Float64,1},Interpolations.BSpline{Interpolations.Linear},Interpolations.OnGrid,0},Interpolations.BSpline{Interpolations.Linear},Interpolations.OnGrid,Tuple{FloatRange{Float64}}}:
julia> sitp[0]
0.0
thanks to spencerlyon for pointing that out.
I'm trying to plot a parametric equation that was partially obtained using NSolve. Here's my attempted code:
VolumeDiff[v_] = 1.7 - v
SolveR[ v_] =
Re[NSolve[16 v^2 - 16 v*(r^3) + 3 (r^2) + 1 == 0, r, Reals]]
EnergyPos[r_] = r/2 (r + Sqrt[r^2 - 1])
EnergyNet[r_] = EnergyPos[SolveR[r]] + EnergyPos[SolveR[VolumeDiff[r]]]
ParametricPlot[{Re[EnergyNet[x]], 1.7 - 2. x}, {x, .1, 1.6}]
Basically, I have a cubic with two variables, I solve for one given the other and try to plot two parametric equations based on that original given variable. This is supposed to be a graph of energy vs. volume difference of two bubbles attached together. However, my axis are blank. I used NSolve to isolate the real root of the cubic equation and I guess Mathematica has a problem graphing with NSolve involved. I looked all over the internet but I couldn't find any answers to this. thanks for any help!
David
Several errors corrected.
You should read about how SetDelayed ( := ) and Solve[] work.
VolumeDiff[v_] := 1.7 - v
SolveR[v_] := NSolve[16 v^2 - 16 v*(r^3) + 3 (r^2) + 1 == 0, r, Reals][[1]]
EnergyPos[r_] := r/2 (r + Sqrt[r^2 - 1])
EnergyNet[r_] := EnergyPos[r /. SolveR[r]]+EnergyPos[r /. SolveR[VolumeDiff[r]]]
ParametricPlot[{EnergyNet[x], 1.7 - 2. x}, {x, .1, 2}]
with naming variables i'd like to be as clear as possible.
a percentage can range from 0 and 100. my public variable only accepts values between 0.0 and 1.0, so naming it a "percentage" can lead to confusion and simply naming it a "value" will not clarify the range limit.
is there a "percent" equivalent or naming convention for variables representing values that range from 0.0 and 1.0?
0.0 to 1.0 is percentage as well. You didn't get your definition right, a percentage range from 0% to 100% or from 0.0 to 1.0. It means the same thing, the % means percent = per cent = per hundred.
The range of 0.0 to 1.0 is normally used in statistics, while the range of 0% to 100% is more found in general life as people can put their mind around it better.
OpenGL uses the "normalized" term for values in the [0, 1] range.