Difference between unicode cdot and * in julia - julia

I've started using the unicode cdot in place of * in my Julia code because I find it easier to read. I thought they were the same, but apparently there is a difference I don't understand. Is there documentation on this?
julia> 2pi⋅(0:1)
ERROR: MethodError: no method matching dot(::Float64, ::UnitRange{Int64})
Closest candidates are:
dot(::Number, ::Number) at linalg\generic.jl:301
dot{T<:Union{Float32,Float64},TI<:Integer}(::Array{T<:Union{Float32,Float64},1}, ::Union{Range{TI<:Integer},UnitRange{TI<:Integer}}, ::Array{T<:Union{Float32,Float64},1}, ::Union{Range{TI<:Integer},UnitRange{TI<:Integer}}) at linalg\matmul.jl:48
dot{T<:Union{Complex{Float32},Complex{Float64}},TI<:Integer}(::Array{T<:Union{Complex{Float32},Complex{Float64}},1}, ::Union{Range{TI<:Integer},UnitRange{TI<:Integer}}, ::Array{T<:Union{Complex{Float32},Complex{Float64}},1}, ::Union{Range{TI<:Integer},UnitRange{TI<:Integer}}) at linalg\matmul.jl:61
...
julia> 2pi*(0:1)
0.0:6.283185307179586:6.283185307179586

dot or ⋅ is not the same as multiplication (*). You can find out what it's for by typing ?dot:
help?> ⋅
search: ⋅
dot(x, y)
⋅(x,y)
Compute the dot product. For complex vectors, the first vector is conjugated. [...]
For more info about the dot product, see e.g. here.

It seems like you are conflating two different operators. The cdot aliases the dot function, while the asterisk * aliases multiplication routines.
I suspect that you want to do a dot product. The error that you see tells you that Julia does not know how to compute the dot product of a scalar floating point number (Float64) with an integer unit range (UnitRange{Int}). If you think about it, using dot here makes little sense.
In contrast, the second command 2pi*(0:1) computes the product of a scalar against the same UnitRange object. That simply rescales the range, and Julia has a method to do that.
A few options for you, depending on what you want to do:
Use * instead of dot here (easiest)
Code your own dot method to handle rescaling of UnitRange objects (probably not helpful)
Use elementwise multiplication .* (careful, not equal to dot!)

Related

Failure to report number that is too small

I did the following calculations in Julia
z = LinRange(-0.09025000000000001,0.19025000000000003,5)
d = Normal.(0.05*(1-0.95) .+ 0.95.*z .- 0.0051^2/2, 0.0051 .* (similar(z) .*0 .+1))
minimum(cdf.(d, (z[3]+z[2])/2))
The problem I have is that the last code sometimes gives me the correct result 4.418051841202834e-239, sometimes reports the error DomainError with NaN: Normal: the condition σ >= zero(σ) is not satisfied. I think this is because 4.418051841202834e-239 is too small. But I was wondering why my code can give me different results.
In addition to points mentioned by others, here are a few more:
Firstly, don't use LinRange when numerical accuracy is of importance. This is what the range function is for. LinRange can be used when numerical precision is of lesser importance, since it is faster. From the docstring of range:
Special care is taken to ensure intermediate values are computed rationally. To avoid this induced overhead, see the LinRange constructor.
Example:
julia> LinRange(-0.09025000000000001,0.19025000000000003,5) .- range(-0.09025000000000001,0.19025000000000003,5)
0.0:-3.469446951953614e-18:-1.3877787807814457e-17
Secondly, this is a pretty terrible way to create a vector of a certain value:
0.0051 .* (similar(z) .*0 .+1)
Other's have mentioned ones, etc. but I think it's better to use fill
fill(0.0051, size(z))
which directly fills the array with the right value. Perhaps one should use convert(eltype(z), 0.0051) inside fill.
Thirdly, don't create this vector at all! You use broadcasting, so just use the scalar value:
d = Normal.(0.05*(1-0.95) .+ 0.95.*z .- 0.0051^2/2, 0.0051) # look! just a scalar!
This is how broadcasting works, it expands singleton dimensions implicitly to match other arguments (without actually wasting that memory).
Much of the point of broadcasting is that you don't need to create that sort of 'dummy arrays' anymore. If you find yourself doing that, give it another think; constant-valued arrays are inherently wasteful, and you shouldn't need to create them.
There are two problems:
Noted by #Dan Getz: similar does no initialize the values and quite often unused areas of memory have values corresponding to NaN. In that case multiplication by 0 does not help since NaN * 0 == NaN. Instead you want to have ones(eltype(z),size(z))
you need to use higher precision than Float64. BigFloat is one way to go - just you need to remember to call setprecision(BigFloat, 128) so you actually control how many bits you use. However, much more time-efficient solution (if you run computations at scale) will be to use a dedicated package such as DoubleFloats.
Sample corrected code using DoubleFloats below:
julia> z = LinRange(df64"-0.09025000000000001",df64"0.19025000000000003",5)
5-element LinRange{Double64, Int64}:
-0.09025000000000001,-0.020125,0.05000000000000001,0.12012500000000002,0.19025000000000003
julia> d = Normal.(0.05*(1-0.95) .+ 0.95.*z .- 0.0051^2/2, 0.0051 .* ones(eltype(z),size(z)))
5-element Vector{Normal{Double64}}:
Normal{Double64}(μ=-0.083250505, σ=0.0051)
Normal{Double64}(μ=-0.016631754999999998, σ=0.0051)
Normal{Double64}(μ=0.049986995000000006, σ=0.0051)
Normal{Double64}(μ=0.11660574500000001, σ=0.0051)
Normal{Double64}(μ=0.18322449500000001, σ=0.0051)
julia> minimum(cdf.(d, (z[3]+z[2])/2))
4.418051841203009e-239
The problem in the code is similar(z) which produces a vector with undefined entries and is used without initialization. Use ones(length(z)) instead.

Reading parameter descriptions in Julia Documentation

Compared to the Python documentation, I find that Julia documentation are much harder to read.
For example, the rand function:
rand([rng=GLOBAL_RNG], [S], [dims...])
How should I interpret this? What do the brackets mean? Which parameters are optional, and which are not?
Also, in Flux's documentation for Dense:
Dense(in, out, σ=identity; bias=true, init=glorot_uniform)
Why are some parameters separated by commas and others by semicolons?
The parameters is square brackets [] are optional - this is a convention for documentation across many programming languages - this is not a part of language syntax though. Hence all parameters for rand are optional and you can do just rand.
Actually it is a good idea to try to type methods(rand) in the console to see the huge number of methods required to cover all such use cases:
julia> methods(rand)
# 80 methods for generic function "rand":
[1] rand() in Random at c:\Julia-1.7.2\share\julia\stdlib\v1.7\Random\src\Random.jl:257
.....
Semicolon is a part of syntax used for separating positional parameters from named parameters in Julia functions.
As an example consider a function:
function foo(a, b=4; c, d=8)
return a+b+c+d
end
Than you could do:
julia> foo(1,c=100)
113

What's the concise notation for "option values"?

Is there a mathematical symbol or otherwise concise notation to represent option values (OCaml's option type, Haskell's Maybe...)?
It appears so often in functional programming that I would expect to find a concise syntax for this type, the same way lists have a somewhat standard [] notation, functions have the -> notation, and so on.
I know that in a more formal context one might use a partial function notation , but in most cases it doesn't fit as nicely as some explicit symbols for Some/None (or Just/Nothing).
Ideally, I'd like to write something like:
This function returns #42 if the input is valid, # otherwise.
Where #42 represents Some 42 and # represents None, but in a standard way, easily understandable by most readers (or at least those with some mathematical background).
I haven't seen any such specific notation. The closest I know is to use of mathematical symbols to express the type: α ⊕ 1. Here ⊕ represents direct sum (disjoint union) of types and 1 represents the unit type.
This notation is used in category theory or in typing systems.

Derivative Calculator

I'm interested in building a derivative calculator. I've racked my brains over solving the problem, but I haven't found a right solution at all. May you have a hint how to start? Thanks
I'm sorry! I clearly want to make symbolic differentiation.
Let's say you have the function f(x) = x^3 + 2x^2 + x
I want to display the derivative, in this case f'(x) = 3x^2 + 4x + 1
I'd like to implement it in objective-c for the iPhone.
I assume that you're trying to find the exact derivative of a function. (Symbolic differentiation)
You need to parse the mathematical expression and store the individual operations in the function in a tree structure.
For example, x + sin²(x) would be stored as a + operation, applied to the expression x and a ^ (exponentiation) operation of sin(x) and 2.
You can then recursively differentiate the tree by applying the rules of differentiation to each node. For example, a + node would become the u' + v', and a * node would become uv' + vu'.
you need to remember your calculus. basically you need two things: table of derivatives of basic functions and rules of how to derivate compound expressions (like d(f + g)/dx = df/dx + dg/dx). Then take expressions parser and recursively go other the tree. (http://www.sosmath.com/tables/derivative/derivative.html)
Parse your string into an S-expression (even though this is usually taken in Lisp context, you can do an equivalent thing in pretty much any language), easiest with lex/yacc or equivalent, then write a recursive "derive" function. In OCaml-ish dialect, something like this:
let rec derive var = function
| Const(_) -> Const(0)
| Var(x) -> if x = var then Const(1) else Deriv(Var(x), Var(var))
| Add(x, y) -> Add(derive var x, derive var y)
| Mul(a, b) -> Add(Mul(a, derive var b), Mul(derive var a, b))
...
(If you don't know OCaml syntax - derive is two-parameter recursive function, with first parameter the variable name, and the second being mathched in successive lines; for example, if this parameter is a structure of form Add(x, y), return the structure Add built from two fields, with values of derived x and derived y; and similarly for other cases of what derive might receive as a parameter; _ in the first pattern means "match anything")
After this you might have some clean-up function to tidy up the resultant expression (reducing fractions etc.) but this gets complicated, and is not necessary for derivation itself (i.e. what you get without it is still a correct answer).
When your transformation of the s-exp is done, reconvert the resultant s-exp into string form, again with a recursive function
SLaks already described the procedure for symbolic differentiation. I'd just like to add a few things:
Symbolic math is mostly parsing and tree transformations. ANTLR is a great tool for both. I'd suggest starting with this great book Language implementation patterns
There are open-source programs that do what you want (e.g. Maxima). Dissecting such a program might be interesting, too (but it's probably easier to understand what's going on if you tried to write it yourself, first)
Probably, you also want some kind of simplification for the output. For example, just applying the basic derivative rules to the expression 2 * x would yield 2 + 0*x. This can also be done by tree processing (e.g. by transforming 0 * [...] to 0 and [...] + 0 to [...] and so on)
For what kinds of operations are you wanting to compute a derivative? If you allow trigonometric functions like sine, cosine and tangent, these are probably best stored in a table while others like polynomials may be much easier to do. Are you allowing for functions to have multiple inputs,e.g. f(x,y) rather than just f(x)?
Polynomials in a single variable would be my suggestion and then consider adding in trigonometric, logarithmic, exponential and other advanced functions to compute derivatives which may be harder to do.
Symbolic differentiation over common functions (+, -, *, /, ^, sin, cos, etc.) ignoring regions where the function or its derivative is undefined is easy. What's difficult, perhaps counterintuitively, is simplifying the result afterward.
To do the differentiation, store the operations in a tree (or even just in Polish notation) and make a table of the derivative of each of the elementary operations. Then repeatedly apply the chain rule and the elementary derivatives, together with setting the derivative of a constant to 0. This is fast and easy to implement.

Why are the arguments to atan2 Y,X rather than X,Y?

In C the atan2 function has the following signature:
double atan2( double y, double x );
Other languages do this as well. This is the only function I know of that takes its arguments in Y,X order rather than X,Y order, and it screws me up regularly because when I think coordinates, I think (X,Y).
Does anyone know why atan2's argument order convention is this way?
Because I believe it is related to arctan(y/x), so y appears on top.
Here's a nice link talking about it a bit: Angles and Directions
My assumption has always been that this is because of the trig definition, ie that
tan(theta) = opposite / adjacent
When working with the canonical angle from the origin, opposite is always Y and adjacent is always X, so:
atan2(opposite, adjacent) = theta
Ie, it was done that way so there's no ordering confusion with respect to the mathematical definition.
Suppose a rectangle triangle with its opposite side called y, adjacent side called x:
tan(angle) = y/x
arctan(tan(angle)) = arctan(y/x)
It's because in school, the mnemonic for calculating the gradient
is rise over run, or in other words dy/dx, or more briefly y/x.
And this order has snuck into the arguments of arctangent functions.
So it's a historical artefact. For me it depends on what I'm thinking
about when I use atan2. If I'm thinking about differentials, I get it right
and if I'm thinking about coordinate pairs, I get it wrong.
The order is atan2(X,Y) in excel so I think the reverse order is a programming thing. atan(Y/X) can easily be changed to atan2(Y,X) by putting a '2' between the 'n' and the '(', and replacing the '/' with a ',', only 2 operations. The opposite order would take 4 operations and some of the operations would be more complex (cut and paste).
I often work out my math in Excel then port it to .NET, so will get hung up on atan2 sometimes. It would be best if atan2 could be standardized one way or the other.
It would be more convenient if atan2 had its arguments reversed. Then you wouldn't need to worry about flipping the arguments when computing polar angles. The Mathematica equivalent does just that: https://reference.wolfram.com/language/ref/ArcTan.html
Way back in the dawn of time, FORTRAN had an ATAN2 function with the less convenient argument order that, in this reference manual, is (somewhat inaccurately) described as arctan(arg1 / arg2).
It is plausible that the initial creator was fixated on atan2(arg1, arg2) being (more or less) arctan(arg1 / arg2), and that the decision was blindly copied from FORTRAN to C to C++ and Python and Java and JavaScript.

Resources