Compared to the Python documentation, I find that Julia documentation are much harder to read.
For example, the rand function:
rand([rng=GLOBAL_RNG], [S], [dims...])
How should I interpret this? What do the brackets mean? Which parameters are optional, and which are not?
Also, in Flux's documentation for Dense:
Dense(in, out, σ=identity; bias=true, init=glorot_uniform)
Why are some parameters separated by commas and others by semicolons?
The parameters is square brackets [] are optional - this is a convention for documentation across many programming languages - this is not a part of language syntax though. Hence all parameters for rand are optional and you can do just rand.
Actually it is a good idea to try to type methods(rand) in the console to see the huge number of methods required to cover all such use cases:
julia> methods(rand)
# 80 methods for generic function "rand":
[1] rand() in Random at c:\Julia-1.7.2\share\julia\stdlib\v1.7\Random\src\Random.jl:257
.....
Semicolon is a part of syntax used for separating positional parameters from named parameters in Julia functions.
As an example consider a function:
function foo(a, b=4; c, d=8)
return a+b+c+d
end
Than you could do:
julia> foo(1,c=100)
113
Related
I'm starting to learn Julia, and I was wondering if there is an equivalent function in Julia that is doing the same as the epsilon function in Fortran.
In Fortran, the epsilon function of variable x gives the smallest number of the same kind of x that satisfies 1+epsilon(x)>1
I thought the the function eps() in Julia would be something similar, so I tried
eps(typeof(x))
but I got the error:
MethodError: no method matching eps(::Type{Int64})
Is there any other function that resembles the Fortran one that can be used on the different variables of the code?
If you really need eps to work for both Float and Int types, you can overload a method of eps for Int by writing Base.eps(Int) = one(Int). Then your code will work. This is okay for personal projects, but not a good idea for code you intend to share.
As the docstring for eps says:
help?> eps
eps(::Type{T}) where T<:AbstractFloat
eps()
eps is only defined for subtypes of AbstractFloat i.e. floating point numbers. It seems that your variable x is an integer variable, as the error message says no method matching eps(::Type{Int64}). It doesn't really make sense to define an eps for integers since the "the smallest number of the same kind of x that satisfies 1 + epsilon(x) > 1" is always going to be 1 for integers.
If you did want to get a 1 of the specific type of integer you have, you can use the one function instead:
julia> x = UInt8(42)
0x2a
julia> one(typeof(x))
0x01
It seems like what you actually want is
eps(x)
not
eps(typeof(x))
The former is exactly equivalent to your Fortran function.
I am new to Julia. Have a quick question.
In Fortran, we can do
implicit none
Therefore whenever we use a variable, we need to define it first. Otherwise it will give error.
In Julia, I want to do the same time thing. I want to define the type of each variable first, like Float64, Int64, etc. So that I wish that Julia no longer need to automatically do type conversion, which may slow down the code.
Because I know if Julia code looks like Fortran it usually runs fast.
So, in short, is there a way in Julia such that, I can force it to only use variables whose types has been defined, and otherwise give me an error message? I just want to be strict.
I just wanted to define all the variables first, then use them. Just like Fortran does.
Thanks!
[Edit]
As suggested by the experts who answers the questions, thank you very much! I know that perhaps in Julia there is no need to manually define each variables one by one. Because for one thing, if I do that, the code will become just like Fortran, and it can be very very long especially if I have many variables.
But if I do not define the type of the each variables, is Julia smart enough to know the type? Will it do some conversions which may slow down the code?
Also, even if there is no need to define variables one by one, is there in some situations we may have to manually define the type manually?
No, this is not possible* as such. You can, however, add type annotations at any point in your code, which will raise an error if the variable is not of the expected type:
julia> i = 1
1
julia> i::Int64
1
julia> i = 1.0
1.0
julia> i::Int64
ERROR: TypeError: in typeassert, expected Int64, got a value of type Float64
Stacktrace:
[1] top-level scope
# REPL[4]:1
julia> i=0x01::UInt8
0x01
*Julia is a dynamically-typed language, though there are packages such as https://github.com/aviatesk/JET.jl for static type-checking (i.e., no type annotations required) and https://github.com/JuliaDebug/Cthulhu.jl for exploring Julia's type inference system.
Strict type declaration is not how you achieve performance in Julia.
You achieve the performance by type stability.
Moreover declaring the types is usually not recommended because it decreases interoperability.
Consider the following function:
f(x::Int) = rand() < 4+x ? 1 : "0"
The types of everything seem to be known. However, this is a terrible (from performance point of view) function because the type of output can not be calcuated by looking types of input. And this is exactly how you write the performant code with regard to types.
So how to check your code? There is a special macro #code_warntype to detect such cases so you can correct your code:
Another type related issue are the containers that should not be specified by abstract elements. So you never want to have neither Vector{Any} nor Vector{Real} - rather than that you want Vector{Int} or Vector{Float64}.
See also https://docs.julialang.org/en/v1/manual/performance-tips/ for further discussion.
I am puzzled by the following results of typeof in the Julia 1.0.0 REPL:
# This makes sense.
julia> typeof(10)
Int64
# This surprised me.
julia> typeof(function)
ERROR: syntax: unexpected ")"
# No answer at all for return example and no error either.
julia> typeof(return)
# In the next two examples the REPL returns the input code.
julia> typeof(in)
typeof(in)
julia> typeof(typeof)
typeof(typeof)
# The "for" word returns an error like the "function" word.
julia> typeof(for)
ERROR: syntax: unexpected ")"
The Julia 1.0.0 documentation says for typeof
"Get the concrete type of x."
The typeof(function) example is the one that really surprised me. I expected a function to be a first-class object in Julia and have a type. I guess I need to understand types in Julia.
Any suggestions?
Edit
Per some comment questions below, here is an example based on a small function:
julia> function test() return "test"; end
test (generic function with 1 method)
julia> test()
"test"
julia> typeof(test)
typeof(test)
Based on this example, I would have expected typeof(test) to return generic function, not typeof(test).
To be clear, I am not a hardcore user of the Julia internals. What follows is an answer designed to be (hopefully) an intuitive explanation of what functions are in Julia for the non-hardcore user. I do think this (very good) question could also benefit from a more technical answer provided by one of the more core developers of the language. Also, this answer is longer than I'd like, but I've used multiple examples to try and make things as intuitive as possible.
As has been pointed out in the comments, function itself is a reserved keyword, and is not an actual function istself per se, and so is orthogonal to the actual question. This answer is intended to address your edit to the question.
Since Julia v0.6+, Function is an abstract supertype, much in the same way that Number is an abstract supertype. All functions, e.g. mean, user-defined functions, and anonymous functions, are subtypes of Function, in the same way that Float64 and Int are subtypes of Number.
This structure is deliberate and has several advantages.
Firstly, for reasons I don't fully understand, structuring functions in this way was the key to allowing anonymous functions in Julia to run just as fast as in-built functions from Base. See here and here as starting points if you want to learn more about this.
Secondly, because each function is its own subtype, you can now dispatch on specific functions. For example:
f1(f::T, x) where {T<:typeof(mean)} = f(x)
and:
f1(f::T, x) where {T<:typeof(sum)} = f(x) + 1
are different dispatch methods for the function f1
So, given all this, why does, e.g. typeof(sum) return typeof(sum), especially given that typeof(Float64) returns DataType? The issue here is that, roughly speaking, from a syntactical perspective, sum needs to serves two purposes simultaneously. It needs to be both a value, like e.g. 1.0, albeit one that is used to call the sum function on some input. But, it is also needs to be a type name, like Float64.
Obviously, it can't do both at the same time. So sum on its own behaves like a value. You can write f = sum ; f(randn(5)) to see how it behaves like a value. But we also need some way of representing the type of sum that will work not just for sum, but for any user-defined function, and any anonymous function. The developers decided to go with the (arguably) simplest option and have the type of sum print literally as typeof(sum), hence the behaviour you observe. Similarly if I write f1(x) = x ; typeof(f1), that will also return typeof(f1).
Anonymous functions are a bit more tricky, since they are not named as such. What should we do for typeof(x -> x^2)? What actually happens is that when you build an anonymous function, it is stored as a temporary global variable in the module Main, and given a number that serves as its type for lookup purposes. So if you write f = (x -> x^2), you'll get something back like #3 (generic function with 1 method), and typeof(f) will return something like getfield(Main, Symbol("##3#4")), where you can see that Symbol("##3#4") is the temporary type of this anonymous function stored in Main. (a side effect of this is that if you write code that keeps arbitrarily generating the same anonymous function over and over you will eventually overflow memory, since they are all actually being stored as separate global variables of their own type - however, this does not prevent you from doing something like this for n = 1:largenumber ; findall(y -> y > 1.0, x) ; end inside a function, since in this case the anonymous function is only compiled once at compile-time).
Relating all of this back to the Function supertype, you'll note that typeof(sum) <: Function returns true, showing that the type of sum, aka typeof(sum) is indeed a subtype of Function. And note also that typeof(typeof(sum)) returns DataType, in much the same way that typeof(typeof(1.0)) returns DataType, which shows how sum actually behaves like a value.
Now, given everything I've said, all the examples in your question now make sense. typeof(function) and typeof(for) return errors as they should, since function and for are reserved syntax. typeof(typeof) and typeof(in) correctly return (respectively) typeof(typeof), and typeof(in), since typeof and in are both functions. Note of course that typeof(typeof(typeof)) returns DataType.
In various web pages, I see references to jq functions with a slash and a number following them. For example:
walk/1
I found the above notation used on a stackoverflow page.
I could not find in the jq Manual page a definition as to what this notation means. I'm guessing it might indicate that the walk function that takes 1 argument. If so, I wonder why a more meaningful notation isn't used such as is used with signatures in C++, Java, and other languages:
<function>(type1, type2, ..., typeN)
Can anyone confirm what the notation <function>/<number> means? Are other variants used?
The notation name/arity gives the name and arity of the function. "arity" is the number of arguments (i.e., parameters), so for example explode/0 means you'd just write explode without any arguments, and map/1 means you'd write something like map(f).
The fact that 0-arity functions are invoked by name, without any parentheses, makes the notation especially handy. The fact that a function name can have multiple definitions at any one time (each definition having a distinct arity) makes it easy to distinguish between them.
This notation is not used in jq programs, but it is used in the output of the (new) built-in filter, builtins/0.
By contrast, in some other programming languages, it (or some close variant, e.g. module:name/arity in Erlang) is also part of the language.
Why?
There are various difficulties which typically arise when attempting to graft a notation that's suitable for languages in which method-dispatch is based on types onto ones in which dispatch is based solely on arity.
The first, as already noted, has to do with 0-arity functions. This is especially problematic for jq as 0-arity functions are invoked in jq without parentheses.
The second is that, in general, jq functions do not require their arguments to be any one jq type. Having to write something like nth(string+number) rather than just nth/1 would be tedious at best.
This is why the manual strenuously avoids using "name(type)"-style notation. Thus we see, for example, startswith(str), rather than startswith(string). That is, the parameter names in the documentation are clearly just names, though of course they often give strong type hints.
If you're wondering why the 'name/arity' convention isn't documented in the manual, it's probably largely because the documentation was mostly written before jq supported multi-arity functions.
In summary -- any notational scheme can be made to work, but name/arity is (1) concise; (2) precise in the jq context; (3) easy-to-learn; and (4) widely in use for arity-oriented languages, at least on this planet.
In Julia, the syntax to print a formatted string is as follows:
#printf("Hello %d\n", 5)
Why is #printf a macro instead of a function? Is it so that it can accept a varying number of arguments?
Taking a variable number of arguments is not a problem for normal Julia functions [1]. #printf is a macro so that it can parse and interpret the format string at compile time and generate custom code for that specific format string. People may not realize that C's printf function re-parses and re-interprets the format string each time you call printf. The fact that it's as fast as it is represents a minor miracle of insane pointer programming. Seriously, just look at your nearest libc's printf implementation. It's completely nuts.
Julia uses a different approach: #printf is a macro which translates format strings into efficient code specific to that format specification. If you think about it, a printf-style format string is really just a way to express a function that takes a fixed number and type of arguments and prints them in a particular way. Note that I said that the format string is a function, not printf itself, which is conceptually a function generator, turning formats into formatters. The fact that this is all crammed into a run-time function in C is a bit of a mismatch due to that being the only reasonable option in C. In fact, because of this, until very recently, it was rather easy to shoot yourself in the foot by passing the wrong number or type of arguments to C's printf. This is only better now because compilers have been special-cased to understand the semantics of printf formats.
In theory, Julia's #printf can be made faster than C since it generate's custom code, but in practice, I had a hard enough time matching C, let alone beating it. But I think that's due to the current design of our I/O system and how I'm using it, not an inherent limitation. The I/O stuff is due for an overhaul though, and when that happens, we might actually be able to beat C at formatted printing by leveraging the fact that #printf is a macro.
It is for performance. The printf macro takes a constant format string (eg. "Hello %d\n") and generates optimized code for that string.