I have been trying to build a type system with the Hindley-Milner Algorithm and ran into the following challenge and was curious if there are any resources or papers out there I can take a look at.
Suppose I have a programming language that has some form of property accessor (javascript-like) that works for both arrays and objects, s.t. the property within the brackets for an array needs to be a number and the property within the brackets for an object needs to be a string.
For example
const arr = [0, 1, 2]
arr[0]
const obj = { hello: "World" }
obj["hello"]
Suppose we want to use hindley-milner on the following code snippet
A[B]
Then if later we realized that B is a number, then we instantly deduce that A is an array. Similarly, if we deduce that A is an object, then B is instantly a number.
Are there any papers or type systems out there that have this sort of conditional substitution concept?
Related
In Pascal, I understand that one could create a function returning a pointer which can be dereferenced and then assign a value to that, such as in the following (obnoxiously useless) example:
type ptr = ^integer;
var d: integer;
function f(x: integer): ptr;
begin
f := #x;
end;
begin
f(d)^ := 4;
end.
And now d is 4.
(The actual usage is to access part of a quite complicated array of records data structure. I know that a class would be better than an array of nested records, but it isn't my code (it's TeX: The Program) and was written before Pascal implementations supported object-orientation. The code was written using essentially a language built on top of Pascal that added macros which expand before the compiler sees them. Thus you could define some macro m that takes an argument x and expands into thearray[x + 1].f1.f2 instead of writing that every time; the usage would be m(x) := somevalue. I want to replicate this functionality with a function instead of a macro.)
However, is it possible to achieve this functionality without the ^ operator? Can a function f be written such that f(x) := y (no caret) assigns the value y to x? I know that this is stupid and the answer is probably no, but I just (a) don't really like the look of it and (b) am trying to mimic exactly the form of the macro I mentioned above.
References are not first class objects in Pascal, unlike languages such as C++ or D. So the simple answer is that you cannot directly achieve what you want.
Using a pointer as you illustrated is one way to achieve the same effect although in real code you'd need to return the address of an object whose lifetime extends beyond that of the function. In your code that is not the case because the argument x is only valid until the function returns.
You could use an enhanced record with operator overloading to encapsulate the pointer, and so encapsulate the pointer dereferencing code. That may be a good option, but it very much depends on your overall problem, of which we do not have sight.
Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified? This requires, if we want to use it anyway, to go through a very artificial process, by representing these data in the form of a ridiculous single element table.
Ada, which had the same kind of limitation, abandoned it in its 2012 redesign to the great satisfaction of its users. A small keyword (like out in Ada) could very well indicate that the possibility of keeping the modifications of a parameter at the output is required.
From my experience in Julia it is useful to understand the difference between a value and a binding.
Values
Each value in Julia has a concrete type and location in memory. Value can be mutable or immutable. In particular when you define your own composite type you can decide if objects of this type should be mutable (mutable struct) or immutable (struct).
Of course Julia has in-built types and some of them are mutable (e.g. arrays) and other are immutable (e.g. numbers, strings). Of course there are design trade-offs between them. From my perspective two major benefits of immutable values are:
if a compiler works with immutable values it can perform many optimizations to speed up code;
a user is can be sure that passing an immutable to a function will not change it and such encapsulation can simplify code analysis.
However, in particular, if you want to wrap an immutable value in a mutable wrapper a standard way to do it is to use Ref like this:
julia> x = Ref(1)
Base.RefValue{Int64}(1)
julia> x[]
1
julia> x[] = 10
10
julia> x
Base.RefValue{Int64}(10)
julia> x[]
10
You can pass such values to a function and modify them inside. Of course Ref introduces a different type so method implementation has to be a bit different.
Variables
A variable is a name bound to a value. In general, except for some special cases like:
rebinding a variable from module A in module B;
redefining some constants, e.g. trying to reassign a function name with a non-function value;
rebinding a variable that has a specified type of allowed values with a value that cannot be converted to this type;
you can rebind a variable to point to any value you wish. Rebinding is performed most of the time using = or some special constructs (like in for, let or catch statements).
Now - getting to the point - function is passed a value not a binding. You can modify a binding of a function parameter (in other words: you can rebind a value that a parameter is pointing to), but this parameter is a fresh variable whose scope lies inside a function.
If, for instance, we wanted a call like:
x = 10
f(x)
change a binding of variable x it is impossible because f does not even know of existence of x. It only gets passed its value. In particular - as I have noted above - adding such a functionality would break the rule that module A cannot rebind variables form module B, as f might be defined in a module different than where x is defined.
What to do
Actually it is easy enough to work without this feature from my experience:
What I typically do is simply return a value from a function that I assign to a variable. In Julia it is very easy because of tuple unpacking syntax like e.g. x,y,z = f(x,y,z), where f can be defined e.g. as f(x,y,z) = 2x,3y,4z;
You can use macros which get expanded before code execution and thus can have an effect modifying a binding of a variable, e.g. macro plusone(x) return esc(:($x = $x+1)) end and now writing y=100; #plusone(y) will change the binding of y;
Finally you can use Ref as discussed above (or any other mutable wrapper - as you have noted in your question).
"Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified?" asked by Schemer
Your question is wrong because you assume the wrong things.
Parameters are variables
When you pass things to a function, often those things are values and not variables.
for example:
function double(x::Int64)
2 * x
end
Now what happens when you call it using
double(4)
What is the point of the function modifying it's parameter x , it's pointless. Furthermore the function has no idea how it is called.
Furthermore, Julia is built for speed.
A function that modifies its parameter will be hard to optimise because it causes side effects. A side effect is when a procedure/function changes objects/things outside of it's scope.
If a function does not modifies a variable that is part of its calling parameter then you can be safe knowing.
the variable will not have its value changed
the result of the function can be optimised to a constant
not calling the function will not break the program's behaviour
Those above three factors are what makes FUNCTIONAL language fast and NON FUNCTIONAL language slow.
Furthermore when you move into Parallel programming or Multi Threaded programming, you absolutely DO NOT WANT a variable having it's value changed without you (The programmer) knowing about it.
"How would you implement with your proposed macro, the function F(x) which returns a boolean value and modifies c by c:= c + 1. F can be used in the following piece of Ada code : c:= 0; While F(c) Loop ... End Loop;" asked by Schemer
I would write
function F(x)
boolean_result = perform_some_logic()
return (boolean_result,x+1)
end
flag = true
c = 0
(flag,c) = F(c)
while flag
do_stuff()
(flag,c) = F(c)
end
"Unfortunately no, because, and I should have said that, c has to take again the value 0 when F return the value False (c increases as long the Loop lives and return to 0 when it dies). " said Schemer
Then I would write
function F(x)
boolean_result = perform_some_logic()
if boolean_result == true
return (true,x+1)
else
return (false,0)
end
end
flag = true
c = 0
(flag,c) = F(c)
while flag
do_stuff()
(flag,c) = F(c)
end
I would like to check if a variable is scalar in julia, such as Integer, String, Number, but not AstractArray, Tuple, type, struct, etc. Is there a simple method to do this (i.e. isscalar(x))
The notion of what is, or is not a scalar is under-defined without more context.
Mathematically, a scalar is defined; (Wikipedia)
A scalar is an element of a field which is used to define a vector space.
That is to say, you need to define a vector space, based on a field, before you can determine if something is, or is not a scalar (relative to that vector space.).
For the right vector space, tuples could be a scalar.
Of-course we are not looking for a mathematically rigorous definition.
Just a pragmatic one.
Base it off what Broadcasting considers to be scalar
I suggest that the only meaningful way in which a scalar can be defined in julia, is of the behavior of broadcast.
As of Julia 1:
using Base.Broadcast
isscalar(x::T) where T = isscalar(T)
isscalar(::Type{T}) where T = BroadcastStyle(T) isa Broadcast.DefaultArrayStyle{0}
See the docs for Broadcast.
In julia 0.7, Scalar is the default. So it is basically anything that doesn't have specific broadcasting behavior, i.e. it knocks out things like array and tuples etc.:
using Base.Broadcast
isscalar(x::T) where T = isscalar(T)
isscalar(::Type{T}) where T = BroadcastStyle(T) isa Broadcast.Scalar
In julia 0.6 this is a bit more messy, but similar:
isscalar(x::T) where T = isscalar(T)
isscalar(::Type{T}) where T = Base.Broadcast._containertype(T)===Any
The advantage of using the methods for Broadcast to determine if something is scalar, over using your own methods, is that anyone making a new type that is going to act in a scalar way must make sure it works with those methods correctly
(or actually nonscalar since scalar is the default.)
Structs are not not scalar
That is to say: sometimes structs are scalar and sometimes they are not and it depends on the struct.
Note however that these methods do not consider struct to be non-scalar.
I think you are mistaken in your desire to.
Julia structs are not (necessarily or usually) a collection type.
Consider that: BigInteger, BigFloat, Complex128 etc etc
are all defined using structs
I was tempted to say that having a start method makes a type nonscalar, but that would be incorrect as start(::Number) is defined.
(This has been debated a few times)
For completeness, I am copying Tasos Papastylianou's answer from the comments to here. If all you want to do is distinguish scalars from arrays you can use:
isa(x, Number)
This will output true if x is a Number (like a float or an int), and output false if x is an Array (vector, matrix, etc.)
I found myself needing to capture the notion of if something was scalar or not recently in MultiResolutionIterators.jl.
I found the boardcasting based rules from the other answer,
did not meet my needs.
In particular I wanted to consider strings as nonscalar.
I defined a trait,
bases on method_exists(start, (T,)),
with some exceptions as mentioned e.g. for Number.
abstract type Scalarness end
struct Scalar <: Scalarness end
struct NotScalar <: Scalarness end
isscalar(::Type{Any}) = NotScalar() # if we don't know the type we can't really know if scalar or not
isscalar(::Type{<:AbstractString}) = NotScalar() # We consider strings to be nonscalar
isscalar(::Type{<:Number}) = Scalar() # We consider Numbers to be scalar
isscalar(::Type{Char}) = Scalar() # We consider Sharacter to be scalar
isscalar(::Type{T}) where T = method_exists(start, (T,)) ? NotScalar() : Scalar()
Something similar is also done by AbstractTrees.jl
isscalar(x) == applicable(start, x) && !isa(x, Integer) && !isa(x, Char) && !isa(x, Task)
So here is the setting. I have multiple composite types defined with their own fields and constructors. Lets show two simplified components here:
type component1
x
y
end
type component2
x
y
z
end
Now I want to define a new type such that It can save an array of size K of previously defined composite types in it. So it is a parametric composite type with two fields: one is an integer K, and the other is an array of size K of the type passed.
type mixture{T}
components::Array{T, 1}
K::Int64
function mixture(qq::T, K::Int64)
components = Array{typeof(qq), K}
for k in 1:K
components[k] = qq
end
new(components, K)
end
end
But this is not the correct way to do it. Because all the K components are referring to one object and manipulating mixture.components[k] will affect all K components. In python I can remedy this with deepcopy. But the deepcopy in Julia is not defined for composite types. How do I solve this problem?
An answer to your specific question:
When you define a new type in Julia, it is common to extend some of the standard methods in Base to your new type, including deepcopy. For example:
type MyType
x::Vector
y::Vector
end
import Base.deepcopy
Base.deepcopy(m::MyType) = MyType(deepcopy(m.x), deepcopy(m.y))
Now you can call deepcopy over an instance of MyType and you will get a new, truly independent, copy of MyType as the output.
Note, my import Base.deepcopy is actually redundant, since I've referenced Base in my function definition, e.g. Base.deepcopy(m::MyType). However, I did both of these to show you the two ways of extending a method from Base.
Second note, if your type has lots of fields, you might instead iterate over the fields using deepcopy as follows:
Base.deepcopy(m::MyType) = MyType([ deepcopy(getfield(m, k)) for k = 1:length(names(m)) ]...)
A comment on your code:
First, it is standard practice in Julia to capitalize type names, e.g. Component1 instead of component1. Of course, you don't have to do this, but...
Second, from the Julia docs performance tips: Declare specific types for fields of composite types. Note, you can parameterize these declarations, e.g.
type Component1{T1, T2}
x::T1
y::T2
end
Third, here is how I would have defined your new type:
type Mixture{T}
components::Vector{T}
Mixture{T}(c::Vector{T}) = new(c)
end
Mixture{T}(c::Vector{T}) = Mixture{eltype(c)}(c)
Mixture(x, K::Int) = Mixture([ deepcopy(x) for k = 1:K ])
There are several important differences here between my code and yours. I'll go through them one at a time.
Your K field was redundant (I think) because it appears to just be the length of components. So it might be simpler to just extend the length method to your new type as follows:
Base.length(m::Mixture) = length(m.components)
and now you can use length(m), where m is an instance of Mixture to get what was previously stored in the K field.
The inner constructor in your type mixture was unusual. Standard practice is for the inner constructor to take arguments that correspond one-to-one (in sequence) to the fields of your type, and then the remainder of the inner constructor just performs any error checks you would like to be done whenever initialising your type. You deviated from this since qq was not (necessarily) an array. This kind of behaviour is better reserved for outer constructors. So, what have I done with constructors?
The inner constructor of Mixture doesn't really do anything other than pass the argument into the field via new. This is because currently there aren't any error checks I need to perform (but I can easily add some in the future).
If I want to call this inner constructor, I need to write something like m = Mixture{MyType}(x), where x is Vector{MyType}. This is a bit annoying. So my first outer constructor automatically infers the contents of {...} using eltype(x). Because of my first outer constructor, I can now initialise Mixture using m = Mixture(x) instead of m = Mixture{MyType}(x)
My second outer constructor corresponds to your inner constructor. It seems to me the behaviour you are after here is to initialise Mixture with the same component in every field of components, repeated K times. So I do this with a loop comprehension over x, which will work as long as the deepcopy method has been defined for x. If no deepcopy method exists, you'll get a No Method Exists error. This kind of programming is called duck-typing, and in Julia there is typically no performance penalty to using it.
Note, my second outer constructor will call my first outer constructor K times, and each time, my first outer constructor will call my inner constructor. Nesting functionality in this way will, in more complicated scenarios, massively cut-down on code duplication.
Sorry, this is a lot to take in I know. Hope it helps.
I am trying to create a function
import Language.Reflection
foo : Type -> TT
I tried it by using the reflect tactic:
foo = proof
{
intro t
reflect t
}
but this reflects on the variable t itself:
*SOQuestion> foo
\t => P Bound (UN "t") (TType (UVar 41)) : Type -> TT
Reflection in Idris is a purely syntactic, compile-time only feature. To predict how it will work, you need to know about how Idris converts your program to its core language. Importantly, you won't be able to get ahold of reflected terms at runtime and reconstruct them like you would with Lisp. Here's how your program is compiled:
Internally, Idris creates a hole that will expect something of type Type -> TT.
It runs the proof script for foo in this state. We start with no assumptions and a goal of type Type -> TT. That is, there's a term being constructed which looks like ?rhs : Type => TT . rhs. The ?foo : ty => body syntax shows that there's a hole called foo whose eventual value will be available inside of body.
The step intro t creates a function whose argument is t : Type - this means that we now have a term like ?foo_body : TT . \t : Type => foo_body.
The reflect t step then fills the current hole by taking the term on its right-hand side and converting it to a TT. That term is in fact just a reference to the argument of the function, so you get the variable t. reflect, like all other proof script steps, only has access to the information that is available directly at compile time. Thus, the result of filling in foo_body with the reflection of the term t is P Bound (UN "t") (TType (UVar (-1))).
If you could do what you are wanting here, it would have major consequences both for understanding Idris code and for running it efficiently.
The loss in understanding would come from the inability to use parametricity to reason about the behavior of functions based on their types. All functions would effectively become potentially ad-hoc polymorphic, because they could (say) run differently on lists of strings than on lists of ints.
The loss in performance would come from representing enough type information to do the reflection. After Idris code is compiled, there is no type information left in it (unlike in a system such as the JVM or .NET or a dynamically typed system such as Python, where types have a runtime representation that code can access). In Idris, types can be very large, because they can contain arbitrary programs - this means that far more information would have to be maintained, and computation occurring at the type level would also have to be preserved and repeated at runtime.
If you're wanting to reflect on the structure of a type for further proof automation at compile time, take a look at the applyTactic tactic. Its argument should be a function that takes a reflected context and goal and gives back a new reflected tactic script. An example can be seen in the Data.Vect source.
So I suppose the summary is that Idris can't do what you want, and it probably never will be able to, but you might be able to make progress another way.