Idris vectors vs linked lists - vector

Does Idris do any kind of optimization under the hood of vectors? Because from the looks of it, an Idris vector is just a linked list with known size (known at compile time). In fact, in general it seems like you could express the following equivalence (I'm guessing a bit at the syntax):
Vector : Nat -> Type -> Type
Vector n t = (l: List t ** length l = n)
So while this is nice in the sense of preventing range errors, the real advantage of vectors (in the traditional usage of the term) is in terms of performance; in particular, O(1) random access. It seems that the idris vector would not support this (how would you write the indexing function to have this performance?).
Assuming that there's no wizardry going on under the hood (as happens with Nat) to reconfigure Vectors, is there a random-access data type in Idris?
How would be/is such a thing defined in an algebraic type system? Certainly it seems like it would be impossible to define it inductively.
Is it possible, within a type system like that of Idris, to create a data type which supports O(1) random access, and is aware of its length such that all access is provably valid? (Haskell has array-style Vectors, but their concrete implementation is opaque to the average user, including me)

It doesn't do anything to optimise Vector lookups (at the time of writing this answer, at least).
This isn't because of any difficulty in doing it, really, but more because I would rather have some kind of general framework for writing this kind of optimisation than hard coding lots of them. Admittedly, we already have hard coded optimisations for Nat, but I still would prefer not to add loads more in an ad-hoc fashion.
Depending on what you actually want it for, it might be that the experimental uniqueness type system will help, in that you could have a low level mutable thing under the hood and still have safe and efficient access and update in a pure style in the high level language. We'll see...

Edwin has the definitive answers on what Idris currently does. However, if you are looking for something which might be natural to optimize into constant-time lookup in some cases, the following might be a step in the right direction.
For compile-time fixed-size vectors (i.e., not under a lambda, not parameterized by length at top-level), the following structure gives you vectors and lookup functions that, for any fixed concrete length, can be compile-time normalized to functions that should be somewhat straightforwardly optimizable into constant-time functions. (Sorry, code is in Coq; I don't have a working version of Idris at the moment, and don't know it well. I'm happy to replace this with Idris code if someone suggests the right syntax, e.g., in a comment.)
Fixpoint vector (n : nat) (A : Type) :=
match n return Type with
| 0 => unit
| S n' => (A * vector n' A)%type
end.
Definition nil {A} : vector 0 A := tt.
Definition cons {n} {A : Prop} (x : A) (xs : vector n A) : vector (S n) A
:= (x, xs).
Fixpoint get {n} {A : Prop} (m : nat) (default : A) (v : vector n A) {struct n} : A
:= match n as n return vector n A -> A with
| 0 => fun _ => default
| S n' => match m with
| 0 => fun v => fst v
| S m' => fun v => #get n' A m' default (snd v)
end
end v.
The idea is that, for any fixed n, the normal form of get is non-recursive, so the compiler could, hypothetically, compile it into a function whose runtime is independent of what n happens to be.

Related

Math-operations (or hash functions) without order dependency

The problem I am thinking about is hash functions, although I'm mainly interested in the mathematical terms/background to describe my requested property.
Consider the case where I have a hash-function taking a secret (S) and a number (X) which creates another number (Y):
Hash : S, X → Y
I then define two different hash-functions with their own secrets (a and b):
H1(X) := Hash(a, X)
H2(X) := Hash(b, X)
The property I want is that:
H1(H2(x)) = H2(H1(X))
(I think this is called that the functions commute?)
Taking a step back from programming and thinking about math we can look at different operations. If the function consist of one operation only, then I'm quite sure that this property will always be satisfied if the operation has both associative and commutative properties. However there are operations which are order insensitive but non-commutative, e.g. division. How does I know if my choice of hash function will make it commute?
Some examples that seems to work:
Simple addition:
Hash(S, X) := S + X
Bitwise xor:
Hash(S, X) := S xor X
Modular exponentiation:
Hash(S, X) := X^S mod p
if S ∈ N and X ∈ Z
How do I know if my choice of hash function will make it commute?
Commutativity under composition is an unusual property. It's not typical unless the functions are using a commutative operation of some underlying algebraic structures, such as "multiply by x". This is the form of your three examples.
The practical answer is "if you don't have a proof that it's commutative, assume it's not commutative". There's no general algorithm that will provide that proof for you.

In Idris, why do interface parameters have to be type or data constructors?

To get some practice with Idris, I've been trying to represent various basic algebraic structures as interfaces. The way I thought of organizing things at first was to make the parameters of a given interface be the set and the various operations over it, and the methods/fields be proofs of the various axioms. For example, I was thinking of defining Group like so:
Group (G : Type) (op : G -> G -> G) (e : G) (inv : G -> G) where
assoc : {x,y,z : G} -> (x `op` y) `op z = x `op` (y `op` z)
id_l : {x : G} -> x `op` e = x
id_r : {x : G} -> x `op` e = x
inv_l : {x : G} -> x `op` (inv x) = e
inv_r : {x : G} -> (inv x) `op` x = e
My reasoning for doing it this way instead of just making op, e, and inv methods was that it would be easier to talk about the same set being a group in different ways. Like, mathematically, it doesn't make sense to talk about a set being a group; it only makes sense to talk about a set with a specified operation being a group. The same set can correspond to two completely different groups by defining the operation differently. On the other hand, the proofs of the various interface laws don't affect the group. While the inhabitants (proofs) of the laws may be different, it doesn't result in a different group. Thus, one would have no use for declaring multiple implementations.
More fundamentally, this approach seems like a better representation of the mathematical concepts. It's a category error to talk about a set being a group, so the mathematician in me isn't thrilled about asserting as much by making the group operation an interface method.
This scheme isn't possible, however. When I try, it actually does typecheck, but as soon as I try to define an instance, it doesn't: idris complains that e.g.:
(+) cannot be a parameter of Algebra.Group
(Implementation arguments must be type or data constructors)
My question is: why this restriction? I assume there's a good reason, but for the life of me I can't see it. Like, I thought Idris collapses the value/type/kind hierarchy, so there's no real difference between types and values, so why do implementations treat types specially? And why are data constructors treated specially? It seems arbitrary to me.
Now, I could just achieve the same thing using named implementations, which I guess I'll end up doing now. I guess I'm just used to Haskell, where you can only have one instance of a typeclass for a given datatype. But it still feels rather arbitrary.... In particular, I would like to be able to define, e.g., a semiring as a tuple (R,+,*,0,1) where (R,+,0) is a monoid and (R,*,1) is a monoid (with the distributivity laws tacked on). But I don't think I can do that very easily without the above scheme, even with named implementations. I could only say whether or not R is a monoid---but for semirings, it needs to be a monoid in two distinct ways! I'm sure there are workarounds with some boilerplate type synonyms or something (which, again I'll probably end up doing), but I don't really see why that should be necessary.
$ idris --version
1.2.0

Decidable equality in Agda with less than n^2 cases?

I've got a datatype for the AST of a programming langauge that I'd like to reason about, but there are about 10 different constructors for the AST.
data Term : Set where
UnitTerm : Term
VarTerm : Var -> Term
...
SeqTerm : Term -> Term -> Term
I'm trying to write a function that has decidable equality for syntax trees of this language. In theory this is straightforward: there's nothing too complicated, it's just simple data being stored in the AST.
The problem is that writing such a function seems to require about 100 cases: for each constructor, there's 10 cases.
eqDecide : (x : Term) -> (y : Term) -> Dec (x ≡ y)
eqDecide UnitTerm UnitTerm = yes refl
eqDecide UnitTerm (VarTerm x) = Generic.no (λ ())
...
eqDecide UnitTerm (SeqTerm t1 t2) = Generic.no (λ ())
EqDecide (VarTerm x) UnitTerm = Generic.no (λ ())
...
The problem is, there are a bunch of redundant cases. After the first pattern match where the constructors match, ideally I could match with underscore, since no possible other constructor could unify, but it doesn't appear that I can do that.
I've tried and failed to use this library to derive the equality: I'm running into problems with strict positivity, as well as getting some general errors that I'm are pretty hard to debug. The Agda Prelude also has some facility for this, but looks pretty immature, and is lacking some things that I need from the standard library.
How do people do decidable equality in practice? Do they suck it up and just write all 100 cases, or is there a trick I'm missing? Is this just a place where the newness of the language is showing through?
If you want to avoid using reflection and still prove decidable equality in a linear number of cases, you could try the following approach:
Define a function embed : Term → Nat (or to some other type for which decidable equality is easier to prove such as labelled trees).
Prove that your function is indeed injective.
Make use of the fact that your function is injective together with decidable equality on the result type to conclude decidable equality on Terms (see for example via-injection in the module Relation.Nullary.Decidable in the standard library).

Convergence and vectors theories

Is there a convergence theory in Isabelle/HOL? I need to define ∥x(t)∥ ⟶ 0 as t ⟶ ∞.
Also, I'm looking for vectors theory, I found a matrix theory but I couldn't find the vectors one, Is there exist such theory in Isabelle/HOL?
Cheers.
Convergence etc. are expressed with filters in Isabelle. (See the corresponding documentation)
In your case, that would be something like
filterlim (λt. norm (x t)) (nhds 0) at_top
or, using the tendsto abbreviation,
((λt. norm (x t)) ⤏ 0) at_top
where ⤏ is the Isabelle symbol \<longlongrightarrow>, which can be input using the abbreviation --->.
As a side note, I am wondering why you are writing it that way in the first place, seeing as it is equivalent to
filterlim x (nhds 0) at_top
or, with the tendsto syntax:
(x ⤏ 0) at_top
Reasoning with these filters can be tricky at first, but it has the advantage of providing a unified framework for limits and other topological concepts, and once you get the hang of it, it is very elegant.
As for vectors, just import ~~/src/HOL/Analysis/Analysis. That should have everything you need. Ideally, build the HOL-Analysis session image by starting Isabelle/jEdit with isabelle jedit -l HOL-Analysis. Then you won't have to process all of Isabelle's analysis library every time you start the system.
I assume that by ‘vectors’ you mean concrete finite-dimensional real vector spaces like ℝn. This is provided by ~~/src/HOL/Analysis/Finite_Cartesian_Product.thy, which is part of HOL-Analysis. This provides the vec type, which takes two parameters: the component type (probably real in your case) and the index type, which specifies the dimension of the vector space.
There is also a pre-defined type n for every positive integer n, so that you can write e.g. (real, 3) vec for the vector space ℝ³. There is also type syntax so that you can write 'a ^ 'n for ('a, 'n) vec.

In pure functional languages, is data (strings, ints, floats.. ) also just functions?

I was thinking about pure Object Oriented Languages like Ruby, where everything, including numbers, int, floats, and strings are themselves objects. Is this the same thing with pure functional languages? For example, in Haskell, are Numbers and Strings also functions?
I know Haskell is based on lambda calculus which represents everything, including data and operations, as functions. It would seem logical to me that a "purely functional language" would model everything as a function, as well as keep with the definition that a function most always returns the same output with the same inputs and has no state.
It's okay to think about that theoretically, but...
Just like in Ruby not everything is an object (argument lists, for instance, are not objects), not everything in Haskell is a function.
For more reference, check out this neat post: http://conal.net/blog/posts/everything-is-a-function-in-haskell
#wrhall gives a good answer. However you are somewhat correct that in the pure lambda calculus it is consistent for everything to be a function, and the language is Turing-complete (capable of expressing any pure computation that Haskell, etc. is).
That gives you some very strange things, since the only thing you can do to anything is to apply it to something else. When do you ever get to observe something? You have some value f and want to know something about it, your only choice is to apply it some value x to get f x, which is another function and the only choice is to apply it to another value y, to get f x y and so on.
Often I interpret the pure lambda calculus as talking about transformations on things that are not functions, but only capable of expressing functions itself. That is, I can make a function (with a bit of Haskelly syntax sugar for recursion & let):
purePlus = \zero succ natCase ->
let plus = \m n -> natCase m n (\m' -> plus m' n)
in plus (succ (succ zero)) (succ (succ zero))
Here I have expressed the computation 2+2 without needing to know that there are such things as non-functions. I simply took what I needed as arguments to the function I was defining, and the values of those arguments could be church encodings or they could be "real" numbers (whatever that means) -- my definition does not care.
And you could think the same thing of Haskell. There is no particular reason to think that there are things which are not functions, nor is there a particular reason to think that everything is a function. But Haskell's type system at least prevents you from applying an argument to a number (anybody thinking about fromInteger right now needs to hold their tongue! :-). In the above interpretation, it is because numbers are not necessarily modeled as functions, so you can't necessarily apply arguments to them.
In case it isn't clear by now, this whole answer has been somewhat of a technical/philosophical digression, and the easy answer to your question is "no, not everything is a function in functional languages". Functions are the things you can apply arguments to, that's all.
The "pure" in "pure functional" refers to the "freedom from side effects" kind of purity. It has little relation to the meaning of "pure" being used when people talk about a "pure object-oriented language", which simply means that the language manipulates purely (only) in objects.
The reason is that pure-as-in-only is a reasonable distinction to use to classify object-oriented languages, because there are languages like Java and C++, which clearly have values that don't have all that much in common with objects, and there are also languages like Python and Ruby, for which it can be argued that every value is an object1
Whereas for functional languages, there are no practical languages which are "pure functional" in the sense that every value the language can manipulate is a function. It's certainly possible to program in such a language. The most basic versions of the lambda calculus don't have any notion of things that are not functions, but you can still do arbitrary computation with them by coming up with ways of representing the things you want to compute on as functions.2
But while the simplicity and minimalism of the lambda calculus tends to be great for proving things about programming, actually writing substantial programs in such a "raw" programming language is awkward. The function representation of basic things like numbers also tends to be very inefficient to implement on actual physical machines.
But there is a very important distinction between languages that encourage a functional style but allow untracked side effects anywhere, and ones that actually enforce that your functions are "pure" functions (similar to mathematical functions). Object-oriented programming is very strongly wed to the use of impure computations3, so there are no practical object-oriented programming languages that are pure in this sense.
So the "pure" in "pure functional language" means something very different from the "pure" in "pure object-oriented language".4 In each case the "pure vs not pure" distinction is one that is completely uninteresting applied to the other kind of language, so there's no very strong motive to standardise the use of the term.
1 There are corner cases to pick at in all "pure object-oriented" languages that I know of, but that's not really very interesting. It's clear that the object metaphor goes much further in languages in which 1 is an instance of some class, and that class can be sub-classed, than it does in languages in which 1 is something else than an object.
2 All computation is about representation anyway. Computers don't know anything about numbers or anything else. They just have bit-patterns that we use to represent numbers, and operations on bit-patterns that happen to correspond to operations on numbers (because we designed them so that they would).
3 This isn't fundamental either. You could design a "pure" object-oriented language that was pure in this sense. I tend to write most of my OO code to be pure anyway.
4 If this seems obtuse, you might reflect that the terms "functional", "object", and "language" have vastly different meanings in other contexts also.
A very different angle on this question: all sorts of data in Haskell can be represented as functions, using a technique called Church encodings. This is a form of inversion of control: instead of passing data to functions that consume it, you hide the data inside a set of closures, and to consume it you pass in callbacks describing what to do with this data.
Any program that uses lists, for example, can be translated into a program that uses functions instead of lists:
-- | A list corresponds to a function of this type:
type ChurchList a r = (a -> r -> r) --^ how to handle a cons cell
-> r --^ how to handle the empty list
-> r --^ result of processing the list
listToCPS :: [a] -> ChurchList a r
listToCPS xs = \f z -> foldr f z xs
That function is taking a concrete list as its starting point, but that's not necessary. You can build up ChurchList functions out of just pure functions:
-- | The empty 'ChurchList'.
nil :: ChurchList a r
nil = \f z -> z
-- | Add an element at the front of a 'ChurchList'.
cons :: a -> ChurchList a r -> ChurchList a r
cons x xs = \f z -> f z (xs f z)
foldChurchList :: (a -> r -> r) -> r -> ChurchList a r -> r
foldChurchList f z xs = xs f z
mapChurchList :: (a -> b) -> ChurchList a r -> ChurchList b r
mapChurchList f = foldChurchList step nil
where step x = cons (f x)
filterChurchList :: (a -> Bool) -> ChurchList a r -> ChurchList a r
filterChurchList pred = foldChurchList step nil
where step x xs = if pred x then cons x xs else xs
That last function uses Bool, but of course we can replace Bool with functions as well:
-- | A Bool can be represented as a function that chooses between two
-- given alternatives.
type ChurchBool r = r -> r -> r
true, false :: ChurchBool r
true a _ = a
false _ b = b
filterChurchList' :: (a -> ChurchBool r) -> ChurchList a r -> ChurchList a r
filterChurchList' pred = foldChurchList step nil
where step x xs = pred x (cons x xs) xs
This sort of transformation can be done for basically any type, so in theory, you could get rid of all "value" types in Haskell, and keep only the () type, the (->) and IO type constructors, return and >>= for IO, and a suitable set of IO primitives. This would obviously be hella impractical—and it would perform worse (try writing tailChurchList :: ChurchList a r -> ChurchList a r for a taste).
Is getChar :: IO Char a function or not? Haskell Report doesn't provide us with a definition. But it states that getChar is a function (see here). (Well, at least we can say that it is a function.)
So I think the answer is YES.
I don't think there can be correct definition of "function" except "everything is a function". (What is "correct definition"? Good question...) Consider the next example:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Control.Applicative
f :: Applicative f => f Int
f = pure 1
g1 :: Maybe Int
g1 = f
g2 :: Int -> Int
g2 = f
Is f a function or datatype? It depends.

Resources