Reasoning about the entirety of a codatatype in Isabelle/HOL - isabelle

I'd like to write down some definitions (and prove some lemmas!) about paths in a graph. Let's say that the graph is given implicitly by a relation of type 'a => 'a => bool. To talk about a possibly infinite path in the graph, I thought a sensible thing was to use a lazy list codatatype like 'a llist as given in "Defining (Co)datatypes and Primitively (Co)recursive Functions in Isabelle/HOL" (datatypes.pdf in the Isabelle distribution).
This works well enough, but I'd then I'd like to define a predicate that takes such a list and a graph relation and evaluates to true iff the list defines a valid path in the graph: any pair of adjacent entries in the list must be an edge.
If I was using 'a list as a type to represent the paths, this would be easy: I'd just define the predicate using primrec. However, the co-inductive definitions I can find all seem to generate or consume the data one element at a time, rather than being able to make a statement about the whole thing. Obviously, I realise that the resulting predicate won't be computable (because it is making a statement about infinite streams), so it might have a \forall in there somewhere, but that's fine - I want to use it for developing a theory, not generating code.
How can I define such a predicate? (And how could I prove the obvious associated introduction and elimination lemmas that make it useful?)
Thanks!

I suppose the most idiomatic way to do this is to use a coinductive predicate. Intuitively, this is like a normal inductive predicate except that you also allow ‘infinite derivation trees’:
type_synonym 'a graph = "'a ⇒ 'a ⇒ bool"
codatatype 'a llist = LNil | LCons 'a "'a llist"
coinductive is_path :: "'a graph ⇒ 'a llist ⇒ bool" for g :: "'a graph" where
is_path_LNil:
"is_path g LNil"
| is_path_singleton:
"is_path g (LCons x LNil)"
| is_path_LCons:
"g x y ⟹ is_path g (LCons y path) ⟹ is_path g (LCons x (LCons y path))"
This gives you introduction rules is_path.intros and an elimination rule is_path.cases.
When you want to show that an inductive predicate holds, you just use its introduction rules; when you want to show that an inductive predicate implies something else, you use induction with its induction rule.
With coinductive predicates, it is typically the other way round: When you want to show that a coinductive predicate implies something else, you just use its elimination rules. When you want to show that a coinductive predicate holds, you have to use coinduction.

Related

What is `ind` type in Isabelle

In Isabelle natural numbers are defined as follows
typedecl ind
axiomatization Zero_Rep :: ind and Suc_Rep :: "ind ⇒ ind"
― ‹The axiom of infinity in 2 parts:›
where Suc_Rep_inject: "Suc_Rep x = Suc_Rep y ⟹ x = y"
and Suc_Rep_not_Zero_Rep: "Suc_Rep x ≠ Zero_Rep"
subsection ‹Type nat›
text ‹Type definition›
inductive Nat :: "ind ⇒ bool"
where
Zero_RepI: "Nat Zero_Rep"
| Suc_RepI: "Nat i ⟹ Nat (Suc_Rep i)"
That's a lot of code to write what's effectively just
datatype nat = Zero | Suc nat
Is there some greater purpose to ind or maybe it is there just for historical reasons?
The datatype package needs a whole lot of maths to do all those internal constructions that are required to give you the datatype you want in the end. In particular, it needs natural numbers.
So the reason why the datatype package is not used to define the naturals is that it simply isn't available yet at that point.
One could of course just axiomatise the nat type directly. I think the idea to instead axiomatise some infinite type and then carve the naturals out of that is a standard one, I think it is done similarly in Zermelo–Fraenkel logic.
Side note: In fact one could even make the datatype package itself axiomatic. But the common philosophy in interactive theorem provers, especially in the LCF family, is to work from a small set of axioms and that everything that can be constructed should be constructed instead of axiomatised. This reduces the amount of "trusted code".
A direct axiomatisation of the natural numbers would not be very controversial I think, but for something as complicated as the datatype package an axiomatic implementation would h introduce lots of possibilities for subtle soundness bugs.

Is there a way to get a complete list of all kinds of operators/constructors of Isabelle?

For example, Isabelle can represent "and" as "^", "or" as "v".....Is there a way to get a complete list of all of these kinds of operators/constructors?
There is the print_syntax command, but its output might look a bit intimidating. But, for instance, the following line
logic(55) = logic(55) "∘" logic(56) ⇒ "\<^const>Fun.comp
tells you that the symbol ∘ is an infix operator with precedence 55 that maps to the constant Fun.comp. The corresponding declaration is this:
definition comp :: "('b ⇒ 'c) ⇒ ('a ⇒ 'b) ⇒ 'a ⇒ 'c" (infixl "∘" 55)
where "f ∘ g = (λx. f (g x))"
The more usual way to discover these notations is to either
try the obvious notation (for many things, it is just what one would expect, as is the case for the function composition above).
Find out what the constant is called and then look around the place where it is defined to see what notation is set up for it.
Not that I know of.
A good starting point is the list of all symbols in Main. However, that does not contain all symbols and does not provide a mapping from symbols to definitions.
In general, it is not that useful to find how things are abbreviated:
If you have the symbol, you can click on it to find the definition.
If you have the symbol and don't know how to type it, you can go the symbol panel of Isabelle/jEdit to see how to type it.
If you have the definition, just type term "and a b" and the output shows you the version with the symbol.
So the real question is how to find the right definition, not how to find the right symbol. But that is a lot harder.

Proving the set of reachable states of semantics function is finite in Isabelle

Consider the following property:
lemma "finite {t. (c,s) ⇒ t}"
Which refers to the following big step semantics:
inductive gbig_step :: "com × state ⇒ state ⇒ bool" (infix "⇒" 55)
where
Skip: "(SKIP, s) ⇒ s"
| Assign: "(x ::= a, s) ⇒ s(x := aval a s)"
| Seq: "⟦(c1, s1) ⇒ s2; (c2, s2) ⇒ s3⟧ ⟹ (c1;;c2, s1) ⇒ s3"
| IfBlock: "⟦(b,c) ∈ set gcs; bval b s; (c,s) ⇒ s'⟧ ⟹ (IF gcs FI, s) ⇒ s'"
| DoTrue: "⟦(b,c) ∈ set gcs; bval b s1; (c,s1) ⇒ s2;(DO gcs OD,s2) ⇒ s3⟧
⟹ (DO gcs OD, s1) ⇒ s3"
| DoFalse: "⟦(∀ (b,c) ∈ set gcs. ¬ bval b s)⟧ ⟹ (DO gcs OD, s) ⇒ s"
To me it is obvious that the property holds by induction on the big step relation. However, I can not get it out of the set, so I cannot effectively induct on it.
How could I do this?
Finiteness is nothing that you could prove directly with the induction rule of an inductive predicate. The problem is that looking at an individual run (as does the induction rule) does not say anything about the branching behaviour, which must also be finite for the statement to hold.
I see two approaches to proving finiteness:
Model the derivation tree explicitly as a datatype in Isabelle/HOL and prove that it adequately represent the derivation trees behind inductive. Then prove that the tree has finitely many leaves (by induction on the tree). If you design the datatype such that the states in the leaves are a type parameter, then the corresponding set function generated by the datatype package is what you want to prove to be finite. (Note that you cannot prove finiteness by the induction rule of the set function, because that would again be just a single run.)
Look at the internal construction of the inductive definition. It is defined as the least fixpoint of a functional. You can get access to these internals by putting the inductive definition into a context in which [[inductive_internals]] is declared. Then you can prove that the functional preserves finiteness in a single step and then lift that through the induction.
The proof argument in both approaches is similar. The explicit datatype in #1 simply reifies the fixpoint argument of #2. So you can think of #1 as a deep embedding of #2. Of course, you can also re-derive the internal construction (in a more suitable format) just from the introduction and induction theorems and then follow approach #2.
I would try to do precisely this as your semantics is small. For a large real-world semantics, it might make sense to spend the effort to automate step #2 in ML.

How to define a data type with constraints?

For example I need to define a data type for pairs of list, both of which must have the same length:
type_synonym list2 = "nat list × nat list"
definition good_list :: "list2" where
"good_list ≡ ([1,2],[3,4])"
definition bad_list :: "list2" where
"bad_list ≡ ([1,2],[3,4,5])"
I can define a separate predicate, which checks whether a pair of lists is ok:
definition list2_is_good :: "list2 ⇒ bool" where
"list2_is_good x ≡ length (fst x) = length (snd x)"
value "list2_is_good good_list"
value "list2_is_good bad_list"
Is it possible to combine the datatype and the predicate? I've tried to use inductive_set, but I have no idea how to use it:
inductive_set ind_list2 :: "(nat list × nat list) set" where
"length (fst x) = length (snd x) ⟹
x ∈ ind_list2"
You can create a new type which is constraint by some predicate via typedef, though the result will just be a type and not a datatype.
typedef good_lists2 = "{xy :: list2. list2_is_good xy}"
by (intro exI[of _ "([],[])"], auto simp: list2_is_good_def)
Working with such a newly created type is best done via the lifting-package.
setup_lifting type_definition_good_lists2
Now for every operation on this new lifted type good_lists2,
you first have
to lift the operation from the raw type list2.
For instance, below we define an extraction function and a Cons-function.
In the latter you have prove that indeed the newly generated pair satisfies the invariant.
lift_definition get_lists :: "good_lists2 ⇒ list2" is "λ x. x" .
lift_definition Cons_good_lists2 :: "nat ⇒ nat ⇒ good_lists2 ⇒ good_lists2"
is "λ x y (xs,ys). (x # xs, y # ys)"
by (auto simp: list2_is_good_def)
Of course, you it is also possible to access the invariant
of the lifted type.
lemma get_lists: "get_lists xy = (x,y) ⟹ length x = length y"
by (transfer, auto simp: list2_is_good_def)
I hope this helps.
René's answer is the answer to what you asked for, but just for the sake of completeness, I would like to add two things:
First, stating the obvious here: It seems like it would be much easier if you just worked with lists of pairs instead of pairs of lists. Your proposed new type is clearly isomorphic to a list of pairs. Then you don't have to introduce an extra type.
Also, on a more general note, just because you can introduce new types with type definitions in Isabelle that capture certain invariants does not mean that this is always the best idea. It may be easier to just carry around the invariants separately. It depends very much on what those invariants look like and what you actually do with the values of that type. In many cases, I would argue that the additional boilerplate for setting up the new type (in particular class instantiations if you need those) and converting between the base type and the new type is not worth whatever abstraction benefit you get from it.
A good heuristic, I think, is to ask yourself whether the type you are introducing is more of a ‘throw-away’ thing that you need in one specific place – then don't introduce a new type for it – or whether it is something that you can prove nice general facts about and introduce a good abstract theory on – then do introduce a new type for it. Good examples from the distribution for the latter are things like multisets, finite sets, and probability mass functinos.

Using the ordering locale with partial maps

The following code doesn't typecheck:
type_synonym env = "char list ⇀ val"
interpretation map: order "op ⊆⇩m :: (env ⇒ env ⇒ bool)" "(λa b. a ≠ b ∧ a ⊆⇩m b)"
by unfold_locales (auto intro: map_le_trans simp: map_le_antisym)
lemma
assumes "mono (f :: env ⇒ env)"
shows "True"
by simp
Isabelle complains with the following error at the lemma:
Type unification failed: No type arity option :: order
Type error in application: incompatible operand type
Operator: mono :: (??'a ⇒ ??'b) ⇒ bool
Operand: f :: (char list ⇒ val option) ⇒ char list ⇒ val option
Why so? Did I miss something to use the interpretation? I suspect I need something like a newtype wrapper here...
When you interpret a locale like order which corresponds to a type class, you only get the theorems proved inside the context of the locale. However, the constant mono is only defined on the type class. The reason is that mono's type contains two type variables, whereas only one is available inside locales from type classes. You can notice this because there is no map.mono stemming from your interpretation.
If you instantiate the type class order for the option type with None being less than Some x, then you can use mono for maps, because the function space instantiates order with the pointwise order. However, the ordering <= on maps will only be semantically equivalent to ⊆⇩m, not syntactically, so none of the existing theorems about ⊆⇩m will work for <= and vice versa. Moreover, your theories will be incompatible with other people's that instantiate order for option differently.
Therefore, I recommend to go without type classes. The predicate monotone explicitly takes the order to be used. This is a bit more writing, but in the end, you are more flexible than with type classes. For example, you can write monotone (op ⊆⇩m) (op ⊆⇩m) f to express that f is a monotone transformation of environments.

Resources