TL;DR: Are there any coding conventions for the Isar language? Is it necessary to respect jEdit's folding strategy?
My team is working on the formalization of mathematics, so one of our main purposes is to obtain readable proofs. Looking into that, we tried to code proofs in a way that intermediate facts (and labels, if any) stand out:
from fact1 have
1: "Foo"
using Thm1 Thm2 by auto
then have
2: "Bar = FooBar"
by simp
also from 1 have
" ... = BarFoo"
by blast
etc. Apart from the fact that sometimes this produces a proliferation of "short lines" (btw, I don't know if this is really a problem), it is somehow not compatible with jEdit folding strategy; after folding, the previous code block would look like this:
from fact1 have
then have
also from 1 have
completely obscuring the argument. The following format perhaps is better:
from fact1
have 1: "Foo"
using Thm1 Thm2 by auto
then
have 2: "Bar = FooBar"
by simp
also from 1
have " ... = BarFoo"
by blast
And, after folding,
from fact1
have 1: "Foo"
then
have 2: "Bar = FooBar"
also from 1
have " ... = BarFoo"
which makes the flow of the argument explicit.
In any case, before I come up with 5 new formatting conventions, I'd definitely like to know if there is some de facto standard, or at least if someone thought about this.
Related
Can I somehow refer to the name of a lemma (or theorem or corollary) inside Isabelle text?
For example:
theory Scratch
imports Main
begin
lemma lemma_name: "stuff = stuff" by simp
text‹As we have proven in fact #{thm lemma_name}, stuff is stuff.›
end
When compiling this to pdf, I see
As we have proven in fact stuff=stuff, stuff is stuff.
I would like to see
As we have proven in fact lemma_name, stuff is stuff.
Is there some document antiqotation which just prints the name of a lemma?
I could just type the lemma name verbatim, but this neither gives me control-click in the IDE nor does it make sure the text still refers to a true fact, even if I rename lemmas.
The output of antiquotations can be changed by options explained in 4.2.1 and 4.2.2 of the Isabelle/Isar Reference Manual. One (rather hidden) option [source=true] sets the output to print what you have entered as argument to the antiquotation instead of its output.
text ‹As we have proven in fact #{thm [source] lemma_name}, stuff is stuff.
...will thus result in the document output:
As we have proven in fact lemma_name, stuff is stuff.
The checking of the validity of the reference will still take place during document preparation.
In Isabelle/HOL, how do I find where a given type was instantiated for a given class? For the sake of this post for example, where real was instantiated as a conditionally_complete_linorder. To justify the question: I might want to know this for inspiration for a similar instantiation, for showing it to someone(s), for Isabelle/HOL practice reading, for curiosity, and so on. My process at the moment:
First, check it actually is: type instantiation real :: conditionally_complete_linorder begin end and see if I get the error message "No parameters and no pending instance proof obligations in instantiation."
Next, ideally before where I'd need to know how i.e. whether it was direct, or implicit via classes C_1[, C_2, C_3, etc]. Then, I would need to find where those instantiations are, either an explicit instantiation real :: conditionally_complete_linorder or the implicit ones for the C_i (same process for either case ofc). I don't know how to find out how, so I have to check for an explicit instantiation, then all possible implicit instantiations.
For explicit, I can do a grep -Ern ~/.local/src/Isabelle2019 -e 'instantiation real :: conditionally_complete_linorder' (and hope the whitespace isn't weird, or do a more robust search :)). Repeat for AFP location. Alternatively, to stay within the jEdit window:
I can find where the class itself was defined by typing term "x::'a::conditionally_complete_linorder" then Ctrl-clicking the class name, and then check if real is directly instantiated in that file with Ctrl-f.
I could then check if it's instantiated where the type real is defined by typing term "x::real" and Ctrl-clicking real, then Ctrl-f for conditionally_complete_linorder in that file.
(If it is in either place it'll be whichever is further down in the import hierarchy, but I find just going through those two steps simpler.) However, if neither two places turn it up then either, for whatever reason, it is explicitly instantiated somewhere else or is implicitly instantiated. For that reason grep is more robust.
If explicit turns nothing up then I check implicit. Looking at the class_deps graph I can see that conditionally_complete_linorder can follow from either complete_linorder or linear_continuum. I can then continue the search by seeing if real is instantiated as either of them (disregarding any I happen to know real can't be instantiated as). I can also check to see if it's instantiated as both conditioanlly_complete_lattice and linorder, which is what I can see conditionally_complete_linorder is a simple (no additional assumptions) combination of*. Repeat for all of these classes recursively until the instantiations are found. In this case, I can see that linear_continuum_topology implies linear_continuum, so kill two birds with one stone with grep -Ern ~/.local/src/Isabelle2019 -e "instantiation.*real" | grep continuum and find /path/to/.local/src/Isabelle2019/src/HOL/Real.thy:897:instantiation real :: linear_continuum.
This process is quite tedious. Less but still quite tedious** would be to get the class_deps graph up and Ctrl-f for "instantiation real" in Real.thy and look for instantiations of: the original class, the superclasses of it, or the classes which imply it. Then in the files each those classes are defined search for "instantiation real". Do this recursively till done. In this case I would have found what I needed in Real.thy.
Is there an easier way? Hope I just missed something obvious.
* I can't Ctrl-click in Conditionally_Complete_Lattices.thy to jump to linorder directly, I guess because of something to do with it being pre-built, so I have to do the term "x::'a::linorder" thing again.
** And also less robust, as it is minus grep-ing which can turn up weirder instantiation locations, then again I'm not sure if this ever comes up in practice.
Thanks
You can import the theory in the code listing below and then use the command find_instantiations. I will leave the code without further explanation, but please feel free to ask further questions in the comments if you need further details or suspect that something is not quite right.
section ‹Auxiliary commands›
theory aux_cmd
imports Complex_Main
keywords "find_instantiations" :: thy_decl
begin
subsection ‹Commands›
ML ‹
fun find_instantiations ctxt c =
let
val {classes, ...} = ctxt |> Proof_Context.tsig_of |> Type.rep_tsig;
val algebra = classes |> #2
val arities = algebra |> Sorts.arities_of;
in
Symtab.lookup arities c
|> the
|> map #1
|> Sorts.minimize_sort algebra
end
fun find_instantiations_cmd tc st =
let
val ctxt = Toplevel.context_of st;
val _ = tc
|> Syntax.parse_typ ctxt
|> dest_Type
|> fst
|> find_instantiations ctxt
|> map Pretty.str
|> Pretty.writeln_chunks
in () end
val q = Outer_Syntax.command
\<^command_keyword>‹find_instantiations›
"find all instantiations of a given type constructor"
(Parse.type_const >> (fn tc => Toplevel.keep (find_instantiations_cmd tc)));
›
subsection ‹Examples›
find_instantiations filter
find_instantiations nat
find_instantiations real
end
Remarks
I would be happy to provide amendments if you find any problems with it, but do expect a reasonable delay in further replies.
The command finds both explicit and implicit instantiations, i.e. it also finds the ones that were achieved by means other than the use of the commands instance or instantiation, e.g. inside an ML function.
Unfortunately, the command does not give you the location of the file where the instantiation was performed - this is something that would be more difficult to achieve, especially, given that instantiations can also be performed programmatically. Nevertheless, given a list of all instantiations, I believe, it is nearly always easy to use the in-built search functionality on the imported theories to narrow down the exact place where the instantiation was performed.
This is a derivative question of Existing constants (e.g. constructors) in type class instantiations.
The short question is this: How can I prevent the error that occurs due to free_constructors, so that I can combine the two theories that I include below.
I've been sitting on this for months. The other question helped me move forward (it appears). Thanks to the person who deserves thanks.
The real issue here is about overloading notation, though it looks like I now just have a namespace problem.
At this point, it's not a necessity, just an inconvenience that two theories have to be used. If the system allows, all this will disappear, but I ask anyway to make it possible to get a little extra information.
The big explanation here comes in explaining the motivation, which may lead to getting some extra information. I explain some, then include S1.thy, make a few comments, and then include S2.thy.
Motivation: using syntactic type classes for overloading notation of multiple binary datatypes
The basic idea is that I might have 5 different forms of binary words that have been defined with datatype, and I want to define some binary and hexadecimal notation that's overloaded for all 5 types.
I don't know what all is possible, but the past tells me (by others telling me things) that if I want code generation, then I should use type classes, to get the magic that comes with type classes.
The first theory, S1
Next is the theory S1.thy. What I do is instantiate bool for the type classes zero and one, and then use free_constructors to set up the notation 0 and 1 for use as the bool constructors True and False. It seems to work. This in itself is something I specifically wanted, but didn't know how to do.
I then try to do the same thing with an example datatype, BitA. It doesn't work because constant case_BitA is created when BitA is defined with datatype. It causes a conflict.
Further comments of mine are in the THY.
theory S1
imports Complex_Main
begin
declare[[show_sorts]]
(*---EXAMPLE, NAT 0: IT CAN BE USED AS A CONSTRUCTOR.--------------------*)
fun foo_nat :: "nat => nat" where
"foo_nat 0 = 0"
(*---SETTING UP BOOL TRUE & FALSE AS 0 AND 1.----------------------------*)
(*
I guess it works, because 'free_constructors' was used for 'bool' in
Product_Type.thy, instead of in this theory, like I try to do with 'BitA'.
*)
instantiation bool :: "{zero,one}"
begin
definition "zero_bool = False"
definition "one_bool = True"
instance ..
end
(*Non-constructor pattern error at this point.*)
fun foo1_bool :: "bool => bool" where
"foo1_bool 0 = False"
find_consts name: "case_bool"
free_constructors case_bool for "0::bool" | "1::bool"
by(auto simp add: zero_bool_def one_bool_def)
find_consts name: "case_bool"
(*found 2 constant(s):
Product_Type.bool.case_bool :: "'a∷type => 'a∷type => bool => 'a∷type"
S1.bool.case_bool :: "'a∷type => 'a∷type => bool => 'a∷type" *)
fun foo2_bool :: "bool => bool" where
"foo2_bool 0 = False"
|"foo2_bool 1 = True"
thm foo2_bool.simps
(*---TRYING TO WORK A DATATYPE LIKE I DID WITH BOOL.---------------------*)
(*
There will be 'S1.BitA.case_BitA', so I can't do it here.
*)
datatype BitA = A0 | A1
instantiation BitA :: "{zero,one}"
begin
definition "0 = A0"
definition "1 = A1"
instance ..
end
find_consts name: "case_BitA"
(*---ERROR NEXT: because there's already S1.BitA.case_BitA.---*)
free_constructors case_BitA for "0::BitA" | "1::BitA"
(*ERROR: Duplicate constant declaration "S1.BitA.case_BitA" vs.
"S1.BitA.case_BitA" *)
end
The second theory, S2
It seems that case_BitA is necessary for free_constructors to set things up, and it occurred to me that maybe I could get it to work by using datatype in one theory, and use free_constructors in another theory.
It seems to work. Is there a way I can combine these two theories?
theory S2
imports S1
begin
(*---HERE'S THE WORKAROUND. IT WORKS BECAUSE BitA IS IN S1.THY.----------*)
(*
I end up with 'S1.BitA.case_BitA' and 'S2.BitA.case_BitA'.
*)
declare[[show_sorts]]
find_consts name: "BitA"
free_constructors case_BitA for "0::BitA" | "1::BitA"
unfolding zero_BitA_def one_BitA_def
using BitA.exhaust
by(auto)
find_consts name: "BitA"
fun foo_BitA :: "BitA => BitA" where
"foo_BitA 0 = A0"
|"foo_BitA 1 = A1"
thm foo_BitA.simps
end
The command free_constructors always creates a new constant of the given name for the case expression and names the generated theorems in the same way as datatype does, because datatype internaly calls free_constructors.
Thus, you have to issue the command free_constructors in a context that changes the name space. For example, use a locale:
locale BitA_locale begin
free_constructors case_BitA for "0::BitA" | "1::BitA" ...
end
interpretation BitA!: BitA_locale .
After that, you can use both A0 and A1 as constructors in pattern matching equations and 0 and 1, but you should not mix them in a single equation. Yet, A0 and 0 are still different constants to Isabelle. This means that you may have to manually convert the one into the other during proofs and code generation works only for one of them. You would have to set up the code generator to replace A0 with 0 and A1 with 1 (or vice versa) in the code equations. To that end, you want to declare the equations A0 = 0 and A1 = 1 as [code_unfold], but you also probably want to write your own preprocessor function in ML that replaces A0 and A1 in left-hand sides of code equations, see the code generator tutorial for details.
Note that if BitA was a polymorphic datatype, packages such as BNF and lifting would continue to use the old set of constructors.
Given these problems, I would really go for the manual definition of the type as described in my answer to another question. This saves you a lot of potential issues later on. Also, if you are really only interested in notation, you might want to consider adhoc_overloading. It works perfectly well with code generation and is more flexible than type classes. However, you cannot talk about the overloaded notation abstractly, i.e., every occurrence of the overloaded constant must be disambiguated to a single use case. In terms of proving, this should not be a restriction, as you assume nothing about the overloaded constant. In terms of definitions over the abstract notation, you would have to repeat the overloading there as well (or abstract over the overloaded definitions in a locale and interpret the locale several times).
I know how to make "term abbreviations" in Isabelle, but can I make "type abbreviations" that behave in the same way?
I can define a "term abbreviation" using
abbreviation "foo == True"
Henceforth all appearances of True in the output will be printed as foo. For instance, the command
term "True ⟶ False"
outputs "foo ⟶ False". I would like to define a "type abbreviation" that has this same behaviour. I know about the type_synonym command, but when I type
type_synonym baz = "int list"
then appearances of int list in future output are not replaced with baz as I would like them to be. If it doesn't already exist in some form, I think a type_abbreviation command could be quite handy when the right-hand side of the definition is rather unwieldy.
You can declare syntax translations for types just as it had to be done for terms before abbreviation was introduced. For example, the following makes Isabelle pretty-print char list as string. More examples of this kind can be found in the Isabelle distribution in MicroJava.
translations
(type) "string" <= (type) "char list"
The command translations works for type abbreviations where each type variable occurs exactly once on each side. If you have multiple occurrences of a type variable on the right hand side, you have to write a parse translation in ML. Examples of this can be found in JinjaThreads in the AFP (search for print_translation).
in Python I'm used to having my code "style-checked" by an automatic but configurable tool, called pep8, after the 8th Python enhancement proposal.
in R I don't know. Google has a style guide, but:
what do most R programmers actually use?
I still didn't find any program that performs those checks.
Dirk, Alex, in your answers you pointed me at pretty printers, but in my opinion that would overdo one thing and not do another: code would be automatically edited to follow the style, while no warnings are issued for poorly chosen identifiers.
There's a formatR package with tidy.source function. I use Emacs with ESS, and follow Hadley's style recommendations. It's hard to compare R with Python, since style is kind of mandatory in Python, unlike R. =)
EDITa simple demonstration:
code <- "fn <- function(x, y) { paste(x, '+', y, '-', x+y) }"
tidy.source(text = code)
## not run
fn <- function(x, y) {
paste(x, "+", y, "-", x + y)
}
I think if you want such a tool, you may have to write it yourself. The reason is that R does not have an equivalent to Python's PEP8; that is, an "official style guide" that has been handed down from on high and is universally followed by the majority of R programmers.
In addition there are a lot of stylistic inconsistencies in the R core itself; this is a consequence of the way in which R evolved as a language. For example, many functions in R core follow the form of foo.bar and were written before the S3 object system came along and used that notation for method dispatch. In hindsight, the naming of these functions should probably be changed in the interests of consistency and clarity, but it is too late to consider that now.
In summary, there is no official "style lint" tool for R because the R Core itself contains enough style lint, of which nothing can be done about, that writing one would be very difficult. For every rule--- "don't do this" ---there would have to be a long list of exceptions--- "except in this case, and this case, and this one, and ..., where it was done for historical purposes".
As for
what do most R programmers actually use
I suspect that quite a few people follow R Core who have a
R Coding standards section in the R Internals manual.
Which in a large sense falls back to these sensible Emacs defaults to be used along with ESS. Here is what I use and it is only minimally changed:
;;; C
(add-hook 'c-mode-hook
;;(lambda () (c-set-style "bsd")))
;;(lambda () (c-set-style "user"))) ; edd or maybe c++ ?
(lambda () (c-set-style "c++"))) ; edd or maybe c++ ?
;;;; ESS
(add-hook 'ess-mode-hook
(lambda ()
(ess-set-style 'C++)
;; Because
;; DEF GNU BSD K&R C++
;; ess-indent-level 2 2 8 5 4
;; ess-continued-statement-offset 2 2 8 5 4
;; ess-brace-offset 0 0 -8 -5 -4
;; ess-arg-function-offset 2 4 0 0 0
;; ess-expression-offset 4 2 8 5 4
;; ess-else-offset 0 0 0 0 0
;; ess-close-brace-offset 0 0 0 0 0
(add-hook 'local-write-file-hooks
(lambda ()
(ess-nuke-trailing-whitespace)))))
(setq ess-nuke-trailing-whitespace-p t)
As for a general, tool Xihui's formatR pretty-printer may indeed be the closest. Or just use ESS :)
lintr - highlights possible syntax and style issues/errors
CRAN Task View: Reproducible Research - Formatting Tools section contains other useful tools, particularly formatR which can automatically formt code.
The lint package gives warnings about stylistic problems, without correcting those.
Running the lint() command (using the default parameter values) gives you a list of warnings for all R files in the current directory.
I use styler and then lintr before I check anything into version control.
styler converts your code base to match a given style - the default matches the tidyverse style described here. It modifies alignments, and some syntax (<- over =). But, it doesn't rename variables or anything like that.
lintr is non-modifying. It just identifies lines of code that are inconsistent with your style guide. I use this within vim when I'm working on a package or a project to identify things that need a bit more human input to fix (renaming variables/functions etc)
RStudio has added a style checker at some point in the past. For instance, in version 1.1.463 you can enable the feature under General Options. Here's a screenshot: