Does Lean expose itself as a C/C++ or python library? - math

I am interested in doing a project relying on automated proofs, in great dimension as a learning exercise. So far my online search suggests Lean is the way to go, in theory.
However, all I read about it talks about using it as a proof assistant in VS code or Emacs. But that's not what I need, I need a system I can communicate with fully programmatically. I.E string of assumptions goes in -> string specifying deductibility comes out or something like that.
To be more precise, I need to be able to call parsing functions on strings that do the heavy work of determining whether a set of results is deducible from the input assumptions.
I cant find documentation about Lean being able to do this.

Related

Java compatible simple expression language

I am building a system in Scala for feature engineering where the end user API receives aggregations on list of objects/events.
For example, a client of this tool might pass into it a function that given an array of past pageviews for a specific web user, filters for the ones coming from a specify country and count them. The output of this call will then be a number.
It can be thought as a very simple reduce operation.
I am figuring out how to build the API for such system. I could write a simple custom language to perform counts and filters but I am sure that it is not the best approach, especially because it will not be expressive enough, unless designed with care.
Are you aware of something like an expression language that could be used to express simple functions without the need for me to build one from scratch?
The other option would be to allow end users to pass custom code to this library, which might be dangerous at runtime.
I am aware of apache Calcite to plug SQL into different data structures and db. It is a good option, however it forces me to think in "columnar" sql way, while here I am looking more for something row based, similar to the map-reduce way of programming.
You mention Apache Calcite, but did you know that you can use it without writing SQL? If you use Calcite's RelBuilder class you can build algebra directly, which is very similar to the algebraic approach of MapReduce. See algebra in Calcite for more details.

What is more in the spirit of the Julia language and philosophy?

I recently started programming in Julia for research purposes. Going through it I started loving the syntax, I positively experienced the community here in SO and now I am thinking about porting some code from other programming languages.
Working with highly computational expensive forecasting models, it would be nice to have them all in a powerful modern language as Julia.
I would like to create a project and I am wondering how I should design it. I am concerned both from a performance and a language perspective (i.e.: Would it be better to create modules – submodules – functions or something else would be preferred? Is it better off to use dictionaries or custom types?).
I have looked at different GitHub projects in my field, but I haven't really found a common standard. Therefore I am wondering: what is more in the spirit of the Julia language and philosophy?
EDIT:
It has been pointed out that this question might be too generic. Therefore, I would like to focus it on how it would be better structuring modules (i.e. separate modules for main functions and subroutines versus modules and submodules, etc.). I believe this would be enough for me to have a feel about what might be considered in the spirit of the Julia language and philosophy. Of course, additional examples and references are more than welcome.
The most you'll find is that there is an "official" style-guide. The rest of the "Julian" style is ill-defined, but there are some ways to heuristically define it.
First of all, it means designing the software around multiple dispatch and the type system. A software which follows a Julian design philosophy usually won't be defining a bunch of functions like test_pumpkin and test_pineapple, instead it will use dispatches on test for types Pumpkin and Pineapple. This allows for clean/understandable code. It will break tasks up into small type-stable functions which will allow for good performance. It likely will also be written very generically, allowing the user to use items that are subtypes of AbstractArray or Number, and using the power of dispatch to allow their software to work on numbers they've never even heard of. (In this respect, custom types are recommended over dictionaries when you need performance. However, for a type you have to know all of the fields at the beginning, which means some things require dictionaries.)
A software which follows a Julian design philosophy may also implement a DSL (Domain-Specific Language) to allow a simpler interface to the user. Instead of requiring the user to conform to archaic standards derived from C/Fortran, or write large repetitive items and inputs, the package may provide macros to allow the user to more heuristically define the problem for the software to solve.
Other items which are part of the Julian design philosophy are up for much debate. Is proper Julia code devectorized? I would say no, and the loop fusing broadcast . is a powerful way to write MATLAB-style "vectorized" code and have it be perform like a devectorized loop. However, I have seen others prefer devectorized styles.
Also note that Julia is very different from something like Python where in Julia, you can essentially "build your own standard way of doing something". Since there's no performance penalty for functions/types declared in packages rather than Base, you can build your own Julia world if you want, using macros to define your own "function-like" objects, etc. I mean, you can re-create Java styles in Julia if you wanted.

Can I execute untrusted Common Lisp code in a restricted environment?

Supposed I wanted to take advantage of Common Lisp's ability to read and execute Common Lisp code so that my program can execute external code written in Lisp, but I don't trust that code so I don't want it to have access the full power of Common Lisp. Is it possible for me to restricts its environment so that it can only see the packages/symbols to which I explicitly give it access, effectively creating a DSL?
To read the code, start by disabling *read-eval* (that stops people injecting execution during parsing, using something like #.(do-evil-stuff). You probably want to do the reading using a custom read-table that disables most (if not all) read-macros. You probably want to do the reading with a custom, one-off, package, importing only symbols you allow.
Once you've read the user-provided code, you still need to validate that there's no unexpected function/macro references in the code. If you have used a custom package, you should be able to confirm that each symbol falls in either of the two classes "belongs to the custom one-off package" (this is user-supplied stuff) or "explicitly allowed from elsewhere" (you would need this list to construct the custom package).
Once that's been done, you can then evaluate it.
However, doing this correctly would take a fair bit of care and you really should have someone else have a look at the code and actively try to break out of the sandbox.
Take a look at the section 'Reader security' in chapter 4 of Let over lambda which discusses this topic in some depth. In particular, you probably want to set *read-eval* to nil. To address your question regarding restricting access to the environment, this is generally difficult in Common Lisp, as it is designed to allow access to most pieces of the system in the first place. Maybe you can use elaborate the ideas of Let over lambda in the direction of white listing symbols (in comparison to the blacklisting of macro characters in the linked chapter). I don't think there are any ready-made solutions.

'make'-like dependency-tracking library?

There are many nice things to like about Makefiles, and many pains in the butt.
In the course of doing various project (I'm a research scientist, "data scientist", or whatever) I often find myself starting out with a few data objects on disk, generating various artifacts from those, generating artifacts from those artifacts, and so on.
It would be nice if I could just say "this object depends on these other objects", and "this object is created in the following manner from these objects", and then ask a Make-like framework to handle the details of actually building them, figuring out which objects need to be updated, farming out work to multiple processors (like Make's -j option), and so on. Makefiles can do all this - but the huge problem is that all the actions have to be written as shell commands. This is not convenient if I'm working in R or Perl or another similar environment. Furthermore, a strong assumption in Make is that all targets are files - there are some exceptions and workarounds, but if my targets are e.g. rows in a database, that would be pretty painful.
To be clear, I'm not after a software-build system. I'm interested in something that (more generally?) deals with dependency webs of artifacts.
Anyone know of a framework for these kinds of dependency webs? Seems like it could be a nice tool for doing data science, & visually showing how results were generated, etc.
One extremely interesting example I saw recently was IncPy, but it looks like it hasn't been touched in quite a while, and it's very closely coupled with Python. It's probably also much more ambitious than I'm hoping for, which is why it has to be so closely coupled with Python.
Sorry for the vague question, let me know if some clarification would be helpful.
A new system called "Drake" was announced today that targets this exact situation: http://blog.factual.com/introducing-drake-a-kind-of-make-for-data . Looks very promising, though I haven't actually tried it yet.
This question is several years old, but I thought adding a link to remake here would be relevant.
From the GitHub repository:
The idea here is to re-imagine a set of ideas from make but built for R. Rather than having a series of calls to different instances of R (as happens if you run make on R scripts), the idea is to define pieces of a pipeline within an R session. Rather than being language agnostic (like make must be), remake is unapologetically R focussed.
It is not on CRAN yet, and I haven't tried it, but it looks very interesting.
I would give Bazel a try for this. It is primarily a software build system, but with its genrule type of artifacts it can perform pretty arbitrary file generation, too.
Bazel is very extendable, using its Python-like Starlark language which should be far easier to use for complicated tasks than make. You can start by writing simple genrule steps by hand, then refactor common patterns into macros, and if things become more complicated even write your own rules. So you should be able to express your individual transformations at a high level that models how you think about them, then turn that representation into lower level constructs using something that feels like a proper programming language.
Where make depends on timestamps, Bazel checks fingerprints. So if at any one step produces the same output even though one of its inputs changed, then subsequent steps won't need to get re-computed again. If some of your data processing steps project or filter data, there might be a high probability of this kind of thing happening.
I see your question is tagged for R, even though it doesn't mention it much. Under the hood, R computations would in Bazel still boil down to R CMD invocations on the shell. But you could have complicated muliti-line commands assembled in complicated ways, to read your inputs, process them and store the outputs. If the cost of initialization of the R binary is a concern, Rserve might help although using it would make the setup depend on a locally accessible Rserve instance I believe. Even with that I see nothing that would avoid the cost of storing the data to file, and loading it back from file. If you want something that avoids that cost by keeping things in memory between steps, then you'd be looking into a very R-specific tool, not a generic tool like you requested.
In terms of “visually showing how results were generated”, bazel query --output graph can be used to generate a graphviz dot file of the dependency graph.
Disclaimer: I'm currently working at Google, which internally uses a variant of Bazel called Blaze. Actually Bazel is the open-source released version of Blaze. I'm very familiar with using Blaze, but not with setting up Bazel from scratch.
Red-R has a concept of data flow programming. I have not tried it yet.

Interactive math proof system

I'm looking for a tool (GUI preferred but CLI would work) that allows me to input math expressions and then perform manipulations of them but restricts me to only mathematically valid operations. Also, the tool must be able to save a session and later prove that the given set of saved operations is valid.
Note: I am Not looking for a system to generate proofs, only that check that the steps I manually specify are valid.
I have used ACL2 for similar operations and it does well for some cases but it is very hard to use for everything else.
This little project is my motivation. It is a D template type that allows for equation solving. Given this equation:
(A * B) = C + D / F;
Any one of the symbols can be set as unknown and evaluating that expression will result an an assignment to that variable. It works by building expression trees into the type and then using rewrite rules to convert it to something that can be eventuated for the unknown type.
What I need is some way to validate the rewrite rule. They can be validated by testing the assertion that given some relation is true, another one is also.
Several American proof assistants were mentioned already (usually with LISP syntax), so here is a Europe-centric list to complement that:
Coq
Isabelle
HOL4
HOL-Light
Mizar
All of them are notorious for TTY interfaces, but Coq and Isabelle provide good support for the Proof General / Emacs interface. Moreover, Coq comes with CoqIDE, which is based on OCaml/GTK an the on-board text widget. Recent Isabelle includes the Isabelle/jEdit Prover IDE, which is based on jEdit and augmented by semantic markup provided by the prover in real-time as the user types.
ACL2 is notorious -- we used to say it was an expert system, and so could only be used by experts, who had to learn from Warren Hunt, J Moore, or Bob Boyer. The thing you need to do in ACL2 is really really understand how the proof system itself works; then you can "hint" it in directions that reduce the search space.
There are several other systems that can help with this kind of thing, though, depending on what you're trying to do.
If you want to work with continuous math or number theory, the ideal is Mathematica. Problem is you can buy a used car for the same amount of money (unless you can qualify for an academic license, a far better deal.)
Something similar, and free, is Open Maxima, which is an extension of Macsyma. That page also points to several others like Axiom, that I've got no experience with.
For mathematical logic operations, there's PVS from SRI. They've got some other cool stuff like model-checking in the same framework.
There's ongoing research in this area, it's called "Theorem proving in computer algebra".
People are trying to merge the ease of use and power of computer algebra systems like Mathematica, Maple, ... with the logical rigor of proof systems. The problems are:
Computer algebra systems are not rigorous. They tend to forget side conditions such as that a divisor must not be 0.
The proof systems are hard and tedious to use (as you have discovered).
In addition to what Charlie Martin's links, you may also want to check out Maple. My experience with such software is about 5 years old, but I recall at the time finding Maple to be much more intuitive than Mathematica.
The lean prover is interactive through a JS gui.
An old and unmaintained system is 'Ontic':
http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/kr/systems/ontic/0.html

Resources