String Parsing - Recursively - recursion

I have an string/expression like this:
(((p1 == 1) && (p2 != 2)) || p3 > 3) || (p4 < 5)
I want to parse this expression recursively in order to build a binary expression tree.
So, for this expression, the root would be || operator.
How can I build that algorithm?
Thanks in advance,

Take a look at the Shunting-Yard Algorithm.
In computer science, the shunting-yard algorithm is a method for
parsing mathematical expressions specified in infix notation. It can
be used to produce output in Reverse Polish notation (RPN) or as an
abstract syntax tree (AST). The algorithm was invented by Edsger
Dijkstra and named the "shunting yard" algorithm because its operation
resembles that of a railroad shunting yard.

Related

Examples of recursive predicates

In Stone's Algorithms for Functional Programming, he gives a design pattern for recursively defined predicates, which in Scheme is
(define (check stop? continue? step)
(rec (checker . arguments)
(or (apply stop? arguments)
(and (apply continue? arguments)
(apply (pipe step checker) arguments)))))
where pipe is the author's function to compose two functions in diagrammatic order, ((pipe f g) x = (g (f x)).
So, for instance, to test whether a function is a power of two, you could define
(define power-of-two? (check (sect = <> 1) even? halve))
where (sect = <> 1) is the author's notation for currying, equivalent to lambda x: x == 1.
Clearly a lot of predicates could be implemented recursively but it would not be useful. And clearly there are some recursive predicates that wouldn't use this pattern, like predicates on trees.
What are some classic predicates that fit this pattern? I guess testing if something is in the Cantor set, but that's almost the same as the above.
It is not clear what you are asking but your example is a classical example of programming with combinators.
The combinators are functions that take as input functions and return functions.
Combinators are fundamental in functional programming. Using them you can implement everything. For instance, if you define a data structure for an object as a function you can compose objects using some combinator and get a new object.
The combinator from your exemple seems to be useful to check some predicate about the composition of some monads.

Convergence and vectors theories

Is there a convergence theory in Isabelle/HOL? I need to define ∥x(t)∥ ⟶ 0 as t ⟶ ∞.
Also, I'm looking for vectors theory, I found a matrix theory but I couldn't find the vectors one, Is there exist such theory in Isabelle/HOL?
Cheers.
Convergence etc. are expressed with filters in Isabelle. (See the corresponding documentation)
In your case, that would be something like
filterlim (λt. norm (x t)) (nhds 0) at_top
or, using the tendsto abbreviation,
((λt. norm (x t)) ⤏ 0) at_top
where ⤏ is the Isabelle symbol \<longlongrightarrow>, which can be input using the abbreviation --->.
As a side note, I am wondering why you are writing it that way in the first place, seeing as it is equivalent to
filterlim x (nhds 0) at_top
or, with the tendsto syntax:
(x ⤏ 0) at_top
Reasoning with these filters can be tricky at first, but it has the advantage of providing a unified framework for limits and other topological concepts, and once you get the hang of it, it is very elegant.
As for vectors, just import ~~/src/HOL/Analysis/Analysis. That should have everything you need. Ideally, build the HOL-Analysis session image by starting Isabelle/jEdit with isabelle jedit -l HOL-Analysis. Then you won't have to process all of Isabelle's analysis library every time you start the system.
I assume that by ‘vectors’ you mean concrete finite-dimensional real vector spaces like ℝn. This is provided by ~~/src/HOL/Analysis/Finite_Cartesian_Product.thy, which is part of HOL-Analysis. This provides the vec type, which takes two parameters: the component type (probably real in your case) and the index type, which specifies the dimension of the vector space.
There is also a pre-defined type n for every positive integer n, so that you can write e.g. (real, 3) vec for the vector space ℝ³. There is also type syntax so that you can write 'a ^ 'n for ('a, 'n) vec.

Questions about Abstract Syntax Tree (AST)

I am finding difficulties to understand
1) AST matching, how two AST's are similar? Are types included in the comparison/matching or only the operations like +, -, ++,...etc inlcuded?
2) Two statements are syntactically similar (This term I read it somewhere in a paper), can we say the below example that the two statement are syntactically similar?
int x = 1 + 2
String y = "1" + "2"
Java - Eclipse is what am using right now and trying to understand the AST for.
Best Regards,
What ASTs are:
An AST is a data structure representing program source text that consists of nodes that contain a node type, and possibly a literal value, and a list of children nodes. The node type corresponds to what OP calls "operations" (+, -, ...) but also includes language commands (do, if, assignment, call, ...) , declarations (int, struct, ...) and literals (number, string, boolean). [It is unclear what OP means by "type"]. (ASTs often have additional information in each node referring back to the point of origin in the source text file).
What ASTs are for:
OP seems puzzled by the presence of ASTs in Eclipse.
ASTs are used to represent program text in a form which is easier to interpret than the raw text. They provide a means to reason about the program structure or content; sometimes they are used to enable the modification of program ("refactoring") by modifying the AST for a program and then regenerating the text from the AST.
Comparing ASTs for similarity is not a really common use in my experience, except in clone detection and/or pattern matching.
Comparing ASTs:
Comparing ASTs for equality is easy: compare the root node type/literal value for equality; if not equal, the comparision is complete, else (recursively) compare the children nodes).
Comparing ASTs of similarity is harder because you must decide how to relax the equality comparision. In particular, you must decide on a precise definition of similarity. There are many ways to define this, some rather shallow syntactically, some more semantically sophisticated.
My paper Clone Detection Using Abstract Syntax Trees describes one way to do this, using similarity defined as the ratio of the number of nodes shared divided by the number of nodes total in both trees. Shared nodes are computed by comparing the trees top down to the point where some child is different. (The actual comparision is to compute an anti-unifier). This similary measure is rather shallow, but it works remarkably well in finding code clones in big software systems.
From that perspective, OPs's examples:
int x = 1 + 2
String y = "1" + "2"
have trees written as S-expressions:
(declaration_with_assignment (int x) (+ 1 2))
(declaration_with_assignment (String y) (+ "1" "2"))
These are not very similar; they only share a root node whose type is "declaration-with-assignment" and the top of the + subtree. (Here the node count is 12 with only 2 matching nodes for a similarity of 2/12).
These would be more similar:
int x = 1 + 2
float x = 1.0 + 2
(S-expressions)
(declaration_with_assignment (int x) (+ 1 2))
(declaration_with_assignment (float x) (+ 1.0 2))
which share the declaration with assignment, the add node, the literal leaf node 2, and arguably the literal nodes for integer 1 and float 1.0, depending on whether you wish to define them as "equal" or not, for a similarity of 4/12.
If you change one of the trees to be a pattern tree, in which some "leaves" are pattern variables, you can then use such pattern trees to find code that has certain structure.
The surface syntax pattern:
?type ?variable = 1 + ?expression
with S-expression
((declaration_with_assignment (?type ?varaible)) (+ 1 ?expression))
matches the first of OP's examples but not the second.
As far as I know, Eclipse doesn't offer any pattern-based matching abilities.
But these are very useful in program analysis and/or program transformation tools. For some specific examples, too long to include here, see DMS Rewrite Rules
(Full disclosure: DMS is a product of my company. I'm the architect).

Derivative Calculator

I'm interested in building a derivative calculator. I've racked my brains over solving the problem, but I haven't found a right solution at all. May you have a hint how to start? Thanks
I'm sorry! I clearly want to make symbolic differentiation.
Let's say you have the function f(x) = x^3 + 2x^2 + x
I want to display the derivative, in this case f'(x) = 3x^2 + 4x + 1
I'd like to implement it in objective-c for the iPhone.
I assume that you're trying to find the exact derivative of a function. (Symbolic differentiation)
You need to parse the mathematical expression and store the individual operations in the function in a tree structure.
For example, x + sin²(x) would be stored as a + operation, applied to the expression x and a ^ (exponentiation) operation of sin(x) and 2.
You can then recursively differentiate the tree by applying the rules of differentiation to each node. For example, a + node would become the u' + v', and a * node would become uv' + vu'.
you need to remember your calculus. basically you need two things: table of derivatives of basic functions and rules of how to derivate compound expressions (like d(f + g)/dx = df/dx + dg/dx). Then take expressions parser and recursively go other the tree. (http://www.sosmath.com/tables/derivative/derivative.html)
Parse your string into an S-expression (even though this is usually taken in Lisp context, you can do an equivalent thing in pretty much any language), easiest with lex/yacc or equivalent, then write a recursive "derive" function. In OCaml-ish dialect, something like this:
let rec derive var = function
| Const(_) -> Const(0)
| Var(x) -> if x = var then Const(1) else Deriv(Var(x), Var(var))
| Add(x, y) -> Add(derive var x, derive var y)
| Mul(a, b) -> Add(Mul(a, derive var b), Mul(derive var a, b))
...
(If you don't know OCaml syntax - derive is two-parameter recursive function, with first parameter the variable name, and the second being mathched in successive lines; for example, if this parameter is a structure of form Add(x, y), return the structure Add built from two fields, with values of derived x and derived y; and similarly for other cases of what derive might receive as a parameter; _ in the first pattern means "match anything")
After this you might have some clean-up function to tidy up the resultant expression (reducing fractions etc.) but this gets complicated, and is not necessary for derivation itself (i.e. what you get without it is still a correct answer).
When your transformation of the s-exp is done, reconvert the resultant s-exp into string form, again with a recursive function
SLaks already described the procedure for symbolic differentiation. I'd just like to add a few things:
Symbolic math is mostly parsing and tree transformations. ANTLR is a great tool for both. I'd suggest starting with this great book Language implementation patterns
There are open-source programs that do what you want (e.g. Maxima). Dissecting such a program might be interesting, too (but it's probably easier to understand what's going on if you tried to write it yourself, first)
Probably, you also want some kind of simplification for the output. For example, just applying the basic derivative rules to the expression 2 * x would yield 2 + 0*x. This can also be done by tree processing (e.g. by transforming 0 * [...] to 0 and [...] + 0 to [...] and so on)
For what kinds of operations are you wanting to compute a derivative? If you allow trigonometric functions like sine, cosine and tangent, these are probably best stored in a table while others like polynomials may be much easier to do. Are you allowing for functions to have multiple inputs,e.g. f(x,y) rather than just f(x)?
Polynomials in a single variable would be my suggestion and then consider adding in trigonometric, logarithmic, exponential and other advanced functions to compute derivatives which may be harder to do.
Symbolic differentiation over common functions (+, -, *, /, ^, sin, cos, etc.) ignoring regions where the function or its derivative is undefined is easy. What's difficult, perhaps counterintuitively, is simplifying the result afterward.
To do the differentiation, store the operations in a tree (or even just in Polish notation) and make a table of the derivative of each of the elementary operations. Then repeatedly apply the chain rule and the elementary derivatives, together with setting the derivative of a constant to 0. This is fast and easy to implement.

algorithm to find derivative

I'm writing program in Python and I need to find the derivative of a function (a function expressed as string).
For example: x^2+3*x
Its derivative is: 2*x+3
Are there any scripts available, or is there something helpful you can tell me?
If you are limited to polynomials (which appears to be the case), there would basically be three steps:
Parse the input string into a list of coefficients to x^n
Take that list of coefficients and convert them into a new list of coefficients according to the rules for deriving a polynomial.
Take the list of coefficients for the derivative and create a nice string describing the derivative polynomial function.
If you need to handle polynomials like a*x^15125 + x^2 + c, using a dict for the list of coefficients may make sense, but require a little more attention when doing the iterations through this list.
sympy does it well.
You may find what you are looking for in the answers already provided. I, however, would like to give a short explanation on how to compute symbolic derivatives.
The business is based on operator overloading and the chain rule of derivatives. For instance, the derivative of v^n is n*v^(n-1)dv/dx, right? So, if you have v=3*x and n=3, what would the derivative be? The answer: if f(x)=(3*x)^3, then the derivative is:
f'(x)=3*(3*x)^2*(d/dx(3*x))=3*(3*x)^2*(3)=3^4*x^2
The chain rule allows you to "chain" the operation: each individual derivative is simple, and you just "chain" the complexity. Another example, the derivative of u*v is v*du/dx+u*dv/dx, right? If you get a complicated function, you just chain it, say:
d/dx(x^3*sin(x))
u=x^3; v=sin(x)
du/dx=3*x^2; dv/dx=cos(x)
d/dx=v*du+u*dv
As you can see, differentiation is only a chain of simple operations.
Now, operator overloading.
If you can write a parser (try Pyparsing) then you can request it to evaluate both the function and derivative! I've done this (using Flex/Bison) just for fun, and it is quite powerful. For you to get the idea, the derivative is computed recursively by overloading the corresponding operator, and recursively applying the chain rule, so the evaluation of "*" would correspond to u*v for function value and u*der(v)+v*der(u) for derivative value (try it in C++, it is also fun).
So there you go, I know you don't mean to write your own parser - by all means use existing code (visit www.autodiff.org for automatic differentiation of Fortran and C/C++ code). But it is always interesting to know how this stuff works.
Cheers,
Juan
Better late than never?
I've always done symbolic differentiation in whatever language by working with a parse tree.
But I also recently became aware of another method using complex numbers.
The parse tree approach consists of translating the following tiny Lisp code into whatever language you like:
(defun diff (s x)(cond
((eq s x) 1)
((atom s) 0)
((or (eq (car s) '+)(eq (car s) '-))(list (car s)
(diff (cadr s) x)
(diff (caddr s) x)
))
; ... and so on for multiplication, division, and basic functions
))
and following it with an appropriate simplifier, so you get rid of additions of 0, multiplying by 1, etc.
But the complex method, while completely numeric, has a certain magical quality. Instead of programming your computation F in double precision, do it in double precision complex.
Then, if you need the derivative of the computation with respect to variable X, set the imaginary part of X to a very small number h, like 1e-100.
Then do the calculation and get the result R.
Now real(R) is the result you would normally get, and imag(R)/h = dF/dX
to very high accuracy!
How does it work? Take the case of multiplying complex numbers:
(a+bi)(c+di) = ac + i(ad+bc) - bd
Now suppose the imaginary parts are all zero, except we want the derivative with respect to a.
We set b to a very small number h. Now what do we get?
(a+hi)(c) = ac + hci
So the real part of this is ac, as you would expect, and the imaginary part, divided by h, is c, which is the derivative of ac with respect to a.
The same sort of reasoning seems to apply to all the differentiation rules.
Symbolic Differentiation is an impressive introduction to the subject-at least for non-specialist like me :) The code is written in C++ btw.
Look up automatic differentiation. There are tools for Python. Also, this.
If you are thinking of writing the differentiation program from scratch, without utilizing other libraries as help, then the algorithm/approach of computing the derivative of any algebraic equation I described in my blog will be helpful.
You can try creating a class that will represent a limit rigorously and then evaluate it for (f(x)-f(a))/(x-a) as x approaches a. That should give a pretty accurate value of the limit.
if you're using string as an input, you can separate individual terms using + or - char as a delimiter, which will give you individual terms. Now you can use power rule to solve for each term, say you have x^3 which using power rule will give you 3x^2, or suppose you have a more complicated term like a/(x^3) or a(x^-3), again you can single out other variables as a constant and now solving for x^-3 will give you -3a/(x^2). power rule alone should be enough, however it will require extensive use of the factorization.
Unless any already made library deriving it's quite complex because you need to parse and handle functions and expressions.
Deriving by itself it's an easy task, since it's mechanical and can be done algorithmically but you need a basic structure to store a function.

Resources