I am trying to implement a basic parser, scanner, and a minimal language in OCaml. I believe the issue is that I'm trying to maintain a map between variables in this toy language and their values, and the language should be able to handle an expression like a=2;a and return 2. The name seems to successfully store the number 2, but by the time the program moves on to evaluating the second expression, it does not find the name in the map. And I can't understand why.
Below is the abstract syntax tree.
Ast.ml
type operator = Add (* for now just one operator *)
type expr =
Binop of expr * operator * expr
| Lit of int (* a number *)
| Seq of expr * expr (* a sequence, to behave like ";" *)
| Asn of string * expr (* assignment, to behave like "=" in an imperative language *)
| Var of string (* a variable *)
Here are the parser and scanner.
parser.mly
%{
open Ast
%}
%token SEQ PLUS ASSIGN EOF
%token <int> LITERAL
%token <string> VARIABLE
%left SEQ PLUS
%start expr
%type <Ast.expr> expr
%%
expr:
| expr SEQ expr { Seq($1, $3) }
| expr PLUS expr { Binop($1, Add, $3) }
| LITERAL { Lit($1) }
| VARIABLE { Var($1) }
| VARIABLE ASSIGN expr { Asn($1, $3) }
scanner.mll
{
open Parser
}
rule tokenize = parse
[' ' '\t' '\r' '\n'] { tokenize lexbuf }
| '+' { PLUS }
| ['0'-'9']+ as lit { LITERAL(int_of_string lit) }
| ['a'-'z']+ as id { VARIABLE(id) }
| '=' { ASSIGN }
| ';' { SEQ }
| eof { EOF }
And here's where I tried to implement a sort of name-space in a basic calculator.
calc.ml
open Ast
module StringMap = Map.Make(String)
let namespace = ref StringMap.empty
let rec eval exp = match exp with
| Lit(n) -> n
| Binop(e1, op, e2) ->
let v1 = eval e1 in
let v2 = eval e2 in
v1+v2
| Asn (name, e) ->
let v = eval e in
(namespace := StringMap.add name v !namespace; v)
| Var(name) -> StringMap.find name !namespace
| Seq(e1, e2) ->
(let _ = eval e1 in
eval e2)
let _ =
let lexbuf = Lexing.from_channel stdin in
let expr = Parser.expr Scanner.tokenize lexbuf in
let result = eval expr in
print_endline (string_of_int result)
To test it out I compile, and it compiles successfully, then run $ ./calc in a terminal, and enter a=2;a then press Ctrl+D. It should print 2 but it gives a Not found exception. Presumably this is coming from the StringMap.find line, and it's not finding the name in the namespace. I've tried throwing print lines around, and I think I can confirm that the sequence is being correctly processed in the terminal and that the first evaluation is happening successfully, with the name and value getting entered into the string map. But for some reason it seems not to be there when the program moves on to processing the second expression in the sequence.
I'd appreciate any clarification, thanks.
I cannot reproduce your error.
Feeding the AST directly to eval
let () =
let ast = Seq(Asn ("a", Lit 2),Var "a") in
let result = eval ast in
print_endline (string_of_int result)
prints 2 as expected.
After fixing your parser to recognize the end of the stream:
entry:
| expr EOF { $1 }
using it in
let () =
let s = Lexing.from_string "a=2;a\n" in
let ast = Parser.entry Scanner.tokenize s in
let result = eval ast in
print_endline (string_of_int result)
prints 2 as expected. And without this fix, your code fails with a syntax error.
EDIT:
Rather than using a makefile, I will advise to use dune, with the following simple dune file:
(menhir (modules parser))
(ocamllex scanner)
(executable (name calc))
it will at least solve your compilation troubles.
Hey so I don't know if you are still looking for the solution, but I was able to reproduce the problem and how I solved it was simply adding assign to the %left statement in parser.mly
%left SEQ ASSIGN PLUS
Related
I would like to work with the following type
type RecordPath<'a,'b> = {
Get: 'a -> 'b
Path:string
}
It's purpose is to define a getter for going from record type 'a to some field within 'a of type 'b. It also gives the path to that field for the json representation of the record.
For example, consider the following fields.
type DateWithoutTimeBecauseWeirdlyDotnetDoesNotHaveThisConcept = {
Year:uint
Month:uint
Day:uint
}
type Person = {
FullName:string
PassportNumber:string
BirthDate:DateWithoutTimeBecauseWeirdlyDotnetDoesNotHaveThisConcept
}
type Team = {
TeamName:string
TeamMembers:Person list
}
An example RecordPath might be
let birthYearPath = {
Get = fun (team:Team) -> team.TeamMembers |> List.map (fun p -> p.BirthDate.Year)
Path = "$.TeamMember[*].BirthDate.Year" //using mariadb format for json path
}
Is there some way of letting a library user create this record without ever actually needing to specify the string explicitly. Ideally there is some strongly typed way of the user specifying the fields involved. Maybe some kind of clever use of reflection?
It just occurred to me that with a language that supports macros, this would be possible. But can it be done in F#?
PS: I notice that I left out the s in "TeamMembers" in the path. This is the kind of thing I want to guard against to make it easier on the user.
As you noted in the comments, F# has a quotation mechanism that lets you do this. You can create those explicitly using <# ... #> notation or implicitly using a somewhat more elengant automatic quoting mechanism. The quotations are farily close representations of the F# code, so converting them to the desired path format is not going to be easy, but I think it can be done.
I tried to get this to work at least for your small example. First, I needed a helper function that does two transformations on the code and turns:
let x = e1 in e2 into e2[x <- e1] (using the notation e2[x <- e1] to mean a subsitution, i.e. expression e2 with all occurences of x replaced by e1)
e1 |> fun x -> e2 into e2[x <- e1]
This is all I needed for your example, but it's likely you'll need a few more cases:
open Microsoft.FSharp.Quotations
let rec simplify dict e =
let e' = simplifyOne dict e
if e' <> e then simplify dict e' else e'
and simplifyOne dict = function
| Patterns.Call(None, op, [e; Patterns.Lambda(v, body)])
when op.Name = "op_PipeRight" ->
simplify (Map.add v e dict) body
| Patterns.Let(v, e, body) -> simplify (Map.add v e dict) body
| ExprShape.ShapeVar(v) when Map.containsKey v dict -> dict.[v]
| ExprShape.ShapeVar(v) -> Expr.Var(v)
| ExprShape.ShapeLambda(v, e) -> Expr.Lambda(v, simplify dict e)
| ExprShape.ShapeCombination(o, es) ->
ExprShape.RebuildShapeCombination(o, List.map (simplify dict) es)
With this pre-processing, I managed to write an extractPath function like this:
let rec extractPath var = function
| Patterns.Call(None, op, [Patterns.Lambda(v, body); inst]) when op.Name = "Map" ->
extractPath var inst + "[*]." + extractPath v.Name body
| Patterns.PropertyGet(Some(Patterns.Var v), p, []) when v.Name = var -> p.Name
| Patterns.PropertyGet(Some e, p, []) -> extractPath var e + "." + p.Name
| e -> failwithf "Unexpected expression: %A" e
This looks for (1) a call to map function, (2) a property access on a variable that represents the data source and (3) a property access where the instance has some more property accesses.
The following now works for your small example (but probably for nothing else!)
type Path =
static member Make([<ReflectedDefinition(true)>] f:Expr<'T -> 'R>) =
match f with
| Patterns.WithValue(f, _, Patterns.Lambda(v, body)) ->
{ Get = f :?> 'T -> 'R
Path = "$." + extractPath v.Name (simplify Map.empty body) }
| _ -> failwith "Unexpected argument"
Path.Make(fun (team:Team) -> team.TeamMembers |> List.map (fun p -> p.BirthDate.Year))
The way I solved this is
let jsonPath userExpr =
let rec innerLoop expr state =
match expr with
|Patterns.Lambda(_, body) ->
innerLoop body state
|Patterns.PropertyGet(Some parent, propInfo, []) ->
sprintf ".%s%s" propInfo.Name state |> innerLoop parent
|Patterns.Call (None, _, expr1::[Patterns.Let (v, expr2, _)]) when v.Name = "mapping"->
let parentPath = innerLoop expr1 "[*]"
let childPath = innerLoop expr2 ""
parentPath + childPath
|ExprShape.ShapeVar x ->
state
|_ ->
failwithf "Unsupported expression: %A" expr
innerLoop userExpr "" |> sprintf "$%s"
type Path =
static member Make([<ReflectedDefinition(true)>] f:Expr<'T -> 'R>) =
match f with
|Patterns.WithValue(f, _, expr) ->
let path = jsonPath expr
{
Get = f :?> 'T -> 'R
Path = path
}
| _ -> failwith "Unexpected argument"
Caveat: I don't know enough about these techniques to tell if Tomas' answer performs better in some edge cases than mine.
How can we build a function in F# that outputs the name of the variable passed in? For example:
let someVar1 = "x"
getVarname someVar1 //output would be "someVar1"
let someVar2 = "y"
getVarname someVar2 //output would be "someVar2"
let f toString = fun a -> printfn "%s: %d" (toString a) a
let x = 1
f getVarname x //output would be: "x: 1"
I found a similar question in C# here (get name of a variable or parameter), but I was unable to make it work in F#.
If you use quotations and static methods, you can already capture the name of the variable in F# 4 using the ReflectedDefinition attribute. The Demo.GetVarName static method in the following example returns the name of the variable used as an argument together with the value:
open Microsoft.FSharp.Quotations
type Demo =
static member GetVarName([<ReflectedDefinition(true)>] x:Expr<int>) =
match x with
| Patterns.WithValue(_, _, Patterns.ValueWithName(value, _, name)) ->
name, value :?> int
| _ -> failwithf "Argument was not a variable: %A" x
let test ()=
let yadda = 123
Demo.GetVarName(yadda)
test()
This works for local variables as in the test() function above. For top-level variables (which are actually compiled as properties) you also need to add a case for PropertyGet:
match x with
| Patterns.WithValue(_, _, Patterns.ValueWithName(value, _, name)) ->
name, value :?> int
| Patterns.WithValue(value, _, Patterns.PropertyGet(_, pi, _)) ->
pi.Name, value :?> int
| _ -> failwithf "Argument was not a variable: %A" x
The nameof implementation has the operator in F# core, but the F# 5 compiler bits haven't shipped yet.
When it does, you can use it to get the name of a symbol.
let someVar1 = None
let name = nameof someVar1 // name = "someVar1"
For now, we can maybe abuse the dynamic operator to get us a shim which you can eventually replace with nameof
let name = ()
let (?) _ name = string name
Usage:
let someVar1 = None
let name = name?someVar1
It doesn't read too bad, and you get some degree of auto-completion.
If you really want to be able to retrieve the local name and value at the call-site, there's quotations.
let printVar = function
| ValueWithName(value, _type, name) -> printfn "%s = %A" name value
| _ -> ()
The usage is a bit noisy, though.
let someVar1 = 12
printVar <# someVar1 #> //prints someVar1 = 12
open System
open System.Collections.Generic
type Node<'a>(expr:'a, symbol:int) =
member x.Expression = expr
member x.Symbol = symbol
override x.GetHashCode() = symbol
override x.Equals(y) =
match y with
| :? Node<'a> as y -> symbol = y.Symbol
| _ -> failwith "Invalid equality for Node."
interface IComparable with
member x.CompareTo(y) =
match y with
| :? Node<'a> as y -> compare symbol y.Symbol
| _ -> failwith "Invalid comparison for Node."
type Ty =
| Int
| String
| Tuple of Ty list
| Rec of Node<Ty>
| Union of Ty list
type NodeDict<'a> = Dictionary<'a,Node<'a>>
let get_nodify_tag =
let mutable i = 0
fun () -> i <- i+1; i
let nodify (dict: NodeDict<_>) x =
match dict.TryGetValue x with
| true, x -> x
| false, _ ->
let x' = Node(x,get_nodify_tag())
dict.[x] <- x'
x'
let d = Dictionary(HashIdentity.Structural)
let nodify_ty x = nodify d x
let rec int_string_stream =
Union
[
Tuple [Int; Rec (nodify_ty (int_string_stream))]
Tuple [String; Rec (nodify_ty (int_string_stream))]
]
In the above example, the int_string_stream gives a type error, but it neatly illustrates what I want to do. Of course, I want both sides to get tagged with the same symbol in nodify_ty. When I tried changing the Rec type to Node<Lazy<Ty>> I've found that it does not compare them correctly and each sides gets a new symbol which is useless to me.
I am working on a language, and the way I've dealt with storing recursive types up to now is by mapping Rec to an int and then substituting that with the related Ty in a dictionary whenever I need it. Currently, I am in the process of cleaning up the language, and would like to have the Rec case be Node<Ty> rather than an int.
At this point though, I am not sure what else could I try here. Could this be done somehow?
I think you will need to add some form of explicit "delay" to the discriminated union that represents your types. Without an explicit delay, you'll always end up fully evaluating the types and so there is no potential for closing the loop.
Something like this seems to work:
type Ty =
| Int
| String
| Tuple of Ty list
| Rec of Node<Ty>
| Union of Ty list
| Delayed of Lazy<Ty>
// (rest is as before)
let rec int_string_stream = Delayed(Lazy.Create(fun () ->
Union
[
Tuple [Int; Rec (nodify_ty (int_string_stream))]
Tuple [String; Rec (nodify_ty (int_string_stream))]
]))
This will mean that when you pattern match on Ty, you'll always need to check for Delayed, evaluate the lazy value and then pattern match again, but that's probably doable!
I defined in AST file that stmt=|stmt|stmt List. Then I have a function with parameter of type stmt, I cannot use stmt List? Can anyone tell me how to make it work?
The error when compiling is:
Error: This expression has type Ast.stmt list
but an expression was expected of type Ast.stmt
Below is my code for function:
let rec eval_stmt (stmt:stmt) (ps:proc_state) (env:environment) (store:store) : (stmtEvalRes*proc_state*store) = match stmt with
| PrintInt e ->
let r = eval_expr e ps env store in
print_int r; (Next, ps, store)
| PrintStr s ->
print_string (Str.global_replace (Str.regexp "\\\\n") "\n" s);
(* Escaping characters here because it's not done in the parser *)
(Next, ps, store)
| List (stmt1::stmts) -> eval_stmt stmt1 ps env store;
eval_stmt stmts ps env store;(Next,ps,store)
Below is my AST file:
type stmt = (* Statements in C-flat language *)
Empty
| VarAss of expr * expr
(* Expressions can be statements in themselves. This is most useful when
the expression is a function call. Calls to built-in functions are
just special cases of calls. *)
| Expr of expr
| PrintStr of string (* string is only static! *)
| PrintInt of expr (* expr should be int *)
(* Control Flow *)
| IfThen of cond * stmt
| IfThenElse of cond * stmt * stmt
| Switch of expr * (int * stmt list) list * stmt list
(* Switch (cond, cases, default) *)
| While of cond * stmt
| For of stmt * cond * stmt * stmt
| Break (* used in switch, while and for *)
| Continue (* used in while and for *)
(* nested statements *)
| List of stmt list
;;
Your define stmt as a variant type, i.e., a finite collection of alternatives. One of the alternatives is List which indeed contains a stmt list. But this doesn't mean that stmt list is the same type as stmt. What it means is that a value List ... (with a list of statements inside) is of type stmt.
Idiomatic F# can nicely represent the classic recursive expression data structure:
type Expression =
| Number of int
| Add of Expression * Expression
| Multiply of Expression * Expression
| Variable of string
together with recursive functions thereon:
let rec simplify_add (exp: Expression): Expression =
match exp with
| Add (x, Number 0) -> x
| Add (Number 0, x) -> x
| _ -> exp
... oops, that doesn't work as written; simplify_add needs to recur into subexpressions. In this toy example that's easy enough to do, only a couple of extra lines of code, but in a real program there would be dozens of expression types; one would prefer to avoid adding dozens of lines of boilerplate to every function that operates on expressions.
Is there any way to express 'by default, recur on subexpressions'? Something like:
let rec simplify_add (exp: Expression): Expression =
match exp with
| Add (x, Number 0) -> x
| Add (Number 0, x) -> x
| _ -> recur simplify_add exp
where recur might perhaps be some sort of higher-order function that uses reflection to look up the type definition or somesuch?
Unfortunately, F# does not give you any recursive function for processing your data type "for free". You could probably generate one using reflection - this would be valid if you have a lot of recursive types, but it might not be worth it in normal situations.
There are various patterns that you can use to hide the repetition though. One that I find particularly nice is based on the ExprShape module from standard F# libraries. The idea is to define an active pattern that gives you a view of your type as either leaf (with no nested sub-expressions) or node (with a list of sub-expressions):
type ShapeInfo = Shape of Expression
// View expression as a node or leaf. The 'Shape' just stores
// the original expression to keep its original structure
let (|Leaf|Node|) e =
match e with
| Number n -> Leaf(Shape e)
| Add(e1, e2) -> Node(Shape e, [e1; e2])
| Multiply(e1, e2) -> Node(Shape e, [e1; e2])
| Variable s -> Leaf(Shape e)
// Reconstruct an expression from shape, using new list
// of sub-expressions in the node case.
let FromLeaf(Shape e) = e
let FromNode(Shape e, args) =
match e, args with
| Add(_, _), [e1; e2] -> Add(e1, e2)
| Multiply(_, _), [e1; e2] -> Multiply(e1, e2)
| _ -> failwith "Wrong format"
This is some boilerplate code that you'd have to write. But the nice thing is that we can now write the recursive simplifyAdd function using just your special cases and two additional patterns for leaf and node:
let rec simplifyAdd exp =
match exp with
// Special cases for this particular function
| Add (x, Number 0) -> x
| Add (Number 0, x) -> x
// This now captures all other recursive/leaf cases
| Node (n, exps) -> FromNode(n, List.map simplifyAdd exps)
| Leaf _ -> exp