How can I determine the json path to a field within a record without actually hard coding the path? - reflection

I would like to work with the following type
type RecordPath<'a,'b> = {
Get: 'a -> 'b
Path:string
}
It's purpose is to define a getter for going from record type 'a to some field within 'a of type 'b. It also gives the path to that field for the json representation of the record.
For example, consider the following fields.
type DateWithoutTimeBecauseWeirdlyDotnetDoesNotHaveThisConcept = {
Year:uint
Month:uint
Day:uint
}
type Person = {
FullName:string
PassportNumber:string
BirthDate:DateWithoutTimeBecauseWeirdlyDotnetDoesNotHaveThisConcept
}
type Team = {
TeamName:string
TeamMembers:Person list
}
An example RecordPath might be
let birthYearPath = {
Get = fun (team:Team) -> team.TeamMembers |> List.map (fun p -> p.BirthDate.Year)
Path = "$.TeamMember[*].BirthDate.Year" //using mariadb format for json path
}
Is there some way of letting a library user create this record without ever actually needing to specify the string explicitly. Ideally there is some strongly typed way of the user specifying the fields involved. Maybe some kind of clever use of reflection?
It just occurred to me that with a language that supports macros, this would be possible. But can it be done in F#?
PS: I notice that I left out the s in "TeamMembers" in the path. This is the kind of thing I want to guard against to make it easier on the user.

As you noted in the comments, F# has a quotation mechanism that lets you do this. You can create those explicitly using <# ... #> notation or implicitly using a somewhat more elengant automatic quoting mechanism. The quotations are farily close representations of the F# code, so converting them to the desired path format is not going to be easy, but I think it can be done.
I tried to get this to work at least for your small example. First, I needed a helper function that does two transformations on the code and turns:
let x = e1 in e2 into e2[x <- e1] (using the notation e2[x <- e1] to mean a subsitution, i.e. expression e2 with all occurences of x replaced by e1)
e1 |> fun x -> e2 into e2[x <- e1]
This is all I needed for your example, but it's likely you'll need a few more cases:
open Microsoft.FSharp.Quotations
let rec simplify dict e =
let e' = simplifyOne dict e
if e' <> e then simplify dict e' else e'
and simplifyOne dict = function
| Patterns.Call(None, op, [e; Patterns.Lambda(v, body)])
when op.Name = "op_PipeRight" ->
simplify (Map.add v e dict) body
| Patterns.Let(v, e, body) -> simplify (Map.add v e dict) body
| ExprShape.ShapeVar(v) when Map.containsKey v dict -> dict.[v]
| ExprShape.ShapeVar(v) -> Expr.Var(v)
| ExprShape.ShapeLambda(v, e) -> Expr.Lambda(v, simplify dict e)
| ExprShape.ShapeCombination(o, es) ->
ExprShape.RebuildShapeCombination(o, List.map (simplify dict) es)
With this pre-processing, I managed to write an extractPath function like this:
let rec extractPath var = function
| Patterns.Call(None, op, [Patterns.Lambda(v, body); inst]) when op.Name = "Map" ->
extractPath var inst + "[*]." + extractPath v.Name body
| Patterns.PropertyGet(Some(Patterns.Var v), p, []) when v.Name = var -> p.Name
| Patterns.PropertyGet(Some e, p, []) -> extractPath var e + "." + p.Name
| e -> failwithf "Unexpected expression: %A" e
This looks for (1) a call to map function, (2) a property access on a variable that represents the data source and (3) a property access where the instance has some more property accesses.
The following now works for your small example (but probably for nothing else!)
type Path =
static member Make([<ReflectedDefinition(true)>] f:Expr<'T -> 'R>) =
match f with
| Patterns.WithValue(f, _, Patterns.Lambda(v, body)) ->
{ Get = f :?> 'T -> 'R
Path = "$." + extractPath v.Name (simplify Map.empty body) }
| _ -> failwith "Unexpected argument"
Path.Make(fun (team:Team) -> team.TeamMembers |> List.map (fun p -> p.BirthDate.Year))

The way I solved this is
let jsonPath userExpr =
let rec innerLoop expr state =
match expr with
|Patterns.Lambda(_, body) ->
innerLoop body state
|Patterns.PropertyGet(Some parent, propInfo, []) ->
sprintf ".%s%s" propInfo.Name state |> innerLoop parent
|Patterns.Call (None, _, expr1::[Patterns.Let (v, expr2, _)]) when v.Name = "mapping"->
let parentPath = innerLoop expr1 "[*]"
let childPath = innerLoop expr2 ""
parentPath + childPath
|ExprShape.ShapeVar x ->
state
|_ ->
failwithf "Unsupported expression: %A" expr
innerLoop userExpr "" |> sprintf "$%s"
type Path =
static member Make([<ReflectedDefinition(true)>] f:Expr<'T -> 'R>) =
match f with
|Patterns.WithValue(f, _, expr) ->
let path = jsonPath expr
{
Get = f :?> 'T -> 'R
Path = path
}
| _ -> failwith "Unexpected argument"
Caveat: I don't know enough about these techniques to tell if Tomas' answer performs better in some edge cases than mine.

Related

How to create a cached recursive type?

open System
open System.Collections.Generic
type Node<'a>(expr:'a, symbol:int) =
member x.Expression = expr
member x.Symbol = symbol
override x.GetHashCode() = symbol
override x.Equals(y) =
match y with
| :? Node<'a> as y -> symbol = y.Symbol
| _ -> failwith "Invalid equality for Node."
interface IComparable with
member x.CompareTo(y) =
match y with
| :? Node<'a> as y -> compare symbol y.Symbol
| _ -> failwith "Invalid comparison for Node."
type Ty =
| Int
| String
| Tuple of Ty list
| Rec of Node<Ty>
| Union of Ty list
type NodeDict<'a> = Dictionary<'a,Node<'a>>
let get_nodify_tag =
let mutable i = 0
fun () -> i <- i+1; i
let nodify (dict: NodeDict<_>) x =
match dict.TryGetValue x with
| true, x -> x
| false, _ ->
let x' = Node(x,get_nodify_tag())
dict.[x] <- x'
x'
let d = Dictionary(HashIdentity.Structural)
let nodify_ty x = nodify d x
let rec int_string_stream =
Union
[
Tuple [Int; Rec (nodify_ty (int_string_stream))]
Tuple [String; Rec (nodify_ty (int_string_stream))]
]
In the above example, the int_string_stream gives a type error, but it neatly illustrates what I want to do. Of course, I want both sides to get tagged with the same symbol in nodify_ty. When I tried changing the Rec type to Node<Lazy<Ty>> I've found that it does not compare them correctly and each sides gets a new symbol which is useless to me.
I am working on a language, and the way I've dealt with storing recursive types up to now is by mapping Rec to an int and then substituting that with the related Ty in a dictionary whenever I need it. Currently, I am in the process of cleaning up the language, and would like to have the Rec case be Node<Ty> rather than an int.
At this point though, I am not sure what else could I try here. Could this be done somehow?
I think you will need to add some form of explicit "delay" to the discriminated union that represents your types. Without an explicit delay, you'll always end up fully evaluating the types and so there is no potential for closing the loop.
Something like this seems to work:
type Ty =
| Int
| String
| Tuple of Ty list
| Rec of Node<Ty>
| Union of Ty list
| Delayed of Lazy<Ty>
// (rest is as before)
let rec int_string_stream = Delayed(Lazy.Create(fun () ->
Union
[
Tuple [Int; Rec (nodify_ty (int_string_stream))]
Tuple [String; Rec (nodify_ty (int_string_stream))]
]))
This will mean that when you pattern match on Ty, you'll always need to check for Delayed, evaluate the lazy value and then pattern match again, but that's probably doable!

How best to memoize based on argument only, not function closure, and inside a class?

(question edited and rewritten to reflect chat discussion results)
In one line: Given a state in a state monad, evaluate monadic function once, cache the results.
I am trying to cache the result of a function evaluation, where the key of the cache is the state of a State monad, and where I do not care about possible side effects: i.e., even if the body of the function may change in theory, I know it will be independent of the state:
f x = state { return DateTime.Now.AddMinutes(x) }
g x = state { return DateTime.Now.AddMinutes(x) }
Here, g 10 and f 10 should yield the same result, they may not differ as result to a double call to DateTime.Now, i.e., they must be deterministic. For the sake of argument, the variable state here is x.
On a same token, (g 10) - (f 5) should yield exactly 5 minutes and not a microsecond more or less.
After finding out that caching didn't work, I toned down a more elaborate solution to its bare minimum, using Don Syme's memoization pattern with maps (or dict).
The memoization pattern:
module Cache =
let cache f =
let _cache = ref Map.empty
fun x ->
match (!_cache).TryFind(x) with
| Some res -> res
| None ->
let res = f x
_cache := (!_cache).Add(x,res)
res
The caching is supposed to be used as part of a computation builder, in the Run method:
type someBuilder() =
member __.Run f =
Log.time "Calling __.Run"
let memo_me =
fun state ->
let res =
match f with
| State expr - expr state
| Value v -> state, v
Log.time ("Cache miss, adding key: %A", s)
res
XCache.cache memo_me
This doesn't work, because the cache function is different each time because of the closure, resulting in hitting a cache miss each time over. It should be independent of expr above, and dependent on state only.
I tried placing the _cache outside the cache function on module level, but then it hits the problem of generalization:
Value restriction. The value '_cache' has been inferred to have generic type
Either define '_cache' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.
Which I then tried to solve using type annotations, but I ended up not being able to use it in the generic function for the same reason: it required specific type annotations then to be used:
let _cache<'T, 'U when 'T: comparison> ref : Map<'T, 'U> = ref Map.empty
Edit, a working version of the whole computation builder
Here's the computation builder as asked in the comments, tested in FSI. The caching should be dependent solely on TState, not on the whole of 'TState -> 'TState * 'TResult.
type State<'TState, 'TResult> = State of ('TState -> 'TState * 'TResult)
type ResultState<'TState, 'TResult> =
| Expression of State<'TState, 'TResult>
| Value of 'TResult
type RS<'S, 'T> = ResultState<'S, 'T>
type RS =
static member run v s =
match v with
| Value item -> s, item
| Expression (State expr) -> expr s
static member bind k v =
match v with
| Expression (State expr) ->
Expression
<| State
(fun initialState ->
let updatedState, result = expr initialState
RS.run (k result) updatedState
)
| Value item -> k item
type MyBuilder() =
member __.Bind (e, f) = RS.bind f e
member __.Return v = RS.Value v
member __.ReturnFrom e = e
member __.Run f =
printfn "Running!"
// add/remove the first following line to see it with caching
XCache.cache <|
fun s ->
match f with
| RS.Expression (State state) ->
printfn "Call me once!"
state s
| RS.Value v -> s, v
module Builders =
let builder = new MyBuilder()
// constructing prints "Running!", this is as expected
let create() = builder {
let! v = RS.Expression <| (State <| fun i -> (fst i + 12.0, snd i + 3), "my value")
return "test " + v
}
// for seeing the effect, recreating the builder twice,
// it should be cached once
let result1() = create()(30.0, 39)
let result2() = create()(30.0, 39)
Result of running the example in FSI:
Running!
Call me once!
val it : (float * int) * string = ((42.0, 42), "test my value")
Call me once!
val it : (float * int) * string = ((42.0, 42), "test my value")
Just add the Cache into the Run
member __.Run f =
printfn "Running!"
Cache.cache <|
fun s ->
match f with
| RS.Expression (State state) ->
printfn "Call me once!"
state s
| RS.Value v -> s, v
and modify the cache function to see if it really caches
module Cache =
let cache f =
let _cache = ref Map.empty
fun x ->
match (!_cache).TryFind(x) with
| Some res -> printfn "from cache"; res
| None ->
let res = f x
_cache := (!_cache).Add(x,res)
printfn "to cache"
res
and the output is
Call me once!
to cache
val it : (float * int) * string = ((42.0, 42), "test my value")
>
from cache
val it : (float * int) * string = ((42.0, 42), "test my value")

Mutually recursive let bindings

I'm trying to implement a parser that looks something like this:
open System
type ParseResult<'a> =
{
Result : Option<'a>;
Rest : string
}
let Fail = fun input -> { Result = None; Rest = input }
let Return a = fun input -> { Result = Some a; Rest = input }
let ThenBind p f =
fun input ->
let r = p input
match r.Result with
| None -> { Result = None; Rest = input } // Recreate the result since p returns a ParseResult<'a>
| _ -> (f r.Result) r.Rest
let Then p1 p2 = ThenBind p1 (fun r -> p2)
let Or p1 p2 =
fun input ->
let r = p1 input
match r.Result with
| None -> p2 input
| _ -> r
let rec Chainl1Helper a p op =
Or
<| ThenBind op (fun f ->
ThenBind p (fun y ->
Chainl1Helper (f.Value a y.Value) p op))
<| Return a
let Chainl1 p op = ThenBind p (fun x -> Chainl1Helper x.Value p op)
let rec Chainr1 p op =
ThenBind p (fun x ->
Or
(ThenBind op (fun f ->
ThenBind (Chainr1 p op) (fun y ->
Return (f.Value x.Value y.Value))))
(Return x.Value))
let Next = fun input ->
match input with
| null -> { Result = None; Rest = input }
| "" -> { Result = None; Rest = input }
| _ -> { Result = Some <| char input.[0..1]; Rest = input.[1..] }
let Sat predicate = ThenBind Next (fun n -> if predicate n.Value then Return n.Value else Fail)
let Digit = ThenBind (Sat Char.IsDigit) (fun c -> Return <| float c.Value)
let rec NatHelper i =
Or
(ThenBind Digit (fun x ->
NatHelper (float 10 * i + x.Value) ))
(Return i)
let Nat = ThenBind Digit (fun d -> NatHelper d.Value)
let LiteralChar c = Sat (fun x -> x = c)
let rec Literal input token =
match input with
| "" -> Return token
| _ -> Then (LiteralChar <| char input.[0..1]) (Literal input.[1..] token)
let AddSub =
Or
<| ThenBind (LiteralChar '+') (fun c -> Return (+))
<| ThenBind (LiteralChar '-') (fun c -> Return (-))
let MulDiv =
Or
<| ThenBind (LiteralChar '*') (fun c -> Return (*))
<| ThenBind (LiteralChar '/') (fun c -> Return (/))
let Exp = ThenBind (LiteralChar '^') (fun c -> Return ( ** ))
let rec Expression = Chainl1 Term AddSub
and Term = Chainl1 Factor MulDiv
and Factor = Chainr1 Part Exp
and Part = Or Nat Paren
and Paren =
Then
<| LiteralChar '('
<| ThenBind Expression (fun e ->
Then (LiteralChar ')') (Return e.Value))
The last functions are mutually recursive in their definitions. Expression's definition depends on Term, which depends on Factor, which depends on Part, which depends on Paren, which depends on Expression.
When I try to compile this, I get an error about mutually recursive definitions with the suggestion to make Expression lazy or a function. I tried both of those, and I get a cryptic InvalidOperationException with both that says something about ValueFactory attempting to access the Value property.
In general, F# lets you use let rec .. and .. not just for defining mutually recursive functions, but also for defining mutually recursive values. This means that you might be able to write something like this:
let rec Expression = Chainl1 Term AddSub
and Paren =
Then
<| LiteralChar '('
<| ThenBind Expression (fun e ->
Then (LiteralChar ')') (Return e.Value))
and Part = Or Nat Paren
and Factor = Chainr1 Part Exp
and Term = Chainl1 Factor MulDiv
However, this only works if the computation is not evaluated immediately (because then the recursive definition would not make sense). This very much depends on the library you're using here (or on the rest of your code). But you can try the above and see if that works - if no, you'll need to provide more details.
EDIT In the updated example, there is an immediate loop in your recursive definition. You need to delay some part of the definition using fun _ -> ... so that not everything needs to be evaluated at once. In your example, you can do that by replacing Then with ThenBind in the definition of Paren:
let rec Expression = Chainl1 Term AddSub
and Term = Chainl1 Factor MulDiv
and Factor = Chainr1 Part Exp
and Part = Or Nat Paren
and Paren =
ThenBind
(LiteralChar '(')
(fun _ -> ThenBind Expression (fun e ->
Then (LiteralChar ')') (Return e.Value)))

How to shorten this OCaml code?

I am just wondering how to shorten these code as I suspect it is too redundant
let get ename doc =
try Some (StringMap.find ename doc) with Not_found -> None;;
let get_double ename doc =
let element = get ename doc in
match element with
| None -> None
| Some (Double v) -> Some v
| _ -> raise Wrong_bson_type;;
let get_string ename doc =
let element = get ename doc in
match element with
| None -> None
| Some (String v) -> Some v
| _ -> raise Wrong_bson_type;;
let get_doc ename doc =
let element = get ename doc in
match element with
| None -> None
| Some (Document v) -> Some v
| _ -> raise Wrong_bson_type;;
So, basically, I have different types of values, and I put all those kinds of values into a map.
The code above is for getting according type of values out of the map. What I do is that for each type, I have a get. To get one type of value, I have to see a). whether it is there or not; b). whether it is that type indeed, if not, raise an exception.
But the code above seems to redundant as you can see. The only diff between each type's get is just the type itself.
How can I shorten this code?
You can do this:
let get_generic extract ename doc =
let element = get ename doc in
match element with
| None -> None
| Some v -> Some (extract v)
let get_double = get_generic (function Double v -> v | _ -> raise Wrong_bson_type)
let get_string = get_generic (function String v -> v | _ -> raise Wrong_bson_type)
let get_doc = get_generic (function Document v -> v | _ -> raise Wrong_bson_type)
EDIT:
To remove the redundant raise Wrong_bson_type (But it is ugly):
let get_generic extract ename doc = try
let element = get ename doc in
match element with
| None -> None
| Some v -> Some (extract v)
with Match_failure _ -> raise Wrong_bson_type
let get_double = get_generic (fun (Double v) -> v)
let get_string = get_generic (fun (String v) -> v)
let get_doc = get_generic (fun (Document v)-> v)
You can use GADT to do that:
If you define a type expr like this:
type _ expr =
| Document: document -> document expr
| String: string -> string expr
| Double: float -> float expr
You can write a function get like this:
let get : type v. v expr -> v = function
Document doc -> doc
| String s -> s
| Double d -> d
With GADTs:
type _ asked =
| TDouble : float asked
| TString : string asked
| TDocument : document asked
let get : type v. v asked -> string -> doc StringMap.t -> v option =
fun asked ename doc ->
try
Some (match asked, StringMap.find ename doc with
| TDouble, Double f -> f
| TString, String s -> s
| TDocument, Document d -> d)
with Not_found -> None
let get_double = get TDouble
let get_string = get TString
let get_document = get TDocument
If you can live with these extractor functions:
let extract_double = function
| Double v -> v
| _ -> raise Wrong_bson_type
let extract_string = function
| String v -> v
| _ -> raise Wrong_bson_type
let extract_doc = function
| Document v -> v
| _ -> raise Wrong_bson_type
Then you can use monadic style for the higher-order function, which allows you to keep your original definition of get:
let return x = Some x
let (>>=) mx f =
match mx with
| Some x -> f x
| None -> None
let get_with exf ename doc =
(get ename doc) >>= fun v ->
return (exf v)
let get_double = get_with extract_double
let get_string = get_with extract_string
let get_doc = get_with extract_doc
Less redundant and abstracts the side effect to generic bind and return operations.

Embed a variable inside an F# quotation

I'm writing an F# dsl for SQL (http://github.com/kolosy/furious).
A select statement would look like this:
type person = {
personId: string
firstname: string
lastname: string
homeAddress: address
workAddress: address
altAddresses: address seq
}
and address = {
addressId: string
street1: string
zip: string
}
let (neighbor: person seq) =
db.Yield <# Seq.filter (fun p -> p.homeAddress.zip = '60614') #>
The obvious (and silly) question is... How do I parametrize the quotation?
If I just somehting like:
let z = "60614"
let (neighbor: person seq) =
db.Yield <# Seq.filter (fun p -> p.homeAddress.zip = z) #>
then z gets resolved into a static property accessor (PropertyGet(None, String z, [])). I need something that will let me retrieve the value of the variable/let binding based solely on the quotation. Ideas?
Quotations are not my forte, but check out the difference here:
let z = "60614"
let foo = <# List.filter (fun s -> s = z) #>
printfn "%A" foo
let foo2 =
let z = z
<# List.filter (fun s -> s = z) #>
printfn "%A" foo2
I think maybe having 'z' be local to the expression means the value is captured, rather than a property reference.
In addition to what Brian wrote - I believe that the encoding of access to global let bound values is also pretty stable and they will quite likely continue to be encoded as PropGet in the future.
This means that you could support this case explicitly in your translator and add a simple pre-processing step to get values of these properties. This can be done using ExprShape (which allows you to fully traverse quotation just using 4 cases). This would allow your DSL to support the general case as well.
The following function traverses quotation and replaces access to global lets with their value:
open Microsoft.FSharp.Quotations
let rec expand e =
match e with
// Extract value of global 'let' bound symbols
| Patterns.PropertyGet(None, pi, []) ->
Expr.Value(pi.GetValue(null, [| |]), e.Type)
// standard recursive processing of quotations
| ExprShape.ShapeCombination(a, b) ->
ExprShape.RebuildShapeCombination(a, b |> List.map expand)
| ExprShape.ShapeLambda(v, b) -> Expr.Lambda(v, expand b)
| ExprShape.ShapeVar(v) -> Expr.Var(v)
Then you can write the following to get a quotation that contains value instead of PropGet:
let z = 5
let eOrig = <# Seq.filter (fun p -> p = z) [ 1 .. 10 ]#>
let eNice = expand eOrig

Resources