Can I suppress F# compiler making copy of function in IL code? - reflection

I want to create a JIT GPU compiler. You give an F# function, and we JIT compile it. The key of JIT compiling is to be able to cache the compiling result. I tried to use the MethodInfo as the caching key, but it won't work. It seems that F# compiler will make a copy of the function instead of referencing the origin function. Is there a way to suppress this behavior?
Here is a test code, ideally, it should be just compiled twice, but it did it 4 times.
let compileGpuCode (m:MethodInfo) =
printfn "JIT compiling..."
printfn "Type : %A" m.ReflectedType
printfn "Method: %A" m
printfn ""
"fake gpu code"
let gpuCodeCache = ConcurrentDictionary<MethodInfo, string>()
let launchGpu (func:int -> int -> int) =
let m = func.GetType().GetMethod("Invoke", [| typeof<int>; typeof<int> |])
let gpuCode = gpuCodeCache.GetOrAdd(m, compileGpuCode)
// launch gpuCode
()
let myGpuCode (a:int) (b:int) = a + 2 * b
[<Test>]
let testFSFuncReflection() =
launchGpu (+)
launchGpu (+)
launchGpu myGpuCode
launchGpu myGpuCode
Here is the output:
JIT compiling...
Type : AleaTest.FS.Lab.Experiments+testFSFuncReflection#50
Method: Int32 Invoke(Int32, Int32)
JIT compiling...
Type : AleaTest.FS.Lab.Experiments+testFSFuncReflection#51-1
Method: Int32 Invoke(Int32, Int32)
JIT compiling...
Type : AleaTest.FS.Lab.Experiments+testFSFuncReflection#52-2
Method: Int32 Invoke(Int32, Int32)
JIT compiling...
Type : AleaTest.FS.Lab.Experiments+testFSFuncReflection#53-3
Method: Int32 Invoke(Int32, Int32)

The F# compiler treats your code more as something like this:
launchGpu (fun a b -> myGpuCode a b)
launchGpu (fun a b -> myGpuCode a b)
When compiling this, it will generate a new class to represent the function on each of the lines. If you wrote your test as follows:
let f = myGpuCode
launchGpu f
launchGpu f
... it would generate just one class (for the one place where the function is referenced) and then share the same type in both of the calls - so this would work.
In this example, the compiler actually inlines myGpuCode because it is too short, but if you make it more complex, then it generates very simple Invoke function in both of the classes:
ldarg.1
ldarg.2
call int32 Test::myGpuCode(int32, int32)
ret
I'm sure there is a plenty of caveats, but you could just check if the body of the generated class contains the same IL and uses that as your key instead. Once you have the Invoke method, you can get the IL body using the following:
let m = func.GetType().GetMethod("Invoke", [| typeof<int>; typeof<int> |])
let body = m.GetMethodBody().GetILAsByteArray()
This will be the same for both of the classes - ideally, you could also analyze this to figure out if the code is just calling some other method.

Related

Railway oriented programming with Async operations

Previously asked similar question but somehow I'm not finding my way out, attempting again with another example.
The code as a starting point (a bit trimmed) is available at https://ideone.com/zkQcIU.
(it has some issue recognizing Microsoft.FSharp.Core.Result type, not sure why)
Essentially all operations have to be pipelined with the previous function feeding the result to the next one. The operations have to be async and they should return error to the caller in case an exception occurred.
The requirement is to give the caller either result or fault. All functions return a Tuple populated with either Success type Article or Failure with type Error object having descriptive code and message returned from the server.
Will appreciate a working example around my code both for the callee and the caller in an answer.
Callee Code
type Article = {
name: string
}
type Error = {
code: string
message: string
}
let create (article: Article) : Result<Article, Error> =
let request = WebRequest.Create("http://example.com") :?> HttpWebRequest
request.Method <- "GET"
try
use response = request.GetResponse() :?> HttpWebResponse
use reader = new StreamReader(response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
Ok ((new DataContractJsonSerializer(typeof<Article>)).ReadObject(memoryStream) :?> Article)
with
| :? WebException as e ->
use reader = new StreamReader(e.Response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
Error ((new DataContractJsonSerializer(typeof<Error>)).ReadObject(memoryStream) :?> Error)
Rest of the chained methods - Same signature and similar bodies. You can actually reuse the body of create for update, upload, and publish to be able to test and compile code.
let update (article: Article) : Result<Article, Error>
// body (same as create, method <- PUT)
let upload (article: Article) : Result<Article, Error>
// body (same as create, method <- PUT)
let publish (article: Article) : Result<Article, Error>
// body (same as create, method < POST)
Caller Code
let chain = create >> Result.bind update >> Result.bind upload >> Result.bind publish
match chain(schemaObject) with
| Ok article -> Debug.WriteLine(article.name)
| Error error -> Debug.WriteLine(error.code + ":" + error.message)
Edit
Based on the answer and matching it with Scott's implementation (https://i.stack.imgur.com/bIxpD.png), to help in comparison and in better understanding.
let bind2 (switchFunction : 'a -> Async<Result<'b, 'c>>) =
fun (asyncTwoTrackInput : Async<Result<'a, 'c>>) -> async {
let! twoTrackInput = asyncTwoTrackInput
match twoTrackInput with
| Ok s -> return! switchFunction s
| Error err -> return Error err
}
Edit 2 Based on F# implementation of bind
let bind3 (binder : 'a -> Async<Result<'b, 'c>>) (asyncResult : Async<Result<'a, 'c>>) = async {
let! result = asyncResult
match result with
| Error e -> return Error e
| Ok x -> return! binder x
}
Take a look at the Suave source code, and specifically the WebPart.bind function. In Suave, a WebPart is a function that takes a context (a "context" is the current request and the response so far) and returns a result of type Async<context option>. The semantics of chaining these together are that if the async returns None, the next step is skipped; if it returns Some value, the next step is called with value as the input. This is pretty much the same semantics as the Result type, so you could almost copy the Suave code and adjust it for Result instead of Option. E.g., something like this:
module AsyncResult
let bind (f : 'a -> Async<Result<'b, 'c>>) (a : Async<Result<'a, 'c>>) : Async<Result<'b, 'c>> = async {
let! r = a
match r with
| Ok value ->
let next : Async<Result<'b, 'c>> = f value
return! next
| Error err -> return (Error err)
}
let compose (f : 'a -> Async<Result<'b, 'e>>) (g : 'b -> Async<Result<'c, 'e>>) : 'a -> Async<Result<'c, 'e>> =
fun x -> bind g (f x)
let (>>=) a f = bind f a
let (>=>) f g = compose f g
Now you can write your chain as follows:
let chain = create >=> update >=> upload >=> publish
let result = chain(schemaObject) |> Async.RunSynchronously
match result with
| Ok article -> Debug.WriteLine(article.name)
| Error error -> Debug.WriteLine(error.code + ":" + error.message)
Caution: I haven't been able to verify this code by running it in F# Interactive, since I don't have any examples of your create/update/etc. functions. It should work, in principle — the types all fit together like Lego building blocks, which is how you can tell that F# code is probably correct — but if I've made a typo that the compiler would have caught, I don't yet know about it. Let me know if that works for you.
Update: In a comment, you asked whether you need to have both the >>= and >=> operators defined, and mentioned that you didn't see them used in the chain code. I defined both because they serve different purposes, just like the |> and >> operators serve different purposes. >>= is like |>: it passes a value into a function. While >=> is like >>: it takes two functions and combines them. If you would write the following in a non-AsyncResult context:
let chain = step1 >> step2 >> step3
Then that translates to:
let asyncResultChain = step1AR >=> step2AR >=> step3AR
Where I'm using the "AR" suffix to indicate versions of those functions that return an Async<Result<whatever>> type. On the other hand, if you had written that in a pass-the-data-through-the-pipeline style:
let result = input |> step1 |> step2 |> step3
Then that would translate to:
let asyncResult = input >>= step1AR >>= step2AR >>= step3AR
So that's why you need both the bind and compose functions, and the operators that correspond to them: so that you can have the equivalent of either the |> or the >> operators for your AsyncResult values.
BTW, the operator "names" that I picked (>>= and >=>), I did not pick randomly. These are the standard operators that are used all over the place for the "bind" and "compose" operations on values like Async, or Result, or AsyncResult. So if you're defining your own, stick with the "standard" operator names and other people reading your code won't be confused.
Update 2: Here's how to read those type signatures:
'a -> Async<Result<'b, 'c>>
This is a function that takes type A, and returns an Async wrapped around a Result. The Result has type B as its success case, and type C as its failure case.
Async<Result<'a, 'c>>
This is a value, not a function. It's an Async wrapped around a Result where type A is the success case, and type C is the failure case.
So the bind function takes two parameters:
a function from A to an async of (either B or C)).
a value that's an async of (either A or C)).
And it returns:
a value that's an async of (either B or C).
Looking at those type signatures, you can already start to get an idea of what the bind function will do. It will take that value that's either A or C, and "unwrap" it. If it's C, it will produce an "either B or C" value that's C (and the function won't need to be called). If it's A, then in order to convert it to an "either B or C" value, it will call the f function (which takes an A).
All this happens within an async context, which adds an extra layer of complexity to the types. It might be easier to grasp all this if you look at the basic version of Result.bind, with no async involved:
let bind (f : 'a -> Result<'b, 'c>) (a : Result<'a, 'c>) =
match a with
| Ok val -> f val
| Error err -> Error err
In this snippet, the type of val is 'a, and the type of err is 'c.
Final update: There was one comment from the chat session that I thought was worth preserving in the answer (since people almost never follow chat links). Developer11 asked,
... if I were to ask you what Result.bind in my example code maps to your approach, can we rewrite it as create >> AsyncResult.bind update? It worked though. Just wondering i liked the short form and as you said they have a standard meaning? (in haskell community?)
My reply was:
Yes. If the >=> operator is properly written, then f >=> g will always be equivalent to f >> bind g. In fact, that's precisely the definition of the compose function, though that might not be immediately obvious to you because compose is written as fun x -> bind g (f x) rather than as f >> bind g. But those two ways of writing the compose function would be exactly equivalent. It would probably be very instructive for you to sit down with a piece of paper and draw out the function "shapes" (inputs & outputs) of both ways of writing compose.
Why do you want to use Railway Oriented Programming here? If you just want to run a sequence of operations and return information about the first exception that occurs, then F# already provides a language support for this using exceptions. You do not need Railway Oriented Programming for this. Just define your Error as an exception:
exception Error of code:string * message:string
Modify the code to throw the exception (also note that your create function takes article but does not use it, so I deleted that):
let create () = async {
let ds = new DataContractJsonSerializer(typeof<Error>)
let request = WebRequest.Create("http://example.com") :?> HttpWebRequest
request.Method <- "GET"
try
use response = request.GetResponse() :?> HttpWebResponse
use reader = new StreamReader(response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
return ds.ReadObject(memoryStream) :?> Article
with
| :? WebException as e ->
use reader = new StreamReader(e.Response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
return raise (Error (ds.ReadObject(memoryStream) :?> Error)) }
And then you can compose functions just by sequencing them in async block using let! and add exception handling:
let main () = async {
try
let! created = create ()
let! updated = update created
let! uploaded = upload updated
Debug.WriteLine(uploaded.name)
with Error(code, message) ->
Debug.WriteLine(code + ":" + message) }
If you wanted more sophisticated exception handling, then Railway Oriented Programming might be useful and there is certainly a way of integrating it with async, but if you just want to do what you described in your question, then you can do that much more easily with just standard F#.

How to get the name of a higher order function in F#? [duplicate]

How can I create a function called getFuncName that takes a function of type (unit -> 'a) and returns its name.
I was talking to one of the C# devs and they said you could use the .Method property on a Func type as shown in an example here.
I tried to convert this to F# :
for example convert (unit -> 'a) to a type Func<_> then call the property on it but it always returns the string "Invoke".
let getFuncName f =
let fFunc = System.Func<_>(fun _ -> f())
fFunc.Method.Name
let customFunc() = 1.0
// Returns "Invoke" but I want it to return "customFunc"
getFuncName customFunc
A bit of background to this problem is:
I have created an array of functions of type (unit -> Deedle.Frame). I now want to cycle through those functions invoking them and saving them to csv with the csv name having the same name as the function. Some hypothetical code is below:
let generators : (unit -> Frame<int, string>) array = ...
generators
|> Array.iter (fun generator -> generator().SaveCsv(sprintf "%s\%s.csv" __SOURCE_DIRECTORY__ (getFuncName generator)))
This is being used in a scripting sense rather than as application code.
Not sure how you searched for information, but the first query to the search engine gave me this response:
let getFuncName f =
let type' = f.GetType()
let method' = type'.GetMethods() |> Array.find (fun m -> m.Name="Invoke")
let il = method'.GetMethodBody().GetILAsByteArray()
let methodCodes = [byte OpCodes.Call.Value;byte OpCodes.Callvirt.Value]
let position = il |> Array.findIndex(fun x -> methodCodes |> List.exists ((=)x))
let metadataToken = BitConverter.ToInt32(il, position+1)
let actualMethod = type'.Module.ResolveMethod metadataToken
actualMethod.Name
Unfortunately, this code only works when F# compiler does not inline function body into calling method.
Taken from here
Although there may be a more simple way.

How can I make a retry function tail recursive?

I have a discriminated union that is similar to the Result type used in Scott's Railway Oriented Programming. For simplicity's sake, it's slightly simplified here:
type ErrorMessage = ErrorMessage of string
type ValidationResult<'a> =
| Success of 'a
| Error of ErrorMessage
I have a corresponding module ValidationResult that contains functions that act on these ValidationResults, one of them is a recursive retryable function that allows the parameter, f: unit -> 'a, to be called again (such as reading from stdin) if the ValidationResult is Error:
module ValidationResult
let doubleMap success error = function
| Success x -> success x
| Error e -> error e
let rec retryable errorHandler f =
let result = f ()
let retry e =
errorHandler e
retryable errorHandler f
doubleMap id retry result
But it isn't tail recursive and I would like to convert it to be so. How can I do that?
The F# compiler compiles tail-recursive functions in two different ways.
If the function is simple (calls itself directly), then it is compiled into a loop
If the tail-recursion involves multiple different functions (or even function values), then the compiler uses the .tail IL instruction to do a tail-call. This is also a tail-call, but handled by the .NET runtime rather than eliminated by the F# compiler.
In your case, the retryable function is already tail-recursive, but it is the second kind. Daniel's answer makes it simple enough so that it becomes the first kind.
However, you can keep the function as you have it and it will be tail-recursive. The only thing to note is that the compiler does not generate the .tail instruction by default in Debug mode (as it messes up the call stack) and so you need to enable it explicitly (in project options, check "Generate tail calls").
Just removing the call to doubleMap should do it:
let rec retryable errorHandler f =
match f() with
| Success x -> x
| Error e ->
errorHandler e
retryable errorHandler f

Type extension errors for Dictionary<'K, 'V>

The following type extension
module Dict =
open System.Collections.Generic
type Dictionary<'K, 'V> with
member this.Difference(that:Dictionary<'K, 'T>) =
let dict = Dictionary()
for KeyValue(k, v) in this do
if not (that.ContainsKey(k)) then
dict.Add(k, v)
dict
gives the error:
The signature and implementation are not compatible because the declaration of the type parameter 'TKey' requires a constraint of the form 'TKey : equality
But when I add the constraint it gives the error:
The declared type parameters for this type extension do not match the declared type parameters on the original type 'Dictionary<,>'
This is especially mysterious because the following type extension doesn't have the constraint and works.
type Dictionary<'K, 'V> with
member this.TryGet(key) =
match this.TryGetValue(key) with
| true, v -> Some v
| _ -> None
Now I'm having weird thoughts: is the constraint required only when certain members are accessed?
module Dict =
open System.Collections.Generic
type Dictionary<'K, 'V> with
member this.Difference(that:Dictionary<'K, 'T>) =
let dict = Dictionary(this.Comparer)
for KeyValue(k, v) in this do
if not (that.ContainsKey(k)) then
dict.Add(k, v)
dict
EDIT:
As per F# spec (14.11 Additional Constraints on CLI Methods)
Some specific CLI methods and types are treated specially by F#, because they are common in F# programming and cause extremely difficult-to-find bugs. For each use of the following constructs, the F# compiler imposes additional ad hoc constraints:
x.Equals(yobj) requires type ty : equality for the static type of x
x.GetHashCode() requires type ty : equality for the static type of x
new Dictionary<A,B>() requires A : equality, for any overload that does not take an IEqualityComparer<T>
as far as I can see the following code does the trick:
module Dict =
open System.Collections.Generic
type Dictionary<'K, 'V> with
member this.Difference(that: Dictionary<'K,'V2>) =
let diff =
this
|> Seq.filter (fun x -> not <| that.ContainsKey(x.Key))
|> Seq.map (fun x -> x.Key, x.Value)
System.Linq.Enumerable.ToDictionary(diff, fst, snd)
The problem is your use of the Add method. If you use this method of Dictionary<TKey, TValue> then F# will enforce that TKey has the equality constraint.
After playing around a bit I'm not sure that it's even possible to write this extension method. The F# type system appears to force the declaration type of the extension method to have no additional constraints than the original type (i get an error whenever I add the equality constraint). Additionally the type listed in the individal extension methods cannot differ than the listed type. I've tried a number of ways and can't get this to function correctly.
The closest I've come is the non-extension method as follows
let Difference (this : Dictionary<'K, 'T>) (that:Dictionary<'K, 'T> when 'K : equality) =
let dict = Dictionary()
for KeyValue(k, v) in this do
if not (that.ContainsKey(k)) then
dict.Add(k, v)
dict
Perhaps another F# ninja will be able to prove me wrong
(EDIT: CKoenig has a nice answer.)
Hm, I didn't immediately see a way to do this either.
Here's a non-type-safe solution that might provide some crazy inspiration for others.
open System.Collections.Generic
module Dict =
type Dictionary<'K, 'V> with
member this.Difference<'K2, 'T when 'K2 : equality>(that:Dictionary<'K2, 'T>) =
let dict = Dictionary<'K2,'V>()
for KeyValue(k, v) in this do
if not (that.ContainsKey(k |> box |> unbox)) then
dict.Add(k |> box |> unbox, v)
dict
open Dict
let d1 = Dictionary()
d1.Add(1, "foo")
d1.Add(2, "bar")
let d2 = Dictionary()
d2.Add(1, "cheese")
let show (d:Dictionary<_,_>) =
for (KeyValue(k,v)) in d do
printfn "%A: %A" k v
d1.Difference(d2) |> show
let d3 = Dictionary()
d3.Add(1, 42)
d1.Difference(d3) |> show
let d4 = Dictionary()
d4.Add("uh-oh", 42)
d1.Difference(d4) |> show // blows up at runtime
Overall it seems like there may be no way to unify the types K and K2 without also forcing them to have the same equality constraint though...
(EDIT: seems like calling into .NET which is equality-constraint-agnostic is a good way to create a dictionary in the absence of the extra constraint.)

Collecting the output of an external command using OCaml

What is the right way to call an external command and collect its output in OCaml?
In Python, I can do something like this:
os.popen('cmd').read()
How I can get all of an external program's output in OCaml? Or, better, OCaml with Lwt?
Thanks.
You want Unix.open_process_in, which is described on page 388 of the OCaml system manual, version 3.10.
For Lwt,
val pread : ?env:string array -> command -> string Lwt.t
seems to be a good contender. Documentation here: http://ocsigen.org/docu/1.3.0/Lwt_process.html
let process_output_to_list2 = fun command ->
let chan = Unix.open_process_in command in
let res = ref ([] : string list) in
let rec process_otl_aux () =
let e = input_line chan in
res := e::!res;
process_otl_aux() in
try process_otl_aux ()
with End_of_file ->
let stat = Unix.close_process_in chan in (List.rev !res,stat)
let cmd_to_list command =
let (l,_) = process_output_to_list2 command in l
There are lots of examples on PLEAC.
You can use the third party library Rashell which uses Lwt to define some high-level primitives to read output from processes. These primitives, defined in the module Rashell_Command, are:
exec_utility to read the output of a process as a string;
exec_test to only read the exit status of a process;
exec_query to read the output of a process line by line as a string Lwt_stream.t
exec_filter to use an external program as a string Lwt_stream.t -> string Lwt_stream.t transformation.
The command function is used to create command contexts on which the previous primitives can be applied, it has the signature:
val command : ?workdir:string -> ?env:string array -> string * (string array) -> t
(** [command (program, argv)] prepare a command description with the
given [program] and argument vector [argv]. *)
So for instance
Rashell_Command.(exec_utility ~chomp:true (command("", [| "uname" |])))
is a string Lwt.t which returns the “chomped” string (new line removed) of the “uname” command. As a second example
Rashell_Command.(exec_query (command("", [| "find"; "/home/user"; "-type"; "f"; "-name"; "*.orig" |])))
is a string Lwt_stream.t whose elements are the paths of the file found by the command
find /home/user -type f -name '*.orig'
The Rashell library defines also interfaces to some commonly used commands, and a nice interface to the find command is defined in Rashell_Posix – which by the way guarantees POSIX portability.

Resources