F#: Downloading data asynchronously - asynchronous

I am new to programming and F# is my first language.
Here are the relevant parts of my code:
open System.IO
open System.Net
let downloadHtmlFromUrlAsync (url: string) =
async {
let uri = new System.Uri(url)
let webClient = new WebClient()
let! html = webClient.AsyncDownloadString(uri)
return html
}
let downloadHtmlToDisk (url: string) (directoryPath: string) =
if isValidUrl url then
let name = getNameFromRedirectedUrl url
let id = getIdFromUrl url
let html = downloadHtmlFromUrlAsync url
let newTextFile = File.Create(directoryPath + "\\" + id.ToString("00000") + " " + name.TrimEnd([|' '|]) + ".html")
use file = new StreamWriter(newTextFile)
file.Write(html)
file.Close()
let downloadEntireDatabase (baseUrl: string) (totalNumberOfPeople: int) =
let allIds = [ for i in 1 .. totalNumberOfPeople -> i ]
allIds
|> Seq.map (fun id -> baseUrl + string(id))
|> Seq.filter isValidUrl
|> Seq.map downloadHtmlToDisk
|> Async.Parallel
|> Async.RunSynchronously
I have tested the functions isValidUrl, getNameFromRedirectedUrl, getIdFromUrl in F# interactive. They work fine.
My problem is this: When I try to run the code pasted above, the following error message is produced:
Program.fs(483,8): error FS0193: Type constraint mismatch. The type
seq<(string -> unit)> is not compatible with type
seq<Async<'a>> The type Async<'a> does not match the type string -> unit
What went wrong? What changes should I make?

The problem is probably this line (can you please give us the definition of downloadFighterHtmlToDisk):
allIds
...
|> Seq.map downloadFighterHtmlToDisk
...
based on the error message this functions seems to have a signature string -> string -> unit but you really need string -> Async<'something>.
Now I guess you used downloadHtmlToDisk or something similar and you can but then I would suggest rewriting it to:
let downloadHtmlToDisk (directoryPath: string) (url: string) =
async {
if isValidUrl url then
let name = getNameFromRedirectedUrl url
let id = getIdFromUrl url
let! html = downloadHtmlFromUrlAsync url
let newTextFile = File.Create(directoryPath + "\\" + id.ToString("00000") + " " + name.TrimEnd([|' '|]) + ".html")
use file = new StreamWriter(newTextFile)
file.Write(html)
}
and use it like
let downloadEntireDatabase (baseUrl: string) (totalNumberOfPeople: int) =
let allIds = [ for i in 1 .. totalNumberOfPeople -> i ]
allIds
|> Seq.map (fun id -> (id, baseUrl + string(id)))
|> Seq.filter (fun (_,url) -> isValidUrl url)
|> Seq.map (fun (id,url) -> downloadHtmlToDisk (getFighterPath id) url)
|> Async.Parallel
|> Async.RunSynchronously
See the let! html = ..? This is important - this is where the async will happen ;) - if you want you can find similar operations to write your file asynchronously. Also you don't need to close your file - dispose should handle it
remark
I have just seen that you reextract the id from the url - you might also use this instead of the way I used tuples but I think it's better to really pass the id on if you still need it - for example in downloadHtmlToDisk you really need the id and could have created the url from the id there instead - a much easier approach IMO but I don't want to rewrite everything you go - just experiment a bit with this stuff

Related

Function with type 'T -> Async<'T> like C#'s Task.FromResult

I'm playing around asynchronous programming and was wondering if there's a function that exists that can take a value of type 'T and transform it to an Async<'T>, similar to C#'s Task.FromResult that can take a value of type TResult and transform it to a Task<TResult> that can then be awaited.
If such a function does not exist in F#, is it possible to create it? I can kind of emulate this by using Async.AwaitTask and Task.FromResult, but can I do this by only using Async?
Essentially, I'd like to be able to do something like this:
let asyncValue = toAsync 3 // toAsync: 'T -> Async<'T>
let foo = async{
let! value = asyncValue
}
...or just async.Return
let toAsync = async.Return
let toAsync` x = async.Return x
moreover there is async.Bind (in tupled form)
let asyncBind
(asyncValue: Async<'a>)
(asyncFun: 'a -> Async<'b>) : Async<'b> =
async.Bind(asyncValue, asyncFun)
you could use them to make pretty complicated async computation without builder gist link
let inline (>>-) x f = async.Bind(x, f >> async.Return)
let requestMasterAsync limit urls =
let results = Array.zeroCreate (List.length urls)
let chunks =
urls
|> Seq.chunkBySize limit
|> Seq.indexed
async.For (chunks, fun (i, chunk) ->
chunk
|> Seq.map asyncMockup
|> Async.Parallel
>>- Seq.iteri (fun j r -> results.[i*limit+j]<-r))
>>- fun _ -> results
You can use return within your async expression:
let toAsync x = async { return x }

Railway oriented programming with Async operations

Previously asked similar question but somehow I'm not finding my way out, attempting again with another example.
The code as a starting point (a bit trimmed) is available at https://ideone.com/zkQcIU.
(it has some issue recognizing Microsoft.FSharp.Core.Result type, not sure why)
Essentially all operations have to be pipelined with the previous function feeding the result to the next one. The operations have to be async and they should return error to the caller in case an exception occurred.
The requirement is to give the caller either result or fault. All functions return a Tuple populated with either Success type Article or Failure with type Error object having descriptive code and message returned from the server.
Will appreciate a working example around my code both for the callee and the caller in an answer.
Callee Code
type Article = {
name: string
}
type Error = {
code: string
message: string
}
let create (article: Article) : Result<Article, Error> =
let request = WebRequest.Create("http://example.com") :?> HttpWebRequest
request.Method <- "GET"
try
use response = request.GetResponse() :?> HttpWebResponse
use reader = new StreamReader(response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
Ok ((new DataContractJsonSerializer(typeof<Article>)).ReadObject(memoryStream) :?> Article)
with
| :? WebException as e ->
use reader = new StreamReader(e.Response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
Error ((new DataContractJsonSerializer(typeof<Error>)).ReadObject(memoryStream) :?> Error)
Rest of the chained methods - Same signature and similar bodies. You can actually reuse the body of create for update, upload, and publish to be able to test and compile code.
let update (article: Article) : Result<Article, Error>
// body (same as create, method <- PUT)
let upload (article: Article) : Result<Article, Error>
// body (same as create, method <- PUT)
let publish (article: Article) : Result<Article, Error>
// body (same as create, method < POST)
Caller Code
let chain = create >> Result.bind update >> Result.bind upload >> Result.bind publish
match chain(schemaObject) with
| Ok article -> Debug.WriteLine(article.name)
| Error error -> Debug.WriteLine(error.code + ":" + error.message)
Edit
Based on the answer and matching it with Scott's implementation (https://i.stack.imgur.com/bIxpD.png), to help in comparison and in better understanding.
let bind2 (switchFunction : 'a -> Async<Result<'b, 'c>>) =
fun (asyncTwoTrackInput : Async<Result<'a, 'c>>) -> async {
let! twoTrackInput = asyncTwoTrackInput
match twoTrackInput with
| Ok s -> return! switchFunction s
| Error err -> return Error err
}
Edit 2 Based on F# implementation of bind
let bind3 (binder : 'a -> Async<Result<'b, 'c>>) (asyncResult : Async<Result<'a, 'c>>) = async {
let! result = asyncResult
match result with
| Error e -> return Error e
| Ok x -> return! binder x
}
Take a look at the Suave source code, and specifically the WebPart.bind function. In Suave, a WebPart is a function that takes a context (a "context" is the current request and the response so far) and returns a result of type Async<context option>. The semantics of chaining these together are that if the async returns None, the next step is skipped; if it returns Some value, the next step is called with value as the input. This is pretty much the same semantics as the Result type, so you could almost copy the Suave code and adjust it for Result instead of Option. E.g., something like this:
module AsyncResult
let bind (f : 'a -> Async<Result<'b, 'c>>) (a : Async<Result<'a, 'c>>) : Async<Result<'b, 'c>> = async {
let! r = a
match r with
| Ok value ->
let next : Async<Result<'b, 'c>> = f value
return! next
| Error err -> return (Error err)
}
let compose (f : 'a -> Async<Result<'b, 'e>>) (g : 'b -> Async<Result<'c, 'e>>) : 'a -> Async<Result<'c, 'e>> =
fun x -> bind g (f x)
let (>>=) a f = bind f a
let (>=>) f g = compose f g
Now you can write your chain as follows:
let chain = create >=> update >=> upload >=> publish
let result = chain(schemaObject) |> Async.RunSynchronously
match result with
| Ok article -> Debug.WriteLine(article.name)
| Error error -> Debug.WriteLine(error.code + ":" + error.message)
Caution: I haven't been able to verify this code by running it in F# Interactive, since I don't have any examples of your create/update/etc. functions. It should work, in principle — the types all fit together like Lego building blocks, which is how you can tell that F# code is probably correct — but if I've made a typo that the compiler would have caught, I don't yet know about it. Let me know if that works for you.
Update: In a comment, you asked whether you need to have both the >>= and >=> operators defined, and mentioned that you didn't see them used in the chain code. I defined both because they serve different purposes, just like the |> and >> operators serve different purposes. >>= is like |>: it passes a value into a function. While >=> is like >>: it takes two functions and combines them. If you would write the following in a non-AsyncResult context:
let chain = step1 >> step2 >> step3
Then that translates to:
let asyncResultChain = step1AR >=> step2AR >=> step3AR
Where I'm using the "AR" suffix to indicate versions of those functions that return an Async<Result<whatever>> type. On the other hand, if you had written that in a pass-the-data-through-the-pipeline style:
let result = input |> step1 |> step2 |> step3
Then that would translate to:
let asyncResult = input >>= step1AR >>= step2AR >>= step3AR
So that's why you need both the bind and compose functions, and the operators that correspond to them: so that you can have the equivalent of either the |> or the >> operators for your AsyncResult values.
BTW, the operator "names" that I picked (>>= and >=>), I did not pick randomly. These are the standard operators that are used all over the place for the "bind" and "compose" operations on values like Async, or Result, or AsyncResult. So if you're defining your own, stick with the "standard" operator names and other people reading your code won't be confused.
Update 2: Here's how to read those type signatures:
'a -> Async<Result<'b, 'c>>
This is a function that takes type A, and returns an Async wrapped around a Result. The Result has type B as its success case, and type C as its failure case.
Async<Result<'a, 'c>>
This is a value, not a function. It's an Async wrapped around a Result where type A is the success case, and type C is the failure case.
So the bind function takes two parameters:
a function from A to an async of (either B or C)).
a value that's an async of (either A or C)).
And it returns:
a value that's an async of (either B or C).
Looking at those type signatures, you can already start to get an idea of what the bind function will do. It will take that value that's either A or C, and "unwrap" it. If it's C, it will produce an "either B or C" value that's C (and the function won't need to be called). If it's A, then in order to convert it to an "either B or C" value, it will call the f function (which takes an A).
All this happens within an async context, which adds an extra layer of complexity to the types. It might be easier to grasp all this if you look at the basic version of Result.bind, with no async involved:
let bind (f : 'a -> Result<'b, 'c>) (a : Result<'a, 'c>) =
match a with
| Ok val -> f val
| Error err -> Error err
In this snippet, the type of val is 'a, and the type of err is 'c.
Final update: There was one comment from the chat session that I thought was worth preserving in the answer (since people almost never follow chat links). Developer11 asked,
... if I were to ask you what Result.bind in my example code maps to your approach, can we rewrite it as create >> AsyncResult.bind update? It worked though. Just wondering i liked the short form and as you said they have a standard meaning? (in haskell community?)
My reply was:
Yes. If the >=> operator is properly written, then f >=> g will always be equivalent to f >> bind g. In fact, that's precisely the definition of the compose function, though that might not be immediately obvious to you because compose is written as fun x -> bind g (f x) rather than as f >> bind g. But those two ways of writing the compose function would be exactly equivalent. It would probably be very instructive for you to sit down with a piece of paper and draw out the function "shapes" (inputs & outputs) of both ways of writing compose.
Why do you want to use Railway Oriented Programming here? If you just want to run a sequence of operations and return information about the first exception that occurs, then F# already provides a language support for this using exceptions. You do not need Railway Oriented Programming for this. Just define your Error as an exception:
exception Error of code:string * message:string
Modify the code to throw the exception (also note that your create function takes article but does not use it, so I deleted that):
let create () = async {
let ds = new DataContractJsonSerializer(typeof<Error>)
let request = WebRequest.Create("http://example.com") :?> HttpWebRequest
request.Method <- "GET"
try
use response = request.GetResponse() :?> HttpWebResponse
use reader = new StreamReader(response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
return ds.ReadObject(memoryStream) :?> Article
with
| :? WebException as e ->
use reader = new StreamReader(e.Response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
return raise (Error (ds.ReadObject(memoryStream) :?> Error)) }
And then you can compose functions just by sequencing them in async block using let! and add exception handling:
let main () = async {
try
let! created = create ()
let! updated = update created
let! uploaded = upload updated
Debug.WriteLine(uploaded.name)
with Error(code, message) ->
Debug.WriteLine(code + ":" + message) }
If you wanted more sophisticated exception handling, then Railway Oriented Programming might be useful and there is certainly a way of integrating it with async, but if you just want to do what you described in your question, then you can do that much more easily with just standard F#.

How to get the name of a higher order function in F#? [duplicate]

How can I create a function called getFuncName that takes a function of type (unit -> 'a) and returns its name.
I was talking to one of the C# devs and they said you could use the .Method property on a Func type as shown in an example here.
I tried to convert this to F# :
for example convert (unit -> 'a) to a type Func<_> then call the property on it but it always returns the string "Invoke".
let getFuncName f =
let fFunc = System.Func<_>(fun _ -> f())
fFunc.Method.Name
let customFunc() = 1.0
// Returns "Invoke" but I want it to return "customFunc"
getFuncName customFunc
A bit of background to this problem is:
I have created an array of functions of type (unit -> Deedle.Frame). I now want to cycle through those functions invoking them and saving them to csv with the csv name having the same name as the function. Some hypothetical code is below:
let generators : (unit -> Frame<int, string>) array = ...
generators
|> Array.iter (fun generator -> generator().SaveCsv(sprintf "%s\%s.csv" __SOURCE_DIRECTORY__ (getFuncName generator)))
This is being used in a scripting sense rather than as application code.
Not sure how you searched for information, but the first query to the search engine gave me this response:
let getFuncName f =
let type' = f.GetType()
let method' = type'.GetMethods() |> Array.find (fun m -> m.Name="Invoke")
let il = method'.GetMethodBody().GetILAsByteArray()
let methodCodes = [byte OpCodes.Call.Value;byte OpCodes.Callvirt.Value]
let position = il |> Array.findIndex(fun x -> methodCodes |> List.exists ((=)x))
let metadataToken = BitConverter.ToInt32(il, position+1)
let actualMethod = type'.Module.ResolveMethod metadataToken
actualMethod.Name
Unfortunately, this code only works when F# compiler does not inline function body into calling method.
Taken from here
Although there may be a more simple way.

Akka.net F# stateful actor that awaits multipe FileSystemWatcher Observable events

I'm new to F# as well as Akka.Net and trying to achieve the following with them:
I want to create an actor (Tail) that receives a file location and then listens for events at that location using FileSystemWatcher and some Observables, forwarding them on as messages to some other actor for processing.
The problem I'm having is that the code to listen for the events only picks up one event at a time and ignores all the others. e.g. if I copy 20 files into the directory being watched it only seems to send out the event for 1 of them.
Here's my Actor code:
module Tail
open Akka
open Akka.FSharp
open Akka.Actor
open System
open Model
open ObserveFiles
open ConsoleWriteActor
let handleTailMessages tm =
match tm with
| StartTail (f,r) ->
observeFile f consoleWriteActor |!> consoleWriteActor
|> ignore
let spawnTail =
fun (a : Actor<IMessage> ) ->
let rec l (count : int) = actor{
let! m = a.Receive()
handleTailMessages m
return! l (count + 1)
}
l(0)
and here's the code that listens for events:
module ObserveFiles
open System
open System.IO
open System.Threading
open Model
open Utils
open Akka
open Akka.FSharp
open Akka.Actor
let rec observeFile (absolutePath : string) (a : IActorRef ) = async{
let fsw = new FileSystemWatcher(
Path = Path.GetDirectoryName(absolutePath),
Filter = "*.*",
EnableRaisingEvents = true,
NotifyFilter = (NotifyFilters.FileName ||| NotifyFilters.LastWrite ||| NotifyFilters.LastAccess ||| NotifyFilters.CreationTime ||| NotifyFilters.DirectoryName)
)
let prepareMessage (args: EventArgs) =
let text =
match box args with
| :? FileSystemEventArgs as fsa ->
match fsa.ChangeType with
| WatcherChangeTypes.Changed -> "Changed " + fsa.Name
| WatcherChangeTypes.Created -> "Created " + fsa.Name
| WatcherChangeTypes.Deleted -> "Deleted " + fsa.Name
| WatcherChangeTypes.Renamed -> "Renamed " + fsa.Name
| _ -> "Some other change " + fsa.ChangeType.ToString()
| :? ErrorEventArgs as ea -> "Error: " + ea.GetException().Message
| o -> "some other unexpected event occurd" + o.GetType().ToString()
WriteMessage text
let sendMessage x = async{ async.Return(prepareMessage x) |!> a
return! observeFile absolutePath a }
let! occurance =
[
fsw.Changed |> Observable.map(fun x -> sendMessage (x :> EventArgs));
fsw.Created |> Observable.map(fun x -> sendMessage (x :> EventArgs));
fsw.Deleted |> Observable.map(fun x -> sendMessage (x :> EventArgs));
fsw.Renamed |> Observable.map(fun x -> sendMessage (x :> EventArgs));
fsw.Error |> Observable.map(fun x -> sendMessage (x :> EventArgs));
]
|> List.reduce Observable.merge
|> Async.AwaitObservable
return! occurance
}
It took quite a few hacks to get it to this point, any advice on how I could change it, so that it picks up and processes all the events while the actor is running would be greatly appreciated.
When designing task like that, we could split it into following components:
Create manager responsible for receiving all messages - it's main role is to respond on incoming directory listening requests. Once request comes in, it creates a child actor responsible for listening under this specific directory.
Child actor is responsible for managing FileSystemWatcher for specific path. It should subscribe to incoming events and redirect them as messages to actor responsible for receiving change events. It should also free disposable resources when it's closed.
Actor responsible for receiving change events - in our case by displaying them on the console.
Example code:
open Akka.FSharp
open System
open System.IO
let system = System.create "observer-system" <| Configuration.defaultConfig()
let observer filePath consoleWriter (mailbox: Actor<_>) =
let fsw = new FileSystemWatcher(
Path = filePath,
Filter = "*.*",
EnableRaisingEvents = true,
NotifyFilter = (NotifyFilters.FileName ||| NotifyFilters.LastWrite ||| NotifyFilters.LastAccess ||| NotifyFilters.CreationTime ||| NotifyFilters.DirectoryName)
)
// subscribe to all incoming events - send them to consoleWriter
let subscription =
[fsw.Changed |> Observable.map(fun x -> x.Name + " " + x.ChangeType.ToString());
fsw.Created |> Observable.map(fun x -> x.Name + " " + x.ChangeType.ToString());
fsw.Deleted |> Observable.map(fun x -> x.Name + " " + x.ChangeType.ToString());
fsw.Renamed |> Observable.map(fun x -> x.Name + " " + x.ChangeType.ToString());]
|> List.reduce Observable.merge
|> Observable.subscribe(fun x -> consoleWriter <! x)
// don't forget to free resources at the end
mailbox.Defer <| fun () ->
subscription.Dispose()
fsw.Dispose()
let rec loop () = actor {
let! msg = mailbox.Receive()
return! loop()
}
loop ()
// create actor responsible for printing messages
let writer = spawn system "console-writer" <| actorOf (printfn "%A")
// create manager responsible for serving listeners for provided paths
let manager = spawn system "manager" <| actorOf2 (fun mailbox filePath ->
spawn mailbox ("observer-" + Uri.EscapeDataString(filePath)) (observer filePath writer) |> ignore)
manager <! "testDir"

How should I expose a global Dictionary declared in f# that will have items added from different HttpModules?

I have a dictionary (formatters) declared in the following code that will have items added to it inside of multiple HttpModules. Once those are loaded it will not be written to again. What would be the best way to expose this so it can be accessed from any .NET language? I know this seems lame and looks like I should just have them implement ToString() however part of the application requires strings to be in a certain format and I don't want clients having to implement ToString() in a way that is specific to my application.
module MappingFormatters
open System
open System.Collections.Generic
let formatters = new Dictionary<Type, obj -> string>();
let format item =
let toDateTime (d:DateTime) =
let mutable date = d;
if (date.Kind) <> System.DateTimeKind.Utc then
date <- date.ToUniversalTime()
date.ToString("yyyy-MM-ddTHH:mm:00Z")
let stripControlCharacters (str:string) =
let isControl c = not (Char.IsControl(c))
System.String( isControl |> Array.filter <| str.ToCharArray())
let defaultFormat (item:obj) =
match item with
| :? string as str-> stripControlCharacters(str)
| :? DateTime as dte -> toDateTime(dte)
| _ -> item.ToString()
let key = item.GetType();
if formatters.ContainsKey(key) then
formatters.Item(key) item
else
defaultFormat item
If the question is just one about language interoperability, then I think you should just change the type from
Dictionary<Type, obj -> string>
to
Dictionary<Type, Func<obj, string> >
and then you should be in good shape.
After researching. I have decided to create a type called MappingFormatters to hold the method for adding the formatter. The client does not need to call it, but my f# code will. I believe this will let me use the common f# conventions while exposing a way for other .net languages to inter-operate with the least confusion.
module File1
open System
let mutable formatters = Map.empty<string, obj -> string>
let format (item:obj) =
let dateToString (d:DateTime) =
let mutable date = d;
if (date.Kind) <> System.DateTimeKind.Utc then
date <- date.ToUniversalTime()
date.ToString("yyyy-MM-ddTHH:mm:00Z")
let stripCtrlChars (str:string) =
let isControl c = not (Char.IsControl(c))
System.String( isControl |> Array.filter <| str.ToCharArray())
let key = item.GetType().AssemblyQualifiedName
if Map.containsKey key formatters then
Map.find key formatters item
else
match item with
| :? DateTime as d -> dateToString d
| _ -> stripCtrlChars (item.ToString())
let add (typ:Type) (formatter:obj -> string) =
let contains = Map.containsKey
let key = typ.AssemblyQualifiedName
if not (formatters |> contains key) then
formatters <- Map.add key formatter formatters
type MappingFormatters() = class
let addLock = new obj()
member a.Add (``type``:Type, formatter:Func<obj,string>) =
lock addLock (fun () ->
add ``type`` (fun x -> formatter.Invoke(x))
)
end

Resources