Parallel functions composition in Functional Programming - functional-programming

This question is a sub-problem of a bigger problem and this is the last hard nut:
I have two functions:
getUserFromDB :: Token -> Either String User
getNewUser :: Token -> Either String User
How do I combine them to get a function like this:
getUser :: Token -> Either String User
so that getUserFromDB and getNewUser are tried in sequence one after another, are fed the same input (Token is an api token for OAuth api) and the result of the first succeeded function is returned, or some default value if none succeed?
In other words: if there is a function in FP with signature like:
tryUntilGoodOrFail :: failValue -> (result -> Boolean) -> [fn] -> input -> result
OR
tryUntilGoodOrFail :: a -> (a -> Boolean) -> [(b -> a)] -> b -> a
EDIT
This is just a newbie question about functional programming in general. I just don't know how to put two (or more) functions in parallel sequence, in contrast to consecutive functions pipe (which FP is very good at). But in realworld cases this parallel sequence functions pattern is what often needed. I believe FP has it's own solution to this. I just can't find it anywhere.

Part of the problem is Eithers are right-biased, meaning they are designed to short-circuit on the first Left, but you want it to short-circuit on the first Right. If you made everything an Either User String, then your getUser would be:
getUser token = getUserFromDb token >> getNewUser token
If you want to keep it as Either String User, you have to swap it then bind, or do the short-circuiting yourself using an if expression or something.

This kind of sounds like Applicative Functors (also here). Assuming getNewUser will return a Left (or Failure depending on the language) if the user is already in the system then you can use a list of functions and just apply them in sequence and get a list of Result back. Then you can find the first successful result without caring whether it is a new user or an existing one.
This doesn't short-circuit but it does the next best thing.

The simplest solution I have found (working for any language lazy or not): fold (foldl to be more accurate). Don't know how to express it in Haskell or PureScript. Here is my version in Python:
getUser = lambda token: fold(
lambda e,f:e if e.isRight else f(token),
left('None Worked'),
[getUserFromDB, getNewUser, someOtherAttempt, ... ]
)
Same for JS:
const getUser = token => fold(
(e,f)=>e.isRight?e:f(token),
left('None Worked'),
[getUserFromDB, getNewUser, someOtherAttempt, ... ]
)
PureScript has interesting function called until that also can help with it. But it returns a list of results until the first successful one (which is what I was looking for), and I can feed it to last and maybe and get a first good result or a default value. But fold makes it all in one go and is more general and universal and standard.

Related

F#: How can I wrap a non-async value in Async<'T>?

In F#, I'm implementing an interface that returns Async<'T>, but my implementation does not require any async method calls. Here's an example of my current implementation:
type CrappyIdGeneratorImpl() =
interface IdGenerator with
member this.Generate(): Async<Id> = async {
return System.Guid.NewGuid().ToString()
}
Is this the "best" way to wrap a non-Async<> value in Async<>, or is there a better way?
Put another way, I'm looking for the Async equivalent of System.Threading.Tasks.Task.FromResult().
Yes, that's the best way.
Note that, as stated in the comment, another way would be to do async.Return but it's not exactly the same thing, and in your use case it won't work, due to eager evaluation of the argument.
In your case you're interested in calling the .NewGuid() method each time the async is run, but if you use async.Return alone it will be evaluated once and the async computation will be created with that result as a fixed value.
In fact the equivalent of the CE is async.Delay (fun () -> async.Return (.. your expression ..))
UPDATE
As Tarmil noted, this can be interpreted as intended, given that the code resides in a method.
It might be the case that you created a method because that's the way you do with tasks, but Asyncs can be created and called independently (hot vs cold async models).
The question of whether the intention is delaying the async or creating an async at each method call is not entirely clear to me, but whichever is the case, the explanation above about each approach is still valid.

Add element to immutable vector rust

I am trying to create a user input validation function in rust utilising functional programming and recursion. How can I return an immutable vector with one element concatenated onto the end?
fn get_user_input(output_vec: Vec<String>) -> Vec<String> {
// Some code that has two variables: repeat(bool) and new_element(String)
if !repeat {
return output_vec.add_to_end(new_element); // What function could "add_to_end" be?
}
get_user_input(output_vec.add_to_end(new_element)) // What function could "add_to_end" be?
}
There are functions for everything else:
push adds a mutable vector to a mutable vector
append adds an element to the end of a mutable vector
concat adds an immutable vector to an immutable vector
??? adds an element to the end of a immutable vector
The only solution I have been able to get working is using:
[write_data, vec![new_element]].concat()
but this seems inefficient as I'm making a new vector for just one element (so the size is known at compile time).
You are confusing Rust with a language where you only ever have references to objects. In Rust, code can have exclusive ownership of objects, and so you don't need to be as careful about mutating an object that could be shared, because you know whether or not the object is shared.
For example, this is valid JavaScript code:
const a = [];
a.push(1);
This works because a does not contain an array, it contains a reference to an array.1 The const prevents a from being repointed to a different object, but it does not make the array itself immutable.
So, in these kinds of languages, pure functional programming tries to avoid mutating any state whatsoever, such as pushing an item onto an array that is taken as an argument:
function add_element(arr) {
arr.push(1); // Bad! We mutated the array we have a reference to!
}
Instead, we do things like this:
function add_element(arr) {
return [...arr, 1]; // Good! We leave the original data alone.
}
What you have in Rust, given your function signature, is a totally different scenario! In your case, output_vec is owned by the function itself, and no other entity in the program has access to it. There is therefore no reason to avoid mutating it, if that is your goal:
fn get_user_input(mut output_vec: Vec<String>) -> Vec<String> {
// Add mut ^^^
You have to keep in mind that any non-reference is an owned value. &Vec<String> would be an immutable reference to a vector something else owns, but Vec<String> is a vector this code owns and nobody else has access to.
Don't believe me? Here's a simple example of broken code that demonstrates this:
fn take_my_vec(y: Vec<String>) { }
fn main() {
let mut x = Vec::<String>::new();
x.push("foo".to_string());
take_my_vec(x);
println!("{}", x.len()); // E0382
}
The expression x.len() causes a compile-time error, because the vector x was moved into the function argument and we don't own it anymore.
So why shouldn't the function mutate the vector it owns now? The caller can't use it anymore.
In summary, functional programming looks a bit different in Rust. In other languages that have no way to communicate "I'm giving you this object" you must avoid mutating values you are given because the caller may not expect you to change them. In Rust, who owns a value is clear, and the argument reflects that:
Is the argument a value (Vec<String>)? The function owns the value now, the caller gave it away and can't use it anymore. Mutate it if you need to.
Is the argument an immutable reference (&Vec<String>)? The function doesn't own it, and it can't mutate it anyway because Rust won't allow it. You could clone it and mutate the clone.
Is the argument a mutable reference (&mut Vec<String>)? The caller must explicitly give the function a mutable reference and is therefore giving the function permission to mutate it -- but the function still doesn't own the value. The function can mutate it, clone it, or both -- it depends what the function is supposed to do.
If you take an argument by value, there is very little reason not to make it mut if you need to change it for whatever reason. Note that this detail (mutability of function arguments) isn't even part of the function's public signature simply because it's not the caller's business. They gave the object away.
Note that with types that have type arguments (like Vec) other expressions of ownership are possible. Here are a few examples (this is not an exhaustive list):
Vec<&String>: You now own a vector, but you don't own the String objects that it contains references to.
&Vec<&String>: You are given read-only access to a vector of string references. You could clone this vector, but you still couldn't change the strings, just rearrange them, for example.
&Vec<&mut String>: You are given read-only access to a vector of mutable string references. You can't rearrange the strings, but you can change the strings themselves.
&mut Vec<&String>: Like the above but opposite: you are allowed to rearrange the string references but you can't change the strings.
1 A good way to think of it is that non-primitive values in JavaScript are always a value of Rc<RefCell<T>>, so you're passing around a handle to the object with interior mutability. const only makes the Rc<> immutable.

Can you make a function take more than one type in Elm? Can you have an overloaded function?

I'm trying to port a library from JS, and I've gotten to a function that can either take a string or a list of strings. If given a string, it'll split it into a list of strings and then continue as if it had been passed that in the first place.
I can sort of do it by defining my own type, but it makes the API ugly and require a custom Type prefix your data.
Here's what I've got:
type DocumentBody = Raw String | Words List String
tokenize: DocumentBody -> List String
tokenize s =
case s of
Raw str_body -> String.split " " str_body |> (List.map String.toLower)
Words list_body -> List.map String.toLower list_body
-- Tests
tests =
suite "Tokenizer"
[ test "simple" <| assertEqual ["this", "is", "a", "simple", "string"]
<| tokenize (Raw "this is a simple string")
, test "downcasing tokens: string" <| assertEqual ["foo", "bar"]
<| tokenize (Raw "FOO BAR")
, test "downcasing tokens: list of str" <| assertEqual ["foo", "bar"]
<| tokenize (Words ["Foo", "BAR"])
]
Ultimately, I don't think that the port should support this kind of behavior, but how do you pattern match on just the enumerations of the type instead of needing the prefix Raw or Words in my example?
No, you cannot overload functions in elm the same way you can in other languages. Each function has a single signature. If you want a function to accept a parameter that can be of many types, your solution using a union type works just fine.
You say it makes the API ugly. I wouldn't call it ugly. Perhaps the syntax of the type declaration or case statements is unappealing but I say, give it time. It will grow on you. There's a lot of power and safety there. You're not letting the code make any assumptions, you are forced to handle every scenario, and it's one of the strengths of working in a language like elm.
Your pattern matching code is appropriate. You cannot shorten it beyond that.
At the same time, I can understand the pain of rewriting a javascript library and trying to communicate over ports from raw javascript. You're rewriting things in a much stricter language, and you won't be able to duplicate function signatures that in javascript, accepted anything and everything. But again, that's a strength of elm, not a weakness. Use the opportunity to tighten up the API and remove ambiguity.
When it comes to your specific example, to me it feels like there are a few possible alternatives beyond your solution. I'd argue that the tokenize function is promising too much to begin with; it's too ambiguous. When writing code in functional languages, I prefer to keep things small and composable. To me, it should really be two separate functions, each with a single, specific purpose.

What are the two ways of maintaining state between tail-recursive instances of a function in Erlang?

The equivalent in a procedural language (e.g. in Java) would be local variables (or instance variables) declared outside of a loop whose contents use and update them. How can I do that in Erlang?
You pass the state as parameters in the recursive call. Example loop that receives N Msgs and returns them as a list:
loop(N) ->
loop(N, 0, []).
loop(N, Count, Msgs) when Count < N ->
receive
Msg -> loop(N, Count+1, [Msg|Msgs])
end;
loop(_, _, Msgs)
list:reverse(Msgs).
I hope it wasn't homework question but I'm confused with "two ways" in subject.
The most proper way, of course, is to extend recursive function definition with at least one argument to carry all needed data. But, if you can't use it, and you are sure only one instance of such recursive cycle will be in effect in a moment (or they will be properly stacked), and function invocations are in the same process, then process dictionary will help you. See put() and get() in erlang module, and invent unique terms to be used as keys. But this is definitely a kind of hack.
One could invent more hacks but all them will be ugly.:)

OCaml visitor pattern

I am implementing a simple C-like language in OCaml and, as usual, AST is my intermediate code representation. As I will be doing quite some traversals on the tree, I wanted to implement
a visitor pattern to ease the pain. My AST currently follows the semantics of the language:
type expr = Plus of string*expr*expr | Int of int | ...
type command = While of boolexpr*block | Assign of ...
type block = Commands of command list
...
The problem is now that nodes in a tree are of different type. Ideally, I would pass to the
visiting procedure a single function handling a node; the procedure would switch on type of the node and do the work accordingly. Now, I have to pass a function for each node type, which does not seem like a best solution.
It seems to me that I can (1) really go with this approach or (2) have just a single type above. What is the usual way to approach this? Maybe use OO?
Nobody uses the visitor pattern in functional languages -- and that's a good thing. With pattern matching, you can fortunately implement the same logic much more easily and directly just using (mutually) recursive functions.
For example, assume you wanted to write a simple interpreter for your AST:
let rec run_expr = function
| Plus(_, e1, e2) -> run_expr e1 + run_expr e2
| Int(i) -> i
| ...
and run_command = function
| While(e, b) as c -> if run_expr e <> 0 then (run_block b; run_command c)
| Assign ...
and run_block = function
| Commands(cs) = List.iter run_command cs
The visitor pattern will typically only complicate this, especially when the result types are heterogeneous, like here.
It is indeed possible to define a class with one visiting method per type of the AST (which by default does nothing) and have your visiting functions taking an instance of this class as a parameter. In fact, such a mechanism is used in the OCaml world, albeit not that often.
In particular, the CIL library has a visitor class
(see https://github.com/kerneis/cil/blob/develop/src/cil.mli#L1816 for the interface). Note that CIL's visitors are inherently imperative (transformations are done in place). It is however perfectly possible to define visitors that maps an AST into another one, such as in Frama-C, which is based on CIL and offer in-place and copy visitor. Finally Cαml, an AST generator meant to easily take care of bound variables, generate map and fold visitors together with the datatypes.
If you have to write many different recursive operations over a set of mutually recursive datatypes (such as an AST) then you can use open recursion (in the form of classes) to encode the traversal and save yourself some boiler plate.
There is an example of such a visitor class in Real World OCaml.
The Visitor pattern (and all pattern related to reusable software) has to do with reusability in an inclusion polymorphism context (subtypes and inheritance).
Composite explains a solution in which you can add a new subtype to an existing one without modifying the latter one code.
Visitor explains a solution in which you can add a new function to an existing type (and to all of its subtypes) without modifying the type code.
These solutions belong to object-oriented programming and require message sending (method invocation) with dynamic binding.
You can do this in Ocaml is you use the "O" (the Object layer), with some limitation coming with the advantage of having strong typing.
In OCaml Having a set of related types, deciding whether you will use a class hierarchy and message sending or, as suggested by andreas, a concrete (algebraic) type together with pattern matching and simple function call, is a hard question.
Concrete types are not equivalent. If you choose the latter, you will be unable to define a new node in your AST after your node type will be defined and compiled. Once said that a A is either a A1 or a A2, your cannot say later on that there are also some A3, without modifying the source code.
In your case, if you want to implement a visitor, replace your EXPR concrete type by a class and its subclasses and your functions by methods (which are also functions by the way). Dynamic binding will then do the job.

Resources