Function chaining in Julia - julia

Is it possible to write multiple function calls as a chain?
sum(
map(parseIgFloat,
map((row) -> row.PL_Amount,
filter((row) -> !ismissing(row.Summary) && row.Summary == "Cash In",
collect(ts)
)
)
)
)
Turn it into something like:
ts
|> collect
|> filter((row) -> !ismissing(row.Summary) && row.Summary == "Cash In")
|> map((row) -> row.PL_Amount)
|> map(parseIgFloat)
|> sum

After searching, this seems to be the best option available
ts |>
collect |>
(list -> filter((row) -> !ismissing(row.Summary) && row.Summary == "Cash In", list)) |>
(list -> map((row) -> row.PL_Amount, list)) |>
(list -> map(parseIgFloat, list)) |>
sum
or with the Pipe package with the macros
#pipe ts |>
collect |>
filter((row) -> !ismissing(row.Summary) && row.Summary == "Cash In", _) |>
map((row) -> row.PL_Amount, _) |>
map(parseIgFloat, _) |>
sum

In general, what you ask is difficult, since whatever construct you came up with, you would have to guarantee the order of arguments as passed is the order expected by the function, so there is no general enough method to allow you to do this without defining explicit types and operations for it.
However, with respect to map and filter specifically, it is trivial to create 'curried' versions of these functions and apply chaining to them. E.g.
# Create curried versions of map and filter for use in 'chaining'
import Base.map; function map(f); return L -> map(f,L); end;
import Base.filter; function filter(f); return L -> filter(f,L); end;
f = x -> x ^ 2;
ts = range(1, stop=10);
( ts
|> collect
|> map(f) # square the collection
|> filter(iseven) # keep only even results
|> sum
)
Output:
220
PS: Chaining is mostly about ease of readability, and is most useful when you have a succession of individually simple and visually straightforward commands, as above. If you're going to have complicated expressions in your 'chain', like the ones in your proposed solution, then it's not really worth it in my opinion. Either wrap your complicated expressions into appropriately named functions that make the chain read like plain english, or avoid chaining in the first place, and rely on clear steps using temporary variables instead.
PS2: Also note that, the |> operator is a valid target for broadcasting, like any function. Therefore |> map(f) above could also have been written more simply as .|> f instead.

I would probably write it like this:
sum(row -> parseIgFloat(row.PL_Amount),
filter((row) -> !ismissing(row.Summary) && row.Summary == "Cash In",
ts)
)
If and when underscore currying comes about (https://github.com/JuliaLang/julia/pull/24990) you could simplify it like this:
sum(parseIgFloat(_.PL_Amount)),
filter(!ismissing(_.Summary) && _.Summary == "Cash In", # actually not sure if this line would work
ts)
)
At that point chaining might also become more convenient.
By the way: don't use collect unless you somehow really have to.

Related

flatten a map of map in F#

I have the following:
Map<Instrument, Map<PositionSide, PositionData>>
is there a way to flatten this to a:
PositionData list
without iterating through the 2 maps manually?
If we do not worry about duplicate values from the inner maps, it can be done by treating the maps as sequences. Their in-built GetEnumerator() method returns a sequence of the type System.Collections.Generic.KeyValuePair<_,_>.
The latter can be deconstructed by F#'s predefined active recognizer KeyValue.
Map.empty<Instrument, Map<PositionSide, PositionData>>
|> Seq.collect (fun (KeyValue(_, v)) -> v)
|> Seq.map (fun (KeyValue(_, v)) -> v)
|> Seq.toList
// val it : PositionData list = []
I'm afraid it is not possible. There are many issues with it
at first this are not 2, but potentially many collections in another collection.
Map is not built for iteration, it's dictionary like. But you can convert it to a List of pairs first with Map.toList
there is no other possibility than iterate over values if you need them all.
let toValuesList map =
map
|> Map.toList
|> List.map snd
let ds =
data
|> toValuesList
|> List.collect toValuesList
If you could lookup PositionData by some key you would at most avoid iterating internal Map's
let data =
map
|> toValuesList
|> List.map (fun valueMap -> lookupWithin valueMap)

Accidental recursion, blowing up the stack with Seq.append, without using `rec`

I had code that was waiting to blow up something lurking around. Using F# 4.1 Result it is similar to this:
module Result =
let unwindSeq (sourceSeq: #seq<Result<_, _>>) =
sourceSeq
|> Seq.fold (fun state res ->
match state with
| Error e -> Error e
| Ok innerResult ->
match res with
| Ok suc ->
Seq.singleton suc
|> Seq.append innerResult
|> Ok
| Error e -> Error e) (Ok Seq.empty)
The obvious bottleneck here is Seq.singleton added to Seq.append. I understand that this is slow (and badly written), but why does it have to blow up the stack? I don't think that Seq.append is inherently recursive...
// blows up stack, StackOverflowException
Seq.init 1000000 Result.Ok
|> Result.unwindSeq
|> printfn "%A"
And as an aside, to unwind a sequence of Result, I fixed this function by using a simple try-catch-reraise, but that feels sub-par too. Any ideas as to how to do this more idiomatically without force-evaluating the sequence or blowing up the stack?
Not-so-perfect unwinding (it also forces the result-fail type), but at least without pre-evaluation of the sequence:
let unwindSeqWith throwArgument (sourceSeq: #seq<Result<_, 'a -> 'b>>) =
try
sourceSeq
|> Seq.map (throwOrReturnWith throwArgument)
|> Ok
with
| e ->
(fun _ -> raise e)
|> Error
I believe the idiomatic way of folding a sequence of Results in the way you suggest would be:
let unwindSeq<'a,'b> =
Seq.fold<Result<'a,'b>, Result<'a seq, 'b>>
(fun acc cur -> acc |> Result.bind (fun a -> cur |> Result.bind (Seq.singleton >> Seq.append a >> Ok)))
(Ok Seq.empty)
Not that this will be any faster than your current implementation, it just leverages Result.bind to do most of the work. I believe the stack is overflowing because a recursive function somewhere in the F# library, likely in the Seq module. My best evidence for this is that materializing the sequence to a List first seems to make it work, as in the following example:
let results =
Seq.init 2000000 (fun i -> if i <= 1000000 then Result.Ok i else Error "too big")
|> Seq.toList
results
|> unwindSeq
|> printfn "%A"
However, this may not work in your production scenario if the sequence is too big to materialize in memory.

F# stop Seq.map when a predicate evaluates true

I'm currently generating a sequence in a similar way to:
migrators
|> Seq.map (fun m -> m())
The migrator function is ultimately returning a discriminated union like:
type MigratorResult =
| Success of string * TimeSpan
| Error of string * Exception
I want to stop the map once I encounter my first Error but I need to include the Error in the final sequence.
I have something like the following to display a final message to the user
match results |> List.rev with
| [] -> "No results equals no migrators"
| head :: _ ->
match head with
| Success (dt, t) -> "All migrators succeeded"
| Error (dt, ex) -> "Migration halted owing to error"
So I need:
A way to stop the mapping when one of the map steps produces an Error
A way to have that error be the final element added to the sequence
I appreciate there may be a different sequence method other than map that will do this, I'm new to F# and searching online hasn't yielded anything as yet!
I guess there are multiple approaches here, but one way would be to use unfold:
migrators
|> Seq.unfold (fun ms ->
match ms with
| m :: tl ->
match m () with
| Success res -> Some (Success res, tl)
| Error res -> Some (Error res, [])
| [] -> None)
|> List.ofSeq
Note the List.ofSeq at the end, that's just there for realizing the sequence. A different way to go would be to use sequence comprehensions, some might say it results in a clearer code.
The ugly things Tomaš alludes to are 1) mutable state, and 2) manipulation of the underlying enumerator. A higher-order function which returns up to and including when the predicate holds would then look like this:
module Seq =
let takeUntil pred (xs : _ seq) = seq{
use en = xs.GetEnumerator()
let flag = ref true
while !flag && en.MoveNext() do
flag := not <| pred en.Current
yield en.Current }
seq{1..10} |> Seq.takeUntil (fun x -> x % 5 = 0)
|> Seq.toList
// val it : int list = [1; 2; 3; 4; 5]
For your specific application, you'd map the cases of the DU to a boolean.
(migrators : seq<MigratorResult>)
|> Seq.takeUntil (function Success _ -> false | Error _ -> true)
I think the answer from #scrwtp is probably the nicest way to do this if your input is reasonably small (and you can turn it into an F# list to use pattern matching). I'll add one more version, which works when your input is just a sequence and you do not want to turn it into a list.
Essentially, you want to do something that's almost like Seq.takeWhile, but it gives you one additional item at the end (the one, for which the predicate fails).
To use a simpler example, the following returns all numbers from a sequence until one that is divisible by 5:
let nums = [ 2 .. 10 ]
nums
|> Seq.map (fun m -> m % 5)
|> Seq.takeWhile (fun n -> n <> 0)
So, you basically just need to look one element ahead - to do this, you could use Seq.pairwise which gives you the current and the next element in the sequence"
nums
|> Seq.map (fun m -> m % 5)
|> Seq.pairwise // Get sequence of pairs with the next value
|> Seq.takeWhile (fun (p, n) -> p <> 0) // Look at the next value for test
|> Seq.mapi (fun i (p, n) -> // For the first item, we return both
if i = 0 then [p;n] else [n]) // for all other, we return the second
|> Seq.concat
The only ugly thing here is that you then need to flatten the sequence again using mapi and concat.
This is not very nice, so a good thing to do would be to define your own higher-order function like Seq.takeUntilAfter that encapsulates the behavior you need (and hides all the ugly things). Then your code could just use the function and look nice & readable (and you can experiment with other ways of implementing this).

Returning object with different value

Is there a way to return an object used in a lambda expression, but with a different value? I've been using the "kind of linq-select" way, but I'd like to do something like this:
let bob= tab
|> Seq.map (fun x -> ignore (x.Value=x.Value+1); x)
|> Seq.iter (fun x -> x.Dump())
making all the x's in my sequence to have their value +1'ed.
instead of doing this:
let bob= tab
|> Seq.map (fun x -> Ville(IdVille= 9, NoVille=x.Value+1, Nom=x.Nom, __RowVersion = x.__RowVersion))
|> Seq.iter (fun x -> x.Dump())
edit:
What I expect to get : from this, a dump of the sequence, hence the Iter and Dump...
What I want the sequence to be? Here is an example, well the original sequence, but after applying a function to each element and get a copy of the result... (No side effect on the original sequence).
For example, I have a sequence of names, I'd like to have a copy of the original sequence, but with upper-cased names. Now imagine the same, but with a sequence of objects got from a database.
Edit2:
I made a test with LinqPad and AdventureWorks database, and I did this:
let dc = new TypedDataContext()
let tab = dc.GetTable<Address>()
let bob = tab
|> Seq.map (fun x -> ignore (x.AddressLine1 <- "Bob"); x)
tab.Dump()
bob.Dump()
The 2 Dump() results are differents. If I invert the 2 Dump() calls, both results are the same. You were right!
It's hard to tell what you're trying to do, but mutating a value suggests an imperative approach, so why not a for loop?
for x in tab do
x.Value <- x.Value + 1
x.Dump()
What value do you expect for bob? Seq.iter returns unit. If you mutate tab within Seq.map it will have the same value as bob.
EDIT
If you modify elements of a sequence within map the result and the original sequence will be one and the same. map is not intended to be used with side effects. An example:
type T(value) =
member val Value = value with get, set
let tab = [T(0); T(1); T(3)]
let bob = tab |> Seq.map (fun x -> x.Value <- x.Value + 1; x)
tab = (Seq.toList bob) //true
You can try using map along with an object expression to update just one field:
let bob = tab
|> Seq.map (fun x -> {x with Value = x.Value + 1})
|> Seq.iter (fun x -> x.Dump())
Though bob will not get the results of Dump() assigned to it if you are using iter. You'd need to use map again for that.
Edit
This only works with record types.

F# replacing variables with actual values results in endless loop (recursive function)

I recently started with F# and implemented a very basic recursive function that represents the Sieve of Eratosthenes. I came up with the following, working code:
static member internal SieveOfEratosthenesRecursive sequence accumulator =
match sequence with
| [] -> accumulator
| head::tail -> let rest = tail |> List.filter(fun number -> number % head <> 0L)
let newAccumulator = head::accumulator
Prime.SieveOfEratosthenesRecursive rest newAccumulator
This function is not really memory efficient so I tried to eliminate the variables "rest" and "newAccumulator". I came up with the following code
static member internal SieveOfEratosthenesRecursive sequence accumulator =
match sequence with
| [] -> accumulator
| head::tail -> tail |> List.filter(fun number -> number % head <> 0L)
|> Prime.SieveOfEratosthenesRecursive (head::accumulator)
As far as I understand the tutorials I've read Prime.SieveOfEratosthenesRecursive will be called with the filtered tail as first parameter and a list consisting of head::accumulator as second one. However when I try to run the code with the reduced variable usage, the program gets trappen in an infinite loop. Why is this happening and what did I do wrong?
As far as I understand the tutorials I've read Prime.SieveOfEratosthenesRecursive will be called with the filtered tail as first parameter and a list consisting of head::accumulator as second one.
You have this backwards.
In the first version, you're passing rest then newAccumulator; in the second version, you're effectively passing newAccumulator then rest. I.e., you've transposed the arguments.
Prime.SieveOfEratosthenesRecursive (head::accumulator) is a partial function application wherein you're applying (head::accumulator) as the first argument (sequence). This partial function application yields a unary function (expecting accumulator), to which you are passing (via |>) what is called rest in the first version of your code.
Changing SieveOfEratosthenesRecursive's argument order is the easiest solution, but I would consider something like the following idiomatic as well:
static member internal SieveOfEratosthenesRecursive sequence accumulator =
match sequence with
| [] -> accumulator
| head::tail ->
tail
|> List.filter(fun number -> number % head <> 0L)
|> Prime.SieveOfEratosthenesRecursive <| (head::accumulator)
or
static member internal SieveOfEratosthenesRecursive sequence accumulator =
let inline flipzip a b = b, a
match sequence with
| [] -> accumulator
| head::tail ->
tail
|> List.filter(fun number -> number % head <> 0L)
|> flipzip (head::accumulator)
||> Prime.SieveOfEratosthenesRecursive
FWIW, eliminating rest and newAccumulator as named variables here is not going to impact your memory usage in the slightest.
The last call in your second function is equivalent to:
Prime.SieveOfEratosthenesRecursive newAccumulator rest
where you switch positions of two params. Since newAccumulator grows bigger after each recursive call, you will never reach the base case of empty list.
The rule of thumb is putting the most frequently changing parameter at last:
let rec sieve acc xs =
match xs with
| [] -> acc
| x::xs' -> xs' |> List.filter (fun y -> y % x <> 0L)
|> sieve (x::acc)
The above function could be shortened using function keyword:
let rec sieve acc = function
| [] -> acc
| x::xs' -> xs' |> List.filter (fun y -> y % x <> 0L)
|> sieve (x::acc)
Using pipe (|>) operator only makes the function more readable, it doesn't affect memory usage at all.

Resources