Futhark dimension mismatch issue - futhark

I am trying to implement groupby functionality. I have reasoned that my code should be correct.
Here is the important section
let type_func (typ: i32) (v1 : u32) (v2: u32) : u32 =
match typ
case 1 -> (*) v1 v2
case 2 -> (+) v1 v2
case 3 -> (u32.max) v1 v2
case 4 -> (u32.min) v1 v2
case x -> (u32.min) v1 v2 -- TODO change to some panic function
let merge [n][m] (s_cols_t: [n]i32) (a: [m]u32) (b: [m]u32) : [m]u32 =
map (\i -> if i == 0 then a[i]
else (type_func s_cols_t[i-1] a[i] b[i])
) (iota m)
let main [n][m][t] (db : [n][m]u32) (g_col: i32) (s_cols: [t]i32) (t_cols: [t]i32) : [][]u32 =
let keep_g = db[:, g_col]
let keep_s_cols = map (\c -> db[:, c]) s_cols
let keep_inter = concat [keep_g] (keep_s_cols)
let keep = transpose keep_inter
let sorted_rows = rsort keep -- ideally pass groupby col here
let idxs = mk_flags sorted_rows[:, 0]
let flag = map (== 1) idxs
let helper = merge t_cols
in segmented_reduce helper (replicate (length keep_inter) 0) flag sorted_rows
However the compiler throws the following error.
[0]> :l groupby.fut
Loading groupby.fut
Error at groupby.fut:63:70-80 :
Cannot apply "segmented_reduce" to "sorted_rows" (invalid type).
Expected: [n][argdim₃₅]u32
Actual: *[n][ret₁₃]u32
Dimensions "argdim₃₅" and "ret₁₃" do not match.
Note: "argdim₃₅" is value of argument
length keep_inter
passed to "replicate" at 63:42-58.
Note: "ret₁₃" is unknown size returned by "concat" at 57:20-48.
I have manuelly checked that the dimension of argdim₃₅ and ret₁₃ match by going putting in the code line by line in the REPL. Is this simply a compiler limitation I have encountered or am I doing something stupid?

I have a hacky fix to the problem. You can put most of the main functionality into another function and then explicitly give the shape of the dimension that is used in the function definition.
let groupby [n][m][s][t] (db : [n][m]u32) (cols: [s]i32) (t_cols: [t]i32) : [][]u32 =
let keep_fun columns row = map (\i -> row[i]) columns
let keep = map (keep_fun cols) db
let sorted_rows = rsort keep -- ideally pass groupby col here
let idxs = mk_flags sorted_rows[:, 0]
let flag = map (== 1) idxs
let helper = merge t_cols
in segmented_reduce helper (replicate s 0) flag sorted_rows
let main db g_col s_cols t_cols =
let cols = concat [g_col] s_cols
in groupby db cols t_cols
It's an extremely barbaric and ugly solution, but if it works it works.

Related

Using `should equal` with sequences in F# and FsUnit

I am using FsUnit.Xunit. I am getting a failure for the following test case:
[<Fact>]
let ``Initialization of DFF`` () =
dff Seq.empty Seq.empty |> should equal (seq {Zero})
The test failure is:
Message: 
FsUnit.Xunit+MatchException : Exception of type 'FsUnit.Xunit+MatchException' was thrown.
Expected: Equals seq [Zero]
Actual: seq [Zero]
Stack Trace: 
That.Static[a](a actual, IMatcher`1 matcher)
Signal.Initialization of DFF() line 11
I get the same error if the test is:
[<Fact>]
let ``Initialization of DFF`` () =
dff Seq.empty Seq.empty |> should equal (Seq.singleton Zero)
I have never tested equality of sequences using FsUnit.Xunit, so I am confused what's going on. I'm not even for sure what the failure message is telling me, as it seems to be saying that the expected and actual are the same. I can get this to work fine by converting the sequences to lists, but it would be nice to not have to do that.
Could someone explain what's going on here? It seems I'm not understanding the error message and thus probably something about Equals and comparing sequence values (literals?). Thanks.
Source code to be able to reproduce (I think this is everything):
type Bit =
| Zero
| One
type Signal = seq<Bit>
let Nand a b =
match a, b with
| Zero, Zero -> One
| Zero, One -> One
| One, Zero -> One
| One, One -> Zero
let Not input =
Nand input input
let And a b =
Not (Nand a b)
let Or a b =
Nand (Not a) (Not b)
let private liftToSignal1 op (signal: Signal) : Signal =
Seq.map op signal
let private liftToSignal2 op (signalA: Signal) (signalB: Signal) : Signal =
Seq.map2 op signalA signalB
let Not' = liftToSignal1 Not
let And' = liftToSignal2 And
let Or' = liftToSignal2 Or
let rec dff data clock : Signal =
seq {
yield Zero
yield! Or' (And' data clock)
(And' (dff data clock) (Not' clock))
}
This is an issue with structural vs. referential equality.
In F# seq { 'a' } = seq { 'a' } // false but [ 'a' ] = [ 'a' ] // true due to seq being IEnumerable and not supporting structural equality (or comparison).
Lists (and other F# container-like types) are much more 'intelligent', i.e. they support structural equality / comparison if the contained objects support it:
[ {| foo = StringComparison.Ordinal; bar = Some(1.23) |} ] =
[ {| foo = StringComparison.Ordinal; bar = Some(1.23) |} ] // true
but don't, if they contain anything that doesn't:
[ box(fun() -> 3) ] = [ box(fun() -> 3) ] // false
So, to make the test work, add a List.ofSeq:
dff Seq.empty Seq.empty |> List.ofSeq |> should equal [ Zero ]

How to build the dictionary from two other dictionaries by some condition on their values

I am new to functional programming and so can not imagen how to build the new dictionary based on two other dictionaries with similar set of keys. The new dictionary will have the entries with all keys but values will be selected/computed based on some condition.
For example, having two dictionaries:
D1: [(1,100);(2,50);(3,150)]
D2: [(1,20);(2,30);(3,0);(4,10)]
and condition to get the average of two values, the resulting dictionary will be
DR: [(1,60);(2,40);(3,75);(4,10)]
I need implementation in F#.
Please could you give me some advise.
View them as two (or more...) lists of tuples that we concat makes it easier. The below solves your specfic problem. To generalise the process aggeragting a list of values to something specific you would need to change averageBy to fold and provide a fold function instead of float. Assuming d1 and d2 mataches your exmaple.
Seq.concat [ d1 ; d2 ]
|> Seq.map (|KeyValue|)
|> Seq.groupBy fst
|> Seq.map (fun (k, c) -> k, Seq.averageBy (snd >> float) c |> int)
|> dict
If you wanted to use an external library, you could do this using Deedle series, which has various operations for working with (time) series of data.
Here, you have two data series that have different keys. Deedle lets you zip series based on keys and handle the cases where one of the values is missing using the opt type:
#r "nuget:Deedle"
open Deedle
let s1 = series [(1,100);(2,50);(3,150)]
let s2 = series [(1,20);(2,30);(3,0);(4,10)]
Series.zip s1 s2
|> Series.mapValues (fun (v1, v2) ->
( (OptionalValue.defaultArg 0 v1) +
(OptionalValue.defaultArg 0 v2) ) / 2)
This may not make sense if this is a thing that you need just in one or two places, but if you're working with key-value series of data more generally, it may be worth checking out.
Solution 1
From a functional perspective I would use a Map data-structure, instead of a dictionary. You can convert a dictionary to a Map like this
let d1 = dict [(1,100);(2,50);(3,150)]
let m1 = Map [for KeyValue (key,value) in d1 -> key, value]
But i wouldn't use a Dictionary and convert it, I would use a Map diretly.
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
Next, you need a way to get all keys from both Maps. You can get the keys of a map with Map.keys but you need all the keys from both. You could get them by using a Set.
let keys = Set (Map.keys m1) + Set (Map.keys m2)
By adding two Sets you get a Set.union of both sets. Once you have them, you can traverse the keys, and try to get both values from both keys. If you use Map.find then you get an optional. You can Pattern match on both cases at once.
let result = Map [
for key in keys do
match Map.tryFind key m1, Map.tryFind key m2 with
| Some x, Some y -> key, (x + y) / 2
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
This creates a new Map data-structure and saves it into result. If both cases are Some then you compute the average, otherwise you just keep the value. As you iterate the keys of both Maps the None,None case cannot happen. A Key always must be in either one or the other.
After all of this, result will be:
Map [(1, 60); (2, 40); (3, 75); (4, 10)]
Again, here is the whole code at once:
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
let keys = Set (Map.keys m1) + Set (Map.keys m2)
let result = Map [
for key in keys do
match Map.tryFind key m1, Map.tryFind key m2 with
| Some x, Some y -> key, (x + y) / 2
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
You also can inline the keys variable, if you want.
Solution 2
When you have a Map then you can make use of the fact that adding a value always to a map, always creates a new Map data-structure. This way you are able to use Map.fold that traverses a Map data-structure and uses one of the map as the starting state while you traverse the other Map.
With Map.change you then can read and change a value in one step. If a key is already available you calculate the average, otherwise just add the value.
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
let result =
(m1,m2) ||> Map.fold (fun state key y ->
state |> Map.change key (function
| Some x -> Some ((x + y) / 2)
| None -> Some y
)
)
Bonus: Adding Functions to Modules
It's sad sometimes that F# has so few functions on Map. But you need the a lot, you always can add a union function youself to the Module. For example:
module Map =
let union f map1 map2 =
let keys = Set (Map.keys map1) + Set (Map.keys map2)
Map [
for key in keys do
match Map.tryFind key map1, Map.tryFind key map2 with
| Some x, Some y -> key, (f x y)
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
This way you get a Map.union and you can specify a lambda-function that is executed if both keys are present in both maps, otherwise the value is used unchanged.
There have been a couple of useful suggestions:
Group by keys with standard library functions from the Seq module, by user1981
Use a specialized library for dealing with data series, by Tomas Petricek
Use a map instead (a functional data structure based on comparison), by David Raab
To this I'd like to add
An imperative way, filling a combined dictionary by iterating through the keys of the source data structures, and finally
A query expression
An imperative way
The average calculation is hard-coded with the type int. You can still have generic keys, as their type does not figure in the function, except for the equality constraint required for dictionary keys. You could make the function generic for values too, by marking it inline, but that won't be a pretty sight as it will introduce a host of other constraints onto the type of values.
open System.Collections.Generic
let unionAverage (d1 : IDictionary<_,_>) (d2 : IDictionary<_,_>) =
let d = Dictionary<_,_>()
for k in Seq.append d1.Keys d2.Keys |> Seq.distinct do
match d1.TryGetValue k, d2.TryGetValue k with
| (true, v1), (true, v2) -> d.Add(k, (v1 + v2) / 2)
| (true, v), _ | _, (true, v) -> d.Add(k, v)
| _ -> failwith "Key not found"
d
let d1 = dict[1, 100; 2, 50; 3, 150]
let d2 = dict[1, 20; 2, 30; 3, 0; 4, 10]
unionAverage d1 d2
A query expression
It operates on the same principle as the answer from user1981, but for re-usability the average function has been factored out. It expects an arbitrary number of #seq<KeyValuePair<_,_>> elements, which is just another way to represent dictionaries that are accessed through their enumerators.
As the query expression uses System.Linq.IGrouping under the hood, this is upcast to a regular sequence to reduce confusion. Then there's the conversion to float for Seq.average to operate on, because the type int does not have the required member DivideByInt.
module Dict =
let unionByMany f src =
query{
for KeyValue(k, v) in Seq.concat src do
groupValBy v k into group
select (group.Key, f (group :> seq<_>)) }
|> dict
Dict.unionByMany (Seq.averageBy float >> int) [d1; d2]
Dict.unionByMany Seq.sum [d1; d2]
Dict.unionByMany Seq.min [d1; d2]

F# Split Function

I'm building a merge sort function and my split method is giving me a value restriction error. I'm using 2 accumulating parameters, the 2 lists resulting from the split, that I package into a tuple in the end for the return. However I'm getting a value restriction error and I can't figure out what the problem is. Does anyone have any ideas?
let split lst =
let a = []
let b = []
let ctr = 0
let rec helper (lst,l1,l2,ctr) =
match lst with
| [] -> []
| x::xs -> if ctr%2 = 0 then helper(xs, x::l1, l2, ctr+1)
else
helper(xs, l1, x::l2, ctr+1)
helper (lst, a, b, ctr)
(a,b)
Any input is appreciated.
The code, as you have written it, doesn't really make sense. F# uses immutable values by default, therefore your function, as it's currently written, can be simplified to this:
let split lst =
let a = []
let b = []
(a,b)
This is probably not what you want. In fact, due to immutable bindings, there is no value in predeclaring a, b and ctr.
Here is a recursive function that will do the trick:
let split lst =
let rec helper lst l1 l2 ctr =
match lst with
| [] -> l1, l2 // return accumulated lists
| x::xs ->
if ctr%2 = 0 then
helper xs (x::l1) l2 (ctr+1) // prepend x to list 1 and increment
else
helper xs l1 (x::l2) (ctr+1) // prepend x to list 2 and increment
helper lst [] [] 0
Instead of using a recursive function, you could also solve this problem using List.fold, fold is a higher order function which generalises the accumulation process that we described explicitly in the recursive function above.
This approach is a bit more concise but very likely less familiar to someone new to functional programming, so I've tried to describe this process in more detail.
let split2 lst =
/// Take a running total of each list and a index*value and return a new
/// pair of lists with the supplied value prepended to the correct list
let splitFolder (l1, l2) (i, x) =
match i % 2 = 0 with
|true -> x :: l1, l2 // return list 1 with x prepended and list2
|false -> l1, x :: l2 // return list 1 and list 2 with x prepended
lst
|> List.mapi (fun i x -> i, x) // map list of values to list of index*values
|> List.fold (splitFolder) ([],[]) // fold over the list using the splitFolder function

Mutable Data in OCaml

I've created a mutable data structure in OCaml, however when I go to access it, it gives a weird error,
Here is my code
type vector = {a:float;b:float};;
type vec_store = {mutable seq:vector array;mutable size:int};;
let max_seq_length = ref 200;;
exception Out_of_bounds;;
exception Vec_store_full;;
let vec_mag {a=c;b=d} = sqrt( c**2.0 +. d**2.0);;
let make_vec_store() =
let vecarr = ref ((Array.create (!max_seq_length)) {a=0.0;b=0.0}) in
{seq= !vecarr;size=0};;
When I do this in ocaml top-level
let x = make _ vec _store;;
and then try to do x.size I get this error
Error: This expression has type unit -> vec_store
but an expression was expected of type vec_store
Whats seems to be the problem? I cant see why this would not work.
Thanks,
Faisal
make_vec_store is a function. When you say let x = make_vec_store, you are setting x to be that function, just like if you'd written let x = 1, that would make x the number 1. What you want is the result of calling that function. According to make_vec_store's definition, it takes () (also known as "unit") as an argument, so you would write let x = make_vec_store ().
try x = make_ vec_store()
As a follow up to the excellent answere provided. You can tell that your example line:
# let x = make_vec_store;;
val x : unit -> vec_store = <fun>
returns a function as the repl will tell you this. You can see from the output that x is of type <fun> that takes no parameters unit and returns a type vec_store.
Contrast this to the declaration
# let x = 1;;
val x : int = 1
which tells you that x is of type int and value 1.

Homework help converting an iterative function to recursive

For an assignment, i have written the following code in recursion. It takes a list of a vector data type, and a vector and calculates to closeness of the two vectors. This method works fine, but i don't know how to do the recursive version.
let romulus_iter (x:vector list) (vec:vector) =
let vector_close_hash = Hashtbl.create 10 in
let prevkey = ref 10000.0 in (* Define previous key to be a large value since we intially want to set closefactor to prev key*)
if List.length x = 0 then
{a=0.;b=0.}
else
begin
Hashtbl.clear vector_close_hash;
for i = 0 to (List.length x)-1 do
let vecinquestion = {a=(List.nth x i).a;b=(List.nth x i).b} in
let closefactor = vec_close vecinquestion vec in
if (closefactor < !prevkey) then
begin
prevkey := closefactor;
Hashtbl.add vector_close_hash closefactor vecinquestion
end
done;
Hashtbl.find vector_close_hash !prevkey
end;;
The general recursive equivalent of
for i = 0 to (List.length x)-1 do
f (List.nth x i)
done
is this:
let rec loop = function
| x::xs -> f x; loop xs
| [] -> ()
Note that just like a for-loop, this function only returns unit, though you can define a similar recursive function that returns a meaningful value (and in fact that's what most do). You can also use List.iter, which is meant just for this situation where you're applying an impure function that doesn't return anything meaningful to each item in the list:
List.iter f x

Resources