Making cyclic graphs in F#. Is mutability required? - graph

I'm trying to do a cyclic graph in F#
My node type looks something like this:
type Node = { Value : int; Edges : Node list }
My question is: Do I need to make Edges mutable in order to have cycles?

F# makes it possible to create immediate recursive object references with cycles, but this really only works on (fairly simple) records. So, if you try this on your definition it won't work:
let rec loop =
{ Value = 0;
Edges = [loop] }
However, you can still avoid mutation - one reasonable alternative is to use lazy values:
type Node = { Value : int; Edges : Lazy<Node list>}
This way, you are giving the compiler "enough time" to create a loop value before it needs to evaluate the edges (and access the loop value again):
let rec loop =
{ Value = 0;
Edges = lazy [loop] }
In practice, you'll probably want to call some functions to create the edges, but that should work too. You should be able to write e.g. Edges = lazy (someFancyFunction loop).
Alternatively, you could also use seq<Edges> (as sequences are lazy by default), but that would re-evaluate the edges every time, so you probably don't want to do that.

Related

Removing the first half of the entries in LinkedHashMap other than looping

I was going to use Hashtable but some existing answer said only LinkedHashMap preserve the insertion order. So, it seems that I can get the insertion order with the entries or keys properties.
My question is, when the map has n elements, if I want to remove the first n/2 elements, is there a better way than looping through the keys and repeatedly calling remove(key)? That is, something like this
val a = LinkedHashMap<Int, Int>();
val n = 10;
for(i in 1 .. n)
{
a[i] = i*10;
}
a.removeRange(0,n/2);
instead of
val a = LinkedHashMap<Int, Int>();
val n = 10;
for(i in 1 .. n)
{
a[i] = i*10;
}
var i = 0;
var keysToRemove= ArrayList<Int>();
for(k in a.keys)
{
if(i >= n/2)
break;
else
i++
keysToRemove.add(k);
}
for(k in keysToRemove)
{
a.remove(k);
}
The purpose of this is that I use the map as a cache, and when the cache is full, I want to purge the oldest half of the entries. I do not have to use LinkedHashMap as long as I can:
Find the value using a key, efficiently.
Remove a range of entries at once.
There's no method in the class that makes this possible. The source code doesn't have any operations for ranges of keys or entries. Since the linking is built on top of the HashMap logic, individual entries still have to be individuatlly found by a hashed key lookup to remove them, so being able to remove a range couldn't be done faster in a LinkedHashMap, which is unlike the analogy of a LinkedList to an ArrayList.
For simpler code that's equivalent to what you're doing:
a.keys.take(a.size / 2).forEach(a::remove)
If you don't want to use a library for a cache set, LinkedHashSet is designed so you can easily build your own by subclassing. For instance, a basic one that simply removes the oldest entry when you add elements above a certain collection size:
class CacheHashMap<K, V>(private var maxSize: Int): LinkedHashMap<K, V>() {
override fun removeEldestEntry(eldest: MutableMap.MutableEntry<K, V>?): Boolean =
size == maxSize
}
Also, if you set accessOrder to true in your constructor call, it orders by last used to most recently used entry, which might be more apt for your situation than insertion order.
EDIT: sorry I missed the part about using this as an LRU cache, for that use case, TreeMap will not be suitable.
If insertion order is just incidental for you, and what you want is in fact the actual order of comparable keys, you should use a TreeMap instead.
However, the specific use case of removing half the keys might not be supported directly. You will rather find methods to remove keys below/above a certain value, and get the highest/lowest keys.

What's a good pattern to manage impossible states in Elm?

Maybe you can help. I'm an Elm beginner and I'm struggling with a rather mundane problem. I'm quite excited with Elm and I've been rather successful with smaller things, so now I tried something more complex but I just can't seem to get my head around it.
I'm trying to build something in Elm that uses a graph-like underlying data structure. I create the graph with a fluent/factory pattern like this:
sample : Result String MyThing
sample =
MyThing.empty
|> addNode 1 "bobble"
|> addNode 2 "why not"
|> addEdge 1 2 "some data here too"
When this code returns Ok MyThing, then the whole graph has been set up in a consistent manner, guaranteed, i.e. all nodes and edges have the required data and the edges for all nodes actually exist.
The actual code has more complex data associated with the nodes and edges but that doesn't matter for the question. Internally, the nodes and edges are stored in the Dict Int element.
type alias MyThing =
{ nodes : Dict Int String
, edges : Dict Int { from : Int, to : Int, label : String }
}
Now, in the users of the module, I want to access the various elements of the graph. But whenever I access one of the nodes or edges with Dict.get, I get a Maybe. That's rather inconvenient because by the virtue of my constructor code I know the indexes exist etc. I don't want to clutter upstream code with Maybe and Result when I know the indexes in an edge exist. To give an example:
getNodeTexts : Edge -> MyThing -> Maybe (String, String)
getNodeTexts edge thing =
case Dict.get edge.from thing.nodes of
Nothing ->
--Yeah, actually this can never happen...
Nothing
Just fromNode -> case Dict.get edge.to thing.nodes of
Nothing ->
--Again, this can never actually happen because the builder code prevents it.
Nothing
Just toNode ->
Just ( fromNode.label, toNode.label )
That's just a lot of boilerplate code to handle something I specifically prevented in the factory code. But what's even worse: Now the consumer needs extra boilerplate code to handle the Maybe--potentially not knowing that the Maybe will actually never be Nothing. The API is sort of lying to the consumer. Isn't that something Elm tries to avoid? Compare to the hypothetical but incorrect:
getNodeTexts : Edge -> MyThing -> (String, String)
getNodeTexts edge thing =
( Dict.get edge.from thing.nodes |> .label
, Dict.get edge.to thing.nodes |> .label
)
An alternative would be not to use Int IDs but use the actual data instead--but then updating things gets very tedious as connectors can have many edges. Managing state without the decoupling through Ints just doesn't seem like a good idea.
I feel there must be a solution to this dilemma using opaque ID types but I just don't see it. I would be very grateful for any pointers.
Note: I've also tried to use both drathier and elm-community elm-graph libraries but they don't address the specific question. They rely on Dict underneath as well, so I end up with the same Maybes.
There is no easy answer to your question. I can offer one comment and a coding suggestion.
You use the magic words "impossible state" but as OOBalance has pointed out, you can create an impossible state in your modelling. The normal meaning of "impossible state" in Elm is precisely in relation to modelling e.g. when you use two Bools to represent 3 possible states. In Elm you can use a custom type for this and not leave one combination of bools in your code.
As for your code, you can reduce its length (and perhaps complexity) with
getNodeTexts : Edge -> MyThing -> Maybe ( String, String )
getNodeTexts edge thing =
Maybe.map2 (\ n1 n2 -> ( n1.label, n2.label ))
(Dict.get edge.from thing.nodes)
(Dict.get edge.to thing.nodes)
From your description, it looks to me like those states actually aren't impossible.
Let's start with your definition of MyThing:
type alias MyThing =
{ nodes : Dict Int String
, edges : Dict Int { from : Int, to : Int, label : String }
}
This is a type alias, not a type – meaning the compiler will accept MyThing in place of {nodes : Dict Int String, edges : Dict Int {from : Int, to : Int, label : String}} and vice-versa.
So rather than construct a MyThing value safely using your factory functions, I can write:
import Dict
myThing = { nodes = Dict.empty, edges = Dict.fromList [(0, {from = 0, to = 1, label = "Edge 0"})] }
… and then pass myThing to any of your functions expecting MyThing, even though the nodes connected by Edge 0 aren't contained in myThing.nodes.
You can fix this by changing MyThing to be a custom type:
type MyThing
= MyThing { nodes : Dict Int String
, edges : Dict Int { from : Int, to : Int, label : String }
}
… and exposing it using exposing (MyThing) rather than exposing (MyThing(..)). That way, no constructor for MyThing is exposed, and code outside of your module must use the factory functions to obtain a value.
The same applies to Edge, wich I'm assuming is defined as:
type alias Edge =
{ from : Int, to : Int, label : String }
Unless it is changed to a custom type, it is trivial to construct arbitrary Edge values:
type Edge
= Edge { from : Int, to : Int, label : String }
Then however, you will need to expose some functions to obtain Edge values to pass to functions like getNodeTexts. Let's assume I have obtained a MyThing and one of its edges:
myThing : MyThing
-- created using factory functions
edge : Edge
-- an edge of myThing
Now I create another MyThing value, and pass it to getNodeTexts along with edge:
myOtherThing : MyThing
-- a different value of type MyThing
nodeTexts = getNodeTexts edge myOtherThing
This should return Maybe.Nothing or Result.Err String, but certainly not (String, String) – the edge does not belong to myOtherThing, so there is no guarantee its nodes are contained in it.

Pointer to a record in OCaml

I am implementing binary search trees in OCaml, trying to use as much imperative programming as possible.
I have the following data type:
type tKey = Key of int;;
type tBST = Null | Pos of node ref
and node = {mutable key : tKey; mutable left : tBST; mutable right : tBST};;
I am having trouble with this function:
let createNode k tree =
tree := Pos ({key = k; left = Null; right = Null});;
Error: This record expression is expected to have type node ref
The field key does not belong to type ref
A binary search tree can be either Null (means empty tree) or a Pos. A tree Pos is a pointer to a node, and a node is a structure of a key and 2 other trees (left and right).
My main goal here is to have a tree that is modified after functions are over. Passing tree by reference so when createNode is over, the tBST I passed as parameter is modified.
Question: is actually possible to do what I am trying in OCaml? if so, how could I change my function createNode and/or data type to make this happen?
Thank you very much.
It is possible, but you need to create the Pos node with a reference explicitly:
Pos (ref {key = k; (*...*)})
Whether what you are trying to do is recommended practice in a language like Ocaml is a different story, though.
The question has already been answered. I would just like to add a side note: The use of ref seems superfluous in this case.
A value of type tBST is either Null or a mutable pointer. If it is Null it will remain Null. If it is non-Null, it will remain non-Null, but the actual pointer might change. That might well be what you intended, but I have my doubts. In particular, what tBST does not do, is to emulate C-style pointers (which are either null or really point somewhere). I suspect, though, that that was your intention.
The idiomatic way to emulate C-style pointers is to just use the built-in option type, like so:
type tBST = node option
A value of type node option is either None or Some n, where n is a pointer to a value of type node. You use tBST for mutable fields (of the record node), so you would effectively have mutable C-style pointers to nodes.
Here is what you probably had in mind:
type tree = node option ref
and node = {
mutable left: tree;
mutable key: int;
mutable right: tree;
};;
let t0 : tree = ref None;;
let t1 : tree = ref (Some { left = ref None; key = 1; right = ref None; }) ;;
let create_node key tree =
tree := Some { left = ref None; key; right = ref None; }
No need to have a separate type for key but you can if you want it, and with the latest OCaml there no runtime overhead for it.

how return a new type with an update value

If I want to change a value on a list, I will return a new list with the new value instead of changing the value on the old list.
Now I have four types. I need to update the value location in varEnd, instead of changing the value, I need to return a new type with the update value
type varEnd = {
v: ctype;
k: varkind;
l: location;
}
;;
type varStart = {
ct: ctype;
sy: sTable;
n: int;
stm: stmt list;
e: expr
}
and sEntry = Var of varEnd | Fun of varStart
and sTable = (string * sEntry) list
type environment = sTable list;;
(a function where environment is the only parameter i can use)
let allocateMem (env:environment) : environment =
I tried to use List.iter, but it changes the value directly, which type is also not mutable. I think List.fold will be a better option.
The biggest issue i have is there are four different types.
I think you're saying that you know how to change an element of a list by constructing a new list.
Now you want to do this to an environment, and an environment is a list of quite complicated things. But this doesn't make any difference, the way to change the list is the same. The only difference is that the replacement value will be a complicated thing.
I don't know what you mean when you say you have four types. I see a lot more than four types listed here. But on the other hand, an environment seems to contain things of basically two different types.
Maybe (but possibly not) you're saying you don't know a good way to change just one of the four fields of a record while leaving the others the same. This is something for which there's a good answer. Assume that x is something of type varEnd. Then you can say:
{ x with l = loc }
If, in fact, you don't know how to modify an element of a list by creating a new list, then that's the thing to figure out first. You can do it with a fold, but in fact you can also do it with List.map, which is a little simpler. You can't do it with List.iter.
Update
Assume we have a record type like this:
type r = { a: int; b: float; }
Here's a function that takes r list list and adds 1.0 to the b fields of those records whose a fields are 0.
let incr_ll rll =
let f r = if r.a = 0 then { r with b = r.b +. 1.0 } else r in
List.map (List.map f) rll
The type of this function is r list list -> r list list.

Weird behaviour with struct constructors

I've written a basic Node struct in D, designed to be used as a part of a tree-like structure. The code is as follows:
import std.algorithm: min;
alias Number = size_t;
struct Node {
private {
Node* left, right, parent;
Number val;
}
this(Number n) {val = n;}
this(ref Node u, ref Node v) {
this.left = &u;
this.right = &v;
val = min(u.val, v.val);
u.parent = &this;
v.parent = &this;
}
}
Now, I wrote a simple function which is supposed to give me a Node (meaning a whole tree) with the argument array providing the leaves, as follows.
alias Number = size_t;
Node make_tree (Number[] nums) {
if (nums.length == 1) {
return Node(nums[0]);
} else {
Number half = nums.length/2;
return Node(make_tree(nums[0..half]), make_tree(nums[half..$]));
}
}
Now, when I try to run it through dmd, I get the following error message:
Error: constructor Node.this (ulong n) is not callable using argument types (Node, Node)
This makes no sense to me - why is it trying to call a one-argument constructor when given two arguments?
The problem has nothing to do with constructors. It has to do with passing by ref. The constructor that you're trying to use
this(ref Node u, ref Node v) {...}
accepts its arguments by ref. That means that they must be lvalues (i.e. something that can be on the left-hand side of an assignment). But you're passing it the result of a function call which does not return by ref (so, it's returning a temporary, which is an rvalue - something that can go on the right-hand side of an assignment but not the left). So, what you're trying to do is illegal. Now, the error message isn't great, since it's giving an error with regards to the first constructor rather than the second, but regardless, you don't have a constructor which matches what you're trying to do. At the moment, I can think of 3 options:
Get rid of the ref on the constructor's parameters. If you're only going to be passing it the result of a function call like you're doing now, having it accept ref doesn't help you anyway. The returned value will be moved into the function's parameter, so no copy will take place, and ref isn't buying you anything. Certainly, assigning the return values to local variables so that you can pass them to the constructor as it's currently written would lose you something, since then you'd be making unnecessary copies.
Overload the constructor so that it accepts either ref or non-ref. e.g.
void foo(ref Bar b) { ... }
void foo(Bar b) { foo(b); } //this calls the other foo
In general, this works reasonably well when you have one parameter, but it would be a bit annoying here, because you end up with an exponential explosion of function signatures as you add parameters. So, for your constructor, you'd end up with
this(ref Node u, ref Node v) {...}
this(ref Node u, Node v) { this(u, v); }
this(Node u, ref Node v) { this(u, v); }
this(Node u, Node v) { this(u, v); }
And if you added a 3rd parameter, you'd end up with eight overloads. So, it really doesn't scale beyond a single parameter.
Templatize the constructor and use auto ref. This essentially does what #2 does, but you only have to write the function once:
this()(auto ref Node u, auto ref Node v) {...}
This will then generate a copy of the function to match the arguments given (up to 4 different versions of it with the full function body in each rather than 3 of them just forwarding to the 4th one), but you only had to write it once. And in this particular case, it's probably reasonable to templatize the function, since you're dealing with a struct. If Node were a class though, it might not make sense, since templated functions can't be virtual.
So, if you really want to be able to pass by ref, then in this particular case, you should probably go with #3 and templatize the constructor and use auto ref. However, personally, I wouldn't bother. I'd just go with #1. Your usage pattern here wouldn't get anything from auto ref, since you're always passing it two rvalues, and your Node struct isn't exactly huge anyway, so while you obviously wouldn't want to copy it if you don't need to, copying an lvalue to pass it to the constructor probably wouldn't matter much unless you were doing it a lot. But again, you're only going to end up with a copy if you pass it an lvalue, since an rvalue can be moved rather than copied, and you're only passing it rvalues right now (at least with the code shown here). So, unless you're doing something different with that constructor which would involve passing it lvalues, there's no point in worrying about lvalues - or about the Nodes being copied when they're returned from a function and passed into the constructor (since that's a move, not a copy). As such, just removing the refs would be the best choice.

Resources