immutable functional data structure to represent graphs - graph

I (almost) fully understand the Zipper data structure for trees. However, in some publications I saw hints that it is also possible to use the Zipper idea to create immutable functional data structure for arbitrary graphs (which might have cycles as well).
What's the way to do it?
As soon as we have cycles, it means that any node can be reached via several paths. Hence, if I focus on a node, do some change to it, and move the focus away, I might later on come back to the same node via a different path, which means that it would be an 'old' version of the node, prior to the change made.
The only solution I came up with is to include to the context the list of changes to any node. Every time before the focus is changed to node X, it should be checked whether X is the member of the list of changes, and if so, it should be taken as the focused node.
If we also track the number of times N node X was copied from the list of changes, we can remove X from the list of changes, as soon as N = number of edges, inward to X.
Is there any better way to do it?

Related

How to "keep track" of user activity with functional programming?

tl;dr
In a program that calls a function onEnterFrame on each frame, how do you store and mutate state? For instance if you are making a level editor or a painting program where keeping track of state and making small incremental changes are tempting / enticing / inviting. What is the most performany way to handle such a thing with minimal global state mutations?
long version:
In a interactive program that accepts input from the user, like mouse clicks and key strokes, we may need to keep track of the state of the data model. For instance:
Are some elements selected?
Is the mouse cursor hovering over an element, which one?
How long is the mouse button held down? Is this a click or a drag?
We also, sometimes need make small changes to a large model:
In a level editor, we may need to add one wall to an existing large set of prefabs. You don't want to recreate the set, no?
Read Prof Frisby's mostly-adequate-guide so far, there are many functional solutions to issues that deal with extracting a piece of data from some source of input, performing computation on that data and passing the result to some output.
Sometimes an app let's the user interact and perform a sequence of mutations on data. For instance, what if a program let's the user draw (like Paint) on a canvas and we need to store the state of the painting as well as the actions that led to that state (for undo and logging/debugging purposes)?
What state is acceptable to store and what should we absolutely avoid?
Currently my conclusions is that we should never store state that we only need temporarily, we should pass it to the function that needs it directly.
But what if there are several functions that need a specific computation? Like the case in which we check if the mouse's cursor is hovering over a specific area, why would we want to recompute that?
Are there ways to further minimize mutations of global state?
Storing state isn't the problem. It is mutating global state that is the problem. There are solutions to handling this. One that comes to mind is the State Monad. However, I am not sure this is ideal for undoing operations. But it is a place to start.
If you just want to look at the problem as an initial state and a set of operations then you can think of the operations as a List that can be traversed (with the head being the latest operation). Undoing a set of n operations could be accomplished by traversing the first n elements of the list and cons-ing the inverse of these operations to the list.
That way you don't modify global state at all.

Storing doubly linked lists in Riak without a race condition?

We want to use Riak's Links to create a doubly linked list.
The algorithm for it is quite simple, I believe:
Let 'N0' be the new element to insert
Get the head of the list, including its 'next' link (N1)
Set the 'previous' of N1 to be the N0.
Set the 'next' of N0 to be N1
Set the 'next' of the head of the list to be N0.
The problem that we have is that there is an obvious race condition here, because if 2 concurrent clients get the head of the list, one of the items will likely be 'lost'. Any way to avoid that?
Riak is an eventually consistent system when talking about CAP theorem.
Provided you set the bucket property allow_multi=true, if two concurrent clients get the head of the list then write, you will have sibling records. On your next read you'll receive multiple values (siblings) and will then have to resolve the conflict and write the result. Given that we don't have any sort of atomicity this will possibly lead to additional conflicts under heavy write concurrency as you attempt to update the linked objects. Not impossible to resolve, but definitely tricky.
You're probably better off simply serializing the entire list into a single object. This makes your conflict resolution much simpler.

Finding end of linked list

I'm looking over interview questions, and I came across "How do you find out if a linked-list has an end? (i.e. the list is not a cycle)." It gives a solution (traverse it one and two nodes at a time, and see if the pointers are ever equal).
Couldn't we just keep the pointer that we start at and see if while traversing it, we ever hit that pointer again? Or will that not work?
That will not work: the linked list may contain a cycle that does not include the first pointer.
Keep in mind that a node in a linked list can be linked to by more than one other node!
Couldn't we just keep the pointer that we start at and see if
while traversing it, we ever hit that pointer again?
No. See the below case. You will just traverse the loop without ever hitting the start node of the list
Another way to find if there is a loop:
If you reverse the list, and remember the inital node, you will know that there is a cycle if you get back to the first node. While efficient, this solution changes the list and not suited for multithreaded applications.
Take two pointers one should traverse the list one node at a time and another should traverse the list 2 nodes at a time.
if at any point of time they meet each other(both the pointers refer to the same node).It means its a circular linked list and does not have an end.

Applying visitor on syntax tree

I am building a c interpreter. My AST uses the composite-pattern. To check semantics and perform actions, I wanna use the visitor-pattern. Now there's one problem. This is an grammar rule of the c-preprocessor: if-section = if-group [ elif-groups ] [ else-group ] endif-line. The visitor of if-section needs information about the child nodes, to know which groups have to be skipped. In the visitor-pattern, every "visit"-method returns void. So I can't get any information about these nodes (only with adding information to the nodes, but that's ugly ...). Are there any opportunities?
You've nailed the problem: you have to have additional information above and beyond the raw data that comprises the AST.
You can associate all of that extra information with just individual tree nodes: if you do that, you'll end up building what is called an attributed tree. In theory (and if you work at), you make this idea work completely. Your visitor may have to inspect/update the information associated with not only the AST node it is visiting, but that of key children and parents.
In practice, it is useful to build auxiliary data structures (e.g., symbol tables) which can consulted by the visitor (and updated) as it walks the tree. You end up with kind of degenerate attributed tree: you associate symbol table chunks with AST nodes that form scopes.
You've artificially constrained your visitor from returning any value; if you didn't do that, a child visitor could pass useful values to a parent visitor, enabling the parent to do less ad hoc reaching down the tree.
In your problem statement, you have not constrained your visitor from passing values down to children, so you can pass down useful values. An extremely useful value to pass is the symbol table associated with the surrounding scope, so that children visitors don't have to climb back up the tree to find the scoping node, and the associated symbol table.

How to realize persistence of a complex graph with an Object Database?

I have several graphs. The breadth and depth of each graph can vary and will undergo changes and alterations during runtime. See example graph.
There is a root node to get a hold on the whole graph (i.e. tree). A node can have several children and each child serves a special purpose. Furthermore a node can access all its direct children in order to retrieve certain informations. On the other hand a child node may not be aware of its own parent node, nor other siblings. Nothing spectacular so far.
Storing each graph and updating it with an object database (in this case DB4O) looks pretty straightforward. I could have used a relational database to accomplish data persistence (including database triggers, etc.) but I wanted to realize it with an object database instead.
There is one peculiar thing with my graphs. See another example graph.
To properly perform calculations some nodes require informations from other nodes. These other nodes may be siblings, children/grandchildren or related in some other kind. In this case a specific node knows the other relevant nodes as well (and thus can get the required informations directly from them). For the sake of simplicity the first image didn't show all potential connections.
If one node has a change of state (e.g. triggered by an internal timer or triggered by some other node) it will inform other nodes (interested obsevers, see also observer pattern) about the change. Each informed node will then take appropriate actions to update its own state (and in turn inform other observers as needed). A root node will not know about every change that occurs, since only the involved nodes will know that something has changed. If such a chain of events is triggered by the root node then of course it's not much of an issue.
The aim is to assure data persistence with an object database. Data in memory should be in sync with data stored within the database. What adds to the complexity is the fact that the graphs don't consist of simple (and stupid) data nodes, but that lots of functionality is integrated in each node (i.e. events that trigger state changes throughout a graph).
I have several rough ideas on how to cope with the presented issue (e.g. (1) stronger separation of data and functionality or (2) stronger integration of the database or (3) set an arbitrary time interval to update data and accept that data may be out of synch for a period of time). I'm looking for some more input and options concerning such a key issue (which will definitely leave significant footprints on a concrete implementation).
(edited)
There is another aspect I forgot to mention. A graph should not reside all the time in memory. Graphs that are not needed will be only present in the database and thus put in a state of suspension. This is another issue which needs consideration. While in suspension the update mechanisms will probably be put to sleep as well and this is not intended.
In the case of db4o check out "transparent activation" to automatically load objects on demand as you traverse the graph (this way the graph doesn't have to be all in memory) and check out "transparent persistence" to allow each node to persist itself after a state change.
http://www.gamlor.info/wordpress/2009/12/db4o-transparent-persistence/
Moreover you can use db4o "callbacks" to trigger custom behavior during db4o operations.
HTH
German
What's the exact question? Here a few comments:
As #German already mentioned: For complex object graphs you probably want to use transparent persistence.
Also as #German mentione: Callback can help you to do additional stuff when objects are read/written etc on the database.
To the Observer-Pattern. Are you on .NET or Java? Usually you don't want to store the observers in the database, since the observers are usually some parts of your business-logic, GUI etc. On .NET events are automatically not stored. On Java make sure that you mark the field holding the observer-references as transient.
In case you actually want to store observers, for example because they are just other elements in your object-graph. On .NET, you cannot store delegates / closures. So you need to introduce a interface for calling the observer. On Java: Often we use anonymous inner classes as listener: While db4o can store those, I would NOT recommend that. Because a anonymous inner class gets generated name which can change. Then db4o will not find that class later if you've changed your code.
Thats it. Ask more detailed questions if you want to know more.

Resources