Load balancing using the heat equation - math

I didn't think this was mathsy enough for mathoverflow, so I thought I would try here. Its for a programming problem though, so its sorta on topic.
I have a graph (as in the maths one), where I have a number of vertices, A,B,C.. each with a load "value", these are connected by some arbitrary topology (there is at least a spanning tree). The objective of the problem is to transfer load between each of the vertices, preferably using the minimal flow possible.
I wish to know the transfer values along each edge.
The solution to the problem I was thinking of was to treat it as a heat transfer problem, and iteratively transfer load, or solve the heat equation in some way, counting the amount of load dissipated along each edge. The amount of heat transfer until the network reaches steady state should thus yield the result.
Whilst I think this will work, it seems like the stupid solution. I am wondering if there is a reference or sample problem that someone can point me to -- I am unsure what keywords to search for.
I could not see how to couple the problem as either a simplex problem or as a network flow problem -- each edge has unlimited capacity, and so does each node. There are two simultaneous minimisation problems to solve, so simplex seems to not apply??

If you have a spanning tree, then there is an O(n) solution.
Calculate the average value. (In the end, every node will have this value.) Then iterate over the leaves: calculate the change in value necessary for that leaf (average - value), add that change to the leaf, assign it (as flow) to the edge that connects that leaf to the tree, and subtract it from the other node. Remove that leaf and edge from the tree (not from the graph of course); the other node to may become a new leaf, if it has only one remaining edge in the tree.
When you reach the last node, if you've done the arithmetic right it will end up with the average value, just like all the other nodes.

Related

Given an unsorted list of edges, how would I detect if they form a cycle (Graphs)

I am given a list of edges (each of which have a 'From' and 'To' properties that specify which vertices they connect).
I want to either return a null if they don't form the cycle in this (undirected) graph, or return the list forming a cycle.
Does anyone know how I would go about such a problem? I am clueless.
The way I've been taught to do this involves storing a list of visited vertices.
Navigate through the graph, storing each vertex and adding it to the list. Each turn, compare the current vertex to the list - if it is present, you've visited it before, and are therefore in a cycle.
Algorithms of this type are called Graph Cycle Detection Algorithms. There are some intricacies about which algorithms to select based on the needs of the application or the context of the problem. For example, do you want to find the first cycle, the shortest cycle, longest cycle, all of the cycles, is the graph unidirectional or bidirectional, etc.?
There are numerous cycle detection algorithms to select from, depending on the need and the allowable computational complexity and the nature of the cycle (e.g. finding first, longest, etc.). Some common algorithms include the following:
Floyd's Algorithm
Brent's Algoritm (see also here)
Tarjan's Strongly Connected Components Algorithm (see also this Stack Overflow post)
The specific algorithm you select will depend on your need and how efficient the algorithm must be. If serious efficiency is needed, I would suggest looking through some scholarly articles on the topic and compare and contrast some of the trade-offs of various algorithms.

Alloy model - how to display cycle in visualization of graph?

We have an Alloy model intended to find possible deadlocks in a system. It is set up such that when a counterexample is found, that implies the system being modeling may have a deadlock, i.e. a circular dependency between nodes in the graph. The problem is the visual model of the graph is so complicated, it's nearly impossible to find the cycle representing the deadlock. If there were a way to highlight the cycle, or at least perhaps highlight arcs in the graph that are directed "up" rather than "down" this would help us visualize things better (since in the model we have, a deadlock-free system has all arcs directed in a downward direction). Is there a way to highlight or selectively plot nodes and arcs that create the counterexample?
The first thing that comes to my mind is that when Alloy shows an instance of a predicate, the various arguments to the predicate can be styled specially. So you might try (1) defining a predicate that is the inverse of your assertion, i.e. one that holds when a deadlock is present and which assigns a named role to the nodes in the cycle, and (2) setting the style to display those nodes in a distinct color or shape. You may be able to hide everything that's not in the cycle, or gray it out.

Finding optimal order of all nodes to be visited in a graph

The following problem comes from geography, but I don't kown of any GIS method to solve it. I think it's solution can be found with graph analysis, but I need some guidance to think in the right direction.
There is a geographical area, say a state. It is subdivided in several quadrants, which are subdivided further, and once again. So its a tree structure with the state as root, and 3 levels of child nodes, each parent having 4 childs. But from the perspective of the underlying process its more like a completed graph, since in theory a node is directly reachable from each other node.
The subdivisions reflect map sheet boundaries at different mapscales. Each mapsheet has to reviewed by a topographer in a time span dependend on the complexity of the map contents.
While reviewing the map, the underlying digital data is locked in the database. And as the objects have topological relationships with objects of neighboring map sheet (eg. roads crossing the map boundaries), all 8 surrounding map sheets are locked also.
The question is, what is the optimal order in which the leafs (on the lowest level) should be visited to satisfy following requirements:
each node has to be visited
we do not deal with travel times but with the timespan a worker spent at each node (map)
the time spent at a node is different
while the worker is at a node, all adjacent nodes cannot be visited; this is true also for other workers too; they cannot work on a map side by side with a map already being processed
if a node has been visited, other nodes having the same parent should be prefered as next node; this is true for all levels of parents
Finally for a given number of nodes/maps and workers we need an ordered series of nodes, each worker visites to minimize the overall time, and the time for each parent also.
After designing the solution the real work begins. We will recognize, that the actual work may need more or less time, than expected. Therefore it is necessary to replay the solution up to a current state, and design a new solution with slightly different conditions, leading to another order of nodes.
Has somebody an idea which data structure and which algorithm to use to find a solution for such kind of problem?
Not havig a ready made algorithm, but may be the following helps devising one:
Your exakt topologie is not cler. I assume from the other remarks,
you are targeting a regular structure. In your case a 4x4 square.
Given the restriction that working on a node blocks any adjacient node can be used to identify a starting condition for the algorithm:
Put a worker to one corner of the total are and then put others
at ditance 2 from this (first in x direction and as soon as the side is "filled" with y direction . This will occupy all (x,y) nodes (x,y in 0,2,..,2n where 2n <= size of grid)
With a 4x4 area this will allow a maximum of 4 workers, and will position a worker per child node of each 2 level grid node)
from this let each worker process (x,y),(x+1),(y+1),(x+1,y). This are the 4 nodes of a small square.
If a worker is done but can not proceed to the next planned node, you may advance it to the next free node from the schedule.
The more workers you will have, the higher the risk for contention will be. If you have any estimates on the expected wokload per node,
then you may prefer starting with the most expensive ones and arrange the processing sequence to continue with the ones that have the highest total expected costs.

How to compute the average(or sum) of node values in a network?

Consider a network(graph) of N nodes and each of them is holding a value, how to design a program/algorithm (for each node) that allows each node to compute the average(or sum) of all the node values in the network?
Assumptions are:
Direct communication between nodes is constrained by the graph topology, which is not a complete graph. Any other assumptions, if necessary for your algorithm, is allowable. The weakest one I assume is that there's a loop in the graph that contains all the nodes.
N is finite.
N is suffiently large such that you can't store all the values and then compute its average (or sum). For the same reason, you can't "remember" whose value you've received (thus you can't just redistributing values you've received and add those you've not seen to the buffer and get a result).
(The Tags may not be right since I don't know which field this kind of problems are in, if it's some kind of a general problem.)
That is an interesting question, here some assumptions I've made, before I present a partial solution:
The graph is connected (in case of a directed graph, strongly connected)
The nodes only communicate with their direct neighbours
It is possible to hold and send the sum of all numbers, this means the sum either won't exceed long or you have a data structure sufficiently large, which it won't exceed
I'd go with depth first search. Node N0 would initiate the algorithm and send it's value + the count to the first neighbour (N0.1). N0.1 would add it's own value + count and forward the message to the next neighbour (N0.1.1). In case the message comes back to either N0 or N0.1 they just forward it to another neighbour of theirs. (N0.2 or N0.1.2).
The problem now is to know, when to terminate the algorithm. Preferably you want to terminate it as soon as you've reached all nodes, and afterwards just broadcast the final message. In case you know how many nodes there are in the graph, just keep on forwarding it to the next node, until every node will be reached eventually. The last node will know that is had been reached (it can compare the count variable with the number of nodes in the graph) and broadcast the message.
If you don't know how many nodes there are, and it's and undirected graph, than it will be just depth first implementation in a graph. This means, if N0.1 gets a message from anyone else than N0.1.1 it will just bounce the message back, as you can't send messages to the parent when performing depth first search. If it is a directed graph and you don't know the number of nodes, well than you either come up with a mathematical model to prove when the algorithm has finished, or you learn the number of nodes.
I've found a paper here, proposing a gossip based algorithm to count the number of nodes in a dynamic network: https://gnunet.org/sites/default/files/Gossipico.pdf maybe that will help. Maybe you can even use it to sum up the nodes.

Can I use SQLite to model arbitrary graphs (i.e. a logical map with cycles)?

I'm new to SQL and learning about Adjacency Lists, Nested Sets, Closure Tables, but from what I understood, these solutions usually apply to acyclic data.
I'm aware that this sort of problem may be better suited to a graphical database engine such as Neo4j, and I am exploring that also. But for this question, I specifically want to know if I can achieve this goal in SQLite.
Before running off with a possible answer for this, please help me understand how to better define or illustrate the problem. Once the problem definition is refined, then point me in the right direction (technique, reference material) and let me try to figure it out.
Objectives:
Maintain a list of areas and how they are connected.
Areas can have different types: Country, Highway, State, City, Neighborhood.
Areas can be connected in cycles (undirected).
Areas can have multiple exits.
Maintain a weighted list from one exit to another, within the area.
Extract optimal path from one area to another (from this neighborhood to nearest highway).
Assumptions:
Will use SQLite 3 (newest version).
Small data set ( < 1,000 areas and connections, < 5s DB creation).
Relatively static ( < 5 inserts or updates/year ).
May be simpler to recreate database from scratch than update?
Highways are areas, not connectors.
Streets are logical connectors, no length, no weight.
Areas and connections are like a house with many rooms with multiple doors. The doors connect the rooms. There is no traversal weight going through a door. The weight in selecting a door comes from the distance between the doors. A hallway is like an extended door, so it has a weight and is considered a type room. A room may have a large size, but if the only two doors are near each other, it may have a small weight. it's not the size of the room that counts for my purposes, but the distance between the doors.
As always, thank you for taking the time to read, and for constructive comments.
Yes, it is possible to use SQLite to store this kind of data.
It is not practical, and you may have performance issues. If you plan to store huge amount of such data and want a well scalable solution, you should go for some graph DB.
If you are gong to store ~1000 nodes, that can work with simple realtions in SQLite.
Especially since you are going to have very little number of updates, you could pre-calculate the distances. So you don't have to actually recalculate it each time, but just load from the DB.
I think you should represent your problem as a graph.
Nodes could be the "doors" and edges the distances between them.
You could store this easily in relational database. (Areas(Id,Name), Doors(Id,Area1,Area2) DoorDoorDistance(Door1, Door2, Distance))
If you have stored these data, you can calculate shortest path from every door to every other. You could store this in a new table. (Distances(Door1,Door2,Path, Distance))
To calculate shortest path you can find different algorithms:
Shortest path algorithms
After this you have the shortest path between each pair of doors.
The only question from now is witch door to take from your starting area to which door in your destination area.
If you don't want to be this precise you just take the one with the shortest path. Otherwise you have to maintain door distances from area starting points.
A; You can can assume that you start from the center of the area, so you can store door distances from the center
B; You can be more precise, by storing exact door locations and calculating door distances from an exact starting point.
In both cases you should select door with the lowest cost, both in the starting area and the destination area:
Total cost: (Walk to door distance) + (starting Door to destination Door Path) + (Walk to destination in the destination area)
I would do this like this. I hope I helped, have fun!

Resources