Graph traversal algorithms are not limited to single computers, they can also be applied to large networks of computers to facilitate parallel processing, networking, and other distributed algorithms. In this article, we'll simulate a network of computers, discuss how to traverse them, and show code examples in multiple languages. Note: this article assumes you have a strong understanding of normal depth-first search and breadth-first search algorithms. If you do not, please visit our introduction to graph traversals page first.
Graphs with Distributed Nodes
In normal graph-traversal code examples, all of the nodes in the graph will be in memory and a single function will orchestrate the traversal. In this article, we will consider each node as if it were on its own computer, and the graph will be a network of these computers. Each node (computer) will only be able to communicate its child and parent nodes. Therefore, using a single function to orchestrate the graph traversal will not be possible. In the sections below, we'll simulate this type of network, then discuss how to traverse it.
Simulating a Distributed Graph
We will be simulating a distributed network of computers using graph nodes and message responders. A node will hold reference to its children node and has the ability to send messages to them. Each node can receive messages from its own parent nodes and respond to them. In our example, we will use message responders to decouple the functionality of responding to messages so we can implement that functionality in seperate classes instead of packing everything into the graph node class. In the code below, we've implemented the graph node class to hold a set of child nodes and a map of message types to message responders. When a child is added, the reverse connection will also be added, as we are working with undirected graphs. When a message is received, the appropriate message responder will be found in the map and invoked:
Our implementation is a simplification of what a real distributed network of computers would be. In real-life scenarios, we would also need to ensure the nodes are synchronized, messages are properly received and acknowledged, and handle network failures, but for this article, we will ignore those concerns.
Distributed Graph Traversal
Our graph traversal algorithm will be similar to a depth-first search algorithm. However, instead of having a single function orchestrate the traversal, each node will handle itself and its direct children. We will initiate the algorithm from some arbitrary start node, and when that node is concluded, the entire algorithm will be concluded. We will cover the implementation in more detail in the next section which outlines the algorithm to count the total number of nodes in the graph. To facilitate that functionality, we will introduce two abstract classes: one for responding to messages, and one for a type of message responder which will send its own types of requests and responses. The AbstractGraphMessageRequestorResponder encapsulates this functionality: the request type message will indicate to a node that it should begin some algorithm, and the response type message indicates that a response has been received. We will allow for a null-like response to be sent in certain situations to indicate that a node has refused to execute the algorithm. In the code below, we'll setup these abstract classes to lay foundation for the algorithms we'll be using in the next two sections:
Algorithm: Count the Nodes in the Graph
In the code below, the system will be counting the total number of nodes in the graph. It must account for the fact that there is no single function running the search, as the algorithm is "distributed" so each node only handles itself and its neighbors/children. It must also ensure that nodes are not counted more than once, or more generally, the algorithm doesn't enter an infinite cycle. To prevent loops, each node's responder will keep a flag alreadyReceivedRequest which will initially be sent to false. Once the node receives its initial request, that flag will be set to true, and if it receives subsequent requests, it will return a null-like response immediately instead of re-executing the algorithm. After the request is received, the node will send its own requests to its own children, keep a sum of their response values, and when all requests are received, this node will send its own response back.
When a request is received:
- If a request has already been received, respond with a null-like response and do not proceed.
- Mark that a request has been received.
- Mark that the total number of nodes is 1 (this current node).
- Mark that the number of responses received is 0 (to be used later).
- Store the requestor as the original requestor (to be used later).
- For each child of the node this requestor is attached to:
- If the child is the same as the original requestor:
- Increment the number of responses received.
- If that number is equal to the number of child nodes, return a response with the node count.
- Otherwise, send a request to this child node to get the number of nodes from that node.
When a response is received (from a child node):
- Increment the number of responses received.
- If the response is not null-like, add the response's number of nodes to this node's count.
- If the number of responses received is equal to the number of child nodes, return a response with the total node count.
Algorithm: Sum all Values of Nodes in the Graph
This algorithm is very similar to the one described above. The only difference is: instead of keeping node counts and incrementing by 1, we will be keeping sub-sums and incrementing by the value of each node. The code below contains a copy of most of the code above; in a system where we had many types of responders, we could further abstract this code so that only the different types of aggregation need to be implemented by each responder.
Running the Algorithm
To run the algorithm, we can send an initial request to any arbitrary start node and set the requestedFrom field to null. In the code below, we've picked aNode as the entry point. Once invoked, each node responder will output its value to the console, and the last value logged to the console for each responder will be the resulting value.
This code will output the text below to the console. Some intermediate results will be logged, then for the count and sum, the last line outputted will be the final answer.
Space and Time Complexity
We'll consider n to be the number of nodes and e to be the number of edges in the graph.
Complexity | Notes | |
---|---|---|
Space | O(n) | Each node keeps its own constant number of properties to record the algorithm's progress. |
Time | O(e) | Each node will be invoked once per each connection to it. It will return a null-like response every time after the first invocation, but the edge to that node will still be traversed a constant number of times. |