The *A** algorithm is an efficient algorithm used for path finding. In this article, we'll set ourselves up with a maze specification where we'll search for paths, then introduce a few simple path finding algorithms before proceeding to *A**. We'll include all relevant code samples along the way in multiple programming languages. Then, we'll look at the space and time complexity of each path finding solution. At the bottom, we have an interactive demo which will generate random mazes and use the different maze solvers to visualize each approach.

### A* Algorithm

The **A* Algorithm** (A Star Algorithm) is a path, or graph, search algorithm which uses a *heuristic* function to decide which possible paths to visit first, thus reducing the number of possibilities which need to be explored before finding the optimal path. In the sections below, we'll create a maze, then find paths with *depth-first* search and *breadth-first* search to set the stage for *A**, which can be considered a modification of *breadth-first* search.

### Maze Specification

We will use a small maze class to test the A* algorithm. The maze will consist of instances of a class *MazeCell* stored in a 2D array. Each cell will initially be marked as *blocked* or *opened*. A blocked cell corresponds to a cell which has a wall and thus cannot be traversed. As a maze solver traverses the maze, it will mark cells as *traversed*. The **Maze** class will hold the cells and provide functionality to get a particular cell, and get the cells which are accessible in one step from the current cell.

We'll also define some interfaces and base classes for maze solutions. In the sections below, we'll create different kinds of maze solvers which will use the interfaces here to provide a uniform structure.

### Solving with Depth-First Search

If we consider the maze as a *graph*, each cell as a *node*, the top-left cell as the entry node, and the bottom right as the target, we can use graph-traversal algorithms to find a path through the maze. Any cell which is blocked can be considered a disconnected node in the graph, or as a node which does not exist at all. In this graph, a path from the start cell to the goal cell will be one or more of the paths from the root node to a leaf node. A *depth-first search* can find such a path in a relatively small number of steps, as it will continue down a single direction until that direction is no longer valid, move in a different direction, and repeat. However, the first path to the goal found will most likely have many more cells than the shortest path. Note that while the path may not be the shortest, *depth-first search* is still guaranteed to yield a path if any exist. In the implementation below, we'll use an iterative depth-first search with a stack. The steps can be summarized as:

- Create a stack to keep track of nodes which need to be visited.
- Create a set to keep track of nodes which we've already processed.
- While the stack is not empty:
- Pop from the stack and set that value as the
*currentNode*. - If the
*currentNode*is the goal / end of map, stop. - Otherwise, for each non-processed next node (maze cell which can be reached in one move):
- Push this node to the nodes which need to be visited in the stack, and mark the
*currentNode*as its parent in the path. - Mark this node as being visited in the set.
- If the path exists, work backwards with the parent nodes starting with the goal node to determine the path from start to end.

The full implementation is given below:

### Solving with Breadth-First Search

We can also find a path using a *breadth-first search*. The implementation itself looks almost exactly the same as that of the *depth-first search*, with one important exception: we will be using a *queue* instead of a *stack* as the data structure to store and retrieve nodes which must be visited. Unlike the *depth-first search*, this approach is guaranteed to find the shortest path. However, it will need to potentially take a large number of steps, and in the worst case scenario, tough every cell before finding the path. The steps can be summarized as:

- Create a queue to keep track of nodes which need to be visited.
- Create a set to keep track of nodes which we've already processed.
- While the stack is not empty:
- Dequeue from the queue and set that value as the
*currentNode*. - If the
*currentNode*is the goal / end of map, stop. - Otherwise, for each non-processed next node (maze cell which can be reached in one move):
- Push this node to the nodes which need to be visited in the queue, and mark the
*currentNode*as its parent in the path. - Mark this node as being visited in the set.
- If the path exists, work backwards with the parent nodes starting with the goal node to determine the path from start to end.

The full implementation is given below:

### Solving with A*

The **A* Algorithm** is similar to the *breadth-first search*, but has a few main differences:

- The traversal keeps track of
*costs*based on the number of steps in the path so far. - Instead of using a
*queue*to store nodes to be visited, it uses a*min priority queue*and uses a*heuristic*+*cost*to determine which nodes to visit first. - Instead of using a
*set*to keep track of previously visited cells, it uses a*map*to also map previously visited nodes to their*costs*. - The algorithm uses a
**heuristic**, or ranking function, to help select nodes which will yield a solution sooner.

The **heuristic** function can be any function which accepts a possible next-state of the program, such as one of the child cells of the current cell being traversed (the cells which can be reached in one step), and returns a number indicating a rank/cost. This heuristic is denoted as *h(n)*. The *cost* is determined by the number of steps taken in the path currently being explored and is denoted as *g(n)*. The variable *n* itself is the child cell in question. The sum value: *f(n) = g(n) + h(n)* will be used by the priority queue to select which nodes should be visited first. Smaller values of *f(n)* will be given favor over larger values; thus, the priority queue will return nodes whose *f(n)* values are smaller before returning those which are larger.

In this application, the *heuristic* function will be a function of distance: it will return a number corresponding to the distance between any particular cell and the exit cell of the maze. This means that we will be making choices about which nodes to visit next based on how far we've travelled already + how close each next node is to the goal state. Moving towards the goal will be favored and moving away will not be. Since our maze only permits left, right, up, and down movements, no diagonals, we will use a function called the **Manhattan distance**. This distance function does not consider diagonal directions, only left-right and up-down. It is named after the fact that there are very few diagnoal streets in Manhattan. The function can be calculated as: for points *a* and *b*: *dist = abs(a.x - b.x) + abs(a.y - b.y)*. In other words, it is thte sum of the horizontal and vertical distances between the two points. This heuristic is considered **admissible**, which means that the heuristic will never over-estimate the cost of reaching the goal.

With the heuristic and priority queue in place, the A* algorithm will effectively perform a breadth-first type search, but because it favors better performing nodes over others, it will reach the optimal solution in fewer steps than breadth-first, at the expense of keeping track of costs and performing heuristic calculations. In cases where there is more than one optimal solution; i.e. there is more than one path with the shortest number of steps, this algorithm may return a different path than breadth-first, but both paths will be of the same length.

**The steps can be summarized as**:

- Create a priority queue to keep track of nodes which need to be visited.
- Create a map to keep track of nodes which we've already processed and their costs.
- While the stack is not empty:
- Dequeue from the priority queye and set that value as the
*currentNode*. - If the
*currentNode*is the goal / end of map, stop. - Otherwise, for each next node (maze cell which can be reached in one move):
- If this node is not yet visited, or the current number of steps is less than that in the map, proceed with the steps below:
- Push this node to the nodes which need to be visited in the priority queue, and mark the
*currentNode*as its parent in the path. - Mark this node as being visited in the map and associate the current number of steps, which is the number of steps associated with
*currentNode*+ 1. - If the path exists, work backwards with the parent nodes starting with the goal node to determine the path from start to end.

The full implementation is given below:

### Space and Time Complexity

In the table below, we'll consider complexity in terms of the width and height of the 2D maze. In all cases, we will have at most *O(n * m)* extra storage used (not including the maze itself): some nodes will be stored in the nodes which are queued (or stacked) for vising, and some nodes to mark nodes which have already been visited. For time complexity, we will visit each node exactly once, except for A*, where a node might be revisited if a more optimal path is found, so the time complexity will also be *O(n * m)*.

Space Complexity | Time Complexity | Notes | |
---|---|---|---|

Depth-First Search | `O(w * h)` | `O(w * h)` | Fewer steps than the others but not likely to find the shortest path. |

Breadth-First Search | `O(w * h)` | `O(w * h)` | More steps than the others but guaranteed to find the shortest path. |

A* Search | `O(w * h)` | `O(w * h)` | Fewer steps than breadth-first search, but more space is used, as intermediate node objects are generated during the search. Guaranteed to find the shortest path. |

### Interactive Demo

In the small application below, we'll showcase the different types of searches, the paths they generate, and the number of steps it took to get there. The caption shows each color and its corresponding strategy. Cells whose path is included in more than one strategy will have more than one color inside of it. If no colors appeared, the generated map does not have any paths from the top left to the bottom right.