Generic Search Algorithms : Generalised steps for any algorithm function Search(graph,start,goal): 0. Initialize agenda [[start]] extended_list = [] while agenda is not empty: 1. Path = agenda.pop(0) 2. If is_path_to_goal(path,goal): return path 3. Otherwise extend current path if not already extended For each connected node: make new path(Don't add path with loop) 4. Add new_path to agenda and reorganize agenda return failHere,
- Agenda : Keeps track of all paths under consideration and the way it is maintained is the key to difference between most of the search algorithms.
- Loops in paths : No path with loop should be created.
- Extended List : List of nodes that has extended in step 3. Use of extended list is optional but it provides optimization. This optimization can be applied to any search algorithm.
- Backtracking : Possibility of going back to other partial path is called backtracking.
- Exiting the Search : Non optimal searches may actually exit once it finds path with goal node. For optimal, search must exit only when path is the first removed from agenda ( which would be best possible solution depending on cost)
Search algorithms can be informed or uninformed search. If there is a function f(x), which is also called evaluation function, which helps in guiding search algorithm to the result, the search is called informed search. Hence if there is no evaluation function used for search, the search algorithm is called uninformed search. Breadth First Search, Depth First Search and British Museum are considered to be uninformed search algorithms while all the remaining search algorithms are informed search algorithms.
If algorithm finds only one path from source to destination, it is incomplete search. This will give the first ever result it encountered. But if algorithm is searching for all paths from source to destination, it is complete search. Complete search algorithms tries to find best possible path available from source to destination. Similarly , if algorithm is returning best solution, it is called optimal solution. This means, only the complete search algorithms can be optimal. Non Complete search can find all solutions but not necessarily returns optimal solution
There are many different search algorithms exist, most common of them mentioned as below:- Breadth First Search :
- Properties : Uninformed, Non Optimal , complete
- Add all new paths to the back of agenda like a queue (FIFO)
- Depth First Search :
- Properties : Uninformed, Non Optimal, Incomplete
- Add all new paths to the front of agenda like a stack (LIFO)
- Best First Search :
- Properties : Depending on definition of f(x), non optimal and incomplete
- Required Parameter : f(x) to sort the entire agenda
- Keep entire agenda sorted by f(x)
- n Best First :
- Properties : Non optimal, Incomplete
- f(x) to sort entire agenda by n = maximum size of the agenda
- Keep entire agenda sorted by f(x) and only keep the top n
- Hill Climbing :
- Properties : Non Optimal, Incomplete with Heuristic
- Keep only newly added paths sorted by f(x)
- Add sorted new path to the front of agenda
- Beam Search :
- Like BFS, but expand search node in f(x) order. Incomplete for small k but complete for k = infinity. Non optimal. If k = 1, Beam Search is like Hill climbing without backtracking.
- Parameter : Beam width K, f(x) to sort top paths by
- Keep only k top paths that are of length of n and sorted by f(x)
- British Museum :
- Properties : Brutally exhaustive, non informed, complete
- Implemented using Breadth first enumeration of all paths.
- Branch and Bound :
- Properties : Optimal
- g(x) = c(s,x) = cost of path from s to node x. f(x) = g(x) + 0. Keep sorting paths by x.
- A* without extended list :
- Properties : Optimal if h is admissible.
- f(x) = g(x) + h (x, g) where h(x, g) is estimated cost from x to g.
- sort paths by f(x)
- A* with extended list:
- Properties : Optimal if h is consistent
- f(x) = g(x) + h(x)
- sort paths by f(x)
- Admissible Heuristics : For all nodes x in graph, h(x) <= c(x, g) that means heuristic is underestimate of actual cost to goal from x.
- Consistent Heuristics :
- For edges in an undirected graph, where m is connected to n |h(m) - h(n)| <= c(m, n)
- For edges in directed graph, n is descendent of m or m -> n, h(m) - h(n) <= c(m, n)
- Thus for each edge, h value of edge <= actual cost of edge.
- Please remember, if h is consistent then it is admissible but if h is admissible, it is not necessarily consistent. An admissible heuristic can be made consistent by using pathmax algorithm.
- Pathmax algorithm :
- When you are extending nodes, if there exist and edge that is not consistent (that means h(m) - h(n) > c(m, n)) make it consistent by setting end h(n) value to h(m). Hence difference h(m) - h(n) becomes zero and consistent.
- Always remember , more closely h(x) approximates the actual cost to goal, the faster A* will find the solution.