Locating the Algorithm You Need

By John Paul Mueller, Luca Massaron

Part of Algorithms For Dummies Cheat Sheet

The following table describes algorithms and algorithm types that you might find useful for various types of data analysis. (You can find discussions of all these algorithms in Algorithms For Dummies.)

Algorithm Description Helpful Link
A * Search The algorithm tracks the cost of nodes as it explores them using the equation: f(n) = g(n) + h(n), where:

n is the node identifier

g(n) is the cost of reaching the node so far

h(n) is the estimated cost to reach the goal from the node

f(n) is the estimated cost of the path from n to the goal

The idea is to search the most promising paths first and avoid expensive paths.

Standford.edu
Balanced Tree A kind of tree that maintains a balanced structure through reorganization so that it can provide reduced access times. The number of elements on the left side differs from the number on the right side by one at most. Webdocs
Bidirectional Search This technique searches simultaneously from the root node and the goal node until the two search paths meet in the middle. An advantage of this approach is that it’s time efficient because it finds the solution faster than many other brute-force solutions. In addition, it uses memory more efficiently than other approaches and always finds a solution. The main disadvantage is complexity of implementation. Planning.cs
Binary Tree This is a type of tree containing nodes that connect to zero (leaf nodes), one, or two (branch nodes) other nodes. Each node defines the three elements that it must include to provide connectivity and store data: data storage, left connection, and right connection. cs.cmu.edu
Breadth-First Search This technique begins at the root node, explores each of the child nodes first, and only then moves down to the next level. It progresses level by level until it finds a solution. The disadvantage of this algorithm is that it must store every node in memory, which means that it uses a considerable amount of memory for a large number of nodes. This technique can check for duplicate nodes, which saves time, and it always comes up with a solution. Khan Academcy
Brute Force This is a technique of problem solving in which someone tries every possible solution, looking for the best problem solution. Brute-force techniques do guarantee a best-fit solution when one exists but are so time consuming to implement that most people avoid them. Igm.univ
Depth-First Search This technique begins at the root node and explores a set of connected child nodes until it reaches a leaf node. It progresses branch by branch until it finds a solution. The disadvantage of this algorithm is that it can’t check for duplicate nodes, which means that it could traverse the same node paths more than once. In fact, this algorithm may not find a solution at all, which means that you must define a cutoff point to keep the algorithm from searching infinitely. An advantage of this approach is that it’s memory efficient. Hacker Earth
Divide and Conquer This is a technique of problem solving in which the problem is broken into the smallest possible pieces and solved using the simplest approach possible. This technique saves considerable time and resources when compared to other approaches, such as brute force. However, it doesn’t always guarantee a best-fit result. Khan Academy
Dijikstra This is an algorithm used for finding the shortest path in a directed, weighted (having positive weights) graph. Geeks for Geeks
Graph A graph is a sort of a tree extension. As with trees, you have nodes that connect to each other to create relationships. However, unlike binary trees, a graph can have more than one or two connections. In fact, graph nodes often have a multitude of connections. You see graphs used in places like maps for GPS and all sorts of other places for which the top-down approach of a tree won’t work. Tutorials
Greedy Algorithms Thistechnique of one of problem solving in which the solution relies on the best answer for every step of the problem-solving process. Greedy algorithms generally make two assumptions:

Making a single optimal choice at a given step is possible.

By choosing the optimal selection at each step, finding an optimal solution for the overall problem is possible.

Tutorials
Greedy Best-First Search (BFS) The algorithm always chooses the path that is closest to the goal using the equation: f(n) = h(n). This particular algorithm can find solutions quite quickly, but it can also get stuck in loops, so many people don’t consider it an optimal approach to finding a solution. Centurion2
Hashing This is a method of predicting the location of a particular data item in the data structure (whatever that structure might be) before actually looking for it. This approach relies on the use of keys placed into an index. A hash function turns the key into a numeric value that the algorithm places into a hash table. A hash table provides the means to create an index that points to elements in a data structure so that an algorithm can easily predict the location of the data. Tutorials
Heap This is a sophisticated tree that allows data insertions into the tree structure. The use of data insertion makes sorting faster. You can further classify these trees as max heaps and min heaps, depending on the tree’s capability to immediately provide the maximum or minimum value present in the tree. Tutorials
Heuristics This is a technique of problem solving that relies on self-discovery and produces sufficiently useful results (not necessarily optimal, but good enough) to address a problem well enough that a better solution isn’t necessary. Self-discovery is the process of allowing the algorithm to show you a potentially useful path to a solution (but you must still count on human intuition and understanding to know whether the solution is the right one). Northwest.edu
MapReduce This is a framework for making algorithms work using computations in parallel (using multiple computers connected together in a network), allowing algorithms to complete their solutions faster. Hadoop Apache
Mergesort Mergesort is a general-purpose, comparison based method of sorting data. It depends on a divide-and-conquer approach to performing its task. Geeks for Geeks
Nash Equilibrium This is a game theory in which the other players know the equilibrium strategy for the other players, so no one has anything to gain by changing his or her personal strategy. This theory sees use in any hostile situation in which the player must account for the decisions made by all of the other players in order to win the game. Khan Academy
PageRank PageRank is an algorithm for measuring the importance of a node in a graph. This algorithm is at the root of the Google’s core algorithms for powering relevant searches to users. Princeton.edu
Pure Heuristic Search This algorithm expands nodes in order of their cost. It maintains two lists. The closed list contains the nodes that it has already explored, and the open list contains the nodes it must yet explore. In each iteration, the algorithm expands the node with the lowest possible cost. All its child nodes are placed in the closed list and the individual child node costs are calculated. The algorithm sends the child nodes with a low cost back to the open list and deletes the child nodes with a high cost. Consequently, the algorithm performs an intelligent, cost-based search for the solution. World of Computing
Quicksort This is a general-purpose sorting strategy based on partitioning arrays of data into smaller arrays. It depends on a divide-and-conquer approach to performing its task. Tutorials
Unbalanced Tree This is a tree that places new data items wherever necessary in the tree without regard to balance. This method of adding items makes building the tree faster but reduces access speed when searching or sorting. Quora