Given a directed graph, how do we detect if it has a cycle?
You probably already knew one approach, which is to run a toposort algorithm on the graph, and check if it failed. The toposort algorithm fails if and only if there is a cycle in the graph, hence this correctly solves our problem! But if you were asked to actually find a cycle, then it could get tricky depending on the toposort algorithm used.
But it's still worthwhile to discuss additional approaches to this problem. For instance, a simple algorithm arises from the following insight: there is a cycle if and only if there is a back edge in the DFS forest. (Why?) Thus, we can detect a cycle by performing a DFS (like above) and stopping once we find a back edge! Another advantage of this is that it's easy to actually find a cycle if one is detected. (How?) In fact, by pursuing this idea further, you can use a DFS to actually extract a toposort of a DAG: Just order the nodes by decreasing finishing time! Think about why that works.
One can also detect (and find) a cycle using BFS, although we will leave it to you to discover.
Our goal to a special kind of graph. Specifically, let's only consider graphs where each node has exactly one outgoing edge. Let's call such graphs function graphs because on such a graph, we can define a function $f : V \rightarrow V$ where $f(x) = y$ if and only if $x \rightarrow y$ is an edge. (We can also consider more general graphs with at most one outgoing edge per node. We can convert such graphs into function graphs by adding a new node, say $\mathrm{trash}$, and pointing all nodes without an outgoing edge to $\mathrm{trash}$ (including $\mathrm{trash}$ itself).) Since there is exactly one outgoing edge from each node, $f$ is welldefined. Conversely, every function $f : S \rightarrow S$ corresponds to a function graph whose node set is $S$ and whose edges are $\{(x, f(x)) : x \in S\}$ (we're implicitly allowing selfloops here, but that's fine).
Now, starting at any node, there's only one path we can follow. In terms of $f$, starting at some node $x$ and following the (only) path corresponds to iteratively applying $f$ on $x$, thus, the sequence of nodes we visit is:
\[ (x, f(x), f(f(x)), \ldots, f^{(n)}(x), \ldots)\]
(Here, $f^{(n)}(x)$ is $f$ applied to $x$ a total of $n$ times.)
Now, if we assume that our node set is finite, then (by the pigeonhole principle) this will eventually repeat, i.e., there will be indices $0 \le i \lt j$ such that $f^{(i)}(x) = f^{(j)}(x)$. (Note that this is no longer true if the graph is infinite. Why?) In fact, once this happens, the subsequence $(f^{(i)}(x), f^{(i+1)}(x), \ldots, f^{(j1)}(x))$ will repeat forever. This always happens regardless of which node you start at.
This gives rise to a natural question: Given a starting point $x$, when is the first time that a repeat happens? Furthermore, how long is the cycle? To make the question more interesting, suppose we don't know anything about $f$ apart from:
We can formalize the cyclefinding problem as: Given a function $f$ with the above properties, and a starting point $x$, compute the following:
We can visualize these values with the following:
A simple approach is to use BFS or DFS, which is equivalent to just following the (only) path and storing the visited nodes until we encounter a node we've already visited.
 // Justwalk algorithm
 function cycle_find(x):
 visit_time = new empty map
 time = 0
 s = x
 while not visit_time.has_key(s):
 visit_time[s] = time++
 s = f(s)
 s_cycle = s
 l_tail = visit_time[s]
 l_cycle = time  l_tail
 return (s_cycle, l_cycle, l_tail)
Assuming no preprocessing and no specialized knowledge on $f$, this is probably close to the fastest we can do. It needs $O(l_\text{tail} + l_\text{cycle})$ time.
It also needs $O(l_\text{tail} + l_\text{cycle})$ memory, but one might wonder if it can be improved upon. Amazingly, there's actually a way to compute it using $O(1)$ memory, called Floyd's cyclefinding algorithm!
Now, you might ask, why the need for a fancy algorithm? Surely it's trivial to find an $O(1)$memory solution. Here's one:
 // Bogocyclefinding algorithm
 function cycle_find(x):
 for time in 1,2,3,4...
 s = x
 for i = 1..time:
 s = f(s)

 s_cycle = s

 s = x
 l_tail = 0
 while s_cycle != s:
 s = f(s)
 l_tail++

 if l_tail < time:
 l_cycle = time  l_tail
 return (s_cycle, l_cycle, l_tail)
Let's call this the bogocyclefinding algorithm. Although it might not be obvious why this works, clearly this is $O(1)$ memory! Well, that's certainly true, but this is an incredibly slow solution! The idea is to use only $O(1)$ memory without sacrificing running time.
Let's discuss Floyd's algorithm. This is also sometimes called the tortoise and the hare algorithm, since we will only use two pointers, called the tortoise and the hare, respectively.
The idea is that both the tortoise and the hare begin walking at the starting point, but the hare is twice as fast. This means that at the beginning, the hare might be quite ahead of the tortoise, but once they both enter the cycle, they will eventually meet. Once we they meet, they stop, and the hare teleports back to the starting point. They then proceed walking at the same speed and stop once they meet. This meeting point will be $s_\text{cycle}$!
Once we get $s_\text{cycle}$, $l_\text{tail}$ and $l_\text{cycle}$ can easily be computed, e.g., $l_\text{cycle}$ can be computed by going around the cycle once. Here's the pseudocode:
 // Floyd's cyclefinding algorithm
 function cycle_find(x):
 tortoise = hare = x
 do:
 tortoise = f(tortoise)
 hare = f(f(hare))
 while tortoise != hare

 // teleport, and walk at the same speed
 hare = x
 l_tail = 0
 while tortoise != hare:
 tortoise = f(tortoise)
 hare = f(hare)
 l_tail++

 s_cycle = hare

 // compute l_cycle by walking around once.
 l_cycle = 0
 do:
 hare = f(hare)
 l_cycle++
 while tortoise != hare

 return (s_cycle, l_cycle, l_tail)
This clearly uses $O(1)$ memory. We've also mentioned that this runs in $O(l_\text{tail} + l_\text{cycle})$, but I didn't provide a complete convincing proof. It's not even clear why this correctly computes “$s_\text{cycle}$”. We leave it to you as an exercise to prove the running time and correctness of this algorithm!
An interesting application of Floyd's algorithm (or at least its idea) is with integer factorization. Pollard's $\rho$ algorithm ($\rho$ is pronounced “rho”) is a factorization algorithm that can sometimes factorize numbers faster than trialanderror division. The name comes from the shape of the path when starting at some value:
The algorithm accepts $N$, the number to be factorized, along with two additional parameters, a starting value $s$, and a function $f : \{0, 1, \ldots, N  1\} \rightarrow \{0, 1, \ldots, N  1\}$, which must be a polynomial modulo $N$. The algorithm then attempts to find a divisor of $N$. One issue with this algorithm is that it's not guaranteed to succeed, so you may want to run the algorithm multiple times, with differing $s$ (and possibly $f$).
Suppose we want to factorize a large number $N$. Also, suppose $s = 2$ and $f(x) = (x^2 + 1) \bmod N$. Here is the pseudocode of Pollard's $\rho$algorithm.
 function try_factorize(N):
 x = y = 2 // starting point
 do:
 x = f(x)
 y = f(f(y))
 d = gcd(x  y, N)
 while d == 1

 if d == N:
 failure
 else:
 return d
One can clearly see $f$ being used as the iteration function, and $x$ and $y$ are assuming roles that are similar to the tortoise and hare, respectively. If this fails, you could possibly try again with a different starting point, or perhaps a different $f$. (Of course, this will always fail if $N$ is prime.)
A more thorough explanation of why this works, and why cyclefinding appears, can be seen on its Wikipedia page.