Skip to main content

Detecting starting point of loop in a linked list

Detecting the starting point of a loop in a linked list is a very popular problem. In fact most you reading this might already be knowing the algorithm to solve this. However understanding why that algorithm works is a separate challenge altogether.

I was in the same state and decided to find an explanation of why it works. I could not find convincing explanations by simple google searches and hence decided to right this blog.
Hopefully, this will satisfy the curiosity of people trying to understand the reasoning behind this algorithm.

I believe coming up with this algorithm in the first place is an even bigger challenge. You will have to stay with this problem a lot longer if you want to do that. For now let's assume we have the algorithm and let's try to prove why it should work.

The Algorithm

The key insight of the algorithm is that we need to maintain two pointers: Slow and Fast,

Slow pointer: Moves one node at a time.
Fast pointer: Moves two nodes at a time.

The first thing to realise is that these pointers are definitely going to meet at some position in the loop. To convince yourself, consider this:
* Both the pointers are going to enter the loop at some point.
* The fast pointer will catch up with the slow pointer one step at a time (since it moves at double the speed)




Above diagram shows an example of a loop.

Now we know that the pointers are going to meet at some point in the loop.

The second step of the algorithm says that bring one of the pointers to the start of the list. 
Now move both the pointers (one at the start and one at the meeting point in the loop) one step at a time. The next time they meet will be the starting point of the list.

Let's try to prove this algorithm.

Proof

We need to consider certain variables to mathematically formulate the problem.

m - number of links a pointer has to travel to reach the starting point of the loop
l - the number of links a pointer has to travel to complete one cycle of the loop
k - the number of links a pointer has to travel from the starting point of the loop to reach the meeting point of the nodes in the first step of the algorithm.

Now that we have defined these variables, let us consider the following:

Distance travelled by the slow pointer to reach the meeting point = m + p*l + k

m is the distance travelled to reach the starting of the loop. Let's say it makes p rounds of the loop (notice that p can be zero) and finally it travels k distance from the loop start to reach the meeting point.


Distance travelled by the slow pointer to reach the meeting point = m + q*l + k

This is similar to the slow pointer except that the fast pointer makes q rounds of the loop. Since it travels at twice the speed hence q > p.

Now we know that, distance travelled by fast pointer = 2 * distance travelled by slow pointer

So, m + q*l + k = 2*(m + p*l + k) => (m + k) = (q - 2*p)*l

What this means is that (m + k) is an integral multiple of the length of the loop or in other words, (m+k) % l = 0. So if you travel a distance of (m+k) in the loop, you will reach the start of the loop.

Now we can consider the second part of the algorithm. We bring one pointer to the start of the list. 
Now when this pointer takes m steps, it reaches the start of the loop.
How much distance would the second pointer have covered by this time? It would be m+k. 
Why? Since it was already at an offset k after the first step of the algorithm and then it travels m more steps.

Now we know that (m+k) is a multiple of l. That means the second pointer has reached the start of the loop and so has the first pointer. This implies that both the pointer will meet at the beginning of the loop after the second step of the algorithm. Hence the algorithm works.

Comments

Popular posts from this blog

Streaming Algorithms - Lossy Counting

We had a discussion recently in our office on one of the famous  streaming algorithms  known as lossy count. I decided to write down my understanding of it focusing on the space complexity discussion and proving why it works in logarithmic space. This  is the original paper introducing this algorithm and most of the information discussed in this blog is (hopefully) a simplified version of this paper relating to lossy count. Problem Statement There are use cases where you may be receiving a continuous stream of data and you want to know which items in that data stream exceed a certain threshold. The data stream could be really big so you can't store it in memory. Moreover you want to answer queries in real time. So here are the challenges this problem imposes: 1. Need to do it in single pass (time constraint) 2. Limited memory 3. Large volume of data in real-time The above issues warrant the need of a smart counting algorithm. Data stream mining to identify events & pa...

RSA Encryption System

THE NEED OF ENCRYPTION In today's world where a lot of secured information like our credit card number, passwords etc., travel around the web, the presence of a secure encryption system is almost inevitable. We want a method to encrypt a message, send it over the insecure internet connection and allow only the receiver to be able to decrypt and read the original message. This exact problem was solved by R ivest, S hamir and A dleman(RSA) in the year 1978, in their paper A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. In this post, I will try to explain the method they adopted to create a secure encryption system. PUBLIC KEY CRYPTOSYSTEM A public key cryptosystem has the following properties: 1) D(E(M)) = M, if we decrypt an encrypted message, we should get back the message. 2) Both E and D should be easy to compute, if it takes days to encrypt or decrypt the message, then the cryptosystem is not very useful. 3) By publicly revea...

Fermat's little theorem

In my post on RSA encryption system I mentioned the use of Fermat's little theorem. In this post I am going to give a formal proof (and explain in simple terms) the theorem itself. This will also give you a chance to boast among your friends (possibly nerdy) that you know the proof to one of Fermat’s theorem ;) THE STATEMENT Fermat's little theorem states that: For any integer a not divisible by p and any prime p, the following always holds: a (p-1) ≡ 1 (mod p) The reason why the theorem states that a should not be divisible by p is very clear. Let us assume that a was divisible by p, then obviously p will divide a (p-1) . So this means that a (p-1) ≡ 0 (mod p) when p | a. So this is an exception and is separately mentioned in the theorem. PROOF Now coming to the actual proof of the theorem. If we see the left hand side of the equation in the statement, we have the term a (p-1) . So it is clear that this was obtained by multiplying a, p-1 times. Now let us ...