Longest Consecutive Sequence

mosesg1123
May 9
4 min read

Finding the longest run of consecutive days, IDs, or timestamps in a dataset comes up in analytics, event processing, and scheduling systems. This “Longest Consecutive Sequence” problem is a great way to exercise hashing, set membership, and thinking about how to avoid redundant work.

Problem Statement

You’re given an unsorted array of integers nums. Return the length of the longest sequence of consecutive integers that appear in nums. Your algorithm should run in O(n) time.

Example:

Input:  [100, 4, 200, 1, 3, 2]
Output: 4        // because the sequence 1, 2, 3, 4 has length 4

1. Clarify Requirements Before Jumping Into Code

Can nums contain duplicates? (Yes, but duplicates shouldn’t extend the sequence.)
What about negative numbers? (Allowed.)
Empty array input should return 0.
Expected time: O(n). Space: O(n) extra is fine.

2. Identify the Category of the Question

This is a hash‑set and sequence‑detection problem. Common patterns here include:

Sorting followed by linear scan (gives O(n log n)).
Hash‑set membership for O(1) existence checks.
Union‑find grouping (less common for this specific problem).

3. Consider a Brute‑Force Approach to Understand the Problem

Let's start with a basic idea: For each element x, what if I try counting upwards and downwards until I find a number that is missing:

Check x+1, x+2, … until I fail to find a matching element in the array (or I exceed the size of the array).
Do the same with x-1, x-2, …
That costs up to O(n) per element → O(n²) total in the worst case. Too slow.

Another basic idea: I could sort the array in O(n log n) and then scan once to count runs, skipping duplicates. That’s O(n log n) time, O(n) space. But can I do better than O(n log n) time?

4. Brainstorm More Solutions

Let's try breaking the problem down into subproblems:
- How do I test membership quickly?
- How do I avoid re‑counting the same sequence multiple times?
Subproblem – test membership quickly
- Option A: Binary search on a sorted array → O(log n) per check, but I also incur the cost of sorting. Overall, the runtime is O(n log n) to sort + O(log n) for scans.
- Option B: I could create a hash‑set of all values → O(1) per membership check. The constant look up time is huge, so I'll assume I need to use a hash-set in some way in my final solution.
Subproblem – avoid redundant scans
- If I start counting at every element, I end up recounting parts of the same sequence over and over.
- What if I only start counting when the element is the beginning of a sequence? How would I know if I'm at the beginning of a new sequence? Easy: if x - 1 is not present in my hash-set!
Putting it together
- Build a hash‑set of all numbers in O(n).
- For each x in the set:
  - If x - 1 is missing, treat x as the start of a sequence.
  - Then count how many consecutive x, x+1, x+2, … are present.
  - Track the maximum run length.
- This does one membership check per number for starts, plus one check per element in each sequence - overall O(n).

5. Discuss Trade‑Offs Between Your Solutions

Goal: Use a table to compare time/space and pros/cons.

Approach	Time	Space	Pros	Cons
Nested scans per element	O(n²)	O(1)	Conceptually trivial	Unusable for large n
Sort + single scan	O(n log n)	O(n)	Easy to code	Slower than linear, uses sort
Hash‑set + start‑at‑heads (optimal)	O(n)	O(n)	True linear time, simple logic	Needs extra set for membership
Union‑find linking	~O(n α(n))	O(n)	Academically interesting	Overkill for this problem

6. Chosen Solution Recap

Build a hash‑set of all numbers. Then for each number x, if x-1 is missing, scan forward x, x+1, x+2… until a gap, counting the length. Track the maximum length.

7. Write Pseudocode to Structure Your Thoughts

function longestConsecutive(nums):
  set = new HashSet(nums)
  best = 0
  for x in set:
    if (x - 1) not in set:
      length = 1
      while (x + length) in set:
        length += 1
      best = max(best, length)
  return best

8. Consider Edge Cases

Empty array → return 0
Single element → return 1
All duplicates, e.g. [2,2,2] → set reduces to {2}, sequence length 1
Negative and positive mix, e.g. [-1,0,1] → handles negative starts
Large gaps, e.g. [10, 4, 6, 2] → sequences length 1

9. Write Full Code Syntax

import java.util.HashSet;

public class Solution {
    public int longestConsecutive(int[] nums) {
        if (nums == null || nums.length == 0) return 0;
        HashSet<Integer> set = new HashSet<>();
        for (int n : nums) set.add(n);

        int best = 0;
        for (int x : set) {
            // only start from sequence head
            if (!set.contains(x - 1)) {
                int length = 1;
                while (set.contains(x + length)) {
                    length++;
                }
                best = Math.max(best, length);
            }
        }
        return best;
    }
}

10. Test Your Code

public static void main(String[] args) {
    Solution sol = new Solution();

    System.out.println(sol.longestConsecutive(new int[]{}));                 // 0
    System.out.println(sol.longestConsecutive(new int[]{5}));                // 1
    System.out.println(sol.longestConsecutive(new int[]{2,2,2}));            // 1
    System.out.println(sol.longestConsecutive(new int[]{100,4,200,1,3,2}));   // 4
    System.out.println(sol.longestConsecutive(new int[]{-1,0,1}));           // 3
    System.out.println(sol.longestConsecutive(new int[]{10, 5, 6, 7, 1}));    // 3
}

11. Key Lessons to Remember for Future Questions

Using a hash‑set for O(1) membership checks unlocks true linear solutions for sequence problems.
Identifying sequence heads (elements without a predecessor) avoids redundant scans.
Breaking the problem into “how to test membership” and “how to avoid repeats” makes the path clear.
Always consider sorting for O(n log n) if O(n) seems tricky, then refine to linear with hashing.
Testing edge cases—empty input, duplicates, negatives—ensures robustness.

Good Luck and Happy Coding!