top of page

Merge k Sorted Lists

  • mosesg1123
  • 3 days ago
  • 5 min read

Combining multiple sorted streams into one sorted output is a classic challenge faced in database merge operations, log file aggregation, and external sorting. The “merge K sorted lists” problem tests your ability to coordinate multiple pointers, leverage priority data structures, and keep everything running in optimal time.


Problem Statement

You’re given an array of k linked lists, each of which is sorted in ascending order. Merge all the lists into one sorted linked list and return its head.


1. Clarify Requirements Before Jumping Into Code

I want to make sure I’ve got the details:

  • Are all lists non-null? (They might be empty or null.)

  • Do nodes contain only integer values? (Yes.)

  • Total number of lists k could be large, as could total nodes N.

  • What time complexity do we need? Ideally O(N log k).

  • Should we modify the input lists or build a new one? (In place is fine.)


2. Identify the Category of the Question

This is a divide-and-conquer and priority-queue design problem on linked lists. Common patterns:

  • Pairwise merging of two lists (like merge step in mergesort)

  • Recursively divide the array of lists and merge halves

  • Using a min-heap (priority queue) to always select the smallest head among k lists


3. Consider a Brute-Force Approach to Understand the Problem

A naive approach:

  1. Dump all node values into an array - O(n)

  2. Sort the array - O(n log n).

  3. Rebuild a linked list - O(n).

Total: O(n log n) time, O(n) space. This works, but we’re not leveraging the fact that each list is already sorted.


4. Brainstorm More Solutions

Let me start by trying to simplify the problem a bit.

What would I do if I only had two sorted lists that I needed to merge? That's an easier problem - I would just use a two pointer solution comparing and inserting the head of each list to the result. This is O(n) time.


Now what if I wanted to scale that up to 3 or more lists? Two options come to mind:

  • Option 1: What if I expand my two pointer solution to a k-pointer solution? I'm essentially iterating through the head of every list to find the next smallest element. This gives me O(n*k) time, since I'm visiting every element in every list k times.

  • Option 2: Alternatively, I could merge the first two lists using the two pointer method, then merge in the third, and on all the way down through k lists. This way, I'm only ever merging two lists at a time. In the worst case, this also gives me a runtime of O (n*k), since I'm visiting every element in the result k times.


These solution are technically correct and would work, but I'm wondering: Can we do better?


Option 1

One trick for optimizing on a solution is to ask yourself: am I doing duplicate work anywhere? I see some duplicate work in option 1 - I'm essentially revisiting the head of every list every iteration to try and find the next smallest.

So, what if I didn't have to revisit every list every time? Is there some way I can immediately determine which node is the next smallest?


There's a data structure that let's me do that - a min-heap!


If I were maintaining a min-heap of the head of each list:

  • My min heap is going to have k elements. Inserting an element to the min heap is thus O(log k).

  • When I extract the smallest element from the min-heap, I'll add in the next node from the corresponding list to the min heap (or do nothing if the corresponding list has been exhausted).

  • I'm doing these operations n times, so inserting/extracting all elements from my min-heap averages out to O(n log k) time - that's better than O (n*k)!


Option 2

Does Option 2 remind me of any of the standard approaches to solving a coding problem? It works by breaking the problem down into a smaller sub-problem and then building up the result piece by piece. There's another name for this strategy - divide and conquer!


But my approach in Option 2 was a bit different from your typical divide and conquer strategy. I'm building up the solution one-by-one, list-by-list.

Divide and conquer shines when you can break the problem down into similar sized sub-problems. Can I do that here? Yes!

Rather than processing my lists iteratively, I'll recursively divide my lists in half until I only have 2 lists that I'm merging; merge the two lists together in O(n) time; then merge with the result from the other half of the lists!

  • So, each "level" does O(n) work

  • There are log k levels

  • That means this solution runs in O(n log k) time.


Both the divide-and-conquer solution and the min-heap solution give me O(n log k) runtime. I was not given space constraints, but if I were, it would be unlikely to be a concern in either solution.


So how do I pick? In this case, I'll pick the solution that I think will lead to the code that is easiest to implement and understand. For me, that's the min-heap solution.


5. Discuss Trade-Offs Between Your Solutions

Approach

Time

Space

Pros

Cons

Dump & sort

O(N log N)

O(N)

Very simple

Doesn’t leverage sorted inputs

Merge one by one

O(k·N)

O(1)

Straight forward

Degrades when k is large

Divide & conquer

O(N log k)

O(1)

Optimal time, minimal extra space

Slightly more recursive code

Min-heap of heads

O(N log k)

O(k)

Clear iterative code, heap library

Requires heap space O(k)


7. Write Pseudocode to Structure Your Thoughts

function mergeKLists(lists):
  heap = new min-heap
  for head in lists:
    if head != null: heap.push(head)
  dummy = new node(0)
  tail = dummy
  while heap not empty:
    node = heap.pop()
    tail.next = node
    tail = node
    if node.next != null:
      heap.push(node.next)
  return dummy.next

8. Consider Edge Cases

  • All lists empty or null → return null.

  • Some lists empty, others non-empty → skip null heads.

  • Single list only → return that list.

  • Lists containing duplicate values → heap handles ties predictably.


9. Write Full Code Syntax (Java)

import java.util.PriorityQueue;

class ListNode {
    int val;
    ListNode next;
    ListNode(int x) { val = x; }
}

public class Solution {
    public ListNode mergeKLists(ListNode[] lists) {
        PriorityQueue<ListNode> heap = new PriorityQueue<>(
            (a, b) -> Integer.compare(a.val, b.val)
        );
        for (ListNode head : lists) {
            if (head != null) {
                heap.offer(head);
            }
        }
        ListNode dummy = new ListNode(0), tail = dummy;
        while (!heap.isEmpty()) {
            ListNode node = heap.poll();
            tail.next = node;
            tail = node;
            if (node.next != null) {
                heap.offer(node.next);
            }
        }
        return dummy.next;
    }
}

10. Test Your Code

public static void main(String[] args) {
    Solution sol = new Solution();

    // helper to build and print lists omitted for brevity

    // Test 1: empty array
    System.out.println(sol.mergeKLists(new ListNode[]{} ) == null);

    // Test 2: all null lists
    System.out.println(sol.mergeKLists(new ListNode[]{null, null}) == null);

    // Test 3: single list
    ListNode l1 = new ListNode(1);
    l1.next = new ListNode(2);
    System.out.println(sol.mergeKLists(new ListNode[]{l1}).val == 1);

    // Test 4: example case
    ListNode a1 = new ListNode(1);
    a1.next = new ListNode(4);
    a1.next.next = new ListNode(5);
    ListNode b1 = new ListNode(1);
    b1.next = new ListNode(3);
    b1.next.next = new ListNode(4);
    ListNode c1 = new ListNode(2);
    c1.next = new ListNode(6);
    ListNode merged = sol.mergeKLists(new ListNode[]{a1, b1, c1});
    // expected output sequence: 1→1→2→3→4→4→5→6

    // Test 5: lists of varying lengths
    ListNode x1 = new ListNode(0);
    ListNode merged2 = sol.mergeKLists(new ListNode[]{x1, a1});
    // expected: 0→1→4→5
}

11. Key Lessons to Remember for Future Questions

  • A min-heap elegantly merges k sorted streams in O(n log k).

  • Divide-and-conquer merging also runs in O(n log k) with O(1) extra space.

  • Always consider both merging strategies when k is large.

  • Pseudocode first clarifies the core loop before diving into language syntax.

  • When looking to optimize on a solution, think about where you might be doing duplicate work and how you might able to eliminate that duplicate work

  • If you're stuck trying to optimize, don't be afraid to just talk through some of your standard solutions to see if any of them sound similar to your initial ideas


Good Luck and Happy Coding!

Recent Posts

See All
Find the Median from a Data Stream

Computing running medians on a stream of numbers pops up in real-time analytics, financial tickers, and sensor dashboards. You need to support two operations—adding a number and retrieving the current

 
 
 
Design an LRU Cache

Caching is critical in systems design, from web browsers to database engines. An LRU (Least Recently Used) cache evicts the least recently used entry when full, ensuring hot data stays accessible. Imp

 
 
 
Contains Duplicate Within K Distance

Checking for nearby repeated events in a log or sensor stream is a common task in monitoring, fraud detection, and real‑time analytics....

 
 
 

Comments


Drop Me a Line, Let Me Know What You Think

Thanks for submitting!

© 2023 by Train of Thoughts. Proudly created with Wix.com

bottom of page