Reconstruct Itinerary: A Step-by-Step Interview Walkthrough

18 hours ago
12 min read

Reconstruct Itinerary is a problem that looks like a routine graph traversal and turns out to be a subtle trap for the obvious approach. The setup — use all the tickets, start at JFK, return the lexicographically smallest valid itinerary — practically begs for backtracking: try a flight, recurse, undo if you get stuck, try the next. That works, but it's exponential, and the candidates who reach for it spend their time wrestling with backtracking bookkeeping instead of seeing the structure. The candidates who recognize "use every edge exactly once" as the definition of an Eulerian path unlock a clean, near-linear algorithm. The signal here is whether you can identify a classic graph concept hiding behind an applied story, and whether you know the specific traversal trick (Hierholzer's algorithm) that constructs such a path directly instead of searching for it.

Eulerian-path problems show up wherever you need to traverse every connection exactly once. Route planning for mail carriers, snowplows, and street-sweepers (the "Chinese Postman" family) builds on Eulerian paths. DNA fragment assembly in bioinformatics reconstructs sequences by finding Eulerian paths through overlap graphs. Network testing tools traverse every link exactly once to verify connectivity. PCB drilling and CNC toolpath optimization minimize redundant travel the same way. Any time the goal is "cover every edge once," this is the underlying structure.

Problem Statement

You are given a list of airline tickets where tickets[i] = [from, to] represents a flight from airport from to airport to.

All tickets must be used exactly once.
The itinerary must begin with "JFK".
If multiple valid itineraries exist, return the one with the smallest lexical order when read as a single concatenated string.

Return the itinerary as a list of airport codes.

Example:

tickets = [["MUC","LHR"],["JFK","MUC"],["SFO","SJC"],["LHR","SFO"]]

result → ["JFK","MUC","LHR","SFO","SJC"].

1. Clarify Requirements Before Jumping Into Code

Before writing a single line, we should restate the problem in our own words and ask the kinds of questions that would change our approach.

Input: a list of directed edges (tickets), each a [from, to] pair.

Output: an ordered list of airport codes representing the full itinerary.

Details to clarify:

Must every ticket be used, and exactly once? Yes — this is the defining constraint. We're not finding a path to somewhere; we're using up all the edges.
Is a valid itinerary guaranteed to exist? Yes — the problem promises at least one valid itinerary using all tickets.
Where do we start? Always "JFK".
How is "smallest" defined when multiple itineraries work? Lexicographically, treating the whole itinerary as a sequence of airport codes — at each branch point, prefer the alphabetically smaller airport.
Can there be multiple tickets between the same two airports? Yes — duplicate edges are allowed, and each must be used.

The thing to flag is "use every ticket exactly once." That phrase is doing all the work. It's not asking us to reach a destination — it's asking us to consume every edge exactly once. That's a specific, well-studied graph structure, and naming it correctly is what unlocks the efficient solution.

2. Identify the Category of the Question

A few signals jump out:

Airports are nodes, tickets are directed edges.
We must traverse every edge exactly once — not every node, every edge.
Among valid traversals, we want a specific one (lexicographically smallest).

That "use every edge exactly once" requirement is the fingerprint of an Eulerian path — a path through a graph that uses every edge exactly once. This is a classic, named concept, and there's a classic algorithm to construct one: Hierholzer's algorithm. The lexicographical-ordering requirement is a secondary constraint layered on top, which we'll handle by controlling the order in which we explore edges. Recognizing "Eulerian path" is the entire battle; once you name it, the rest is applying a known technique.

3. Brute Force Solution

Let's think about the naive approach to understand the problem. The obvious idea is backtracking: from JFK, we try each outgoing flight (in alphabetical order, to get the smallest result first), recurse to use the remaining tickets, and if we hit a dead end before using all tickets, undo the last choice and try the next.

backtrack(airport, usedCount, path):
    if usedCount == totalTickets:
        return path is complete
    for each unused ticket from airport (in lex order):
        mark ticket used
        if backtrack(destination, usedCount + 1, path + destination):
            return success
        unmark ticket    # undo and try next
    return failure

Because we try destinations in alphabetical order and return the first complete itinerary we find, this does produce the lexicographically smallest answer. But it's exponential in the worst case — at each airport we may branch into many tickets, and dead ends force us to unwind and retry large subtrees. For graphs with many overlapping routes, this times out.

The brute force does teach us the requirement clearly, though: we need to use every ticket, and ordering matters globally (a greedy "always take the smallest next flight" can paint us into a corner where we strand unused tickets). That global-ordering tension is exactly what a smarter algorithm needs to resolve without exponential search.

4. Brainstorm More Solutions

Step 1: Name the structure

Let's model the problem precisely. Airports are nodes. Each ticket is a directed edge. We need a path that uses every edge exactly once, starting from JFK.

"A path that uses every edge exactly once" is the textbook definition of an Eulerian path. The problem guarantees a valid itinerary exists, which means the graph is guaranteed to have an Eulerian path starting at JFK. So our job isn't to search for whether such a path exists; it's to construct the one path that's lexicographically smallest.

Once we've named it, we can reach for the standard tool: Hierholzer's algorithm constructs an Eulerian path efficiently, without the exponential backtracking. The question becomes how Hierholzer's works and why, and how we bolt on the lexicographical requirement.

Step 2: Why naive greedy fails — and what fixes it

Before jumping to the algorithm, let's understand why this is tricky, because the difficulty motivates the trick.

The naive greedy idea: from each airport, always fly to the smallest-lettered available destination. That respects the lexical constraint locally, but it can strand tickets. Consider an airport JFK with a flight A that must be saved for last because it leads to a dead-end spur. If we greedily take the alphabetically smaller flight into the dead-end spur first, we get stuck there with tickets still unused elsewhere.

So we can't naively commit to the smallest flight and march forward — sometimes a flight that leads to a dead end has to be deferred until everything else is explored. The key realization: a dead-end airport (one with no remaining outgoing flights) must be the last thing on the itinerary among the choices available at that branch. When we get stuck at an airport with no outgoing edges left, that airport belongs at the end of the remaining route, not somewhere in the middle.

That observation is exactly what Hierholzer's algorithm exploits.

Step 3: Reasoning our way to the construction trick

We established the problem with naive greedy: from an airport, the alphabetically smallest flight sometimes leads into a dead-end spur that strands other tickets. So we can't always commit to the smallest flight and march forward. Let's reason about what we can say for certain, and see if it leads somewhere.

Here's a question with a clean answer: when we follow flights and eventually arrive at an airport with no remaining outgoing tickets, what do we know about that airport? We know we're stuck — we can't fly anywhere from here. And since we have to use every ticket, and there's no way out of this airport, this airport can't have anything after it. It must be the last stop. That's a fact we can trust completely, regardless of what messy choices led us here.

That's a strong foothold, so let's lean on it. Suppose we keep flying greedily (smallest flight first) until we get stuck. The airport where we get stuck is definitely the end of the itinerary. Now back up one step to the airport we flew in from. We've used the ticket that led to the dead-end; are there other tickets out of this previous airport?

If so, those represent detours we still need to weave in — side-trips that branch off and (because a valid itinerary is promised) must eventually loop back.

If not, this airport is now also stuck, which means it's the second-to-last stop.

There's a pattern forming here. The airports get "finalized" from the end backward. The first airport we get stuck at is the last in the itinerary. The next one we finish is second-to-last. And so on. We're discovering the itinerary's airports in reverse order of where they belong.

So here's the construction that falls out of this reasoning: do the greedy DFS, and whenever we exhaust an airport's flights (get stuck), record it. Keep recording as the recursion unwinds and we finish more airports. Each airport we record is earlier in the itinerary than the previous one we recorded. At the end, we have the full itinerary recorded backward — so all we have to do is reverse it.

What about the detour side-trips? They handle themselves. When we back up to an airport that still has unused tickets, we dive down that branch, follow it until it gets stuck, and record those airports too — and because they were recorded later (during the unwind from the detour), they slot into the reversed itinerary before the airport we detoured from. The recursion's natural order — finish the deepest dead-ends first, then their parents — combined with the reversal, splices every loop into exactly the right place. We never have to plan it; the postorder recording and the final reversal do the bookkeeping for us.

This is genuinely counterintuitive the first time — we're recording airports in the opposite of the order we want, and trusting that a reversal fixes everything. But it follows directly from the one fact we were sure of: a stuck airport is the last airport. Everything else is that fact applied repeatedly as we unwind. (This construction is known as Hierholzer's algorithm, but you don't need to recall the name — you can re-derive it from "a dead-end must come last.")

Step 4: Handling the lexicographical constraint with the right data structure

Now layer on "lexicographically smallest." At each airport, when we choose which outgoing flight to take next, we want the alphabetically smallest destination first. So we need each airport's outgoing edges stored in a way that always hands us the smallest unused one.

A min-heap (priority queue) per airport does exactly this: poll() returns the smallest remaining destination, and removing it also marks that ticket as used (each edge consumed exactly once). So we store the graph as a map from each airport to a min-heap of its destinations.

This is a nice separation of concerns: the ordering requirement lives entirely in the data structure (the min-heap), while the traversal logic (Hierholzer's DFS) stays clean and doesn't have to think about lexicography at all. Putting the ordering in the data structure rather than the algorithm is a recurring design move worth remembering — it keeps the traversal logic simple.

Step 5: Assemble the algorithm

build graph: map of (each airport : min-heap of its destinations)
result = empty list

dfs(airport):
    while airport's heap is non-empty:
        # get smallest unused destination and mark ticket used
        next = heap.poll()      
        dfs(next)
    # add when stuck (postorder)
    result.add(airport)         

dfs("JFK")
reverse(result)
return result

The whole algorithm is short, but every piece earns its place: the min-heap enforces lexical order, the DFS consumes edges, the postorder insertion handles the dead-end-goes-last problem, and the final reversal flips the recorded order into the itinerary order.

Step 6: Walk through an example

Let's trace tickets = [["JFK","SFO"],["JFK","ATL"],["SFO","ATL"],["ATL","JFK"],["ATL","SFO"]].

Build the heaps (each airport's destinations sorted):

JFK → [ATL, SFO]
ATL → [JFK, SFO]
SFO → [ATL]

Now dfs("JFK"):

At JFK, poll smallest → ATL. Recurse into ATL.
- At ATL, poll smallest → JFK. Recurse into JFK.
  - At JFK, poll smallest → SFO (ATL was already used). Recurse into SFO.
    - At SFO, poll → ATL. Recurse into ATL.
      - At ATL, poll → SFO (JFK was already used). Recurse into SFO.
      - At SFO, heap empty. Stuck. Add SFO. result = [SFO].
      - Back at ATL, heap empty. Add ATL. result = [SFO, ATL].
    - Back at SFO, heap empty. Add SFO. result = [SFO, ATL, SFO].
  - Back at JFK, heap empty. Add JFK. result = [SFO, ATL, SFO, JFK].
- Back at ATL, heap empty. Add ATL. result = [SFO, ATL, SFO, JFK, ATL].
Back at JFK, heap empty. Add JFK. result = [SFO, ATL, SFO, JFK, ATL, JFK].

Reverse: [JFK, ATL, JFK, SFO, ATL, SFO]. Every ticket used exactly once, starts at JFK, lexicographically smallest. Correct.

Notice how the first airport to get "stuck" (the final SFO) ended up first in the recorded list and last after reversal — exactly where a dead-end belongs.

Step 7: Complexity

Let e be the number of tickets (edges). Let's derive the cost from the operations.

Building the graph inserts each ticket into a min-heap. With e total tickets distributed across the heaps, each insertion costs O(log e) in the worst case (when many tickets share an airport), so building is O(e log e).

The DFS traverses each edge exactly once — every ticket is polled from a heap exactly once and never revisited. Each poll costs O(log e). So the traversal is O(e log e). The final reversal is O(e). Adding these, the dominant term is O(e log e) time.

For space: the graph stores every edge once across the heaps, O(e). The recursion stack can go as deep as the path length, which is O(e) in the worst case (the itinerary has e + 1 airports). The result list is O(e). So total space is O(e).

The transferable framing, same as the heap-based analyses elsewhere: the cost is "(number of edges) × (log of heap size)," because each edge is inserted once and polled once, and each heap operation carries a log factor.

5. Discuss Trade-Offs Between Solutions

Approach	Time	Space	When I'd use it
Brute-force permutations	O(e!)	High	Never — factorial blowup
Backtracking DFS	Exponential worst case	High	Conceptually simple, but times out and the ordering logic is fiddly
Hierholzer's (DFS + min-heaps)	O(e log e)	O(e)	My default — constructs the answer directly, no search

Hierholzer's algorithm is the right answer. The brute-force and backtracking approaches can produce correct results, but they search for the path; Hierholzer's constructs it deterministically, which is both faster and cleaner. This problem is genuinely won or lost on recognizing the Eulerian path underneath.

6. Pseudocode

build graph: map each airport → min-heap of destinations
result = empty list

dfs(airport):
    while heap[airport] is non-empty:
        next = heap[airport].poll()    # smallest unused destination
        dfs(next)
    result.add(airport)                # postorder: add when stuck

dfs("JFK")
reverse(result)
return result

7. Edge Cases

Things to verify before claiming we're done:

Multiple tickets between the same two airports → the min-heap holds duplicates; each is polled once, so each is used exactly once. ✓
Cycles in the graph → Hierholzer's handles cycles naturally; a loop just gets explored and spliced into the itinerary at the right spot.
Only one valid itinerary → the algorithm finds it; the lexical tiebreaker never has to choose, but the logic is unchanged.
Lexical ties at multiple branch points → the min-heap always hands back the smallest available destination, so the globally smallest valid itinerary is produced.
A single ticket → dfs("JFK") polls the one destination, recurses, both get added, reversed to [JFK, destination]. ✓

The duplicate-tickets case is the one worth mentioning explicitly — using a min-heap (rather than a set) is what lets us store and consume duplicate edges correctly.

8. Full Code

import java.util.*;

public class ReconstructItinerary {

    public static List<String> findItinerary(List<List<String>> tickets) {
        // Each airport maps to a min-heap of destinations.
        // The heap enforces lexical order
        Map<String, PriorityQueue<String>> graph = new HashMap<>();
        for (List<String> ticket : tickets) {
            graph.computeIfAbsent(ticket.get(0), k -> new PriorityQueue<>())
                 .add(ticket.get(1));
        }

        List<String> itinerary = new ArrayList<>();
        dfs("JFK", graph, itinerary);

        // We added airports when we got stuck (postorder), so the 
        // recorded order is the reverse of the itinerary. Flip it.
        Collections.reverse(itinerary);
        return itinerary;
    }

    private static void dfs(
            String airport,
            Map<String, PriorityQueue<String>> graph,
            List<String> itinerary
    ) {
        PriorityQueue<String> destinations = graph.get(airport);

        // Follow the smallest available edge until this airport 
        // is exhausted. Polling removes the ticket, ensuring each 
        // is used exactly once.
        while (destinations != null && !destinations.isEmpty()) {
            String next = destinations.poll();
            dfs(next, graph, itinerary);
        }

        // Stuck here — record the airport. 
        // Dead-ends end up last after reversal.
        itinerary.add(airport);
    }
}

9. Test the Code

// Simple linear-ish itinerary
List<List<String>> t1 = List.of(
    List.of("MUC", "LHR"),
    List.of("JFK", "MUC"),
    List.of("SFO", "SJC"),
    List.of("LHR", "SFO")
);
System.out.println(findItinerary(t1));
// [JFK, MUC, LHR, SFO, SJC]

// Branch point at JFK and ATL — lexical tiebreaks matter
List<List<String>> t2 = List.of(
    List.of("JFK", "SFO"),
    List.of("JFK", "ATL"),
    List.of("SFO", "ATL"),
    List.of("ATL", "JFK"),
    List.of("ATL", "SFO")
);
System.out.println(findItinerary(t2));
// [JFK, ATL, JFK, SFO, ATL, SFO]

// Duplicate tickets between the same airports
List<List<String>> t3 = List.of(
    List.of("JFK", "KUL"),
    List.of("JFK", "NRT"),
    List.of("NRT", "JFK")
);
System.out.println(findItinerary(t3));
// [JFK, NRT, JFK, KUL]  (must defer KUL — it's a dead end)

These hit the meaningful cases: a straightforward itinerary, a graph with branch points where lexical tiebreaks matter, and a case where the alphabetically smaller first choice (KUL) would strand tickets — verifying the algorithm correctly defers the dead-end. That t3 case is the one that proves naive greedy fails: a front-to-back greedy would fly JFK→KUL first (K < N) and get stranded, but Hierholzer's postorder construction handles it correctly.

10. Key Lessons

"Use every edge exactly once" is the signature of an Eulerian path. When you see it, reach for Hierholzer's algorithm rather than backtracking. Recognizing the named concept is what turns an exponential search into a linear construction.
Some path problems are better constructed than searched. Backtracking searches for a valid path by trial and error; Hierholzer's builds the answer directly and deterministically. When a classic structure underlies the problem, look for the construction algorithm.
Building a result in postorder (recording nodes as you get stuck, then reversing) is a powerful trick for path problems where dead-ends must land in specific positions. The reversal converts "order I finished" into "order I should visit."
Put ordering requirements in the data structure, not the traversal logic. Storing each node's neighbors in a min-heap handled the lexicographical constraint automatically, keeping the DFS clean. This separation of concerns is reusable across many ordered-traversal problems.
Removing an edge as you traverse it (here, polling from the heap) is how you enforce "use each edge exactly once" without a separate visited structure. The consumption is the marking.

The thing that makes Reconstruct Itinerary click is recognizing the Eulerian path beneath the airline-ticket story, and then trusting the counterintuitive postorder-and-reverse construction that builds the path without any backtracking. Once you see that "use every edge exactly once" names a specific structure with a specific algorithm, the problem stops being a daunting search and becomes a clean, deterministic walk. Training yourself to match applied problems to their underlying classic structures is the skill that carries across the whole graph category.

Good Luck and Happy Coding!