top of page

Binary Tree Maximum Path Sum: A Step-by-Step Interview Walkthrough

  • May 24
  • 12 min read

Binary Tree Maximum Path Sum is one of the genuinely hard tree problems in the standard interview rotation. Most tree problems are about traversing a structure or computing a single value per node. This one asks something more subtle: find the best path anywhere in the tree, where "anywhere" includes paths that don't start at the root, don't end at a leaf, and bend through any node. The trap is that candidates instinctively reach for a recursive function that returns "the answer for this subtree" — but as we'll see, there isn't a single value that captures everything we need to know at each node. The real signal interviewers are watching for is whether you can recognize that one return value isn't enough, and design a recursion that separates "what I can pass up to my parent" from "what I contribute to the final answer."


Compilers use this kind of tree DP to find expensive expressions in syntax trees for common subexpression elimination. Network routing algorithms compute longest paths through tree-shaped topologies. Game AIs evaluate decision trees by propagating value estimates upward through branching choices. Phylogenetic analysis in bioinformatics finds maximum-weight paths through evolutionary trees. Any time a hierarchical structure has additive weights and we want the best contiguous chain through it, the same "split what you keep from what you compute" pattern applies.


Problem Statement

Given the root of a binary tree, return the maximum path sum.

A path is defined as any sequence of nodes where:

  • Each pair of adjacent nodes has a parent-child relationship.

  • The path does not need to pass through the root.

  • The path can start and end at any node.

  • The path must contain at least one node.

Node values may be negative.


Example:

       -10
       /  \
      9   20
          / \
         15  7

Maximum path sum is 42 (the path 15 → 20 → 7).


1. Clarify Requirements Before Jumping Into Code

Before writing a single line, we should restate the problem in our own words and ask the kinds of questions that would change our approach.


Input: the root of a binary tree.

Output: an integer — the maximum sum of any valid path.


Details to clarify:

  • Does the path have to start at the root? No — and this is the most important clarification. Most tree problems are rooted; this one isn't.

  • Does the path have to end at a leaf? No.

  • Can a path consist of a single node? Yes — a single node is a valid path.

  • Can node values be negative? Yes.

  • Can all values be negative? Yes — in which case the answer is the single largest (least negative) value, since adding more negatives only makes things worse.

  • Can the tree be empty? Usually not for this problem, but worth confirming.

The thing to flag is that "path" here means any sequence of parent-child-connected nodes — including paths that "bend" through some node and go down two different subtrees. This is the trap. If you think of paths only as "root-to-something" or "leaf-to-leaf," you'll miss the right framing.


2. Identify the Category of the Question

A few signals jump out:

  • We're optimizing over a structure (finding a max).

  • The decision at each node depends on its children's subtrees.

  • Subproblems overlap heavily — every node sits at the top of some path and is also part of paths anchored elsewhere.

That combination — optimization, recursive structure, overlapping subproblems — places this in the tree DP family. Specifically, it's a postorder problem: each node's contribution depends on what its children compute first. Same family as Diameter of Binary Tree, House Robber III, and Longest Univalue Path.

But there's a wrinkle that makes this problem harder than its cousins. We'll see in the brainstorm.


3. Brute Force Solution

The naive approach: enumerate every possible path in the tree, compute each path's sum, return the maximum.

How many paths are there? For every pair of nodes (u, v) connected by a parent-child chain, there's a path. In the worst case (a complete tree), this is roughly O(n²) paths, and computing each takes O(n) time to walk it. That's O(n³) total, which is far too slow for any reasonable input.

We could be slightly smarter — enumerate "bent" paths by considering each node as the "peak" (the highest point of the path) and finding the best downward extension on each side. But even that approach, naively implemented, recomputes subtree information many times.

The brute force teaches us the key insight: the work overlaps enormously. The best downward path from a node is the same regardless of which paths it appears in. If we can compute that information once per node and reuse it, we collapse the cubic blowup into a linear traversal.


4. Brainstorm More Solutions

Step 1: Try the obvious recursion

Let's try the most natural-looking approach. Define maxPathSum(node) as the maximum path sum anywhere in the subtree rooted at node. The answer to the whole problem is maxPathSum(root).


For the recurrence, at a given node, the best path is the maximum of:

  • The best path entirely in the left subtree.

  • The best path entirely in the right subtree.

  • The best path that passes through this node.


The first two are direct recursive calls. The third — paths through the current node — is where things get interesting. Let's try breaking down a path through the current node into smaller pieces that are easier to reason about.


The value of a path through node is:

  • node.val

  • + best downward path from left child

  • + best downward path from right child

  • where the downward paths can be empty (zero) if including them would hurt.


But here's the problem: we just used the phrase "best downward path from left child," which is not the same thing as "the maximum path sum anywhere in the left subtree." The first is restricted — it has to start at the left child. The second is unrestricted — it could be anywhere down there.


So the recursive function we defined doesn't give us the information we need. To compute the answer at node, we'd need both:

  • The unrestricted max anywhere in each subtree (to consider paths that don't touch node because they are contained entirely in the left or right subtrees).

  • The best downward chain from each child (to consider paths that pass through this node).

Those are two different values. We can't return both from a single function call without complicating the interface.


Step 2: Separate what flows up from what gets recorded

So we're stuck if we insist on a single return value because our current function definition won't work with only a single return value. So what if we change what the function returns? Instead of returning "the maximum path sum anywhere in this subtree" — which doesn't tell the parent what it needs in order to compute a bent path through itself — what if the function returns the best downward chain starting at this node?


Once we make that swap, look at what a recursive call has on hand. It knows its own node's value. It knows the best downward chain from its left child, because we just redefined the left recursive call to return exactly that. It knows the same for the right child. With those three numbers, the call can compute the best bent path that peaks at this node — and that's a candidate for the final answer. Now we just need to design our function so that it doesn't need to return that candidate to anyone; it just needs to record it somewhere that survives the rest of the traversal. A shared variable does the job.


That leaves only one thing to return: the best downward chain we promised our parent. Since our parent will extend a path through us, it can only use a straight chain going down through one of our children. So the return value is node.val + max(left, right).


Two outputs, two channels: candidate answers get recorded into a shared variable, and the extendable downward chain gets returned to the parent. Each carries the information its consumer needs, and neither has to do the other's job.


So we end up with two separate quantities at each node:

  • The max possible value of a path that bends through this node using both subtrees: node.val + left_contribution + right_contribution. This is computed and recorded into the global max

  • The max possible straight chain using at most one subtree: node.val + max(left_contribution, right_contribution). Returned upward to the parent


This separation is the entire algorithm. Once you see that one return value isn't enough and that the answer must be tracked separately from what flows up, the code writes itself.


Step 3: Negative contributions

Let's not forget that node values can have negative values. So what do we do if a child's downward chain has a negative sum? Then including it makes the path worse.

That's easy: we should just drop it and treat it as if the child contributed zero.

In code, this is one line: left = max(dfs(node.left), 0). If the left child's contribution is negative, we pretend it's zero, which is the same as "don't extend through the left side."


But we have to be careful here: this only applies to what flows up to the parent. It does not mean the global max can be zero. The global max has to start at some actual node and use that node's value, even if all values are negative. That means our global max should start at Integer.MIN_VALUE (or -infinity), and at each node we compute node.val + left + right as a candidate — never dropping the node itself, even if it's negative.


That's why all-negative trees work correctly: the algorithm never "decides not to include any node." It just picks the least-bad single node.


Step 4: Walk through the example

Let's verify on:

       -10
       /  \
      9   20
          / \
         15  7

We're using DFS in our maxPathSum function to visit children before parents (postorder). Trace:

  • maxPathSum(9): left=0, right=0. Compute path-through: 9 + 0 + 0 = 9. Global max: 9. Return: 9 + max(0, 0) = 9.

  • maxPathSum(15): same as 9. Compute 15. Global max: 15. Return: 15.

  • maxPathSum(7): same. Compute 7. Global max stays 15 (since 15 > 7). Return: 7.

  • maxPathSum(20): left=max(15, 0)=15, right=max(7, 0)=7. Compute path-through: 20 + 15 + 7 = 42. Global max: 42. Return: 20 + max(15, 7) = 35.

  • maxPathSum(-10): left=max(9, 0)=9, right=max(35, 0)=35. Compute path-through: -10 + 9 + 35 = 34. Global max stays 42 (since 42 > 34). Return: -10 + max(9, 35) = 25.

Final answer: 42.

Notice how the 42 was computed during the visit to 20, but the algorithm kept recursing up to -10 afterward. The global max captured 42 before the recursion finished. If we had only tracked the return value, we'd have ended up with 25, which is wrong. That's why the global max isn't optional — it's the mechanism that lets us record answers from anywhere in the tree, not just at the root.


Step 5: Complexity

The recursion visits each of the n nodes exactly once and does O(1) work per node (a few arithmetic operations and a comparison against the global max). Total time: O(n).

Space is O(h) for the recursion stack, where h is the tree's height — O(log n) for balanced trees, O(n) for skewed ones. The global max variable is O(1).

Both axes are essentially optimal. We have to look at every node at least once, so O(n) is unavoidable. And tree recursion inherently uses stack space proportional to depth.


5. Discuss Trade-Offs Between Solutions

Approach

Time

Space

When I'd use it

Enumerate all paths

O(n³)

O(n) for path storage

Never — wastes the structure entirely

DFS returning "max in subtree" only

Doesn't work

The natural first attempt that exposes why one return value isn't enough

DFS with global max + downward contribution

O(n)

O(h) stack

My default interview answer — canonical solution

The two-quantity decomposition (return one thing, record another) is the right answer for this problem. There's no better runtime to aim for.


6. Pseudocode

globalMax = -infinity

dfs(node):
    if node is null:
        return 0

    # Drop negative contributions — they only hurt
    left = max(dfs(node.left), 0)
    right = max(dfs(node.right), 0)

    # Path that bends through this node — uses both children
    globalMax = max(globalMax, node.val + left + right)

    # What we pass up to our parent — at most one child can be extended
    return node.val + max(left, right)

7. Edge Cases

Things to verify before claiming we're done:

  • Single-node tree → dfs visits the one node; left and right are 0. Global max becomes node.val. Returns the value. ✓

  • All-negative tree → max(dfs(child), 0) drops every child contribution, so each node's candidate is just node.val + 0 + 0 = node.val. Global max ends up as the largest single value. ✓

  • Tree where the best path doesn't include the root → algorithm still finds it, because the global max is updated at every node, not just the root.

  • Highly unbalanced (skewed) tree → algorithm works, but recursion depth equals tree height. For pathological inputs, an iterative version with an explicit stack would be safer.

  • Tree where both children are positive but the node itself is very negative → the algorithm correctly computes the bent path, even if the node hurts the running total.

The combination of max(child, 0) for what passes up and the global max for what gets recorded handles every shape cleanly. Initializing globalMax to Integer.MIN_VALUE is what makes all-negative trees work — using 0 instead would incorrectly allow "empty paths" with sum 0, which the problem doesn't permit.


8. Full Code

public class BinaryTreeMaximumPathSum {

    static class TreeNode {
        int val;
        TreeNode left;
        TreeNode right;

        TreeNode(int val) {
            this.val = val;
        }
    }

    private static int maxSum;

    public static int maxPathSum(TreeNode root) {
        // Initialize to MIN_VALUE so all-negative trees return 
        // correctly. The path must include at least one node, so 
        // we never settle for 0.
        maxSum = Integer.MIN_VALUE;
        dfs(root);
        return maxSum;
    }

    private static int dfs(TreeNode node) {
        if (node == null) {
            return 0;
        }

        // Drop negative contributions — extending through a child 
        // with a negative sum would only make our path worse.
        int left = Math.max(dfs(node.left), 0);
        int right = Math.max(dfs(node.right), 0);

        // Candidate answer at this node: a path that bends here, 
        // using BOTH children. This is what gets recorded into the 
        // global max.
        int pathThroughNode = node.val + left + right;
        maxSum = Math.max(maxSum, pathThroughNode);

        // What we return to our parent: a STRAIGHT chain going 
        // down through at most one child. Our parent will extend 
        // through us, so we can only contribute a single linear 
        // path.
        return node.val + Math.max(left, right);
    }
}

9. Test the Code

// Standard case — best path bends through 20
TreeNode root = new TreeNode(-10);
root.left = new TreeNode(9);
root.right = new TreeNode(20);
root.right.left = new TreeNode(15);
root.right.right = new TreeNode(7);
System.out.println(maxPathSum(root));  // 42

// Single node
System.out.println(maxPathSum(new TreeNode(5)));   // 5
System.out.println(maxPathSum(new TreeNode(-3)));  // -3

// All-negative tree — answer is the single least-negative node
TreeNode allNeg = new TreeNode(-3);
allNeg.left = new TreeNode(-2);
allNeg.right = new TreeNode(-1);
System.out.println(maxPathSum(allNeg));  // -1

// Best path doesn't include the root
TreeNode skipRoot = new TreeNode(-5);
skipRoot.left = new TreeNode(4);
skipRoot.right = new TreeNode(2);
skipRoot.left.left = new TreeNode(11);
System.out.println(maxPathSum(skipRoot));  // 15 (path: 4 → 11)

// Skewed chain of positives
TreeNode chain = new TreeNode(1);
chain.right = new TreeNode(2);
chain.right.right = new TreeNode(3);
chain.right.right.right = new TreeNode(4);
System.out.println(maxPathSum(chain));  // 10 (the whole chain)

These hit the meaningful cases: a bending path through a subtree, single-node trees, an all-negative tree (testing the MIN_VALUE initialization), a tree where the optimal path avoids the root entirely, and a skewed chain to verify the linear case.


10. Key Lessons

  • When a problem feels too complex to attack head-on, break it down into mutually exclusive cases that together cover every possibility. In Step 1, we asked "what kinds of best paths exist at a given node?" and enumerated three parts. That decomposition didn't solve the problem on its own, but it converted a vague "find the best path anywhere" question into three concrete sub-questions we could analyze one at a time. Each case is a smaller, more tractable version of the problem.

  • When a recursive function needs to return information that isn't the same as the final answer, separate the two. Return what the parent needs to make its decision; track the actual answer in a separate global variable. This pattern shows up across tree DP problems — Diameter of Binary Tree, Longest Univalue Path, and others use the same decomposition.

  • A "path that bends through this node" uses both subtrees, but a "chain that extends from this node up to its parent" can only use one. Recognizing the asymmetry between these two views is the entire problem.

  • When extending a path with non-positive values would hurt, drop them. max(child_contribution, 0) is one line of code that handles all the "don't include this if it's negative" cases without special branches. It does not apply to the current node's own value, though — the path must include at least one actual node.

  • Initialize global accumulators to the most extreme value the problem allows. For "max with potentially all-negative input," that's Integer.MIN_VALUE, not 0. Using 0 would create a phantom "empty path" that isn't actually valid.

  • Postorder DFS (children before parent) is the right pattern whenever a node's answer depends on values computed in its subtrees. Tree DP almost always means postorder.


The thing that makes Binary Tree Maximum Path Sum click is the recognition that what flows up and what gets recorded can be two different quantities, and trying to make a single return value do both jobs is what makes the problem feel impossible. Once you separate them, the algorithm collapses from "I have no idea how to track this" into ten lines of clean code. Every other "find the best path/diameter/chain in a tree" problem follows the same template — once you see the split, you'll see it everywhere.


Good Luck and Happy Coding!

Recent Posts

See All

Comments


Drop Me a Line, Let Me Know What You Think

Thanks for submitting!

© 2026 by WhiteboardReady

bottom of page