Combination Sum II

2 days ago
7 min read

The Duplicate Problem

Combination Sum II has the same goal as Combination Sum - find all combinations that sum to a target - with one crucial difference. Each number can only be used once, and the input may contain duplicates. That single change forces you to be much more deliberate about how you explore the search space. Get the duplicate handling wrong and you'll produce repeated combinations. Get it too aggressive and you'll miss valid ones.

Problem Statement

You are given a collection of candidate numbers candidates and a target number target.

Return all unique combinations in candidates where the chosen numbers sum to target.

Each number in candidates may be used at most once.
The input array may contain duplicate numbers.
The solution set must not contain duplicate combinations.

Examples:

candidates = [10,1,2,7,6,1,5], target = 8 → [[1,1,6], [1,2,5], [1,7], [2,6]]
candidates = [2,5,2,1,2], target = 5 → [[1,2,2], [5]]
candidates = [1,1,1,1], target = 2 → [[1,1]]

1. Clarify Requirements Before Jumping Into Code

Before coding, lock down the constraints carefully, as this problem has more of them than Combination Sum.

Input: An array of positive integers (may contain duplicates) and a target integer.

Output: All unique combinations that sum exactly to target, where each element is used at most once.

Key rules to confirm:

[1, 2, 5] and [2, 1, 5] are the same combination - ordering in paths doesn't matter.
Is candidates sorted? Not necessarily.
Duplicate numbers in the input can produce duplicate combinations if you're careless.
All numbers are positive - sums only grow, so pruning is safe.

This is where paying close attention to the problem statement pays off. Combination Sum allowed reuse and had no duplicates. This problem disallows reuse and introduces duplicates. Both changes need to be handled, and they need to be handled independently.

2. Identify the Category of the Problem

This is backtracking - the same choose → explore → un-choose structure from Combination Sum, Subsets, and Permutations. But this problem has the most constrained version of the pattern so far:

No reuse: advance to i + 1 after choosing element i.
No duplicate combinations: skip candidates at the same recursion depth that have already been tried.

The second constraint is the new one, and it's where most candidates go wrong. Understanding why the deduplication logic works - not just what it is - is the real test.

3. Start With Brute Force to Build Intuition

The naive approach generates every possible subset of candidates, filters the ones that sum to target, then deduplicates the results.

The problems here are compounding. First, there are 2ⁿ subsets - expensive to enumerate. Second, deduplicating afterward requires sorting each combination and comparing it to every other - even more expense. Third, and most importantly, it completely sidesteps the interesting part of the problem: learning to structure the search so duplicates never appear in the first place.

The brute force solution does provide some insights around what we need to focus on for our final solution: a way to avoid generating the same combination twice, without doing extra work after the fact.

4. Brainstorm More Solutions

Step 1: Start From What You Already Know

Compare this problem directly to Combination Sum. In that problem:

Numbers could be reused → we passed i (not i + 1) in the recursive call.
No duplicate inputs → no duplicate combinations could arise.

Here, both of those change. Let's handle them one at a time.

Fixing reuse: Easy. Pass i + 1 instead of i in the recursive call. That moves past the current element, so it can't be picked again in the same combination.

Fixing duplicates: This is harder. Looks dive deeper on this requirement.

Step 2: Understand Why Duplicates Arise

Suppose candidates = [1, 1, 2] and target = 3.

After sorting, the array is [1, 1, 2]. Let's label them 1a, 1b, and 2 to tell them apart. Without any deduplication logic, backtracking would find:

[1a, 2]
[1b, 2]
[1a, 1b]

But [1a, 2] and [1b, 2] are the same combination - both are [1, 2]. We've generated a duplicate because we used the two identical 1s as if they were distinguishable.

The root cause: at the same level of the recursion tree, we tried 1a as our choice and also tried 1b as our choice, but since they have the same value, both choices lead to the same set of combinations downstream.

Step 3: The Fix - Skip Duplicates at the Same Depth

If we're at a given recursion depth and we've already tried a candidate with value v, there's really no point trying other candidates with that same value at our current recursion depth. It will generate exactly the same combinations.

What kind of conditional clause do we need to tell us if the current candidate has the same value as the previous one?

Something like: candidates[i] == candidates[i - 1]

Now, we know we only want to skip duplicates at the same depth, and not across different depths. Otherwise we risk eliminating valid combinations like [1, 1] from [1, 1, 2]. How do we turn that into a conditional clause?

Something like i > start should work, to ensure that when we're looping through our candidates, we don't look back at candidates that we've already passed

The condition that captures this:

if (i > start && candidates[i] == candidates[i - 1]) {
    continue;
}

Together: "if this value is a duplicate of something we already tried at this recursion level, skip it."

So to summarize our solution:

Backtracking to explore all possible combinations
Sorting the candidates to enable early pruning
Skipping duplicate candidates at same recursion levels to ensure no duplicate combinations

5. Trade-Offs at a Glance

Approach	Time	Space	Notes
Brute-force subsets + dedup	O(2ⁿ × n)	Large	Conceptually simple; wasteful
Backtracking	O(2ⁿ)	O(n)	Clean, pruned, correct
Backtracking + sort + early break	O(2ⁿ)	O(n)	Faster in practice

The time complexity is exponential for the same reason as Combination Sum - in the worst case, many combinations may be valid and must be enumerated. Sorting enables the early break when candidates[i] > remaining, which prunes dead-end branches before they're explored.

6. Pseudocode

sort candidates

result = empty list

backtrack(start, remaining, path):
    if remaining == 0:
        add copy of path to result
        return

    for i from start to candidates.length - 1:
        if i > start AND candidates[i] == candidates[i-1]:
            continue    ← skip duplicate values at the same depth

        if candidates[i] > remaining:
            break       ← prune: no point continuing (array is sorted)

        add candidates[i] to path
        backtrack(i + 1, remaining - candidates[i], path)   ← i+1: no reuse
        remove last element from path

Compared to Combination Sum's pseudocode, there are exactly two differences: i + 1 instead of i in the recursive call, and the duplicate-skip condition at the top of the loop. Everything else is identical.

7. Edge Cases

No valid combination: The recursion prunes or exhausts all paths without hitting remaining == 0. Returns [].
All duplicates: [1,1,1,1], target = 2 → [[1,1]]. Only one combination despite four identical elements — the skip condition collapses the duplicates correctly.
Target smaller than all candidates: [3,4,5], target = 2 → []. The break triggers on the first iteration of every call.
Single element: [1], target = 1 → [[1]]. Straightforward - no duplicates to manage.

8. Full Solution

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class CombinationSumII {

    public static List<List<Integer>> combinationSum2(int[] candidates, int target) {
        Arrays.sort(candidates);  // required for both duplicate skipping and early break
        List<List<Integer>> result = new ArrayList<>();
        backtrack(candidates, 0, target, new ArrayList<>(), result);
        return result;
    }

    private static void backtrack(int[] candidates, int start,
                                   int remaining,
                                   List<Integer> path,
                                   List<List<Integer>> result) {

        if (remaining == 0) {
            // copy — not a reference
            result.add(new ArrayList<>(path));  
            return;
        }

        for (int i = start; i < candidates.length; i++) {
            // Skip duplicate values at the same recursion depth
            if (i > start && candidates[i] == candidates[i - 1])
                continue;

            // Prune subsequent candidates  that are too large
            if (candidates[i] > remaining) 
                break;

            path.add(candidates[i]);
            backtrack(candidates, i + 1, remaining - candidates[i], path, result);
            path.remove(path.size() - 1);
        }
    }
}

9. Test Your Solution

// Standard case with duplicates in input
System.out.println(combinationSum2(new int[]{10,1,2,7,6,1,5}, 8));
// [[1,1,6], [1,2,5], [1,7], [2,6]]

// Multiple duplicates
System.out.println(combinationSum2(new int[]{2,5,2,1,2}, 5));
// [[1,2,2], [5]]

// All identical elements
System.out.println(combinationSum2(new int[]{1,1,1,1}, 2));
// [[1,1]]  ← not [[1,1], [1,1], [1,1], ...] — dedup is working

// No valid combination
System.out.println(combinationSum2(new int[]{3,4,5}, 2));
// []

// Single element
System.out.println(combinationSum2(new int[]{1}, 1));
// [[1]]

The [1,1,1,1] test is the most important one to examine closely. If your deduplication logic is right, you get exactly one [1,1]. If you get multiple copies, the i > start guard is missing or wrong. If you get nothing, the guard is too aggressive and is skipping valid combinations across depths.

10. Key Lessons for Future Problems

Sorting can sometimes be a prerequisite, not just an optimization. The duplicate-skip logic requires that equal values are adjacent. Without sorting, candidates[i] == candidates[i-1] means nothing reliable. Sort first when the input may contain duplicates.
Skip duplicates at the same depth, not across depths. The i > start condition is the heart of this problem. Understand it mechanically: i > start means "we've already tried at least one candidate at this recursion level." If the current candidate has the same value as the previous one, trying it will produce the same results - skip it. But if we're at a new recursion depth (i == start), we haven't tried anything yet at this level, so even a duplicate value gets a fresh chance.
"Use once" and "no duplicate combinations" are different constraints. "Use once" is handled by advancing to i + 1. "No duplicate combinations" is handled by the skip condition. These solve different problems and must both be present. Conflating them is a common source of bugs.

Understand the whole backtracking family together. You've now seen four variations of the same pattern:

Problem	Reuse?	Duplicate inputs?	Record at?
Permutations	No	No	Only when path is full
Subsets	No	No	Every level
Combination Sum	Yes	No	Only at remaining == 0
Combination Sum II	No	Yes	Only at remaining == 0

Seeing them as a family, rather than four separate problems to memorize, means you can reconstruct any of them from first principles. When you encounter a new backtracking problem in an interview, you can ask yourself: does it allow reuse? Does the input have duplicates? When do I record a result? The answers tell you exactly how to adapt the skeleton.

Combination Sum II is a precision test. The algorithm isn't complex - it's one condition added to code you've already written. What it tests is whether you understand that condition deeply enough to get it exactly right under pressure.

Good Luck and Happy Coding!