Davide Spataro

3Sum

medium · Arrays & Hashing · 26 min read
array hashing two-pointers sorting dedup interview-strategy

Introduction

3Sum is Two Sum’s older sibling. Once you internalize the move from Two Sumreplace an inner search with a lookup — 3Sum looks like a trivial extension: fix one element aia_i and ask “is there a Two Sum for target ai-a_i in the rest of the array?” That is the algorithmic content of the problem.

It is also not what the problem is actually testing.

The real grading criterion for 3Sum, at every company that asks it, is deduplication. Almost every candidate can see the reduction to Two Sum inside ninety seconds. A smaller fraction can implement it without bugs. A much smaller fraction produces a correct, deduped output on the first run — without either (a) double-counting triplets like (0,0,0)(0, 0, 0), or (b) silently dropping one of two legitimately distinct triplets that happen to share an element. That delta is where the signal lives.

3Sum is not really about 3Sum. It is about writing the deduplication argument before you write the loop that needs it.

This chapter takes the Two Sum move as given and spends most of its energy on the part the interview actually scores: sorting so dedup becomes a one-liner, the two-pointer skip step, and the handful of edge cases where a naive implementation quietly produces the wrong set.

What the interviewer is actually testing

Read off the checklist out loud; interviewers track it even when they don’t tell you so:

  1. Pattern recognition. Do you see the reduction to Two Sum in under a minute?
  2. Clarification discipline. Do you ask about duplicates in the input and duplicates in the output — two different questions — before you write code?
  3. Dedup strategy. Do you have a defensible plan for deduplication, or are you hoping a hash set will bail you out at the end?
  4. Correct bounds. Do the two inner loops on your sorted array actually stop correctly when the pointers meet, and do you skip duplicates on both sides of an emit?
  5. Complexity articulation. Can you argue O(n2)O(n^2) without hand-waving, and can you argue why O(n2)O(n^2) is the lower bound for this family?

Problem statement

Problem

Given an array AA of nn integers, return every unique triplet (a,b,c)(a, b, c) with abca \le b \le c drawn from AA at three distinct indices ijki \ne j \ne k such that a+b+c=0a + b + c = 0.

The output must not contain duplicate triplets, but the input may contain duplicate values.

Example 1
Input A = [-1, 0, 1, 2, -1, -4]
Output [[-1, -1, 2], [-1, 0, 1]]
Two distinct triplets. The value -1 appears twice in the input, and both occurrences are used in the first triplet — that is legal because they sit at different indices.
Example 2
Input A = [0, 0, 0, 0]
Output [[0, 0, 0]]
The canonical double-counting trap. A naive loop will emit four copies; dedup must knock it down to one.
Example 3
Input A = [1, 2, 3]
Output []
All positive; no triplet can sum to zero. A sorted-array approach should detect this and exit without any inner work.

Clarifying questions — and why each one matters

3Sum arrives under-specified in two very specific ways. If you ask the two questions below in your first minute, you have already passed a signal the interviewer is waiting for.

Clarification questions

Q Can the input contain duplicate values, and if so, can the same value appear in more than one output triplet?
A Two questions masquerading as one. The standard answer: input may contain duplicates; output triplets must be unique as multisets; but a single value can appear across several different triplets. Candidates who conflate these two end up either missing triplets (over-dedup) or emitting the same triplet multiple times (under-dedup).
Q Is the target strictly zero, or a parameter T?
A LeetCode asks for zero. The algorithm is identical for arbitrary T — replace the comparison with A[lo] + A[hi] vs T - A[i]. Ask, because many interviewers quietly generalize mid-interview to test flexibility.
Q Can I modify the input array?
A If yes, sort in place and the two-pointer approach is free. If not, sort a copy; you'll pay O(n) extra space for the copy but the time complexity is unchanged.
Q Must the output be sorted, or does any ordering work?
A LeetCode does not require a specific order of triplets, but within a triplet the values should be sorted. Sorting the input gets both for free.
Q Any guarantees on n or the value range?
A At n ≤ 3000, O(n²) is plenty. At n ≥ 10⁵, the constant factor of the two-pointer approach starts to matter and the per-pivot hash-set approach becomes impractical. Ask; sizes often hide in the problem statement.

A mental framework for k-Sum problems

3Sum is the first step on a ladder. The same reduction applies to 4Sum, 5Sum, k-Sum:

  1. Sort. This is not optional for the recommended solution. Sorting turns deduplication from a data-structure problem into a one-liner (skip equal neighbours).
  2. Fix k2k-2 elements with outer loops. Each outer loop iterates over candidate values, skipping duplicates.
  3. Reduce the innermost pair to Two Sum on a sorted array. Two-pointer walk — the O(n) core from the Two Sum chapter’s two-pointer section.
  4. At the emit, advance past duplicates on both sides. This is the dedup step that makes the output unique without a separate hash set.
  5. Early-exit when the sorted prefix can no longer hit the target. At A[i]>0A[i] > 0 the minimum remaining triplet is already positive; stop.

Total complexity: O(nk1)O(n^{k-1}) time, O(1)O(1) extra space after sorting. For 3Sum, that is O(n2)O(n^2).

Every kk-Sum variant you will ever be asked is this skeleton with a different outer loop count. Internalize the skeleton here; you will never re-derive it.

Discussion

Approach 01

Brute force — three nested loops

Enumerate every ordered triple. Cubic time, correct, useful only as a sanity check.

Time O(n^3) Space O(1)

The interview narration

“The obvious approach is three nested loops: O(n3)O(n^3) time, O(1)O(1) space. Let me name it so we can compare against it, then look for the reduction to Two Sum.”

State this in fifteen seconds; do not write the code unless the interviewer specifically asks. Writing O(n3)O(n^3) in a 3Sum interview and moving on often reads as “I don’t know the faster one yet.”

The code

brute-force.cpp
#include <algorithm>
#include <vector>
// Enumerate every triple (i < j < k). Cubic time, constant extra space.
// Produces a deduplicated set of triplets by sorting each triplet and then
// the output — simple but slow.
std::vector<std::vector<int>> threeSum(std::vector<int> A) {
std::vector<std::vector<int>> out;
const int n = static_cast<int>(A.size());
for (int i = 0; i < n; ++i) {
for (int j = i + 1; j < n; ++j) {
for (int k = j + 1; k < n; ++k) {
if (A[i] + A[j] + A[k] == 0) {
std::vector<int> t = {A[i], A[j], A[k]};
std::sort(t.begin(), t.end());
out.push_back(std::move(t));
}
}
}
}
// Dedup the output; the triplets themselves were already sorted above.
std::sort(out.begin(), out.end());
out.erase(std::unique(out.begin(), out.end()), out.end());
return out;
}

Three nested loops and a set-union for dedup — the baseline you name before you replace it. Don't ship this in an interview unless $n \le 100$ is explicit.

Why it is O(n3)O(n^3) — and what it gets right

The arithmetic is the straightforward triple sum (n3)=Θ(n3)\binom{n}{3} = \Theta(n^3). What this approach does get right is correctness: every unordered triplet is enumerated exactly once (because of the i<j<ki < j < k constraint), the sum check is direct, and the final dedup handles the [0,0,0,0][0, 0, 0, 0] trap by sorting each triplet before insertion.

Keep the correctness argument in mind. The two fast approaches below preserve it with different deduplication tricks; the underlying claim — every unordered triplet is considered exactly once — is what you must maintain.

Approach 02

Hash set per pivot

Fix one element, run a Two Sum hash over the remainder. Expected O(n²), but deduplication is painful.

Time O(n^2) expected Space O(n)

The insight, stated cleanly

Fix the outer index ii. For every j>ij > i, the problem reduces to “is there an index k>jk > j with A[k]=A[i]A[j]A[k] = -A[i] - A[j]?” — a Two Sum lookup on a sliding suffix. A hash set of values seen to the right of jj turns the inner search into O(1)O(1) expected.

Total work per pivot: O(n)O(n) expected. Total across nn pivots: O(n2)O(n^2) expected.

The code

hashset-per-pivot.cpp
#include <algorithm>
#include <set>
#include <unordered_set>
#include <vector>
// Fix one element A[i] and reduce the remaining problem to Two Sum on
// a target of -A[i]. Expected O(n^2) time, O(n) extra space.
//
// Deduplication is awkward: we have to canonicalize each triplet and
// push it through a set, because the hashing pass does not know which
// value it already emitted at the current pivot.
std::vector<std::vector<int>> threeSum(std::vector<int> A) {
std::sort(A.begin(), A.end()); // only to canonicalize triplets for dedup
const int n = static_cast<int>(A.size());
std::set<std::vector<int>> unique;
for (int i = 0; i < n - 2; ++i) {
// Skip duplicates at the pivot, or the inner loop wastes work
// re-finding the same triplets.
if (i > 0 && A[i] == A[i - 1]) continue;
const int target = -A[i];
std::unordered_set<int> seen;
for (int j = i + 1; j < n; ++j) {
const int complement = target - A[j];
if (seen.count(complement)) {
// A[i] <= complement <= A[j] because the array is sorted
// and we scan j left-to-right, so this is already canonical.
unique.insert({A[i], complement, A[j]});
}
seen.insert(A[j]);
}
}
return {unique.begin(), unique.end()};
}

Two Sum with a hash set, repeated once per pivot, with sort + std::set dedup bolted on because the hashing pass cannot dedup on its own.

Why the dedup pain is real

Hashing preserves no ordering between emits. That has three consequences that burn candidates in interviews:

  • Within a single pivot, the hash set can re-find the same pair via either endpoint; we mitigate by only emitting when the complement was seen before A[j], but the logic is fragile.
  • Across pivots, the same triplet can be emitted three times (once per rotation of its elements as pivot). Canonicalizing by sorting the triplet and stuffing it into a std::set<std::vector<int>> fixes this but costs O(logm)O(\log m) per emit and O(m)O(m) space for the m output triplets.
  • Skipping duplicate pivots matters: without the A[i] == A[i-1] guard, [1,1,0,1][-1, -1, 0, 1] re-pays the full inner pass for the second 1-1.

In aggregate, this solution is correct but is almost never what you would ship. The sorted two-pointer approach below is simpler to reason about and has strictly smaller constant factors.

When you’d still use it

  • The input is intrinsically unsortable (e.g., stream of integers you cannot revisit). Then hashing is the only option.
  • You need the count of triplets, not the triplets themselves, and you can afford the dedup bookkeeping. A Bloom-filter approximation is sometimes acceptable.
Approach 03

Sort, then two pointers — the answer Recommended

Sort once, fix A[i], walk a two-pointer on the rest. O(n²) time, O(1) extra space, dedup falls out of the sort.

Time O(n^2) Space O(1) sorted arraytwo pointers

The mechanic

Sort AA in nondecreasing order. For each pivot index ii, set L=i+1L = i + 1 and R=n1R = n - 1 and run the Two Sum two-pointer walk from the Two Sum chapter, targeting A[i]-A[i]. The three moves are:

Sum at i,L,Ri, L, RActionWhy
A[i]+A[L]+A[R]=0A[i] + A[L] + A[R] = 0record the triplet, advance past duplicates on both sides, step LL right and RR leftAny further pair that matches with this pivot will have a strictly greater A[L]A[L] and strictly smaller A[R]A[R].
A[i]+A[L]+A[R]<0A[i] + A[L] + A[R] < 0LL+1L \leftarrow L + 1Need a larger sum; the array is sorted, so only raising LL can grow it.
A[i]+A[L]+A[R]>0A[i] + A[L] + A[R] > 0RR1R \leftarrow R - 1Need a smaller sum; only lowering RR can shrink it.

On the outer loop, skip a pivot when A[i]=A[i1]A[i] = A[i-1] (same work already done), and break entirely when A[i]>0A[i] > 0 (all subsequent sums are strictly positive).

The code

sort-two-pointers.cpp
#include <algorithm>
#include <vector>
// Sort once, then fix A[i] and two-pointer the rest. O(n^2) time,
// O(1) extra space (ignoring the output vector and the sort's own
// stack usage). Deduplication falls out naturally from the sort.
std::vector<std::vector<int>> threeSum(std::vector<int> A) {
std::sort(A.begin(), A.end());
const int n = static_cast<int>(A.size());
std::vector<std::vector<int>> out;
for (int i = 0; i < n - 2; ++i) {
// Early exit: the smallest possible triplet starting here is
// already positive, so no later i can hit zero either.
if (A[i] > 0) break;
// Skip duplicate pivots so we never emit the same triplet twice.
if (i > 0 && A[i] == A[i - 1]) continue;
int lo = i + 1;
int hi = n - 1;
while (lo < hi) {
const int s = A[i] + A[lo] + A[hi];
if (s == 0) {
out.push_back({A[i], A[lo], A[hi]});
// Advance past duplicates on BOTH sides before stepping.
// This is the dedup trick that makes the sort pay off.
while (lo < hi && A[lo] == A[lo + 1]) ++lo;
while (lo < hi && A[hi] == A[hi - 1]) --hi;
++lo;
--hi;
} else if (s < 0) {
++lo; // Sum too small: the only way to grow it is to raise lo.
} else {
--hi; // Sum too large: the only way to shrink it is to lower hi.
}
}
}
return out;
}

The version to ship: sort, fix each pivot, two-pointer the rest. Dedup is three `while` guards, no auxiliary data structures.

The dedup argument, written down

This is the part strong candidates articulate explicitly:

Claim. Every unordered triplet with sum 00 is emitted exactly once.

Invariant. At the start of each outer-loop iteration, no triplet with pivot value strictly less than A[i]A[i] is missing from the output. On entering the inner loop with a fresh pivot, no triplet with pivot A[i]A[i] and left element less than A[L]A[L] is missing.

Why the skips are safe.

  • Outer skip: if A[i]=A[i1]A[i] = A[i-1] and the previous iteration already emitted every valid triplet with pivot A[i1]A[i-1], then the current pivot adds nothing new, because “pivot = A[i1]A[i-1]” is the same constraint as “pivot = A[i]A[i]” on the set of values.
  • Inner skips on emit: having just emitted (A[i],A[L],A[R])(A[i], A[L], A[R]), the only triplet that could legitimately share those three values is… the same triplet. Any position L>LL' > L with A[L]=A[L]A[L'] = A[L] would either re-emit the same triplet (with some RRR' \le R) or require an adjusted RR' — but the sorted order forces us to move both pointers, and skipping to the last duplicate on each side preserves correctness while avoiding emission of duplicates.

Say this out loud when you close. Most candidates at this problem cannot; saying it cleanly is a strong positive signal.

Edge cases to call out explicitly

InputExpectedWhy it works
[] or [0] or [0, 0][]The outer loop runs at most once and the inner loop never enters because L<RL < R fails.
[0, 0, 0, 0][[0, 0, 0]]First pivot emits once; the inner while skips push LL to the last 00 and RR to the first 00 so L<RL < R breaks. Outer skips prevent re-entry on the subsequent zeros.
[0, 0, 0, 0, 0][[0, 0, 0]]Same as above. Stress-test your implementation against this.
[-2, 0, 1, 1, 2][[-2, 0, 2], [-2, 1, 1]]The value 1 appears twice and legitimately pairs with itself. The inner loop finds it because A[L]=A[L+1]=1A[L] = A[L+1] = 1 is allowed — only the advance-past-duplicates runs after the emit.
[1, 2, 3][]Early-exit on A[0]>0A[0] > 0 skips the whole loop. Constant time.
All negative[]The smallest sum is A[n3]+A[n2]+A[n1]A[n-3] + A[n-2] + A[n-1], which is still negative; no iteration emits.

The [-2, 0, 1, 1, 2] case is the one to memorize. It is the trap that catches candidates who skip duplicates before the sum check instead of after the emit.

Complexity, precisely

  • Time: O(n2)O(n^2). The sort is O(nlogn)O(n \log n), strictly dominated. The outer loop runs nn times; each inner two-pointer walk is O(n)O(n). Both pointers move monotonically (never step back), so the inner loop is linear, not quadratic.
  • Space: O(1)O(1) extra, ignoring the output vector itself and the O(logn)O(\log n) stack for std::sort. If the input cannot be mutated, O(n)O(n) for a sorted copy.

Why O(n2)O(n^2) is the natural floor

Any correct algorithm must, in the worst case, consider every pair of array elements — otherwise an adversary can hide a valid triplet in the pair you skipped. 3Sum with an all-distinct input can produce Θ(n2)\Theta(n^2) triplets (construction: take an arithmetic progression centered on 0), so the output itself can be quadratic. You cannot beat O(n2)O(n^2) without assumptions on the values.

Stating this bounds the conversation: when the interviewer asks “can you do better?” the answer is “not without restricting the input.”

Approach comparison

ApproachTimeSpaceProsCons
Brute forceO(n3)O(n^3)O(1)O(1)Obviously correctToo slow beyond n500n \approx 500; dedup needs a post-pass
Hash set per pivotO(n2)O(n^2) expectedO(n)O(n)Works on values that resist orderingDedup is painful; worst-case O(n2)O(n^2) degrades with bad hashes
Sort + two pointersO(n2)O(n^2)O(1)O(1)Smallest constants; dedup is free from the sort; cache-friendlyMutates or copies input; requires orderable values

Default answer: sort + two pointers. State the reduction to Two Sum, state the sort, state the dedup argument, then write the code.

Follow-ups to anticipate

The interviewer asks…What to do
”What about 4Sum?”Add one more outer loop; total O(n3)O(n^3). Same dedup pattern (skip equal pivots, skip equal emits).
”General kk-Sum?”Recurse: fix one element, recurse with k1k - 1 on the suffix; base case k=2k = 2 is the two-pointer walk. O(nk1)O(n^{k-1}).
“3Sum Closest — find the triplet whose sum is closest to TT.”Same skeleton; replace the == 0 emit with abs(sum - T) < best; no dedup needed (single answer).
“3Sum Smaller — count triplets with sum <T< T.”On sum < T, add R - L to the count and advance LL; skip emit and dedup entirely.
”The array is streamed; nn is unknown.”Hashing approach, accepting the dedup bookkeeping cost. Or reservoir-sort and reprocess.
”Values are floats.”Sort + two pointers still works; comparisons are on doubles. Emit on abs(sum) < eps.
”Adversarial hashing input.”Prefer the sort + two pointers version — no hashing surface to exploit.

Pattern recognition — the real lesson

The move you practiced here — fix an outer element, reduce to the (k1)(k-1)-sum problem on the suffix, let sort carry the dedup — reappears in every kk-sum variant and in a surprising number of problems that don’t look like kk-sum at first:

  • 4Sum, k-Sum. Direct application.
  • 3Sum Closest, 3Sum Smaller. Same walk, different emit condition.
  • Triangle count (count triples a,b,ca, b, c with a+b>ca + b > c). Sort, fix the longest side, two-pointer the rest.
  • Subarray Product Less Than K. Not kk-sum, but the same move both pointers monotonically on a sorted (or windowed) range pattern.
  • Minimum Triangle Perimeter. Sort, then consecutive triples — a simpler cousin of this skeleton.

More deeply, this chapter is the second argument in a running theme: whenever an interview problem asks for unordered combinations of array elements satisfying a constraint, sorting is almost always the right first instruction, because it turns enumerate without duplicates from an algorithm design problem into three while guards.

What to say out loud — the delivery script

Rehearse this until it’s automatic:

  1. Restate the problem. “I need every unique triplet of elements from AA that sums to zero. Duplicates in the input are allowed; duplicates in the output are not.”
  2. Ask the two clarifications. Dedup semantics, and whether you can modify the input.
  3. Name the reduction. “If I fix one element A[i]A[i], this becomes Two Sum on the suffix with target A[i]-A[i].”
  4. State the plan. “I’ll sort so dedup is a neighbour check, then for each pivot run a two-pointer walk on the rest. O(n2)O(n^2) time, O(1)O(1) extra space.”
  5. Write it. The three whiles are the part worth writing slowly.
  6. Trace through [-1, 0, 1, 2, -1, -4]. Narrate the pivot skips and the emit-plus-advance on both sides.
  7. State the dedup claim out loud. “Outer skip handles duplicate pivots; inner skips handle duplicate emits; together they guarantee each triplet emits exactly once.”
  8. Call out the trap cases. [0, 0, 0, 0] and [-2, 0, 1, 1, 2].
  9. Summarize. O(n2)O(n^2) is optimal in the general case because the output itself can be quadratic.”

Delivery is scored. The candidate who says “dedup falls out of the sort” is ahead of the candidate who writes the same code silently.

Summary

3Sum is Two Sum with one more pointer — and one much harder editorial problem: how do you not emit the same triplet twice? The answer is almost never “stuff everything in a hash set at the end”; it is “sort first, then skip duplicates on both sides of every emit.” That move generalizes directly to every kk-Sum variant you will ever be asked.

Internalize the skeleton: sort, fix k2k - 2 elements with outer loops, two-pointer the innermost pair, skip neighbours on both sides after each emit, early-exit when the sorted prefix can no longer hit the target. Every k-sum problem on every problem sheet collapses to that skeleton. The point of 3Sum is to build the reflex.