Two Sum
Introduction
Two Sum is the most-asked coding interview question on Earth. That fact alone would earn it a chapter. But Two Sum deserves far more than that: it is where most engineers first learn the single most important performance lever in the entire coding-interview discipline — trading space for time using a hash table.
In the forty-five minutes of a typical on-site, the interviewer is really testing two things. Can you produce correct code? And: can you reason about tradeoffs? Two Sum surfaces both, fast. A candidate who writes only the nested loop reveals a ceiling. One who sees the hashing solution but cannot explain why they look up before they insert reveals another. One who can produce the hashing solution, a sorted two-pointer alternative, and can argue when each wins — that candidate is already distinguishing themselves.
The skill Two Sum actually tests is not “can you do Two Sum”. It is “can you replace a search with a lookup, and know why it is safe.”
This chapter is designed to take you all the way there. If you are new to interview prep, read it linearly: the approaches, the narration, the traps. If you have written the hash solution a dozen times, jump to the Follow-ups and Correctness invariants sections — there is more nuance here than the typical problem sheet reveals.
What the interviewer is actually testing
Before we look at code, know what you are being graded on. The Two Sum rubric at most top companies scores four things:
- Clarification discipline. Did you ask at least two sharp clarifying questions in the first minute?
- Multiple approaches. Did you surface at least two qualitatively different solutions?
- Tradeoff articulation. Can you argue which solution you would ship, and under what constraints you would switch?
- Clean code. Did you write compilable, idiomatic code, and walk through it on a concrete input?
Every section below is organized around delivering those four signals.
Problem statement
Given an array of size and an integer , return true if any two distinct elements of sum to , and false otherwise.
Formally: return true iff there exist indices such that .
A = [9, 4, 17, 42, 36, -3, 15], T = 14 true A = [1, 3, 7], T = 6 false A = [5, 5, 1], T = 10 true Clarifying questions — and why each one matters
In the interview the problem will arrive slightly under-specified. Your clarifications are themselves being evaluated; asking the right ones in the first sixty seconds is one of the highest-value minutes of the whole session. Here is what to ask, and why.
Clarification questions
- Q Can the input be modified?
- A Decides whether the two-pointer approach is free. If yes, sort in place. If not, you either copy the array first or lean on hashing.
- Q Can values be negative?
- A If positive only, certain micro-optimizations open up (skip elements larger than T). If negatives are allowed, several tempting shortcuts quietly break.
- Q Can the same element be used twice?
- A The single most important clarification. Standard answer: no — indices must differ, but two elements with the same value at different indices are a valid pair. Candidates who skip this return wrong answers on inputs like
[5, 5, 1], T=10. - Q Is the input sorted?
- A If yes, the two-pointer walk is O(n) without the sort. Many interviewers present an unsorted array by default; always ask.
- Q Should we return a boolean, or the indices of the pair?
- A The common variant (e.g., LeetCode #1) asks for indices. The algorithms do not change but the data structure does — you use a hash map of value → index instead of a hash set.
- Q Are there multiple valid pairs; return any or all?
- A Affects whether you return on first match or iterate to the end. Most variants ask for any one pair; some ask for the count of pairs.
- Q Any guarantees about n or value range?
- A At n ≥ 10⁶ the hash-table constants start to matter; at values near
INT_MAX, the sum may overflow. Ask.
A mental framework for lookup-shaped array problems
Before diving into approaches, here is the mental sequence I recommend for problems of this shape. It generalizes to 3Sum, 4Sum, Two Difference, Subarray Sum Equals K, Longest Substring Without Repeating Characters, and many more:
- Enumerate and check. Can you brute-force every candidate? Write it as a baseline — even if you throw it away.
- Exploit structure. Is the input sorted? Bounded in range? Non-negative? Structure is almost always the path to a better algorithm.
- Replace search with a hash lookup. If the inner loop is looking for a specific value, a hash table collapses it to .
- Replace search with sort + two pointers. If the inner loop is looking for a range of values, sort + two pointers is often simpler than hashing.
- Replace recomputation with running state. Prefix sums, sliding windows, monotonic stacks.
For Two Sum, steps 1, 3, and 4 all apply. Let’s walk them.
Discussion
Brute force
Check every pair with two nested loops. Correct, quadratic, your starting line.
The interview narration
Say this out loud, almost verbatim:
“The straightforward approach is to check every pair — two nested loops, time, space. I’ll write it as a baseline and then look for a faster one.”
This sentence signals three things at once: you know the obvious approach, you know it is not the answer, and you have a plan.
The code
#include <vector>
// Returns true iff any two distinct indices i, j have A[i] + A[j] == T.// O(n²) time, O(1) extra space. Use only as a baseline.bool twoSum(const std::vector<int>& A, int T) { const int n = static_cast<int>(A.size()); for (int i = 0; i < n - 1; ++i) { for (int j = i + 1; j < n; ++j) { if (A[i] + A[j] == T) return true; } } return false;}Enumerate every index pair; quadratic time, constant extra space — the version you name before you optimize.
Why it is — write this on the whiteboard
It is tempting to wave your hand and say “nested loops, quadratic”. Don’t. Write the arithmetic — it is free points and shows your work:
The inner loop runs times for , for , and so on down to for . That is the triangular number, which is .
Edge cases this handles for free
- : the outer loop body never executes; we return
false. Correct. - All identical values:
A[i] + A[j]withi < jis enumerated; correct. - Negative numbers: the comparison is value-based; no issue.
When brute force is genuinely fine
If the interviewer explicitly says and there will be no follow-up questions, brute force is the answer. Don’t overengineer. But this is rare — usually the follow-up is “now do it faster.”
Hashing — one pass Recommended
Scan once. At each element, ask the hash set: have I already seen T − x? The move that turns quadratic into linear.
The insight, stated cleanly
The inner loop of brute force is searching for a single specific value: for each , is there an with such that ? Rearrange that equation. We’re looking for . A hash set answers that lookup in constant time on average.
So: scan once. At each index, ask the set “have I already seen ?”. If yes, return true. If not, record and move on.
The critical subtlety — look before you insert
This is the detail that separates candidates.
A tempting wrong variant: stuff the whole array into the set first, then in a second loop look up each . This is wrong when is even and appears exactly once in : the lookup will find — itself — and return a false positive.
Concrete failure: . The two-pass version finds in the set, 10 - 5 = 5 also in the set, returns true. But there is no valid pair. The correct answer is false.
Search before you insert is the invariant that keeps the element out of its own pair.
At A[0] = 9, look for T − value = 14 − 9 = 5. Not in the set yet — insert 9.
The code
#include <unordered_set>#include <vector>
bool twoSum(const std::vector<int>& A, int T) { std::unordered_set<int> seen; seen.reserve(A.size());
for (int x : A) { // CRITICAL: look for the complement BEFORE inserting x, // so x can never pair with itself when T == 2·x. if (seen.count(T - x)) return true; seen.insert(x); } return false;}One pass: look up the complement before inserting the current value — the standard interview solution for the boolean variant.
The indices variant
A frequent variant (and the one LeetCode #1 asks) is to return the indices of the pair, not a boolean. One small change — a hash map of value → index instead of a set — handles it:
#include <unordered_map>#include <vector>
// Returns the zero-based indices {i, j} of any valid pair, or {-1, -1} if none.// O(n) time, O(n) space.std::pair<int, int> twoSumIndices(const std::vector<int>& A, int T) { std::unordered_map<int, int> idx; // value -> index of first occurrence idx.reserve(A.size());
for (int i = 0; i < static_cast<int>(A.size()); ++i) { const int complement = T - A[i]; auto it = idx.find(complement); if (it != idx.end()) { return {it->second, i}; } // Insert only after the lookup — never overwrites an earlier occurrence. idx.emplace(A[i], i); } return {-1, -1};}Same idea with a map from value to index, for when the interviewer wants positions back (e.g. LeetCode 1).
Anticipate this variant. When the interviewer says “return the indices” after your boolean solution, the delta should take you twenty seconds.
Correctness argument
Claim. The algorithm returns true iff a valid pair exists.
(⇐) If a valid pair exists with , then when we process index , is already in the set (inserted at iteration ). We look up , find it, return true.
(⇒) If we return true at iteration , the lookup found in the set. Since we only insert after lookups, was inserted at some earlier iteration . Indices differ, sum equals . Valid pair.
Short argument. Memorize it. An interviewer asking “why is this correct?” is a chance to look sharp, not to stall.
Edge cases to call out explicitly
| Input | Expected | Why it works |
|---|---|---|
| Empty array, or a single element | false | You never have two distinct indices, so the complement test never pairs two values. |
[5, 5], | true | First index inserts 5; second index looks for and finds the value already in the set. |
Single [5], | false | One iteration: the set is empty on lookup, then you insert 5 and stop — no pair. |
[5, 1, 4], | false | Hitting 10 with two 5s would re-use the same 5; look-before-insert rules that out. |
| All zeros, | true if | The second index hits complement 0 in the set from the first. |
Complexity, precisely
- Time: expected. Each of elements triggers one lookup and one insert, both expected for a good hash. Worst-case if the hash degrades (e.g., adversarial inputs hitting pathological collisions). In competitive programming you sometimes need a custom hash to prevent this; in an interview, mention it only if asked.
- Space: . Up to elements may be stored before a match is found.
Sort, then two pointers
When you cannot allocate a hash set, or when the input is already sorted, two pointers find the pair in O(1) extra space.
The mechanic
Once is sorted, place one pointer at the start () and one at the end (). Look at and move:
| Sum at , | Action | Why |
|---|---|---|
| return true | A valid pair on the two pointers. | |
| Need a larger sum; the array is sorted, so only moving right can increase. | ||
| Need a smaller sum; only moving left can decrease. |
Stop when meets .
#include <algorithm>#include <vector>
// If the input is already sorted, skip the std::sort and this is O(n) / O(1).// Otherwise O(n log n) / O(1).// If the caller needs A to remain untouched, sort a copy.bool twoSum(std::vector<int> A, int T) { std::sort(A.begin(), A.end());
int lo = 0; int hi = static_cast<int>(A.size()) - 1;
while (lo < hi) { const int s = A[lo] + A[hi]; if (s == T) return true; if (s < T) ++lo; // Sum too small: the only way to grow it is to raise the left. else --hi; // Sum too large: the only way to shrink it is to lower the right. } return false;}After sorting in place, walk $L$ and $R$ inward — linear scan after the sort, $O(1)$ extra space.
L, R, or declare a match);Reset re-runs from both ends. The line below the grid states the sum inequality driving each move — it should read like the two-pointer case split in the chapter.Sum 39 > T 14. Move R left to decrease the sum.
Correctness — the invariant
The two-pointer walk looks like magic; experienced candidates articulate why it is not.
Invariant. At every step of the walk, no pair with or can be the answer.
Proof sketch.
- We move right only when . At that moment, every has because the array is sorted — wait, the opposite: for , . So . No pair with this can sum to . Safe to discard .
- The case is symmetric.
Because we never discard a pointer that could participate in a valid pair, if a valid pair exists, the walk finds it.
State this invariant out loud in the interview. It turns “I’m moving the pointers” into “here is why this is correct”, which is the difference between a mid candidate and a strong one.
When you pick two-pointers over hashing
- The input is already sorted. Now it’s time, space — strictly better than hashing.
- The interviewer constrains memory to . Hashing is disqualified.
- Adversarial inputs are a concern. The worst-case hash degeneration () can matter in security-sensitive code; two-pointers is always .
- Cache behavior matters. Linear scans are cache-friendly; hash tables are the opposite.
When you don’t
- The input is unsorted and the size is large. Then the sort cost dominates and hashing wins. This is the common case.
Approach comparison
| Approach | Time | Space | Pros | Cons |
|---|---|---|---|---|
| Brute force | Simple, correct, reliable | Too slow for large | ||
| Hashing (one pass) | expected | Fastest single-pass solution | Extra memory; worst-case hash collisions | |
| Sort + two pointers | Constant extra space; cache-friendly | Modifies or copies input; slower than hashing |
Default answer for an unsorted input: hashing. Default for a sorted input: two pointers. Default when memory is tight: two pointers. Say this out loud when you close.
Follow-ups to anticipate
| The interviewer asks… | What to do |
|---|---|
| ”Return the indices.” | Swap hash set for hash map (shown above). |
| ”Return all pairs, not just one.” | Iterate to the end instead of returning early; dedup if needed. |
| ”What if the array is sorted?” | Go straight to two pointers; mention you’d skip the sort. |
| ”What is 3Sum?” | Fix one element; do Two Sum on the rest; total . Standard pattern. |
| ”Values are floats.” | Hash-equality on floats is brittle; use epsilon comparison or hash by rounded key. |
| ”The array doesn’t fit in memory.” | Distributed sort + linear sweep, or streaming hash with a bounded Bloom-filter front-end. |
| ”Adversarial input.” | Mention randomized hashing / universal hash families to prevent worst-case collisions. |
| ”Reduce space to on an unsorted array.” | You cannot do better than time with space — sort in place, then two pointers. |
Pattern recognition — the real lesson
The move you learned here — replace an inner search with a hash lookup — is the single most productive pattern in the early-round array portion of coding interviews. It reappears constantly:
- 3Sum, 4Sum. Fix elements, reduce to Two Sum.
- Subarray Sum Equals K. Hash prefix sums; look up
prefix − K. - Contains Duplicate. Hash each element; check before insert.
- Group Anagrams. Hash by a canonical key (sorted string).
- Longest Substring Without Repeating Characters. Hash last-seen position.
- Valid Sudoku. Hash sets per row, column, and box.
Once you see the shape — “at each element, I need to ask a lookup question about previous elements” — you apply the move on reflex. That reflex is what Two Sum is really teaching.
What to say out loud — the delivery script
Delivery is scored. Here is the narrative arc to rehearse:
- Repeat the problem in your own words. “So I have an array of integers and a target; I need to decide whether any two distinct elements sum to the target.”
- Ask the three clarifying questions. (See above.)
- State the brute force + its complexity. “The obvious approach is two nested loops, . Let me write that as a baseline.”
- Announce the insight. “I think I can do better by rearranging the equation. . If I’ve seen before, I can look it up in a hash set in constant time.”
- Write the code. No more than a dozen lines.
- Walk through an example. Pick one of the given examples; trace the hash set filling.
- State the look-before-insert invariant. “One trap here: you have to check the set before inserting , otherwise will false-positive.”
- Call out edge cases. Empty array, single element, some element.
- Mention the alternative. “If the input were sorted or we couldn’t allocate memory, I’d switch to two pointers after a sort.”
- Summarize. “I’d ship the hashing version unless space is required.”
Rehearse this until it’s automatic. The candidates who get offers aren’t the ones with the cleverest code — they’re the ones whose interview feels organized.
Summary
Two Sum looks like one problem. It is really three: the brute force is a warm-up, the hashing solution is the workhorse, the two-pointer solution is the connoisseur’s answer. Know all three, know when to use each, know the invariant that makes each one correct, and know what to say as you are writing them.
The problem is also a pattern detector. Every time an interview problem asks you to find something in an array where the criterion is “relative to some other element”, the Two Sum move — hash the values, look up the complement — is your first instinct. Internalize it here; it will save you on dozens of problems after.