Chapter 5: Arrays
1 hour 41 min readChapter 5 Overview:
This chapter covers the basic properties of arrays (specifically Python lists), common manipulation techniques, and classic array-based problems. It heavily emphasizes thinking about space complexity and optimizing operations.
Think of arrays like this:
Imagine a row of numbered boxes, right next to each other in memory.
- Contiguous: The boxes are physically adjacent. This is crucial for performance.
- Indexed: You can instantly jump to any box if you know its number (index). This is O(1) access.
- Homogeneous (usually): Typically, all boxes hold the same type of item (like all integers or all strings), though Python lists offer more flexibility.
Why Arrays Are Important (Especially for Interviews):
- Efficiency: Direct access by index is super fast (O(1)).
- Memory Locality: Because elements are stored together, accessing sequential elements is often cache-friendly, leading to good practical performance.
- Building Blocks: Many other data structures (like hash maps, heaps, stacks, queues) are often implemented using arrays underneath.
- In-Place Operations: Interviewers love problems where you modify the array directly without using significant extra memory (O(1) space). This often involves clever use of pointers or swapping elements.
Array Boot Camp: Reordering Array Entries (Page 1)
The boot camp problem is a fantastic introduction to in-place array manipulation using multiple pointers.
The Problem:
Given an array of integers, rearrange it so that all the even numbers appear before all the odd numbers.
The order within the evens or odds doesn’t matter.
Crucially, do this in-place, meaning using only O(1) extra space (you can’t just create two new lists and combine them).
Example: Input [3, 1, 2, 4, 5, 6] could become [2, 4, 6, 1, 5, 3] or [6, 2, 4, 3, 1, 5], etc.
The Boot Camp Code Explained (even_odd):
def even_odd(A):
next_even, next_odd = 0, len(A) - 1 # Pointers start at opposite ends
while next_even < next_odd: # Stop when pointers meet or cross
if A[next_even] % 2 == 0:
# A[next_even] is even. It's in the correct partition.
# Leave it there and advance the 'even' boundary.
next_even += 1
else:
# A[next_even] is odd. It needs to go to the 'odd' section.
# Swap it with the element at A[next_odd].
A[next_even], A[next_odd] = A[next_odd], A[next_even]
# The element now at A[next_odd] is confirmed odd (or hasn't been checked),
# so shrink the 'odd' boundary.
next_odd -= 1
# No return needed as A is modified in-place
Intuition (Partitioning):
Think of the array being divided into three sections as you process it:
A[0 ... next_even - 1]: Even Numbers (Correctly placed)A[next_even ... next_odd]: Unclassified Numbers (Yet to be examined)A[next_odd + 1 ... len(A) - 1]: Odd Numbers (Correctly placed)
The while loop processes the Unclassified section.
- The
next_evenpointer expands the Even section. - The
next_oddpointer shrinks the Unclassified section from the right (effectively expanding the Odd section).
When next_even finds an odd number, it swaps it with A[next_odd], placing the odd number in the correct final partition, and then shrinks the next_odd boundary. The element that gets swapped into A[next_even] still needs to be checked in the next iteration of the loop, which is why next_even doesn’t advance in the else block.
Complexity:
- Time: O(n). Each element is examined at most once by
next_even, and involved in at most one swap. - Space: O(1). We only use a couple of pointer variables and potentially one temporary variable for the swap (though Python handles the tuple swap elegantly without explicit temp).
Key Takeaway: This boot camp demonstrates a powerful pattern: using two (or sometimes more) pointers to partition an array in-place based on some property.
Top Tips for Arrays (Page 2)
This section lists crucial strategies for array problems:
- Use the array itself for space: Cleverly reuse parts of the array or use indices/values to store information, avoiding extra lists/dicts.
- Fill from the back: When merging or writing results, starting from the end of the allocated space can avoid overwriting needed data and reduce shifting (e.g., merging two sorted arrays into one).
- Overwrite, don’t delete: If you need to remove elements and order doesn’t matter for the “removed” part, just overwrite them with elements you want to keep. Much faster than shifting everything left.
- Process integer digits: For arrays representing large numbers, process from the LSB (rightmost/last element) for arithmetic. Or reverse the array first if processing LSB-first simplifies logic.
- Subarrays: Be comfortable working with slices or pointer ranges (
A[start:end]). - Off-by-one errors: Double-check loop conditions (
<vs<=), index calculations (i+1,i-1), and empty array edge cases. - Don’t preserve properties prematurely: It’s often fine to temporarily violate sortedness or other properties during intermediate steps if you restore them by the end.
- Arrays as direct maps: If values are in a known, limited range (like 0-99 or ASCII), use a boolean or count array indexed by the value itself (e.g.,
seen[value] = True). - Parallel logic for 2D arrays: Often, the logic for rows (iterating
i) and columns (iteratingj) is very similar; exploit this symmetry. - Simulation vs. Formula: For generating sequences (like spiral order), sometimes writing code that directly simulates the generation process step-by-step is easier and less error-prone than deriving a complex mathematical formula for the k-th element.
Know Your Array Libraries (Python list) (Page 2-3)
Python’s list is your go-to array. It’s dynamic (grows/shrinks) and versatile.
Instantiation:
my_list = []nums = [1, 6, 3]zeros = [0] * 10copy_list = list(original_list)- List comprehensions are powerful:
squares = [x*x for x in range(10)]
Basic Ops:
len(A)A.append(val)A.insert(idx, val)A.remove(val)(removes first occurrence)val in A(O(n) search)
2D Arrays:
matrix = [[0] * cols for _ in range(rows)]- BEWARE:
matrix = [[0] * cols] * rowscreates shallow copies of the inner list - modifying one row modifies all!
Copying:
B = A:Bis just another reference toA. Changes toBaffectA.B = list(A)orB = A[:]orB = copy.copy(A): Shallow copy. Creates a new listB, but elements insideBare still references to the same objects as inA. IfAcontains mutable objects (like other lists), changing them viaAwill reflect inB, and vice-versa.B = copy.deepcopy(A): Deep copy. Creates a fully independent copy, including copies of any nested objects. Use this when you need total separation.
Key Methods:
min(A),max(A)bisectmodule (bisect_left,bisect_right,insort): For efficient searching/insertion in sorted lists (O(log n) search, O(n) insertion).A.reverse(): Reverses in-place.reversed(A): Returns an iterator for reverse order (doesn’t modifyA, doesn’t create a full new list immediately).A.sort(): Sorts in-place.sorted(A): Returns a new sorted list (doesn’t modifyA).del A[i]: Deletes element at indexi.del A[i:j]: Deletes slice fromiup to (not including)j.
Slicing (A[start:stop:step]): Super powerful!
A[i:j]gives subarray.A[:j]from start up toj.A[i:]fromito end.A[:]shallow copy.A[::-1]reverses the list (returns a new reversed list).A[k:] + A[:k]rotates left byk.
List Comprehensions: [expression for item in iterable if condition]
- Concise way to build lists.
- Often clearer than
map/filter. - Can nest:
[val for row in matrix for val in row]flattens 2D list. - Avoid excessive nesting (more than 2 levels often becomes hard to read).
Section 5.1: The Dutch National Flag Problem (Pages 3-6)
This is a classic partitioning problem, essential for understanding quicksort and in-place array manipulation.
The Problem: Given an array A and an index i (whose value A[i] is the “pivot”), rearrange A such that:
- All elements less than the pivot come first.
- Then, all elements equal to the pivot.
- Finally, all elements greater than the pivot.
Do this in-place (O(1) space). The relative order within the <, ==, > groups doesn’t matter.
Example: A = [0, 1, 2, 0, 2, 1, 1], pivot_index = 3 (pivot value = 0).
A valid result: [0, 0, 1, 2, 2, 1, 1] or [0, 0, 1, 1, 1, 2, 2] (after sorting groups). But the simplest valid result might be [0, 0, 1, 2, 2, 1, 1].
Approach 1: Two Passes (Less efficient, conceptually simpler)
- First Pass (Less Than): Iterate through
A. Keep a pointersmaller(initially 0). IfA[current_index] < pivot, swapA[current_index]withA[smaller]and incrementsmaller. This brings all smaller elements to the front. - Second Pass (Greater Than): Iterate backwards through
A. Keep a pointerlarger(initiallyn-1). IfA[current_index] > pivot, swapA[current_index]withA[larger]and decrementlarger. This pushes all larger elements to the end.
After these two passes, elements equal to the pivot will naturally be left in the middle.
- Time: O(n) + O(n) = O(n).
- Space: O(1).
Approach 2: Single Pass (The Standard, Efficient Solution)
This is the implementation shown in the book, often preferred in interviews.
Intuition: Divide the array into four regions using three pointers:
A[0 ... smaller - 1]: Bottom (Known to be< pivot)A[smaller ... equal - 1]: Middle (Known to be== pivot)A[equal ... larger - 1]: Unclassified (Elements we haven’t processed yet)A[larger ... n - 1]: Top (Known to be> pivot)
Pointers:
smaller: Boundary between Bottom and Middle.equal: Boundary between Middle and Unclassified (the current element being processed).larger: Boundary between Unclassified and Top.
Algorithm:
- Initialize
smaller = 0,equal = 0,larger = len(A). - Get
pivot = A[pivot_index]. - Loop
while equal < larger: (While there are unclassified elements)- Examine
A[equal]. - Case 1:
A[equal] < pivot- Swap
A[equal]withA[smaller]. - Increment
smaller. - Increment
equal.
- Swap
- Case 2:
A[equal] == pivot- Increment
equal. (Element is in the right place relative tosmaller).
- Increment
- Case 3:
A[equal] > pivot- Decrement
larger. (Shrink unclassified region from the right). - Swap
A[equal]withA[larger]. - Crucially: Do NOT increment
equalhere. The element just swapped intoA[equal]came from the unprocessedlargerregion and still needs to be classified in the next loop iteration.
- Decrement
- Examine
Let’s revisit the requirements:
- Input: An array
A(e.g.,[0, 1, 2, 0, 2, 1, 1]) and an indexi(e.g.,pivot_index = 3). - Pivot Value: The value at that index,
pivot = A[pivot_index](sopivot = 0in our example). - Goal: Rearrange
Aso it looks like this:[ <pivot | ==pivot | >pivot ]Specifically:[ All elements LESS THAN pivot ]- These come first.[ All elements EQUAL TO pivot ]- These come next.[ All elements GREATER THAN pivot ]- These come last.
Why three categories? This is extremely useful as a subroutine in Quicksort. If an array has many duplicate elements equal to the pivot, standard Quicksort (which partitions into just < pivot and >= pivot) can perform poorly (O(n^2)). By putting all elements equal to the pivot into their own group in the middle, you exclude them from the recursive calls on the < pivot and > pivot subarrays. This makes Quicksort much more efficient (O(n log n) on average, O(n) if all elements are equal) when duplicates are present.
“Do this in-place (O(1) space)”: This means you cannot create new helper arrays to store the <, ==, and > elements temporarily. You must achieve the final arrangement by only swapping elements within the original array A, using only a constant amount of extra memory (like variables for the pivot value and your pointers smaller, equal, larger). This is the main challenge and why the pointer techniques are important.
“The relative order within the <, ==, > groups doesn’t matter”: This is a relaxation that makes the problem easier. Look at the example A = [0, 1, 2, 0, 2, 1, 1] with pivot = 0.
- The
< pivotgroup is empty. - The
== pivotgroup contains[0, 0]. - The
> pivotgroup contains[1, 2, 2, 1, 1].
A valid final arrangement could be [0, 0, | 1, 2, 2, 1, 1]. Another could be [0, 0, | 2, 1, 1, 2, 1]. As long as all the 0s come first, followed by all the numbers greater than 0, it’s correct. We don’t need to keep the original relative order of the 1s and 2s within the > pivot group. This freedom allows us to perform swaps more easily. If we had to maintain relative order (a “stable partition”), the O(1) space solution would be much harder.
In essence: The Dutch Flag problem is just a more detailed partitioning than even_odd. Instead of two bins, you have three (<, ==, >). The challenge lies in achieving this three-way split efficiently using only the existing array.
Section 5.2: Increment an Arbitrary-Precision Integer (Page 7 / PDF Page 43)
The Concept: Standard integer types in many languages (like int in C++ or Java) have a fixed size (e.g., 32 or 64 bits) and thus a maximum value. Python handles this automatically with its arbitrary-precision integers, but interview problems often ask you to simulate this behavior, typically by storing the digits of a very large number in an array (or list in Python).
Representation: An array A represents a non-negative integer D. Each element A[i] holds a single digit. The most significant digit is usually at A[0].
- Example:
D = 129is represented asA = [1, 2, 9]. - Example:
D = 99is represented asA = [9, 9].
The Problem: Write a function that takes such an array A representing integer D and updates it in-place to represent D + 1. Handle potential carries, including the case where the number of digits increases (like 99 + 1 = 100).
Brute-Force (and why it’s often not allowed/intended):
- Convert the array
[1, 2, 9]into the integer129. - Add 1:
129 + 1 = 130. - Convert
130back into an array[1, 3, 0].
Limitation: This fails if the integer D is larger than the maximum value the language’s built-in integer type can hold (this isn’t an issue for Python’s runtime integers, but the problem setup often simulates fixed-precision constraints or asks you to avoid this conversion). It also doesn’t modify the array in-place directly.
The Grade-School Algorithm Approach: Think about how you add 1 to a number on paper, like 129 + 1:
- Start from the rightmost digit (Least Significant Digit - LSB).
- Add 1 to the last digit: 9 + 1 = 10.
- Write down the 0, carry-over the 1.
- Move to the next digit to the left (2). Add the carry: 2 + 1 = 3.
- Write down the 3. No carry-over this time (carry is 0).
- Move to the next digit to the left (1). Add the carry: 1 + 0 = 1.
- Write down the 1. No carry-over.
- No more digits, result is 130.
Now consider 99 + 1:
- Start rightmost: 9 + 1 = 10. Write 0, carry 1.
- Next digit: 9 + 1 (carry) = 10. Write 0, carry 1.
- No more digits, but we still have a carry. This means the result needs an extra digit. The result is 100.
Simulating on the Array:
- We operate directly on the array
A. - Increment the last element:
A[n-1] += 1. - Iterate from right-to-left (from
n-1down to1).- If
A[i] == 10(meaning we had a carry into this position):- Set
A[i] = 0. - Increment the digit to the left:
A[i-1] += 1.
- Set
- If
A[i]is not10after adding the potential carry, then the carry propagation stops, and we can break the loop early.
- If
- Special Case: After the loop, check if the first digit
A[0]became10. If it did, it means we had a carry out of the most significant digit.- Set
A[0] = 1. - Append a
0to the end of the array to accommodate the new least significant digit. (The book uses a slightly different but equivalent way: setA[0]=1,A.append(0)).
- Set
Section 5.3: Multiply Two Arbitrary-Precision Integers (Page 7-8 / PDF Pages 43-44)
The Problem: Now, instead of adding 1, multiply two non-negative integers num1 and num2, represented as arrays of digits. Return their product, also as an array of digits. Handle potential negative numbers indicated by a negative sign in the first element (e.g., [-7, 6, 1] represents -761).
Example: [1, 2, 3] (123) * [9, 8, 7] (987) = [1, 2, 1, 4, 0, 1] (121401)
Again, Brute-Force Conversion Fails: Converting to built-in integers and multiplying won’t work for numbers exceeding the language’s fixed precision limits.
Grade-School Multiplication Algorithm: Think about 123 * 987:
123
x 987
------
861 (= 7 * 123)
984 (= 8 * 123, shifted left 1 place)
1107 (= 9 * 123, shifted left 2 places)
------
121401 (Sum of the above)
Simulating on Arrays:
Handle Sign:
- Determine the sign of the result (-1 if one input is negative, 1 otherwise).
- Make both input arrays positive for the multiplication step (e.g., using
abs()on the first element).
Result Array Size:
- The product of an n-digit number and an m-digit number can have at most
n + mdigits. - Create a
resultarray of sizen + m, initialized to zeros.
- The product of an n-digit number and an m-digit number can have at most
Core Multiplication Loop: Iterate through
num1from right-to-left (indexi) andnum2from right-to-left (indexj).- For each pair
num1[i]andnum2[j], calculate the productnum1[i] * num2[j]. - This product contributes to the position
result[i + j + 1](think about the place values: 10a * 10b = 10(a+b)). - Add this product to
result[i + j + 1]. - Handle the carry: The carry from this addition goes into
result[i + j].result[i + j] += result[i + j + 1] // 10.
- Keep the digit:
result[i + j + 1] %= 10.
- This directly adds the partial products into the correct positions in the
resultarray, simulating the addition step of grade-school multiplication implicitly.
- For each pair
Remove Leading Zeros:
- The
resultarray might have leading zeros (e.g., if multiplying small numbers like[1]*[1]). - Find the first non-zero digit and return the slice from that point onwards.
- Handle the case where the result is 0 (return
[0]).
- The
Apply Sign:
- If the overall sign calculated in step 1 was negative, negate the first element of the final
resultarray.
- If the overall sign calculated in step 1 was negative, negate the first element of the final
Complexity:
- Time: O(n * m), where n and m are the number of digits in the two input numbers. This comes from the nested loops iterating through all pairs of digits.
- Space: O(n + m) for the result array.
Section 5.4: Advancing Through an Array (Page 8 / PDF Page 44)
The Problem: Imagine a board game where you’re on a sequence of positions represented by an array A. Each A[i] contains a non-negative integer telling you the maximum number of steps you can take forward from position i. You start at index 0. Can you reach the last index of the array?
Example 1: A = [3, 3, 1, 0, 2, 0, 1]
- Start at index 0 (
A[0]=3). Can move 1, 2, or 3 steps. - Option 1: Move 1 step to index 1 (
A[1]=3). From index 1, can move 1, 2, or 3 steps. - Let’s move 3 steps to index 4 (
A[4]=2). From index 4, can move 1 or 2 steps. - Let’s move 2 steps to index 6 (
A[6]=1). Index 6 is the last index. - Yes, we can reach the end.
Example 2: A = [3, 2, 0, 0, 2, 0, 1]
- Start at index 0 (
A[0]=3). Can move 1, 2, or 3 steps.- If we move 1 step -> index 1 (
A[1]=2). Can reach index 2 or 3. - If we move 2 steps -> index 2 (
A[2]=0). Stuck. - If we move 3 steps -> index 3 (
A[3]=0). Stuck.
- If we move 1 step -> index 1 (
- From index 1, max reach is 1+2=3. Both indices 2 and 3 have 0s. We cannot get past index 3.
- No, we cannot reach the end.
Initial Thoughts & Why Simple Greedy Fails:
You might think: “Always take the maximum jump possible from the current position.” This doesn’t work.
Consider A = [2, 4, 1, 1, 0, 2, 3]. Max jump from index 0 takes you to index 2 (A[2]=1). From there you get stuck. But taking just 1 step from index 0 to index 1 (A[1]=4) lets you jump much further and potentially reach the end.
The Efficient Approach: Track Maximum Reach The key idea is to iterate through the array and keep track of the furthest index we know we can possibly reach so far.
Algorithm:
- Initialize
max_reach = 0. This variable will store the furthest index reachable from the positions we’ve visited. - Iterate through the array with an index
ifrom0up tolen(A) - 1. - Crucial Check: In each iteration
i, first check ifi > max_reach.- If
iis greater thanmax_reach, it means we could never have reached the current indexifrom any previous position. Therefore, we definitely cannot reach the end. ReturnFalse.
- If
- Update the maximum reach: Calculate the furthest we could jump from the current position
i. This isi + A[i].- Update
max_reach = max(max_reach, i + A[i]).
- Update
- Check for early success: If at any point
max_reachbecomes greater than or equal to the last index (len(A) - 1), we know we can reach the end. ReturnTrue. - If the loop completes without returning
False(meaning we could always reach the currenti) and without having reached the end yet (which impliesmax_reachmight have landed exactly on the last index in the final iteration), we still need a final check after the loop, but the logic shown below incorporates this cleanly.
Note: The book’s code slightly reformulates the loop condition and final check, but the core logic of tracking max_reach is the same.
Intuition Walkthrough (Example 1: A = [3, 3, 1, 0, 2, 0, 1])
i=0:max_reach=0.i <= max_reachis true.max_reach = max(0, 0+3) = 3.ibecomes 1.i=1:max_reach=3.i <= max_reachis true.max_reach = max(3, 1+3) = 4.ibecomes 2.i=2:max_reach=4.i <= max_reachis true.max_reach = max(4, 2+1) = 4.ibecomes 3.i=3:max_reach=4.i <= max_reachis true.max_reach = max(4, 3+0) = 4.ibecomes 4.i=4:max_reach=4.i <= max_reachis true.max_reach = max(4, 4+2) = 6.ibecomes 5.- Loop condition:
i <= max_reach(5 <= 6) is true, butmax_reach < last_index(6 < 6) is false. Loop terminates. - Return
max_reach >= last_index(6 >= 6) -> True.
Intuition Walkthrough (Example 2: A = [3, 2, 0, 0, 2, 0, 1])
i=0:max_reach=0.i <= max_reachis true.max_reach = max(0, 0+3) = 3.ibecomes 1.i=1:max_reach=3.i <= max_reachis true.max_reach = max(3, 1+2) = 3.ibecomes 2.i=2:max_reach=3.i <= max_reachis true.max_reach = max(3, 2+0) = 3.ibecomes 3.i=3:max_reach=3.i <= max_reachis true.max_reach = max(3, 3+0) = 3.ibecomes 4.i=4:max_reach=3.i <= max_reach(4 <= 3) is false. Loop terminates earlier.- Return
max_reach >= last_index(3 >= 6) -> False.
Analogy: Hopping Stones Across a River
Imagine the array indices 0, 1, 2, ... n-1 are stones lined up across a river.
- You start at stone
0. - The number
A[i]on each stone tells you the maximum length of a single jump you can take from that stone. So, from stonei, you can land on any stonejwherei < j <= i + A[i]. - Goal: Can you reach the last stone (stone
n-1)?
Why simple jumping doesn’t work: If you just stand on stone i and decide which jump to take next, you might make a bad choice. A shorter jump might land you on a stone with a much bigger A[j] value that lets you cross the river, while the longest jump from i might land you somewhere useless.
The max_reach Idea: Tracking Your Potential
Why this “tracking the potential” works:
- It’s safe: We only advance our current position
iif it’s within the knownmax_reach. If we ever encounter anithat’s beyondmax_reach, it means no combination of previous jumps could have gotten us there. - It’s sufficient: We don’t need to know the exact path. We only need to know if the potential to reach the end exists. By always updating
max_reachto the maximum possible distance (max(current_max_reach, i + A[i])), we are essentially keeping track of the “frontier” of reachable stones. If that frontier ever touches or crosses the last stone, we know a path must exist.
Think of it like this: max_reach is your “optimistic outlook”. At each step i you can actually reach, you check if the jump from i gives you a more optimistic outlook than you already had. If your current position i ever falls behind your optimistic outlook (i > max_reach), then your optimism was unfounded, and you’re stuck.
Instead of deciding which specific jump to take, we focus on a simpler question: “What is the absolute furthest stone I could possibly reach, given the stones I’ve visited so far?”
Complexity:
- Time: O(n). We iterate through the array at most once.
- Space: O(1). We only use a few variables (
max_reach,i,last_index).
Variant: Minimum Steps
The variant asks for the minimum number of steps to reach the end. This smells like a Breadth-First Search (BFS).
Think of array indices as nodes in a graph. There’s an edge from i to j if j is reachable from i (i.e., i < j <= i + A[i]). You want the shortest path from node 0 to node n-1.
A standard BFS exploring reachable indices level by level will find the minimum number of jumps. You’d need a way to keep track of visited indices to avoid cycles/redundant work and a queue to manage the BFS levels. The state in the queue could be (index, num_jumps). This would likely take O(n) time and O(n) space (for the queue and visited set/array).
Section 5.5: Delete Duplicates from a Sorted Array (Page 9 / PDF Page 45)
The Problem: You are given a sorted array A. Your task is to remove duplicate elements so that each unique element appears only once. You should do this in-place, modifying the original array A. The elements that are kept should be shifted to the beginning of the array. You need to return the number of valid, unique elements remaining in the array. What’s stored in the array beyond the last valid element doesn’t matter.
Example:
- Input:
A = [2, 3, 5, 5, 7, 11, 11, 11, 13] - Output: The function should return
6. - After the function runs, the array
Ashould be modified so its beginning looks like:[2, 3, 5, 7, 11, 13, ... ]. The values in the positions after index 5 are irrelevant (e.g.,[2, 3, 5, 7, 11, 13, 11, 11, 13]would be a valid state forA).
Why “Sorted” is Key: Because the array is sorted, all duplicate elements are guaranteed to be grouped together consecutively. This makes identifying them much easier than if the array were unsorted.
Common Pitfalls:
- Using O(n) space: It’s easy to iterate through
A, store unique elements in a set or a new list, and then copy them back. But the problem often requires O(1) extra space (in-place modification). - Shifting elements repeatedly: You could find a duplicate
A[i] == A[i-1], and then shift all elements fromA[i+1]onwards one step to the left. This works, but if you have many duplicates (e.g.,[2, 2, 2, 2, 3]), you end up doing a lot of shifting, leading to O(n^2) time complexity in the worst case.
The Efficient In-Place Approach: Two Pointers (Read & Write) This is a very common and useful pattern for modifying arrays in-place when you want to keep only certain elements.
The Strategy:
- Use one pointer (
read_indexor just the loop indexiin the code below) to iterate through the original array from the second element onwards. - Use a second pointer (
write_index) to keep track of the next position in the array where a unique element should be written. Initializewrite_index = 1(since the very first elementA[0]is always unique by definition, assuming the array isn’t empty). - Compare the element being read (
A[i]) with the last unique element written (A[write_index - 1]).- If
A[i]is different fromA[write_index - 1], it means we’ve found a new unique element.- Copy this unique element
A[i]to the next available write position:A[write_index] = A[i]. - Advance the write pointer:
write_index += 1.
- Copy this unique element
- If
A[i]is the same asA[write_index - 1], it’s a duplicate. We simply ignore it and move the read pointer (i) forward, without advancingwrite_index. This effectively overwrites duplicates later when a new unique element is found.
- If
- After iterating through the entire array,
write_indexwill hold the count of unique elements, and the subarrayA[0 ... write_index - 1]will contain those unique elements.
Section 5.6: Buy and Sell a Stock Once (Page 10 / PDF Page 46)
The Problem: You’re given an array prices where prices[i] is the price of a given stock on day i. You want to find the maximum profit you can achieve by buying the stock on one day and selling it on a later day. If no profit can be made (prices always decrease), the maximum profit is 0. You must buy before you sell.
Example: prices = [310, 315, 275, 295, 260, 270, 290, 230, 255, 250]
- Buy at 260 (day 4, index 4), Sell at 290 (day 6, index 6). Profit = 290 - 260 = 30.
- This is the maximum possible profit. Note: minimum price is 230, max is 315, but 315 occurs before 230, so you can’t buy at 230 and sell at 315.
Brute-Force: Try every possible pair of (buy_day, sell_day) where sell_day > buy_day. Calculate the profit for each pair and keep track of the maximum.
- Use nested loops: outer loop for
buy_dayfrom 0 ton-2, inner loop forsell_dayfrombuy_day + 1ton-1. - Time Complexity: O(n^2).
- Space Complexity: O(1).
- (Too slow for large inputs).
Divide and Conquer (Mentioned in Book Introduction, not here): Split the array in half, find the max profit in the left half, find the max profit in the right half, and find the max profit that crosses the midpoint (buy in left, sell in right). The crossing part requires finding the min price in the left and max price in the right. This leads to O(n log n) time. Better, but we can do even better.
The Efficient Linear Scan Approach:
Key Idea: As you iterate through the prices day by day, what information do you need to know from the past to decide the best potential profit if you sell today? You need to know the minimum price encountered before today.
Algorithm:
- Initialize
min_price_so_farto a very large value (orprices[0]if the array is not empty). - Initialize
max_profitto0.0. - Iterate through the
pricesarray, starting from the first price. Let the current price beprice.- Calculate potential profit if selling today:
potential_profit = price - min_price_so_far. - Update maximum profit:
max_profit = max(max_profit, potential_profit). - Update minimum price seen:
min_price_so_far = min(min_price_so_far, price).
- Calculate potential profit if selling today:
- After iterating through all prices,
max_profitwill hold the maximum profit achievable.
Intuition Walkthrough (prices = [310, 315, 275, 295, 260, 270, 290, 230, 255, 250]):
price | min_price_so_far (Before Update) | profit_sell_today | max_profit (After Update) | min_price_so_far (After Update) |
|---|---|---|---|---|
| 310 | inf | -inf | 0 | 310 |
| 315 | 310 | 5 | 5 | 310 |
| 275 | 310 | -35 | 5 | 275 |
| 295 | 275 | 20 | 20 | 275 |
| 260 | 275 | -15 | 20 | 260 |
| 270 | 260 | 10 | 20 | 260 |
| 290 | 260 | 30 | 30 | 260 |
| 230 | 260 | -30 | 30 | 230 |
| 255 | 230 | 25 | 30 | 230 |
| 250 | 230 | 20 | 30 | 230 |
Final max_profit = 30.
Complexity:
- Time: O(n). We iterate through the array exactly once.
- Space: O(1). We only use two variables (
min_price_so_far,max_profit).
Variant: Longest Equal Subarray
Find the length of the longest contiguous subarray where all entries are equal.
Iterate through the array, keeping track of the current_length of the equal sequence and the max_length found so far.
- If
A[i] == A[i-1], incrementcurrent_length. - If
A[i] != A[i-1], resetcurrent_lengthto 1. - Update
max_length = max(max_length, current_length)in each iteration.
O(n) time, O(1) space.
Intuition Walkthrough (prices = [310, 315, 275, 295, 260, 270, 290, 230, 255, 250]):
| price | min_price_so_far (Before Update) | profit_sell_today | max_profit (After Update) | min_price_so_far (After Update) |
|---|---|---|---|---|
| 310 | inf | -inf | 0 | 310 |
| 315 | 310 | 5 | 5 | 310 |
| 275 | 310 | -35 | 5 | 275 |
| 295 | 275 | 20 | 20 | 275 |
| 260 | 275 | -15 | 20 | 260 |
| 270 | 260 | 10 | 20 | 260 |
| 290 | 260 | 30 | 30 | 260 |
| 230 | 260 | -30 | 30 | 230 |
| 255 | 230 | 25 | 30 | 230 |
| 250 | 230 | 20 | 30 | 230 |
Final max_profit = 30.
Complexity:
- Time: O(n). We iterate through the array exactly once.
- Space: O(1). We only use two variables (
min_price_so_far,max_profit).
Variant: Longest Equal Subarray
Find the length of the longest contiguous subarray where all entries are equal.
Iterate through the array, keeping track of the current_length of the equal sequence and the max_length found so far.
- If
A[i] == A[i-1], incrementcurrent_length. - If
A[i] != A[i-1], resetcurrent_lengthto 1. - Update
max_length = max(max_length, current_length)in each iteration.
O(n) time, O(1) space.
Section 5.7: Buy and Sell a Stock Twice (Page 11 / PDF Page 47)
The Problem: Same setup as before, but now you can buy and sell at most twice. The second buy must happen after the first sell. Find the maximum profit.
Example: prices = [12, 11, 13, 9, 12, 8, 14, 13, 15]
- First Tx: Buy at 9, Sell at 12 (Profit 3)
- Second Tx: Buy at 8, Sell at 15 (Profit 7)
- Total Profit = 3 + 7 = 10. This is the maximum.
Brute-Force: Try all combinations of two disjoint buy-sell intervals. O(n4) (choose 4 days). Too slow.
Improved Brute-Force: Iterate through all possible split points i (day the first sale happens or second buy happens). Find the max profit for one transaction in prices[0...i] and the max profit for one transaction in prices[i...n-1]. Add them up. Keep track of the maximum sum.
- Using the O(n) algorithm from Section 5.6 for each subarray takes O(n) time. Doing this for
npossible split points gives O(n2) total time. Space is O(1). Better, but still not optimal.
The Efficient Dynamic Programming Approach:
The Core Difficulty: How do you decide the four days (buy1, sell1, buy2, sell2) optimally? Trying all combinations is too slow (O(n4)).
The Key Insight (Decomposition):
Imagine you complete your first transaction by selling on day i.
- This means you bought on some day
b1 <= iand sold on dayi. - The best you could have done for this first transaction ending on day
iisMaxProfit(prices[0...i]). Let’s call thisProfit1_ending_at_i.
Now, you need to make your second transaction using only the days after day i.
- This means you buy on some day
b2 > iand sell on some days2 >= b2. - The best you could do for this second transaction happening after day
iisMaxProfit(prices[i+1...n-1]). Let’s call thisProfit2_starting_after_i.
If we split the entire process at day i (meaning the first transaction finishes by day i, and the second starts after day i), the total profit for this specific split point i would be:
TotalProfit(split at i) = Profit1_ending_at_i + Profit2_starting_after_i
The Problem: We don’t know the best day i to split the timeline. So, we need to calculate this TotalProfit(split at i) for every possible split day i (from i=0 to i=n-1) and take the maximum among them.
How to Calculate Efficiently?
Calculating Profit1_ending_at_i and Profit2_starting_after_i from scratch for every i is slow (O(n2)). We need to precompute or compute iteratively.
Step 1: Calculate all Profit1_ending_at_i values.
- Notice that
Profit1_ending_at_iis exactly the maximum profit you can make with a single transaction within the rangeprices[0...i]. - We already know how to calculate this efficiently! It’s the logic from the “Buy and Sell Once” problem (Section 5.6). We can run that algorithm once, and as we iterate through the
pricesarray, we store the maximum profit found up to that point in a new array. Let’s call this arrayF(for Forward). F[i]= Maximum profit from one transaction using prices from day 0 to dayi.- This takes one pass, O(n) time, and O(n) space for the
Farray.
Step 2: Calculate all Profit2_starting_after_i values.
- This is the maximum profit from one transaction using only prices from day
i+1to dayn-1. - How can we calculate this efficiently for all
i? We can run the “Buy and Sell Once” logic backwards! - Imagine iterating from the end of the
pricesarray (n-1) down to0. Keep track of themax_price_seen_so_far(during the backward scan). The best profit if you buy on dayj(during this backward scan) ismax_price_seen_so_far - prices[j]. - Let
B[j]be the maximum profit from one transaction using prices from dayjto dayn-1. We can compute allB[j]values in a single backward pass (O(n) time). We could store this in another arrayB.
Step 3: Combine the Results.
- Now we have
F[i](best profit ending by dayi) andB[j](best profit starting from dayj). - The total profit for a split right after day
iisF[i] + B[i+1]. - We need to calculate
F[i] + B[i+1]for allifrom 0 ton-2. - The final answer is the maximum value found in these sums.
- We also need to consider the possibility that the best profit comes from only one transaction (e.g., if prices always go down after the first transaction). The best single-transaction profit over the whole period is simply
F[n-1]. So the overall maximum ismax(F[n-1], max(F[i] + B[i+1] for i in 0..n-2)).
The Book’s Space Optimization:
The book realizes we don’t actually need to store the entire B array from Step 2. We can combine Step 2 and Step 3.
- Calculate and store the
Farray (forward pass) as described in Step 1. - Perform the backward pass (like Step 2). As you iterate backwards with index
i(fromn-1down to1):- Calculate the maximum profit for a single transaction starting at or after day
i. Let’s call thisprofit_second_tx. (This is conceptuallyB[i]). You do this by keeping track ofmax_price_so_farencountered during the backward scan.profit_second_tx = max(profit_if_buy_at_i, profit_if_buy_later). - Immediately combine: Now that you know the best profit starting after day
i-1(which isprofit_second_txyou just calculated for indexi), you can add it to the best profit ending before dayi(which isF[i-1], already calculated in the forward pass). total_profit_for_split_at_i-1 = F[i-1] + profit_second_tx.- Keep track of the overall maximum total profit seen across all split points considered so far.
- Calculate the maximum profit for a single transaction starting at or after day
Let’s retry the Intuition Walkthrough (prices = [12, 11, 13, 9, 12, 8, 14, 13, 15]):
Forward Pass: Compute
F = [0, 0, 2, 2, 3, 3, 6, 6, 7].max_total_profitinitially is the best single transaction profit, somax_total_profit = F[8] = 7.
Backward Pass & Combine:
i = 8(priceprices[8]=15):max_price(for range[8..8]) = 15.profit_second_tx(for range[8..8]) ismax(0, 15-15) = 0.total = F[7](6) + 0 = 6.max_total_profit = max(7, 6) = 7.
i = 7(priceprices[7]=13):max_price(for range[7..8]) =max(15, 13) = 15.profit_second_tx(for range[7..8]) ismax(profit_starting_at_8=0, profit_buy_at_7=15-13=2) = 2.total = F[6](6) + 2 = 8.max_total_profit = max(7, 8) = 8.
i = 6(priceprices[6]=14):max_price(for range[6..8]) =max(15, 14) = 15.profit_second_tx(for range[6..8]) ismax(profit_starting_at_7=2, profit_buy_at_6=15-14=1) = 2.total = F[5](3) + 2 = 5.max_total_profit = max(8, 5) = 8.
i = 5(priceprices[5]=8):max_price(for range[5..8]) =max(15, 8) = 15.profit_second_tx(for range[5..8]) ismax(profit_starting_at_6=2, profit_buy_at_5=15-8=7) = 7.total = F[4](3) + 7 = 10.max_total_profit = max(8, 10) = 10.
- … and so on.
The backward pass calculates the best profit for the second transaction on the fly and immediately combines it with the pre-calculated best profit for the first transaction ending just before it.
Think of the backward loop index i as defining the start day for the potential second transaction. profit_second_tx calculated at step i represents the best possible profit if your second transaction happens entirely within prices[i...n-1]. We add this to F[i-1] (best profit from first transaction ending before day i) to get the total profit for that specific way of splitting the two transactions. We do this for all possible start days i for the second transaction and find the overall maximum.
Section 5.8: Computing an Alternation (Page 12 / PDF Page 48)
The Problem: You are given an array A of numbers. Rearrange its elements in-place to create a new array (or modify A directly) such that it follows an alternating pattern:
A[0] <= A[1] >= A[2] <= A[3] >= A[4] <= A[5] ...
(Less-than-or-equal followed by greater-than-or-equal, repeating).
Example:
- Input:
A = [3, 1, 4, 1, 5, 9, 2, 6] - A valid output:
[1, 4, 1, 5, 2, 9, 3, 6](Check: 1<=4>=1<=5>=2<=9>=3<=6) - Note: 3<=6 is satisfied at the end. - Another valid output:
[1, 3, 1, 4, 2, 6, 5, 9]
Approach 1: Sorting (Simple but Suboptimal)
- Sort the array
A. Example:[1, 1, 2, 3, 4, 5, 6, 9] - Swap adjacent pairs starting from the second element. Swap
A[1]andA[2], thenA[3]andA[4], thenA[5]andA[6], etc.- Swap
A[1](1)andA[2](2)->[1, 2, 1, 3, 4, 5, 6, 9] - Swap
A[3](3)andA[4](4)->[1, 2, 1, 4, 3, 5, 6, 9] - Swap
A[5](5)andA[6](6)->[1, 2, 1, 4, 3, 6, 5, 9]
- Swap
- Check the result:
1<=2>=1<=4>=3<=6>=5<=9. It works!
- Complexity: Dominated by the sort step.
- Time: O(n log n).
- Space: O(1) if using an in-place sort, or O(n) otherwise.
Approach 2: Median Finding (Mentioned, but not fully explored)
- Find the median element of the array
A(using an O(n) median-of-medians algorithm, like in Problem 11.8). - Partition the array around the median (like one step of Quicksort or using Dutch Flag) so elements smaller than the median are first, then elements equal to the median, then elements larger. This takes O(n) time.
- Interleave elements from the lower half and the upper half, similar to the swapping pattern in Approach 1.
- Complexity: Dominated by median finding and partitioning.
- Time: O(n).
- Space: O(1).
- This is theoretically optimal time, but the median-of-medians algorithm is complex to implement.
Approach 3: The Clever Local Swap (Efficient and Simple)
Key Insight: The required property (A[i-1] <= A[i] >= A[i+1] or A[i-1] >= A[i] <= A[i+1]) is very local. It only involves adjacent elements. We don’t necessarily need the global order provided by sorting or median finding.
The Goal: Arrange the array so it goes “up, down, up, down…” like this:
Small <= Big >= Small <= Big >= Small ...
The Rule We Need:
- At every even index
i, we needA[i] <= A[i+1]. - At every odd index
i, we needA[i] >= A[i+1].
The Local Swap Algorithm:
We walk through the array, looking at pairs (A[i], A[i+1]). We force the local rule to be true for that pair.
Look at
(A[0], A[1])(i=0, which is even):- Rule: We need
A[0] <= A[1]. - If it’s already true, great, do nothing.
- If
A[0] > A[1], swap them. NowA[0] <= A[1]is true for this pair.
- Rule: We need
Look at
(A[1], A[2])(i=1, which is odd):- Rule: We need
A[1] >= A[2]. - If it’s true, do nothing.
- If
A[1] < A[2], swap them. NowA[1] >= A[2]is true for this pair.
- Rule: We need
Look at
(A[2], A[3])(i=2, which is even):- Rule: We need
A[2] <= A[3]. - If
A[2] > A[3], swap them. NowA[2] <= A[3]is true. …and so on.
- Rule: We need
Why Doesn’t Enforcing the Rule at Step i Break the Rule We Just Fixed at Step i-1?
This is the core confusion. Let’s trace carefully. We need to check the condition involving A[i] in both the step i-1 comparison and the step i comparison.
Consider index i. It’s involved in two checks:
- Check at step
i-1: Compares(A[i-1], A[i]). - Check at step
i: Compares(A[i], A[i+1]).
Let’s analyze the transition at index i when we process step i.
Case 1: i is EVEN.
- The rule we enforce at step
i: We ensureA[i] <= A[i+1](by swapping if needed). - What about the rule from step
i-1? Stepi-1was ODD. The rule enforced there wasA[i-1] >= A[i]. - Does enforcing
A[i] <= A[i+1]breakA[i-1] >= A[i]?- If we didn’t swap at step
i(becauseA[i] <= A[i+1]was already true): No problem, the previous conditionA[i-1] >= A[i]still holds (we didn’t changeA[i]orA[i-1]). - If we did swap at step
i(becauseA[i] > A[i+1]initially): Let the original values beOld_A[i]andOld_A[i+1]. We knowOld_A[i] > Old_A[i+1]. The new value atA[i]isNew_A[i] = Old_A[i+1]. The condition from stepi-1wasA[i-1] >= Old_A[i]. SinceOld_A[i] > Old_A[i+1] = New_A[i], it follows thatA[i-1] > New_A[i]. So, the required conditionA[i-1] >= New_A[i]still holds! Swapping in a smaller value atA[i]cannot break the previousA[i-1] >= A[i]condition.
- If we didn’t swap at step
Case 2: i is ODD.
- The rule we enforce at step
i: We ensureA[i] >= A[i+1](by swapping if needed). - What about the rule from step
i-1? Stepi-1was EVEN. The rule enforced there wasA[i-1] <= A[i]. - Does enforcing
A[i] >= A[i+1]breakA[i-1] <= A[i]?- If we didn’t swap at step
i(becauseA[i] >= A[i+1]was already true): No problem. - If we did swap at step
i(becauseA[i] < A[i+1]initially): Let the original values beOld_A[i]andOld_A[i+1]. We knowOld_A[i] < Old_A[i+1]. The new value atA[i]isNew_A[i] = Old_A[i+1]. The condition from stepi-1wasA[i-1] <= Old_A[i]. SinceOld_A[i] < Old_A[i+1] = New_A[i], it follows thatA[i-1] < New_A[i]. So, the required conditionA[i-1] <= New_A[i]still holds! Swapping in a larger value atA[i]cannot break the previousA[i-1] <= A[i]condition.
- If we didn’t swap at step
Conclusion: Enforcing the rule for the pair (A[i], A[i+1]) preserves the rule that was already established for the pair (A[i-1], A[i]) in the previous step. Since we process the array from left to right, by the time we reach the end, all adjacent pairs satisfy their respective conditions (<= for even i, >= for odd i), fulfilling the overall alternation requirement.
Analogy: Smoothing Out Bumps
Imagine the numbers are heights. You want a profile like low <= high >= low <= high ...
You walk along the profile, looking at two adjacent points i and i+1.
- If
ishould be a “low” point (iis even), butA[i]is higher thanA[i+1], you swap them to push the high point toi+1. This makesA[i]lower, which doesn’t violate any preceding “high >= low” condition. - If
ishould be a “high” point (iis odd), butA[i]is lower thanA[i+1], you swap them to pull the high point toi. This makesA[i]higher, which doesn’t violate any preceding “low <= high” condition.
Section 5.9: Enumerate All Primes to n (Page 13 / PDF Page 49)
The Problem: Given an integer n, return a list of all prime numbers between 1 and n (inclusive).
Prime Number Definition: A natural number p is prime if it is greater than 1 and has no divisors other than 1 and itself. (2, 3, 5, 7, 11, 13, 17, 19… are primes). 1 is not prime.
Example:
- Input:
n = 18 - Output:
[2, 3, 5, 7, 11, 13, 17]
Approach 1: Trial Division (Brute-Force)
- Iterate through each number
ifrom 2 up ton. - For each
i, check if it’s prime.- How to check if
iis prime? Try dividingiby every numberdfrom 2 up tosqrt(i).- If you find any
dthat dividesievenly (i % d == 0), theniis not prime. Stop checking for thisiand move to the nexti. - If you check all
dup tosqrt(i)and find no divisors, theniis prime.
- If you find any
- How to check if
- If
iis found to be prime, add it to your results list.
Why only check up to
sqrt(i)? Ifihas a divisordlarger thansqrt(i), theni = d * k. For this equation to hold,kmust be smaller thansqrt(i). So, ifihas any divisor other than 1 and itself, it must have at least one divisor less than or equal to its square root.Complexity: Checking if a single number
iis prime takes roughly O(sqrt(i)) time. Doing this for all numbers from 2 tongives a total time complexity around O(n * sqrt(n)) or O(n1.5). Space is O(p) wherepis the number of primes found, roughly O(n/log n).
Approach 2: Sieve of Eratosthenes (Much More Efficient)
This is the classic algorithm for finding all primes up to n. It avoids redundant checks by eliminating multiples.
The Idea:
- Start with a list (or boolean array) representing all numbers from 0 up to
n. Initially, mark all numbers from 2 tonas potentially prime (e.g.,True). Mark 0 and 1 as not prime (False). - Find the first number
pin the list (starting from 2) that is still marked as potentially prime. - This number
pmust be prime (because if it had a smaller divisor, that divisor would have been found earlier, andpwould have already been marked as not prime). Addpto your list of primes. - Sieve: Mark all multiples of
p(i.e.,2*p,3*p,4*p, …) up tonas not prime (False) in your boolean array. They cannot be prime because they havepas a divisor. - Repeat from step 2: Find the next number in the list greater than
pthat is still marked as potentially prime, and repeat the sieving process. - Continue until you have checked numbers up to
n.
Sieve Optimization 1: When sieving multiples of prime p, you can start marking from p*p (p2). Why? Any smaller multiple k*p where k < p would have already been marked as not prime when you processed the prime factors of k.
Sieve Optimization 2: You only need to find primes p up to sqrt(n) to perform the sieving. Any composite number c <= n must have a prime factor less than or equal to sqrt(n). If c hasn’t been marked off by the time you reach sqrt(n), it must be prime itself.
The Old Way (Trial Division):
- Is 2 prime? Yes.
- Is 3 prime? Yes.
- Is 4 prime? No (divisible by 2).
- Is 5 prime? Yes.
- Is 6 prime? No (divisible by 2).
- Is 7 prime? Yes.
- Is 8 prime? No (divisible by 2).
- Is 9 prime? No (divisible by 3). … This is slow because we check 9 for divisibility by 3 even though we already knew 3 was prime and could have somehow used that information.
The Sieve Idea: Be Proactive!
Instead of checking each number individually, when we find a prime number, let’s immediately cross out all of its multiples because we know they cannot be prime.
Steps for n = 20:
List all numbers: Write down all numbers from 2 up to 20. Imagine we have little flags (or boolean values) next to them, initially all set to “potentially prime” (True).
[ (2,T), (3,T), (4,T), (5,T), (6,T), (7,T), (8,T), (9,T), (10,T), (11,T), (12,T), (13,T), (14,T), (15,T), (16,T), (17,T), (18,T), (19,T), (20,T) ]Find the first number
pthat is marked True: Start from the beginning. The first number marked True isp = 2.- Action: We know 2 is prime (it’s the first one we found). Add 2 to our prime list:
Primes = [2]. - Sieve Action: Now, cross out (mark as False) all multiples of 2 in our list (4, 6, 8, 10, 12, 14, 16, 18, 20).
- List becomes:
[ (2,T), (3,T), (4,F), (5,T), (6,F), (7,T), (8,F), (9,T), (10,F), (11,T), (12,F), (13,T), (14,F), (15,T), (16,F), (17,T), (18,F), (19,T), (20,F) ]
- Action: We know 2 is prime (it’s the first one we found). Add 2 to our prime list:
Find the next number
p(after the previousp=2) that is marked True: Looking through the list, the next True is atp = 3.- Action: We know 3 must be prime (because if it had a smaller prime factor, it would have been crossed out in a previous step - the only smaller prime was 2, and 3 isn’t a multiple of 2). Add 3 to our prime list:
Primes = [2, 3]. - Sieve Action: Cross out all multiples of 3 (starting from 3*2=6, though 6 is already False. Next is 3*3=9, then 3*4=12(F), 3*5=15, 3*6=18(F)).
- List becomes:
[ (2,T), (3,T), (4,F), (5,T), (6,F), (7,T), (8,F), (9,F), (10,F), (11,T), (12,F), (13,T), (14,F), (15,F), (16,F), (17,T), (18,F), (19,T), (20,F) ]
- Action: We know 3 must be prime (because if it had a smaller prime factor, it would have been crossed out in a previous step - the only smaller prime was 2, and 3 isn’t a multiple of 2). Add 3 to our prime list:
Find the next number
p(afterp=3) that is marked True: The next True is atp = 5.- Action: 5 must be prime. Add 5 to list:
Primes = [2, 3, 5]. - Sieve Action: Cross out multiples of 5 (5*2=10(F), 5*3=15(F), 5*4=20(F)). No changes needed as they were already False.
- List remains the same.
- Action: 5 must be prime. Add 5 to list:
Find the next number
p(afterp=5) that is marked True: The next True is atp = 7.- Action: 7 must be prime. Add 7 to list:
Primes = [2, 3, 5, 7]. - Sieve Action: Cross out multiples of 7 (7*2=14(F)). No changes.
- List remains the same.
- Action: 7 must be prime. Add 7 to list:
Find the next number
p(afterp=7) that is marked True: The next True is atp = 11.- Action: 11 must be prime. Add 11 to list:
Primes = [2, 3, 5, 7, 11]. - Sieve Action: Cross out multiples of 11 (11*2=22, which is > 20). No changes.
- List remains the same.
- Action: 11 must be prime. Add 11 to list:
Find the next number
p(afterp=11) that is marked True: The next True is atp = 13.- Action: 13 must be prime. Add 13:
Primes = [2, 3, 5, 7, 11, 13]. - Sieve Action: Multiples are > 20.
- Action: 13 must be prime. Add 13:
Find the next number
p(afterp=13) that is marked True: The next True is atp = 17.- Action: 17 must be prime. Add 17:
Primes = [2, 3, 5, 7, 11, 13, 17]. - Sieve Action: Multiples are > 20.
- Action: 17 must be prime. Add 17:
Find the next number
p(afterp=17) that is marked True: The next True is atp = 19.- Action: 19 must be prime. Add 19:
Primes = [2, 3, 5, 7, 11, 13, 17, 19]. - Sieve Action: Multiples are > 20.
- Action: 19 must be prime. Add 19:
Find the next number
p(afterp=19) that is marked True: There are none left up ton=20. We are done.
The Final Prime List: [2, 3, 5, 7, 11, 13, 17, 19].
Why is this efficient?
- We only “process” (find multiples of) the actual prime numbers (2, 3, 5, 7, etc.).
- We touch each composite number (like 4, 6, 8, 9, 10, 12…) only when we cross it out as a multiple of one of its prime factors. We don’t waste time trying to see if 12 is divisible by 4, 5, 6, etc., because it was already crossed out when we processed prime 2 (and again for prime 3).
The Optimization (Sieving from p*p)
Let’s look at step 3 again, when p=3. We crossed out 6, 9, 12, 15, 18.
Notice that 6 was already crossed out when we processed p=2. Why? Because 6 = 2 * 3. The factor k=2 is less than p=3.
In general, when processing prime p, any multiple k * p where k < p has already been crossed out by the prime factors of k (which are all smaller than p).
So, the first multiple of p that we really need to worry about crossing out is p * p.
- For
p=3, we start crossing out from3*3 = 9. - For
p=5, we start crossing out from5*5 = 25(which is > 20, so we do nothing). - For
p=7, we start from7*7=49(> 20). This optimization saves some redundant work.
The Boolean Array (is_prime)
In the code, instead of a list of pairs like (2, T), we use a simple boolean array (a list of True/False values). The index of the array represents the number.
is_prime = [False, False, True, True, True, ... True] (up to index n)
- When we process
p=2, we loopfor i in range(p*p, n+1, p): is_prime[i] = False. - When we process
p=3, we loopfor i in range(p*p, n+1, p): is_prime[i] = False. And so on.
Section 5.10: Permute the Elements of an Array (Page 14 / PDF Page 50)
The Problem: You are given an array A of n elements, and a permutation array P of length n. The permutation array P describes how to rearrange A: for each index i, the element currently at A[i] should be moved to position P[i]. You need to apply this permutation to array A in-place (O(1) extra space).
Example:
A = [a, b, c, d]P = [2, 0, 1, 3]
Interpretation of P:
- Element at index 0 (
a) moves to indexP[0]=2. - Element at index 1 (
b) moves to indexP[1]=0. - Element at index 2 (
c) moves to indexP[2]=1. - Element at index 3 (
d) moves to indexP[3]=3.
Desired Result: A should become [b, c, a, d].
Approach 1: Using Extra Space (Simple but Not In-Place)
- Create a new array
Bof the same size asA. - Iterate through
Afromi = 0ton-1. - For each element
A[i], place it in its target position inB:B[P[i]] = A[i]. - Copy the contents of
Bback intoA.
- Complexity:
- Time: O(n) (for the iteration and copy).
- Space: O(n) (for the auxiliary array
B).
- Issue: Doesn’t meet the O(1) space requirement.
Approach 2: In-Place Permutation using Cycles
Key Insight: Any permutation can be broken down into one or more disjoint cycles.
- A cycle means that element at index
i0moves toi1, element ati1moves toi2, …, element atik-1moves toik, and element atikmoves back toi0.
In our example P = [2, 0, 1, 3]:
- Start at index 0: Element
amoves to index 2 (P[0]=2). - Look at index 2: Element
cmoves to index 1 (P[2]=1). - Look at index 1: Element
bmoves to index 0 (P[1]=0). - We’re back at index 0. This forms the cycle
0 -> 2 -> 1 -> 0. - Start at index 3 (the next unvisited index): Element
dmoves to index 3 (P[3]=3). This is a cycle of length 1:3 -> 3. The permutationPconsists of two cycles:(0 2 1)and(3).
Algorithm Idea: If we can process one cycle at a time, we can do it in place.
- Iterate through the array
Afromi = 0ton-1. - For each index
i, check if the element at this position has already been moved as part of a previous cycle. If it has, skip it. - If the element at
ihasn’t been moved, we’ve found the start of a new cycle. Process this cycle:- Remember the starting index
start = iand the valuetemp = A[start]. - Easier cycle traversal:
- Store the element you’re about to overwrite:
value_to_move = A[i]. - Follow the cycle:
curr = i. - Loop:
next_idx = P[curr].- Save the element at the destination:
next_value = A[next_idx]. - Move the
value_to_moveto the destination:A[next_idx] = value_to_move. - Update
curr = next_idx. - Update
value_to_move = next_value. - Mark the position
curr(or maybeP[curr]) as visited so we don’t process this cycle again. - Stop when
currgets back to the starting indexi.
- Store the element you’re about to overwrite:
- Remember the starting index
How to Mark Visited Cycles without O(n) Space?
This is the clever trick. We can modify the permutation array P itself to mark elements that have been processed. The book suggests subtracting n (or using the sign bit if possible) from P[i] after its corresponding element A[i] has been placed correctly.
The Cycle Idea: Follow One Element All the Way
Let’s pick an element and follow its journey until it lands in its final spot, and see where the element it displaced needs to go, and so on.
Let’s trace starting with index i = 0:
- Element:
A[0]is ‘a’. - Destination: Where should ‘a’ go?
P[0]tells us it goes to index 2. - Problem: Index 2 currently holds ‘c’. We can’t just overwrite ‘c’.
- Solution: Before moving ‘a’ to index 2, let’s save ‘c’ somewhere temporarily. Let
temp = 'c'. - Move ‘a’: Now, put ‘a’ into
A[2]. The arrayAis now['a', 'b', 'a', 'd']. (We still have ‘c’ saved intemp). - Where does ‘c’ go? ‘c’ originally came from index 2. According to
P, the element from index 2 should go to indexP[2] = 1. - Problem: Index 1 currently holds ‘b’.
- Solution: Save ‘b’. Let
temp = 'b'. - Move ‘c’: Put ‘c’ (which we saved way back in step 4) into
A[1]. ArrayAis now['a', 'c', 'a', 'd']. (We have ‘b’ saved intemp). - Where does ‘b’ go? ‘b’ originally came from index 1. According to
P, the element from index 1 should go to indexP[1] = 0. - Problem: Index 0 currently holds ‘a’.
- Solution: Save ‘a’. Let
temp = 'a'. - Move ‘b’: Put ‘b’ (saved in step 8) into
A[0]. ArrayAis now['b', 'c', 'a', 'd']. (We have ‘a’ saved intemp). - Where does ‘a’ go? This ‘a’ originally came from index 0. We followed the path
0 -> 2 -> 1 -> 0. We are back where we started! This ‘a’ we saved in step 12 is the one we started with. Since we’re back at the start, the cycle is complete. We don’t need to place this saved ‘a’; it’s already implicitly handled because the element now at index 0 (‘b’) is correct.
We’ve just processed the cycle 0 -> 2 -> 1 -> 0! The array A is now ['b', 'c', 'a', 'd'].
How do we know we are done? We need to check other starting points.
- Let’s trace starting with index
i = 1:- We need a way to know if index 1 was already handled. If we followed the cycle correctly above, index 1 was involved. We need a mechanism to remember this.
- Let’s trace starting with index
i = 3:- Assume index 3 hasn’t been handled yet.
- Element:
A[3]is ’d’. - Destination:
P[3]tells us it goes to index 3. - It’s already there! This is a cycle of length 1:
3 -> 3. We don’t need to do anything.
Putting it Together & The “Marking” Trick:
The process above shows how a cycle works with temporary variables. The challenge is doing it without explicitly using lots of temp variables (just one is needed at a time) and knowing which cycles have been completed.
This is where modifying the perm array comes in (or using a visited array). We mark perm[k] by making it negative after we’ve moved the element originally at A[k] to its correct spot.
My previous manual trace using the book’s code swap logic was slightly off. The core idea of the book’s code is to complete a cycle correctly.
Corrected Trace (Using Book’s Code Logic - A[next_idx], A[target_idx] = A[target_idx], A[next_idx]):
Initial State:
A = ['a', 'b', 'c', 'd'],perm = [2, 0, 1, 3]Outer loop:
i = 0next_idx = 0.perm[0](2)>= 0. Enterwhile.target_idx = perm[0] = 2.- Swap
A[0](‘a’) andA[target_idx](A[2]which is ‘c’).Abecomes['c', 'b', 'a', 'd']. temp = perm[0](which is2).- Mark
perm[0]:perm[0] = 2 - 4 = -2.permis now[-2, 0, 1, 3]. next_idx = temp = 2.
perm[2](1)>= 0. Continuewhile.target_idx = perm[2] = 1.- Swap
A[next_idx](A[2]which is ‘a’) andA[target_idx](A[1]which is ‘b’).Abecomes['c', 'a', 'b', 'd']. temp = perm[2](which is1).- Mark
perm[2]:perm[2] = 1 - 4 = -3.permis now[-2, 0, -3, 3]. next_idx = temp = 1.
perm[1](0)>= 0. Continuewhile.target_idx = perm[1] = 0.- Swap
A[next_idx](A[1]which is ‘a’) andA[target_idx](A[0]which is ‘c’).Abecomes['a', 'c', 'b', 'd']. temp = perm[1](which is0).- Mark
perm[1]:perm[1] = 0 - 4 = -4.permis now[-2, -4, -3, 3]. next_idx = temp = 0.
perm[0](-2)< 0. Exitwhileloop.
Outer loop:
i = 1perm[1](-4) is negative. Skipwhileloop. (This index was part of the cycle we just processed).
Outer loop:
i = 2perm[2](-3) is negative. Skipwhileloop.
Outer loop:
i = 3next_idx = 3.perm[3](3)>= 0. Enterwhile.target_idx = perm[3] = 3.- Swap
A[3](’d’) andA[3](’d’).Ais unchanged:['a', 'c', 'b', 'd']. temp = perm[3](which is3).- Mark
perm[3]:perm[3] = 3 - 4 = -1.permis now[-2, -4, -3, -1]. next_idx = temp = 3.
perm[3](-1)< 0. Exitwhileloop.
Outer loop finishes.
Final state for
Ais['a', 'c', 'b', 'd']. This trace still gives the incorrect result.
The confusion arises from how the swaps in the book’s code achieve the “following an element” logic. Let’s use the “pick up and carry” mental model which is more direct.
Cycle Following - Simpler View (Corrected to match desired outcome):
This approach correctly gets the desired result by focusing on placing the element currently “held” (value) to its destination, and then picking up the element that was at that destination.
- Outer loop
i = 0ton-1:- If
perm[i]is negative, this element’s cycle is already processed. Skip. - Start a new cycle:
current_cycle_idx = ivalue_to_place = A[i](the element we are “holding”)- Inner
do-whileloop (orwhile Truewith a break):next_target_pos = perm[current_cycle_idx](wherevalue_to_placeshould go).value_at_target = A[next_target_pos](save the element currently at the target).A[next_target_pos] = value_to_place(place our “held” value).- Mark
perm[current_cycle_idx]as processed (e.g.,perm[current_cycle_idx] -= len(perm)). current_cycle_idx = next_target_pos(this is wherevalue_at_targetcame from).value_to_place = value_at_target(this is the new value we are “holding”).- If
current_cycle_idx == i(we’ve returned to the start of the cycle), break the inner loop.
- If
Let’s try this view on A = ['a', 'b', 'c', 'd'], P = [2, 0, 1, 3]
Outer loop
i = 0:perm[0](2) is not negative. Start cycle.current_cycle_idx = 0.value_to_place = A[0] = 'a'.- Inner Loop 1:
next_target_pos = perm[0] = 2.value_at_target = A[2] = 'c'.A[2] = 'a'.Ais['a', 'b', 'a', 'd'].perm[0] -= 4 \implies -2.permis[-2, 0, 1, 3].current_cycle_idx = 2.value_to_place = 'c'.current_cycle_idx(2) !=i(0). Continue.
- Inner Loop 2:
next_target_pos = perm[2] = 1.value_at_target = A[1] = 'b'.A[1] = 'c'.Ais['a', 'c', 'a', 'd'].perm[2] -= 4 \implies -3.permis[-2, 0, -3, 3].current_cycle_idx = 1.value_to_place = 'b'.current_cycle_idx(1) !=i(0). Continue.
- Inner Loop 3:
next_target_pos = perm[1] = 0.value_at_target = A[0] = 'a'.A[0] = 'b'.Ais['b', 'c', 'a', 'd'].perm[1] -= 4 \implies -4.permis[-2, -4, -3, 3].current_cycle_idx = 0.value_to_place = 'a'.current_cycle_idx(0) ==i(0). Break inner loop. Cycle fori=0is complete.
Outer loop
i = 1:perm[1](-4) is negative. Skip.
Outer loop
i = 2:perm[2](-3) is negative. Skip.
Outer loop
i = 3:perm[3](3) is not negative. Start cycle.current_cycle_idx = 3.value_to_place = A[3] = 'd'.- Inner Loop 1:
next_target_pos = perm[3] = 3.value_at_target = A[3] = 'd'.A[3] = 'd'.Ais['b', 'c', 'a', 'd'].perm[3] -= 4 \implies -1.permis[-2, -4, -3, -1].current_cycle_idx = 3.value_to_place = 'd'.current_cycle_idx(3) ==i(3). Break inner loop. Cycle fori=3is complete.
Done. Final A is ['b', 'c', 'a', 'd']. This matches the desired result!
The key is correctly following the element you “picked up” and placing it, then picking up the displaced element and continuing, while marking the origin index in P as complete for that step of placing its element.
The book’s code’s swap A[next_idx], A[target_idx] = A[target_idx], A[next_idx] is actually performing this “pick up and carry” implicitly. The element at target_idx is moved to next_idx, and the old A[next_idx] is moved to target_idx. The next_idx for the next iteration then becomes the original target_idx. This is a common way to implement cycle sort.
Section 5.11: Compute the Next Permutation (Page 16 / PDF Page 52)
The Problem: Given an array perm representing a permutation of numbers (e.g., [1, 0, 3, 2]), find the next permutation in dictionary (lexicographical) order. If the input is already the largest possible permutation (i.e., sorted in descending order, like [3, 2, 1, 0]), return an empty array or list.
Dictionary Order: Think of the permutations as words. [0, 1, 2] comes before [0, 2, 1], which comes before [1, 0, 2], etc. We want the permutation that comes immediately after the input one in this sorted list of all permutations.
Example:
- Input:
[1, 0, 3, 2]-> Output:[1, 2, 0, 3] - Input:
[1, 2, 0, 3]-> Output:[1, 2, 3, 0] - Input:
[3, 2, 1, 0]-> Output:[](or indication of none)
The Key Insight: Minimizing the Change
We want to find the next permutation, which means we want to increase the sequence by the smallest possible amount while still making it larger.
Where to Make the Change? To make the smallest possible increase, we want to change the sequence as far to the right as possible.
Consider [1, 0, 3, 2]:
- If we only change
[3, 2]? The only other arrangement is[2, 3]. This gives[1, 0, 2, 3]. Is this larger than the original[1, 0, 3, 2]? Yes (because at index 2, 2 < 3). - If we only change
[0, 3, 2]? We could rearrange these to[2, 0, 3],[2, 3, 0],[3, 0, 2],[3, 2, 0]. Combined with the 1 at the start, these give[1, 2, 0, 3],[1, 2, 3, 0],[1, 3, 0, 2],[1, 3, 2, 0]. All of these are larger than the original. Which one is the smallest?[1, 2, 0, 3].
Notice that [1, 2, 0, 3] seems to be the smallest possible increase we found. It involved changing the 0 at index 1.
Finding the “Pivot” - Where the Increase Must Happen
Let’s look at the sequence from right to left.
perm = [1, 0, 3, 2]
2: Nothing to compare.3, 2:3 > 2. The sequence is decreasing here.0, 3, 2:0 < 3. The decreasing trend is broken! This0at indexk=1is important. Why? Because any rearrangement only involving the suffix[3, 2]will keep the[1, 0]prefix the same. To get a permutation larger than[1, 0, 3, 2]but potentially smaller than[1, 0, 2, 3](which we found by rearranging only the suffix), we must change something at or before index 1. To make the smallest change, we should change the rightmost possible element, which isperm[1]=0.
This leads to:
Algorithm Steps:
Find the “pivot” (k):
- Scan the permutation
permfrom right to left. Find the first indexksuch thatperm[k] < perm[k+1]. - This
perm[k]is the rightmost element we can increase while potentially making the resulting permutation larger than the original but as small as possible. - If no such
kexists (the entire array is in descending order), it’s the last permutation. Return an empty list or signal no next permutation. - For
[1, 0, 3, 2],k=1(whereperm[k]=0). The suffix is[3, 2].
- Scan the permutation
Find the “successor” (l) to swap with the pivot:
- We need to replace
perm[k]with something larger than it. To make the overall permutation increase by the least amount, we should replaceperm[k]with the smallest possible value from the suffixperm[k+1:]that is still greater thanperm[k]. - Scan the suffix
perm[k+1:]from right to left. Find the first elementperm[l]that is greater thanperm[k]. - For
[1, 0, 3, 2],perm[k]=0. Suffix is[3, 2]. Scan from right:perm[3] = 2. Is2 > 0? Yes. Found it!l=3. The successor value is2.
- (Why scan from the right for the successor? Because the suffix
perm[k+1:]was identified becausekwas the first break in a descending sequence from the right. This means the suffixperm[k+1:]itself must be in descending order. Scanning a descending sequence from the right guarantees that the first element we find that’s greater than the pivot is also the smallest element in that suffix that’s greater than the pivot.)
- We need to replace
Swap pivot and successor:
- Swap
perm[k]andperm[l]. - Swap
perm[1](0)andperm[3](2).permbecomes[1, 2, 3, 0].
- Swap
Make the rest of the sequence (suffix after k) as small as possible:
- We’ve now made the prefix up to index
k(e.g.,[1, 2]) the smallest possible larger prefix. - Now, we need to make the suffix (the part after index
k, which is now[3, 0]) as small as possible in dictionary order to ensure this is the very next permutation. - How do you make a sequence of numbers as small as possible in dictionary order? You sort it in ascending order.
- The suffix after index
k(perm[k+1:]) was originally in descending order (or became so after the swap, asperm[l]was replaced byperm[k], which is smaller than other elements in the original suffix). To sort a descending sequence into ascending order efficiently, just reverse it! - The suffix after
k=1is[3, 0]. Reversing it gives[0, 3]. - Attach this to the prefix:
[1, 2]+[0, 3]=[1, 2, 0, 3]. This is the final answer.
- We’ve now made the prefix up to index
Let’s try perm = [6, 2, 1, 5, 4, 3, 0] again:
Find k: Scan right-to-left.
03 > 04 > 35 > 41 < 5. Break!k=2(pivotperm[k]=1). Suffix is[5, 4, 3, 0].
Find l: Scan suffix
[5, 4, 3, 0]right-to-left for first element >perm[k](1).0 > 1? No.3 > 1? Yes. Found it!l=5(index in original array, successorperm[l]=3).
Swap: Swap
perm[k](1)andperm[l](3).permbecomes[6, 2, 3, 5, 4, 1, 0].
Reverse suffix: Reverse the part after
k=2, which is[5, 4, 1, 0].- Reversing gives
[0, 1, 4, 5].
- Reversing gives
Final Result:
[6, 2, 3]+[0, 1, 4, 5]=[6, 2, 3, 0, 1, 4, 5].
This algorithm correctly finds the immediate next permutation by making the smallest necessary change to the prefix and then making the suffix as small as possible.
Sampling Algorithms and their implementation
Problem: You have a bag (array A) with n distinct items. You want to pull out exactly k items such that every possible group of k items has the same chance of being selected. You do this by modifying the original array A so the chosen k items end up in the first k positions (A[0] to A[k-1]).
Analogy: Picking Players for a Team
- You have
nplayers lined up (A[0]toA[n-1]). - You need to form a team of
kplayers. - You want the selection process to be completely fair – any group of
kplayers is equally likely.
A Naive (But Fair) Way - Drawing Names from a Hat:
- Write each of the
nplayer names on a slip of paper. - Put all
nslips into a hat. - Shake the hat well.
- Draw out
kslips one by one without putting them back. This is perfectly fair. Every group ofknames has an equal chance.
How can we simulate this “drawing from a hat” using the array A directly?
This is what the Offline Sampling algorithm (random_sampling) does. Let’s trace it with A = ['A', 'B', 'C', 'D', 'E'] and k = 3. We want to pick 3 players fairly and put them in A[0], A[1], A[2].
The Algorithm Step-by-Step:
Goal: Fill A[0], then A[1], then A[2] with fairly chosen players.
Step 1: Fill A[0] (i=0)
- Idea: We need to pick one player randomly out of the entire group (
A[0]toA[4]) to put into the first team slot (A[0]). - Action:
- Choose a random index
rbetween0and4(inclusive). Let’s sayrandom.randrange(0, 5)givesr = 3. - Swap the player currently at
A[0](which is ‘A’) with the player at the random indexA[r](which isA[3], ‘D’). Abecomes['D', 'B', 'C', 'A', 'E'].
- Choose a random index
- Result:
A[0]now contains ‘D’. ‘D’ was chosen uniformly from all 5 original players. We have fairly selected the first team member and placed them in the first slot. The remaining players (['B', 'C', 'A', 'E']) are now in slotsA[1]toA[4].
Step 2: Fill A[1] (i=1)
- Idea: We need to pick the second team member. This member must be chosen fairly from the players who haven’t been picked yet. Where are the unpicked players? They are currently in the array from index
i=1to the end (A[1]toA[4]). There aren-i = 5-1 = 4players remaining (‘B’, ‘C’, ‘A’, ‘E’). - Action:
- Choose a random index
rbetween1and4(inclusive). Let’s sayrandom.randrange(1, 5)givesr = 1. - Swap the player currently at
A[i](which isA[1], ‘B’) with the player at the random indexA[r](which isA[1], ‘B’). Aremains['D', 'B', 'C', 'A', 'E']. (Swapping with itself).
- Choose a random index
- Result:
A[1]now contains ‘B’. ‘B’ was chosen uniformly from the 4 players available (B, C, A, E) after ‘D’ was picked. We have fairly selected the second team member. The remaining 3 players (['C', 'A', 'E']) are now in slotsA[2]toA[4].
Step 3: Fill A[2] (i=2)
- Idea: Pick the third team member fairly from the remaining players. The available players are at indices
i=2to the end (A[2]toA[4]). There aren-i = 5-2 = 3players remaining (‘C’, ‘A’, ‘E’). - Action:
- Choose a random index
rbetween2and4(inclusive). Let’s sayrandom.randrange(2, 5)givesr = 4. - Swap the player currently at
A[i](which isA[2], ‘C’) with the player at the random indexA[r](which isA[4], ‘E’). Abecomes['D', 'B', 'E', 'A', 'C'].
- Choose a random index
- Result:
A[2]now contains ‘E’. ‘E’ was chosen uniformly from the 3 players available (C, A, E) after ‘D’ and ‘B’ were picked. We have fairly selected the third team member.
Stop: We needed k=3 players, and we have filled A[0], A[1], A[2].
Final State: The array is ['D', 'B', 'E', 'A', 'C'].
The Sample: The first k=3 elements are ['D', 'B', 'E']. This is our uniformly random sample.
Why is this like drawing from a hat?
- At step
i, choosing a random indexrfromiton-1is like drawing one name randomly from then-inames still left in the hat. - Swapping
A[i]withA[r]is like taking the drawn name (A[r]) and putting it aside as thei-th selected player (by placing it atA[i]), while the name that was atA[i]is effectively put back into the “remaining players” part of the array (at indexr) to potentially be drawn later.
This process ensures that at each step i, every available element has an equal chance of being selected for position i. Over k steps, this builds up a sample where every possible group of k had an equal shot.
Let’s try a different random sequence for k=2, A=['a','b','c','d']:
i=0:n=4. Pickrfrom[0, 1, 2, 3]. Sayr=2. SwapA[0](‘a’) andA[2](‘c’).A = ['c', 'b', 'a', 'd'].A[0]is fairly chosen.i=1:n=4. Pickrfrom[1, 2, 3]. Sayr=3. SwapA[1](‘b’) andA[3](’d’).A = ['c', 'd', 'a', 'b'].A[1]is fairly chosen from the rest.- Stop.
k=2. Sample isA[0..1] = ['c', 'd'].
Think about filling slots 0 to k-1. For slot i, you randomly pick one element from all elements currently at or after index i and swap it into place.
Section 5.12: Sample Offline Data (Page 18 / PDF Page 54)
The Scenario: Imagine you have a large dataset (like all your customers, or all sensor readings from a day) stored in an array A. You want to select a smaller, random sample of size k from this dataset (e.g., to send a survey to, or to analyze more closely). The key requirement is that every possible subset of size k must have an equal chance of being selected. This is called uniform random sampling. “Offline” means you have the entire dataset A available before you start sampling.
The Problem: Implement an algorithm that takes an array A of n distinct elements and an integer k (where k <= n). It should rearrange A such that the first k elements (A[0] to A[k-1]) hold a uniformly random subset of the original elements of size k.
Example:
- Input:
A = [3, 7, 5, 11],k = 3 - Possible outputs (in the first
k=3slots):[3, 7, 5],[3, 5, 11],[7, 3, 11],[11, 5, 7], etc. Each combination of 3 elements should appear with probability 1/(4 choose 3) = 1/4. The order within the firstkelements doesn’t strictly matter for the subset definition, but the algorithm naturally produces a random permutation within the subset.
Why Not Just Pick Random Indices?
If you just pick k random indices between 0 and n-1, you might pick the same index multiple times. If you discard duplicates and retry, the process becomes complex and potentially slow (related to the Coupon Collector’s problem), especially if k is close to n.
The Efficient In-Place Algorithm (Fisher-Yates Shuffle Adaptation):
This algorithm is elegant and efficient. It builds the random subset in the first k positions of the array iteratively.
Algorithm:
- Iterate with an index
ifrom0up tok-1. This loop runs exactlyktimes. - In each iteration
i:- We need to select a random element from the portion of the array that hasn’t been considered for the sample yet. This “unseen” portion is from index
iton-1(inclusive). - Generate a random integer
rsuch thati <= r < n. (Pick a random index from the current positionito the end). - Swap the element at the current index
iwith the element at the randomly chosen indexr:A[i], A[r] = A[r], A[i].
- We need to select a random element from the portion of the array that hasn’t been considered for the sample yet. This “unseen” portion is from index
- After the loop finishes (after
kswaps), the subarrayA[0...k-1]contains the uniformly random sample of sizek.
Why does this work? (Intuition)
- Iteration
i=0: We randomly pick an indexrfrom0ton-1. We swapA[0]withA[r]. NowA[0]holds a uniformly random element chosen from the entire original array. - Iteration
i=1: We randomly pick an indexrfrom1ton-1. This picks uniformly from all elements except the one we already placed atA[0]. We swapA[1]withA[r]. NowA[1]holds a uniformly random element chosen from the remainingn-1elements.A[0]andA[1]together now form a random pair. - Iteration
i: We pick a random indexrfromiton-1. This selects uniformly from then-ielements that haven’t yet been fixed into the sample positions0toi-1. We swapA[i]withA[r]. This places a uniformly chosen element (from the remaining available ones) into positioni.
By induction, after k steps, A[0...k-1] holds a uniformly chosen subset of size k. Each element had an equal chance of ending up in each position of the sample.
Section 5.13: Sample Online Data (Page 19 / PDF Page 55)
The Scenario: Now imagine data isn’t available all at once (“offline”). Instead, you’re receiving a stream of data items (like packets in a network session, or log entries as they happen). You don’t know how many items will be in the stream in total beforehand. You want to maintain a random sample of size k from the items seen so far. As each new item arrives, you update your sample. The sample must always be a uniformly random subset of the items processed up to that point.
The Problem: Design an algorithm that takes an input size k and reads a sequence of data items (an iterator or stream). It should continuously maintain a subset of size k of the items read so far. This subset must be a uniformly random sample.
Why this is Tricky:
- You don’t know
n(the total number of items) in advance. - When the
(m+1)-th item arrives, you need to decide if it should be in your sample of sizek(replacing one of the existingkitems) or if it should be discarded. This decision must ensure that after processingm+1items, anykitems from them+1seen so far have an equal chance of being in your sample.
The Algorithm: Reservoir Sampling
This is a classic algorithm for this problem.
Algorithm:
- Initialization: Read the first
kitems from the stream and store them directly into your sample (let’s call itsample_arrayof sizek). At this point,sample_arrayis trivially a uniform random sample of the firstkitems. - Process Subsequent Items: For each subsequent item
xthat arrives (from itemk+1onwards):- Let
num_items_seen_so_farbe the total number of items processed so far (including the current itemx). So, when processing the(k+1)-th item,num_items_seen_so_far = k+1. When processing the(k+2)-th,num_items_seen_so_far = k+2, and so on. - Generate a random integer
rbetween0andnum_items_seen_so_far - 1(inclusive). (Or 1 tonum_items_seen_so_farand adjust). - Decision:
- If
r < k(i.e., if the random number falls within the range0tok-1), then replace the elementsample_array[r]with the current itemx. - If
r >= k, do nothing; discard the current itemx.
- If
- Let
- After processing all items in the stream,
sample_arraywill hold a uniformly random sample of sizekfrom the entire stream.
Why does this work? (Proof by Induction - High Level)
Base Case: After seeing
kitems, thesample_arraycontains allkitems. This is a uniform random sample (the only possible sample of sizekfromkitems).Inductive Hypothesis: Assume that after seeing
mitems (wherem >= k), thesample_array(of sizek) is a uniformly random sample of sizekfrom thosemitems. This means any specific itemyfrom the firstmitems has a probability ofk/mof being in thesample_array.Inductive Step: Now, the
(m+1)-th item, let’s call itx_m+1, arrives.- What is the probability that
x_m+1enters the sample?- It enters if we pick a random number
r(from0tom) andr < k. - The probability of this is
k / (m+1).
- It enters if we pick a random number
- What is the probability that an old item
y(which was among the firstmitems and was in the sample) remains in the sample?- Prob(
ywas in sample aftermitems) =k/m. - Prob(
yis not kicked out whenx_m+1arrives) = Prob(x_m+1is not chosen to replace anything) + Prob(x_m+1is chosen, butyis not the one picked for replacement). - This is
(1 - k/(m+1))(ifx_m+1doesn’t replace anything) +(k/(m+1)) * ((k-1)/k)(ifx_m+1replaces, but it replaces one of the otherk-1items). = (m+1-k)/(m+1) + (k-1)/(m+1)= (m+1-k+k-1)/(m+1) = m/(m+1).
- Prob(
- So, Prob(
yremains in sample) = (Prob(ywas in sample)) * (Prob(ynot kicked out))= (k/m) * (m/(m+1)) = k/(m+1).
- What is the probability that
Conclusion: After
m+1items, both the new itemx_m+1and any old itemy(that was previously in the sample) now have a probability ofk/(m+1)of being in the sample. This holds true for all items, thus maintaining the uniform random property.
sampling Algorithms Recap
Recall the Offline Sampling (Fisher-Yates) Logic:
At step i (filling the i-th slot of our sample, 0-indexed), we do:
r = random.randrange(i, n)A[i], A[r] = A[r], A[i]
The key principle ensuring fairness here is: At step i, we choose uniformly from all items not yet placed in the sample (those in positions i through n-1) and place the chosen item into position i.
Now, the Online/Reservoir Sampling Challenge:
When the m-th item arrives (m > k), we don’t know n. We cannot simply pick a random index r from i to n-1. We only know the items seen so far (m items). We have a reservoir R of size k which (by induction) is a fair sample of the first m-1 items.
Goal: Update the reservoir R using the m-th item (x_m) such that R becomes a fair sample of all m items. Fairness means every item x_1, ..., x_m must have a k/m probability of being in R.
Reservoir Sampling Step for item m:
- Generate
r = random.randrange(m)(an index from 0 tom-1). if r < k:R[r] = x_m(replace element at indexrin the reservoir with the new item).
How does this achieve the k/m probability goal?
Let’s connect this to the “drawing from a hat” idea.
- Imagine a Hypothetical Hat: Conceptually, after
m-1items, our reservoirRholdskitems drawn fairly from a hat containing the firstm-1items. - Item
mArrives: It’s like adding them-th item’s name (x_m) to the hat. Now the hat containsmitems. - Fair Draw Needed: To get a fair sample of size
kfrom this hat ofmitems, each item in the hat should have ak/mchance of being selected if we were to redraw completely.
Can we achieve this without redrawing?
Let’s analyze the chances for the new item x_m:
- The algorithm gives
x_ma chance to get into the reservoir ifr < k, whereris chosen from0tom-1. - There are
k“winning” values forr(0 tok-1) out ofmtotal possibilities. - So, Prob(
x_menters the reservoir) =k/m. This is exactly the fairness probability we need for the new item!
Now, let’s analyze the chances for an old item Y that was already in the reservoir R before x_m arrived.
By our assumption (inductive hypothesis),
Ywas in the reservoir with probabilityk / (m-1)afterm-1items.For
Yto still be in the reservoir after itemmis processed, one of two things must happen:- The random number
rwas not less thank(i.e.,r >= k). This happens with probability(m-k)/m. In this case,x_mis ignored, andYdefinitely stays. - The random number
rwas less thank(probk/m), AND the slot chosen for replacement was notY’s slot. Sinceris chosen uniformly from0tok-1in this case to be the replacement index, the chance thatY’s specific slot is not picked is(k-1)/k. So, the probability of this combined event is(k/m) * (k-1)/k = (k-1)/m.
- The random number
Total probability
Ystays in the reservoir = Prob(Case 1) + Prob(Case 2)= (m-k)/m + (k-1)/m = (m-1)/m.Now, the final probability that
Yis in the reservoir after processing itemmis: Prob(Ywas in afterm-1) * Prob(Ystays whenmarrives)= [k / (m-1)] * [(m-1) / m]= k/m.
The Connection:
Both algorithms achieve fairness, but through different mechanisms constrained by what information is available:
- Offline (Fisher-Yates): Knows
n. Can directly simulate picking from the remaining available items at each stepiby choosing a random indexrin the future part of the array (iton-1) and swapping into positioni. It ensures fairness by giving every available item an equal chance (1/(n-i)) to be put into sloti. - Online (Reservoir): Doesn’t know
n. Cannot pick from future indices. Instead, it uses probabilities based on the countmseen so far. It gives the new itemx_mthe exact target probabilityk/mof being included. If included, the random replacement maintains the correct probabilities for the older items, ensuring they also end up with probabilityk/m. It achieves the same final probability distribution as Offline Sampling, but through an incremental, probabilistic update rather than direct selection and placement.
Think of Reservoir Sampling as cleverly calculating how often the new item should replace an old one to maintain the correct k/m probability for everyone in the long run, even without knowing the total number of items m will eventually reach.
Section 5.15: Compute a Random Subset (Page 21 / PDF Page 57)
The Problem: Given a positive integer n and a size k <= n, return a size-k subset of the set {0, 1, 2, ..., n-1}. The subset should be represented as an array. All (n choose k) possible subsets should be equally likely. Additionally, all permutations of elements within the returned array should be equally likely (though this is often a natural byproduct of algorithms that produce uniform subsets).
How is this different from Offline Sampling (Section 5.12)?
- Offline Sampling (5.12): Input is an existing array
A. Output is a modification ofAwhereA[0...k-1]holds the sample. It operates on given elements. - This Problem (5.15): Input is just integers
nandk. The “elements” are implicitly the integers from0ton-1. We need to construct an array containing a random subset of these integers.
Solution Approach 1: Mimic Offline Sampling (Fisher-Yates)
- Create an array
elements = [0, 1, ..., n-1]. - Apply the Offline Sampling algorithm (from Section 5.12, also known as Fisher-Yates shuffle) to this
elementsarray to pickkitems. - Return the first
kelements of the shuffledelementsarray (elements[0...k-1]).
- Complexity:
- Time: O(n) to create the initial
elementsarray + O(k) for sampling = O(n). - Space: O(n) for the
elementsarray.
- Time: O(n) to create the initial
Can we do better if k is much smaller than n? (e.g., sample 3 items from 1 million)
Creating an array of 1 million elements just to pick 3 seems wasteful.
Solution Approach 2: Using a Hash Table (for k « n)
This approach avoids creating the full array of n elements if k is small. It directly simulates the Fisher-Yates logic but only keeps track of the elements that would have been moved from their original positions.
The Idea:
Imagine you’re running the Fisher-Yates shuffle on the conceptual array [0, 1, ..., n-1].
- The array initially has
A[j] = j. - When you iterate for the
i-th element of your sample (fromi=0tok-1):- You pick a random index
rand_idxfromiton-1. - You swap
A[i]andA[rand_idx]. Ifkis small, most of the conceptual arrayAwill remainA[j]=j. Only a few elements get swapped. We can use a hash table (dictionary in Python) to store only those elements whose values are not equal to their indices (i.e., the elements that have been moved).
- You pick a random index
Algorithm with Hash Table:
- Initialize an empty dictionary
changed_elements. This dictionary will storeindex: valuepairs for any position where the value is different from what it would be in the identity permutation[0, 1, ..., n-1]after our simulated swaps. - Iterate
ifrom0tok-1(to selectkelements for our subset).- Generate a random index
rand_idxin the range[i, n-1]. - Get current values:
value_at_i = changed_elements.get(i, i)(Ifiis in the dictionary, use its stored value; otherwise, its value isi).value_at_rand_idx = changed_elements.get(rand_idx, rand_idx)(Similarly forrand_idx).
- Perform the swap (conceptually, updating the dictionary):
- Store
value_at_iatrand_idx:changed_elements[rand_idx] = value_at_i. - Store
value_at_rand_idxati:changed_elements[i] = value_at_rand_idx.
- Store
- Generate a random index
- After the loop, the desired subset consists of the values that are conceptually in
A[0]...A[k-1]. We retrieve these from ourchanged_elementsdictionary (defaulting toiifiis not a key).result_subset = [changed_elements.get(j, j) for j in range(k)].
When is the Hash Table approach better?
When k is significantly smaller than n.
- Fisher-Yates on full array (Approach 1):
- Time: O(n)
- Space: O(n) (for the initial array
[0, ..., n-1])
- Hash Table method (Approach 2):
- Time: O(k) (k iterations, each with dictionary operations which are O(1) on average)
- Space: O(k) (at most
2kentries in the dictionary ifiandrand_idxare always different and neither was previously in the dictionary)
So, if k << n (k is much much less than n), the hash table method wins on both time and space by avoiding the creation and full scan of the n-element array.
If k is close to n (e.g., k = n/2 or k = n-1), the O(n) approach (Approach 1) is fine and might even be slightly faster in practice due to less overhead compared to dictionary get/set operations, even though their asymptotic complexities might suggest otherwise for k ~ O(n). The constant factors matter.
Section 5.16: Generate Nonuniform Random Numbers (Page 22 / PDF Page 58)
The Problem: You are given:
- A list of
nnumbers (values):T = [t_0, t_1, ..., t_{n-1}]. - A list of
ncorresponding probabilities:P = [p_0, p_1, ..., p_{n-1}]. These probabilities sum up to 1.
You need to write a function that generates one of the numbers from T according to its specified probability p_i. You have access to a standard uniform random number generator that produces values in the range [0.0, 1.0) (i.e., 0 inclusive, 1 exclusive).
Example:
- Numbers
T = [3, 5, 7, 11] - Probabilities
P = [9/18, 6/18, 2/18, 1/18](which are[0.5, 0.333..., 0.111..., 0.055...])
If you call your function many times:
- About 50% of the time, it should return 3.
- About 33.3% of the time, it should return 5.
- About 11.1% of the time, it should return 7.
- About 5.5% of the time, it should return 11.
The Core Idea: Mapping the [0,1) Interval
The uniform random number generator gives us a value r in [0.0, 1.0). We need to divide this [0.0, 1.0) interval into n segments, where the length of each segment corresponds to the probability of picking the associated number.
- Segment 0: Length
p_0. Corresponds tot_0. Range[0, p_0). - Segment 1: Length
p_1. Corresponds tot_1. Range[p_0, p_0 + p_1). - Segment 2: Length
p_2. Corresponds tot_2. Range[p_0 + p_1, p_0 + p_1 + p_2). - …and so on.
- Segment n-1: Length
p_{n-1}. Corresponds tot_{n-1}. Range[sum(p_0 to p_{n-2}), sum(p_0 to p_{n-1}) = 1.0)
Analogy: A Dartboard with Unequal Slices
Imagine you have a circular dartboard. Instead of all slices being equal (like in a standard dartboard), the slices have different sizes.
- You have
nnumbers you want to pick from:t_0, t_1, t_2, ..., t_{n-1}. - Each number
t_ihas a probabilityp_iof being picked. - The slice on the dartboard for
t_0takes upp_0fraction of the dartboard’s area. - The slice for
t_1takes upp_1fraction of the area. - And so on. All
p_ifractions add up to 1 (the whole dartboard).
Example:
- Numbers
T = [Red, Green, Blue] - Probabilities
P = [0.5 (for Red), 0.3 (for Green), 0.2 (for Blue)]- Red slice covers 50% of the dartboard.
- Green slice covers 30%.
- Blue slice covers 20%.
How to Pick Fairly?
- Throw a dart completely randomly at the dartboard. “Completely randomly” means any point on the board is equally likely to be hit.
- See which slice your dart lands in. That’s the number you pick. Since the Red slice is bigger, you’re more likely to hit Red. This simulates the probabilities.
Connecting to random.random() which gives a number in [0.0, 1.0)
Instead of a circular dartboard, let’s imagine a straight line segment of length 1. Think of it like a ruler from 0.0 to 1.0. We’re going to divide this ruler into sections, where the length of each section is proportional to the probability.
For T = [Red, Green, Blue] and P = [0.5, 0.3, 0.2]:
- Section for Red: Length 0.5. It goes from 0.0 up to 0.5. (Range:
[0.0, 0.5)) - Section for Green: Length 0.3. It starts where Red left off, so it goes from 0.5 up to 0.5 + 0.3 = 0.8. (Range:
[0.5, 0.8)) - Section for Blue: Length 0.2. It starts where Green left off, so it goes from 0.8 up to 0.8 + 0.2 = 1.0. (Range:
[0.8, 1.0))
Visually:
| 0.0 0.5 0.8 1.0 | ||
|---|---|---|
| <— Red —> <– Green –> <– Blue –> | ||
| (length 0.5) (length 0.3) (length 0.2) |
The Algorithm using this “Ruler”:
Prepare the Ruler Divisions (these are the “cumulative probabilities”):
- Red ends at:
0.5 - Green ends at:
0.5 + 0.3 = 0.8 - Blue ends at:
0.5 + 0.3 + 0.2 = 1.0Let’s call these end-points:Endpoints = [0.5, 0.8, 1.0]
- Red ends at:
“Throw a Dart”: Get a random number from your computer,
rand_val = random.random(). Thisrand_valwill be somewhere between 0.0 and 1.0 (but not including 1.0).See Which Section it Lands In:
- If
0.0 <= rand_val < Endpoints[0](i.e.,rand_val < 0.5), it landed in Red’s section. Output: Red. - Else if
Endpoints[0] <= rand_val < Endpoints[1](i.e.,0.5 <= rand_val < 0.8), it landed in Green’s section. Output: Green. - Else if
Endpoints[1] <= rand_val < Endpoints[2](i.e.,0.8 <= rand_val < 1.0), it landed in Blue’s section. Output: Blue.
- If
Why does this work?
Because the length of each section on our “ruler” is exactly its probability, a uniformly random rand_val is proportionally more likely to fall into longer sections.
Example from the Book:
- Numbers
T = [3, 5, 7, 11] - Probabilities
P = [9/18, 6/18, 2/18, 1/18] P_decimal = [0.5, 0.333..., 0.111..., 0.055...]
Ruler Divisions (Cumulative Probabilities):
div[0](for number 3):0.5div[1](for number 5):0.5 + 0.333... = 0.833...div[2](for number 7):0.833... + 0.111... = 0.944...div[3](for number 11):0.944... + 0.055... = 1.0So,Endpoints = [0.5, 0.833..., 0.944..., 1.0]
Throw a Dart: rand_val = random.random(). Let’s say rand_val = 0.75.
See Where it Lands:
- Is
0.75 < Endpoints[0](0.5)? No. - Is
0.75 < Endpoints[1](0.833…)? Yes.- Aha! It landed between
Endpoints[0]andEndpoints[1]. This corresponds to the second number in our original listT, which is 5. - So, we output 5.
- Aha! It landed between
What if rand_val = 0.2?
- Is
0.2 < Endpoints[0](0.5)? Yes.- It landed before the first endpoint. This corresponds to the first number, 3. Output 3.
The “Binary Search” Part (e.g., bisect_left)
When we have many sections (many numbers t_i), checking rand_val against each endpoint one by one (if rand_val < E[0]… else if rand_val < E[1]…) is slow (O(n)).
Since the Endpoints array [0.5, 0.833, 0.944, 1.0] is sorted, we can use binary search to quickly find which “bucket” rand_val falls into.
bisect_left(Endpoints, rand_val) efficiently finds the index idx such that all Endpoints[j] for j < idx are less than rand_val, and Endpoints[idx] is the first endpoint that is greater than or equal to rand_val. This idx is precisely the index of the chosen t_i in our original list T.
Summary in Simple Terms:
- Line up all your probabilities end-to-end on a ruler of length 1. Mark where each probability “ends”.
- Generate a random point on that ruler (a random number between 0 and 1).
- See which probability’s segment your random point landed in.
- Pick the number associated with that segment. (Binary search is just a fast way to do step 3 if you have lots of segments).
Section 5.17: The Sudoku Checker Problem (Page 24 / PDF Page 60)
The Problem: You are given a 9x9 2D array (a grid) representing a partially or fully completed Sudoku puzzle. The grid contains integers from 0 to 9. A value of 0 indicates an empty cell. Values from 1 to 9 are filled-in digits.
Your task is to determine if the current state of the Sudoku grid is valid according to Sudoku rules:
- Each row must contain unique digits from 1 to 9 (ignoring 0s).
- Each column must contain unique digits from 1 to 9 (ignoring 0s).
- Each of the nine 3x3 subgrids (also called “boxes” or “regions”) must contain unique digits from 1 to 9 (ignoring 0s).
The problem asks only to check for validity (no duplicate 1-9 digits in any row, column, or 3x3 box). It does not ask you to solve the Sudoku.
Example:
- Figure 5.2(a) on page 60 is a partially filled valid Sudoku.
- Figure 5.2(b) is a complete, valid solution.
- An invalid example: A row
[5, 3, 4, 6, 7, 8, 9, 1, 5]is invalid because 5 appears twice.
Core Implementation Idea (Directly Testing Constraints):
The most straightforward approach is to check each of the three types of constraints one by one. For checking uniqueness within a row, column, or 3x3 box, a common technique is to use a helper data structure (like a hash set or a boolean array) to keep track of the digits seen so far within that unit.
Algorithm Outline:
Function to Check a Unit (Row, Column, or Box):
- Create a helper function, say
has_duplicate(unit_array), that takes a 1D array (representing a row, column, or flattened 3x3 box) and returnsTrueif it contains duplicate digits from 1-9,Falseotherwise. - Inside
has_duplicate(unit_array):- Initialize a
seen_digits = set()(or a boolean arrayseen = [False]*10). - Iterate through each
digitinunit_array:- If
digit == 0, ignore it (empty cell). - If
digit != 0:- If
digitis already inseen_digits(orseen[digit]isTrue), then a duplicate is found. ReturnTrueimmediately. - Otherwise, add
digittoseen_digits(or setseen[digit] = True).
- If
- If
- If the loop finishes without finding duplicates, return
False.
- Initialize a
- Create a helper function, say
Check Row Constraints:
- Iterate
ifrom 0 to 8 (for each row). - Extract row
ifrom the Sudoku grid:row_data = grid[i]. - If
has_duplicate(row_data)isTrue, then the Sudoku is invalid. ReturnFalse.
- Iterate
Check Column Constraints:
- Iterate
jfrom 0 to 8 (for each column). - Extract column
j:col_data = [grid[i][j] for i in range(9)]. - If
has_duplicate(col_data)isTrue, then the Sudoku is invalid. ReturnFalse.
- Iterate
Check 3x3 Subgrid Constraints:
- Iterate through the 9 subgrids. A common way is to use nested loops for the top-left corner of each subgrid:
- Iterate
box_row_startfrom 0 to 6, with a step of 3 (i.e., 0, 3, 6). - Iterate
box_col_startfrom 0 to 6, with a step of 3 (i.e., 0, 3, 6). - For each (
box_row_start,box_col_start):- (The next step would be to extract the 3x3 subgrid data and call
has_duplicateon it. If it returnsTrue, the Sudoku is invalid and we returnFalsefrom the main function.)
- (The next step would be to extract the 3x3 subgrid data and call
- Iterate
- Iterate through the 9 subgrids. A common way is to use nested loops for the top-left corner of each subgrid:
All Checks Passed: If all the above checks pass without returning False, then the Sudoku grid is valid. Return True.
Simplified Implementation Steps:
Helper function
is_unit_valid(list_of_9_cells):- Takes a list (a row, a column, or the 9 cells of a box).
- Uses a
seen_numbers = set(). - For each
cellinlist_of_9_cells:- If
cell == 0, continue. - If
cellis inseen_numbers, returnFalse(duplicate found!). - Add
celltoseen_numbers.
- If
- If loop finishes, return
True(no duplicates).
Main function
is_sudoku_valid(grid):- Rows: For each row
rfrom 0 to 8:- If not
is_unit_valid(grid[r]), then returnFalse.
- If not
- Columns: For each column
cfrom 0 to 8:- Create
column_list = [grid[r][c] for r in range(9)]. - If not
is_unit_valid(column_list), then returnFalse.
- Create
- Boxes: For each box (e.g.,
box_row_startfrom 0,3,6 andbox_col_startfrom 0,3,6):- Create
box_listby picking the 9 cells. - If not
is_unit_valid(box_list), then returnFalse.
- Create
- If you get through all of this, return
True.
- Rows: For each row
Section 5.18: Compute the Spiral Ordering of a 2D Array (Page 25 / PDF Page 61)
What: Given an n x n 2D array, return a 1D list of its elements in “spiral order” (outside-in, clockwise).
Why (Interview): This is a common 2D array traversal problem. It tests:
- Ability to manage 2D array indices carefully.
- Handling boundary conditions.
- Devising a systematic way to cover all elements without repetition.
- State management (current position, current direction, visited cells).
Underlying Pattern: Systematic Traversal with State Management. The two main patterns are “Layer Peeling” or “Path Simulation.”
Simplified Implementation Intuition (Simulation Approach - “Robot Walk”):
Imagine a robot starting at matrix[0][0]. It needs to walk in a spiral and report the numbers it steps on.
Setup:
result_list = [](to store the spiral numbers).current_row = 0,current_col = 0.directions = [[0, 1], [1, 0], [0, -1], [-1, 0]](These are[row_change, col_change]for Right, Down, Left, Up).current_direction_idx = 0(Start by going Right).num_elements = n * n.- A way to mark visited cells: Let’s assume we can modify the input matrix. We’ll set
matrix[row][col] = None(or some other special marker) after visiting. (If not, use a separatevisited[n][n]boolean array).
The Walk (Loop num_elements times):
- Collect: Add
matrix[current_row][current_col]toresult_list. - Mark:
matrix[current_row][current_col] = None(orvisited[current_row][current_col] = True). - Try to move in current direction:
next_row_change, next_col_change = directions[current_direction_idx].potential_next_row = current_row + next_row_change.potential_next_col = current_col + next_col_change.
- Check if turn is needed:
- Is (
potential_next_row,potential_next_col) OFF the grid (e.g.,potential_next_row < 0or>= n, orpotential_next_col < 0or>= n)? - OR, is
matrix[potential_next_row][potential_next_col]ALREADY visited (e.g.,== None)?
- Is (
- If YES (turn needed):
- Change direction:
current_direction_idx = (current_direction_idx + 1) % 4(turn clockwise). - Recalculate actual next step with new direction:
next_row_change, next_col_change = directions[current_direction_idx].current_row += next_row_change.current_col += next_col_change.
- Change direction:
- If NO (can continue in current direction):
current_row = potential_next_row.current_col = potential_next_col.
- Return
result_list.
Example Trace Snippet (3x3): [[1,2,3],[4,5,6],[7,8,9]]
res=[], (r,c)=(0,0), dir=0 (Right)
- Add 1.
res=[1].matrix[0][0]=None. Try (0,1) (Right). Valid.(r,c)=(0,1). - Add 2.
res=[1,2].matrix[0][1]=None. Try (0,2) (Right). Valid.(r,c)=(0,2). - Add 3.
res=[1,2,3].matrix[0][2]=None. Try (0,3) (Right). INVALID (off grid).- Turn!
dir=(0+1)%4 = 1(Down). - New try from (0,2) with Down: (1,2). Valid.
(r,c)=(1,2).
- Turn!
- Add 6.
res=[1,2,3,6].matrix[1][2]=None. Try (2,2) (Down). Valid.(r,c)=(2,2). - Add 9.
res=[1,2,3,6,9].matrix[2][2]=None. Try (3,2) (Down). INVALID (off grid).- Turn!
dir=(1+1)%4 = 2(Left). - New try from (2,2) with Left: (2,1). Valid.
(r,c)=(2,1). - …and so on.
- Turn!
Managing Direction:
We need two things:
- A way to represent the current direction.
- A way to change the direction systematically (usually clockwise).
1. Representing Direction: We can assign numbers to directions:
- 0: Right
- 1: Down
- 2: Left
- 3: Up
We store the robot’s current direction in a variable, say
current_direction_idx(initialized to 0 for Right).
2. Changing Direction (Turning Clockwise): When the robot hits a wall or an already visited cell, it needs to turn right (clockwise).
- If current direction is Right (0), next is Down (1).
- If current direction is Down (1), next is Left (2).
- If current direction is Left (2), next is Up (3).
- If current direction is Up (3), next is Right (0).
This pattern can be achieved with a simple calculation:
current_direction_idx = (current_direction_idx + 1) % 4(% 4(modulo 4) ensures that ifcurrent_direction_idxbecomes 4, it wraps around back to 0). (0 + 1) % 4 = 1(1 + 1) % 4 = 2(2 + 1) % 4 = 3(3 + 1) % 4 = 0
3. How Direction Affects Movement (Delta Row, Delta Column):
Once we know the current_direction_idx, how do we know how to change the row and col to move one step? We can use a lookup table (an array or list of lists/tuples) that stores the change in row (dr) and change in column (dc) for each direction index.
Let SHIFT = [[0, 1], # Direction 0 (Right): row doesn't change, col increases by 1
[1, 0], # Direction 1 (Down): row increases by 1, col doesn't change
[0, -1], # Direction 2 (Left): row doesn't change, col decreases by 1
[-1, 0]] # Direction 3 (Up): row decreases by 1, col doesn't change
When the robot wants to move:
- Get the changes for the current direction:
dr = SHIFT[current_direction_idx][0]dc = SHIFT[current_direction_idx][1]
- Calculate the potential next position:
potential_next_row = current_row + drpotential_next_col = current_col + dc
Section 5.19: Rotate a 2D Array (Page 28 / PDF Page 64)
The Problem: You are given an n x n 2D array (a square matrix). You need to rotate this matrix by 90 degrees clockwise, in-place (O(1) additional space).
Example (from Figure 5.4):
Input Matrix A:
[ [ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16] ]
Output Matrix A (after rotation):
[ [13, 9, 5, 1],
[14, 10, 6, 2],
[15, 11, 7, 3],
[16, 12, 8, 4] ]
Observations / How Elements Move:
- The first row of the original (
[1,2,3,4]) becomes the last column of the rotated, read from top to bottom ([1,2,3,4]). - The last column of the original (
[4,8,12,16]) becomes the first row of the rotated, but reversed (or the last row of the transposed, reversed matrix - it’s easier to think in terms of layers).- More precisely:
A[row][col]moves toA[col][n-1-row].
- More precisely:
Approach 1: Using Extra Space (Not In-Place, for understanding)
- Create a new
n x nmatrix, sayrotated_matrix. - For each element
A[row][col]in the original matrix:- The new position in
rotated_matrixwill berotated_matrix[col][n - 1 - row]. rotated_matrix[col][n - 1 - row] = A[row][col]
- The new position in
- Copy
rotated_matrixback toA.
- Time: O(N2)
- Space: O(N2) for
rotated_matrix. Fails the in-place requirement.
Approach 2: In-Place Rotation by Layers
This is the standard and efficient way to do it in-place.
Key Idea: The rotation can be performed layer by layer, from the outermost layer inwards. Within each layer, elements move in groups of four.
Consider the outermost layer of the 4x4 example:
A[0][0](1) moves to whereA[0][3](4) was.A[0][3](4) moves to whereA[3][3](16) was.A[3][3](16) moves to whereA[3][0](13) was.A[3][0](13) moves to whereA[0][0](1) was. This is a 4-way swap for the corners: (1) -> (4) -> (16) -> (13) -> (1)
Original positions: A[0][0], A[0][3], A[3][3], A[3][0]
After 90-deg clockwise rotation, their new conceptual positions are:
- The value originally at
A[3][0](13) moves toA[0][0]. - The value originally at
A[0][0](1) moves toA[0][3]. - The value originally at
A[0][3](4) moves toA[3][3]. - The value originally at
A[3][3](16) moves toA[3][0].
So, the values move like this in a cycle (let TL=TopLeft, TR=TopRight, BR=BottomRight, BL=BottomLeft for a specific group of 4):
temp = value_at_TL
value_at_TL = value_at_BL
value_at_BL = value_at_BR
value_at_BR = value_at_TR
value_at_TR = temp
Algorithm for In-Place Layer Rotation:
- The number of layers is
n // 2. - Iterate through layers, from
layer = 0ton // 2 - 1.- For each
layer:- Define
first = layer(first row/col index of this layer). - Define
last = n - 1 - layer(last row/col index of this layer). - Now, iterate through the elements within this layer’s top row (excluding the last one which is handled as a corner of another 4-element cycle). Iterate
ifromfirsttolast - 1.- An
offset = i - first. Thisoffsethelps identify the corresponding elements in the other three sides of the current square within the layer. - For each
i(oroffset), perform a 4-way swap:- Save top-left element:
top = matrix[first][i](which ismatrix[first][first + offset]) - Left element moves to Top-Left:
matrix[first][i] = matrix[last - offset][first] - Bottom element moves to Left:
matrix[last - offset][first] = matrix[last][last - offset] - Right element moves to Bottom:
matrix[last][last - offset] = matrix[i][last](which ismatrix[first + offset][last]) - Saved Top element moves to Right:
matrix[i][last] = top
- Save top-left element:
- An
- Define
- For each
Section 5.20: Compute Rows in Pascal’s Triangle (Page 29-30 / PDF Page 65-66 in your full book)
The Problem:
You are asked to generate the first n rows of Pascal’s Triangle.
Pascal’s Triangle has a specific structure:
- The first row (Row 0 conceptually) is just
[1]. - Each subsequent row has one more element than the previous row.
- The first and last element of every row is
1. - Every other element in a row is the sum of the two elements directly above it (to its left and right) in the previous row.
Figure 5.5 Example (First 5 rows):
Row 0: 1
Row 1: 1 1
Row 2: 1 2 1
Row 3: 1 3 3 1
Row 4: 1 4 6 4 1
Input: A non-negative integer n (number of rows to generate).
Output: A list of lists, where each inner list represents a row of Pascal’s Triangle.
Example:
If n = 5, output should be:
[
[1],
[1, 1],
[1, 2, 1],
[1, 3, 3, 1],
[1, 4, 6, 4, 1]
]
The Core Rule (from the problem description):
“the j-th entry in the i-th row is 1 if j = 0 or j = i, otherwise it is the sum of the (j - 1)-th and j-th entries in the (i - 1)-th row.”
(Assuming 0-indexed rows i and 0-indexed entries j within a row).
Let Triangle[i][j] be the element at row i, column j.
Triangle[i][0] = 1Triangle[i][i] = 1(since rowihasi+1elements, so indexiis the last one)Triangle[i][j] = Triangle[i-1][j-1] + Triangle[i-1][j]for0 < j < i.
Implementation Idea (Building Row by Row):
- Initialize an empty list
pascal_triangle_result = []. - Iterate
ifrom0ton-1(to generatenrows, which will be Row 0 to Rown-1).- For each
i, create acurrent_rowlist. Rowiwill havei+1elements. - The first element of
current_row(current_row[0]) is always1. - The last element of
current_row(current_row[i]) is always1(ifi > 0). - For the elements in between (from
j=1toj=i-1):current_row[j] = pascal_triangle_result[i-1][j-1] + pascal_triangle_result[i-1][j].- This means we need access to the previous row (
pascal_triangle_result[i-1]) that we already computed and stored.
- Add
current_rowtopascal_triangle_result.
- For each
- Return
pascal_triangle_result.
The Code Explained (generate_pascal_triangle - from the book):
def generate_pascal_triangle(n):
if n == 0:
return []
# Initialize the result list.
# The book's list comprehension initializes with all 1s, which is clever
# because the first and last elements of each row are always 1.
# Triangle[i] will have i+1 elements.
result = [[1] * (i + 1) for i in range(n)]
# Example for n=3: result will be [[1], [1,1], [1,1,1]]
# Iterate from the third row (index 2) onwards, because Row 0 and Row 1
# are already correctly filled with all 1s by the initialization.
for i in range(n): # Iterate through rows
# Iterate through elements in the current row 'i'
# We need to calculate elements from index 1 up to index i-1
# (since index 0 and index i are already 1).
for j in range(1, i): # This loop only runs if i >= 2
# Sum of the two elements directly above in the previous row
result[i][j] = result[i - 1][j - 1] + result[i - 1][j]
return result
Let’s trace for n=4:
result = [[1] * (i + 1) for i in range(4)]result = [[1], [1,1], [1,1,1], [1,1,1,1]]Outer loop
for i in range(4):i = 0: Inner loopfor j in range(1, 0)does not run.result[0]is[1]. Correct.i = 1: Inner loopfor j in range(1, 1)does not run.result[1]is[1,1]. Correct.i = 2: (Generating Row 2:[1,2,1])- Inner loop
for j in range(1, 2):j = 1:result[2][1] = result[2-1][1-1] + result[2-1][1]result[2][1] = result[1][0] + result[1][1]result[2][1] = 1 + 1 = 2.
result[2]is now[1, 2, 1]. Correct. (It was[1,1,1], element at index 1 got updated).
- Inner loop
i = 3: (Generating Row 3:[1,3,3,1])- Inner loop
for j in range(1, 3):j = 1:result[3][1] = result[3-1][1-1] + result[3-1][1]result[3][1] = result[2][0] + result[2][1]result[3][1] = 1 + 2 = 3.
j = 2:result[3][2] = result[3-1][2-1] + result[3-1][2]result[3][2] = result[2][1] + result[2][2]result[3][2] = 2 + 1 = 3.
result[3]is now[1, 3, 3, 1]. Correct.
- Inner loop
Return
result.
Complexity:
- The number of elements in Pascal’s triangle up to row
n-1is1 + 2 + ... + n = n*(n+1)/2. - Each element is computed in
O(1)time (one addition). - Time Complexity:
O(n^2). - Space Complexity:
O(n^2)to store the triangle.
Variant: Compute only the n-th row of Pascal’s triangle using O(n) space.
(Here n would mean Row n, e.g., if input is 3, compute Row 3 [1,3,3,1]).
- Idea: To compute Row
i, you only need Rowi-1. You don’t need Rowi-2, Rowi-3, etc. - So, you can keep track of just two rows:
previous_rowandcurrent_row. - Initialize
previous_row = [1]. - Loop
kfrom 1 up ton(to compute up to Rown):current_rowwill havek+1elements.current_row[0] = 1,current_row[k] = 1.- For
jfrom1tok-1:current_row[j] = previous_row[j-1] + previous_row[j]. - Set
previous_row = current_rowfor the next iteration.
- The final
current_row(orprevious_rowafter the loop) is the answer. - This takes
O(n)iterations to get to then-th row. Each iteration doesO(k)work (wherekgoes up ton). Total time:O(1+2+...+n) = O(n^2). - Space:
O(n)because we only store two rows at a time, each of length at mostn+1. - Further Optimization for O(n) space, O(n^2) time (generating k-th row):
You can even do it with a single array of size
k+1by computing the values for the next row in reverse order within that array. Ifrowstores rowi-1, to compute rowiin place:row[j] = row[j] + row[j-1]forjfromi-1down to1. Therow[0]remains 1. You’d need to append a 1 at the end forrow[i].
Interview Takeaway:
- Tests basic 2D array/list of lists manipulation.
- Understanding the recursive definition of Pascal’s triangle is key.
- The
O(n^2)time and space for generating allnrows is standard. - The variant tests if you can optimize space when only the last result is needed, a common DP optimization technique (reducing space by noticing limited dependencies).