\

Chapter 5: Arrays

1 hour 41 min read

Chapter 5 Overview:

This chapter covers the basic properties of arrays (specifically Python lists), common manipulation techniques, and classic array-based problems. It heavily emphasizes thinking about space complexity and optimizing operations.

Think of arrays like this:

Imagine a row of numbered boxes, right next to each other in memory.

  • Contiguous: The boxes are physically adjacent. This is crucial for performance.
  • Indexed: You can instantly jump to any box if you know its number (index). This is O(1) access.
  • Homogeneous (usually): Typically, all boxes hold the same type of item (like all integers or all strings), though Python lists offer more flexibility.

Why Arrays Are Important (Especially for Interviews):

  1. Efficiency: Direct access by index is super fast (O(1)).
  2. Memory Locality: Because elements are stored together, accessing sequential elements is often cache-friendly, leading to good practical performance.
  3. Building Blocks: Many other data structures (like hash maps, heaps, stacks, queues) are often implemented using arrays underneath.
  4. In-Place Operations: Interviewers love problems where you modify the array directly without using significant extra memory (O(1) space). This often involves clever use of pointers or swapping elements.

Array Boot Camp: Reordering Array Entries (Page 1)

The boot camp problem is a fantastic introduction to in-place array manipulation using multiple pointers.

The Problem:

Given an array of integers, rearrange it so that all the even numbers appear before all the odd numbers.
The order within the evens or odds doesn’t matter.
Crucially, do this in-place, meaning using only O(1) extra space (you can’t just create two new lists and combine them).

Example: Input [3, 1, 2, 4, 5, 6] could become [2, 4, 6, 1, 5, 3] or [6, 2, 4, 3, 1, 5], etc.

The Boot Camp Code Explained (even_odd):

def even_odd(A):
  next_even, next_odd = 0, len(A) - 1 # Pointers start at opposite ends
  while next_even < next_odd: # Stop when pointers meet or cross
    if A[next_even] % 2 == 0:
      # A[next_even] is even. It's in the correct partition.
      # Leave it there and advance the 'even' boundary.
      next_even += 1
    else:
      # A[next_even] is odd. It needs to go to the 'odd' section.
      # Swap it with the element at A[next_odd].
      A[next_even], A[next_odd] = A[next_odd], A[next_even]
      # The element now at A[next_odd] is confirmed odd (or hasn't been checked),
      # so shrink the 'odd' boundary.
      next_odd -= 1
  # No return needed as A is modified in-place

Intuition (Partitioning):

Think of the array being divided into three sections as you process it:

  • A[0 ... next_even - 1]: Even Numbers (Correctly placed)
  • A[next_even ... next_odd]: Unclassified Numbers (Yet to be examined)
  • A[next_odd + 1 ... len(A) - 1]: Odd Numbers (Correctly placed)

The while loop processes the Unclassified section.

  • The next_even pointer expands the Even section.
  • The next_odd pointer shrinks the Unclassified section from the right (effectively expanding the Odd section).

When next_even finds an odd number, it swaps it with A[next_odd], placing the odd number in the correct final partition, and then shrinks the next_odd boundary. The element that gets swapped into A[next_even] still needs to be checked in the next iteration of the loop, which is why next_even doesn’t advance in the else block.

Complexity:

  • Time: O(n). Each element is examined at most once by next_even, and involved in at most one swap.
  • Space: O(1). We only use a couple of pointer variables and potentially one temporary variable for the swap (though Python handles the tuple swap elegantly without explicit temp).

Key Takeaway: This boot camp demonstrates a powerful pattern: using two (or sometimes more) pointers to partition an array in-place based on some property.

Top Tips for Arrays (Page 2)

This section lists crucial strategies for array problems:

  1. Use the array itself for space: Cleverly reuse parts of the array or use indices/values to store information, avoiding extra lists/dicts.
  2. Fill from the back: When merging or writing results, starting from the end of the allocated space can avoid overwriting needed data and reduce shifting (e.g., merging two sorted arrays into one).
  3. Overwrite, don’t delete: If you need to remove elements and order doesn’t matter for the “removed” part, just overwrite them with elements you want to keep. Much faster than shifting everything left.
  4. Process integer digits: For arrays representing large numbers, process from the LSB (rightmost/last element) for arithmetic. Or reverse the array first if processing LSB-first simplifies logic.
  5. Subarrays: Be comfortable working with slices or pointer ranges (A[start:end]).
  6. Off-by-one errors: Double-check loop conditions (< vs <=), index calculations (i+1, i-1), and empty array edge cases.
  7. Don’t preserve properties prematurely: It’s often fine to temporarily violate sortedness or other properties during intermediate steps if you restore them by the end.
  8. Arrays as direct maps: If values are in a known, limited range (like 0-99 or ASCII), use a boolean or count array indexed by the value itself (e.g., seen[value] = True).
  9. Parallel logic for 2D arrays: Often, the logic for rows (iterating i) and columns (iterating j) is very similar; exploit this symmetry.
  10. Simulation vs. Formula: For generating sequences (like spiral order), sometimes writing code that directly simulates the generation process step-by-step is easier and less error-prone than deriving a complex mathematical formula for the k-th element.

Know Your Array Libraries (Python list) (Page 2-3)

Python’s list is your go-to array. It’s dynamic (grows/shrinks) and versatile.

Instantiation:

  • my_list = []
  • nums = [1, 6, 3]
  • zeros = [0] * 10
  • copy_list = list(original_list)
  • List comprehensions are powerful: squares = [x*x for x in range(10)]

Basic Ops:

  • len(A)
  • A.append(val)
  • A.insert(idx, val)
  • A.remove(val) (removes first occurrence)
  • val in A (O(n) search)

2D Arrays:

  • matrix = [[0] * cols for _ in range(rows)]
  • BEWARE: matrix = [[0] * cols] * rows creates shallow copies of the inner list - modifying one row modifies all!

Copying:

  • B = A: B is just another reference to A. Changes to B affect A.
  • B = list(A) or B = A[:] or B = copy.copy(A): Shallow copy. Creates a new list B, but elements inside B are still references to the same objects as in A. If A contains mutable objects (like other lists), changing them via A will reflect in B, and vice-versa.
  • B = copy.deepcopy(A): Deep copy. Creates a fully independent copy, including copies of any nested objects. Use this when you need total separation.

Key Methods:

  • min(A), max(A)
  • bisect module (bisect_left, bisect_right, insort): For efficient searching/insertion in sorted lists (O(log n) search, O(n) insertion).
  • A.reverse(): Reverses in-place.
  • reversed(A): Returns an iterator for reverse order (doesn’t modify A, doesn’t create a full new list immediately).
  • A.sort(): Sorts in-place.
  • sorted(A): Returns a new sorted list (doesn’t modify A).
  • del A[i]: Deletes element at index i.
  • del A[i:j]: Deletes slice from i up to (not including) j.

Slicing (A[start:stop:step]): Super powerful!

  • A[i:j] gives subarray.
  • A[:j] from start up to j.
  • A[i:] from i to end.
  • A[:] shallow copy.
  • A[::-1] reverses the list (returns a new reversed list).
  • A[k:] + A[:k] rotates left by k.

List Comprehensions: [expression for item in iterable if condition]

  • Concise way to build lists.
  • Often clearer than map/filter.
  • Can nest: [val for row in matrix for val in row] flattens 2D list.
  • Avoid excessive nesting (more than 2 levels often becomes hard to read).

Section 5.1: The Dutch National Flag Problem (Pages 3-6)

This is a classic partitioning problem, essential for understanding quicksort and in-place array manipulation.

The Problem: Given an array A and an index i (whose value A[i] is the “pivot”), rearrange A such that:

  1. All elements less than the pivot come first.
  2. Then, all elements equal to the pivot.
  3. Finally, all elements greater than the pivot.

Do this in-place (O(1) space). The relative order within the <, ==, > groups doesn’t matter.

Example: A = [0, 1, 2, 0, 2, 1, 1], pivot_index = 3 (pivot value = 0). A valid result: [0, 0, 1, 2, 2, 1, 1] or [0, 0, 1, 1, 1, 2, 2] (after sorting groups). But the simplest valid result might be [0, 0, 1, 2, 2, 1, 1].

Approach 1: Two Passes (Less efficient, conceptually simpler)

  1. First Pass (Less Than): Iterate through A. Keep a pointer smaller (initially 0). If A[current_index] < pivot, swap A[current_index] with A[smaller] and increment smaller. This brings all smaller elements to the front.
  2. Second Pass (Greater Than): Iterate backwards through A. Keep a pointer larger (initially n-1). If A[current_index] > pivot, swap A[current_index] with A[larger] and decrement larger. This pushes all larger elements to the end.

After these two passes, elements equal to the pivot will naturally be left in the middle.

  • Time: O(n) + O(n) = O(n).
  • Space: O(1).

Approach 2: Single Pass (The Standard, Efficient Solution)

This is the implementation shown in the book, often preferred in interviews.

Intuition: Divide the array into four regions using three pointers:

  • A[0 ... smaller - 1]: Bottom (Known to be < pivot)
  • A[smaller ... equal - 1]: Middle (Known to be == pivot)
  • A[equal ... larger - 1]: Unclassified (Elements we haven’t processed yet)
  • A[larger ... n - 1]: Top (Known to be > pivot)

Pointers:

  • smaller: Boundary between Bottom and Middle.
  • equal: Boundary between Middle and Unclassified (the current element being processed).
  • larger: Boundary between Unclassified and Top.

Algorithm:

  1. Initialize smaller = 0, equal = 0, larger = len(A).
  2. Get pivot = A[pivot_index].
  3. Loop while equal < larger: (While there are unclassified elements)
    • Examine A[equal].
    • Case 1: A[equal] < pivot
      • Swap A[equal] with A[smaller].
      • Increment smaller.
      • Increment equal.
    • Case 2: A[equal] == pivot
      • Increment equal. (Element is in the right place relative to smaller).
    • Case 3: A[equal] > pivot
      • Decrement larger. (Shrink unclassified region from the right).
      • Swap A[equal] with A[larger].
      • Crucially: Do NOT increment equal here. The element just swapped into A[equal] came from the unprocessed larger region and still needs to be classified in the next loop iteration.

Let’s revisit the requirements:

  • Input: An array A (e.g., [0, 1, 2, 0, 2, 1, 1]) and an index i (e.g., pivot_index = 3).
  • Pivot Value: The value at that index, pivot = A[pivot_index] (so pivot = 0 in our example).
  • Goal: Rearrange A so it looks like this: [ <pivot | ==pivot | >pivot ] Specifically:
    • [ All elements LESS THAN pivot ] - These come first.
    • [ All elements EQUAL TO pivot ] - These come next.
    • [ All elements GREATER THAN pivot ] - These come last.

Why three categories? This is extremely useful as a subroutine in Quicksort. If an array has many duplicate elements equal to the pivot, standard Quicksort (which partitions into just < pivot and >= pivot) can perform poorly (O(n^2)). By putting all elements equal to the pivot into their own group in the middle, you exclude them from the recursive calls on the < pivot and > pivot subarrays. This makes Quicksort much more efficient (O(n log n) on average, O(n) if all elements are equal) when duplicates are present.

“Do this in-place (O(1) space)”: This means you cannot create new helper arrays to store the <, ==, and > elements temporarily. You must achieve the final arrangement by only swapping elements within the original array A, using only a constant amount of extra memory (like variables for the pivot value and your pointers smaller, equal, larger). This is the main challenge and why the pointer techniques are important.

“The relative order within the <, ==, > groups doesn’t matter”: This is a relaxation that makes the problem easier. Look at the example A = [0, 1, 2, 0, 2, 1, 1] with pivot = 0.

  • The < pivot group is empty.
  • The == pivot group contains [0, 0].
  • The > pivot group contains [1, 2, 2, 1, 1].

A valid final arrangement could be [0, 0, | 1, 2, 2, 1, 1]. Another could be [0, 0, | 2, 1, 1, 2, 1]. As long as all the 0s come first, followed by all the numbers greater than 0, it’s correct. We don’t need to keep the original relative order of the 1s and 2s within the > pivot group. This freedom allows us to perform swaps more easily. If we had to maintain relative order (a “stable partition”), the O(1) space solution would be much harder.

In essence: The Dutch Flag problem is just a more detailed partitioning than even_odd. Instead of two bins, you have three (<, ==, >). The challenge lies in achieving this three-way split efficiently using only the existing array.


Section 5.2: Increment an Arbitrary-Precision Integer (Page 7 / PDF Page 43)

The Concept: Standard integer types in many languages (like int in C++ or Java) have a fixed size (e.g., 32 or 64 bits) and thus a maximum value. Python handles this automatically with its arbitrary-precision integers, but interview problems often ask you to simulate this behavior, typically by storing the digits of a very large number in an array (or list in Python).

Representation: An array A represents a non-negative integer D. Each element A[i] holds a single digit. The most significant digit is usually at A[0].

  • Example: D = 129 is represented as A = [1, 2, 9].
  • Example: D = 99 is represented as A = [9, 9].

The Problem: Write a function that takes such an array A representing integer D and updates it in-place to represent D + 1. Handle potential carries, including the case where the number of digits increases (like 99 + 1 = 100).

Brute-Force (and why it’s often not allowed/intended):

  1. Convert the array [1, 2, 9] into the integer 129.
  2. Add 1: 129 + 1 = 130.
  3. Convert 130 back into an array [1, 3, 0].

Limitation: This fails if the integer D is larger than the maximum value the language’s built-in integer type can hold (this isn’t an issue for Python’s runtime integers, but the problem setup often simulates fixed-precision constraints or asks you to avoid this conversion). It also doesn’t modify the array in-place directly.

The Grade-School Algorithm Approach: Think about how you add 1 to a number on paper, like 129 + 1:

  1. Start from the rightmost digit (Least Significant Digit - LSB).
  2. Add 1 to the last digit: 9 + 1 = 10.
  3. Write down the 0, carry-over the 1.
  4. Move to the next digit to the left (2). Add the carry: 2 + 1 = 3.
  5. Write down the 3. No carry-over this time (carry is 0).
  6. Move to the next digit to the left (1). Add the carry: 1 + 0 = 1.
  7. Write down the 1. No carry-over.
  8. No more digits, result is 130.

Now consider 99 + 1:

  1. Start rightmost: 9 + 1 = 10. Write 0, carry 1.
  2. Next digit: 9 + 1 (carry) = 10. Write 0, carry 1.
  3. No more digits, but we still have a carry. This means the result needs an extra digit. The result is 100.

Simulating on the Array:

  1. We operate directly on the array A.
  2. Increment the last element: A[n-1] += 1.
  3. Iterate from right-to-left (from n-1 down to 1).
    • If A[i] == 10 (meaning we had a carry into this position):
      • Set A[i] = 0.
      • Increment the digit to the left: A[i-1] += 1.
    • If A[i] is not 10 after adding the potential carry, then the carry propagation stops, and we can break the loop early.
  4. Special Case: After the loop, check if the first digit A[0] became 10. If it did, it means we had a carry out of the most significant digit.
    • Set A[0] = 1.
    • Append a 0 to the end of the array to accommodate the new least significant digit. (The book uses a slightly different but equivalent way: set A[0]=1, A.append(0)).

Section 5.3: Multiply Two Arbitrary-Precision Integers (Page 7-8 / PDF Pages 43-44)

The Problem: Now, instead of adding 1, multiply two non-negative integers num1 and num2, represented as arrays of digits. Return their product, also as an array of digits. Handle potential negative numbers indicated by a negative sign in the first element (e.g., [-7, 6, 1] represents -761).

Example: [1, 2, 3] (123) * [9, 8, 7] (987) = [1, 2, 1, 4, 0, 1] (121401)

Again, Brute-Force Conversion Fails: Converting to built-in integers and multiplying won’t work for numbers exceeding the language’s fixed precision limits.

Grade-School Multiplication Algorithm: Think about 123 * 987:

123
x 987
------
861   (= 7 * 123)
984   (= 8 * 123, shifted left 1 place)
1107  (= 9 * 123, shifted left 2 places)
------
121401 (Sum of the above)

Simulating on Arrays:

  1. Handle Sign:

    • Determine the sign of the result (-1 if one input is negative, 1 otherwise).
    • Make both input arrays positive for the multiplication step (e.g., using abs() on the first element).
  2. Result Array Size:

    • The product of an n-digit number and an m-digit number can have at most n + m digits.
    • Create a result array of size n + m, initialized to zeros.
  3. Core Multiplication Loop: Iterate through num1 from right-to-left (index i) and num2 from right-to-left (index j).

    • For each pair num1[i] and num2[j], calculate the product num1[i] * num2[j].
    • This product contributes to the position result[i + j + 1] (think about the place values: 10a * 10b = 10(a+b)).
    • Add this product to result[i + j + 1].
    • Handle the carry: The carry from this addition goes into result[i + j].
      • result[i + j] += result[i + j + 1] // 10.
    • Keep the digit:
      • result[i + j + 1] %= 10.
    • This directly adds the partial products into the correct positions in the result array, simulating the addition step of grade-school multiplication implicitly.
  4. Remove Leading Zeros:

    • The result array might have leading zeros (e.g., if multiplying small numbers like [1]*[1]).
    • Find the first non-zero digit and return the slice from that point onwards.
    • Handle the case where the result is 0 (return [0]).
  5. Apply Sign:

    • If the overall sign calculated in step 1 was negative, negate the first element of the final result array.

Complexity:

  • Time: O(n * m), where n and m are the number of digits in the two input numbers. This comes from the nested loops iterating through all pairs of digits.
  • Space: O(n + m) for the result array.

Section 5.4: Advancing Through an Array (Page 8 / PDF Page 44)

The Problem: Imagine a board game where you’re on a sequence of positions represented by an array A. Each A[i] contains a non-negative integer telling you the maximum number of steps you can take forward from position i. You start at index 0. Can you reach the last index of the array?

Example 1: A = [3, 3, 1, 0, 2, 0, 1]

  • Start at index 0 (A[0]=3). Can move 1, 2, or 3 steps.
  • Option 1: Move 1 step to index 1 (A[1]=3). From index 1, can move 1, 2, or 3 steps.
  • Let’s move 3 steps to index 4 (A[4]=2). From index 4, can move 1 or 2 steps.
  • Let’s move 2 steps to index 6 (A[6]=1). Index 6 is the last index.
  • Yes, we can reach the end.

Example 2: A = [3, 2, 0, 0, 2, 0, 1]

  • Start at index 0 (A[0]=3). Can move 1, 2, or 3 steps.
    • If we move 1 step -> index 1 (A[1]=2). Can reach index 2 or 3.
    • If we move 2 steps -> index 2 (A[2]=0). Stuck.
    • If we move 3 steps -> index 3 (A[3]=0). Stuck.
  • From index 1, max reach is 1+2=3. Both indices 2 and 3 have 0s. We cannot get past index 3.
  • No, we cannot reach the end.

Initial Thoughts & Why Simple Greedy Fails: You might think: “Always take the maximum jump possible from the current position.” This doesn’t work. Consider A = [2, 4, 1, 1, 0, 2, 3]. Max jump from index 0 takes you to index 2 (A[2]=1). From there you get stuck. But taking just 1 step from index 0 to index 1 (A[1]=4) lets you jump much further and potentially reach the end.

The Efficient Approach: Track Maximum Reach The key idea is to iterate through the array and keep track of the furthest index we know we can possibly reach so far.

Algorithm:

  1. Initialize max_reach = 0. This variable will store the furthest index reachable from the positions we’ve visited.
  2. Iterate through the array with an index i from 0 up to len(A) - 1.
  3. Crucial Check: In each iteration i, first check if i > max_reach.
    • If i is greater than max_reach, it means we could never have reached the current index i from any previous position. Therefore, we definitely cannot reach the end. Return False.
  4. Update the maximum reach: Calculate the furthest we could jump from the current position i. This is i + A[i].
    • Update max_reach = max(max_reach, i + A[i]).
  5. Check for early success: If at any point max_reach becomes greater than or equal to the last index (len(A) - 1), we know we can reach the end. Return True.
  6. If the loop completes without returning False (meaning we could always reach the current i) and without having reached the end yet (which implies max_reach might have landed exactly on the last index in the final iteration), we still need a final check after the loop, but the logic shown below incorporates this cleanly.

Note: The book’s code slightly reformulates the loop condition and final check, but the core logic of tracking max_reach is the same.

Intuition Walkthrough (Example 1: A = [3, 3, 1, 0, 2, 0, 1])

  • i=0: max_reach=0. i <= max_reach is true. max_reach = max(0, 0+3) = 3. i becomes 1.
  • i=1: max_reach=3. i <= max_reach is true. max_reach = max(3, 1+3) = 4. i becomes 2.
  • i=2: max_reach=4. i <= max_reach is true. max_reach = max(4, 2+1) = 4. i becomes 3.
  • i=3: max_reach=4. i <= max_reach is true. max_reach = max(4, 3+0) = 4. i becomes 4.
  • i=4: max_reach=4. i <= max_reach is true. max_reach = max(4, 4+2) = 6. i becomes 5.
  • Loop condition: i <= max_reach (5 <= 6) is true, but max_reach < last_index (6 < 6) is false. Loop terminates.
  • Return max_reach >= last_index (6 >= 6) -> True.

Intuition Walkthrough (Example 2: A = [3, 2, 0, 0, 2, 0, 1])

  • i=0: max_reach=0. i <= max_reach is true. max_reach = max(0, 0+3) = 3. i becomes 1.
  • i=1: max_reach=3. i <= max_reach is true. max_reach = max(3, 1+2) = 3. i becomes 2.
  • i=2: max_reach=3. i <= max_reach is true. max_reach = max(3, 2+0) = 3. i becomes 3.
  • i=3: max_reach=3. i <= max_reach is true. max_reach = max(3, 3+0) = 3. i becomes 4.
  • i=4: max_reach=3. i <= max_reach (4 <= 3) is false. Loop terminates earlier.
  • Return max_reach >= last_index (3 >= 6) -> False.

Analogy: Hopping Stones Across a River

Imagine the array indices 0, 1, 2, ... n-1 are stones lined up across a river.

  • You start at stone 0.
  • The number A[i] on each stone tells you the maximum length of a single jump you can take from that stone. So, from stone i, you can land on any stone j where i < j <= i + A[i].
  • Goal: Can you reach the last stone (stone n-1)?

Why simple jumping doesn’t work: If you just stand on stone i and decide which jump to take next, you might make a bad choice. A shorter jump might land you on a stone with a much bigger A[j] value that lets you cross the river, while the longest jump from i might land you somewhere useless.

The max_reach Idea: Tracking Your Potential

Why this “tracking the potential” works:

  • It’s safe: We only advance our current position i if it’s within the known max_reach. If we ever encounter an i that’s beyond max_reach, it means no combination of previous jumps could have gotten us there.
  • It’s sufficient: We don’t need to know the exact path. We only need to know if the potential to reach the end exists. By always updating max_reach to the maximum possible distance (max(current_max_reach, i + A[i])), we are essentially keeping track of the “frontier” of reachable stones. If that frontier ever touches or crosses the last stone, we know a path must exist.

Think of it like this: max_reach is your “optimistic outlook”. At each step i you can actually reach, you check if the jump from i gives you a more optimistic outlook than you already had. If your current position i ever falls behind your optimistic outlook (i > max_reach), then your optimism was unfounded, and you’re stuck.

Instead of deciding which specific jump to take, we focus on a simpler question: “What is the absolute furthest stone I could possibly reach, given the stones I’ve visited so far?”

Complexity:

  • Time: O(n). We iterate through the array at most once.
  • Space: O(1). We only use a few variables (max_reach, i, last_index).

Variant: Minimum Steps

The variant asks for the minimum number of steps to reach the end. This smells like a Breadth-First Search (BFS).

Think of array indices as nodes in a graph. There’s an edge from i to j if j is reachable from i (i.e., i < j <= i + A[i]). You want the shortest path from node 0 to node n-1.

A standard BFS exploring reachable indices level by level will find the minimum number of jumps. You’d need a way to keep track of visited indices to avoid cycles/redundant work and a queue to manage the BFS levels. The state in the queue could be (index, num_jumps). This would likely take O(n) time and O(n) space (for the queue and visited set/array).


Section 5.5: Delete Duplicates from a Sorted Array (Page 9 / PDF Page 45)

The Problem: You are given a sorted array A. Your task is to remove duplicate elements so that each unique element appears only once. You should do this in-place, modifying the original array A. The elements that are kept should be shifted to the beginning of the array. You need to return the number of valid, unique elements remaining in the array. What’s stored in the array beyond the last valid element doesn’t matter.

Example:

  • Input: A = [2, 3, 5, 5, 7, 11, 11, 11, 13]
  • Output: The function should return 6.
  • After the function runs, the array A should be modified so its beginning looks like: [2, 3, 5, 7, 11, 13, ... ]. The values in the positions after index 5 are irrelevant (e.g., [2, 3, 5, 7, 11, 13, 11, 11, 13] would be a valid state for A).

Why “Sorted” is Key: Because the array is sorted, all duplicate elements are guaranteed to be grouped together consecutively. This makes identifying them much easier than if the array were unsorted.

Common Pitfalls:

  • Using O(n) space: It’s easy to iterate through A, store unique elements in a set or a new list, and then copy them back. But the problem often requires O(1) extra space (in-place modification).
  • Shifting elements repeatedly: You could find a duplicate A[i] == A[i-1], and then shift all elements from A[i+1] onwards one step to the left. This works, but if you have many duplicates (e.g., [2, 2, 2, 2, 3]), you end up doing a lot of shifting, leading to O(n^2) time complexity in the worst case.

The Efficient In-Place Approach: Two Pointers (Read & Write) This is a very common and useful pattern for modifying arrays in-place when you want to keep only certain elements.

The Strategy:

  1. Use one pointer (read_index or just the loop index i in the code below) to iterate through the original array from the second element onwards.
  2. Use a second pointer (write_index) to keep track of the next position in the array where a unique element should be written. Initialize write_index = 1 (since the very first element A[0] is always unique by definition, assuming the array isn’t empty).
  3. Compare the element being read (A[i]) with the last unique element written (A[write_index - 1]).
    • If A[i] is different from A[write_index - 1], it means we’ve found a new unique element.
      • Copy this unique element A[i] to the next available write position: A[write_index] = A[i].
      • Advance the write pointer: write_index += 1.
    • If A[i] is the same as A[write_index - 1], it’s a duplicate. We simply ignore it and move the read pointer (i) forward, without advancing write_index. This effectively overwrites duplicates later when a new unique element is found.
  4. After iterating through the entire array, write_index will hold the count of unique elements, and the subarray A[0 ... write_index - 1] will contain those unique elements.

Section 5.6: Buy and Sell a Stock Once (Page 10 / PDF Page 46)

The Problem: You’re given an array prices where prices[i] is the price of a given stock on day i. You want to find the maximum profit you can achieve by buying the stock on one day and selling it on a later day. If no profit can be made (prices always decrease), the maximum profit is 0. You must buy before you sell.

Example: prices = [310, 315, 275, 295, 260, 270, 290, 230, 255, 250]

  • Buy at 260 (day 4, index 4), Sell at 290 (day 6, index 6). Profit = 290 - 260 = 30.
  • This is the maximum possible profit. Note: minimum price is 230, max is 315, but 315 occurs before 230, so you can’t buy at 230 and sell at 315.

Brute-Force: Try every possible pair of (buy_day, sell_day) where sell_day > buy_day. Calculate the profit for each pair and keep track of the maximum.

  • Use nested loops: outer loop for buy_day from 0 to n-2, inner loop for sell_day from buy_day + 1 to n-1.
  • Time Complexity: O(n^2).
  • Space Complexity: O(1).
  • (Too slow for large inputs).

Divide and Conquer (Mentioned in Book Introduction, not here): Split the array in half, find the max profit in the left half, find the max profit in the right half, and find the max profit that crosses the midpoint (buy in left, sell in right). The crossing part requires finding the min price in the left and max price in the right. This leads to O(n log n) time. Better, but we can do even better.

The Efficient Linear Scan Approach:

Key Idea: As you iterate through the prices day by day, what information do you need to know from the past to decide the best potential profit if you sell today? You need to know the minimum price encountered before today.

Algorithm:

  1. Initialize min_price_so_far to a very large value (or prices[0] if the array is not empty).
  2. Initialize max_profit to 0.0.
  3. Iterate through the prices array, starting from the first price. Let the current price be price.
    • Calculate potential profit if selling today: potential_profit = price - min_price_so_far.
    • Update maximum profit: max_profit = max(max_profit, potential_profit).
    • Update minimum price seen: min_price_so_far = min(min_price_so_far, price).
  4. After iterating through all prices, max_profit will hold the maximum profit achievable.

Intuition Walkthrough (prices = [310, 315, 275, 295, 260, 270, 290, 230, 255, 250]):

pricemin_price_so_far (Before Update)profit_sell_todaymax_profit (After Update)min_price_so_far (After Update)
310inf-inf0310
31531055310
275310-355275
2952752020275
260275-1520260
2702601020260
2902603030260
230260-3030230
2552302530230
2502302030230

Final max_profit = 30.

Complexity:

  • Time: O(n). We iterate through the array exactly once.
  • Space: O(1). We only use two variables (min_price_so_far, max_profit).

Variant: Longest Equal Subarray

Find the length of the longest contiguous subarray where all entries are equal.

Iterate through the array, keeping track of the current_length of the equal sequence and the max_length found so far.

  • If A[i] == A[i-1], increment current_length.
  • If A[i] != A[i-1], reset current_length to 1.
  • Update max_length = max(max_length, current_length) in each iteration.

O(n) time, O(1) space.

Intuition Walkthrough (prices = [310, 315, 275, 295, 260, 270, 290, 230, 255, 250]):

pricemin_price_so_far (Before Update)profit_sell_todaymax_profit (After Update)min_price_so_far (After Update)
310inf-inf0310
31531055310
275310-355275
2952752020275
260275-1520260
2702601020260
2902603030260
230260-3030230
2552302530230
2502302030230

Final max_profit = 30.

Complexity:

  • Time: O(n). We iterate through the array exactly once.
  • Space: O(1). We only use two variables (min_price_so_far, max_profit).

Variant: Longest Equal Subarray

Find the length of the longest contiguous subarray where all entries are equal.

Iterate through the array, keeping track of the current_length of the equal sequence and the max_length found so far.

  • If A[i] == A[i-1], increment current_length.
  • If A[i] != A[i-1], reset current_length to 1.
  • Update max_length = max(max_length, current_length) in each iteration.

O(n) time, O(1) space.


Section 5.7: Buy and Sell a Stock Twice (Page 11 / PDF Page 47)

The Problem: Same setup as before, but now you can buy and sell at most twice. The second buy must happen after the first sell. Find the maximum profit.

Example: prices = [12, 11, 13, 9, 12, 8, 14, 13, 15]

  • First Tx: Buy at 9, Sell at 12 (Profit 3)
  • Second Tx: Buy at 8, Sell at 15 (Profit 7)
  • Total Profit = 3 + 7 = 10. This is the maximum.

Brute-Force: Try all combinations of two disjoint buy-sell intervals. O(n4) (choose 4 days). Too slow.

Improved Brute-Force: Iterate through all possible split points i (day the first sale happens or second buy happens). Find the max profit for one transaction in prices[0...i] and the max profit for one transaction in prices[i...n-1]. Add them up. Keep track of the maximum sum.

  • Using the O(n) algorithm from Section 5.6 for each subarray takes O(n) time. Doing this for n possible split points gives O(n2) total time. Space is O(1). Better, but still not optimal.

The Efficient Dynamic Programming Approach:

The Core Difficulty: How do you decide the four days (buy1, sell1, buy2, sell2) optimally? Trying all combinations is too slow (O(n4)).

The Key Insight (Decomposition):

Imagine you complete your first transaction by selling on day i.

  • This means you bought on some day b1 <= i and sold on day i.
  • The best you could have done for this first transaction ending on day i is MaxProfit(prices[0...i]). Let’s call this Profit1_ending_at_i.

Now, you need to make your second transaction using only the days after day i.

  • This means you buy on some day b2 > i and sell on some day s2 >= b2.
  • The best you could do for this second transaction happening after day i is MaxProfit(prices[i+1...n-1]). Let’s call this Profit2_starting_after_i.

If we split the entire process at day i (meaning the first transaction finishes by day i, and the second starts after day i), the total profit for this specific split point i would be: TotalProfit(split at i) = Profit1_ending_at_i + Profit2_starting_after_i

The Problem: We don’t know the best day i to split the timeline. So, we need to calculate this TotalProfit(split at i) for every possible split day i (from i=0 to i=n-1) and take the maximum among them.

How to Calculate Efficiently?

Calculating Profit1_ending_at_i and Profit2_starting_after_i from scratch for every i is slow (O(n2)). We need to precompute or compute iteratively.

Step 1: Calculate all Profit1_ending_at_i values.

  • Notice that Profit1_ending_at_i is exactly the maximum profit you can make with a single transaction within the range prices[0...i].
  • We already know how to calculate this efficiently! It’s the logic from the “Buy and Sell Once” problem (Section 5.6). We can run that algorithm once, and as we iterate through the prices array, we store the maximum profit found up to that point in a new array. Let’s call this array F (for Forward).
  • F[i] = Maximum profit from one transaction using prices from day 0 to day i.
  • This takes one pass, O(n) time, and O(n) space for the F array.

Step 2: Calculate all Profit2_starting_after_i values.

  • This is the maximum profit from one transaction using only prices from day i+1 to day n-1.
  • How can we calculate this efficiently for all i? We can run the “Buy and Sell Once” logic backwards!
  • Imagine iterating from the end of the prices array (n-1) down to 0. Keep track of the max_price_seen_so_far (during the backward scan). The best profit if you buy on day j (during this backward scan) is max_price_seen_so_far - prices[j].
  • Let B[j] be the maximum profit from one transaction using prices from day j to day n-1. We can compute all B[j] values in a single backward pass (O(n) time). We could store this in another array B.

Step 3: Combine the Results.

  • Now we have F[i] (best profit ending by day i) and B[j] (best profit starting from day j).
  • The total profit for a split right after day i is F[i] + B[i+1].
  • We need to calculate F[i] + B[i+1] for all i from 0 to n-2.
  • The final answer is the maximum value found in these sums.
  • We also need to consider the possibility that the best profit comes from only one transaction (e.g., if prices always go down after the first transaction). The best single-transaction profit over the whole period is simply F[n-1]. So the overall maximum is max(F[n-1], max(F[i] + B[i+1] for i in 0..n-2)).

The Book’s Space Optimization:

The book realizes we don’t actually need to store the entire B array from Step 2. We can combine Step 2 and Step 3.

  1. Calculate and store the F array (forward pass) as described in Step 1.
  2. Perform the backward pass (like Step 2). As you iterate backwards with index i (from n-1 down to 1):
    • Calculate the maximum profit for a single transaction starting at or after day i. Let’s call this profit_second_tx. (This is conceptually B[i]). You do this by keeping track of max_price_so_far encountered during the backward scan. profit_second_tx = max(profit_if_buy_at_i, profit_if_buy_later).
    • Immediately combine: Now that you know the best profit starting after day i-1 (which is profit_second_tx you just calculated for index i), you can add it to the best profit ending before day i (which is F[i-1], already calculated in the forward pass).
    • total_profit_for_split_at_i-1 = F[i-1] + profit_second_tx.
    • Keep track of the overall maximum total profit seen across all split points considered so far.

Let’s retry the Intuition Walkthrough (prices = [12, 11, 13, 9, 12, 8, 14, 13, 15]):

  • Forward Pass: Compute F = [0, 0, 2, 2, 3, 3, 6, 6, 7].

    • max_total_profit initially is the best single transaction profit, so max_total_profit = F[8] = 7.
  • Backward Pass & Combine:

    • i = 8 (price prices[8]=15):
      • max_price (for range [8..8]) = 15.
      • profit_second_tx (for range [8..8]) is max(0, 15-15) = 0.
      • total = F[7](6) + 0 = 6.
      • max_total_profit = max(7, 6) = 7.
    • i = 7 (price prices[7]=13):
      • max_price (for range [7..8]) = max(15, 13) = 15.
      • profit_second_tx (for range [7..8]) is max(profit_starting_at_8=0, profit_buy_at_7=15-13=2) = 2.
      • total = F[6](6) + 2 = 8.
      • max_total_profit = max(7, 8) = 8.
    • i = 6 (price prices[6]=14):
      • max_price (for range [6..8]) = max(15, 14) = 15.
      • profit_second_tx (for range [6..8]) is max(profit_starting_at_7=2, profit_buy_at_6=15-14=1) = 2.
      • total = F[5](3) + 2 = 5.
      • max_total_profit = max(8, 5) = 8.
    • i = 5 (price prices[5]=8):
      • max_price (for range [5..8]) = max(15, 8) = 15.
      • profit_second_tx (for range [5..8]) is max(profit_starting_at_6=2, profit_buy_at_5=15-8=7) = 7.
      • total = F[4](3) + 7 = 10.
      • max_total_profit = max(8, 10) = 10.
    • … and so on.

The backward pass calculates the best profit for the second transaction on the fly and immediately combines it with the pre-calculated best profit for the first transaction ending just before it.

Think of the backward loop index i as defining the start day for the potential second transaction. profit_second_tx calculated at step i represents the best possible profit if your second transaction happens entirely within prices[i...n-1]. We add this to F[i-1] (best profit from first transaction ending before day i) to get the total profit for that specific way of splitting the two transactions. We do this for all possible start days i for the second transaction and find the overall maximum.


Section 5.8: Computing an Alternation (Page 12 / PDF Page 48)

The Problem: You are given an array A of numbers. Rearrange its elements in-place to create a new array (or modify A directly) such that it follows an alternating pattern: A[0] <= A[1] >= A[2] <= A[3] >= A[4] <= A[5] ... (Less-than-or-equal followed by greater-than-or-equal, repeating).

Example:

  • Input: A = [3, 1, 4, 1, 5, 9, 2, 6]
  • A valid output: [1, 4, 1, 5, 2, 9, 3, 6] (Check: 1<=4>=1<=5>=2<=9>=3<=6) - Note: 3<=6 is satisfied at the end.
  • Another valid output: [1, 3, 1, 4, 2, 6, 5, 9]

Approach 1: Sorting (Simple but Suboptimal)

  1. Sort the array A. Example: [1, 1, 2, 3, 4, 5, 6, 9]
  2. Swap adjacent pairs starting from the second element. Swap A[1] and A[2], then A[3] and A[4], then A[5] and A[6], etc.
    • Swap A[1](1) and A[2](2) -> [1, 2, 1, 3, 4, 5, 6, 9]
    • Swap A[3](3) and A[4](4) -> [1, 2, 1, 4, 3, 5, 6, 9]
    • Swap A[5](5) and A[6](6) -> [1, 2, 1, 4, 3, 6, 5, 9]
  3. Check the result: 1<=2>=1<=4>=3<=6>=5<=9. It works!
  • Complexity: Dominated by the sort step.
    • Time: O(n log n).
    • Space: O(1) if using an in-place sort, or O(n) otherwise.

Approach 2: Median Finding (Mentioned, but not fully explored)

  1. Find the median element of the array A (using an O(n) median-of-medians algorithm, like in Problem 11.8).
  2. Partition the array around the median (like one step of Quicksort or using Dutch Flag) so elements smaller than the median are first, then elements equal to the median, then elements larger. This takes O(n) time.
  3. Interleave elements from the lower half and the upper half, similar to the swapping pattern in Approach 1.
  • Complexity: Dominated by median finding and partitioning.
    • Time: O(n).
    • Space: O(1).
  • This is theoretically optimal time, but the median-of-medians algorithm is complex to implement.

Approach 3: The Clever Local Swap (Efficient and Simple)

Key Insight: The required property (A[i-1] <= A[i] >= A[i+1] or A[i-1] >= A[i] <= A[i+1]) is very local. It only involves adjacent elements. We don’t necessarily need the global order provided by sorting or median finding.

The Goal: Arrange the array so it goes “up, down, up, down…” like this: Small <= Big >= Small <= Big >= Small ...

The Rule We Need:

  • At every even index i, we need A[i] <= A[i+1].
  • At every odd index i, we need A[i] >= A[i+1].

The Local Swap Algorithm:

We walk through the array, looking at pairs (A[i], A[i+1]). We force the local rule to be true for that pair.

  1. Look at (A[0], A[1]) (i=0, which is even):

    • Rule: We need A[0] <= A[1].
    • If it’s already true, great, do nothing.
    • If A[0] > A[1], swap them. Now A[0] <= A[1] is true for this pair.
  2. Look at (A[1], A[2]) (i=1, which is odd):

    • Rule: We need A[1] >= A[2].
    • If it’s true, do nothing.
    • If A[1] < A[2], swap them. Now A[1] >= A[2] is true for this pair.
  3. Look at (A[2], A[3]) (i=2, which is even):

    • Rule: We need A[2] <= A[3].
    • If A[2] > A[3], swap them. Now A[2] <= A[3] is true. …and so on.

Why Doesn’t Enforcing the Rule at Step i Break the Rule We Just Fixed at Step i-1?

This is the core confusion. Let’s trace carefully. We need to check the condition involving A[i] in both the step i-1 comparison and the step i comparison.

Consider index i. It’s involved in two checks:

  • Check at step i-1: Compares (A[i-1], A[i]).
  • Check at step i: Compares (A[i], A[i+1]).

Let’s analyze the transition at index i when we process step i.

Case 1: i is EVEN.

  • The rule we enforce at step i: We ensure A[i] <= A[i+1] (by swapping if needed).
  • What about the rule from step i-1? Step i-1 was ODD. The rule enforced there was A[i-1] >= A[i].
  • Does enforcing A[i] <= A[i+1] break A[i-1] >= A[i]?
    • If we didn’t swap at step i (because A[i] <= A[i+1] was already true): No problem, the previous condition A[i-1] >= A[i] still holds (we didn’t change A[i] or A[i-1]).
    • If we did swap at step i (because A[i] > A[i+1] initially): Let the original values be Old_A[i] and Old_A[i+1]. We know Old_A[i] > Old_A[i+1]. The new value at A[i] is New_A[i] = Old_A[i+1]. The condition from step i-1 was A[i-1] >= Old_A[i]. Since Old_A[i] > Old_A[i+1] = New_A[i], it follows that A[i-1] > New_A[i]. So, the required condition A[i-1] >= New_A[i] still holds! Swapping in a smaller value at A[i] cannot break the previous A[i-1] >= A[i] condition.

Case 2: i is ODD.

  • The rule we enforce at step i: We ensure A[i] >= A[i+1] (by swapping if needed).
  • What about the rule from step i-1? Step i-1 was EVEN. The rule enforced there was A[i-1] <= A[i].
  • Does enforcing A[i] >= A[i+1] break A[i-1] <= A[i]?
    • If we didn’t swap at step i (because A[i] >= A[i+1] was already true): No problem.
    • If we did swap at step i (because A[i] < A[i+1] initially): Let the original values be Old_A[i] and Old_A[i+1]. We know Old_A[i] < Old_A[i+1]. The new value at A[i] is New_A[i] = Old_A[i+1]. The condition from step i-1 was A[i-1] <= Old_A[i]. Since Old_A[i] < Old_A[i+1] = New_A[i], it follows that A[i-1] < New_A[i]. So, the required condition A[i-1] <= New_A[i] still holds! Swapping in a larger value at A[i] cannot break the previous A[i-1] <= A[i] condition.

Conclusion: Enforcing the rule for the pair (A[i], A[i+1]) preserves the rule that was already established for the pair (A[i-1], A[i]) in the previous step. Since we process the array from left to right, by the time we reach the end, all adjacent pairs satisfy their respective conditions (<= for even i, >= for odd i), fulfilling the overall alternation requirement.

Analogy: Smoothing Out Bumps

Imagine the numbers are heights. You want a profile like low <= high >= low <= high ...

You walk along the profile, looking at two adjacent points i and i+1.

  • If i should be a “low” point (i is even), but A[i] is higher than A[i+1], you swap them to push the high point to i+1. This makes A[i] lower, which doesn’t violate any preceding “high >= low” condition.
  • If i should be a “high” point (i is odd), but A[i] is lower than A[i+1], you swap them to pull the high point to i. This makes A[i] higher, which doesn’t violate any preceding “low <= high” condition.

Section 5.9: Enumerate All Primes to n (Page 13 / PDF Page 49)

The Problem: Given an integer n, return a list of all prime numbers between 1 and n (inclusive).

Prime Number Definition: A natural number p is prime if it is greater than 1 and has no divisors other than 1 and itself. (2, 3, 5, 7, 11, 13, 17, 19… are primes). 1 is not prime.

Example:

  • Input: n = 18
  • Output: [2, 3, 5, 7, 11, 13, 17]

Approach 1: Trial Division (Brute-Force)

  1. Iterate through each number i from 2 up to n.
  2. For each i, check if it’s prime.
    • How to check if i is prime? Try dividing i by every number d from 2 up to sqrt(i).
      • If you find any d that divides i evenly (i % d == 0), then i is not prime. Stop checking for this i and move to the next i.
      • If you check all d up to sqrt(i) and find no divisors, then i is prime.
  3. If i is found to be prime, add it to your results list.
  • Why only check up to sqrt(i)? If i has a divisor d larger than sqrt(i), then i = d * k. For this equation to hold, k must be smaller than sqrt(i). So, if i has any divisor other than 1 and itself, it must have at least one divisor less than or equal to its square root.

  • Complexity: Checking if a single number i is prime takes roughly O(sqrt(i)) time. Doing this for all numbers from 2 to n gives a total time complexity around O(n * sqrt(n)) or O(n1.5). Space is O(p) where p is the number of primes found, roughly O(n/log n).

Approach 2: Sieve of Eratosthenes (Much More Efficient)

This is the classic algorithm for finding all primes up to n. It avoids redundant checks by eliminating multiples.

The Idea:

  1. Start with a list (or boolean array) representing all numbers from 0 up to n. Initially, mark all numbers from 2 to n as potentially prime (e.g., True). Mark 0 and 1 as not prime (False).
  2. Find the first number p in the list (starting from 2) that is still marked as potentially prime.
  3. This number p must be prime (because if it had a smaller divisor, that divisor would have been found earlier, and p would have already been marked as not prime). Add p to your list of primes.
  4. Sieve: Mark all multiples of p (i.e., 2*p, 3*p, 4*p, …) up to n as not prime (False) in your boolean array. They cannot be prime because they have p as a divisor.
  5. Repeat from step 2: Find the next number in the list greater than p that is still marked as potentially prime, and repeat the sieving process.
  6. Continue until you have checked numbers up to n.

Sieve Optimization 1: When sieving multiples of prime p, you can start marking from p*p (p2). Why? Any smaller multiple k*p where k < p would have already been marked as not prime when you processed the prime factors of k.

Sieve Optimization 2: You only need to find primes p up to sqrt(n) to perform the sieving. Any composite number c <= n must have a prime factor less than or equal to sqrt(n). If c hasn’t been marked off by the time you reach sqrt(n), it must be prime itself.

The Old Way (Trial Division):

  • Is 2 prime? Yes.
  • Is 3 prime? Yes.
  • Is 4 prime? No (divisible by 2).
  • Is 5 prime? Yes.
  • Is 6 prime? No (divisible by 2).
  • Is 7 prime? Yes.
  • Is 8 prime? No (divisible by 2).
  • Is 9 prime? No (divisible by 3). … This is slow because we check 9 for divisibility by 3 even though we already knew 3 was prime and could have somehow used that information.

The Sieve Idea: Be Proactive!

Instead of checking each number individually, when we find a prime number, let’s immediately cross out all of its multiples because we know they cannot be prime.

Steps for n = 20:

  1. List all numbers: Write down all numbers from 2 up to 20. Imagine we have little flags (or boolean values) next to them, initially all set to “potentially prime” (True). [ (2,T), (3,T), (4,T), (5,T), (6,T), (7,T), (8,T), (9,T), (10,T), (11,T), (12,T), (13,T), (14,T), (15,T), (16,T), (17,T), (18,T), (19,T), (20,T) ]

  2. Find the first number p that is marked True: Start from the beginning. The first number marked True is p = 2.

    • Action: We know 2 is prime (it’s the first one we found). Add 2 to our prime list: Primes = [2].
    • Sieve Action: Now, cross out (mark as False) all multiples of 2 in our list (4, 6, 8, 10, 12, 14, 16, 18, 20).
    • List becomes: [ (2,T), (3,T), (4,F), (5,T), (6,F), (7,T), (8,F), (9,T), (10,F), (11,T), (12,F), (13,T), (14,F), (15,T), (16,F), (17,T), (18,F), (19,T), (20,F) ]
  3. Find the next number p (after the previous p=2) that is marked True: Looking through the list, the next True is at p = 3.

    • Action: We know 3 must be prime (because if it had a smaller prime factor, it would have been crossed out in a previous step - the only smaller prime was 2, and 3 isn’t a multiple of 2). Add 3 to our prime list: Primes = [2, 3].
    • Sieve Action: Cross out all multiples of 3 (starting from 3*2=6, though 6 is already False. Next is 3*3=9, then 3*4=12(F), 3*5=15, 3*6=18(F)).
    • List becomes: [ (2,T), (3,T), (4,F), (5,T), (6,F), (7,T), (8,F), (9,F), (10,F), (11,T), (12,F), (13,T), (14,F), (15,F), (16,F), (17,T), (18,F), (19,T), (20,F) ]
  4. Find the next number p (after p=3) that is marked True: The next True is at p = 5.

    • Action: 5 must be prime. Add 5 to list: Primes = [2, 3, 5].
    • Sieve Action: Cross out multiples of 5 (5*2=10(F), 5*3=15(F), 5*4=20(F)). No changes needed as they were already False.
    • List remains the same.
  5. Find the next number p (after p=5) that is marked True: The next True is at p = 7.

    • Action: 7 must be prime. Add 7 to list: Primes = [2, 3, 5, 7].
    • Sieve Action: Cross out multiples of 7 (7*2=14(F)). No changes.
    • List remains the same.
  6. Find the next number p (after p=7) that is marked True: The next True is at p = 11.

    • Action: 11 must be prime. Add 11 to list: Primes = [2, 3, 5, 7, 11].
    • Sieve Action: Cross out multiples of 11 (11*2=22, which is > 20). No changes.
    • List remains the same.
  7. Find the next number p (after p=11) that is marked True: The next True is at p = 13.

    • Action: 13 must be prime. Add 13: Primes = [2, 3, 5, 7, 11, 13].
    • Sieve Action: Multiples are > 20.
  8. Find the next number p (after p=13) that is marked True: The next True is at p = 17.

    • Action: 17 must be prime. Add 17: Primes = [2, 3, 5, 7, 11, 13, 17].
    • Sieve Action: Multiples are > 20.
  9. Find the next number p (after p=17) that is marked True: The next True is at p = 19.

    • Action: 19 must be prime. Add 19: Primes = [2, 3, 5, 7, 11, 13, 17, 19].
    • Sieve Action: Multiples are > 20.
  10. Find the next number p (after p=19) that is marked True: There are none left up to n=20. We are done.

The Final Prime List: [2, 3, 5, 7, 11, 13, 17, 19].

Why is this efficient?

  • We only “process” (find multiples of) the actual prime numbers (2, 3, 5, 7, etc.).
  • We touch each composite number (like 4, 6, 8, 9, 10, 12…) only when we cross it out as a multiple of one of its prime factors. We don’t waste time trying to see if 12 is divisible by 4, 5, 6, etc., because it was already crossed out when we processed prime 2 (and again for prime 3).

The Optimization (Sieving from p*p) Let’s look at step 3 again, when p=3. We crossed out 6, 9, 12, 15, 18. Notice that 6 was already crossed out when we processed p=2. Why? Because 6 = 2 * 3. The factor k=2 is less than p=3. In general, when processing prime p, any multiple k * p where k < p has already been crossed out by the prime factors of k (which are all smaller than p). So, the first multiple of p that we really need to worry about crossing out is p * p.

  • For p=3, we start crossing out from 3*3 = 9.
  • For p=5, we start crossing out from 5*5 = 25 (which is > 20, so we do nothing).
  • For p=7, we start from 7*7=49 (> 20). This optimization saves some redundant work.

The Boolean Array (is_prime) In the code, instead of a list of pairs like (2, T), we use a simple boolean array (a list of True/False values). The index of the array represents the number. is_prime = [False, False, True, True, True, ... True] (up to index n)

  • When we process p=2, we loop for i in range(p*p, n+1, p): is_prime[i] = False.
  • When we process p=3, we loop for i in range(p*p, n+1, p): is_prime[i] = False. And so on.

Section 5.10: Permute the Elements of an Array (Page 14 / PDF Page 50)

The Problem: You are given an array A of n elements, and a permutation array P of length n. The permutation array P describes how to rearrange A: for each index i, the element currently at A[i] should be moved to position P[i]. You need to apply this permutation to array A in-place (O(1) extra space).

Example:

  • A = [a, b, c, d]
  • P = [2, 0, 1, 3]

Interpretation of P:

  • Element at index 0 (a) moves to index P[0]=2.
  • Element at index 1 (b) moves to index P[1]=0.
  • Element at index 2 (c) moves to index P[2]=1.
  • Element at index 3 (d) moves to index P[3]=3.

Desired Result: A should become [b, c, a, d].

Approach 1: Using Extra Space (Simple but Not In-Place)

  1. Create a new array B of the same size as A.
  2. Iterate through A from i = 0 to n-1.
  3. For each element A[i], place it in its target position in B: B[P[i]] = A[i].
  4. Copy the contents of B back into A.
  • Complexity:
    • Time: O(n) (for the iteration and copy).
    • Space: O(n) (for the auxiliary array B).
  • Issue: Doesn’t meet the O(1) space requirement.

Approach 2: In-Place Permutation using Cycles

Key Insight: Any permutation can be broken down into one or more disjoint cycles.

  • A cycle means that element at index i0 moves to i1, element at i1 moves to i2, …, element at ik-1 moves to ik, and element at ik moves back to i0.

In our example P = [2, 0, 1, 3]:

  • Start at index 0: Element a moves to index 2 (P[0]=2).
  • Look at index 2: Element c moves to index 1 (P[2]=1).
  • Look at index 1: Element b moves to index 0 (P[1]=0).
  • We’re back at index 0. This forms the cycle 0 -> 2 -> 1 -> 0.
  • Start at index 3 (the next unvisited index): Element d moves to index 3 (P[3]=3). This is a cycle of length 1: 3 -> 3. The permutation P consists of two cycles: (0 2 1) and (3).

Algorithm Idea: If we can process one cycle at a time, we can do it in place.

  1. Iterate through the array A from i = 0 to n-1.
  2. For each index i, check if the element at this position has already been moved as part of a previous cycle. If it has, skip it.
  3. If the element at i hasn’t been moved, we’ve found the start of a new cycle. Process this cycle:
    • Remember the starting index start = i and the value temp = A[start].
    • Easier cycle traversal:
      • Store the element you’re about to overwrite: value_to_move = A[i].
      • Follow the cycle: curr = i.
      • Loop:
        • next_idx = P[curr].
        • Save the element at the destination: next_value = A[next_idx].
        • Move the value_to_move to the destination: A[next_idx] = value_to_move.
        • Update curr = next_idx.
        • Update value_to_move = next_value.
        • Mark the position curr (or maybe P[curr]) as visited so we don’t process this cycle again.
        • Stop when curr gets back to the starting index i.

How to Mark Visited Cycles without O(n) Space? This is the clever trick. We can modify the permutation array P itself to mark elements that have been processed. The book suggests subtracting n (or using the sign bit if possible) from P[i] after its corresponding element A[i] has been placed correctly.

The Cycle Idea: Follow One Element All the Way

Let’s pick an element and follow its journey until it lands in its final spot, and see where the element it displaced needs to go, and so on.

Let’s trace starting with index i = 0:

  1. Element: A[0] is ‘a’.
  2. Destination: Where should ‘a’ go? P[0] tells us it goes to index 2.
  3. Problem: Index 2 currently holds ‘c’. We can’t just overwrite ‘c’.
  4. Solution: Before moving ‘a’ to index 2, let’s save ‘c’ somewhere temporarily. Let temp = 'c'.
  5. Move ‘a’: Now, put ‘a’ into A[2]. The array A is now ['a', 'b', 'a', 'd']. (We still have ‘c’ saved in temp).
  6. Where does ‘c’ go? ‘c’ originally came from index 2. According to P, the element from index 2 should go to index P[2] = 1.
  7. Problem: Index 1 currently holds ‘b’.
  8. Solution: Save ‘b’. Let temp = 'b'.
  9. Move ‘c’: Put ‘c’ (which we saved way back in step 4) into A[1]. Array A is now ['a', 'c', 'a', 'd']. (We have ‘b’ saved in temp).
  10. Where does ‘b’ go? ‘b’ originally came from index 1. According to P, the element from index 1 should go to index P[1] = 0.
  11. Problem: Index 0 currently holds ‘a’.
  12. Solution: Save ‘a’. Let temp = 'a'.
  13. Move ‘b’: Put ‘b’ (saved in step 8) into A[0]. Array A is now ['b', 'c', 'a', 'd']. (We have ‘a’ saved in temp).
  14. Where does ‘a’ go? This ‘a’ originally came from index 0. We followed the path 0 -> 2 -> 1 -> 0. We are back where we started! This ‘a’ we saved in step 12 is the one we started with. Since we’re back at the start, the cycle is complete. We don’t need to place this saved ‘a’; it’s already implicitly handled because the element now at index 0 (‘b’) is correct.

We’ve just processed the cycle 0 -> 2 -> 1 -> 0! The array A is now ['b', 'c', 'a', 'd'].

How do we know we are done? We need to check other starting points.

  • Let’s trace starting with index i = 1:
    • We need a way to know if index 1 was already handled. If we followed the cycle correctly above, index 1 was involved. We need a mechanism to remember this.
  • Let’s trace starting with index i = 3:
    • Assume index 3 hasn’t been handled yet.
    • Element: A[3] is ’d’.
    • Destination: P[3] tells us it goes to index 3.
    • It’s already there! This is a cycle of length 1: 3 -> 3. We don’t need to do anything.

Putting it Together & The “Marking” Trick:

The process above shows how a cycle works with temporary variables. The challenge is doing it without explicitly using lots of temp variables (just one is needed at a time) and knowing which cycles have been completed. This is where modifying the perm array comes in (or using a visited array). We mark perm[k] by making it negative after we’ve moved the element originally at A[k] to its correct spot.

My previous manual trace using the book’s code swap logic was slightly off. The core idea of the book’s code is to complete a cycle correctly.

Corrected Trace (Using Book’s Code Logic - A[next_idx], A[target_idx] = A[target_idx], A[next_idx]):

  • Initial State: A = ['a', 'b', 'c', 'd'], perm = [2, 0, 1, 3]

  • Outer loop: i = 0

    • next_idx = 0. perm[0] (2) >= 0. Enter while.
      • target_idx = perm[0] = 2.
      • Swap A[0] (‘a’) and A[target_idx] (A[2] which is ‘c’). A becomes ['c', 'b', 'a', 'd'].
      • temp = perm[0] (which is 2).
      • Mark perm[0]: perm[0] = 2 - 4 = -2. perm is now [-2, 0, 1, 3].
      • next_idx = temp = 2.
    • perm[2] (1) >= 0. Continue while.
      • target_idx = perm[2] = 1.
      • Swap A[next_idx] (A[2] which is ‘a’) and A[target_idx] (A[1] which is ‘b’). A becomes ['c', 'a', 'b', 'd'].
      • temp = perm[2] (which is 1).
      • Mark perm[2]: perm[2] = 1 - 4 = -3. perm is now [-2, 0, -3, 3].
      • next_idx = temp = 1.
    • perm[1] (0) >= 0. Continue while.
      • target_idx = perm[1] = 0.
      • Swap A[next_idx] (A[1] which is ‘a’) and A[target_idx] (A[0] which is ‘c’). A becomes ['a', 'c', 'b', 'd'].
      • temp = perm[1] (which is 0).
      • Mark perm[1]: perm[1] = 0 - 4 = -4. perm is now [-2, -4, -3, 3].
      • next_idx = temp = 0.
    • perm[0] (-2) < 0. Exit while loop.
  • Outer loop: i = 1

    • perm[1] (-4) is negative. Skip while loop. (This index was part of the cycle we just processed).
  • Outer loop: i = 2

    • perm[2] (-3) is negative. Skip while loop.
  • Outer loop: i = 3

    • next_idx = 3. perm[3] (3) >= 0. Enter while.
      • target_idx = perm[3] = 3.
      • Swap A[3] (’d’) and A[3] (’d’). A is unchanged: ['a', 'c', 'b', 'd'].
      • temp = perm[3] (which is 3).
      • Mark perm[3]: perm[3] = 3 - 4 = -1. perm is now [-2, -4, -3, -1].
      • next_idx = temp = 3.
    • perm[3] (-1) < 0. Exit while loop.
  • Outer loop finishes.

  • Final state for A is ['a', 'c', 'b', 'd']. This trace still gives the incorrect result.

The confusion arises from how the swaps in the book’s code achieve the “following an element” logic. Let’s use the “pick up and carry” mental model which is more direct.

Cycle Following - Simpler View (Corrected to match desired outcome):

This approach correctly gets the desired result by focusing on placing the element currently “held” (value) to its destination, and then picking up the element that was at that destination.

  1. Outer loop i = 0 to n-1:
    • If perm[i] is negative, this element’s cycle is already processed. Skip.
    • Start a new cycle:
      • current_cycle_idx = i
      • value_to_place = A[i] (the element we are “holding”)
      • Inner do-while loop (or while True with a break):
        1. next_target_pos = perm[current_cycle_idx] (where value_to_place should go).
        2. value_at_target = A[next_target_pos] (save the element currently at the target).
        3. A[next_target_pos] = value_to_place (place our “held” value).
        4. Mark perm[current_cycle_idx] as processed (e.g., perm[current_cycle_idx] -= len(perm)).
        5. current_cycle_idx = next_target_pos (this is where value_at_target came from).
        6. value_to_place = value_at_target (this is the new value we are “holding”).
        7. If current_cycle_idx == i (we’ve returned to the start of the cycle), break the inner loop.

Let’s try this view on A = ['a', 'b', 'c', 'd'], P = [2, 0, 1, 3]

  • Outer loop i = 0:

    • perm[0] (2) is not negative. Start cycle.
    • current_cycle_idx = 0. value_to_place = A[0] = 'a'.
    • Inner Loop 1:
      1. next_target_pos = perm[0] = 2.
      2. value_at_target = A[2] = 'c'.
      3. A[2] = 'a'. A is ['a', 'b', 'a', 'd'].
      4. perm[0] -= 4 \implies -2. perm is [-2, 0, 1, 3].
      5. current_cycle_idx = 2.
      6. value_to_place = 'c'.
      7. current_cycle_idx (2) != i (0). Continue.
    • Inner Loop 2:
      1. next_target_pos = perm[2] = 1.
      2. value_at_target = A[1] = 'b'.
      3. A[1] = 'c'. A is ['a', 'c', 'a', 'd'].
      4. perm[2] -= 4 \implies -3. perm is [-2, 0, -3, 3].
      5. current_cycle_idx = 1.
      6. value_to_place = 'b'.
      7. current_cycle_idx (1) != i (0). Continue.
    • Inner Loop 3:
      1. next_target_pos = perm[1] = 0.
      2. value_at_target = A[0] = 'a'.
      3. A[0] = 'b'. A is ['b', 'c', 'a', 'd'].
      4. perm[1] -= 4 \implies -4. perm is [-2, -4, -3, 3].
      5. current_cycle_idx = 0.
      6. value_to_place = 'a'.
      7. current_cycle_idx (0) == i (0). Break inner loop. Cycle for i=0 is complete.
  • Outer loop i = 1:

    • perm[1] (-4) is negative. Skip.
  • Outer loop i = 2:

    • perm[2] (-3) is negative. Skip.
  • Outer loop i = 3:

    • perm[3] (3) is not negative. Start cycle.
    • current_cycle_idx = 3. value_to_place = A[3] = 'd'.
    • Inner Loop 1:
      1. next_target_pos = perm[3] = 3.
      2. value_at_target = A[3] = 'd'.
      3. A[3] = 'd'. A is ['b', 'c', 'a', 'd'].
      4. perm[3] -= 4 \implies -1. perm is [-2, -4, -3, -1].
      5. current_cycle_idx = 3.
      6. value_to_place = 'd'.
      7. current_cycle_idx (3) == i (3). Break inner loop. Cycle for i=3 is complete.

Done. Final A is ['b', 'c', 'a', 'd']. This matches the desired result! The key is correctly following the element you “picked up” and placing it, then picking up the displaced element and continuing, while marking the origin index in P as complete for that step of placing its element. The book’s code’s swap A[next_idx], A[target_idx] = A[target_idx], A[next_idx] is actually performing this “pick up and carry” implicitly. The element at target_idx is moved to next_idx, and the old A[next_idx] is moved to target_idx. The next_idx for the next iteration then becomes the original target_idx. This is a common way to implement cycle sort.


Section 5.11: Compute the Next Permutation (Page 16 / PDF Page 52)

The Problem: Given an array perm representing a permutation of numbers (e.g., [1, 0, 3, 2]), find the next permutation in dictionary (lexicographical) order. If the input is already the largest possible permutation (i.e., sorted in descending order, like [3, 2, 1, 0]), return an empty array or list.

Dictionary Order: Think of the permutations as words. [0, 1, 2] comes before [0, 2, 1], which comes before [1, 0, 2], etc. We want the permutation that comes immediately after the input one in this sorted list of all permutations.

Example:

  • Input: [1, 0, 3, 2] -> Output: [1, 2, 0, 3]
  • Input: [1, 2, 0, 3] -> Output: [1, 2, 3, 0]
  • Input: [3, 2, 1, 0] -> Output: [] (or indication of none)

The Key Insight: Minimizing the Change

We want to find the next permutation, which means we want to increase the sequence by the smallest possible amount while still making it larger.

Where to Make the Change? To make the smallest possible increase, we want to change the sequence as far to the right as possible.

Consider [1, 0, 3, 2]:

  • If we only change [3, 2]? The only other arrangement is [2, 3]. This gives [1, 0, 2, 3]. Is this larger than the original [1, 0, 3, 2]? Yes (because at index 2, 2 < 3).
  • If we only change [0, 3, 2]? We could rearrange these to [2, 0, 3], [2, 3, 0], [3, 0, 2], [3, 2, 0]. Combined with the 1 at the start, these give [1, 2, 0, 3], [1, 2, 3, 0], [1, 3, 0, 2], [1, 3, 2, 0]. All of these are larger than the original. Which one is the smallest? [1, 2, 0, 3].

Notice that [1, 2, 0, 3] seems to be the smallest possible increase we found. It involved changing the 0 at index 1.

Finding the “Pivot” - Where the Increase Must Happen Let’s look at the sequence from right to left. perm = [1, 0, 3, 2]

  • 2: Nothing to compare.
  • 3, 2: 3 > 2. The sequence is decreasing here.
  • 0, 3, 2: 0 < 3. The decreasing trend is broken! This 0 at index k=1 is important. Why? Because any rearrangement only involving the suffix [3, 2] will keep the [1, 0] prefix the same. To get a permutation larger than [1, 0, 3, 2] but potentially smaller than [1, 0, 2, 3] (which we found by rearranging only the suffix), we must change something at or before index 1. To make the smallest change, we should change the rightmost possible element, which is perm[1]=0.

This leads to:

Algorithm Steps:

  1. Find the “pivot” (k):

    • Scan the permutation perm from right to left. Find the first index k such that perm[k] < perm[k+1].
    • This perm[k] is the rightmost element we can increase while potentially making the resulting permutation larger than the original but as small as possible.
    • If no such k exists (the entire array is in descending order), it’s the last permutation. Return an empty list or signal no next permutation.
    • For [1, 0, 3, 2], k=1 (where perm[k]=0). The suffix is [3, 2].
  2. Find the “successor” (l) to swap with the pivot:

    • We need to replace perm[k] with something larger than it. To make the overall permutation increase by the least amount, we should replace perm[k] with the smallest possible value from the suffix perm[k+1:] that is still greater than perm[k].
    • Scan the suffix perm[k+1:] from right to left. Find the first element perm[l] that is greater than perm[k].
    • For [1, 0, 3, 2], perm[k]=0. Suffix is [3, 2]. Scan from right:
      • perm[3] = 2. Is 2 > 0? Yes. Found it! l=3. The successor value is 2.
    • (Why scan from the right for the successor? Because the suffix perm[k+1:] was identified because k was the first break in a descending sequence from the right. This means the suffix perm[k+1:] itself must be in descending order. Scanning a descending sequence from the right guarantees that the first element we find that’s greater than the pivot is also the smallest element in that suffix that’s greater than the pivot.)
  3. Swap pivot and successor:

    • Swap perm[k] and perm[l].
    • Swap perm[1](0) and perm[3](2). perm becomes [1, 2, 3, 0].
  4. Make the rest of the sequence (suffix after k) as small as possible:

    • We’ve now made the prefix up to index k (e.g., [1, 2]) the smallest possible larger prefix.
    • Now, we need to make the suffix (the part after index k, which is now [3, 0]) as small as possible in dictionary order to ensure this is the very next permutation.
    • How do you make a sequence of numbers as small as possible in dictionary order? You sort it in ascending order.
    • The suffix after index k (perm[k+1:]) was originally in descending order (or became so after the swap, as perm[l] was replaced by perm[k], which is smaller than other elements in the original suffix). To sort a descending sequence into ascending order efficiently, just reverse it!
    • The suffix after k=1 is [3, 0]. Reversing it gives [0, 3].
    • Attach this to the prefix: [1, 2] + [0, 3] = [1, 2, 0, 3]. This is the final answer.

Let’s try perm = [6, 2, 1, 5, 4, 3, 0] again:

  1. Find k: Scan right-to-left.

    • 0
    • 3 > 0
    • 4 > 3
    • 5 > 4
    • 1 < 5. Break! k=2 (pivot perm[k]=1). Suffix is [5, 4, 3, 0].
  2. Find l: Scan suffix [5, 4, 3, 0] right-to-left for first element > perm[k](1).

    • 0 > 1? No.
    • 3 > 1? Yes. Found it! l=5 (index in original array, successor perm[l]=3).
  3. Swap: Swap perm[k](1) and perm[l](3).

    • perm becomes [6, 2, 3, 5, 4, 1, 0].
  4. Reverse suffix: Reverse the part after k=2, which is [5, 4, 1, 0].

    • Reversing gives [0, 1, 4, 5].
  5. Final Result: [6, 2, 3] + [0, 1, 4, 5] = [6, 2, 3, 0, 1, 4, 5].

This algorithm correctly finds the immediate next permutation by making the smallest necessary change to the prefix and then making the suffix as small as possible.


Sampling Algorithms and their implementation

Problem: You have a bag (array A) with n distinct items. You want to pull out exactly k items such that every possible group of k items has the same chance of being selected. You do this by modifying the original array A so the chosen k items end up in the first k positions (A[0] to A[k-1]).

Analogy: Picking Players for a Team

  • You have n players lined up (A[0] to A[n-1]).
  • You need to form a team of k players.
  • You want the selection process to be completely fair – any group of k players is equally likely.

A Naive (But Fair) Way - Drawing Names from a Hat:

  1. Write each of the n player names on a slip of paper.
  2. Put all n slips into a hat.
  3. Shake the hat well.
  4. Draw out k slips one by one without putting them back. This is perfectly fair. Every group of k names has an equal chance.

How can we simulate this “drawing from a hat” using the array A directly? This is what the Offline Sampling algorithm (random_sampling) does. Let’s trace it with A = ['A', 'B', 'C', 'D', 'E'] and k = 3. We want to pick 3 players fairly and put them in A[0], A[1], A[2].

The Algorithm Step-by-Step:

Goal: Fill A[0], then A[1], then A[2] with fairly chosen players.

Step 1: Fill A[0] (i=0)

  • Idea: We need to pick one player randomly out of the entire group (A[0] to A[4]) to put into the first team slot (A[0]).
  • Action:
    1. Choose a random index r between 0 and 4 (inclusive). Let’s say random.randrange(0, 5) gives r = 3.
    2. Swap the player currently at A[0] (which is ‘A’) with the player at the random index A[r] (which is A[3], ‘D’).
    3. A becomes ['D', 'B', 'C', 'A', 'E'].
  • Result: A[0] now contains ‘D’. ‘D’ was chosen uniformly from all 5 original players. We have fairly selected the first team member and placed them in the first slot. The remaining players (['B', 'C', 'A', 'E']) are now in slots A[1] to A[4].

Step 2: Fill A[1] (i=1)

  • Idea: We need to pick the second team member. This member must be chosen fairly from the players who haven’t been picked yet. Where are the unpicked players? They are currently in the array from index i=1 to the end (A[1] to A[4]). There are n-i = 5-1 = 4 players remaining (‘B’, ‘C’, ‘A’, ‘E’).
  • Action:
    1. Choose a random index r between 1 and 4 (inclusive). Let’s say random.randrange(1, 5) gives r = 1.
    2. Swap the player currently at A[i] (which is A[1], ‘B’) with the player at the random index A[r] (which is A[1], ‘B’).
    3. A remains ['D', 'B', 'C', 'A', 'E']. (Swapping with itself).
  • Result: A[1] now contains ‘B’. ‘B’ was chosen uniformly from the 4 players available (B, C, A, E) after ‘D’ was picked. We have fairly selected the second team member. The remaining 3 players (['C', 'A', 'E']) are now in slots A[2] to A[4].

Step 3: Fill A[2] (i=2)

  • Idea: Pick the third team member fairly from the remaining players. The available players are at indices i=2 to the end (A[2] to A[4]). There are n-i = 5-2 = 3 players remaining (‘C’, ‘A’, ‘E’).
  • Action:
    1. Choose a random index r between 2 and 4 (inclusive). Let’s say random.randrange(2, 5) gives r = 4.
    2. Swap the player currently at A[i] (which is A[2], ‘C’) with the player at the random index A[r] (which is A[4], ‘E’).
    3. A becomes ['D', 'B', 'E', 'A', 'C'].
  • Result: A[2] now contains ‘E’. ‘E’ was chosen uniformly from the 3 players available (C, A, E) after ‘D’ and ‘B’ were picked. We have fairly selected the third team member.

Stop: We needed k=3 players, and we have filled A[0], A[1], A[2].

Final State: The array is ['D', 'B', 'E', 'A', 'C']. The Sample: The first k=3 elements are ['D', 'B', 'E']. This is our uniformly random sample.

Why is this like drawing from a hat?

  • At step i, choosing a random index r from i to n-1 is like drawing one name randomly from the n-i names still left in the hat.
  • Swapping A[i] with A[r] is like taking the drawn name (A[r]) and putting it aside as the i-th selected player (by placing it at A[i]), while the name that was at A[i] is effectively put back into the “remaining players” part of the array (at index r) to potentially be drawn later.

This process ensures that at each step i, every available element has an equal chance of being selected for position i. Over k steps, this builds up a sample where every possible group of k had an equal shot.

Let’s try a different random sequence for k=2, A=['a','b','c','d']:

  • i=0: n=4. Pick r from [0, 1, 2, 3]. Say r=2. Swap A[0](‘a’) and A[2](‘c’). A = ['c', 'b', 'a', 'd']. A[0] is fairly chosen.
  • i=1: n=4. Pick r from [1, 2, 3]. Say r=3. Swap A[1](‘b’) and A[3](’d’). A = ['c', 'd', 'a', 'b']. A[1] is fairly chosen from the rest.
  • Stop. k=2. Sample is A[0..1] = ['c', 'd'].

Think about filling slots 0 to k-1. For slot i, you randomly pick one element from all elements currently at or after index i and swap it into place.


Section 5.12: Sample Offline Data (Page 18 / PDF Page 54)

The Scenario: Imagine you have a large dataset (like all your customers, or all sensor readings from a day) stored in an array A. You want to select a smaller, random sample of size k from this dataset (e.g., to send a survey to, or to analyze more closely). The key requirement is that every possible subset of size k must have an equal chance of being selected. This is called uniform random sampling. “Offline” means you have the entire dataset A available before you start sampling.

The Problem: Implement an algorithm that takes an array A of n distinct elements and an integer k (where k <= n). It should rearrange A such that the first k elements (A[0] to A[k-1]) hold a uniformly random subset of the original elements of size k.

Example:

  • Input: A = [3, 7, 5, 11], k = 3
  • Possible outputs (in the first k=3 slots): [3, 7, 5], [3, 5, 11], [7, 3, 11], [11, 5, 7], etc. Each combination of 3 elements should appear with probability 1/(4 choose 3) = 1/4. The order within the first k elements doesn’t strictly matter for the subset definition, but the algorithm naturally produces a random permutation within the subset.

Why Not Just Pick Random Indices? If you just pick k random indices between 0 and n-1, you might pick the same index multiple times. If you discard duplicates and retry, the process becomes complex and potentially slow (related to the Coupon Collector’s problem), especially if k is close to n.

The Efficient In-Place Algorithm (Fisher-Yates Shuffle Adaptation):

This algorithm is elegant and efficient. It builds the random subset in the first k positions of the array iteratively.

Algorithm:

  1. Iterate with an index i from 0 up to k-1. This loop runs exactly k times.
  2. In each iteration i:
    • We need to select a random element from the portion of the array that hasn’t been considered for the sample yet. This “unseen” portion is from index i to n-1 (inclusive).
    • Generate a random integer r such that i <= r < n. (Pick a random index from the current position i to the end).
    • Swap the element at the current index i with the element at the randomly chosen index r: A[i], A[r] = A[r], A[i].
  3. After the loop finishes (after k swaps), the subarray A[0...k-1] contains the uniformly random sample of size k.

Why does this work? (Intuition)

  • Iteration i=0: We randomly pick an index r from 0 to n-1. We swap A[0] with A[r]. Now A[0] holds a uniformly random element chosen from the entire original array.
  • Iteration i=1: We randomly pick an index r from 1 to n-1. This picks uniformly from all elements except the one we already placed at A[0]. We swap A[1] with A[r]. Now A[1] holds a uniformly random element chosen from the remaining n-1 elements. A[0] and A[1] together now form a random pair.
  • Iteration i: We pick a random index r from i to n-1. This selects uniformly from the n-i elements that haven’t yet been fixed into the sample positions 0 to i-1. We swap A[i] with A[r]. This places a uniformly chosen element (from the remaining available ones) into position i.

By induction, after k steps, A[0...k-1] holds a uniformly chosen subset of size k. Each element had an equal chance of ending up in each position of the sample.


Section 5.13: Sample Online Data (Page 19 / PDF Page 55)

The Scenario: Now imagine data isn’t available all at once (“offline”). Instead, you’re receiving a stream of data items (like packets in a network session, or log entries as they happen). You don’t know how many items will be in the stream in total beforehand. You want to maintain a random sample of size k from the items seen so far. As each new item arrives, you update your sample. The sample must always be a uniformly random subset of the items processed up to that point.

The Problem: Design an algorithm that takes an input size k and reads a sequence of data items (an iterator or stream). It should continuously maintain a subset of size k of the items read so far. This subset must be a uniformly random sample.

Why this is Tricky:

  • You don’t know n (the total number of items) in advance.
  • When the (m+1)-th item arrives, you need to decide if it should be in your sample of size k (replacing one of the existing k items) or if it should be discarded. This decision must ensure that after processing m+1 items, any k items from the m+1 seen so far have an equal chance of being in your sample.

The Algorithm: Reservoir Sampling

This is a classic algorithm for this problem.

Algorithm:

  1. Initialization: Read the first k items from the stream and store them directly into your sample (let’s call it sample_array of size k). At this point, sample_array is trivially a uniform random sample of the first k items.
  2. Process Subsequent Items: For each subsequent item x that arrives (from item k+1 onwards):
    • Let num_items_seen_so_far be the total number of items processed so far (including the current item x). So, when processing the (k+1)-th item, num_items_seen_so_far = k+1. When processing the (k+2)-th, num_items_seen_so_far = k+2, and so on.
    • Generate a random integer r between 0 and num_items_seen_so_far - 1 (inclusive). (Or 1 to num_items_seen_so_far and adjust).
    • Decision:
      • If r < k (i.e., if the random number falls within the range 0 to k-1), then replace the element sample_array[r] with the current item x.
      • If r >= k, do nothing; discard the current item x.
  3. After processing all items in the stream, sample_array will hold a uniformly random sample of size k from the entire stream.

Why does this work? (Proof by Induction - High Level)

  • Base Case: After seeing k items, the sample_array contains all k items. This is a uniform random sample (the only possible sample of size k from k items).

  • Inductive Hypothesis: Assume that after seeing m items (where m >= k), the sample_array (of size k) is a uniformly random sample of size k from those m items. This means any specific item y from the first m items has a probability of k/m of being in the sample_array.

  • Inductive Step: Now, the (m+1)-th item, let’s call it x_m+1, arrives.

    • What is the probability that x_m+1 enters the sample?
      • It enters if we pick a random number r (from 0 to m) and r < k.
      • The probability of this is k / (m+1).
    • What is the probability that an old item y (which was among the first m items and was in the sample) remains in the sample?
      • Prob(y was in sample after m items) = k/m.
      • Prob(y is not kicked out when x_m+1 arrives) = Prob(x_m+1 is not chosen to replace anything) + Prob(x_m+1 is chosen, but y is not the one picked for replacement).
      • This is (1 - k/(m+1)) (if x_m+1 doesn’t replace anything) + (k/(m+1)) * ((k-1)/k) (if x_m+1 replaces, but it replaces one of the other k-1 items).
      • = (m+1-k)/(m+1) + (k-1)/(m+1)
      • = (m+1-k+k-1)/(m+1) = m/(m+1).
    • So, Prob(y remains in sample) = (Prob(y was in sample)) * (Prob(y not kicked out))
      • = (k/m) * (m/(m+1)) = k/(m+1).
  • Conclusion: After m+1 items, both the new item x_m+1 and any old item y (that was previously in the sample) now have a probability of k/(m+1) of being in the sample. This holds true for all items, thus maintaining the uniform random property.


sampling Algorithms Recap

Recall the Offline Sampling (Fisher-Yates) Logic:

At step i (filling the i-th slot of our sample, 0-indexed), we do:

  1. r = random.randrange(i, n)
  2. A[i], A[r] = A[r], A[i]

The key principle ensuring fairness here is: At step i, we choose uniformly from all items not yet placed in the sample (those in positions i through n-1) and place the chosen item into position i.

Now, the Online/Reservoir Sampling Challenge:

When the m-th item arrives (m > k), we don’t know n. We cannot simply pick a random index r from i to n-1. We only know the items seen so far (m items). We have a reservoir R of size k which (by induction) is a fair sample of the first m-1 items.

Goal: Update the reservoir R using the m-th item (x_m) such that R becomes a fair sample of all m items. Fairness means every item x_1, ..., x_m must have a k/m probability of being in R.

Reservoir Sampling Step for item m:

  1. Generate r = random.randrange(m) (an index from 0 to m-1).
  2. if r < k:
    • R[r] = x_m (replace element at index r in the reservoir with the new item).

How does this achieve the k/m probability goal?

Let’s connect this to the “drawing from a hat” idea.

  • Imagine a Hypothetical Hat: Conceptually, after m-1 items, our reservoir R holds k items drawn fairly from a hat containing the first m-1 items.
  • Item m Arrives: It’s like adding the m-th item’s name (x_m) to the hat. Now the hat contains m items.
  • Fair Draw Needed: To get a fair sample of size k from this hat of m items, each item in the hat should have a k/m chance of being selected if we were to redraw completely.

Can we achieve this without redrawing?

Let’s analyze the chances for the new item x_m:

  • The algorithm gives x_m a chance to get into the reservoir if r < k, where r is chosen from 0 to m-1.
  • There are k “winning” values for r (0 to k-1) out of m total possibilities.
  • So, Prob(x_m enters the reservoir) = k/m. This is exactly the fairness probability we need for the new item!

Now, let’s analyze the chances for an old item Y that was already in the reservoir R before x_m arrived.

  • By our assumption (inductive hypothesis), Y was in the reservoir with probability k / (m-1) after m-1 items.

  • For Y to still be in the reservoir after item m is processed, one of two things must happen:

    1. The random number r was not less than k (i.e., r >= k). This happens with probability (m-k)/m. In this case, x_m is ignored, and Y definitely stays.
    2. The random number r was less than k (prob k/m), AND the slot chosen for replacement was not Y’s slot. Since r is chosen uniformly from 0 to k-1 in this case to be the replacement index, the chance that Y’s specific slot is not picked is (k-1)/k. So, the probability of this combined event is (k/m) * (k-1)/k = (k-1)/m.
  • Total probability Y stays in the reservoir = Prob(Case 1) + Prob(Case 2) = (m-k)/m + (k-1)/m = (m-1)/m.

  • Now, the final probability that Y is in the reservoir after processing item m is: Prob(Y was in after m-1) * Prob(Y stays when m arrives) = [k / (m-1)] * [(m-1) / m] = k/m.

The Connection:

Both algorithms achieve fairness, but through different mechanisms constrained by what information is available:

  • Offline (Fisher-Yates): Knows n. Can directly simulate picking from the remaining available items at each step i by choosing a random index r in the future part of the array (i to n-1) and swapping into position i. It ensures fairness by giving every available item an equal chance (1/(n-i)) to be put into slot i.
  • Online (Reservoir): Doesn’t know n. Cannot pick from future indices. Instead, it uses probabilities based on the count m seen so far. It gives the new item x_m the exact target probability k/m of being included. If included, the random replacement maintains the correct probabilities for the older items, ensuring they also end up with probability k/m. It achieves the same final probability distribution as Offline Sampling, but through an incremental, probabilistic update rather than direct selection and placement.

Think of Reservoir Sampling as cleverly calculating how often the new item should replace an old one to maintain the correct k/m probability for everyone in the long run, even without knowing the total number of items m will eventually reach.


Section 5.15: Compute a Random Subset (Page 21 / PDF Page 57)

The Problem: Given a positive integer n and a size k <= n, return a size-k subset of the set {0, 1, 2, ..., n-1}. The subset should be represented as an array. All (n choose k) possible subsets should be equally likely. Additionally, all permutations of elements within the returned array should be equally likely (though this is often a natural byproduct of algorithms that produce uniform subsets).

How is this different from Offline Sampling (Section 5.12)?

  • Offline Sampling (5.12): Input is an existing array A. Output is a modification of A where A[0...k-1] holds the sample. It operates on given elements.
  • This Problem (5.15): Input is just integers n and k. The “elements” are implicitly the integers from 0 to n-1. We need to construct an array containing a random subset of these integers.

Solution Approach 1: Mimic Offline Sampling (Fisher-Yates)

  1. Create an array elements = [0, 1, ..., n-1].
  2. Apply the Offline Sampling algorithm (from Section 5.12, also known as Fisher-Yates shuffle) to this elements array to pick k items.
  3. Return the first k elements of the shuffled elements array (elements[0...k-1]).
  • Complexity:
    • Time: O(n) to create the initial elements array + O(k) for sampling = O(n).
    • Space: O(n) for the elements array.

Can we do better if k is much smaller than n? (e.g., sample 3 items from 1 million) Creating an array of 1 million elements just to pick 3 seems wasteful.

Solution Approach 2: Using a Hash Table (for k « n)

This approach avoids creating the full array of n elements if k is small. It directly simulates the Fisher-Yates logic but only keeps track of the elements that would have been moved from their original positions.

The Idea: Imagine you’re running the Fisher-Yates shuffle on the conceptual array [0, 1, ..., n-1].

  • The array initially has A[j] = j.
  • When you iterate for the i-th element of your sample (from i=0 to k-1):
    • You pick a random index rand_idx from i to n-1.
    • You swap A[i] and A[rand_idx]. If k is small, most of the conceptual array A will remain A[j]=j. Only a few elements get swapped. We can use a hash table (dictionary in Python) to store only those elements whose values are not equal to their indices (i.e., the elements that have been moved).

Algorithm with Hash Table:

  1. Initialize an empty dictionary changed_elements. This dictionary will store index: value pairs for any position where the value is different from what it would be in the identity permutation [0, 1, ..., n-1] after our simulated swaps.
  2. Iterate i from 0 to k-1 (to select k elements for our subset).
    1. Generate a random index rand_idx in the range [i, n-1].
    2. Get current values:
      • value_at_i = changed_elements.get(i, i) (If i is in the dictionary, use its stored value; otherwise, its value is i).
      • value_at_rand_idx = changed_elements.get(rand_idx, rand_idx) (Similarly for rand_idx).
    3. Perform the swap (conceptually, updating the dictionary):
      • Store value_at_i at rand_idx: changed_elements[rand_idx] = value_at_i.
      • Store value_at_rand_idx at i: changed_elements[i] = value_at_rand_idx.
  3. After the loop, the desired subset consists of the values that are conceptually in A[0]...A[k-1]. We retrieve these from our changed_elements dictionary (defaulting to i if i is not a key).
    • result_subset = [changed_elements.get(j, j) for j in range(k)].

When is the Hash Table approach better?

When k is significantly smaller than n.

  • Fisher-Yates on full array (Approach 1):
    • Time: O(n)
    • Space: O(n) (for the initial array [0, ..., n-1])
  • Hash Table method (Approach 2):
    • Time: O(k) (k iterations, each with dictionary operations which are O(1) on average)
    • Space: O(k) (at most 2k entries in the dictionary if i and rand_idx are always different and neither was previously in the dictionary)

So, if k << n (k is much much less than n), the hash table method wins on both time and space by avoiding the creation and full scan of the n-element array.

If k is close to n (e.g., k = n/2 or k = n-1), the O(n) approach (Approach 1) is fine and might even be slightly faster in practice due to less overhead compared to dictionary get/set operations, even though their asymptotic complexities might suggest otherwise for k ~ O(n). The constant factors matter.


Section 5.16: Generate Nonuniform Random Numbers (Page 22 / PDF Page 58)

The Problem: You are given:

  • A list of n numbers (values): T = [t_0, t_1, ..., t_{n-1}].
  • A list of n corresponding probabilities: P = [p_0, p_1, ..., p_{n-1}]. These probabilities sum up to 1.

You need to write a function that generates one of the numbers from T according to its specified probability p_i. You have access to a standard uniform random number generator that produces values in the range [0.0, 1.0) (i.e., 0 inclusive, 1 exclusive).

Example:

  • Numbers T = [3, 5, 7, 11]
  • Probabilities P = [9/18, 6/18, 2/18, 1/18] (which are [0.5, 0.333..., 0.111..., 0.055...])

If you call your function many times:

  • About 50% of the time, it should return 3.
  • About 33.3% of the time, it should return 5.
  • About 11.1% of the time, it should return 7.
  • About 5.5% of the time, it should return 11.

The Core Idea: Mapping the [0,1) Interval

The uniform random number generator gives us a value r in [0.0, 1.0). We need to divide this [0.0, 1.0) interval into n segments, where the length of each segment corresponds to the probability of picking the associated number.

  • Segment 0: Length p_0. Corresponds to t_0. Range [0, p_0).
  • Segment 1: Length p_1. Corresponds to t_1. Range [p_0, p_0 + p_1).
  • Segment 2: Length p_2. Corresponds to t_2. Range [p_0 + p_1, p_0 + p_1 + p_2).
  • …and so on.
  • Segment n-1: Length p_{n-1}. Corresponds to t_{n-1}. Range [sum(p_0 to p_{n-2}), sum(p_0 to p_{n-1}) = 1.0)

Analogy: A Dartboard with Unequal Slices

Imagine you have a circular dartboard. Instead of all slices being equal (like in a standard dartboard), the slices have different sizes.

  • You have n numbers you want to pick from: t_0, t_1, t_2, ..., t_{n-1}.
  • Each number t_i has a probability p_i of being picked.
  • The slice on the dartboard for t_0 takes up p_0 fraction of the dartboard’s area.
  • The slice for t_1 takes up p_1 fraction of the area.
  • And so on. All p_i fractions add up to 1 (the whole dartboard).

Example:

  • Numbers T = [Red, Green, Blue]
  • Probabilities P = [0.5 (for Red), 0.3 (for Green), 0.2 (for Blue)]
    • Red slice covers 50% of the dartboard.
    • Green slice covers 30%.
    • Blue slice covers 20%.

How to Pick Fairly?

  1. Throw a dart completely randomly at the dartboard. “Completely randomly” means any point on the board is equally likely to be hit.
  2. See which slice your dart lands in. That’s the number you pick. Since the Red slice is bigger, you’re more likely to hit Red. This simulates the probabilities.

Connecting to random.random() which gives a number in [0.0, 1.0)

Instead of a circular dartboard, let’s imagine a straight line segment of length 1. Think of it like a ruler from 0.0 to 1.0. We’re going to divide this ruler into sections, where the length of each section is proportional to the probability.

For T = [Red, Green, Blue] and P = [0.5, 0.3, 0.2]:

  • Section for Red: Length 0.5. It goes from 0.0 up to 0.5. (Range: [0.0, 0.5))
  • Section for Green: Length 0.3. It starts where Red left off, so it goes from 0.5 up to 0.5 + 0.3 = 0.8. (Range: [0.5, 0.8))
  • Section for Blue: Length 0.2. It starts where Green left off, so it goes from 0.8 up to 0.8 + 0.2 = 1.0. (Range: [0.8, 1.0))

Visually:

0.0 0.5 0.8 1.0
<— Red —> <– Green –> <– Blue –>
(length 0.5) (length 0.3) (length 0.2)

The Algorithm using this “Ruler”:

  1. Prepare the Ruler Divisions (these are the “cumulative probabilities”):

    • Red ends at: 0.5
    • Green ends at: 0.5 + 0.3 = 0.8
    • Blue ends at: 0.5 + 0.3 + 0.2 = 1.0 Let’s call these end-points: Endpoints = [0.5, 0.8, 1.0]
  2. “Throw a Dart”: Get a random number from your computer, rand_val = random.random(). This rand_val will be somewhere between 0.0 and 1.0 (but not including 1.0).

  3. See Which Section it Lands In:

    • If 0.0 <= rand_val < Endpoints[0] (i.e., rand_val < 0.5), it landed in Red’s section. Output: Red.
    • Else if Endpoints[0] <= rand_val < Endpoints[1] (i.e., 0.5 <= rand_val < 0.8), it landed in Green’s section. Output: Green.
    • Else if Endpoints[1] <= rand_val < Endpoints[2] (i.e., 0.8 <= rand_val < 1.0), it landed in Blue’s section. Output: Blue.

Why does this work? Because the length of each section on our “ruler” is exactly its probability, a uniformly random rand_val is proportionally more likely to fall into longer sections.

Example from the Book:

  • Numbers T = [3, 5, 7, 11]
  • Probabilities P = [9/18, 6/18, 2/18, 1/18]
  • P_decimal = [0.5, 0.333..., 0.111..., 0.055...]

Ruler Divisions (Cumulative Probabilities):

  • div[0] (for number 3): 0.5
  • div[1] (for number 5): 0.5 + 0.333... = 0.833...
  • div[2] (for number 7): 0.833... + 0.111... = 0.944...
  • div[3] (for number 11): 0.944... + 0.055... = 1.0 So, Endpoints = [0.5, 0.833..., 0.944..., 1.0]

Throw a Dart: rand_val = random.random(). Let’s say rand_val = 0.75.

See Where it Lands:

  • Is 0.75 < Endpoints[0] (0.5)? No.
  • Is 0.75 < Endpoints[1] (0.833…)? Yes.
    • Aha! It landed between Endpoints[0] and Endpoints[1]. This corresponds to the second number in our original list T, which is 5.
    • So, we output 5.

What if rand_val = 0.2?

  • Is 0.2 < Endpoints[0] (0.5)? Yes.
    • It landed before the first endpoint. This corresponds to the first number, 3. Output 3.

The “Binary Search” Part (e.g., bisect_left)

When we have many sections (many numbers t_i), checking rand_val against each endpoint one by one (if rand_val < E[0]… else if rand_val < E[1]…) is slow (O(n)). Since the Endpoints array [0.5, 0.833, 0.944, 1.0] is sorted, we can use binary search to quickly find which “bucket” rand_val falls into. bisect_left(Endpoints, rand_val) efficiently finds the index idx such that all Endpoints[j] for j < idx are less than rand_val, and Endpoints[idx] is the first endpoint that is greater than or equal to rand_val. This idx is precisely the index of the chosen t_i in our original list T.

Summary in Simple Terms:

  1. Line up all your probabilities end-to-end on a ruler of length 1. Mark where each probability “ends”.
  2. Generate a random point on that ruler (a random number between 0 and 1).
  3. See which probability’s segment your random point landed in.
  4. Pick the number associated with that segment. (Binary search is just a fast way to do step 3 if you have lots of segments).

Section 5.17: The Sudoku Checker Problem (Page 24 / PDF Page 60)

The Problem: You are given a 9x9 2D array (a grid) representing a partially or fully completed Sudoku puzzle. The grid contains integers from 0 to 9. A value of 0 indicates an empty cell. Values from 1 to 9 are filled-in digits.

Your task is to determine if the current state of the Sudoku grid is valid according to Sudoku rules:

  1. Each row must contain unique digits from 1 to 9 (ignoring 0s).
  2. Each column must contain unique digits from 1 to 9 (ignoring 0s).
  3. Each of the nine 3x3 subgrids (also called “boxes” or “regions”) must contain unique digits from 1 to 9 (ignoring 0s).

The problem asks only to check for validity (no duplicate 1-9 digits in any row, column, or 3x3 box). It does not ask you to solve the Sudoku.

Example:

  • Figure 5.2(a) on page 60 is a partially filled valid Sudoku.
  • Figure 5.2(b) is a complete, valid solution.
  • An invalid example: A row [5, 3, 4, 6, 7, 8, 9, 1, 5] is invalid because 5 appears twice.

Core Implementation Idea (Directly Testing Constraints):

The most straightforward approach is to check each of the three types of constraints one by one. For checking uniqueness within a row, column, or 3x3 box, a common technique is to use a helper data structure (like a hash set or a boolean array) to keep track of the digits seen so far within that unit.

Algorithm Outline:

  1. Function to Check a Unit (Row, Column, or Box):

    • Create a helper function, say has_duplicate(unit_array), that takes a 1D array (representing a row, column, or flattened 3x3 box) and returns True if it contains duplicate digits from 1-9, False otherwise.
    • Inside has_duplicate(unit_array):
      • Initialize a seen_digits = set() (or a boolean array seen = [False]*10).
      • Iterate through each digit in unit_array:
        • If digit == 0, ignore it (empty cell).
        • If digit != 0:
          • If digit is already in seen_digits (or seen[digit] is True), then a duplicate is found. Return True immediately.
          • Otherwise, add digit to seen_digits (or set seen[digit] = True).
      • If the loop finishes without finding duplicates, return False.
  2. Check Row Constraints:

    • Iterate i from 0 to 8 (for each row).
    • Extract row i from the Sudoku grid: row_data = grid[i].
    • If has_duplicate(row_data) is True, then the Sudoku is invalid. Return False.
  3. Check Column Constraints:

    • Iterate j from 0 to 8 (for each column).
    • Extract column j: col_data = [grid[i][j] for i in range(9)].
    • If has_duplicate(col_data) is True, then the Sudoku is invalid. Return False.
  4. Check 3x3 Subgrid Constraints:

    • Iterate through the 9 subgrids. A common way is to use nested loops for the top-left corner of each subgrid:
      • Iterate box_row_start from 0 to 6, with a step of 3 (i.e., 0, 3, 6).
      • Iterate box_col_start from 0 to 6, with a step of 3 (i.e., 0, 3, 6).
      • For each (box_row_start, box_col_start):
        • (The next step would be to extract the 3x3 subgrid data and call has_duplicate on it. If it returns True, the Sudoku is invalid and we return False from the main function.)
  5. All Checks Passed: If all the above checks pass without returning False, then the Sudoku grid is valid. Return True.

Simplified Implementation Steps:

  1. Helper function is_unit_valid(list_of_9_cells):

    • Takes a list (a row, a column, or the 9 cells of a box).
    • Uses a seen_numbers = set().
    • For each cell in list_of_9_cells:
      • If cell == 0, continue.
      • If cell is in seen_numbers, return False (duplicate found!).
      • Add cell to seen_numbers.
    • If loop finishes, return True (no duplicates).
  2. Main function is_sudoku_valid(grid):

    • Rows: For each row r from 0 to 8:
      • If not is_unit_valid(grid[r]), then return False.
    • Columns: For each column c from 0 to 8:
      • Create column_list = [grid[r][c] for r in range(9)].
      • If not is_unit_valid(column_list), then return False.
    • Boxes: For each box (e.g., box_row_start from 0,3,6 and box_col_start from 0,3,6):
      • Create box_list by picking the 9 cells.
      • If not is_unit_valid(box_list), then return False.
    • If you get through all of this, return True.

Section 5.18: Compute the Spiral Ordering of a 2D Array (Page 25 / PDF Page 61)

What: Given an n x n 2D array, return a 1D list of its elements in “spiral order” (outside-in, clockwise).

Why (Interview): This is a common 2D array traversal problem. It tests:

  • Ability to manage 2D array indices carefully.
  • Handling boundary conditions.
  • Devising a systematic way to cover all elements without repetition.
  • State management (current position, current direction, visited cells).

Underlying Pattern: Systematic Traversal with State Management. The two main patterns are “Layer Peeling” or “Path Simulation.”

Simplified Implementation Intuition (Simulation Approach - “Robot Walk”):

Imagine a robot starting at matrix[0][0]. It needs to walk in a spiral and report the numbers it steps on.

Setup:

  • result_list = [] (to store the spiral numbers).
  • current_row = 0, current_col = 0.
  • directions = [[0, 1], [1, 0], [0, -1], [-1, 0]] (These are [row_change, col_change] for Right, Down, Left, Up).
  • current_direction_idx = 0 (Start by going Right).
  • num_elements = n * n.
  • A way to mark visited cells: Let’s assume we can modify the input matrix. We’ll set matrix[row][col] = None (or some other special marker) after visiting. (If not, use a separate visited[n][n] boolean array).

The Walk (Loop num_elements times):

  1. Collect: Add matrix[current_row][current_col] to result_list.
  2. Mark: matrix[current_row][current_col] = None (or visited[current_row][current_col] = True).
  3. Try to move in current direction:
    • next_row_change, next_col_change = directions[current_direction_idx].
    • potential_next_row = current_row + next_row_change.
    • potential_next_col = current_col + next_col_change.
  4. Check if turn is needed:
    • Is (potential_next_row, potential_next_col) OFF the grid (e.g., potential_next_row < 0 or >= n, or potential_next_col < 0 or >= n)?
    • OR, is matrix[potential_next_row][potential_next_col] ALREADY visited (e.g., == None)?
  5. If YES (turn needed):
    • Change direction: current_direction_idx = (current_direction_idx + 1) % 4 (turn clockwise).
    • Recalculate actual next step with new direction:
      • next_row_change, next_col_change = directions[current_direction_idx].
      • current_row += next_row_change.
      • current_col += next_col_change.
  6. If NO (can continue in current direction):
    • current_row = potential_next_row.
    • current_col = potential_next_col.
  7. Return result_list.

Example Trace Snippet (3x3): [[1,2,3],[4,5,6],[7,8,9]] res=[], (r,c)=(0,0), dir=0 (Right)

  1. Add 1. res=[1]. matrix[0][0]=None. Try (0,1) (Right). Valid. (r,c)=(0,1).
  2. Add 2. res=[1,2]. matrix[0][1]=None. Try (0,2) (Right). Valid. (r,c)=(0,2).
  3. Add 3. res=[1,2,3]. matrix[0][2]=None. Try (0,3) (Right). INVALID (off grid).
    • Turn! dir=(0+1)%4 = 1 (Down).
    • New try from (0,2) with Down: (1,2). Valid. (r,c)=(1,2).
  4. Add 6. res=[1,2,3,6]. matrix[1][2]=None. Try (2,2) (Down). Valid. (r,c)=(2,2).
  5. Add 9. res=[1,2,3,6,9]. matrix[2][2]=None. Try (3,2) (Down). INVALID (off grid).
    • Turn! dir=(1+1)%4 = 2 (Left).
    • New try from (2,2) with Left: (2,1). Valid. (r,c)=(2,1).
    • …and so on.

Managing Direction:

We need two things:

  1. A way to represent the current direction.
  2. A way to change the direction systematically (usually clockwise).

1. Representing Direction: We can assign numbers to directions:

  • 0: Right
  • 1: Down
  • 2: Left
  • 3: Up We store the robot’s current direction in a variable, say current_direction_idx (initialized to 0 for Right).

2. Changing Direction (Turning Clockwise): When the robot hits a wall or an already visited cell, it needs to turn right (clockwise).

  • If current direction is Right (0), next is Down (1).
  • If current direction is Down (1), next is Left (2).
  • If current direction is Left (2), next is Up (3).
  • If current direction is Up (3), next is Right (0). This pattern can be achieved with a simple calculation: current_direction_idx = (current_direction_idx + 1) % 4 (% 4 (modulo 4) ensures that if current_direction_idx becomes 4, it wraps around back to 0).
  • (0 + 1) % 4 = 1
  • (1 + 1) % 4 = 2
  • (2 + 1) % 4 = 3
  • (3 + 1) % 4 = 0

3. How Direction Affects Movement (Delta Row, Delta Column): Once we know the current_direction_idx, how do we know how to change the row and col to move one step? We can use a lookup table (an array or list of lists/tuples) that stores the change in row (dr) and change in column (dc) for each direction index.

Let SHIFT = [[0, 1], # Direction 0 (Right): row doesn't change, col increases by 1 [1, 0], # Direction 1 (Down): row increases by 1, col doesn't change [0, -1], # Direction 2 (Left): row doesn't change, col decreases by 1 [-1, 0]] # Direction 3 (Up): row decreases by 1, col doesn't change

When the robot wants to move:

  • Get the changes for the current direction:
    • dr = SHIFT[current_direction_idx][0]
    • dc = SHIFT[current_direction_idx][1]
  • Calculate the potential next position:
    • potential_next_row = current_row + dr
    • potential_next_col = current_col + dc

Section 5.19: Rotate a 2D Array (Page 28 / PDF Page 64)

The Problem: You are given an n x n 2D array (a square matrix). You need to rotate this matrix by 90 degrees clockwise, in-place (O(1) additional space).

Example (from Figure 5.4):

Input Matrix A:

[ [ 1,  2,  3,  4],
  [ 5,  6,  7,  8],
  [ 9, 10, 11, 12],
  [13, 14, 15, 16] ]

Output Matrix A (after rotation):

[ [13,  9,  5,  1],
  [14, 10,  6,  2],
  [15, 11,  7,  3],
  [16, 12,  8,  4] ]

Observations / How Elements Move:

  • The first row of the original ([1,2,3,4]) becomes the last column of the rotated, read from top to bottom ([1,2,3,4]).
  • The last column of the original ([4,8,12,16]) becomes the first row of the rotated, but reversed (or the last row of the transposed, reversed matrix - it’s easier to think in terms of layers).
    • More precisely: A[row][col] moves to A[col][n-1-row].

Approach 1: Using Extra Space (Not In-Place, for understanding)

  1. Create a new n x n matrix, say rotated_matrix.
  2. For each element A[row][col] in the original matrix:
    • The new position in rotated_matrix will be rotated_matrix[col][n - 1 - row].
    • rotated_matrix[col][n - 1 - row] = A[row][col]
  3. Copy rotated_matrix back to A.
  • Time: O(N2)
  • Space: O(N2) for rotated_matrix. Fails the in-place requirement.

Approach 2: In-Place Rotation by Layers

This is the standard and efficient way to do it in-place.

Key Idea: The rotation can be performed layer by layer, from the outermost layer inwards. Within each layer, elements move in groups of four.

Consider the outermost layer of the 4x4 example:

  • A[0][0] (1) moves to where A[0][3] (4) was.
  • A[0][3] (4) moves to where A[3][3] (16) was.
  • A[3][3] (16) moves to where A[3][0] (13) was.
  • A[3][0] (13) moves to where A[0][0] (1) was. This is a 4-way swap for the corners: (1) -> (4) -> (16) -> (13) -> (1)

Original positions: A[0][0], A[0][3], A[3][3], A[3][0] After 90-deg clockwise rotation, their new conceptual positions are:

  • The value originally at A[3][0] (13) moves to A[0][0].
  • The value originally at A[0][0] (1) moves to A[0][3].
  • The value originally at A[0][3] (4) moves to A[3][3].
  • The value originally at A[3][3] (16) moves to A[3][0].

So, the values move like this in a cycle (let TL=TopLeft, TR=TopRight, BR=BottomRight, BL=BottomLeft for a specific group of 4):

temp = value_at_TL
value_at_TL = value_at_BL
value_at_BL = value_at_BR
value_at_BR = value_at_TR
value_at_TR = temp

Algorithm for In-Place Layer Rotation:

  1. The number of layers is n // 2.
  2. Iterate through layers, from layer = 0 to n // 2 - 1.
    • For each layer:
      • Define first = layer (first row/col index of this layer).
      • Define last = n - 1 - layer (last row/col index of this layer).
      • Now, iterate through the elements within this layer’s top row (excluding the last one which is handled as a corner of another 4-element cycle). Iterate i from first to last - 1.
        • An offset = i - first. This offset helps identify the corresponding elements in the other three sides of the current square within the layer.
        • For each i (or offset), perform a 4-way swap:
          1. Save top-left element: top = matrix[first][i] (which is matrix[first][first + offset])
          2. Left element moves to Top-Left: matrix[first][i] = matrix[last - offset][first]
          3. Bottom element moves to Left: matrix[last - offset][first] = matrix[last][last - offset]
          4. Right element moves to Bottom: matrix[last][last - offset] = matrix[i][last] (which is matrix[first + offset][last])
          5. Saved Top element moves to Right: matrix[i][last] = top

Section 5.20: Compute Rows in Pascal’s Triangle (Page 29-30 / PDF Page 65-66 in your full book)

The Problem: You are asked to generate the first n rows of Pascal’s Triangle. Pascal’s Triangle has a specific structure:

  • The first row (Row 0 conceptually) is just [1].
  • Each subsequent row has one more element than the previous row.
  • The first and last element of every row is 1.
  • Every other element in a row is the sum of the two elements directly above it (to its left and right) in the previous row.

Figure 5.5 Example (First 5 rows):

Row 0:         1
Row 1:        1 1
Row 2:       1 2 1
Row 3:      1 3 3 1
Row 4:     1 4 6 4 1

Input: A non-negative integer n (number of rows to generate). Output: A list of lists, where each inner list represents a row of Pascal’s Triangle.

Example: If n = 5, output should be:

[
  [1],
  [1, 1],
  [1, 2, 1],
  [1, 3, 3, 1],
  [1, 4, 6, 4, 1]
]

The Core Rule (from the problem description): “the j-th entry in the i-th row is 1 if j = 0 or j = i, otherwise it is the sum of the (j - 1)-th and j-th entries in the (i - 1)-th row.” (Assuming 0-indexed rows i and 0-indexed entries j within a row).

Let Triangle[i][j] be the element at row i, column j.

  • Triangle[i][0] = 1
  • Triangle[i][i] = 1 (since row i has i+1 elements, so index i is the last one)
  • Triangle[i][j] = Triangle[i-1][j-1] + Triangle[i-1][j] for 0 < j < i.

Implementation Idea (Building Row by Row):

  1. Initialize an empty list pascal_triangle_result = [].
  2. Iterate i from 0 to n-1 (to generate n rows, which will be Row 0 to Row n-1).
    • For each i, create a current_row list. Row i will have i+1 elements.
    • The first element of current_row (current_row[0]) is always 1.
    • The last element of current_row (current_row[i]) is always 1 (if i > 0).
    • For the elements in between (from j=1 to j=i-1):
      • current_row[j] = pascal_triangle_result[i-1][j-1] + pascal_triangle_result[i-1][j].
      • This means we need access to the previous row (pascal_triangle_result[i-1]) that we already computed and stored.
    • Add current_row to pascal_triangle_result.
  3. Return pascal_triangle_result.

The Code Explained (generate_pascal_triangle - from the book):

def generate_pascal_triangle(n):
    if n == 0:
        return []

    # Initialize the result list.
    # The book's list comprehension initializes with all 1s, which is clever
    # because the first and last elements of each row are always 1.
    # Triangle[i] will have i+1 elements.
    result = [[1] * (i + 1) for i in range(n)]
    # Example for n=3: result will be [[1], [1,1], [1,1,1]]

    # Iterate from the third row (index 2) onwards, because Row 0 and Row 1
    # are already correctly filled with all 1s by the initialization.
    for i in range(n): # Iterate through rows
        # Iterate through elements in the current row 'i'
        # We need to calculate elements from index 1 up to index i-1
        # (since index 0 and index i are already 1).
        for j in range(1, i): # This loop only runs if i >= 2
            # Sum of the two elements directly above in the previous row
            result[i][j] = result[i - 1][j - 1] + result[i - 1][j]

    return result

Let’s trace for n=4:

  1. result = [[1] * (i + 1) for i in range(4)] result = [[1], [1,1], [1,1,1], [1,1,1,1]]

  2. Outer loop for i in range(4):

    • i = 0: Inner loop for j in range(1, 0) does not run. result[0] is [1]. Correct.
    • i = 1: Inner loop for j in range(1, 1) does not run. result[1] is [1,1]. Correct.
    • i = 2: (Generating Row 2: [1,2,1])
      • Inner loop for j in range(1, 2):
        • j = 1:
          • result[2][1] = result[2-1][1-1] + result[2-1][1]
          • result[2][1] = result[1][0] + result[1][1]
          • result[2][1] = 1 + 1 = 2.
      • result[2] is now [1, 2, 1]. Correct. (It was [1,1,1], element at index 1 got updated).
    • i = 3: (Generating Row 3: [1,3,3,1])
      • Inner loop for j in range(1, 3):
        • j = 1:
          • result[3][1] = result[3-1][1-1] + result[3-1][1]
          • result[3][1] = result[2][0] + result[2][1]
          • result[3][1] = 1 + 2 = 3.
        • j = 2:
          • result[3][2] = result[3-1][2-1] + result[3-1][2]
          • result[3][2] = result[2][1] + result[2][2]
          • result[3][2] = 2 + 1 = 3.
      • result[3] is now [1, 3, 3, 1]. Correct.
  3. Return result.

Complexity:

  • The number of elements in Pascal’s triangle up to row n-1 is 1 + 2 + ... + n = n*(n+1)/2.
  • Each element is computed in O(1) time (one addition).
  • Time Complexity: O(n^2).
  • Space Complexity: O(n^2) to store the triangle.

Variant: Compute only the n-th row of Pascal’s triangle using O(n) space. (Here n would mean Row n, e.g., if input is 3, compute Row 3 [1,3,3,1]).

  • Idea: To compute Row i, you only need Row i-1. You don’t need Row i-2, Row i-3, etc.
  • So, you can keep track of just two rows: previous_row and current_row.
  • Initialize previous_row = [1].
  • Loop k from 1 up to n (to compute up to Row n):
    • current_row will have k+1 elements.
    • current_row[0] = 1, current_row[k] = 1.
    • For j from 1 to k-1: current_row[j] = previous_row[j-1] + previous_row[j].
    • Set previous_row = current_row for the next iteration.
  • The final current_row (or previous_row after the loop) is the answer.
  • This takes O(n) iterations to get to the n-th row. Each iteration does O(k) work (where k goes up to n). Total time: O(1+2+...+n) = O(n^2).
  • Space: O(n) because we only store two rows at a time, each of length at most n+1.
  • Further Optimization for O(n) space, O(n^2) time (generating k-th row): You can even do it with a single array of size k+1 by computing the values for the next row in reverse order within that array. If row stores row i-1, to compute row i in place: row[j] = row[j] + row[j-1] for j from i-1 down to 1. The row[0] remains 1. You’d need to append a 1 at the end for row[i].

Interview Takeaway:

  • Tests basic 2D array/list of lists manipulation.
  • Understanding the recursive definition of Pascal’s triangle is key.
  • The O(n^2) time and space for generating all n rows is standard.
  • The variant tests if you can optimize space when only the last result is needed, a common DP optimization technique (reducing space by noticing limited dependencies).