Longest Palindromic Substring
[[Dynamic Programming Primer]]
Problem
You are given one string, s. Your goal is to find the longest substring (a contiguous block) that is a palindrome.
Example: s = "babads". The answer is "bab" (or "aba"). Length is 3.
Brain storming
lets start with the basics a substring of string s is defined as s[start:end] start and end being the index inside this string that represent this substring. Why do we care about this? its because for a palindrome this play a very important role.
A palindrome is defined as a string where its reverse is also a valid string. for eg racecar is a palindrome since its reverse racecar is also a valid string.
So whatever logic we come up with we need both the start index and the end index of the subsequence to determine palindrome. Just like the previous problem [[Longest Common Subsequence]] if we need 2 variables to define a subproblem then its a 2D dynamic programming problem
Imagining “Substrings” and the DP Grid
Okay, we’ve established it’s 2D. But the meaning of the grid is different now. It’s not about two different strings; it’s about one string.
Let s = "babad". Our dp grid will be 5 x 5.
The grid is a lookup table for the answer to the question:DP[i][j] “Is this substring a palindrome?”
b a b a d (end index, j)
+---+---+---+---+---+
b 0 | . | . | . | . | . |
+---+---+---+---+---+
a 1 | | . | . | . | . |
+---+---+---+---+---+
b 2 | | | . | . | . |
+---+---+---+---+---+
a 3 | | | | . | . |
+---+---+---+---+---+
d 4 | | | | | . |
+---+---+---+---+---+
^
|
(start index, i)
(Note: We only care about the upper half of the grid where i <= j) why ? since i is the start index and j is the end index a string for eg s[3][1] doesn’t make any sense as it will be empty
Let’s pick an arbitrary cell: dp[1][3].
What does it mean?
i=1is the start index.j=3is the end index.- This cell corresponds to the substring
s[1...3], which is"aba". dp[1][3]will hold the boolean valueTruebecause"aba"is a palindrome.
How do we calculate it? (The Story of a Single Cell) To solve for
dp[i][j], we need a rule. For any string to be a palindrome, two things must be true.- Prompt 2 (The Recurrence):
Let’s figure out
dp[1][3]("aba").- Condition 1 (The Outsides): What must be true about the first and last characters of the substring
s[i...j]? - Condition 2 (The Insides): If the outer characters match, what must be true about the string between them? For
"aba", the inner string is"b". For"racecar", it’s"aceca". How do we know if this inner string is a palindrome? (Hint: We can look up the answer for the inner substrings[i+1...j-1]in ourdptable. Which cell is that?) Combine these two to form the recurrence:dp[i][j] = (condition_1) AND (condition_2).
- Condition 1 (The Outsides): What must be true about the first and last characters of the substring
- Prompt 2 (The Recurrence):
Let’s figure out
The “First Move” Test (Base Cases): Our recurrence works for strings of length 3 or more. But what about the simplest cases?
- Prompt 3 (The Base Cases):
- What about
dp[i][i]? (A substring of length 1, like"b"). Is it a palindrome? - What about
dp[i][i+1]? (A substring of length 2, like"ba"or"bb"). Under what one condition is it a palindrome?
- What about
- Prompt 3 (The Base Cases):
The Visualisation (Table Fill Order): Look at your recurrence. To compute
dp[i][j], you needdp[i+1][j-1].i+1is the row below you.j-1is the column to the left of you.- Prompt 4 (The Order): If you need the answer from a cell “below-left” of you, can you fill the table in the standard top-to-bottom, left-to-right order? What order must you fill it in?
This is not your typical tabulation approach now its your memoization form of DP since dp[i][j] depends upon dp[i+1] a future value. So what we need to do is calculate values and then write down the answers as we build them. Also we would not be able to build this in the standard flow of DP grid (left-> right) (top->bottom)
Lets try to define the algorithm
- Get the string length lets call it
n - initialise a dp grid of
nxnwithfalsevalue as default - lets first calculate for base cases values
- string of length 1 , by definition they are palindrome
- string of length 2, if the first and the last characters are same then they are palindrome
- Store both these strings values in your DP grid hence getting are base cases jotted down
- now for string of length 3 , we know for a fact we cannot go the standard way then how should we do it
- One way to do it try to fill the table in reverse bottom to top , right to left but we might face challenges writing that code
- The other way is to
solve for lengthsomething we are already doing for strings of length 1 and 2 , we will try to solve for length of 3 and greater (till n+1) we can compute the start index as(n - length +1)- lets take “babads” as example
n= 6length = 3bthen the substring isabawith its start index asbat 0 and end indexdis 4 then we will iterate for each string of this length inside the string asfor i in range(n-length +1). here we will check for start index0 , 1, 2, 3. with these start indexes we will calculate the end indexjasi + length -1so the j becomes2, 3,4,5hence the substrings to check becomes[0:2] = babs[1:3] = abas[2:4] = bads[3:5] = ads
- Now we will have our substring to check with the
outerandinnerlogic- the outer logic will be simple
s[i] == s[j] - the inner logic would be
dp[i+1][j-1] - hence the final logic is
dp[i][j] = (s[i]==s[j] and dp[i+1][j-1]
- the outer logic will be simple
- lets take “babads” as example
Below is the python implementation
s = "babads"
n = len(s)
if n < 2:
# return s # Handle empty or single-character strings
# 1. Initialization
dp = [[False for _ in range(n)] for _ in range(n)]
longest_start = 0
max_len = 1 # Default answer is a single character
# 2. Base Case 1: Substrings of length 1
for i in range(n):
dp[i][i] = True
# 3. Base Case 2: Substrings of length 2
for i in range(n - 1):
if s[i] == s[i+1]:
dp[i][i+1] = True
# If we find a length-2 palindrome, update our answer
longest_start = i
max_len = 2
# 4. Main Logic: Substrings of length 3 and greater
for length in range(3, n + 1):
for i in range(n - length + 1):
j = i + length - 1
# --- The Core Recurrence ---
# First, check if the inner substring is a palindrome AND
# if the outer characters match.
is_inner_palindrome = dp[i+1][j-1]
if is_inner_palindrome and s[i] == s[j]:
# If both are true, then the whole substring is a palindrome
dp[i][j] = True
# --- The Tracking Logic ---
# Now that we've confirmed s[i..j] is a palindrome,
# check if it's the new longest one.
if length > max_len:
max_len = length
longest_start = i
# 5. Return the result
print(s[longest_start : longest_start + max_len])