Merge K Sorted Lists
[[Heaps]]
Problem
You are given an array of k linked-lists lists, each linked-list is sorted in ascending order.
Merge all the linked-lists into one sorted linked-list and return it.
Example 1:
Input: lists = [[1,4,5],[1,3,4],[2,6]]
Output: [1,1,2,3,4,4,5,6]
Explanation: The linked-lists are:
[
1->4->5,
1->3->4,
2->6
]
merging them into one sorted linked list:
1->1->2->3->4->4->5->6
Example 2:
Input: lists = [] Output: []
Example 3:
Input: lists = [[]] Output: []
Brainstorming
Brute force
# class ListNode:
# def __init__(self, val=0, next=None):
# self.val = val
# self.next = next
we would need to have a way to to iterate over all lists at once or we can do it in passes too where in we try to store the list in the final output
so what i am saying is lets take the example [[1,2], [1,2,3]
we would need 2 things
- at any given point what is the current largest item in our final list so lets say in the first pass we added
[1,2]so the largest element in the final list is 2 - so when we go for the second list when we iterate we just check if the element in that list is smaller or equal to the largest element then they come before the largest element in the list that is
[1,2]largest element 2- iterating over
[1,2,3]check if the current node val is smaller or equal to 2- 1 yes we so we prepend it
[1,1,2] - 2 yes so we prepend it
[1,1,2,2] - 3 no , so now we have to do 3 things
- 1 yes we so we prepend it
- we add the node 3 after the largest element i.e
[1,1,2,2,3] - we then update the largest element in the final list to 3
- we continue this till we are done with all the lists are done
from typing import List, Optional
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
def mergeKLists(self, lists: List[Optional[ListNode]]) -> Optional[ListNode]:
final_list = []
number_of_lists = len(lists) -1 # this will help us define the number of passes
current_list_index = 0
while current_list_index <= number_of_lists: # this loop will run till the time we have lists to explore
current_list = lists[current_list_index]
current_list_max = - float('inf')
current_node = current_list[0]
while current_node : # this while loop will run
if curren_node.val >= current_list_max:
current_list_max = max(curren_node.val, current_list_max)
the only problem here is how do we append left in the final list? also is there a way to do the max
- 🍅 (pomodoro::WORK) (duration:: 20m) (begin:: 2025-12-13 14:29) - (end:: 2025-12-13 14:49)
Optimal solution
You’ve started with a very creative idea, thinking about merging lists sequentially and inserting elements based on the max value seen so far. This is good brainstorming.
However, let’s analyze the approach. Appending to the front of a list (or inserting into the middle) is a very slow O(N) operation. Doing this repeatedly will lead to a very inefficient algorithm, likely O(N*K) or worse, where N is the total number of nodes and K is the number of lists.
Let’s step back and think about the core task. We are trying to build one single sorted list.
The Self-Nudge Process:
Question 1: What is the absolute simplest version of this problem?
If you weren’t given K lists, but just two sorted lists, how would you merge them? Describe the standard “merge two sorted lists” algorithm. What data structure(s) or pointers would you use?
we use :
class ListNode:
def __init__(self, data=None, next_node=None):
self.data = data
self.next = next_node
# L1 = 2 -> 5 -> 7
l1_node7 = ListNode(7)
l1_node5 = ListNode(5, l1_node7)
L1 = ListNode(2, l1_node5)
# L2 = 3 -> 11
l2_node11 = ListNode(11)
L2 = ListNode(3, l2_node11)
def merge_two_sorted_list(l1:ListNode, l2:ListNode):
dummy_head = ListNode()
tail = dummy_head
while l1 is not None and l2 is not None:
if l1.data <= l2.data:
tail.next = l1
l1 = l1.next
else:
tail.next = l2
l2 = l2.next
tail = tail.next
tail.next = l1 or l2
return dummy_head.next
final_list = merge_two_sorted_list(L1,L2)
while final_list:
print("->", final_list.data)
final_list = final_list.next
Now, let’s use this as our building block.
The original problem is “Merge K Sorted Lists.”
We now have a perfect function that can merge two.
Question (The Self-Nudge):
How can you use your merge_two_sorted_list function to solve the K-list problem? Describe a simple, high-level strategy. What would be the time complexity of this approach?
we can just expand this problem for k lists
while l1 is not None or l2 is not None ...... lk is Not None
and inside that also we can expand the if else tree for the checks
if l1.data <= l2.data elif l2.data <= l3,data
but the core problem still remains
we would need to know before hand how many lists are there
That’s a very creative thought! You’re trying to generalise the while loop logic directly to K lists. It’s a great instinct, but let’s analyse where it gets tricky, as you correctly pointed out.
- The while loop condition would become complex.
- The if/elif/else chain to find the minimum of K different nodes would be very long and inefficient. For each node you append, you’d have to do K-1 comparisons to find the next minimum.
This feels like it’s getting complicated. Let’s step back.
You have a perfect tool: merge_two_sorted_list(l1, l2).
You have a list of K lists: [list1, list2, list3, ..., listK]
Let’s try a simpler strategy.
What if you just merge them one by one?
- Take list1 and list2. Merge them into a merged_list_1_2.
- Take merged_list_1_2 and list3. Merge them into merged_list_1_2_3.
- Take merged_list_1_2_3 and list4. Merge them.
- …and so on, until you’ve merged all K lists.
This is a valid, brute-force approach that directly uses the tool you just built.
from typing import List, Optional
class ListNode:
def __init__(self, data=0, next=None):
self.data = data
self.next = next
def build_list(arr):
dummy = ListNode(0)
curr = dummy
for x in arr:
curr.next = ListNode(x)
curr = curr.next
return dummy.next
lists = [
build_list([1, 4, 7]),
build_list([2, 5, 8]),
build_list([3, 6, 9])
]
def helper_merge_k_sorted_list(l1:ListNode, l2:ListNode):
dummy_node = ListNode()
tail = dummy_node
while l1 is not None and l2 is not None:
if l1.data <= l2.data:
tail.next = l1
l1 = l1.next
else:
tail.next = l2
l2 = l2.next
tail = tail.next
tail.next = l1 or l2
return dummy_node.next
# The brute force
def mergeKLists(lists: List[Optional[ListNode]]) -> Optional[ListNode]:
if not lists: # Handle empty list case
return None
# Start with the first list as our initial merged result
merged_list = lists[0]
# Loop from the *second* list to the end
for i in range(1, len(lists)):
# At each step, merge the current 'merged_list' with the next list in the input
merged_list = helper_merge_k_sorted_list(merged_list, lists[i])
return merged_list
k_sorted_list = mergeKLists(lists)
while k_sorted_list:
print("->",k_sorted_list.data)
k_sorted_list = k_sorted_list.next
- 🍅 (pomodoro::WORK) (duration:: 30m) (begin:: 2025-12-13 14:49) - (end:: 2025-12-13 15:21)
- 🥤 (pomodoro::BREAK) (duration:: 15m) (begin:: 2025-12-13 15:22) - (end:: 2025-12-13 15:37)
This is a complete, correct, and robust O(N*K) solution. You have successfully built it from the ground up, starting with the two-list merge and scaling it to handle K lists.
Now, let’s go back to the question I asked before we started coding the brute-force.
The “Self-Nudge” towards the Optimal Solution:
The O(N*K) approach is inefficient because we keep re-traversing the beginning of our merged_list.
The core task at every single step of building the final list is to answer the question:
“Out of the K nodes currently at the front of each of the K lists, which one is the smallest?”
We need a data structure that is excellent at managing a collection of K items and repeatedly giving us the smallest one.
- We will insert up to K nodes into it at the start.
- We will repeatedly extract_min (K-1 comparisons in a simple list, but we can do better).
- When we extract a node, we will insert its next node into the structure.
That is the perfect question to ask. “Heap” is the right answer, but “how does it work?” is the key to the algorithm.
Let’s be very precise. The heap will not store nodes from the final_list. The final_list is what we are building.
The heap will only ever store at most K nodes (where K is the number of input lists). Specifically, it will store the “current” node from each of the K lists.
Let’s visualise the process:
Input: lists = [ [1,4,7], [2,5,8], [3,6,9] ] (K=3)
Output: A dummy_head and a tail pointer, just like in the two-list merge.
Algorithm:
Initialisation:
- Create an empty min-heap. A min-heap is a data structure that always has the smallest element at the top, ready to be popped in
O(log K)time. - Create
dummy_head = ListNode()andtail = dummy_head.
- Create an empty min-heap. A min-heap is a data structure that always has the smallest element at the top, ready to be popped in
Seed the Heap:
- Go through each of the
Kinput lists. Take the very first node from each non-empty list and push it onto the min-heap. - After this step, the heap contains:
[ (1, node_1), (2, node_2), (3, node_3) ].- (Note: To store nodes in a Python
heapq, which doesn’t know how to compareListNodeobjects, we store them as tuples:(node.data, node)or(node.data, id(node), node). The heap will sort based on the first element of the tuple, the value). Let’s just think of it as storing the nodes for now.
- (Note: To store nodes in a Python
- Go through each of the
The Main
whileLoop:- The loop continues as long as the heap is not empty.
Inside the Loop (The Core “Extract -> Append -> Refill” Logic):
- a. Extract Minimum: Pop the smallest node from the heap. This is guaranteed to be the overall smallest node currently available from any list.
popped_node = heap.pop()-> This isnode_1(value 1).
- b. Append to Result: Attach this
popped_nodeto the end of your final list.tail.next = popped_nodetail = tail.next
- c. Refill the Heap: Look at the node you just popped (
node_1). Does it have anextnode? (node_1.nextpoints tonode_4).- If it does, push that next node onto the heap.
heap.push(node_4).- The heap now contains
[ (2, node_2), (3, node_3), (4, node_4) ].
- a. Extract Minimum: Pop the smallest node from the heap. This is guaranteed to be the overall smallest node currently available from any list.
Repeat the Loop:
while heap is not empty:- a. Extract Min: Pop
node_2(value 2). - b. Append:
tail.next = node_2,tail = tail.next. - c. Refill: Push
node_2.next(which isnode_5) onto the heap. - Heap now:
[ (3, node_3), (4, node_4), (5, node_5) ]
- a. Extract Min: Pop
This process continues until all nodes from all lists have been pushed onto and popped off the heap, and the heap becomes empty.
from typing import List, Optional
import heapq
class ListNode:
def __init__(self, data=0, next=None):
self.data = data
self.next = next
def build_list(arr):
dummy = ListNode(0)
curr = dummy
for x in arr:
curr.next = ListNode(x)
curr = curr.next
return dummy.next
lists = [
build_list([1, 4, 7]),
build_list([2, 5, 8]),
build_list([3, 6, 9])
]
print(lists)
def merge_k_sorted_list(lists):
K = len(lists)
min_heap = []
dummyNode = ListNode()
tail = dummyNode
for i in range(K):
current_head = lists[i]
heapq.heappush(min_heap, (current_head.data, current_head))
while min_heap is not None:
# extract
popped_node = heapq.heappop(min_heap)
# append
tail.next = popped_node[1]
tail = tail.next
if popped_node[1].next:
heapq.heappush(min_heap, (popped_node[1].data, popped_node[1]))
return dummyNode.next
k_sorted_list = merge_k_sorted_list(lists)
while k_sorted_list:
print("->")
print(k_sorted_list.data)
k_sorted_list = k_sorted_list.next
This is an excellent implementation. You’ve correctly translated the heap-based logic into Python code. You are just a few small refinements away from a perfectly robust solution.
Refinement #1: The Heap Tuples (A classic Python heap gotcha)
You are pushing (current_head.data, current_head) onto the heap. This works great as long as all the node data values are unique.
But what if you have two lists like [1, 5] and [1, 6]?
- You would push
(1, <ListNode 1a>)and(1, <ListNode 1b>). - If
heapqneeds to compare these two tuples (which it will), it will see that the first elements (1and1) are equal. It will then try to compare the second elements to break the tie:<ListNode 1a>and<ListNode 1b>. - Python’s
heapqdoes not know how to compare twoListNodeobjects (<ListNode> < <ListNode>), so it will crash with aTypeError.
The Fix: We need a tie-breaker that is always unique. A simple counter or an object’s id() works perfectly. We can add it as the second element of the tuple.
- Revised Tuple:
(node.data, unique_id, node)
Refinement #2: The while Loop Condition
You wrote: while min_heap is not None:
In Python, an empty list [] is not None. It’s just a list with a length of 0. The loop should continue as long as the heap has elements in it.
- Question: What is the Pythonic way to check if a list is not empty?
Refinement #3: The “Refill” Logic
You have:
if popped_node[1].next:
heapq.heappush(min_heap, (popped_node[1].data, popped_node[1]))
You are pushing the same node you just popped back onto the heap. You need to push its next node.
- Question: What should be pushed onto the heap in the refill step?
Refinement #4: Initial Seeding Edge Case
What if one of the lists in the input lists is empty (i.e., None)?
Your current_head = lists[i] line will be None, and current_head.data will crash.
- Question: What
ifcheck should you add inside your initialforloop before you try to push a node onto the heap?
(Prompt): Address these four refinements. The tuple tie-breaker is a new concept, but the other three are logical fixes to your existing code. What does the final, robust version look like?
from typing import List, Optional
import heapq
class ListNode:
def __init__(self, data=0, next=None):
self.data = data
self.next = next
def build_list(arr):
dummy = ListNode(0)
curr = dummy
for x in arr:
curr.next = ListNode(x)
curr = curr.next
return dummy.next
lists = [
build_list([1, 4, 7]),
build_list([2, 5, 8]),
build_list([3, 6, 9])
]
def merge_k_sorted_lists(lists):
K = len(lists)
dummy_node = ListNode()
tail = dummy_node
min_heap = []
for i in range(K):
if lists[i]:
heapq.heappush(min_heap, (lists[i].data, id(lists[i]) ,lists[i]))
while min_heap:
# extract
popped_node = heapq.heappop(min_heap)
# append
tail.next = popped_node[2]
tail = tail.next
if popped_node[2].next:
heapq.heappush(min_heap, (popped_node[2].next.data, id(popped_node[2].next), popped_node[2].next))
return dummy_node.next
k_sorted_list = merge_k_sorted_lists(lists)
while k_sorted_list:
print("->", k_sorted_list.data)
k_sorted_list = k_sorted_list.next
Complexity Analysis:
- Let
Nbe the total number of nodes in all lists combined. - Let
Kbe the number of lists. - Seeding the heap: You push
Knodes.K * O(log K). - Main Loop: The loop will run
Ntimes in total (once for every node).- Inside the loop, you do one
pop(O(log K)) and onepush(O(log K)).
- Inside the loop, you do one
- Total Time Complexity:
O(K * log K + N * log K), which simplifies toO(N * log K).
- 🍅 (pomodoro::WORK) (duration:: 30m) (begin:: 2025-12-13 15:38) - (end:: 2025-12-13 17:44)
🍅 (pomodoro::WORK) (duration:: 30m) (begin:: 2025-12-13 17:54) - (end:: 2025-12-13 21:22)