How Can I Implement A Custom Iterator Class In Python That Efficiently Handles Lazy Evaluation Of A Complex, Nested Data Structure (e.g. A Tree Or Graph), While Also Providing Support For Multiple Iteration Protocols (e.g. `__iter__`, `__reversed__`, `__getitem__`) And Properly Handling Edge Cases Such As Iteration Over An Empty Structure Or A Structure With Cyclic References?
To implement a custom iterator class in Python that efficiently handles lazy evaluation of a complex, nested data structure (like a tree or graph), while supporting multiple iteration protocols and handling edge cases, follow these steps:
Approach
- Node Class: Define a simple
Node
class to represent each element in the data structure, containing a value and a list of children. - Iterator Class: Create an
Iterator
class that supports lazy evaluation using depth-first search (DFS) with a stack. Track visited nodes to prevent infinite loops in cyclic structures. - Lazy Evaluation: Implement the
__iter__
and__next__
methods to yield values on-the-fly without preloading all data. - Reversed Iteration: Support
__reversed__
by collecting all nodes in a list during forward iteration and then reversing the list. - Indexing and Slicing: Implement
__getitem__
to allow accessing elements by index or slice, using a cache to store already processed nodes for efficient access.
Solution Code
class Node:
def __init__(self, value, children=None):
self.value = value
self.children = children if children is not None else []
class Iterator:
def init(self, root):
self.root = root
self.stack = [root] if root is not None else []
self.visited = set()
self.cache = []
def __iter__(self):
return self
def __next__(self):
if not self.stack:
raise StopIteration
node = self.stack.pop()
if node in self.visited:
return self.__next__()
self.visited.add(node)
self.cache.append(node.value)
for child in reversed(node.children):
if child not in self.visited:
self.stack.append(child)
return node.value
def __reversed__(self):
new_iterator = Iterator(self.root)
nodes = list(new_iterator)
return iter(reversed(nodes))
def __getitem__(self, index):
if isinstance(index, slice):
start, stop, step = index.indices(len(self.cache))
result = []
for i in range(start, stop, step):
if i >= len(self.cache):
while len(self.cache) <= i:
try:
next(self)
except StopIteration:
break
if i < len(self.cache):
result.append(self.cache[i])
return result
else:
if index < 0:
index += len(self.cache)
if index < 0:
raise IndexError("Index out of range")
while len(self.cache) <= index:
try:
next(self)
except StopIteration:
break
if index >= len(self.cache):
raise IndexError("Index out of range")
return self.cache[index]

if name == "main":
# Create a sample tree
root = Node(1, [
Node(2),
Node(3, [
Node(4),
Node(5)
])
])
# Forward iteration
print("Forward iteration:")
it = Iterator(root)
for val in it:
print(val)
# Reversed iteration
print("\nReversed iteration:")
for val in reversed(it):
print(val)
# Indexing
print("\nAccessing index 2:")
print(it[2])
# Slicing
print("\nSlicing [1:4:2]:")
print(it[1:4:2])
Explanation
- Node Class: Each
Node
has a value and a list of children, allowing the creation of tree or graph structures. - Iterator Initialization: The
Iterator
class initializes with a root node, setting up a stack for DFS traversal and a set to track visited nodes to avoid cycles. - Lazy Evaluation: The
__next__
method processes each node, adding its value to the cache and pushing its children onto the stack in reverse order to maintain the correct traversal order. - Reversed Iteration: The
__reversed__
method creates a new iterator, collects all nodes, and returns them in reverse order, ensuring the original iterator's state isn't affected. - Indexing and Slicing: The
__getitem__
method uses the cache to provide efficient access to elements by index or slice, iterating as needed to populate the cache up to the requested position.
This approach efficiently handles lazy evaluation, supports multiple iteration protocols, and manages edge cases like empty structures and cyclic references.