Is it possible to lazily traverse a recursive data-structure with O(1)
memory usage, tail-call optimized?
Let's say that we have a recursive data-structure, like a binary tree.
There are many ways to traverse it, and they have different memory-usage
profiles. For instance, if we were to simply print the value of each node,
using pseudo-code like the following in-order traversal...
visitNode(node) {
if (node == null) return;
visitNode(node.leftChild);
print(node.value);
visitNode(node.rightChild);
}
...our memory usage would be constant, but due to the recursive calls, we
would increase the size of the call stack. On very large trees, this could
potentially overflow it.
Let's say that we decided to optimize for call-stack size; assuming that
this language is capable of proper tailcalls, we could rewrite this as the
following pre-order traversal...
visitNode(node, nodes = []) {
if (node != null) {
print(node.value);
visitNode(nodes + [node.left, node.right]);
} else if (node == null && nodes.length != 0 ) {
visitNode(nodes.head, nodes.tail);
} else return;
}
While we would never blow the stack, we would now see heap usage increase
linearly with respect to the size of the tree.
Let's say we were then to attempt to lazily traverse the tree - here is
where my reasoning gets fuzzy. I think that even using a basic lazy
evaluation strategy, we would grow memory at the same rate as the tailcall
optimized version. Here is a concrete example using Scala's Stream class,
which provides lazy evaluation:
sealed abstract class Node[A] {
def toStream: Stream[Node[A]]
def value: A
}
case class Fork[A](value: A, left: Node[A], right: Node[A]) extends Node[A] {
def toStream: Stream[Node[A]] = this #::
left.toStream.append(right.toStream)
}
case class Leaf[A](value: A) extends Node[A] {
def toStream: Stream[Node[A]] = this #:: Stream.empty
}
Although only the head of the stream is strictly evaluated, anytime the
left.toStream.append(right.toStream) is evaluated, I think this would
actually evaluate the head of both the left and right streams. Even if it
doesn't (due to append cleverness), I think that recursively building this
thunk (to borrow a term from Haskell) would essentially grow memory at the
same rate. Rather than saying, "put this node in the list of nodes to
traverse", we're basically saying, "here's another value to evaluate that
will tell you what to traverse next", but the outcome is the same; linear
memory growth.
The only strategy I can think of that would avoid this is having mutable
state in each node declaring which paths have been traversed. This would
allow us to have a referentially transparent function that says, "Given a
node, I will tell you which single node you should traverse next", and we
could use that to build an O(1) iterator.
Is there another way to accomplish O(1), tailcall optimized traversal of a
binary tree, possibly without mutable state?
No comments:
Post a Comment