Difference between revisions of "Edge Interface For Forest Iterators"

From Adobe Open Source Wiki
Jump to: navigation, search
(initial population)
 
 
Line 8: Line 8:
  
 
* <code>v</code>, a vertex descriptor (in the forest implementation this is the pointer to the node).
 
* <code>v</code>, a vertex descriptor (in the forest implementation this is the pointer to the node).
* <code>e</code>, an <code>in_edge</code> index - this specifies how we are pointing at the node (this was the "leading bool" and is now the <code>edge()</code> and the topic of this document).
+
* <code>e</code>, an <code>in_edge</code> index - this specifies how we are pointing at the node.
 
* <code>p</code>, a property (this is fixed to a single property of the <code>value_type</code> of the iterator).
 
* <code>p</code>, a property (this is fixed to a single property of the <code>value_type</code> of the iterator).
 
* <code>f</code>, a function to move from an <code>in_edge</code> to a corresponding <code>out_edge</code> (for a fullorder iterator on a forest this is the identity function).
 
* <code>f</code>, a function to move from an <code>in_edge</code> to a corresponding <code>out_edge</code> (for a fullorder iterator on a forest this is the identity function).
Line 33: Line 33:
 
==Child iterators==
 
==Child iterators==
  
I think the same description holds for <code>child_iterator</code> except <code>f()</code> is replaced with <code>trailing_of(*v)</code> and it no longer depends on which <code>in_edge</code> you arrive on.
+
The same description holds for <code>child_iterator</code> except <code>f()</code> is replaced with <code>trailing_of(*v)</code> and it no longer depends on which <code>in_edge</code> you arrive on.
  
 
==The reverse iterator problem with edge()==
 
==The reverse iterator problem with edge()==
  
The standard notion of a reverse iterator transforms a range <code>[first, last)</code> to the range <code>(first, last]</code> without introducing any extra position by having the base iterator point one after the element. Reverse iterators for a strict sequence work fine with a forest, but a reverse iterator which supports <code>edge()</code> is trickier.
+
The standard notion of a reverse iterator transforms a range <code>[first, last)</code> to the range <code>(first, last]</code> without introducing any extra positions by having the base iterator point one after the element. Reverse iterators for a strict sequence work fine with a forest, but a reverse iterator which supports <code>edge()</code> is trickier.
  
The next (or prior) edge is dependent on where you came from and it is a function of a vertex. So to read the edge when your iterator is after the current mark you need to <code>decrement</code> - but this means you are reading a vertex which is _before_ your offset range - this is because the edge is defined even on an end iterator (the vertex exists even if the property doesn't).
+
The next (or prior) edge is dependent on where you came from and it is a function of a vertex. So to read the edge when the iterator is after the current mark you need to <code>decrement</code> - but this means you are reading a vertex which is _before_ your offset range - this is because the edge is defined even on an end iterator (the vertex exists even if the property doesn't).
  
Right away then we violate our range just to read an <code>edge()</code> - to set an edge you defer the actual assignment of the edge until you reach the node - then set the edge to determine where to go next - so incrementing a reverse iterator looks like:
+
This violates our range just to read an <code>edge()</code> - to set an edge you defer the actual assignment of the edge until you reach the node - then set the edge to determine where to go next - so incrementing a reverse iterator looks like:
  
 
* Decrement the base iterator
 
* Decrement the base iterator
Line 58: Line 58:
 
This all works on a forest, despite the range violations, because a forest is implemented as a loop structure and the prior vertex of any node is guaranteed to exist (that is, we have a <code>root()</code> as well as an <code>end()</code> and <code>--begin()</code> will yield <code>root()</code>).
 
This all works on a forest, despite the range violations, because a forest is implemented as a loop structure and the prior vertex of any node is guaranteed to exist (that is, we have a <code>root()</code> as well as an <code>end()</code> and <code>--begin()</code> will yield <code>root()</code>).
  
A simpler form of a reverse iterator could simply rely on the fact that the forest ranges are symmetrical. In that case <code>rend()</code> could return <code>root()</code> and we would scrap the whole off by one. This does not change the requirements of our reverse iterator at all, but for anyone familiar with STL reverse iterators it does change the behavior of <code>base()</code>.
+
A simpler form of a reverse iterator relies on the fact that the forest ranges are symmetrical. In that case <code>rend()</code> could return <code>root()</code> and we would scrap the whole off by one. This does not change the requirements of our reverse iterator at all, but for anyone familiar with STL reverse iterators it does change the behavior of <code>base()</code>.
  
 
===Final Resolution===
 
===Final Resolution===
  
The simpler form mentioned above was implemented and I've decided the simplifications are well worth it. The off-by-one issue of <code>base()</code> is resolved by simply having <code>base()</code> return <code>next(base_m)</code>. The requirement imposed by the reverse iterator is actually the same requirement imposed by <code>set_next()</code>. For any valid range <code>[first, last)</code> there must exist a valid range <code>(prior(first), last)</code>.
+
The simpler form mentioned above was implemented and I've decided the simplifications are well worth it. The off-by-one issue of <code>base()</code> is resolved by simply having <code>base()</code> return <code>next(base_m)</code>. The requirement imposed by the reverse iterator is actually the same requirement imposed by <code>set_next()</code>. For any valid range <code>[first, last)</code> there must exist a valid range <code>(prior, last)</code> such that next(prior) == first.

Latest revision as of 22:21, 13 March 2007

Contents

Introduction

We can define the concept of the forest iterator in terms of a general graph using the terminology from Boost graph - this is useful because at some point we may wish to generalize these constructs to apply to other graph structures.

A forest is (loosely) a Bidirectional Graph.

A forest iterator consists of the following:

  • v, a vertex descriptor (in the forest implementation this is the pointer to the node).
  • e, an in_edge index - this specifies how we are pointing at the node.
  • p, a property (this is fixed to a single property of the value_type of the iterator).
  • f, a function to move from an in_edge to a corresponding out_edge (for a fullorder iterator on a forest this is the identity function).

An edge is a directed link (with a from() and to() function) - and because it is bidirectional you can find the corresponding input link.

The increment operation can be defined as follows in pseudo code (once I'm more fluent in Boost Graph this could be described more concretely):

current_edge = out_edge(*v, e);
v = to(curent_edge);
e = f(find_in_edge(*v, current_edge));

In this way we can think of a forest iterator as being a traversal defined over a Bidirectional Graph. The edge() interface on the iterator allows you to manipulate what in_edge is being used to reach the vertex, and correspondingly what out_edge will be followed next. The type of the result is the index for the edge sequence. In the case of forest this is an array, so std::size_t is the correct type, even though it will only be a value of 0 or 1.

The signature for edge() is:

std::size_t edge() const; // read access to the edge
std::size_t& edge(); // write access to the edge

Child iterators

The same description holds for child_iterator except f() is replaced with trailing_of(*v) and it no longer depends on which in_edge you arrive on.

The reverse iterator problem with edge()

The standard notion of a reverse iterator transforms a range [first, last) to the range (first, last] without introducing any extra positions by having the base iterator point one after the element. Reverse iterators for a strict sequence work fine with a forest, but a reverse iterator which supports edge() is trickier.

The next (or prior) edge is dependent on where you came from and it is a function of a vertex. So to read the edge when the iterator is after the current mark you need to decrement - but this means you are reading a vertex which is _before_ your offset range - this is because the edge is defined even on an end iterator (the vertex exists even if the property doesn't).

This violates our range just to read an edge() - to set an edge you defer the actual assignment of the edge until you reach the node - then set the edge to determine where to go next - so incrementing a reverse iterator looks like:

  • Decrement the base iterator
  • Set the edge to our next edge
  • remember what the next edge should be

We run into a big problem if we set the edge and then decrement the reverse iterator - setting the edge may have pivoted on the previous node and we should land on a node which isn't reachable in one step from the current base!

The algorithm for decrementing the reverse iterator is

  • Decrement the base iterator
  • Set the edge to our next edge
  • increment the base iterator twice
  • remember what the next edge should be

This all works on a forest, despite the range violations, because a forest is implemented as a loop structure and the prior vertex of any node is guaranteed to exist (that is, we have a root() as well as an end() and --begin() will yield root()).

A simpler form of a reverse iterator relies on the fact that the forest ranges are symmetrical. In that case rend() could return root() and we would scrap the whole off by one. This does not change the requirements of our reverse iterator at all, but for anyone familiar with STL reverse iterators it does change the behavior of base().

Final Resolution

The simpler form mentioned above was implemented and I've decided the simplifications are well worth it. The off-by-one issue of base() is resolved by simply having base() return next(base_m). The requirement imposed by the reverse iterator is actually the same requirement imposed by set_next(). For any valid range [first, last) there must exist a valid range (prior, last) such that next(prior) == first.