andyp at osdl.org
Thu Feb 13 12:28:54 PST 2003
On Thu, 2003-02-13 at 11:13, Rod Van Meter wrote:
> First off, this looks like very useful functionality. I'm happy to
> see it. And it comes with documentation, too!
I've made a first pass through the PDF file that describes what it is
and how it does it.
Cold-medicine-induced rambling follows: ;^)
The first thing that strikes me is that it is similar in routing and
neighbor discovery to several distributed memory message-passing systems
developed during the mid-'80's. They were characterized as systems
composed of nodes connected only by point-to-point networks, and all
routing was performed by store-and-forward of messages by the nodes
within the system. Several platforms were built this way, including
those based upon INMOS Transputers, the early nCUBE, the Intel iPSC 1,
and a few-other early hypercubes.
If I understand the document correctly, the "Hello" mechanism and
routing table discovery/maintenance used by TIPC can have scaling
complications on very large systems (100's to 1000's of communicating
agents) when configured with insufficient internal connectivity. A
spanning tree-based algorithm for maintaining the routing tables looks
like the ideal solution (nearly all of the needed adjacency information
is present) to apply here, rather than hardware-based broadcast on a
subnet or software-based "replicast."
The "zone" abstraction is also similar to techniques developed for
buffer management and flow-control in the high-performance
message-passing present on systems like the Intel iPSC2 and the Intel
Paragon. In those systems, all-to-all communication needed to be
supported, but the O(N^2) time and space requirements rapidly became
prohibitive with 100's of nodes. Internally, NX message passing
maintained an LRU of "nearest logical neighbors", and transparently
handled the attach/detach dynamically between one node and a set of
other nodes. TIPC appears to be similar, at least in the description,
of that kind of behavior.
I'm curious as to the behavior of the protocol in some of the strange
boundary conditions, as in the case where the reroute counter of a
message has expired and the system is attempting to return it to the
sender, what happens if all routes to the original sender are cut or if
the sender has been removed?
More information about the cgl_discussion