Internet Experiment Note # 19
Notebook Section 2.3.3.5



                                  A note on

                Inter-Network Naming, Addressing, and Routing


                                John F. Shoch

                                January 1978

                        Xerox Palo Alto Research Center
                          Palo Alto, California 94305



Introduction

Taxonomies and terminology will not, by themselves, solve some of the
difficult problems associated with the inter-connection of computer networks;
but carefully choosing our words can help us to avoid misunderstanding and
refine our perceptions of the task.

In 'Through the Looking Glass', the White Knight tries to elucidate (for an
imprecise Alice) the important differences between what a song *is*, what it
*is called*, what it *is named*, and what *the name is called*; perhaps we
need to be equally vigilant with our use of the words 'name', 'address', and
'route'.

Let me offer one scheme which has proven useful in analyzing this domain, and
begin by asserting that 'names', 'addresses', and 'routes' are different
entities. [Even one of my favorite papers introduces part of this topic by
merging two of these characteristics: "The question of addressing. or how to
name all the participants in an interconnected communication system...."]


The General Model

We can first construct an extremely general definition:

  The 'name' of a resource indicates *what* we seek,

    an 'address' indicates *where* it is, and

        a 'route' tells us *how to get there*.

Some further detail to flesh this out:

I. A 'name' is a symbol -- usually a human-readable string -- identifying
some resource, or set of resources.

    The string need not be meaningful to all users, and need not be drawn
    from a uniform name space.

    The strings can identitify processes, places, people, machines,
    functions, or anything else the user chooses.


2


    To be useful, however, there will probably be some mechanism available to
    the user that will map names into addresses.

    The name (what we seek) need not be bound to the address (where it is)
    until this mapping takes place; the address (or addresses) associated
    with a particular name may change over time.

II. An 'address', however, is the data structure whose format can be
recognized by all elements in the domain, and which defines the fundamental
addressable object.

    Addresses must, therefore, be meaningful throughout the domain, and must
    be drawn from some uniform address space.

    This address space may be a "flat" one which spans the entire domain
    (such as Social Security numbers), or it may be a hierarchical address
    space (such as telephone numbers).

    If a name has produced several different addresses for a particular
    resource, some form of 'a priori' information may prove useful in
    selecting a preferred address.

    At the time one wishes to communicate with a particular address, there
    will be some mechanism that will map an address into an appropriate
    route.

    The address (where something is) need not be bound to the route (how to
    get there) until this mapping takes place; the choice of an "appropriate"
    route may change over time.

III. A 'route' is the specific information needed to forward a piece of
information to its specified address.

    The routing action may only require one step to reach a destination (a
    direct route), or it may require a series of steps in order to forward
    the information on its way.

    Where there is merely one hop in a simple topology, the routing decision
    is usually straightforward.

    When the path to an address requires several steps (as in a store and
    forward system), the route defines a path through intermediate switching
    points.

    In categorizing such routing mechanisms, one dimension is the *place* at
    which each intermediate routing decision is specified:

        a. The source may specificy all of the intermediate routing decisions,
        and include this information along with the data being sent ('source
        routing'). The source must have fairly comprehensive information about
        the environment, but the switching points need not maintain routing
        tables.

        b. Alternatively, the source may only specify the destination
        address, and the intermediate switching points choose the next
        portion of the route ('hop-by-hop routing', sometimes called
        'intermediate routing' or 'incremental routing'). In this case, the
        source only needs enough information to reach the next switching
        point, but each of these points must then have a routing table of
        some sort.

        c. There may be a hybrid routing mechanism combining these two, in
        which the source specifies certain major intermediate points, but
        allows the underlying system to choose routes between those points.

    A second dimension of the routing mechanims [sic] is the 'time constant'
    of the routing decisions -- when are they specified, and how frequently
    they may be changed:

        a. Routing tables may be set once, left unchanged for relatively long
        periods


3


        of time, and only changed to reflect major modifications to the
        system ('fixed routing', or 'deterministic routing').

        b. Alternatively, routing tables may be updated relatively
        frequently, reflecting shorter term changes in the environment
        ('dynanamic routing', or 'adaptive routing').
        [This measure is, in some sense, relative to the time constants of
        the communication system itself.]

    If there is a dynamic routing system providing periodic updates of
    appropriate information, a third dimension of the routing process is the
    'control' mechanism:

        a. Individual switching points may try to update their information in
        an 'isolated' manner, perhaps by trying various routes and observing
        performance.

        b. Changes in the environment or connectivity may all be reported
        back to one 'centralized' point, which is then responsible for
        promulgating this inforniation to the sources (if we use source
        routing) or to the switches (if we use hop-by-hop routing).

        c. Alternatively. control of the dynamic update process may be
        'distributed' among all of the sources or switches, which individualy
        [sic] obtain information about the state of the system.

    It is meaningful to talk about possible combinations of these
    dimensions: 'fixed source routing' may be quite simple to implement,
    while 'dynamic source routing' may be beneficial if the source can learn
    about changes occurring throughout the system. 'Fixed hop-by-hop routing'
    requires setting the routing tab1es at the switching points, while
    'dynamic hop-by-hop routing' then requires updates to these routing
    tables. 'Distributed, dynamic, hop-by-hop routing' might obtain these
    periodic updates from neighboring switches, while a 'centralized,
    dynamic, hop-by-hop' system would depend on some form of network coutrol
    center to provide new information.

Thus, a *name* may be used to derive an *address*, which may then be used to
derive a *route*.

    [There is an interesting similarity between this structure and mechanisms
    used in programming languages (where one must bind a value to a
    variable), or in operating systems (where one must link a particular
    piece of code into a module). In all these cases, there is a certain
    amount of added flexibility gained by deferring the binding process as
    long as possible: MULTICS defers the binding of pieces of code until
    run-time so that new instantiations of code will be used automatically,
    while the Arpanet tries to defer routing decisions as long as possible,
    in order to make the most informed choice.]

If one fails to deliver a piece of information to the resource originally
specified, an error recovery procedure can systematically try to undo each of
these binding decisions: first try an *alternate route*, then an *alternate
address*, then an *alternate name*.


Example #1: Naming/Addressing/Routing in the Telephone System

Before attempting to apply this scheme to a connected set of computer
networks, let's see how well it serves to understand the workings of another
domain: placing a telephone call.

[The telephone system, of course, uses circuit switching, and dedicates
resources for the lifetime of a call. We are here discussing the naming,
addressing, and routing which take place during call set-up.]

I. We can look up a 'name' in the phone book.

    The name may be a specific person ("D. B. Cooper"), an institution
    ("Stanford University"), the local representative of a nation-wide
    service ("the nearest TWA


4


    office"), or a generic function ("Get me the police!", or the time
    server).

    The names of different individuals may yield the same phone number.

    One name may yield several phone numbers, if there are two telephones in
    one home or if a business provides multiple incoming lines. In addition, a
    considerate business might provide several incoming lines vith
    different area codes.

    Should a name yield more than one number, the caller may use some
    additional information to aid in the choice of which to dial: is one
    number less likely to be in use, or is one address in our own area code,
    and thus cheaper.

    There can be a two-level name-to-address mapping process: look up the
    name "information" in the phone book, get that number (411), place a
    call, and then present a new name to be looked-up by the information
    operator.

II. The 'address' which we eventually find is just the telephone number.

    The phone system did not have to understand the subscribers' names, but
    it certainly must be able to understand their addresses (numbers).

    Addressing in the phone network is based on a hierarchical structure --
    the major contribution of area codes, as specified in the North American
    Numbering Plan.

    There are certain conventions for defaulting parts of the address when it
    is found: if the area code is not listed, we generally assume it is the
    same as our own.

    Every resource (person) accessible through the phone system is actually
    represented by a telephone instrument, which we can address. Actual
    signals are intended for the person, but they are always sent via the
    instrument.

III. Vhen we dial the number (in fact, each time we dial the number) the
phone network attempts to build an appropriate route to the destination,
reflecting current conditions.

    Two successive calls to the same number need not be routed through the
    same path, and the telephone plant provides alternate routes if a trunk
    is busy. The distributed, adaptive nature of this routing is one of the
    major factors which contributes to the reliability of the phone system.

Now consider the error recovery procedure if we are not able to reach the
person we seek: we move back through the hierarchy, trying alternate routes,
alternate addresses, and alternate names.

1. If we get a busy trunk indication, we know that the route chosen by the
network turned out to be inadequate. We can, in effect, ask the system to try
picking a route again, by re-dialing (re-transmitting). If we are
particularly concerned, we can ask an operator to override some of the
automatic routing decisions, and attempt an alternate route.

2. If all of the routes to this address are inadequate, or if the address is
unavailable (perhaps busy, or broken), we must fall back and find an
alternate address: perhaps there were two phone numbers listed, and we can
try the second. (How many of us have second phones for a terminal, which we
call in on when the regular line is busy?) This will lead to the selection of
a completely new route.

3. If the alternate address fails, we can fall back one more step, and try to
find an alternate name: perhaps the phone is listed in the name of one's
spouse, or maybe we should look up the name of the organization where one
works.


5


The Implications of Hierarchical vs. Flat Address Spaces

It should be apparent that the structure of the address space is of central
importance: it is the major element of commonality in such an environment,
and can have a profound influence on the naming mechanism "above" and the
routing mechanism "below".

An address space can be partitioned in a hierarchical manner, or left as a
single uniform space. Use of a flat address space implies that:

    1. The address given to any resource must be unique over the whole
    domain; and,
    2. There is no structure to the address which might aid the routing
    process; instead, the routing mechanism must be able to handle all
    addresses, without segmenting them into parts (i.e., there is no area
    code).

There are some advantages to this approach: it is logically simple, and has
the potential to optimize the particular route between any two resources. But
there are certain kinds of costs: upon adding a new set of resources to an
existing system, one must insure [sic] that none of the new addresses
conflict with those already used, and the routing process must now be able to
recognize all of the legal addresses (in the phone system, 10^10).

The alternative is to provide a hierarchical structure for addresses, as has
been done with area codes. Local addresses can then be changed within areas,
but need not be made known to all of the other parts.

Yet the choice of hierarchical addressing can have important ramifications
for the levels "below".

We would certainly not want to allow a strictly hierarchical address scheme
to serve as the justification for a strictly hierarchical physical topology:
that would allow only one path to each resource, and severely diminish
reliability. Thus, the underlying communications should provide alternate
paths to a destination.

Furthermore, the routing mechanism must now be flexible enough to take
advantage of the alternate paths available -- it would make litle sense to
lay a strictly hierarchical routing algorithm on top of a richly-connected
network. Given a hierarchical address, this form of area routing should be
able to pick out an appropriate field, and route the data towards that area,
through one of the available paths, but be able to identify an alternate path
when necessary.

Though the hierarchical address space is often viewed as preferable, there is
a cost here too: if the routing decision does in fact use the hierarchical
structure of the address and route to an area, then some information has been
lost: the ability to specify precise routes between two resources. This can
lead to several kinds of difficulties:

    1. Two resources may be geographically close, but on distant branches of
    the hierarchical tree, If the routing strictly parallels the addressing
    hierarchy, it may require many hops to reach a nearby node.

        The telephone system provides a routing hierarchy which matches the
        addressing scheme, but then provides additional routes (high usage
        trunks) across the tree, as warranted by the traffic; switching
        centers then have the choice of routing through the specially
        allocated trunk, or back up the regular hierarchy, which then becomes
        the alternate route.
        This means, however, that some central offices within an area code
        are reachable through direct lines, while other less-heavily used
        offices must be reached through the standard hierarchy. To make that
        routing decision, the foreign switch must be provided with additional
        information about the internal structure of the system within the
        destination area code -- i.e., which central office prefixes are
        reachable through a "short-cut". To take advantage of a direct
        high-usage trunk from downtown New York to a


6


        central office in downtown San Francisco, without going through the
        California area code hierarchy, the switch in New York must be able
        to perform what is called "6-digit translation" on the phone number,
        examining both the area code and the central office prefix.

    2. If a sub-area (netvork) is unintentionally partitioned so that half of
    its internal addresses are now unreachable, a distant routing process
    won't be able to recognize that, given only the sub-area number.

    3. Should there be alternate paths into the partitioned sub-area, now
    making those lost nodes reachable, the distant routing process will not
    be able to make an informed choice about the proper route into the
    sub-area.

    4. Two resources may be in the same sub-area, but perhaps connected by an
    inefficient route (many hops or poor quality). Provided there are two
    paths in to the sub-area, it may make more sense for a resource to route
    its traffic out of the area aud back in through the other path, but area
    routing will indicate that the source is already in the area of the
    destination, and may not allow a route going outside of that area.

In these unusual circumstances, we see that there are clear weaknesses at the
routing level which evolve from the choice of a hierarchical address space;
but in the most common cases this approach will work well.

This choice of hierarchical addresses may also impact the naming mechanism,
although that ought not be necessary. The name-to-address mapping can be
supplied by a process which is external to the system itself, and there is no
reason to enforce hierarchical names. A user may offer a name from which one
can directly derive the address, but it should also be acceptable to provide
a symbolic name, independent of the address format. An example may show the
different kinds of names which might be converted into adresses (telephone
numbers):

    1. The syntax of the string "(415) 767-2676" explicitly indicates the
    three fields of the address, and their values.

    2. The string "(SanFrancisco) Pop-Corn" has the syntactic structure of a
    proper hierarchical address, but will require a symbolic look-up of the
    component parts.

    3. Yet the string "GetMeTheTimeLady" -- when looked-up in the context of
    my name dictionary -- should produce this same address.

The virtue of the third approach is that the names may remain the same, even
if the resource is moved to a different address, or if I should move to a
different area; only the name-lookup process must be made aware of the change.


Example #2: Inter-connected Computer Netvorks

It seems that this model of naming/addressing/routing provides a useful
framework for discussing the phone system, and that system provides a rich
set of examples for comparison with the domain of computer networks.

I. A name is merely a human-readable string meant to denote some resource:
"BBN-TenexA" (a host), "SAIL" and "SU-AI" (two names for the same host), "the
SRI mobile van", "a time server", "the Telnet server at ISI", "the Telnet
server on any Tenex", "the nearest FORTRAN compiler", "the error-logging
process in IMP32", "the inter-network routing process in Gateway2l" .... Note
that these names may identify resources outside of the communication system,
or processes within it.

Virtually any object can have a name. In practice, of course, any such
resource is usually represented by some sort of process which can act on its
behalf.

    The Arpanet system for handling text messages describes a slightly
    different kind of name, in the form "User@Host", but that would not
    really resolve down to a proper internetwork address. Instead, the "host"
    name is used to derive an address for the


7


    mail server process, and the "user" name is sent along as data, to be
    interpreted by a higher-level protocol (the mail server) at the
    destination.

Some local process (either local or part of some distant resource) may offer
to convert names into appropriate addesses, although a single name may yield
many addresses. The name-lookup process may know about the precise address of
certain well-known severs [sic], and may provide certain default conventions
if some portions of the address are not fully specified.

    I have not yet mentioned the notion of names and addresses for
    multi-destination pieces of information. The name
    "Internet-Working-Group" could produce a whole set of names, which in
    turn could be transformed into individual addressess [sic], and
    individual copies of the information could be sent to each place. Most
    communications subsystems do not yet have protocols which would provide
    more congenial support for multi-destination packets.

If a single name does yield many addresses, one of them may be chosen on the
basis of some other information: bandwidth and delay characteristics, cost,
etc.

II. An address, in this context, really means an internetwork address: a
collection of bits in a standard format, recognizable by any object cognizant
of the inter-network protocols.

In most inter-network efforts we have adopted a hierarchical address space,
usually partitioned into (net, host, socket) tuples.

    Note that this model is not quite adequate to describe the dynamic naming
    and addessing conventions planned for the Distributed Computing System
    (DCS) at UC Irvine. They had proposed that processes not be bound to
    particular machines, but would be free to migrate around the ring; 16-bit
    process ID's on each packet would be dynamically checked in each host,
    and the packet given to the appropriate process, if it happened to be in
    that host. The actual address of a destination process would remain
    hidden from the sending program,

    I would argue that the 16-bit process ID (although not human-readable) is
    most accurately called a process 'name', since it may be used to describe
    a generic resource without specifying a particular address. Thus, this
    mechanism pushes "down" and distributes the task of mapping from names to
    addresses, thereby deferring the binding until even later.

    The mechamsm depended heavily on two aspects of the DCS ring:
    -- the ability of the communications subsystem to broadcast a packet to
    all nodes on the ring, and,
    -- the ability of each interface to quickly look up a process name in a
    content addressable memory (associative memory) used to identify those
    processes resident in the node. (In the original proposal, this memory
    could hold the names of 16 resident processes) [sic]
    In a different communications enviromment -- or if there were more than
    16 processes in one host -- this mechanism would not function properly.
    In those cases, however, one can provide the same effect (allowing users
    to communicate with resources stirctly [sic] by name, without specifying
    an address) merely by providing an extra layer of protocol. I would
    conclude that the different naming/addressing conventions in DCS are more
    of a difference in style, rather than a difference in substance.

We usually think of these three components as physical objects -- networks
and machines -- but they are really only logical groupings used to form the
address. In a particular environment, for example, the "host" might be a
front-end processor, and the "sockets" might be specialized processors,
perfoming the function which the user sought at that


8


address.

If a machine has only one network interface, processes in that machine will
all have the same <Net><Host> address, although from many other addresses
there may be multiple paths to this address.

If a machine has two network interfaces, to the same network or to two
different networks, it will have two alternate internetwork addresses for
each process resident there. Another process seeking to communicate might
select either address.

III. The underlying communications system provides the means for routing
packets to their destination.

Individual networks use different routing strategies for forwarding their
internal traffic: the DCS ring has a simple topology with no alternate
routes, while the Alohanet and the Etherent [sic] use a broadcast
communications media, requiring no routing. The Arpanet uses dynamic
hop-by-hop routing, while the Packet Radionet uses dynamic source routing.

Internetwork packets, however, may travel through various networks, routed
among internetwork gateways. The gateway routing processes make use of the
information specified in the internetwork address, and can route to the areas
specified. Upon arrival at a gateway entering the destination network, the
network-specific routing mechanism can be used to route it to the specified
"host".

The gateway processes may implement a form of dynamic hop-by-hop routing,
using complete networks as the communications channel to "adjacent" gateways.
These gateways can exchange routing tables and perform dynamic hop-by-hop
routing, or one could simplify the gateway by requiring users to perform
source routing.

In one sense, these two methods can be combined: while the gateways normally
do hop-by-hop routing, there could be a special process resident in the
gateway which would take incoming packets, examine the data field, extract
the next address indicated there, and forward the packet to that destination.
(Danny Cohen has called this process a "forwarder", and I believe Steve
Crocker once called it an "oasis".) The effect is to provide source routing
to the user, by simply incorporating the list of intermediate addresses in
the data field of a packet, rather than in the header.

We have spoken earlier on the manner in which area routing using hierarchical
addresses cannot properly identify resources which may have been stuck in
different parts of a partitioned network -- it requires some additional
information to guide the selection of an appropriate route. If we are using
some form of source routing, and if some information about the internal
structure of the destination net can propogate back to the source, it may be
possible to then choose an alternate route which will lead into one part of a
partitioned network.

In addition, a source routing or hybrid scheme which allows a user to specify
certain intermediate addresses allows a process to force traffic over a
particular network, perhaps in response to political, economic, or
performance considerations: "take any reasonable path across the country, but
I must use the Packet Satellite system across the Atlantic". More difficult,
however, is selectively eliminating some paths from the routing decision:
"take me to Europe, but don't ever go through Spain".


Naming/Addressing/Routing and Internetwork Gateways

The gateway function is. in general, performed in a machine which is
connected to at least two networks, and which therefore has at least two
addresses. Incoming packets which need to be forwarded may enter on any
attached net, using any of the legal addresses.


9


Yet there is a single gateway process which handles all of this traffic, and
which is responsible for maintaining relevent [sic] routing tables.

It has been suggested that a gateway might be viewed as two completely
separate "halves" actually residing on different machines, connected by a
"thin wire" communications channel. We do not find this abstraction
particularly useful, since it does not easily account for the shared gateway
processes which actually perform the routing function.

In this model, the gateway processes maintain routing tables to reach network
areas through other gateways, but they do not attempt to maintain any other
state about networks, hosts, or connections. Thus, these intermediate
gateways -- using dynamic hop-by-hop routing -- will not be able to deal with
partitioned networks.


Material not covered in this note

Distribution lists, multi-destination packets, and broadcasting.
Ways we might modify area routing to more gracefully deal with partitioned
networks.
Internetwork fragmentation.
Internetwork flow control and congestion control.
Flow control between gateways.


Acknowledgements

The material in this note has evolved from many conversations with David
Boggs, Ed Taft, Bob Metcalfe, and Yogen Dalal; any clarity which has emerged
is due to their willingness to struggle with the semantic demons. Recent
conversations with Danny Cohen have been particularly useful in understanding
his proposal for hybrid routing schemes (where the source routing process
specifies only some of the intermediate stops), and the manner in which area
routing may obscure a preferred off-net route. Much of the commentary on
routing has evolved from [McQuillan 1974] and [Sunshine 1977]; the latter
paper is a partictilarly useful one, touching on many of the issues raised
here, although we tend to differ on some of the fine points.


Bibliography

[AT&T 1975]
        AT&T, 'Notes on Distance Dialing', 1975.
[McQuillan 1974]
        J. M. McQuillan, 'Adaptive routing algorithms for distributed
        computer networks', BBN Report No. 2831, May 1974.
[Sunshine 1977]
        Carl A. Sunshine, 'Interconnection of computer networks', Computer
        Networks, 1977.








file: names.b
January 29, 1978 8:02 PM