IEN 188
ISSUES IN INTERNETTING
PART 3: ADDRESSING
Eric C. Rosen
Bolt Beranek and Newman Inc.
June 1981
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
ISSUES IN INTERNETTING
PART 3: ADDRESSING
3. Addressing
This is the third in a series of papers that discuss the
issues involved in the design of an internet. The initial paper
was IEN 184, familiarity with which is presupposed.
In this paper, we will deal with two basic issues. The
first has to do with the Network Access Protocol. It is
concerned with the sort of addressing information which a source
Host has to supply, along with its data, to a source Switch
(gateway, in the Catenet context), in order to enable the Switch
to get the data delivered to the proper destination Host. The
second issue has to do with the question of how the Switches
(both source Switch and the intermediate Switches) are to
interpret and act upon the addressing information supplied by the
source Host. We begin by stating generally the sort of
addressing scheme we envision (which is by no means original),
and by comparing it to the very different sort of addressing
currently in use in the Catenet. Next we will discuss some of
the issues and details that arise in considering how to make such
a scheme work reliably. We will then show how this scheme lends
itself quite naturally to the solution of certain problems which
are very difficult to handle in the current Catenet architecture.
Although addressing and routing are rather intimately bound up,
- 1 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
we will avoid routing considerations here whenever possible.
Routing in the internet will be the topic of a longer paper which
will be the next to appear in this series.
3.1 Logical Addressing / Flat Addressing
For maximum flexibility and robustness of operation, a
source Host should be able to simply "name" the destination Host
it wants to reach, where a "name" is just an arbitrary identifier
for a Host. That is, the source Host should not need to know
anything about the physical location of the destination Host, NOT
EVEN WHAT NETWORK IT IS ON. In other words, the internet should
have logical addressing. The advantages of logical addressing
are thoroughly discussed in IEN 183, and that discussion shall
not be repeated here. IEN 183 presents a logical addressing
scheme which was designed with the ARPANET in mind. However,
since we regard the internet as a Network Structure whose
Switches are gateways and whose Hosts are generally multi-homed
to the gateways, most of the ideas presented in IEN 183 can be
carried over directly to the internet environment. The present
IEN will emphasize those aspects of the logical addressing scheme
which are specific to the internet environment, but the proposed
scheme is basically the same as the one discussed in IEN 183.
Anyone with a real interest in these issues will want to become
familiar with that document.
- 2 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
The basic idea of logical addressing is that a source Host
should name the destination Host, and the Switches should map
that name into a physical address that is meaningful within the
Network Structure of the Switches. The mapping between names and
(physical) addresses will, in general, be many-many. That is,
one name may refer indeterminately to several distinct physical
addresses, either because some one physical machine is
multi-homed, or because the user does not care which of several
physical machines he reaches. Similarly, one physical machine
may have several names, which may either be synonyms, or may be
used for further multiplexing within the destination Host. (This
may be particularly important when a Host within one Network
Structure is really a Switch, e.g., a port expander or local
network, within another.)
Logical addressing tends to result in a flat addressing
space, rather than a hierarchical one. This may seem surprising
in the context of the internet, since an internet is a
hierarchical structure, and internet routing is almost certainly
going to be some form of hierarchical routing. However, it
simply does not follow that the addressing space used in the
internet Network Access Protocol must be a hierarchical
addressing space. In fact, since the form of the addressing
space has an effect on the Network Access Protocol, and hence on
Host-level software, whereas the routing algorithm is a purely
internal matter to the Network Structure, proper protocol
- 3 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
layering would seem to require that the form of the addressing
and the form of the routing be independent. We would like to be
able to change the internal routing algorithm of the Network
Structure without requiring corresponding changes in Host
software, i.e., without changing the form of the addressing.
What we are proposing is quite different from the way
addressing is done in the current Catenet Network Access
Protocol, IP. IP uses both physical addressing and hierarchical
addressing. (Note that physical addressing within a hierarchical
Network Structure will almost certainly be hierarchical
addressing, whereas logical addressing allows the internal
structure of the Network Structure to be better hidden from the
users. This is one of its main advantages.) The first component
of the address is a network number, and the second component is a
physical address which is meaningful within that network. In IEN
183, we discuss a number of reasons for the superiority of
logical over physical addressing. Other criticisms of the
Catenet's current addressing scheme have been voiced by other
authors. For example, the way in which hierarchical addressing
is incorporated into Catenet addressing mechanisms has recently
come under criticism in IEN 177 by Danny Cohen, who focuses his
criticism on the particular case of the ARPANET. His main
criticism is that it does not allow enough hierarchical levels.
That is, with the presence of local nets or port expanders which
appear to the ARPANET as Hosts, there is really another level of
- 4 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
hierarchy after the ARPANET. He suggests, therefore, that
ARPANET addressing (1822-level) be changed to provide this
additional hierarchical level, and that end-users (or at least
Host software modules) fill in this additional level.
It is not obvious, though, that a single additional level of
addressing will do for all applications. If we are sending data
not just to a local net, but to an internet of local nets, maybe
several additional levels of hierarchy are needed. We may also
need more hierarchy on the "front end" of the address. A
protocol which begins the internet address with a field which is
supposed to identify the destination network (e.g., IP) assumes
that there is no need to establish a hierarchy among the networks
themselves. (This is equivalent to assuming that all Switches
can "know about" all networks.) As long as we have only a small
number of networks, it may be reasonable enough to assume that
destination network addresses need not themselves be
hierarchical. However, it is not difficult to imagine a very
large internet composed of thousands of networks, where before
specifying a network, we must first specify, say, a continent.
So maybe our protocol for hierarchical addressing needs a
"continent address" field before the network address field. It
begins to look as if the addressing structure needs to be
INFINITELY EXTENSIBLE in both directions. In fact, in IEN 179
Cohen proposes a scheme which seems intended to provide this sort
of infinite extensibility. That seems both an inevitable
- 5 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
consequence of hierarchical addressing, and a reductio ad
absurdum of it.
It is also worth noting that a given number of Hosts can
generally be addressed with fewer bits in a flat addressing
scheme than in a hierarchical addressing scheme. Given, say, 32
bits of addressing, flat addressing can represent 2**32 Hosts.
However, if these 32 bits are broken into four 8-bit fields,
hierarchically, fewer Hosts can be represented, since in general,
not every one of the four fields will actually take on the full
256 values. Inevitably, one finds that at least one field must
take on 257 values, while at least one other turns out to have a
smaller number of values than expected. This tends to lead to
the feeling that the address field needs "just one more level" of
hierarchy. It also tends to lead to the use of funny escape
values and multiplexing protocols so that different fields can be
divided up in different ways by different applications. The same
problems usually reappear, however, in a few years, as the need
for "just one more level" is proclaimed yet again. Yet the
alternative of making the address fields arbitrarily long, hence
infinitely extensible, is rather infeasible, if bandwidth
considerations are taken into account.
The need for infinite extensibility at the Host interface
can be avoided by using logical addressing (although this is only
one of its many advantages). We can then identify a single Host
- 6 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
by using a single, structure-less, unique name which is
meaningful at each level of internet hierarchy. That is, the
Switches at each level of the hierarchy would be able to
recognize the name, and to map it into a physical address that is
meaningful at that level of hierarchy. Neither the end-user nor
the source Host would be responsible for determining the physical
addresses at each level of a never-ending hierarchy. Of course,
neither these arguments, nor those of IEN 183, can be regarded as
finally settling the "flat vs. hierarchical" issue. In
networking, no one issue can ever be settled in isolation, and
attempts to do so result only in endless and unproductive
arguments. A network (or internet) is a whole whose performance
and functionality result from the combination of its protocols,
addressing schemes, routing algorithm, hardware and software
architecture, etc. Particular addressing schemes can only be
judged when it is seen how they actually fit into particular
designs. The only real argument in favor of a particular
addressing scheme is that it fits naturally into a network
architecture which provides the needed functionality and
performance. It is hoped that the addressing scheme we propose
will be judged as part of the architecture we are developing in
this series of papers, rather than in isolation.
- 7 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
3.2 Model of Operation: An Overview
The model of operation we are proposing is as follows. A
source Host submits a packet to a source Switch, naming (not
addressing) the destination Host. THE SOURCE SWITCH THEN
TRANSLATES (OR MAPS) THAT NAME INTO A PHYSICAL SWITCH ADDRESS
WHICH IS MEANINGFUL WITHIN ITS OWN NETWORK STRUCTURE; THAT WILL
BE THE ADDRESS OF THE DESTINATION SWITCH WITHIN THAT NETWORK
STRUCTURE. The data is then routed through the Network Structure
to the destination Switch so addressed. The name (logical
address) of the destination Host is also carried through the
Network Structure along with the data and the physical address of
the destination Switch. When the destination Switch receives the
data, it forwards it to the destination Host over (one of) its
Pathway(s) to that Host. If the Pathway is itself a network or
internet configuration with logical addressing, the name of the
destination Host is passed on via the Pathway Access Protocol.
If logical addresses or names are not unique across all component
networks of an internet, translation from the internet logical
address to the Pathway logical address would have to be done at
this point. If the network or internet underlying the Pathway
does not even have logical addressing, the Host name will have to
be translated into a Pathway physical address by the destination
Switch.
- 8 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
Note that, at any particular hierarchical level (i.e.,
within any particular Network Structure), the ADDRESSABLE
ENTITIES are the Switches at that level (which are physically
addressed), and all the Hosts (which are logically addressed, or
named). Component networks of the internet are treated as
structure-less Pathways, AND NEITHER THE COMPONENT NETWORKS
THEMSELVES NOR THE SWITCHES OF THE COMPONENT NETWORKS ARE
INDEPENDENTLY ADDRESSABLE. Furthermore, a name (logical address)
which adequately identifies the destination Host is present at
each level of the hierarchy. Of course, a particular name only
needs to be unique at a single level of the internet hierarchy,
within a particular Network Structure. The names can change as
we travel up and down the hierarchy of Network Structures that
make up the internet.
3.3 Some Issues in Address Translation
In order to do the sort of translation from logical to
physical address that we have been discussing above, the Switches
must have translation tables. Many of the issues involved in the
design of a robust translation table mechanism are discussed in
IEN 183, and much of that discussion applies without change to
the internet. We will confine our discussion here, therefore, to
issues which are not considered in that note, or which are more
specific to the internet environment.
- 9 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
The main problem with the model of operation we have
proposed is a very mundane one, but unfortunately a very
important one. If there may be thousands of Hosts on an
internet, each one with an unlimited number of different names,
and if a source Switch must be able to map any name to the
address of a destination Switch, then each Switch will have to
have a very large table of names to drive this translation
function. By itself, this is not much of a problem. To be sure,
in the past, it has been considered important to keep the
gateways as small as possible. It now seems to be more generally
accepted that the current Catenet gateways provide inadequate
performance, and that building a robust operational internet
system requires us to build Switches that are large enough to
handle the required functionality at a reasonably high level of
performance. We would expect Switches built in the future to be
much larger than the current gateways are. However, it is one
thing to require large tables, and quite another thing to require
tables which may grow without bound. Since the number of Hosts
on the internet may grow without bound, it does not seem feasible
to require the Switches to have tables with one or more entries
for each and every Host in the internet.
If we cannot fit the complete set of translation tables into
each Switch, a natural alternative is to turn the tables into a
DISTRIBUTED DATA BASE, with each Switch having only a subset of
the complete set of tables. For each Switch, there would be a
- 10 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
subset of logical addresses for which the Switch would have
complete physical addressing information. These logical
addresses would fall into one of two classes:
1) Those logical addresses which refer to Hosts which are
homed (in some Network Structure) directly to that
Switch.
2) Those logical addresses which refer to distant Hosts
which are in FREQUENT communication with the Hosts which
are directly homed to that Switch.
The logical addresses in these two classes are the ones for which
the Switch will be most often called upon to do
logical-to-physical address translation, and for best efficiency,
the information needed to do the translation ought to be present
in the Switches. For other logical addresses, which are less
often seen, all that is needed is for the Switch to know where
the address translation information can be found. Then if a
packet with an infrequently-seen logical address is encountered,
it can be forwarded to a place where the proper information is
known to reside, or else the packet can be held while the
information is obtained. (We may want to have a scheme which is
a hybrid of these two alternatives. For example, packets with
logical addresses that are not contained in the resident tables
can be forwarded to a place with more addressing information, and
this can in turn cause the needed addressing information to be
- 11 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
sent back to the source Switch, so that additional packets with
the same address can be handled directly by the source Switch.
That is, the source Switch might maintain, in addition to its
permanently resident tables, a cache of the most recently needed
addressing information.)
It is important to note that the two classes defined above
may vary dynamically, and we may want a procedure for altering
the members of those classes in some specific Switch depending
upon the traffic that the Switch is actually seeing in real time.
Unfortunately, any such scheme would seem to require the
inclusion of at least one additional level of hierarchy in the
addressing structure, since when a Switch sees a logical address
for which it does not have complete information, it must be able
to determine how to get that complete information. The scheme
would be self-defeating if it meant that we had to have a table
of all the logical addresses, with an indication for each one of
which other Switch has the complete information. Rather, we need
to be able to group the logical addresses into "areas", of which
there will be a bounded number. Then each Switch will be able to
keep a table indicating which other Switches contain the complete
translation information for each area. This table of areas would
then be the only part of the complete set of translation tables
that had to be resident at ALL Switches. While this is much more
feasible than requiring each Switch to keep a table containing
- 12 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
all the logical addresses, it does means that the destination
address provided by the source Host must include not only a
destination Host identifier, but also an "area code" for that
logical address.
If we are going to organize the logical addresses of all
internet Hosts into a relatively small set of "areas", we would
like to find some means of organization which is fairly optimal.
Unfortunately, there are a number of fairly subtle considerations
which make this quite tricky to do. Certain intuitively
attractive ways of organizing the internet into these areas will
result in various sorts of significant and quite annoying
sub-optimalities. Suppose, for example, we treated "area" as
meaning "home network", much as in the present Catenet IP (where
network number is part of the address that the Hosts must
specify.) Then we would require all and only the ARPANET gateways
to contain the logical-to-physical addressing information for the
ARPANET Hosts, all and only the SATNET gateways to contain the
tables for the logical addresses of the SATNET Hosts, etc. The
user, in addressing a particular Host, would not only name it,
but also name its "home network", and the source Switch would
choose some Switch which interfaces directly to the home network
of the destination Host from which to obtain the translation
information. This method of organization, however, has several
unsatisfactory consequences. One problem is that if any Host is
on two "home networks", we want the Switches, not the Hosts, to
- 13 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
choose which "destination network" to use. This is necessary if
we want the routing algorithm to be able to choose the "best"
path to some destination Host, and is really the only way of
ensuring that packets can be delivered to a Host over some path,
if one of the Host's home networks is down but the other is up.
(This is jumping ahead a bit, since a full discussion of the
"partitioned net" problem will not appear until section 3.4. The
point, though, is that the choice of "home network" to use when
sending traffic to a particular destination Host is a ROUTING
PROBLEM, NOT AN ADDRESSING PROBLEM. Therefore it ought to be
totally in the province of the Switches, which are responsible
for routing, and not at all in the province of the Hosts, which
must participate in the addressing, but not the routing.)
Another problem arises as follows. Suppose we have adopted
the scheme of sending packets for a certain area to a Switch in
that area, depending on that Switch to do the further
logical-to-physical translation. It is possible that when this
further translation is done, we will find that the route which
the packet travels from that Switch takes it back through the
source Switch. This could mean a very lengthy and
delay-producing "detour" for the packet. It might at first
appear that this is not very likely. If a packet is going to
some ARPANET Host, and we send it to some Switch which is
directly connected to the ARPANET, surely we have sent it closer
to its final destination, not further away. Unfortunately, that
- 14 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
just is not necessarily true. Network partition or congestion
may force a packet for an ARPANET Host to travel from an ARPANET
gateway to a gateway (or series of gateways) outside the ARPANET,
back around (through a potentially long route) to another ARPANET
gateway. (Consider the partitioned net and the expressway
problems.) In such cases, the Network Structure may already be
in a condition of stress which is likely to result in below par
performance. We do not want to make things even worse by adding
any further unnecessary but lengthy detours just because we
cannot keep all the addressing information at the source Switch.
One way of helping to avoid these sorts of problems is to
separate the notion of "area" from any physical meaning. The
purpose of adding the notion of area to the logical addressing
scheme is just to enable us to distribute the data base needed to
do logical-to-physical address translation. There is no reason
to suppose that the addressing information needed for some
particular Host ought to be contained only in Switches that are
"near" that Host. That would be a mistake. Rather, the
addressing information ought to be somewhere which is "near" the
SOURCE Host, not somewhere which is near the destination Host.
This maximizes the chances that the necessary address translation
will be done as soon as possible after the packet enters the
Network Structure. The sooner we do the address translation, the
more information we have which we can make use of to improve the
routing of the packet, and the less likely any unnecessary
detours will be.
- 15 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
One might think that at least Hosts which are on the same
home network should be grouped into the same area. This will
work until the first time a Host is moved from one network to
another. Since the area codes are given by the individual Host
or user as part of the address in the Network Access Protocol of
the internet, changing a Host's area code would involve changing
Host-level software or tables, which has to be avoided.
(Avoiding the need to make such changes when Hosts move
physically is one of the main reasons for using logical
addressing.) So we really have to think of "areas" as random
collections of Hosts.
What we are proposing is a truly distributed logical address
translation table, rather than a scheme where each Switch
maintains only local information. To make this more concrete,
consider how this might be done in the Catenet. All the
information about logical addresses which refer to Hosts on the
ARPANET would be contained not only in all the gateways which are
directly connected to the ARPANET, but also in a set of
additional gateways which are uniformly scattered around the
internet. Then, although the addressing information would not be
in every potential source Switch, it would be somewhere close to
every potential source Switch, and packets would not have to
travel a long distance only to find out that they are going in
the wrong direction.
- 16 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
3.4 Model of Operation: More Detail
Let's assume that a source Host has given a message to a
source Switch, with a logical address and an "area code"
indicating the destination Host. If the source Switch does not
have the complete address translation information in its tables,
it will look in its table of area codes. The given area code
will be associated in the latter table with some set of Switches
(within the same Network Structure). The sequence of operations
that we envisage is the following:
1) The source Switch picks one of these Switches, and sends
the message to it. There must be enough protocol between
these two Switches so that the chosen Switch knows that
it is not the final destination Switch, but only an
intermediate Switch, and that it is expected to complete
the address translation and then to forward the message
further.
2) The chosen Switch must be able to recognize the logical
address of the destination Host, and associate it with
one or more possible destination Switches. The message
will be forwarded to one of these Switches. Furthermore,
the addressing information can be sent back to the source
Switch where it can be held in a cache in case the
message is followed by a flood of additional messages for
the same logical address.
- 17 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
In the case where the source Switch does contain complete
address translation information for the destination logical
address, that logical address will be associated with some set of
potential destination Switches. The source Switch will choose
one, and send the message directly to it.
Logical-to-physical address translation should be done by
only one Switch; either the source Switch or the Switch chosen by
the source Switch on the basis of the area code. There is no
need to allow intermediate Switches to do any logical-to-physical
address translation. (There is only one exception to this,
namely the case where a message arrives at an intermediate Switch
only to discover that the destination Switch chosen by the source
Switch is no longer accessible. In this case, re-translation is
the alternative to dropping the message entirely.) Remember that
many Hosts will be multi-homed (in the internet, virtually every
Host is multi-homed, since most networks will have at least two
internet gateways connected to them), so that there will in
general be more than one possible destination Switch. By
prohibiting re-translation at intermediate Switches, we avoid the
problems of looping that might arise if different intermediate
Switches make different choices of destination Switch. As we
shall see, this also simplifies our approach to the partitioned
net problem, and at any rate, there is no great advantage to
allowing intermediate Switch translation (cf. IEN 183).
- 18 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
We suggested above that if a source Switch does not
recognize a particular logical address, and hence must send a
message to another Switch (as determined by the area code), the
latter Switch should send the addressing information back to the
source Switch, to be kept temporarily in a cache. We have to
emphasize "temporarily." The source Switch should time out the
addressing information which it keeps in the cache, and then
discard it. If it later receives from any of its source Hosts
any subsequent messages for the same destination logical address,
it will have to reobtain the information. The reason for this is
that it will be necessary, from time to time, to change the
translation tables. It is not that hard to develop an updating
procedure which ensures consistent updating of all Switches where
the information about a logical address normally resides. But it
might be more difficult to develop a procedure which ensures
consistent updating of all the temporary (cached) copies of that
information. Timing out the temporary copies of the addressing
information will prevent out-of-date information from being
preserved in inappropriate places. (Though the use of an
out-of-date translation is not so terrible, since it would elicit
a DNA message, rather than causing mis-delivery of data. See IEN
183 for details. In this sense, out-of-date information is
self-correcting.)
When either a destination Host name (logical address) or an
area code maps into several Switches, the source Switch must
- 19 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
apply some criterion to choose one from among them, since in
general we will want to send only one copy of the message to its
destination. (Though there may indeed be cases in which we want
to send a copy of the message to each possible destination
Switch, in order to increase the reliability of the system, or to
be sure that we get the message to its destination Host as fast
as possible.) There are several possible criteria that we might
consider using:
a) We might always choose the "closest" Switch, according to
some particular distance metric (which might or might not
be the same distance metric used by the routing
algorithm).
b) The list of potential destination Switches might have a
"built-in" ordering, so that the first one is always used
unless it is down, in which case the second one is always
used, unless it is down, in which case the third one is
used, etc.
c) If the set of potential destination Switches has the
right sort of topological distribution, we might try to
round-robin them in order to achieve some sort of
load-splitting.
d) If we can obtain some information about the relative
loadings of the various Switches, we can try to choose
- 20 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
the one with the smallest load (to try to avoid causing
congestion within the destination Switches), or we might
try to trade off the increase in load that we will cause
at the destination Switch with the distance we have to
travel to get there.
e) Certain possible destination Switches might be favored
for certain classes of traffic (as determined by the
"type of service" field, or by access control
considerations). That is, certain destination Switches
might be favored for interactive traffic, and certain
others (with more capacity?) for bulk traffic. Or there
might be administrative access control restrictions which
prohibit certain classes of traffic from being sent to
certain Switches. (This may be particularly applicable
in an internet context where different Switches are under
the control of different administrations. It is
possible, though, to imagine applications of this sort of
access control even in a single-administration Network
Structure. For example, we might want to prohibit
military traffic from entering certain Switches, in order
to preserve capacity for important university traffic.)
f) It is possible to combine some of the above criteria,
e.g., choose the closest (i.e., shortest delay) Switch
for interactive traffic and the most lightly loaded one
for bulk traffic.
- 21 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
Remember that in the internet case, all the Hosts on some network
are considered to be homed to all the gateways on that network,
so that in general most Hosts will be multi-homed, and the way we
select the destination Switch could have a significant effect on
internet performance.
Of course, a destination Switch might itself have two or
more Pathways to a particular destination Host. Perhaps the
Switch is a gateway on two networks, and the Host is also on
those two networks. Or perhaps the Switch is multi-homed onto
the network of the Host. In such cases, a further choice
remains -- the destination Switch must choose which of several
possible Pathways to the destination Host it should use for
sending some particular packet. Each (destination) Switch,
therefore, will have to have a second logical-to-physical address
translation table, which it accesses in order to choose the
proper Pathway to a destination Host. This second translation
table, however, contains information which is only useful
locally. In addition to containing information needed to map the
logical address onto one of the Switch's access lines, it must
also contain any information needed in order to specify the
address of the destination Host in the Pathway Access Protocol.
In some cases, the logical address of the Host in its "home
network" may be the same as its logical address in the internet,
in which case no additional information is needed. If this is
not the case, or if the "home network" does not have logical
- 22 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
addressing, the local translation tables must contain information
for mapping the internet logical address to an address (logical
or physical) which is meaningful in the "home network." The
issues of choosing one from among a set of possible Pathways
according to some criteria are basically the same as those we
have been discussing from the perspective of the source Switch,
however.
An interesting little issue: suppose that traffic for Host H
can be sent to either Switch A or B, but that the route to Switch
B contains Switch A as an intermediate Switch. Does this mean
that the traffic should always be sent to A, rather than B? Not
necessarily. Perhaps A has plenty of bandwidth available for
forwarding traffic to other Switches, but only a little available
for sending traffic directly to a Host. Or the Pathway from
Switch A to Host H may itself have such a long delay that it is
quicker to send the traffic through A to B and then on B's
Pathway to H. While it may turn out to be very difficult to
take account of such factors, we ought not to rule them out by a
priori considerations, and we ought not to design a system in
which such factors cannot be considered.
A variant on this issue can arise as follows. Suppose Host
H1 wants to send some data to Host H2, and H1 puts this data into
the internet by submitting it to source Switch S. Now S will
look in its address translation table to find the possible
- 23 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
destination Switches for H2. Let's suppose that there are two
such possible destination Switches, one of which is D, and the
other of which is S itself. That is, S has a choice of sending
the data directly to H2 (over a Pathway with no intermediate
Switches), or of sending it to D, so D can transmit it directly
to H2. Nothing in the proposed scheme constrains S to choose
itself as the destination Switch. If we want, we can have S make
the choice of destination Switch without taking any special
cognizance of the fact that it itself is a possible destination
Switch. Or we might even require that S not choose itself as the
destination Switch. That is, when a gateway on the ARPANET, for
example, gets some data from an ARPANET Host which is destined
for another ARPANET Host, maybe we want the data to be sent
through another gateway, rather than just sending it right back
into the ARPANET. This possibility might be crucial to solving
the "expressway" problem. While we are not at present making any
proposals for allowing the internet to be used as an "expressway"
between two Hosts on a common, but very slow, network, we are
trying to ensure that nothing in our proposed addressing scheme
will make this impossible. This is a very important difference
between our proposed scheme and the scheme presently implemented
in the Catenet, where a source Switch which is also a potential
destination Switch is highly constrained to pick itself as the
actual destination Switch. Of course, for this to work, there
must be enough protocol so that a Switch which receives some data
- 24 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
can know whether it is getting it directly from a source Host, or
whether it is getting it from another Switch.
When we say that a particular Host name maps onto a set of
possible Switches, what we are really saying is that each member
of that set of Switches has a Pathway to the Host. Remember the
definition of "Pathway" -- a Pathway in Network Structure N
between two Switches of Network Structure N or between a Switch
and a Host of Network Structure N is a communications path
between the two entities which does not contain any Switches of
Network Structure N. The logical-to-physical address translation
tables will not map a Host name to a particular set of
destination Switches unless each of those Switches has a Pathway
to that Host. But we must remember that at any particular time,
one or more of these Pathways may be down. Before we apply the
above criteria (or others) to the set of possible destination
Switches in order to choose a particular one, we must first
eliminate from the set any Switches whose Pathway to the
destination Host is down. This is a non-trivial task which
breaks down naturally into two sub-tasks. First, the destination
Switch must be able to determine which of the Hosts that are
normally homed to it is reachable at some particular time.
Second, this information must be fed back to the source Switch.
Each of these sub-tasks raises a number of interesting issues.
- 25 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
In IEN 187, we discussed the importance of having a Pathway
up/down protocol run between each Host and each Switch to which
it is homed, so that a source Host can know which source Switches
it has a currently operational Pathway to. Now we see the other
side of the coin -- each destination Switch must be able to
determine which Hosts it currently has an operational Pathway to.
Many of the considerations discussed in IEN 187 apply here too,
and need not be mentioned again. Basically, the Switch will have
to run a low-level up/down protocol which relies on the network
which underlies the Pathway to tell it whether a particular Host
is reachable (e.g., the ARPANET returns an 1822 DEAD Reply to any
ARPANET source Host which attempts to send a non-datagram message
to an unreachable destination Host), and the Switch will also
have to run a higher-level up/down protocol whereby it queries
the Host and infers that the Host is unreachable if no replies to
the queries are received. Of course, if some Pathway consists of
a simple datagram-oriented network that provides no feedback to
the source, then a higher-level protocol will have to be used
alone.
Assuming that the Switches have some way of determining
whether their Pathways to particular Hosts are operational, we
have the following subsidiary issue -- should these
determinations be made on a regular basis, for all Hosts that
might be reachable, or should they be made on an exception basis,
with the information obtained only as needed? Let's consider the
- 26 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
analogous operation in the ARPANET. In the ARPANET, the up/down
status of each Host is maintained continuously, as a matter of
course, by the IMP to which that Host is homed. This
information, however, is not generally maintained at other IMPs.
If a packet for a dead Host (on a live IMP) is submitted to some
source IMP, the packet will always be sent to the destination
IMP, which will (unless the packet is a datagram) return an 1822
DEAD reply. The source IMP receives the DEAD reply, signals it
to the source Host, and then discards the information. IMPs do
not maintain status information about remote Hosts, but the
information is available to them as they need it (i.e., on an
exception basis). On the other hand, each IMP always maintains
complete, accurate, and up-to-date information about the
reachability of each other IMP. Whenever any IMP goes down or
comes up, this information is broadcast to all other IMPs in an
extremely quick and reliable manner. If a source Host attempts
to send a packet to a Host on an unreachable IMP, no data is sent
across the network at all; the source IMP already knows that the
destination IMP cannot be reached, and tells the source Host
immediately.
Why don't IMPs maintain regular status information about all
ARPANET Hosts? It's not as if this is against the law, and under
certain conditions, it might be advantageous to do so. However,
the more entities about which regular status information is
maintained, the more bandwidth (trunk and CPU) and memory must be
- 27 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
devoted to handling the information. With a potentially
unbounded number of Hosts being able to connect to the ARPANET,
it does not seem feasible for all IMPs to maintain this status
information for every Host. Fortunately, it just is not as
important to maintain status information for Hosts as it is for
IMPs. Status information about the IMPs is necessary in order to
do routing, so failure to maintain this information regularly
would degrade the routing capability, with a consequent global
degradation in network service. Since Hosts, on the other hand,
are not used for storing-and-forwarding packets, routing does not
have to be so aware of Host status, and global degradations due
to incorrect assumptions about Host status are less likely.
If we can't expect ARPANET IMPs to maintain regular status
information for each Host, we certainly can't expect internet
gateways to maintain regular status information for each and
every Host in the internet. In fact, in the internet, the
situation is even worse. In the ARPANET, each IMP at least
maintains regular status information about the few Hosts to which
it is directly connected. This is simple enough to do, since the
number of Hosts on an IMP is bounded (barring the introduction of
local nets or port expanders) and there are machine instructions
to detect the state of the Ready Line. However, we can hardly
expect a gateway to maintain regular status information about all
the Hosts on all the networks to which the gateway is directly
connected. So we will suppose that in general, status
- 28 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
information about the Hosts which are homed to a particular
Switch will be obtained by that Switch on an exception basis, as
needed. Of course, saying that this will be true in general does
not mean that it must be universally true. If there are a few
Hosts somewhere that are major servers with many many important
users scattered around the internet, there is no reason why the
Switches to which those servers are homed cannot maintain regular
status information about those few Hosts. If the number of such
special Hosts is kept small, this would not be prohibitively
expensive, and if these Hosts really do handle a large portion of
the internet traffic, this might be an important efficiency
savings.
If a source Switch knows that a particular destination Host
logical address can be mapped to any of a number of destination
Switches, then, as we have pointed out, it must be able to tell
when, due to some sort of failure or network partition, the
destination Host is (temporarily) unreachable via some particular
Switch. It must have that information in order to be able to
avoid choosing a destination Switch whose Pathway to the Host is
non-operational. If we agree that the Pathway up/down status
between a particular destination Switch and a particular
destination Host which is ordinarily homed to it can only be
obtained, on an exception basis, by that destination Switch
itself, it follows that this information can also only be
obtained by the source Switch on an exception basis. That is,
- 29 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
the only way for a source Switch to find out that a particular
Host can temporarily not be reached through a particular
destination Switch is to send a message for that Host to that
Switch. The destination Switch must then determine that it has
no operational Pathway to that Host, and it must send back a
control message to the source Switch informing it of this fact.
(In IEN 183, we christened these messages "DNA messages", for
"Destination Not Accessible.) The source Switch will store this
information in its address translation tables, so that from then
on it does not choose a destination Switch whose Pathway to the
Host is down. (Of course, in addition to sending this control
information back to the source Switch, the putative destination
Switch should also try to forward the message it received to one
of the other Switches to which the destination Host is homed.)
This should work well, unless the Pathway between the
original destination Switch and the destination Host comes back
up. We must develop some way of informing the source Switch that
that destination Switch is now once again usable as a destination
Switch for that Host. A simple and robust way to handle this is
as follows. When a source Switch is informed, according to the
mechanism of the previous paragraph, that a particular
destination Switch cannot reach a particular destination Host
(without forwarding traffic through additional intermediate
Switches), it marks (in its address translation tables) that
Switch as UNUSABLE as a destination for that Host. However, this
- 30 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
information is reset periodically, say, every few minutes. In
effect, this approach would cause a source Switch which is
handling traffic for that destination Host to query the
destination Switch periodically to see if it has become usable
again. Note that no special control message is needed for the
querying. The querying is done simply by sending data addressed
to the destination Host to the destination Switch. If the
destination Switch is still unusable, no data is lost, since the
data can be readdressed by the destination Switch and sent to
some other destination Switch which does have an operational
Pathway to that destination Host. Note also that with this
scheme, not all source Switches will be in agreement as to which
destination Switches can be used to reach which destination Hosts
at some particular time. But this is not much of a problem, as
long as address translation is done only once, and not re-done at
each intermediate Switch. Further, any source Switch which tries
to use the wrong destination Switch will be told, via a DNA
message, to use another one.
Lest there be any misunderstanding, we should emphasize that
we are not proposing this as a general mechanism for determining
which Hosts are homed to which Switches. That information is not
to be obtained dynamically at all, but rather is to be installed
in the translation tables at each Switch by the Network Control
Center (or whatever equivalent of the Network Control Center we
devise for the internet.) This mechanism is only used to
- 31 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
determine that a Pathway which ORDINARILY exists between some
Switch and some Host is TEMPORARILY out of operation.
If a destination Host happens to be unreachable from EACH
potential destination Switch (which will happen if the Host is
down), this procedure will eventually result in the source Switch
marking all potential destination Switches unusable. Once this
happens, the source Switch should discard any data it receives
which is destined for that destination Host, and should return
some sort of negative acknowledgment to the source Host. The
source Host can then try again, every few minutes, to send more
data to the destination Host. Since the information marking a
destination Switch as unusable (for a particular destination
Host) is reset every few minutes, the source Host will be able to
establish communication with the destination Host soon after it
becomes reachable again. Strictly speaking, a negative
acknowledgment from the source Switch is not required, and the
current IP makes no provision for such a thing. Yet the
information contained in the negative acknowledgment might well
help the source Host to choose a suitable retransmission
interval. If a destination Host is unreachable, it makes sense
for a TCP to retransmit more infrequently than if the TCP has no
information at all about why it is not getting any
acknowledgments from the destination Host. Also, this
information would be useful to the end-user (if the various
protocol layers in his Host succeed in passing it back to him.)
- 32 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
A user who is not getting any response from the system may want
to take a different action if he knows his destination cannot be
reached than if he thinks that the network (or internet) is just
slow.
This procedure, which is basically the same as the one we
recommended (in IEN 183) for use with logically addressed
multi-homed Hosts on the ARPANET, should resolve the partitioned
net problem. Our approach is not dissimilar to one proposed by
Sunshine and Postel in IEN 135. To quote them:
A simpler solution to the partitioning problem follows the
spirit of querying a database when things go wrong. Suppose
there were another database listing networks and all the
gateways attached to each net (whether up or down). This
database would change slowly only as new equipment was added
to the internet system. Further suppose that the gateways
and internet routing are totally unaware of network
partitions, except that gateways to partitioned nets find
out when they cannot reach some Host on their own net. In
this case, the gateway would return a Host Unreachable
(through me) advisory message to the source. The source
could then query the global database to get a list of all
gateways to the destination net, and construct explicit
source routes to the destination going through each of these
gateways, trying each one in turn until it succeeded.
Note, however, that our proposal does not require any source
routing, because it is Switches (i.e., gateways) themselves which
are the addressable entities in our scheme, rather than networks
(though the authors quoted above were considering how to handle
the problem in the current Catenet environment, rather than how
to design a new environment). The database they propose can be
identified with the translation tables we have spoken of. Also,
- 33 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
our proposal handles the situation where a Pathway that was down
becomes usable again, a case they don't seem to mention.
It is sometimes claimed that hierarchical addressing
requires less table space than flat addressing, since there is no
need to have an entry in a translation table for each address.
We can see now that this is not true. If we wish to be able to
handle multi-homing, and in particular to handle the "partitioned
net" problem, we need to maintain table space for the Hosts with
which we are in communication. This is true no matter what kind
of addressing scheme we adopt.
Let's look now at how our scheme would handle the problem of
mobile Hosts, i.e., Hosts which move from one network to another.
We distinguish the case of "rapidly mobile" Hosts from the case
of "slowly mobile" Hosts. A Host is slowly mobile if its move
from one net to another can be made with enough lead time to
allow manual intervention to update the logical-to-physical
address translation tables. This case is handled simply by the
presence of the logical addressing. When the Host moves to
another network, it can still be addressed by the same name, but
the translation tables are changed so that the logical address is
now mapped to a different set of Switches. This creates some
work for the internet administration and control center, but is
completely transparent to higher level protocols, since the
logical address does not change. On the other hand, we consider
- 34 -
IEN 188 Bolt Beranek and Newman Inc.
Eric C. Rosen
a Host to be rapidly mobile if it moves from one net to another
too quickly or too frequently to allow the procedure of modifying
the address translation tables to be feasible. If we can know in
advance that there is some limited set of networks to which that
Host might connect, we can map the logical address of that Host
onto the set of all gateways which connect to any of those
networks. Our procedure for choosing one gateway to use as the
destination gateway might be as follows. Try the first gateway
on the list. If a DNA message is received, try the second, etc.,
etc. Once a source gateway begins sending traffic for a mobile
Host to a particular destination gateway, it should always
continue to use that gateway, until it receives a DNA message, in
which case it should try the next one. You will note that this
procedure is very similar to that used for non-mobile Hosts. In
fact, it might be entirely identical. The only possible
difference is that we might want to be much more reluctant to
switch from one destination gateway to another in the case of
mobile Hosts than in the case of non-mobile Hosts, since we
expect that a mobile Host will not generally be reachable through
all of the potential destination gateways at every time.
- 35 -
-------