RFC 9378 | IOAM Deployment | April 2023 |
Brockners, et al. | Informational | [Page] |
In situ Operations, Administration, and Maintenance (IOAM) collects operational and telemetry information in the packet while the packet traverses a path between two points in the network. This document provides a framework for IOAM deployment and provides IOAM deployment considerations and guidance.¶
This document is not an Internet Standards Track specification; it is published for informational purposes.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are candidates for any level of Internet Standard; see Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc9378.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
In situ Operations, Administration, and Maintenance (IOAM) collects OAM information within the packet while the packet traverses a particular network domain. The term "in situ" refers to the fact that the OAM data is added to the data packets rather than being sent within packets specifically dedicated to OAM. IOAM complements mechanisms such as Ping, Traceroute, or other active probing mechanisms. In terms of "active" or "passive" OAM, IOAM can be considered a hybrid OAM type. In situ mechanisms do not require extra packets to be sent. IOAM adds information to the already available data packets and, therefore, cannot be considered passive. In terms of the classification given in [RFC7799], IOAM could be portrayed as Hybrid Type I. IOAM mechanisms can be leveraged where mechanisms using, e.g., ICMP do not apply or do not offer the desired results. These situations could include:¶
Abbreviations used in this document:¶
[RFC9197] defines the scope of IOAM as well as the different types of IOAM nodes. For improved readability, this section provides a brief overview of where IOAM applies, along with explaining the main roles of nodes that employ IOAM. Please refer to [RFC9197] for further details.¶
IOAM is focused on "limited domains", as defined in [RFC8799]. IOAM is not targeted for a deployment on the global Internet. The part of the network that employs IOAM is referred to as the "IOAM-Domain". For example, an IOAM-Domain can include an enterprise campus using physical connections between devices or an overlay network using virtual connections or tunnels for connectivity between said devices. An IOAM-Domain is defined by its perimeter or edge. The operator of an IOAM-Domain is expected to put provisions in place to ensure that packets that contain IOAM data fields do not leak beyond the edge of an IOAM-Domain, e.g., using packet filtering methods. The operator should consider the potential operational impact of IOAM on mechanisms such as ECMP load-balancing schemes (e.g., load-balancing schemes based on packet length could be impacted by the increased packet size due to IOAM), path MTU (i.e., ensure that the MTU of all links within a domain is sufficiently large enough to support the increased packet size due to IOAM), and ICMP message handling.¶
An IOAM-Domain consists of "IOAM encapsulating nodes", "IOAM decapsulating nodes", and "IOAM transit nodes". The role of a node (i.e., encapsulating, transit, decapsulating) is defined within an IOAM-Namespace (see below). A node can have different roles in different IOAM-Namespaces.¶
An IOAM encapsulating node incorporates one or more IOAM Option-Types into packets that IOAM is enabled for. If IOAM is enabled for a selected subset of the traffic, the IOAM encapsulating node is responsible for applying the IOAM functionality to the selected subset.¶
An IOAM transit node updates one or more of the IOAM-Data-Fields. If both the Pre-allocated and the Incremental Trace Option-Types are present in the packet, each IOAM transit node will update at most one of these Option-Types. Note that in case both Trace Option-Types are present in a packet, it is up to the IOAM data processing systems (see Section 6) to integrate the data from the two Trace Option-Types to form a view of the entire journey of the packet. A transit node does not add new IOAM Option-Types to a packet and does not change the IOAM-Data-Fields of an IOAM Edge-to-Edge (E2E) Option-Type.¶
An IOAM decapsulating node removes any IOAM Option-Types from packets.¶
The role of an IOAM encapsulating, IOAM transit, or IOAM decapsulating node is always performed within a specific IOAM-Namespace. This means that an IOAM node that is, e.g., an IOAM decapsulating node for IOAM-Namespace "A" but not for IOAM-Namespace "B" will only remove the IOAM Option-Types for IOAM-Namespace "A" from the packet. An IOAM decapsulating node situated at the edge of an IOAM-Domain removes all IOAM Option-Types and associated encapsulation headers for all IOAM-Namespaces from the packet.¶
IOAM-Namespaces allow for a namespace-specific definition and interpretation of IOAM-Data-Fields. Please refer to Section 7.1 for a discussion of IOAM-Namespaces.¶
IOAM nodes that add or remove the IOAM-Data-Fields can also update the IOAM-Data-Fields at the same time. Or, in other words, IOAM encapsulating or decapsulating nodes can also serve as IOAM transit nodes at the same time. Note that not every node in an IOAM-Domain needs to be an IOAM transit node. For example, a deployment might require that packets traverse a set of firewalls that support IOAM. In that case, only the set of firewall nodes would be IOAM transit nodes rather than all nodes.¶
IOAM supports different modes of operation. These modes are differentiated by the type of IOAM data fields that are being carried in the packet, the data being collected, the type of nodes that collect or update data, and if and how nodes export IOAM information.¶
OAM information about each IOAM node a packet traverses is collected and stored within the user data packet as the packet progresses through the IOAM-Domain. Potential uses of IOAM per-hop tracing include:¶
"IOAM tracing data" is expected to be collected at every IOAM transit node that a packet traverses to ensure visibility into the entire path that a packet takes within an IOAM-Domain. In other words, in a typical deployment, all nodes in an IOAM-Domain would participate in IOAM and, thus, be IOAM transit nodes, IOAM encapsulating nodes, or IOAM decapsulating nodes. If not all nodes within a domain are IOAM capable, IOAM tracing information (i.e., node data, see below) will only be collected on those nodes that are IOAM capable. Nodes that are not IOAM capable will forward the packet without any changes to the IOAM-Data-Fields. The maximum number of hops and the minimum path MTU of the IOAM-Domain are assumed to be known.¶
IOAM offers two different Trace Option-Types: the "Incremental" Trace Option-Type and the "Pre-allocated" Trace Option-Type. For a discussion about which of the two option types is the most suitable for an implementation and/or deployment, see Section 7.3.¶
Every node data entry holds information for a particular IOAM transit node that is traversed by a packet. The IOAM decapsulating node removes any IOAM Option-Types and processes and/or exports the associated data. All IOAM-Data-Fields are defined in the context of an IOAM-Namespace.¶
IOAM tracing can, for example, collect the following types of information:¶
The Incremental Trace Option-Type and Pre-allocated Trace Option-Type are defined in [RFC9197].¶
The IOAM Proof of Transit Option-Type is to support path or service function chain [RFC7665] verification use cases. Proof of transit could use methods like nested hashing or nested encryption of the IOAM data.¶
The IOAM Proof of Transit Option-Type consists of a fixed-size "IOAM Proof of Transit Option header" and "IOAM Proof of Transit Option data fields". For details, see [RFC9197].¶
The IOAM E2E Option-Type is to carry the data that is added by the IOAM encapsulating node and interpreted by IOAM decapsulating node. The IOAM transit nodes may process the data but must not modify it.¶
The IOAM E2E Option-Type consists of a fixed-size "IOAM Edge-to-Edge Option-Type header" and "IOAM Edge-to-Edge Option-Type data fields". For details, see [RFC9197].¶
Direct Export is an IOAM mode of operation within which IOAM data are to be directly exported to a collector rather than be collected within the data packets. The IOAM Direct Export Option-Type consists of a fixed-size "IOAM direct export option header". Direct Export for IOAM is defined in [RFC9326].¶
IOAM data fields and associated data types for IOAM are defined in [RFC9197]. The IOAM data field can be transported by a variety of transport protocols, including NSH, Segment Routing, Geneve, BIER, IPv6, etc.¶
IOAM encapsulation for IPv6 is defined in [IOAM-IPV6-OPTIONS], which also discusses IOAM deployment considerations for IPv6 networks.¶
IOAM encapsulation for GRE is outlined as part of the "EtherType Protocol Identification of In-situ OAM Data" in [IOAM-ETH].¶
IOAM encapsulation for Geneve is defined in [IOAM-GENEVE].¶
IOAM encapsulation for Segment Routing is defined in [MPLS-IOAM].¶
IOAM encapsulation for Segment Routing over IPv6 is defined in [IOAM-SRV6].¶
IOAM encapsulation for VXLAN-GPE is defined in [IOAM-VXLAN-GPE].¶
IOAM nodes collect information for packets traversing a domain that supports IOAM. IOAM decapsulating nodes, as well as IOAM transit nodes, can choose to retrieve IOAM information from the packet, process the information further, and export the information using, e.g., IP Flow Information Export (IPFIX).¶
Raw data export of IOAM data using IPFIX is discussed in [IOAM-RAWEXPORT]. "Raw export of IOAM data" refers to a mode of operation where a node exports the IOAM data as it is received in the packet. The exporting node does not interpret, aggregate, or reformat the IOAM data before it is exported. Raw export of IOAM data is to support an operational model where the processing and interpretation of IOAM data is decoupled from the operation of encapsulating/updating/decapsulating IOAM data, which is also referred to as "IOAM data-plane operation". Figure 2 shows the separation of concerns for IOAM export. Exporting IOAM data is performed by the "IOAM node", which performs IOAM data-plane operation, whereas the interpretation of IOAM data is performed by one or several IOAM data processing systems. The separation of concerns is to offload interpretation, aggregation, and formatting of IOAM data from the node that performs data-plane operations. In other words, a node that is focused on data-plane operations, i.e., forwarding of packets and handling IOAM data, will not be tasked to also interpret the IOAM data. Instead, that node can leave this task to another system or a set of systems. For scalability reasons, a single IOAM node could choose to export IOAM data to several systems that process IOAM data. Similarly, several monitoring systems or analytics systems can be used to further process the data received from the IOAM preprocessing systems. Figure 2 shows an overview of IOAM export, including IOAM data processing systems and monitoring and analytics systems.¶
This section describes several concepts of IOAM and provides considerations that need to be taken into account when implementing IOAM in a network domain. This includes concepts like IOAM-Namespaces, IOAM Layering, traffic-sets that IOAM is applied to, and IOAM Loopback. For a definition of IOAM-Namespaces and IOAM Layering, please refer to [RFC9197]. IOAM Loopback is defined in [RFC9322].¶
IOAM-Namespaces add further context to IOAM Option-Types and associated IOAM-Data-Fields. IOAM-Namespaces are defined in Section 4.3 of [RFC9197]. The Namespace-ID is part of the IOAM Option-Type definition. See Section 4.4 of [RFC9197] for IOAM Trace Option-Types or Section 4.6 of [RFC9197] for the IOAM E2E Option-Type. IOAM-Namespaces support several uses:¶
IOAM-Namespaces can be used to identify different sets of devices (e.g., different types of devices) in a deployment. If an operator desires to insert different IOAM-Data-Fields based on the device, the devices could be grouped into multiple IOAM-Namespaces. This could be due to the fact that the IOAM feature set differs between different sets of devices, or it could be for reasons of optimized space usage in the packet header. It could also stem from hardware or operational limitations on the size of the trace data that can be added and processed, preventing collection of a full trace for a flow.¶
If several encapsulation protocols (e.g., in case of tunneling) are stacked on top of each other, IOAM-Data-Fields could be present in different protocol fields at different layers. Layering allows operators to instrument the protocol layer they want to measure. The behavior follows the ships-in-the-night model, i.e., IOAM-Data-Fields in one layer are independent of IOAM-Data-Fields in another layer. Or in other words, even though the term "layering" often implies there is some form of hierarchy and relationship, in IOAM, layers are independent of each other and don't assume any relationship among them. The different layers could, but do not have to, share the same IOAM encapsulation mechanisms. Similarly, the semantics of the IOAM-Data-Fields can, but do not have to, be associated to cross different layers. For example, a node that inserts node-id information into two different layers could use "node-id=10" for one layer and "node-id=1000" for the second layer.¶
Figure 3 shows an example of IOAM Layering. The figure shows a Geneve tunnel carried over IPv6, which starts at node A and ends at node D. IOAM information is encapsulated in IPv6 as well as in Geneve. At the IPv6 layer, node A is the IOAM encapsulating node (into IPv6), node D is the IOAM decapsulating node, and nodes B and C are IOAM transit nodes. At the Geneve layer, node A is the IOAM encapsulating node (into Geneve), and node D is the IOAM decapsulating node (from Geneve). The use of IOAM at both layers, as shown in the example here, could be used to reveal which nodes of an underlay (here the IPv6 network) are traversed by a tunneled packet in an overlay (here the Geneve network) -- which assumes that the IOAM information encapsulated by nodes A and D into Geneve and IPv6 is associated to each other.¶
IOAM offers two different IOAM Option-Types for tracing: "Incremental" Trace Option-Type and "Pre-allocated" Trace Option-Type. "Incremental" refers to a mode of operation where the packet is expanded at every IOAM node that adds IOAM-Data-Fields. "Pre-allocated" describes a mode of operation where the IOAM encapsulating node allocates room for all IOAM-Data-Fields in the entire IOAM-Domain. More specifically:¶
Which IOAM Trace Option-Types can be supported is not only a function of operator-defined configuration but may also be limited by protocol constraints unique to a given encapsulating protocol. For encapsulating protocols that support both IOAM Trace Option-Types, the operator decides, by means of configuration, which Trace Option-Type(s) will be used for a particular domain. In this case, deployments can mix devices that include either the Incremental Trace Option-Type or the Pre-allocated Trace Option-Type. For example, if different types of packet forwarders and associated different types of IOAM implementations exist in a deployment and the encapsulating protocol supports both IOAM Trace Option-Types, a deployment can mix devices that include either the Incremental Trace Option-Type or the Pre-allocated Trace Option-Type. As a result, both Option-Types can be present in a packet. IOAM decapsulating nodes remove both types of Trace Option-Types from the packet.¶
The two different Option-Types cater to different packet-forwarding infrastructures and allow an optimized implementation of IOAM tracing:¶
IOAM can be deployed on all or only on subsets of the live user traffic, e.g., per interface, based on an access control list or flow specification defining a specific set of traffic, etc.¶
IOAM Loopback is used to trigger each transit device along the path of a packet to send a copy of the data packet back to the source. Loopback allows an IOAM encapsulating node to trace the path to a given destination and to receive per-hop data about both the forward and the return path. Loopback is enabled by the encapsulating node setting the Loopback flag. Looped-back packets use the source address of the original packet as a destination address and the address of the node that performs the Loopback operation as source address. Nodes that loop back a packet clear the Loopback flag before sending the copy back towards the source. Loopack applies to IOAM deployments where the encapsulating node is either a host or the start of a tunnel. For details on IOAM Loopback, please refer to [RFC9322].¶
The Active flag indicates that a packet is an active OAM packet as opposed to regular user data traffic. Active flag is expected to be used for active measurement using IOAM. For details on the Active flag, please refer to [RFC9322].¶
Example use cases for the Active flag include:¶
A network can consist of a mix of IOAM-aware and IOAM-unaware nodes. The encapsulation of IOAM-Data-Fields into different protocols (see also Section 5) are defined such that data packets that include IOAM-Data-Fields do not get dropped by IOAM-unaware nodes. For example, packets that contain the IOAM Trace Option-Types in IPv6 Hop-by-Hop extension headers are defined with bits to indicate "00 - skip over this option and continue processing the header". This will ensure that when an IOAM-unaware node receives a packet with IOAM-Data-Fields included, it does not drop the packet.¶
Deployments that leverage the IOAM Trace Option-Type(s) could benefit from the ability to detect the presence of IOAM-unaware nodes, i.e., nodes that forward the packet but do not update or add IOAM-Data-Fields in IOAM Trace Option-Types. The node data that is defined as part of the IOAM Trace Option-Type(s) includes a Hop_Lim field associated to the node identifier to detect missed nodes, i.e., "holes" in the trace. Monitoring/Analytics systems could utilize this information to account for the presence of IOAM-unaware nodes in the network.¶
The YANG model for configuring IOAM in network nodes that support IOAM is defined in [IOAM-YANG].¶
A deployment can leverage IOAM profiles to limit the scope of IOAM features, allowing simpler implementation, verification, and interoperability testing in the context of specific use cases that do not require the full functionality of IOAM. An IOAM profile defines a use case or a set of use cases for IOAM and an associated set of rules that restrict the scope and features of the IOAM specification, thereby limiting it to a subset of the full functionality. IOAM profiles are defined in [IOAM-PROFILES].¶
For deployments where the IOAM capabilities of a node are unknown, [RFC9359] could be used to discover the enabled IOAM capabilities of nodes.¶
This document has no IANA actions.¶
As discussed in [RFC7276], a successful attack on an OAM protocol in general and, specifically, on IOAM can prevent the detection of failures or anomalies or can create a false illusion of nonexistent ones.¶
The Proof of Transit Option-Type (Section 4.2) is used for verifying the path of data packets. The security considerations of POT are further discussed in [PROOF-OF-TRANSIT].¶
Security considerations related to the use of IOAM flags, particularly the Loopback flag, are found in [RFC9322].¶
IOAM data can be subject to eavesdropping. Although the confidentiality of the user data is not at risk in this context, the IOAM data elements can be used for network reconnaissance, allowing attackers to collect information about network paths, performance, queue states, buffer occupancy, and other information. Recon is an improbable security threat in an IOAM deployment that is within a confined physical domain. However, in deployments that are not confined to a single LAN but span multiple interconnected sites (for example, using an overlay network), the inter-site links are expected to be secured (e.g., by IPsec) in order to avoid external eavesdropping and introduction of malicious or false data. Another possible mitigation approach is to use "Direct Exporting" [RFC9326]. In this case, the IOAM-related trace information would not be available in the customer data packets but would trigger the exporting of (secured) packet-related IOAM information at every node. IOAM data export and securing IOAM data export is outside the scope of this document.¶
IOAM can be used as a means for implementing or amplifying Denial-of-Service (DoS) attacks. For example, a malicious attacker can add an IOAM header to packets or modify an IOAM header in en route packets in order to consume the resources of network devices that take part in IOAM or collectors that analyze the IOAM data. Another example is a packet-length attack, in which an attacker pushes headers associated with IOAM Option-Types into data packets, causing these packets to be increased beyond the MTU size, resulting in fragmentation or in packet drops. Such DoS attacks can be mitigated by deploying IOAM in confined administrative domains and by limiting the rate and/or the percentage of packets that an IOAM encapsulating node adds IOAM information to as well as limiting rate and/or percentage of packets that an IOAM transit or an IOAM decapsulating node creates to export IOAM information extracted from the data packets that carry IOAM information.¶
Even though IOAM focused on limited domains [RFC8799], there might be deployments for which it is important for IOAM transit nodes and IOAM decapsulating nodes to know that the data received haven't been tampered with. In those cases, the IOAM data should be integrity protected. Integrity protection of IOAM data fields is described in [IOAM-DATA-INTEGRITY]. In addition, since IOAM options may include timestamps, if network devices use synchronization protocols, then any attack on the time protocol [RFC7384] can compromise the integrity of the timestamp-related data fields. Synchronization attacks can be mitigated by combining a secured time distribution scheme, e.g., [RFC8915], and by using redundant clock sources [RFC5905] and/or redundant network paths for the time distribution protocol [RFC8039].¶
At the management plane, attacks may be implemented by misconfiguring or by maliciously configuring IOAM-enabled nodes in a way that enables other attacks. Thus, IOAM configuration should be secured in a way that authenticates authorized users and verifies the integrity of configuration procedures.¶
Notably, IOAM is expected to be deployed in limited network domains [RFC8799], thus, confining the potential attack vectors within the limited domain. Indeed, in order to limit the scope of threats within the current network domain, the network operator is expected to enforce policies that prevent IOAM traffic from leaking outside the IOAM-Domain and prevent an attacker from introducing malicious or false IOAM data to be processed and used within the IOAM-Domain. IOAM data leakage could lead to privacy issues. Consider an IOAM encapsulating node that is a home gateway in an operator's network. A home gateway is often identified with an individual. Revealing IOAM data, such as "IOAM node identifier" or geolocation information outside of the limited domain, could be harmful for that user. Note that Direct Exporting [RFC9326] can mitigate the potential threat of IOAM data leaking through data packets.¶
The authors would like to thank Tal Mizrahi, Eric Vyncke, Nalini Elkins, Srihari Raghavan, Ranganathan T S, Barak Gafni, Karthik Babu Harichandra Babu, Akshaya Nadahalli, LJ Wobker, Erik Nordmark, Vengada Prasad Govindan, Andrew Yourtchenko, Aviv Kfir, Tianran Zhou, Zhenbin (Robin), Joe Clarke, Al Morton, Tom Herbet, Haoyu Song, and Mickey Spiegel for the comments and advice on IOAM.¶