RFC 9414 | Predictable Transient Numeric IDs | July 2023 |
Gont & Arce | Informational | [Page] |
This document analyzes the timeline of the specification and implementation of different types of "transient numeric identifiers" used in IETF protocols and how the security and privacy properties of such protocols have been affected as a result of it. It provides empirical evidence that advice in this area is warranted. This document is a product of the Privacy Enhancements and Assessments Research Group (PEARG) in the IRTF.¶
This document is not an Internet Standards Track specification; it is published for informational purposes.¶
This document is a product of the Internet Research Task Force (IRTF). The IRTF publishes the results of Internet-related research and development activities. These results might not be suitable for deployment. This RFC represents the consensus of the Privacy Enhancements and Assessments Research Group of the Internet Research Task Force (IRTF). Documents approved for publication by the IRSG are not candidates for any level of Internet Standard; see Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc9414.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.¶
Networking protocols employ a variety of transient numeric identifiers for different protocol objects, such as IPv4 and IPv6 Identification values [RFC0791] [RFC8200], IPv6 Interface Identifiers (IIDs) [RFC4291], transport-protocol ephemeral port numbers [RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC9293], NTP Reference IDs (REFIDs) [RFC5905], and DNS IDs [RFC1035]. These identifiers typically have specific requirements (e.g., uniqueness during a specified period of time) that must be satisfied such that they do not result in negative interoperability implications and an associated failure severity when such requirements are not met [RFC9415].¶
For more than 30 years, a large number of implementations of IETF protocols have been subject to a variety of attacks, with effects ranging from Denial of Service (DoS) or data injection to information leakages that could be exploited for pervasive monitoring [RFC7258]. The root cause of these issues has been, in many cases, the poor selection of transient numeric identifiers in such protocols, usually as a result of insufficient or misleading specifications. While it is generally trivial to identify an algorithm that can satisfy the interoperability requirements of a given transient numeric identifier, empirical evidence exists that doing so without negatively affecting the security and/or privacy properties of the aforementioned protocols is prone to error.¶
For example, implementations have been subject to security and/or privacy issues resulting from:¶
Recent history indicates that, when new protocols are standardized or new protocol implementations are produced, the security and privacy properties of the associated transient numeric identifiers tend to be overlooked, and inappropriate algorithms to generate such identifiers are either suggested in the specifications or selected by implementers. As a result, advice in this area is warranted.¶
This document contains a non-exhaustive timeline of the specification and vulnerability disclosures related to some sample transient numeric identifiers, including other work that has led to advances in this area. This analysis indicates that:¶
While it is generally possible to identify an algorithm that can satisfy the interoperability requirements for a given transient numeric identifier, this document provides empirical evidence that doing so without negatively affecting the security and/or privacy properties of the corresponding protocols is nontrivial. Other related documents ([RFC9415] and [RFC9416]) provide guidance in this area, as motivated by the present document.¶
This document represents the consensus of the Privacy Enhancements and Assessments Research Group (PEARG).¶
The terms "constant IID", "stable IID", and "temporary IID" are to be interpreted as defined in [RFC7721].¶
Throughout this document, we do not consider on-path attacks. That is, we assume the attacker does not have physical or logical access to the system(s) being attacked and that the attacker can only observe traffic explicitly directed to the attacker. Similarly, an attacker cannot observe traffic transferred between the sender and the receiver(s) of a target protocol but may be able to interact with any of these entities, including by, e.g., sending any traffic to them to sample transient numeric identifiers employed by the target hosts when communicating with the attacker.¶
For example, when analyzing vulnerabilities associated with TCP Initial Sequence Numbers (ISNs), we consider the attacker is unable to capture network traffic corresponding to a TCP connection between two other hosts. However, we consider the attacker is able to communicate with any of these hosts (e.g., establish a TCP connection with any of them) to, e.g., sample the TCP ISNs employed by these hosts when communicating with the attacker.¶
Similarly, when considering host-tracking attacks based on IPv6 Interface Identifiers, we consider an attacker may learn the IPv6 address employed by a victim host if, e.g., the address becomes exposed as a result of the victim host communicating with an attacker-operated server. Subsequently, an attacker may perform host-tracking by probing a set of target addresses composed by a set of target prefixes and the IPv6 Interface Identifier originally learned by the attacker. Alternatively, an attacker may perform host-tracking if, e.g., the victim host communicates with an attacker-operated server as it moves from one location to another, thereby exposing its configured addresses. We note that none of these scenarios require the attacker observe traffic not explicitly directed to the attacker.¶
While assessing IETF protocol specifications regarding the use of transient numeric identifiers, we have found that most of the issues discussed in this document arise as a result of one of the following conditions:¶
A number of IETF protocol specifications under specified their transient numeric identifiers, thus leading to implementations that were vulnerable to numerous off-path attacks. Examples of them are the specification of TCP local ports in [RFC0793] or the specification of the DNS ID in [RFC1035].¶
On the other hand, there are a number of IETF protocol specifications that over specify some of their associated transient numeric identifiers. For example, [RFC4291] essentially overloads the semantics of IPv6 Interface Identifiers (IIDs) by embedding link-layer addresses in the IPv6 IIDs when the interoperability requirement of uniqueness could be achieved in other ways that do not result in negative security and privacy implications [RFC7721]. Similarly, [RFC2460] suggests the use of a global counter for the generation of Identification values when the interoperability requirement of uniqueness per {IPv6 Source Address, IPv6 Destination Address} could be achieved with other algorithms that do not result in negative security and privacy implications [RFC7739].¶
Finally, there are protocol implementations that simply fail to comply with existing protocol specifications. For example, some popular operating systems still fail to implement transport-protocol ephemeral port randomization, as recommended in [RFC6056], or TCP Initial Sequence Number randomization, as recommended in [RFC9293].¶
The following subsections document the timelines for a number of sample transient numeric identifiers that illustrate how the problem discussed in this document has affected protocols from different layers over time. These sample transient numeric identifiers have different interoperability requirements and failure severities (see Section 6 of [RFC9415]), and thus are considered to be representative of the problem being analyzed in this document.¶
This section presents the timeline of the Identification field employed by IPv4 (in the base header) and IPv6 (in Fragment Headers). The reason for presenting both cases in the same section is to make it evident that, while the Identification value serves the same purpose in both protocols, the work and research done for the IPv4 case did not influence IPv6 specifications or implementations.¶
The IPv4 Identification is specified in [RFC0791], which specifies the interoperability requirements for the Identification field, i.e., the sender must choose the Identification field to be unique for a given {Source Address, Destination Address, Protocol} for the time the datagram (or any fragment of it) could be alive in the Internet. It suggests that a sending protocol module may keep "a table of Identifiers, one entry for each destination it has communicated with in the last maximum packet lifetime for the [I]nternet", and it also suggests that "since the Identifier field allows 65,536 different values, hosts may be able to simply use unique identifiers independent of destination". The above has been interpreted numerous times as a suggestion to employ per-destination or global counters for the generation of Identification values. While [RFC0791] does not suggest any flawed algorithm for the generation of Identification values, the specification omits a discussion of the security and privacy implications of predictable Identification values. This resulted in many IPv4 implementations generating predictable Identification values by means of a global counter, at least at some point in time.¶
The IPv6 Identification was originally specified in [RFC1883]. It serves the same purpose as its IPv4 counterpart, but rather than being part of the base header (as in the IPv4 case), it is part of the Fragment Header (which may or may not be present in an IPv6 packet). Section 4.5 of [RFC1883] states that the Identification must be different than that of any other fragmented packet sent recently (within the maximum likely lifetime of a packet) with the same Source Address and Destination Address. Subsequently, it notes that this requirement can be met by means of a wrap-around 32-bit counter that is incremented each time a packet must be fragmented and that it is an implementation choice whether to use a global or a per-destination counter. Thus, the specification of the IPv6 Identification is similar to that of the IPv4 case, with the only difference that, in the IPv6 case, the suggestions to use simple counters is more explicit. [RFC2460] is the first revision of the core IPv6 specification and maintains the same text for the specification of the IPv6 Identification field. [RFC8200], the second revision of the core IPv6 specification, removes the suggestion from [RFC2460] to use a counter for the generation of IPv6 Identification values and points to [RFC7739] for sample algorithms for their generation.¶
[RFC0793] suggests that the choice of the ISN of a connection is not arbitrary but aims to reduce the chances of a stale segment from being accepted by a new incarnation of a previous connection. [RFC0793] suggests the use of a global 32-bit ISN generator that is incremented by 1 roughly every 4 microseconds. However, as a matter of fact, protection against stale segments from a previous incarnation of the connection is enforced by preventing the creation of a new incarnation of a previous connection before 2*MSL has passed since a segment corresponding to the old incarnation was last seen (where "MSL" is the "Maximum Segment Lifetime" [RFC0793]). This is accomplished by the TIME-WAIT state and TCP's "quiet time" concept (see Appendix B of [RFC1323]). Based on the assumption that ISNs are monotonically increasing across connections, many stacks (e.g., 4.2BSD-derived) use the ISN of an incoming SYN segment to perform "heuristics" that enable the creation of a new incarnation of a connection while the previous incarnation is still in the TIME-WAIT state (see p. 945 of [Wright1994]). This avoids an interoperability problem that may arise when a node establishes connections to a specific TCP end-point at a high rate [Silbersack2005].¶
The interoperability requirements for TCP ISNs are probably not as clearly spelled out as one would expect. Furthermore, the suggestion of employing a global counter in [RFC0793] negatively affects the security and privacy properties of the protocol.¶
[Zalewski2002] updates and complements [Zalewski2001]. It concludes that "while some vendors [...] reacted promptly and tested their solutions properly, many still either ignored the issue and never evaluated their implementations, or implemented a flawed solution that apparently was not tested using a known approach" [Zalewski2002].¶
IPv6 Interface Identifiers can be generated as a result of different mechanisms, including Stateless Address Autoconfiguration (SLAAC) [RFC4862], DHCPv6 [RFC8415], and manual configuration. This section focuses on Interface Identifiers resulting from SLAAC.¶
The Interface Identifier of stable IPv6 addresses resulting from SLAAC originally resulted in the underlying link-layer address being embedded in the IID. At the time, employing the underlying link-layer address for the IID was seen as a convenient way to obtain a unique address. However, recent awareness about the security and privacy properties of this approach [RFC7707] [RFC7721] has led to the replacement of this flawed scheme with an alternative one [RFC7217] [RFC8064] that does not negatively affect the security and privacy properties of the protocol.¶
The NTP [RFC5905] Reference ID is a 32-bit code identifying the particular server or reference clock. Above stratum 1 (secondary servers and clients), this value can be employed to avoid degree-one timing loops, that is, scenarios where two NTP peers are (mutually) the time source of each other. If using the IPv4 address family, the identifier is the four-octet IPv4 address. If using the IPv6 address family, it is the first four octets of the MD5 hash of the IPv6 address.¶
Most (if not all) transport protocols employ "port numbers" to demultiplex packets to the corresponding transport-protocol instances. "Ephemeral ports" refer to the local ports employed in active OPEN requests, that is, typically the local port numbers employed on the side initiating the communication.¶
The DNS ID [RFC1035] can be employed to match DNS replies to outstanding DNS queries.¶
For more than 30 years, a large number of implementations of IETF protocols have been subject to a variety of attacks, with effects ranging from Denial of Service (DoS) or data injection to information leakages that could be exploited for pervasive monitoring [RFC7258]. The root cause of these issues has been, in many cases, the poor selection of transient numeric identifiers in such protocols, usually as a result of insufficient or misleading specifications.¶
While it is generally possible to identify an algorithm that can satisfy the interoperability requirements for a given transient numeric identifier, this document provides empirical evidence that doing so without negatively affecting the security and/or privacy properties of the aforementioned protocols is nontrivial. It is thus evident that advice in this area is warranted.¶
[RFC9416] aims at requiring future IETF protocol specifications to contain analysis of the security and privacy properties of any transient numeric identifiers specified by the protocol and to recommend an algorithm for the generation of such transient numeric identifiers. [RFC9415] specifies a number of sample algorithms for generating transient numeric identifiers with specific interoperability requirements and failure severities.¶
This document has no IANA actions.¶
This document analyzes the timeline of the specification and implementation of the transient numeric identifiers of some sample IETF protocols and how the security and privacy properties of such protocols have been affected as a result of it. It provides concrete evidence that advice in this area is warranted.¶
[RFC9415] analyzes and categorizes transient numeric identifiers based on their interoperability requirements and their associated failure severities and recommends possible algorithms that can be employed to comply with those requirements without negatively affecting the security and privacy properties of the corresponding protocols.¶
[RFC9416] formally requires IETF protocol specifications to specify the interoperability requirements for their transient numeric identifiers, to do a warranted vulnerability assessment of such transient numeric identifiers, and to recommend possible algorithms for their generation, such that the interoperability requirements are complied with, while any negative security or privacy properties of these transient numeric identifiers are mitigated.¶
The authors would like to thank (in alphabetical order) Bernard Aboba, Dave Crocker, Spencer Dawkins, Theo de Raadt, Sara Dickinson, Guillermo Gont, Christian Huitema, Colin Perkins, Vincent Roca, Kris Shrishak, Joe Touch, Brian Trammell, and Christopher Wood for providing valuable comments on earlier versions of this document.¶
The authors would like to thank (in alphabetical order) Steven Bellovin, Joseph Lorenzo Hall, Gre Norcie, and Martin Thomson for providing valuable comments on [draft-gont-predictable-numeric-ids-03], on which this document is based.¶
Section 4.2 of this document borrows text from [RFC6528], authored by Fernando Gont and Steven Bellovin.¶
The authors would like to thank Sara Dickinson and Christopher Wood for their guidance during the publication process of this document.¶
The authors would like to thank Diego Armando Maradona for his magic and inspiration.¶