Internet Engineering Task Force (IETF) R. Barnes Request for Comments: 9420 Cisco Category: Standards Track B. Beurdouche ISSN: 2070-1721 Inria & Mozilla R. Robert Phoenix R&D J. Millican Meta Platforms E. Omara K. Cohn-Gordon University of Oxford July 2023 The Messaging Layer Security (MLS) Protocol Abstract Messaging applications are increasingly making use of end-to-end security mechanisms to ensure that messages are only accessible to the communicating endpoints, and not to any servers involved in delivering messages. Establishing keys to provide such protections is challenging for group chat settings, in which more than two clients need to agree on a key but may not be online at the same time. In this document, we specify a key establishment protocol that provides efficient asynchronous group key establishment with forward secrecy (FS) and post-compromise security (PCS) for groups in size ranging from two to thousands. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc9420. Copyright Notice Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction 2. Terminology 2.1. Presentation Language 2.1.1. Optional Value 2.1.2. Variable-Size Vector Length Headers 3. Protocol Overview 3.1. Cryptographic State and Evolution 3.2. Example Protocol Execution 3.3. External Joins 3.4. Relationships between Epochs 4. Ratchet Tree Concepts 4.1. Ratchet Tree Terminology 4.1.1. Ratchet Tree Nodes 4.1.2. Paths through a Ratchet Tree 4.2. Views of a Ratchet Tree 5. Cryptographic Objects 5.1. Cipher Suites 5.1.1. Public Keys 5.1.2. Signing 5.1.3. Public Key Encryption 5.2. Hash-Based Identifiers 5.3. Credentials 5.3.1. Credential Validation 5.3.2. Credential Expiry and Revocation 5.3.3. Uniquely Identifying Clients 6. Message Framing 6.1. Content Authentication 6.2. Encoding and Decoding a Public Message 6.3. Encoding and Decoding a Private Message 6.3.1. Content Encryption 6.3.2. Sender Data Encryption 7. Ratchet Tree Operations 7.1. Parent Node Contents 7.2. Leaf Node Contents 7.3. Leaf Node Validation 7.4. Ratchet Tree Evolution 7.5. Synchronizing Views of the Tree 7.6. Update Paths 7.7. Adding and Removing Leaves 7.8. Tree Hashes 7.9. Parent Hashes 7.9.1. Using Parent Hashes 7.9.2. Verifying Parent Hashes 8. Key Schedule 8.1. Group Context 8.2. Transcript Hashes 8.3. External Initialization 8.4. Pre-Shared Keys 8.5. Exporters 8.6. Resumption PSK 8.7. Epoch Authenticators 9. Secret Tree 9.1. Encryption Keys 9.2. Deletion Schedule 10. Key Packages 10.1. KeyPackage Validation 11. Group Creation 11.1. Required Capabilities 11.2. Reinitialization 11.3. Subgroup Branching 12. Group Evolution 12.1. Proposals 12.1.1. Add 12.1.2. Update 12.1.3. Remove 12.1.4. PreSharedKey 12.1.5. ReInit 12.1.6. ExternalInit 12.1.7. GroupContextExtensions 12.1.8. External Proposals 12.2. Proposal List Validation 12.3. Applying a Proposal List 12.4. Commit 12.4.1. Creating a Commit 12.4.2. Processing a Commit 12.4.3. Adding Members to the Group 13. Extensibility 13.1. Additional Cipher Suites 13.2. Proposals 13.3. Credential Extensibility 13.4. Extensions 13.5. GREASE 14. Sequencing of State Changes 15. Application Messages 15.1. Padding 15.2. Restrictions 15.3. Delayed and Reordered Application Messages 16. Security Considerations 16.1. Transport Security 16.2. Confidentiality of Group Secrets 16.3. Confidentiality of Sender Data 16.4. Confidentiality of Group Metadata 16.4.1. GroupID, Epoch, and Message Frequency 16.4.2. Group Extensions 16.4.3. Group Membership 16.5. Authentication 16.6. Forward Secrecy and Post-Compromise Security 16.7. Uniqueness of Ratchet Tree Key Pairs 16.8. KeyPackage Reuse 16.9. Delivery Service Compromise 16.10. Authentication Service Compromise 16.11. Additional Policy Enforcement 16.12. Group Fragmentation by Malicious Insiders 17. IANA Considerations 17.1. MLS Cipher Suites 17.2. MLS Wire Formats 17.3. MLS Extension Types 17.4. MLS Proposal Types 17.5. MLS Credential Types 17.6. MLS Signature Labels 17.7. MLS Public Key Encryption Labels 17.8. MLS Exporter Labels 17.9. MLS Designated Expert Pool 17.10. The "message/mls" Media Type 18. References 18.1. Normative References 18.2. Informative References Appendix A. Protocol Origins of Example Trees Appendix B. Evolution of Parent Hashes Appendix C. Array-Based Trees Appendix D. Link-Based Trees Contributors Authors' Addresses 1. Introduction A group of users who want to send each other encrypted messages needs a way to derive shared symmetric encryption keys. For two parties, this problem has been studied thoroughly, with the Double Ratchet emerging as a common solution [DoubleRatchet] [Signal]. Channels implementing the Double Ratchet enjoy fine-grained forward secrecy as well as post-compromise security, but are nonetheless efficient enough for heavy use over low-bandwidth networks. For a group of size greater than two, a common strategy is to distribute symmetric "sender keys" over existing 1:1 secure channels, and then for each member to send messages to the group encrypted with their own sender key. On the one hand, using sender keys improves efficiency relative to pairwise transmission of individual messages, and it provides forward secrecy (with the addition of a hash ratchet). On the other hand, it is difficult to achieve post- compromise security with sender keys, requiring a number of key update messages that scales as the square of the group size. An adversary who learns a sender key can often indefinitely and passively eavesdrop on that member's messages. Generating and distributing a new sender key provides a form of post-compromise security with regard to that sender. However, it requires computation and communications resources that scale linearly with the size of the group. In this document, we describe a protocol based on tree structures that enables asynchronous group keying with forward secrecy and post- compromise security. Based on earlier work on "asynchronous ratcheting trees" [ART], the protocol presented here uses an asynchronous key-encapsulation mechanism for tree structures. This mechanism allows the members of the group to derive and update shared keys with costs that scale as the log of the group size. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. Client: An agent that uses this protocol to establish shared cryptographic state with other clients. A client is defined by the cryptographic keys it holds. Group: A group represents a logical collection of clients that share a common secret value at any given time. Its state is represented as a linear sequence of epochs in which each epoch depends on its predecessor. Epoch: A state of a group in which a specific set of authenticated clients hold shared cryptographic state. Member: A client that is included in the shared state of a group and hence has access to the group's secrets. Key Package: A signed object describing a client's identity and capabilities, including a hybrid public key encryption (HPKE) [RFC9180] public key that can be used to encrypt to that client. Other clients can use a client's KeyPackage to introduce the client to a new group. Group Context: An object that summarizes the shared, public state of the group. The group context is typically distributed in a signed GroupInfo message, which is provided to new members to help them join a group. Signature Key: A signing key pair used to authenticate the sender of a message. Proposal: A message that proposes a change to the group, e.g., adding or removing a member. Commit: A message that implements the changes to the group proposed in a set of Proposals. PublicMessage: An MLS protocol message that is signed by its sender and authenticated as coming from a member of the group in a particular epoch, but not encrypted. PrivateMessage: An MLS protocol message that is signed by its sender, authenticated as coming from a member of the group in a particular epoch, and encrypted so that it is confidential to the members of the group in that epoch. Handshake Message: A PublicMessage or PrivateMessage carrying an MLS Proposal or Commit object, as opposed to application data. Application Message: A PrivateMessage carrying application data. Terminology specific to tree computations is described in Section 4.1. In general, symmetric values are referred to as "keys" or "secrets" interchangeably. Either term denotes a value that MUST be kept confidential to a client. When labeling individual values, we typically use "secret" to refer to a value that is used to derive further secret values and "key" to refer to a value that is used with an algorithm such as Hashed Message Authentication Code (HMAC) or an Authenticated Encryption with Associated Data (AEAD) algorithm. The PublicMessage and PrivateMessage formats are defined in Section 6. Security notions such as forward secrecy and post- compromise security are defined in Section 16. As detailed in Section 13.5, MLS uses the "Generate Random Extensions And Sustain Extensibility" (GREASE) approach to maintaining extensibility, where senders insert random values into fields in which receivers are required to ignore unknown values. Specific "GREASE values" for this purpose are registered in the appropriate IANA registries. 2.1. Presentation Language We use the TLS presentation language [RFC8446] to describe the structure of protocol messages. In addition to the base syntax, we add two additional features: the ability for fields to be optional and the ability for vectors to have variable-size length headers. 2.1.1. Optional Value An optional value is encoded with a presence-signaling octet, followed by the value itself if present. When decoding, a presence octet with a value other than 0 or 1 MUST be rejected as malformed. struct { uint8 present; select (present) { case 0: struct{}; case 1: T value; }; } optional; 2.1.2. Variable-Size Vector Length Headers In the TLS presentation language, vectors are encoded as a sequence of encoded elements prefixed with a length. The length field has a fixed size set by specifying the minimum and maximum lengths of the encoded sequence of elements. In MLS, there are several vectors whose sizes vary over significant ranges. So instead of using a fixed-size length field, we use a variable-size length using a variable-length integer encoding based on the one described in Section 16 of [RFC9000]. They differ only in that the one here requires a minimum-size encoding. Instead of presenting min and max values, the vector description simply includes a V. For example: struct { uint32 fixed<0..255>; opaque variable; } StructWithVectors; Such a vector can represent values with length from 0 bytes to 2^30 bytes. The variable-length integer encoding reserves the two most significant bits of the first byte to encode the base 2 logarithm of the integer encoding length in bytes. The integer value is encoded on the remaining bits, so that the overall value is in network byte order. The encoded value MUST use the smallest number of bits required to represent the value. When decoding, values using more bits than necessary MUST be treated as malformed. This means that integers are encoded in 1, 2, or 4 bytes and can encode 6-, 14-, or 30-bit values, respectively. +========+=========+=============+=======+============+ | Prefix | Length | Usable Bits | Min | Max | +========+=========+=============+=======+============+ | 00 | 1 | 6 | 0 | 63 | +--------+---------+-------------+-------+------------+ | 01 | 2 | 14 | 64 | 16383 | +--------+---------+-------------+-------+------------+ | 10 | 4 | 30 | 16384 | 1073741823 | +--------+---------+-------------+-------+------------+ | 11 | invalid | - | - | - | +--------+---------+-------------+-------+------------+ Table 1: Summary of Integer Encodings Vectors that start with the prefix "11" are invalid and MUST be rejected. For example: * The four-byte length value 0x9d7f3e7d decodes to 494878333. * The two-byte length value 0x7bbd decodes to 15293. * The single-byte length value 0x25 decodes to 37. The following figure adapts the pseudocode provided in [RFC9000] to add a check for minimum-length encoding: ReadVarint(data): // The length of variable-length integers is encoded in the // first two bits of the first byte. v = data.next_byte() prefix = v >> 6 if prefix == 3: raise Exception('invalid variable length integer prefix') length = 1 << prefix // Once the length is known, remove these bits and read any // remaining bytes. v = v & 0x3f repeat length-1 times: v = (v << 8) + data.next_byte() // Check if the value would fit in half the provided length. if prefix >= 1 && v < (1 << (8*(length/2) - 2)): raise Exception('minimum encoding was not used') return v The use of variable-size integers for vector lengths allows vectors to grow very large, up to 2^30 bytes. Implementations should take care not to allow vectors to overflow available storage. To facilitate debugging of potential interoperability problems, implementations SHOULD provide a clear error when such an overflow condition occurs. 3. Protocol Overview MLS is designed to operate in the context described in [MLS-ARCH]. In particular, we assume that the following services are provided: * An Authentication Service (AS) that enables group members to authenticate the credentials presented by other group members. * A Delivery Service (DS) that routes MLS messages among the participants in the protocol. MLS assumes a trusted AS but a largely untrusted DS. Section 16.10 describes the impact of compromise or misbehavior of an AS. MLS is designed to protect the confidentiality and integrity of the group data even in the face of a compromised DS; in general, the DS is only expected to reliably deliver messages. Section 16.9 describes the impact of compromise or misbehavior of a DS. The core functionality of MLS is continuous group authenticated key exchange (AKE). As with other authenticated key exchange protocols (such as TLS), the participants in the protocol agree on a common secret value, and each participant can verify the identity of the other participants. That secret can then be used to protect messages sent from one participant in the group to the other participants using the MLS framing layer or can be exported for use with other protocols. MLS provides group AKE in the sense that there can be more than two participants in the protocol, and continuous group AKE in the sense that the set of participants in the protocol can change over time. The core organizing principles of MLS are _groups_ and _epochs_. A group represents a logical collection of clients that share a common secret value at any given time. The history of a group is divided into a linear sequence of epochs. In each epoch, a set of authenticated _members_ agree on an _epoch secret_ that is known only to the members of the group in that epoch. The set of members involved in the group can change from one epoch to the next, and MLS ensures that only the members in the current epoch have access to the epoch secret. From the epoch secret, members derive further shared secrets for message encryption, group membership authentication, and so on. The creator of an MLS group creates the group's first epoch unilaterally, with no protocol interactions. Thereafter, the members of the group advance their shared cryptographic state from one epoch to another by exchanging MLS messages. * A _KeyPackage_ object describes a client's capabilities and provides keys that can be used to add the client to a group. * A _Proposal_ message proposes a change to be made in the next epoch, such as adding or removing a member. * A _Commit_ message initiates a new epoch by instructing members of the group to implement a collection of proposals. * A _Welcome_ message provides a new member to the group with the information to initialize their state for the epoch in which they were added or in which they want to add themselves to the group. KeyPackage and Welcome messages are used to initiate a group or introduce new members, so they are exchanged between group members and clients not yet in the group. A client publishes a KeyPackage via the DS, thus enabling other clients to add it to groups. When a group member wants to add a new member to a group, it uses the new member's KeyPackage to add them and constructs a Welcome message with which the new member can initialize their local state. Proposal and Commit messages are sent from one member of a group to the others. MLS provides a common framing layer for sending messages within a group: A _PublicMessage_ provides sender authentication for unencrypted Proposal and Commit messages. A _PrivateMessage_ provides encryption and authentication for both Proposal/Commit messages as well as any application data. 3.1. Cryptographic State and Evolution The cryptographic state at the core of MLS is divided into three areas of responsibility: .- ... -. | | | | | | | | Key Schedule | V | | epoch_secret | . | | | . |\ Ratchet | | | Secret /| | \ Tree | | | Tree / | | \ | | | / | | \ | V | / | | +--> commit_secret --> epoch_secret --> encryption_secret -->+ | | / | | | \ | | / | | | \ | | / | | | \ | |/ | | | \| ' | V | ' | epoch_secret | | | | | | | | V | | | '- ... -' Figure 1: Overview of MLS Group Evolution * A _ratchet tree_ that represents the membership of the group, providing group members a way to authenticate each other and efficiently encrypt messages to subsets of the group. Each epoch has a distinct ratchet tree. It seeds the _key schedule_. * A _key schedule_ that describes the chain of key derivations used to progress from epoch to epoch (mainly using the _init_secret_ and _epoch_secret_), as well as the derivation of a variety of other secrets (see Table 4). For example: - An _encryption secret_ that is used to initialize the secret tree for the epoch. - An _exporter secret_ that allows other protocols to leverage MLS as a generic authenticated group key exchange. - A _resumption secret_ that members can use to prove their membership in the group, e.g., when creating a subgroup or a successor group. * A _secret tree_ derived from the key schedule that represents shared secrets used by the members of the group for encrypting and authenticating messages. Each epoch has a distinct secret tree. Each member of the group maintains a partial view of these components of the group's state. MLS messages are used to initialize these views and keep them in sync as the group transitions between epochs. Each new epoch is initiated with a Commit message. The Commit instructs existing members of the group to update their views of the ratchet tree by applying a set of Proposals, and uses the updated ratchet tree to distribute fresh entropy to the group. This fresh entropy is provided only to members in the new epoch and not to members who have been removed. Commits thus maintain the property that the epoch secret is confidential to the members in the current epoch. For each Commit that adds one or more members to the group, there are one or more corresponding Welcome messages. Each Welcome message provides new members with the information they need to initialize their views of the key schedule and ratchet tree, so that these views align with the views held by other members of the group in this epoch. 3.2. Example Protocol Execution There are three major operations in the life of a group: * Adding a member, initiated by a current member; * Updating the keys that represent a member in the tree; and * Removing a member. Each of these operations is "proposed" by sending a message of the corresponding type (Add / Update / Remove). The state of the group is not changed, however, until a Commit message is sent to provide the group with fresh entropy. In this section, we show each proposal being committed immediately, but in more advanced deployment cases, an application might gather several proposals before committing them all at once. In the illustrations below, we show the Proposal and Commit messages directly, while in reality they would be sent encapsulated in PublicMessage or PrivateMessage objects. Before the initialization of a group, clients publish KeyPackages to a directory provided by the DS (see Figure 2). Delivery Service | .--------' '--------. | | Group A B C Directory Channel | | | | | | KeyPackageA | | | | +------------------------------------------------->| | | | | | | | | KeyPackageB | | | | +-------------------------------->| | | | | | | | | | KeyPackageC | | | | +--------------->| | | | | | | Figure 2: Clients A, B, and C publish KeyPackages to the directory Figure 3 shows how these pre-published KeyPackages are used to create a group. When client A wants to establish a group with clients B and C, it first initializes a group state containing only itself and downloads KeyPackages for B and C. For each member, A generates an Add proposal and a Commit message to add that member and then broadcasts the two messages to the group. Client A also generates a Welcome message and sends it directly to the new member (there's no need to send it to the group). Only after A has received its Commit message back from the Delivery Service does it update its state to reflect the new member's addition. Once A has updated its state, the new member has processed the Welcome, and any other group members have processed the Commit, they will all have consistent representations of the group state, including a group secret that is known only to the members the group. The new member will be able to read and send new messages to the group, but messages sent before they were added to the group will not be accessible. Group A B C Directory Channel | | | | | | KeyPackageB, KeyPackageC | | |<-------------------------------------------+ | | | | | | | | | | Add(A->AB) | | | | | Commit(Add) | +--------------------------------------------------------------->| | | | | | | Welcome(B) | | | | +------------->| | | | | | | | | | | | | Add(A->AB) | | | | | Commit(Add) | |<---------------------------------------------------------------+ | | | | | | | | | | | | | | Add(AB->ABC) | | | | | Commit(Add) | +--------------------------------------------------------------->| | | | | | | | Welcome(C) | | | +---------------------------->| | | | | | | | | | | | Add(AB->ABC) | | | | | Commit(Add) | |<---------------------------------------------------------------+ | |<------------------------------------------------+ | | | | | Figure 3: Client A creates a group with clients B and C Subsequent additions of group members proceed in the same way. Any member of the group can download a KeyPackage for a new client, broadcast Add and Commit messages that the current group will use to update their state, and send a Welcome message that the new client can use to initialize its state and join the group. To enforce the forward secrecy and post-compromise security of messages, each member periodically updates the keys that represent them to the group. A member does this by sending a Commit (possibly with no proposals) or by sending an Update message that is committed by another member (see Figure 4). Once the other members of the group have processed these messages, the group's secrets will be unknown to an attacker that had compromised the secrets corresponding to the sender's leaf in the tree. At the end of the scenario shown in Figure 4, the group has post-compromise security with respect to both A and B. Update messages SHOULD be sent at regular intervals of time as long as the group is active, and members that don't update SHOULD eventually be removed from the group. It's left to the application to determine an appropriate amount of time between Updates. Since the purpose of sending an Update is to proactively constrain a compromise window, the right frequency is usually on the order of hours or days, not milliseconds. For example, an application might send an Update each time a member sends an application message after receiving any message from another member, or daily if no application messages are sent. The MLS architecture recommends that MLS be operated over a secure transport (see Section 7.1 of [MLS-ARCH]). Such transport protocols will typically provide functions such as congestion control that manage the impact of an MLS-using application on other applications sharing the same network. Applications should take care that they do not send MLS messages at a rate that will cause problems such as network congestion, especially if they are not following the above recommendation (e.g., sending MLS directly over UDP instead). Group A B ... Z Directory Channel | | | | | | | Update(B) | | | | +------------------------------------------->| | | | | Update(B) | |<----------------------------------------------------------+ | |<-------------------------------------------+ | | |<----------------------------+ | | | | | | Commit(Upd) | | | | +---------------------------------------------------------->| | | | | Commit(Upd) | |<----------------------------------------------------------+ | |<-------------------------------------------+ | | |<----------------------------+ | | | | | Figure 4: Client B proposes to update its key, and client A commits the proposal Members are removed from the group in a similar way, as shown in Figure 5. Any member of the group can send a Remove proposal followed by a Commit message. The Commit message provides new entropy to all members of the group except the removed member. This new entropy is added to the epoch secret for the new epoch so that it is not known to the removed member. Note that this does not necessarily imply that any member is actually allowed to evict other members; groups can enforce access control policies on top of these basic mechanisms. Group A B ... Z Directory Channel | | | | | | | | Remove(B) | | | | | Commit(Rem) | | | | +---------------------------->| | | | | | | | | | Remove(B) | | | | | Commit(Rem) | |<----------------------------------------------------------+ | |<-------------------------------------------+ | | |<----------------------------+ | | | | | Figure 5: Client Z removes client B from the group Note that the flows in this section are examples; applications can arrange message flows in other ways. For example: * Welcome messages don't necessarily need to be sent directly to new joiners. Since they are encrypted to new joiners, they could be distributed more broadly, say if the application only had access to a broadcast channel for the group. * Proposal messages don't need to be immediately sent to all group members. They need to be available to the committer before generating a Commit, and to other members before processing the Commit. * The sender of a Commit doesn't necessarily have to wait to receive its own Commit back before advancing its state. It only needs to know that its Commit will be the next one applied by the group, say based on a promise from an orchestration server. 3.3. External Joins In addition to the Welcome-based flow for adding a new member to the group, it is also possible for a new member to join by means of an "external Commit". This mechanism can be used when the existing members don't have a KeyPackage for the new member, for example, in the case of an "open" group that can be joined by new members without asking permission from existing members. Figure 6 shows a typical message flow for an external join. To enable a new member to join the group in this way, a member of the group (A, B) publishes a GroupInfo object that includes the GroupContext for the group as well as a public key that can be used to encrypt a secret to the existing members of the group. When the new member Z wishes to join, they download the GroupInfo object and use it to form a Commit of a special form that adds Z to the group (as detailed in Section 12.4.3.2). The existing members of the group process this external Commit in a similar way to a normal Commit, advancing to a new epoch in which Z is now a member of the group. Group A B Z Directory Channel | | | | | | GroupInfo | | | | +------------------------------------------->| | | | | GroupInfo | | | | |<-------------+ | | | | | | | | | Commit(ExtZ) | | | | +---------------------------->| | | | | Commit(ExtZ) | |<----------------------------------------------------------+ | |<-------------------------------------------+ | | |<----------------------------+ | | | | | Figure 6: Client A publishes a GroupInfo object, and Client Z uses it to join the group 3.4. Relationships between Epochs A group has a single linear sequence of epochs. Groups and epochs are generally independent of one another. However, it can sometimes be useful to link epochs cryptographically, either within a group or across groups. MLS derives a resumption pre-shared key (PSK) from each epoch to allow entropy extracted from one epoch to be injected into a future epoch. A group member that wishes to inject a PSK issues a PreSharedKey proposal (Section 12.1.4) describing the PSK to be injected. When this proposal is committed, the corresponding PSK will be incorporated into the key schedule as described in Section 8.4. Linking epochs in this way guarantees that members entering the new epoch agree on a key if and only if they were members of the group during the epoch from which the resumption key was extracted. MLS supports two ways to tie a new group to an existing group, which are illustrated in Figures 7 and 8. Reinitialization closes one group and creates a new group comprising the same members with different parameters. Branching starts a new group with a subset of the original group's participants (with no effect on the original group). In both cases, the new group is linked to the old group via a resumption PSK. epoch_A_[n-1] | | |<-- ReInit | V epoch_A_[n] epoch_B_[0] . | . PSK(usage=reinit) | .....................>| | V epoch_B_[1] Figure 7: Reinitializing a Group epoch_A_[n] epoch_B_[0] | | | PSK(usage=branch) | |....................>| | | V V epoch_A_[n+1] epoch_B_[1] Figure 8: Branching a Group Applications may also choose to use resumption PSKs to link epochs in other ways. For example, Figure 9 shows a case where a resumption PSK from epoch n is injected into epoch n+k. This demonstrates that the members of the group at epoch n+k were also members at epoch n, irrespective of any changes to these members' keys due to Updates or Commits. epoch_A_[n] | | PSK(usage=application) |..................... | . | . ... ... | . | . V . epoch_A_[n+k-1] . | . | . |<.................... | V epoch_A_[n+k] Figure 9: Reinjecting Entropy from an Earlier Epoch 4. Ratchet Tree Concepts The protocol uses "ratchet trees" for deriving shared secrets among a group of clients. A ratchet tree is an arrangement of secrets and key pairs among the members of a group in a way that allows for secrets to be efficiently updated to reflect changes in the group. Ratchet trees allow a group to efficiently remove any member by encrypting new entropy to a subset of the group. A ratchet tree assigns shared keys to subgroups of the overall group, so that, for example, encrypting to all but one member of the group requires only log(N) encryptions to subtrees, instead of the N-1 encryptions that would be needed to encrypt to each participant individually (where N is the number of members in the group). This remove operation allows MLS to efficiently achieve post- compromise security. In an Update proposal or a full Commit message, an old (possibly compromised) representation of a member is efficiently removed from the group and replaced with a freshly generated instance. 4.1. Ratchet Tree Terminology Trees consist of _nodes_. A node is a _leaf_ if it has no children; otherwise, it is a _parent_. All parents in our trees have precisely two children, a _left_ child and a _right_ child. A node is the _root_ of a tree if it has no parent, and _intermediate_ if it has both children and a parent. The _descendants_ of a node are that node's children, and the descendants of its children. We say a tree _contains_ a node if that node is a descendant of the root of the tree, or if the node itself is the root of the tree. Nodes are _siblings_ if they share the same parent. A _subtree_ of a tree is the tree given by any node (the _head_ of the subtree) and its descendants. The _size_ of a tree or subtree is the number of leaf nodes it contains. For a given parent node, its _left subtree_ is the subtree with its left child as head and its _right subtree_ is the subtree with its right child as head. Every tree used in this protocol is a perfect binary tree, that is, a complete balanced binary tree with 2^d leaves all at the same depth d. This structure is unique for a given depth d. There are multiple ways that an implementation might represent a ratchet tree in memory. A convenient property of left-balanced binary trees (including the complete trees used here) is that they can be represented as an array of nodes, with node relationships computed based on the nodes' indices in the array. A more traditional representation based on linked node objects may also be used. Appendices C and D provide some details on how to implement the tree operations required for MLS in these representations. MLS places no requirements on implementations' internal representations of ratchet trees. An implementation may use any tree representation and associated algorithms, as long as they produce correct protocol messages. 4.1.1. Ratchet Tree Nodes Each leaf node in a ratchet tree is given an _index_ (or _leaf index_), starting at 0 from the left to 2^d - 1 at the right (for a tree with 2^d leaves). A tree with 2^d leaves has 2^(d+1) - 1 nodes, including parent nodes. Each node in a ratchet tree is either _blank_ (containing no value) or it holds an HPKE public key with some associated data: * A public key (for the HPKE scheme in use; see Section 5.1) * A credential (only for leaf nodes; see Section 5.3) * An ordered list of "unmerged" leaves (see Section 4.2) * A hash of certain information about the node's parent, as of the last time the node was changed (see Section 7.9). As described in Section 4.2, different members know different subsets of the set of private keys corresponding to the public keys in nodes in the tree. The private key corresponding to a parent node is known only to members at leaf nodes that are descendants of that node. The private key corresponding to a leaf node is known only to the member at that leaf node. A leaf node is _unmerged_ relative to one of its ancestor nodes if the member at the leaf node does not know the private key corresponding to the ancestor node. Every node, regardless of whether the node is blank or populated, has a corresponding _hash_ that summarizes the contents of the subtree below that node. The rules for computing these hashes are described in Section 7.8. The _resolution_ of a node is an ordered list of non-blank nodes that collectively cover all non-blank descendants of the node. The resolution of the root contains the set of keys that are collectively necessary to encrypt to every node in the group. The resolution of a node is effectively a depth-first, left-first enumeration of the nearest non-blank nodes below the node: * The resolution of a non-blank node comprises the node itself, followed by its list of unmerged leaves, if any. * The resolution of a blank leaf node is the empty list. * The resolution of a blank intermediate node is the result of concatenating the resolution of its left child with the resolution of its right child, in that order. For example, consider the following subtree, where the _ character represents a blank node and unmerged leaves are indicated in square brackets: ... / _ ______|______ / \ X[B] _ __|__ __|__ / \ / \ _ _ Y _ / \ / \ / \ / \ A B _ D E F _ H 0 1 2 3 4 5 6 7 Figure 10: A Tree with Blanks and Unmerged Leaves In this tree, we can see all of the above rules in play: * The resolution of node X is the list [X, B]. * The resolution of leaf 2 or leaf 6 is the empty list []. * The resolution of top node is the list [X, B, Y, H]. 4.1.2. Paths through a Ratchet Tree The _direct path_ of a root is the empty list. The direct path of any other node is the concatenation of that node's parent along with the parent's direct path. The _copath_ of a node is the node's sibling concatenated with the list of siblings of all the nodes in its direct path, excluding the root. The _filtered direct path_ of a leaf node L is the node's direct path, with any node removed whose child on the copath of L has an empty resolution (keeping in mind that any unmerged leaves of the copath child count toward its resolution). The removed nodes do not need their own key pairs because encrypting to the node's key pair would be equivalent to encrypting to its non-copath child. For example, consider the following tree (where blank nodes are indicated with _, but also assigned a label for reference): W = root | .-----+-----. / \ _=U Y | | .-+-. .-+-. / \ / \ T _=V X _=Z / \ / \ / \ / \ A B _ _ E F G _=H 0 1 2 3 4 5 6 7 Figure 11: A Complete Tree with Five Members, with Labels for Blank Parent Nodes In this tree, the direct paths, copaths, and filtered direct paths for the leaf nodes are as follows: +======+=============+=========+======================+ | Node | Direct path | Copath | Filtered Direct Path | +======+=============+=========+======================+ | A | T, U, W | B, V, Y | T, W | +------+-------------+---------+----------------------+ | B | T, U, W | A, V, Y | T, W | +------+-------------+---------+----------------------+ | E | X, Y, W | F, Z, U | X, Y, W | +------+-------------+---------+----------------------+ | F | X, Y, W | E, Z, U | X, Y, W | +------+-------------+---------+----------------------+ | G | Z, Y, W | H, X, U | Y, W | +------+-------------+---------+----------------------+ Table 2 4.2. Views of a Ratchet Tree We generally assume that each participant maintains a complete and up-to-date view of the public state of the group's ratchet tree, including the public keys for all nodes and the credentials associated with the leaf nodes. No participant in an MLS group knows the private key associated with every node in the tree. Instead, each member is assigned to a leaf of the tree, which determines the subset of private keys it knows. The credential stored at that leaf is one provided by the member. In particular, MLS maintains the members' views of the tree in such a way as to maintain the _tree invariant_: | The private key for a node in the tree is known to a member of the | group only if the node's subtree contains that member's leaf. In other words, if a node is not blank, then it holds a public key. The corresponding private key is known only to members occupying leaves below that node. The reverse implication is not true: A member may not know the private key of an intermediate node above them. Such a member has an _unmerged_ leaf at the intermediate node. Encrypting to an intermediate node requires encrypting to the node's public key, as well as the public keys of all the unmerged leaves below it. A leaf is unmerged with regard to all of its ancestors when it is first added, because the process of adding the leaf does not give it access to the private keys for all of the nodes above it in the tree. Leaves are "merged" as they receive the private keys for nodes, as described in Section 7.4. For example, consider a four-member group (A, B, C, D) where the node above the right two members is blank. (This is what it would look like if A created a group with B, C, and D.) Then the public state of the tree and the views of the private keys of the tree held by each participant would be as follows, where _ represents a blank node, ? represents an unknown private key, and pk(X) represents the public key corresponding to the private key X: Public Tree ============================ pk(ABCD) / \ pk(AB) _ / \ / \ pk(A) pk(B) pk(C) pk(D) Private @ A Private @ B Private @ C Private @ D ============= ============= ============= ============= ABCD ABCD ABCD ABCD / \ / \ / \ / \ AB _ AB _ ? _ ? _ / \ / \ / \ / \ / \ / \ / \ / \ A ? ? ? ? B ? ? ? ? C ? ? ? ? D Note how the tree invariant applies: Each member knows only their own leaf, the private key AB is known only to A and B, and the private key ABCD is known to all four members. This also illustrates another important point: it is possible for there to be "holes" on the path from a member's leaf to the root in which the member knows the key both above and below a given node, but not for that node, as in the case with D. 5. Cryptographic Objects 5.1. Cipher Suites Each MLS session uses a single cipher suite that specifies the following primitives to be used in group key computations: * HPKE parameters: - A Key Encapsulation Mechanism (KEM) - A Key Derivation Function (KDF) - An Authenticated Encryption with Associated Data (AEAD) encryption algorithm * A hash algorithm * A Message Authentication Code (MAC) algorithm * A signature algorithm MLS uses HPKE for public key encryption [RFC9180]. The DeriveKeyPair function associated to the KEM for the cipher suite maps octet strings to HPKE key pairs. As in HPKE, MLS assumes that an AEAD algorithm produces a single ciphertext output from AEAD encryption (aligning with [RFC5116]), as opposed to a separate ciphertext and tag. Cipher suites are represented with the CipherSuite type. The cipher suites are defined in Section 17.1. 5.1.1. Public Keys HPKE public keys are opaque values in a format defined by the underlying protocol (see Section 4 of [RFC9180] for more information). opaque HPKEPublicKey; Signature public keys are likewise represented as opaque values in a format defined by the cipher suite's signature scheme. opaque SignaturePublicKey; For cipher suites using the Edwards-curve Digital Signature Algorithm (EdDSA) signature schemes (Ed25519 or Ed448), the public key is in the format specified in [RFC8032]. For cipher suites using the Elliptic Curve Digital Signature Algorithm (ECDSA) with the NIST curves (P-256, P-384, or P-521), the public key is represented as an encoded UncompressedPointRepresentation struct, as defined in [RFC8446]. 5.1.2. Signing The signature algorithm specified in a group's cipher suite is the mandatory algorithm to be used for signing messages within the group. It MUST be the same as the signature algorithm specified in the credentials in the leaves of the tree (including the leaf node information in KeyPackages used to add new members). The signatures used in this document are encoded as specified in [RFC8446]. In particular, ECDSA signatures are DER encoded, and EdDSA signatures are defined as the concatenation of R and S, as specified in [RFC8032]. To disambiguate different signatures used in MLS, each signed value is prefixed by a label as shown below: SignWithLabel(SignatureKey, Label, Content) = Signature.Sign(SignatureKey, SignContent) VerifyWithLabel(VerificationKey, Label, Content, SignatureValue) = Signature.Verify(VerificationKey, SignContent, SignatureValue) Where SignContent is specified as: struct { opaque label; opaque content; } SignContent; And its fields are set to: label = "MLS 1.0 " + Label; content = Content; The functions Signature.Sign and Signature.Verify are defined by the signature algorithm. If MLS extensions require signatures by group members, they should reuse the SignWithLabel construction, using a distinct label. To avoid collisions in these labels, an IANA registry is defined in Section 17.6. 5.1.3. Public Key Encryption As with signing, MLS includes a label and context in encryption operations to avoid confusion between ciphertexts produced for different purposes. Encryption and decryption including this label and context are done as follows: EncryptWithLabel(PublicKey, Label, Context, Plaintext) = SealBase(PublicKey, EncryptContext, "", Plaintext) DecryptWithLabel(PrivateKey, Label, Context, KEMOutput, Ciphertext) = OpenBase(KEMOutput, PrivateKey, EncryptContext, "", Ciphertext) Where EncryptContext is specified as: struct { opaque label; opaque context; } EncryptContext; And its fields are set to: label = "MLS 1.0 " + Label; context = Context; The functions SealBase and OpenBase are defined in Section 6.1 of [RFC9180] (with "Base" as the MODE), using the HPKE algorithms specified by the group's cipher suite. If MLS extensions require HPKE encryption operations, they should reuse the EncryptWithLabel construction, using a distinct label. To avoid collisions in these labels, an IANA registry is defined in Section 17.7. 5.2. Hash-Based Identifiers Some MLS messages refer to other MLS objects by hash. For example, Welcome messages refer to KeyPackages for the members being welcomed, and Commits refer to Proposals they cover. These identifiers are computed as follows: opaque HashReference; HashReference KeyPackageRef; HashReference ProposalRef; MakeKeyPackageRef(value) = RefHash("MLS 1.0 KeyPackage Reference", value) MakeProposalRef(value) = RefHash("MLS 1.0 Proposal Reference", value) RefHash(label, value) = Hash(RefHashInput) Where RefHashInput is defined as: struct { opaque label; opaque value; } RefHashInput; And its fields are set to: label = label; value = value; For a KeyPackageRef, the value input is the encoded KeyPackage, and the cipher suite specified in the KeyPackage determines the KDF used. For a ProposalRef, the value input is the AuthenticatedContent carrying the Proposal. In the latter two cases, the KDF is determined by the group's cipher suite. 5.3. Credentials Each member of a group presents a credential that provides one or more identities for the member and associates them with the member's signing key. The identities and signing key are verified by the Authentication Service in use for a group. It is up to the application to decide which identifiers to use at the application level. For example, a certificate in an X509Credential may attest to several domain names or email addresses in its subjectAltName extension. An application may decide to present all of these to a user, or if it knows a "desired" domain name or email address, it can check that the desired identifier is among those attested. Using the terminology from [RFC6125], a credential provides "presented identifiers", and it is up to the application to supply a "reference identifier" for the authenticated client, if any. // See the "MLS Credential Types" IANA registry for values uint16 CredentialType; struct { opaque cert_data; } Certificate; struct { CredentialType credential_type; select (Credential.credential_type) { case basic: opaque identity; case x509: Certificate certificates; }; } Credential; A "basic" credential is a bare assertion of an identity, without any additional information. The format of the encoded identity is defined by the application. For an X.509 credential, each entry in the certificates field represents a single DER-encoded X.509 certificate. The chain is ordered such that the first entry (certificates[0]) is the end-entity certificate. The public key encoded in the subjectPublicKeyInfo of the end-entity certificate MUST be identical to the signature_key in the LeafNode containing this credential. A chain MAY omit any non- leaf certificates that supported peers are known to already possess. 5.3.1. Credential Validation The application using MLS is responsible for specifying which identifiers it finds acceptable for each member in a group. In other words, following the model that [RFC6125] describes for TLS, the application maintains a list of "reference identifiers" for the members of a group, and the credentials provide "presented identifiers". A member of a group is authenticated by first validating that the member's credential legitimately represents some presented identifiers, and then ensuring that the reference identifiers for the member are authenticated by those presented identifiers. The parts of the system that perform these functions are collectively referred to as the Authentication Service (AS) [MLS-ARCH]. A member's credential is said to be _validated with the AS_ when the AS verifies that the credential's presented identifiers are correctly associated with the signature_key field in the member's LeafNode, and that those identifiers match the reference identifiers for the member. Whenever a new credential is introduced in the group, it MUST be validated with the AS. In particular, at the following events in the protocol: * When a member receives a KeyPackage that it will use in an Add proposal to add a new member to the group * When a member receives a GroupInfo object that it will use to join a group, either via a Welcome or via an external Commit * When a member receives an Add proposal adding a member to the group * When a member receives an Update proposal whose LeafNode has a new credential for the member * When a member receives a Commit with an UpdatePath whose LeafNode has a new credential for the committer * When an external_senders extension is added to the group * When an existing external_senders extension is updated In cases where a member's credential is being replaced, such as the Update and Commit cases above, the AS MUST also verify that the set of presented identifiers in the new credential is valid as a successor to the set of presented identifiers in the old credential, according to the application's policy. 5.3.2. Credential Expiry and Revocation In some credential schemes, a valid credential can "expire" or become invalid after a certain point in time. For example, each X.509 certificate has a notAfter field, expressing a time after which the certificate is not valid. Expired credentials can cause operational problems in light of the validation requirements of Section 5.3.1. Applications can apply some operational practices and adaptations to Authentication Service policies to moderate these impacts. In general, to avoid operational problems such as new joiners rejecting expired credentials in a group, applications that use such credentials should ensure to the extent practical that all of the credentials in use in a group are valid at all times. If a member finds that its credential has expired (or will soon), it should issue an Update or Commit that replaces it with a valid credential. For this reason, members SHOULD accept Update proposals and Commits issued by members with expired credentials, if the credential in the Update or Commit is valid. Similarly, when a client is processing messages sent some time in the past (e.g., syncing up with a group after being offline), the client SHOULD accept signatures from members with expired credentials, since the credential may have been valid at the time the message was sent. If a member finds that another member's credential has expired, they may issue a Remove that removes that member. For example, an application could require a member preparing to issue a Commit to check the tree for expired credentials and include Remove proposals for those members in its Commit. In situations where the group tree is known to the DS, the DS could also monitor the tree for expired credentials and issue external Remove proposals. Some credential schemes also allow credentials to be revoked. Revocation is similar to expiry in that a previously valid credential becomes invalid. As such, most of the considerations above also apply to revoked credentials. However, applications may want to treat revoked credentials differently, e.g., by removing members with revoked credentials while allowing members with expired credentials time to update. 5.3.3. Uniquely Identifying Clients MLS implementations will presumably provide applications with a way to request protocol operations with regard to other clients (e.g., removing clients). Such functions will need to refer to the other clients using some identifier. MLS clients have a few types of identifiers, with different operational properties. Internally to the protocol, group members are uniquely identified by their leaf index. However, a leaf index is only valid for referring to members in a given epoch. The same leaf index may represent a different member, or no member at all, in a subsequent epoch. The Credentials presented by the clients in a group authenticate application-level identifiers for the clients. However, these identifiers may not uniquely identify clients. For example, if a user has multiple devices that are all present in an MLS group, then those devices' clients could all present the user's application-layer identifiers. If needed, applications may add application-specific identifiers to the extensions field of a LeafNode object with the application_id extension. opaque application_id; However, applications MUST NOT rely on the data in an application_id extension as if it were authenticated by the Authentication Service, and SHOULD gracefully handle cases where the identifier presented is not unique. 6. Message Framing Handshake and application messages use a common framing structure. This framing provides encryption to ensure confidentiality within the group, as well as signing to authenticate the sender. In most of the protocol, messages are handled in the form of AuthenticatedContent objects. These structures contain the content of the message itself as well as information to authenticate the sender (see Section 6.1). The additional protections required to transmit these messages over an untrusted channel (group membership authentication or AEAD encryption) are added by encoding the AuthenticatedContent as a PublicMessage or PrivateMessage message, which can then be sent as an MLSMessage. Likewise, these protections are enforced (via membership verification or AEAD decryption) when decoding a PublicMessage or PrivateMessage into an AuthenticatedContent object. PrivateMessage represents a signed and encrypted message, with protections for both the content of the message and related metadata. PublicMessage represents a message that is only signed, and not encrypted. Applications MUST use PrivateMessage to encrypt application messages and SHOULD use PrivateMessage to encode handshake messages, but they MAY transmit handshake messages encoded as PublicMessage objects in cases where it is necessary for the Delivery Service to examine such messages. enum { reserved(0), mls10(1), (65535) } ProtocolVersion; enum { reserved(0), application(1), proposal(2), commit(3), (255) } ContentType; enum { reserved(0), member(1), external(2), new_member_proposal(3), new_member_commit(4), (255) } SenderType; struct { SenderType sender_type; select (Sender.sender_type) { case member: uint32 leaf_index; case external: uint32 sender_index; case new_member_commit: case new_member_proposal: struct{}; }; } Sender; // See the "MLS Wire Formats" IANA registry for values uint16 WireFormat; struct { opaque group_id; uint64 epoch; Sender sender; opaque authenticated_data; ContentType content_type; select (FramedContent.content_type) { case application: opaque application_data; case proposal: Proposal proposal; case commit: Commit commit; }; } FramedContent; struct { ProtocolVersion version = mls10; WireFormat wire_format; select (MLSMessage.wire_format) { case mls_public_message: PublicMessage public_message; case mls_private_message: PrivateMessage private_message; case mls_welcome: Welcome welcome; case mls_group_info: GroupInfo group_info; case mls_key_package: KeyPackage key_package; }; } MLSMessage; Messages from senders that aren't in the group are sent as PublicMessage. See Sections 12.1.8 and 12.4.3.2 for more details. The following structure is used to fully describe the data transmitted in plaintexts or ciphertexts. struct { WireFormat wire_format; FramedContent content; FramedContentAuthData auth; } AuthenticatedContent; The following figure illustrates how the various structures described in this section relate to each other, and the high-level operations used to produce and consume them: Proposal Commit Application Data | | | +--------------+--------------+ | V FramedContent | | -. +--------+ | | | | | V | +-- Asymmetric FramedContentAuthData | | Sign / Verify | | | +--------+ | | | | | V V -' AuthenticatedContent | -. +--------+--------+ | | | +-- Symmetric V V | Protect / Unprotect PublicMessage PrivateMessage -' | | | | Welcome KeyPackage GroupInfo | | | | | +-----------------+-----+----------+----------+ | V MLSMessage Figure 12: Relationships among MLS Objects 6.1. Content Authentication FramedContent is authenticated using the FramedContentAuthData structure. struct { ProtocolVersion version = mls10; WireFormat wire_format; FramedContent content; select (FramedContentTBS.content.sender.sender_type) { case member: case new_member_commit: GroupContext context; case external: case new_member_proposal: struct{}; }; } FramedContentTBS; opaque MAC; struct { /* SignWithLabel(., "FramedContentTBS", FramedContentTBS) */ opaque signature; select (FramedContent.content_type) { case commit: /* MAC(confirmation_key, GroupContext.confirmed_transcript_hash) */ MAC confirmation_tag; case application: case proposal: struct{}; }; } FramedContentAuthData; The signature is computed using SignWithLabel with label "FramedContentTBS" and with a content that covers the message content and the wire format that will be used for this message. If the sender's sender_type is member, the content also covers the GroupContext for the current epoch so that signatures are specific to a given group and epoch. The sender MUST use the private key corresponding to the following signature key depending on the sender's sender_type: * member: The signature key contained in the LeafNode at the index indicated by leaf_index in the ratchet tree. * external: The signature key at the index indicated by sender_index in the external_senders group context extension (see Section 12.1.8.1). The content_type of the message MUST be proposal and the proposal_type MUST be a value that is allowed for external senders. * new_member_commit: The signature key in the LeafNode in the Commit's path (see Section 12.4.3.2). The content_type of the message MUST be commit. * new_member_proposal: The signature key in the LeafNode in the KeyPackage embedded in an external Add proposal. The content_type of the message MUST be proposal and the proposal_type of the Proposal MUST be add. Recipients of an MLSMessage MUST verify the signature with the key depending on the sender_type of the sender as described above. The confirmation tag value confirms that the members of the group have arrived at the same state of the group. A FramedContentAuthData is said to be valid when both the signature and confirmation_tag fields are valid. 6.2. Encoding and Decoding a Public Message Messages that are authenticated but not encrypted are encoded using the PublicMessage structure. struct { FramedContent content; FramedContentAuthData auth; select (PublicMessage.content.sender.sender_type) { case member: MAC membership_tag; case external: case new_member_commit: case new_member_proposal: struct{}; }; } PublicMessage; The membership_tag field in the PublicMessage object authenticates the sender's membership in the group. For messages sent by members, it MUST be set to the following value: struct { FramedContentTBS content_tbs; FramedContentAuthData auth; } AuthenticatedContentTBM; membership_tag = MAC(membership_key, AuthenticatedContentTBM) When decoding a PublicMessage into an AuthenticatedContent, the application MUST check membership_tag and MUST check that the FramedContentAuthData is valid. 6.3. Encoding and Decoding a Private Message Authenticated and encrypted messages are encoded using the PrivateMessage structure. struct { opaque group_id; uint64 epoch; ContentType content_type; opaque authenticated_data; opaque encrypted_sender_data; opaque ciphertext; } PrivateMessage; encrypted_sender_data and ciphertext are encrypted using the AEAD function specified by the cipher suite in use, using the SenderData and PrivateMessageContent structures as input. 6.3.1. Content Encryption Content to be encrypted is encoded in a PrivateMessageContent structure. struct { select (PrivateMessage.content_type) { case application: opaque application_data; case proposal: Proposal proposal; case commit: Commit commit; }; FramedContentAuthData auth; opaque padding[length_of_padding]; } PrivateMessageContent; The padding field is set by the sender, by first encoding the content (via the select) and the auth field, and then appending the chosen number of zero bytes. A receiver identifies the padding field in a plaintext decoded from PrivateMessage.ciphertext by first decoding the content and the auth field; then the padding field comprises any remaining octets of plaintext. The padding field MUST be filled with all zero bytes. A receiver MUST verify that there are no non-zero bytes in the padding field, and if this check fails, the enclosing PrivateMessage MUST be rejected as malformed. This check ensures that the padding process is deterministic, so that, for example, padding cannot be used as a covert channel. In the MLS key schedule, the sender creates two distinct key ratchets for handshake and application messages for each member of the group. When encrypting a message, the sender looks at the ratchets it derived for its own member and chooses an unused generation from either the handshake ratchet or the application ratchet, depending on the content type of the message. This generation of the ratchet is used to derive a provisional nonce and key. Before use in the encryption operation, the nonce is XORed with a fresh random value to guard against reuse. Because the key schedule generates nonces deterministically, a client MUST keep persistent state as to where in the key schedule it is; if this persistent state is lost or corrupted, a client might reuse a generation that has already been used, causing reuse of a key/nonce pair. To avoid this situation, the sender of a message MUST generate a fresh random four-byte "reuse guard" value and XOR it with the first four bytes of the nonce from the key schedule before using the nonce for encryption. The sender MUST include the reuse guard in the reuse_guard field of the sender data object, so that the recipient of the message can use it to compute the nonce to be used for decryption. +-+-+-+-+---------...---+ | Key Schedule Nonce | +-+-+-+-+---------...---+ XOR +-+-+-+-+---------...---+ | Guard | 0 | +-+-+-+-+---------...---+ === +-+-+-+-+---------...---+ | Encrypt/Decrypt Nonce | +-+-+-+-+---------...---+ The Additional Authenticated Data (AAD) input to the encryption contains an object of the following form, with the values used to identify the key and nonce: struct { opaque group_id; uint64 epoch; ContentType content_type; opaque authenticated_data; } PrivateContentAAD; When decoding a PrivateMessageContent, the application MUST check that the FramedContentAuthData is valid. It is up to the application to decide what authenticated_data to provide and how much padding to add to a given message (if any). The overall size of the AAD and ciphertext MUST fit within the limits established for the group's AEAD algorithm in [CFRG-AEAD-LIMITS]. 6.3.2. Sender Data Encryption The "sender data" used to look up the key for content encryption is encrypted with the cipher suite's AEAD with a key and nonce derived from both the sender_data_secret and a sample of the encrypted content. Before being encrypted, the sender data is encoded as an object of the following form: struct { uint32 leaf_index; uint32 generation; opaque reuse_guard[4]; } SenderData; When constructing a SenderData object from a Sender object, the sender MUST verify Sender.sender_type is member and use Sender.leaf_index for SenderData.leaf_index. The reuse_guard field contains a fresh random value used to avoid nonce reuse in the case of state loss or corruption, as described in Section 6.3.1. The key and nonce provided to the AEAD are computed as the KDF of the first KDF.Nh bytes of the ciphertext generated in the previous section. If the length of the ciphertext is less than KDF.Nh, the whole ciphertext is used. In pseudocode, the key and nonce are derived as: ciphertext_sample = ciphertext[0..KDF.Nh-1] sender_data_key = ExpandWithLabel(sender_data_secret, "key", ciphertext_sample, AEAD.Nk) sender_data_nonce = ExpandWithLabel(sender_data_secret, "nonce", ciphertext_sample, AEAD.Nn) The AAD for the SenderData ciphertext is the first three fields of PrivateMessage: struct { opaque group_id; uint64 epoch; ContentType content_type; } SenderDataAAD; When parsing a SenderData struct as part of message decryption, the recipient MUST verify that the leaf index indicated in the leaf_index field identifies a non-blank node. 7. Ratchet Tree Operations The ratchet tree for an epoch describes the membership of a group in that epoch, providing public key encryption (HPKE) keys that can be used to encrypt to subsets of the group as well as information to authenticate the members. In order to reflect changes to the membership of the group from one epoch to the next, corresponding changes are made to the ratchet tree. In this section, we describe the content of the tree and the required operations. 7.1. Parent Node Contents As discussed in Section 4.1.1, the nodes of a ratchet tree contain several types of data describing individual members (for leaf nodes) or subgroups of the group (for parent nodes). Parent nodes are simpler: struct { HPKEPublicKey encryption_key; opaque parent_hash; uint32 unmerged_leaves; } ParentNode; The encryption_key field contains an HPKE public key whose private key is held only by the members at the leaves among its descendants. The parent_hash field contains a hash of this node's parent node, as described in Section 7.9. The unmerged_leaves field lists the leaves under this parent node that are unmerged, according to their indices among all the leaves in the tree. The entries in the unmerged_leaves vector MUST be sorted in increasing order. 7.2. Leaf Node Contents A leaf node in the tree describes all the details of an individual client's appearance in the group, signed by that client. It is also used in client KeyPackage objects to store the information that will be needed to add a client to a group. enum { reserved(0), key_package(1), update(2), commit(3), (255) } LeafNodeSource; struct { ProtocolVersion versions; CipherSuite cipher_suites; ExtensionType extensions; ProposalType proposals; CredentialType credentials; } Capabilities; struct { uint64 not_before; uint64 not_after; } Lifetime; // See the "MLS Extension Types" IANA registry for values uint16 ExtensionType; struct { ExtensionType extension_type; opaque extension_data; } Extension; struct { HPKEPublicKey encryption_key; SignaturePublicKey signature_key; Credential credential; Capabilities capabilities; LeafNodeSource leaf_node_source; select (LeafNode.leaf_node_source) { case key_package: Lifetime lifetime; case update: struct{}; case commit: opaque parent_hash; }; Extension extensions; /* SignWithLabel(., "LeafNodeTBS", LeafNodeTBS) */ opaque signature; } LeafNode; struct { HPKEPublicKey encryption_key; SignaturePublicKey signature_key; Credential credential; Capabilities capabilities; LeafNodeSource leaf_node_source; select (LeafNodeTBS.leaf_node_source) { case key_package: Lifetime lifetime; case update: struct{}; case commit: opaque parent_hash; }; Extension extensions; select (LeafNodeTBS.leaf_node_source) { case key_package: struct{}; case update: opaque group_id; uint32 leaf_index; case commit: opaque group_id; uint32 leaf_index; }; } LeafNodeTBS; The encryption_key field contains an HPKE public key whose private key is held only by the member occupying this leaf (or in the case of a LeafNode in a KeyPackage object, the issuer of the KeyPackage). The signature_key field contains the member's public signing key. The credential field contains information authenticating both the member's identity and the provided signing key, as described in Section 5.3. The capabilities field indicates the protocol features that the client supports, including protocol versions, cipher suites, credential types, non-default proposal types, and non-default extension types. The following proposal and extension types are considered "default" and MUST NOT be listed: * Proposal types: - 0x0001 - add - 0x0002 - update - 0x0003 - remove - 0x0004 - psk - 0x0005 - reinit - 0x0006 - external_init - 0x0007 - group_context_extensions * Extension types: - 0x0001 - application_id - 0x0002 - ratchet_tree - 0x0003 - required_capabilities - 0x0004 - external_pub - 0x0005 - external_senders There are no default values for the other fields of a capabilities object. The client MUST list all values for the respective parameters that it supports. The types of any non-default extensions that appear in the extensions field of a LeafNode MUST be included in the extensions field of the capabilities field, and the credential type used in the LeafNode MUST be included in the credentials field of the capabilities field. As discussed in Section 13, unknown values in capabilities MUST be ignored, and the creator of a capabilities field SHOULD include some random GREASE values to help ensure that other clients correctly ignore unknown values. The leaf_node_source field indicates how this LeafNode came to be added to the tree. This signal tells other members of the group whether the leaf node is required to have a lifetime or parent_hash, and whether the group_id is added as context to the signature. These fields are included selectively because the client creating a LeafNode is not always able to compute all of them. For example, a KeyPackage is created before the client knows which group it will be used with, so its signature can't bind to a group_id. In the case where the leaf was added to the tree based on a pre- published KeyPackage, the lifetime field represents the times between which clients will consider a LeafNode valid. These times are represented as absolute times, measured in seconds since the Unix epoch (1970-01-01T00:00:00Z). Applications MUST define a maximum total lifetime that is acceptable for a LeafNode, and reject any LeafNode where the total lifetime is longer than this duration. In order to avoid disagreements about whether a LeafNode has a valid lifetime, the clients in a group SHOULD maintain time synchronization (e.g., using the Network Time Protocol [RFC5905]). In the case where the leaf node was inserted into the tree via a Commit message, the parent_hash field contains the parent hash for this leaf node (see Section 7.9). The LeafNodeTBS structure covers the fields above the signature in the LeafNode. In addition, when the leaf node was created in the context of a group (the update and commit cases), the group ID of the group is added as context to the signature. LeafNode objects stored in the group's ratchet tree are updated according to the evolution of the tree. Each modification of LeafNode content MUST be reflected by a change in its signature. This allows other members to verify the validity of the LeafNode at any time, particularly in the case of a newcomer joining the group. 7.3. Leaf Node Validation The validity of a LeafNode needs to be verified at the following stages: * When a LeafNode is downloaded in a KeyPackage, before it is used to add the client to the group * When a LeafNode is received by a group member in an Add, Update, or Commit message * When a client validates a ratchet tree, e.g., when joining a group or after processing a Commit The client verifies the validity of a LeafNode using the following steps: * Verify that the credential in the LeafNode is valid, as described in Section 5.3.1. * Verify that the signature on the LeafNode is valid using signature_key. * Verify that the LeafNode is compatible with the group's parameters. If the GroupContext has a required_capabilities extension, then the required extensions, proposals, and credential types MUST be listed in the LeafNode's capabilities field. * Verify that the credential type is supported by all members of the group, as specified by the capabilities field of each member's LeafNode, and that the capabilities field of this LeafNode indicates support for all the credential types currently in use by other members. * Verify the lifetime field: - If the LeafNode appears in a message being sent by the client, e.g., a Proposal or a Commit, then the client MUST verify that the current time is within the range of the lifetime field. - If instead the LeafNode appears in a message being received by the client, e.g., a Proposal, a Commit, or a ratchet tree of the group the client is joining, it is RECOMMENDED that the client verifies that the current time is within the range of the lifetime field. (This check is not mandatory because the LeafNode might have expired in the time between when the message was sent and when it was received.) * Verify that the extensions in the LeafNode are supported by checking that the ID for each extension in the extensions field is listed in the capabilities.extensions field of the LeafNode. * Verify the leaf_node_source field: - If the LeafNode appears in a KeyPackage, verify that leaf_node_source is set to key_package. - If the LeafNode appears in an Update proposal, verify that leaf_node_source is set to update and that encryption_key represents a different public key than the encryption_key in the leaf node being replaced by the Update proposal. - If the LeafNode appears in the leaf_node value of the UpdatePath in a Commit, verify that leaf_node_source is set to commit. * Verify that the following fields are unique among the members of the group: - signature_key - encryption_key 7.4. Ratchet Tree Evolution Whenever a member initiates an epoch change (i.e., commits; see Section 12.4), they may need to refresh the key pairs of their leaf and of the nodes on their leaf's direct path in order to maintain forward secrecy and post-compromise security. The member initiating the epoch change generates the fresh key pairs using the following procedure. The procedure is designed in a way that allows group members to efficiently communicate the fresh secret keys to other group members, as described in Section 7.6. A member updates the nodes along its direct path as follows: * Blank all the nodes on the direct path from the leaf to the root. * Generate a fresh HPKE key pair for the leaf. * Generate a sequence of path secrets, one for each node on the leaf's filtered direct path, as follows. In this setting, path_secret[0] refers to the first parent node in the filtered direct path, path_secret[1] to the second parent node, and so on. path_secret[0] is sampled at random path_secret[n] = DeriveSecret(path_secret[n-1], "path") * Compute the sequence of HPKE key pairs (node_priv,node_pub), one for each node on the leaf's direct path, as follows. node_secret[n] = DeriveSecret(path_secret[n], "node") node_priv[n], node_pub[n] = KEM.DeriveKeyPair(node_secret[n]) The node secret is derived as a temporary intermediate secret so that each secret is only used with one algorithm: The path secret is used as an input to DeriveSecret, and the node secret is used as an input to DeriveKeyPair. For example, suppose there is a group with four members, with C an unmerged leaf at Z: Y | .-+-. / \ X Z[C] / \ / \ A B C D 0 1 2 3 Figure 13: A Full Tree with One Unmerged Leaf If member B subsequently generates an UpdatePath based on a secret "leaf_secret", then it would generate the following sequence of path secrets: path_secret[1] ---> node_secret[1] -------> node_priv[1], node_pub[1] ^ | | path_secret[0] ---> node_secret[0] -------> node_priv[0], node_pub[0] ^ | | leaf_secret ------> leaf_node_secret --+--> leaf_priv, leaf_pub | | '-------. .-------' | leaf_node Figure 14: Derivation of Ratchet Tree Keys along a Direct Path After applying the UpdatePath, the tree will have the following structure: node_priv[1] --------> Y' | .-+-. / \ node_priv[0] ----> X' Z[C] / \ / \ A B C D ^ leaf_priv -----------+ 0 1 2 3 Figure 15: Placement of Keys in a Ratchet Tree 7.5. Synchronizing Views of the Tree After generating fresh key material and applying it to update their local tree state as described in Section 7.4, the generator broadcasts this update to other members of the group in a Commit message, who apply it to keep their local views of the tree in sync with the sender's. More specifically, when a member commits a change to the tree (e.g., to add or remove a member), it transmits an UpdatePath containing a set of public keys and encrypted path secrets for intermediate nodes in the filtered direct path of its leaf. The other members of the group use these values to update their view of the tree, aligning their copy of the tree to the sender's. An UpdatePath contains the following information for each node in the filtered direct path of the sender's leaf, including the root: * The public key for the node * One or more encrypted copies of the path secret corresponding to the node The path secret value for a given node is encrypted to the subtree rooted at the parent's non-updated child, i.e., the child on the copath of the sender's leaf node. There is one encryption of the path secret to each public key in the resolution of the non-updated child. A member of the group _updates their direct path_ by computing new values for their leaf node and the nodes along their filtered direct path as follows: 1. Blank all nodes along the direct path of the sender's leaf. 2. Compute updated path secrets and public keys for the nodes on the sender's filtered direct path. * Generate a sequence of path secrets of the same length as the filtered direct path, as defined in Section 7.4. * For each node in the filtered direct path, replace the node's public key with the node_pub[n] value derived from the corresponding path secret path_secret[n]. 3. Compute the new parent hashes for the nodes along the filtered direct path and the sender's leaf node. 4. Update the leaf node for the sender. * Set the leaf_node_source to commit. * Set the encryption_key to the public key of a freshly sampled key pair. * Set the parent hash to the parent hash for the leaf. * Re-sign the leaf node with its new contents. Since the new leaf node effectively updates an existing leaf node in the group, it MUST adhere to the same restrictions as LeafNodes used in Update proposals (aside from leaf_node_source). The application MAY specify other changes to the leaf node, e.g., providing a new signature key, updated capabilities, or different extensions. The member then _encrypts path secrets to the group_. For each node in the member's filtered direct path, the member takes the following steps: 1. Compute the resolution of the node's child that is on the copath of the sender (the child that is not in the direct path of the sender). Any new member (from an Add proposal) added in the same Commit MUST be excluded from this resolution. 2. For each node in the resolution, encrypt the path secret for the direct path node using the public key of the resolution node, as defined in Section 7.6. The recipient of an UpdatePath performs the corresponding steps. First, the recipient _merges UpdatePath into the tree_: 1. Blank all nodes on the direct path of the sender's leaf. 2. For all nodes on the filtered direct path of the sender's leaf, * Set the public key to the public key in the UpdatePath. * Set the list of unmerged leaves to the empty list. 3. Compute parent hashes for the nodes in the sender's filtered direct path, and verify that the parent_hash field of the leaf node matches the parent hash for the first node in its filtered direct path. * Note that these hashes are computed from root to leaf, so that each hash incorporates all the non-blank nodes above it. The root node always has a zero-length hash for its parent hash. Second, the recipient _decrypts the path secrets_: 1. Identify a node in the filtered direct path for which the recipient is in the subtree of the non-updated child. 2. Identify a node in the resolution of the copath node for which the recipient has a private key. 3. Decrypt the path secret for the parent of the copath node using the private key from the resolution node. 4. Derive path secrets for ancestors of that node in the sender's filtered direct path using the algorithm described above. 5. Derive the node secrets and node key pairs from the path secrets. 6. Verify that the derived public keys are the same as the corresponding public keys sent in the UpdatePath. 7. Store the derived private keys in the corresponding ratchet tree nodes. For example, in order to communicate the example update described in Section 7.4, the member at node B would transmit the following values: +=============+====================================================+ | Public Key | Ciphertext(s) | +=============+====================================================+ | node_pub[1] | E(pk(Z), path_secret[1]), E(pk(C), path_secret[1]) | +-------------+----------------------------------------------------+ | node_pub[0] | E(pk(A), path_secret[0]) | +-------------+----------------------------------------------------+ Table 3 In this table, the value node_pub[i] represents the public key derived from node_secret[i], pk(X) represents the current public key of node X, and E(K, S) represents the public key encryption of the path secret S to the public key K (using HPKE). A recipient at node A would decrypt E(pk(A), path_secret\[0\]) to obtain path_secret\[0\], then use it to derive path_secret[1] and the resulting node secrets and key pairs. Thus, A would have the private keys to nodes X' and Y', in accordance with the tree invariant. Similarly, a recipient at node D would decrypt E(pk(Z), path_secret[1]) to obtain path_secret[1], then use it to derive the node secret and key pair for the node Y'. As required to maintain the tree invariant, node D does not receive the private key for the node X', since X' is not an ancestor of D. After processing the update, each recipient MUST delete outdated key material, specifically: * The path secrets and node secrets used to derive each updated node key pair. * Each outdated node key pair that was replaced by the update. 7.6. Update Paths As described in Section 12.4, each Commit message may optionally contain an UpdatePath, with a new LeafNode and set of parent nodes for the sender's filtered direct path. For each parent node, the UpdatePath contains a new public key and encrypted path secret. The parent nodes are kept in the same order as the filtered direct path. struct { opaque kem_output; opaque ciphertext; } HPKECiphertext; struct { HPKEPublicKey encryption_key; HPKECiphertext encrypted_path_secret; } UpdatePathNode; struct { LeafNode leaf_node; UpdatePathNode nodes; } UpdatePath; For each UpdatePathNode, the resolution of the corresponding copath node MUST exclude all new leaf nodes added as part of the current Commit. The length of the encrypted_path_secret vector MUST be equal to the length of the resolution of the copath node (excluding new leaf nodes), with each ciphertext being the encryption to the respective resolution node. The HPKECiphertext values are encrypted and decrypted as follows: (kem_output, ciphertext) = EncryptWithLabel(node_public_key, "UpdatePathNode", group_context, path_secret) path_secret = DecryptWithLabel(node_private_key, "UpdatePathNode", group_context, kem_output, ciphertext) Here node_public_key is the public key of the node for which the path secret is encrypted, group_context is the provisional GroupContext object for the group, and the EncryptWithLabel function is as defined in Section 5.1.3. 7.7. Adding and Removing Leaves In addition to the path-based updates to the tree described above, it is also necessary to add and remove leaves of the tree in order to reflect changes to the membership of the group (see Sections 12.1.1 and 12.1.3). Since the tree is always full, adding or removing leaves corresponds to increasing or decreasing the depth of the tree, resulting in the number of leaves being doubled or halved. These operations are also known as _extending_ and _truncating_ the tree. Leaves are always added and removed at the right edge of the tree. When the size of the tree needs to be increased, a new blank root node is added, whose left subtree is the existing tree and right subtree is a new all-blank subtree. This operation is typically done when adding a member to the group. _ <-- new blank root _ __|__ __|__ / \ / \ X ===> X _ <-- new blank subtree ===> X _ / \ / \ / \ / \ / \ A B A B _ _ A B C _ ^ | new member --+ Figure 16: Extending the Tree to Make Room for a Third Member When the right subtree of the tree no longer has any non-blank nodes, it can be safely removed. The root of the tree and the right subtree are discarded (whether or not the root node is blank). The left child of the root becomes the new root node, and the left subtree becomes the new tree. This operation is typically done after removing a member from the group. Y Y __|__ __|__ / \ / \ X _ ===> X _ ==> X <-- new root / \ / \ / \ / \ / \ A B C _ A B _ _ A B ^ | removed member --+ Figure 17: Cleaning Up after Removing Member C Concrete algorithms for these operations on array-based and link- based trees are provided in Appendices C and D. The concrete algorithms are non-normative. An implementation may use any algorithm that produces the correct tree in its internal representation. 7.8. Tree Hashes MLS hashes the contents of the tree in two ways to authenticate different properties of the tree. _Tree hashes_ are defined in this section, and _parent hashes_ are defined in Section 7.9. Each node in a ratchet tree has a tree hash that summarizes the subtree below that node. The tree hash of the root is used in the GroupContext to confirm that the group agrees on the whole tree. Tree hashes are computed recursively from the leaves up to the root. P --> th(P) ^ ^ / \ / \ th(L) th(R) Figure 18: Composition of the Tree Hash The tree hash of an individual node is the hash of the node's TreeHashInput object, which may contain either a LeafNodeHashInput or a ParentNodeHashInput depending on the type of node. LeafNodeHashInput objects contain the leaf_index and the LeafNode (if any). ParentNodeHashInput objects contain the ParentNode (if any) and the tree hash of the node's left and right children. For both parent and leaf nodes, the optional node value MUST be absent if the node is blank and present if the node contains a value. enum { reserved(0), leaf(1), parent(2), (255) } NodeType; struct { NodeType node_type; select (TreeHashInput.node_type) { case leaf: LeafNodeHashInput leaf_node; case parent: ParentNodeHashInput parent_node; }; } TreeHashInput; struct { uint32 leaf_index; optional leaf_node; } LeafNodeHashInput; struct { optional parent_node; opaque left_hash; opaque right_hash; } ParentNodeHashInput; The tree hash of an entire tree corresponds to the tree hash of the root node, which is computed recursively by starting at the leaf nodes and building up. 7.9. Parent Hashes While tree hashes summarize the state of a tree at point in time, parent hashes capture information about how keys in the tree were populated. When a client sends a Commit to change a group, it can include an UpdatePath to assign new keys to the nodes along its filtered direct path. When a client computes an UpdatePath (as defined in Section 7.5), it computes and signs a parent hash that summarizes the state of the tree after the UpdatePath has been applied. These summaries are constructed in a chain from the root to the member's leaf so that the part of the chain closer to the root can be overwritten as nodes set in one UpdatePath are reset by a later UpdatePath. ph(Q) / / V P.public_key --> ph(P) / ^ / \ V \ N.parent_hash th(S) Figure 19: Inputs to a Parent Hash As a result, the signature over the parent hash in each member's leaf effectively signs the subtree of the tree that hasn't been changed since that leaf was last changed in an UpdatePath. A new member joining the group uses these parent hashes to verify that the parent nodes in the tree were set by members of the group, not chosen by an external attacker. For an example of how this works, see Appendix B. Consider a ratchet tree with a non-blank parent node P and children D and S (for "parent", "direct path", and "sibling"), with D and P in the direct path of a leaf node L (for "leaf"): ... / P __|__ / \ D S / \ / \ ... ... ... ... / L Figure 20: Nodes Involved in a Parent Hash Computation The parent hash of P changes whenever an UpdatePath object is applied to the ratchet tree along a path from a leaf L traversing node D (and hence also P). The new "Parent hash of P (with copath child S)" is obtained by hashing P's ParentHashInput struct. struct { HPKEPublicKey encryption_key; opaque parent_hash; opaque original_sibling_tree_hash; } ParentHashInput; The field encryption_key contains the HPKE public key of P. If P is the root, then the parent_hash field is set to a zero-length octet string. Otherwise, parent_hash is the parent hash of the next node after P on the filtered direct path of the leaf L. This way, P's parent hash fixes the new HPKE public key of each non-blank node on the path from P to the root. Note that the path from P to the root may contain some blank nodes that are not fixed by P's parent hash. However, for each node that has an HPKE key, this key is fixed by P's parent hash. Finally, original_sibling_tree_hash is the tree hash of S in the ratchet tree modified as follows: For each leaf L in P.unmerged_leaves, blank L and remove it from the unmerged_leaves sets of all parent nodes. Observe that original_sibling_tree_hash does not change between updates of P. This property is crucial for the correctness of the protocol. Note that original_sibling_tree_hash is the tree hash of S, not the parent hash. The parent_hash field in ParentHashInput captures information about the nodes above P. the original_sibling_tree_hash captures information about the subtree under S that is not being updated (and thus the subtree to which a path secret for P would be encrypted according to Section 7.5). For example, in the following tree: W [F] ______|_____ / \ U Y [F] __|__ __|__ / \ / \ T _ _ _ / \ / \ / \ / \ A B C D E F G _ Figure 21: A Tree Illustrating Parent Hash Computations With P = W and S = Y, original_sibling_tree_hash is the tree hash of the following tree: Y __|__ / \ _ _ / \ / \ E _ G _ Because W.unmerged_leaves includes F, F is blanked and removed from Y.unmerged_leaves. Note that no recomputation is needed if the tree hash of S is unchanged since the last time P was updated. This is the case for computing or processing a Commit whose UpdatePath traverses P, since the Commit itself resets P. (In other words, it is only necessary to recompute the original sibling tree hash when validating a group's tree on joining.) More generally, if none of the entries in P.unmerged_leaves are in the subtree under S (and thus no leaves were blanked), then the original tree hash at S is the tree hash of S in the current tree. If it is necessary to recompute the original tree hash of a node, the efficiency of recomputation can be improved by caching intermediate tree hashes, to avoid recomputing over the subtree when the subtree is included in multiple parent hashes. A subtree hash can be reused as long as the intersection of the parent's unmerged leaves with the subtree is the same as in the earlier computation. 7.9.1. Using Parent Hashes In ParentNode objects and LeafNode objects with leaf_node_source set to commit, the value of the parent_hash field is the parent hash of the next non-blank parent node above the node in question (the next node in the filtered direct path). Using the node labels in Figure 20, the parent_hash field of D is equal to the parent hash of P with copath child S. This is the case even when the node D is a leaf node. The parent_hash field of a LeafNode is signed by the member. The signature of such a LeafNode thus attests to which keys the group member introduced into the ratchet tree and to whom the corresponding secret keys were sent, in addition to the other contents of the LeafNode. This prevents malicious insiders from constructing artificial ratchet trees with a node D whose HPKE secret key is known to the insider, yet where the insider isn't assigned a leaf in the subtree rooted at D. Indeed, such a ratchet tree would violate the tree invariant. 7.9.2. Verifying Parent Hashes Parent hashes are verified at two points in the protocol: When joining a group and when processing a Commit. The parent hash in a node D is valid with respect to a parent node P if the following criteria hold. Here C and S are the children of P (for "child" and "sibling"), with C being the child that is on the direct path of D (possibly D itself) and S being the other child: * D is a descendant of P in the tree. * The parent_hash field of D is equal to the parent hash of P with copath child S. * D is in the resolution of C, and the intersection of P's unmerged_leaves with the subtree under C is equal to the resolution of C with D removed. These checks verify that D and P were updated at the same time (in the same UpdatePath), and that they were neighbors in the UpdatePath because the nodes in between them would have omitted from the filtered direct path. A parent node P is "parent-hash valid" if it can be chained back to a leaf node in this way. That is, if there is leaf node L and a sequence of parent nodes P_1, ..., P_N such that P_N = P and each step in the chain is authenticated by a parent hash, then L's parent hash is valid with respect to P_1, P_1's parent hash is valid with respect to P_2, and so on. When joining a group, the new member MUST authenticate that each non- blank parent node P is parent-hash valid. This can be done "bottom up" by building chains up from leaves and verifying that all non- blank parent nodes are covered by exactly one such chain, or "top down" by verifying that there is exactly one descendant of each non- blank parent node for which the parent node is parent-hash valid. When processing a Commit message that includes an UpdatePath, clients MUST recompute the expected value of parent_hash for the committer's new leaf and verify that it matches the parent_hash value in the supplied leaf_node. After being merged into the tree, the nodes in the UpdatePath form a parent-hash chain from the committer's leaf to the root. 8. Key Schedule Group keys are derived using the Extract and Expand functions from the KDF for the group's cipher suite, as well as the functions defined below: ExpandWithLabel(Secret, Label, Context, Length) = KDF.Expand(Secret, KDFLabel, Length) DeriveSecret(Secret, Label) = ExpandWithLabel(Secret, Label, "", KDF.Nh) Where KDFLabel is specified as: struct { uint16 length; opaque label; opaque context; } KDFLabel; And its fields are set to: length = Length; label = "MLS 1.0 " + Label; context = Context; The value KDF.Nh is the size of an output from KDF.Extract, in bytes. In the below diagram: * KDF.Extract takes its salt argument from the top and its Input Keying Material (IKM) argument from the left. * DeriveSecret takes its Secret argument from the incoming arrow. * 0 represents an all-zero byte string of length KDF.Nh. When processing a handshake message, a client combines the following information to derive new epoch secrets: * The init secret from the previous epoch * The commit secret for the current epoch * The GroupContext object for current epoch Given these inputs, the derivation of secrets for an epoch proceeds as shown in the following diagram: init_secret_[n-1] | | V commit_secret --> KDF.Extract | | V ExpandWithLabel(., "joiner", GroupContext_[n], KDF.Nh) | | V joiner_secret | | V psk_secret (or 0) --> KDF.Extract | | +--> DeriveSecret(., "welcome") | = welcome_secret | V ExpandWithLabel(., "epoch", GroupContext_[n], KDF.Nh) | | V epoch_secret | | +--> DeriveSecret(.,