Authenticity of room messages

# Authenticity of room messages ## Problem Can the recipient be confident that the message originated from the sender? *Why could a message not originate from the sender?* - Compromised homeserver injecting a spurious device into the user's account and sending messages from it. - Compromised homeserver spoofing senders. - Someone impersonating a contact with same avatar, name. ## Proposal A message may be authenticated to a certain level (or not at all). Message authenticity ultimately relies on the trust level of the key used to decrypt the message. ### Megolm key authenticity level The levels of trust of a megolm key authenticity are the following: - None(claimed sender_key) - Olm(sender_key) - Device(fingerprint_key) - Identity(msk) `None`: We have no guarantee of the authenticity of this key. `Olm`: The device that sent this key is unknown or has been deleted, and no authenticity can be established. `Device`: The key is guaranteed to be owned by a given device, identified by the device signing key (Ed25519). However, there is no evidence that the device is controlled by a particular person. `Identity`: The key is guaranteed to be owned by a given user identity, that is a particular Master Signing Key (MSK). ### Message authenticity Message authenticity now only depends on whether a user trusts a device or identity. Better if the device/identity is verified. Messages decrypted by a Megolm key with a trust level of `None` or `Olm` are not trusted. Messages decrypted by a Megolm key with a trust level of `Device` or `Identity` are authentic if the given device or identity has been verified by the user. ## Anatomy of messsage authenticity chain Megolm "room keys" are received out-of-band (not via room, but via pairwise E2EE Olm channels). End-to-end encryption of messages sent to a room utilizes the established pairwise encrypted Olm sessions to distribute their “room key” (a.k.a. Megolm key, message key). The room message content is encrypted using the “room key”, achieving an efficient and secure server fan-out for the messages that are sent to rooms (groups). In order to audit the authenticity of a message, a set of properties/links must be considered: 1. Megolm key source 2. Megolm key ↔ device link 3. device ↔ user identity link 4. verification state ### 1. Megolm key source Megolm keys can be received in several ways: via `m.room_key`, `m.forwarded_room_key`, from server-side key backup but also by importing manually from a file. To simplify, keys received via `m.room_key` are trusted and other sources are mostly untrusted. In the end, if a room key is received from an untrusted source, the authenticity of the message cannot be guaranteed. ### 2. Megolm key ↔ device link The effect we want to achieve is for a `m.room_key` message to commit to a particular Olm device *(identity|fingerprint)* at the point of `m.room_key` reception. At the point of reception, by decrypting the olm message we know that the sender owns the `sender_key` (a.k.a identity key/curve25519). But a device is identified by the signing key (i.e. the ed25519 key). To link the identity key back to the signing key, we need to download the device info from server (`/keys/query`): the device info (identity key, signing key, device ID, user ID) is signed by the signing key; this signature allows us to commit the received key to the olm device. /!\ It is possible that we are not able to establish the device ↔ Megolm key link, for example if the `/keys/query` call doesn't return a device for the given `sender_key`. It is possible that the device was deleted before our client get a chance to download the key. It is also possible that the `/keys/query` call fails (network/server) or that there is a federation lag. We can't really decide for now whether it's a deleted, unknown, not yet known device. We might consider in the future to modify the protocol to include the signed device content in to_device key contents. If the device ↔ Megolm key link cannot be resolved, the message authenticity cannot be established. Such message should be marked with a warning `sent from an unknown or deleted device`, as it could be a vector of attack where a compromised homeserver could inject short-lived devices to send messages. ### 3. device ↔ user identity link A user identity is defined by their **master signing key** (MSK). User identities are published to the homeserver (public keys), and can be retrieved via a `/keys/query` call (same call to get the device). The effect we want to achieve is that the device commits to the published User Identity. In order to establish authenticity of that link, we need to check if the device is cross-signed by the published user identity. We need to check the following signatures: the MSK signs the **self-signing key** (SSK), and the SSK signs the device. The (device↔user) link can be `cross_signed` | `claimed` | `unknown` `cross_signed`: If the device is signed by the published keys `unknown`: If the user has not published cross signing keys `claimed`: The device is not signed by the published identity ### 4. verification state This is the last level to ensure authenticity of a message. If you have verified the given user, you know for sure that he owns this User Identity. It is also possible to directly verify devices (if you don't have cross signing, or you manually verified the device). This might get deprecated in the future in favour of cross signing verification. ## Deep dive ## Megolm key source Goal = Establish if a sender_key is the legitimate owner of a Megolm key, that is the trust level is Olm. A Megolm key trust level is at least of `olm level` if **any** of the following are true: - It was created on the same device (i.e. the device has an outbound session). - It is received via an initial key share (`m.room_key`). - It is received via a safe key forward. (See [Forwarded Key Safety](#Forwarded-Key-Safety).) - It is retrieved from a symmetric (v2) Megolm key backup, and it's indicated that the key is safe in the encrypted payload (in `session_data`, should not use `is_verified` that is not encrypted) MSC TBD. Otherwise, the key trust level is `None`. ### Forwarded Key Safety From the perspective of a device D1 (of cross-signing identity U), a key forward (`m.forwarded_room_key`) received from device D2 is safe iff **both** of the following conditions are met: - D2 is another verified (cross-verified) device of the same identity U. - **AND** D2 [indicates it considers](https://github.com/matrix-org/matrix-spec-proposals/pull/3879) the key to be safe via the safe flag. (A new property, trusted, is added to the m.forwarded_room_key event.) In all other situations, a key forward is unsafe and therefore the key received via such a forward is unsafe as well. --- # ARCHIVES ## Room Message Decryption state Application using the crypto crate can get the e2ee property of a message by accessing the encryptionInfo field of the `DecryptedRoomEvent` result. (Changes from existing code) `decrypt_room_event(...) : Result<responses::DecryptedRoomEvent>` ```rust pub struct EncryptionInfo { ... /// The user ID of the event sender, note this is untrusted data unless /// the `safety` is safe and `device_trust_state` is Trusted. pub sender: OwnedUserId, /// Safe or Unsafe as per key safety definition. /// Reflects the fact that we are sure that this group key /// is owned by the sending device. pub safety: Safety, /// Reflects the fact that this device is signed by the user identity, /// independently of whether this identity is trusted or not /// one of TRUSTED_BY_OWNER, UNTRUSTED_BY_OWNER, /// UNKNOWN_OR_DELETED_DEVICE or NO_CROSS_SIGNING /// Note this is the state of the device at the time of decryption. /// A new decryption attempt could give different result /// (device gets deleted, or user keys updated) pub device_trust_state: DeviceVerificationState /// The user identity (pub msk b64) of the sender /// of the message at the time of decryption pub user_identity: Option<String> } ``` ## Recommendation for decoration of message with trust/safety issues | | Trusted | Untrusted| Unknown/Deleted | No XSigning | | -------- | -------- | -------- | --------------- | -------- | | Safe | ✓ | Red | Red | Grey | | Unsafe | Grey | Red | Red | Grey | **No decoration** If the key `safety` is `Safe`, and `device_trust_state` is `TRUSTED_BY_OWNER` **Red level warning** If the `device_trust_state` is `UNTRUSTED_BY_OWNER` **or** `UNKNOWN_OR_DELETED_DEVICE` **Grey level warnings** All other cases. ## Why don't we just reject unsafe keys now? - The current Megolm backup is using the asymmetric algorithm and is producing only unsafe keys. If these keys were rejected, users will lose access to history. - Share keys on invite ([MSC3061](https://github.com/matrix-org/matrix-spec-proposals/pull/3061)). In rooms that support it, when invited users will receive forwarded keys that will all be unsafe. ## What is needed to reject unsafe keys? - Migrating to symmetric backup, and deprecating asymmetric backup. - Replacing usage of forwards in **MSC3061** with a new type of key gossip, code name `Certified transcript`, or `history as presented by INVITER`. ## When is key safety computed? => At time of reception of the key (room_key, forwarded_room_key, backup/file import) or creation (outbound). ## When is the Olm -> MXID link computed? Definition: Olm (`sender_key`) to MXID link (enforced in `/keys/query` signature) Currently at time of decryption. As on reception of a olm message we might not yet have downloaded the user keys. Some problems/context: - If the device is deleted, a new computation will fail. As there will be no way to link the sender_key to the user - It's important to warn when the link cannot be resolved, as a malicious homeserver could inject a device send a message then delete the device. Currently we can't rule out if this device has been owned/verified at some point in time or has never been. - We don't propagate the (olm -> MXID) link when the key is fowarded or backed_up. So on a long standing device a key sent by a now deleted device would be marked as verified (as the link has been resolved at time of decryption), but a new login that will request the key (forward or backup) won't be able to link the sender key to the MXID or, and will mark messages decrypted with it with a warning. ## Various links/ownership in the Olm/Megolm world **Megolm to Olm link:** Tells us is a given megolm session is owned by a given device (identified by sender_key). When we receive an encrypted room key, the fact that we can decrypt it tells us that it was really sent by the device with the matching `sender_key`, also the m.room.key contains a signature that ensure us that the sender actually known the private part of the megolm session (outbound). So a key received via a m.room.key allows us say that the device identified by `sender_key` owns the megolm session. In this case it's a trusted ownership. If we receive via a forward, the ownership link is claimed (not trusted). It is possible, if the forward is coming from one of our own trusted device to trust the claimed ownership. **sender_key to MXID bound:** When receiving to-device messages, we might no have yet downloaded the keys. It's not needed as the pre-key messages contains the public part of the keys needed to establish the olm session. So upon decryption of a prekey message, the sender_key(a.k.a identity key) to MXID link is unknown (we only have a claimed sender MXID from the wired payload of the event). When downloading the device keys (keys/query), we will be able to check the signature done by the device fingerprint key. This signature bounds the (sender_key, fingerprint_key and MXID together). Note: We will not be always able to say if a `sender_key` is bounded to a `MXID`. Let's imagine someone creating a device, sending a message, then log-out before we have time to open our session. In this case we will get the megolm key, but we won't be able to get the device from a `keys/query` as it has been deleted. Any device with an invalid fingerpring signature will be rejected. That means that this bound is either present and valid, or unknown (device deleted) **User Identity to sender_key bound:** Similar to `sender_key`/`MXID` bound, the `keys/query` can contain an identy signature, as the SSK signs and bounds (sender_key, fingerprint_key and MXID) together. When such a signature is present, the device is said as trusted by the given identity. **Relation chain needed to establish trust to a message** - A message is decrypted by a megolm session - The `megolm session` is onwed by a `sender_key` (the ownership is trusted or claimed) - The `sender_key` is bound to a Device (the bound is valid or unknown -deleted device-) - The `Device` is bounded to a `MXID` - The `Device` is bounded (optional) to a `User Identity` - The `Device` is trusted or not trusted by us, whether locally or via cross-signing (i.e we trust the `User Identity` bounded to the `Device`). So a **message is trusted** if and only if it's *decrypted* by a `megolm session` that is *truthfully owned* by a `sender_key` that *is bound* to a `Device` that is itself *trusted by us* (locally or via cross-signing). Table of [possible link values](https://docs.google.com/spreadsheets/d/1B42HCTD_TRE7MGAAEkcoHdM4JaNtP-KDlzd41fqe58Y/edit#gid=1456597672). **Questions:** *If we haven't verified the sender User/Device, what can we tell regarding the message trust?* Basically, we then rely on the Homeservers (mine and sender one) to say the truth regarding the (sender_key <-> Device <-> User) bounds. Because an evil user couldn't without the homerserver participation lie about these bounds. The Megolm -> sender_key is different, as a user could on this own lie about such link: - Forwarded room key on invite for example where the sender spoofs the sender_key - Backup is different because you would need HS cooperation to inject keys in the backup *Does it make sense to check if a device is trusted by an identity if we don't trust this identity?* If we don't trust the identity, at the end we will always rely on the homeserver to not lie. But still, it's possible that an attacker got access to the User account password, and thus is able to add devices to the account without HS cooperation. So it's fishy if a device is not trusted by the claimed user identity. *What does TOFU bring?* TOFU (Trust On First Use, also called [Opportunistic User Key Pinning](https://github.com/matrix-org/matrix-spec-proposals/blob/fayed/tofu/proposals/3834-tofu.md)) adds significant value, because we only have to trust the homeserver at one point in time. We are protected against a later compromise of the homeserver. ## Glossary Device signing key : The key representing the cryptographic identity of a particular Matrix device. In the current standard, it is implemented as an Ed25519 key pair. HS : A Matrix homeserver Megolm key : A cryptographic key used for encrypting messages in an encrypted Matrix room. Synonyms include Megolm session, group session and room key. MSK : Master Signing Key. One of the three cross-signing keys, and the root of a cryptographic user identity. USK : User Signing Key. One of the three cross-signing keys, used for signing others' MSKs to indicate you've verified them. SSK : Self Signing Key. One of the three cross-signing keys, used for signing your own devices. MXID : Matrix ID, also called a user ID in various places. Ex. `@foo:example.com` to-device message : A message delivered directly to a Matrix device (rather than in a Matrix room) using the [Send-to-Device API](https://spec.matrix.org/v1.5/client-server-api/#send-to-device-messaging). Encrypted to-device messages are encrypted using Olm. TOFU : Trust On First Use, also called Opportunistic User Key Pinning `sender_key` : A term used to refer to a device's Curve25519 cryptographic key, used for establishing Olm session. User identity : The identity of a particular Matrix user (for example, a real person or a particular bot), cryptographically reified in the form of a Master Signing Key (MSK). The MSK is verified to belong to a Matrix user via out-of-band verification. A given user identity is also bound to a particular MXID by including it in the MSK struct and signing it with the MSK.