Define new backup algorithm, and other improvements

- define new HMAC method that covers other fields
- indicate key source for unauthenticated keys
- define migration
pull/4048/head
Hubert Chathi 8 months ago
parent 1403cb4460
commit 77401a0d94

@ -33,42 +33,99 @@ parameter of `"MATRIX_BACKUP_MAC_KEY"` and generating 32 bytes (256 bits):
backup_mac_key = HKDF("", decryption_key, "MATRIX_BACKUP_MAC_KEY", 32)
The backup MAC key can be shared/stored using [the Secrets
The backup MAC key can be shared using [the Secrets
module](https://spec.matrix.org/unstable/client-server-api/#secrets) using the
name `m.megolm_backup.v1.mac`. Note that if the backup decryption key (the
secret using the name `m.megolm_backup.v1`) is stored/shared, then the backup
MAC key does not need to be stored/shared as it can be derived from the backup
decryption key.
The `SessionData` object for the [`m.megolm_backup.v1.curve25519-aes-sha2` key
backup
algorithm](https://spec.matrix.org/unstable/client-server-api/#backup-algorithm-mmegolm_backupv1curve25519-aes-sha2)
key backup algorithm has a new optional `authenticated` property, defaulting to
`false`. This property indicates whether the device that uploaded the key to
the backup believes that the key belongs to the given `sender_key`. This is
true if: a) the key was received via an Olm-encrypted `m.room_key` event from
the `sender_key`, b) the key was received via a trusted key forward
([MSC3879](https://github.com/matrix-org/matrix-spec-proposals/pull/3879)), or
c) the key was downloaded from the key backup, with the `authenticated`
property set to `true` and was authenticated (for example using the method from
this proposal). If the `session_data` does not have a `mac2` property (see
below), then this flag must be treated as being `false`.
No changes are made to the `AuthData` for the
`m.megolm_backup.v1.curve25519-aes-sha2` key backup algorithm.
The following changes are made to the cleartext `session_data` property of the
`KeyBackupData` object:
- a new `mac2` [FIXME: get a better name. suggestions?] property is added,
which is a MAC of the `SessionData` ciphertext (prior to base64-encoding),
using HMAC-SHA-256 with the backup MAC key derived above.
- the current `mac` property is deprecated. Clients should continue to produce
it for compatibility with older clients, but should no longer use it to
verify the contents of the backup if the `mac2` property is present. Clients
should also accept `session_data` that does not have the `mac` property if
the `mac2` property is present, as the `mac` property may become optional in
the future.
secret using the name `m.megolm_backup.v1`) is shared, then the backup MAC key
does not need to be shared as it can be derived from the backup decryption
key. Since the backup decryption key is usually stored in Secret Storage, the
backup MAC key does not need to be stored.
### `m.backup.v2.curve25519-aes-sha2`
A new backup algorithm is defined, identified by the name
"`m.backup.v2.curve25519-aes-sha2`". In addition to incrementing the version
number, this name drops the "megolm", as it is expected that other types of
keys may be stored in it, for example [MLS
groups](https://github.com/matrix-org/matrix-spec-proposals/pull/4038).
The intention of creating a new backup algorithm is to prevent an attacker from
uploading additional keys that cannot be authenticated.
The `auth_data` is the same as with `m.megolm_backup.v1.curve25519-aes-sha2`.
The `session_data` is constructed as follows:
1. Encode the session key to be backed up as a JSON object using the
`SessionDataV2` format defined below.
2. Generate an ephemeral curve25519 key, and perform an ECDH with the ephemeral
key and the backups public key to generate a shared secret. The public half
of the ephemeral key, encoded using unpadded base64, becomes the `ephemeral`
property of the `session_data`.
3. Using the shared secret, generate 80 bytes by performing an HKDF using
SHA-256 as the hash, with a salt of 32 bytes of 0, and with the empty string
as the info. The first 32 bytes are used as the AES key, the next 32 bytes
are discarded, and the last 16 bytes are used as the AES initialization
vector. (This is the same as the key generation for
`m.megolm_backup.v1.curve25519-aes-sha2`, except that the generated MAC key
is discarded since it is unused.)
4. Stringify the JSON object, and encrypt it using AES-CBC-256 with PKCS#7
padding. This encrypted data, encoded using unpadded base64, becomes the
`ciphertext` property of the `session_data`.
5. Encode the `session_data` as canonical JSON, as would be done when [signing
JSON](https://spec.matrix.org/unstable/appendices/#signing-details), and
calculate the HMAC-SHA-256 MAC using the backup MAC key. The MAC is
base64-encoded (unpadded), and becomes the `hmac-sha-256:1` property of the
`signatures` property of `session_data`.
Thus the `session_data` property has `ephemeral`, `ciphertext`, and
`signatures` properties, with the `signatures` property having a
`hmac-sha-256:1` property. Keys without a `signatures`.`hmac-sha-256:1`
property must be ignored.
When verifying the MAC, the `session_data` is encoded as canonical JSON,
following the procedure as when signing JSON. That is, any additional
properties, other than `signatures` and `unsigned`, are included.
The `SessionDataV2` has algorithm-dependent and algorithm-independent
properties. The algorithm-independent properties are:
- `algorithm`: (required string) the end-to-end message encryption algorithm that the
key is for. The values are the same as for the `algorithm` property in the
`m.room_key` event. For example, for Megolm keys, this is
`m.megolm.v1.aes-sha2`.
- `unauthenticated`: (optional string) if not present, the key is considered to
be authenticated, that is, the device that uploaded the key to the backup
believes that the key belongs to the recorded sender, as defined by the key
algorithm (with `m.megolm.v1.aes-sha2`, the sender is given in the
`sender_key` property). A key is considered to be authenticated if: a) the
key was received via an Olm-encrypted `m.room_key` event from the
`sender_key`, b) the key was received via a trusted key forward
([MSC3879](https://github.com/matrix-org/matrix-spec-proposals/pull/3879)),
or c) the key was downloaded from the key backup where it is marked as
authenticated, and the data can be authenticated (for example using the
method from this proposal). If the key is not considered to be
authenticated, this property indicates the source of the key. Currently
defined values are: `m.undefined`, which indicates that the source is not
specified; `m.legacy-v1`, which indicates that the key was an unauthenticated
key from a `m.megolm_backup.v1.curve25519-aes-sha2` backup ([see
below](#migrating-keys)); and `m.forwarded_room_key`, which indicates that
the key came from an untrusted key forward. (FIXME: do we also want to
encode the source of the key forward?) Clients may create other values to
specify other sources, using the Java package naming convention; clients
should consider unknown values to be `m.undefined`.
For the `m.megolm.v1.aes-sha2` algorithm, the algorithm-dependent properties
are the `forwarding_curve25519_key_chain`, `sender_claimed_keys`, `sender_key`,
and `session_key` properties defined for
`m.megolm_backup.v1.curve25519-aes-sha2`.
### `m.megolm_backup.v1.curve25519-aes-sha2`
Megolm keys may be uploaded to a `m.megolm_backup.v1.curve25519-aes-sha2`
backup using the `m.backup.v2.curve25519-aes-sha2` format, provided the
`session_data` also contains the `mac` property as required for the
`m.megolm_backup.v1.curve25519-aes-sha2` algorithm.
The [construction of the `session_data`
property](https://spec.matrix.org/unstable/client-server-api/#backup-algorithm-mmegolm_backupv1curve25519-aes-sha2)
@ -91,13 +148,37 @@ thus becomes:
5. Pass the raw encrypted data (prior to base64 encoding) through HMAC-SHA-256
using the MAC key generated above. The first 8 bytes of the resulting MAC
are base64-encoded, and become the `mac` property of the `session_data`.
6. Pass the raw encrypted data (prior to base64 encoding) through HMAC-SHA-256
using the backup MAC key. The MAC is base64-encoded (unpadded), and becomes
the `mac2` property of the `session_data`.
6. Encode the `session_data` as canonical JSON, as would be done when [signing
JSON](https://spec.matrix.org/unstable/appendices/#signing-details), and
calculate the HMAC-SHA-256 MAC using the backup MAC key. The MAC is
base64-encoded (unpadded), and becomes the `hmac-sha-256:1` property of the
`signatures` property of `session_data`.
FIXME: should the server compare the `signatures`.`hmac-sha-256:1` property
when a client uploads a key to the backup, when deciding whether to keep the
existing key or replace it with a new key?
To simplify logic, clients may treat `m.backup.v2.curve25519-aes-sha2`-format
keys with the same semantics as `m.megolm_backup.v1.curve25519-aes-sha2` keys
when they are in a `m.megolm_backup.v1.curve25519-aes-sha2` backup. That is,
clients may treat all keys in a `m.megolm_backup.v1.curve25519-aes-sha2` backup
as being unauthenticated, regardless of the presence or absence of the
`signatures`.`hmac-sha-256:1` property in the cleartext `session_data`
property.
#### Migrating keys
FIXME: should the server compare the `mac2` when a client uploads a key to the
backup, when deciding whether to keep the existing key or replace it with a new
key?
When migrating keys from a `m.megolm_backup.v1.curve25519-aes-sha2` backup to a
`m.backup.v2.curve25519-aes-sha2` backup, keys without a
`signatures`.`hmac-sha-256:1` property in the cleartext `session_data` property
must have the `unauthenticated` property set to `m.legacy-v1` in the encrypted
`SessionData`, regardless of whether the key originally had an
`unauthenticated` property, and a `signatures`.`hmac-sha-256:1` property added
to the cleartext `session_data`. If the same backup decryption key is used for
the old and new backups, keys that have an existing
`signatures`.`hmac-sha-256:1` property may be uploaded to the new backup
unchanged, as they will be valid `m.backup.v2.curve25519-aes-sha2`-format
keys.
## Potential issues
@ -165,13 +246,27 @@ message.
Until this MSC is accepted, the following unstable names should be used:
- the property name `org.matrix.msc4048.authenticated` should be used in place
of `authenticated` in the `SessionData` object,
- the property name `org.matrix.msc4048.mac2` should be used in place of `mac2`
in the `session_data` property,
- the algorithm name `org.matrix.msc4048.curve25519-aes-sha2` should
be used in place of the name `m.backup.v2.curve25519-aes-sha2`.
- the property name `org.matrix.msc4048.unauthenticated` should be used in place
of `unauthenticated` in the `SessionData` object,
- the property name `org.matrix.msc4048.hmac-sha-256:1` should be used in place
of `hmac-sha-256:1` in the `signatures` property,
- the SSSS identifier `org.matrix.msc4048.mac` should be used in place of
`m.megolm_backup.v1.mac`.
### Migration to stable names
After this MSC is accepted, clients that understand the
`org.matrix.msc4048.curve25519-aes-sha2` algorithm name should
migrate the user to a backup using the accepted version of the
`m.backup.v2.curve25519-aes-sha2` algorithm. Keys that use the unstable
property names should be re-uploaded using the stable names.
This includes migrating
`org.matrix.msc4048.curve25519-aes-sha2`-format keys uploaded to
`m.megolm_backup.v1.curve25519-aes-sha2` backups.
## Dependencies
None

Loading…
Cancel
Save