MSC4225: Specification of an order in which one-time-keys should be issued (#4225)
* Initial proposal * edits * typos * Add impl note about dropping old keyspull/4241/head
parent
aa7a3fc035
commit
5370f489cd
@ -0,0 +1,155 @@
|
||||
# MSC4225: Specification of an order in which one-time-keys should be issued
|
||||
|
||||
The specification for the
|
||||
[`/keys/claim`](https://spec.matrix.org/v1.12/client-server-api/#post_matrixclientv3keysclaim)
|
||||
endpoint does not specify any particular order in which one-time keys (OTKs)
|
||||
should be returned, leading to the possibility that old keys can remain on the
|
||||
server a long time. Clients may discard the private parts of such old keys,
|
||||
leading to [unable-to-decrypt
|
||||
errors](https://github.com/element-hq/element-meta/issues/2356).
|
||||
|
||||
See the Appendix below for a more detailed discussion of the problem.
|
||||
|
||||
## Proposal
|
||||
|
||||
[`/_matrix/client/v3/keys/claim`](https://spec.matrix.org/v1.12/client-server-api/#post_matrixclientv3keysclaim)
|
||||
and [`/_matrix/federation/v1/user/keys/claim`](https://spec.matrix.org/v1.12/server-server-api/#post_matrixfederationv1userkeysclaim)
|
||||
should return one-time keys in the order that they were uploaded via
|
||||
[`/keys/upload`](https://spec.matrix.org/v1.12/client-server-api/#post_matrixclientv3keysupload). All
|
||||
keys uploaded in a given call to `/keys/upload` are considered equivalent in
|
||||
this regard; no ordering is specified within them.
|
||||
|
||||
This means that the server will retain only the most recently-uploaded one-time
|
||||
keys, therefore significantly reducing the chance that clients will discard
|
||||
private one-time keys that are later used.
|
||||
|
||||
### Implementation note
|
||||
|
||||
When servers first implement this proposal, they may still have old one-time
|
||||
keys in their datastore, and this may take many years to resolve.
|
||||
|
||||
It is suggested that one solution to this is, as a one-time job, to drop all
|
||||
one-time keys older than, say, a week, from the database. Having done so,
|
||||
clients will upload new one-time keys as soon as they come online; in the
|
||||
meantime, fallback keys are available to allow conversation to continue.
|
||||
|
||||
Servers might consider applying further heuristics: for example, keys produced
|
||||
by libolm are far more likely to have been dropped by the client than those
|
||||
produced by Vodozemac, because libolm only retained 100 keys, whereas Vodozemac
|
||||
retains 5000. The two can be identified by the different lengths of key ID they
|
||||
produce (6 characters vs 11 characters).
|
||||
|
||||
## Potential issues
|
||||
|
||||
This proposal is not a complete solution to the problem of premature discarding
|
||||
of one-time keys: even if the server issues a recent one-time key, it is still
|
||||
possible for a to-device message to be delayed so long that the recipient has
|
||||
discarded the private part of the one-time key. It is, however, a significant
|
||||
improvement. A possible future solution is for clients that expect to be used
|
||||
in conditions of poor connectivity to keep old OTKs for longer.
|
||||
|
||||
There are other ways in which the server and client can get out of sync with
|
||||
respect to one-time keys, including by a [database
|
||||
rollback](https://github.com/element-hq/element-meta/issues/2155), or
|
||||
implementation defects. It is anticipated that other solutions will be found
|
||||
for those situations.
|
||||
|
||||
## Alternatives
|
||||
|
||||
1. Mandate that clients keep the private parts of all uploaded but unused OTKs
|
||||
indefinitely.
|
||||
|
||||
Given that each one-time key is only 32 bytes (plus the ID, say 8 bytes), it
|
||||
would certainly be possible for a client implementation to hold a very large
|
||||
number of private OTKs without significant concerns about resource usage. In
|
||||
particular, a OTK is much smaller than a Megolm key, and there is no limit
|
||||
to the number of Megolm keys that a client has to retain.
|
||||
|
||||
In short though, this just seems inefficient, compared to specifying that
|
||||
OTKs should be issued in upload order.
|
||||
|
||||
2. [MSC4162](https://github.com/matrix-org/matrix-spec-proposals/pull/4162)
|
||||
proposes a mechanism by which a client can inform the server that it is
|
||||
discarding certain OTKs, so that the server can also remove the public
|
||||
keys. This seems a heavier-weight solution to the problem.
|
||||
|
||||
3. A more invasive alternative would be to design an encryption stack which
|
||||
does not rely on one-time keys. Setting aside whether the security
|
||||
properties of such a protocol would be sufficient, this is considered well
|
||||
out-of-scope for this proposal.
|
||||
|
||||
## Security considerations
|
||||
|
||||
Requiring keys to be allocated in upload order might leak information about the
|
||||
user's traffic level, since key IDs are typically allocated sequentially: if I
|
||||
issue two claims for Alice's OTKs a week apart, and I get sequential key IDs, I
|
||||
know that nobody else has opened a conversation with her in that time.
|
||||
|
||||
To mitigate this, we might consider having clients create key IDs
|
||||
non-sequentially (whilst remaining unique). This is considered out-of-scope for
|
||||
this MSC.
|
||||
|
||||
## Unstable prefix
|
||||
|
||||
None required.
|
||||
|
||||
## Dependencies
|
||||
|
||||
None.
|
||||
|
||||
## Appendix: detailed explanation of the failure mode
|
||||
|
||||
### Background
|
||||
|
||||
End-to-end encryption in Matrix relies on individual devices sharing
|
||||
[Megolm](https://gitlab.matrix.org/matrix-org/olm/blob/master/docs/megolm.md)
|
||||
message keys via [to-device
|
||||
messages](https://spec.matrix.org/v1.12/client-server-api/#send-to-device-messaging)
|
||||
which are themselves encrypted using the
|
||||
[Olm](https://gitlab.matrix.org/matrix-org/olm/blob/master/docs/olm.md)
|
||||
ratchet.
|
||||
|
||||
Suppose Alice wishes to send an Olm-encrypted message to Bob (with whom she
|
||||
has not previously established an Olm session). When Bob logged in, his device
|
||||
will have created a long-term identity key, as well as a number (typically 50)
|
||||
of one-time keys. Each of these keys has a private part, which Bob's device
|
||||
retains locally, and a public part, which Bob's device publishes to his
|
||||
homeserver via the
|
||||
[`/keys/upload`](https://spec.matrix.org/v1.12/client-server-api/#post_matrixclientv3keysupload) endpoint.
|
||||
|
||||
Now, to establish an Olm session, Alice needs to claim one of Bob's one-time
|
||||
keys. She does this by calling the
|
||||
[`/keys/claim`](https://spec.matrix.org/v1.12/client-server-api/#post_matrixclientv3keysclaim)
|
||||
endpoint. Together with Bob's identity key, this allows Alice to encrypt a
|
||||
message that only Bob will be able to decrypt.
|
||||
|
||||
Over time, Bob's supply of one-time keys will be depleted. The `/sync` endpoint
|
||||
informs clients how many one-time keys remain unclaimed on the server, so that
|
||||
it can generate new ones when required.
|
||||
|
||||
See [One-time and fallback
|
||||
keys](https://spec.matrix.org/v1.12/client-server-api/#one-time-and-fallback-keys)
|
||||
which specifies much of this behaviour.
|
||||
|
||||
### Problem
|
||||
|
||||
Clearly, a device must retain the private part of each one-time key until that
|
||||
key is used to establish an Olm session. However, for a number of reasons,
|
||||
ranging from network errors to malicious activity, it is possible for a claimed
|
||||
one-time key never to be used to establish an Olm session.
|
||||
|
||||
This presents a problem: there is a limit to the number of private one-time
|
||||
keys that a client can retain. Over time, as keys are repeatedly claimed,
|
||||
replaced with newly-generated keys, but not actually used, the client must
|
||||
start to discard older keys.
|
||||
|
||||
Unfortunately, the Matrix specification does not currently specify any order in
|
||||
which keys should be returned by `/keys/claim`. This means that homeservers can
|
||||
legitimately issue one-time keys effectively at random. Over time, then, it is
|
||||
easy to get into a situation where the server still holds some very old
|
||||
one-time keys, for which the client has discarded the private parts.
|
||||
|
||||
Suppose, when Alice claims one of Bob's one-time keys, she is issued one whose
|
||||
private part Bob has discarded. Then, Bob will be unable to decrypt the Olm
|
||||
message from Alice, and (assuming the Olm message contained Megolm keys), will
|
||||
be unable to decrypt any room messages that Alice sends.
|
||||
Loading…
Reference in New Issue