|
|
# Changing Event IDs to be Hashes
|
|
|
|
|
|
## Motivation
|
|
|
|
|
|
Having event IDs separate from the hashes leads to issues when a server receives
|
|
|
multiple events with the same event ID but different reference hashes. While
|
|
|
APIs could be changed to better support dealing with this situation, it is
|
|
|
easier and nicer to simply drop the idea of a separate event ID entirely, and
|
|
|
instead use the reference hash of an event as its ID.
|
|
|
|
|
|
## Identifier Format
|
|
|
|
|
|
Currently hashes in our event format include the hash name, allowing servers to
|
|
|
choose which hash functions to use. The idea here was to allow a gradual change
|
|
|
between hash functions without the need to globally coordinate shifting from one
|
|
|
hash function to another.
|
|
|
|
|
|
However now that room versions exist, changing hash functions can be achieved by
|
|
|
bumping the room version. Using this method would allow using a simple string as
|
|
|
the event ID rather than a full structure, significantly easing their usage.
|
|
|
|
|
|
One side effect of this would be that there would be no indication about which
|
|
|
hash function was actually used, and it would need to be inferred from the room
|
|
|
version. To aid debuggability it may be worth encoding the hash function into
|
|
|
the ID format.
|
|
|
|
|
|
**Conclusion:** Don't encode the hash function, since the hash will depend on
|
|
|
the version specific redaction algorithm anyway.
|
|
|
|
|
|
The proposal is therefore that the event IDs are a sha256 hash, encoded using
|
|
|
[unpadded
|
|
|
Base64](https://matrix.org/docs/spec/appendices.html#unpadded-base64), and
|
|
|
prefixed with `$` (to aid distinguishing different types of identifiers). For
|
|
|
example, an event ID might be: `$CD66HAED5npg6074c6pDtLKalHjVfYb2q4Q3LZgrW6o`.
|
|
|
|
|
|
The hash is calculated in the same way as previous event reference hashes were,
|
|
|
which is:
|
|
|
|
|
|
1. Redact the event
|
|
|
2. Remove `signatures` field from the event
|
|
|
3. Serialize the event to canonical JSON
|
|
|
4. Compute the hash of the JSON bytes
|
|
|
|
|
|
Event IDs will no longer be included as part of the event, and so must be
|
|
|
calculated by servers receiving the event.
|
|
|
|
|
|
|
|
|
## Changes to Event Formats
|
|
|
|
|
|
As well as changing the format of event IDs, we also change the format of the
|
|
|
`auth_events` and `prev_events` keys in events to simply be lists of event IDs
|
|
|
(rather than being lists of tuples).
|
|
|
|
|
|
A full event would therefore look something like (note that this is just an
|
|
|
illustrative example, and that the hashes are not correct):
|
|
|
|
|
|
```json
|
|
|
{
|
|
|
"auth_events": [
|
|
|
"$5hdALbO+xIhzcLTxCkspx5uqry9wO8322h/OI9ApnHE",
|
|
|
"$Ga0DBIICBsWIZbN292ATv8fTHIGGimwjb++w+zcHLRo",
|
|
|
"$zc4ip/DpPI9FZVLM1wN9RLqN19vuVBURmIqAohZ1HXg",
|
|
|
],
|
|
|
"content": {
|
|
|
"body": "Here is the message content",
|
|
|
"msgtype": "m.message"
|
|
|
},
|
|
|
"depth": 6,
|
|
|
"hashes": {
|
|
|
"sha256": "M6/LmcMMJKc1AZnNHsuzmf0PfwladVGK2Xbz+sUTN9k"
|
|
|
},
|
|
|
"origin": "localhost:8800",
|
|
|
"origin_server_ts": 1548094046693,
|
|
|
"prev_events": [
|
|
|
"$MoOzCuB/sacqHAvgBNOLICiGLZqGT4zB16MSFOuiO0s",
|
|
|
],
|
|
|
"room_id": "!eBrhCHJWOgqrOizwwW:localhost:8800",
|
|
|
"sender": "@anon-20190121_180719-33:localhost:8800",
|
|
|
"signatures": {
|
|
|
"localhost:8800": {
|
|
|
"ed25519:a_iIHH": "N7hwZjvHyH6r811ebZ4wwLzofKhJuIAtrQzaD3NZbf4WQNijXl5Z2BNB047aWIQCS1JyFOQKPVom4et0q9UOAA"
|
|
|
}
|
|
|
},
|
|
|
"type": "m.room.message"
|
|
|
}
|
|
|
```
|
|
|
|
|
|
## Changes to existing APIs
|
|
|
|
|
|
All APIs that accept event IDs must accept event IDs in the new format.
|
|
|
|
|
|
For S2S API, whenever a server needs to parse an event from a request or
|
|
|
response they must either already know the room version *or* be told the room
|
|
|
version in the request/response. There are separate MSCs to update APIs where
|
|
|
necessary.
|
|
|
|
|
|
For C2S API, the only change clients will see is that the event IDs have changed
|
|
|
format. Clients should already be treating event IDs as opaque strings, so no
|
|
|
changes should be required. Servers must add the `event_id` when sending the
|
|
|
event to clients, however.
|
|
|
|
|
|
Note that the `auth_events` and `prev_events` fields aren't sent to clients, and
|
|
|
so the changes proposed above won't affect clients.
|
|
|
|
|
|
|
|
|
## Protocol Changes
|
|
|
|
|
|
The `auth_events` and `prev_events` fields on an event need to be changed from a
|
|
|
list of tuples to a list of strings, i.e. remove the old event ID and simply
|
|
|
have the list of hashes.
|
|
|
|
|
|
The auth rules also need to change:
|
|
|
|
|
|
- The event no longer needs to be signed by the domain of the event ID (but
|
|
|
still needs to be signed by the sender’s domain)
|
|
|
|
|
|
- We currently allow redactions if the domain of the redaction event ID
|
|
|
matches the domain of the event ID it is redacting; which allows self
|
|
|
redaction. This check is removed and redaction events are always accepted.
|
|
|
Instead, the redaction event only takes effect and is sent down to clients
|
|
|
if/when the original event is received, and the domain of the events'
|
|
|
senders match. (While this is clearly suboptimal, it is the only practical
|
|
|
suggestion)
|
|
|
|
|
|
|
|
|
## Room Version
|
|
|
|
|
|
There will be a new room version v3 that is the same as v2 except uses the new
|
|
|
event format proposed above. v3 will be marked as 'stable' as defined in [MSC1804](https://github.com/matrix-org/matrix-doc/blob/travis/msc/room-version-client-advertising/proposals/1804-advertising-capable-room-versions.md)
|
|
|
|