Merge 39903141e7
into d6edcbd946
commit
f1a5714c1d
@ -0,0 +1,438 @@
|
||||
# MSC3871: Gappy timeline
|
||||
|
||||
`/messages` returns a linearized version of the event DAG. From any given
|
||||
homeservers perspective of the room, the DAG can have gaps where they're missing
|
||||
events. This could be because the homeserver hasn't fetched them yet or because
|
||||
it failed to fetch the events because those homeservers are unreachable and no
|
||||
one else knows about the event.
|
||||
|
||||
Currently, there is an unwritten rule between the server and client that the
|
||||
server will always return all contiguous events in that part of the timeline.
|
||||
But the server has to break this rule sometimes when it doesn't have the event
|
||||
and is unable to get the event from anyone else. This MSC aims to change the
|
||||
dynamic so the server can give the client feedback and an indication of where
|
||||
the gaps are.
|
||||
|
||||
This way, clients know where they are missing events and can even retry fetching
|
||||
by perhaps adding some UI to the timeline like "We failed to get some messages
|
||||
in this gap, try again."
|
||||
|
||||
This can also make servers faster to respond to `/messages`. For example,
|
||||
currently, Synapse always tries to backfill and fill in the gap (even when it
|
||||
has enough messages locally to respond). In big rooms like `#matrix:matrix.org`
|
||||
(Matrix HQ), almost every place you ask for has gaps in it (thousands of
|
||||
backwards extremities) and lots of those events are unreachable so we try the
|
||||
same thing over and over hoping the response will be different this time but
|
||||
instead, we just make the `/messages` response time slow. With this MSC, we can
|
||||
instead be more intelligent about backfilling in the background and just tell
|
||||
the client about the gap that they can retry fetching a little later.
|
||||
|
||||
|
||||
## Proposal
|
||||
|
||||
Add a `gaps` field to the response of [`GET
|
||||
/_matrix/client/v3/rooms/{roomId}/messages`](https://spec.matrix.org/v1.1/client-server-api/#get_matrixclientv3roomsroomidmessages).
|
||||
This field is an array of `GapEntry` indicating where missing events in the
|
||||
timeline are as defined below.
|
||||
|
||||
|
||||
### 200 response
|
||||
|
||||
This describes the new `gaps` response field being added to the `200 response`
|
||||
of `/messages`:
|
||||
|
||||
Name | Type | Description | required
|
||||
--- | --- | --- | ---
|
||||
`gaps` | `[GapEntry]` | A list of gaps indicating where events are missing in the `chunk` | no
|
||||
|
||||
|
||||
#### `GapEntry`
|
||||
|
||||
key | type | value | description | required
|
||||
--- | --- | --- | --- | ---
|
||||
`event_id` | string | Event ID | The event ID indicating the position in the `/messages` `chunk` response | yes
|
||||
`prev_pagination_token` | string | Pagination token | A pagination token that represents the spot in the DAG before the given `event_id` in the `chunk`. Omitting this field just means there is no gap on this side. | no
|
||||
`next_pagination_token` | string | Pagination token | A pagination token that represents the spot in the DAG after the given `event_id` in the `chunk`. Omitting this field just means there is no gap on this side. | no
|
||||
|
||||
|
||||
### `/messages` response examples
|
||||
|
||||
The following mermaid diagram represents the room DAG snapshot used for the following
|
||||
`/messages` responses. The slightly transparent events with no background are events
|
||||
that the homeserver does not have and are in the gap.
|
||||
|
||||
Pagination tokens are positions between events. This already an established concept but
|
||||
to illustrate this better, see the following `tX` pagination tokens in the following
|
||||
diagram.
|
||||
|
||||
```mermaid
|
||||
flowchart RL
|
||||
after[newest events...]:::gap-event -->|t10| fred -->|t9| waldo:::gap-event -->|t8| garply -->|t7| grault:::gap-event -->|t6| corge -->|t5| qux:::gap-event -->|t4| baz -->|t3| bar:::gap-event -->|t2| foo -->|t1| before[oldest events...]:::gap-event
|
||||
|
||||
classDef gap-event opacity:0.8,fill:transparent;
|
||||
```
|
||||
|
||||
The idea is to be able to keep paginating from
|
||||
`prev_pagination_token`/`next_pagination_token` in the respective direction to fill in
|
||||
the gap.
|
||||
|
||||
|
||||
#### `/messages?dir=b`
|
||||
|
||||
`/messages?dir=b` response example with gaps (`chunk` has events in
|
||||
reverse-chronoligcal order since we're paginating backwards):
|
||||
|
||||
`/messages?dir=b&from=t6`
|
||||
```json5
|
||||
{
|
||||
"chunk": [
|
||||
{
|
||||
"event_id": "$corge",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "corge",
|
||||
}
|
||||
},
|
||||
{
|
||||
"event_id": "$baz",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "baz",
|
||||
}
|
||||
},
|
||||
{
|
||||
"event_id": "$foo",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "foo",
|
||||
}
|
||||
}
|
||||
]
|
||||
"gaps": [
|
||||
{
|
||||
"prev_pagination_token": "t6",
|
||||
"event_id": "$corge",
|
||||
"next_pagination_token": "t5",
|
||||
},
|
||||
{
|
||||
"prev_pagination_token": "t4",
|
||||
"event_id": "$baz",
|
||||
"next_pagination_token": "t3",
|
||||
},
|
||||
{
|
||||
"prev_pagination_token": "t2",
|
||||
"event_id": "$foo",
|
||||
"next_pagination_token": "t1",
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
#### `/messages?dir=f`
|
||||
|
||||
`/messages?dir=f` response example with gaps (`chunk` has events in
|
||||
chronoligcal order since we're paginating forwards):
|
||||
|
||||
`/messages?dir=f&from=t6`
|
||||
```json5
|
||||
{
|
||||
"chunk": [
|
||||
{
|
||||
"event_id": "$garply",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "garply",
|
||||
}
|
||||
},
|
||||
{
|
||||
"event_id": "$fred",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "fred",
|
||||
}
|
||||
},
|
||||
]
|
||||
"gaps": [
|
||||
{
|
||||
"prev_pagination_token": "t7",
|
||||
"event_id": "$garply",
|
||||
"next_pagination_token": "t8",
|
||||
},
|
||||
{
|
||||
"prev_pagination_token": "t9",
|
||||
"event_id": "$fred",
|
||||
"next_pagination_token": "t10",
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Potential issues
|
||||
|
||||
Lots of gaps/extremities are generated when a spam attack occurs and federation
|
||||
falls behind. If clients start showing gaps with retry links, we might just be
|
||||
exposing the spam more.
|
||||
|
||||
|
||||
## Alternatives
|
||||
|
||||
As an alternative, we can continue to do nothing as we do today and not worry
|
||||
about the occasional missing events. People seem not to notice any missing
|
||||
messages anyway but they do probably see our slow `/messages` pagination.
|
||||
|
||||
|
||||
### Expose `prev_events` to the client
|
||||
|
||||
One alternative is including the `prev_events` in the events that the client sees so
|
||||
they can figure out the DAG chain themselves and see if there is an missing event in the
|
||||
middle.
|
||||
|
||||
There is an [unspecced `/messages?raw=true` query parameter in
|
||||
Synapse](https://github.com/matrix-org/synapse/blob/20c76cecb9eb84dadfa7b2d25b436d3ab9218a1a/synapse/rest/client/room.py#L653)
|
||||
that returns the full raw event as seen over federation which means it will include the
|
||||
`prev_events`.
|
||||
|
||||
You can also specify `event_format: federation` directly in that JSON `filter` parameter
|
||||
of `/messages` ->
|
||||
`/_matrix/client/v3/rooms/{room_id}}/messages?dir=b&filter=%7B%22event_format%22%3A%20%22federation%22%7D`
|
||||
|
||||
Related to:
|
||||
|
||||
- https://github.com/matrix-org/matrix-spec/issues/859
|
||||
- https://github.com/matrix-org/matrix-spec/issues/1047
|
||||
|
||||
|
||||
### Synthetic `m.timeline.gap` event alternative
|
||||
|
||||
Another alternative is using synthetic events (thing that looks like an event
|
||||
without an `event_id`) which the server inserts alongside other events in the
|
||||
`chunk` to indicate where the gap is. But this has detractors since it's harder
|
||||
to implement in strongly typed SDK's and easy for a client to naively display
|
||||
every "event" in the `chunk`.
|
||||
|
||||
`/messages` response example with a gap:
|
||||
|
||||
```json
|
||||
{
|
||||
"chunk": [
|
||||
{
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "foo",
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "m.timeline.gap",
|
||||
"content": {
|
||||
"gap_start_event_id": "$12345",
|
||||
"pagination_token": "t47409-4357353_219380_26003_2265",
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "baz",
|
||||
}
|
||||
},
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
### `GapEntry` alternative only indicating a gap `next_to_event_id` (only one side)
|
||||
|
||||
Same concept as the existing `GapEntry` proposal but we only indicate the gap on one
|
||||
side of an event `next_to_event_id` according to the direction that `/messages` is going
|
||||
already.
|
||||
|
||||
The problem with this alternative is that clients store events differently and it's
|
||||
valid to want to paginate in either direction from a given event. This alternative works
|
||||
fine in the Element Web case where you always paginate backwards in the scrollback and
|
||||
store events as a whole timeline list but another client like the [Trixinity
|
||||
SDK](https://github.com/benkuly/trixnity), where events are stored individually in a
|
||||
linked list, where each event could have a gap before and after, and where a gap could
|
||||
be 100's, 1000's of events wide, it would be useful to paginate from both ends to fill
|
||||
the gap faster.
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
Details for the <code>GapEntry</code> alternative only indicating a gap <code>next_to_event_id</code>
|
||||
</summary>
|
||||
|
||||
#### `GapEntry`
|
||||
|
||||
key | type | value | description | required
|
||||
--- | --- | --- | --- | ---
|
||||
`next_to_event_id` | string | Event ID | The event ID indicating the position in the `/messages` `"chunk"` response where the gap starts after that position. This field can be `null` or completely omitted to indicate that the gap is at the start of the `/messages` `"chunk"` | no
|
||||
`pagination_token` | string | Pagination token | A pagination token that represents the spot in the DAG to be able to continue paginating in the same direction as the request and fill in the gap from `next_to_event_id` to the next known event. | yes
|
||||
|
||||
|
||||
### `/messages` response examples
|
||||
|
||||
The following mermaid diagram represents the room DAG snapshot used for the following
|
||||
`/messages` responses. The slightly transparent events with no background are events
|
||||
that the homeserver does not have and are in the gap.
|
||||
|
||||
Pagination tokens are positions between events. This already an established concept but
|
||||
to illustrate this better, see the following `tX` pagination tokens in the following
|
||||
diagram.
|
||||
|
||||
```mermaid
|
||||
flowchart RL
|
||||
after[newest events...]:::gap-event -->|t10| fred -->|t9| waldo:::gap-event -->|t8| garply -->|t7| grault:::gap-event -->|t6| corge -->|t5| qux:::gap-event -->|t4| baz -->|t3| bar:::gap-event -->|t2| foo -->|t1| before[oldest events...]:::gap-event
|
||||
|
||||
classDef gap-event opacity:0.8,fill:transparent;
|
||||
```
|
||||
|
||||
The idea is to be able to keep paginating from `pagination_token` in the same
|
||||
direction of the request to fill in the gap.
|
||||
|
||||
|
||||
#### `/messages?dir=b`
|
||||
|
||||
`/messages?dir=b` response example with gaps (`chunk` has events in
|
||||
reverse-chronoligcal order since we're paginating backwards):
|
||||
|
||||
`/messages?dir=b&from=t6`
|
||||
```json5
|
||||
{
|
||||
"chunk": [
|
||||
// there is no gap from `t6` to `$corge` as expected
|
||||
{
|
||||
"event_id": "$corge",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "corge",
|
||||
}
|
||||
},
|
||||
// <the first `GapEntry` indicates a gap here>
|
||||
{
|
||||
"event_id": "$baz",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "baz",
|
||||
}
|
||||
},
|
||||
// <the second `GapEntry` indicates a gap here>
|
||||
{
|
||||
"event_id": "$foo",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "foo",
|
||||
}
|
||||
}
|
||||
// <the third `GapEntry` indicates a gap here>
|
||||
]
|
||||
"gaps": [
|
||||
{
|
||||
"next_to_event_id": "$corge",
|
||||
"pagination_token": "t5",
|
||||
},
|
||||
{
|
||||
"next_to_event_id": "$baz",
|
||||
"pagination_token": "t3",
|
||||
},
|
||||
{
|
||||
"next_to_event_id": "$foo",
|
||||
"pagination_token": "t1",
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
#### `/messages?dir=f`
|
||||
|
||||
`/messages?dir=f` response example with gaps (`chunk` has events in
|
||||
chronoligcal order since we're paginating forwards):
|
||||
|
||||
`/messages?dir=f&from=t6`
|
||||
```json5
|
||||
{
|
||||
"chunk": [
|
||||
// <the first `GapEntry` indicates a gap here>
|
||||
{
|
||||
"event_id": "$garply",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "garply",
|
||||
}
|
||||
},
|
||||
// <the second `GapEntry` indicates a gap here>
|
||||
{
|
||||
"event_id": "$fred",
|
||||
"type": "m.room.message",
|
||||
"content": {
|
||||
"body": "fred",
|
||||
}
|
||||
},
|
||||
// <the third`GapEntry` indicates a gap here>
|
||||
]
|
||||
"gaps": [
|
||||
{
|
||||
"next_to_event_id": null,
|
||||
"pagination_token": "t6",
|
||||
},
|
||||
{
|
||||
"next_to_event_id": "$garply",
|
||||
"pagination_token": "t8",
|
||||
},
|
||||
{
|
||||
"next_to_event_id": "$fred",
|
||||
"pagination_token": "t10",
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
|
||||
## Future considerations
|
||||
|
||||
In the future, we should consider adding the same `gaps` field to `/context` because
|
||||
it's another endpoint that returns a linearized version of the DAG.
|
||||
|
||||
It could make sense to roll this into this MSC but it might make the proposal less clear
|
||||
if we have to bulk it up by specifying the same details for `/context`. Leaving it to be
|
||||
follow-up MSC for now.
|
||||
|
||||
|
||||
## Security considerations
|
||||
|
||||
Only your own homeserver controls whether a gap is added to the `/messages`
|
||||
response so there shouldn't be any weird edge case where someone else can
|
||||
control whether you to fetch something.
|
||||
|
||||
|
||||
## Unstable prefix
|
||||
|
||||
While this feature is in development, the `gaps` field can be used as
|
||||
`org.matrix.msc3871.gaps`
|
||||
|
||||
### While the MSC is unstable
|
||||
|
||||
During this period, to detect server support clients should check for the
|
||||
presence of the `org.matrix.msc3871` flag in `unstable_features` on `/versions`.
|
||||
Clients are also required to use the unstable prefixes (see [unstable
|
||||
prefix](#unstable-prefix)) during this time.
|
||||
|
||||
### Once the MSC is merged but not in a spec version
|
||||
|
||||
Once this MSC is merged, but is not yet part of the spec, clients should rely on
|
||||
the presence of the `org.matrix.msc3871.stable` flag in `unstable_features` to
|
||||
determine server support. If the flag is present, clients are required to use
|
||||
stable prefixes (see [unstable prefix](#unstable-prefix)).
|
||||
|
||||
### Once the MSC is in a spec version
|
||||
|
||||
Once this MSC becomes a part of a spec version, clients should rely on the
|
||||
presence of the spec version, that supports the MSC, in `versions` on
|
||||
`/versions`, to determine support. Servers are encouraged to keep the
|
||||
`org.matrix.msc3871.stable` flag around for a reasonable amount of time
|
||||
to help smooth over the transition for clients. "Reasonable" is intentionally
|
||||
left as an implementation detail, however the MSC process currently recommends
|
||||
*at most* 2 months from the date of spec release.
|
Loading…
Reference in New Issue