MSC2675: Serverside aggregations of message relationships (#2675)

* initial version of serverside aggregations proposal

* fix MSC numbers

* clarification

* add e2ee section from 2674 here, as it is only needed for server-side aggregations

* move edge case wrt to calling /context on a relation here from 2674

* fix typo

* clarify which APIs should bundle relations

* move stale_events over to future extensions section

* summarize stale_events and make tone conditional to mark that is not part of the MSC

* casing and wording

* clarify in summary an API for requesting relations is also proposed

* remove proposal for batch get event api as is unused and unimplemented

* attempt to clarify relations vs aggregations

* clarify pagination and align it with synapse impl already in the wild

* conciseness

* better headers

* clarify that relations are always returned, contrary to aggregations

* document the limitation of the event type not being known in e2ee rooms

* specify that redacted relations are not aggregated

* remove type in (non-binding) example as synapse doesn't do this

* mention that these are just examples

* clarify that this is a non-normative example

* Update proposals/2675-aggregations-server.md

Co-authored-by: David Baker <dbkr@users.noreply.github.com>

* add http method for endpoint list

* line break

* remove "unbundled relations" term, it's just confusing

instead use relation events,
with the bundled form now called aggregation

also restructure the headings so we have on section about aggregations
and another one about querying relation events

* some more restructuring of text after changing doc structure

* mention original_event for m.replace relations

* remove dir param as it is unused and unimplemented

* clarify that relating pending events should happen by transaction_id

* remove unimplemented /aggregations/{eventID}//{eventType}/{key}
 endpoint

* Update proposals/2675-aggregations-server.md

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* mention that the server might not be aware of all the relations

* clarify that redacted events should still return their relations and aggregations respectively

* remove /context edge case, it should not be special-cased

* Update proposals/2675-aggregations-server.md

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* bad example, replies doesn't use relations

* clarify that we dont bundle discrete events

* clarify that we dont bundle discrete events, again

* improve example

* clarify this MSC does not use a prefix

* better english

* clarify pagination in example

* better english

* remove contradication: m.reference doesn't support pagination but example mentions it

* double punctuation

* clarify that only the bundled aggregation limit for truncation can't be set by the client, /aggregations does have a limit param

* move e2ee limitation to limitations section

* clarify prefixes

* mention that state events never bundle aggregations

* Update proposals/2675-aggregations-server.md

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* add that the visibility of relations can derive from that of the target

* typsos

* be more explicit

* moar rewording

* keep related parts together

* don't make a relation invisible because the target event isn't

also clarify what to do with relations for which the target is invisible

* Update proposals/2675-aggregations-server.md

Co-authored-by: David Baker <dbkr@users.noreply.github.com>

* better words

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* be more precise when clients should ensure the key is shared

* mention that ignored users can cause different aggregations for users

* move visibility rule changes to MSC3570

* don't overspecify visibility limitation, allow for unspecified behaviour

as synapse includes the invisible events in the aggregation

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* move non-normative note to below example

* make rel_type mandatory as the response structure doesn't allow for mixing types

* fix typo/thinko

* make pagination forward only as there is no use case for backwards

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* add non-normative aggregation examples

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* use relation type rather than rel_type

the former is already define as the latter in MSC 2674

* change trailing slashes remark to event_type, rel_type is mandatory now

* reword and split out client-side aggregation section

* rename parent event to target event, the term used elsewhere

* apply suggestion

* apply suggestion

* remove pagination

* remove mentions of /aggregations endpoint after removing pagination

* add note about not bundling into state events

* restructure headers so more of the aggregations stuff is under section

* make rel_type mandatory for /relations and better wording

* remove confusion that aggregations contain more info than relations

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>

* mention that tokens from /sync, /messages can be used on /relations

* try not to be overly prescriptive

* remove edge case of ignoring events without target event, as ignoring is not always safe

* clarify limitation for encrypted rooms

* make rel_type optional again for /relations

* Update proposals/2675-aggregations-server.md

Co-authored-by: Hubert Chathi <hubert@uhoreg.ca>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Hubert Chathi <hubert@uhoreg.ca>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Hubert Chathi <hubert@uhoreg.ca>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Hubert Chathi <hubert@uhoreg.ca>

* Update proposals/2675-aggregations-server.md

Co-authored-by: Matthew Hodgson <matthew@arasphere.net>

* mention requires auth and rate-limited on /relations

* replace hypothetical examples for bundled aggregations with non-normative ones

* move to MSC 2676 as it's specific to edits

* dont repeat how local echo using transaction_id works

Co-authored-by: Bruno Windels <bruno@windels.cloud>
Co-authored-by: David Baker <dbkr@users.noreply.github.com>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Co-authored-by: Matthew Hodgson <matthew@arasphere.net>
pull/3778/head
Hubert Chathi 3 years ago committed by GitHub
parent a786fa980d
commit aa0350ae33
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -0,0 +1,320 @@
# MSC2675: Serverside aggregations of message relationships
It's common to want to send events in Matrix which relate to existing events -
for instance, reactions, edits and even replies/threads.
Clients typically need to track the related events alongside the original
event they relate to, in order to correctly display them. For instance,
reaction events need to be aggregated together by summing and be shown next to
the event they react to; edits need to be aggregated together by replacing the
original event and subsequent edits, etc.
It is possible to treat relations as normal events and aggregate them
clientside, but to do so comprehensively could be very resource intensive, as
the client would need to spider all possible events in a room to find
relationships and maintain a correct view.
Instead, this proposal seeks to solve this problem by defining APIs to let the
server calculate the aggregations on behalf of the client, and so bundle the
aggregated data with the original event where appropriate. It also proposes an
API to let clients paginate through all relations of an event.
This proposal is one in a series of proposals that defines a mechanism for
events to relate to each other. Together, these proposals replace
[MSC1849](https://github.com/matrix-org/matrix-doc/pull/1849).
* [MSC2674](https://github.com/matrix-org/matrix-doc/pull/2674) defines a
standard shape for indicating events which relate to other events.
* This proposal defines APIs to let the server calculate the aggregations on
behalf of the client, and so bundle the aggregated data with the original
event where appropriate.
* [MSC2676](https://github.com/matrix-org/matrix-doc/pull/2676) defines how
users can edit messages using this mechanism.
* [MSC2677](https://github.com/matrix-org/matrix-doc/pull/2677) defines how
users can annotate events, such as reacting to events with emoji, using this
mechanism.
## Proposal
### Aggregations
Relation events can be aggregated per relation type by the server.
The format of the aggregated value (hereafter called "aggregation")
depends on the relation type.
Some relation types might group the aggregations by the `key` property
in the relation and aggregate to an array, while
others might aggregate to a single object or any other value really.
Here are some non-normative examples of what aggregations can look like:
Example aggregation for [`m.thread`](https://github.com/matrix-org/matrix-doc/pull/3440) (which
aggregates all relations into a single object):
```
{
"latest_event": {
"content": { ... },
...
},
"count": 7,
"current_user_participated": true
}
```
Example aggregation for [`m.annotation`](https://github.com/matrix-org/matrix-doc/pull/2677) (which
aggregates relations into a list of objects, grouped by `key`).
```
[
{
"key": "👍",
"origin_server_ts": 1562763768320,
"count": 3
},
{
"key": "👎",
"origin_server_ts": 1562763768320,
"count": 2
}
]
```
#### Bundling
Other than during non-gappy incremental syncs, timeline events that have other
events relate to them should include the aggregation of those related events
in the `m.relations` property of their unsigned data. This process is
referred to as "bundling", and the aggregated relations included via
this mechanism are called "bundled aggregations".
By sending a summary of the relations, bundling
avoids us having to always send lots of individual relation events
to the client.
Aggregations are never bundled into state events. This is a current
implementation detail that could be revisited later,
rather than a specific design decision.
The following client-server APIs should bundle aggregations
with events they return:
- `GET /rooms/{roomId}/messages`
- `GET /rooms/{roomId}/context/{eventId}`
- `GET /rooms/{roomId}/event/{eventId}`
- `GET /sync`, only for room sections in the response where `limited` field
is `true`; this amounts to all rooms in the response if
the `since` request parameter was not passed, also known as an initial sync.
- `GET /relations`, as proposed in this MSC.
Deprecated APIs like `/initialSync` and `/events/{eventId}` are *not* required
to bundle aggregations.
The bundled aggregations are grouped according to their relation type.
The format of `m.relations` (here with *non-normative* examples of
the [`m.replace`](https://github.com/matrix-org/matrix-doc/pull/2676) and
[`m.annotation`](https://github.com/matrix-org/matrix-doc/pull/2677) relation
types) is as follows:
```json
{
"event_id": "abc",
"unsigned": {
"m.relations": {
"m.annotation": {
"key": "👍",
"origin_server_ts": 1562763768320,
"count": 3
},
"m.replace": {
"event_id": "$edit_event_id",
"origin_server_ts": 1562763768320,
"sender": "@alice:localhost"
},
}
}
}
```
#### Client-side aggregation
Bundled aggregations on an event give a snapshot of what relations were known
at the time the event was received. When relations are received through `/sync`,
clients should locally aggregate (as they might have done already before
supporting this MSC) the relation on top of any bundled aggregation the server
might have sent along previously with the target event, to get an up to date
view of the aggregations for the target event. The aggregation algorithm is the
same as the one described here for the server.
### Querying relations
A single event can have lots of associated relations, and we do not want to
overload the client by, for example, including them all bundled with the
related-to event. Instead, we also provide a new `/relations` API in
order to paginate over the relations, which behaves in a similar way to
`/messages`, except using `next_batch` and `prev_batch` names
(in line with `/sync` API). Tokens from `/sync` or `/messages` can be
passed to `/relations` to only get relating events from a section of
the timeline.
The `/relations` API returns the discrete relation events
associated with an event that the server is aware of
in standard topological order. Note that events may be missing,
see [limitations](#servers-might-not-be-aware-of-all-relations-of-an-event).
You can optionally filter by a given relation type and the event type of the
relating event:
```
GET /_matrix/client/v1/rooms/{roomID}/relations/{event_id}[/{rel_type}[/{event_type}]][?from=token][&to=token][&limit=amount]
```
```
{
"chunk": [
{
"type": "m.reaction",
"sender": "...",
"content": {
"m.relates_to": {
"rel_type": "m.annotation",
...
}
}
}
],
"prev_batch": "some_token",
"next_batch": "some_token",
}
```
The endpoint does not have any trailing slashes. It requires authentication
and is not rate-limited.
The `from` and `limit` query parameters are used for pagination, and work
just like described for the `/messages` endpoint.
Note that [MSC2676](https://github.com/matrix-org/matrix-doc/pull/2676)
adds the related-to event in `original_event` property of the response.
This way the full history (e.g. also the first, original event) of the event
is obtained without further requests. See that MSC for further details.
### End to end encryption
Since the server has to be able to aggregate relation events, structural
information about relations must be visible to the server, and so the
`m.relates_to` field must be included in the plaintext.
A future MSC may define a method for encrypting certain parts of the
`m.relates_to` field that may contain sensitive information.
### Redactions
Redacted relations should not be taken into consideration in
bundled aggregations, nor should they be returned from `/relations`.
Requesting `/relations` on a redacted event should
still return any existing relation events.
This is in line with other APIs like `/context` and `/messages`.
### Local echo
For the best possible user experience, clients should also include unsent
relations into the client-side aggregation. When adding a relation to the send
queue, clients should locally aggregate it into the relations of the target
event, ideally regardless of the target event having received an `event_id`
already or still being pending. If the client gives up on sending the relation
for some reason, the relation should be de-aggregated from the relations of
the target event. If the client offers the user a possibility of manually
retrying to send the relation, it should be re-aggregated when the user does so.
De-aggregating a relation refers to rerunning the aggregation for a given
target event while not considering the de-aggregated event any more.
Upon receiving the remote echo for any relations, a client is likely to remove
the pending event from the send queue. Here, it should also de-aggregate the
pending event from the target event's relations, and re-aggregate the received
remote event from `/sync` to make sure the client-side aggregation happens with
the same event data as on the server.
When adding a redaction for a relation to the send queue, the relation
referred to should be de-aggregated from the relations of the target of the
relation. Similar to a relation, when the sending of the redaction fails or
is cancelled, the relation should be aggregated again.
If the target event is still pending and hasn't received its `event_id` yet,
clients can locally relate relation events to their target by using
`transaction_id` like they already do for detecting remote echos when sending
events.
## Edge cases
### How do you handle ignored users?
* Information about relations sent from ignored users must never be sent to
the client, either in aggregations or discrete relation events.
This is to let you block someone from harassing you with emoji reactions
(or using edits as a side-channel to harass you). Therefore, it is possible
that different users will see different aggregations (a different last edit,
or a different reaction count) on an event.
## Limitations
### Relations can be missed while not being in the room
Relation events behave no different from other events in terms of room history visibility,
which means that some relations might not be visible to a user while they are not invited
or have not joined the room. This can cause a user to see an incomplete edit history or reaction count
based on discrete relation events upon (re)joining a room.
Ideally the server would not include these events in aggregations, as it would mean breaking
the room history visibility rules, but this MSC defers addressing this limitation and
specifying the exact server behaviour to [MSC3570](https://github.com/matrix-org/matrix-doc/pull/3570).
### Servers might not be aware of all relations of an event
The response of `/relations` might be incomplete because the homeserver
potentially doesn't have the full DAG of the room. The federation API doens't
have an equivalent of the `/relations` API, so has no way but to fetch the
full DAG over federation to assure itself that it is aware of all relations.
[MSC2836](https://github.com/matrix-org/matrix-doc/pull/2836) provided a proposal for following relationships over federation in the "Querying relationships over federation" section via a `/_matrix/federation/v1/event_relationships` API
### Event type based aggregation and filtering won't work well in encrypted rooms
The `/relations` endpoint allows filtering by event type,
which for encrypted rooms will be `m.room.encrypted`, rendering this filtering
less useful for encrypted rooms. Aggregation algorithms that take the type of
the relating events they aggregate into account will suffer from the same
limitation.
## Future extensions
### Handling limited (gappy) syncs
For the special case of a gappy incremental sync, many relations (particularly
reactions) may have occurred during the gap. It would be inefficient to send
each one individually to the client, but it would also be inefficient to send
all possible bundled aggregations to the client.
The server could tell the client the event IDs of events which
predate the gap which received relations during the gap. This means that the
client could invalidate its copy of those events (if any) and then requery them
(including their bundled relations) from the server if/when needed,
for example using an extension of the `/event` API for batch requests.
The server could do this with a new `stale_events` field of each room object
in the sync response. The `stale_events` field would list all the event IDs
prior to the gap which had updated relations during the gap. The event IDs
would be grouped by relation type,
and paginated as per the normal Matrix pagination model.
This was originally part of this MSC but left out to limit the scope
to what is implemented at the time of writing.
## Prefix
While this MSC is not considered stable, the endpoints become:
- `GET /_matrix/client/unstable/rooms/{roomID}/relations/{eventID}[/{relationType}[/{eventType}]]`
None of the newly introduced identifiers should use a prefix though, as this MSC
tries to document relation support already being used in
the wider matrix ecosystem.
Loading…
Cancel
Save