From 16ec511e9c858b9fd49456922132f4fc389e4ddd Mon Sep 17 00:00:00 2001 From: Hubert Chathi Date: Sat, 15 Jan 2022 17:07:08 -0500 Subject: [PATCH] MSC2675: Serverside aggregations of message relationships (#2675) * initial version of serverside aggregations proposal * fix MSC numbers * clarification * add e2ee section from 2674 here, as it is only needed for server-side aggregations * move edge case wrt to calling /context on a relation here from 2674 * fix typo * clarify which APIs should bundle relations * move stale_events over to future extensions section * summarize stale_events and make tone conditional to mark that is not part of the MSC * casing and wording * clarify in summary an API for requesting relations is also proposed * remove proposal for batch get event api as is unused and unimplemented * attempt to clarify relations vs aggregations * clarify pagination and align it with synapse impl already in the wild * conciseness * better headers * clarify that relations are always returned, contrary to aggregations * document the limitation of the event type not being known in e2ee rooms * specify that redacted relations are not aggregated * remove type in (non-binding) example as synapse doesn't do this * mention that these are just examples * clarify that this is a non-normative example * Update proposals/2675-aggregations-server.md Co-authored-by: David Baker * add http method for endpoint list * line break * remove "unbundled relations" term, it's just confusing instead use relation events, with the bundled form now called aggregation also restructure the headings so we have on section about aggregations and another one about querying relation events * some more restructuring of text after changing doc structure * mention original_event for m.replace relations * remove dir param as it is unused and unimplemented * clarify that relating pending events should happen by transaction_id * remove unimplemented /aggregations/{eventID}//{eventType}/{key} endpoint * Update proposals/2675-aggregations-server.md Co-authored-by: Patrick Cloke * mention that the server might not be aware of all the relations * clarify that redacted events should still return their relations and aggregations respectively * remove /context edge case, it should not be special-cased * Update proposals/2675-aggregations-server.md Co-authored-by: Patrick Cloke * Update proposals/2675-aggregations-server.md Co-authored-by: Patrick Cloke * Update proposals/2675-aggregations-server.md Co-authored-by: Patrick Cloke * bad example, replies doesn't use relations * clarify that we dont bundle discrete events * clarify that we dont bundle discrete events, again * improve example * clarify this MSC does not use a prefix * better english * clarify pagination in example * better english * remove contradication: m.reference doesn't support pagination but example mentions it * double punctuation * clarify that only the bundled aggregation limit for truncation can't be set by the client, /aggregations does have a limit param * move e2ee limitation to limitations section * clarify prefixes * mention that state events never bundle aggregations * Update proposals/2675-aggregations-server.md Co-authored-by: Patrick Cloke * add that the visibility of relations can derive from that of the target * typsos * be more explicit * moar rewording * keep related parts together * don't make a relation invisible because the target event isn't also clarify what to do with relations for which the target is invisible * Update proposals/2675-aggregations-server.md Co-authored-by: David Baker * better words * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * be more precise when clients should ensure the key is shared * mention that ignored users can cause different aggregations for users * move visibility rule changes to MSC3570 * don't overspecify visibility limitation, allow for unspecified behaviour as synapse includes the invisible events in the aggregation * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * move non-normative note to below example * make rel_type mandatory as the response structure doesn't allow for mixing types * fix typo/thinko * make pagination forward only as there is no use case for backwards * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * add non-normative aggregation examples * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * use relation type rather than rel_type the former is already define as the latter in MSC 2674 * change trailing slashes remark to event_type, rel_type is mandatory now * reword and split out client-side aggregation section * rename parent event to target event, the term used elsewhere * apply suggestion * apply suggestion * remove pagination * remove mentions of /aggregations endpoint after removing pagination * add note about not bundling into state events * restructure headers so more of the aggregations stuff is under section * make rel_type mandatory for /relations and better wording * remove confusion that aggregations contain more info than relations * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * Update proposals/2675-aggregations-server.md Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * mention that tokens from /sync, /messages can be used on /relations * try not to be overly prescriptive * remove edge case of ignoring events without target event, as ignoring is not always safe * clarify limitation for encrypted rooms * make rel_type optional again for /relations * Update proposals/2675-aggregations-server.md Co-authored-by: Hubert Chathi * Update proposals/2675-aggregations-server.md Co-authored-by: Hubert Chathi * Update proposals/2675-aggregations-server.md Co-authored-by: Hubert Chathi * Update proposals/2675-aggregations-server.md Co-authored-by: Hubert Chathi * Update proposals/2675-aggregations-server.md Co-authored-by: Matthew Hodgson * mention requires auth and rate-limited on /relations * replace hypothetical examples for bundled aggregations with non-normative ones * move to MSC 2676 as it's specific to edits * dont repeat how local echo using transaction_id works Co-authored-by: Bruno Windels Co-authored-by: David Baker Co-authored-by: Patrick Cloke Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Co-authored-by: Matthew Hodgson --- proposals/2675-aggregations-server.md | 320 ++++++++++++++++++++++++++ 1 file changed, 320 insertions(+) create mode 100644 proposals/2675-aggregations-server.md diff --git a/proposals/2675-aggregations-server.md b/proposals/2675-aggregations-server.md new file mode 100644 index 00000000..5808dbec --- /dev/null +++ b/proposals/2675-aggregations-server.md @@ -0,0 +1,320 @@ +# MSC2675: Serverside aggregations of message relationships + +It's common to want to send events in Matrix which relate to existing events - +for instance, reactions, edits and even replies/threads. + +Clients typically need to track the related events alongside the original +event they relate to, in order to correctly display them. For instance, +reaction events need to be aggregated together by summing and be shown next to +the event they react to; edits need to be aggregated together by replacing the +original event and subsequent edits, etc. + +It is possible to treat relations as normal events and aggregate them +clientside, but to do so comprehensively could be very resource intensive, as +the client would need to spider all possible events in a room to find +relationships and maintain a correct view. + +Instead, this proposal seeks to solve this problem by defining APIs to let the +server calculate the aggregations on behalf of the client, and so bundle the +aggregated data with the original event where appropriate. It also proposes an +API to let clients paginate through all relations of an event. + +This proposal is one in a series of proposals that defines a mechanism for +events to relate to each other. Together, these proposals replace +[MSC1849](https://github.com/matrix-org/matrix-doc/pull/1849). + +* [MSC2674](https://github.com/matrix-org/matrix-doc/pull/2674) defines a + standard shape for indicating events which relate to other events. +* This proposal defines APIs to let the server calculate the aggregations on + behalf of the client, and so bundle the aggregated data with the original + event where appropriate. +* [MSC2676](https://github.com/matrix-org/matrix-doc/pull/2676) defines how + users can edit messages using this mechanism. +* [MSC2677](https://github.com/matrix-org/matrix-doc/pull/2677) defines how + users can annotate events, such as reacting to events with emoji, using this + mechanism. + +## Proposal + +### Aggregations + +Relation events can be aggregated per relation type by the server. +The format of the aggregated value (hereafter called "aggregation") +depends on the relation type. + +Some relation types might group the aggregations by the `key` property +in the relation and aggregate to an array, while +others might aggregate to a single object or any other value really. + +Here are some non-normative examples of what aggregations can look like: + +Example aggregation for [`m.thread`](https://github.com/matrix-org/matrix-doc/pull/3440) (which +aggregates all relations into a single object): +``` +{ + "latest_event": { + "content": { ... }, + ... + }, + "count": 7, + "current_user_participated": true +} +``` + +Example aggregation for [`m.annotation`](https://github.com/matrix-org/matrix-doc/pull/2677) (which +aggregates relations into a list of objects, grouped by `key`). +``` +[ + { + "key": "👍", + "origin_server_ts": 1562763768320, + "count": 3 + }, + { + "key": "👎", + "origin_server_ts": 1562763768320, + "count": 2 + } +] +``` + +#### Bundling + +Other than during non-gappy incremental syncs, timeline events that have other +events relate to them should include the aggregation of those related events +in the `m.relations` property of their unsigned data. This process is +referred to as "bundling", and the aggregated relations included via +this mechanism are called "bundled aggregations". + +By sending a summary of the relations, bundling +avoids us having to always send lots of individual relation events +to the client. + +Aggregations are never bundled into state events. This is a current +implementation detail that could be revisited later, +rather than a specific design decision. + +The following client-server APIs should bundle aggregations +with events they return: + + - `GET /rooms/{roomId}/messages` + - `GET /rooms/{roomId}/context/{eventId}` + - `GET /rooms/{roomId}/event/{eventId}` + - `GET /sync`, only for room sections in the response where `limited` field + is `true`; this amounts to all rooms in the response if + the `since` request parameter was not passed, also known as an initial sync. + - `GET /relations`, as proposed in this MSC. + +Deprecated APIs like `/initialSync` and `/events/{eventId}` are *not* required +to bundle aggregations. + +The bundled aggregations are grouped according to their relation type. +The format of `m.relations` (here with *non-normative* examples of +the [`m.replace`](https://github.com/matrix-org/matrix-doc/pull/2676) and +[`m.annotation`](https://github.com/matrix-org/matrix-doc/pull/2677) relation +types) is as follows: + +```json +{ + "event_id": "abc", + "unsigned": { + "m.relations": { + "m.annotation": { + "key": "👍", + "origin_server_ts": 1562763768320, + "count": 3 + }, + "m.replace": { + "event_id": "$edit_event_id", + "origin_server_ts": 1562763768320, + "sender": "@alice:localhost" + }, + } + } +} +``` + +#### Client-side aggregation + +Bundled aggregations on an event give a snapshot of what relations were known +at the time the event was received. When relations are received through `/sync`, +clients should locally aggregate (as they might have done already before +supporting this MSC) the relation on top of any bundled aggregation the server +might have sent along previously with the target event, to get an up to date +view of the aggregations for the target event. The aggregation algorithm is the +same as the one described here for the server. + +### Querying relations + +A single event can have lots of associated relations, and we do not want to +overload the client by, for example, including them all bundled with the +related-to event. Instead, we also provide a new `/relations` API in +order to paginate over the relations, which behaves in a similar way to +`/messages`, except using `next_batch` and `prev_batch` names +(in line with `/sync` API). Tokens from `/sync` or `/messages` can be +passed to `/relations` to only get relating events from a section of +the timeline. + +The `/relations` API returns the discrete relation events +associated with an event that the server is aware of +in standard topological order. Note that events may be missing, +see [limitations](#servers-might-not-be-aware-of-all-relations-of-an-event). +You can optionally filter by a given relation type and the event type of the +relating event: + +``` +GET /_matrix/client/v1/rooms/{roomID}/relations/{event_id}[/{rel_type}[/{event_type}]][?from=token][&to=token][&limit=amount] +``` + +``` +{ + "chunk": [ + { + "type": "m.reaction", + "sender": "...", + "content": { + "m.relates_to": { + "rel_type": "m.annotation", + ... + } + } + } + ], + "prev_batch": "some_token", + "next_batch": "some_token", +} +``` + +The endpoint does not have any trailing slashes. It requires authentication +and is not rate-limited. + +The `from` and `limit` query parameters are used for pagination, and work +just like described for the `/messages` endpoint. + +Note that [MSC2676](https://github.com/matrix-org/matrix-doc/pull/2676) +adds the related-to event in `original_event` property of the response. +This way the full history (e.g. also the first, original event) of the event +is obtained without further requests. See that MSC for further details. + +### End to end encryption + +Since the server has to be able to aggregate relation events, structural +information about relations must be visible to the server, and so the +`m.relates_to` field must be included in the plaintext. + +A future MSC may define a method for encrypting certain parts of the +`m.relates_to` field that may contain sensitive information. + +### Redactions + +Redacted relations should not be taken into consideration in +bundled aggregations, nor should they be returned from `/relations`. + +Requesting `/relations` on a redacted event should +still return any existing relation events. +This is in line with other APIs like `/context` and `/messages`. + +### Local echo + +For the best possible user experience, clients should also include unsent +relations into the client-side aggregation. When adding a relation to the send +queue, clients should locally aggregate it into the relations of the target +event, ideally regardless of the target event having received an `event_id` +already or still being pending. If the client gives up on sending the relation +for some reason, the relation should be de-aggregated from the relations of +the target event. If the client offers the user a possibility of manually +retrying to send the relation, it should be re-aggregated when the user does so. + +De-aggregating a relation refers to rerunning the aggregation for a given +target event while not considering the de-aggregated event any more. + +Upon receiving the remote echo for any relations, a client is likely to remove +the pending event from the send queue. Here, it should also de-aggregate the +pending event from the target event's relations, and re-aggregate the received +remote event from `/sync` to make sure the client-side aggregation happens with +the same event data as on the server. + +When adding a redaction for a relation to the send queue, the relation +referred to should be de-aggregated from the relations of the target of the +relation. Similar to a relation, when the sending of the redaction fails or +is cancelled, the relation should be aggregated again. + +If the target event is still pending and hasn't received its `event_id` yet, +clients can locally relate relation events to their target by using +`transaction_id` like they already do for detecting remote echos when sending +events. + +## Edge cases + +### How do you handle ignored users? + + * Information about relations sent from ignored users must never be sent to + the client, either in aggregations or discrete relation events. + This is to let you block someone from harassing you with emoji reactions + (or using edits as a side-channel to harass you). Therefore, it is possible + that different users will see different aggregations (a different last edit, + or a different reaction count) on an event. + +## Limitations + +### Relations can be missed while not being in the room + +Relation events behave no different from other events in terms of room history visibility, +which means that some relations might not be visible to a user while they are not invited +or have not joined the room. This can cause a user to see an incomplete edit history or reaction count +based on discrete relation events upon (re)joining a room. + +Ideally the server would not include these events in aggregations, as it would mean breaking +the room history visibility rules, but this MSC defers addressing this limitation and +specifying the exact server behaviour to [MSC3570](https://github.com/matrix-org/matrix-doc/pull/3570). + +### Servers might not be aware of all relations of an event + +The response of `/relations` might be incomplete because the homeserver +potentially doesn't have the full DAG of the room. The federation API doens't +have an equivalent of the `/relations` API, so has no way but to fetch the +full DAG over federation to assure itself that it is aware of all relations. + +[MSC2836](https://github.com/matrix-org/matrix-doc/pull/2836) provided a proposal for following relationships over federation in the "Querying relationships over federation" section via a `/_matrix/federation/v1/event_relationships` API + +### Event type based aggregation and filtering won't work well in encrypted rooms + +The `/relations` endpoint allows filtering by event type, +which for encrypted rooms will be `m.room.encrypted`, rendering this filtering +less useful for encrypted rooms. Aggregation algorithms that take the type of +the relating events they aggregate into account will suffer from the same +limitation. + +## Future extensions + +### Handling limited (gappy) syncs + +For the special case of a gappy incremental sync, many relations (particularly +reactions) may have occurred during the gap. It would be inefficient to send +each one individually to the client, but it would also be inefficient to send +all possible bundled aggregations to the client. + +The server could tell the client the event IDs of events which +predate the gap which received relations during the gap. This means that the +client could invalidate its copy of those events (if any) and then requery them +(including their bundled relations) from the server if/when needed, +for example using an extension of the `/event` API for batch requests. + +The server could do this with a new `stale_events` field of each room object +in the sync response. The `stale_events` field would list all the event IDs +prior to the gap which had updated relations during the gap. The event IDs +would be grouped by relation type, +and paginated as per the normal Matrix pagination model. + +This was originally part of this MSC but left out to limit the scope +to what is implemented at the time of writing. + +## Prefix + +While this MSC is not considered stable, the endpoints become: + + - `GET /_matrix/client/unstable/rooms/{roomID}/relations/{eventID}[/{relationType}[/{eventType}]]` + +None of the newly introduced identifiers should use a prefix though, as this MSC +tries to document relation support already being used in +the wider matrix ecosystem. \ No newline at end of file