From 46440b0e75829b5c5eb99e763d5e7de8a7a71f18 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Thu, 1 Oct 2020 23:33:17 +0100 Subject: [PATCH 1/4] On why you should never trust an event. --- proposals/2801-untrusted-event-data.md | 185 +++++++++++++++++++++++++ 1 file changed, 185 insertions(+) create mode 100644 proposals/2801-untrusted-event-data.md diff --git a/proposals/2801-untrusted-event-data.md b/proposals/2801-untrusted-event-data.md new file mode 100644 index 00000000..fcf5b958 --- /dev/null +++ b/proposals/2801-untrusted-event-data.md @@ -0,0 +1,185 @@ +# MSC2801: Make it explicit that event bodies are untrusted data + +As the Matrix Specification stands today, it is easy for client developers to +assume that all events will be structured as documented. For example, the +`displayname` key of an [`m.room.member` +event](https://matrix.org/docs/spec/client_server/r0.6.1#m-room-member) is +specified to be "`string` or `null`"; and the `info` key of an [`m.image` +message event](https://matrix.org/docs/spec/client_server/r0.6.1#m-image) must +be a Javascript object. + +In reality, these are not safe assumptions. This MSC proposes that the +specification be updated to make it clear that developers should treat all +event data as untrusted. + +## Reasons why events may not match the specification + +Firstly, let's examine the reasons why such malformed events can exist. Some of +these reasons may appear to have trivial solutions; these are discussed below. + + 1. Most obviously, Synapse as it currently stands does very little validation + of event bodies sent over the Client-Server API. There is nothing stopping + any user sending a malformed `m.room.message` event with a simple `curl` + command or Element-Web's `/devtools`. + + Any event sent in this way will be returned to the clients of all the other + users in the room. + + 2. In order to allow the Matrix protocol to be extensible, server + implementations must tolerate unknown event types, and allow them to be + passed between clients. It is obvious that a pair of custom clients + implementing a `com.example.special.event` event type cannot rely on a + standard server implementation to do any validation for them. + + However, this problem extends to event types and keys which have been added + to the specification. For example, the [`m.room.encrypted` event + type](https://matrix.org/docs/spec/client_server/r0.6.1#m-room-encrypted) + was added in Client-Server API r0.4.0. It therefore follows that a server + implementing CS-API r0.3.0 would have no way to validate an + `m.room.encrypted` event, so if a client is connected to such a server, it + could receive malformed events. + + 3. To extend from point 2, the problem is not even resolved by upgrading the + server. There may now be rooms which contain historical events which would + no longer be accepted, but these will still be served by the server. + + This problem also applies to non-room data such as account data. For + example, Client-Server API r0.6.0 added the [`m.identity_server` account + data event + type](https://matrix.org/docs/spec/client_server/r0.6.1#m-identity-server). + It is possible, if unlikely, that a client could have uploaded an + `m.identity_server` event before the administrator upgraded the server. + + 3. Event redaction removes certain keys from an event. This is a bit of a + trivial case, though it is worth noting that the rules for event redaction + vary between room versions, so it's possible to see a variety of "partial" + events. + + 4. All the cases above can occur without federation. Federation adds + additional complexities due to the structure of Matrix rooms. In + particular, a server implementation cannot simply ignore any malformed + events since such events may either be critical to the structure of the + room (for example, they may be `m.room.membership` events), or at the very + least ignoring them would leave "holes" in the event graph which would + prevent correct back-pagination. + +## Ideas for mitigating the problem + +The problems above appear to have some easy solutions. Let's describe some of +those ideas, before considering the fundamental problem that none of them can +solve. + +### Validate all events when they are submitted + +In this idea, we would require all server implementations to strictly validate +any data which is sent over the Client-Server API, to ensure that it complied +with the specified formats. + +This evidently solves problem 1 above, in that it would prevent local users from +creating malformed events of any event types that the server supports; however, +it would do nothing to address any of the other problems. + +### Validate events at the point they are served to a client + +We could require that server implementations validate any data that they are +about to serve to a client; we might recommend that malformed data be stripped +out of the response, or redacted, or similar. + +It is worth mentioning that this would be tricky to implement efficiently on +the server side, but it at least helps address most of the problems above, such +as historical data, or malformed events received over federation. + +### Have servers re-validate data on upgrade + +Similar to the previous idea, but rather than validating data each time it is +served to a client, any stored data could be re-validated to check that it +complies with new validation requirements. + +This could be more efficient in the case that changes to validation rules are +rare, but it could still be a huge amount of data processing on a large server. + +### Create new room versions which require events to be well-formed + +As outlined above, one of the big problems in this area is how we deal with +events sent over federation; in particular, if subsets of the servers in a room +have different ideas as to which events are "valid", then their concepts of the +room state can begin to drift, and the room can eventually become +"split-brained". This makes it hard to simply say, for example, +"`m.room.member` events with a non-string `displayname` are invald and should +not form part of the room state": we have a risk that some servers will accept +the event, and some will not. + +One approach to solving this is via [room +versions](https://matrix.org/docs/spec/index#room-versions). By specifying that +a change of rules only applies for a future room version, we can eliminate this +potential disagreement. + +The process of changing a room from one version to another is intrusive, not +least because it requires that all servers in a room support the new room +version (or risk being locked out). For that reason, it is extremely +undesirable that any new feature require a new room version: whenever possible, +it should be possible to use new features in existing rooms. It therefore +follows that we cannot rely on room versions to provide validation of event +data. + +### Create a single new room version which applies all event validation rules + +This idea is included for completeness, though it is unclear how it would work +in practice. + +It has been suggested that we create a new room version which explicitly states +that events which fail the current event schema, whatever that is at that +moment in time, should be rejected. + +Let's imagine that in future, the `m.room.member` event schema is extended to +include an optional `location` key, which, if given, must be a string. The +implication of this idea is that servers should reject any `m.room.member` +event whose `location` is not a string. We now have a problem: any servers in +the room which are updated to the latest spec will reject such malformed +events, but any other servers yet to be upgraded will allow it. So is that user +in the room or not? + +Even if all the servers in the room can be upgraded at once, what about any +`m.room.member` events which were sent before the rule change? + +## The fundamental problem + +The ideas above all mitigate the problems discussed earlier to a greater or +lesser extent, and may indeed be worth doing on their own merits. However, none +of them can address the problem of outdated server implementations. + +For example, consider the case of a new key being added to an event body, say +`m.relates_to`. Now, we may have decided as above that all new specced keys +must be validated by the server, so `vN+1` of Synapse dutifully implements such +validation and refuses to accept events with a malformed `m.relates_to`. + +The problem comes for users whose server is still Synapse `vN`. It knows +nothing of `m.relates_to`, so accepts and passes it through even if +malformed. The only potential solution is for clients seeking to implement +`m.relates_to` to refuse to talk to servers which do not declare support for +it. + +However, this is an entirely client-side feature: it is illogical to require +that servers must be upgraded before it can be used. Consider that the hosted +element-web at `https://app.element.io` is upgraded to support the new feature; +in this scenario, that would lock out any user whose homeserver had not yet +been upgraded. This is not an acceptable user experience. + +In short, we are left with the reality that clients must still handle the +unvalidated data. + +## Conclusions + +Short of closely coupling server and client versions - which violates the +fundamental ethos of the Matrix project - there is nothing that can completely +prevent clients from having to handle untrusted data. + +With that in mind, the advantages of the ideas above are diminished. If clients +must handle untrusted data in some circumstances, why not in all? "You can +trust the content of this data structure, provided you have checked that the +server knows how to validate it, in which case you need to treat it as +untrusted" is not a useful message for a client developer. + +It may be possible to assert that specific, known cases can be treated as +trusted data, but these should be called out as specific cases. The default +message should be that clients must treat all event data as untrusted. From 12da19f927fdcc872f6e6fb985c1b7bb490d34d0 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Fri, 2 Oct 2020 09:26:59 +0100 Subject: [PATCH 2/4] Words on e2ee events --- proposals/2801-untrusted-event-data.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/proposals/2801-untrusted-event-data.md b/proposals/2801-untrusted-event-data.md index fcf5b958..4a95170b 100644 --- a/proposals/2801-untrusted-event-data.md +++ b/proposals/2801-untrusted-event-data.md @@ -25,7 +25,11 @@ these reasons may appear to have trivial solutions; these are discussed below. Any event sent in this way will be returned to the clients of all the other users in the room. - 2. In order to allow the Matrix protocol to be extensible, server + 2. Events may be encrypted. Any client implementing E2EE must be prepared to + deal with any encrypted content, since by definitioon a server cannot + validate it. + + 3. In order to allow the Matrix protocol to be extensible, server implementations must tolerate unknown event types, and allow them to be passed between clients. It is obvious that a pair of custom clients implementing a `com.example.special.event` event type cannot rely on a @@ -39,7 +43,7 @@ these reasons may appear to have trivial solutions; these are discussed below. `m.room.encrypted` event, so if a client is connected to such a server, it could receive malformed events. - 3. To extend from point 2, the problem is not even resolved by upgrading the + 4. To extend from point 3, the problem is not even resolved by upgrading the server. There may now be rooms which contain historical events which would no longer be accepted, but these will still be served by the server. @@ -50,12 +54,12 @@ these reasons may appear to have trivial solutions; these are discussed below. It is possible, if unlikely, that a client could have uploaded an `m.identity_server` event before the administrator upgraded the server. - 3. Event redaction removes certain keys from an event. This is a bit of a + 5. Event redaction removes certain keys from an event. This is a bit of a trivial case, though it is worth noting that the rules for event redaction vary between room versions, so it's possible to see a variety of "partial" events. - 4. All the cases above can occur without federation. Federation adds + 6. All the cases above can occur without federation. Federation adds additional complexities due to the structure of Matrix rooms. In particular, a server implementation cannot simply ignore any malformed events since such events may either be critical to the structure of the @@ -172,7 +176,8 @@ unvalidated data. Short of closely coupling server and client versions - which violates the fundamental ethos of the Matrix project - there is nothing that can completely -prevent clients from having to handle untrusted data. +prevent clients from having to handle untrusted data. In addition, encrypted +events eliminate any possibility of server-side validation. With that in mind, the advantages of the ideas above are diminished. If clients must handle untrusted data in some circumstances, why not in all? "You can From 5db56391f30621db3a7e36824417fc57ed5f169f Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Thu, 29 Oct 2020 12:21:46 +0000 Subject: [PATCH 3/4] spelling --- proposals/2801-untrusted-event-data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2801-untrusted-event-data.md b/proposals/2801-untrusted-event-data.md index 4a95170b..31e6d539 100644 --- a/proposals/2801-untrusted-event-data.md +++ b/proposals/2801-untrusted-event-data.md @@ -26,7 +26,7 @@ these reasons may appear to have trivial solutions; these are discussed below. users in the room. 2. Events may be encrypted. Any client implementing E2EE must be prepared to - deal with any encrypted content, since by definitioon a server cannot + deal with any encrypted content, since by definition a server cannot validate it. 3. In order to allow the Matrix protocol to be extensible, server From 52dfaaabd416f8fb9ca98361a31aa74cf7abf76c Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Tue, 23 Mar 2021 11:40:56 +0000 Subject: [PATCH 4/4] Update proposals/2801-untrusted-event-data.md Co-authored-by: Jonathan de Jong --- proposals/2801-untrusted-event-data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2801-untrusted-event-data.md b/proposals/2801-untrusted-event-data.md index 31e6d539..2b0d16aa 100644 --- a/proposals/2801-untrusted-event-data.md +++ b/proposals/2801-untrusted-event-data.md @@ -109,7 +109,7 @@ events sent over federation; in particular, if subsets of the servers in a room have different ideas as to which events are "valid", then their concepts of the room state can begin to drift, and the room can eventually become "split-brained". This makes it hard to simply say, for example, -"`m.room.member` events with a non-string `displayname` are invald and should +"`m.room.member` events with a non-string `displayname` are invalid and should not form part of the room state": we have a risk that some servers will accept the event, and some will not.