diff --git a/proposals/3819-to-device-messages-for-widgets.md b/proposals/3819-to-device-messages-for-widgets.md new file mode 100644 index 000000000..c4cc5f28e --- /dev/null +++ b/proposals/3819-to-device-messages-for-widgets.md @@ -0,0 +1,247 @@ +# MSC3819: Allowing widgets to send/receive to-device messages + +Widgets (embedded HTML applications in Matrix) currently have a relatively large surface area +they can use for interacting with their attached client, primarily in the context of a room. They +can [send/receive events with MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762), +[navigate to rooms with MSC2931](https://github.com/matrix-org/matrix-spec-proposals/pull/2931), +and even [open dialogs with MSC2790](https://github.com/matrix-org/matrix-spec-proposals/pull/2790), +but they can't act as a whole other Matrix client just yet. + +This MSC forms part of a larger, ongoing, question about how to embed other Matrix clients into another +client or room for access. An increasingly more popular client development option is to build out an +entirely new Matrix client and want to embed that within another client (as a widget) to avoid the +user needing to switch apps. To support this, we need to consider both long term and short term impact +of the changes we propose. This MSC aims closer to the short term. + +A longer term solution to the problem of clients wanting to be embedded in other clients might still +be widgets, though with a system like [MSC3008](https://github.com/matrix-org/matrix-spec-proposals/pull/3008) +to restrict access to the client-server API more effectively. For this MSC's purpose though, we're +aiming to cover a specific subset of the client-server API: to-device messages. + +While we could expose the entire client-server API over `postMessage` (or similar) for embedded +clients to access, the permissions model gets hairy and difficult to secure on the client side. Instead, +we're exploring what it would look like to special case what is needed for specific applications, as +needed, starting with to-device messages. + +To-device messaging is described [here](https://spec.matrix.org/v1.2/client-server-api/#send-to-device-messaging) +with practical applications for widget-ized clients being implementations of +[MSC3401 - Native group VoIP](https://github.com/matrix-org/matrix-spec-proposals/pull/3401) for now. + +## Prerequisite background + +*Author's note: This is copied from [MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762).* + +Widgets are relatively new to Matrix and so the terminology and behaviour might not be known to all +readers. This section should clarify the components of widgets that are applicable to this MSC without +going on a deep dive into widgets in general. + +Widgets are embedded HTML/JS/CSS applications in a client which use the `postMessage` API to talk +to the client. This communication allows widgets to provide enhanced functionality such as sticker +pickers (when applied to a user) or performance dashboards (in rooms). + +One of the first things that happens over this communication channel is a "capabilities negotiation" +where the client asks the widget what permissions it wants, and the widget replies with its ideal +set. The client then either decides or asks the user if the permissions requested are okay. + +All communication over the channel is done in a simple request/response flow, using actions to +describe the request. For the capabilities negotiation, this would be the client sending the widget +a request with an `action` of `capabilities`, and the widget would respond to that request with a +response object. + +The channel in which communication occurs is called a "session", where the session is "established" +after the capabilities negotiation. Sessions can only be terminated by the client. + +The Widget API is split into two parts: `toWidget` (client->widget) and `fromWidget` (widget->client). +They are differentiated by where the request originates. + +## Proposal + +Inspired heavily from [MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762), we +introduce new capabilities for to-device messages: + +* `m.send.to_device:` (eg: `m.send.to_device:m.call.invite`) - Used for sending to-device + messages of a given type. +* `m.receive.to_device:` (eg: `m.receive.to_device:m.call.invite`) - Used for receiving + to-device messages of a given type. + +These capabilities open up access to the following respective actions, when approved: + +**`fromWidget` action of `send_to_device`** + +```json5 +{ + // This is a standardized widget API request. + "api": "fromWidget", + "widgetId": "20200827_WidgetExample", + "requestid": "generated-id-1234", + "action": "send_to_device", // value defined by this proposal + "data": { + // Same structure as the `/sendToDevice` HTTP API request body + "@target:example.org": { + "DEVICEID": { // can also be a `*` to denote "all of the user's devices" + "example_content": "put your real message here" + } + } + } +} +``` + +The client upon receipt of this will validate that the widget has an appropriate capability to send +the to-device message. If the widget is approved for such a capability, the client **MUST** encrypt +the message by default unless the event is already encrypted by the widget (this MSC doesn't provide +enough API surface for a widget to do this, but in future it might be possible for the widget to +gain some context of the encryption state for the client and use that to make/manage Olm sessions). +The encrypted message is then sent as requested to the users/devices using +[`/sendToDevice`](https://spec.matrix.org/v1.2/client-server-api/#put_matrixclientv3sendtodeviceeventtypetxnid). + +If the widget doesn't have appropriate permission, or an error occurs anywhere along the send path, +a standardized widget error response is returned. + +Under the widget API, a response to all actions is required and takes the shape of repeating the +request with an added top-level `response` field. This `response` field is empty for this action, +as shown: + +```json5 +{ + "api": "fromWidget", + "widgetId": "20200827_WidgetExample", + "requestid": "generated-id-1234", + "action": "send_to_device", + "data": { + "@target:example.org": { + "DEVICEID": { + "example_content": "put your real message here" + } + } + }, + "response": {} +} +``` + +The client *should not* send a response to the action until the server has returned 200 OK itself, +which might take longer than the default widget API timeout of 10 seconds. Widgets should raise their +maximum timeout to 60 seconds or more for this action. + +**`toWidget` action of `send_to_device`** + +*Note*: It is common practice to name the action in favour of the direction of travel rather than try +and determine an alternative name. This does mean that there are two `send_to_device` actions: one +for widget->client and one for client->widget. This section is talking about client->widget. + +After the client has decrypted all to-device messages it receives, it determines if any widgets should +be made aware of the contents within. The decrypted event type for the message is used to determine +if the widget has appropriate capability to see the message. + +The client should process all to-device messages it can before sending them off to the widget. Even if +the client does process a message though, it should still send it to the appropriate widgets for +potential re-processing. This is to avoid a scenario where the host client can no longer reliably +function, such as if Olm sessions get corrupted or similar. + +The client should be aware that to-device messages might be seen which the client *could* handle, but +might not have context on, such as VoIP signaling. The client should not error out if it can't locate +a matching call, for example. + +The client SHOULD only send events which were received by the client *after* the session has been +established with the widget (after the widget's capabilities are negotiated). + +The request itself looks as follows: + +```json5 +{ + // This is a standardized widget API request. + "api": "toWidget", // note that we're sending *to* the widget here + "widgetId": "20200827_WidgetExample", + "requestid": "generated-id-1234", + "action": "send_to_device", // value defined by this proposal + "data": { + "type": "m.call.invite", + "sender": "@source:example.org", + "content": { + // ... as required for the event schema + } + } +} +``` + +Note that the action only supports a single to-device message at a time. This is for symmetry with +[MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762). + +Under the widget API, a response is required from the widget. The widget simply acknowledges the request +with an empty response object: + +```json5 +{ + "api": "toWidget", + "widgetId": "20200827_WidgetExample", + "requestid": "generated-id-1234", + "action": "send_to_device", + "data": { + "type": "m.call.invite", + "sender": "@source:example.org", + "content": { + // ... as required for the event schema + } + }, + "response": {} +} +``` + +## Potential issues + +Due to lack of documentation/spec, conventions for the widget API and its security principles could +be misunderstood or confusing. This MSC attempts to overly describe these cases where they are at +risk of being a potential misunderstanding, however readers of the proposal are still encouraged to +gather as much information as they can before reviewing this proposal. + +This MSC further pushes forward an idea that the `postMessage` transport for the widget API is the +way to go, however MSCs like [MSC3009](https://github.com/matrix-org/matrix-spec-proposals/pull/3009) +explore what it could mean to have a different transport mechanism. This MSC is not tied directly +to `postMessage` and is instead describing the request/responses used over the widget API - whatever +transport that might be. + +## Alternatives + +As discussed in the introduction of this proposal and on other MSCs, we could expose the client-server +API more generically to the widget. This causes issues where the client is either forced to parse +requests like a webserver would to validate that the widget is allowed to make the request, or +require such a generic capability that widgets would excessively request full read/write access +from the user without consideration for the impact that might have. As such, we continue to describe +special-cased actions for the widget API on a case-by-case basis. + +On other related proposals there's discussion about how a bot could achieve the same function as +the proposal. While also partially true here, the intent is not to have a game or similar publishing +events into a room but rather to have a second Matrix client (for all intents and purposes) embedded +either as a room widget or account widget. A bot precludes the second client from acting on behalf +of the user who has it open. + +## Security considerations + +Because the widget can implicitly decrypt events, it is absolutely imperative that clients +prompt for permission to use these capabilities even though the capabilities negotation does not +require this to be done. Strictly speaking, clients which do not prompt for confirmation from the +user are frowned upon, however given the intended usecase of VoIP signaling it is reasonable to +auto-approve some capabilities if the client can verifiably trust the widget is running safe code. +In general, verifiable trust only comes from the client locking widgets down to specific domains +or rewriting the widget URL before rendering to something the client controls. + +This MSC allows widgets to access sensitive parts of the client-server API, and the encryption +module specifically. If granted permission, a widget could feasibly harvest decryption keys *in clear +text*. It is strongly encouraged that clients do not auto-approve capabilities for key exchanges +or similar. In fact, it might even be reasonable for the client to auto-deny instead. + +This MSC allows a room widget to act at the account level rather than the traditional room level. +Normally these events would be scoped to the currently active room, however to-device messages are +not tied to a room. Therefore, the events are exposed as-is to the widget and can be interacted with +as such. + +## Unstable prefix + +While this MSC is not present in the spec, clients and widgets should: + +* Use `org.matrix.msc3819.` in place of `m.` in all new identifiers of this MSC. +* Only call/support the `action`s if a widget API version of `org.matrix.msc3819` is advertised. + +## Dependencies + +None applicable - this MSC's dependencies have either been approved or are used simply as reference +material. In practice, widgets should probably be formally in the spec before this MSC gets included.