diff --git a/drafts/websockets.rst b/drafts/websockets.rst new file mode 100644 index 000000000..bd0ff081d --- /dev/null +++ b/drafts/websockets.rst @@ -0,0 +1,285 @@ +WebSockets API +============== + +Introduction +------------ +This document is a proposal for a WebSockets-based client-server API. It is not +intended to replace the REST API, but rather to complement it and provide an +alternative interface for certain operations. + +The primary goal is to offer a more efficient interface than the REST API: by +using a bidirectional protocol such as WebSockets we can avoid the overheads +involved in long-polling (SSL negotiation, HTTP headers, etc). In doing so we +will reduce the latency between server and client by allowing the server to +send events as soon as they arrive, rather than having to wait for a poll from +the client. + +Handshake +--------- +1. Instead of calling ``/sync``, the client makes a websocket request to + ``/_matrix/client/rN/stream``, passing the query parameters ``access_token`` + and ``since``, and optionally ``filter`` - all of which have the same + meaning as for ``/sync``. + + * The client sets the ``Sec-WebSocket-Protocol`` to ``m.json``. (Servers may + offer alternative encodings; at present only the JSON encoding is + specified but in future we will specify alternative encodings.) + +#. The server returns the websocket handshake; the socket is then connected. + +If the server does not return a valid websocket handshake, this indicates that +the server or an intermediate proxy does not support WebSockets. In this case, +the client should fall back to polling the ``/sync`` REST endpoint. + +Example +~~~~~~~ + +Client request: + +.. code:: http + + GET /_matrix/client/v2_alpha/stream?access_token=123456&since=s72594_4483_1934 HTTP/1.1 + Host: matrix.org + Upgrade: websocket + Connection: Upgrade + Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw== + Sec-WebSocket-Protocol: m.json + Sec-WebSocket-Version: 13 + Origin: https://matrix.org + +Server response: + +.. code:: http + + HTTP/1.1 101 Switching Protocols + Upgrade: websocket + Connection: Upgrade + Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk= + Sec-WebSocket-Protocol: m.json + + +Update Notifications +-------------------- +Once the socket is connected, the server begins streaming updates over the +websocket. The server sends Update notifications about new messages or state +changes. To make it easy for clients to parse, Update notifications have the +same structure as the response to ``/sync``: an object with the following +members: + +============= ========== =================================================== +Key Type Description +============= ========== =================================================== +next_batch string The batch token to supply in the ``since`` param of + the next /sync request. This is not required for + streaming of events over the WebSocket, but is + provided so that clients can reconnect if the + socket is disconnected. +presence Presence The updates to the presence status of other users. +rooms Rooms Updates to rooms. +============= ========== =================================================== + +Example +~~~~~~~ +Message from the server: + +.. code:: json + + { + "next_batch": "s72595_4483_1934", + "presence": { + "events": [] + }, + "rooms": { + "join": {}, + "invite": {}, + "leave": {} + } + } + + +Client-initiated operations +--------------------------- + +The client can perform certain operations by sending a websocket message to +the server. Such a "Request" message should be a JSON-encoded object with +the following members: + +============= ========== =================================================== +Key Type Description +============= ========== =================================================== +id string A unique identifier for this request +method string Specifies the name of the operation to be + performed; see below for available operations +param object The parameters for the requested operation. +============= ========== =================================================== + +The server responds to a client Request with a Response message. This is a +JSON-encoded object with the following members: + +============= ========== =================================================== +Key Type Description +============= ========== =================================================== +id string The same as the value in the corresponding Request + object. The presence of the ``id`` field + distinguishes a Response message from an Update + notification. +result object On success, the results of the request. +error object On error, an object giving the resons for the + error. This has the same structure as the "standard + error response" for the Matrix API: an object with + the fields ``errcode`` and ``error``. +============= ========== =================================================== + +Request methods +~~~~~~~~~~~~~~~ +It is not intended that all operations which are available via the REST API +will be available via the WebSockets API, but a few simple, common operations +will be exposed. The initial operations will be as follows. + +``ping`` +^^^^^^^^ +This is a no-op which clients may use to keep their connection alive. + +The request ``params`` and the response ``result`` should be empty. + +``send`` +^^^^^^^^ +Send a message event to a room. The parameters are as follows: + +============= ========== =================================================== +Parameter Type Description +============= ========== =================================================== +room_id string **Required.** The room to send the event to +event_type string **Required.** The type of event to send. +content object **Required.** The content of the event. +============= ========== =================================================== + +The result is as follows: + +============= ========== =================================================== +Key Type Description +============= ========== =================================================== +event_id string A unique identifier for the event. +============= ========== =================================================== + +The ``id`` from the Request message is used as the transaction ID by the +server. + +``state`` +^^^^^^^^^ +Update the state on a room. + +============= ========== =================================================== +Parameter Type Description +============= ========== =================================================== +room_id string **Required.** The room to set the state in +event_type string **Required.** The type of event to send. +state_key string **Required.** The state_key for the state to send. +content object **Required.** The content of the event. +============= ========== =================================================== + +The result is as follows: + +============= ========== =================================================== +Key Type Description +============= ========== =================================================== +event_id string A unique identifier for the event. +============= ========== =================================================== + + +Example +~~~~~~~ +Client request: + +.. code:: json + + { + "id": "12345", + "method": "send", + "params": { + "room_id": "!d41d8cd:matrix.org", + "event_type": "m.room.message", + "content": { + "msgtype": "m.text", + "body": "hello" + } + } + } + +Server response: + +.. code:: json + + { + "id": "12345", + "result": { + "event_id": "$66697273743031:matrix.org" + } + } + +Alternative server response, in case of error: + +.. code:: json + + { + "id": "12345", + "error": { + "errcode": "M_MISSING_PARAM", + "error": "Missing parameter: event_type" + } + } + + +Rationale +--------- +Alternatives to WebSockets include HTTP/2, CoAP, and simply rolling our own +protocol over raw TCP sockets. However, the need to implement browser-based +clients essentially reduces our choice to WebSockets. HTTP/2 streams will +probably provide an interesting alternative in the future, but current browsers +do not appear to give javascript applications low-level access to the protocol. + +Concerning the continued use of the JSON encoding: we prefer to focus on the +transition to WebSockets initially. Replacing JSON with a compact +representation such as CBOR, MessagePack, or even just compressed JSON will be +a likely extension for the future. The support for negotiation of subprotocols +within WebSockets should make this a simple transition once time permits. + +The number of methods available for client requests is deliberately limited, as +each method requires code to be written to map it onto the equivalent REST +implementation. Some REST methods - for instance, user registration and login - +would be pointless to expose via WebSockets. It is likely, however, that we +will increate the number of methods available via the WebSockets API as it +becomes clear which would be most useful. + +Open questions +-------------- + +Throttling +~~~~~~~~~~ +At least in v2 sync, clients are inherently self-throttling - if they do not +poll quickly enough, events will be dropped from the next result. This proposal +raises the possibility that events will be produced more quickly than they can +be sent to the client; backlogs will build up on the server and/or in the +intermediate network, which will not only lead to high latency on events being +delivered, but will lead to responses to client requests also being delayed. + +We may need to implement some sort of throttling mechanism by which the server +can start to drop events. The difficulty is in knowing when to start dropping +events. A few ideas: + +* Use websocket pings to measure the RTT; if it starts to increase, start + dropping events. But this requires knowledge of the base RTT, and a useful + model of what constitutes an excessive increase. + +* Have the client acknowledge each batch of events, and use a window to ensure + the number of outstanding batches is limited. This is annoying as it requires + the client to have to acknowledge batches - and it's not clear what the right + window size is: we want a big window for long fat networks (think of mobile + clients), but a small one for one with lower latency. + +* Start dropping events if the server's TCP buffer starts filling up. This has + the advantage of delegating the congestion-detection to TCP (which already + has a number of algorithms to deal with it, to greater or lesser + effectiveness), but relies on homeservers being hosted on OSes which use + sensible TCP congestion-avoidance algorithms, and more critically, an ability + to read the fill level of the TCP send buffer.