Merge branch 'rav/e2e_guide'

Land the E2E implementation guide
9 years ago · 66a50855ff
parent 2ec43a5948 e77dc0bd4c
commit 66a50855ff
1 changed files with 658 additions and 0 deletions
--- a/supporting-docs/guides/2016-10-18-e2e_implementation.rst
+++ b/supporting-docs/guides/2016-10-18-e2e_implementation.rst
@ -0,0 +1,658 @@
+Implementing End-to-End Encryption in Matrix clients
+====================================================
+
+This guide is intended for authors of Matrix clients who wish to add
+support for end-to-end encryption. It is highly recommended that readers
+be familiar with the Matrix protocol and the use of access tokens before
+proceeding.
+
+The libolm library
+------------------
+
+End-to-end encryption in Matrix is based on the Olm and Megolm
+cryptographic ratchets. The recommended starting point for any client
+authors is with the `libolm <http://matrix.org/git/olm>`__ library,
+which contains implementations of all of the cryptographic primitives
+required. The library itself is written in C/C++, but is architected in
+a way which makes it easy to write wrappers for higher-level languages.
+
+Devices
+-------
+
+We have a particular meaning for “device”. As a user, I might have
+several devices (a desktop client, some web browsers, an Android device,
+an iPhone, etc). When I first use a client, it should register itself as
+a new device. If I log out and log in again as a different user, the
+client must register as a new device. Critically, the client must create
+a new set of keys (see below) for each “device”.
+
+The longevity of devices will depend on the client. In the web client,
+we create a new device every single time you log in. In a mobile client,
+it might be acceptable to reuse the device if a login session expires,
+**provided** the user is the same. **Never** share keys between
+different users.
+
+Devices are identified by their ``device_id`` (which is unique within
+the scope of a given user). By default, the ``/login`` and ``/register``
+endpoints will auto-generate a ``device_id`` and return it in the
+response; a client is also free to generate its own ``device_id`` or, as
+above, reuse a device, in which case the client should pass the
+``device_id`` in the request body.
+
+The lifetime of devices and ``access_token``\ s (technically: chains of
+``refresh_token``\ s and ``access_token``\ s), are closely related. In
+the simple case where a new device is created each time you log in,
+there is a one-to-one mapping between a ``device_id`` and an
+``access_token`` chain. If a client reuses a ``device_id`` when logging
+in, there will be several ``access_token`` chains associated with a
+given ``device_id`` - but still, we would expect only one of these to be
+active at once (though we do not currently enforce that in Synapse).
+
+Keys used in End-to-End encryption
+----------------------------------
+
+There are a number of keys involved in encrypted communication: a
+summary of them follows.
+
+Ed25519 fingerprint key pair
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Ed25519 is a public-key cryptographic system for signing messages. In
+Matrix, each device has an Ed25519 key pair which serves to identify
+that device. The private part of the key pair should never leave the
+device, but the public part is published to the Matrix network.
+
+Curve25519 identity key pair
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Curve25519 is a public-key cryptographic system which can be used to
+establish a shared secret. In Matrix, each device has a long-lived
+Curve25519 identity key which is used to establish Olm sessions with
+that device. Again, the private key should never leave the device, but
+the public part is signed with the Ed25519 fingerprint key and published
+to the network.
+
+Theoretically we should rotate the Curve25519 identity key from time to
+time, but we haven't implemented this yet.
+
+Curve25519 one-time keys
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+As well as the identity key, each device creates a number of Curve25519
+key pairs which are also used to establish Olm sessions, but can only be
+used once. Once again, the private part remains on the device.
+
+At startup, Alice creates a number of one-time key pairs, and publishes
+them to her homeserver. If Bob wants to establish an Olm session with
+Alice, he needs to claim one of Alice’s one-time keys, and creates a new
+one of his own. Those two keys, along with Alice’s and Bob’s identity
+keys, are used in establishing an Olm session between Alice and Bob.
+
+Megolm encryption keys
+~~~~~~~~~~~~~~~~~~~~~~
+
+The Megolm key is used to encrypt group messages (in fact it is used to
+derive an AES-256 key, and an HMAC-SHA-256 key). It is initialised with
+random data. Each time a message is sent, a hash calculation is done on
+the Megolm key to derive the key for the next message. It is therefore
+possible to share the current state of the Megolm key with a user,
+allowing them to decrypt future messages but not past messages.
+
+Ed25519 Megolm signing key pair
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When a sender creates a Megolm session, he also creates another Ed25519
+signing key pair. This is used to sign messages sent via that Megolm
+session, to authenticate the sender. Once again, the private part of the
+key remains on the device. The public part is shared with other devices
+in the room alongside the encryption key.
+
+Creating and registering device keys
+------------------------------------
+
+This process only happens once, when a device first starts.
+
+It must create the Ed25519 fingerprint key pair and the Curve25519
+identity key pair. This is done by calling ``olm_create_account`` in
+libolm. The (base64-encoded) keys are retrieved by calling
+``olm_account_identity_keys``. The account should be stored for future
+use.
+
+It should then publish these keys to the homeserver. To do this, it
+should construct a JSON object as follows:
+
+.. code:: json
+
+  {
+    "algorithms": ["m.olm.v1.curve25519-aes-sha2", "m.megolm.v1.aes-sha2"],
+    "device_id": "<deviceId>",
+    "keys": {
+      "curve25519:<deviceId>": "<curve25519_key>",
+      "ed25519:<deviceId>": "<ed25519_key>"
+    },
+    "user_id: <userId>"
+  }
+
+The object should be formatted as `Canonical
+JSON <http://matrix.org/docs/spec/server_server/unstable.html#canonical-json>`__,
+then signed with ``olm_account_sign``; the signature should be added to
+the JSON as ``signatures.<userId>.ed25519:<deviceId>``.
+
+The signed JSON is then uploaded via
+``POST /_matrix/client/unstable/keys/upload``.
+
+Creating and registering one-time keys
+--------------------------------------
+
+At first start, and at regular intervals
+thereafter\ [#]_, the client should check how
+many one-time keys the homeserver has stored for it, and, if necessary,
+generate and upload some more.
+
+.. [#] Every 10 minutes is suggested.
+
+The number of one-time keys currently stored is returned by
+``POST /_matrix/client/unstable/keys/upload``. (Post an empty JSON object
+``{}`` if you don’t want to upload the device keys.)
+
+The maximum number of active keys supported by libolm is returned by
+``olm_account_max_number_of_one_time_keys``. The client should try to
+maintain about half this number on the homeserver.
+
+To generate new one-time keys:
+
+* Call ``olm_account_generate_one_time_keys`` to generate new keys
+
+* Call ``olm_account_one_time_keys`` to retrieve the unpublished keys. This
+  returns a JSON-formatted object with the single property ``curve25519``,
+  which is itself an object mapping key id to base64-encoded Curve25519
+  key. For example:
+
+  .. code:: json
+
+    {
+      "curve25519": {
+        "AAAAAA": "wo76WcYtb0Vk/pBOdmduiGJ0wIEjW4IBMbbQn7aSnTo",
+        "AAAAAB": "LRvjo46L1X2vx69sS9QNFD29HWulxrmW11Up5AfAjgU"
+      }
+    }
+
+* Construct a JSON object as follows:
+
+  .. code:: json
+
+    {
+      "one_time_keys": {
+        "curve25519:<keyId>": "<curve25519_key>",
+        ...
+      }
+    }
+
+* Upload the object via ``POST /_matrix/client/unstable/keys/upload``. (Unlike
+  the device keys, the one-time keys are **not** signed.
+
+* Call ``olm_account_mark_keys_as_published`` to tell the olm library not to
+  return the same keys from a future call to ``olm_account_one_time_keys``\.
+
+Configuring a room to use encryption
+------------------------------------
+
+To enable encryption in a room, a client should send a state event of
+type ``m.room.encryption``, and content ``{ "algorithm":
+"m.megolm.v1.aes-sha2" }``.
+
+Handling an ``m.room.encryption`` state event
+---------------------------------------------
+
+When a client receives an ``m.room.encryption`` event as above, it
+should set a flag to indicate that messages sent in the room should be
+encrypted.
+
+This flag should **not** be cleared if a later ``m.room.encryption``
+event changes the configuration. This is to avoid a situation where a
+MITM can simply ask participants to disable encryption. In short: once
+encryption is enabled in a room, it can never be disabled.
+
+Handling an ``m.room.encrypted`` event
+--------------------------------------
+
+Encrypted events have a type of ``m.room.encrypted``. They have a
+content property ``algorithm`` which gives the encryption algorithm in
+use, as well as other properties specific to the algorithm.
+
+The encrypted payload is a JSON object with the properties ``type``
+(giving the decrypted event type), and ``content`` (giving the decrypted
+content). Depending on the algorithm in use, the payload may contain
+additional keys.
+
+There are currently two defined algorithms:
+
+``m.olm.v1.curve25519-aes-sha2``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encrypted events using this algorithm should have a ``sender_key`` and a
+``ciphertext`` property.
+
+The ``sender_key`` property of the event content gives the Curve25519
+identity key of the sender. Clients should maintain a list of known Olm
+sessions for each device they speak to; it is recommended to index them
+by Curve25519 identity key.
+
+Olm messages are encrypted separately for each recipient device.
+``ciphertext`` is an object mapping from the Curve25519 identity key for
+the recipient device. The receiving client should, of course, look for
+its own identity key in this object. (If it isn't listed, the message
+wasn't sent for it, and the client can't decrypt it; it should show an
+error instead, or similar).
+
+This should result in an object with the properties ``type`` and
+``body``. Messages of type '0' are 'prekey' messages which are used to
+establish a new Olm session between two devices; type '1' are normal
+messages which are used once a message has been received on the session.
+
+When a message (of either type) is received, a client should first
+attempt to decrypt it with each of the known sessions for that sender.
+There are two steps to this:
+
+-  If (and only if) ``type==0``, the client should call
+   ``olm_matches_inbound_session`` with the session and ``body``. This
+   returns a flag indicating whether the message was encrypted using
+   that session.
+
+-  The client calls ``olm_decrypt``, with the session, ``type``, and
+   ``body``. If this is successful, it returns the plaintext of the
+   event.
+
+If the client was unable to decrypt the message using any known sessions
+(or if there are no known sessions yet), **and** the message had type 0,
+**and** ``olm_matches_inbound_session`` wasn't true for any existing
+sessions, then the client can try establishing a new session. This is
+done as follows:
+
+-  Call ``olm_create_inbound_session_from`` using the olm account, and
+   the ``sender_key`` and ``body`` of the message.
+
+-  If the session was established successfully:
+
+   -  call ``olm_remove_one_time_keys`` to ensure that the same
+      one-time-key cannot be reused.
+
+   -  Call ``olm_decrypt`` with the new session
+
+   -  Store the session for future use
+
+At the end of this, the client will hopefully have successfully
+decrypted the payload.
+
+As well as the ``type`` and ``content`` properties, the payload should
+contain a ``keys`` property, which should be an object with a property
+ed25519. The client should check that the value of this property matches
+the sender's fingerprint key when `marking the event as verified`_ [#]_.
+
+.. [#] This prevents an attacker publishing someone else's curve25519 keys as
+   their own and subsequently claiming to have sent messages which they didn't
+   (see
+   https://github.com/vector-im/vector-web/issues/2215#issuecomment-247630155).
+
+
+``m.megolm.v1.aes-sha2``
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encrypted events using this algorithm should have ``sender_key``,
+``session_id`` and ``ciphertext`` content properties. If the
+``room_id``, ``sender_key`` and ``session_id`` correspond to a known
+Megolm session (see `below`__), the ciphertext can be
+decrypted by passing the ciphertext into ``olm_group_decrypt``.
+
+__ `m.room_key`_
+
+The client should check that the sender's fingerprint key matches the
+``keys.ed25519`` property of the event which established the Megolm session
+when `marking the event as verified`_.
+
+.. _`m.room_key`:
+
+Handling an ``m.room_key`` event
+--------------------------------
+
+These events contain key data to allow decryption of other messages.
+They are sent to specific devices, so they appear in the ``to_device``
+section of the response to ``GET /_matrix/client/r0/sync``. They will
+also be encrypted, so will need decrypting as above before they can be
+seen.
+
+The event content will contain an 'algorithm' property, indicating the
+encryption algorithm the key data is to be used for. Currently, this
+will always be ``m.megolm.v1.aes-sha2``.
+
+Room key events for Megolm will also have ``room_id``, ``session_id``, and
+``session_key`` keys. They are used to establish a Megolm session.  The
+``room_id`` identifies which room the session will be used in. The ``room_id``,
+together with the ``sender_key`` of the ``room_key`` event before it was
+decrypted, and the ``session_id``, uniquely identify a Megolm session. If they
+do not represent a known session, the client should start a new inbound Megolm
+session by calling ``olm_init_inbound_group_session`` with the ``session_key``.
+
+The client should remember the value of the keys property of the payload
+of the encrypted ``m.room_key`` event and store it with the inbound
+session. This is used as above when marking the event as verified.
+
+.. _`download the device list`:
+
+Downloading the device list for users in the room
+-------------------------------------------------
+
+Before an encrypted message can be sent, it is necessary to retrieve the
+list of devices for each user in the room. This can be done proactively,
+or deferred until the first message is sent. The information is also
+required to allow users to `verify or block devices`__.
+
+__ `blocking`_
+
+The client should build a JSON query object as follows:
+
+.. code:: json
+
+  {
+    "<user_id>": {},
+    ...
+  }
+
+Each member in the room should be included in the query. This is then
+sent via ``POST /_matrix/client/unstable/keys/query.``
+
+The result includes, for each listed user id, a map from device ID to an
+object containing information on the device, as follows:
+
+.. code:: json
+
+  {
+    "algorithms": [...],
+    "device_id": "<deviceId>",
+    "keys": {
+      "curve25519:<deviceId>": "<curve25519\_key>",
+      "ed25519:<deviceId>": "<ed25519\_key>"
+    },
+    "signatures": {
+      "<userId>": {
+        "ed25519:<deviceId>": "<signature>"
+      },
+    },
+    "unsigned": {
+      "device_display_name": "<display name>"
+    },
+    "user_id: <userId>"
+  }
+
+The client should first check the signature on this object. To do this,
+it should remove the ``signatures`` and ``unsigned`` properties, format
+the remainder as Canonical JSON, and pass the result into
+``olm_ed25519_verify``, using the Ed25519 key for the ``key`` parameter,
+and the corresponding signature for the ``signature`` parameter. If the
+signature check fails, no further processing should be done on the
+device.
+
+The client should check if the ``user_id``/``device_ie`` correspond to a device
+it had seen previously. If it did, the client **must** check that the Ed25519
+key hasn't changed. Again, if it has changed, no further processing should be
+done on the device.
+
+Otherwise the client stores the information about this device.
+
+Sending an encrypted event
+--------------------------
+
+When sending a message in a room `configured to use
+encryption`__, a client first checks to see if it has
+an active outbound Megolm session. If not, it first `creates one
+as per below`__.
+
+__ `Configuring a room to use encryption`_
+__ `Starting a Megolm session`_
+
+It then builds an encryption payload as follows:
+
+.. code:: json
+
+  {
+    "type": "<event type>",
+    "content": "<event content>",
+    "room_id": "<id of destination room>"
+  }
+
+and calls ``olm_group_encrypt`` to encrypt the payload. This is then packaged
+into event content as follows:
+
+.. code:: json
+
+  {
+    "algorithm": "m.megolm.v1.aes-sha2",
+    "sender_key": "<our curve25519 device key>",
+    "ciphertext": "<encrypted payload>",
+    "session_id": "<outbound group session id>",
+    "device_id": "<our device ID>"
+  }
+
+Finally, the encrypted event is sent to the room with ``POST
+/_matrix/client/r0/rooms/<room_id>/send/m.room.encrypted/<txn_id>``.
+
+Starting a Megolm session
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When a message is first sent in an encrypted room, the client should
+start a new outbound Megolm session. This should **not** be done
+proactively, to avoid proliferation of unnecessary Megolm sessions.
+
+To create the session, the client should call
+``olm_init_outbound_group_session``, and store the details of the
+outbound session for future use.
+
+The client should then call ``olm_outbound_group_session_id`` to get the
+unique ID of the new session, and ``olm_outbound_group_session_key`` to
+retrieve the current ratchet key and index. It should store these
+details as an inbound session, just as it would when `receiving them via
+an m.room_key event`__.
+
+__ `m.room_key`_
+
+The client must then share the keys for this session with each device in the
+room. It must therefore `download the device list`_ if it hasn't already done
+so, and for each device in the room which has not been `blocked`__, the client
+should:
+
+__ `blocking`_
+
+* Build a content object as follows:
+
+  .. code:: json
+
+    {
+      "algorithm": "m.megolm.v1.aes-sha2",
+      "room_id": "<id of destination room>",
+      "session_id": "<session id>",
+      "session_key": "<session_key>"
+    }
+
+-  Encrypt the content as an ``m.room_key`` event using Olm, as below.
+
+Once all of the key-sharing event contents have been assembled, the
+events should be sent to the corresponding devices via
+``PUT /_matrix/client/unstable/sendToDevice/m.room.encrypted/<txnId>``.
+
+Encrypting an event with Olm
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Olm is not used for encrypting room events, as it requires a separate
+copy of the ciphertext for each device, and because the receiving device
+can only decrypt received messages once. However, it is used for
+encrypting key-sharing events for Megolm.
+
+When encrypting an event using Olm, the client should:
+
+-  Build an encryption payload as follows:
+
+   .. code:: json
+
+     {
+       "type": "<event type>",
+       "content": "<event content>",
+       "sender_device": "<our device ID>",
+       "keys": {
+         "ed25519": "<our ed25519 fingerprint key>"
+       }
+     }
+
+-  Check if it has an existing Olm session; if it does not, `start a new
+   one`__. If it has several (as may happen due to
+   races when establishing sessions), it should use the one with the
+   first session_id when sorted by their ASCII codepoints (ie, 'A'
+   would be before 'Z', which would be before 'a').
+
+   __ `Starting an Olm session`_
+
+-  Encrypt the payload by calling ``olm_encrypt``.
+
+-  Package the payload into event content as follows:
+
+   .. code:: json
+
+     {
+       "algorithm": "m.olm.v1.curve25519-aes-sha2",
+       "sender_key": "<our curve25519 identity key>",
+       "ciphertext": "<encrypted payload>"
+     }
+
+Starting an Olm session
+~~~~~~~~~~~~~~~~~~~~~~~
+
+To start a new Olm session with another device, a client must first
+claim one of the other device's one-time keys. To do this, it should
+create a query object as follows:
+
+.. code:: json
+
+  {
+    "<user id>": {
+      "<device_id>": "curve25519",
+      ...
+    },
+    ...
+  }
+
+and send this via ``POST /_matrix/client/unstable/keys/claim``. Claims
+for multiple devices should be aggregated into a single request.
+
+This will return a result as follows:
+
+.. code:: json
+
+  {
+    "<user id>": {
+      "<device_id>": {
+        "curve25519:<key_id>": "<one-time key>"
+      },
+      ...
+    },
+    ...
+  }
+
+The client should then pass this key, along with the Curve25519 Identity
+key for the remote device, into ``olm_create_outbound_session``.
+
+Handling membership changes
+---------------------------
+
+The client should monitor rooms which are configured to use encryption for
+membership changes.
+
+When a member leaves a room, the client should invalidate any active outbound
+Megolm session, to ensure that a new session is used next time the user sends a
+message.
+
+When a new member joins a room, the client should first `download the device
+list`_ for the new member, if it doesn't already have it.
+
+After giving the user an opportunity to `block`__ any suspicious devices, the
+client should share the keys for the outbound Megolm session with all the new
+member's devices. This is done in the same way as `creating a new session`__,
+except that there is no need to start a new Megolm session: due to the design
+of the Megolm ratchet, the new user will only be able to decrypt messages
+starting from the current state. The recommended method is to maintain a list
+of members who are waiting for the session keys, and share them when the user
+next sends a message.
+
+__ `blocking`_
+__ `Starting a Megolm session`_
+
+Sending New Device announcements
+--------------------------------
+
+When a user logs in on a new device, it is necessary to make sure that
+other devices in any rooms with encryption enabled are aware of the new
+device. This is done as follows.
+
+Once the initial call to the ``/sync`` API completes, the client should
+iterate through each room where encryption is enabled. For each user
+(including the client's own user), it should build a content object as
+follows:
+
+.. code:: json
+
+  {
+    "device_id": "<our device ID>",
+    "rooms": ["<shared room id 1>", "<room id 2>", ... ]
+  }
+
+Once all of these have been constructed, they should be sent to all of the
+relevant user's devices (using the wildcard ``*`` in place of the
+``device_id``) via ``PUT
+/_matrix/client/unstable/sendToDevice/m.new_device/<txnId>.``
+
+Handling an m.new_device event
+-------------------------------
+
+As with ``m.room_key`` events, these will appear in the ``to_device``
+section of the ``/sync`` response.
+
+The client should `download the device list`_ of the sender, to get the details
+of the new device.
+
+The event content will contain a ``rooms`` property, as well as the
+``device_id`` of the new device. For each room in the list, the client
+should check if encryption is enabled, and if the sender of the event is
+a member of that room. If so, the client should share the keys for the
+outbound Megolm session with the new device, in the same way as
+`handling a new user in the room`__.
+
+__ `Handling membership changes`_
+
+.. _`blocking`:
+
+Blocking / Verifying devices
+----------------------------
+
+It should be possible for a user to mark each device belonging to
+another user as 'Blocked' or 'Verified'.
+
+When a user chooses to block a device, this means that no further
+encrypted messages should be shared with that device. In short, it
+should be excluded when sharing room keys when `starting a new Megolm
+session <#_p5d1esx6gkrc>`__. Any active outbound Megolm sessions whose
+keys have been shared with the device should also be invalidated so that
+no further messages are sent over them.
+
+Verifying a device involves ensuring that the device belongs to the
+claimed user. Currently this must be done by showing the user the
+Ed25519 fingerprint key for the device, and prompting the user to verify
+out-of-band that it matches the key shown on the other user's device.
+
+.. _`marking the event as verified`:
+
+Marking events as 'verified'
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Once a device has been verified, it is possible to verify that events
+have been sent from a particular device. See the section on `Handling an
+m.room.encrypted event`_ for notes on how to do this
+for each algorithm. Events sent from a verified device can be decorated
+in the UI to show that they have been sent from a verified device.