From 55c4307f1291cb6aab353da8cf2e403eb2bc9a6c Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Thu, 30 Aug 2018 14:37:24 +0100 Subject: [PATCH] Rewrite the section on signing events ... for clarity and de-duplication. And to say a bit about validating the signatures. --- api/server-server/definitions/pdu.yaml | 3 +- .../definitions/unsigned_pdu.yaml | 6 +- specification/server_server_api.rst | 185 ++++++++---------- 3 files changed, 85 insertions(+), 109 deletions(-) diff --git a/api/server-server/definitions/pdu.yaml b/api/server-server/definitions/pdu.yaml index bb14ede27..d86b85384 100644 --- a/api/server-server/definitions/pdu.yaml +++ b/api/server-server/definitions/pdu.yaml @@ -23,7 +23,8 @@ allOf: hashes: type: object title: Event Hash - description: Hashes of the PDU, following the algorithm specified in `Signing Events`_. + description: |- + Content hashes of the PDU, following the algorithm specified in `Signing Events`_. example: { "sha256": "thishashcoversallfieldsincasethisisredacted" } diff --git a/api/server-server/definitions/unsigned_pdu.yaml b/api/server-server/definitions/unsigned_pdu.yaml index 64991d227..446973ed0 100644 --- a/api/server-server/definitions/unsigned_pdu.yaml +++ b/api/server-server/definitions/unsigned_pdu.yaml @@ -55,8 +55,8 @@ properties: prev_events: type: array description: |- - Event IDs and hashes of the most recent events in the room that the homeserver was aware - of when it made this event. + Event IDs and reference hashes for the most recent events in the room + that the homeserver was aware of when it made this event. items: type: array maxItems: 2 @@ -86,7 +86,7 @@ properties: auth_events: type: array description: |- - An event reference list containing the authorization events that would + Event IDs and reference hashes for the authorization events that would allow this event to be in the room. items: type: array diff --git a/specification/server_server_api.rst b/specification/server_server_api.rst index 244a8b82e..e55b8113e 100644 --- a/specification/server_server_api.rst +++ b/specification/server_server_api.rst @@ -112,7 +112,7 @@ Server implementation {{version_ss_http_api}} -Retrieving Server Keys +Retrieving server keys ~~~~~~~~~~~~~~~~~~~~~~ .. NOTE:: @@ -965,142 +965,114 @@ Signing Events Signing events is complicated by the fact that servers can choose to redact non-essential parts of an event. -Before signing the event, the ``unsigned`` and ``signature`` members are -removed, it is encoded as `Canonical JSON`_, and then hashed using SHA-256. The -resulting hash is then stored in the event JSON in a ``hash`` object under a -``sha256`` key. +Adding hashes and signatures to outgoing events +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -.. code:: python - - def hash_event(event_json_object): - - # Keys under "unsigned" can be modified by other servers. - # They are useful for conveying information like the age of an - # event that will change in transit. - # Since they can be modifed we need to exclude them from the hash. - unsigned = event_json_object.pop("unsigned", None) - - # Signatures will depend on the current value of the "hashes" key. - # We cannot add new hashes without invalidating existing signatures. - signatures = event_json_object.pop("signatures", None) +Before signing the event, the *content hash* of the event is calculated as +described below. The hash is encoded using `Unpadded Base64`_ and stored in the +event object, in a ``hashes`` object, under a ``sha256`` key. - # The "hashes" key might contain multiple algorithms if we decide to - # migrate away from SHA-2. We don't want to include an existing hash - # output in our hash so we exclude the "hashes" dict from the hash. - hashes = event_json_object.pop("hashes", {}) - - # Encode the JSON using a canonical encoding so that we get the same - # bytes on every server for the same JSON object. - event_json_bytes = encode_canonical_json(event_json_bytes) +The event object is then *redacted*, following the `redaction +algorithm`_. Finally it is signed as described in `Signing JSON`_, using the +server's signing key (see also `Retrieving server keys`_). - # Add the base64 encoded bytes of the hash to the "hashes" dict. - hashes["sha256"] = encode_base64(sha256(event_json_bytes).digest()) +The signature is then copied back to the original event object. - # Add the "hashes" dict back the event JSON under a "hashes" key. - event_json_object["hashes"] = hashes - if unsigned is not None: - event_json_object["unsigned"] = unsigned - return event_json_object +See `Persistent Data Unit schema`_ for an example of a signed event. -The event is then stripped of all non-essential keys both at the top level and -within the ``content`` object. Any top-level keys not in the following list -MUST be removed: -.. code:: +Validating hashes and signatures on received events +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When a server receives an event over federation from another server, the +receiving server should check the hashes and signatures on that event. - auth_events - depth - event_id - hashes - membership - origin - origin_server_ts - prev_events - prev_state - room_id - sender - signatures - state_key - type - -A new ``content`` object is constructed for the resulting event that contains -only the essential keys of the original ``content`` object. If the original -event lacked a ``content`` object at all, a new empty JSON object is created -for it. - -The keys that are considered essential for the ``content`` object depend on the -the ``type`` of the event. These are: +First the signature is checked. The event is redacted following the `redaction +algorithm`_, and the resultant object is checked for a signature from the +originating server, following the algorithm described in `Checking for a signature`_. +Note that this step should succeed whether we have been sent the full event or +a redacted copy. -.. code:: +If the signature is found to be valid, the expected content hash is calculated +as described below. The content hash in the ``hashes`` property of the received +event is base64-decoded, and the two are compared for equality. - type is "m.room.aliases": - aliases +If the hash check fails, then it is assumed that this is because we have only +been given a redacted version of the event. To enforce this, the receiving +server should use the redacted copy it calculated rather than the full copy it +received. - type is "m.room.create": - creator +Calculating the content hash for an event +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - type is "m.room.history_visibility": - history_visibility +The *content hash* of an event covers the complete event including the +*unredacted* contents. It is calculated as follows. - type is "m.room.join_rules": - join_rule +First, any existing ``unsigned``, ``signature``, and ``hashes`` members are +removed. The resulting object is then encoded as `Canonical JSON`_, and the +JSON is hashed using SHA-256. - type is "m.room.member": - membership - type is "m.room.power_levels": - ban - events - events_default - kick - redact - state_default - users - users_default - -The resulting stripped object with the new ``content`` object and the original -``hashes`` key is then signed using the JSON signing algorithm outlined below: +Example code +~~~~~~~~~~~~ .. code:: python - def sign_event(event_json_object, name, key): - - # Make sure the event has a "hashes" key. - if "hashes" not in event_json_object: - event_json_object = hash_event(event_json_object) + def hash_and_sign_event(event_object, signing_key, signing_name): + # First we need to hash the event object. + content_hash = compute_content_hash(event_object) + event_object["hashes"] = {"sha256": encode_unpadded_base64(content_hash)} # Strip all the keys that would be removed if the event was redacted. # The hashes are not stripped and cover all the keys in the event. # This means that we can tell if any of the non-essential keys are # modified or removed. - stripped_json_object = strip_non_essential_keys(event_json_object) + stripped_object = strip_non_essential_keys(event_object) # Sign the stripped JSON object. The signature only covers the # essential keys and the hashes. This means that we can check the # signature even if the event is redacted. - signed_json_object = sign_json(stripped_json_object) + signed_object = sign_json(stripped_object, signing_key, signing_name) # Copy the signatures from the stripped event to the original event. - event_json_object["signatures"] = signed_json_oject["signatures"] - return event_json_object + event_object["signatures"] = signed_object["signatures"] -Servers can then transmit the entire event or the event with the non-essential -keys removed. If the entire event is present, receiving servers can then check -the event by computing the SHA-256 of the event, excluding the ``hash`` object. -If the keys have been redacted, then the ``hash`` object is included when -calculating the SHA-256 hash instead. + def compute_content_hash(event_object): + # take a copy of the event before we remove any keys. + event_object = dict(event_object) -New hash functions can be introduced by adding additional keys to the ``hash`` -object. Since the ``hash`` object cannot be redacted a server shouldn't allow -too many hashes to be listed, otherwise a server might embed illict data within -the ``hash`` object. For similar reasons a server shouldn't allow hash values -that are too long. + # Keys under "unsigned" can be modified by other servers. + # They are useful for conveying information like the age of an + # event that will change in transit. + # Since they can be modifed we need to exclude them from the hash. + event_object.pop("unsigned", None) + + # Signatures will depend on the current value of the "hashes" key. + # We cannot add new hashes without invalidating existing signatures. + event_object.pop("signatures", None) + + # The "hashes" key might contain multiple algorithms if we decide to + # migrate away from SHA-2. We don't want to include an existing hash + # output in our hash so we exclude the "hashes" dict from the hash. + event_object.pop("hashes", None) + + # Encode the JSON using a canonical encoding so that we get the same + # bytes on every server for the same JSON object. + event_json_bytes = encode_canonical_json(event_object) + + return hashlib.sha256(event_json_bytes) .. TODO - [[TODO(markjh): We might want to specify a maximum number of keys for the - ``hash`` and we might want to specify the maximum output size of a hash]] - [[TODO(markjh) We might want to allow the server to omit the output of well - known hash functions like SHA-256 when none of the keys have been redacted]] + + [[TODO(markjh): Since the ``hash`` object cannot be redacted a server + shouldn't allow too many hashes to be listed, otherwise a server might embed + illict data within the ``hash`` object. + + We might want to specify a maximum number of keys for the + ``hash`` and we might want to specify the maximum output size of a hash]] + + [[TODO(markjh) We might want to allow the server to omit the output of well + known hash functions like SHA-256 when none of the keys have been redacted]] + .. |/query/directory| replace:: ``/query/directory`` .. _/query/directory: #get-matrix-federation-v1-query-directory @@ -1111,3 +1083,6 @@ that are too long. .. _`Inviting to a room`: #inviting-to-a-room .. _`Canonical JSON`: ../appendices.html#canonical-json .. _`Unpadded Base64`: ../appendices.html#unpadded-base64 +.. _`redaction algorithm`: ../client_server/unstable.html#redactions +.. _`Signing JSON`: ../appendices.html#signing-json +.. _`Checking for a signature`: ../appendices.html#checking-for-a-signature