diff --git a/api/identity/lookup.yaml b/api/identity/lookup.yaml index 166b09628..78fa3e3e7 100644 --- a/api/identity/lookup.yaml +++ b/api/identity/lookup.yaml @@ -16,7 +16,7 @@ # limitations under the License. swagger: '2.0' info: - title: "Matrix Identity Service Lookup API" + title: "Matrix Identity Service Lookup API" version: "1.0.0" host: localhost:8090 schemes: diff --git a/api/identity/v2_lookup.yaml b/api/identity/v2_lookup.yaml new file mode 100644 index 000000000..561700f4b --- /dev/null +++ b/api/identity/v2_lookup.yaml @@ -0,0 +1,145 @@ +# Copyright 2016 OpenMarket Ltd +# Copyright 2017 Kamax.io +# Copyright 2017 New Vector Ltd +# Copyright 2018 New Vector Ltd +# Copyright 2019 The Matrix.org Foundation C.I.C. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +swagger: '2.0' +info: + title: "Matrix Identity Service Lookup API" + version: "2.0.0" +host: localhost:8090 +schemes: + - https +basePath: /_matrix/identity/v2 +consumes: + - application/json +produces: + - application/json +securityDefinitions: + $ref: definitions/security.yaml +paths: + "/hash_details": + get: + summary: Gets hash function information from the server. + description: |- + Gets parameters for hashing identifiers from the server. This can include + any of the algorithms defined in this specification. + operationId: getHashDetails + security: + - accessToken: [] + parameters: [] + responses: + 200: + description: The hash function information. + examples: + application/json: { + "lookup_pepper": "matrixrocks", + "algorithms": ["none", "sha256"] + } + schema: + type: object + properties: + lookup_pepper: + type: string + description: |- + The pepper the client MUST use in hashing identifiers, and MUST + supply to the ``/lookup`` endpoint when performing lookups. + + Servers SHOULD rotate this string often. + algorithms: + type: array + items: + type: string + description: |- + The algorithms the server supports. Must contain at least ``sha256``. + required: ['lookup_pepper', 'algorithms'] + "/lookup": + post: + summary: Look up Matrix User IDs for a set of 3PIDs. + description: |- + Looks up the set of Matrix User IDs which have bound the 3PIDs given, if + bindings are available. Note that the format of the addresses is defined + later in this specification. + operationId: lookupUsersV2 + security: + - accessToken: [] + parameters: + - in: body + name: body + schema: + type: object + properties: + algorithm: + type: string + description: |- + The algorithm the client is using to encode the ``addresses``. This + should be one of the available options from ``/hash_details``. + example: "sha256" + pepper: + type: string + description: |- + The pepper from ``/hash_details``. This is required even when the + ``algorithm`` does not make use of it. + example: "matrixrocks" + addresses: + type: array + items: + type: string + description: |- + The addresses to look up. The format of the entries here depend on + the ``algorithm`` used. Note that queries which have been incorrectly + hashed or formatted will lead to no matches. + example: [ + "4kenr7N9drpCJ4AfalmlGQVsOn3o2RHjkADUpXJWZUc", + "nlo35_T5fzSGZzJApqu8lgIudJvmOQtDaHtr-I4rU7I" + ] + required: ['algorithm', 'pepper', 'addresses'] + responses: + 200: + description: + The associations for any matched ``addresses``. + examples: + application/json: { + "mappings": { + "4kenr7N9drpCJ4AfalmlGQVsOn3o2RHjkADUpXJWZUc": "@alice:example.org" + } + } + schema: + type: object + properties: + mappings: + type: object + description: |- + Any applicable mappings of ``addresses`` to Matrix User IDs. Addresses + which do not have associations will not be included, which can make + this property be an empty object. + title: AssociatedMappings + additionalProperties: + type: string + required: ['mappings'] + 400: + description: + The client's request was invalid in some way. One possible problem could + be the ``pepper`` being invalid after the server has rotated it - this is + presented with the ``M_INVALID_PEPPER`` error code. Clients SHOULD make + a call to ``/hash_details`` to get a new pepper in this scenario, being + careful to avoid retry loops. + examples: + application/json: { + "errcode": "M_INVALID_PEPPER", + "error": "Unknown or invalid pepper - has it been rotated?" + } + schema: + $ref: "../client-server/definitions/errors/error.yaml" diff --git a/specification/identity_service_api.rst b/specification/identity_service_api.rst index c92f737b4..f389cbc7a 100644 --- a/specification/identity_service_api.rst +++ b/specification/identity_service_api.rst @@ -155,6 +155,23 @@ should allow a 3PID to be mapped to a Matrix user identity, but not in the other direction (i.e. one should not be able to get all 3PIDs associated with a Matrix user ID, or get all 3PIDs associated with a 3PID). +Version 1 API deprecation +------------------------- + +.. TODO: Remove this section when the v1 API is removed. + +As described on each of the version 1 endpoints, the v1 API is deprecated in +favour of the v2 API described here. The major difference, with the exception +of a few isolated cases, is that the v2 API requires authentication to ensure +the user has given permission for the identity server to operate on their data. + +The v1 API is planned to be removed from the specification in a future version. + +Clients SHOULD attempt the v2 endpoints first, and if they receive a ``404``, +``400``, or similar error they should try the v1 endpoint or fail the operation. +Clients are strongly encouraged to warn the user of the risks in using the v1 API, +if they are planning on using it. + Web browser clients ------------------- @@ -258,7 +275,134 @@ Association lookup {{lookup_is_http_api}} -.. TODO: TravisR - Add v2 lookup API in future PR +{{v2_lookup_is_http_api}} + +Client behaviour +~~~~~~~~~~~~~~~~ + +.. TODO: Remove this note when v1 is removed completely +.. Note:: + This section only covers the v2 lookup endpoint. The v1 endpoint is described + in isolation above. + +Prior to performing a lookup clients SHOULD make a request to the ``/hash_details`` +endpoint to determine what algorithms the server supports (described in more detail +below). The client then uses this information to form a ``/lookup`` request and +receive known bindings from the server. + +Clients MUST support at least the ``sha256`` algorithm. + +Server behaviour +~~~~~~~~~~~~~~~~ + +.. TODO: Remove this note when v1 is removed completely +.. Note:: + This section only covers the v2 lookup endpoint. The v1 endpoint is described + in isolation above. + +Servers, upon receipt of a ``/lookup`` request, will compare the query against +known bindings it has, hashing the identifiers it knows about as needed to +verify exact matches to the request. + +Servers MUST support at least the ``sha256`` algorithm. + +Algorithms +~~~~~~~~~~ + +Some algorithms are defined as part of the specification, however other formats +can be negotiated between the client and server using ``/hash_details``. + +``sha256`` +++++++++++ + +This algorithm MUST be supported by clients and servers at a minimum. It is +additionally the preferred algorithm for lookups. + +When using this algorithm, the client converts the query first into strings +separated by spaces in the format ``
``. The ```` +is retrieved from ``/hash_details``, the ```` is typically ``email`` or +``msisdn`` (both lowercase), and the ``
`` is the 3PID to search for. +For example, if the client wanted to know about ``alice@example.org``'s bindings, +it would first format the query as ``alice@example.org email ThePepperGoesHere``. + +.. admonition:: Rationale + + Mediums and peppers are appended to the address to prevent a common prefix + for each 3PID, helping prevent attackers from pre-computing the internal state + of the hash function. + +After formatting each query, the string is run through SHA-256 as defined by +`RFC 4634 `_. The resulting bytes are then +encoded using URL-Safe `Unpadded Base64`_ (similar to `room version 4's +event ID format <../../rooms/v4.html#event-ids>`_). + +An example set of queries when using the pepper ``matrixrocks`` would be:: + + "alice@example.com email matrixrocks" -> "4kenr7N9drpCJ4AfalmlGQVsOn3o2RHjkADUpXJWZUc" + "bob@example.com email matrixrocks" -> "LJwSazmv46n0hlMlsb_iYxI0_HXEqy_yj6Jm636cdT8" + "18005552067 msisdn matrixrocks" -> "nlo35_T5fzSGZzJApqu8lgIudJvmOQtDaHtr-I4rU7I" + + +The set of hashes is then given as the ``addresses`` array in ``/lookup``. Note +that the pepper used MUST be supplied as ``pepper`` in the ``/lookup`` request. + +``none`` +++++++++ + +This algorithm performs plaintext lookups on the identity server. Typically this +algorithm should not be used due to the security concerns of unhashed identifiers, +however some scenarios (such as LDAP-backed identity servers) prevent the use of +hashed identifiers. Identity servers (and optionally clients) can use this algorithm +to perform those kinds of lookups. + +Similar to the ``sha256`` algorithm, the client converts the queries into strings +separated by spaces in the format ``
`` - note the lack of ````. +For example, if the client wanted to know about ``alice@example.org``'s bindings, +it would format the query as ``alice@example.org email``. + +The formatted strings are then given as the ``addresses`` in ``/lookup``. Note that +the ``pepper`` is still required, and must be provided to ensure the client has made +an appropriate request to ``/hash_details`` first. + +Security considerations +~~~~~~~~~~~~~~~~~~~~~~~ + +.. Note:: + `MSC2134 `_ has much more + information about the security considerations made for this section of the + specification. This section covers the high-level details for why the specification + is the way it is. + +Typically the lookup endpoint is used when a client has an unknown 3PID it wants to +find a Matrix User ID for. Clients normally do this kind of lookup when inviting new +users to a room or searching a user's address book to find any Matrix users they may +not have discovered yet. Rogue or malicious identity servers could harvest this +unknown information and do nefarious things with it if it were sent in plain text. +In order to protect the privacy of users who might not have a Matrix identifier bound +to their 3PID addresses, the specification attempts to make it difficult to harvest +3PIDs. + +.. admonition:: Rationale + + Hashing identifiers, while not perfect, helps make the effort required to harvest + identifiers significantly higher. Phone numbers in particular are still difficult + to protect with hashing, however hashing is objectively better than not. + + An alternative to hashing would be using bcrypt or similar with many rounds, however + by nature of needing to serve mobile clients and clients on limited hardware the + solution needs be kept relatively lightweight. + +Clients should be cautious of servers not rotating their pepper very often, and +potentially of servers which use a weak pepper - these servers may be attempting to +brute force the identifiers or use rainbow tables to mine the addresses. Similarly, +clients which support the ``none`` algorithm should consider at least warning the user +of the risks in sending identifiers in plain text to the identity server. + +Addresses are still potentially reversable using a calculated rainbow table given +some identifiers, such as phone numbers, common email address domains, and leaked +addresses are easily calculated. For example, phone numbers can have roughly 12 +digits to them, making them an easier target for attack than email addresses. + Establishing associations -------------------------