From 87330b9b9b01dc089a8781d9775e9252c207b4cb Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Mon, 5 Nov 2018 18:23:23 +0000 Subject: [PATCH 01/12] Proposal for .well-known for server discovery --- proposals/1708-well-known-for-federation.md | 126 ++++++++++++++++++++ 1 file changed, 126 insertions(+) create mode 100644 proposals/1708-well-known-for-federation.md diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md new file mode 100644 index 00000000..2b8a3711 --- /dev/null +++ b/proposals/1708-well-known-for-federation.md @@ -0,0 +1,126 @@ +# .well-known support for server name resolution + +Currently, mapping from a server name to a hostname for federation is done via +`SRV` records. This presents two principal difficulties: + + * SRV records are not widely used, and administrators may be unfamiliar with + them, and there may be other practical difficulties in their deployment such + as poor support from hosting providers. [^1] + + * It is likely that we will soon require valid X.509 certificates on the + federation endpoint. It will then be necessary for the homeserver to present + a certificate which is valid for the server name. This presents difficulties + for hosted server offerings: BigCorp may be reluctant to hand over the + keys for `bigcorp.com` to the administrators of the `bigcorp.com` matrix + homeserver. + +Here we propose to solve these problems by augmenting the current `SRV` record +with a `.well-known` lookup. + +## Proposal + +For reference, the current [specification for resolving server +names](https://matrix.org/docs/spec/server_server/unstable.html#resolving-server-names) +is as follows: + +* If the hostname is an IP literal, then that IP address should be used, + together with the given port number, or 8448 if no port is given. + +* Otherwise, if the port is present, then an IP address is discovered by + looking up an AAAA or A record for the hostname, and the specified port is + used. + +* If the hostname is not an IP literal and no port is given, the server is + discovered by first looking up a `_matrix._tcp` SRV record for the + hostname, which may give a hostname (to be looked up using AAAA or A queries) + and port. If the SRV record does not exist, then the server is discovered by + looking up an AAAA or A record on the hostname and taking the default + fallback port number of 8448. + + Homeservers may use SRV records to load balance requests between multiple TLS + endpoints or to failover to another endpoint if an endpoint fails. + +The first two points remain unchanged: if the server name is an IP literal, or +contains a port, then requests will be made directly as before. + +If the hostname is neither an IP literal, nor does it have an explicit port, +then the requesting server should continue to make an SRV lookup as before, and +use the result if one is found. + +If *no* result is found, the requesting server should make a `GET` request to +`https://\/.well-known/matrix/server`, with normal X.509 +certificate validation. If the request fails in any way, then we fall back as +before to using using port 8448 on the hostname. + +Rationale: Falling back to port 8448 (rather than aborting the request) is +necessary to maintain compatibility with existing deployments, which may not +present valid certificates on port 443, or may return 4xx or 5xx errors. + +If the GET request succeeds, it should result in a JSON response, with contents +structured as shown: + +```json +{ + "server": "[:]" +} +``` + +The `server` property has the same format as a [server +name](https://matrix.org/docs/spec/appendices.html#server-name): a hostname +followed by an optional port. + +If the response cannot be parsed as JSON, or lacks a valid `server` property, +the request is considered to have failed, and no fallback to port 8448 takes +place. + +Otherwise, the requesting server performs an `AAAA/A` lookup on the hostname, +and connects to the resultant address and the specifed port. The port defaults +to 8448, if unspecified. + +### Caching + +Servers should not look up the `.well-known` file for every request, as this +would impose an unacceptable overhead on both sides. Instead, the results of +the `.well-known` request should be cached according to the HTTP response +headers, as per [RFC7234](https://tools.ietf.org/html/rfc7234). If the response +does not include an explicit expiry time, the requesting server should use a +sensible default: 24 hours is suggested. + +Because there is no way to request a revalidation, it is also recommended that +requesting servers cap the expiry time. 48 hours is suggested. + +Similarly, a failure to retrieve the `.well-known` file should be cached for +a reasonable period. 24 hours is suggested again. + +### The future of SRV records + +It's worth noting that this proposal is very clear in that we will maintain +support for SRV records for the immediate future; there are no current plans to +deprecate them. + +However, clearly a `.well-known` file can provide much of the functionality of +an SRV record, and having to support both may be undesirable. Accordingly, we +may consider sunsetting SRV record support at some point in the future. + +### Outstanding questions + +Should we follow 30x redirects for the .well-known file? On the one hand, there +is no obvious usecase and they add complexity (for example: how do they +interact with caches?). On the other hand, we'll presumably be using an HTTP +client library to handle some of the caching stuff, and they might be useful +for something? + +## Security considerations + +The `.well-known` file potentially broadens the attack surface for an attacker +wishing to intercept federation traffic to a particular server. + +## Conclusion + +This proposal adds a new mechanism, alongside the existing `SRV` record lookup +for finding the server responsible for a particular matrix server_name, which +will allow greater flexibility in deploying homeservers. + + +[^1] For example, Cloudflare automatically "flattens" SRV record responses. + From e3f10a4fd2049bcd71f4e77fe4192946498059f2 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Mon, 5 Nov 2018 18:37:25 +0000 Subject: [PATCH 02/12] Update 1708-well-known-for-federation.md fix title --- proposals/1708-well-known-for-federation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index 2b8a3711..beec384f 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -1,4 +1,4 @@ -# .well-known support for server name resolution +# MSC1708: .well-known support for server name resolution Currently, mapping from a server name to a hostname for federation is done via `SRV` records. This presents two principal difficulties: From c4e1949cf8f81b137d9fedebb4d5bf2c5116b112 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Tue, 6 Nov 2018 11:38:23 +0000 Subject: [PATCH 03/12] Clarifications about what `server` means --- proposals/1708-well-known-for-federation.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index beec384f..8854eb9f 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -47,8 +47,8 @@ If the hostname is neither an IP literal, nor does it have an explicit port, then the requesting server should continue to make an SRV lookup as before, and use the result if one is found. -If *no* result is found, the requesting server should make a `GET` request to -`https://\/.well-known/matrix/server`, with normal X.509 +If *no* SRV result is found, the requesting server should make a `GET` request +to `https://\/.well-known/matrix/server`, with normal X.509 certificate validation. If the request fails in any way, then we fall back as before to using using port 8448 on the hostname. @@ -65,17 +65,19 @@ structured as shown: } ``` -The `server` property has the same format as a [server -name](https://matrix.org/docs/spec/appendices.html#server-name): a hostname -followed by an optional port. +The `server` property should be a hostname or IP address, followed by an +optional port. If the response cannot be parsed as JSON, or lacks a valid `server` property, the request is considered to have failed, and no fallback to port 8448 takes place. -Otherwise, the requesting server performs an `AAAA/A` lookup on the hostname, -and connects to the resultant address and the specifed port. The port defaults -to 8448, if unspecified. +Otherwise, the requesting server performs an `AAAA/A` lookup on the hostname +(if necessary), and connects to the resultant address and the specifed +port. The port defaults to 8448, if unspecified. + +(The formal grammar for the `server` property is identical to that of a [server +name](https://matrix.org/docs/spec/appendices.html#server-name).) ### Caching From 09d41464e7b3d03eb07fe066b4fae4153f79b3c2 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Tue, 6 Nov 2018 11:38:46 +0000 Subject: [PATCH 04/12] Add problems section xs --- proposals/1708-well-known-for-federation.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index 8854eb9f..21365be1 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -112,6 +112,21 @@ interact with caches?). On the other hand, we'll presumably be using an HTTP client library to handle some of the caching stuff, and they might be useful for something? +## Problems + +It will take a while for `.well-known` to be supported across the ecosystem; +until it is, it will be difficult to deploy homeservers which rely on it for +their routing: if Alice is using a current homeserver implementation, and Bob +deploys a new implementation which relies on `.well-known` for routing, then +Alice will be unable to send messages to Bob. (This is the same problem we have with +[SNI](https://github.com/matrix-org/synapse/issues/1491#issuecomment-415153428).) + +The main defence against this seems to be to release support for `.well-known` +as soom as possible, to maximise uptake in the ecosystem. It is likely that, as +we approach Matrix 1.0, there will be sufficient other new features (such as +new Room versions) that upgading will be necessary anyway. + + ## Security considerations The `.well-known` file potentially broadens the attack surface for an attacker From b1e79ac7ab37302c2c90f2f031a8c6f8d4d2d34b Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Fri, 9 Nov 2018 11:22:22 +0000 Subject: [PATCH 05/12] Update 1708-well-known-for-federation.md --- proposals/1708-well-known-for-federation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index 21365be1..bd2f32b8 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -122,9 +122,9 @@ Alice will be unable to send messages to Bob. (This is the same problem we have [SNI](https://github.com/matrix-org/synapse/issues/1491#issuecomment-415153428).) The main defence against this seems to be to release support for `.well-known` -as soom as possible, to maximise uptake in the ecosystem. It is likely that, as +as soon as possible, to maximise uptake in the ecosystem. It is likely that, as we approach Matrix 1.0, there will be sufficient other new features (such as -new Room versions) that upgading will be necessary anyway. +new Room versions) that upgrading will be necessary anyway. ## Security considerations From e789eb186a8f12fe2517d1b30c8e38989d3c3b5f Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Mon, 12 Nov 2018 14:02:44 +0000 Subject: [PATCH 06/12] link to MSC1711 --- proposals/1708-well-known-for-federation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index bd2f32b8..8fa66c7a 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -7,7 +7,8 @@ Currently, mapping from a server name to a hostname for federation is done via them, and there may be other practical difficulties in their deployment such as poor support from hosting providers. [^1] - * It is likely that we will soon require valid X.509 certificates on the + * [MSC1711](https://github.com/matrix-org/matrix-doc/pull/1711) proposes + requiring valid X.509 certificates on the federation endpoint. It will then be necessary for the homeserver to present a certificate which is valid for the server name. This presents difficulties for hosted server offerings: BigCorp may be reluctant to hand over the @@ -140,4 +141,3 @@ will allow greater flexibility in deploying homeservers. [^1] For example, Cloudflare automatically "flattens" SRV record responses. - From f33a540e6dbc6287b03c3633f09144a3266c26b4 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Tue, 8 Jan 2019 00:25:06 +0000 Subject: [PATCH 07/12] Do a SRV lookup before .well-known lookup also other clarifications and corrections. --- proposals/1708-well-known-for-federation.md | 133 ++++++++++---------- 1 file changed, 69 insertions(+), 64 deletions(-) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index 8fa66c7a..6e857b85 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -1,21 +1,17 @@ # MSC1708: .well-known support for server name resolution Currently, mapping from a server name to a hostname for federation is done via -`SRV` records. This presents two principal difficulties: - - * SRV records are not widely used, and administrators may be unfamiliar with - them, and there may be other practical difficulties in their deployment such - as poor support from hosting providers. [^1] - - * [MSC1711](https://github.com/matrix-org/matrix-doc/pull/1711) proposes - requiring valid X.509 certificates on the - federation endpoint. It will then be necessary for the homeserver to present - a certificate which is valid for the server name. This presents difficulties - for hosted server offerings: BigCorp may be reluctant to hand over the - keys for `bigcorp.com` to the administrators of the `bigcorp.com` matrix - homeserver. - -Here we propose to solve these problems by augmenting the current `SRV` record +`SRV` records. However, +[MSC1711](https://github.com/matrix-org/matrix-doc/pull/1711) proposes +requiring valid X.509 certificates on the federation endpoint. It will then be +necessary for the homeserver to present a certificate which is valid for the +server name. This presents difficulties for hosted server offerings: BigCorp +may want to delegate responsibility for running its Matrix homeserver to an +outside supplier, but it may be difficult for that supplier to obtain a TLS +certificate for `bigcorp.com` (and BigCorp may be reluctant to let them have +one). + +This MSC proposes to solve this problem by augmenting the current `SRV` record with a `.well-known` lookup. ## Proposal @@ -24,59 +20,80 @@ For reference, the current [specification for resolving server names](https://matrix.org/docs/spec/server_server/unstable.html#resolving-server-names) is as follows: -* If the hostname is an IP literal, then that IP address should be used, - together with the given port number, or 8448 if no port is given. +1. If the hostname is an IP literal, then that IP address should be used, + together with the given port number, or 8448 if no port is given. + +2. Otherwise, if the port is present, then an IP address is discovered by + looking up an AAAA or A record for the hostname, and the specified port is + used. + +3. If the hostname is not an IP literal and no port is given, the server is + discovered by first looking up a `_matrix._tcp` SRV record for the + hostname, which may give a hostname (to be looked up using AAAA or A queries) + and port. + +4. Finally, the server is discovered by looking up an AAAA or A record on the + hostname, and taking the default fallback port number of 8448. + +We insert the following between Steps 3 and 4: + +If the SRV record does not exist, the requesting server should make a `GET` +request to `https:///.well-known/matrix/server`, with normal +X.509 certificate validation. If the request does not return a 200, continue +to step 4, otherwise: + +XXX: should we follow redirects? -* Otherwise, if the port is present, then an IP address is discovered by - looking up an AAAA or A record for the hostname, and the specified port is - used. +The response must have a `Content-Type` of `application/json`, and must be +valid JSON which follows the structure documented below. Otherwise, the +request is aborted. -* If the hostname is not an IP literal and no port is given, the server is - discovered by first looking up a `_matrix._tcp` SRV record for the - hostname, which may give a hostname (to be looked up using AAAA or A queries) - and port. If the SRV record does not exist, then the server is discovered by - looking up an AAAA or A record on the hostname and taking the default - fallback port number of 8448. +If the response is valid, the `m.server` property is parsed as +`[:]`, and processed as follows: - Homeservers may use SRV records to load balance requests between multiple TLS - endpoints or to failover to another endpoint if an endpoint fails. + a. If `` is an IP literal, then that IP address should + be used, together with ``, or 8448 if no port is + given. The server should present a valid TLS certificate for + ``. -The first two points remain unchanged: if the server name is an IP literal, or -contains a port, then requests will be made directly as before. + b. Otherwise, if the port is present, then an IP address is discovered by + looking up an AAAA or A record for ``, and the + specified port is used. The server should present a valid TLS certificate + for ``. -If the hostname is neither an IP literal, nor does it have an explicit port, -then the requesting server should continue to make an SRV lookup as before, and -use the result if one is found. + (In other words, the federation connection is made to + `https://:`). -If *no* SRV result is found, the requesting server should make a `GET` request -to `https://\/.well-known/matrix/server`, with normal X.509 -certificate validation. If the request fails in any way, then we fall back as -before to using using port 8448 on the hostname. + c. If the hostname is not an IP literal and no port is given, a second SRV + record is looked up; this time for `_matrix._tcp.`, + which may give yet another hostname (to be looked up using A/AAAA queries) + and port. The server must present a TLS cert for the + `` from the .well-known. -Rationale: Falling back to port 8448 (rather than aborting the request) is -necessary to maintain compatibility with existing deployments, which may not -present valid certificates on port 443, or may return 4xx or 5xx errors. + d. If no SRV record is found, the server is discovered by looking up an AAAA + or A record on ``, and taking the default fallback + port number of 8448. -If the GET request succeeds, it should result in a JSON response, with contents -structured as shown: + (In other words, the federation connection is made to + `https://:8448`). + +### Structure of the `.well-known` response + +The contents of the `.well-known` response should be structured as shown: ```json { - "server": "[:]" + "m.server": "[:]" } ``` -The `server` property should be a hostname or IP address, followed by an +The `m.server` property should be a hostname or IP address, followed by an optional port. If the response cannot be parsed as JSON, or lacks a valid `server` property, the request is considered to have failed, and no fallback to port 8448 takes place. -Otherwise, the requesting server performs an `AAAA/A` lookup on the hostname -(if necessary), and connects to the resultant address and the specifed -port. The port defaults to 8448, if unspecified. - (The formal grammar for the `server` property is identical to that of a [server name](https://matrix.org/docs/spec/appendices.html#server-name).) @@ -92,18 +109,10 @@ sensible default: 24 hours is suggested. Because there is no way to request a revalidation, it is also recommended that requesting servers cap the expiry time. 48 hours is suggested. -Similarly, a failure to retrieve the `.well-known` file should be cached for -a reasonable period. 24 hours is suggested again. - -### The future of SRV records - -It's worth noting that this proposal is very clear in that we will maintain -support for SRV records for the immediate future; there are no current plans to -deprecate them. - -However, clearly a `.well-known` file can provide much of the functionality of -an SRV record, and having to support both may be undesirable. Accordingly, we -may consider sunsetting SRV record support at some point in the future. +A failure to retrieve the `.well-known` file should also be cached, though care +must be taken that a single 500 error or connection failure should not break +federation for an extended period. A short cache time of about an hour might be +appropriate; alternatively, servers might use an exponential backoff. ### Outstanding questions @@ -127,7 +136,6 @@ as soon as possible, to maximise uptake in the ecosystem. It is likely that, as we approach Matrix 1.0, there will be sufficient other new features (such as new Room versions) that upgrading will be necessary anyway. - ## Security considerations The `.well-known` file potentially broadens the attack surface for an attacker @@ -138,6 +146,3 @@ wishing to intercept federation traffic to a particular server. This proposal adds a new mechanism, alongside the existing `SRV` record lookup for finding the server responsible for a particular matrix server_name, which will allow greater flexibility in deploying homeservers. - - -[^1] For example, Cloudflare automatically "flattens" SRV record responses. From fb171cadf48def72258433d2f19fd7d17df03c9d Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Tue, 8 Jan 2019 12:51:02 +0000 Subject: [PATCH 08/12] formatting fix --- proposals/1708-well-known-for-federation.md | 50 ++++++++++----------- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index 6e857b85..14ba3bf6 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -51,31 +51,31 @@ request is aborted. If the response is valid, the `m.server` property is parsed as `[:]`, and processed as follows: - a. If `` is an IP literal, then that IP address should - be used, together with ``, or 8448 if no port is - given. The server should present a valid TLS certificate for - ``. - - b. Otherwise, if the port is present, then an IP address is discovered by - looking up an AAAA or A record for ``, and the - specified port is used. The server should present a valid TLS certificate - for ``. - - (In other words, the federation connection is made to - `https://:`). - - c. If the hostname is not an IP literal and no port is given, a second SRV - record is looked up; this time for `_matrix._tcp.`, - which may give yet another hostname (to be looked up using A/AAAA queries) - and port. The server must present a TLS cert for the - `` from the .well-known. - - d. If no SRV record is found, the server is discovered by looking up an AAAA - or A record on ``, and taking the default fallback - port number of 8448. - - (In other words, the federation connection is made to - `https://:8448`). +a. If `` is an IP literal, then that IP address should + be used, together with ``, or 8448 if no port is + given. The server should present a valid TLS certificate for + ``. + +b. Otherwise, if the port is present, then an IP address is discovered by + looking up an AAAA or A record for ``, and the + specified port is used. The server should present a valid TLS certificate + for ``. + + (In other words, the federation connection is made to + `https://:`). + +c. If the hostname is not an IP literal and no port is given, a second SRV + record is looked up; this time for `_matrix._tcp.`, + which may give yet another hostname (to be looked up using A/AAAA queries) + and port. The server must present a TLS cert for the + `` from the .well-known. + +d. If no SRV record is found, the server is discovered by looking up an AAAA + or A record on ``, and taking the default fallback + port number of 8448. + + (In other words, the federation connection is made to + `https://:8448`). ### Structure of the `.well-known` response From f1ebbc358bd873988b2e987e32c53105a0e55293 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Tue, 8 Jan 2019 12:44:22 +0000 Subject: [PATCH 09/12] document dismissed options --- proposals/1708-well-known-for-federation.md | 67 +++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index 14ba3bf6..b100ef38 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -141,6 +141,73 @@ new Room versions) that upgrading will be necessary anyway. The `.well-known` file potentially broadens the attack surface for an attacker wishing to intercept federation traffic to a particular server. +## Dismissed alternatives + +For future reference, here are the alternative solutions which have been +considered and dismissed. + +### Look up the `.well-known` file before the SRV record + +We could make the request for `.well-known` before looking up the `SRV` +record. On the one hand this is maybe marginally simpler (and avoids the +overhead of having to make *two* `SRV` lookups in the case that a `.well-known` +is found. It might also open a future path for using `.well-known` for +information other than delegation. + +Ultimately we decided to include the initial `SRV` lookup so that deployments +have a mechanism to avoid the `.well-known` overhead in the common case that it +is not required. + +### Subdomain hack + +As well as accepting TLS certs for `example.com`, we could also accept them for +`delegated--matrix.example.com`. This would allow `example.com` to delegate its +matrix hosting by (a) setting up the SRV record at `_matrix._tcp.example.com` +and (b) setting up a CNAME at `delegated--matrix.example.com`. The latter would +enable the delegatee to obtain an acceptable TLS certificate. + +This was certainly an interesting idea, but we dismissed it for the following +reasons: + +* There's a security trap for anybody who lets people sign up for subdomains + (which is certainly not an uncommon business model): if you can register for + delegated--matrix.example.com, you get to intercept all the matrix traffic + for example.com. + +* Generally it feels quite unintuitive and violates the principle of least + surprise. + +* The fact that we can't find any prior art for this sets off alarm bells too. + +### Rely on DNS/DNSSEC + +If we could trust SRV records, we would be able to accept TLS certs for the +*target* of the SRV record, which avoids this whole problem. + +Such trust could come from assuming that plain DNS is "good enough". However, +DNS cache poisoning attacks are a real thing, and the fact that the designers +of TLS chose to implement a server-name check specifically to deal with this +case suggests we would be foolish to make this assumption. + +The alternative is to rely on DNSSEC to provide security for SRV records. The +problem here is simply that DNSSEC is not that widely deployed currently. A +number of large organisations are actively avoiding enabling it on their +domains, so requiring DNSSEC would be a direct impediment to the uptake of +Matrix. Furthermore, if we required DNSSEC-authenticated SRV records for +domains doing delegation, we would end up with a significant number of +homeservers unable to talk to such domains, because their local DNS +infrastructure may not implement DNSSEC. + +Finally, if we're expecting servers to present the cert for the *target* of the +SRV record, then we'll have to change the Host and SNI fields, and that will +break backwards compat everywhere (and it's hard to see how to mitigate that). + +### Stick with perspectives + +The final option is to double-down on the Perspectives approach, ie to skip +[MSC1711](https://github.com/matrix-org/matrix-doc/pull/1711). MSC1711 +discusses the reasons we do not believe this to be a viable option. + ## Conclusion This proposal adds a new mechanism, alongside the existing `SRV` record lookup From 5812450299fc88c58acf284e9f911152e2484812 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Tue, 8 Jan 2019 12:50:42 +0000 Subject: [PATCH 10/12] spec that we follow redirects --- proposals/1708-well-known-for-federation.md | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index b100ef38..5981ab9f 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -35,14 +35,12 @@ is as follows: 4. Finally, the server is discovered by looking up an AAAA or A record on the hostname, and taking the default fallback port number of 8448. -We insert the following between Steps 3 and 4: +We insert the following between Steps 3 and 4. If the SRV record does not exist, the requesting server should make a `GET` -request to `https:///.well-known/matrix/server`, with normal -X.509 certificate validation. If the request does not return a 200, continue -to step 4, otherwise: - -XXX: should we follow redirects? +request to `https:///.well-known/matrix/server`, with normal X.509 +certificate validation, and following 30x redirects. If the request does not +return a 200, continue to step 4, otherwise: The response must have a `Content-Type` of `application/json`, and must be valid JSON which follows the structure documented below. Otherwise, the @@ -114,14 +112,6 @@ must be taken that a single 500 error or connection failure should not break federation for an extended period. A short cache time of about an hour might be appropriate; alternatively, servers might use an exponential backoff. -### Outstanding questions - -Should we follow 30x redirects for the .well-known file? On the one hand, there -is no obvious usecase and they add complexity (for example: how do they -interact with caches?). On the other hand, we'll presumably be using an HTTP -client library to handle some of the caching stuff, and they might be useful -for something? - ## Problems It will take a while for `.well-known` to be supported across the ecosystem; From b541c2a247d6b849031574e3151ca44427c4cdb0 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Tue, 8 Jan 2019 12:54:52 +0000 Subject: [PATCH 11/12] more formatting --- proposals/1708-well-known-for-federation.md | 49 ++++++++++----------- 1 file changed, 24 insertions(+), 25 deletions(-) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index 5981ab9f..98cfe8d8 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -49,31 +49,30 @@ request is aborted. If the response is valid, the `m.server` property is parsed as `[:]`, and processed as follows: -a. If `` is an IP literal, then that IP address should - be used, together with ``, or 8448 if no port is - given. The server should present a valid TLS certificate for - ``. - -b. Otherwise, if the port is present, then an IP address is discovered by - looking up an AAAA or A record for ``, and the - specified port is used. The server should present a valid TLS certificate - for ``. - - (In other words, the federation connection is made to - `https://:`). - -c. If the hostname is not an IP literal and no port is given, a second SRV - record is looked up; this time for `_matrix._tcp.`, - which may give yet another hostname (to be looked up using A/AAAA queries) - and port. The server must present a TLS cert for the - `` from the .well-known. - -d. If no SRV record is found, the server is discovered by looking up an AAAA - or A record on ``, and taking the default fallback - port number of 8448. - - (In other words, the federation connection is made to - `https://:8448`). +* If `` is an IP literal, then that IP address should be + used, together with ``, or 8448 if no port is given. The + server should present a valid TLS certificate for ``. + +* Otherwise, if the port is present, then an IP address is discovered by + looking up an AAAA or A record for ``, and the + specified port is used. The server should present a valid TLS certificate + for ``. + + (In other words, the federation connection is made to + `https://:`). + +* If the hostname is not an IP literal and no port is given, a second SRV + record is looked up; this time for `_matrix._tcp.`, + which may give yet another hostname (to be looked up using A/AAAA queries) + and port. The server must present a TLS cert for the + `` from the .well-known. + +* If no SRV record is found, the server is discovered by looking up an AAAA + or A record on ``, and taking the default fallback + port number of 8448. + + (In other words, the federation connection is made to + `https://:8448`). ### Structure of the `.well-known` response From c10394d03f9f82533c8db0a823052df6ba23453d Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Wed, 9 Jan 2019 11:26:14 +0000 Subject: [PATCH 12/12] Clarifications thanks to @uhoreg --- proposals/1708-well-known-for-federation.md | 23 ++++++++++----------- 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/proposals/1708-well-known-for-federation.md b/proposals/1708-well-known-for-federation.md index 98cfe8d8..8105a638 100644 --- a/proposals/1708-well-known-for-federation.md +++ b/proposals/1708-well-known-for-federation.md @@ -39,8 +39,9 @@ We insert the following between Steps 3 and 4. If the SRV record does not exist, the requesting server should make a `GET` request to `https:///.well-known/matrix/server`, with normal X.509 -certificate validation, and following 30x redirects. If the request does not -return a 200, continue to step 4, otherwise: +certificate validation, and following 30x redirects (being careful to avoid +redirect loops). If the request does not return a 200, continue to step 4, +otherwise: The response must have a `Content-Type` of `application/json`, and must be valid JSON which follows the structure documented below. Otherwise, the @@ -53,10 +54,10 @@ If the response is valid, the `m.server` property is parsed as used, together with ``, or 8448 if no port is given. The server should present a valid TLS certificate for ``. -* Otherwise, if the port is present, then an IP address is discovered by - looking up an AAAA or A record for ``, and the - specified port is used. The server should present a valid TLS certificate - for ``. +* If `` is not an IP literal, and `` is + present, then an IP address is discovered by looking up an AAAA or A record + for ``, and the specified port is used. The server + should present a valid TLS certificate for ``. (In other words, the federation connection is made to `https://:`). @@ -84,15 +85,13 @@ The contents of the `.well-known` response should be structured as shown: } ``` -The `m.server` property should be a hostname or IP address, followed by an -optional port. - -If the response cannot be parsed as JSON, or lacks a valid `server` property, +If the response cannot be parsed as JSON, or lacks a valid `m.server` property, the request is considered to have failed, and no fallback to port 8448 takes place. -(The formal grammar for the `server` property is identical to that of a [server -name](https://matrix.org/docs/spec/appendices.html#server-name).) +The formal grammar for the `m.server` property is the same as that of a [server +name](https://matrix.org/docs/spec/appendices.html#server-name): it is a +hostname or IP address, followed by an optional port. ### Caching