From f2850c7f6ad9002256bb7055c67b019d9721f25f Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Fri, 4 Oct 2019 15:50:19 +0900 Subject: [PATCH 01/29] Initial draft of the Matrix URI scheme proposal A remake of MSC455. --- proposals/2312-matrix-uri.md | 496 +++++++++++++++++++++++++++++++++++ 1 file changed, 496 insertions(+) create mode 100644 proposals/2312-matrix-uri.md diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md new file mode 100644 index 00000000..21898007 --- /dev/null +++ b/proposals/2312-matrix-uri.md @@ -0,0 +1,496 @@ +# URI scheme for Matrix + +This is a proposal of a URI scheme to identify Matrix resources in a wide +range of applications (web, desktop, or mobile) both throughout Matrix software +and (especially) outside of it. It supersedes +[MSC455](https://github.com/matrix-org/matrix-doc/issues/455) in order +to continue the discussion in the modern GFM style. + +While Matrix has its own resource naming system that allows it to identify +resources without resolving them, there is a common need to provide URIs +to Matrix resources (e.g., rooms, users, PDUs) that could be transferred +outside of Matrix and then resolved in a uniform way - matching URLs +in World Wide Web. + +Specific use cases include: +1. Representation in UI: as a Matrix user I want to refer to Matrix entities + in the same way as for web pages, so that others could “follow the link” + I sent them (not necessarily through Matrix, it can be, e.g., a web page or + email) in order to access the referred resource. +1. Inbound integration: as an author of Matrix software, I want to have a way + to invoke my software from the operating environment to resolve a Matrix URI + passed from another program. This is a case of, e.g., + opening a Matrix client by clicking on a link in an email program. +1. Outbound integration: as an author of Matrix software, I want to have a way + to export identifiers of Matrix resources as URIs to non-Matrix environment + so that they could be resolved in another time-place in a uniform way. + This is a case of "Share via…" action in a mobile Matrix client. + +https://matrix.to somehow compensates for the lack of dedicated URIs; however: +* it addresses use case (1) in a somewhat clumsy way (resolving a link needs + two interactions with the user instead of one), and +* it can only deal with (2) within a web browser (basically limiting + first-class support to browser-based clients). + +To cover the use cases above, the following scheme is proposed for Matrix URIs +(`[]` enclose optional parts, `{}` enclose variables): +```text +matrix:[//{authority}/]{type}/{id without sigil}[/{more type/id pairs}][?{query}] +``` +with `type` being one of: `user`, `room`, `roomid`, `event` and +`group`; and the only `query` defined for now is `action=join`. The Matrix +identifier (or identifiers) can be reconstructed from the URI by taking +the sigil that corresponds to `type` and appending `id without sigil` to it. +The query may alter the kind of request with respect to the Matrix resource; +and Matrix resources can be contained in each other, in which care the +`more type/id pairs` series is used to reconstruct inner identifiers. + +This proposal defines initial mapping of URIs to Matrix identifiers and actions +on corresponding resources; the scheme and mapping are subject +to further extension. + +Examples: +* Room `#someroom:example.org`: + `matrix:room/someroom:example.org` +* Unfederated room `#internalroom:internal.example.org`: + `matrix://internal.example.org/room/internalroom:internal.example.org` +* Event in a room: + `matrix:room/someroom:example.org/event/Arbitrary_Event_Id` +* [A commit like this](https://github.com/her001/steamlug.org/commit/2bd69441e1cf21f626e699f0957193f45a1d560f) + could make use of a Matrix URI in the form of + `{Matrix identifier}`. + +This MSC does not introduce new Matrix entities, nor API endpoints - +it merely defines a mapping from a URI with the scheme name `matrix:` +to Matrix identifiers and actions on them. It is deemed sufficient to +produce an implementation that would convert Matrix URIs to a series +of CS API calls, entirely on the client side. It is recognised, +however, that most of URI processing logic can and should (eventually) +be on the server side in order to facilitate adoption of Matrix URIs; +further MSCs are needed to define details for that. + +## Proposal + +Further text uses “Matrix identifier” with a meaning of identifiers +as described by [Matrix Specification](https://matrix.org/docs/spec/), +and “Matrix URI” with a meaning of an identifier following +the RFC-compliant URI format proposed hereby. + +### Requirements + +The following considerations drive the requirements for Matrix URIs: +1. Follow existing standards and practices. +1. Endorse the principle of least surprise. +1. Humans first, machines second. +1. Cover as many entities as practical. +1. URIs are expected to be extremely portable and stable; + you cannot rewrite them once they are released to the world. +1. Ease of implementation, allowing reuse of existing codes. + +The following requirements resulted from these drivers +(see [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) for conventions +around MUST/SHOULD/MAY): +1. Matrix URI MUST comply with + [RFC 3986](https://tools.ietf.org/html/rfc3986) and + [RFC 7595](https://tools.ietf.org/html/rfc7595). +1. By definition, Matrix URI MUST unambiguously identify a resource + in a Matrix network, across servers and types of resources. + This means, in particular, that two Matrix identifiers distinct by + [Matrix Specification](https://matrix.org/docs/spec/appendices.html#identifier-grammar) + MUST NOT have Matrix URIs that are equal in + [RFC 3986](https://tools.ietf.org/html/rfc3986) sense + (but two distinct Matrix URIs MAY map to the same Matrix identifier). +1. The following classes MUST be supported: + 1. User IDs (`@user:example.org`) + 1. Room IDs (`!roomid:example.org`) + 1. Room aliases (`#roomalias:example.org`) + 1. Event IDs (`$eventid:example.org`) +1. The mapping MUST take into account that some identifiers + (e.g. aliases) can have non-ASCII characters - reusing + [RFC 3987](https://tools.ietf.org/html/rfc3987) is RECOMMENDED + but an alternative encoding can be used if there are reasons + for that. +1. The mapping between Matrix identifiers and Matrix URIs MUST + be extensible (without invalidating previous URIs) to: + 1. new classes of identifiers (there MUST be a meta-rule to produce + a new mapping for IDs following the `&somethingnew:example.org` + pattern assumed for Matrix identifiers); + 1. new ways to navigate to and interact with objects in Matrix + (e.g., we might eventually want to have a mapping for + room-specific user profiles). +1. The mapping MUST support decentralised as well as centralised IDs. + This basically means that the URI scheme MUST have provisions + for mapping of `:` but it MUST NOT require + `:` to be there. +1. Matrix URI SHOULD allow encoding of action requests such as joining + a room. +1. Matrix URI SHOULD have a human-readable, if not necessarily + human-friendly, representation - to allow visual sanity-checks. + In particular, characters escaping/encoding should be reduced + to bare minimum in that representation. As a food for thought, see + [Wikipedia: Clean URL, aka SEF URL](https://en.wikipedia.org/wiki/Clean_URL) and + [a very relevant use case from RFC 3986](https://tools.ietf.org/html/rfc3986#section-1.2.1). +1. It SHOULD be easy to parse Matrix URI in popular programming + languages: e.g., one should be able to use `parseUri()` + to dissect a Matrix URI in JavaScript. +1. The mapping SHOULD be consistent across different classes of + Matrix identifiers. +1. The mapping SHOULD support linking to unfederated servers/networks + (see also + [matrix-doc#2309](https://github.com/matrix-org/matrix-doc/issues/2309) + that calls for such linking). + +The syntax and mapping discussed below meet all these requirements. +Further extensions MUST comply to them as well. + +### Syntax and high-level processing + +The proposed generic Matrix URI syntax is a subset of the generic +URI syntax +[defined by RFC 3986](https://tools.ietf.org/html/rfc3986#section-3): +```text +MatrixURI = “matrix:” hier-part [ “?” query ] [ “#” fragment ] +hier-part = [ “//” authority “/” ] path +``` +As mentioned above, this MSC assumes client-side URI processing +(i.e. mapping to Matrix identifiers and CS API requests). +However, even when URI processing is shifted to the server side +the client will still have to parse the URI at least to find +the authority part (or lack of it) and remove the fragment part +before sending the request to the server (more on that below). + +#### Scheme name +The proposed scheme name is `matrix`. +[RFC 7595](https://tools.ietf.org/html/rfc7595) states: + + if there’s one-to-one correspondence between a service name and + a scheme name then the scheme name should be the same as + the service name. + +Other considered options were `mx` and `web+matrix`; +[comments to MSC455](https://github.com/matrix-org/matrix-doc/issues/455) +mention two scheme names proposed and one more has been mentioned +in `#matrix-core:matrix.org`). + +The scheme name is a definitive indication of a Matrix URI and MUST NOT +be omitted. As can be seen below, Matrix URI rely heavily on [relative +references](https://tools.ietf.org/html/rfc3986#section-4.2) and +omitting the scheme name makes them indistinguishable from a local path +that might have nothing to do with Matrix. Clients MUST NOT try to +parse pieces like `room/MyRoom:example.org` as Matrix URIs; instead, +users should be encouraged to use Matrix IDs for in-text references +(`#MyRoom:example.org`) and client applications should do +the heavy-lifting of turning them into hyperlinks to Matrix URIs. + +#### Authority + +The authority part is used for the specific case of getting access +to a Matrix resource (such as a room or a user) through a given server. +```text +authority = host [ “:” port ] +``` + +Here's an example of a Matrix URI with an authority part +(the authority part is `example.org:682` here): +`matrix://example.org:682/roomid/Internal_Room_Id:example2.org`. + +The main purpose of the authority part, +[as per RFC 3986](https://tools.ietf.org/html/rfc3986#section-3.2), +is to identify the authority governing the namespace for the rest +of the URI. This MSC reuses the RFC definitions for +[`host`](https://tools.ietf.org/html/rfc3986#section-3.2.2) and +[`port`](https://tools.ietf.org/html/rfc3986#section-3.2.3). +RFC 3986 also includes provisions for user information - +this MSC explicitly omits them. If providing a user identity +in the authority part is found to be of value in some case, +this can be addressed in a separate MSC. + +When a Matrix client needs to obtain the resource specified by a URI +with an authority part, the client MUST, +_instead of engaging the current user's homeserver connection_: +1. resolve the homeserver (further referred to as authority server) using + [the standard process for server discovery](https://matrix.org/docs/spec/client_server/latest#server-discovery) + from the authority part; +1. interrogate the resolved authority server using Matrix APIs in order + to get or interact with the resource encoded in the URI. + +Importantly, the authority part is _not_ intended for usage in routing +over federation; rather, it is for cases when a given Matrix +entity is not expected to be reachable through federation (such as +unfederated rooms or non-public Matrix networks). Sending requests +to the server resolved from the authority part means that the client +should be, as the name implies, _authorised_ by the authority server +to access the requested resource. That implies that the resource +is either available to guests on the authority server, or the end user +must be authenticated (and their access rights checked) +on that server in order to access the resource. + +#### Path +Unlike the very wide definition of path in RFC 3986, this MSC +restricts the path component of a Matrix URI to a simple +pattern that allows to easily reconstruct a Matrix identifier or +a chain of identifiers: +```text +path = type “/” id-without-sigil “/” path +type = “user” / “roomid” / “room” / “event” / “group” +id-without-sigil = string ; see below +``` +The path component consists of 1 or more type-id pairs separated +by slash character both inside the pair and between pairs. While most +of the URIs considered in this MSC do not need any sort of hierarchy, +one case is standing out: as of now, events require rooms to be +resolved so an event URI for `$eventid` in the room +`!roomid:example2.org` would be +`matrix:room/roomid:example2.org/event/eventid`. + +This MSC defines the following type specifiers: +`user` (user id, sigil `@`), `roomid` (room id, sigil `!`), +`room` (room alias, sigil `#`), and `event` (event id, sigil `$`). +The type `group` (group/community id, sigil `+`) is reserved for future use. + +`id-without-sigil` is defined as the `string` part of Matrix +[Common identifier format](https://matrix.org/docs/spec/appendices#common-identifier-format) with percent-encoded characters that are NEITHER +unreserved, sub-delimiters, `:` nor `@`, +[as per RFC 3986 rule for pchar](https://tools.ietf.org/html/rfc3986#appendix-A). +This notably exempts `:` from percent-encoding but includes `/`. + +See subsections of the same section for details on each type of +identifier. + +The rationale behind dropping sigils is two-fold: +* (most important) Semantic and grammatic clashes with RFC. + `#` delimits a fragment; `@` delimits user information. Percent-encoding + these is not a reasonable option, given that the two are the most used sigils. +* The namespace for sigils is extremely limited. Not that Matrix showed + any tendency to produce plenty of various classes of objects but + that cannot be ruled out; so rather then hit the wall together + with the institute of sigils, this MSC proposes + a more extensible solution from the outset. + +Clients MUST resolve all Matrix entities that they support. + +Further MSCs may introduce navigation to more top-level as well as +non-top-level objects; e.g., one could conceive a URI mapping of avatars +in the form of +`matrix:user/uid:matrix.org/avatar/room:matrix.org` +(a user’s avatar for a given room). + +#### Query + +Matrix URI can optionally have +[the query part](https://tools.ietf.org/html/rfc3986#section-3.4). +This MSC defines only two specific forms for the query; further MSCs +may add to this as long as RFC 3986 is followed. +```text +query = query-element *( “&” query-element ) +query-element = action / routing +action = “action=join” +routing = “via=” authority +``` +The join action can only be used with a URI resolving to a room; applications +MUST ignore it if found and MUST NOT generate it for other Matrix resources. +This action means that rather than just navigate to the room +client applications SHOULD attempt to join it using the standard CS API means. +Client applications SHOULD ask for user confirmation before joining if the user +is not a joined member in the room nor in its immediate +[successor or predecessor](https://matrix.org/docs/spec/client_server/latest#module-room-upgrades). + +The routing query is only allowed with URIs that have path starting with +`roomid/` to indicate servers that are likely involved in the room (cf. +[the feature of matrix.to](https://matrix.org/docs/spec/appendices#routing)). +For other URIs it SHOULD be ignored and clients SHOULD NOT generate it. Note +that `authority` here only represents the server grammar as defined in +the respective section above; it doesn't need to coincide with the actual +authority server if it's also supplied in the URI. + + +### URI semantics + +The main purpose of a Matrix URI is accessing the resource specified by the +identifier, and the primary action is loading the contents of a document +corresponding to a given resource. This MSC defines the "default" action +that a client application SHOULD perform when the user activates +(e.g. clicks on) a URI; further MSCs may introduce additional actions enabled +either by passing an `action` value in the query part, or by other means. +The classes of URIs and corresponding default actions (along with relevant +CS API calls) are collected as follows: + +* User ID: + - URI example: `matrix:user/me:example.org` or + (decentralised user id, future) `matrix:user/me_in_matrix` + - Default action: Show user profile (`GET /profile/@me:example.org/...`, + `GET /profile/@me_in_matrix/...`) +* Room ID: + - URI example: `matrix:roomid/rid:example.org` or + (decentralised id, future) `!lol823y4bcp3qo4` + - Default action: attempt to "open" the room (usually means the client + at least displays the room timeline at the latest or + the last remembered position - `GET /rooms/!rid:example.org/...`, + `GET /rooms/!lol823y4bcp3qo4/...`) +* Joining by Room ID: + - URI example: `matrix:roomid/rid:example.org?action=join&via=example2.org` + - Default action: if needed (see the section about the query part) ask + the user to confirm the intention; then join the room + (`POST /join/!rid:example.org?server_name=example2.org`) +* Room alias: + - URI example: `matrix:room/us:example.org` + - Default action: resolve the alias to room id + (`GET /directory/room/#us:example.org` if needed) and attempt to "open" + the room (same as above) +* Joining by Room alias: + - URI example: `matrix:room/us:example.org?action=join` + - Default action: if needed (see the section about the query part) ask + the user to confirm the intention; then join the room + (`POST /join/#us:example.org`) +* Event ID (as used in + [room version 3](https://matrix.org/docs/spec/rooms/v3) and later): + - URI example (aka "event permalink"): + `matrix:room/us:example.org/event/UnpaddedBase64` or + (additional routing is needed for room id) + `matrix:roomid/rid:example.org/event/UnpaddedBase64?via=example2.org` + - Default action: + 1. For room aliases, resolve an alias to a room id (see above) + 1. If the event is in the room that the user has joined, retrieve + the event contents (`GET /rooms/!rid:example.org/event/UnpaddedBase64` or + `GET /rooms/!rid:example.org/context/UnpaddedBase64`) and display them + to the user + 1. Otherwise try to retrieve the event in the same way but in case of + access failure the client MAY offer the user to join the room; if + the user agrees and joining succeeds, retry the step above. +* Group ID: + - URI example: `matrix:group/them:matrix.org` + - Default action: reserved for future use + + +## Discussion points and tradeoffs + +The below documents the discussion and outcomes in various prior forums; +further discussion should happen in GitHub comments. +1. _Why no double-slashes in a typical URI?_ + Because `//` is used to mark the beginning of an authority + part. RFC 3986 explicitly forbids to start the path component with + `//` if the URI doesn't have an authority component. In other words, + `//` implies a centre of authority, and the (public) Matrix + federation is not supposed to have one; hence no `//` in most URIs. +1. _Why type specifiers use singular rather than plural + as is common in RESTful APIs?_ + Unlike in actual RESTful APIs, this MSC does not see `rooms/` or + `users/` as collections to browse. The type specifier completes + the id specification in the URI, defining a very specific and + easy to parse syntax for that. Future MSCs may certainly add + collection URIs, but it is recommended to use more distinct naming + for such collections. In particular, `rooms/` is ambiguous, as + different sets of rooms are available to any user at any time + (e.g., all rooms known to the user; or all routable rooms; or + public rooms known to the user's homeserver). +1. (still open) + _Should we advise using the query part for collections then?_ +1. (still open) + _Why not Reddit-style single-letter type specifiers? That's almost + as compact as a sigil, still pretty clearly conveys the type, + and nicely avoids the confusion described in the previous question._ + Readability? +1. _Why event URI cannot use the fragment part for the event id?_ + Because fragment is a part processed exclusively by the client + in order to navigate within a larger document, and room cannot + be considered a "document". Each event can be retrieved from the server + individually, so each event can be viewed as a self-contained document. + When URI processing is shifted to the server-side, servers are not even + going to receive fragments (as per RFC 3986). +1. _Interoperability with + [Linked Data](https://en.wikipedia.org/wiki/Linked_data)_ is out of + scope of this MSC but worth being considered separately. +1. _How does this MSC work with closed federations?_ If you need to + communicate a URI to the bigger world where you cannot expect + the consumer to know in advance which federation they should use - + supply any server of the closed federation in the authority part. + Users inside the closed federation can omit the authority part if + they know the URI is not going to be used outside this federation. + Clients can facilitate that by having an option to always add or omit + the authority part in generated URIs for a given user account. + + +## Further evolution + +This MSC is obviously just the first step, opening the door for a range of +extensions, especially to the query part. Possible actions may include +opening the +[canonical direct chat](https://github.com/matrix-org/matrix-doc/pull/2199) +(`action=chat`, following the existing convention in Riot), +leaving a room (`action=leave`); bounds for a segment of the room timeline +(`from=$evtid1&to=$evtid2`) also look worthwhile. + + +## Alternatives + +The discussion in +[MSC455](https://github.com/matrix-org/matrix-doc/issues/455) +mentions an option to standardise URNs rather than URLs/URIs, +with the list of resolvers being user-specific. While a URN namespace +such as `urn:matrix:`, along with a URN scheme, might be deemed useful +once we shift to (even) more decentralised structure of the network, +`urn:` URIs must be managed entities (see +[RFC 8141](https://tools.ietf.org/html/rfc8141)) which is not always +the case in Matrix (consider room aliases, e.g.). + +With that said, a URN-styled (`matrix:room:example.org:roomalias`) +option was considered. However, Matrix already uses colon (`:`) as +a delimiter of id parts and, as can be seen above, reversing the parts +to meet the URN's hierarchical order would look confusing for Matrix +users. + +Yet another alternative considered was to go "full REST" and build +a more traditional looking URL structure with serverparts coming first +followed by type grouping (sic - not specifiers) and then by localparts, +i.e. `matrix://example.org/rooms/roomalias`. This is even more +difficult to comprehend for a Matrix user than the previous alternative +and besides it conflates the notion of an authority server with +that of a namespace (`example.org` above is a server part of an alias, +not the name of a homeserver that should be used to resolve the URI). + + +## Potential issues + +Despite the limited functionality of URIs as proposed in this MSC, +Matrix authors are advised to use tools that would process URIs just +like an http(s) URI instead of making a quick home-baked +parsers/emitters. Even with that in mind, not all tools normalise and +sanitise all cases in a fully RFC-compliant way. This MSC tries to keep +the required transformations to the minimum and will likely not bring +much grief even with naive implementations; however, as functionality +of Matrix URI grows, the number of corner cases will increase. + + +## Security considerations + +This MSC mostly builds on RFC 3986 but tries to reduce the scope +as much as possible. Notably, it avoids introducing complex traversable +structures and defines the URI grammar quite strictly. Notably, +dot path segments (`.` and `..`), while potentially useful when URIs +become richer, would come too much ahead of time for now. Care is taken +to not make essential parts of the URI omittable to avoid +even accidental misrepresentation of a local resource for a remote one +in Matrix and vice versa. One part still omittable - authority - can +result in leaks if a non-fully-qualified URI from one closed federation +is sent to another closed federation; this is considered an extremely +corner case. + +The MSC intentionally doesn't support conveying any kind of user +information in URIs. + + +## Conclusion + +A dedicated URI scheme is well overdue for Matrix. Many other networks +already have got one for themselves, benefiting both in terms of +branding (compare `matrix:room/weruletheworld:example.org` vs. +`#weruletheworld:example.org` from the standpoint of someone who +hasn't been to Matrix) and interoperability (`matrix.to` requires +opening a browser while clicking a `tg:` link dumped to the terminal +application will open the correct application for Telegram without +user intervention or can even offer to install one, if needed). +The proposed syntax makes conversion between Matrix URIs +and Matrix identifiers as easy as a bunch of regular expressions; so +even though client-side processing of URIs might not be optimal +longer-term, it's a very simple and quick way that allows plenty +of experimentation early on. From 008185dfcbf45195cc6614914b623e905157097d Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Sun, 6 Oct 2019 08:10:53 +0900 Subject: [PATCH 02/29] Fix path grammar Co-Authored-By: David Vo --- proposals/2312-matrix-uri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 21898007..f0d49b94 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -231,7 +231,7 @@ restricts the path component of a Matrix URI to a simple pattern that allows to easily reconstruct a Matrix identifier or a chain of identifiers: ```text -path = type “/” id-without-sigil “/” path +path = type “/” id-without-sigil [“/” path] type = “user” / “roomid” / “room” / “event” / “group” id-without-sigil = string ; see below ``` From 644bb3e4d4efa7d627965473c824fe63fff0735f Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Tue, 15 Oct 2019 20:10:20 +0900 Subject: [PATCH 03/29] Follow-ups on review comments --- proposals/2312-matrix-uri.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index f0d49b94..2cedd390 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -295,6 +295,19 @@ Client applications SHOULD ask for user confirmation before joining if the user is not a joined member in the room nor in its immediate [successor or predecessor](https://matrix.org/docs/spec/client_server/latest#module-room-upgrades). +Client applications SHOULD NOT append `action=join` to room URIs and generally +SHOULD NOT push consumers into joining rooms unless the user's intention +to join the room can be assumed from the context of the URI. A link +`Join our room` is +a poorer choice for a webpage compared to +`Our room`; on the other hand, +a message in an obsolete, especially private, room saying +"We have moved our room elsewhere, +`click to join`" +is safe to assume that, being in the current room, a user would want to join +the new room and that there's not much sense in exploring the new room +without joining it anyway. + The routing query is only allowed with URIs that have path starting with `roomid/` to indicate servers that are likely involved in the room (cf. [the feature of matrix.to](https://matrix.org/docs/spec/appendices#routing)). @@ -322,7 +335,7 @@ CS API calls) are collected as follows: `GET /profile/@me_in_matrix/...`) * Room ID: - URI example: `matrix:roomid/rid:example.org` or - (decentralised id, future) `!lol823y4bcp3qo4` + (decentralised id, future) `matrix:roomid/lol823y4bcp3qo4` - Default action: attempt to "open" the room (usually means the client at least displays the room timeline at the latest or the last remembered position - `GET /rooms/!rid:example.org/...`, From 812df4c9c18ae820c607c390b521a7b1c7005918 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Sun, 10 Nov 2019 19:04:41 +0900 Subject: [PATCH 04/29] Out of WIP; early feedback incorporated - Authority part semantics are no more prescribed; authority part has defined syntax but reserved for future use. - Moved away non-normative parts to "Discussion" and/or "Alternatives" - Added `action=chat` - Extended `via=` applicability to non-roomid cases to compensate dropping the authority part semantics. - Added a reference algorithm to parse a URI. - Closed outstanding questions/discussion points. - Added more cases for future evolution. - Added "minimal syntax" options to the discussion of possible alternatives --- proposals/2312-matrix-uri.md | 361 +++++++++++++++++++++++------------ 1 file changed, 241 insertions(+), 120 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 2cedd390..cac300a7 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -104,7 +104,7 @@ around MUST/SHOULD/MAY): 1. User IDs (`@user:example.org`) 1. Room IDs (`!roomid:example.org`) 1. Room aliases (`#roomalias:example.org`) - 1. Event IDs (`$eventid:example.org`) + 1. Event IDs (`$arbitrary_eventid_with_or_without_serverpart`) 1. The mapping MUST take into account that some identifiers (e.g. aliases) can have non-ASCII characters - reusing [RFC 3987](https://tools.ietf.org/html/rfc3987) is RECOMMENDED @@ -194,36 +194,9 @@ Here's an example of a Matrix URI with an authority part (the authority part is `example.org:682` here): `matrix://example.org:682/roomid/Internal_Room_Id:example2.org`. -The main purpose of the authority part, -[as per RFC 3986](https://tools.ietf.org/html/rfc3986#section-3.2), -is to identify the authority governing the namespace for the rest -of the URI. This MSC reuses the RFC definitions for -[`host`](https://tools.ietf.org/html/rfc3986#section-3.2.2) and -[`port`](https://tools.ietf.org/html/rfc3986#section-3.2.3). -RFC 3986 also includes provisions for user information - -this MSC explicitly omits them. If providing a user identity -in the authority part is found to be of value in some case, -this can be addressed in a separate MSC. - -When a Matrix client needs to obtain the resource specified by a URI -with an authority part, the client MUST, -_instead of engaging the current user's homeserver connection_: -1. resolve the homeserver (further referred to as authority server) using - [the standard process for server discovery](https://matrix.org/docs/spec/client_server/latest#server-discovery) - from the authority part; -1. interrogate the resolved authority server using Matrix APIs in order - to get or interact with the resource encoded in the URI. - -Importantly, the authority part is _not_ intended for usage in routing -over federation; rather, it is for cases when a given Matrix -entity is not expected to be reachable through federation (such as -unfederated rooms or non-public Matrix networks). Sending requests -to the server resolved from the authority part means that the client -should be, as the name implies, _authorised_ by the authority server -to access the requested resource. That implies that the resource -is either available to guests on the authority server, or the end user -must be authenticated (and their access rights checked) -on that server in order to access the resource. +The authority part, as defined above, is reserved for future MSCs. +Clients SHOULD NOT use data from the authority part other than for +experimental or further research purposes. #### Path Unlike the very wide definition of path in RFC 3986, this MSC @@ -241,7 +214,7 @@ of the URIs considered in this MSC do not need any sort of hierarchy, one case is standing out: as of now, events require rooms to be resolved so an event URI for `$eventid` in the room `!roomid:example2.org` would be -`matrix:room/roomid:example2.org/event/eventid`. +`matrix:roomid/roomid:example2.org/event/eventid`. This MSC defines the following type specifiers: `user` (user id, sigil `@`), `roomid` (room id, sigil `!`), @@ -249,31 +222,16 @@ This MSC defines the following type specifiers: The type `group` (group/community id, sigil `+`) is reserved for future use. `id-without-sigil` is defined as the `string` part of Matrix -[Common identifier format](https://matrix.org/docs/spec/appendices#common-identifier-format) with percent-encoded characters that are NEITHER -unreserved, sub-delimiters, `:` nor `@`, +[Common identifier format](https://matrix.org/docs/spec/appendices#common-identifier-format) +with percent-encoded characters that are NEITHER unreserved, sub-delimiters, `:` nor `@`, [as per RFC 3986 rule for pchar](https://tools.ietf.org/html/rfc3986#appendix-A). This notably exempts `:` from percent-encoding but includes `/`. -See subsections of the same section for details on each type of -identifier. - -The rationale behind dropping sigils is two-fold: -* (most important) Semantic and grammatic clashes with RFC. - `#` delimits a fragment; `@` delimits user information. Percent-encoding - these is not a reasonable option, given that the two are the most used sigils. -* The namespace for sigils is extremely limited. Not that Matrix showed - any tendency to produce plenty of various classes of objects but - that cannot be ruled out; so rather then hit the wall together - with the institute of sigils, this MSC proposes - a more extensible solution from the outset. +See the rationale behind dropping sigils and the respective up/downsides in +"Discussion points and tradeoffs" below. -Clients MUST resolve all Matrix entities that they support. - Further MSCs may introduce navigation to more top-level as well as -non-top-level objects; e.g., one could conceive a URI mapping of avatars -in the form of -`matrix:user/uid:matrix.org/avatar/room:matrix.org` -(a user’s avatar for a given room). +non-top-level objects; see "Further evolution" for some ideas. #### Query @@ -284,40 +242,43 @@ may add to this as long as RFC 3986 is followed. ```text query = query-element *( “&” query-element ) query-element = action / routing -action = “action=join” +action = “action=" ( "join" / "chat" ) routing = “via=” authority ``` -The join action can only be used with a URI resolving to a room; applications -MUST ignore it if found and MUST NOT generate it for other Matrix resources. -This action means that rather than just navigate to the room -client applications SHOULD attempt to join it using the standard CS API means. -Client applications SHOULD ask for user confirmation before joining if the user -is not a joined member in the room nor in its immediate -[successor or predecessor](https://matrix.org/docs/spec/client_server/latest#module-room-upgrades). - -Client applications SHOULD NOT append `action=join` to room URIs and generally -SHOULD NOT push consumers into joining rooms unless the user's intention -to join the room can be assumed from the context of the URI. A link -`Join our room` is -a poorer choice for a webpage compared to -`Our room`; on the other hand, -a message in an obsolete, especially private, room saying -"We have moved our room elsewhere, -`click to join`" -is safe to assume that, being in the current room, a user would want to join -the new room and that there's not much sense in exploring the new room -without joining it anyway. - -The routing query is only allowed with URIs that have path starting with -`roomid/` to indicate servers that are likely involved in the room (cf. + +The `action` query item is used in contexts where, on top of identifying +the Matrix entity, a certain action is requested on it. This proposal +describes two possible actions: +* `action=join` is only valid in a URI resolving to a Matrix room; + applications MUST ignore it if found in other contexts and MUST NOT generate + it for other Matrix resources. This action means that client applications + SHOULD attempt to join it using the standard CS API means. Client + applications SHOULD ask for user confirmation before joining if the user + is neither a joined member of the room nor of its immediate + [successor or predecessor](https://matrix.org/docs/spec/client_server/latest#module-room-upgrades). +* `action=chat` is only valid in a URI resolving to a Matrix user; + applications MUST ignore it if found in other contexts and MUST NOT generate + it for other Matrix resources. A URI with this action that a client application + SHOULD open a direct chat window with the user; clients supporting + [canonical direct chats](https://github.com/matrix-org/matrix-doc/pull/2199) + SHOULD open the canonical direct chat. + +For both actions, where applicable, client applications SHOULD ask for user +confirmation or at least make the user aware if the action leads +to joining or creating a new room rather than switching to a prior one. + +The routing query (`via=`) indicates servers that are likely involved in +the room (see also [the feature of matrix.to](https://matrix.org/docs/spec/appendices#routing)). -For other URIs it SHOULD be ignored and clients SHOULD NOT generate it. Note -that `authority` here only represents the server grammar as defined in -the respective section above; it doesn't need to coincide with the actual -authority server if it's also supplied in the URI. +It is proposed to use the routing query to be used not only for resolving +room ids in a public federation but also when a URI refers to a resource in +a non-public Matrix network (see the question about closed federations in +"Discussion points and tradeoffs"). Note that `authority` in the definition +above is only a part of the grammar as defined in the respective section; +it is not proposed here to generate or read the authority part of the URI. -### URI semantics +### URI semantics and parsing algorithm The main purpose of a Matrix URI is accessing the resource specified by the identifier, and the primary action is loading the contents of a document @@ -325,14 +286,79 @@ corresponding to a given resource. This MSC defines the "default" action that a client application SHOULD perform when the user activates (e.g. clicks on) a URI; further MSCs may introduce additional actions enabled either by passing an `action` value in the query part, or by other means. + +The reference algorithm of parsing a Matrix URI follows. Note that, although +clients are encouraged to use lower-case strings in their URIs, all string +comparisons are case-INsensitive. + +1. Parse the URI into main components (`scheme name`, `authority`, `path`, + `query`, and `fragment`), decoding special or international characters + as directed by [RFC 3986](https://tools.ietf.org/html/rfc3986) and + (for IRIs) [RFC 3987](https://tools.ietf.org/html/rfc3987). Authors are + strongly RECOMMENDED to find an existing implementation of that step for + their language and SDK, rather than implement it from scratch based on RFCs. + +1. Check that `scheme name` is exactly `matrix`, case-insensitive. If + the scheme name doesn't match, exit parsing: this is not Matrix URI. + +1. Split the `path` into segments separated by `/` character; several + subsequent `/` characters delimit empty segments, as advised by RFC 3986. + +1. Check that the URI contains either 2 or 4 segments; if it's not the case, + fail parsing; the Matrix URI is invalid. + +1. To construct the top-level (primary) Matrix identifier: + + a. Pick the leftmost segment of `path` until `/` (path segment) and match + it against the following list to produce `sigil-1`: + - `user` -> `@` + - `roomid` -> `!` + - `room` -> `#` + - `group` -> `+` + - any other string, including an empty one -> fail parsing: + the Matrix URI is invalid. + + b. Pick the next (2nd) leftmost path segment: + - if the segment is empty, fail parsing; + - otherwise, percent-decode the segment and make `mxid-1` by + concatenating `sigil-1` and the result of percent-decoding. + +1. If `sigil-1` is `!` or `#` and the URI path has exactly 4 segments, + it may be possible to construct the 2nd-level Matrix identifier to + point to an event inside the room identified by `mxid-1`: + + a. Pick the next (3rd) path segment: + - if the segment is exactly `event`, proceed; + - otherwise, including the case of an empty segment, fail parsing. + + b. Pick the next (4th) leftmost path segment: + - if the segment is empty, fail parsing; + - otherwise, percent-decode the segment and make `mxid-2` by + prepending `$` to the result of percent-decoding. + +1. Split the `query` into items separated by `&` character; several subsequent + `&` characters delimit empty items, ignored by this algorithm. + + a. If `query` contains one or more items starting with `via=` - treat + the rest of each this item after `via=` as a percent-encoded homeserver + name to be used in + [routing](https://matrix.org/docs/spec/appendices#routing). + + b. If `query` contains one or more items starting with `action=` - treat + _the last_ such item as an instruction for joining or opening a direct + chat, as per this proposal. + The classes of URIs and corresponding default actions (along with relevant -CS API calls) are collected as follows: +CS API calls) are collected as follows. This is non-normative and just provides +a reference and examples of handling various kinds of URIs. * User ID: - URI example: `matrix:user/me:example.org` or (decentralised user id, future) `matrix:user/me_in_matrix` - - Default action: Show user profile (`GET /profile/@me:example.org/...`, - `GET /profile/@me_in_matrix/...`) + - Possible default actions: + - Show user profile + (`GET /profile/@me:example.org/...`, `GET /profile/@me_in_matrix/...`); + - Mention the user in the current room (client-local operation) * Room ID: - URI example: `matrix:roomid/rid:example.org` or (decentralised id, future) `matrix:roomid/lol823y4bcp3qo4` @@ -396,46 +422,88 @@ further discussion should happen in GitHub comments. different sets of rooms are available to any user at any time (e.g., all rooms known to the user; or all routable rooms; or public rooms known to the user's homeserver). -1. (still open) - _Should we advise using the query part for collections then?_ -1. (still open) - _Why not Reddit-style single-letter type specifiers? That's almost +1. _Should we advise using the query part for collections then?_ + Not in this MSC but that can be considered in the future. +1. _Why not Reddit-style single-letter type specifiers? That's almost as compact as a sigil, still pretty clearly conveys the type, and nicely avoids the confusion described in the previous question._ - Readability? + Reddit-style prefixes would eventually produce bigger ambiguity as + primary notation; but they can be handy as shortcuts. As discussed + further below, the current proposal provides enough space to define + synonyms; this may need some canonicalisation service from homeservers + so that we don't have to enable synonyms at each client individually. 1. _Why event URI cannot use the fragment part for the event id?_ Because fragment is a part processed exclusively by the client in order to navigate within a larger document, and room cannot be considered a "document". Each event can be retrieved from the server individually, so each event can be viewed as a self-contained document. - When URI processing is shifted to the server-side, servers are not even + When/if URI processing is shifted to the server-side, servers are not even going to receive fragments (as per RFC 3986). 1. _Interoperability with [Linked Data](https://en.wikipedia.org/wiki/Linked_data)_ is out of scope of this MSC but worth being considered separately. -1. _How does this MSC work with closed federations?_ If you need to +1. _How does this MSC work with closed federations?_ ~~If you need to communicate a URI to the bigger world where you cannot expect the consumer to know in advance which federation they should use - supply any server of the closed federation in the authority part. Users inside the closed federation can omit the authority part if they know the URI is not going to be used outside this federation. Clients can facilitate that by having an option to always add or omit - the authority part in generated URIs for a given user account. + the authority part in generated URIs for a given user account.~~ + Use `via=` in order to point to a homeserver in the closed federation. + The authority part may eventually be used for that but further discussion + is needed on how clients should support without compromising privacy + (see https://github.com/matrix-org/matrix-doc/pull/2312#discussion_r348960282 + for the original concern). ## Further evolution -This MSC is obviously just the first step, opening the door for a range of -extensions, especially to the query part. Possible actions may include -opening the -[canonical direct chat](https://github.com/matrix-org/matrix-doc/pull/2199) -(`action=chat`, following the existing convention in Riot), -leaving a room (`action=leave`); bounds for a segment of the room timeline -(`from=$evtid1&to=$evtid2`) also look worthwhile. +This section is non-normative. + +This MSC is obviously just the first step, keeping the door open for +extensions. Here are a few ideas: + +* Add new actions; e.g. leaving a room (`action=leave`). +* Add specifying a segment of the room timeline (`from=$evtid1&to=$evtid2`). +* Unlock bare event ids (`matrix:event/$event_id`) - subject to changes in + other areas of the specification. +* One area of possible evolution is bringing tangible semantics to + the authority part. The main purpose of the authority part, + [as per RFC 3986](https://tools.ietf.org/html/rfc3986#section-3.2), + is to identify the authority governing the namespace for the rest + of the URI. This MSC reuses the RFC definitions for + [`host`](https://tools.ietf.org/html/rfc3986#section-3.2.2) and + [`port`](https://tools.ietf.org/html/rfc3986#section-3.2.3). + RFC 3986 also includes provisions for user information - + this MSC explicitly excludes them. If providing a user identity + in the authority part is found to be of value in some case, + this should be addressed in a separate MSC. + + Importantly, the authority part is _not_ intended for usage in routing + over federation; rather, it is for cases when a given Matrix + entity is not expected to be reachable through federation (such as + unfederated rooms or non-public Matrix networks). Sending requests + to the server resolved from the authority part means that the client + should be, as the name implies, _authorised_ by the authority server + to access the requested resource. That, in turn, implies that the resource + is either available to guests on the authority server, or the end user + must be authenticated (and their access rights checked) + on that server in order to access the resource. While being a part + of the original proposal, the semantics for the authority part have + been dropped from the normative part as a result of MSC discussion. +* One could conceive a URI mapping of avatars in the form of + `matrix:user/uid:matrix.org/avatar/room:matrix.org` + (a user’s avatar for a given room). +* As described below in "Alternatives", one can introduce a synonymous + system that uses Matrix identifiers with sigils by adding another path + prefix (`matrix:id/%23matrix:matrix.org`). ## Alternatives +### URNs + The discussion in [MSC455](https://github.com/matrix-org/matrix-doc/issues/455) mentions an option to standardise URNs rather than URLs/URIs, @@ -452,45 +520,98 @@ a delimiter of id parts and, as can be seen above, reversing the parts to meet the URN's hierarchical order would look confusing for Matrix users. + +### "Full REST" + Yet another alternative considered was to go "full REST" and build -a more traditional looking URL structure with serverparts coming first +a more traditionally looking URL structure with serverparts coming first followed by type grouping (sic - not specifiers) and then by localparts, i.e. `matrix://example.org/rooms/roomalias`. This is even more difficult to comprehend for a Matrix user than the previous alternative and besides it conflates the notion of an authority server with that of a namespace (`example.org` above is a server part of an alias, -not the name of a homeserver that should be used to resolve the URI). +not the name of a hypothetical homeserver that should be used to resolve +the URI). + + +### Minimal syntax + +One early but still viable proposal was to simply prepend `matrix:` to +a Matrix identifier (without encoding it), assuming that it will only be +processed on the client side. The massive downside of this option is that +such strings are not actual URIs even though they look like ones: most +URI parsers won't handle them correctly. Bare Matrix identifiers have +the same applicable range without deceptive looks. + + +### Minimal syntax based on path and percent-encoding + +A simple modification of the previous option is much more viable: +proper percent-encoding of the Matrix identifier allows to use it as +a URI path part. A single identifier packed in a URI could look like +`matrix:/encoded_id_with_sigil`; an event-in-a-room URI would be something +like `matrix:/roomid_or_alias/$event_id` (NB: RFC3986 doesn't require `$` +to be encoded). This is considerably more concise and encoding is only +needed for `#` - quite unfortunately, this is one of the most used sigils +in Matrix. E.g., `matrix:/%23matrix:matrix.org` would be a URI for +Matrix HQ chat room. + +Putting the whole id to the URI fragment (`matrix:#id_with_sigil` or, +following on the `matrix.to` tradition, `matrix:#/id_with_sigil` for +readability) allows to use `#` without encoding on many URI parsers. It is +still not fully RFC3986-compliant but the bigger problem is that putting +the identifying part to the fragment rules out using URIs in client-server +communication. Effectively all clients will have to implement full URI +processing with no chance to offload that to the server. + +Regardless of the placement (the fragment or the path), one more consideration +is that the character space for sigils is extremely limited and +Matrix identifiers are generally less expressive than full-blown URI paths. +Not that Matrix showed a tendency to produce many classes of objects that would +warrant a dedicated sigil but that cannot be ruled out. Rather than rely +on the institute of sigils, this proposal gives an alternative more +extensible syntax that can be used for more advanced cases - as a uniform way +to represent arbitrary sub-objects (with or without Matrix identifier) such as +user profiles or a notifications feed for the room - and also, if ever needed, +as an escape hatch to a bigger namespace if we hit shortage of sigils. + +The current proposal is also flexible enough to even incorporate the minimal +syntax of this option as an alternative to its own notation - e.g., a further +MSC could enable `matrix:id/%23matrix:matrix.org` as a synonym for +`matrix:room/matrix:matrix.org`. ## Potential issues Despite the limited functionality of URIs as proposed in this MSC, Matrix authors are advised to use tools that would process URIs just -like an http(s) URI instead of making a quick home-baked -parsers/emitters. Even with that in mind, not all tools normalise and -sanitise all cases in a fully RFC-compliant way. This MSC tries to keep -the required transformations to the minimum and will likely not bring -much grief even with naive implementations; however, as functionality -of Matrix URI grows, the number of corner cases will increase. +like an http(s) URI instead of making home-baked parsers/emitters. +Even with that in mind, not all tools normalise and sanitise all cases +in a fully RFC-compliant way. This MSC tries to keep the required +transformations to the minimum and will likely not bring much grief even +with naive implementations; however, as functionality of Matrix URI grows, +the number of corner cases will increase. -## Security considerations +## Security/privacy considerations This MSC mostly builds on RFC 3986 but tries to reduce the scope as much as possible. Notably, it avoids introducing complex traversable -structures and defines the URI grammar quite strictly. Notably, -dot path segments (`.` and `..`), while potentially useful when URIs -become richer, would come too much ahead of time for now. Care is taken -to not make essential parts of the URI omittable to avoid +structures and further restricts the URI grammar to the necessary subset. +In particular, dot path segments (`.` and `..`), while potentially useful +when URIs become richer, would come too much ahead of time for now. Care +is taken to not make essential parts of the URI omittable to avoid even accidental misrepresentation of a local resource for a remote one -in Matrix and vice versa. One part still omittable - authority - can -result in leaks if a non-fully-qualified URI from one closed federation -is sent to another closed federation; this is considered an extremely -corner case. +in Matrix and vice versa. The MSC intentionally doesn't support conveying any kind of user information in URIs. +The MSC strives to not be prescriptive in treating URIs except the `action` +query parameter. Actions without user confirmation may lead to unintended +leaks of certain metadata so this MSC recommends to ask for a user consent - +recognising that not all clients are in position for that. + ## Conclusion @@ -503,7 +624,7 @@ opening a browser while clicking a `tg:` link dumped to the terminal application will open the correct application for Telegram without user intervention or can even offer to install one, if needed). The proposed syntax makes conversion between Matrix URIs -and Matrix identifiers as easy as a bunch of regular expressions; so -even though client-side processing of URIs might not be optimal -longer-term, it's a very simple and quick way that allows plenty -of experimentation early on. +and Matrix identifiers as easy as a bunch of string comparisons or +regular expressions; so even though client-side processing of URIs +might not be optimal longer-term, it's a very simple and quick way +that allows plenty of experimentation early on. From bcf5585e5594f39c953571399bfe2f9dbd79978d Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Thu, 23 Jul 2020 15:51:52 +0200 Subject: [PATCH 05/29] Remove authority from the example for unfederated room Co-authored-by: Mayeul Cantan --- proposals/2312-matrix-uri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index cac300a7..98a3b3a7 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -53,7 +53,7 @@ Examples: * Room `#someroom:example.org`: `matrix:room/someroom:example.org` * Unfederated room `#internalroom:internal.example.org`: - `matrix://internal.example.org/room/internalroom:internal.example.org` + `matrix:room/internalroom:internal.example.org`. This can only be resolved by a client connected to the appropriate server, likely internal.example.org in this case, so it can be hinted at: `matrix:room/internalroom:internal.example.org&via=internal.example.org`. See the [Discussion points and tradeoffs](#Discussion points and tradeoffs) section. * Event in a room: `matrix:room/someroom:example.org/event/Arbitrary_Event_Id` * [A commit like this](https://github.com/her001/steamlug.org/commit/2bd69441e1cf21f626e699f0957193f45a1d560f) From 294f30f3abbf5835bdfd1e2311de252a39311ca3 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Sat, 25 Jul 2020 11:55:18 +0200 Subject: [PATCH 06/29] Reword, as per review Co-authored-by: Mayeul Cantan --- proposals/2312-matrix-uri.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 98a3b3a7..068fb10d 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -339,12 +339,11 @@ comparisons are case-INsensitive. 1. Split the `query` into items separated by `&` character; several subsequent `&` characters delimit empty items, ignored by this algorithm. - a. If `query` contains one or more items starting with `via=` - treat - the rest of each this item after `via=` as a percent-encoded homeserver - name to be used in + a. If `query` contains one or more items starting with `via=`: for each item, treat + the rest of the string as a percent-encoded homeserver name to be used in [routing](https://matrix.org/docs/spec/appendices#routing). - b. If `query` contains one or more items starting with `action=` - treat + b. If `query` contains one or more items starting with `action=`: treat _the last_ such item as an instruction for joining or opening a direct chat, as per this proposal. From 7b574448db39502b44b155049b61b89e3bad95ad Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Sat, 25 Jul 2020 12:42:52 +0200 Subject: [PATCH 07/29] Fix a broken link Co-authored-by: Denis Kasak --- proposals/2312-matrix-uri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 068fb10d..ff5cfe4e 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -53,7 +53,7 @@ Examples: * Room `#someroom:example.org`: `matrix:room/someroom:example.org` * Unfederated room `#internalroom:internal.example.org`: - `matrix:room/internalroom:internal.example.org`. This can only be resolved by a client connected to the appropriate server, likely internal.example.org in this case, so it can be hinted at: `matrix:room/internalroom:internal.example.org&via=internal.example.org`. See the [Discussion points and tradeoffs](#Discussion points and tradeoffs) section. + `matrix:room/internalroom:internal.example.org`. This can only be resolved by a client connected to the appropriate server, likely internal.example.org in this case, so it can be hinted at: `matrix:room/internalroom:internal.example.org&via=internal.example.org`. See the [Discussion points and tradeoffs](#discussion-points-and-tradeoffs) section. * Event in a room: `matrix:room/someroom:example.org/event/Arbitrary_Event_Id` * [A commit like this](https://github.com/her001/steamlug.org/commit/2bd69441e1cf21f626e699f0957193f45a1d560f) From c60368338c346d787b22841f446297561790ad48 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Sat, 25 Jul 2020 12:44:23 +0200 Subject: [PATCH 08/29] Apply suggestions from code review Co-authored-by: Mayeul Cantan --- proposals/2312-matrix-uri.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index ff5cfe4e..fcc8a200 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -252,10 +252,7 @@ describes two possible actions: * `action=join` is only valid in a URI resolving to a Matrix room; applications MUST ignore it if found in other contexts and MUST NOT generate it for other Matrix resources. This action means that client applications - SHOULD attempt to join it using the standard CS API means. Client - applications SHOULD ask for user confirmation before joining if the user - is neither a joined member of the room nor of its immediate - [successor or predecessor](https://matrix.org/docs/spec/client_server/latest#module-room-upgrades). + SHOULD attempt to join it using the standard CS API means. * `action=chat` is only valid in a URI resolving to a Matrix user; applications MUST ignore it if found in other contexts and MUST NOT generate it for other Matrix resources. A URI with this action that a client application @@ -344,8 +341,7 @@ comparisons are case-INsensitive. [routing](https://matrix.org/docs/spec/appendices#routing). b. If `query` contains one or more items starting with `action=`: treat - _the last_ such item as an instruction for joining or opening a direct - chat, as per this proposal. + _the last_ such item as an instruction, as this proposal defines in [query](#query). The classes of URIs and corresponding default actions (along with relevant CS API calls) are collected as follows. This is non-normative and just provides From eeb5ce222cc4fd5bdd82054b1bc527be4c7366ed Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Sat, 25 Jul 2020 17:00:30 +0200 Subject: [PATCH 09/29] Intro: rephrase a paragraph as per review --- proposals/2312-matrix-uri.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index fcc8a200..63e79921 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -37,13 +37,14 @@ To cover the use cases above, the following scheme is proposed for Matrix URIs ```text matrix:[//{authority}/]{type}/{id without sigil}[/{more type/id pairs}][?{query}] ``` -with `type` being one of: `user`, `room`, `roomid`, `event` and -`group`; and the only `query` defined for now is `action=join`. The Matrix -identifier (or identifiers) can be reconstructed from the URI by taking -the sigil that corresponds to `type` and appending `id without sigil` to it. -The query may alter the kind of request with respect to the Matrix resource; -and Matrix resources can be contained in each other, in which care the -`more type/id pairs` series is used to reconstruct inner identifiers. +with `type` defining the resource type (such as `user` or `roomid` - see +the "Path" section in the proposal) and `query` containing additional hints +or request details on the Matrix entity (see "Query" in the proposal). +The Matrix identifier (or identifiers) can be reconstructed from the URI by +taking the sigil that corresponds to `type` and appending `id without sigil` +to it. To support a hierarchy of Matrix resources, `more type/id pairs` series +is used to reconstruct inner identifiers (as of now, there can be only one +inner identifier, pointing to an event in a room). This proposal defines initial mapping of URIs to Matrix identifiers and actions on corresponding resources; the scheme and mapping are subject From c3329fe64413374e520b99b62633ed0fbe260685 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Sat, 25 Jul 2020 18:59:06 +0200 Subject: [PATCH 10/29] Wording/grammar fixes from code review Co-authored-by: Denis Kasak --- proposals/2312-matrix-uri.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 63e79921..b3a4727e 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -101,7 +101,7 @@ around MUST/SHOULD/MAY): MUST NOT have Matrix URIs that are equal in [RFC 3986](https://tools.ietf.org/html/rfc3986) sense (but two distinct Matrix URIs MAY map to the same Matrix identifier). -1. The following classes MUST be supported: +1. References to the following entities MUST be supported: 1. User IDs (`@user:example.org`) 1. Room IDs (`!roomid:example.org`) 1. Room aliases (`#roomalias:example.org`) @@ -407,7 +407,7 @@ further discussion should happen in GitHub comments. `//` if the URI doesn't have an authority component. In other words, `//` implies a centre of authority, and the (public) Matrix federation is not supposed to have one; hence no `//` in most URIs. -1. _Why type specifiers use singular rather than plural +1. _Why do type specifiers use singular rather than plural as is common in RESTful APIs?_ Unlike in actual RESTful APIs, this MSC does not see `rooms/` or `users/` as collections to browse. The type specifier completes @@ -428,7 +428,7 @@ further discussion should happen in GitHub comments. further below, the current proposal provides enough space to define synonyms; this may need some canonicalisation service from homeservers so that we don't have to enable synonyms at each client individually. -1. _Why event URI cannot use the fragment part for the event id?_ +1. _Why can an event URI not use the fragment part for the event ID?_ Because fragment is a part processed exclusively by the client in order to navigate within a larger document, and room cannot be considered a "document". Each event can be retrieved from the server From b4b917c6c89f2986e3850aae58eca9bee1ba6598 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Sat, 25 Jul 2020 19:00:56 +0200 Subject: [PATCH 11/29] Another wording fix --- proposals/2312-matrix-uri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index b3a4727e..9539b89e 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -422,7 +422,7 @@ further discussion should happen in GitHub comments. Not in this MSC but that can be considered in the future. 1. _Why not Reddit-style single-letter type specifiers? That's almost as compact as a sigil, still pretty clearly conveys the type, - and nicely avoids the confusion described in the previous question._ + and nicely avoids the singular vs. plural confusion._ Reddit-style prefixes would eventually produce bigger ambiguity as primary notation; but they can be handy as shortcuts. As discussed further below, the current proposal provides enough space to define From 438e5a368f4f6647353bc025b599d82e9a35049a Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Tue, 29 Sep 2020 21:11:45 +0200 Subject: [PATCH 12/29] Add URI operations and construction algorithm --- proposals/2312-matrix-uri.md | 103 +++++++++++++++++++---------------- 1 file changed, 57 insertions(+), 46 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 9539b89e..dbd80ef1 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -276,14 +276,9 @@ above is only a part of the grammar as defined in the respective section; it is not proposed here to generate or read the authority part of the URI. -### URI semantics and parsing algorithm +### Recommended implementation -The main purpose of a Matrix URI is accessing the resource specified by the -identifier, and the primary action is loading the contents of a document -corresponding to a given resource. This MSC defines the "default" action -that a client application SHOULD perform when the user activates -(e.g. clicks on) a URI; further MSCs may introduce additional actions enabled -either by passing an `action` value in the query part, or by other means. +#### URI parsing algorithm The reference algorithm of parsing a Matrix URI follows. Note that, although clients are encouraged to use lower-case strings in their URIs, all string @@ -297,7 +292,7 @@ comparisons are case-INsensitive. their language and SDK, rather than implement it from scratch based on RFCs. 1. Check that `scheme name` is exactly `matrix`, case-insensitive. If - the scheme name doesn't match, exit parsing: this is not Matrix URI. + the scheme name doesn't match, exit parsing: this is not a Matrix URI. 1. Split the `path` into segments separated by `/` character; several subsequent `/` characters delimit empty segments, as advised by RFC 3986. @@ -327,7 +322,8 @@ comparisons are case-INsensitive. a. Pick the next (3rd) path segment: - if the segment is exactly `event`, proceed; - - otherwise, including the case of an empty segment, fail parsing. + - otherwise, including the case of an empty segment (trailing `/`, e.g.), + fail parsing. b. Pick the next (4th) leftmost path segment: - if the segment is empty, fail parsing; @@ -338,45 +334,35 @@ comparisons are case-INsensitive. `&` characters delimit empty items, ignored by this algorithm. a. If `query` contains one or more items starting with `via=`: for each item, treat - the rest of the string as a percent-encoded homeserver name to be used in + the rest of the item as a percent-encoded homeserver name to be used in [routing](https://matrix.org/docs/spec/appendices#routing). b. If `query` contains one or more items starting with `action=`: treat _the last_ such item as an instruction, as this proposal defines in [query](#query). -The classes of URIs and corresponding default actions (along with relevant -CS API calls) are collected as follows. This is non-normative and just provides -a reference and examples of handling various kinds of URIs. - -* User ID: - - URI example: `matrix:user/me:example.org` or - (decentralised user id, future) `matrix:user/me_in_matrix` - - Possible default actions: - - Show user profile - (`GET /profile/@me:example.org/...`, `GET /profile/@me_in_matrix/...`); - - Mention the user in the current room (client-local operation) -* Room ID: - - URI example: `matrix:roomid/rid:example.org` or - (decentralised id, future) `matrix:roomid/lol823y4bcp3qo4` - - Default action: attempt to "open" the room (usually means the client - at least displays the room timeline at the latest or - the last remembered position - `GET /rooms/!rid:example.org/...`, - `GET /rooms/!lol823y4bcp3qo4/...`) -* Joining by Room ID: - - URI example: `matrix:roomid/rid:example.org?action=join&via=example2.org` - - Default action: if needed (see the section about the query part) ask - the user to confirm the intention; then join the room - (`POST /join/!rid:example.org?server_name=example2.org`) -* Room alias: - - URI example: `matrix:room/us:example.org` - - Default action: resolve the alias to room id - (`GET /directory/room/#us:example.org` if needed) and attempt to "open" - the room (same as above) -* Joining by Room alias: - - URI example: `matrix:room/us:example.org?action=join` - - Default action: if needed (see the section about the query part) ask - the user to confirm the intention; then join the room - (`POST /join/#us:example.org`) +#### Operations on Matrix URIs + +The main purpose of a Matrix URI is accessing the resource specified by the +identifier. This MSC defines the "default" operation +([in the sense of RFC 7595](https://tools.ietf.org/html/rfc7595#section-3.4)) +that a client application SHOULD perform when the user activates +(e.g. clicks on) a URI; further MSCs may introduce additional operations +enabled either by passing an `action` value in the query part, or by other +means. + +The classes of URIs and corresponding default operations (along with relevant +CS API calls) are collected below. The table assumes that the operations are +performed on behalf (using the access token) of the user `@me:example.org`: + +| URI class/example | Interactive operation | Non-interactive operation / Involved CS API | +| ----------------- | --------------------- | --------------------------------------------- | +| User Id (no `action` in URI):
`matrix:user/her:example.org` | _Outside the room context_: show user profile
_Inside the room context:_ mention the user in the current room (client-local operation) | No default non-interactive operation
`GET /profile/@her:example.org/display_name`
`GET /profile/@her:example.org/avatar_url` | +| User Id (`action=chat`):
`matrix:user/her:example.org?action=chat` | Open a direct chat with the user (see the next column on identifying the room) | If [canonical direct chats](https://github.com/matrix-org/matrix-doc/pull/2199) are supported: `GET /_matrix/client/r0/user/@me:example.org/dm?involves=@her:example.org`
Without canonical direct chats:
1. `GET /user/@me:example.org/account_data/m.direct`
2. Find the room id for `@her:example.org` in the event content
3. if found, return this room id; if not, `POST /createRoom` with `"is_direct": true` and return id of the created room | +| Room (no `action` in URI):
`matrix:roomid/rid:example.org`
`matrix:room/us:example.org` | Attempt to "open" (usually: display the timeline at the latest or last remembered position) the room | No default non-interactive operation
`GET /rooms/!rid:example.org/...`
the respective room section of `GET /sync` | +| Room (`action=join`):
`matrix:roomid/rid:example.org?action=join&via=example2.org`
`matrix:room/us:example.org?action=join` | Attempt to join the room | `POST /join/!rid:example.org?server_name=example2.org`
`POST /join/#us:example.org` | +| Event:
`matrix:room/us:example.org/event/lol823y4bcp3qo4`
`matrix:roomid/rid:example.org/event/lol823y4bcp3qo4?via=example2.org` | 1. For room aliases, resolve an alias to a room id (HOW?)
2. Attempt to retrieve (see the next column) and display the event;
3. If the event could not be retrieved due to access denial and the current user is not a member of the room, the client MAY offer the user to join the room and try to open the event again | `GET `
`GET /rooms/!rid:example.org/event/lol823y4bcp3qo4?server_name=example2.org` | +| Group:
`matrix:group/them:matrix.org` | Reserved for future use | + * Event ID (as used in [room version 3](https://matrix.org/docs/spec/rooms/v3) and later): - URI example (aka "event permalink"): @@ -392,9 +378,34 @@ a reference and examples of handling various kinds of URIs. 1. Otherwise try to retrieve the event in the same way but in case of access failure the client MAY offer the user to join the room; if the user agrees and joining succeeds, retry the step above. -* Group ID: - - URI example: `matrix:group/them:matrix.org` - - Default action: reserved for future use + +#### URI construction algorithm + +The following algorithm assumes a Matrix identifier that follows +the high-level grammar described in the specification. Clients MUST ensure +compliance of identifiers passed to this algorithm. + +For room and user identifiers (including room aliases): +1. Remove the sigil character from the identifier and match it against + the following list to produce `prefix-1`: + - `@` -> `user/` + - `!` -> `roomid/` + - `#` -> `room/` + - `+` -> `group/` +2. Build the Matrix URI as a concatenation of: + - literal `matrix:` + - `prefix-1` + - the remainder of identifier (`id without sigil`). + +For event identifiers (assuming they need the room context, see +[MSC2695](https://github.com/matrix-org/matrix-doc/pull/2695) and +[MSC 2779](https://github.com/matrix-org/matrix-doc/issues/2779) that +may change this): +1. Take the event's room id or canonical alias and build a Matrix URI for them + as described above. +2. Append to the result of previous step: + - literal `event/` + - the event id with the sigil (`$`) removed. ## Discussion points and tradeoffs From 907bd2e2a2748fac2bdad2dc0b18c643fbcad053 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Tue, 29 Sep 2020 21:29:44 +0200 Subject: [PATCH 13/29] Rewording, clarifications, minor fixes --- proposals/2312-matrix-uri.md | 146 +++++++++++++++++++---------------- 1 file changed, 81 insertions(+), 65 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index dbd80ef1..75c56592 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -13,69 +13,80 @@ outside of Matrix and then resolved in a uniform way - matching URLs in World Wide Web. Specific use cases include: -1. Representation in UI: as a Matrix user I want to refer to Matrix entities - in the same way as for web pages, so that others could “follow the link” - I sent them (not necessarily through Matrix, it can be, e.g., a web page or - email) in order to access the referred resource. +1. Representation: as a Matrix user I want to refer to Matrix entities + in the same way as for web pages, so that others could unambiguously identify + the resource, regardless of the context or used medium to identify it to them + (within or outside Matrix, e.g., in a web page or an email message). 1. Inbound integration: as an author of Matrix software, I want to have a way to invoke my software from the operating environment to resolve a Matrix URI passed from another program. This is a case of, e.g., - opening a Matrix client by clicking on a link in an email program. + opening a Matrix client by clicking on a link from an email message. 1. Outbound integration: as an author of Matrix software, I want to have a way - to export identifiers of Matrix resources as URIs to non-Matrix environment + to export identifiers of Matrix resources to non-Matrix environment so that they could be resolved in another time-place in a uniform way. - This is a case of "Share via…" action in a mobile Matrix client. + An example of this case is the "Share via…" action in a mobile Matrix client. -https://matrix.to somehow compensates for the lack of dedicated URIs; however: -* it addresses use case (1) in a somewhat clumsy way (resolving a link needs - two interactions with the user instead of one), and -* it can only deal with (2) within a web browser (basically limiting - first-class support to browser-based clients). +Matrix identifiers as defined by the current specification have a form distinct +enough from other identifiers to mostly fulfil the representation use case. +Since they are not URIs they can not cover the two integration use cases. +https://matrix.to somehow compensates for this; however: +* it requires a web browser to run JavaScript code that resolves identifiers + (basically limiting first-class support to browser-based clients), and +* it relies on matrix.to as an intermediary that provides that JavaScript code. To cover the use cases above, the following scheme is proposed for Matrix URIs (`[]` enclose optional parts, `{}` enclose variables): ```text -matrix:[//{authority}/]{type}/{id without sigil}[/{more type/id pairs}][?{query}] +matrix:[//{authority}/]{type}/{id without sigil}[/{type}/{id without sigil}...][?{query}][#{fragment}] ``` -with `type` defining the resource type (such as `user` or `roomid` - see -the "Path" section in the proposal) and `query` containing additional hints +with `{type}` defining the resource type (such as `user` or `roomid` - see +the "Path" section in the proposal) and `{query}` containing additional hints or request details on the Matrix entity (see "Query" in the proposal). -The Matrix identifier (or identifiers) can be reconstructed from the URI by -taking the sigil that corresponds to `type` and appending `id without sigil` -to it. To support a hierarchy of Matrix resources, `more type/id pairs` series -is used to reconstruct inner identifiers (as of now, there can be only one -inner identifier, pointing to an event in a room). +`{authority}` and `{fragment}` parts are reserved for future use; this proposal +does not define them and implementations SHOULD ignore them for now. -This proposal defines initial mapping of URIs to Matrix identifiers and actions -on corresponding resources; the scheme and mapping are subject -to further extension. +This MSC does not introduce new Matrix entities, nor API endpoints - +it merely defines a mapping between URIs with the scheme name `matrix:` +and Matrix identifiers, as well as operations on them. The MSC should be +sufficient to produce an implementation that would convert Matrix URIs to +a series of CS API calls, entirely on the client side. It is recognised, +however, that most of URI processing logic can and should (eventually) +be on the server side in order to facilitate adoption of Matrix URIs; +further MSCs are needed to define details for that, as well as to extend +the mapping to more resources (including those without equivalent +Matrix identifiers, such as room state or user profile data). + +The Matrix identifier (or identifiers) can be reconstructed from +`{id without sigil}` by prepending a sigil character corresponding to `{type}`. +To support a hierarchy of Matrix resources, more `/{type}/{id without sigil}` +pairs can be appended, identifying resources inside of other resources. +As of now, there's only one such case, with exactly one additional pair - +pointing to an event in a room. Examples: * Room `#someroom:example.org`: `matrix:room/someroom:example.org` -* Unfederated room `#internalroom:internal.example.org`: - `matrix:room/internalroom:internal.example.org`. This can only be resolved by a client connected to the appropriate server, likely internal.example.org in this case, so it can be hinted at: `matrix:room/internalroom:internal.example.org&via=internal.example.org`. See the [Discussion points and tradeoffs](#discussion-points-and-tradeoffs) section. +* User `@me:example.org`: + `matrix:user/me:example.org` * Event in a room: `matrix:room/someroom:example.org/event/Arbitrary_Event_Id` * [A commit like this](https://github.com/her001/steamlug.org/commit/2bd69441e1cf21f626e699f0957193f45a1d560f) could make use of a Matrix URI in the form of `{Matrix identifier}`. - -This MSC does not introduce new Matrix entities, nor API endpoints - -it merely defines a mapping from a URI with the scheme name `matrix:` -to Matrix identifiers and actions on them. It is deemed sufficient to -produce an implementation that would convert Matrix URIs to a series -of CS API calls, entirely on the client side. It is recognised, -however, that most of URI processing logic can and should (eventually) -be on the server side in order to facilitate adoption of Matrix URIs; -further MSCs are needed to define details for that. + ## Proposal -Further text uses “Matrix identifier” with a meaning of identifiers -as described by [Matrix Specification](https://matrix.org/docs/spec/), -and “Matrix URI” with a meaning of an identifier following -the RFC-compliant URI format proposed hereby. +### Definitions + +Further text uses the following terms: +- Matrix identifier - one of identifiers defined by the current +[Matrix Specification](https://matrix.org/docs/spec/appendices.html#identifier-grammar), +- Matrix URI - a uniform resource identifier proposed hereby, following + the RFC-compliant URI format. +- MUST/SHOULD/MAY etc. follow the conventions of + [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). + ### Requirements @@ -88,9 +99,7 @@ The following considerations drive the requirements for Matrix URIs: you cannot rewrite them once they are released to the world. 1. Ease of implementation, allowing reuse of existing codes. -The following requirements resulted from these drivers -(see [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) for conventions -around MUST/SHOULD/MAY): +The following requirements resulted from these drivers: 1. Matrix URI MUST comply with [RFC 3986](https://tools.ietf.org/html/rfc3986) and [RFC 7595](https://tools.ietf.org/html/rfc7595). @@ -121,19 +130,18 @@ around MUST/SHOULD/MAY): room-specific user profiles). 1. The mapping MUST support decentralised as well as centralised IDs. This basically means that the URI scheme MUST have provisions - for mapping of `:` but it MUST NOT require + for mapping of identifiers with `:` but it MUST NOT require `:` to be there. -1. Matrix URI SHOULD allow encoding of action requests such as joining - a room. +1. Matrix URI SHOULD allow encoding of action requests such as joining a room. 1. Matrix URI SHOULD have a human-readable, if not necessarily human-friendly, representation - to allow visual sanity-checks. In particular, characters escaping/encoding should be reduced to bare minimum in that representation. As a food for thought, see [Wikipedia: Clean URL, aka SEF URL](https://en.wikipedia.org/wiki/Clean_URL) and - [a very relevant use case from RFC 3986](https://tools.ietf.org/html/rfc3986#section-1.2.1). + [a use case from RFC 3986](https://tools.ietf.org/html/rfc3986#section-1.2.1). 1. It SHOULD be easy to parse Matrix URI in popular programming languages: e.g., one should be able to use `parseUri()` - to dissect a Matrix URI in JavaScript. + to dissect a Matrix URI into components in JavaScript. 1. The mapping SHOULD be consistent across different classes of Matrix identifiers. 1. The mapping SHOULD support linking to unfederated servers/networks @@ -141,8 +149,10 @@ around MUST/SHOULD/MAY): [matrix-doc#2309](https://github.com/matrix-org/matrix-doc/issues/2309) that calls for such linking). -The syntax and mapping discussed below meet all these requirements. -Further extensions MUST comply to them as well. +The syntax and mapping discussed below meet all these requirements except +the last one that will be addressed separately. +Further extensions MUST NOT reduce the supported set of requirements. + ### Syntax and high-level processing @@ -156,8 +166,8 @@ hier-part = [ “//” authority “/” ] path As mentioned above, this MSC assumes client-side URI processing (i.e. mapping to Matrix identifiers and CS API requests). However, even when URI processing is shifted to the server side -the client will still have to parse the URI at least to find -the authority part (or lack of it) and remove the fragment part +the client will still have to parse the URI at least to remove +the authority and fragment parts (if either exists) before sending the request to the server (more on that below). #### Scheme name @@ -179,14 +189,16 @@ references](https://tools.ietf.org/html/rfc3986#section-4.2) and omitting the scheme name makes them indistinguishable from a local path that might have nothing to do with Matrix. Clients MUST NOT try to parse pieces like `room/MyRoom:example.org` as Matrix URIs; instead, -users should be encouraged to use Matrix IDs for in-text references -(`#MyRoom:example.org`) and client applications should do -the heavy-lifting of turning them into hyperlinks to Matrix URIs. +users should be encouraged to use Matrix identifiers for in-text references +(`#MyRoom:example.org`) and client applications SHOULD turn them into +hyperlinks to Matrix URIs. #### Authority -The authority part is used for the specific case of getting access -to a Matrix resource (such as a room or a user) through a given server. +The authority part will eventually be used to indicate access to a Matrix +resource (such as a room or a user) specifically through a given server, +addressing a case described in +[matrix-org/matrix-doc#2309](https://github.com/matrix-org/matrix-doc/issues/2309). ```text authority = host [ “:” port ] ``` @@ -200,10 +212,11 @@ Clients SHOULD NOT use data from the authority part other than for experimental or further research purposes. #### Path -Unlike the very wide definition of path in RFC 3986, this MSC -restricts the path component of a Matrix URI to a simple -pattern that allows to easily reconstruct a Matrix identifier or -a chain of identifiers: +This MSC restricts +[the very wide definition of path in RFC 3986](https://tools.ietf.org/html/rfc3986#section-3.3), +to a simple pattern that allows to easily reconstruct a Matrix identifier or +a chain of identifiers and also to locate a certain sub-resource in the scope +of a given Matrix entity: ```text path = type “/” id-without-sigil [“/” path] type = “user” / “roomid” / “room” / “event” / “group” @@ -252,18 +265,21 @@ the Matrix entity, a certain action is requested on it. This proposal describes two possible actions: * `action=join` is only valid in a URI resolving to a Matrix room; applications MUST ignore it if found in other contexts and MUST NOT generate - it for other Matrix resources. This action means that client applications - SHOULD attempt to join it using the standard CS API means. + it for other Matrix resources. This action means that a client application + SHOULD attempt to join the room specified by the URI path using the standard + CS API means. * `action=chat` is only valid in a URI resolving to a Matrix user; applications MUST ignore it if found in other contexts and MUST NOT generate - it for other Matrix resources. A URI with this action that a client application - SHOULD open a direct chat window with the user; clients supporting + it for other Matrix resources. This action means that a client application + SHOULD open a direct chat window with the user specified by the URI path; + clients supporting [canonical direct chats](https://github.com/matrix-org/matrix-doc/pull/2199) SHOULD open the canonical direct chat. For both actions, where applicable, client applications SHOULD ask for user confirmation or at least make the user aware if the action leads -to joining or creating a new room rather than switching to a prior one. +to joining or creating a new room rather than to opening a room that the user +already has in the room list. The routing query (`via=`) indicates servers that are likely involved in the room (see also @@ -459,7 +475,7 @@ further discussion should happen in GitHub comments. the authority part in generated URIs for a given user account.~~ Use `via=` in order to point to a homeserver in the closed federation. The authority part may eventually be used for that but further discussion - is needed on how clients should support without compromising privacy + is needed on how clients should support it without compromising privacy (see https://github.com/matrix-org/matrix-doc/pull/2312#discussion_r348960282 for the original concern). From bbc8bfc1c096638ead486c4bd41a4c4bc3c42db4 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Tue, 29 Sep 2020 21:30:03 +0200 Subject: [PATCH 14/29] Rework the path grammar for better clarity --- proposals/2312-matrix-uri.md | 46 +++++++++++++++++++++++------------- 1 file changed, 30 insertions(+), 16 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 75c56592..aab1a670 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -218,23 +218,40 @@ to a simple pattern that allows to easily reconstruct a Matrix identifier or a chain of identifiers and also to locate a certain sub-resource in the scope of a given Matrix entity: ```text -path = type “/” id-without-sigil [“/” path] -type = “user” / “roomid” / “room” / “event” / “group” -id-without-sigil = string ; see below +path = entity-descriptor [“/” entity-descriptor] +entity-descriptor = nonid-segment / type-qualifier id-without-sigil +non-id-segment = segment-nz ; as defined in RFC3986, see also below +type-qualifier = segment-nz “/” ; as defined in RFC3986, see also below +id-without-sigil = string ; as defined in Matrix identifier spec, see below ``` -The path component consists of 1 or more type-id pairs separated -by slash character both inside the pair and between pairs. While most -of the URIs considered in this MSC do not need any sort of hierarchy, -one case is standing out: as of now, events require rooms to be -resolved so an event URI for `$eventid` in the room -`!roomid:example2.org` would be -`matrix:roomid/roomid:example2.org/event/eventid`. - -This MSC defines the following type specifiers: +The path component consists of 1 or more descriptors separated by a slash +(`/`) character. This is a generic pattern intended for reuse in future +extensions. + +This MSC only proposes mappings along `type-qualifier id-without-sigil` syntax; +`nonid-segment` is unused and reserved for future use. +For the sake of integrity future `nonid-segment` extensions must follow +[the ABNF for `segment-nz` as defined in RFC3986](https://tools.ietf.org/html/rfc3986#appendix-A). + +This MSC defines the following `type` specifiers: `user` (user id, sigil `@`), `roomid` (room id, sigil `!`), `room` (room alias, sigil `#`), and `event` (event id, sigil `$`). The type `group` (group/community id, sigil `+`) is reserved for future use. +As of this MSC, `user`, `roomid`, `room`, and `group` can only be at the top +level. The type `event` can only be used on the 2nd level and only under `room` +or `roomid`; this is driven by the current shape of Client-Server API that +does not provide a non-deprecated way to retrieve an event without knowing +the room (see [MSC2695](https://github.com/matrix-org/matrix-doc/pull/2695) and +[MSC 2779](https://github.com/matrix-org/matrix-doc/issues/2779) that may +change this). + +Further MSCs may introduce navigation to more top-level as well as +non-top-level objects; see "Further evolution" for some ideas. These new +proposals SHOULD follow the generic grammar laid out above, adding new `type` +and `nonid-segment` specifiers and/or allowing them in other levels, rather +than introduce a new grammar. + `id-without-sigil` is defined as the `string` part of Matrix [Common identifier format](https://matrix.org/docs/spec/appendices#common-identifier-format) with percent-encoded characters that are NEITHER unreserved, sub-delimiters, `:` nor `@`, @@ -242,11 +259,8 @@ with percent-encoded characters that are NEITHER unreserved, sub-delimiters, `:` This notably exempts `:` from percent-encoding but includes `/`. See the rationale behind dropping sigils and the respective up/downsides in -"Discussion points and tradeoffs" below. +"Discussion points and tradeoffs" as well as "Alternatives" below. -Further MSCs may introduce navigation to more top-level as well as -non-top-level objects; see "Further evolution" for some ideas. - #### Query Matrix URI can optionally have From 6f082bbe125eb5902213bef728ebfad82acca958 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Wed, 30 Sep 2020 08:04:02 +0200 Subject: [PATCH 15/29] Be clearer about percent-en/decoding --- proposals/2312-matrix-uri.md | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index aab1a670..fa65e1d4 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -243,7 +243,7 @@ level. The type `event` can only be used on the 2nd level and only under `room` or `roomid`; this is driven by the current shape of Client-Server API that does not provide a non-deprecated way to retrieve an event without knowing the room (see [MSC2695](https://github.com/matrix-org/matrix-doc/pull/2695) and -[MSC 2779](https://github.com/matrix-org/matrix-doc/issues/2779) that may +[MSC2779](https://github.com/matrix-org/matrix-doc/issues/2779) that may change this). Further MSCs may introduce navigation to more top-level as well as @@ -343,8 +343,8 @@ comparisons are case-INsensitive. b. Pick the next (2nd) leftmost path segment: - if the segment is empty, fail parsing; - - otherwise, percent-decode the segment and make `mxid-1` by - concatenating `sigil-1` and the result of percent-decoding. + - otherwise, percent-decode the segment (unless the initial URI parse + has already done that) and make `mxid-1` by prepending `sigil-1`. 1. If `sigil-1` is `!` or `#` and the URI path has exactly 4 segments, it may be possible to construct the 2nd-level Matrix identifier to @@ -357,8 +357,8 @@ comparisons are case-INsensitive. b. Pick the next (4th) leftmost path segment: - if the segment is empty, fail parsing; - - otherwise, percent-decode the segment and make `mxid-2` by - prepending `$` to the result of percent-decoding. + - otherwise, percent-decode the segment (unless the initial URI parse + has already done that) and make `mxid-2` by prepending `$`. 1. Split the `query` into items separated by `&` character; several subsequent `&` characters delimit empty items, ignored by this algorithm. @@ -370,6 +370,9 @@ comparisons are case-INsensitive. b. If `query` contains one or more items starting with `action=`: treat _the last_ such item as an instruction, as this proposal defines in [query](#query). +Clients MUST implement proper percent-decoding of the identifiers; there's no +liberty similar to that of matrix.to. + #### Operations on Matrix URIs The main purpose of a Matrix URI is accessing the resource specified by the @@ -423,19 +426,23 @@ For room and user identifiers (including room aliases): - `#` -> `room/` - `+` -> `group/` 2. Build the Matrix URI as a concatenation of: - - literal `matrix:` - - `prefix-1` - - the remainder of identifier (`id without sigil`). + - literal `matrix:`; + - `prefix-1`; + - the remainder of identifier (`id without sigil`), percent-encoded as per + [RFC 3986](https://tools.ietf.org/html/rfc3986). For event identifiers (assuming they need the room context, see [MSC2695](https://github.com/matrix-org/matrix-doc/pull/2695) and -[MSC 2779](https://github.com/matrix-org/matrix-doc/issues/2779) that +[MSC2779](https://github.com/matrix-org/matrix-doc/issues/2779) that may change this): 1. Take the event's room id or canonical alias and build a Matrix URI for them as described above. 2. Append to the result of previous step: - - literal `event/` - - the event id with the sigil (`$`) removed. + - literal `event/`; + - the event id after removing the sigil (`$`) and percent-encoding. + +Clients MUST implement proper percent-encoding of the identifiers; there's no +liberty similar to that of matrix.to. ## Discussion points and tradeoffs From 758c57b0214c9823306db0dcfa41ad002cdbd5dd Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Thu, 1 Oct 2020 14:33:03 +0200 Subject: [PATCH 16/29] More cleanup --- proposals/2312-matrix-uri.md | 50 +++++++++++++++++++----------------- 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index fa65e1d4..f9620701 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -160,8 +160,8 @@ The proposed generic Matrix URI syntax is a subset of the generic URI syntax [defined by RFC 3986](https://tools.ietf.org/html/rfc3986#section-3): ```text -MatrixURI = “matrix:” hier-part [ “?” query ] [ “#” fragment ] -hier-part = [ “//” authority “/” ] path +MatrixURI = "matrix:" hier-part [ "?" query ] [ "#" fragment ] +hier-part = [ "//" authority "/" ] path ``` As mentioned above, this MSC assumes client-side URI processing (i.e. mapping to Matrix identifiers and CS API requests). @@ -200,7 +200,7 @@ resource (such as a room or a user) specifically through a given server, addressing a case described in [matrix-org/matrix-doc#2309](https://github.com/matrix-org/matrix-doc/issues/2309). ```text -authority = host [ “:” port ] +authority = host [ ":" port ] ``` Here's an example of a Matrix URI with an authority part @@ -218,10 +218,10 @@ to a simple pattern that allows to easily reconstruct a Matrix identifier or a chain of identifiers and also to locate a certain sub-resource in the scope of a given Matrix entity: ```text -path = entity-descriptor [“/” entity-descriptor] +path = entity-descriptor ["/" entity-descriptor] entity-descriptor = nonid-segment / type-qualifier id-without-sigil -non-id-segment = segment-nz ; as defined in RFC3986, see also below -type-qualifier = segment-nz “/” ; as defined in RFC3986, see also below +non-id-segment = segment-nz ; as defined in RFC 3986, see also below +type-qualifier = segment-nz "/" ; as defined in RFC 3986, see also below id-without-sigil = string ; as defined in Matrix identifier spec, see below ``` The path component consists of 1 or more descriptors separated by a slash @@ -231,7 +231,7 @@ extensions. This MSC only proposes mappings along `type-qualifier id-without-sigil` syntax; `nonid-segment` is unused and reserved for future use. For the sake of integrity future `nonid-segment` extensions must follow -[the ABNF for `segment-nz` as defined in RFC3986](https://tools.ietf.org/html/rfc3986#appendix-A). +[the ABNF for `segment-nz` as defined in RFC 3986](https://tools.ietf.org/html/rfc3986#appendix-A). This MSC defines the following `type` specifiers: `user` (user id, sigil `@`), `roomid` (room id, sigil `!`), @@ -265,13 +265,13 @@ See the rationale behind dropping sigils and the respective up/downsides in Matrix URI can optionally have [the query part](https://tools.ietf.org/html/rfc3986#section-3.4). -This MSC defines only two specific forms for the query; further MSCs -may add to this as long as RFC 3986 is followed. +This MSC defines the general form for the query and two "standard" query items; +further MSCs may add to this as long as RFC 3986 is followed. ```text -query = query-element *( “&” query-element ) +query = query-element *( "&" query-element ) query-element = action / routing -action = “action=" ( "join" / "chat" ) -routing = “via=” authority +action = "action=" ( "join" / "chat" ) +routing = "via=” authority ``` The `action` query item is used in contexts where, on top of identifying @@ -291,9 +291,9 @@ describes two possible actions: SHOULD open the canonical direct chat. For both actions, where applicable, client applications SHOULD ask for user -confirmation or at least make the user aware if the action leads -to joining or creating a new room rather than to opening a room that the user -already has in the room list. +confirmation or at least notify the user before joining or creating a new room. +Conversely, no additional confirmation/notification is necessary when +the action leads to opening a room the user is already a member of. The routing query (`via=`) indicates servers that are likely involved in the room (see also @@ -562,7 +562,8 @@ With that said, a URN-styled (`matrix:room:example.org:roomalias`) option was considered. However, Matrix already uses colon (`:`) as a delimiter of id parts and, as can be seen above, reversing the parts to meet the URN's hierarchical order would look confusing for Matrix -users. +users (as in example above - is `room` a part of the identifier or +the type signifier?). ### "Full REST" @@ -580,12 +581,13 @@ the URI). ### Minimal syntax -One early but still viable proposal was to simply prepend `matrix:` to -a Matrix identifier (without encoding it), assuming that it will only be -processed on the client side. The massive downside of this option is that -such strings are not actual URIs even though they look like ones: most -URI parsers won't handle them correctly. Bare Matrix identifiers have -the same applicable range without deceptive looks. +One early proposal was to simply prepend `matrix:` to a Matrix identifier +(without encoding it), assuming that it will only be processed on the client +side. The massive downside of this option is that such strings are not actual +URIs even though they look like ones: most URI parsers won't handle them +correctly. As laid out in the beginning of this proposal, Matrix URIs are +not striving to preempt Matrix identifiers; instead of trying to produce +an equally readable string, one should just use identifiers where they work. ### Minimal syntax based on path and percent-encoding @@ -594,7 +596,7 @@ A simple modification of the previous option is much more viable: proper percent-encoding of the Matrix identifier allows to use it as a URI path part. A single identifier packed in a URI could look like `matrix:/encoded_id_with_sigil`; an event-in-a-room URI would be something -like `matrix:/roomid_or_alias/$event_id` (NB: RFC3986 doesn't require `$` +like `matrix:/roomid_or_alias/$event_id` (NB: RFC 3986 doesn't require `$` to be encoded). This is considerably more concise and encoding is only needed for `#` - quite unfortunately, this is one of the most used sigils in Matrix. E.g., `matrix:/%23matrix:matrix.org` would be a URI for @@ -619,7 +621,7 @@ to represent arbitrary sub-objects (with or without Matrix identifier) such as user profiles or a notifications feed for the room - and also, if ever needed, as an escape hatch to a bigger namespace if we hit shortage of sigils. -The current proposal is also flexible enough to even incorporate the minimal +The current proposal is also flexible enough to incorporate the minimal syntax of this option as an alternative to its own notation - e.g., a further MSC could enable `matrix:id/%23matrix:matrix.org` as a synonym for `matrix:room/matrix:matrix.org`. From a2fa637396d30ceab4d19ffaaaffb4ff3f2239bc Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Thu, 1 Oct 2020 14:41:10 +0200 Subject: [PATCH 17/29] Refactor of non-normative sections Including a few more words regarding the "minimal encoded" (aka "keep sigils") alternative. --- proposals/2312-matrix-uri.md | 172 ++++++++++++++++++++--------------- 1 file changed, 100 insertions(+), 72 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index f9620701..f7741aca 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -445,7 +445,61 @@ Clients MUST implement proper percent-encoding of the identifiers; there's no liberty similar to that of matrix.to. -## Discussion points and tradeoffs +## Discussion and non-normative statements + +### Further evolution + +This MSC is obviously just the first step, keeping the door open for +extensions. Here are a few ideas: + +* Add new actions; e.g. leaving a room (`action=leave`). + +* Add specifying a segment of the room timeline (`from=$evtid1&to=$evtid2`). + +* Unlock bare event ids (`matrix:event/$event_id`) - subject to changes in + other areas of the specification. + +* Bring tangible semantics to the authority part. The main purpose of + the authority part, + [as per RFC 3986](https://tools.ietf.org/html/rfc3986#section-3.2), + is to identify the authority governing the namespace for the rest + of the URI. This MSC restates the RFC definitions for + [`host`](https://tools.ietf.org/html/rfc3986#section-3.2.2) and + [`port`](https://tools.ietf.org/html/rfc3986#section-3.2.3) but + doesn't go further, calling for a separate MSC that would define semantics of + the `host:port` pair. RFC 3986 also includes provisions for user + information but this MSC explicitly excludes them from the authority grammar, + on the grounds that user information has historically been a vector of + widespread abuse. If providing a user identity via the authority part is + found to be of value (with alleviated security concerns) in some case, + a separate MSC should both re-add it to the grammar of the authority part + and define how to construct, parse, and use it. + + Importantly, future MSCs are advised against using the authority part for + _routing over federation_ (the case for `via=` query items), as it would be + against the spirit of RFC 3986. The authority part can be used in cases when + a given Matrix entity is only available from certain servers (the case of + closed federations or non-federating servers). A request to the server + resolved from the authority part means that the client should be, as the name + implies, _authorised_ by the authority server to access the requested + resource. That, in turn, implies that the resource is either available + to guests on the authority server, or the end user must be authenticated + (and their access rights checked) by (or on behalf of) _that server_ in order + to access the resource. While being a part of the original proposal, + the definition of the authority semantics has been dropped as a result of + [the discussion](https://github.com/matrix-org/matrix-doc/pull/2312#discussion_r348960282) + (also referred to in the previous section). + +* One could conceive a URI mapping of avatars in the form of + `matrix:user/uid:matrix.org/avatar/room:matrix.org` + (a user’s avatar for a given room). + +* As described in "Alternatives" and "Discussion points", respectively, one can introduce a synonymous + system that uses Matrix identifiers with sigils by adding another path + prefix (e.g., `matrix:id/%23matrix:matrix.org`). + + +### Past discussion points and tradeoffs The below documents the discussion and outcomes in various prior forums; further discussion should happen in GitHub comments. @@ -468,21 +522,19 @@ further discussion should happen in GitHub comments. public rooms known to the user's homeserver). 1. _Should we advise using the query part for collections then?_ Not in this MSC but that can be considered in the future. -1. _Why not Reddit-style single-letter type specifiers? That's almost - as compact as a sigil, still pretty clearly conveys the type, - and nicely avoids the singular vs. plural confusion._ - Reddit-style prefixes would eventually produce bigger ambiguity as - primary notation; but they can be handy as shortcuts. As discussed - further below, the current proposal provides enough space to define - synonyms; this may need some canonicalisation service from homeservers - so that we don't have to enable synonyms at each client individually. -1. _Why can an event URI not use the fragment part for the event ID?_ +1. _Why can't event URIs use the fragment part for the event ID?_ Because fragment is a part processed exclusively by the client in order to navigate within a larger document, and room cannot be considered a "document". Each event can be retrieved from the server individually, so each event can be viewed as a self-contained document. When/if URI processing is shifted to the server-side, servers are not even - going to receive fragments (as per RFC 3986). + going to receive fragments (as per RFC 3986), which is why usage of + fragments to remove the need for percent-encoding in other identifiers + would lead to URIs that cannot be resolved on servers. Effectively, all + clients would have to implement full URI processing with no chance + to offload that to the server. For that reason fragments, if/when ever + employed in Matrix, only should be used to pinpoint a position within events + and for similar strictly client-side operations. 1. _Interoperability with [Linked Data](https://en.wikipedia.org/wiki/Linked_data)_ is out of scope of this MSC but worth being considered separately. @@ -497,56 +549,25 @@ further discussion should happen in GitHub comments. Use `via=` in order to point to a homeserver in the closed federation. The authority part may eventually be used for that but further discussion is needed on how clients should support it without compromising privacy - (see https://github.com/matrix-org/matrix-doc/pull/2312#discussion_r348960282 - for the original concern). + (see [the discussion on the issue](https://github.com/matrix-org/matrix-doc/pull/2312#discussion_r348960282)). -## Further evolution +### Alternatives -This section is non-normative. +#### Reddit-style URLs -This MSC is obviously just the first step, keeping the door open for -extensions. Here are a few ideas: - -* Add new actions; e.g. leaving a room (`action=leave`). -* Add specifying a segment of the room timeline (`from=$evtid1&to=$evtid2`). -* Unlock bare event ids (`matrix:event/$event_id`) - subject to changes in - other areas of the specification. -* One area of possible evolution is bringing tangible semantics to - the authority part. The main purpose of the authority part, - [as per RFC 3986](https://tools.ietf.org/html/rfc3986#section-3.2), - is to identify the authority governing the namespace for the rest - of the URI. This MSC reuses the RFC definitions for - [`host`](https://tools.ietf.org/html/rfc3986#section-3.2.2) and - [`port`](https://tools.ietf.org/html/rfc3986#section-3.2.3). - RFC 3986 also includes provisions for user information - - this MSC explicitly excludes them. If providing a user identity - in the authority part is found to be of value in some case, - this should be addressed in a separate MSC. - - Importantly, the authority part is _not_ intended for usage in routing - over federation; rather, it is for cases when a given Matrix - entity is not expected to be reachable through federation (such as - unfederated rooms or non-public Matrix networks). Sending requests - to the server resolved from the authority part means that the client - should be, as the name implies, _authorised_ by the authority server - to access the requested resource. That, in turn, implies that the resource - is either available to guests on the authority server, or the end user - must be authenticated (and their access rights checked) - on that server in order to access the resource. While being a part - of the original proposal, the semantics for the authority part have - been dropped from the normative part as a result of MSC discussion. -* One could conceive a URI mapping of avatars in the form of - `matrix:user/uid:matrix.org/avatar/room:matrix.org` - (a user’s avatar for a given room). -* As described below in "Alternatives", one can introduce a synonymous - system that uses Matrix identifiers with sigils by adding another path - prefix (`matrix:id/%23matrix:matrix.org`). +Reddit style (`matrix:r/matrix:matrix.org`, `matrix:u/me:example.org` etc.) +is almost as compact as original Matrix identifiers, while still rather +clearly conveys the type and nicely avoids the singular vs. plural confusion +described in the previos section. However, in the context of high requirements +to URL grammar stability, Reddit-style prefixes would eventually produce +bigger ambiguity as a primary notation; but they can be handy as shortcuts. +As discussed in "Future evolution", the current proposal provides enough space +to define synonyms; this may need some canonicalisation service from +homeservers so that we don't have to enable synonyms at each client +individually. - -## Alternatives - -### URNs +#### URNs The discussion in [MSC455](https://github.com/matrix-org/matrix-doc/issues/455) @@ -565,8 +586,7 @@ to meet the URN's hierarchical order would look confusing for Matrix users (as in example above - is `room` a part of the identifier or the type signifier?). - -### "Full REST" +#### "Full REST" Yet another alternative considered was to go "full REST" and build a more traditionally looking URL structure with serverparts coming first @@ -574,12 +594,11 @@ followed by type grouping (sic - not specifiers) and then by localparts, i.e. `matrix://example.org/rooms/roomalias`. This is even more difficult to comprehend for a Matrix user than the previous alternative and besides it conflates the notion of an authority server with -that of a namespace (`example.org` above is a server part of an alias, -not the name of a hypothetical homeserver that should be used to resolve -the URI). +that of a namespace (quite confusingly, `example.org` above is +the _domain name_ - aka server part - of an alias, not a _host name_ +of a hypothetical homeserver that should be used to resolve the URI). - -### Minimal syntax +#### Minimal syntax One early proposal was to simply prepend `matrix:` to a Matrix identifier (without encoding it), assuming that it will only be processed on the client @@ -589,8 +608,7 @@ correctly. As laid out in the beginning of this proposal, Matrix URIs are not striving to preempt Matrix identifiers; instead of trying to produce an equally readable string, one should just use identifiers where they work. - -### Minimal syntax based on path and percent-encoding +#### Minimal syntax based on path and percent-encoding A simple modification of the previous option is much more viable: proper percent-encoding of the Matrix identifier allows to use it as @@ -598,17 +616,27 @@ a URI path part. A single identifier packed in a URI could look like `matrix:/encoded_id_with_sigil`; an event-in-a-room URI would be something like `matrix:/roomid_or_alias/$event_id` (NB: RFC 3986 doesn't require `$` to be encoded). This is considerably more concise and encoding is only -needed for `#` - quite unfortunately, this is one of the most used sigils -in Matrix. E.g., `matrix:/%23matrix:matrix.org` would be a URI for -Matrix HQ chat room. +needed for `#`. + +Quite unfortunately, `#` is one of the two sigils in Matrix most relevant +to integration cases. The other one is `@`; it doesn't need encoding outside +of the authority part - which is why the form above uses a leading `/` that +puts the identifier in the path part instead of what parsers treat as +the authority part. `#` has to be encoded wherever it appears, making a URI +for Matrix HQ, the first chat room many new users join, look like +`matrix:/%23matrix:matrix.org`. Beyond first-time usage, this generally impacts +[the "napkin" case](https://tools.ietf.org/html/rfc3986#section-1.2.1) from +RFC 3986 that the Requirements section of this MSC mentions. Until we have +applications generally recognising Matrix identifiers in the same way e-mail +addresses are recognised without prefixing `mailto:`, we should live with +the fact that people will have to produce Matrix URIs by hand in various +instances, from pen-and-paper to other instant messengers. Putting the whole id to the URI fragment (`matrix:#id_with_sigil` or, following on the `matrix.to` tradition, `matrix:#/id_with_sigil` for readability) allows to use `#` without encoding on many URI parsers. It is -still not fully RFC3986-compliant but the bigger problem is that putting -the identifying part to the fragment rules out using URIs in client-server -communication. Effectively all clients will have to implement full URI -processing with no chance to offload that to the server. +still not fully RFC-compliant and rules out using URIs by homeservers +(see also "Past discussion points"). Regardless of the placement (the fragment or the path), one more consideration is that the character space for sigils is extremely limited and From 11d2529a5e925dc276ffb8ae75ce709d473f0408 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Thu, 1 Oct 2020 18:37:39 +0200 Subject: [PATCH 18/29] Allow custom query items --- proposals/2312-matrix-uri.md | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index f7741aca..792726f3 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -268,10 +268,13 @@ Matrix URI can optionally have This MSC defines the general form for the query and two "standard" query items; further MSCs may add to this as long as RFC 3986 is followed. ```text -query = query-element *( "&" query-element ) -query-element = action / routing +query = query-element *( "&" query-item ) +query-item = action / routing / custom-query-item action = "action=" ( "join" / "chat" ) routing = "via=” authority +custom-query-item = custom-item-name "=" custom-item-value +custom-item-name = 1*unreserved ; reverse-DNS name; see below +custom-item-value = ; see below ``` The `action` query item is used in contexts where, on top of identifying @@ -305,6 +308,25 @@ a non-public Matrix network (see the question about closed federations in above is only a part of the grammar as defined in the respective section; it is not proposed here to generate or read the authority part of the URI. +Clients MAY introduce and recognise custom query items, according to +the following rules: +- the name of the item MUST follow the reverse-DNS (aka "Java package") + naming convention - e.g., a custom action item for Element clients would be + named `io.element.action`, for Quaternion - `com.github.quaternion.action`, + etc. +- the value of the item can be any content but its representation in the URI + MUST follow the general RFC requirements for the query part; on top of that, + if the raw value contains `&` it MUST be percent-encoded. +- clients SHOULD respect standard query items over their own ones; e.g., + if a URI contains both `action` and the custom client action, the standard + action should be respected as much as possible. Client authors SHOULD strive + for consistent experience across their and 3rd party clients, anticipating + that the same user may happen to have both their client and a 3rd party one. + +Client authors are strongly encouraged to standardise custom query elements +that gain adoption by submitting an MSC defining them in a way compatible +across the client ecosystem. + ### Recommended implementation From a32d7f5027a4e9dba05a3d86be153d94ce20ccd7 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Tue, 19 Jan 2021 19:44:15 +0100 Subject: [PATCH 19/29] Apply suggestions from code review --- proposals/2312-matrix-uri.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 792726f3..e5a5d60e 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -220,7 +220,7 @@ of a given Matrix entity: ```text path = entity-descriptor ["/" entity-descriptor] entity-descriptor = nonid-segment / type-qualifier id-without-sigil -non-id-segment = segment-nz ; as defined in RFC 3986, see also below +nonid-segment = segment-nz ; as defined in RFC 3986, see also below type-qualifier = segment-nz "/" ; as defined in RFC 3986, see also below id-without-sigil = string ; as defined in Matrix identifier spec, see below ``` @@ -310,10 +310,11 @@ it is not proposed here to generate or read the authority part of the URI. Clients MAY introduce and recognise custom query items, according to the following rules: -- the name of the item MUST follow the reverse-DNS (aka "Java package") - naming convention - e.g., a custom action item for Element clients would be - named `io.element.action`, for Quaternion - `com.github.quaternion.action`, - etc. +- the name of a custom item MUST follow the reverse-DNS (aka "Java package") + naming convention, as per + [MSC2758](https://github.com/matrix-org/matrix-doc/pull/2758) - e.g., + a custom action item for Element clients would be named `io.element.action`, + for Quaternion - `com.github.quaternion.action`, etc. - the value of the item can be any content but its representation in the URI MUST follow the general RFC requirements for the query part; on top of that, if the raw value contains `&` it MUST be percent-encoded. From 43e6470acae6f43c09f05dda239f61ccee3efbed Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Tue, 19 Jan 2021 19:59:03 +0100 Subject: [PATCH 20/29] Drop 'group' in anticipation of #1772 --- proposals/2312-matrix-uri.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index e5a5d60e..58b2ea12 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -235,10 +235,15 @@ For the sake of integrity future `nonid-segment` extensions must follow This MSC defines the following `type` specifiers: `user` (user id, sigil `@`), `roomid` (room id, sigil `!`), -`room` (room alias, sigil `#`), and `event` (event id, sigil `$`). -The type `group` (group/community id, sigil `+`) is reserved for future use. - -As of this MSC, `user`, `roomid`, `room`, and `group` can only be at the top +`room` (room alias, sigil `#`), and `event` (event id, sigil `$`). This MSC +does not define a type specifier for sigil `+` +([groups](https://github.com/matrix-org/matrix-doc/issues/1513) aka communities +or, in the more recent incarnation, +[spaces](https://github.com/matrix-org/matrix-doc/pull/1772)); a separate MSC +can introduce the specifier, along with the parsing/construction logic and +relevant CS API invocations, following the framework of this proposal. + +As of this MSC, `user`, `roomid`, and `room` can only be at the top level. The type `event` can only be used on the 2nd level and only under `room` or `roomid`; this is driven by the current shape of Client-Server API that does not provide a non-deprecated way to retrieve an event without knowing @@ -360,7 +365,6 @@ comparisons are case-INsensitive. - `user` -> `@` - `roomid` -> `!` - `room` -> `#` - - `group` -> `+` - any other string, including an empty one -> fail parsing: the Matrix URI is invalid. @@ -417,7 +421,6 @@ performed on behalf (using the access token) of the user `@me:example.org`: | Room (no `action` in URI):
`matrix:roomid/rid:example.org`
`matrix:room/us:example.org` | Attempt to "open" (usually: display the timeline at the latest or last remembered position) the room | No default non-interactive operation
`GET /rooms/!rid:example.org/...`
the respective room section of `GET /sync` | | Room (`action=join`):
`matrix:roomid/rid:example.org?action=join&via=example2.org`
`matrix:room/us:example.org?action=join` | Attempt to join the room | `POST /join/!rid:example.org?server_name=example2.org`
`POST /join/#us:example.org` | | Event:
`matrix:room/us:example.org/event/lol823y4bcp3qo4`
`matrix:roomid/rid:example.org/event/lol823y4bcp3qo4?via=example2.org` | 1. For room aliases, resolve an alias to a room id (HOW?)
2. Attempt to retrieve (see the next column) and display the event;
3. If the event could not be retrieved due to access denial and the current user is not a member of the room, the client MAY offer the user to join the room and try to open the event again | `GET `
`GET /rooms/!rid:example.org/event/lol823y4bcp3qo4?server_name=example2.org` | -| Group:
`matrix:group/them:matrix.org` | Reserved for future use | * Event ID (as used in [room version 3](https://matrix.org/docs/spec/rooms/v3) and later): @@ -447,7 +450,6 @@ For room and user identifiers (including room aliases): - `@` -> `user/` - `!` -> `roomid/` - `#` -> `room/` - - `+` -> `group/` 2. Build the Matrix URI as a concatenation of: - literal `matrix:`; - `prefix-1`; From 3988979f825f782df7dbcb9eaf37d0083349e880 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Tue, 19 Jan 2021 22:02:58 +0100 Subject: [PATCH 21/29] Fix leftovers --- proposals/2312-matrix-uri.md | 21 +++------------------ 1 file changed, 3 insertions(+), 18 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 58b2ea12..b2fe5eb8 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -418,25 +418,10 @@ performed on behalf (using the access token) of the user `@me:example.org`: | ----------------- | --------------------- | --------------------------------------------- | | User Id (no `action` in URI):
`matrix:user/her:example.org` | _Outside the room context_: show user profile
_Inside the room context:_ mention the user in the current room (client-local operation) | No default non-interactive operation
`GET /profile/@her:example.org/display_name`
`GET /profile/@her:example.org/avatar_url` | | User Id (`action=chat`):
`matrix:user/her:example.org?action=chat` | Open a direct chat with the user (see the next column on identifying the room) | If [canonical direct chats](https://github.com/matrix-org/matrix-doc/pull/2199) are supported: `GET /_matrix/client/r0/user/@me:example.org/dm?involves=@her:example.org`
Without canonical direct chats:
1. `GET /user/@me:example.org/account_data/m.direct`
2. Find the room id for `@her:example.org` in the event content
3. if found, return this room id; if not, `POST /createRoom` with `"is_direct": true` and return id of the created room | -| Room (no `action` in URI):
`matrix:roomid/rid:example.org`
`matrix:room/us:example.org` | Attempt to "open" (usually: display the timeline at the latest or last remembered position) the room | No default non-interactive operation
`GET /rooms/!rid:example.org/...`
the respective room section of `GET /sync` | +| Room (no `action` in URI):
`matrix:roomid/rid:example.org`
`matrix:room/us:example.org` | Attempt to "open" (usually: display the timeline at the latest or last remembered position) the room | No default non-interactive operation
API: Find the respective room in the local `/sync` cache or
`GET /rooms/!rid:example.org/...`
| | Room (`action=join`):
`matrix:roomid/rid:example.org?action=join&via=example2.org`
`matrix:room/us:example.org?action=join` | Attempt to join the room | `POST /join/!rid:example.org?server_name=example2.org`
`POST /join/#us:example.org` | -| Event:
`matrix:room/us:example.org/event/lol823y4bcp3qo4`
`matrix:roomid/rid:example.org/event/lol823y4bcp3qo4?via=example2.org` | 1. For room aliases, resolve an alias to a room id (HOW?)
2. Attempt to retrieve (see the next column) and display the event;
3. If the event could not be retrieved due to access denial and the current user is not a member of the room, the client MAY offer the user to join the room and try to open the event again | `GET `
`GET /rooms/!rid:example.org/event/lol823y4bcp3qo4?server_name=example2.org` | - -* Event ID (as used in - [room version 3](https://matrix.org/docs/spec/rooms/v3) and later): - - URI example (aka "event permalink"): - `matrix:room/us:example.org/event/UnpaddedBase64` or - (additional routing is needed for room id) - `matrix:roomid/rid:example.org/event/UnpaddedBase64?via=example2.org` - - Default action: - 1. For room aliases, resolve an alias to a room id (see above) - 1. If the event is in the room that the user has joined, retrieve - the event contents (`GET /rooms/!rid:example.org/event/UnpaddedBase64` or - `GET /rooms/!rid:example.org/context/UnpaddedBase64`) and display them - to the user - 1. Otherwise try to retrieve the event in the same way but in case of - access failure the client MAY offer the user to join the room; if - the user agrees and joining succeeds, retry the step above. +| Event:
`matrix:room/us:example.org/event/lol823y4bcp3qo4`
`matrix:roomid/rid:example.org/event/lol823y4bcp3qo4?via=example2.org` | 1. For room aliases, resolve an alias to a room id (HOW?)
2. Attempt to retrieve (see the next column) and display the event;
3. If the event could not be retrieved due to access denial and the current user is not a member of the room, the client MAY offer the user to join the room and try to open the event again | Non-interactive operation: return event or event content, depending on context
API: find the event in the local `/sync` cache or
`GET /directory/room/%23us:example.org` (to resolve alias to id)
`GET /rooms/!rid:example.org/event/lol823y4bcp3qo4?server_name=example2.org`
| + #### URI construction algorithm From eca999382d5db521095db1787f8f2abc0144b139 Mon Sep 17 00:00:00 2001 From: Alexey Rusakov Date: Fri, 22 Jan 2021 18:39:33 +0100 Subject: [PATCH 22/29] Grammar --- proposals/2312-matrix-uri.md | 50 ++++++++++++++++++------------------ 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index b2fe5eb8..fc2b65e0 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -2,7 +2,7 @@ This is a proposal of a URI scheme to identify Matrix resources in a wide range of applications (web, desktop, or mobile) both throughout Matrix software -and (especially) outside of it. It supersedes +and (especially) outside it. It supersedes [MSC455](https://github.com/matrix-org/matrix-doc/issues/455) in order to continue the discussion in the modern GFM style. @@ -28,7 +28,7 @@ Specific use cases include: Matrix identifiers as defined by the current specification have a form distinct enough from other identifiers to mostly fulfil the representation use case. -Since they are not URIs they can not cover the two integration use cases. +Since they are not URIs, they can not cover the two integration use cases. https://matrix.to somehow compensates for this; however: * it requires a web browser to run JavaScript code that resolves identifiers (basically limiting first-class support to browser-based clients), and @@ -50,7 +50,7 @@ it merely defines a mapping between URIs with the scheme name `matrix:` and Matrix identifiers, as well as operations on them. The MSC should be sufficient to produce an implementation that would convert Matrix URIs to a series of CS API calls, entirely on the client side. It is recognised, -however, that most of URI processing logic can and should (eventually) +however, that most of the URI processing logic can and should (eventually) be on the server side in order to facilitate adoption of Matrix URIs; further MSCs are needed to define details for that, as well as to extend the mapping to more resources (including those without equivalent @@ -59,7 +59,7 @@ Matrix identifiers, such as room state or user profile data). The Matrix identifier (or identifiers) can be reconstructed from `{id without sigil}` by prepending a sigil character corresponding to `{type}`. To support a hierarchy of Matrix resources, more `/{type}/{id without sigil}` -pairs can be appended, identifying resources inside of other resources. +pairs can be appended, identifying resources within other resources. As of now, there's only one such case, with exactly one additional pair - pointing to an event in a room. @@ -92,7 +92,7 @@ Further text uses the following terms: The following considerations drive the requirements for Matrix URIs: 1. Follow existing standards and practices. -1. Endorse the principle of least surprise. +1. Endorse the principle of the least surprise. 1. Humans first, machines second. 1. Cover as many entities as practical. 1. URIs are expected to be extremely portable and stable; @@ -117,9 +117,8 @@ The following requirements resulted from these drivers: 1. Event IDs (`$arbitrary_eventid_with_or_without_serverpart`) 1. The mapping MUST take into account that some identifiers (e.g. aliases) can have non-ASCII characters - reusing - [RFC 3987](https://tools.ietf.org/html/rfc3987) is RECOMMENDED - but an alternative encoding can be used if there are reasons - for that. + [RFC 3987](https://tools.ietf.org/html/rfc3987) is RECOMMENDED, + but an alternative encoding can be used if there are reasons for that. 1. The mapping between Matrix identifiers and Matrix URIs MUST be extensible (without invalidating previous URIs) to: 1. new classes of identifiers (there MUST be a meta-rule to produce @@ -136,7 +135,7 @@ The following requirements resulted from these drivers: 1. Matrix URI SHOULD have a human-readable, if not necessarily human-friendly, representation - to allow visual sanity-checks. In particular, characters escaping/encoding should be reduced - to bare minimum in that representation. As a food for thought, see + to bare minimum in that representation. As food for thought, see [Wikipedia: Clean URL, aka SEF URL](https://en.wikipedia.org/wiki/Clean_URL) and [a use case from RFC 3986](https://tools.ietf.org/html/rfc3986#section-1.2.1). 1. It SHOULD be easy to parse Matrix URI in popular programming @@ -181,7 +180,7 @@ The proposed scheme name is `matrix`. Other considered options were `mx` and `web+matrix`; [comments to MSC455](https://github.com/matrix-org/matrix-doc/issues/455) mention two scheme names proposed and one more has been mentioned -in `#matrix-core:matrix.org`). +in `#matrix-core:matrix.org`. The scheme name is a definitive indication of a Matrix URI and MUST NOT be omitted. As can be seen below, Matrix URI rely heavily on [relative @@ -225,7 +224,7 @@ type-qualifier = segment-nz "/" ; as defined in RFC 3986, see also below id-without-sigil = string ; as defined in Matrix identifier spec, see below ``` The path component consists of 1 or more descriptors separated by a slash -(`/`) character. This is a generic pattern intended for reuse in future +(`/`) character. This is a generic pattern intended for reusing in future extensions. This MSC only proposes mappings along `type-qualifier id-without-sigil` syntax; @@ -346,8 +345,9 @@ comparisons are case-INsensitive. `query`, and `fragment`), decoding special or international characters as directed by [RFC 3986](https://tools.ietf.org/html/rfc3986) and (for IRIs) [RFC 3987](https://tools.ietf.org/html/rfc3987). Authors are - strongly RECOMMENDED to find an existing implementation of that step for - their language and SDK, rather than implement it from scratch based on RFCs. + strongly RECOMMENDED that they find an existing implementation of that step + for their language and SDK, rather than implement it from scratch based + on RFCs. 1. Check that `scheme name` is exactly `matrix`, case-insensitive. If the scheme name doesn't match, exit parsing: this is not a Matrix URI. @@ -466,7 +466,7 @@ extensions. Here are a few ideas: * Add specifying a segment of the room timeline (`from=$evtid1&to=$evtid2`). -* Unlock bare event ids (`matrix:event/$event_id`) - subject to changes in +* Unlock bare event ids (`matrix:event/$event_id`) - subject to change in other areas of the specification. * Bring tangible semantics to the authority part. The main purpose of @@ -569,7 +569,7 @@ further discussion should happen in GitHub comments. Reddit style (`matrix:r/matrix:matrix.org`, `matrix:u/me:example.org` etc.) is almost as compact as original Matrix identifiers, while still rather clearly conveys the type and nicely avoids the singular vs. plural confusion -described in the previos section. However, in the context of high requirements +described in the previous section. However, in the context of high requirements to URL grammar stability, Reddit-style prefixes would eventually produce bigger ambiguity as a primary notation; but they can be handy as shortcuts. As discussed in "Future evolution", the current proposal provides enough space @@ -598,9 +598,9 @@ the type signifier?). #### "Full REST" -Yet another alternative considered was to go "full REST" and build -a more traditionally looking URL structure with serverparts coming first -followed by type grouping (sic - not specifiers) and then by localparts, +Yet another alternative considered was to go "full REST" and structure +URLs in a more traditional way with serverparts coming first, followed +by type grouping (sic - not specifiers), and then by localparts, i.e. `matrix://example.org/rooms/roomalias`. This is even more difficult to comprehend for a Matrix user than the previous alternative and besides it conflates the notion of an authority server with @@ -618,7 +618,7 @@ correctly. As laid out in the beginning of this proposal, Matrix URIs are not striving to preempt Matrix identifiers; instead of trying to produce an equally readable string, one should just use identifiers where they work. -#### Minimal syntax based on path and percent-encoding +#### Minimal syntax based on the path component and percent-encoding A simple modification of the previous option is much more viable: proper percent-encoding of the Matrix identifier allows to use it as @@ -629,8 +629,8 @@ to be encoded). This is considerably more concise and encoding is only needed for `#`. Quite unfortunately, `#` is one of the two sigils in Matrix most relevant -to integration cases. The other one is `@`; it doesn't need encoding outside -of the authority part - which is why the form above uses a leading `/` that +to integration cases. The other one is `@`; it doesn't need encoding except +in the authority part - which is why the form above uses a leading `/` that puts the identifier in the path part instead of what parsers treat as the authority part. `#` has to be encoded wherever it appears, making a URI for Matrix HQ, the first chat room many new users join, look like @@ -644,7 +644,7 @@ instances, from pen-and-paper to other instant messengers. Putting the whole id to the URI fragment (`matrix:#id_with_sigil` or, following on the `matrix.to` tradition, `matrix:#/id_with_sigil` for -readability) allows to use `#` without encoding on many URI parsers. It is +readability) allows using `#` without encoding on many URI parsers. It is still not fully RFC-compliant and rules out using URIs by homeservers (see also "Past discussion points"). @@ -656,7 +656,7 @@ warrant a dedicated sigil but that cannot be ruled out. Rather than rely on the institute of sigils, this proposal gives an alternative more extensible syntax that can be used for more advanced cases - as a uniform way to represent arbitrary sub-objects (with or without Matrix identifier) such as -user profiles or a notifications feed for the room - and also, if ever needed, +user profiles, or a notifications feed for the room - and also, if ever needed, as an escape hatch to a bigger namespace if we hit shortage of sigils. The current proposal is also flexible enough to incorporate the minimal @@ -669,7 +669,7 @@ MSC could enable `matrix:id/%23matrix:matrix.org` as a synonym for Despite the limited functionality of URIs as proposed in this MSC, Matrix authors are advised to use tools that would process URIs just -like an http(s) URI instead of making home-baked parsers/emitters. +like an HTTP(S) URI instead of making home-baked parsers/emitters. Even with that in mind, not all tools normalise and sanitise all cases in a fully RFC-compliant way. This MSC tries to keep the required transformations to the minimum and will likely not bring much grief even @@ -693,7 +693,7 @@ information in URIs. The MSC strives to not be prescriptive in treating URIs except the `action` query parameter. Actions without user confirmation may lead to unintended -leaks of certain metadata so this MSC recommends to ask for a user consent - +leaks of certain metadata so this MSC recommends asking for a user consent - recognising that not all clients are in position for that. From 246a97e2be95446ccc4f4c0f6d83dea0e0773a28 Mon Sep 17 00:00:00 2001 From: Alexey Rusakov Date: Fri, 22 Jan 2021 19:59:41 +0100 Subject: [PATCH 23/29] Add clarification as per review --- proposals/2312-matrix-uri.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index fc2b65e0..a7a1f9f7 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -302,6 +302,21 @@ confirmation or at least notify the user before joining or creating a new room. Conversely, no additional confirmation/notification is necessary when the action leads to opening a room the user is already a member of. +It is worth reiterating on the (blurry) distinction between URIs with `action` +and those without: +- a URI with no `action` simply _identifies_ the resource; if the context + implies an operation, it is usually focused on the retrieval of the resource, + in line with RFC 3986 (see also the next paragraph); +- a URI with `action` in the query means that a client application should (but + is not obliged to) perform that action, with precautions as described above. + +In some cases a client application may have no meaningful way to immediately +perform the default operation suggested by this MSC (see below); e.g., +the client may be unable to display a room before joining it, while the URI +doesn't have `action=join`. In these cases client applications are free to do +what's best for user experience (e.g., suggest joining the room), even if that +means performing an action on a URI with no `action` in the query. + The routing query (`via=`) indicates servers that are likely involved in the room (see also [the feature of matrix.to](https://matrix.org/docs/spec/appendices#routing)). From b4269a54e87b668c40f4dacd73444031988efd6c Mon Sep 17 00:00:00 2001 From: Alexey Rusakov Date: Wed, 17 Feb 2021 21:14:02 +0100 Subject: [PATCH 24/29] Use abbreviated type specifiers As per the review, this commit introduces Reddit-style type specifiers for user ids (u/), room aliases (r/), and event ids (e/). --- proposals/2312-matrix-uri.md | 123 ++++++++++++++++++++++------------- 1 file changed, 78 insertions(+), 45 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index a7a1f9f7..7b01cff0 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -39,7 +39,7 @@ To cover the use cases above, the following scheme is proposed for Matrix URIs ```text matrix:[//{authority}/]{type}/{id without sigil}[/{type}/{id without sigil}...][?{query}][#{fragment}] ``` -with `{type}` defining the resource type (such as `user` or `roomid` - see +with `{type}` defining the resource type (such as `r`, `u` or `roomid` - see the "Path" section in the proposal) and `{query}` containing additional hints or request details on the Matrix entity (see "Query" in the proposal). `{authority}` and `{fragment}` parts are reserved for future use; this proposal @@ -65,11 +65,11 @@ pointing to an event in a room. Examples: * Room `#someroom:example.org`: - `matrix:room/someroom:example.org` + `matrix:r/someroom:example.org` * User `@me:example.org`: - `matrix:user/me:example.org` + `matrix:u/me:example.org` * Event in a room: - `matrix:room/someroom:example.org/event/Arbitrary_Event_Id` + `matrix:r/someroom:example.org/e/Arbitrary_Event_Id` * [A commit like this](https://github.com/her001/steamlug.org/commit/2bd69441e1cf21f626e699f0957193f45a1d560f) could make use of a Matrix URI in the form of `{Matrix identifier}`. @@ -187,7 +187,7 @@ be omitted. As can be seen below, Matrix URI rely heavily on [relative references](https://tools.ietf.org/html/rfc3986#section-4.2) and omitting the scheme name makes them indistinguishable from a local path that might have nothing to do with Matrix. Clients MUST NOT try to -parse pieces like `room/MyRoom:example.org` as Matrix URIs; instead, +parse pieces like `r/MyRoom:example.org` as Matrix URIs; instead, users should be encouraged to use Matrix identifiers for in-text references (`#MyRoom:example.org`) and client applications SHOULD turn them into hyperlinks to Matrix URIs. @@ -232,29 +232,38 @@ This MSC only proposes mappings along `type-qualifier id-without-sigil` syntax; For the sake of integrity future `nonid-segment` extensions must follow [the ABNF for `segment-nz` as defined in RFC 3986](https://tools.ietf.org/html/rfc3986#appendix-A). -This MSC defines the following `type` specifiers: -`user` (user id, sigil `@`), `roomid` (room id, sigil `!`), -`room` (room alias, sigil `#`), and `event` (event id, sigil `$`). This MSC -does not define a type specifier for sigil `+` +This MSC defines the following `type` specifiers: `u` (user id, sigil `@`), +`r` (room alias, sigil `#`), `roomid` (room id, sigil `!`), and +`e` (event id, sigil `$`). This MSC does not define a type specifier for sigil `+` ([groups](https://github.com/matrix-org/matrix-doc/issues/1513) aka communities or, in the more recent incarnation, [spaces](https://github.com/matrix-org/matrix-doc/pull/1772)); a separate MSC can introduce the specifier, along with the parsing/construction logic and relevant CS API invocations, following the framework of this proposal. -As of this MSC, `user`, `roomid`, and `room` can only be at the top -level. The type `event` can only be used on the 2nd level and only under `room` -or `roomid`; this is driven by the current shape of Client-Server API that -does not provide a non-deprecated way to retrieve an event without knowing +The following type specifiers proposed in earlier editions of this MSC and +already in use in several implementations, are deprecated: `user`, `room`, and +`event`. Client applications MAY parse these specifiers as if they were +`u`, `r`, and `e` respectively; they MUST NOT emit URIs with the deprecated +specifiers. The rationale behind the switch is laid out in "Alternatives". + +As of this MSC, `u`, `r`, and `roomid` can only be at the top +level. The type `e` (event) can only be used on the 2nd level and only under +`r` or `roomid`; this is driven by the current shape of Client-Server API +that does not provide a non-deprecated way to retrieve an event without knowing the room (see [MSC2695](https://github.com/matrix-org/matrix-doc/pull/2695) and [MSC2779](https://github.com/matrix-org/matrix-doc/issues/2779) that may -change this). +change this). Further MSCs may introduce navigation to more top-level as well as non-top-level objects; see "Further evolution" for some ideas. These new proposals SHOULD follow the generic grammar laid out above, adding new `type` and `nonid-segment` specifiers and/or allowing them in other levels, rather -than introduce a new grammar. +than introduce a new grammar. It is recommended to only use abbreviated +single-letter specifiers if they are expected to be user visible and convenient +for type-in; if a URI for a given resource type is usually generated +(e.g. because the corresponding identifier is not human-friendly), it's +RECOMMENDED to use full (though short) words to avoid ambiguity and confusion. `id-without-sigil` is defined as the `string` part of Matrix [Common identifier format](https://matrix.org/docs/spec/appendices#common-identifier-format) @@ -377,9 +386,9 @@ comparisons are case-INsensitive. a. Pick the leftmost segment of `path` until `/` (path segment) and match it against the following list to produce `sigil-1`: - - `user` -> `@` + - `u` (or, optionally, `user` - see "Path") -> `@` + - `r` (or, optionally, `room`) -> `#` - `roomid` -> `!` - - `room` -> `#` - any other string, including an empty one -> fail parsing: the Matrix URI is invalid. @@ -393,7 +402,7 @@ comparisons are case-INsensitive. point to an event inside the room identified by `mxid-1`: a. Pick the next (3rd) path segment: - - if the segment is exactly `event`, proceed; + - if the segment is exactly `e` (or, optionally, `event`), proceed; - otherwise, including the case of an empty segment (trailing `/`, e.g.), fail parsing. @@ -431,11 +440,11 @@ performed on behalf (using the access token) of the user `@me:example.org`: | URI class/example | Interactive operation | Non-interactive operation / Involved CS API | | ----------------- | --------------------- | --------------------------------------------- | -| User Id (no `action` in URI):
`matrix:user/her:example.org` | _Outside the room context_: show user profile
_Inside the room context:_ mention the user in the current room (client-local operation) | No default non-interactive operation
`GET /profile/@her:example.org/display_name`
`GET /profile/@her:example.org/avatar_url` | -| User Id (`action=chat`):
`matrix:user/her:example.org?action=chat` | Open a direct chat with the user (see the next column on identifying the room) | If [canonical direct chats](https://github.com/matrix-org/matrix-doc/pull/2199) are supported: `GET /_matrix/client/r0/user/@me:example.org/dm?involves=@her:example.org`
Without canonical direct chats:
1. `GET /user/@me:example.org/account_data/m.direct`
2. Find the room id for `@her:example.org` in the event content
3. if found, return this room id; if not, `POST /createRoom` with `"is_direct": true` and return id of the created room | -| Room (no `action` in URI):
`matrix:roomid/rid:example.org`
`matrix:room/us:example.org` | Attempt to "open" (usually: display the timeline at the latest or last remembered position) the room | No default non-interactive operation
API: Find the respective room in the local `/sync` cache or
`GET /rooms/!rid:example.org/...`
| -| Room (`action=join`):
`matrix:roomid/rid:example.org?action=join&via=example2.org`
`matrix:room/us:example.org?action=join` | Attempt to join the room | `POST /join/!rid:example.org?server_name=example2.org`
`POST /join/#us:example.org` | -| Event:
`matrix:room/us:example.org/event/lol823y4bcp3qo4`
`matrix:roomid/rid:example.org/event/lol823y4bcp3qo4?via=example2.org` | 1. For room aliases, resolve an alias to a room id (HOW?)
2. Attempt to retrieve (see the next column) and display the event;
3. If the event could not be retrieved due to access denial and the current user is not a member of the room, the client MAY offer the user to join the room and try to open the event again | Non-interactive operation: return event or event content, depending on context
API: find the event in the local `/sync` cache or
`GET /directory/room/%23us:example.org` (to resolve alias to id)
`GET /rooms/!rid:example.org/event/lol823y4bcp3qo4?server_name=example2.org`
| +| User Id (no `action` in URI):
`matrix:u/her:example.org` | _Outside the room context_: show user profile
_Inside the room context:_ mention the user in the current room (client-local operation) | No default non-interactive operation
`GET /profile/@her:example.org/display_name`
`GET /profile/@her:example.org/avatar_url` | +| User Id (`action=chat`):
`matrix:u/her:example.org?action=chat` | Open a direct chat with the user (see the next column on identifying the room) | If [canonical direct chats](https://github.com/matrix-org/matrix-doc/pull/2199) are supported: `GET /_matrix/client/r0/user/@me:example.org/dm?involves=@her:example.org`
Without canonical direct chats:
1. `GET /user/@me:example.org/account_data/m.direct`
2. Find the room id for `@her:example.org` in the event content
3. if found, return this room id; if not, `POST /createRoom` with `"is_direct": true` and return id of the created room | +| Room (no `action` in URI):
`matrix:roomid/rid:example.org`
`matrix:r/us:example.org` | Attempt to "open" (usually: display the timeline at the latest or last remembered position) the room | No default non-interactive operation
API: Find the respective room in the local `/sync` cache or
`GET /rooms/!rid:example.org/...`
| +| Room (`action=join`):
`matrix:roomid/rid:example.org?action=join&via=example2.org`
`matrix:r/us:example.org?action=join` | Attempt to join the room | `POST /join/!rid:example.org?server_name=example2.org`
`POST /join/#us:example.org` | +| Event:
`matrix:r/us:example.org/e/lol823y4bcp3qo4`
`matrix:roomid/rid:example.org/event/lol823y4bcp3qo4?via=example2.org` | 1. For room aliases, resolve an alias to a room id (HOW?)
2. Attempt to retrieve (see the next column) and display the event;
3. If the event could not be retrieved due to access denial and the current user is not a member of the room, the client MAY offer the user to join the room and try to open the event again | Non-interactive operation: return event or event content, depending on context
API: find the event in the local `/sync` cache or
`GET /directory/room/%23us:example.org` (to resolve alias to id)
`GET /rooms/!rid:example.org/event/lol823y4bcp3qo4?server_name=example2.org`
| #### URI construction algorithm @@ -447,9 +456,9 @@ compliance of identifiers passed to this algorithm. For room and user identifiers (including room aliases): 1. Remove the sigil character from the identifier and match it against the following list to produce `prefix-1`: - - `@` -> `user/` + - `@` -> `u/` + - `#` -> `r/` - `!` -> `roomid/` - - `#` -> `room/` 2. Build the Matrix URI as a concatenation of: - literal `matrix:`; - `prefix-1`; @@ -463,7 +472,7 @@ may change this): 1. Take the event's room id or canonical alias and build a Matrix URI for them as described above. 2. Append to the result of previous step: - - literal `event/`; + - literal `e/`; - the event id after removing the sigil (`$`) and percent-encoding. Clients MUST implement proper percent-encoding of the identifiers; there's no @@ -481,7 +490,7 @@ extensions. Here are a few ideas: * Add specifying a segment of the room timeline (`from=$evtid1&to=$evtid2`). -* Unlock bare event ids (`matrix:event/$event_id`) - subject to change in +* Unlock bare event ids (`matrix:e/$event_id`) - subject to change in other areas of the specification. * Bring tangible semantics to the authority part. The main purpose of @@ -516,12 +525,12 @@ extensions. Here are a few ideas: (also referred to in the previous section). * One could conceive a URI mapping of avatars in the form of - `matrix:user/uid:matrix.org/avatar/room:matrix.org` + `matrix:u/uid:matrix.org/avatar/room:matrix.org` (a user’s avatar for a given room). -* As described in "Alternatives" and "Discussion points", respectively, one can introduce a synonymous - system that uses Matrix identifiers with sigils by adding another path - prefix (e.g., `matrix:id/%23matrix:matrix.org`). +* As described in "Alternatives", a synonymous system can be introduced that + uses Matrix identifiers with sigils by adding another path prefix (e.g., + `matrix:id/%23matrix:matrix.org`). ### Past discussion points and tradeoffs @@ -534,8 +543,10 @@ further discussion should happen in GitHub comments. `//` if the URI doesn't have an authority component. In other words, `//` implies a centre of authority, and the (public) Matrix federation is not supposed to have one; hence no `//` in most URIs. -1. _Why do type specifiers use singular rather than plural - as is common in RESTful APIs?_ +1. ~~_Why do type specifiers use singular rather than plural + as is common in RESTful APIs?_~~ + This is no more relevant with single-letter type specifiers. The answer + below is provided for history only. Unlike in actual RESTful APIs, this MSC does not see `rooms/` or `users/` as collections to browse. The type specifier completes the id specification in the URI, defining a very specific and @@ -579,18 +590,40 @@ further discussion should happen in GitHub comments. ### Alternatives -#### Reddit-style URLs - -Reddit style (`matrix:r/matrix:matrix.org`, `matrix:u/me:example.org` etc.) -is almost as compact as original Matrix identifiers, while still rather -clearly conveys the type and nicely avoids the singular vs. plural confusion -described in the previous section. However, in the context of high requirements -to URL grammar stability, Reddit-style prefixes would eventually produce -bigger ambiguity as a primary notation; but they can be handy as shortcuts. -As discussed in "Future evolution", the current proposal provides enough space -to define synonyms; this may need some canonicalisation service from -homeservers so that we don't have to enable synonyms at each client -individually. +#### Using full words for all types + +During its draft state, this MSC was proposing type specifiers using full words +(`user`, `room`, `event` etc.), arguing that abbreviations can be introduced +separately as synonyms. Full words have several shortcomings pointed out in +discussions across the whole period of preparation, namely: +- The singular vs. plural choice (see also "Past discussion points") +- Using English words raises a question about eventual support of localised + URI variants (`matrix:benutzer/...`, `matrix:usuario/...` etc.) catering to + international audience, that would add complication to the Matrix technology. +- Abbreviated forms are popularised by Reddit and make URIs shorter which is + crucial for the outbound integration case (see the introduction). + +Meanwhile, using `u`/`r`/`e` for users, rooms and events has the following +advantages: +1. there's a strong Reddit legacy, with users across the world quite familiar + with the abbreviated forms (and `r/` coincidentally standing for sub-Reddits + links to which have basically the same place in the Reddit ecosystem as + Matrix room aliases have in the Matrix ecosystem); +2. matrix.to links to users and room aliases are heavily used throughout Matrix, + specifically in end-user-facing contexts (see also use cases in the + introductory section of this MSC); +3. the singular vs. plural (`room` or `rooms`?) confusion is avoided; +4. it's shorter, which is crucial for typing the URI in an external medium. + +The rationale behind not abbreviating `roomid/` is a better distinction between +room aliases and room ids; also, since room ids are almost never typed in +manually, the advantages (3) and (4) above don't hold. + +For these reasons, it was decided in the end to use the single-letter style +for types most used in the outbound integration case. It's still possible to +reinstate full words as synonyms some time down the road, with the caveat that +a canonicalisation service from homeservers may be needed to avoid having +to enable synonyms at each client individually. #### URNs From 9dd0854aee812a142f49237b7022ebeef38eec8b Mon Sep 17 00:00:00 2001 From: Alexey Rusakov Date: Tue, 30 Mar 2021 19:34:28 +0200 Subject: [PATCH 25/29] Add a link to CS API --- proposals/2312-matrix-uri.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 7b01cff0..81a66c46 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -49,12 +49,13 @@ This MSC does not introduce new Matrix entities, nor API endpoints - it merely defines a mapping between URIs with the scheme name `matrix:` and Matrix identifiers, as well as operations on them. The MSC should be sufficient to produce an implementation that would convert Matrix URIs to -a series of CS API calls, entirely on the client side. It is recognised, -however, that most of the URI processing logic can and should (eventually) -be on the server side in order to facilitate adoption of Matrix URIs; -further MSCs are needed to define details for that, as well as to extend -the mapping to more resources (including those without equivalent -Matrix identifiers, such as room state or user profile data). +a series of [CS API](https://matrix.org/docs/spec/client_server/latest) calls, +entirely on the client side. It is recognised, however, that most of +the URI processing logic can and should (eventually) be on the server side +in order to facilitate adoption of Matrix URIs; further MSCs are needed +to define details for that, as well as to extend the mapping to more resources +(including those without equivalent Matrix identifiers, such as room state or +user profile data). The Matrix identifier (or identifiers) can be reconstructed from `{id without sigil}` by prepending a sigil character corresponding to `{type}`. From 037ebbf1caca36ad4a68f57532a4f80dcc0fc064 Mon Sep 17 00:00:00 2001 From: Alexey Rusakov Date: Tue, 30 Mar 2021 19:40:34 +0200 Subject: [PATCH 26/29] Be even clearer about the authority part --- proposals/2312-matrix-uri.md | 85 ++++++++++++++++++------------------ 1 file changed, 43 insertions(+), 42 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 81a66c46..5e339b0a 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -195,21 +195,23 @@ hyperlinks to Matrix URIs. #### Authority -The authority part will eventually be used to indicate access to a Matrix -resource (such as a room or a user) specifically through a given server, -addressing a case described in -[matrix-org/matrix-doc#2309](https://github.com/matrix-org/matrix-doc/issues/2309). +Basing on +[the definition in RFC 3986](https://tools.ietf.org/html/rfc3986#section-3.2), +this MSC restricts the authority part to never have a userinfo component, +partially to prevent confusion concerned with the `@` character that has its +own meaning in Matrix, but also because this component has historically been +a popular target of abuse. ```text authority = host [ ":" port ] ``` +Further definition of syntax or semantics for the authority part is left for +future MSCs. Clients MUST parse the authority part as per RFC 3986 (i.e. +the presence of an authority part MUST NOT break URI parsing) but SHOULD NOT +use data from the authority part other than for experiments or research. -Here's an example of a Matrix URI with an authority part -(the authority part is `example.org:682` here): -`matrix://example.org:682/roomid/Internal_Room_Id:example2.org`. - -The authority part, as defined above, is reserved for future MSCs. -Clients SHOULD NOT use data from the authority part other than for -experimental or further research purposes. +The authority part may eventually be used to indicate access to a Matrix +resource (such as a room or a user) specifically through a given entity. +See "Ideas for further evolution". #### Path This MSC restricts @@ -330,12 +332,12 @@ means performing an action on a URI with no `action` in the query. The routing query (`via=`) indicates servers that are likely involved in the room (see also [the feature of matrix.to](https://matrix.org/docs/spec/appendices#routing)). -It is proposed to use the routing query to be used not only for resolving +In the meantime, it is proposed that this routing query be used not only with room ids in a public federation but also when a URI refers to a resource in a non-public Matrix network (see the question about closed federations in "Discussion points and tradeoffs"). Note that `authority` in the definition -above is only a part of the grammar as defined in the respective section; -it is not proposed here to generate or read the authority part of the URI. +above is only a part of the _query parameter_ grammar; it is not proposed here +to generate or interpret the _authority part_ of the URI. Clients MAY introduce and recognise custom query items, according to the following rules: @@ -352,7 +354,7 @@ the following rules: action should be respected as much as possible. Client authors SHOULD strive for consistent experience across their and 3rd party clients, anticipating that the same user may happen to have both their client and a 3rd party one. - + Client authors are strongly encouraged to standardise custom query elements that gain adoption by submitting an MSC defining them in a way compatible across the client ecosystem. @@ -497,33 +499,33 @@ extensions. Here are a few ideas: * Bring tangible semantics to the authority part. The main purpose of the authority part, [as per RFC 3986](https://tools.ietf.org/html/rfc3986#section-3.2), - is to identify the authority governing the namespace for the rest - of the URI. This MSC restates the RFC definitions for - [`host`](https://tools.ietf.org/html/rfc3986#section-3.2.2) and - [`port`](https://tools.ietf.org/html/rfc3986#section-3.2.3) but - doesn't go further, calling for a separate MSC that would define semantics of - the `host:port` pair. RFC 3986 also includes provisions for user - information but this MSC explicitly excludes them from the authority grammar, - on the grounds that user information has historically been a vector of - widespread abuse. If providing a user identity via the authority part is - found to be of value (with alleviated security concerns) in some case, - a separate MSC should both re-add it to the grammar of the authority part - and define how to construct, parse, and use it. + is to identify the entity governing the namespace for the rest of the URI. + The current MSC rules out the userinfo component but leaves it to a separate + MSC to define semantics of the remaining`host[:port]` piece. Importantly, future MSCs are advised against using the authority part for _routing over federation_ (the case for `via=` query items), as it would be against the spirit of RFC 3986. The authority part can be used in cases when a given Matrix entity is only available from certain servers (the case of - closed federations or non-federating servers). A request to the server - resolved from the authority part means that the client should be, as the name - implies, _authorised_ by the authority server to access the requested - resource. That, in turn, implies that the resource is either available - to guests on the authority server, or the end user must be authenticated - (and their access rights checked) by (or on behalf of) _that server_ in order - to access the resource. While being a part of the original proposal, + closed federations or non-federating servers). + + While being a part of the original proposal in an attempt to address + [the respective case](https://github.com/matrix-org/matrix-doc/issues/2309), the definition of the authority semantics has been dropped as a result of - [the discussion](https://github.com/matrix-org/matrix-doc/pull/2312#discussion_r348960282) - (also referred to in the previous section). + [the subsequent discussion](https://github.com/matrix-org/matrix-doc/pull/2312#discussion_r348960282). + A further MSC may approach the same case (and/or others) and define the + meaning of the authority part (either on the client- or even on + the server-side - provided that using Matrix URIs on the server-side brings + some other value along the way). This might not necessarily be actual DNS + hostnames even - one (quite far-fetched for now) idea to entertain might be + introducing some decentralised system of "network names" in order to equalise + "public" and "non-public" federations. + + Along the same lines, if providing any part of user credentials via + the authority part is found to be of considerable value in some case, + a separate MSC could both reinstate it in the grammar and define how + to construct, parse, and use it - provided that the same MSC addresses + the security concerns associated with such URIs. * One could conceive a URI mapping of avatars in the form of `matrix:u/uid:matrix.org/avatar/room:matrix.org` @@ -583,10 +585,9 @@ further discussion should happen in GitHub comments. they know the URI is not going to be used outside this federation. Clients can facilitate that by having an option to always add or omit the authority part in generated URIs for a given user account.~~ - Use `via=` in order to point to a homeserver in the closed federation. - The authority part may eventually be used for that but further discussion - is needed on how clients should support it without compromising privacy - (see [the discussion on the issue](https://github.com/matrix-org/matrix-doc/pull/2312#discussion_r348960282)). + As of now, use `via=` in order to point to a homeserver in the closed + federation. The authority part may eventually be used for that (or for some + other case - see the previous section). ### Alternatives @@ -737,8 +738,8 @@ is taken to not make essential parts of the URI omittable to avoid even accidental misrepresentation of a local resource for a remote one in Matrix and vice versa. -The MSC intentionally doesn't support conveying any kind of user -information in URIs. +As mentioned in the authority part section, the MSC intentionally doesn't +support conveying any kind of user information in URIs. The MSC strives to not be prescriptive in treating URIs except the `action` query parameter. Actions without user confirmation may lead to unintended From 67635b0dd7f947c3d3d10da2ff526c83b29ddebc Mon Sep 17 00:00:00 2001 From: Alexey Rusakov Date: Tue, 30 Mar 2021 19:41:49 +0200 Subject: [PATCH 27/29] Be even clearer about user confirmations --- proposals/2312-matrix-uri.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 5e339b0a..305ce1b1 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -444,10 +444,10 @@ performed on behalf (using the access token) of the user `@me:example.org`: | URI class/example | Interactive operation | Non-interactive operation / Involved CS API | | ----------------- | --------------------- | --------------------------------------------- | | User Id (no `action` in URI):
`matrix:u/her:example.org` | _Outside the room context_: show user profile
_Inside the room context:_ mention the user in the current room (client-local operation) | No default non-interactive operation
`GET /profile/@her:example.org/display_name`
`GET /profile/@her:example.org/avatar_url` | -| User Id (`action=chat`):
`matrix:u/her:example.org?action=chat` | Open a direct chat with the user (see the next column on identifying the room) | If [canonical direct chats](https://github.com/matrix-org/matrix-doc/pull/2199) are supported: `GET /_matrix/client/r0/user/@me:example.org/dm?involves=@her:example.org`
Without canonical direct chats:
1. `GET /user/@me:example.org/account_data/m.direct`
2. Find the room id for `@her:example.org` in the event content
3. if found, return this room id; if not, `POST /createRoom` with `"is_direct": true` and return id of the created room | +| User Id (`action=chat`):
`matrix:u/her:example.org?action=chat` | 1. Confirm with the local user if needed (see "Query")
2. Open the room as defined in the next column | If [canonical direct chats](https://github.com/matrix-org/matrix-doc/pull/2199) are supported: `GET /_matrix/client/r0/user/@me:example.org/dm?involves=@her:example.org`
Without canonical direct chats:
1. `GET /user/@me:example.org/account_data/m.direct`
2. Find the room id for `@her:example.org` in the event content
3. if found, return this room id; if not, `POST /createRoom` with `"is_direct": true` and return id of the created room | | Room (no `action` in URI):
`matrix:roomid/rid:example.org`
`matrix:r/us:example.org` | Attempt to "open" (usually: display the timeline at the latest or last remembered position) the room | No default non-interactive operation
API: Find the respective room in the local `/sync` cache or
`GET /rooms/!rid:example.org/...`
| -| Room (`action=join`):
`matrix:roomid/rid:example.org?action=join&via=example2.org`
`matrix:r/us:example.org?action=join` | Attempt to join the room | `POST /join/!rid:example.org?server_name=example2.org`
`POST /join/#us:example.org` | -| Event:
`matrix:r/us:example.org/e/lol823y4bcp3qo4`
`matrix:roomid/rid:example.org/event/lol823y4bcp3qo4?via=example2.org` | 1. For room aliases, resolve an alias to a room id (HOW?)
2. Attempt to retrieve (see the next column) and display the event;
3. If the event could not be retrieved due to access denial and the current user is not a member of the room, the client MAY offer the user to join the room and try to open the event again | Non-interactive operation: return event or event content, depending on context
API: find the event in the local `/sync` cache or
`GET /directory/room/%23us:example.org` (to resolve alias to id)
`GET /rooms/!rid:example.org/event/lol823y4bcp3qo4?server_name=example2.org`
| +| Room (`action=join`):
`matrix:roomid/rid:example.org?action=join&via=example2.org`
`matrix:r/us:example.org?action=join` | 1. Confirm with the local user if needed (see "Query")
2. Attempt to join the room | `POST /join/!rid:example.org?server_name=example2.org`
`POST /join/#us:example.org` | +| Event:
`matrix:r/us:example.org/e/lol823y4bcp3qo4`
`matrix:roomid/rid:example.org/event/lol823y4bcp3qo4?via=example2.org` | 1. For room aliases, resolve an alias to a room id (see the next column)
2. Attempt to retrieve (see the next column) and display the event;
3. If the event could not be retrieved due to access denial and the current user is not a member of the room, the client MAY offer the user to join the room and try to open the event again | Non-interactive operation: return event or event content, depending on context
API: find the event in the local `/sync` cache or
`GET /directory/room/%23us:example.org` (to resolve alias to id)
`GET /rooms/!rid:example.org/event/lol823y4bcp3qo4?server_name=example2.org`
| #### URI construction algorithm @@ -743,8 +743,10 @@ support conveying any kind of user information in URIs. The MSC strives to not be prescriptive in treating URIs except the `action` query parameter. Actions without user confirmation may lead to unintended -leaks of certain metadata so this MSC recommends asking for a user consent - -recognising that not all clients are in position for that. +leaks of certain metadata and/or changes in the account state with respect +to Matrix. To reiterate, clients SHOULD ask for a user consent if/when they +can unless applying the action doesn't lead to sending persistent (message +or state) events on user's behalf. ## Conclusion From d27ea07bcb9131cd87f4687158f8408bcb2ac981 Mon Sep 17 00:00:00 2001 From: Alexey Rusakov Date: Tue, 30 Mar 2021 19:43:52 +0200 Subject: [PATCH 28/29] Minor brush-ups and cleanup --- proposals/2312-matrix-uri.md | 35 +++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 305ce1b1..77086ff6 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -259,10 +259,10 @@ the room (see [MSC2695](https://github.com/matrix-org/matrix-doc/pull/2695) and change this). Further MSCs may introduce navigation to more top-level as well as -non-top-level objects; see "Further evolution" for some ideas. These new -proposals SHOULD follow the generic grammar laid out above, adding new `type` -and `nonid-segment` specifiers and/or allowing them in other levels, rather -than introduce a new grammar. It is recommended to only use abbreviated +non-top-level objects; see "Ideas for further evolution" to get inspired. These +new proposals SHOULD follow the generic grammar laid out above, adding new +`type` and `nonid-segment` specifiers and/or allowing them in other levels, +rather than introduce a new grammar. It is recommended to only use abbreviated single-letter specifiers if they are expected to be user visible and convenient for type-in; if a URI for a given resource type is usually generated (e.g. because the corresponding identifier is not human-friendly), it's @@ -484,7 +484,7 @@ liberty similar to that of matrix.to. ## Discussion and non-normative statements -### Further evolution +### Ideas for further evolution This MSC is obviously just the first step, keeping the door open for extensions. Here are a few ideas: @@ -533,7 +533,12 @@ extensions. Here are a few ideas: * As described in "Alternatives", a synonymous system can be introduced that uses Matrix identifiers with sigils by adding another path prefix (e.g., - `matrix:id/%23matrix:matrix.org`). + `matrix:id/%23matrix:matrix.org`). However, such MSC would have to address + the concerns of possible confusion arising from having two similar but + distinct notations. + +* Interoperability of Matrix URIs with + [Linked Data](https://en.wikipedia.org/wiki/Linked_data). ### Past discussion points and tradeoffs @@ -574,9 +579,6 @@ further discussion should happen in GitHub comments. to offload that to the server. For that reason fragments, if/when ever employed in Matrix, only should be used to pinpoint a position within events and for similar strictly client-side operations. -1. _Interoperability with - [Linked Data](https://en.wikipedia.org/wiki/Linked_data)_ is out of - scope of this MSC but worth being considered separately. 1. _How does this MSC work with closed federations?_ ~~If you need to communicate a URI to the bigger world where you cannot expect the consumer to know in advance which federation they should use - @@ -651,12 +653,11 @@ the type signifier?). Yet another alternative considered was to go "full REST" and structure URLs in a more traditional way with serverparts coming first, followed by type grouping (sic - not specifiers), and then by localparts, -i.e. `matrix://example.org/rooms/roomalias`. This is even more -difficult to comprehend for a Matrix user than the previous alternative -and besides it conflates the notion of an authority server with -that of a namespace (quite confusingly, `example.org` above is -the _domain name_ - aka server part - of an alias, not a _host name_ -of a hypothetical homeserver that should be used to resolve the URI). +i.e. `matrix://example.org/rooms/roomalias`. This is even more difficult +to comprehend for a Matrix user than the previous alternative and besides it +conflates the notion of an authority server with that of a namespace +discriminator: clients would not connect to `example.org` to resolve the alias +above, they would still connect to their own homeserver. #### Minimal syntax @@ -667,6 +668,8 @@ URIs even though they look like ones: most URI parsers won't handle them correctly. As laid out in the beginning of this proposal, Matrix URIs are not striving to preempt Matrix identifiers; instead of trying to produce an equally readable string, one should just use identifiers where they work. +Why Matrix identifiers look the way they look is way out of the MSC scope +to discuss here. #### Minimal syntax based on the path component and percent-encoding @@ -696,7 +699,7 @@ Putting the whole id to the URI fragment (`matrix:#id_with_sigil` or, following on the `matrix.to` tradition, `matrix:#/id_with_sigil` for readability) allows using `#` without encoding on many URI parsers. It is still not fully RFC-compliant and rules out using URIs by homeservers -(see also "Past discussion points"). +(see also "Past discussion points" on using fragments to address events). Regardless of the placement (the fragment or the path), one more consideration is that the character space for sigils is extremely limited and From 89355034700bc79a7789a01e45b8c1cfe12df8a4 Mon Sep 17 00:00:00 2001 From: Alexey Rusakov Date: Sun, 4 Apr 2021 21:40:18 +0200 Subject: [PATCH 29/29] Fix a left-over spotted in the last moment --- proposals/2312-matrix-uri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2312-matrix-uri.md b/proposals/2312-matrix-uri.md index 77086ff6..45bb2187 100644 --- a/proposals/2312-matrix-uri.md +++ b/proposals/2312-matrix-uri.md @@ -756,7 +756,7 @@ or state) events on user's behalf. A dedicated URI scheme is well overdue for Matrix. Many other networks already have got one for themselves, benefiting both in terms of -branding (compare `matrix:room/weruletheworld:example.org` vs. +branding (compare `matrix:r/weruletheworld:example.org` vs. `#weruletheworld:example.org` from the standpoint of someone who hasn't been to Matrix) and interoperability (`matrix.to` requires opening a browser while clicking a `tg:` link dumped to the terminal