From 1688f12d083e2163752fa77f2978288b7ea2b59c Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Tue, 1 Sep 2020 18:19:56 +0100 Subject: [PATCH 1/5] Proposal for a common identifier grammar --- proposals/2758-textual-id-grammar.md | 54 ++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 proposals/2758-textual-id-grammar.md diff --git a/proposals/2758-textual-id-grammar.md b/proposals/2758-textual-id-grammar.md new file mode 100644 index 00000000..35c43f65 --- /dev/null +++ b/proposals/2758-textual-id-grammar.md @@ -0,0 +1,54 @@ +# MSC2758: Common grammar for textual identifiers + +The matrix specification uses textual identifiers for a wide range of +concepts. Examples include "event types" and "room versions". + +In the past, these identifiers have often lacked a formal grammar, leaving +servers and clients to make assumptions about questions such as which +characters are permitted, minimum and maximum lengths, etc. + +This proposal suggests a common grammar which can be used as a basis for +*future* identifier types, to reduce the work involved in future specification +work. + +No attempt is made here to bring existing identifiers into line; however +examples of identifiers which might have benefitted from such a grammar in the +past include: + + * [`capabilities`](https://matrix.org/docs/spec/client_server/r0.6.0#get-matrix-client-r0-capabilities) + identifiers. + * authentication types for the [User-Interactive Authentication mechanism](https://matrix.org/docs/spec/client_server/r0.6.0#user-interactive-authentication-api). + * login types for [`/_matrix/client/r0/login`](https://matrix.org/docs/spec/client_server/r0.6.0#post-matrix-client-r0-login). + * event types + * [`m.room.message` `msgtypes`](https://matrix.org/docs/spec/client_server/r0.6.0#m-room-message-msgtypes) + * `app_id` for [`POST /_matrix/client/r0/pushers/set`](https://matrix.org/docs/spec/client_server/r0.6.0#post-matrix-client-r0-pushers-set). + * `rule_id`s, `actions` and `tweak`s for [push rules](https://matrix.org/docs/spec/client_server/r0.6.0#push-rules). + * [E2E messaging algoritm names](https://matrix.org/docs/spec/client_server/r0.6.0#messaging-algorithm-names). + +## Proposal + +We define a "common namespaced identifier grammar". This can then be referenced +by other parts of the grammar, in much the same way as [Unpadded +Base64](https://matrix.org/docs/spec/appendices#unpadded-base64) is defined +today. + +The grammar is defined as follows: + + * The length of an identifier may not be less than one character or more than + 255 characters. + * Identifiers must start with one of the characters `[a-z]`, and be entirely + composed of the characters `[a-z]`, `[0-9]`, `-`, `_` and `.`. + * Identifiers starting with the characters `m.` are reserved for use by the + formal matrix specification. + * Implementations wishing to implement unspecified identifiers should follow + the Java Package Naming conventions of using a reversed domain name, for + example `com.example.identifier`. + +This grammar is indented for use entirely by internal identifiers, and *not* +for user-visible strings. + +### Rationale + + * Avoiding non-ascii characters sidesteps any issues with homoglyphs or + altenative encodings of the same characters. + * Avoiding upper-case character sidesteps any concerns over case-sensistivity. From b3f40fb553ca4a6e285ccd28be0c52c6b2463be3 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff Date: Tue, 1 Sep 2020 18:23:46 +0100 Subject: [PATCH 2/5] fix a couple of typos --- proposals/2758-textual-id-grammar.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2758-textual-id-grammar.md b/proposals/2758-textual-id-grammar.md index 35c43f65..7f31e554 100644 --- a/proposals/2758-textual-id-grammar.md +++ b/proposals/2758-textual-id-grammar.md @@ -22,7 +22,7 @@ past include: * event types * [`m.room.message` `msgtypes`](https://matrix.org/docs/spec/client_server/r0.6.0#m-room-message-msgtypes) * `app_id` for [`POST /_matrix/client/r0/pushers/set`](https://matrix.org/docs/spec/client_server/r0.6.0#post-matrix-client-r0-pushers-set). - * `rule_id`s, `actions` and `tweak`s for [push rules](https://matrix.org/docs/spec/client_server/r0.6.0#push-rules). + * `rule_ids`, `actions` and `tweaks` for [push rules](https://matrix.org/docs/spec/client_server/r0.6.0#push-rules). * [E2E messaging algoritm names](https://matrix.org/docs/spec/client_server/r0.6.0#messaging-algorithm-names). ## Proposal @@ -44,7 +44,7 @@ The grammar is defined as follows: the Java Package Naming conventions of using a reversed domain name, for example `com.example.identifier`. -This grammar is indented for use entirely by internal identifiers, and *not* +This grammar is intended for use entirely by internal identifiers, and *not* for user-visible strings. ### Rationale From 49e4e61e805ddc5ef1905e214d7e2b3de1d2d487 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Tue, 1 Sep 2020 22:34:31 +0100 Subject: [PATCH 3/5] fix typos Co-authored-by: Matthew Hodgson --- proposals/2758-textual-id-grammar.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2758-textual-id-grammar.md b/proposals/2758-textual-id-grammar.md index 7f31e554..8fa702a1 100644 --- a/proposals/2758-textual-id-grammar.md +++ b/proposals/2758-textual-id-grammar.md @@ -23,7 +23,7 @@ past include: * [`m.room.message` `msgtypes`](https://matrix.org/docs/spec/client_server/r0.6.0#m-room-message-msgtypes) * `app_id` for [`POST /_matrix/client/r0/pushers/set`](https://matrix.org/docs/spec/client_server/r0.6.0#post-matrix-client-r0-pushers-set). * `rule_ids`, `actions` and `tweaks` for [push rules](https://matrix.org/docs/spec/client_server/r0.6.0#push-rules). - * [E2E messaging algoritm names](https://matrix.org/docs/spec/client_server/r0.6.0#messaging-algorithm-names). + * [E2E messaging algorithm names](https://matrix.org/docs/spec/client_server/r0.6.0#messaging-algorithm-names). ## Proposal @@ -51,4 +51,4 @@ for user-visible strings. * Avoiding non-ascii characters sidesteps any issues with homoglyphs or altenative encodings of the same characters. - * Avoiding upper-case character sidesteps any concerns over case-sensistivity. + * Avoiding upper-case character sidesteps any concerns over case-sensitivity. From 998dec2980ad9637d197cff127f6fb8472864238 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Wed, 2 Sep 2020 17:50:15 +0100 Subject: [PATCH 4/5] Update proposals/2758-textual-id-grammar.md --- proposals/2758-textual-id-grammar.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/proposals/2758-textual-id-grammar.md b/proposals/2758-textual-id-grammar.md index 8fa702a1..4c7bc699 100644 --- a/proposals/2758-textual-id-grammar.md +++ b/proposals/2758-textual-id-grammar.md @@ -41,8 +41,10 @@ The grammar is defined as follows: * Identifiers starting with the characters `m.` are reserved for use by the formal matrix specification. * Implementations wishing to implement unspecified identifiers should follow - the Java Package Naming conventions of using a reversed domain name, for - example `com.example.identifier`. + the Java Package Naming convention of starting with a reversed domain + name (with a dot after the domain name part). For example, for the + organisation `example.com`, a valid identifier would be + `com.example.identifier`. This grammar is intended for use entirely by internal identifiers, and *not* for user-visible strings. From 49ce93f3e2e24bf37eeb88c560ba0d046e7a8453 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Tue, 20 Oct 2020 14:41:37 +0100 Subject: [PATCH 5/5] Update 2758-textual-id-grammar.md --- proposals/2758-textual-id-grammar.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2758-textual-id-grammar.md b/proposals/2758-textual-id-grammar.md index 4c7bc699..626f84ae 100644 --- a/proposals/2758-textual-id-grammar.md +++ b/proposals/2758-textual-id-grammar.md @@ -34,8 +34,8 @@ today. The grammar is defined as follows: - * The length of an identifier may not be less than one character or more than - 255 characters. + * An identifier may not be less than one character or more than 255 characters + in length. * Identifiers must start with one of the characters `[a-z]`, and be entirely composed of the characters `[a-z]`, `[0-9]`, `-`, `_` and `.`. * Identifiers starting with the characters `m.` are reserved for use by the