From 5c29e03a62b0a689033b5dfa7542b3a1e5475241 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Fri, 27 Jun 2025 20:46:37 +0200 Subject: [PATCH 1/6] Proposal to disallow non-compliant user IDs in rooms --- ...lightly-strict-user-id-grammar-in-rooms.md | 62 +++++++++++++++++++ 1 file changed, 62 insertions(+) create mode 100644 proposals/4303-slightly-strict-user-id-grammar-in-rooms.md diff --git a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md new file mode 100644 index 000000000..344ff3880 --- /dev/null +++ b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md @@ -0,0 +1,62 @@ +# MSC4303: Disallowing non-compliant user IDs in rooms + +The specification presently allows for a "historical" grammar for user IDs, described +[here](https://spec.matrix.org/v1.15/appendices/#historical-user-ids). The use of this historical +character set has allowed for "weird" and generally not-great user IDs to appear in the wild, +causing a variety of issues for software projects. + +The historical grammar can be split into two classes: + +1. printable ASCII (U+0021 to U+007E), aka. historical-but-compliant +2. arbitrary unicode, aka. non-compliant + +This proposal uses a future room version to prohibit the second class, while still allowing the first. +Users with non-ASCII user IDs will not be able to join or participating in those room versions. + +## Proposal + +In a future room version, the historical-but-compliant grammar for user IDs is strictly enforced on +all places a user ID can appear in an event. + +* The `sender` of all events. +* The `state_key` of `m.room.member` events. +* The `join_authorised_via_users_server` field in `m.room.member` events. +* Keys of the `users` object in `m.room.power_levels`. + +User IDs in the new room version must only consist of ASCII characters between U+0021 and U+007E. + +By making this change in a room version, non-compliant user IDs are slowly removed from the public +federation. Several rooms will naturally upgrade to a room version which includes this MSC's change +after it becomes available in servers, and eventually the specification will update the +[recommended default room version](https://spec.matrix.org/v1.15/rooms/#complete-list-of-room-versions) +to include this MSC's change as well to affect new room creation. + +In short, this MSC starts a multi-year process to formally phase out non-compliant user IDs as new +(and existing through upgrades) rooms use a room version which bans such user IDs. + +## Potential issues + +Unlike [MSC4044](https://github.com/matrix-org/matrix-spec-proposals/pull/4044), this MSC does +not define all historical user IDs as non-compliant. Therefore, real users who registered historical +user IDs should not be affected. Only accounts registered by modified servers will be prohibited +from participating in new rooms. + +## Alternatives + +[MSC4044](https://github.com/matrix-org/matrix-spec-proposals/pull/4044) + +We don't do this, or hope that pseudo IDs fix the problem. + +## Security considerations + +This MSC improves the security posture of the protocol by reducing the likelihood of strange characters +appearing in user IDs. + +## Unstable prefix + +This MSC can be implemented in a room version identified as `fi.mau.msc4303`. A future MSC will +be required to incorporate this MSC into a stable room version. + +## Dependencies + +None. From dacb6f002f50dc18de883249b0483f591cca9290 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Fri, 27 Jun 2025 20:56:17 +0200 Subject: [PATCH 2/6] Mention that the spec already recommends dropping non-room-scoped data from non-compliant ids --- proposals/4303-slightly-strict-user-id-grammar-in-rooms.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md index 344ff3880..30d468a9c 100644 --- a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md +++ b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md @@ -39,7 +39,9 @@ In short, this MSC starts a multi-year process to formally phase out non-complia Unlike [MSC4044](https://github.com/matrix-org/matrix-spec-proposals/pull/4044), this MSC does not define all historical user IDs as non-compliant. Therefore, real users who registered historical user IDs should not be affected. Only accounts registered by modified servers will be prohibited -from participating in new rooms. +from participating in new rooms. Additionally, the existing spec already recommends servers drop +traffic from such non-compliant user IDs outside of rooms (e.g. in to-device events), which means +such users may already not be able to participate in encrypted rooms. ## Alternatives From d97f844e7a5fac833763ea251521a2758ca53802 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Fri, 27 Jun 2025 21:10:57 +0200 Subject: [PATCH 3/6] Also forbid empty localparts :( --- proposals/4303-slightly-strict-user-id-grammar-in-rooms.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md index 30d468a9c..21ce03384 100644 --- a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md +++ b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md @@ -8,7 +8,7 @@ causing a variety of issues for software projects. The historical grammar can be split into two classes: 1. printable ASCII (U+0021 to U+007E), aka. historical-but-compliant -2. arbitrary unicode, aka. non-compliant +2. arbitrary unicode and blank localparts, aka. non-compliant This proposal uses a future room version to prohibit the second class, while still allowing the first. Users with non-ASCII user IDs will not be able to join or participating in those room versions. @@ -23,7 +23,8 @@ all places a user ID can appear in an event. * The `join_authorised_via_users_server` field in `m.room.member` events. * Keys of the `users` object in `m.room.power_levels`. -User IDs in the new room version must only consist of ASCII characters between U+0021 and U+007E. +User IDs in the new room version must only consist of ASCII characters between U+0021 and U+007E +and the localpart must not be empty. By making this change in a room version, non-compliant user IDs are slowly removed from the public federation. Several rooms will naturally upgrade to a room version which includes this MSC's change From 46c28b7b41846a469beb16b928c9fa4d395e0d4d Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Fri, 27 Jun 2025 21:11:22 +0200 Subject: [PATCH 4/6] Clarify that room ID grammar is handled by other MSCs --- proposals/4303-slightly-strict-user-id-grammar-in-rooms.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md index 21ce03384..a9ae0d9a8 100644 --- a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md +++ b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md @@ -13,6 +13,9 @@ The historical grammar can be split into two classes: This proposal uses a future room version to prohibit the second class, while still allowing the first. Users with non-ASCII user IDs will not be able to join or participating in those room versions. +This proposal does not cover room ID grammar, as that will be handled by another MSC, +such as [MSC4051](https://github.com/matrix-org/matrix-spec-proposals/pull/4051). + ## Proposal In a future room version, the historical-but-compliant grammar for user IDs is strictly enforced on From 11cfb274c257b65fe5c54a8ba2f7fb1e33eb6756 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Sat, 28 Jun 2025 09:39:18 +0200 Subject: [PATCH 5/6] Fix word Co-authored-by: Hubert Chathi --- proposals/4303-slightly-strict-user-id-grammar-in-rooms.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md index a9ae0d9a8..e85c59bea 100644 --- a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md +++ b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md @@ -11,7 +11,7 @@ The historical grammar can be split into two classes: 2. arbitrary unicode and blank localparts, aka. non-compliant This proposal uses a future room version to prohibit the second class, while still allowing the first. -Users with non-ASCII user IDs will not be able to join or participating in those room versions. +Users with non-ASCII user IDs will not be able to join or participate in those room versions. This proposal does not cover room ID grammar, as that will be handled by another MSC, such as [MSC4051](https://github.com/matrix-org/matrix-spec-proposals/pull/4051). From 512566b03ff549140388264d0be7a9457b8f7a0e Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Tue, 8 Jul 2025 16:34:57 +0300 Subject: [PATCH 6/6] Specify base room version for unstable prefix --- proposals/4303-slightly-strict-user-id-grammar-in-rooms.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md index e85c59bea..925d90670 100644 --- a/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md +++ b/proposals/4303-slightly-strict-user-id-grammar-in-rooms.md @@ -60,8 +60,8 @@ appearing in user IDs. ## Unstable prefix -This MSC can be implemented in a room version identified as `fi.mau.msc4303`. A future MSC will -be required to incorporate this MSC into a stable room version. +This MSC can be implemented in a room version identified as `fi.mau.msc4303` building on top of +room version 11. A future MSC will be required to incorporate this MSC into a stable room version. ## Dependencies