MSC4009: Expanding the Matrix ID grammar to enable E.164 IDs (#4009)

* E.164 Matrix IDs.

* Fix typo.

* small typo fixes

* Expand examples.

Another reasonable way around this is to use a prefix,
instead of character mapping.

The downsides are essentially the same.

* Updates from feedback.

* Fix typos.

Co-authored-by: Hubert Chathi <hubertc@matrix.org>

* Add note about re-using IDs.

---------

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Hubert Chathi <hubertc@matrix.org>
pull/4031/head
Patrick Cloke 11 months ago committed by GitHub
parent a3778b3f82
commit 9466833900
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -0,0 +1,103 @@
# MSC4009: Expanding the Matrix ID grammar to enable E.164 IDs
[E.164](https://www.itu.int/rec/T-REC-E.164) is a set of international standards
for telephone numbering. This defines a phone number as a `+` followed by a country
code (e.g. `1` for the US, `44` for the UK) followed by a local phone number.
For example, a E.164 phone number would look like: `+15558675309`.
It is somewhat common[^1] for social networks or messaging applications to use phone
numbers as identifiers instead of relying on users remembering separate usernames
(and needing to tell other users how to find them).
[Matrix user identifiers](https://spec.matrix.org/v1.6/appendices/#user-identifiers)
are *almost* compatible with E.164:
> The `localpart` of a user ID is an opaque identifier for that user. It MUST NOT
> be empty, and MUST contain only the characters `a-z`, `0-9`, `.`, `_`, `=`, `-`,
> and `/`.
## Proposal
Add `+` to the list of allowed characters in a Matrix user identifier. This would
allow a full E.164 phone number as a user's username on a server, which is common
practice for many applications. This should hopefully ease bridging of those
services to Matrix or help them natively speak Matrix in the future. Users would
not need to learn a new Matrix ID, but can continue using their phone number, as
today.
Although E.164 IDs are meant to be globally unique they are still namespaced under
a domain under this proposal, e.g. `@+15558675309:example.com`, as the same user may
already be registered on multiple service providers.
## Potential issues
Homeservers and clients must already be
[more tolerant of Matrix user IDs](https://spec.matrix.org/v1.6/appendices/#historical-user-ids):
> Older versions of this specification were more tolerant of the characters
> permitted in user ID localparts. [...] clients and servers MUST accept user IDs
> with localparts from the expanded character set:
>
> `extended_user_id_char = %x21-39 / %x3B-7E ; all ASCII printing chars except :`
Thus, it is expected to not cause any issues, although clients will need to identify
that the `+` character is valid for a homeserver. This could be from the supported
Matrix versions which the homeserver advertises.
----
A user having two accounts with the same identifier on different services is not
ideal, but is not different than today where a user may log into both WhatsApp
and Signal with their phone number. This MSC does *not* attempt to help with mapping
of an E.164 identifier to an actual Matrix ID, that's best left to the current
[identity service `/lookup` endpoint](https://spec.matrix.org/v1.6/identity-service-api/#post_matrixidentityv2lookup)
or future discovery services.
----
If a service uses E.164 identifiers as Matrix IDs then it must be careful to avoid
leaking history when reassigning IDs to new users. This is related to [#234](https://github.com/matrix-org/matrix-spec/issues/234),
but only applies to the localpart of the Matrix ID, not the entire domain. The
solution remains the same: using portable identifiers ([MSC1228](https://github.com/matrix-org/matrix-spec-proposals/pull/1228)
or [MSC4014](https://github.com/matrix-org/matrix-spec-proposals/pull/4014)).
## Alternatives
The `+` could continue to be disallowed and left off the beginning of the Matrix
IDs. Note that Synapse reserves digit-only usernames for guest users, although this
seems to be an implementation detail and not mandated by the specification.
Another option would be to [map from other character sets](https://spec.matrix.org/v1.6/appendices/#mapping-from-other-character-sets)
or prefix the Matrix ID with something (e.g. `msisdn`).
This would generate a Matrix ID like `@=2b15558675309:example.com` or
`@msisdn_5558675309:example.com`, which would dramatically impact usability
for the users on homeservers which use phone numbers as identifiers.
----
Although E.164 limits the `+` character to the initial character there seems to
be no reason to limit that for Matrix identifiers.
## Security considerations
E.164 IDs are globally unique, but the proposed change only enforces per-homeserver
uniqueness. If a homeserver does not diligently check that the phone number belongs
to a user then this may allow additional spoofing attacks. The author does not consider
this to be much worse than today's situation:
1. Many current applications need to verify that a phone number truly is owned
by an account. Any current best practices should be followed by a service
taking advantage of this MSC.
2. It is already quite easy today to spoof a user's ID by registering the same
user localpart on a different homeserver. The same issue occurs with email or
other distributed system (Mastodon, etc.).
## Unstable prefix
N/A
## Dependencies
N/A
[^1]: E.g. Signal, WhatsApp, etc.
Loading…
Cancel
Save