|
|
@ -1,4 +1,4 @@
|
|
|
|
# Proposal for mandating lowercasing when processing e-mail address localparts
|
|
|
|
# Proposal for mandating case folding when processing e-mail address localparts
|
|
|
|
|
|
|
|
|
|
|
|
[RFC822](https://tools.ietf.org/html/rfc822#section-3.4.7) mandates that
|
|
|
|
[RFC822](https://tools.ietf.org/html/rfc822#section-3.4.7) mandates that
|
|
|
|
localparts in e-mail addresses must be processed with the original case
|
|
|
|
localparts in e-mail addresses must be processed with the original case
|
|
|
@ -22,8 +22,13 @@ Sydent.
|
|
|
|
|
|
|
|
|
|
|
|
This proposal suggests changing the specification of the e-mail 3PID type in
|
|
|
|
This proposal suggests changing the specification of the e-mail 3PID type in
|
|
|
|
[the Matrix spec appendices](https://matrix.org/docs/spec/appendices#pid-types)
|
|
|
|
[the Matrix spec appendices](https://matrix.org/docs/spec/appendices#pid-types)
|
|
|
|
to mandate that any e-mail address must be entirely converted to lowercase
|
|
|
|
to mandate that, before any processing, e-mail address localparts must go
|
|
|
|
before any processing, instead of only its domain.
|
|
|
|
through a full case folding based on [the unicode mapping
|
|
|
|
|
|
|
|
file](https://www.unicode.org/Public/8.0.0/ucd/CaseFolding.txt), on top of
|
|
|
|
|
|
|
|
having their domain lowercased.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This means that `Strauß@Example.com` must be considered as being the same e-mail
|
|
|
|
|
|
|
|
address as `strauss@example.com`.
|
|
|
|
|
|
|
|
|
|
|
|
## Other considered solutions
|
|
|
|
## Other considered solutions
|
|
|
|
|
|
|
|
|
|
|
@ -33,17 +38,24 @@ However, [MSC2134](https://github.com/matrix-org/matrix-doc/pull/2134) changes
|
|
|
|
this: because hashing functions are case sensitive, we need both clients and
|
|
|
|
this: because hashing functions are case sensitive, we need both clients and
|
|
|
|
identity servers to follow the same policy regarding case sensitivity.
|
|
|
|
identity servers to follow the same policy regarding case sensitivity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
An initial version of this proposal proposed to mandate lowercasing e-mail
|
|
|
|
|
|
|
|
addresses instead of case folding them, however it was pointed out that this
|
|
|
|
|
|
|
|
solution might not be the best and most future-proof one.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Unicode normalisation was also looked at but judged unnecessary.
|
|
|
|
|
|
|
|
|
|
|
|
## Tradeoffs
|
|
|
|
## Tradeoffs
|
|
|
|
|
|
|
|
|
|
|
|
Implementing this MSC in identity servers and homeservers might require the
|
|
|
|
Implementing this MSC in identity servers and homeservers might require the
|
|
|
|
databases of existing instances to be updated in a large part to convert the
|
|
|
|
databases of existing instances to be updated in a large part to case fold the
|
|
|
|
email addresses of existing associations to lowercase, in order to avoid
|
|
|
|
email addresses of existing associations, in order to avoid conflicts. However,
|
|
|
|
conflicts. However, most of this update can usually be done by a single database
|
|
|
|
most of this update can usually be done by a background job running at startup,
|
|
|
|
query (or a background job running at startup), so the UX improvement outweighs
|
|
|
|
so the UX improvement outweighs this trouble.
|
|
|
|
this trouble.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Potential issues
|
|
|
|
## Potential issues
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Conflicts with existing associations
|
|
|
|
|
|
|
|
|
|
|
|
Some users might already have two different accounts associated with the same
|
|
|
|
Some users might already have two different accounts associated with the same
|
|
|
|
e-mail address but with different cases. This appears to happen in a small
|
|
|
|
e-mail address but with different cases. This appears to happen in a small
|
|
|
|
number of cases, however, and can be dealt with by the identity server's or the
|
|
|
|
number of cases, however, and can be dealt with by the identity server's or the
|
|
|
@ -58,6 +70,29 @@ like:
|
|
|
|
3. inform the user of the deletion by sending them an email notice to the email
|
|
|
|
3. inform the user of the deletion by sending them an email notice to the email
|
|
|
|
address
|
|
|
|
address
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Storing and querying
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Most database engines don't support case folding, therefore querying all
|
|
|
|
|
|
|
|
e-mail addresses matching a case folded e-mail address might not be trivial,
|
|
|
|
|
|
|
|
e.g. an identity server querying all associations for `strauss@example.com` when
|
|
|
|
|
|
|
|
processing a `/lookup` request would be expected to also get associations for
|
|
|
|
|
|
|
|
`Strauß@Example.com`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To address this issue, implementation maintainers are strongly encouraged to
|
|
|
|
|
|
|
|
make e-mail addresses go through a full case folding before storing them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Implementing case folding
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The need for case folding in services on the Internet doesn't seem to be very
|
|
|
|
|
|
|
|
large currently (probably due to its young age), therefore there seem to be only
|
|
|
|
|
|
|
|
a few third-party implementation librairies out there. However, both
|
|
|
|
|
|
|
|
[Go](https://godoc.org/golang.org/x/text/cases#Fold), [Python
|
|
|
|
|
|
|
|
2](https://docs.python.org/2/library/stringprep.html#stringprep.map_table_b3)
|
|
|
|
|
|
|
|
and [Python 3](https://docs.python.org/3/library/stdtypes.html#str.casefold)
|
|
|
|
|
|
|
|
support it natively, and [a third-party JavaScript
|
|
|
|
|
|
|
|
implementation](https://github.com/ar-nelson/foldcase) exists which, although
|
|
|
|
|
|
|
|
young, seems to be working.
|
|
|
|
|
|
|
|
|
|
|
|
## Footnotes
|
|
|
|
## Footnotes
|
|
|
|
|
|
|
|
|
|
|
|
[0]: This is specific to Sydent because of a bug it has where v1 lookups are
|
|
|
|
[0]: This is specific to Sydent because of a bug it has where v1 lookups are
|
|
|
|