From 4cb667ca27d66b2b83cb21dee9249b213a1f814b Mon Sep 17 00:00:00 2001 From: Travis Ralston Date: Tue, 4 May 2021 09:34:41 -0600 Subject: [PATCH 1/4] Case fold instead of lowercase Fixes https://github.com/matrix-org/matrix-doc/issues/3175 --- content/appendices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/appendices.md b/content/appendices.md index 9132dd6d..38bf6397 100644 --- a/content/appendices.md +++ b/content/appendices.md @@ -773,7 +773,7 @@ Represents E-Mail addresses. The `address` is the raw email address in other text such as real name, angle brackets or a mailto: prefix. In addition to lowercasing the domain component of an email address, -implementations are expected to lowercase the localpart as described +implementations are expected to case fold the localpart as described in [the unicode mapping file](https://www.unicode.org/Public/8.0.0/ucd/CaseFolding.txt) prior to any processing. For example, `Strauß@Example.com` must be considered to be `strauss@example.com` while processing the email From 7a013a53e5fd111a6ea663837975c348eaf64603 Mon Sep 17 00:00:00 2001 From: Travis Ralston Date: Tue, 4 May 2021 09:35:54 -0600 Subject: [PATCH 2/4] Changelogs --- changelogs/identity_service/newsfragments/3167.clarification | 2 +- changelogs/identity_service/newsfragments/3176.clarification | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) create mode 100644 changelogs/identity_service/newsfragments/3176.clarification diff --git a/changelogs/identity_service/newsfragments/3167.clarification b/changelogs/identity_service/newsfragments/3167.clarification index cc70fd71..9f68bbf2 100644 --- a/changelogs/identity_service/newsfragments/3167.clarification +++ b/changelogs/identity_service/newsfragments/3167.clarification @@ -1 +1 @@ -Clarify that some identifiers may be lowercase prior to processing, as per [MSC2265](https://github.com/matrix-org/matrix-doc/pull/2265). \ No newline at end of file +Clarify that some identifiers may be case-folded to lowercase prior to processing, as per [MSC2265](https://github.com/matrix-org/matrix-doc/pull/2265). \ No newline at end of file diff --git a/changelogs/identity_service/newsfragments/3176.clarification b/changelogs/identity_service/newsfragments/3176.clarification new file mode 100644 index 00000000..9f68bbf2 --- /dev/null +++ b/changelogs/identity_service/newsfragments/3176.clarification @@ -0,0 +1 @@ +Clarify that some identifiers may be case-folded to lowercase prior to processing, as per [MSC2265](https://github.com/matrix-org/matrix-doc/pull/2265). \ No newline at end of file From 8b40972872a3baeb8031f15da086463ff1c7975b Mon Sep 17 00:00:00 2001 From: Travis Ralston Date: Tue, 18 May 2021 09:58:19 -0600 Subject: [PATCH 3/4] iterate --- .../identity_service/newsfragments/3176.clarification | 2 +- content/appendices.md | 10 +++++----- proposals/2265-email-lowercase.md | 4 ++-- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/changelogs/identity_service/newsfragments/3176.clarification b/changelogs/identity_service/newsfragments/3176.clarification index 9f68bbf2..50fb7cd1 100644 --- a/changelogs/identity_service/newsfragments/3176.clarification +++ b/changelogs/identity_service/newsfragments/3176.clarification @@ -1 +1 @@ -Clarify that some identifiers may be case-folded to lowercase prior to processing, as per [MSC2265](https://github.com/matrix-org/matrix-doc/pull/2265). \ No newline at end of file +Clarify that some identifiers must be case folded prior to processing, as per [MSC2265](https://github.com/matrix-org/matrix-doc/pull/2265). \ No newline at end of file diff --git a/content/appendices.md b/content/appendices.md index 67c41f98..f3be0cc6 100644 --- a/content/appendices.md +++ b/content/appendices.md @@ -756,11 +756,11 @@ Represents E-Mail addresses. The `address` is the raw email address in other text such as real name, angle brackets or a mailto: prefix. In addition to lowercasing the domain component of an email address, -implementations are expected to case fold the localpart as described -in [the unicode mapping file](https://www.unicode.org/Public/8.0.0/ucd/CaseFolding.txt) -prior to any processing. For example, `Strauß@Example.com` must be -considered to be `strauss@example.com` while processing the email -address. +implementations are expected to apply the unicode case-folding algorithm +as described under "Caseless Matching" in +[chapter 5 of the unicode standard](https://www.unicode.org/versions/Unicode13.0.0/ch05.pdf#G21790). +For example, `Strauß@Example.com` must be considered to be `strauss@example.com` +while processing the email address. ### PSTN Phone numbers diff --git a/proposals/2265-email-lowercase.md b/proposals/2265-email-lowercase.md index 5a1db682..e4fe5313 100644 --- a/proposals/2265-email-lowercase.md +++ b/proposals/2265-email-lowercase.md @@ -23,8 +23,8 @@ Sydent. This proposal suggests changing the specification of the e-mail 3PID type in [the Matrix spec appendices](https://matrix.org/docs/spec/appendices#pid-types) to mandate that, before any processing, e-mail addresses must go through a full -case folding based on [the unicode mapping -file](https://www.unicode.org/Public/8.0.0/ucd/CaseFolding.txt), on top of +case folding as described under "Caseless Matching" in +[chapter 5 of the unicode standard](https://www.unicode.org/versions/Unicode13.0.0/ch05.pdf#G21790), on top of having their domain lowercased. This means that `Strauß@Example.com` must be considered as being the same e-mail From 81da7ba4800bfab25b5e2994b105f3f322f5d194 Mon Sep 17 00:00:00 2001 From: Travis Ralston Date: Wed, 26 May 2021 11:04:50 -0600 Subject: [PATCH 4/4] Update 3167.clarification --- changelogs/identity_service/newsfragments/3167.clarification | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/changelogs/identity_service/newsfragments/3167.clarification b/changelogs/identity_service/newsfragments/3167.clarification index 9f68bbf2..16ed8dcd 100644 --- a/changelogs/identity_service/newsfragments/3167.clarification +++ b/changelogs/identity_service/newsfragments/3167.clarification @@ -1 +1 @@ -Clarify that some identifiers may be case-folded to lowercase prior to processing, as per [MSC2265](https://github.com/matrix-org/matrix-doc/pull/2265). \ No newline at end of file +Clarify that some identifiers must be case folded prior to processing, as per [MSC2265](https://github.com/matrix-org/matrix-doc/pull/2265).