18 KiB
MSC4124: Simple server authorization
This MSC proposes simple authorization rules that consider the origin
server of a given event, with the aim of replacing m.room.server_acl
.
This is a compromising MSC based on MSC4099
that tries to use concepts even more inline current authorization events
This MSC was also created in reaction to MSC2870,
that describes itself as stop gap to cover what the MSC has described
short comings of m.room.server_acl
. We also agree that MSC2870 is
a stop gap and that the m.room.server_acl
has severe shortcomings,
but we take the view that after 4 years of proposed stop-gaping,
there is enough time to introduce a more complete solution.
Related issues:
Context
NOTE: This context was added retroactively
Server Access Control Lists
and their associated event,
m.room.server_acl
, are intended to control which servers can interact
with a Matrix room
Problem: ambient room access
Overwhelmingly server ACL use in the federation is as a
reactive measure to servers which are known to proliferate abuse.
These servers are often added to the deny
list, and it is the author's
subjective understanding that the majority of public Matrix rooms
use an allow list of ["*"]
. For any room with an allow
literal of
["*"]
, servers are given unrestricted authority to interact with
the room, even poison the DAG, before a room administrator is even
aware of the attacking server's existence1. As the
the room admin is unaware of the existence of the attacking server,
this means that they or their tooling will be unable to research
the reputation of the server and determine whether to deny
them
until an attack is already underway, which is too late.
This is why we are introducing the m.server.knock_rule
later in this
proposal. DO NOT assume this is comparable to membership
knocking
until you have read the proposal.
Problem: leaky servers
Server ACLs are specified as restricting which requests a server can make in relation to a room from the server-server API.
The ACLs are applied to servers when they make requests
Note: Server ACLs do not restrict the events relative to the room DAG via authorisation rules, but instead act purely at the network layer to determine which servers are allowed to connect and interact with a given room.
There is one notable exception to this, which is currently
poorly specified,
in that when a denied server uses /federation/v1/send
, any PDUs
that originate from the denied server are failed. Note, these PDUs
are only failed when the denied server is the sender and origin of the
transaction.
Server ACL is currently effective even when the denied server is not
cooperative. However, if there is a distinct uncooperative server in
the room, the uncooperative server can leak events from the denied
server to all other servers, either by using leaked events as forward
extremities or including them in responses to /get_missing_events
.
Mitigating this problem is hard, because as the m.room.server_acl
event is specified to restrict servers at the "request" level,
there is no restriction on events themselves. If servers were to
naively apply server ACL to events, they would likely soft fail
all the events that the denied server sent throughout the room's
history. This is because m.room.server_acl
is intentionally
not specified with any consideration of DAG semantics. We believe
the intention of this is to stop a malicious server being able to add a
boundless number of soft failed events to the DAG. Which we believe
would be possible with current soft fail checks and a DAG based server
access system like the one within this proposal.
It should be clear that servers can leak events to conforming implementations unintentionally, depending on whether the implementers have implemented server access control in a conforming way.
Problem: size limit
It is theorized by me completely from thin air that the server ACL
event can contain roughly 512
entries before a limit to the
event size is reached. A survey of #matrix-org-coc-bl:matrix.org
suggests that most ACL events contain 150
~ deny entries.
However, there have been unstable periods in Matrix's history
due to vulnerability of servers where the number of deny entries
has been much greater. For example, Synapse previously had weak
registration requirements by default that was exploited in a major
incident. There could be a need to protect rooms from vulnerable
servers again in the future.
Workaround for ambient room access: explicit allow list
An explicit allow list can be used to work around the issue of ambient room access. This means that only servers that are known to room admins can interact with the room.
Bootstrapping
This workaround has a bootstrapping problem, because any server, either well established or completely new will be unable to join a community unless they can contact room admins or their tooling out of band to request access.
For large public rooms with reputation, this is problematic as these out of band channels to request access are now the next target of abuse. Typically requesting access in this way requires manual intervention from the joining user and is slow.
Public Rooms
This work around represents a serious user experience issue for both the users and the room administrators public rooms, since vetting new users is time consuming for both parties.
Workaround for leaky servers: banning, kicking, redacting
A work around for leaky servers is falling back to the existing room level user controls such as banning, kicking and redacting. Though it is possible for room administrators to become overwhelmed if the leaking is part of an intentional attack. They could then change the join rule for the room, but this would provide a disruption to their service that outlives the attack.
Theorized workaround for leaky servers: leak detection
It has been suggested that it is possible to detect a leak by forensically analyzing the DAG in order to find which servers are accepting a denied server's events by analyzing which servers have used them as forward extremities.
The author does not consider this to be an acceptable workaround,
because it is possible for a malicious leaking server to poison
/get_missing_events
and other endpoints to get spec conforming
servers to accept events from denied servers without the malicious
leaking server generating any suspicious PDUs themselves.
In this instance, a naive leak detection tool would likely incriminate
the wrong server.
Proposal
The m.server.knock
authorization rule
This rule is be inserted after rule 3 in version 11,
the check for m.room.create
's content field m.federate
.
- If the type is
m.server.knock
:- If the
state_key
does not contain the server name for the origin server, reject. - If there is any current state for the origin server's
m.server.knock
, reject. - If the origin server's current participation is
permitted
, allow. - If the
m.server.knock_rule
isdeny
, reject. - If the origin server's current participation is
deny
, reject. - Otherwise allow.
- If the
The purpose of this rule is to allow a server to send a knock
event, even if the sender
has no membership event.
The purpose of rule 1.2 is to prevent denied servers from ever being
given the ability to craft any event whatsoever in a room that has always
had the active
m.server.knock_rule
.
This is because an m.server.participation
event set to deny
will
usually be topologically older than an m.server.knock
due to the
m.server.participation
usually referencing a recent
m.room.power_levels
event. And so m.server.knock
events could
be crafted by malicious servers without restriction without
rule 1.2.
The m.server.participation
authorization rule
This rule is to be inserted before rule 4 in version 11,
the check for m.room.member
, and after the m.server.knock
rule
described in this proposal.
- If the origin server's current
participation
state is notpermitted
:- If the
participation
state isdeny
, reject. - If the type is
m.server.participation
and the sender's origin server matches the state_key of the considered event:- If the
participation
field of the considered event is notpermitted
, reject. - If the sender is the same sender of m.room.create, then allow.
- If the
- If the
m.server.knock_rule
isdeny
, reject. - If the
m.server.knock_rule
is anything other thanpassive
, reject.
- If the
We allow the room creator to set their own participation and bypass
the m.server.knock_rule
provided their server has not been explicitly
denied. This is because we want them to be able to set their
participation at room creation without being unable to do when the
m.server.knock_rule
is active
.
We allow senders to add the participation
of their own server,
provided that they only do so to permit
their own server (and not
deny themselves as a foot gun). This is useful in cases where a
room has a passive
m.server.knock_rule
and the room admins need
to explicitly permit their own servers before changing the knock
rule to active
.
The m.server.participation
authorization event, state_key: ${origin_server_name}
This is an authorization event
that is used to authorize events originating from the server named in
the state_key
.
participation
can be one of permitted
or deny
.
participation
is protected from redaction.
A denied server must not be sent a m.server.participation
event unless
the targeted server is already present within the room, or it has
an existing m.server.knock
event.
This is to prevent malicious servers being made aware of rooms that
they have not yet discovered.
A reason
field can be present alongside participation
in order to
explain the reason why a server has been denied
.
This reason is to be shown to a joining, or previously present
server, so that the server's users can understand why they are not
being allowed to participate.
The m.server.knock_rule
event, state_key: ''
This event has one field, rule
which can be one of the following:
deny
: Users are unable to send them.server.knock
event unless there is an existingm.server.participation
event for the server.passive
: Users can send them.server.knock
event without corresponding membership or server participation.active
: Users can send them.server.knock
event but cannot send any other event without a corresponding participation ofpermitted
.
rule
is protected from redaction.
The passive
state allows for rooms to operate as they do today,
new servers can freely join a room and start sending events without
prior approval from the administrators
The active
state allows for a much safer way to run public Matrix rooms,
new servers can join a room, send the m.server.knock
event
but cannot do more until a room administrator permits the new joiner with
an m.server.participation
event. We expect that in practice automated
tooling will perform a simple reputation check and immediately permit
a new server to participate. This is an essential part of the proposal
as the active
mechanism eliminates a current shortfall that
m.room.server_acl
is a purely reactive tool in a join wave attack.
The m.server.knock
event, state_key: ${origin_server_name}
This event has no fields, because it can only be sent once, and therefore cannot be edited if the wrong or malicious information is provided.
The intent of the event is to only let the room administrators explicitly aware of the server's existence.
The make_server_knock
handshake
This MSC requires a very simple clone of the make_knock
handshake
for the purpose of signing and creating the m.server.knock
event.
The details of this handshake are left outside the scope of the MSC,
as it may be decided that an API providing an agnostic unification of
make_knock
and make_join
should be used instead that signs
both the membership event and the m.server.knock
event templates.
We believe that the open choice here should not alone be a reason
to block this MSC from consideration. But we will follow up
with a clone of the make_knock
handshake if requested.
Diagrams
These diagrams do not specify any behaviour and are provided only to help explain the proposal. Do not write any part of the specification based upon the diagram, only what you have read from the authorization rules.
Room creation flow
---
title: Room creation as of v1.10
---
stateDiagram-v2
create: Alice creates the room
aliceMembership: Alice joins the room
create --> aliceMembership
alicePL: Alice sends the default power levels event from the createRoom template
aliceMembership --> alicePL
aliceJoinRules: Alice sets the join rules
alicePL --> aliceJoinRules
---
title: Room creation with simple server authorization
---
stateDiagram-v2
create: Alice creates the room
aliceServerParticipation: Alice permits her server to participate
aliceMembership: Alice joins the room
create --> aliceServerParticipation
aliceServerParticipation --> aliceMembership
alicePL: Alice sends the default power levels event from the createRoom template
aliceMembership --> alicePL
aliceJoinRules: Alice sets the join rules
alicePL --> aliceJoinRules
aliceKnockRule: Alice sets the server knock rule
alicePL --> aliceKnockRule
Join flow
This explains the order of events for joining the room, it does
not explain how the make_join
handshake is amended.
---
title: Joining the room with "passive" knock rule
---
stateDiagram-v2
create: Alice creates the room and sets the knock rule to "passive"
bobJoin: Bob sends a membership event with membership join
create --> bobJoin
bobHey: Bob sends m.room.message "Hello!"
bobJoin --> bobHey
state aliceChoice <<choice>>
aliceChoiceText: Alice decides whether to explicitly set Bob's participation
bobHey --> aliceChoiceText
aliceChoiceText --> aliceChoice
aliceDeny: Alice sets Bob's participation to "deny"
alicePermit: Alice sets Bob's participation to "permitted"
aliceImplicit: Alice does not make a decision
aliceChoice --> aliceDeny
aliceChoice --> alicePermit
aliceChoice --> aliceImplicit
bobCoolRoom: Bob sends m.room.message "This is a cool room!"
alicePermit --> bobCoolRoom
aliceImplicit --> bobCoolRoom
---
title: Joining the room with "active" knock rule
---
stateDiagram-v2
create: @alice#colon;matrix.org creates the room and sets the knock rule to "passive"
bobKnock: @bob#colon;example.com sends a knock event for his server#colon; example.com
create --> bobKnock
state aliceChoice <<choice>>
aliceChoiceText: @alice#colon;matrix.org decides whether to explicitly set example.com's participation
bobKnock --> aliceChoiceText
aliceChoiceText --> aliceChoice
aliceDeny: @alice#colon;matrix.org sets example.com's participation to "deny"
alicePermit: @alice#colon;matrix.org sets example.com's participation to "permitted"
aliceImplicit: @alice#colon;matrix.org does not make a decision
aliceChoice --> aliceDeny
aliceChoice --> alicePermit
aliceChoice --> aliceImplicit
bobJoin: @bob#colon;example.com sends a membership event with membership join
bobHello: @bob#colon;example.com sends m.room.message "Hello!"
bobCannotParticipate: @bob#colon;example.com cannot participate and example.com <br> cannot craft any authorizable event anywhere in the DAG
aliceDeny --> bobCannotParticipate
aliceImplicit --> bobCannotParticipate
alicePermit --> bobJoin
bobJoin --> bobHello
Potential issues
Racing with m.server.knock_rule
?
We will embed m.server.knock_rule
in m.room.create
if it
someone raises concerns about a potential race condition or other issue
about this conflicting with m.server.participation
. However, stating
that there might be without elaboration is not helpful, I'd need to
know how the race works. If there is insistence, then we will embed
within the m.room.create
event.
Changing the m.server.knock_rule
from passive
to active
or deny
Server admins can unintentionally lock themselves out of their room unless they are the room creator under the current proposal.
Soft failure of messages
Servers that had participation
of permitted
that are later
denied via deny
, can have some of their messages soft failed
while the forks synchronise similar to https://github.com/matrix-org/synapse/issues/9329.
This could be addressed with MSC4104.
Mismatch with m.room.power_levels
There is an argument to be made that the ability to manage
m.server.participation
should not be flat in the way that
m.room.server_acl
is. Consider Alice being the room creator
and Bob being an admin. Bob could create an m.server.participation
event that denies Alice's server from participating, even if Alice
is the same or a higher power level.
Alternatives
- MSC4099 Participation based authorization for servers in the Matrix DAG
- MSC3953 Server capability DAG
Security considerations
None considered.
Unstable prefix
me.marewolf.msc4124.*
Dependencies
No direct dependencies
See make_server_knock
handshake.
-
It is possible to craft and send events initially to servers that room admins do not reside on. Though most rooms are probably vulnerable to less than this, and somewhat simnple spam join attacks. ↩︎