|
|
|
@ -0,0 +1,227 @@
|
|
|
|
|
# MSC3970: Scope transaction IDs to devices
|
|
|
|
|
|
|
|
|
|
Transaction identifiers in the Client-Server API are currently scoped to the
|
|
|
|
|
concept of a "client session" which, when refresh tokens are used, can span a
|
|
|
|
|
sequence of access tokens.
|
|
|
|
|
|
|
|
|
|
The spec [reads](https://spec.matrix.org/v1.6/client-server-api/#transaction-identifiers):
|
|
|
|
|
|
|
|
|
|
> The scope of a transaction ID is a “client session”, where that session is
|
|
|
|
|
> identified by a particular access token. When refreshing an access token, the
|
|
|
|
|
> transaction ID’s scope is retained. This means that if a client with token `A`
|
|
|
|
|
> uses `TXN1` as their transaction ID, refreshes the token to `B`, and uses
|
|
|
|
|
> `TXN1` again it’ll be assumed to be a duplicate request and ignored. If the
|
|
|
|
|
> client logs out and back in between the `A` and `B` tokens, `TXN1` could be used
|
|
|
|
|
> once for each.
|
|
|
|
|
|
|
|
|
|
The "client session" scope concept described can be complicated to implement.
|
|
|
|
|
|
|
|
|
|
This MSC proposes that the scope of a transaction identifier is changed to something
|
|
|
|
|
that is easier to implement whilst maintaining required transaction semantics.
|
|
|
|
|
|
|
|
|
|
The transaction IDs appear in two parts of the Client-Server API spec:
|
|
|
|
|
|
|
|
|
|
1. As a identifier to allow the homeserver to make some `PUT` endpoints
|
|
|
|
|
[idempotent](https://spec.matrix.org/v1.6/client-server-api/#transaction-identifiers)
|
|
|
|
|
2. An unsigned field in the event data model for a client to tell if it sent an
|
|
|
|
|
event or not. a.k.a. solving the
|
|
|
|
|
["local echo"](https://spec.matrix.org/v1.6/client-server-api/#local-echo) problem
|
|
|
|
|
|
|
|
|
|
For reference, the `PUT` endpoints that have the a `{txnId}` param are:
|
|
|
|
|
|
|
|
|
|
- [`PUT /_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}`](https://spec.matrix.org/v1.6/client-server-api/#put_matrixclientv3roomsroomidsendeventtypetxnid)
|
|
|
|
|
- [`PUT /_matrix/client/v3/rooms/{roomId}/redact/{eventId}/{txnId}`](https://spec.matrix.org/v1.6/client-server-api/#put_matrixclientv3roomsroomidredacteventidtxnid)
|
|
|
|
|
- [`PUT /_matrix/client/v3/sendToDevice/{eventType}/{txnId}`](https://spec.matrix.org/v1.6/client-server-api/#put_matrixclientv3sendtodeviceeventtypetxnid)
|
|
|
|
|
|
|
|
|
|
## Proposal
|
|
|
|
|
|
|
|
|
|
It is proposed that the scope of transaction identifiers be changed from a
|
|
|
|
|
"client session" to a "device".
|
|
|
|
|
|
|
|
|
|
A "device" is typically represented by a `device_id` elsewhere in the spec.
|
|
|
|
|
|
|
|
|
|
For idempotency, this means the homeserver changing the method of identifying a
|
|
|
|
|
request from:
|
|
|
|
|
|
|
|
|
|
- (`client session`, `HTTP path of request which includes the transaction ID`)
|
|
|
|
|
|
|
|
|
|
to:
|
|
|
|
|
|
|
|
|
|
- (`device_id`, `HTTP path of request which includes the transaction ID`)
|
|
|
|
|
|
|
|
|
|
For local echo, the homeserver would now include the `transaction_id` in the
|
|
|
|
|
event data when it is serving a sync request from the same `device_id` as
|
|
|
|
|
determined from the access token.
|
|
|
|
|
|
|
|
|
|
## Potential issues
|
|
|
|
|
|
|
|
|
|
### This is technically a breaking change to the spec
|
|
|
|
|
|
|
|
|
|
The main "issue" I see with this proposal is that this is technically a breaking
|
|
|
|
|
change to the spec.
|
|
|
|
|
|
|
|
|
|
A device ID could have multiple sequences of access tokens associated
|
|
|
|
|
with it (since device ID can be specified as a parameter to
|
|
|
|
|
[`/login`](https://spec.matrix.org/v1.6/client-server-api/#post_matrixclientv3login)).
|
|
|
|
|
|
|
|
|
|
For example, a "bot" implementation might masquerade as the same "device" despite
|
|
|
|
|
calling `/login` on every run by passing the same device ID. Such a device might
|
|
|
|
|
also use the same transaction ID on each run.
|
|
|
|
|
|
|
|
|
|
Therefore it could potentially lead to a request being treated as a duplicate
|
|
|
|
|
where previously it would have been treated as a distinct request.
|
|
|
|
|
|
|
|
|
|
However, some evidence has been collated which suggests that nothing would be impacted
|
|
|
|
|
in reality:
|
|
|
|
|
|
|
|
|
|
#### 1. Data from the matrix.org homeserver suggests the change would have no impact
|
|
|
|
|
|
|
|
|
|
The `matrix.org` homeserver is a reasonable size deployment and could be considered
|
|
|
|
|
reasonably representative of the diversity of Matrix clients.
|
|
|
|
|
|
|
|
|
|
Synapse maintains a `event_txn_id` table
|
|
|
|
|
that contains a rolling 24 hour window of
|
|
|
|
|
(`user_id`, `token_id`, `room_id`, `txn_id`) tuples.
|
|
|
|
|
|
|
|
|
|
Having analysed the contents of the table, it appears that there are no repeated
|
|
|
|
|
transaction IDs for a given user, token and room.
|
|
|
|
|
|
|
|
|
|
n.b. not all `PUT` endpoints contribute to the table but I think the high-volume
|
|
|
|
|
ones do
|
|
|
|
|
|
|
|
|
|
This suggests that the widening of the scope from token to device would not have caused
|
|
|
|
|
any issues during the periods sampled.
|
|
|
|
|
|
|
|
|
|
For reference the following is the schema of the `event_txn_id` table:
|
|
|
|
|
|
|
|
|
|
```sql
|
|
|
|
|
Table "matrix.event_txn_id"
|
|
|
|
|
Column | Type | Collation | Nullable | Default
|
|
|
|
|
-------------+--------+-----------+----------+---------
|
|
|
|
|
event_id | text | | not null |
|
|
|
|
|
room_id | text | | not null |
|
|
|
|
|
user_id | text | | not null |
|
|
|
|
|
token_id | bigint | | not null |
|
|
|
|
|
txn_id | text | | not null |
|
|
|
|
|
inserted_ts | bigint | | not null |
|
|
|
|
|
Indexes:
|
|
|
|
|
"event_txn_id_token_id" btree (token_id)
|
|
|
|
|
"event_txn_id_event_id" UNIQUE, btree (event_id)
|
|
|
|
|
"event_txn_id_ts" btree (inserted_ts)
|
|
|
|
|
"event_txn_id_txn_id" UNIQUE, btree (room_id, user_id, token_id, txn_id)
|
|
|
|
|
Foreign-key constraints:
|
|
|
|
|
"event_txn_id_event_id_fkey" FOREIGN KEY (event_id) REFERENCES matrix.events(event_id) ON DELETE CASCADE
|
|
|
|
|
"event_txn_id_token_id_fkey" FOREIGN KEY (token_id) REFERENCES matrix.access_tokens(id) ON DELETE CASCADE
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
And the query to look for repeated transaction IDs:
|
|
|
|
|
|
|
|
|
|
```sql
|
|
|
|
|
SELECT e1.txn_id, LEFT(e1.user_id, 5) AS user_id, e1.token_id, e2.token_id, e1.inserted_ts, e2.inserted_ts FROM matrix.event_txn_id e1, matrix.event_txn_id e2 WHERE e1.txn_id = e2.txn_id AND e1.event_id <> e2.event_id AND e1.event_id < e2.event_id AND e1.user_id = e2.user_id AND e1.room_id = e2.room_id ORDER BY e1.token_id;
|
|
|
|
|
txn_id | user_id | token_id | token_id | inserted_ts | inserted_ts
|
|
|
|
|
--------+---------+----------+----------+-------------+-------------
|
|
|
|
|
(0 rows)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### 2. Conduit homeserver already scopes transaction IDs to devices
|
|
|
|
|
|
|
|
|
|
As highlighted by the new Complement
|
|
|
|
|
[tests](https://github.com/matrix-org/complement/pull/613) the Conduit homeserver
|
|
|
|
|
is already scoping transaction IDs to devices.
|
|
|
|
|
|
|
|
|
|
I can't find a related issue [listed](https://gitlab.com/famedly/conduit/-/issues),
|
|
|
|
|
so presumably this non-compliant behaviour isn't causing a known issue for
|
|
|
|
|
admins and users of the Conduit homeserver?
|
|
|
|
|
|
|
|
|
|
### Is the "device" concept the right level of abstraction to use?
|
|
|
|
|
|
|
|
|
|
One way to look at it is that device is already widely used in the end-to-end
|
|
|
|
|
encryption parts of the spec and so why isn't it suitable for this use case too?
|
|
|
|
|
|
|
|
|
|
### What about two clients masquerading as a single device ID?
|
|
|
|
|
|
|
|
|
|
I don't know if this actually works in practice. If this was a concern then it
|
|
|
|
|
could be mitigated by clarifying in the spec that if a client wishes to submit
|
|
|
|
|
requests using the same `device_id` as another client session that it should
|
|
|
|
|
choose transaction identifiers that are unique to that client session.
|
|
|
|
|
|
|
|
|
|
## Alternatives
|
|
|
|
|
|
|
|
|
|
### Do nothing
|
|
|
|
|
|
|
|
|
|
We could leave the transaction ID scope as is.
|
|
|
|
|
|
|
|
|
|
However, it makes it difficult to implement a features like
|
|
|
|
|
[MSC3861: Matrix architecture change to delegate authentication via OIDC](https://github.com/matrix-org/matrix-spec-proposals/pull/3861)
|
|
|
|
|
as the concept of a "client session" doesn't really exist in OIDC.
|
|
|
|
|
|
|
|
|
|
As noted above, at least one homeserver implementation is also not implementing
|
|
|
|
|
the spec as it is today.
|
|
|
|
|
|
|
|
|
|
It also turns out that the current implementation of refresh tokens in Synapse
|
|
|
|
|
breaks the transaction ID semantics already and needs to be
|
|
|
|
|
[fixed](https://github.com/matrix-org/synapse/issues/15141).
|
|
|
|
|
|
|
|
|
|
### Make a backwards compatible change
|
|
|
|
|
|
|
|
|
|
A backwards compatible alternative could be something like:
|
|
|
|
|
|
|
|
|
|
1. For idempotency have clients opt-in to a new scope of transaction ID, but
|
|
|
|
|
support the current semantics too for compatibility
|
|
|
|
|
2. Have clients opt-in (e.g. request param on the sync endpoint) to receiving
|
|
|
|
|
transaction ID for all events in the sync response and make the client
|
|
|
|
|
responsible for identifying which messages they sent
|
|
|
|
|
|
|
|
|
|
The disadvantage of this is that we create a load of extra maintenance work to
|
|
|
|
|
support both semantics for a period of time for (empirically) no gain in return.
|
|
|
|
|
|
|
|
|
|
## Security considerations
|
|
|
|
|
|
|
|
|
|
A malicious client can adopt an existing device ID of a user. This could
|
|
|
|
|
possibly allow some kind of denial of service attack.
|
|
|
|
|
|
|
|
|
|
However, if such an attack where possible it would be possible to do so without
|
|
|
|
|
this MSC as device IDs are crucial to the implementation of end-to-end encryption.
|
|
|
|
|
|
|
|
|
|
## Other recommendations
|
|
|
|
|
|
|
|
|
|
I'm not suggesting that these recommendations are addressed in this proposal, but
|
|
|
|
|
more notes for future proposals or spec clarifications.
|
|
|
|
|
|
|
|
|
|
### Clarification on idempotency semantics
|
|
|
|
|
|
|
|
|
|
I have separately prepared a [spec PR](https://github.com/matrix-org/matrix-spec/pull/1449)
|
|
|
|
|
to clarify some of the idempotency semantics that doesn't modify the spec but is
|
|
|
|
|
useful to understand the context of this proposal.
|
|
|
|
|
|
|
|
|
|
### Clarification on transaction ID time scope
|
|
|
|
|
|
|
|
|
|
I also suggest that the spec be clarified over what time periods the transaction
|
|
|
|
|
ID is scoped for such that clients can be aware. This cloud simply be to say
|
|
|
|
|
that the time period is not defined and so may vary by implementation.
|
|
|
|
|
|
|
|
|
|
### Recommend a less naive transaction ID format
|
|
|
|
|
|
|
|
|
|
Currently the format of a transaction ID is not specified, but a recommendation
|
|
|
|
|
is [given](https://spec.matrix.org/v1.6/client-server-api/#transaction-identifiers):
|
|
|
|
|
|
|
|
|
|
> After the [previous] request has finished, the {txnId} value should be changed
|
|
|
|
|
> (how is not specified; **a monotonically increasing integer is recommended**).
|
|
|
|
|
|
|
|
|
|
I think this is an unnecessarily naive recommendation.
|
|
|
|
|
|
|
|
|
|
In most clients environments a pseudo-random number generator will be available and so
|
|
|
|
|
could be used to generate a UUID/ULID or similar.
|
|
|
|
|
|
|
|
|
|
As an aside, in my research I have found some clients use a "seconds since epoch" as a
|
|
|
|
|
transaction ID which introduces a limit on the maximum possible event transmission rate
|
|
|
|
|
per room to once per second. Perhaps a better recommendation could help prevent the
|
|
|
|
|
such behaviour being introduced in future.
|
|
|
|
|
|
|
|
|
|
## Unstable prefix
|
|
|
|
|
|
|
|
|
|
None needed.
|
|
|
|
|
|
|
|
|
|
## Dependencies
|
|
|
|
|
|
|
|
|
|
None.
|