This is a standalone description of the data architecture of Synapse. There is a lot of overlap with the currennt specification, so it has been split out here for posterity. Hopefully all the important bits have been merged into the relevant places in the main spec.
This is a standalone description of the data architecture of Synapse. There is a
lot of overlap with the current specification, so it has been split out here for
posterity. Hopefully all the important bits have been merged into the relevant
@ -200,7 +232,7 @@ the room ``!qporfwt:matrix.org``::
| matrix.org | | domain.com |
| matrix.org | | domain.com |
+------------------+ +------------------+
+------------------+ +------------------+
| ^
| ^
| |
| [HTTP PUT] |
| Room ID: !qporfwt:matrix.org |
| Room ID: !qporfwt:matrix.org |
| Event type: m.room.message |
| Event type: m.room.message |
| Content: { JSON object } |
| Content: { JSON object } |
@ -222,7 +254,7 @@ the room ``!qporfwt:matrix.org``::
| Content: { JSON object } |
| Content: { JSON object } |
|...................................|
|...................................|
Federation maintains shared data structures per-room between multiple home
Federation maintains *shared data structures* per-room between multiple home
servers. The data is split into ``message events`` and ``state events``.
servers. The data is split into ``message events`` and ``state events``.
``Message events`` describe transient 'once-off' activity in a room such as an
``Message events`` describe transient 'once-off' activity in a room such as an
@ -252,7 +284,7 @@ participating in a room.
Room Aliases
Room Aliases
++++++++++++
++++++++++++
Each room can also have multiple "Room Aliases", which looks like::
Each room can also have multiple "Room Aliases", which look like::
#room_alias:domain
#room_alias:domain
@ -272,7 +304,7 @@ that are in the room that can be used to join via.
::
::
GET
HTTP GET
#matrix:domain.com !aaabaa:matrix.org
#matrix:domain.com !aaabaa:matrix.org
| ^
| ^
| |
| |
@ -285,7 +317,7 @@ that are in the room that can be used to join via.
|________________________________|
|________________________________|
Identity
Identity
++++++++
~~~~~~~~
Users in Matrix are identified via their matrix user ID (MXID). However,
Users in Matrix are identified via their matrix user ID (MXID). However,
existing 3rd party ID namespaces can also be used in order to identify Matrix
existing 3rd party ID namespaces can also be used in order to identify Matrix
@ -306,47 +338,39 @@ the Matrix ecosystem. However, without one clients will not be able to look up
user IDs using 3PIDs.
user IDs using 3PIDs.
Presence
Presence
++++++++
~~~~~~~~
Each user has the concept of presence information. This encodes the
"availability" of that user, suitable for display on other user's clients. This
is transmitted as an ``m.presence`` event and is one of the few events which
are sent *outside the context of a room*. The basic piece of presence
information is represented by the ``presence`` key, which is an enum of one of
the following:
- ``online`` : The default state when the user is connected to an event
stream.
- ``unavailable`` : The user is not reachable at this time.
- ``offline`` : The user is not connected to an event stream.
- ``free_for_chat`` : The user is generally willing to receive messages
moreso than default.
- ``hidden`` : Behaves as offline, but allows the user to see the client
state anyway and generally interact with client features. (Not yet
implemented in synapse).
.. TODO-spec
Each user has the concept of presence information. This encodes:
This seems like a very constrained list of states - surely presence states
should be extensible, with us providing a baseline, and possibly a scale of
availability? For instance, do-not-disturb is missing here, as well as a
distinction between 'away' and 'busy'.
This basic ``presence`` field applies to the user as a whole, regardless of how
* Whether the user is currently online
many client devices they have connected. The presence state is pushed by the homeserver to all connected clients for a user to ensure a consistent experience for the user.
* How recently the user was last active (as seen by the server)
* Whether a given client considers the user to be currently idle
* Arbitrary information about the user's current status (e.g. "in a meeting").
.. TODO-spec
This information is collated from both per-device (online; idle; last_active) and
We need per-device presence in order to handle push notification semantics and similar.
per-user (status) data, aggregated by the user's homeserver and transmitted as
an ``m.presence`` event. This is one of the few events which are sent *outside
the context of a room*. Presence events are sent to all users who subscribe to
this user's presence through a presence list or by sharing membership of a room.
.. TODO
How do we let users hide their presence information?
.. TODO
The last_active specifics should be moved to the detailed presence event section
Last activity is tracked by the server maintaining a timestamp of the last time
it saw a pro-active event from the user. Any event which could be triggered by a
human using the application is considered pro-active (e.g. sending an event to a
room). An example of a non-proactive client activity would be a client setting
'idle' presence status, or polling for events. This timestamp is presented via a
key called ``last_active_ago``, which gives the relative number of milliseconds
since the message is generated/emitted that the user was last seen active.
In addition, the server maintains a timestamp of the last time it saw a
N.B. in v1 API, status/online/idle state are muxed into a single 'presence' field on the m.presence event.
pro-active event from the user; either sending a message to a room, or changing
presence state from a lower to a higher level of availability (thus: changing
state from ``unavailable`` to ``online`` counts as a proactive event, whereas in
the other direction it will not). This timestamp is presented via a key called
``last_active_ago``, which gives the relative number of milliseconds since the
message is generated/emitted that the user was last seen active.
Presence List
Presence Lists
~~~~~~~~~~~~~
~~~~~~~~~~~~~~
Each user's home server stores a "presence list". This stores a list of user IDs
Each user's home server stores a "presence list". This stores a list of user IDs
whose presence the user wants to follow.
whose presence the user wants to follow.
@ -355,38 +379,31 @@ To be added to this list, the user being added must be invited by the list owner
and accept the invitation. Once accepted, both user's HSes track the
and accept the invitation. Once accepted, both user's HSes track the
subscription.
subscription.
Presence and Permissions
~~~~~~~~~~~~~~~~~~~~~~~~
For a viewing user to be allowed to see the presence information of a target
user, either:
- The target user has allowed the viewing user to add them to their presence
Profiles
list, or
~~~~~~~~
- The two users share at least one room in common
In the latter case, this allows for clients to display some minimal sense of
presence information in a user list for a room.
Users may publish arbitrary key/value data associated with their account - such
as a human readable ``display name``, a profile photo URL, contact information
(email address, phone nubers, website URLs etc).
Profiles
In Client-Server API v2, profile data is typed using namespaced keys for
++++++++
interoperability, much like events - e.g. ``m.profile.display_name``.
.. TODO-spec
.. TODO
- Metadata extensibility
Actually specify the different types of data - e.g. what format are display
names allowed to be?
Internally within Matrix users are referred to by their user ID, which is
Private User Data
typically a compact unique identifier. Profiles grant users the ability to see
~~~~~~~~~~~~~~~~~
human-readable names for other users that are in some way meaningful to them.
Additionally, profiles can publish additional information, such as the user's
age or location.
A Profile consists of a display name, an avatar picture, and a set of other
Users may also store arbitrary private key/value data in their account - such as
metadata fields that the user may wish to publish (email address, phone
client preferences, or server configuration settings which lack any other
numbers, website URLs, etc...). This specification puts no requirements on the
dedicated API. The API is symmetrical to managing Profile data.
display name other than it being a valid unicode string. Avatar images are not
stored directly; instead the home server stores an ``http``-scheme URL from which clients may fetch the image.
.. TODO
Would it really be overengineered to use the same API for both profile &
private user data, but with different ACLs?
API Standards
API Standards
-------------
-------------
@ -424,8 +441,9 @@ response". This is a JSON object which looks like::
The ``error`` string will be a human-readable error message, usually a sentence
The ``error`` string will be a human-readable error message, usually a sentence
explaining what went wrong. The ``errcode`` string will be a unique string
explaining what went wrong. The ``errcode`` string will be a unique string
which can be used to handle an error message e.g. ``M_FORBIDDEN``. These error
which can be used to handle an error message e.g. ``M_FORBIDDEN``. These error
codes should have their namespace first in ALL CAPS, followed by a single _.
codes should have their namespace first in ALL CAPS, followed by a single _ to
For example, if there was a custom namespace ``com.mydomain.here``, and a
ease seperating the namespace from the error code.. For example, if there was a
custom namespace ``com.mydomain.here``, and a
``FORBIDDEN`` code, the error code should look like
``FORBIDDEN`` code, the error code should look like
``COM.MYDOMAIN.HERE_FORBIDDEN``. There may be additional keys depending on the
``COM.MYDOMAIN.HERE_FORBIDDEN``. There may be additional keys depending on the
error, but the keys ``error`` and ``errcode`` MUST always be present.
error, but the keys ``error`` and ``errcode`` MUST always be present.
@ -502,81 +520,3 @@ In contrast, these are invalid requests::
"key": "This is a put but it is missing a txnId."
"key": "This is a put but it is missing a txnId."
}
}
Glossary
--------
Backfilling:
The process of synchronising historic state from one home server to another,
to backfill the event storage so that scrollback can be presented to the
client(s). Not to be confused with pagination.
Context:
A single human-level entity of interest (currently, a chat room)
EDU (Ephemeral Data Unit):
A message that relates directly to a given pair of home servers that are
exchanging it. EDUs are short-lived messages that related only to one single
pair of servers; they are not persisted for a long time and are not forwarded
on to other servers. Because of this, they have no internal ID nor previous
EDUs reference chain.
Event:
A record of activity that records a single thing that happened on to a context
(currently, a chat room). These are the "chat messages" that Synapse makes
available.
PDU (Persistent Data Unit):
A message that relates to a single context, irrespective of the server that
is communicating it. PDUs either encode a single Event, or a single State
change. A PDU is referred to by its PDU ID; the pair of its origin server
and local reference from that server.
PDU ID:
The pair of PDU Origin and PDU Reference, that together globally uniquely
refers to a specific PDU.
PDU Origin:
The name of the origin server that generated a given PDU. This may not be the
server from which it has been received, due to the way they are copied around
from server to server. The origin always records the original server that
created it.
PDU Reference:
A local ID used to refer to a specific PDU from a given origin server. These
references are opaque at the protocol level, but may optionally have some
structured meaning within a given origin server or implementation.
Presence:
The concept of whether a user is currently online, how available they declare
they are, and so on. See also: doc/model/presence
Profile:
A set of metadata about a user, such as a display name, provided for the
benefit of other users. See also: doc/model/profiles
Room ID:
An opaque string (of as-yet undecided format) that identifies a particular
room and used in PDUs referring to it.
Room Alias:
A human-readable string of the form #name:some.domain that users can use as a
pointer to identify a room; a Directory Server will map this to its Room ID
State:
A set of metadata maintained about a Context, which is replicated among the
servers in addition to the history of Events.
User ID:
A string of the form @localpart:domain.name that identifies a user for
wire-protocol purposes. The localpart is meaningless outside of a particular
home server. This takes a human-readable form that end-users can use directly
if they so wish, avoiding the 3PIDs.
Transaction:
A message which relates to the communication between a given pair of servers.
A transaction contains possibly-empty lists of PDUs and EDUs.
.. TODO
This glossary contradicts the terms used above - especially on State Events v. "State"
and Non-State Events v. "Events". We need better consistent names.