Merge b638a9efbf into e9f0f31d27
commit
e342aabaa5
@ -0,0 +1,319 @@
|
|||||||
|
# Bundled URL previews
|
||||||
|
Currently, URL previews in Matrix are generated on the server when requested by
|
||||||
|
a client using the [`/_matrix/media/v3/preview_url`](https://spec.matrix.org/v1.9/client-server-api/#get_matrixmediav3preview_url)
|
||||||
|
endpoint. This is a relatively good approach, but a major downside is that the
|
||||||
|
user's homeserver gets all links the user's client wants to show a preview for,
|
||||||
|
which means using it in encrypted rooms will effectively leak parts of messages.
|
||||||
|
|
||||||
|
## Proposal
|
||||||
|
The proposed solution is allowing clients to bundle URL preview metadata inside
|
||||||
|
events.
|
||||||
|
|
||||||
|
A new field called `m.url_previews` is added. The field is an array of objects,
|
||||||
|
where each object contains OpenGraph data representing a single URL to preview,
|
||||||
|
similar to what the `/preview_url` endpoint currently returns:
|
||||||
|
|
||||||
|
* `matrix:matched_url` - The URL that is present in `body` and triggered this preview
|
||||||
|
to be generated. This is optional and should be omitted if the link isn't
|
||||||
|
present in the body.
|
||||||
|
* `matrix:image:encryption` - An [EncryptedFile](https://spec.matrix.org/v1.9/client-server-api/#extensions-to-mroommessage-msgtypes)
|
||||||
|
object for encrypted thumbnail images. Similar to encrypted image messages,
|
||||||
|
the URL is inside this object, and not in `og:image`.
|
||||||
|
* `matrix:image:size` - The byte size of the image, like in `/preview_url`.
|
||||||
|
* `og:image` - An `mxc://` URI for unencrypted images, like in `/preview_url`.
|
||||||
|
* `og:url` - Standard OpenGraph tag for the canonical URL of the previewed page.
|
||||||
|
* Any other standard OpenGraph tags.
|
||||||
|
|
||||||
|
At least one of `matrix:matched_url` and `og:url` MUST be present. All other
|
||||||
|
fields are optional.
|
||||||
|
|
||||||
|
URL previews are primarily meant for text-based message types (`m.text`,
|
||||||
|
`m.notice`, `m.emote`), but they may be used with any message type, as even
|
||||||
|
media messages may have captions in the future.
|
||||||
|
|
||||||
|
Allowing the omission of `matched_url` is effectively a new feature to send URL
|
||||||
|
previews without a link in the message text.
|
||||||
|
|
||||||
|
### Extensible events
|
||||||
|
The definition of `matrix:matched_url` changes from "present in `body`" to
|
||||||
|
"present in `m.text`", but otherwise the proposal is directly compatible with
|
||||||
|
extensible events.
|
||||||
|
|
||||||
|
### Client behavior
|
||||||
|
#### Sending preview data
|
||||||
|
When sending previews to encrypted rooms, clients should encrypt preview images
|
||||||
|
and put them in the `matrix:image:encryption` field. Other `og:image:*` and the
|
||||||
|
`matrix:image:size` field can still be used for image metadata, but the
|
||||||
|
`og:image` field should be omitted for encrypted thumbnails.
|
||||||
|
|
||||||
|
If clients use the `/preview_url` endpoint as a helper for generating preview
|
||||||
|
data, they should reupload the thumbnail image (if there is one) to create a
|
||||||
|
persistent `mxc://` URI, as well as encrypt it if applicable. A future MSC
|
||||||
|
could also extend `/preview_url` with a parameter to request a persistent URI.
|
||||||
|
|
||||||
|
#### Receiving messages with `m.url_previews`
|
||||||
|
If an object in the list contains only `matrix:matched_url` or `og:url` (but
|
||||||
|
not both) and no other fields, receiving clients should fall back to the old
|
||||||
|
behavior of requesting a preview using `/preview_url`.
|
||||||
|
|
||||||
|
Clients may choose to ignore bundled data and ask the homeserver for a preview
|
||||||
|
even if bundled data is present, as a security measure against faking preview
|
||||||
|
data.
|
||||||
|
|
||||||
|
Clients may also choose to verify that the matched_url is present in the
|
||||||
|
`body` field before displaying a full preview. However, in order to avoid losing
|
||||||
|
data, clients SHOULD still display ignored entries somehow, e.g. just rendering
|
||||||
|
the link (either `og:url` or `matrix:matched_url`) instead of a full preview.
|
||||||
|
|
||||||
|
Note: ignoring bundled data does not mean ignoring the `m.url_preview` field:
|
||||||
|
even when ignoring bundled data and/or verifying that matched_url is present in
|
||||||
|
`body`, clients should only display previews for URLs that are present in the
|
||||||
|
list, and should never display previews for URLs that aren't present in the list.
|
||||||
|
|
||||||
|
If the `m.url_previews` field is not present at all, clients should fall back
|
||||||
|
to the old behavior of searching `body`.
|
||||||
|
|
||||||
|
The above points effectively make this an alternative for
|
||||||
|
[MSC2385](https://github.com/matrix-org/matrix-spec-proposals/pull/2385).
|
||||||
|
|
||||||
|
### Examples
|
||||||
|
<details>
|
||||||
|
<summary>Normal preview</summary>
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"type": "m.room.message",
|
||||||
|
"content": {
|
||||||
|
"msgtype": "m.text",
|
||||||
|
"body": "https://matrix.org",
|
||||||
|
"m.url_previews": [
|
||||||
|
{
|
||||||
|
"matrix:matched_url": "https://matrix.org",
|
||||||
|
"matrix:image:size": 16588,
|
||||||
|
"og:description": "Matrix, the open protocol for secure decentralised communications",
|
||||||
|
"og:image": "mxc://maunium.net/zeHhTqqUtUSUTUDxQisPdwZO",
|
||||||
|
"og:image:height": 400,
|
||||||
|
"og:image:type": "image/jpeg",
|
||||||
|
"og:image:width": 800,
|
||||||
|
"og:title": "Matrix.org",
|
||||||
|
"og:url": "https://matrix.org/"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"m.mentions": {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
<details>
|
||||||
|
<summary>Preview with encrypted thumbnail image</summary>
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"type": "m.room.message",
|
||||||
|
"content": {
|
||||||
|
"msgtype": "m.text",
|
||||||
|
"body": "https://matrix.org",
|
||||||
|
"m.url_previews": [
|
||||||
|
{
|
||||||
|
"matrix:matched_url": "https://matrix.org",
|
||||||
|
"og:url": "https://matrix.org/",
|
||||||
|
"og:title": "Matrix.org",
|
||||||
|
"og:description": "Matrix, the open protocol for secure decentralised communications",
|
||||||
|
"matrix:image:size": 16588,
|
||||||
|
"og:image:width": 800,
|
||||||
|
"og:image:height": 400,
|
||||||
|
"og:image:type": "image/jpeg",
|
||||||
|
"matrix:image:encryption": {
|
||||||
|
"key": {
|
||||||
|
"k": "GRAgOUnbbkcd-UWoX5kTiIXJII81qwpSCnxLd5X6pxU",
|
||||||
|
"alg": "A256CTR",
|
||||||
|
"ext": true,
|
||||||
|
"kty": "oct",
|
||||||
|
"key_ops": [
|
||||||
|
"encrypt",
|
||||||
|
"decrypt"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"iv": "kZeoJfx4ehoAAAAAAAAAAA",
|
||||||
|
"hashes": {
|
||||||
|
"sha256": "WDOJYFegjAHNlaJmOhEPpE/3reYeD1pRvPVcta4Tgbg"
|
||||||
|
},
|
||||||
|
"v": "v2",
|
||||||
|
"url": "mxc://beeper.com/53207ac52ce3e2c722bb638987064bfdc0cc257b"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"m.mentions": {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
<details>
|
||||||
|
<summary>Message indicating it should not have any previews</summary>
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"type": "m.room.message",
|
||||||
|
"content": {
|
||||||
|
"msgtype": "m.text",
|
||||||
|
"body": "https://matrix.org",
|
||||||
|
"m.url_previews": [],
|
||||||
|
"m.mentions": {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
<details>
|
||||||
|
<summary>Message indicating a preview should be fetched from the homeserver</summary>
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"type": "m.room.message",
|
||||||
|
"content": {
|
||||||
|
"msgtype": "m.text",
|
||||||
|
"body": "https://matrix.org",
|
||||||
|
"m.url_previews": [
|
||||||
|
{
|
||||||
|
"matrix:matched_url": "https://matrix.org"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"m.mentions": {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
<details>
|
||||||
|
<summary>Preview in extensible event</summary>
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"type": "m.message",
|
||||||
|
"content": {
|
||||||
|
"m.text": [
|
||||||
|
{"body": "matrix.org/support"}
|
||||||
|
],
|
||||||
|
"m.url_previews": [
|
||||||
|
{
|
||||||
|
"matrix:matched_url": "matrix.org/support",
|
||||||
|
"matrix:image:size": 16588,
|
||||||
|
"og:description": "Matrix, the open protocol for secure decentralised communications",
|
||||||
|
"og:image": "mxc://maunium.net/zeHhTqqUtUSUTUDxQisPdwZO",
|
||||||
|
"og:image:height": 400,
|
||||||
|
"og:image:type": "image/jpeg",
|
||||||
|
"og:image:width": 800,
|
||||||
|
"og:title": "Support Matrix",
|
||||||
|
"og:url": "https://matrix.org/support/"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"m.mentions": {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
## Potential issues
|
||||||
|
### Fake preview data
|
||||||
|
The message sender can fake previews quite trivially. This is considered an
|
||||||
|
acceptable compromise to achieve non-leaking URL previews in encrypted rooms.
|
||||||
|
|
||||||
|
As mentioned in the client behavior section, clients may choose to ignore
|
||||||
|
embedded preview data in unencrypted rooms and always use the `/preview_url`
|
||||||
|
endpoint, effectively only using `m.url_previews` as a whitelist of URLs to
|
||||||
|
preview.
|
||||||
|
|
||||||
|
### More image uploads
|
||||||
|
Currently previews are generated by the server, which lets the server apply
|
||||||
|
caching and delete thumbnail images quickly. If the data was embedded in events
|
||||||
|
instead, the server would not be able to clean up images the same way.
|
||||||
|
|
||||||
|
### Web clients
|
||||||
|
Web clients likely can't generate previews themselves due to CORS and other
|
||||||
|
such protections.
|
||||||
|
|
||||||
|
Clients could use the existing URL preview endpoint to generate a preview and
|
||||||
|
bundle that data in events, which has the benefit of only leaking the link to
|
||||||
|
one homeserver (the sender's) instead of all servers. When doing this, clients
|
||||||
|
would have to download the preview image and reupload it to get a persistent
|
||||||
|
`mxc://` URI, and possibly encrypt it before uploading.
|
||||||
|
|
||||||
|
Alternatively, clients could simply not include preview data at all and have
|
||||||
|
receiving clients fall back to the old behavior (meaning no previews in
|
||||||
|
encrypted rooms unless the receiver opts in).
|
||||||
|
|
||||||
|
## Security considerations
|
||||||
|
Fake preview data as covered in potential issues.
|
||||||
|
|
||||||
|
### Visibility in old clients (T&S)
|
||||||
|
Clients that don't support this MSC will not display any of the data in the
|
||||||
|
preview field, which could be abused by spammers if all moderators in a room
|
||||||
|
are using old clients.
|
||||||
|
|
||||||
|
### Generating previews will leak IPs
|
||||||
|
The sender's client will leak its IP when it fetches previews for URLs typed by
|
||||||
|
the user. This is generally an acceptable tradeoff, as long as clients take
|
||||||
|
care never to generate previews for links the user did not type.
|
||||||
|
|
||||||
|
For example, if a client generates reply fallbacks, it MUST NOT generate
|
||||||
|
previews for links in the fallback. Clients should also be careful with links
|
||||||
|
when starting to edit a message, possibly by not generating new previews at
|
||||||
|
all.
|
||||||
|
|
||||||
|
Clients may also provide extra safeguards, such as only offering a button to
|
||||||
|
generate previews, rather than generating them immediately after the user types
|
||||||
|
a URL. However, this is a UX decision and is therefore ultimately up to the
|
||||||
|
client to decide.
|
||||||
|
|
||||||
|
Clients could also use a privacy-preserving TCP relay to proxy all URL preview
|
||||||
|
requests [like Signal does](https://signal.org/blog/i-link-therefore-i-am/).
|
||||||
|
That way the client wouldn't leak its IP, and the relay wouldn't see previewed
|
||||||
|
URLs. However, running such a proxy has several potential security issues for
|
||||||
|
the server administrators, so it is out of scope for this MSC.
|
||||||
|
|
||||||
|
### Previewing code must be implemented carefully
|
||||||
|
When generating URL previews, clients are parsing completely untrusted data.
|
||||||
|
Parsing responses must be done with care to prevent content-based attacks, such
|
||||||
|
as the billion laughs attack.
|
||||||
|
|
||||||
|
### Local IPs should not be previewed by default
|
||||||
|
Clients should prevent previewing non-public IP addresses by default. To do
|
||||||
|
this, clients must check the DNS records of a domain before connecting to the
|
||||||
|
resolved IP, as public domains may point to private IPs. For web clients, these
|
||||||
|
limits are generally handled by the browser (see the [Private Network Access
|
||||||
|
spec](https://wicg.github.io/private-network-access/)).
|
||||||
|
|
||||||
|
## Alternatives
|
||||||
|
### Different generation methods
|
||||||
|
Previews could be generated by the receiving client, which both doesn't leak
|
||||||
|
links to the user's homeserver, and prevents fake previews. However, this would
|
||||||
|
leak the user's IP address to all links they receive, so it is not an
|
||||||
|
acceptable solution.
|
||||||
|
|
||||||
|
The original design notes for URL previews from 2016 also has a list of options
|
||||||
|
that were considered at the time: <https://github.com/matrix-org/matrix-spec/blob/main/attic/drafts/url_previews.md>.
|
||||||
|
Option 2 is what was implemented then, and this proposal adds option 4.
|
||||||
|
The combination of options 2 and 4 is also mentioned as the probably best
|
||||||
|
solution in that document.
|
||||||
|
|
||||||
|
The document also mentions the possibility of an AS or HS scanning messages and
|
||||||
|
injecting preview data, but that naturally won't function with encryption at all,
|
||||||
|
and is therefore not an alternative.
|
||||||
|
|
||||||
|
The fifth option mentioned in the document, a centralized previewing service
|
||||||
|
which is configured per-room, could technically work, but would likely be worse
|
||||||
|
than HS-generated previews in practice: users wouldn't know to configure a
|
||||||
|
different previewing service, so clients would probably have to automatically
|
||||||
|
pick one.
|
||||||
|
|
||||||
|
## Unstable prefix
|
||||||
|
Until this MSC is accepted, implementations should apply the following renames:
|
||||||
|
|
||||||
|
* `com.beeper.linkpreviews` instead of `m.url_previews`
|
||||||
|
* `beeper:image:encryption` instead of `matrix:image:encryption`
|
||||||
|
* `matched_url` instead of `matrix:matched_url`
|
||||||
|
* note: this was implemented without a prefix before the MSC was made, which
|
||||||
|
is why the "unstable prefix" is no prefix in this case.
|
||||||
Loading…
Reference in New Issue