diff --git a/proposals/0000-head-method-download.md b/proposals/0000-head-method-download.md new file mode 100644 index 00000000..4e502c79 --- /dev/null +++ b/proposals/0000-head-method-download.md @@ -0,0 +1,71 @@ +# MSC0000: `HEAD` HTTP method on `/download` + +Most servers have a media upload size limit in place which gets applied to remote downloads as well, +ideally preventing "excessively" large media from transiting through the server. Unfortunately, the +best way to prevent large media from being downloaded is to try downloading it. + +Many HTTP client libraries support reading headers before the remaining response body, though this is +time consuming and prone to issues. Some libraries do not offer the functionality at all, and require +the body to be processed by the caller. Other libraries buffer the response body while the caller +determines if it should continue with the request, though this buffer is typically minimal. + +To prevent this exact issue, HTTP has the [`HEAD`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/HEAD) +request method which acts like a `GET` request, except the response body is omitted. This proposal +introduces `HEAD` as a legal method for download requests, allowing a requesting server to make +decisions about whether to download the entire file in a subsequent request. + +## Proposal + +`HEAD` becomes a legal request method on the following endpoints: + +* [`/_matrix/media/v3/download/:serverName/:mediaId`](https://spec.matrix.org/v1.9/client-server-api/#get_matrixmediav3downloadservernamemediaid) +* [`/_matrix/media/v3/download/:serverName/:mediaId/:fileName`](https://spec.matrix.org/v1.9/client-server-api/#get_matrixmediav3downloadservernamemediaidfilename) + +`HEAD` behaves as described by the HTTP specification. + +Servers which do not support the `HEAD` method on the endpoints would respond with a 405 `M_UNRECOGNIZED` +error code, as per the [common error codes spec](https://spec.matrix.org/v1.9/client-server-api/#common-error-codes). +In this case, requesting servers will likely have to take a risk and call the `GET` endpoint without +knowing how much data there is to download. + +In future when the media download endpoint is split into client and federation versions, like in +[MSC3916](https://github.com/matrix-org/matrix-spec-proposals/pull/3916), it is suggested that *both* +APIs get the same `HEAD` method support. This will allow clients to check cache headers, and still +provide servers with information about file size. + +**Note**: `HEAD` is not supported on `/thumbnail` as the thumbnail may be generated at the time of +request and have unknown size. `/download` does not typically have this issue, unless a form of streaming +file transfer is used, like [MSC4016](https://github.com/matrix-org/matrix-spec-proposals/pull/4016). + +## Potential issues + +Adding a round trip to the already-expensive download sequence isn't great and may be an over-optimization +for what is usually a rare problem. + +## Alternatives + +As mentioned in the introduction, requesting servers could abort their request after receiving headers +and possibly part of the body. This may be difficult to do with some libraries/languages, and can still +result in higher-than-ideal bandwidth usage. + +## Security considerations + +Servers should note that while the HTTP spec [suggests](https://www.rfc-editor.org/rfc/rfc9110.html#name-head) +that a `HEAD` request have the same headers as a `GET` request, the `HEAD` request is notably capable +of lacking useful headers like `Content-Length`. Additionally, a malicious server *could* lie about +the download size on `HEAD` and return a larger file on `GET`. Servers should continue to limit `GET` +requests as best they can to stay within their size limits and bandwidth requirements, particularly +when the `HEAD` request doesn't contain a `Content-Length` header. + +## Unstable prefix + +This proposal *could* have an unstable prefix by versioning the endpoints themselves, however as the +HTTP feature is well defined and no servers appear to be using `HEAD` requests currently, this proposal +does not include an unstable prefix. Servers should implement `HEAD` as described by the HTTP specification, +but only call other servers with `HEAD` if in an experimental or unstable mode of operation. For example, +if the Synapse configuration has the `HEAD` feature flag *disabled* then no `HEAD` request should be +generated by that Synapse instance. + +## Dependencies + +This proposal does not have any direct dependencies.