Initial PDF markup proposal

pull/3592/head
Graham Leach-Krouse 4 years ago
parent 87fcb7c8b0
commit 34897a5e6f

@ -0,0 +1,110 @@
# PDF annotation locations for markup
[MSC3574](https://github.com/opentower/matrix-doc/blob/main/proposals/3574-resource-markup.md)
proposes a mechanism for marking up resources (webpages, documents, videos, and
other files) using Matrix. The proposed mechanism requires an
`m.markup.location` schema for representing the location of annotations within
different kinds of resources.MSC3574 punts on what standard location types
might be available, deferring that large family of questions to other MSCs.
This MSC aims to provide two basic location types for marking up PDFs.
## Proposal
Markup locations for PDFs should approximately follow the format of embedded
annotations provided in the PDF standard, for more straightforward integration
with PDF rendering and editing libraries that clients may wish to make use of.
The PDF 1.4 standard includes 19 different kinds of annotations (see [p499
here](https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference.pdf)).
This proposal provides events for two of these: *Text Annotations*, which
represent "sticky notes" at a certain point in the text, and *Highlights*,
which represent a certain range of text that should be highlighted.
PDF 1.4 annotations all accept a very large set of different attributes. Of
these, only two are mandatory: `Subtype` and `Rect`, where `Subtype` gives the
annotation type, and `Rect` gives the position of the annotation on the PDF
page as a rectangle represented by an array of the form
[lower-left-x, lower-left-y, upper-right-x, upper-right-y]
where each item is a number of "user space units" (72ths of an inch) from the
bottom left corner of the page, sometimes called *points*.
This MSC does not propose to include any of the optional attributes. The
`Subtype` attribute will be indicated by a key of the `m.markup.location`
object. So only `Rect`, and the attributes specific to each annotation type,
need to be provided for.
### Text Annotations
Text annotations will be represented within an `m.markup.location` as follows:
```
m.markup.location: {
m.markup.pdf.text: {
rect: {left: ..., right: ..., top: ..., bottom: ...}
contents: ...
}
..
}
```
The `contents` is a string indicating text for the text annotation. Precisely
how to set it will be left as an implementation detail for clients.
Optionally, `m.markup.pdf.text` may also contain a `name` value, which should
be a string that names an icon to be used in displaying the annotation.
Standardly recognized values are: "Comment", "Key", "Note", "Help",
"NewParagraph", "Paragraph" and "Insert".
### Highlight Annotations
Highlight Annotations will be represented within an `m.markup.location` as
follows:
```
m.markup.location: {
m.markup.pdf.text: {
rect: {left: ..., right: ..., top: ..., bottom: ...}
contents: ...
quadPoints: [...]
}
..
}
```
The `contents` are as above. `quadPoints` is an array of arrays of the form:
[x_1,y_1,x_2,y_2,x_3,y_3,x_4,y_4]
each of which represents the vertices (in counterclockwise order) of an
oriented quadrilateral region of the PDF page. Each quadrilateral is meant to
encompass a word or group of contiguous words in the highlighted text.
Optionally, the `m.markup.location` may also include a `textContent` value,
which should be a string containing the highlighted text. the `textContent`
value is not part of the PDF standard, but is included as a convenience for
clients.
## Alternatives
Rather than accepting this MSC, we could wait for a more comprehensive MSC that
tries to comprehensively specify a complete set of location types on PDFs.
However, it seems best to work iteratively, and start with the pdf location
types that can most easily be implemented, rather than waiting until something
truly comprehensive can be implemented.
Rather than using userspace units, we could use some more fine-grained
coordinate system, for example milli-units. The PDF 1.4 standard lets units
take on "real number values" so precision greater than one unit is possible.
But since we can't have float values in matrix events, we can't capture this
greater precision on the present proposal. However, this would probably create
confusion, and precision greater than 1/72th of an inch is probably excessive.
## Security considerations
None.
## Unstable prefix
TBD
Loading…
Cancel
Save