Commit Graph

33 Commits (8c907e3fe4b3d0ecdb7d2acbbd206ab280aa7ee8)

Author SHA1 Message Date
Andrew Dolgov 304d3a0b88 tag-related fixes
1. move tag sanitization to feedparser common item class
2. enforce length limit on tags when parsing
3. support multiple tags passed via one dc:subject and other such elements, parse them as a comma-separated list
4. sort resulting tag list to prevent different order between feed updates
5. remove some duplicate code related to tag validation
6. allow + symbol in tags
5 years ago
Andrew Dolgov 55ef85adc0 parser: clean() attribute values by default (except content) 6 years ago
Andrew Dolgov 54727f9534 parser: move media:element handling to feeditem_common; use media:content @media attribute to generate placeholder content-type if not specified 6 years ago
Tobias Kappé 22a866edb5 Store language of entries as indicated by the feed. 6 years ago
Andrew Dolgov ea79a0e033 remove some redundant php closing tags 8 years ago
Andrew Dolgov 7d1e15c396 parser: properly support tag subtrees instead of text content for article content 9 years ago
Andrew Dolgov d2bb392bae Revert "parser: use node->c14n() instead of expecting html in nodeValue"
This reverts commit 1383514ad9.
9 years ago
Andrew Dolgov 1383514ad9 parser: use node->c14n() instead of expecting html in nodeValue 9 years ago
Andrew Dolgov 206326c219 feedparser: xpath doesn't properly query for title element if there's a default namespace so let's add a separate ugly hack for rdf:RDF feeds, thanks for that xml dipshits 10 years ago
zaikos 2b4853f515 Reverts most of be60340. Implements a simplier solution using XPath to get the proper title tag from a feed item. 10 years ago
zaikos be60340c29 Made FeedItem_RSS::get_title() more aggresive in finding an article title. 10 years ago
Felix Eckhofer 523bd90baf Store size of enclosure to database 11 years ago
Andrew Dolgov 31bd6f7643 parser: trim some some feed-extracted data link titles and links 11 years ago
Andrew Dolgov 2ab7ccb695 parser: fix failing on empty media:group tags 11 years ago
Andrew Dolgov f6c61b2d55 rss: choose between description and content:encoded based on which one is longer because publishers are idiots and can't use tags properly 11 years ago
Andrew Dolgov e23aedd402 parser: add basic support for media:thumbnail 11 years ago
Jeffrey Tolar ed449a9aaa Follow the spec for <media:group>s
Each <media:group> section specifies multiple representations of the
same content.
11 years ago
Andrew Dolgov 5c54e68388 support media:description for media: enclosures 11 years ago
Andrew Dolgov 6bf61bdc63 simplify media:content xpath 11 years ago
Andrew Dolgov 4289b68f0d parser: support media:content elements within media:group 11 years ago
Andrew Dolgov ce5d234d63 support dc:date elements in rss and atom feeds 12 years ago
Andrew Dolgov df2655e015 better support for atom:link elements in rss feeds, support rel=standout (fuck you google and your nonstandard shit) 12 years ago
Andrew Dolgov 042003d55e parser/rss: try to get link from guid isPermaLink=true 12 years ago
Andrew Dolgov 2f6b75d574 fix atom:link not supported in rss feeds (fucking fuck) (2) 12 years ago
Andrew Dolgov f7d64d03fc fix atom:link not supported in rss feeds (fucking fuck) 12 years ago
Andrew Dolgov 99b8256794 feedparser: make content:encoded take precedence over description 12 years ago
Andrew Dolgov 8a95d630a9 fix rss content:encoded not used 12 years ago
Andrew Dolgov b4d1690097 move common methods to feeditem_common 12 years ago
Andrew Dolgov f11015058d support dc:creator 12 years ago
Andrew Dolgov d4992d6b48 add support for dc:subject and slash:comments 12 years ago
Andrew Dolgov 4c00e15b5d pass xpath object to feeditem, support media-rss objects 12 years ago
Andrew Dolgov b09a4cdccc feeditem_rss: use guid element 12 years ago
Andrew Dolgov 04d2f9c831 add basic rss support 12 years ago