Andrew Dolgov
304d3a0b88
tag-related fixes
...
1. move tag sanitization to feedparser common item class
2. enforce length limit on tags when parsing
3. support multiple tags passed via one dc:subject and other such elements, parse them as a comma-separated list
4. sort resulting tag list to prevent different order between feed updates
5. remove some duplicate code related to tag validation
6. allow + symbol in tags
5 years ago
Andrew Dolgov
55ef85adc0
parser: clean() attribute values by default (except content)
6 years ago
Andrew Dolgov
54727f9534
parser: move media:element handling to feeditem_common; use media:content @media attribute to generate placeholder content-type if not specified
6 years ago
Tobias Kappé
22a866edb5
Store language of entries as indicated by the feed.
6 years ago
Andrew Dolgov
ea79a0e033
remove some redundant php closing tags
8 years ago
Andrew Dolgov
7d1e15c396
parser: properly support tag subtrees instead of text content for article content
9 years ago
Andrew Dolgov
d2bb392bae
Revert "parser: use node->c14n() instead of expecting html in nodeValue"
...
This reverts commit 1383514ad9
.
9 years ago
Andrew Dolgov
1383514ad9
parser: use node->c14n() instead of expecting html in nodeValue
9 years ago
Andrew Dolgov
206326c219
feedparser: xpath doesn't properly query for title element if there's a default namespace so let's add a separate ugly hack for rdf:RDF feeds, thanks for that xml dipshits
10 years ago
zaikos
2b4853f515
Reverts most of be60340
. Implements a simplier solution using XPath to get the proper title tag from a feed item.
10 years ago
zaikos
be60340c29
Made FeedItem_RSS::get_title() more aggresive in finding an article title.
10 years ago
Felix Eckhofer
523bd90baf
Store size of enclosure to database
11 years ago
Andrew Dolgov
31bd6f7643
parser: trim some some feed-extracted data link titles and links
11 years ago
Andrew Dolgov
2ab7ccb695
parser: fix failing on empty media:group tags
11 years ago
Andrew Dolgov
f6c61b2d55
rss: choose between description and content:encoded based on which one is longer because publishers are idiots and can't use tags properly
11 years ago
Andrew Dolgov
e23aedd402
parser: add basic support for media:thumbnail
11 years ago
Jeffrey Tolar
ed449a9aaa
Follow the spec for <media:group>s
...
Each <media:group> section specifies multiple representations of the
same content.
11 years ago
Andrew Dolgov
5c54e68388
support media:description for media: enclosures
11 years ago
Andrew Dolgov
6bf61bdc63
simplify media:content xpath
11 years ago
Andrew Dolgov
4289b68f0d
parser: support media:content elements within media:group
11 years ago
Andrew Dolgov
ce5d234d63
support dc:date elements in rss and atom feeds
12 years ago
Andrew Dolgov
df2655e015
better support for atom:link elements in rss feeds, support rel=standout (fuck you google and your nonstandard shit)
12 years ago
Andrew Dolgov
042003d55e
parser/rss: try to get link from guid isPermaLink=true
12 years ago
Andrew Dolgov
2f6b75d574
fix atom:link not supported in rss feeds (fucking fuck) (2)
12 years ago
Andrew Dolgov
f7d64d03fc
fix atom:link not supported in rss feeds (fucking fuck)
12 years ago
Andrew Dolgov
99b8256794
feedparser: make content:encoded take precedence over description
12 years ago
Andrew Dolgov
8a95d630a9
fix rss content:encoded not used
12 years ago
Andrew Dolgov
b4d1690097
move common methods to feeditem_common
12 years ago
Andrew Dolgov
f11015058d
support dc:creator
12 years ago
Andrew Dolgov
d4992d6b48
add support for dc:subject and slash:comments
12 years ago
Andrew Dolgov
4c00e15b5d
pass xpath object to feeditem, support media-rss objects
12 years ago
Andrew Dolgov
b09a4cdccc
feeditem_rss: use guid element
12 years ago
Andrew Dolgov
04d2f9c831
add basic rss support
12 years ago