Commit Graph

150 Commits (ba86c64d38d9995d38af163ae4c51a42b21d5de7)

Author SHA1 Message Date
Andrew Dolgov 528b387563 update individual feed in a separate process to prevent PHP fatal errors
(for example, OOM) from stopping the entire batch
this should also slightly increase memory budget for update processes
4 years ago
Andrew Dolgov 05744bb474 fix updater never scheduling feeds for update if they never been updated before while having default update interval set 4 years ago
Andrew Dolgov 6811d0bde2 use self:: in some places to invoke static methods from the same class 4 years ago
Andrew Dolgov 74568df4ff remove a lot of stuff from global context (functions.php), add a few helper classes instead 4 years ago
Andrew Dolgov 3dd4169b5f clarify some URL validation-related error messages 4 years ago
Andrew Dolgov 4785f21316 update_rss_feed: log effective URL after fetching
validate_url: treat scheme as case-insensitive
4 years ago
Andrew Dolgov a4525d31b2 replace FALSE with false so that static analyzer shuts up about it 4 years ago
Andrew Dolgov afa0023c51 don't try to update manually disabled feeds even if they haven't been updated before or are marked for a manual update 4 years ago
Andrew Dolgov c352e872e9 core: pass found enclosures to HOOK_ARTICLE_FILTER
af_redditimgur: remove enclosures if we found something to embed because it's going to be a low-res thumbnail
4 years ago
Andrew Dolgov 6eb94f1e13 better support for image srcset attributes as discussed in https://community.tt-rss.org/t/problem-with-img-srcset/3519 5 years ago
Andrew Dolgov 06d2c65193 calculate_article_hash: don't die() on previous, woops 5 years ago
Andrew Dolgov 3a142cbf58 calculate_article_hash: ignore some useless or read-only fields (i.e. GUID) when calculating hash 5 years ago
Andrew Dolgov cd1f3cb8cc * store UID in article hashed GUID separately so it could be migrated cleanly to a different instance
* store resulting GUID as a JSON object so it could be extended easier if needed
5 years ago
Andrew Dolgov 3a4b9249a9 DiskCache: properly deal with srcset attributes 5 years ago
Andrew Dolgov 4a00f96733 remove unneeded var_dump() 5 years ago
Andrew Dolgov 6573541873 * add HOOK_ENCLOSURE_IMPORTED
* pass feed id to HOOK_FEED_PARSED
5 years ago
lllusion3418 ec1b0befc7 add support for video[@src] in media cache
it's a valid alternative to a source[@src] child element:
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/video
5 years ago
lllusion3418 cdde23b4dc actually download <video> posters to media cache
video[@poster] is already supported in the rewriting logic but never
actually downloaded
5 years ago
Andrew Dolgov f24ece85a6 add validationtextarea control, use it for filter match editor 5 years ago
Andrew Dolgov 6080cca9ca scrap counter cache system; rework counters to sum() booleans instead 5 years ago
Andrew Dolgov e5b7b145e5 cache media: set referrer to source URL when fetching images 5 years ago
Andrew Dolgov 304d3a0b88 tag-related fixes
1. move tag sanitization to feedparser common item class
2. enforce length limit on tags when parsing
3. support multiple tags passed via one dc:subject and other such elements, parse them as a comma-separated list
4. sort resulting tag list to prevent different order between feed updates
5. remove some duplicate code related to tag validation
6. allow + symbol in tags
5 years ago
Andrew Dolgov 8c3efd51ec reset domain hit quota on feed update start 5 years ago
Andrew Dolgov 0d7b10469b update_rss_feed: add specific logging for HOOK_FETCH_FEED, HOOK_FEED_FETCHED, HOOK_FEED_PARSED handlers 5 years ago
Andrew Dolgov 5bb8dad631 is_gzipped: don't try to strpos() over entire buffer 5 years ago
Andrew Dolgov 647c7c45eb allow article filters to modify num_comments 5 years ago
Andrew Dolgov 4e05008aac update_rss_feed: force cast initial timestamp value to integer 5 years ago
Andrew Dolgov b0d67cd3d0 rework previous to pass unformatted timestamp to plugin, and deal with formatting later
also, move timestamp-related debugging output after plugin handler
5 years ago
Andrew Dolgov 94a12b9674 pass formatted entry timestamp to article filters and allow them to modify it 5 years ago
Andrew Dolgov 6914ad1f74 retire MIN_CACHE_FILE_SIZE 5 years ago
Andrew Dolgov 84974c60a7 RSSUtils::cache_media, cache_enclosures: use DiskCache 5 years ago
Andrew Dolgov fdb6066bf6 * HOOK_ENCLOSURE_ENTRY: pass article_id to handler
* DiskCache: multiple fixes; support isWritable() for cache entries, set content-disposition for send()
* public/cached_url: allow selecting files from sub-caches other than images
* plugins/Cache_Starred_Images: rework to use DiskCache, can be enabled per-user, properly handles article enclosures, etc
5 years ago
Andrew Dolgov 19b9b27662 expire_cached_files to DiskCache::expire() 5 years ago
Andrew Dolgov 088fcf8131 move more globals to more appropriate places
set libxml to always use internal errors
6 years ago
Andrew Dolgov 4fa9aee4e7 move several more global functions to more appropriate classes 6 years ago
Andrew Dolgov 9423d72f6c parser: force libxml error messages to valid utf8 6 years ago
Andrew Dolgov c936cc3a1f use DEFAULT_SEARCH_LANGUAGE to generate tsvector index if per-feed language is not specified, also use it as default value on search form for convenience 6 years ago
Andrew Dolgov 671f4cee65 domdocument: remove old meta charset unicode hacks, replace with shorter xml preamble utf8 hack (on loadhtml where it makes sense)
af_readability: better (?) charset hack for non-unicode pages
6 years ago
Andrew Dolgov 33a2d5f8e4 update_rss_feed: set basic feed info if site_url is blank 6 years ago
Andrew Dolgov 69a691f4e1 cleanup old feed browser cache 6 years ago
Andrew Dolgov 0b74db5ad7 remove feedbrowser (other feeds) 6 years ago
Andrew Dolgov 38e01270d8 archived feeds: expire old entries (schema bump) 6 years ago
Andrew Dolgov 13e7e775a3 update_rss_feed: mark_unread_on_update should take into account catchup filter action and entry_force_catchup 6 years ago
Andrew Dolgov 949bfa3457 add minor clean()-ing on some rss feed values 6 years ago
Andrew Dolgov eedd402807 rssutils: don't gzdecode() stuff 6 years ago
Andrew Dolgov a5517fe857 fetch_file_contents: decompress gzipped data
af_readability: remove utf8 preamble hack
6 years ago
Andrew Dolgov 958fbfedb6 rssutils: check if returned data is in gzip format before trying to decode it 6 years ago
JustAMacUser 4b2f3039d2 Properly report filter plugin time (re-fixes PR 98). 6 years ago
JustAMacUser 53602096b9 Fixed misplaced bracket. 6 years ago
Andrew Dolgov f3737c0b24 update_rss_feed: add log message if article is filtered out
combine filters: fix crash on missing global function
6 years ago
Andrew Dolgov 1e3a53c037 do not try to update filter triggers if nothing was triggered (properly this time) 6 years ago
Andrew Dolgov 5780a5d501 do not try to update filter triggers if nothing was triggered 6 years ago
Andrew Dolgov 3e4326e34d add ttrss_filters2.last_triggered (bump schema version) 6 years ago
Andrew Dolgov a01c33d654 add HOOK_FILTER_TRIGGERED (for filter debugging) 6 years ago
Andrew Dolgov 3ad9944d5e fix missing sprintf() argument 6 years ago
Andrew Dolgov c10a43069e debug logging system rework:
* support various logging levels per-message
 * remove hacks like debug_suppress, DAEMON_EXTENDED_DEBUG, etc
 * _debug() is kept as a compatibility shim for plugins
6 years ago
Andrew Dolgov 2d54eb1a87 remove cache/simplepie 6 years ago
Andrew Dolgov 2c940c4861 better handle PDOExceptions during open transaction in feed update 6 years ago
Andrew Dolgov 665495b94b cache_media: only touch() local file if it's writable 6 years ago
Andrew Dolgov 62d0060aa1 update_daemon_common: do not abort entire batch if PDOException happens when processing individual feeds 6 years ago
fox 8ab77d19ef Merge branch 'pullreq-enclosure-content-type' of tkappe/tt-rss into master 6 years ago
Tobias Kappé ac8a0e7dc6 Differentiate enclosures based on content type.
Some RSS feeds contain multiple enclosures with the same URL. When the first of
these is not recognized as an image, later entries are not added to the
database as rows in ttrss_enclosures. This change differentiates enclosures
based on their content type, so an entry can have multiple enclosure types with
the same URL (but possibly a different content type).
6 years ago
Andrew Dolgov 163b50b15f cache_media: only show downloading debug message when actually downloading 6 years ago
Andrew Dolgov 069aea5989 remove FEED_CRYPT_KEY and everything related to it
always assume auth_pass_encrypted is false
6 years ago
Tobias Kappé 3bbaf902ab Sanitize language obtained for an entry. 6 years ago
Tobias Kappé 22a866edb5 Store language of entries as indicated by the feed. 6 years ago
BtbN 2b8afd4942 Only strip utf8mb4 if mysql_charset != utf8mb4
If a user has fixed their database properly utf8mb4 works just fine allowing emoji and other 4 byte unicode characters to work.
6 years ago
Andrew Dolgov 6e6c3a878d update_rss_feed: limit maximum length of tsvector data because of pgsql limitations 6 years ago
Andrew Dolgov 66fe33e769 bump date_updated when updated article data is saved to exclude it from purging (because it is still present in the originating feed) 7 years ago
Andrew Dolgov 963c22646b pass tsvector data as a named parameter on article update, remove escaping hacks 7 years ago
Andrew Dolgov 5edf4b73a4 add a workaround to support numeric tags 7 years ago
Andrew Dolgov 7f4a404566 include: convert some spaces to tabs 7 years ago
Andrew Dolgov 102a01354b strip utf8mb4 characters in enclosures on mysql 7 years ago
jsoares 26ad257de5 Fixed time stamping of new unmarked/unpublished articles 7 years ago
Andrew Dolgov d4c05d0be2 update_rss_feed: don't try to use quoted NOW() in query 7 years ago
Richard Mortimer aa16334f1f Include NOW() in prepared SQL for rssutils.php 7 years ago
Andrew Dolgov e6532439d6 force strip_tags() on all user input unless explicitly allowed 7 years ago
Andrew Dolgov 7c6f7bb0aa fix some minor issues found by code analyzer 7 years ago
Andrew Dolgov 342e8a9eeb move feeds cache directory to cache/feeds 7 years ago
Andrew Dolgov 93e70e36c2 force article content/etc to string when updating to avoid failing null constraint check 7 years ago
Andrew Dolgov 49a888ecce rssutils: forbid question marks in tsvector data, PDO gets confused sometimes even by quoted ?s 7 years ago
Andrew Dolgov 187abfe732 main classes: remove sql_bool_to_bool() kludge 7 years ago
Andrew Dolgov 0500e14cc2 update_rss_feed: transaction lock article processing 7 years ago
Andrew Dolgov 0567016b40 rssutils: PDO 7 years ago
Andrew Dolgov afcb105f4e rssutils: start PDO switch 7 years ago
Andrew Dolgov e50c8eaa4e enforce unconditional requests every 6 hours even if server claims data is not modified 7 years ago
Andrew Dolgov 9d930af9e1 fetch_file_contents: improve error handling
1. if request fails get error string from http  response status line
2. do not override http error with possible CURL/php specific last error
3. fix silent php error generated while processing response headers to get last modified value
7 years ago
Gilles Grandou f9ad33c2d8 allows favicons to be in Windows PC BMP format 7 years ago
wn_ 3476690cbf Only require an array of basic info from 'HOOK_FEED_BASIC_INFO'.
Removes the need for the plugin to provide feed content.

Gives plugins a chance to provide 'title' and 'site_url' basic info.
Falls back to attempting retrieval+parsing of the fetch URL if needed.
7 years ago
wn_ bec5ba93e2 Add 'HOOK_FEED_BASIC_INFO' to enable plugins to provide basic feed info.
It's expected the plugin will return content parsable by FeedParser, which
will act as an interface to the basic feed info.  In the case of a plugin
that also uses 'HOOK_FETCH_FEED', both might return the same content.

The hook signature was made somewhat similar to 'HOOK_FETCH_FEED'.
7 years ago
Andrew Dolgov 153cb6d305 add support for http 304 not modified (no timestamp calculation bullshit like last time) 7 years ago
Andrew Dolgov 20d2195f13 rssutils: include comment count when calculating article hash 7 years ago
Andrew Dolgov 02f3992a5a Revert "Revert "filters: support matching on multiple feeds/categories""
This reverts commit f5d174bda9.
8 years ago
Andrew Dolgov f5d174bda9 Revert "filters: support matching on multiple feeds/categories"
This reverts commit 0bf7e007bb.
8 years ago
Andrew Dolgov 0bf7e007bb filters: support matching on multiple feeds/categories
opml: update filter export/import for new format
8 years ago
Andrew Dolgov 93af11cb7a update_daemon_common: do not escape feed_url twice, remove some comments and stuff 8 years ago
Andrew Dolgov 6fd0399694 tunables:
* add CACHE_MAX_DAYS as a tunable generic expiry interval for various cached files
* add some comments to tunables in functions.php
* rename _MIN_CACHE_FILE_SIZE to MIN_CACHE_FILE_SIZE
* respect MIN_CACHE_FILE_SIZE setting in a few more places where content is cached
8 years ago
Andrew Dolgov 5b6ea1ef91 remove pubsubhubbub: dead 8 years ago
Andrew Dolgov 4fd0790804 fix DAEMON_SLEEP_INTERVAL not being defined when used
enforce minimum 60 sec spawn/sleep interval in update processes
8 years ago
Andrew Dolgov e6c886bf66 wrap rssfuncs into rssutils class 8 years ago