Commit Graph

269 Commits (54007a45f11ed730352324289b714baefd2901eb)

Author SHA1 Message Date
Sergey M․ 824fa51165
[utils] Improve subtitles_filename (closes #22753) 5 years ago
Sergey M․ 28cc2241e4
[utils] Restrict parse_codecs and add theora as known vcodec (#21381) 5 years ago
Sergey M․ 53cd37bac5
[utils] Improve strip_or_none 5 years ago
Jakub Wilk fd35d8cdfd [utils] Transliterate "þ" as "th" (#20897)
Despite visual similarity "þ" is unrelated to "p".
It is normally transliterated as "th":

    $ echo þ-Þ | iconv -t ASCII//TRANSLIT
    th-TH
5 years ago
Sergey M․ 5e1271c56d
[utils] Improve int_or_none and float_or_none (#20403) 5 years ago
Sergey M․ 0dc41787af
[utils] Introduce parse_bitrate 5 years ago
Sergey M․ fad4ceb534
[utils] Fix urljoin for paths with non-http(s) schemes 5 years ago
Sergey M․ 25d110be30
[utils] Properly recognize AV1 codec (closes #17506) 6 years ago
Sergey M․ af03000ad5
[utils] Introduce url_or_none 6 years ago
Sergey M․ e9c671d5e8
[utils] Allow JSONP with empty func name (closes #17028) 6 years ago
Enes 85750f8972 [openload] Improve ext extraction 6 years ago
Remita Amine 3bb3ff38a1 [test_utils] add tests for b836118724 6 years ago
Sergey M․ 6cc622327f
[utils] Introduce merge_dicts 6 years ago
Sergey M․ 1cc47c6674
[utils] Fix match_str for boolean meta fields 6 years ago
Philipp Hagemeister f226880c6d [tennistv] Add support for tennistv.com 6 years ago
Sergey M․ b871d7e954
[utils] Add parse_resolution 6 years ago
Sergey M․ befa4708fd
[utils] Fixup some common URL's typos in sanitize_url (closes #15649) 6 years ago
Sergey M․ c707b1d828
[test_utils] Add tests for malformed JSON handling in js_to_json 6 years ago
Mike Fährmann c384d537f8 [util] Improve scientific notation handling in js_to_json (closes #14789) 6 years ago
Sergey M․ b555ae9bf1
[utils] Add another date format pattern (#14999) 7 years ago
Sergey M․ 056653bbb1
[utils] Add support for zero years and months in parse_duration 7 years ago
Yen Chi Hsuan 3869028ffb [utils] Use bytes-like objects in dfxp2srt
This fixes handling of non-UTF8 TTML subtitles

Closes #14191
7 years ago
Yen Chi Hsuan 95f3f7c20a
[utils] Fix unescapeHTML for misformed string like "&a"" (#13935) 7 years ago
Sergey M․ 5b232f46dc
[utils] Skip missing params in cli_bool_option (closes #13865) 7 years ago
Sergey M․ dee2ff1d81
[test_utils] Fix tests under Windows 7 years ago
Yen Chi Hsuan 609ff8ca19 [utils] Support attributes with no values in get_elements_by_attribute() 7 years ago
Sergey M․ b4a3d461e4
[utils] Handle HTMLParseError in extract_attributes (closes #13349) 7 years ago
Sergey M․ 2ae2ffda5e
[utils] Improve unified_timestamp 7 years ago
Yen Chi Hsuan 5552c9eb0f
[utils] Recognize more patterns in strip_jsonp()
Used in Youku Show pages
7 years ago
Yen Chi Hsuan 0c26548601
[cda] Implement birthday verification (closes #12789) 7 years ago
Sergey M․ deef31955b
[utils] Improve unified_timestamp
Seen at http://zaq1.pl/video/xev0e
7 years ago
Tithen-Firion 9222d94510 [test_utils] Add one more clean_html test 7 years ago
Remita Amine 5b995f713b [utils] add support for ttml styles 7 years ago
Sergey M․ a426ef6d78
[test_utils] Do not use dash in env variables' names 7 years ago
Sergey M․ 41c5e60dd5
[test_utils] Fix expand_path tests 7 years ago
Sergey M․ 51098426b8
[utils] Introduce expand_path 7 years ago
Sergey M․ 4b5de77bdb
[utils] Process bytestrings in urljoin (closes #12369) 7 years ago
Yen Chi Hsuan f48409c7ac [utils] Add pkcs1pad
Used in daisuki.net (#4738)
7 years ago
Thomas Christlieb 2af12ad9d2 Introduce get_elements_by_class and get_elements_by_attribute utility functions 7 years ago
Sergey M․ 4195096ea8
[utils] Improve comments processing in js_to_json (closes #11947) 7 years ago
Michal Čihař b3ee552e4b
[utils] Handle single-line comments in js_to_json 7 years ago
Sergey M․ 15846398ca
[utils] Improve parse_duration 7 years ago
Sergey M․ cb655f34fb
[utils] Add more date formats 7 years ago
Remita Amine 7fe1592073 [common] fix dash codec information for mixed videos and fragment url construction(#11490) 8 years ago
Sergey M․ b0c65c677f
[utils] Improve urljoin 8 years ago
Sergey M․ e34c33614d
[utils] Add convenience urljoin 8 years ago
Yen Chi Hsuan 582be35847
Update coding style after pycodestyle 2.1.0
In pycodestyle 2.1.0, E305 was introduced, which requires two blank
lines after top level declarations, too.

See https://github.com/PyCQA/pycodestyle/issues/400

See also #10689; thanks @stepshal for first mentioning this issue and
initial patches
8 years ago
Sergey M․ 02dc0a36b7
[utils] Introduce base_url 8 years ago
Sergey M․ c6eed6b8c0
[utils] Lower priority for rare date formats and add tests 8 years ago
Sergey M․ 3e4185c396
[utils] Use native french month names 8 years ago
Sergey M․ f6717dec8a
[utils] Improve month_by_name and add tests 8 years ago
Sergey M․ 6562d34a8c
[utils] Improve mimetype2ext 8 years ago
Yen Chi Hsuan 70852b47ca
[utils] Recognize units with full names in parse_filename
Reference: https://en.wikipedia.org/wiki/Template:Quantities_of_bytes
8 years ago
Yen Chi Hsuan e4659b4547
[utils] Correct octal/hexadecimal number detection in js_to_json 8 years ago
Sergey M․ 13585d7682
[utils] Recognize lowercase units in parse_filesize 8 years ago
Remita Amine 5f2c2b7936 [test_utils] add test for option with not str value 8 years ago
Sergey M․ a8795327ca
[utils] Add support TV Parental Guidelines ratings in parse_age_limit 8 years ago
Yen Chi Hsuan 7dc2a74e0a
[utils] Fix unified_timestamp for formats parsed by parsedate_tz() 8 years ago
Yen Chi Hsuan 0b68de3cc1 Merge pull request #8876 from remitamine/html5_media
[extractor/common] add helper method to extract html5 media entries
8 years ago
Yen Chi Hsuan 84c237fb8a
[utils] Add get_element_by_class
For #9950
8 years ago
Remita Amine dfaa86b75e [test_utils] add test for smuggling a smuggled url 8 years ago
remitamine 4f3c5e0627 [utils] add helper function for parsing codecs 8 years ago
Yen Chi Hsuan 1143535d76
[utils] Add urshift()
Used in IqiyiIE and LeIE
8 years ago
Sergey M․ 46f59e89ea
[utils] Add unified_timestamp 8 years ago
Yen Chi Hsuan 47212f7bcb
[utils] Don't transform numbers not starting with a zero
Fix test_Viidea and maybe others
8 years ago
Yen Chi Hsuan 55b2f099c0
[utils] Decode HTML5 entities
Used in test_Vporn_1. Also related to #9270
8 years ago
bzc6p b96f007eeb Added sanitization support for Hungarian letters Ő and Ű 8 years ago
Sergey M․ 46bc9b7d7c
[utils] Allow None in remove_{start,end} 8 years ago
Sergey M․ 364cf465dd
[test_utils] PEP 8 8 years ago
Sergey M․ 89ac4a19e6
[utils] Process non-base 10 integers in js_to_json 8 years ago
felix bd1e484448
[utils] js_to_json: various improvements
now JS object literals like { /* " */ 0: ",]\xaa<\/p>", } will be correctly converted to JSON.
8 years ago
Yen Chi Hsuan 778a1ccca7
[utils] Add Œ and œ found in French to ACCENT_CHARS
Fixes #9463
8 years ago
Yen Chi Hsuan dab0daeeb0
[utils,compat] Move struct_pack and struct_unpack to compat.py 8 years ago
Adam Thalhammer 31c4448f6e Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347 8 years ago
Adam Thalhammer 79a2e94e79 Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347 8 years ago
Sergey M b6c0d4f431 Merge pull request #9110 from remitamine/parse_duration
[utils] imporove parse_duration to handle more formats
8 years ago
remitamine acaff49575 [utils] imporove parse_duration to handle more formats 8 years ago
Jaime Marquínez Ferrándiz eb9c3edd5e [test/utils] Add test for date_from_str 8 years ago
Yen Chi Hsuan 81f36eba88 [test/test_utils] Update for escape_url change (again) 8 years ago
Yen Chi Hsuan 2d60465e44 [test/test_utils] Update for escape_url change 8 years ago
Jaime Marquínez Ferrándiz 782b1b5bd1 [utils] lookup_unit_table: Match word boundary instead of end of string 8 years ago
Sergey M․ c5229f3926 [utils] PEP 8 8 years ago
remitamine 83548824c2 Merge pull request #8092 from bpfoley/twitter-thumbnail
[utils] Add extract_attributes for extracting html tag attributes
8 years ago
Sergey M․ fb47597b09 [bbc] Generalize unit table lookup and add parse_count 8 years ago
remitamine 3201a67f61 [test/test_utils] add more tests for update_url_query 8 years ago
remitamine fb640d0a3d [test/test_utils] add tests for update_url_query 8 years ago
Brian Foley 8bb56eeeea [utils] Add extract_attributes for extracting html tag attributes
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
8 years ago
Yen Chi Hsuan 5eb6bdced4 [utils] Multiple changes to base_n()
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
8 years ago
Sergey M․ f160785c5c [utils] Remove AM/PM from unified_strdate patterns 8 years ago
Yen Chi Hsuan 5bc880b988 [utils] Add OHDave's RSA encryption function 8 years ago
Sergey M․ 8411229bd5 [utils] Allow dot in strip_jsonp 8 years ago
Sergey M․ 86296ad2cd [utils] Add ability to control skipping false values in dict_get 8 years ago
Sergey M․ cbecc9b903 [utils] Add dict_get convenience method 8 years ago
Sergey M․ 6b77d52b1f [test_utils] Add tests for encode_compat_str 9 years ago
Yen Chi Hsuan db2fe38b55 [utils] Support alternative timestamp format in TTML
Fixes #7608
9 years ago
Yen Chi Hsuan d631d5f9f2 [utils] Fix TTML conversion
Tolerate invalid timestamps (closes #7909)
9 years ago
Sergey M․ 31b2051e21 [utils] Add remove_quotes 9 years ago
Sergey M․ 9cb9a5df77 [utils] Check ext with trailing slash against the list of known extensions 9 years ago
Sergey M․ 5035536e3f [test_utils] Add tests for determine_ext 9 years ago
Sergey M․ 7aefc49c40 [utils] Skip invalid/non HTML entities (Closes #7518) 9 years ago