Compare commits

...

121 Commits

Author SHA1 Message Date
dirkf 956b8c5855 [YouTube] Bug-fix for `c1f5c3274a` 1 week ago
dirkf d5f561166b [core] Re-work format_note display in format list with abbreviated codec name 1 week ago
dirkf d0283f5385 [YouTube] Revert forcing player JS by default
* still leaving the parameters in place

thx bashonly for confirming this suggestion
2 weeks ago
dirkf 6315f4b1df [utils] Support additional codecs and dynamic_range 2 weeks ago
dirkf aeb1254fcf [YouTube] Fix playlist thumbnail extraction
Thx seproDev, yt-dlp/yt-dlp#11615
2 weeks ago
dirkf 25890f2ad1 [YouTube] Improve detection of geo-restriction
Thx yt-dlp
2 weeks ago
dirkf d65882a022 [YouTube] Improve mark_watched()
Thx: Brett824, yt-dlp/yt-dlp#4146
2 weeks ago
dirkf 39378f7b5c [YouTube] Fix incorrect chapter extraction
* align `_get_text()` with yt-dlp (thx, passim) at last
2 weeks ago
dirkf 6f5d4c3289 [YouTube] Improve targeting of pre-roll wait
Experimental for now.
Thx: yt-dlp/yt-dlp#14646
2 weeks ago
dirkf 5d445f8c5f [YouTube] Re-work client selection
* use `android_sdkless` by default
* use `web_safari` (HLS only) if logged in
* skip any non-HLS format with n-challenge
2 weeks ago
dirkf a1e2c7d90b [YouTube] Add further InnerTube clients
FWIW: android-sdkless, tv_downgraded, web_creator
Thx yt-dlp passim
2 weeks ago
dirkf c55ace3c50 [YouTube] Use insertion-order-preserving dict for InnerTube client data 2 weeks ago
dirkf 43e3121020 [utils] Align `parse_duration()` behaviour with yt-dlp
* handle comma-separated long-form durations
* support : as millisecond separator.
2 weeks ago
dirkf 7a488f7fae [utils] Stabilise traversal results using `compat_dict`
In `traverse_obj()`, use `compat_dict` to construct dicts,
ensuring insertion order sort, but`compat_builtin_dict`
to test for dict-iness...
2 weeks ago
dirkf 5585d76da6 [compat] Add `compat_dict`
A dict that preserves insertion order and otherwise resembles the
dict builtin (if it isn't it) rather than `collections.OrderedDict`.

Also:
* compat_builtins_dict: the built-in definition in case `compat_dict`
  was imported as `dict`
* compat_dict_items: use instead of `dict.items` to get items from
  a `compat_dict` in insertion order, if you didn't define `dict` as
  `compat_dict`.
2 weeks ago
dirkf 931e15621c [compat] Add `compat_abc_ABC`
Base class for abstract classes
2 weeks ago
dirkf 27867cc814 [compat] Add `compat_thread` 2 weeks ago
dirkf 70b40dd1ef [utils] Add `subs_list_to_dict()` traversal helper
Thx: yt-dlp/yt-dlp#10653, etc
2 weeks ago
dirkf a9b4649d92 [utils] Apply `partial_application` decorator to existing functions
Thx: yt-dlp/yt-dlp#10653 (etc)
2 weeks ago
dirkf 23a848c314 [utils] Add `partial_application` decorator function
Thx: yt-dlp/yt-dlp#10653
2 weeks ago
dirkf a96a778750 [core] Fix housekeeping for `available_at` 2 weeks ago
dirkf 68fe8c1781 [utils] Support traversal helper functions `require`, `value`, `unpack`
Thx: yt-dlp/yt-dlp#10653
2 weeks ago
dirkf 96419fa706 [utils] Support `filter` traversal key
Thx yt-dlp/yt-dlp#10653
2 weeks ago
dirkf cca41c9d2c [test] Move dict_get() traversal test to its own class
Matches yt-dlp/yt-dlp#9426
2 weeks ago
dirkf bc39e5e678 [test] Fix test_traversal_morsel for Py 3.14+
Thx: yt-dlp/yt-dlp#13471
2 weeks ago
dirkf 014ae63a11 [test] Support additional args and kwargs in report_warning() mocks 2 weeks ago
dirkf 1e109aaee1 [workflows/ci] Avoid installing wheel and setuptools with pip
Works around dependent wheel installation failure with Py 3.4 from 2025-10
2 months ago
dirkf efb4011211 [YouTube] Introduce `_extract_and_report_alerts()` per yt-dlp
Fixes #33196.

Also removing previous `_extract_alerts()` method.
2 months ago
dirkf c1f5c3274a [YouTube] Improve some traversals
Pending full alignment with yt-dlp ...
2 months ago
dirkf e21ff28f6f [YouTube] Misc clean-ups from linter, etc 2 months ago
dirkf 82552faba6 [workflows/ci] Update to windows-2022 runner
FFS
2 months ago
dirkf 617d4e6466 [core] Support explicit `--no-list-formats` option 2 months ago
dirkf 9223fcc48a [YouTube] Support `LOCKUP_CONTENT_TYPE_VIDEO` in subscriptions feed extraction
From yt-dlp/yt-dlp#13665), thx bashonly
2 months ago
dirkf 4222c6d78b [YouTube] Extract fallback title and description from initial data
Based on yt-dlp/yt-dlp#14078, thx bashonly
2 months ago
dirkf 2735d1bf1d [YouTube] Extract srt subtitles
From yt-dlp/yt-dlp#13411, thx gamer191
2 months ago
dirkf f2a774cb9d [YouTube] Fix subtitles extraction
From yt-dlp/yt-dlp#13659, thx bashonly
2 months ago
dirkf 92680b127f [YouTube] Handle required preroll waiting period
* Based on yt-dlp/yt-dlp#14081, thx bashonly
* Uses internal `youtube_preroll_sleep` param, default 6s
2 months ago
dirkf 40ab920354 [downloader] Delay download according to `available_at` format key 2 months ago
dirkf 0739f58f90 [YouTube] Implement player JS override for player `0004de42`
* based on yt-dlp/yt-dlp#14398, thx seproDev
* adds --youtube-player-js-variant option
* adds --youtube-player-js-version option
* sets defaults to main variant of player `0004de42`
* fixes #33187, for now
2 months ago
dirkf aac0148b89 [YouTube] Force `WEB` user agent for video page download
Fixes #33142, until default UAs work.
2 months ago
dirkf 7f7b3881aa [YouTube] Handle Web Safari formats
From yt-dlp/yt-dlp#14168, thx bashonly.
2 months ago
dirkf 0c41b03114 [YouTube] Update player client details 2 months ago
dirkf 7c6630bfdd [YouTube] Miscellaneous clean-ups 2 months ago
dirkf a084c80f7b [YouTube] Fix 680069a, excess `min_ver`
Resolves #33125.
7 months ago
dirkf e102b9993a [workflows/ci.yml] Move pinned Ubuntu runner images from withdrawn 20.4 to 22.04
* fix consequent missing `python-is-python2` package
7 months ago
dirkf 680069a149 [YouTube] Improve n-sig function extraction for player `aa3fc80b`
Resolves #33123
7 months ago
dirkf 4a31290ae1 [YouTube] Delete cached problem nsig cache data on descrambling error
* inspired by yt-dlp/yt-dlp#12750
7 months ago
dirkf 3a42f6ad37 [YouTube] Cache signature timestamp from player JS
* if the YT webpage can't be loaded, getting the `sts` requires loading the
player JS: this caches it
* based on yt-dlp/yt-dlp#13047, thx bashonly
7 months ago
dirkf ec75141bf0 [Cache] Add `clear` function 7 months ago
dirkf c052a16f72 [JSInterp] Add tests and relevant functionality from yt-dlp
* thx seproDev, bashonly: yt-dlp/yt-dlp#12760, yt-dlp/yt-dlp#12761:
  - Improve nested attribute support
  - Pass global stack when extracting objects
  - interpret_statement: Match attribute before indexing
  - Fix assignment to array elements with nested brackets
  - Add new signature tests
  - Invalidate JS function cache
  - Avoid testdata dupes now that we cache by URL

* rework nsig function name search
* fully fixes #33102
* update cache required versions
* update program version
8 months ago
dirkf bd2ded59f2 [JSInterp] Improve unary operators; add `!` 8 months ago
dirkf 16b7e97afa [JSInterp] Add `_separate_at_op()` 8 months ago
dirkf d21717978c [JSInterp] Improve JS classes, etc 8 months ago
dirkf 7513413794 [JSInterp] Reorganise some declarations to align better with yt-dlp 8 months ago
dirkf 67dbfa65f2 [InfoExtractor] Fix merging subtitles to empty target 8 months ago
dirkf 6eb6d6dff5 [InfoExtractor] Use local variants for remaining parent method calls
* ... where defined
8 months ago
dirkf 6c40d9f847 [YouTube] Remove remaining hard-coded API keys
* no longer required for these cases
8 months ago
dirkf 1b08d3281d [YouTube] Fix playlist continuation extraction
* thx coletdjnz, bashonly: yt-dlp/yt-dlp#12777
8 months ago
dirkf 32b8d31780 [YouTube] Support shorts playlist
* only 1..100: yt-dlp/yt-dlp#11130
8 months ago
dirkf 570b868078 [cache] Use `esc_rfc3986` to encode cache key 8 months ago
dirkf 2190e89260 [utils] Support optional `safe` argument for `escape_rfc3986()` 8 months ago
dirkf 7e136639db [compat] Improve Py2 compatibility for URL Quoting 8 months ago
dirkf cedeeed56f [cache] Align further with yt-dlp
* use compat_os_makedirs
* support non-ASCII characters in cache key
* improve logging
8 months ago
dirkf add4622870 [compat] Add compat_os_makedirs
* support exists_ok parameter in Py < 3.2
8 months ago
dirkf 9a6ddece4d [core] Refactor message routines to align better with yt-dlp
* in particular, support `only_once` in the same methods
8 months ago
dirkf 3eb8d22ddb
[JSInterp] Temporary fix for #33102 8 months ago
dirkf 4e714f9df1 [Misc] Correct [_]IE_DESC/NAME in a few IEs
* thx seproDev, yt-dlp/yt-dlp/pull/12694/commits/ae69e3c
* also add documenting comment in `InfoExtractor`
8 months ago
dirkf c1ea7f5a24 [ITV] Mark ITVX not working
* update old shim
* correct [_]IE_DESC
8 months ago
dirkf 2b4fbfce25 [YouTube] Support player `4fcd6e4a`
thx seproDev, bashonly: yt-dlp/yt-dlp#12748
8 months ago
dirkf 1bc45b8b6c [JSInterp] Use `,` for join() with null/undefined argument
Eg: [1,2,3].join(null) -> '1,2,3'
8 months ago
dirkf b982d77d0b [YouTube] Align signature tests with yt-dlp
thx bashonly, yt-dlp/yt-dlp#12725
8 months ago
dirkf c55dbf4838 [YouTube] Update signature extraction for players `643afba4`, `363db69b` 8 months ago
dirkf 087d865230 [YouTube] Support new player URL patterns 8 months ago
dirkf a4fc1151f1 [JSInterp] Improve indexing
* catch invalid list index with `ValueError` (eg [1, 2]['ab'] -> undefined)
* allow assignment outside existing list (eg var l = [1,2]; l[9] = 0;)
8 months ago
dirkf a464c159e6 [YouTube] Make `_extract_player_info()` use `_search_regex()` 8 months ago
dirkf 7dca08eff0 [YouTube] Also get original of translated automatic captions 8 months ago
dirkf 2239ee7965 [YouTube] Get subtitles/automatic captions from both web and API responses 8 months ago
dirkf da7223d4aa [YouTube] Improve support for tce-style player JS
* improve extraction of global "useful data" Array from player JS
* also handle tv-player and add tests: thx seproDev (yt-dlp/yt-dlp#12684)

Co-Authored-By: sepro <sepro@sepr0.com>
9 months ago
dirkf 37c2440d6a [YouTube] Update player client data
thx seproDev (yt-dlp/yt-dlp#12603)

Co-authored-by: sepro <sepro@sepr0.com>
9 months ago
dirkf 420d53387c [JSInterp] Improve tests
* from yt-dlp/yt-dlp#12313
* also fix d7c2708
9 months ago
dirkf 32f89de92b [YouTube] Update TVHTML5 client parameters
* resolves #33078
9 months ago
dirkf 283dca56fe [YouTube] Initially support tce-style player JS
* resolves #33079
9 months ago
dirkf 422b1b31cf [YouTube] Temporarily redirect from tce-style player JS 9 months ago
dirkf 1dc27e1c3b [JSInterp] Make indexing error handling more conformant
* by default TypeError -> undefined, else raise
* set allow_undefined=True/False to override
9 months ago
dirkf af049e309b [JSInterp] Handle undefined, etc, passed to JS_RegExp and Exception 9 months ago
dirkf 94849bc997 [JSInterp] Improve Date processing
* add JS_Date class implementing JS Date
* support constructor args other than date string
* support static methods of Date
* Date objects are still automatically coerced to timestamp before using in JS.
9 months ago
dirkf 974c7d7f34 [compat] Fix inheriting from compat_collections_chain_map
* see ytdl-org/youtube-dl#33079#issuecomment-2704038049
9 months ago
dirkf 8738407d77 [compat] Support zstd Content-Encoding
* see RFC 8878 7.2
9 months ago
dirkf cecaa18b80 [compat] Clean-up
* make workaround_optparse_bug9161 private
* add comments
* avoid leaving test objects behind
9 months ago
dirkf 673277e510
[YouTube] Fix 91b1569 9 months ago
dirkf 91b1569f68
[YouTube] Fix channel playlist extraction (#33074)
* [YouTube] Extract playlist items from LOCKUP_VIEW_MODEL_...
* resolves #33073
* thx seproDev (yt-dlp/yt-dlp#11615)

Co-authored-by: sepro <sepro@sepr0.com>
9 months ago
dirkf 711e72c292 [JSInterp] Fix bit-shift coercion for player 9c6dfc4a 10 months ago
dirkf 26b6f15d14 [compat] Make casefold private
* if required, not supported:
`from youtube_dl.casefold import _casefold as casefold`
10 months ago
dirkf 5975d7bb96 [YouTube] Use X-Goog-Visitor-Id
* required with tv player client
* resolves #33030
11 months ago
dirkf 63fb0fc415 [YouTube] Retain .videoDetails members from all player responses 11 months ago
dirkf b09442a2f4 [YouTube] Also use ios client when is_live 11 months ago
dirkf 55ad8a24ca [YouTube] Support `... /feeds/videos.xml?playlist_id={pl_id}` 11 months ago
dirkf 21fff05121 [YouTube] Switch to TV API client
* thx yt-dlp/yt-dlp#12059
11 months ago
dirkf 1036478d13 [YouTube] Endure subtitle URLs are complete
* WEB URLs are, MWEB not
* resolves #33017
11 months ago
dirkf 00ad2b8ca1 [YouTube] Refactor subtitle processing
* move to internal function
* use `traverse-obj()`
11 months ago
dirkf ab7c61ca29 [YouTube] Apply code style changes, trailing commas, etc 11 months ago
dirkf 176fc2cb00 [YouTube] Avoid early crash if webpage can't be read
* see issue #33013
11 months ago
dirkf d55d1f423d [YouTube] Always extract using MWEB API client
* temporary fix-up for 403 on download
* MWEB parameters from yt-dlp 2024-12-06
12 months ago
dirkf eeafbbc3e5 [YouTube] Fix signature function extraction for `2f1832d2`
* `_` was omitted from patterns
* thx yt-dlp/yt-dlp#11801

Co-authored-by: bashonly
12 months ago
dirkf cd7c7b5edb [YouTube] Simplify pattern for nsig function name extraction 12 months ago
dirkf eed784e15f [YouTube] Pass nsig value as return hook, fixes player `3bb1f723` 12 months ago
dirkf b4469a0f65 [YouTube] Handle player `3bb1f723`
* fix signature code extraction
* raise if n function returns input value
* add new tests from yt-dlp

Co-authored-by: bashonly
12 months ago
dirkf ce1e556b8f [jsinterp] Add return hook for player `3bb1f723`
* set var `_ytdl_do_not_return` to a specific value in the scope of a function
* if an expression to be returned has that value, `return` becomes `void`
12 months ago
dirkf f487b4a02a [jsinterp] Strip /* comments */ when parsing
* NB: _separate() is looking creaky
12 months ago
dirkf 60835ca16c [jsinterp] Fix and improve "methods"
* push, unshift return new length
* impove edge cases for push/pop, shift/unshift, forEach, indexOf, charCodeAt
* increase test coverage
12 months ago
dirkf 94fd774608 [jsinterp] Fix and improve split/join
* improve split/join edge cases
* correctly implement regex split (not like re.split)
12 months ago
dirkf 5dee6213ed [jsinterp] Fix and improve arithmetic operations
* addition becomes concat with a string operand
* improve handling of edgier cases
* arithmetic in float like JS (more places need cast to int?)
* increase test coverage
12 months ago
dirkf 81e64cacf2 [jsinterp] Support multiple indexing (eg a[1][2])
* extend single indexing with improved RE (should probably use/have used _separate_at_paren())
* fix some cases that should have given undefined, not throwing
* standardise RE group names
* support length of objects, like {1: 2, 3: 4, length: 42}
12 months ago
dirkf c1a03b1ac3 [jsinterp] Fix and improve loose and strict equality operations
* reimplement loose equality according to MDN (eg, 1 == "1")
* improve strict equality (eg, "abc" === "abc" but 'abc' is not 'abc')
* add tests for above
12 months ago
dirkf 118c6d7a17 [jsinterp] Implement `typeof` operator 12 months ago
dirkf f28d7178e4 [InfoExtractor] Use kwarg maxsplit for re.split
* May become kw-only in future Pythons
12 months ago
dirkf c5098961b0 [Youtube] Rework n function extraction pattern
Now also succeeds with player b12cc44b
1 year ago
dirkf dbc08fba83 [jsinterp] Improve slice implementation for player b12cc44b
Partly taken from yt-dlp/yt-dlp#10664, thx seproDev
        Fixes #32896
1 year ago
Aiur Adept 71223bff39
[Youtube] Fix nsig extraction for player 20dfca59 (#32891)
* dirkf's patch for nsig extraction
* add generic search per  yt-dlp/yt-dlp/pull/10611 - thx bashonly

---------

Co-authored-by: dirkf <fieldhouse@gmx.net>
1 year ago
dirkf e1b3fa242c [Youtube] Find `n` function name in player `3400486c`
Fixes #32877
1 year ago
dirkf 451046d62a [Youtube] Make n-sig throttling diagnostic up-to-date 1 year ago

@ -116,29 +116,29 @@ jobs:
strategy: strategy:
fail-fast: true fail-fast: true
matrix: matrix:
os: [ubuntu-20.04] os: [ubuntu-22.04]
python-version: ${{ fromJSON(needs.select.outputs.cpython-versions) }} python-version: ${{ fromJSON(needs.select.outputs.cpython-versions) }}
python-impl: [cpython] python-impl: [cpython]
ytdl-test-set: ${{ fromJSON(needs.select.outputs.test-set) }} ytdl-test-set: ${{ fromJSON(needs.select.outputs.test-set) }}
run-tests-ext: [sh] run-tests-ext: [sh]
include: include:
- os: windows-2019 - os: windows-2022
python-version: 3.4 python-version: 3.4
python-impl: cpython python-impl: cpython
ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'core') && 'core' || 'nocore' }} ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'core') && 'core' || 'nocore' }}
run-tests-ext: bat run-tests-ext: bat
- os: windows-2019 - os: windows-2022
python-version: 3.4 python-version: 3.4
python-impl: cpython python-impl: cpython
ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'download') && 'download' || 'nodownload' }} ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'download') && 'download' || 'nodownload' }}
run-tests-ext: bat run-tests-ext: bat
# jython # jython
- os: ubuntu-20.04 - os: ubuntu-22.04
python-version: 2.7 python-version: 2.7
python-impl: jython python-impl: jython
ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'core') && 'core' || 'nocore' }} ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'core') && 'core' || 'nocore' }}
run-tests-ext: sh run-tests-ext: sh
- os: ubuntu-20.04 - os: ubuntu-22.04
python-version: 2.7 python-version: 2.7
python-impl: jython python-impl: jython
ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'download') && 'download' || 'nodownload' }} ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'download') && 'download' || 'nodownload' }}
@ -160,7 +160,7 @@ jobs:
# NB may run apt-get install in Linux # NB may run apt-get install in Linux
uses: ytdl-org/setup-python@v1 uses: ytdl-org/setup-python@v1
env: env:
# Temporary workaround for Python 3.5 failures - May 2024 # Temporary (?) workaround for Python 3.5 failures - May 2024
PIP_TRUSTED_HOST: "pypi.python.org pypi.org files.pythonhosted.org" PIP_TRUSTED_HOST: "pypi.python.org pypi.org files.pythonhosted.org"
with: with:
python-version: ${{ matrix.python-version }} python-version: ${{ matrix.python-version }}
@ -240,7 +240,10 @@ jobs:
# install 2.7 # install 2.7
shell: bash shell: bash
run: | run: |
sudo apt-get install -y python2 python-is-python2 # Ubuntu 22.04 no longer has python-is-python2: fetch it
curl -L "http://launchpadlibrarian.net/474693132/python-is-python2_2.7.17-4_all.deb" -o python-is-python2.deb
sudo apt-get install -y python2
sudo dpkg --force-breaks -i python-is-python2.deb
echo "PYTHONHOME=/usr" >> "$GITHUB_ENV" echo "PYTHONHOME=/usr" >> "$GITHUB_ENV"
#-------- Python 2.6 -- #-------- Python 2.6 --
- name: Set up Python 2.6 environment - name: Set up Python 2.6 environment
@ -362,7 +365,7 @@ jobs:
python -m ensurepip || python -m pip --version || { \ python -m ensurepip || python -m pip --version || { \
get_pip="${{ contains(needs.select.outputs.own-pip-versions, matrix.python-version) && format('{0}/', matrix.python-version) || '' }}"; \ get_pip="${{ contains(needs.select.outputs.own-pip-versions, matrix.python-version) && format('{0}/', matrix.python-version) || '' }}"; \
curl -L -O "https://bootstrap.pypa.io/pip/${get_pip}get-pip.py"; \ curl -L -O "https://bootstrap.pypa.io/pip/${get_pip}get-pip.py"; \
python get-pip.py; } python get-pip.py --no-setuptools --no-wheel; }
- name: Set up Python 2.6 pip - name: Set up Python 2.6 pip
if: ${{ matrix.python-version == '2.6' }} if: ${{ matrix.python-version == '2.6' }}
shell: bash shell: bash

@ -85,10 +85,10 @@ class FakeYDL(YoutubeDL):
# Silence an expected warning matching a regex # Silence an expected warning matching a regex
old_report_warning = self.report_warning old_report_warning = self.report_warning
def report_warning(self, message): def report_warning(self, message, *args, **kwargs):
if re.match(regex, message): if re.match(regex, message):
return return
old_report_warning(message) old_report_warning(message, *args, **kwargs)
self.report_warning = types.MethodType(report_warning, self) self.report_warning = types.MethodType(report_warning, self)
@ -265,11 +265,11 @@ def assertRegexpMatches(self, text, regexp, msg=None):
def expect_warnings(ydl, warnings_re): def expect_warnings(ydl, warnings_re):
real_warning = ydl.report_warning real_warning = ydl.report_warning
def _report_warning(w): def _report_warning(self, w, *args, **kwargs):
if not any(re.search(w_re, w) for w_re in warnings_re): if not any(re.search(w_re, w) for w_re in warnings_re):
real_warning(w) real_warning(w)
ydl.report_warning = _report_warning ydl.report_warning = types.MethodType(_report_warning, ydl)
def http_server_port(httpd): def http_server_port(httpd):

@ -63,9 +63,21 @@ class TestCache(unittest.TestCase):
obj = {'x': 1, 'y': ['ä', '\\a', True]} obj = {'x': 1, 'y': ['ä', '\\a', True]}
c.store('test_cache', 'k.', obj) c.store('test_cache', 'k.', obj)
self.assertEqual(c.load('test_cache', 'k.', min_ver='1970.01.01'), obj) self.assertEqual(c.load('test_cache', 'k.', min_ver='1970.01.01'), obj)
new_version = '.'.join(('%d' % ((v + 1) if i == 0 else v, )) for i, v in enumerate(version_tuple(__version__))) new_version = '.'.join(('%0.2d' % ((v + 1) if i == 0 else v, )) for i, v in enumerate(version_tuple(__version__)))
self.assertIs(c.load('test_cache', 'k.', min_ver=new_version), None) self.assertIs(c.load('test_cache', 'k.', min_ver=new_version), None)
def test_cache_clear(self):
ydl = FakeYDL({
'cachedir': self.test_dir,
})
c = Cache(ydl)
c.store('test_cache', 'k.', 'kay')
c.store('test_cache', 'l.', 'ell')
self.assertEqual(c.load('test_cache', 'k.'), 'kay')
c.clear('test_cache', 'k.')
self.assertEqual(c.load('test_cache', 'k.'), None)
self.assertEqual(c.load('test_cache', 'l.'), 'ell')
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

@ -1,4 +1,5 @@
#!/usr/bin/env python #!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
@ -6,12 +7,14 @@ from __future__ import unicode_literals
import os import os
import sys import sys
import unittest import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import math import math
import re import re
import time
from youtube_dl.compat import compat_str from youtube_dl.compat import compat_str as str
from youtube_dl.jsinterp import JS_Undefined, JSInterpreter from youtube_dl.jsinterp import JS_Undefined, JSInterpreter
NaN = object() NaN = object()
@ -19,7 +22,7 @@ NaN = object()
class TestJSInterpreter(unittest.TestCase): class TestJSInterpreter(unittest.TestCase):
def _test(self, jsi_or_code, expected, func='f', args=()): def _test(self, jsi_or_code, expected, func='f', args=()):
if isinstance(jsi_or_code, compat_str): if isinstance(jsi_or_code, str):
jsi_or_code = JSInterpreter(jsi_or_code) jsi_or_code = JSInterpreter(jsi_or_code)
got = jsi_or_code.call_function(func, *args) got = jsi_or_code.call_function(func, *args)
if expected is NaN: if expected is NaN:
@ -40,16 +43,27 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){return 42 + 7;}', 49) self._test('function f(){return 42 + 7;}', 49)
self._test('function f(){return 42 + undefined;}', NaN) self._test('function f(){return 42 + undefined;}', NaN)
self._test('function f(){return 42 + null;}', 42) self._test('function f(){return 42 + null;}', 42)
self._test('function f(){return 1 + "";}', '1')
self._test('function f(){return 42 + "7";}', '427')
self._test('function f(){return false + true;}', 1)
self._test('function f(){return "false" + true;}', 'falsetrue')
self._test('function f(){return '
'1 + "2" + [3,4] + {k: 56} + null + undefined + Infinity;}',
'123,4[object Object]nullundefinedInfinity')
def test_sub(self): def test_sub(self):
self._test('function f(){return 42 - 7;}', 35) self._test('function f(){return 42 - 7;}', 35)
self._test('function f(){return 42 - undefined;}', NaN) self._test('function f(){return 42 - undefined;}', NaN)
self._test('function f(){return 42 - null;}', 42) self._test('function f(){return 42 - null;}', 42)
self._test('function f(){return 42 - "7";}', 35)
self._test('function f(){return 42 - "spam";}', NaN)
def test_mul(self): def test_mul(self):
self._test('function f(){return 42 * 7;}', 294) self._test('function f(){return 42 * 7;}', 294)
self._test('function f(){return 42 * undefined;}', NaN) self._test('function f(){return 42 * undefined;}', NaN)
self._test('function f(){return 42 * null;}', 0) self._test('function f(){return 42 * null;}', 0)
self._test('function f(){return 42 * "7";}', 294)
self._test('function f(){return 42 * "eggs";}', NaN)
def test_div(self): def test_div(self):
jsi = JSInterpreter('function f(a, b){return a / b;}') jsi = JSInterpreter('function f(a, b){return a / b;}')
@ -57,17 +71,26 @@ class TestJSInterpreter(unittest.TestCase):
self._test(jsi, NaN, args=(JS_Undefined, 1)) self._test(jsi, NaN, args=(JS_Undefined, 1))
self._test(jsi, float('inf'), args=(2, 0)) self._test(jsi, float('inf'), args=(2, 0))
self._test(jsi, 0, args=(0, 3)) self._test(jsi, 0, args=(0, 3))
self._test(jsi, 6, args=(42, 7))
self._test(jsi, 0, args=(42, float('inf')))
self._test(jsi, 6, args=("42", 7))
self._test(jsi, NaN, args=("spam", 7))
def test_mod(self): def test_mod(self):
self._test('function f(){return 42 % 7;}', 0) self._test('function f(){return 42 % 7;}', 0)
self._test('function f(){return 42 % 0;}', NaN) self._test('function f(){return 42 % 0;}', NaN)
self._test('function f(){return 42 % undefined;}', NaN) self._test('function f(){return 42 % undefined;}', NaN)
self._test('function f(){return 42 % "7";}', 0)
self._test('function f(){return 42 % "beans";}', NaN)
def test_exp(self): def test_exp(self):
self._test('function f(){return 42 ** 2;}', 1764) self._test('function f(){return 42 ** 2;}', 1764)
self._test('function f(){return 42 ** undefined;}', NaN) self._test('function f(){return 42 ** undefined;}', NaN)
self._test('function f(){return 42 ** null;}', 1) self._test('function f(){return 42 ** null;}', 1)
self._test('function f(){return undefined ** 0;}', 1)
self._test('function f(){return undefined ** 42;}', NaN) self._test('function f(){return undefined ** 42;}', NaN)
self._test('function f(){return 42 ** "2";}', 1764)
self._test('function f(){return 42 ** "spam";}', NaN)
def test_calc(self): def test_calc(self):
self._test('function f(a){return 2*a+1;}', 7, args=[3]) self._test('function f(a){return 2*a+1;}', 7, args=[3])
@ -89,13 +112,60 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){return 19 & 21;}', 17) self._test('function f(){return 19 & 21;}', 17)
self._test('function f(){return 11 >> 2;}', 2) self._test('function f(){return 11 >> 2;}', 2)
self._test('function f(){return []? 2+3: 4;}', 5) self._test('function f(){return []? 2+3: 4;}', 5)
# equality
self._test('function f(){return 1 == 1}', True)
self._test('function f(){return 1 == 1.0}', True)
self._test('function f(){return 1 == "1"}', True)
self._test('function f(){return 1 == 2}', False) self._test('function f(){return 1 == 2}', False)
self._test('function f(){return 1 != "1"}', False)
self._test('function f(){return 1 != 2}', True)
self._test('function f(){var x = {a: 1}; var y = x; return x == y}', True)
self._test('function f(){var x = {a: 1}; return x == {a: 1}}', False)
self._test('function f(){return NaN == NaN}', False)
self._test('function f(){return null == undefined}', True)
self._test('function f(){return "spam, eggs" == "spam, eggs"}', True)
# strict equality
self._test('function f(){return 1 === 1}', True)
self._test('function f(){return 1 === 1.0}', True)
self._test('function f(){return 1 === "1"}', False)
self._test('function f(){return 1 === 2}', False)
self._test('function f(){var x = {a: 1}; var y = x; return x === y}', True)
self._test('function f(){var x = {a: 1}; return x === {a: 1}}', False)
self._test('function f(){return NaN === NaN}', False)
self._test('function f(){return null === undefined}', False)
self._test('function f(){return null === null}', True)
self._test('function f(){return undefined === undefined}', True)
self._test('function f(){return "uninterned" === "uninterned"}', True)
self._test('function f(){return 1 === 1}', True)
self._test('function f(){return 1 === "1"}', False)
self._test('function f(){return 1 !== 1}', False)
self._test('function f(){return 1 !== "1"}', True)
# expressions
self._test('function f(){return 0 && 1 || 2;}', 2) self._test('function f(){return 0 && 1 || 2;}', 2)
self._test('function f(){return 0 ?? 42;}', 0) self._test('function f(){return 0 ?? 42;}', 0)
self._test('function f(){return "life, the universe and everything" < 42;}', False) self._test('function f(){return "life, the universe and everything" < 42;}', False)
# https://github.com/ytdl-org/youtube-dl/issues/32815 # https://github.com/ytdl-org/youtube-dl/issues/32815
self._test('function f(){return 0 - 7 * - 6;}', 42) self._test('function f(){return 0 - 7 * - 6;}', 42)
def test_bitwise_operators_typecast(self):
# madness
self._test('function f(){return null << 5}', 0)
self._test('function f(){return undefined >> 5}', 0)
self._test('function f(){return 42 << NaN}', 42)
self._test('function f(){return 42 << Infinity}', 42)
self._test('function f(){return 0.0 << null}', 0)
self._test('function f(){return NaN << 42}', 0)
self._test('function f(){return "21.9" << 1}', 42)
self._test('function f(){return true << "5";}', 32)
self._test('function f(){return true << true;}', 2)
self._test('function f(){return "19" & "21.9";}', 17)
self._test('function f(){return "19" & false;}', 0)
self._test('function f(){return "11.0" >> "2.1";}', 2)
self._test('function f(){return 5 ^ 9;}', 12)
self._test('function f(){return 0.0 << NaN}', 0)
self._test('function f(){return null << undefined}', 0)
self._test('function f(){return 21 << 4294967297}', 42)
def test_array_access(self): def test_array_access(self):
self._test('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}', [5, 2, 7]) self._test('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}', [5, 2, 7])
@ -110,8 +180,8 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){var x = 20; x = 30 + 1; return x;}', 31) self._test('function f(){var x = 20; x = 30 + 1; return x;}', 31)
self._test('function f(){var x = 20; x += 30 + 1; return x;}', 51) self._test('function f(){var x = 20; x += 30 + 1; return x;}', 51)
self._test('function f(){var x = 20; x -= 30 + 1; return x;}', -11) self._test('function f(){var x = 20; x -= 30 + 1; return x;}', -11)
self._test('function f(){var x = 2; var y = ["a", "b"]; y[x%y["length"]]="z"; return y}', ['z', 'b'])
@unittest.skip('Not yet fully implemented')
def test_comments(self): def test_comments(self):
self._test(''' self._test('''
function f() { function f() {
@ -130,6 +200,15 @@ class TestJSInterpreter(unittest.TestCase):
} }
''', 3) ''', 3)
self._test('''
function f() {
var x = ( /* 1 + */ 2 +
/* 30 * 40 */
50);
return x;
}
''', 52)
def test_precedence(self): def test_precedence(self):
self._test(''' self._test('''
function f() { function f() {
@ -151,6 +230,34 @@ class TestJSInterpreter(unittest.TestCase):
self._test(jsi, 86000, args=['12/31/1969 18:01:26 MDT']) self._test(jsi, 86000, args=['12/31/1969 18:01:26 MDT'])
# epoch 0 # epoch 0
self._test(jsi, 0, args=['1 January 1970 00:00:00 UTC']) self._test(jsi, 0, args=['1 January 1970 00:00:00 UTC'])
# undefined
self._test(jsi, NaN, args=[JS_Undefined])
# y,m,d, ... - may fail with older dates lacking DST data
jsi = JSInterpreter(
'function f() { return new Date(%s); }'
% ('2024, 5, 29, 2, 52, 12, 42',))
self._test(jsi, (
1719625932042 # UK value
+ (
+ 3600 # back to GMT
+ (time.altzone if time.daylight # host's DST
else time.timezone)
) * 1000))
# no arg
self.assertAlmostEqual(JSInterpreter(
'function f() { return new Date() - 0; }').call_function('f'),
time.time() * 1000, delta=100)
# Date.now()
self.assertAlmostEqual(JSInterpreter(
'function f() { return Date.now(); }').call_function('f'),
time.time() * 1000, delta=100)
# Date.parse()
jsi = JSInterpreter('function f(dt) { return Date.parse(dt); }')
self._test(jsi, 0, args=['1 January 1970 00:00:00 UTC'])
# Date.UTC()
jsi = JSInterpreter('function f() { return Date.UTC(%s); }'
% ('1970, 0, 1, 0, 0, 0, 0',))
self._test(jsi, 0)
def test_call(self): def test_call(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
@ -265,8 +372,28 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f() { a=5; return (a -= 1, a+=3, a); }', 7) self._test('function f() { a=5; return (a -= 1, a+=3, a); }', 7)
self._test('function f() { return (l=[0,1,2,3], function(a, b){return a+b})((l[1], l[2]), l[3]) }', 5) self._test('function f() { return (l=[0,1,2,3], function(a, b){return a+b})((l[1], l[2]), l[3]) }', 5)
def test_not(self):
self._test('function f() { return ! undefined; }', True)
self._test('function f() { return !0; }', True)
self._test('function f() { return !!0; }', False)
self._test('function f() { return ![]; }', False)
self._test('function f() { return !0 !== false; }', True)
def test_void(self): def test_void(self):
self._test('function f() { return void 42; }', None) self._test('function f() { return void 42; }', JS_Undefined)
def test_typeof(self):
self._test('function f() { return typeof undefined; }', 'undefined')
self._test('function f() { return typeof NaN; }', 'number')
self._test('function f() { return typeof Infinity; }', 'number')
self._test('function f() { return typeof true; }', 'boolean')
self._test('function f() { return typeof null; }', 'object')
self._test('function f() { return typeof "a string"; }', 'string')
self._test('function f() { return typeof 42; }', 'number')
self._test('function f() { return typeof 42.42; }', 'number')
self._test('function f() { var g = function(){}; return typeof g; }', 'function')
self._test('function f() { return typeof {key: "value"}; }', 'object')
# not yet implemented: Symbol, BigInt
def test_return_function(self): def test_return_function(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
@ -283,7 +410,7 @@ class TestJSInterpreter(unittest.TestCase):
def test_undefined(self): def test_undefined(self):
self._test('function f() { return undefined === undefined; }', True) self._test('function f() { return undefined === undefined; }', True)
self._test('function f() { return undefined; }', JS_Undefined) self._test('function f() { return undefined; }', JS_Undefined)
self._test('function f() {return undefined ?? 42; }', 42) self._test('function f() { return undefined ?? 42; }', 42)
self._test('function f() { let v; return v; }', JS_Undefined) self._test('function f() { let v; return v; }', JS_Undefined)
self._test('function f() { let v; return v**0; }', 1) self._test('function f() { let v; return v**0; }', 1)
self._test('function f() { let v; return [v>42, v<=42, v&&42, 42&&v]; }', self._test('function f() { let v; return [v>42, v<=42, v&&42, 42&&v]; }',
@ -324,8 +451,19 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f() { let a; return a?.qq; }', JS_Undefined) self._test('function f() { let a; return a?.qq; }', JS_Undefined)
self._test('function f() { let a = {m1: 42, m2: 0 }; return a?.qq; }', JS_Undefined) self._test('function f() { let a = {m1: 42, m2: 0 }; return a?.qq; }', JS_Undefined)
def test_indexing(self):
self._test('function f() { return [1, 2, 3, 4][3]}', 4)
self._test('function f() { return [1, [2, [3, [4]]]][1][1][1][0]}', 4)
self._test('function f() { var o = {1: 2, 3: 4}; return o[3]}', 4)
self._test('function f() { var o = {1: 2, 3: 4}; return o["3"]}', 4)
self._test('function f() { return [1, [2, {3: [4]}]][1][1]["3"][0]}', 4)
self._test('function f() { return [1, 2, 3, 4].length}', 4)
self._test('function f() { var o = {1: 2, 3: 4}; return o.length}', JS_Undefined)
self._test('function f() { var o = {1: 2, 3: 4}; o["length"] = 42; return o.length}', 42)
def test_regex(self): def test_regex(self):
self._test('function f() { let a=/,,[/,913,/](,)}/; }', None) self._test('function f() { let a=/,,[/,913,/](,)}/; }', None)
self._test('function f() { let a=/,,[/,913,/](,)}/; return a.source; }', ',,[/,913,/](,)}')
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/; "".replace(a, ""); return a; } function x() { let a=/,,[/,913,/](,)}/; "".replace(a, ""); return a; }
@ -373,13 +511,6 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){return -524999584 << 5}', 379882496) self._test('function f(){return -524999584 << 5}', 379882496)
self._test('function f(){return 1236566549 << 5}', 915423904) self._test('function f(){return 1236566549 << 5}', 915423904)
def test_bitwise_operators_typecast(self):
# madness
self._test('function f(){return null << 5}', 0)
self._test('function f(){return undefined >> 5}', 0)
self._test('function f(){return 42 << NaN}', 42)
self._test('function f(){return 42 << Infinity}', 42)
def test_negative(self): def test_negative(self):
self._test('function f(){return 2 * -2.0 ;}', -4) self._test('function f(){return 2 * -2.0 ;}', -4)
self._test('function f(){return 2 - - -2 ;}', 0) self._test('function f(){return 2 - - -2 ;}', 0)
@ -411,10 +542,19 @@ class TestJSInterpreter(unittest.TestCase):
self._test(jsi, 't-e-s-t', args=[test_input, '-']) self._test(jsi, 't-e-s-t', args=[test_input, '-'])
self._test(jsi, '', args=[[], '-']) self._test(jsi, '', args=[[], '-'])
self._test('function f(){return '
'[1, 1.0, "abc", {a: 1}, null, undefined, Infinity, NaN].join()}',
'1,1,abc,[object Object],,,Infinity,NaN')
self._test('function f(){return '
'[1, 1.0, "abc", {a: 1}, null, undefined, Infinity, NaN].join("~")}',
'1~1~abc~[object Object]~~~Infinity~NaN')
def test_split(self): def test_split(self):
test_result = list('test') test_result = list('test')
tests = [ tests = [
'function f(a, b){return a.split(b)}', 'function f(a, b){return a.split(b)}',
'function f(a, b){return a["split"](b)}',
'function f(a, b){let x = ["split"]; return a[x[0]](b)}',
'function f(a, b){return String.prototype.split.call(a, b)}', 'function f(a, b){return String.prototype.split.call(a, b)}',
'function f(a, b){return String.prototype.split.apply(a, [b])}', 'function f(a, b){return String.prototype.split.apply(a, [b])}',
] ]
@ -424,6 +564,93 @@ class TestJSInterpreter(unittest.TestCase):
self._test(jsi, test_result, args=['t-e-s-t', '-']) self._test(jsi, test_result, args=['t-e-s-t', '-'])
self._test(jsi, [''], args=['', '-']) self._test(jsi, [''], args=['', '-'])
self._test(jsi, [], args=['', '']) self._test(jsi, [], args=['', ''])
# RegExp split
self._test('function f(){return "test".split(/(?:)/)}',
['t', 'e', 's', 't'])
self._test('function f(){return "t-e-s-t".split(/[es-]+/)}',
['t', 't'])
# from MDN: surrogate pairs aren't handled: case 1 fails
# self._test('function f(){return "😄😄".split(/(?:)/)}',
# ['\ud83d', '\ude04', '\ud83d', '\ude04'])
# case 2 beats Py3.2: it gets the case 1 result
if sys.version_info >= (2, 6) and not ((3, 0) <= sys.version_info < (3, 3)):
self._test('function f(){return "😄😄".split(/(?:)/u)}',
['😄', '😄'])
def test_slice(self):
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice()}', [0, 1, 2, 3, 4, 5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(0)}', [0, 1, 2, 3, 4, 5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(5)}', [5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(99)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-2)}', [7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-99)}', [0, 1, 2, 3, 4, 5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(0, 0)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(1, 0)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(0, 1)}', [0])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(3, 6)}', [3, 4, 5])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(1, -1)}', [1, 2, 3, 4, 5, 6, 7])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-1, 1)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-3, -1)}', [6, 7])
self._test('function f(){return "012345678".slice()}', '012345678')
self._test('function f(){return "012345678".slice(0)}', '012345678')
self._test('function f(){return "012345678".slice(5)}', '5678')
self._test('function f(){return "012345678".slice(99)}', '')
self._test('function f(){return "012345678".slice(-2)}', '78')
self._test('function f(){return "012345678".slice(-99)}', '012345678')
self._test('function f(){return "012345678".slice(0, 0)}', '')
self._test('function f(){return "012345678".slice(1, 0)}', '')
self._test('function f(){return "012345678".slice(0, 1)}', '0')
self._test('function f(){return "012345678".slice(3, 6)}', '345')
self._test('function f(){return "012345678".slice(1, -1)}', '1234567')
self._test('function f(){return "012345678".slice(-1, 1)}', '')
self._test('function f(){return "012345678".slice(-3, -1)}', '67')
def test_splice(self):
self._test('function f(){var T = ["0", "1", "2"]; T["splice"](2, 1, "0")[0]; return T }', ['0', '1', '0'])
def test_pop(self):
# pop
self._test('function f(){var a = [0, 1, 2, 3, 4, 5, 6, 7, 8]; return [a.pop(), a]}',
[8, [0, 1, 2, 3, 4, 5, 6, 7]])
self._test('function f(){return [].pop()}', JS_Undefined)
# push
self._test('function f(){var a = [0, 1, 2]; return [a.push(3, 4), a]}',
[5, [0, 1, 2, 3, 4]])
self._test('function f(){var a = [0, 1, 2]; return [a.push(), a]}',
[3, [0, 1, 2]])
def test_shift(self):
# shift
self._test('function f(){var a = [0, 1, 2, 3, 4, 5, 6, 7, 8]; return [a.shift(), a]}',
[0, [1, 2, 3, 4, 5, 6, 7, 8]])
self._test('function f(){return [].shift()}', JS_Undefined)
# unshift
self._test('function f(){var a = [0, 1, 2]; return [a.unshift(3, 4), a]}',
[5, [3, 4, 0, 1, 2]])
self._test('function f(){var a = [0, 1, 2]; return [a.unshift(), a]}',
[3, [0, 1, 2]])
def test_forEach(self):
self._test('function f(){var ret = []; var l = [4, 2]; '
'var log = function(e,i,a){ret.push([e,i,a]);}; '
'l.forEach(log); '
'return [ret.length, ret[0][0], ret[1][1], ret[0][2]]}',
[2, 4, 1, [4, 2]])
self._test('function f(){var ret = []; var l = [4, 2]; '
'var log = function(e,i,a){this.push([e,i,a]);}; '
'l.forEach(log, ret); '
'return [ret.length, ret[0][0], ret[1][1], ret[0][2]]}',
[2, 4, 1, [4, 2]])
def test_extract_function(self):
jsi = JSInterpreter('function a(b) { return b + 1; }')
func = jsi.extract_function('a')
self.assertEqual(func([2]), 3)
def test_extract_function_with_global_stack(self):
jsi = JSInterpreter('function c(d) { return d + e + f + g; }')
func = jsi.extract_function('c', {'e': 10}, {'f': 100, 'g': 1000})
self.assertEqual(func([1]), 1111)
if __name__ == '__main__': if __name__ == '__main__':

@ -9,21 +9,32 @@ import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import itertools
import re import re
from youtube_dl.traversal import ( from youtube_dl.traversal import (
dict_get, dict_get,
get_first, get_first,
require,
subs_list_to_dict,
T, T,
traverse_obj, traverse_obj,
unpack,
value,
) )
from youtube_dl.compat import ( from youtube_dl.compat import (
compat_chr as chr,
compat_etree_fromstring, compat_etree_fromstring,
compat_http_cookies, compat_http_cookies,
compat_map as map,
compat_str, compat_str,
compat_zip as zip,
) )
from youtube_dl.utils import ( from youtube_dl.utils import (
determine_ext,
ExtractorError,
int_or_none, int_or_none,
join_nonempty,
str_or_none, str_or_none,
) )
@ -446,42 +457,164 @@ class TestTraversal(_TestCase):
msg='`any` should allow further branching') msg='`any` should allow further branching')
def test_traversal_morsel(self): def test_traversal_morsel(self):
values = {
'expires': 'a',
'path': 'b',
'comment': 'c',
'domain': 'd',
'max-age': 'e',
'secure': 'f',
'httponly': 'g',
'version': 'h',
'samesite': 'i',
}
# SameSite added in Py3.8, breaks .update for 3.5-3.7
if sys.version_info < (3, 8):
del values['samesite']
morsel = compat_http_cookies.Morsel() morsel = compat_http_cookies.Morsel()
# SameSite added in Py3.8, breaks .update for 3.5-3.7
# Similarly Partitioned, Py3.14, thx Grub4k
values = dict(zip(morsel, map(chr, itertools.count(ord('a')))))
morsel.set(str('item_key'), 'item_value', 'coded_value') morsel.set(str('item_key'), 'item_value', 'coded_value')
morsel.update(values) morsel.update(values)
values['key'] = str('item_key') values.update({
values['value'] = 'item_value' 'key': str('item_key'),
'value': 'item_value',
}),
values = dict((str(k), v) for k, v in values.items()) values = dict((str(k), v) for k, v in values.items())
# make test pass even without ordered dict
value_set = set(values.values())
for key, value in values.items(): for key, val in values.items():
self.assertEqual(traverse_obj(morsel, key), value, self.assertEqual(traverse_obj(morsel, key), val,
msg='Morsel should provide access to all values') msg='Morsel should provide access to all values')
self.assertEqual(set(traverse_obj(morsel, Ellipsis)), value_set, values = list(values.values())
msg='`...` should yield all values') self.assertMaybeCountEqual(traverse_obj(morsel, Ellipsis), values,
self.assertEqual(set(traverse_obj(morsel, lambda k, v: True)), value_set, msg='`...` should yield all values')
msg='function key should yield all values') self.assertMaybeCountEqual(traverse_obj(morsel, lambda k, v: True), values,
msg='function key should yield all values')
self.assertIs(traverse_obj(morsel, [(None,), any]), morsel, self.assertIs(traverse_obj(morsel, [(None,), any]), morsel,
msg='Morsel should not be implicitly changed to dict on usage') msg='Morsel should not be implicitly changed to dict on usage')
def test_get_first(self): def test_traversal_filter(self):
self.assertEqual(get_first([{'a': None}, {'a': 'spam'}], 'a'), 'spam') data = [None, False, True, 0, 1, 0.0, 1.1, '', 'str', {}, {0: 0}, [], [1]]
self.assertEqual(
traverse_obj(data, (Ellipsis, filter)),
[True, 1, 1.1, 'str', {0: 0}, [1]],
'`filter` should filter falsy values')
class TestTraversalHelpers(_TestCase):
def test_traversal_require(self):
with self.assertRaises(ExtractorError, msg='Missing `value` should raise'):
traverse_obj(_TEST_DATA, ('None', T(require('value'))))
self.assertEqual(
traverse_obj(_TEST_DATA, ('str', T(require('value')))), 'str',
'`require` should pass through non-`None` values')
def test_subs_list_to_dict(self):
self.assertEqual(traverse_obj([
{'name': 'de', 'url': 'https://example.com/subs/de.vtt'},
{'name': 'en', 'url': 'https://example.com/subs/en1.ass'},
{'name': 'en', 'url': 'https://example.com/subs/en2.ass'},
], [Ellipsis, {
'id': 'name',
'url': 'url',
}, all, T(subs_list_to_dict)]), {
'de': [{'url': 'https://example.com/subs/de.vtt'}],
'en': [
{'url': 'https://example.com/subs/en1.ass'},
{'url': 'https://example.com/subs/en2.ass'},
],
}, 'function should build subtitle dict from list of subtitles')
self.assertEqual(traverse_obj([
{'name': 'de', 'url': 'https://example.com/subs/de.ass'},
{'name': 'de'},
{'name': 'en', 'content': 'content'},
{'url': 'https://example.com/subs/en'},
], [Ellipsis, {
'id': 'name',
'data': 'content',
'url': 'url',
}, all, T(subs_list_to_dict(lang=None))]), {
'de': [{'url': 'https://example.com/subs/de.ass'}],
'en': [{'data': 'content'}],
}, 'subs with mandatory items missing should be filtered')
self.assertEqual(traverse_obj([
{'url': 'https://example.com/subs/de.ass', 'name': 'de'},
{'url': 'https://example.com/subs/en', 'name': 'en'},
], [Ellipsis, {
'id': 'name',
'ext': ['url', T(determine_ext(default_ext=None))],
'url': 'url',
}, all, T(subs_list_to_dict(ext='ext'))]), {
'de': [{'url': 'https://example.com/subs/de.ass', 'ext': 'ass'}],
'en': [{'url': 'https://example.com/subs/en', 'ext': 'ext'}],
}, '`ext` should set default ext but leave existing value untouched')
self.assertEqual(traverse_obj([
{'name': 'en', 'url': 'https://example.com/subs/en2', 'prio': True},
{'name': 'en', 'url': 'https://example.com/subs/en1', 'prio': False},
], [Ellipsis, {
'id': 'name',
'quality': ['prio', T(int)],
'url': 'url',
}, all, T(subs_list_to_dict(ext='ext'))]), {'en': [
{'url': 'https://example.com/subs/en1', 'ext': 'ext'},
{'url': 'https://example.com/subs/en2', 'ext': 'ext'},
]}, '`quality` key should sort subtitle list accordingly')
self.assertEqual(traverse_obj([
{'name': 'de', 'url': 'https://example.com/subs/de.ass'},
{'name': 'de'},
{'name': 'en', 'content': 'content'},
{'url': 'https://example.com/subs/en'},
], [Ellipsis, {
'id': 'name',
'url': 'url',
'data': 'content',
}, all, T(subs_list_to_dict(lang='en'))]), {
'de': [{'url': 'https://example.com/subs/de.ass'}],
'en': [
{'data': 'content'},
{'url': 'https://example.com/subs/en'},
],
}, 'optionally provided lang should be used if no id available')
self.assertEqual(traverse_obj([
{'name': 1, 'url': 'https://example.com/subs/de1'},
{'name': {}, 'url': 'https://example.com/subs/de2'},
{'name': 'de', 'ext': 1, 'url': 'https://example.com/subs/de3'},
{'name': 'de', 'ext': {}, 'url': 'https://example.com/subs/de4'},
], [Ellipsis, {
'id': 'name',
'url': 'url',
'ext': 'ext',
}, all, T(subs_list_to_dict(lang=None))]), {
'de': [
{'url': 'https://example.com/subs/de3'},
{'url': 'https://example.com/subs/de4'},
],
}, 'non str types should be ignored for id and ext')
self.assertEqual(traverse_obj([
{'name': 1, 'url': 'https://example.com/subs/de1'},
{'name': {}, 'url': 'https://example.com/subs/de2'},
{'name': 'de', 'ext': 1, 'url': 'https://example.com/subs/de3'},
{'name': 'de', 'ext': {}, 'url': 'https://example.com/subs/de4'},
], [Ellipsis, {
'id': 'name',
'url': 'url',
'ext': 'ext',
}, all, T(subs_list_to_dict(lang='de'))]), {
'de': [
{'url': 'https://example.com/subs/de1'},
{'url': 'https://example.com/subs/de2'},
{'url': 'https://example.com/subs/de3'},
{'url': 'https://example.com/subs/de4'},
],
}, 'non str types should be replaced by default id')
def test_unpack(self):
self.assertEqual(
unpack(lambda *x: ''.join(map(compat_str, x)))([1, 2, 3]), '123')
self.assertEqual(
unpack(join_nonempty)([1, 2, 3]), '1-2-3')
self.assertEqual(
unpack(join_nonempty, delim=' ')([1, 2, 3]), '1 2 3')
with self.assertRaises(TypeError):
unpack(join_nonempty)()
with self.assertRaises(TypeError):
unpack()
def test_value(self):
self.assertEqual(
traverse_obj(_TEST_DATA, ('str', T(value('other')))), 'other',
'`value` should substitute specified value')
class TestDictGet(_TestCase):
def test_dict_get(self): def test_dict_get(self):
FALSE_VALUES = { FALSE_VALUES = {
'none': None, 'none': None,
@ -504,6 +637,9 @@ class TestTraversal(_TestCase):
self.assertEqual(dict_get(d, ('b', 'c', key, )), None) self.assertEqual(dict_get(d, ('b', 'c', key, )), None)
self.assertEqual(dict_get(d, ('b', 'c', key, ), skip_false_values=False), false_value) self.assertEqual(dict_get(d, ('b', 'c', key, ), skip_false_values=False), false_value)
def test_get_first(self):
self.assertEqual(get_first([{'a': None}, {'a': 'spam'}], 'a'), 'spam')
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

@ -69,6 +69,7 @@ from youtube_dl.utils import (
parse_iso8601, parse_iso8601,
parse_resolution, parse_resolution,
parse_qs, parse_qs,
partial_application,
pkcs1pad, pkcs1pad,
prepend_extension, prepend_extension,
read_batch_urls, read_batch_urls,
@ -664,6 +665,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_duration('3h 11m 53s'), 11513) self.assertEqual(parse_duration('3h 11m 53s'), 11513)
self.assertEqual(parse_duration('3 hours 11 minutes 53 seconds'), 11513) self.assertEqual(parse_duration('3 hours 11 minutes 53 seconds'), 11513)
self.assertEqual(parse_duration('3 hours 11 mins 53 secs'), 11513) self.assertEqual(parse_duration('3 hours 11 mins 53 secs'), 11513)
self.assertEqual(parse_duration('3 hours, 11 minutes, 53 seconds'), 11513)
self.assertEqual(parse_duration('3 hours, 11 mins, 53 secs'), 11513)
self.assertEqual(parse_duration('62m45s'), 3765) self.assertEqual(parse_duration('62m45s'), 3765)
self.assertEqual(parse_duration('6m59s'), 419) self.assertEqual(parse_duration('6m59s'), 419)
self.assertEqual(parse_duration('49s'), 49) self.assertEqual(parse_duration('49s'), 49)
@ -682,6 +685,10 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_duration('PT1H0.040S'), 3600.04) self.assertEqual(parse_duration('PT1H0.040S'), 3600.04)
self.assertEqual(parse_duration('PT00H03M30SZ'), 210) self.assertEqual(parse_duration('PT00H03M30SZ'), 210)
self.assertEqual(parse_duration('P0Y0M0DT0H4M20.880S'), 260.88) self.assertEqual(parse_duration('P0Y0M0DT0H4M20.880S'), 260.88)
self.assertEqual(parse_duration('01:02:03:050'), 3723.05)
self.assertEqual(parse_duration('103:050'), 103.05)
self.assertEqual(parse_duration('1HR 3MIN'), 3780)
self.assertEqual(parse_duration('2hrs 3mins'), 7380)
def test_fix_xml_ampersands(self): def test_fix_xml_ampersands(self):
self.assertEqual( self.assertEqual(
@ -895,6 +902,30 @@ class TestUtil(unittest.TestCase):
'vcodec': 'av01.0.05M.08', 'vcodec': 'av01.0.05M.08',
'acodec': 'none', 'acodec': 'none',
}) })
self.assertEqual(parse_codecs('vp9.2'), {
'vcodec': 'vp9.2',
'acodec': 'none',
'dynamic_range': 'HDR10',
})
self.assertEqual(parse_codecs('vp09.02.50.10.01.09.18.09.00'), {
'vcodec': 'vp09.02.50.10.01.09.18.09.00',
'acodec': 'none',
'dynamic_range': 'HDR10',
})
self.assertEqual(parse_codecs('av01.0.12M.10.0.110.09.16.09.0'), {
'vcodec': 'av01.0.12M.10.0.110.09.16.09.0',
'acodec': 'none',
'dynamic_range': 'HDR10',
})
self.assertEqual(parse_codecs('dvhe'), {
'vcodec': 'dvhe',
'acodec': 'none',
'dynamic_range': 'DV',
})
self.assertEqual(parse_codecs('fLaC'), {
'vcodec': 'none',
'acodec': 'flac',
})
self.assertEqual(parse_codecs('theora, vorbis'), { self.assertEqual(parse_codecs('theora, vorbis'), {
'vcodec': 'theora', 'vcodec': 'theora',
'acodec': 'vorbis', 'acodec': 'vorbis',
@ -1723,6 +1754,21 @@ Line 1
'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd',
from_dict={'a': 'c', 'c': [], 'b': 'd', 'd': None}), 'c-d') from_dict={'a': 'c', 'c': [], 'b': 'd', 'd': None}), 'c-d')
def test_partial_application(self):
test_fn = partial_application(lambda x, kwarg=None: '{0}, kwarg={1!r}'.format(x, kwarg))
self.assertTrue(
callable(test_fn(kwarg=10)),
'missing positional parameter should apply partially')
self.assertEqual(
test_fn(10, kwarg=42), '10, kwarg=42',
'positionally passed argument should call function')
self.assertEqual(
test_fn(x=10), '10, kwarg=None',
'keyword passed positional should call function')
self.assertEqual(
test_fn(kwarg=42)(10), '10, kwarg=42',
'call after partial application should call the function')
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

@ -1,4 +1,5 @@
#!/usr/bin/env python #!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
@ -12,6 +13,7 @@ import re
import string import string
from youtube_dl.compat import ( from youtube_dl.compat import (
compat_contextlib_suppress,
compat_open as open, compat_open as open,
compat_str, compat_str,
compat_urlretrieve, compat_urlretrieve,
@ -50,23 +52,93 @@ _SIG_TESTS = [
( (
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflBb0OQx.js', 'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflBb0OQx.js',
84, 84,
'123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQ0STUVWXYZ!"#$%&\'()*+,@./:;<=>' '123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQ0STUVWXYZ!"#$%&\'()*+,@./:;<=>',
), ),
( (
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vfl9FYC6l.js', 'https://s.ytimg.com/yts/jsbin/html5player-en_US-vfl9FYC6l.js',
83, 83,
'123456789abcdefghijklmnopqr0tuvwxyzABCDETGHIJKLMNOPQRS>UVWXYZ!"#$%&\'()*+,-./:;<=F' '123456789abcdefghijklmnopqr0tuvwxyzABCDETGHIJKLMNOPQRS>UVWXYZ!"#$%&\'()*+,-./:;<=F',
), ),
( (
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflCGk6yw/html5player.js', 'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflCGk6yw/html5player.js',
'4646B5181C6C3020DF1D9C7FCFEA.AD80ABF70C39BD369CCCAE780AFBB98FA6B6CB42766249D9488C288', '4646B5181C6C3020DF1D9C7FCFEA.AD80ABF70C39BD369CCCAE780AFBB98FA6B6CB42766249D9488C288',
'82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B' '82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B',
), ),
( (
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js',
'312AA52209E3623129A412D56A40F11CB0AF14AE.3EE09501CB14E3BCDC3B2AE808BF3F1D14E7FBF12', '312AA52209E3623129A412D56A40F11CB0AF14AE.3EE09501CB14E3BCDC3B2AE808BF3F1D14E7FBF12',
'112AA5220913623229A412D56A40F11CB0AF14AE.3EE0950FCB14EEBCDC3B2AE808BF331D14E7FBF3', '112AA5220913623229A412D56A40F11CB0AF14AE.3EE0950FCB14EEBCDC3B2AE808BF331D14E7FBF3',
) ),
(
'https://www.youtube.com/s/player/6ed0d907/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'AOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL2QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/3bb1f723/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'MyOSJXtKI3m-uME_jv7-pT12gOFC02RFkGoqWpzE0Cs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xxAj7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJ2OySqa0q',
),
(
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'AAOAOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7vgpDL0QwbdV06sCIEzpWqMGkFR20CFOS21Tp-7vj_EMu-m37KtXJoOy1',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/20830619/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
),
] ]
_NSIG_TESTS = [ _NSIG_TESTS = [
@ -136,12 +208,16 @@ _NSIG_TESTS = [
), ),
( (
'https://www.youtube.com/s/player/c57c113c/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/c57c113c/player_ias.vflset/en_US/base.js',
'-Txvy6bT5R6LqgnQNx', 'dcklJCnRUHbgSg', 'M92UUMHa8PdvPd3wyM', '3hPqLJsiNZx7yA',
), ),
( (
'https://www.youtube.com/s/player/5a3b6271/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/5a3b6271/player_ias.vflset/en_US/base.js',
'B2j7f_UPT4rfje85Lu_e', 'm5DmNymaGQ5RdQ', 'B2j7f_UPT4rfje85Lu_e', 'm5DmNymaGQ5RdQ',
), ),
(
'https://www.youtube.com/s/player/7a062b77/player_ias.vflset/en_US/base.js',
'NRcE3y3mVtm_cV-W', 'VbsCYUATvqlt5w',
),
( (
'https://www.youtube.com/s/player/dac945fd/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/dac945fd/player_ias.vflset/en_US/base.js',
'o8BkRxXhuYsBCWi6RplPdP', '3Lx32v_hmzTm6A', 'o8BkRxXhuYsBCWi6RplPdP', '3Lx32v_hmzTm6A',
@ -152,7 +228,11 @@ _NSIG_TESTS = [
), ),
( (
'https://www.youtube.com/s/player/cfa9e7cb/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/cfa9e7cb/player_ias.vflset/en_US/base.js',
'qO0NiMtYQ7TeJnfFG2', 'k9cuJDHNS5O7kQ', 'aCi3iElgd2kq0bxVbQ', 'QX1y8jGb2IbZ0w',
),
(
'https://www.youtube.com/s/player/8c7583ff/player_ias.vflset/en_US/base.js',
'1wWCVpRR96eAmMI87L', 'KSkWAVv1ZQxC3A',
), ),
( (
'https://www.youtube.com/s/player/b7910ca8/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/b7910ca8/player_ias.vflset/en_US/base.js',
@ -166,6 +246,110 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/b22ef6e7/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/b22ef6e7/player_ias.vflset/en_US/base.js',
'b6HcntHGkvBLk_FRf', 'kNPW6A7FyP2l8A', 'b6HcntHGkvBLk_FRf', 'kNPW6A7FyP2l8A',
), ),
(
'https://www.youtube.com/s/player/3400486c/player_ias.vflset/en_US/base.js',
'lL46g3XifCKUZn1Xfw', 'z767lhet6V2Skl',
),
(
'https://www.youtube.com/s/player/5604538d/player_ias.vflset/en_US/base.js',
'7X-he4jjvMx7BCX', 'sViSydX8IHtdWA',
),
(
'https://www.youtube.com/s/player/20dfca59/player_ias.vflset/en_US/base.js',
'-fLCxedkAk4LUTK2', 'O8kfRq1y1eyHGw',
),
(
'https://www.youtube.com/s/player/b12cc44b/player_ias.vflset/en_US/base.js',
'keLa5R2U00sR9SQK', 'N1OGyujjEwMnLw',
),
(
'https://www.youtube.com/s/player/3bb1f723/player_ias.vflset/en_US/base.js',
'gK15nzVyaXE9RsMP3z', 'ZFFWFLPWx9DEgQ',
),
(
'https://www.youtube.com/s/player/f8f53e1a/player_ias.vflset/en_US/base.js',
'VTQOUOv0mCIeJ7i8kZB', 'kcfD8wy0sNLyNQ',
),
(
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
'YWt1qdbe8SAfkoPHW5d', 'RrRjWQOJmBiP',
),
(
'https://www.youtube.com/s/player/9c6dfc4a/player_ias.vflset/en_US/base.js',
'jbu7ylIosQHyJyJV', 'uwI0ESiynAmhNg',
),
(
'https://www.youtube.com/s/player/f6e09c70/player_ias.vflset/en_US/base.js',
'W9HJZKktxuYoDTqW', 'jHbbkcaxm54',
),
(
'https://www.youtube.com/s/player/f6e09c70/player_ias_tce.vflset/en_US/base.js',
'W9HJZKktxuYoDTqW', 'jHbbkcaxm54',
),
(
'https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js',
'Sy4aDGc0VpYRR9ew_', '5UPOT1VhoZxNLQ',
),
(
'https://www.youtube.com/s/player/d50f54ef/player_ias_tce.vflset/en_US/base.js',
'Ha7507LzRmH3Utygtj', 'XFTb2HoeOE5MHg',
),
(
'https://www.youtube.com/s/player/074a8365/player_ias_tce.vflset/en_US/base.js',
'Ha7507LzRmH3Utygtj', 'ufTsrE0IVYrkl8v',
),
(
'https://www.youtube.com/s/player/643afba4/player_ias.vflset/en_US/base.js',
'N5uAlLqm0eg1GyHO', 'dCBQOejdq5s-ww',
),
(
'https://www.youtube.com/s/player/69f581a5/tv-player-ias.vflset/tv-player-ias.js',
'-qIP447rVlTTwaZjY', 'KNcGOksBAvwqQg',
),
(
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
'ir9-V6cdbCiyKxhr', '2PL7ZDYAALMfmA',
),
(
'https://www.youtube.com/s/player/643afba4/player_ias.vflset/en_US/base.js',
'ir9-V6cdbCiyKxhr', '2PL7ZDYAALMfmA',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'eWYu5d5YeY_4LyEDc', 'XJQqf-N7Xra3gg',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/tv-player-ias.vflset/tv-player-ias.js',
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
),
(
'https://www.youtube.com/s/player/20830619/tv-player-ias.vflset/tv-player-ias.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
),
(
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
),
(
'https://www.youtube.com/s/player/aa3fc80b/player_ias.vflset/en_US/base.js',
'0qY9dal2uzOnOGwa-48hha', 'VSh1KDfQMk-eag',
),
] ]
@ -178,6 +362,8 @@ class TestPlayerInfo(unittest.TestCase):
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'), ('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-de_DE.vflset/base.js', '64dddad9'), ('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-de_DE.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-tablet-en_US.vflset/base.js', '64dddad9'), ('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-tablet-en_US.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js', 'e7567ecf'),
('https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js', '643afba4'),
# obsolete # obsolete
('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'), ('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'), ('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
@ -187,8 +373,9 @@ class TestPlayerInfo(unittest.TestCase):
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'), ('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'), ('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
) )
ie = YoutubeIE(FakeYDL({'cachedir': False}))
for player_url, expected_player_id in PLAYER_URLS: for player_url, expected_player_id in PLAYER_URLS:
player_id = YoutubeIE._extract_player_info(player_url) player_id = ie._extract_player_info(player_url)
self.assertEqual(player_id, expected_player_id) self.assertEqual(player_id, expected_player_id)
@ -200,21 +387,19 @@ class TestSignature(unittest.TestCase):
os.mkdir(self.TESTDATA_DIR) os.mkdir(self.TESTDATA_DIR)
def tearDown(self): def tearDown(self):
try: with compat_contextlib_suppress(OSError):
for f in os.listdir(self.TESTDATA_DIR): for f in os.listdir(self.TESTDATA_DIR):
os.remove(f) os.remove(f)
except OSError:
pass
def t_factory(name, sig_func, url_pattern): def t_factory(name, sig_func, url_pattern):
def make_tfunc(url, sig_input, expected_sig): def make_tfunc(url, sig_input, expected_sig):
m = url_pattern.match(url) m = url_pattern.match(url)
assert m, '%r should follow URL format' % url assert m, '{0!r} should follow URL format'.format(url)
test_id = m.group('id') test_id = re.sub(r'[/.-]', '_', m.group('id') or m.group('compat_id'))
def test_func(self): def test_func(self):
basename = 'player-{0}-{1}.js'.format(name, test_id) basename = 'player-{0}.js'.format(test_id)
fn = os.path.join(self.TESTDATA_DIR, basename) fn = os.path.join(self.TESTDATA_DIR, basename)
if not os.path.exists(fn): if not os.path.exists(fn):
@ -229,7 +414,7 @@ def t_factory(name, sig_func, url_pattern):
def signature(jscode, sig_input): def signature(jscode, sig_input):
func = YoutubeIE(FakeYDL())._parse_sig_js(jscode) func = YoutubeIE(FakeYDL({'cachedir': False}))._parse_sig_js(jscode)
src_sig = ( src_sig = (
compat_str(string.printable[:sig_input]) compat_str(string.printable[:sig_input])
if isinstance(sig_input, int) else sig_input) if isinstance(sig_input, int) else sig_input)
@ -237,17 +422,23 @@ def signature(jscode, sig_input):
def n_sig(jscode, sig_input): def n_sig(jscode, sig_input):
funcname = YoutubeIE(FakeYDL())._extract_n_function_name(jscode) ie = YoutubeIE(FakeYDL({'cachedir': False}))
return JSInterpreter(jscode).call_function(funcname, sig_input) jsi = JSInterpreter(jscode)
jsi, _, func_code = ie._extract_n_function_code_jsi(sig_input, jsi)
return ie._extract_n_function_from_code(jsi, func_code)(sig_input)
make_sig_test = t_factory( make_sig_test = t_factory(
'signature', signature, re.compile(r'.*-(?P<id>[a-zA-Z0-9_-]+)(?:/watch_as3|/html5player)?\.[a-z]+$')) 'signature', signature,
re.compile(r'''(?x)
.+/(?P<h5>html5)?player(?(h5)(?:-en_US)?-|/)(?P<id>[a-zA-Z0-9/._-]+)
(?(h5)/(?:watch_as3|html5player))?\.js$
'''))
for test_spec in _SIG_TESTS: for test_spec in _SIG_TESTS:
make_sig_test(*test_spec) make_sig_test(*test_spec)
make_nsig_test = t_factory( make_nsig_test = t_factory(
'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_-]+)/.+.js$')) 'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_/.-]+)\.js$'))
for test_spec in _NSIG_TESTS: for test_spec in _NSIG_TESTS:
make_nsig_test(*test_spec) make_nsig_test(*test_spec)

@ -357,7 +357,7 @@ class YoutubeDL(object):
_NUMERIC_FIELDS = set(( _NUMERIC_FIELDS = set((
'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx', 'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
'timestamp', 'upload_year', 'upload_month', 'upload_day', 'timestamp', 'upload_year', 'upload_month', 'upload_day', 'available_at',
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count', 'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
'average_rating', 'comment_count', 'age_limit', 'average_rating', 'comment_count', 'age_limit',
'start_time', 'end_time', 'start_time', 'end_time',
@ -540,10 +540,14 @@ class YoutubeDL(object):
"""Print message to stdout if not in quiet mode.""" """Print message to stdout if not in quiet mode."""
return self.to_stdout(message, skip_eol, check_quiet=True) return self.to_stdout(message, skip_eol, check_quiet=True)
def _write_string(self, s, out=None): def _write_string(self, s, out=None, only_once=False, _cache=set()):
if only_once and s in _cache:
return
write_string(s, out=out, encoding=self.params.get('encoding')) write_string(s, out=out, encoding=self.params.get('encoding'))
if only_once:
_cache.add(s)
def to_stdout(self, message, skip_eol=False, check_quiet=False): def to_stdout(self, message, skip_eol=False, check_quiet=False, only_once=False):
"""Print message to stdout if not in quiet mode.""" """Print message to stdout if not in quiet mode."""
if self.params.get('logger'): if self.params.get('logger'):
self.params['logger'].debug(message) self.params['logger'].debug(message)
@ -552,9 +556,9 @@ class YoutubeDL(object):
terminator = ['\n', ''][skip_eol] terminator = ['\n', ''][skip_eol]
output = message + terminator output = message + terminator
self._write_string(output, self._screen_file) self._write_string(output, self._screen_file, only_once=only_once)
def to_stderr(self, message): def to_stderr(self, message, only_once=False):
"""Print message to stderr.""" """Print message to stderr."""
assert isinstance(message, compat_str) assert isinstance(message, compat_str)
if self.params.get('logger'): if self.params.get('logger'):
@ -562,7 +566,7 @@ class YoutubeDL(object):
else: else:
message = self._bidi_workaround(message) message = self._bidi_workaround(message)
output = message + '\n' output = message + '\n'
self._write_string(output, self._err_file) self._write_string(output, self._err_file, only_once=only_once)
def to_console_title(self, message): def to_console_title(self, message):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
@ -641,18 +645,11 @@ class YoutubeDL(object):
raise DownloadError(message, exc_info) raise DownloadError(message, exc_info)
self._download_retcode = 1 self._download_retcode = 1
def report_warning(self, message, only_once=False, _cache={}): def report_warning(self, message, only_once=False):
''' '''
Print the message to stderr, it will be prefixed with 'WARNING:' Print the message to stderr, it will be prefixed with 'WARNING:'
If stderr is a tty file the 'WARNING:' will be colored If stderr is a tty file the 'WARNING:' will be colored
''' '''
if only_once:
m_hash = hash((self, message))
m_cnt = _cache.setdefault(m_hash, 0)
_cache[m_hash] = m_cnt + 1
if m_cnt > 0:
return
if self.params.get('logger') is not None: if self.params.get('logger') is not None:
self.params['logger'].warning(message) self.params['logger'].warning(message)
else: else:
@ -663,7 +660,7 @@ class YoutubeDL(object):
else: else:
_msg_header = 'WARNING:' _msg_header = 'WARNING:'
warning_message = '%s %s' % (_msg_header, message) warning_message = '%s %s' % (_msg_header, message)
self.to_stderr(warning_message) self.to_stderr(warning_message, only_once=only_once)
def report_error(self, message, *args, **kwargs): def report_error(self, message, *args, **kwargs):
''' '''
@ -677,6 +674,16 @@ class YoutubeDL(object):
kwargs['message'] = '%s %s' % (_msg_header, message) kwargs['message'] = '%s %s' % (_msg_header, message)
self.trouble(*args, **kwargs) self.trouble(*args, **kwargs)
def write_debug(self, message, only_once=False):
'''Log debug message or Print message to stderr'''
if not self.params.get('verbose', False):
return
message = '[debug] {0}'.format(message)
if self.params.get('logger'):
self.params['logger'].debug(message)
else:
self.to_stderr(message, only_once)
def report_unscoped_cookies(self, *args, **kwargs): def report_unscoped_cookies(self, *args, **kwargs):
# message=None, tb=False, is_error=False # message=None, tb=False, is_error=False
if len(args) <= 2: if len(args) <= 2:
@ -2397,60 +2404,52 @@ class YoutubeDL(object):
return res return res
def _format_note(self, fdict): def _format_note(self, fdict):
res = ''
if fdict.get('ext') in ['f4f', 'f4m']: def simplified_codec(f, field):
res += '(unsupported) ' assert field in ('acodec', 'vcodec')
if fdict.get('language'): codec = f.get(field)
if res: return (
res += ' ' 'unknown' if not codec
res += '[%s] ' % fdict['language'] else '.'.join(codec.split('.')[:4]) if codec != 'none'
if fdict.get('format_note') is not None: else 'images' if field == 'vcodec' and f.get('acodec') == 'none'
res += fdict['format_note'] + ' ' else None if field == 'acodec' and f.get('vcodec') == 'none'
if fdict.get('tbr') is not None: else 'audio only' if field == 'vcodec'
res += '%4dk ' % fdict['tbr'] else 'video only')
res = join_nonempty(
fdict.get('ext') in ('f4f', 'f4m') and '(unsupported)',
fdict.get('language') and ('[%s]' % (fdict['language'],)),
fdict.get('format_note') is not None and fdict['format_note'],
fdict.get('tbr') is not None and ('%4dk' % fdict['tbr']),
delim=' ')
res = [res] if res else []
if fdict.get('container') is not None: if fdict.get('container') is not None:
if res: res.append('%s container' % (fdict['container'],))
res += ', ' if fdict.get('vcodec') not in (None, 'none'):
res += '%s container' % fdict['container'] codec = simplified_codec(fdict, 'vcodec')
if (fdict.get('vcodec') is not None if codec and fdict.get('vbr') is not None:
and fdict.get('vcodec') != 'none'): codec += '@'
if res:
res += ', '
res += fdict['vcodec']
if fdict.get('vbr') is not None:
res += '@'
elif fdict.get('vbr') is not None and fdict.get('abr') is not None: elif fdict.get('vbr') is not None and fdict.get('abr') is not None:
res += 'video@' codec = 'video@'
if fdict.get('vbr') is not None: else:
res += '%4dk' % fdict['vbr'] codec = None
codec = join_nonempty(codec, fdict.get('vbr') is not None and ('%4dk' % fdict['vbr']))
if codec:
res.append(codec)
if fdict.get('fps') is not None: if fdict.get('fps') is not None:
if res: res.append('%sfps' % (fdict['fps'],))
res += ', ' codec = (
res += '%sfps' % fdict['fps'] simplified_codec(fdict, 'acodec') if fdict.get('acodec') is not None
if fdict.get('acodec') is not None: else 'audio' if fdict.get('abr') is not None else None)
if res: if codec:
res += ', ' res.append(join_nonempty(
if fdict['acodec'] == 'none': '%-4s' % (codec + (('@%3dk' % fdict['abr']) if fdict.get('abr') else ''),),
res += 'video only' fdict.get('asr') and '(%5dHz)' % fdict['asr'], delim=' '))
else:
res += '%-5s' % fdict['acodec']
elif fdict.get('abr') is not None:
if res:
res += ', '
res += 'audio'
if fdict.get('abr') is not None:
res += '@%3dk' % fdict['abr']
if fdict.get('asr') is not None:
res += ' (%5dHz)' % fdict['asr']
if fdict.get('filesize') is not None: if fdict.get('filesize') is not None:
if res: res.append(format_bytes(fdict['filesize']))
res += ', '
res += format_bytes(fdict['filesize'])
elif fdict.get('filesize_approx') is not None: elif fdict.get('filesize_approx') is not None:
if res: res.append('~' + format_bytes(fdict['filesize_approx']))
res += ', ' return ', '.join(res)
res += '~' + format_bytes(fdict['filesize_approx'])
return res
def list_formats(self, info_dict): def list_formats(self, info_dict):
formats = info_dict.get('formats', [info_dict]) formats = info_dict.get('formats', [info_dict])
@ -2514,7 +2513,7 @@ class YoutubeDL(object):
self.get_encoding())) self.get_encoding()))
write_string(encoding_str, encoding=None) write_string(encoding_str, encoding=None)
writeln_debug = lambda *s: self._write_string('[debug] %s\n' % (''.join(s), )) writeln_debug = lambda *s: self.write_debug(''.join(s))
writeln_debug('youtube-dl version ', __version__) writeln_debug('youtube-dl version ', __version__)
if _LAZY_LOADER: if _LAZY_LOADER:
writeln_debug('Lazy loading extractors enabled') writeln_debug('Lazy loading extractors enabled')

@ -18,7 +18,7 @@ from .compat import (
compat_getpass, compat_getpass,
compat_register_utf8, compat_register_utf8,
compat_shlex_split, compat_shlex_split,
workaround_optparse_bug9161, _workaround_optparse_bug9161,
) )
from .utils import ( from .utils import (
_UnsafeExtensionError, _UnsafeExtensionError,
@ -50,7 +50,7 @@ def _real_main(argv=None):
# Compatibility fix for Windows # Compatibility fix for Windows
compat_register_utf8() compat_register_utf8()
workaround_optparse_bug9161() _workaround_optparse_bug9161()
setproctitle('youtube-dl') setproctitle('youtube-dl')
@ -409,6 +409,8 @@ def _real_main(argv=None):
'include_ads': opts.include_ads, 'include_ads': opts.include_ads,
'default_search': opts.default_search, 'default_search': opts.default_search,
'youtube_include_dash_manifest': opts.youtube_include_dash_manifest, 'youtube_include_dash_manifest': opts.youtube_include_dash_manifest,
'youtube_player_js_version': opts.youtube_player_js_version,
'youtube_player_js_variant': opts.youtube_player_js_variant,
'encoding': opts.encoding, 'encoding': opts.encoding,
'extract_flat': opts.extract_flat, 'extract_flat': opts.extract_flat,
'mark_watched': opts.mark_watched, 'mark_watched': opts.mark_watched,

@ -1,3 +1,4 @@
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import errno import errno
@ -10,12 +11,14 @@ import traceback
from .compat import ( from .compat import (
compat_getenv, compat_getenv,
compat_open as open, compat_open as open,
compat_os_makedirs,
) )
from .utils import ( from .utils import (
error_to_compat_str, error_to_compat_str,
escape_rfc3986,
expand_path, expand_path,
is_outdated_version, is_outdated_version,
try_get, traverse_obj,
write_json_file, write_json_file,
) )
from .version import __version__ from .version import __version__
@ -30,23 +33,35 @@ class Cache(object):
def __init__(self, ydl): def __init__(self, ydl):
self._ydl = ydl self._ydl = ydl
def _write_debug(self, *args, **kwargs):
self._ydl.write_debug(*args, **kwargs)
def _report_warning(self, *args, **kwargs):
self._ydl.report_warning(*args, **kwargs)
def _to_screen(self, *args, **kwargs):
self._ydl.to_screen(*args, **kwargs)
def _get_param(self, k, default=None):
return self._ydl.params.get(k, default)
def _get_root_dir(self): def _get_root_dir(self):
res = self._ydl.params.get('cachedir') res = self._get_param('cachedir')
if res is None: if res is None:
cache_root = compat_getenv('XDG_CACHE_HOME', '~/.cache') cache_root = compat_getenv('XDG_CACHE_HOME', '~/.cache')
res = os.path.join(cache_root, self._YTDL_DIR) res = os.path.join(cache_root, self._YTDL_DIR)
return expand_path(res) return expand_path(res)
def _get_cache_fn(self, section, key, dtype): def _get_cache_fn(self, section, key, dtype):
assert re.match(r'^[a-zA-Z0-9_.-]+$', section), \ assert re.match(r'^[\w.-]+$', section), \
'invalid section %r' % section 'invalid section %r' % section
assert re.match(r'^[a-zA-Z0-9_.-]+$', key), 'invalid key %r' % key key = escape_rfc3986(key, safe='').replace('%', ',') # encode non-ascii characters
return os.path.join( return os.path.join(
self._get_root_dir(), section, '%s.%s' % (key, dtype)) self._get_root_dir(), section, '%s.%s' % (key, dtype))
@property @property
def enabled(self): def enabled(self):
return self._ydl.params.get('cachedir') is not False return self._get_param('cachedir') is not False
def store(self, section, key, data, dtype='json'): def store(self, section, key, data, dtype='json'):
assert dtype in ('json',) assert dtype in ('json',)
@ -56,61 +71,75 @@ class Cache(object):
fn = self._get_cache_fn(section, key, dtype) fn = self._get_cache_fn(section, key, dtype)
try: try:
try: compat_os_makedirs(os.path.dirname(fn), exist_ok=True)
os.makedirs(os.path.dirname(fn)) self._write_debug('Saving {section}.{key} to cache'.format(section=section, key=key))
except OSError as ose:
if ose.errno != errno.EEXIST:
raise
write_json_file({self._VERSION_KEY: __version__, 'data': data}, fn) write_json_file({self._VERSION_KEY: __version__, 'data': data}, fn)
except Exception: except Exception:
tb = traceback.format_exc() tb = traceback.format_exc()
self._ydl.report_warning( self._report_warning('Writing cache to {fn!r} failed: {tb}'.format(fn=fn, tb=tb))
'Writing cache to %r failed: %s' % (fn, tb))
def clear(self, section, key, dtype='json'):
if not self.enabled:
return
fn = self._get_cache_fn(section, key, dtype)
self._write_debug('Clearing {section}.{key} from cache'.format(section=section, key=key))
try:
os.remove(fn)
except Exception as e:
if getattr(e, 'errno') == errno.ENOENT:
# file not found
return
tb = traceback.format_exc()
self._report_warning('Clearing cache from {fn!r} failed: {tb}'.format(fn=fn, tb=tb))
def _validate(self, data, min_ver): def _validate(self, data, min_ver):
version = try_get(data, lambda x: x[self._VERSION_KEY]) version = traverse_obj(data, self._VERSION_KEY)
if not version: # Backward compatibility if not version: # Backward compatibility
data, version = {'data': data}, self._DEFAULT_VERSION data, version = {'data': data}, self._DEFAULT_VERSION
if not is_outdated_version(version, min_ver or '0', assume_new=False): if not is_outdated_version(version, min_ver or '0', assume_new=False):
return data['data'] return data['data']
self._ydl.to_screen( self._write_debug('Discarding old cache from version {version} (needs {min_ver})'.format(version=version, min_ver=min_ver))
'Discarding old cache from version {version} (needs {min_ver})'.format(**locals()))
def load(self, section, key, dtype='json', default=None, min_ver=None): def load(self, section, key, dtype='json', default=None, **kw_min_ver):
assert dtype in ('json',) assert dtype in ('json',)
min_ver = kw_min_ver.get('min_ver')
if not self.enabled: if not self.enabled:
return default return default
cache_fn = self._get_cache_fn(section, key, dtype) cache_fn = self._get_cache_fn(section, key, dtype)
try: try:
with open(cache_fn, encoding='utf-8') as cachef:
self._write_debug('Loading {section}.{key} from cache'.format(section=section, key=key), only_once=True)
return self._validate(json.load(cachef), min_ver)
except (ValueError, KeyError):
try: try:
with open(cache_fn, 'r', encoding='utf-8') as cachef: file_size = 'size: %d' % os.path.getsize(cache_fn)
return self._validate(json.load(cachef), min_ver) except (OSError, IOError) as oe:
except ValueError: file_size = error_to_compat_str(oe)
try: self._report_warning('Cache retrieval from %s failed (%s)' % (cache_fn, file_size))
file_size = os.path.getsize(cache_fn) except Exception as e:
except (OSError, IOError) as oe: if getattr(e, 'errno') == errno.ENOENT:
file_size = error_to_compat_str(oe) # no cache available
self._ydl.report_warning( return
'Cache retrieval from %s failed (%s)' % (cache_fn, file_size)) self._report_warning('Cache retrieval from %s failed' % (cache_fn,))
except IOError:
pass # No cache available
return default return default
def remove(self): def remove(self):
if not self.enabled: if not self.enabled:
self._ydl.to_screen('Cache is disabled (Did you combine --no-cache-dir and --rm-cache-dir?)') self._to_screen('Cache is disabled (Did you combine --no-cache-dir and --rm-cache-dir?)')
return return
cachedir = self._get_root_dir() cachedir = self._get_root_dir()
if not any((term in cachedir) for term in ('cache', 'tmp')): if not any((term in cachedir) for term in ('cache', 'tmp')):
raise Exception('Not removing directory %s - this does not look like a cache dir' % cachedir) raise Exception('Not removing directory %s - this does not look like a cache dir' % (cachedir,))
self._ydl.to_screen( self._to_screen(
'Removing cache dir %s .' % cachedir, skip_eol=True) 'Removing cache dir %s .' % (cachedir,), skip_eol=True, ),
if os.path.exists(cachedir): if os.path.exists(cachedir):
self._ydl.to_screen('.', skip_eol=True) self._to_screen('.', skip_eol=True)
shutil.rmtree(cachedir) shutil.rmtree(cachedir)
self._ydl.to_screen('.') self._to_screen('.')

@ -10,9 +10,10 @@ from .compat import (
# https://github.com/unicode-org/icu/blob/main/icu4c/source/data/unidata/CaseFolding.txt # https://github.com/unicode-org/icu/blob/main/icu4c/source/data/unidata/CaseFolding.txt
# In case newly foldable Unicode characters are defined, paste the new version # In case newly foldable Unicode characters are defined, paste the new version
# of the text inside the ''' marks. # of the text inside the ''' marks.
# The text is expected to have only blank lines andlines with 1st character #, # The text is expected to have only blank lines and lines with 1st character #,
# all ignored, and fold definitions like this: # all ignored, and fold definitions like this:
# `from_hex_code; space_separated_to_hex_code_list; comment` # `from_hex_code; status; space_separated_to_hex_code_list; comment`
# Only `status` C/F are used.
_map_str = ''' _map_str = '''
# CaseFolding-15.0.0.txt # CaseFolding-15.0.0.txt
@ -1657,11 +1658,6 @@ _map = dict(
del _map_str del _map_str
def casefold(s): def _casefold(s):
assert isinstance(s, compat_str) assert isinstance(s, compat_str)
return ''.join((_map.get(c, c) for c in s)) return ''.join((_map.get(c, c) for c in s))
__all__ = [
'casefold',
]

@ -16,7 +16,6 @@ import os
import platform import platform
import re import re
import shlex import shlex
import shutil
import socket import socket
import struct import struct
import subprocess import subprocess
@ -24,11 +23,15 @@ import sys
import types import types
import xml.etree.ElementTree import xml.etree.ElementTree
_IDENTITY = lambda x: x
# naming convention # naming convention
# 'compat_' + Python3_name.replace('.', '_') # 'compat_' + Python3_name.replace('.', '_')
# other aliases exist for convenience and/or legacy # other aliases exist for convenience and/or legacy
# wrap disposable test values in type() to reclaim storage
# deal with critical unicode/str things first # deal with critical unicode/str things first:
# compat_str, compat_basestring, compat_chr
try: try:
# Python 2 # Python 2
compat_str, compat_basestring, compat_chr = ( compat_str, compat_basestring, compat_chr = (
@ -39,18 +42,23 @@ except NameError:
str, (str, bytes), chr str, (str, bytes), chr
) )
# casefold
# compat_casefold
try: try:
compat_str.casefold compat_str.casefold
compat_casefold = lambda s: s.casefold() compat_casefold = lambda s: s.casefold()
except AttributeError: except AttributeError:
from .casefold import casefold as compat_casefold from .casefold import _casefold as compat_casefold
# compat_collections_abc
try: try:
import collections.abc as compat_collections_abc import collections.abc as compat_collections_abc
except ImportError: except ImportError:
import collections as compat_collections_abc compat_collections_abc = collections
# compat_urllib_request
try: try:
import urllib.request as compat_urllib_request import urllib.request as compat_urllib_request
except ImportError: # Python 2 except ImportError: # Python 2
@ -79,11 +87,15 @@ except TypeError:
_add_init_method_arg(compat_urllib_request.Request) _add_init_method_arg(compat_urllib_request.Request)
del _add_init_method_arg del _add_init_method_arg
# compat_urllib_error
try: try:
import urllib.error as compat_urllib_error import urllib.error as compat_urllib_error
except ImportError: # Python 2 except ImportError: # Python 2
import urllib2 as compat_urllib_error import urllib2 as compat_urllib_error
# compat_urllib_parse
try: try:
import urllib.parse as compat_urllib_parse import urllib.parse as compat_urllib_parse
except ImportError: # Python 2 except ImportError: # Python 2
@ -98,17 +110,23 @@ except ImportError: # Python 2
compat_urlparse = compat_urllib_parse compat_urlparse = compat_urllib_parse
compat_urllib_parse_urlparse = compat_urllib_parse.urlparse compat_urllib_parse_urlparse = compat_urllib_parse.urlparse
# compat_urllib_response
try: try:
import urllib.response as compat_urllib_response import urllib.response as compat_urllib_response
except ImportError: # Python 2 except ImportError: # Python 2
import urllib as compat_urllib_response import urllib as compat_urllib_response
# compat_urllib_response.addinfourl
try: try:
compat_urllib_response.addinfourl.status compat_urllib_response.addinfourl.status
except AttributeError: except AttributeError:
# .getcode() is deprecated in Py 3. # .getcode() is deprecated in Py 3.
compat_urllib_response.addinfourl.status = property(lambda self: self.getcode()) compat_urllib_response.addinfourl.status = property(lambda self: self.getcode())
# compat_http_cookiejar
try: try:
import http.cookiejar as compat_cookiejar import http.cookiejar as compat_cookiejar
except ImportError: # Python 2 except ImportError: # Python 2
@ -127,12 +145,16 @@ else:
compat_cookiejar_Cookie = compat_cookiejar.Cookie compat_cookiejar_Cookie = compat_cookiejar.Cookie
compat_http_cookiejar_Cookie = compat_cookiejar_Cookie compat_http_cookiejar_Cookie = compat_cookiejar_Cookie
# compat_http_cookies
try: try:
import http.cookies as compat_cookies import http.cookies as compat_cookies
except ImportError: # Python 2 except ImportError: # Python 2
import Cookie as compat_cookies import Cookie as compat_cookies
compat_http_cookies = compat_cookies compat_http_cookies = compat_cookies
# compat_http_cookies_SimpleCookie
if sys.version_info[0] == 2 or sys.version_info < (3, 3): if sys.version_info[0] == 2 or sys.version_info < (3, 3):
class compat_cookies_SimpleCookie(compat_cookies.SimpleCookie): class compat_cookies_SimpleCookie(compat_cookies.SimpleCookie):
def load(self, rawdata): def load(self, rawdata):
@ -155,11 +177,15 @@ else:
compat_cookies_SimpleCookie = compat_cookies.SimpleCookie compat_cookies_SimpleCookie = compat_cookies.SimpleCookie
compat_http_cookies_SimpleCookie = compat_cookies_SimpleCookie compat_http_cookies_SimpleCookie = compat_cookies_SimpleCookie
# compat_html_entities, probably useless now
try: try:
import html.entities as compat_html_entities import html.entities as compat_html_entities
except ImportError: # Python 2 except ImportError: # Python 2
import htmlentitydefs as compat_html_entities import htmlentitydefs as compat_html_entities
# compat_html_entities_html5
try: # Python >= 3.3 try: # Python >= 3.3
compat_html_entities_html5 = compat_html_entities.html5 compat_html_entities_html5 = compat_html_entities.html5
except AttributeError: except AttributeError:
@ -2408,18 +2434,24 @@ except AttributeError:
# Py < 3.1 # Py < 3.1
compat_http_client.HTTPResponse.getcode = lambda self: self.status compat_http_client.HTTPResponse.getcode = lambda self: self.status
# compat_urllib_HTTPError
try: try:
from urllib.error import HTTPError as compat_HTTPError from urllib.error import HTTPError as compat_HTTPError
except ImportError: # Python 2 except ImportError: # Python 2
from urllib2 import HTTPError as compat_HTTPError from urllib2 import HTTPError as compat_HTTPError
compat_urllib_HTTPError = compat_HTTPError compat_urllib_HTTPError = compat_HTTPError
# compat_urllib_request_urlretrieve
try: try:
from urllib.request import urlretrieve as compat_urlretrieve from urllib.request import urlretrieve as compat_urlretrieve
except ImportError: # Python 2 except ImportError: # Python 2
from urllib import urlretrieve as compat_urlretrieve from urllib import urlretrieve as compat_urlretrieve
compat_urllib_request_urlretrieve = compat_urlretrieve compat_urllib_request_urlretrieve = compat_urlretrieve
# compat_html_parser_HTMLParser, compat_html_parser_HTMLParseError
try: try:
from HTMLParser import ( from HTMLParser import (
HTMLParser as compat_HTMLParser, HTMLParser as compat_HTMLParser,
@ -2432,22 +2464,33 @@ except ImportError: # Python 3
# HTMLParseError was deprecated in Python 3.3 and removed in # HTMLParseError was deprecated in Python 3.3 and removed in
# Python 3.5. Introducing dummy exception for Python >3.5 for compatible # Python 3.5. Introducing dummy exception for Python >3.5 for compatible
# and uniform cross-version exception handling # and uniform cross-version exception handling
class compat_HTMLParseError(Exception): class compat_HTMLParseError(Exception):
pass pass
compat_html_parser_HTMLParser = compat_HTMLParser compat_html_parser_HTMLParser = compat_HTMLParser
compat_html_parser_HTMLParseError = compat_HTMLParseError compat_html_parser_HTMLParseError = compat_HTMLParseError
# compat_subprocess_get_DEVNULL
try: try:
_DEVNULL = subprocess.DEVNULL _DEVNULL = subprocess.DEVNULL
compat_subprocess_get_DEVNULL = lambda: _DEVNULL compat_subprocess_get_DEVNULL = lambda: _DEVNULL
except AttributeError: except AttributeError:
compat_subprocess_get_DEVNULL = lambda: open(os.path.devnull, 'w') compat_subprocess_get_DEVNULL = lambda: open(os.path.devnull, 'w')
# compat_http_server
try: try:
import http.server as compat_http_server import http.server as compat_http_server
except ImportError: except ImportError:
import BaseHTTPServer as compat_http_server import BaseHTTPServer as compat_http_server
# compat_urllib_parse_unquote_to_bytes,
# compat_urllib_parse_unquote, compat_urllib_parse_unquote_plus,
# compat_urllib_parse_urlencode,
# compat_urllib_parse_parse_qs
try: try:
from urllib.parse import unquote_to_bytes as compat_urllib_parse_unquote_to_bytes from urllib.parse import unquote_to_bytes as compat_urllib_parse_unquote_to_bytes
from urllib.parse import unquote as compat_urllib_parse_unquote from urllib.parse import unquote as compat_urllib_parse_unquote
@ -2455,8 +2498,7 @@ try:
from urllib.parse import urlencode as compat_urllib_parse_urlencode from urllib.parse import urlencode as compat_urllib_parse_urlencode
from urllib.parse import parse_qs as compat_parse_qs from urllib.parse import parse_qs as compat_parse_qs
except ImportError: # Python 2 except ImportError: # Python 2
_asciire = (compat_urllib_parse._asciire if hasattr(compat_urllib_parse, '_asciire') _asciire = getattr(compat_urllib_parse, '_asciire', None) or re.compile(r'([\x00-\x7f]+)')
else re.compile(r'([\x00-\x7f]+)'))
# HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus # HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus
# implementations from cpython 3.4.3's stdlib. Python 2's version # implementations from cpython 3.4.3's stdlib. Python 2's version
@ -2524,24 +2566,21 @@ except ImportError: # Python 2
# Possible solutions are to either port it from python 3 with all # Possible solutions are to either port it from python 3 with all
# the friends or manually ensure input query contains only byte strings. # the friends or manually ensure input query contains only byte strings.
# We will stick with latter thus recursively encoding the whole query. # We will stick with latter thus recursively encoding the whole query.
def compat_urllib_parse_urlencode(query, doseq=0, encoding='utf-8'): def compat_urllib_parse_urlencode(query, doseq=0, safe='', encoding='utf-8', errors='strict'):
def encode_elem(e): def encode_elem(e):
if isinstance(e, dict): if isinstance(e, dict):
e = encode_dict(e) e = encode_dict(e)
elif isinstance(e, (list, tuple,)): elif isinstance(e, (list, tuple,)):
list_e = encode_list(e) e = type(e)(encode_elem(el) for el in e)
e = tuple(list_e) if isinstance(e, tuple) else list_e
elif isinstance(e, compat_str): elif isinstance(e, compat_str):
e = e.encode(encoding) e = e.encode(encoding, errors)
return e return e
def encode_dict(d): def encode_dict(d):
return dict((encode_elem(k), encode_elem(v)) for k, v in d.items()) return tuple((encode_elem(k), encode_elem(v)) for k, v in d.items())
def encode_list(l):
return [encode_elem(e) for e in l]
return compat_urllib_parse._urlencode(encode_elem(query), doseq=doseq) return compat_urllib_parse._urlencode(encode_elem(query), doseq=doseq).decode('ascii')
# HACK: The following is the correct parse_qs implementation from cpython 3's stdlib. # HACK: The following is the correct parse_qs implementation from cpython 3's stdlib.
# Python 2's version is apparently totally broken # Python 2's version is apparently totally broken
@ -2596,8 +2635,61 @@ except ImportError: # Python 2
('parse_qs', compat_parse_qs)): ('parse_qs', compat_parse_qs)):
setattr(compat_urllib_parse, name, fix) setattr(compat_urllib_parse, name, fix)
try:
all(chr(i) in b'' for i in range(256))
except TypeError:
# not all chr(i) are str: patch Python2 quote
_safemaps = getattr(compat_urllib_parse, '_safemaps', {})
_always_safe = frozenset(compat_urllib_parse.always_safe)
def _quote(s, safe='/'):
"""quote('abc def') -> 'abc%20def'"""
if not s and s is not None: # fast path
return s
safe = frozenset(safe)
cachekey = (safe, _always_safe)
try:
safe_map = _safemaps[cachekey]
except KeyError:
safe = _always_safe | safe
safe_map = {}
for i in range(256):
c = chr(i)
safe_map[c] = (
c if (i < 128 and c in safe)
else b'%{0:02X}'.format(i))
_safemaps[cachekey] = safe_map
if safe.issuperset(s):
return s
return ''.join(safe_map[c] for c in s)
# linked code
def _quote_plus(s, safe=''):
return (
_quote(s, safe + b' ').replace(b' ', b'+') if b' ' in s
else _quote(s, safe))
# linked code
def _urlcleanup():
if compat_urllib_parse._urlopener:
compat_urllib_parse._urlopener.cleanup()
_safemaps.clear()
compat_urllib_parse.ftpcache.clear()
for name, fix in (
('quote', _quote),
('quote_plus', _quote_plus),
('urlcleanup', _urlcleanup)):
setattr(compat_urllib_parse, '_' + name, getattr(compat_urllib_parse, name))
setattr(compat_urllib_parse, name, fix)
compat_urllib_parse_parse_qs = compat_parse_qs compat_urllib_parse_parse_qs = compat_parse_qs
# compat_urllib_request_DataHandler
try: try:
from urllib.request import DataHandler as compat_urllib_request_DataHandler from urllib.request import DataHandler as compat_urllib_request_DataHandler
except ImportError: # Python < 3.4 except ImportError: # Python < 3.4
@ -2632,16 +2724,20 @@ except ImportError: # Python < 3.4
return compat_urllib_response.addinfourl(io.BytesIO(data), headers, url) return compat_urllib_response.addinfourl(io.BytesIO(data), headers, url)
# compat_xml_etree_ElementTree_ParseError
try: try:
from xml.etree.ElementTree import ParseError as compat_xml_parse_error from xml.etree.ElementTree import ParseError as compat_xml_parse_error
except ImportError: # Python 2.6 except ImportError: # Python 2.6
from xml.parsers.expat import ExpatError as compat_xml_parse_error from xml.parsers.expat import ExpatError as compat_xml_parse_error
compat_xml_etree_ElementTree_ParseError = compat_xml_parse_error compat_xml_etree_ElementTree_ParseError = compat_xml_parse_error
etree = xml.etree.ElementTree
# compat_xml_etree_ElementTree_Element
_etree = xml.etree.ElementTree
class _TreeBuilder(etree.TreeBuilder): class _TreeBuilder(_etree.TreeBuilder):
def doctype(self, name, pubid, system): def doctype(self, name, pubid, system):
pass pass
@ -2650,7 +2746,7 @@ try:
# xml.etree.ElementTree.Element is a method in Python <=2.6 and # xml.etree.ElementTree.Element is a method in Python <=2.6 and
# the following will crash with: # the following will crash with:
# TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types # TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types
isinstance(None, etree.Element) isinstance(None, _etree.Element)
from xml.etree.ElementTree import Element as compat_etree_Element from xml.etree.ElementTree import Element as compat_etree_Element
except TypeError: # Python <=2.6 except TypeError: # Python <=2.6
from xml.etree.ElementTree import _ElementInterface as compat_etree_Element from xml.etree.ElementTree import _ElementInterface as compat_etree_Element
@ -2658,12 +2754,12 @@ compat_xml_etree_ElementTree_Element = compat_etree_Element
if sys.version_info[0] >= 3: if sys.version_info[0] >= 3:
def compat_etree_fromstring(text): def compat_etree_fromstring(text):
return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder())) return _etree.XML(text, parser=_etree.XMLParser(target=_TreeBuilder()))
else: else:
# python 2.x tries to encode unicode strings with ascii (see the # python 2.x tries to encode unicode strings with ascii (see the
# XMLParser._fixtext method) # XMLParser._fixtext method)
try: try:
_etree_iter = etree.Element.iter _etree_iter = _etree.Element.iter
except AttributeError: # Python <=2.6 except AttributeError: # Python <=2.6
def _etree_iter(root): def _etree_iter(root):
for el in root.findall('*'): for el in root.findall('*'):
@ -2675,27 +2771,29 @@ else:
# 2.7 source # 2.7 source
def _XML(text, parser=None): def _XML(text, parser=None):
if not parser: if not parser:
parser = etree.XMLParser(target=_TreeBuilder()) parser = _etree.XMLParser(target=_TreeBuilder())
parser.feed(text) parser.feed(text)
return parser.close() return parser.close()
def _element_factory(*args, **kwargs): def _element_factory(*args, **kwargs):
el = etree.Element(*args, **kwargs) el = _etree.Element(*args, **kwargs)
for k, v in el.items(): for k, v in el.items():
if isinstance(v, bytes): if isinstance(v, bytes):
el.set(k, v.decode('utf-8')) el.set(k, v.decode('utf-8'))
return el return el
def compat_etree_fromstring(text): def compat_etree_fromstring(text):
doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory))) doc = _XML(text, parser=_etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
for el in _etree_iter(doc): for el in _etree_iter(doc):
if el.text is not None and isinstance(el.text, bytes): if el.text is not None and isinstance(el.text, bytes):
el.text = el.text.decode('utf-8') el.text = el.text.decode('utf-8')
return doc return doc
if hasattr(etree, 'register_namespace'):
compat_etree_register_namespace = etree.register_namespace # compat_xml_etree_register_namespace
else: try:
compat_etree_register_namespace = _etree.register_namespace
except AttributeError:
def compat_etree_register_namespace(prefix, uri): def compat_etree_register_namespace(prefix, uri):
"""Register a namespace prefix. """Register a namespace prefix.
The registry is global, and any existing mapping for either the The registry is global, and any existing mapping for either the
@ -2704,14 +2802,16 @@ else:
attributes in this namespace will be serialized with prefix if possible. attributes in this namespace will be serialized with prefix if possible.
ValueError is raised if prefix is reserved or is invalid. ValueError is raised if prefix is reserved or is invalid.
""" """
if re.match(r"ns\d+$", prefix): if re.match(r'ns\d+$', prefix):
raise ValueError("Prefix format reserved for internal use") raise ValueError('Prefix format reserved for internal use')
for k, v in list(etree._namespace_map.items()): for k, v in list(_etree._namespace_map.items()):
if k == uri or v == prefix: if k == uri or v == prefix:
del etree._namespace_map[k] del _etree._namespace_map[k]
etree._namespace_map[uri] = prefix _etree._namespace_map[uri] = prefix
compat_xml_etree_register_namespace = compat_etree_register_namespace compat_xml_etree_register_namespace = compat_etree_register_namespace
# compat_xpath, compat_etree_iterfind
if sys.version_info < (2, 7): if sys.version_info < (2, 7):
# Here comes the crazy part: In 2.6, if the xpath is a unicode, # Here comes the crazy part: In 2.6, if the xpath is a unicode,
# .//node does not match if a node is a direct child of . ! # .//node does not match if a node is a direct child of . !
@ -2898,7 +2998,6 @@ if sys.version_info < (2, 7):
def __init__(self, root): def __init__(self, root):
self.root = root self.root = root
##
# Generate all matching objects. # Generate all matching objects.
def compat_etree_iterfind(elem, path, namespaces=None): def compat_etree_iterfind(elem, path, namespaces=None):
@ -2933,13 +3032,15 @@ if sys.version_info < (2, 7):
else: else:
compat_xpath = lambda xpath: xpath
compat_etree_iterfind = lambda element, match: element.iterfind(match) compat_etree_iterfind = lambda element, match: element.iterfind(match)
compat_xpath = _IDENTITY
# compat_os_name
compat_os_name = os._name if os.name == 'java' else os.name compat_os_name = os._name if os.name == 'java' else os.name
# compat_shlex_quote
if compat_os_name == 'nt': if compat_os_name == 'nt':
def compat_shlex_quote(s): def compat_shlex_quote(s):
return s if re.match(r'^[-_\w./]+$', s) else '"%s"' % s.replace('"', '\\"') return s if re.match(r'^[-_\w./]+$', s) else '"%s"' % s.replace('"', '\\"')
@ -2954,6 +3055,7 @@ else:
return "'" + s.replace("'", "'\"'\"'") + "'" return "'" + s.replace("'", "'\"'\"'") + "'"
# compat_shlex.split
try: try:
args = shlex.split('中文') args = shlex.split('中文')
assert (isinstance(args, list) assert (isinstance(args, list)
@ -2969,6 +3071,7 @@ except (AssertionError, UnicodeEncodeError):
return list(map(lambda s: s.decode('utf-8'), shlex.split(s, comments, posix))) return list(map(lambda s: s.decode('utf-8'), shlex.split(s, comments, posix)))
# compat_ord
def compat_ord(c): def compat_ord(c):
if isinstance(c, int): if isinstance(c, int):
return c return c
@ -2976,6 +3079,7 @@ def compat_ord(c):
return ord(c) return ord(c)
# compat_getenv, compat_os_path_expanduser, compat_setenv
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
compat_getenv = os.getenv compat_getenv = os.getenv
compat_expanduser = os.path.expanduser compat_expanduser = os.path.expanduser
@ -3063,6 +3167,22 @@ else:
compat_os_path_expanduser = compat_expanduser compat_os_path_expanduser = compat_expanduser
# compat_os_makedirs
try:
os.makedirs('.', exist_ok=True)
compat_os_makedirs = os.makedirs
except TypeError: # < Py3.2
from errno import EEXIST as _errno_EEXIST
def compat_os_makedirs(name, mode=0o777, exist_ok=False):
try:
return os.makedirs(name, mode=mode)
except OSError as ose:
if not (exist_ok and ose.errno == _errno_EEXIST):
raise
# compat_os_path_realpath
if compat_os_name == 'nt' and sys.version_info < (3, 8): if compat_os_name == 'nt' and sys.version_info < (3, 8):
# os.path.realpath on Windows does not follow symbolic links # os.path.realpath on Windows does not follow symbolic links
# prior to Python 3.8 (see https://bugs.python.org/issue9949) # prior to Python 3.8 (see https://bugs.python.org/issue9949)
@ -3076,6 +3196,7 @@ else:
compat_os_path_realpath = compat_realpath compat_os_path_realpath = compat_realpath
# compat_print
if sys.version_info < (3, 0): if sys.version_info < (3, 0):
def compat_print(s): def compat_print(s):
from .utils import preferredencoding from .utils import preferredencoding
@ -3086,6 +3207,7 @@ else:
print(s) print(s)
# compat_getpass_getpass
if sys.version_info < (3, 0) and sys.platform == 'win32': if sys.version_info < (3, 0) and sys.platform == 'win32':
def compat_getpass(prompt, *args, **kwargs): def compat_getpass(prompt, *args, **kwargs):
if isinstance(prompt, compat_str): if isinstance(prompt, compat_str):
@ -3098,36 +3220,42 @@ else:
compat_getpass_getpass = compat_getpass compat_getpass_getpass = compat_getpass
# compat_input
try: try:
compat_input = raw_input compat_input = raw_input
except NameError: # Python 3 except NameError: # Python 3
compat_input = input compat_input = input
# compat_kwargs
# Python < 2.6.5 require kwargs to be bytes # Python < 2.6.5 require kwargs to be bytes
try: try:
def _testfunc(x): (lambda x: x)(**{'x': 0})
pass
_testfunc(**{'x': 0})
except TypeError: except TypeError:
def compat_kwargs(kwargs): def compat_kwargs(kwargs):
return dict((bytes(k), v) for k, v in kwargs.items()) return dict((bytes(k), v) for k, v in kwargs.items())
else: else:
compat_kwargs = lambda kwargs: kwargs compat_kwargs = _IDENTITY
# compat_numeric_types
try: try:
compat_numeric_types = (int, float, long, complex) compat_numeric_types = (int, float, long, complex)
except NameError: # Python 3 except NameError: # Python 3
compat_numeric_types = (int, float, complex) compat_numeric_types = (int, float, complex)
# compat_integer_types
try: try:
compat_integer_types = (int, long) compat_integer_types = (int, long)
except NameError: # Python 3 except NameError: # Python 3
compat_integer_types = (int, ) compat_integer_types = (int, )
# compat_int
compat_int = compat_integer_types[-1]
# compat_socket_create_connection
if sys.version_info < (2, 7): if sys.version_info < (2, 7):
def compat_socket_create_connection(address, timeout, source_address=None): def compat_socket_create_connection(address, timeout, source_address=None):
host, port = address host, port = address
@ -3154,6 +3282,7 @@ else:
compat_socket_create_connection = socket.create_connection compat_socket_create_connection = socket.create_connection
# compat_contextlib_suppress
try: try:
from contextlib import suppress as compat_contextlib_suppress from contextlib import suppress as compat_contextlib_suppress
except ImportError: except ImportError:
@ -3196,12 +3325,12 @@ except AttributeError:
# repeated .close() is OK, but just in case # repeated .close() is OK, but just in case
with compat_contextlib_suppress(EnvironmentError): with compat_contextlib_suppress(EnvironmentError):
f.close() f.close()
popen.wait() popen.wait()
# Fix https://github.com/ytdl-org/youtube-dl/issues/4223 # Fix https://github.com/ytdl-org/youtube-dl/issues/4223
# See http://bugs.python.org/issue9161 for what is broken # See http://bugs.python.org/issue9161 for what is broken
def workaround_optparse_bug9161(): def _workaround_optparse_bug9161():
op = optparse.OptionParser() op = optparse.OptionParser()
og = optparse.OptionGroup(op, 'foo') og = optparse.OptionGroup(op, 'foo')
try: try:
@ -3220,9 +3349,10 @@ def workaround_optparse_bug9161():
optparse.OptionGroup.add_option = _compat_add_option optparse.OptionGroup.add_option = _compat_add_option
if hasattr(shutil, 'get_terminal_size'): # Python >= 3.3 # compat_shutil_get_terminal_size
compat_get_terminal_size = shutil.get_terminal_size try:
else: from shutil import get_terminal_size as compat_get_terminal_size # Python >= 3.3
except ImportError:
_terminal_size = collections.namedtuple('terminal_size', ['columns', 'lines']) _terminal_size = collections.namedtuple('terminal_size', ['columns', 'lines'])
def compat_get_terminal_size(fallback=(80, 24)): def compat_get_terminal_size(fallback=(80, 24)):
@ -3252,27 +3382,33 @@ else:
columns = _columns columns = _columns
if lines is None or lines <= 0: if lines is None or lines <= 0:
lines = _lines lines = _lines
return _terminal_size(columns, lines) return _terminal_size(columns, lines)
compat_shutil_get_terminal_size = compat_get_terminal_size
# compat_itertools_count
try: try:
itertools.count(start=0, step=1) type(itertools.count(start=0, step=1))
compat_itertools_count = itertools.count compat_itertools_count = itertools.count
except TypeError: # Python 2.6 except TypeError: # Python 2.6 lacks step
def compat_itertools_count(start=0, step=1): def compat_itertools_count(start=0, step=1):
while True: while True:
yield start yield start
start += step start += step
# compat_tokenize_tokenize
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
from tokenize import tokenize as compat_tokenize_tokenize from tokenize import tokenize as compat_tokenize_tokenize
else: else:
from tokenize import generate_tokens as compat_tokenize_tokenize from tokenize import generate_tokens as compat_tokenize_tokenize
# compat_struct_pack, compat_struct_unpack, compat_Struct
try: try:
struct.pack('!I', 0) type(struct.pack('!I', 0))
except TypeError: except TypeError:
# In Python 2.6 and 2.7.x < 2.7.7, struct requires a bytes argument # In Python 2.6 and 2.7.x < 2.7.7, struct requires a bytes argument
# See https://bugs.python.org/issue19099 # See https://bugs.python.org/issue19099
@ -3304,8 +3440,10 @@ else:
compat_Struct = struct.Struct compat_Struct = struct.Struct
# compat_map/filter() returning an iterator, supposedly the # builtins returning an iterator
# same versioning as for zip below
# compat_map, compat_filter
# supposedly the same versioning as for zip below
try: try:
from future_builtins import map as compat_map from future_builtins import map as compat_map
except ImportError: except ImportError:
@ -3314,6 +3452,8 @@ except ImportError:
except ImportError: except ImportError:
compat_map = map compat_map = map
# compat_filter, compat_filter_fns
try: try:
from future_builtins import filter as compat_filter from future_builtins import filter as compat_filter
except ImportError: except ImportError:
@ -3321,7 +3461,11 @@ except ImportError:
from itertools import ifilter as compat_filter from itertools import ifilter as compat_filter
except ImportError: except ImportError:
compat_filter = filter compat_filter = filter
# "Is this function one or maybe the other filter()?"
compat_filter_fns = tuple(set((filter, compat_filter)))
# compat_zip
try: try:
from future_builtins import zip as compat_zip from future_builtins import zip as compat_zip
except ImportError: # not 2.6+ or is 3.x except ImportError: # not 2.6+ or is 3.x
@ -3331,6 +3475,7 @@ except ImportError: # not 2.6+ or is 3.x
compat_zip = zip compat_zip = zip
# compat_itertools_zip_longest
# method renamed between Py2/3 # method renamed between Py2/3
try: try:
from itertools import zip_longest as compat_itertools_zip_longest from itertools import zip_longest as compat_itertools_zip_longest
@ -3338,7 +3483,42 @@ except ImportError:
from itertools import izip_longest as compat_itertools_zip_longest from itertools import izip_longest as compat_itertools_zip_longest
# new class in collections # compat_abc_ABC
try:
from abc import ABC as compat_abc_ABC
except ImportError:
# Py < 3.4
from abc import ABCMeta as _ABCMeta
compat_abc_ABC = _ABCMeta(str('ABC'), (object,), {})
# dict mixin used here
# like UserDict.DictMixin, without methods created by MutableMapping
class _DictMixin(compat_abc_ABC):
def has_key(self, key):
return key in self
# get(), clear(), setdefault() in MM
def iterkeys(self):
return (k for k in self)
def itervalues(self):
return (self[k] for k in self)
def iteritems(self):
return ((k, self[k]) for k in self)
# pop(), popitem() in MM
def copy(self):
return type(self)(self)
# update() in MM
# compat_collections_chain_map
# collections.ChainMap: new class
try: try:
from collections import ChainMap as compat_collections_chain_map from collections import ChainMap as compat_collections_chain_map
# Py3.3's ChainMap is deficient # Py3.3's ChainMap is deficient
@ -3394,19 +3574,22 @@ except ImportError:
def new_child(self, m=None, **kwargs): def new_child(self, m=None, **kwargs):
m = m or {} m = m or {}
m.update(kwargs) m.update(kwargs)
return compat_collections_chain_map(m, *self.maps) # support inheritance !
return type(self)(m, *self.maps)
@property @property
def parents(self): def parents(self):
return compat_collections_chain_map(*(self.maps[1:])) return type(self)(*(self.maps[1:]))
# compat_re_Pattern, compat_re_Match
# Pythons disagree on the type of a pattern (RegexObject, _sre.SRE_Pattern, Pattern, ...?) # Pythons disagree on the type of a pattern (RegexObject, _sre.SRE_Pattern, Pattern, ...?)
compat_re_Pattern = type(re.compile('')) compat_re_Pattern = type(re.compile(''))
# and on the type of a match # and on the type of a match
compat_re_Match = type(re.match('a', 'a')) compat_re_Match = type(re.match('a', 'a'))
# compat_base64_b64decode
if sys.version_info < (3, 3): if sys.version_info < (3, 3):
def compat_b64decode(s, *args, **kwargs): def compat_b64decode(s, *args, **kwargs):
if isinstance(s, compat_str): if isinstance(s, compat_str):
@ -3418,6 +3601,7 @@ else:
compat_base64_b64decode = compat_b64decode compat_base64_b64decode = compat_b64decode
# compat_ctypes_WINFUNCTYPE
if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0): if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0):
# PyPy2 prior to version 5.4.0 expects byte strings as Windows function # PyPy2 prior to version 5.4.0 expects byte strings as Windows function
# names, see the original PyPy issue [1] and the youtube-dl one [2]. # names, see the original PyPy issue [1] and the youtube-dl one [2].
@ -3436,6 +3620,7 @@ else:
return ctypes.WINFUNCTYPE(*args, **kwargs) return ctypes.WINFUNCTYPE(*args, **kwargs)
# compat_open
if sys.version_info < (3, 0): if sys.version_info < (3, 0):
# open(file, mode='r', buffering=- 1, encoding=None, errors=None, newline=None, closefd=True) not: opener=None # open(file, mode='r', buffering=- 1, encoding=None, errors=None, newline=None, closefd=True) not: opener=None
def compat_open(file_, *args, **kwargs): def compat_open(file_, *args, **kwargs):
@ -3463,18 +3648,151 @@ except AttributeError:
def compat_datetime_timedelta_total_seconds(td): def compat_datetime_timedelta_total_seconds(td):
return (td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6 return (td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6
# optional decompression packages # optional decompression packages
# compat_brotli
# PyPi brotli package implements 'br' Content-Encoding # PyPi brotli package implements 'br' Content-Encoding
try: try:
import brotli as compat_brotli import brotli as compat_brotli
except ImportError: except ImportError:
compat_brotli = None compat_brotli = None
# compat_ncompress
# PyPi ncompress package implements 'compress' Content-Encoding # PyPi ncompress package implements 'compress' Content-Encoding
try: try:
import ncompress as compat_ncompress import ncompress as compat_ncompress
except ImportError: except ImportError:
compat_ncompress = None compat_ncompress = None
# compat_zstandard
# PyPi zstandard package implements 'zstd' Content-Encoding (RFC 8878 7.2)
try:
import zstandard as compat_zstandard
except ImportError:
compat_zstandard = None
# compat_thread
try:
import _thread as compat_thread
except ImportError:
try:
import thread as compat_thread
except ImportError:
import dummy_thread as compat_thread
# compat_dict
# compat_builtins_dict
# compat_dict_items
if sys.version_info >= (3, 6):
compat_dict = compat_builtins_dict = dict
compat_dict_items = dict.items
else:
_get_ident = compat_thread.get_ident
class compat_dict(compat_collections_abc.MutableMapping, _DictMixin, dict):
"""`dict` that preserves insertion order with interface like Py3.7+"""
_order = [] # default that should never be used
def __init__(self, *mappings_or_iterables, **kwargs):
# order an unordered dict using a list of keys: actual Py 2.7+
# OrderedDict uses a doubly linked list for better performance
self._order = []
for arg in mappings_or_iterables:
self.__update(arg)
if kwargs:
self.__update(kwargs)
def __getitem__(self, key):
return dict.__getitem__(self, key)
def __setitem__(self, key, value):
try:
if key not in self._order:
self._order.append(key)
dict.__setitem__(self, key, value)
except Exception:
if key in self._order[-1:] and key not in self:
del self._order[-1]
raise
def __len__(self):
return dict.__len__(self)
def __delitem__(self, key):
dict.__delitem__(self, key)
try:
# expected case, O(len(self)), but who dels anyway?
self._order.remove(key)
except ValueError:
pass
def __iter__(self):
for from_ in self._order:
if from_ in self:
yield from_
def __del__(self):
for attr in ('_order',):
try:
delattr(self, attr)
except Exception:
pass
def __repr__(self, _repr_running={}):
# skip recursive items ...
call_key = id(self), _get_ident()
if _repr_running.get(call_key):
return '...'
_repr_running[call_key] = True
try:
return '%s({%s})' % (
type(self).__name__,
','.join('%r: %r' % k_v for k_v in self.items()))
finally:
del _repr_running[call_key]
# merge/update (PEP 584)
def __or__(self, other):
if not isinstance(other, compat_collections_abc.Mapping):
return NotImplemented
new = type(self)(self)
new.update(other)
return new
def __ror__(self, other):
if not isinstance(other, compat_collections_abc.Mapping):
return NotImplemented
new = type(other)(other)
new.update(self)
return new
def __ior__(self, other):
self.update(other)
return self
# optimisations
def __reversed__(self):
for from_ in reversed(self._order):
if from_ in self:
yield from_
def __contains__(self, item):
return dict.__contains__(self, item)
# allow overriding update without breaking __init__
def __update(self, *args, **kwargs):
super(compat_dict, self).update(*args, **kwargs)
compat_builtins_dict = dict
# Using the object's method, not dict's:
# an ordered dict's items can be returned unstably by unordered
# dict.items as if the method was not ((k, self[k]) for k in self)
compat_dict_items = lambda d: d.items()
legacy = [ legacy = [
'compat_HTMLParseError', 'compat_HTMLParseError',
@ -3491,6 +3809,7 @@ legacy = [
'compat_getpass', 'compat_getpass',
'compat_parse_qs', 'compat_parse_qs',
'compat_realpath', 'compat_realpath',
'compat_shlex_split',
'compat_urllib_parse_parse_qs', 'compat_urllib_parse_parse_qs',
'compat_urllib_parse_unquote', 'compat_urllib_parse_unquote',
'compat_urllib_parse_unquote_plus', 'compat_urllib_parse_unquote_plus',
@ -3504,34 +3823,40 @@ legacy = [
__all__ = [ __all__ = [
'compat_html_parser_HTMLParseError',
'compat_html_parser_HTMLParser',
'compat_Struct', 'compat_Struct',
'compat_abc_ABC',
'compat_base64_b64decode', 'compat_base64_b64decode',
'compat_basestring', 'compat_basestring',
'compat_brotli', 'compat_brotli',
'compat_builtins_dict',
'compat_casefold', 'compat_casefold',
'compat_chr', 'compat_chr',
'compat_collections_abc', 'compat_collections_abc',
'compat_collections_chain_map', 'compat_collections_chain_map',
'compat_datetime_timedelta_total_seconds',
'compat_http_cookiejar',
'compat_http_cookiejar_Cookie',
'compat_http_cookies',
'compat_http_cookies_SimpleCookie',
'compat_contextlib_suppress', 'compat_contextlib_suppress',
'compat_ctypes_WINFUNCTYPE', 'compat_ctypes_WINFUNCTYPE',
'compat_datetime_timedelta_total_seconds',
'compat_dict',
'compat_dict_items',
'compat_etree_fromstring', 'compat_etree_fromstring',
'compat_etree_iterfind', 'compat_etree_iterfind',
'compat_filter', 'compat_filter',
'compat_filter_fns',
'compat_get_terminal_size', 'compat_get_terminal_size',
'compat_getenv', 'compat_getenv',
'compat_getpass_getpass', 'compat_getpass_getpass',
'compat_html_entities', 'compat_html_entities',
'compat_html_entities_html5', 'compat_html_entities_html5',
'compat_html_parser_HTMLParseError',
'compat_html_parser_HTMLParser',
'compat_http_cookiejar',
'compat_http_cookiejar_Cookie',
'compat_http_cookies',
'compat_http_cookies_SimpleCookie',
'compat_http_client', 'compat_http_client',
'compat_http_server', 'compat_http_server',
'compat_input', 'compat_input',
'compat_int',
'compat_integer_types', 'compat_integer_types',
'compat_itertools_count', 'compat_itertools_count',
'compat_itertools_zip_longest', 'compat_itertools_zip_longest',
@ -3541,6 +3866,7 @@ __all__ = [
'compat_numeric_types', 'compat_numeric_types',
'compat_open', 'compat_open',
'compat_ord', 'compat_ord',
'compat_os_makedirs',
'compat_os_name', 'compat_os_name',
'compat_os_path_expanduser', 'compat_os_path_expanduser',
'compat_os_path_realpath', 'compat_os_path_realpath',
@ -3550,13 +3876,14 @@ __all__ = [
'compat_register_utf8', 'compat_register_utf8',
'compat_setenv', 'compat_setenv',
'compat_shlex_quote', 'compat_shlex_quote',
'compat_shlex_split', 'compat_shutil_get_terminal_size',
'compat_socket_create_connection', 'compat_socket_create_connection',
'compat_str', 'compat_str',
'compat_struct_pack', 'compat_struct_pack',
'compat_struct_unpack', 'compat_struct_unpack',
'compat_subprocess_get_DEVNULL', 'compat_subprocess_get_DEVNULL',
'compat_subprocess_Popen', 'compat_subprocess_Popen',
'compat_thread',
'compat_tokenize_tokenize', 'compat_tokenize_tokenize',
'compat_urllib_error', 'compat_urllib_error',
'compat_urllib_parse', 'compat_urllib_parse',
@ -3570,5 +3897,5 @@ __all__ = [
'compat_xml_etree_register_namespace', 'compat_xml_etree_register_namespace',
'compat_xpath', 'compat_xpath',
'compat_zip', 'compat_zip',
'workaround_optparse_bug9161', 'compat_zstandard',
] ]

@ -11,6 +11,7 @@ from ..utils import (
decodeArgument, decodeArgument,
encodeFilename, encodeFilename,
error_to_compat_str, error_to_compat_str,
float_or_none,
format_bytes, format_bytes,
shell_quote, shell_quote,
timeconvert, timeconvert,
@ -367,14 +368,27 @@ class FileDownloader(object):
}) })
return True return True
min_sleep_interval = self.params.get('sleep_interval') min_sleep_interval, max_sleep_interval = (
if min_sleep_interval: float_or_none(self.params.get(interval), default=0)
max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval) for interval in ('sleep_interval', 'max_sleep_interval'))
sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval)
sleep_note = ''
available_at = info_dict.get('available_at')
if available_at:
forced_sleep_interval = available_at - int(time.time())
if forced_sleep_interval > min_sleep_interval:
sleep_note = 'as required by the site'
min_sleep_interval = forced_sleep_interval
if forced_sleep_interval > max_sleep_interval:
max_sleep_interval = forced_sleep_interval
sleep_interval = random.uniform(
min_sleep_interval, max_sleep_interval or min_sleep_interval)
if sleep_interval > 0:
self.to_screen( self.to_screen(
'[download] Sleeping %s seconds...' % ( '[download] Sleeping %.2f seconds %s...' % (
int(sleep_interval) if sleep_interval.is_integer() sleep_interval, sleep_note))
else '%.2f' % sleep_interval))
time.sleep(sleep_interval) time.sleep(sleep_interval)
return self.real_download(filename, info_dict) return self.real_download(filename, info_dict)

@ -32,7 +32,7 @@ class BokeCCBaseIE(InfoExtractor):
class BokeCCIE(BokeCCBaseIE): class BokeCCIE(BokeCCBaseIE):
_IE_DESC = 'CC视频' IE_DESC = 'CC视频'
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)' _VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{ _TESTS = [{

@ -9,7 +9,7 @@ from ..utils import (
class CloudyIE(InfoExtractor): class CloudyIE(InfoExtractor):
_IE_DESC = 'cloudy.ec' IE_DESC = 'cloudy.ec'
_VALID_URL = r'https?://(?:www\.)?cloudy\.ec/(?:v/|embed\.php\?.*?\bid=)(?P<id>[A-Za-z0-9]+)' _VALID_URL = r'https?://(?:www\.)?cloudy\.ec/(?:v/|embed\.php\?.*?\bid=)(?P<id>[A-Za-z0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.cloudy.ec/v/af511e2527aac', 'url': 'https://www.cloudy.ec/v/af511e2527aac',

@ -214,6 +214,7 @@ class InfoExtractor(object):
width : height ratio as float. width : height ratio as float.
* no_resume The server does not support resuming the * no_resume The server does not support resuming the
(HTTP or RTMP) download. Boolean. (HTTP or RTMP) download. Boolean.
* available_at Unix timestamp of when a format will be available to download
* downloader_options A dictionary of downloader options as * downloader_options A dictionary of downloader options as
described in FileDownloader described in FileDownloader
@ -422,6 +423,8 @@ class InfoExtractor(object):
_GEO_COUNTRIES = None _GEO_COUNTRIES = None
_GEO_IP_BLOCKS = None _GEO_IP_BLOCKS = None
_WORKING = True _WORKING = True
# supply this in public subclasses: used in supported sites list, etc
# IE_DESC = 'short description of IE'
def __init__(self, downloader=None): def __init__(self, downloader=None):
"""Constructor. Receives an optional downloader.""" """Constructor. Receives an optional downloader."""
@ -503,7 +506,7 @@ class InfoExtractor(object):
if not self._x_forwarded_for_ip: if not self._x_forwarded_for_ip:
# Geo bypass mechanism is explicitly disabled by user # Geo bypass mechanism is explicitly disabled by user
if not self._downloader.params.get('geo_bypass', True): if not self.get_param('geo_bypass', True):
return return
if not geo_bypass_context: if not geo_bypass_context:
@ -525,7 +528,7 @@ class InfoExtractor(object):
# Explicit IP block specified by user, use it right away # Explicit IP block specified by user, use it right away
# regardless of whether extractor is geo bypassable or not # regardless of whether extractor is geo bypassable or not
ip_block = self._downloader.params.get('geo_bypass_ip_block', None) ip_block = self.get_param('geo_bypass_ip_block', None)
# Otherwise use random IP block from geo bypass context but only # Otherwise use random IP block from geo bypass context but only
# if extractor is known as geo bypassable # if extractor is known as geo bypassable
@ -536,8 +539,8 @@ class InfoExtractor(object):
if ip_block: if ip_block:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(ip_block) self._x_forwarded_for_ip = GeoUtils.random_ipv4(ip_block)
if self._downloader.params.get('verbose', False): if self.get_param('verbose', False):
self._downloader.to_screen( self.to_screen(
'[debug] Using fake IP %s as X-Forwarded-For.' '[debug] Using fake IP %s as X-Forwarded-For.'
% self._x_forwarded_for_ip) % self._x_forwarded_for_ip)
return return
@ -546,7 +549,7 @@ class InfoExtractor(object):
# Explicit country code specified by user, use it right away # Explicit country code specified by user, use it right away
# regardless of whether extractor is geo bypassable or not # regardless of whether extractor is geo bypassable or not
country = self._downloader.params.get('geo_bypass_country', None) country = self.get_param('geo_bypass_country', None)
# Otherwise use random country code from geo bypass context but # Otherwise use random country code from geo bypass context but
# only if extractor is known as geo bypassable # only if extractor is known as geo bypassable
@ -557,8 +560,8 @@ class InfoExtractor(object):
if country: if country:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country) self._x_forwarded_for_ip = GeoUtils.random_ipv4(country)
if self._downloader.params.get('verbose', False): if self.get_param('verbose', False):
self._downloader.to_screen( self.to_screen(
'[debug] Using fake IP %s (%s) as X-Forwarded-For.' '[debug] Using fake IP %s (%s) as X-Forwarded-For.'
% (self._x_forwarded_for_ip, country.upper())) % (self._x_forwarded_for_ip, country.upper()))
@ -584,9 +587,9 @@ class InfoExtractor(object):
raise ExtractorError('An extractor error has occurred.', cause=e) raise ExtractorError('An extractor error has occurred.', cause=e)
def __maybe_fake_ip_and_retry(self, countries): def __maybe_fake_ip_and_retry(self, countries):
if (not self._downloader.params.get('geo_bypass_country', None) if (not self.get_param('geo_bypass_country', None)
and self._GEO_BYPASS and self._GEO_BYPASS
and self._downloader.params.get('geo_bypass', True) and self.get_param('geo_bypass', True)
and not self._x_forwarded_for_ip and not self._x_forwarded_for_ip
and countries): and countries):
country_code = random.choice(countries) country_code = random.choice(countries)
@ -696,7 +699,7 @@ class InfoExtractor(object):
if fatal: if fatal:
raise ExtractorError(errmsg, sys.exc_info()[2], cause=err) raise ExtractorError(errmsg, sys.exc_info()[2], cause=err)
else: else:
self._downloader.report_warning(errmsg) self.report_warning(errmsg)
return False return False
def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True, encoding=None, data=None, headers={}, query={}, expected_status=None): def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True, encoding=None, data=None, headers={}, query={}, expected_status=None):
@ -768,11 +771,11 @@ class InfoExtractor(object):
webpage_bytes = prefix + webpage_bytes webpage_bytes = prefix + webpage_bytes
if not encoding: if not encoding:
encoding = self._guess_encoding_from_content(content_type, webpage_bytes) encoding = self._guess_encoding_from_content(content_type, webpage_bytes)
if self._downloader.params.get('dump_intermediate_pages', False): if self.get_param('dump_intermediate_pages', False):
self.to_screen('Dumping request to ' + urlh.geturl()) self.to_screen('Dumping request to ' + urlh.geturl())
dump = base64.b64encode(webpage_bytes).decode('ascii') dump = base64.b64encode(webpage_bytes).decode('ascii')
self._downloader.to_screen(dump) self.to_screen(dump)
if self._downloader.params.get('write_pages', False): if self.get_param('write_pages', False):
basen = '%s_%s' % (video_id, urlh.geturl()) basen = '%s_%s' % (video_id, urlh.geturl())
if len(basen) > 240: if len(basen) > 240:
h = '___' + hashlib.md5(basen.encode('utf-8')).hexdigest() h = '___' + hashlib.md5(basen.encode('utf-8')).hexdigest()
@ -974,19 +977,9 @@ class InfoExtractor(object):
"""Print msg to screen, prefixing it with '[ie_name]'""" """Print msg to screen, prefixing it with '[ie_name]'"""
self._downloader.to_screen(self.__ie_msg(msg)) self._downloader.to_screen(self.__ie_msg(msg))
def write_debug(self, msg, only_once=False, _cache=[]): def write_debug(self, msg, only_once=False):
'''Log debug message or Print message to stderr''' '''Log debug message or Print message to stderr'''
if not self.get_param('verbose', False): self._downloader.write_debug(self.__ie_msg(msg), only_once=only_once)
return
message = '[debug] ' + self.__ie_msg(msg)
logger = self.get_param('logger')
if logger:
logger.debug(message)
else:
if only_once and hash(message) in _cache:
return
self._downloader.to_stderr(message)
_cache.append(hash(message))
# name, default=None, *args, **kwargs # name, default=None, *args, **kwargs
def get_param(self, name, *args, **kwargs): def get_param(self, name, *args, **kwargs):
@ -1082,7 +1075,7 @@ class InfoExtractor(object):
if mobj: if mobj:
break break
if not self._downloader.params.get('no_color') and compat_os_name != 'nt' and sys.stderr.isatty(): if not self.get_param('no_color') and compat_os_name != 'nt' and sys.stderr.isatty():
_name = '\033[0;34m%s\033[0m' % name _name = '\033[0;34m%s\033[0m' % name
else: else:
_name = name _name = name
@ -1100,7 +1093,7 @@ class InfoExtractor(object):
elif fatal: elif fatal:
raise RegexNotFoundError('Unable to extract %s' % _name) raise RegexNotFoundError('Unable to extract %s' % _name)
else: else:
self._downloader.report_warning('unable to extract %s' % _name + bug_reports_message()) self.report_warning('unable to extract %s' % _name + bug_reports_message())
return None return None
def _search_json(self, start_pattern, string, name, video_id, **kwargs): def _search_json(self, start_pattern, string, name, video_id, **kwargs):
@ -1170,7 +1163,7 @@ class InfoExtractor(object):
username = None username = None
password = None password = None
if self._downloader.params.get('usenetrc', False): if self.get_param('usenetrc', False):
try: try:
netrc_machine = netrc_machine or self._NETRC_MACHINE netrc_machine = netrc_machine or self._NETRC_MACHINE
info = netrc.netrc().authenticators(netrc_machine) info = netrc.netrc().authenticators(netrc_machine)
@ -1181,7 +1174,7 @@ class InfoExtractor(object):
raise netrc.NetrcParseError( raise netrc.NetrcParseError(
'No authenticators for %s' % netrc_machine) 'No authenticators for %s' % netrc_machine)
except (AttributeError, IOError, netrc.NetrcParseError) as err: except (AttributeError, IOError, netrc.NetrcParseError) as err:
self._downloader.report_warning( self.report_warning(
'parsing .netrc: %s' % error_to_compat_str(err)) 'parsing .netrc: %s' % error_to_compat_str(err))
return username, password return username, password
@ -1218,10 +1211,10 @@ class InfoExtractor(object):
""" """
if self._downloader is None: if self._downloader is None:
return None return None
downloader_params = self._downloader.params
if downloader_params.get('twofactor') is not None: twofactor = self.get_param('twofactor')
return downloader_params['twofactor'] if twofactor is not None:
return twofactor
return compat_getpass('Type %s and press [Return]: ' % note) return compat_getpass('Type %s and press [Return]: ' % note)
@ -1356,7 +1349,7 @@ class InfoExtractor(object):
elif fatal: elif fatal:
raise RegexNotFoundError('Unable to extract JSON-LD') raise RegexNotFoundError('Unable to extract JSON-LD')
else: else:
self._downloader.report_warning('unable to extract JSON-LD %s' % bug_reports_message()) self.report_warning('unable to extract JSON-LD %s' % bug_reports_message())
return {} return {}
def _json_ld(self, json_ld, video_id, fatal=True, expected_type=None): def _json_ld(self, json_ld, video_id, fatal=True, expected_type=None):
@ -1587,7 +1580,7 @@ class InfoExtractor(object):
if f.get('vcodec') == 'none': # audio only if f.get('vcodec') == 'none': # audio only
preference -= 50 preference -= 50
if self._downloader.params.get('prefer_free_formats'): if self.get_param('prefer_free_formats'):
ORDER = ['aac', 'mp3', 'm4a', 'webm', 'ogg', 'opus'] ORDER = ['aac', 'mp3', 'm4a', 'webm', 'ogg', 'opus']
else: else:
ORDER = ['webm', 'opus', 'ogg', 'mp3', 'aac', 'm4a'] ORDER = ['webm', 'opus', 'ogg', 'mp3', 'aac', 'm4a']
@ -1599,7 +1592,7 @@ class InfoExtractor(object):
else: else:
if f.get('acodec') == 'none': # video only if f.get('acodec') == 'none': # video only
preference -= 40 preference -= 40
if self._downloader.params.get('prefer_free_formats'): if self.get_param('prefer_free_formats'):
ORDER = ['flv', 'mp4', 'webm'] ORDER = ['flv', 'mp4', 'webm']
else: else:
ORDER = ['webm', 'flv', 'mp4'] ORDER = ['webm', 'flv', 'mp4']
@ -1665,7 +1658,7 @@ class InfoExtractor(object):
""" Either "http:" or "https:", depending on the user's preferences """ """ Either "http:" or "https:", depending on the user's preferences """
return ( return (
'http:' 'http:'
if self._downloader.params.get('prefer_insecure', False) if self.get_param('prefer_insecure', False)
else 'https:') else 'https:')
def _proto_relative_url(self, url, scheme=None): def _proto_relative_url(self, url, scheme=None):
@ -3170,7 +3163,7 @@ class InfoExtractor(object):
# See com/longtailvideo/jwplayer/media/RTMPMediaProvider.as # See com/longtailvideo/jwplayer/media/RTMPMediaProvider.as
# of jwplayer.flash.swf # of jwplayer.flash.swf
rtmp_url_parts = re.split( rtmp_url_parts = re.split(
r'((?:mp4|mp3|flv):)', source_url, 1) r'((?:mp4|mp3|flv):)', source_url, maxsplit=1)
if len(rtmp_url_parts) == 3: if len(rtmp_url_parts) == 3:
rtmp_url, prefix, play_path = rtmp_url_parts rtmp_url, prefix, play_path = rtmp_url_parts
a_format.update({ a_format.update({
@ -3197,7 +3190,7 @@ class InfoExtractor(object):
if fatal: if fatal:
raise ExtractorError(msg) raise ExtractorError(msg)
else: else:
self._downloader.report_warning(msg) self.report_warning(msg)
return res return res
def _float(self, v, name, fatal=False, **kwargs): def _float(self, v, name, fatal=False, **kwargs):
@ -3207,7 +3200,7 @@ class InfoExtractor(object):
if fatal: if fatal:
raise ExtractorError(msg) raise ExtractorError(msg)
else: else:
self._downloader.report_warning(msg) self.report_warning(msg)
return res return res
def _set_cookie(self, domain, name, value, expire_time=None, port=None, def _set_cookie(self, domain, name, value, expire_time=None, port=None,
@ -3216,12 +3209,12 @@ class InfoExtractor(object):
0, name, value, port, port is not None, domain, True, 0, name, value, port, port is not None, domain, True,
domain.startswith('.'), path, True, secure, expire_time, domain.startswith('.'), path, True, secure, expire_time,
discard, None, None, rest) discard, None, None, rest)
self._downloader.cookiejar.set_cookie(cookie) self.cookiejar.set_cookie(cookie)
def _get_cookies(self, url): def _get_cookies(self, url):
""" Return a compat_cookies_SimpleCookie with the cookies for the url """ """ Return a compat_cookies_SimpleCookie with the cookies for the url """
req = sanitized_Request(url) req = sanitized_Request(url)
self._downloader.cookiejar.add_cookie_header(req) self.cookiejar.add_cookie_header(req)
return compat_cookies_SimpleCookie(req.get_header('Cookie')) return compat_cookies_SimpleCookie(req.get_header('Cookie'))
def _apply_first_set_cookie_header(self, url_handle, cookie): def _apply_first_set_cookie_header(self, url_handle, cookie):
@ -3281,8 +3274,8 @@ class InfoExtractor(object):
return not any_restricted return not any_restricted
def extract_subtitles(self, *args, **kwargs): def extract_subtitles(self, *args, **kwargs):
if (self._downloader.params.get('writesubtitles', False) if (self.get_param('writesubtitles', False)
or self._downloader.params.get('listsubtitles')): or self.get_param('listsubtitles')):
return self._get_subtitles(*args, **kwargs) return self._get_subtitles(*args, **kwargs)
return {} return {}
@ -3303,7 +3296,11 @@ class InfoExtractor(object):
""" Merge subtitle dictionaries, language by language. """ """ Merge subtitle dictionaries, language by language. """
# ..., * , target=None # ..., * , target=None
target = kwargs.get('target') or dict(subtitle_dict1) target = kwargs.get('target')
if target is None:
target = dict(subtitle_dict1)
else:
subtitle_dicts = (subtitle_dict1,) + subtitle_dicts
for subtitle_dict in subtitle_dicts: for subtitle_dict in subtitle_dicts:
for lang in subtitle_dict: for lang in subtitle_dict:
@ -3311,8 +3308,8 @@ class InfoExtractor(object):
return target return target
def extract_automatic_captions(self, *args, **kwargs): def extract_automatic_captions(self, *args, **kwargs):
if (self._downloader.params.get('writeautomaticsub', False) if (self.get_param('writeautomaticsub', False)
or self._downloader.params.get('listsubtitles')): or self.get_param('listsubtitles')):
return self._get_automatic_captions(*args, **kwargs) return self._get_automatic_captions(*args, **kwargs)
return {} return {}
@ -3320,9 +3317,9 @@ class InfoExtractor(object):
raise NotImplementedError('This method must be implemented by subclasses') raise NotImplementedError('This method must be implemented by subclasses')
def mark_watched(self, *args, **kwargs): def mark_watched(self, *args, **kwargs):
if (self._downloader.params.get('mark_watched', False) if (self.get_param('mark_watched', False)
and (self._get_login_info()[0] is not None and (self._get_login_info()[0] is not None
or self._downloader.params.get('cookiefile') is not None)): or self.get_param('cookiefile') is not None)):
self._mark_watched(*args, **kwargs) self._mark_watched(*args, **kwargs)
def _mark_watched(self, *args, **kwargs): def _mark_watched(self, *args, **kwargs):
@ -3330,7 +3327,7 @@ class InfoExtractor(object):
def geo_verification_headers(self): def geo_verification_headers(self):
headers = {} headers = {}
geo_verification_proxy = self._downloader.params.get('geo_verification_proxy') geo_verification_proxy = self.get_param('geo_verification_proxy')
if geo_verification_proxy: if geo_verification_proxy:
headers['Ytdl-request-proxy'] = geo_verification_proxy headers['Ytdl-request-proxy'] = geo_verification_proxy
return headers return headers

@ -35,15 +35,6 @@ from ..utils import (
class ITVBaseIE(InfoExtractor): class ITVBaseIE(InfoExtractor):
def _search_nextjs_data(self, webpage, video_id, **kw):
transform_source = kw.pop('transform_source', None)
fatal = kw.pop('fatal', True)
return self._parse_json(
self._search_regex(
r'''<script\b[^>]+\bid=('|")__NEXT_DATA__\1[^>]*>(?P<js>[^<]+)</script>''',
webpage, 'next.js data', group='js', fatal=fatal, **kw),
video_id, transform_source=transform_source, fatal=fatal)
def __handle_request_webpage_error(self, err, video_id=None, errnote=None, fatal=True): def __handle_request_webpage_error(self, err, video_id=None, errnote=None, fatal=True):
if errnote is False: if errnote is False:
return False return False
@ -109,7 +100,9 @@ class ITVBaseIE(InfoExtractor):
class ITVIE(ITVBaseIE): class ITVIE(ITVBaseIE):
_VALID_URL = r'https?://(?:www\.)?itv\.com/(?:(?P<w>watch)|hub)/[^/]+/(?(w)[\w-]+/)(?P<id>\w+)' _VALID_URL = r'https?://(?:www\.)?itv\.com/(?:(?P<w>watch)|hub)/[^/]+/(?(w)[\w-]+/)(?P<id>\w+)'
_IE_DESC = 'ITVX' IE_DESC = 'ITVX'
_WORKING = False
_TESTS = [{ _TESTS = [{
'note': 'Hub URLs redirect to ITVX', 'note': 'Hub URLs redirect to ITVX',
'url': 'https://www.itv.com/hub/liar/2a4547a0012', 'url': 'https://www.itv.com/hub/liar/2a4547a0012',
@ -270,7 +263,7 @@ class ITVIE(ITVBaseIE):
'ext': determine_ext(href, 'vtt'), 'ext': determine_ext(href, 'vtt'),
}) })
next_data = self._search_nextjs_data(webpage, video_id, fatal=False, default='{}') next_data = self._search_nextjs_data(webpage, video_id, fatal=False, default={})
video_data.update(traverse_obj(next_data, ('props', 'pageProps', ('title', 'episode')), expected_type=dict)[0] or {}) video_data.update(traverse_obj(next_data, ('props', 'pageProps', ('title', 'episode')), expected_type=dict)[0] or {})
title = traverse_obj(video_data, 'headerTitle', 'episodeTitle') title = traverse_obj(video_data, 'headerTitle', 'episodeTitle')
info = self._og_extract(webpage, require_title=not title) info = self._og_extract(webpage, require_title=not title)
@ -323,7 +316,7 @@ class ITVIE(ITVBaseIE):
class ITVBTCCIE(ITVBaseIE): class ITVBTCCIE(ITVBaseIE):
_VALID_URL = r'https?://(?:www\.)?itv\.com/(?!(?:watch|hub)/)(?:[^/]+/)+(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?itv\.com/(?!(?:watch|hub)/)(?:[^/]+/)+(?P<id>[^/?#&]+)'
_IE_DESC = 'ITV articles: News, British Touring Car Championship' IE_DESC = 'ITV articles: News, British Touring Car Championship'
_TESTS = [{ _TESTS = [{
'note': 'British Touring Car Championship', 'note': 'British Touring Car Championship',
'url': 'https://www.itv.com/btcc/articles/btcc-2018-all-the-action-from-brands-hatch', 'url': 'https://www.itv.com/btcc/articles/btcc-2018-all-the-action-from-brands-hatch',

@ -47,7 +47,7 @@ class SenateISVPIE(InfoExtractor):
['vetaff', '76462', 'http://vetaff-f.akamaihd.net'], ['vetaff', '76462', 'http://vetaff-f.akamaihd.net'],
['arch', '', 'http://ussenate-f.akamaihd.net/'] ['arch', '', 'http://ussenate-f.akamaihd.net/']
] ]
_IE_NAME = 'senate.gov' IE_NAME = 'senate.gov'
_VALID_URL = r'https?://(?:www\.)?senate\.gov/isvp/?\?(?P<qs>.+)' _VALID_URL = r'https?://(?:www\.)?senate\.gov/isvp/?\?(?P<qs>.+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.senate.gov/isvp/?comm=judiciary&type=live&stt=&filename=judiciary031715&auto_play=false&wmode=transparent&poster=http%3A%2F%2Fwww.judiciary.senate.gov%2Fthemes%2Fjudiciary%2Fimages%2Fvideo-poster-flash-fit.png', 'url': 'http://www.senate.gov/isvp/?comm=judiciary&type=live&stt=&filename=judiciary031715&auto_play=false&wmode=transparent&poster=http%3A%2F%2Fwww.judiciary.senate.gov%2Fthemes%2Fjudiciary%2Fimages%2Fvideo-poster-flash-fit.png',

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

@ -404,6 +404,10 @@ def parseOpts(overrideArguments=None):
'-F', '--list-formats', '-F', '--list-formats',
action='store_true', dest='listformats', action='store_true', dest='listformats',
help='List all available formats of requested videos') help='List all available formats of requested videos')
video_format.add_option(
'--no-list-formats',
action='store_false', dest='listformats',
help='Do not list available formats of requested videos (default)')
video_format.add_option( video_format.add_option(
'--youtube-include-dash-manifest', '--youtube-include-dash-manifest',
action='store_true', dest='youtube_include_dash_manifest', default=True, action='store_true', dest='youtube_include_dash_manifest', default=True,
@ -412,6 +416,17 @@ def parseOpts(overrideArguments=None):
'--youtube-skip-dash-manifest', '--youtube-skip-dash-manifest',
action='store_false', dest='youtube_include_dash_manifest', action='store_false', dest='youtube_include_dash_manifest',
help='Do not download the DASH manifests and related data on YouTube videos') help='Do not download the DASH manifests and related data on YouTube videos')
video_format.add_option(
'--youtube-player-js-variant',
action='store', dest='youtube_player_js_variant',
help='For YouTube, the player javascript variant to use for n/sig deciphering; `actual` to follow the site; default `%default`.',
choices=('actual', 'main', 'tcc', 'tce', 'es5', 'es6', 'tv', 'tv_es6', 'phone', 'tablet'),
default='actual', metavar='VARIANT')
video_format.add_option(
'--youtube-player-js-version',
action='store', dest='youtube_player_js_version',
help='For YouTube, the player javascript version to use for n/sig deciphering, specified as `signature_timestamp@hash`, or `actual` to follow the site; default `%default`',
default='actual', metavar='STS@HASH')
video_format.add_option( video_format.add_option(
'--merge-output-format', '--merge-output-format',
action='store', dest='merge_output_format', metavar='FORMAT', default=None, action='store', dest='merge_output_format', metavar='FORMAT', default=None,

@ -5,6 +5,10 @@
from .utils import ( from .utils import (
dict_get, dict_get,
get_first, get_first,
require,
subs_list_to_dict,
T, T,
traverse_obj, traverse_obj,
unpack,
value,
) )

@ -53,6 +53,8 @@ from .compat import (
compat_etree_fromstring, compat_etree_fromstring,
compat_etree_iterfind, compat_etree_iterfind,
compat_expanduser, compat_expanduser,
compat_filter as filter,
compat_filter_fns,
compat_html_entities, compat_html_entities,
compat_html_entities_html5, compat_html_entities_html5,
compat_http_client, compat_http_client,
@ -1859,6 +1861,39 @@ def write_json_file(obj, fn):
raise raise
class partial_application(object):
"""Allow a function to use pre-set argument values"""
# see _try_bind_args()
try:
inspect.signature
@staticmethod
def required_args(fn):
return [
param.name for param in inspect.signature(fn).parameters.values()
if (param.kind in (inspect.Parameter.POSITIONAL_ONLY, inspect.Parameter.POSITIONAL_OR_KEYWORD)
and param.default is inspect.Parameter.empty)]
except AttributeError:
# Py < 3.3
@staticmethod
def required_args(fn):
fn_args = inspect.getargspec(fn)
n_defaults = len(fn_args.defaults or [])
return (fn_args.args or [])[:-n_defaults if n_defaults > 0 else None]
def __new__(cls, func):
@functools.wraps(func)
def wrapped(*args, **kwargs):
if set(cls.required_args(func)[len(args):]).difference(kwargs):
return functools.partial(func, *args, **kwargs)
return func(*args, **kwargs)
return wrapped
if sys.version_info >= (2, 7): if sys.version_info >= (2, 7):
def find_xpath_attr(node, xpath, key, val=None): def find_xpath_attr(node, xpath, key, val=None):
""" Find the xpath xpath[@key=val] """ """ Find the xpath xpath[@key=val] """
@ -3152,6 +3187,7 @@ def extract_timezone(date_str):
return timezone, date_str return timezone, date_str
@partial_application
def parse_iso8601(date_str, delimiter='T', timezone=None): def parse_iso8601(date_str, delimiter='T', timezone=None):
""" Return a UNIX timestamp from the given date """ """ Return a UNIX timestamp from the given date """
@ -3229,6 +3265,7 @@ def unified_timestamp(date_str, day_first=True):
return calendar.timegm(timetuple) + pm_delta * 3600 - compat_datetime_timedelta_total_seconds(timezone) return calendar.timegm(timetuple) + pm_delta * 3600 - compat_datetime_timedelta_total_seconds(timezone)
@partial_application
def determine_ext(url, default_ext='unknown_video'): def determine_ext(url, default_ext='unknown_video'):
if url is None or '.' not in url: if url is None or '.' not in url:
return default_ext return default_ext
@ -3807,6 +3844,7 @@ def base_url(url):
return re.match(r'https?://[^?#&]+/', url).group() return re.match(r'https?://[^?#&]+/', url).group()
@partial_application
def urljoin(base, path): def urljoin(base, path):
path = _decode_compat_str(path, encoding='utf-8', or_none=True) path = _decode_compat_str(path, encoding='utf-8', or_none=True)
if not path: if not path:
@ -3831,6 +3869,7 @@ class PUTRequest(compat_urllib_request.Request):
return 'PUT' return 'PUT'
@partial_application
def int_or_none(v, scale=1, default=None, get_attr=None, invscale=1, base=None): def int_or_none(v, scale=1, default=None, get_attr=None, invscale=1, base=None):
if get_attr: if get_attr:
if v is not None: if v is not None:
@ -3857,6 +3896,7 @@ def str_to_int(int_str):
return int_or_none(int_str) return int_or_none(int_str)
@partial_application
def float_or_none(v, scale=1, invscale=1, default=None): def float_or_none(v, scale=1, invscale=1, default=None):
if v is None: if v is None:
return default return default
@ -3891,38 +3931,46 @@ def parse_duration(s):
return None return None
s = s.strip() s = s.strip()
if not s:
return None
days, hours, mins, secs, ms = [None] * 5 days, hours, mins, secs, ms = [None] * 5
m = re.match(r'(?:(?:(?:(?P<days>[0-9]+):)?(?P<hours>[0-9]+):)?(?P<mins>[0-9]+):)?(?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?Z?$', s) m = re.match(r'''(?x)
(?P<before_secs>
(?:(?:(?P<days>[0-9]+):)?(?P<hours>[0-9]+):)?
(?P<mins>[0-9]+):)?
(?P<secs>(?(before_secs)[0-9]{1,2}|[0-9]+))
(?:[.:](?P<ms>[0-9]+))?Z?$
''', s)
if m: if m:
days, hours, mins, secs, ms = m.groups() days, hours, mins, secs, ms = m.group('days', 'hours', 'mins', 'secs', 'ms')
else: else:
m = re.match( m = re.match(
r'''(?ix)(?:P? r'''(?ix)(?:P?
(?: (?:
[0-9]+\s*y(?:ears?)?\s* [0-9]+\s*y(?:ears?)?,?\s*
)? )?
(?: (?:
[0-9]+\s*m(?:onths?)?\s* [0-9]+\s*m(?:onths?)?,?\s*
)? )?
(?: (?:
[0-9]+\s*w(?:eeks?)?\s* [0-9]+\s*w(?:eeks?)?,?\s*
)? )?
(?: (?:
(?P<days>[0-9]+)\s*d(?:ays?)?\s* (?P<days>[0-9]+)\s*d(?:ays?)?,?\s*
)? )?
T)? T)?
(?: (?:
(?P<hours>[0-9]+)\s*h(?:ours?)?\s* (?P<hours>[0-9]+)\s*h(?:(?:ou)?rs?)?,?\s*
)? )?
(?: (?:
(?P<mins>[0-9]+)\s*m(?:in(?:ute)?s?)?\s* (?P<mins>[0-9]+)\s*m(?:in(?:ute)?s?)?,?\s*
)? )?
(?: (?:
(?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?\s*s(?:ec(?:ond)?s?)?\s* (?P<secs>[0-9]+)(?:\.(?P<ms>[0-9]+))?\s*s(?:ec(?:ond)?s?)?\s*
)?Z?$''', s) )?Z?$''', s)
if m: if m:
days, hours, mins, secs, ms = m.groups() days, hours, mins, secs, ms = m.group('days', 'hours', 'mins', 'secs', 'ms')
else: else:
m = re.match(r'(?i)(?:(?P<hours>[0-9.]+)\s*(?:hours?)|(?P<mins>[0-9.]+)\s*(?:mins?\.?|minutes?)\s*)Z?$', s) m = re.match(r'(?i)(?:(?P<hours>[0-9.]+)\s*(?:hours?)|(?P<mins>[0-9.]+)\s*(?:mins?\.?|minutes?)\s*)Z?$', s)
if m: if m:
@ -3930,17 +3978,13 @@ def parse_duration(s):
else: else:
return None return None
duration = 0 duration = (
if secs: ((((float(days) * 24) if days else 0)
duration += float(secs) + (float(hours) if hours else 0)) * 60
if mins: + (float(mins) if mins else 0)) * 60
duration += float(mins) * 60 + (float(secs) if secs else 0)
if hours: + (float(ms) / 10 ** len(ms) if ms else 0))
duration += float(hours) * 60 * 60
if days:
duration += float(days) * 24 * 60 * 60
if ms:
duration += float(ms)
return duration return duration
@ -4204,12 +4248,16 @@ def lowercase_escape(s):
s) s)
def escape_rfc3986(s): def escape_rfc3986(s, safe=None):
"""Escape non-ASCII characters as suggested by RFC 3986""" """Escape non-ASCII characters as suggested by RFC 3986"""
if sys.version_info < (3, 0): if sys.version_info < (3, 0):
s = _encode_compat_str(s, 'utf-8') s = _encode_compat_str(s, 'utf-8')
if safe is not None:
safe = _encode_compat_str(safe, 'utf-8')
if safe is None:
safe = b"%/;:@&=+$,!~*'()?#[]"
# ensure unicode: after quoting, it can always be converted # ensure unicode: after quoting, it can always be converted
return compat_str(compat_urllib_parse.quote(s, b"%/;:@&=+$,!~*'()?#[]")) return compat_str(compat_urllib_parse.quote(s, safe))
def escape_url(url): def escape_url(url):
@ -4247,6 +4295,7 @@ def urlencode_postdata(*args, **kargs):
return compat_urllib_parse_urlencode(*args, **kargs).encode('ascii') return compat_urllib_parse_urlencode(*args, **kargs).encode('ascii')
@partial_application
def update_url(url, **kwargs): def update_url(url, **kwargs):
"""Replace URL components specified by kwargs """Replace URL components specified by kwargs
url: compat_str or parsed URL tuple url: compat_str or parsed URL tuple
@ -4268,6 +4317,7 @@ def update_url(url, **kwargs):
return compat_urllib_parse.urlunparse(url._replace(**kwargs)) return compat_urllib_parse.urlunparse(url._replace(**kwargs))
@partial_application
def update_url_query(url, query): def update_url_query(url, query):
return update_url(url, query_update=query) return update_url(url, query_update=query)
@ -4694,30 +4744,45 @@ def parse_codecs(codecs_str):
if not codecs_str: if not codecs_str:
return {} return {}
split_codecs = list(filter(None, map( split_codecs = list(filter(None, map(
lambda str: str.strip(), codecs_str.strip().strip(',').split(',')))) lambda s: s.strip(), codecs_str.strip().split(','))))
vcodec, acodec = None, None vcodec, acodec, hdr = None, None, None
for full_codec in split_codecs: for full_codec in split_codecs:
codec = full_codec.split('.')[0] codec, rest = full_codec.partition('.')[::2]
if codec in ('avc1', 'avc2', 'avc3', 'avc4', 'vp9', 'vp8', 'hev1', 'hev2', 'h263', 'h264', 'mp4v', 'hvc1', 'av01', 'theora'): codec = codec.lower()
if not vcodec: full_codec = '.'.join((codec, rest)) if rest else codec
vcodec = full_codec codec = re.sub(r'0+(?=\d)', '', codec)
elif codec in ('mp4a', 'opus', 'vorbis', 'mp3', 'aac', 'ac-3', 'ec-3', 'eac3', 'dtsc', 'dtse', 'dtsh', 'dtsl'): if codec in ('avc1', 'avc2', 'avc3', 'avc4', 'vp9', 'vp8', 'hev1', 'hev2',
'h263', 'h264', 'mp4v', 'hvc1', 'av1', 'theora', 'dvh1', 'dvhe'):
if vcodec:
continue
vcodec = full_codec
if codec in ('dvh1', 'dvhe'):
hdr = 'DV'
elif codec in ('av1', 'vp9'):
n, m = {
'av1': (2, '10'),
'vp9': (0, '2'),
}[codec]
if (rest.split('.', n + 1)[n:] or [''])[0].lstrip('0') == m:
hdr = 'HDR10'
elif codec in ('flac', 'mp4a', 'opus', 'vorbis', 'mp3', 'aac', 'ac-4',
'ac-3', 'ec-3', 'eac3', 'dtsc', 'dtse', 'dtsh', 'dtsl'):
if not acodec: if not acodec:
acodec = full_codec acodec = full_codec
else: else:
write_string('WARNING: Unknown codec %s\n' % full_codec, sys.stderr) write_string('WARNING: Unknown codec %s\n' % (full_codec,), sys.stderr)
if not vcodec and not acodec:
if len(split_codecs) == 2: return (
return { filter_dict({
'vcodec': split_codecs[0],
'acodec': split_codecs[1],
}
else:
return {
'vcodec': vcodec or 'none', 'vcodec': vcodec or 'none',
'acodec': acodec or 'none', 'acodec': acodec or 'none',
} 'dynamic_range': hdr,
return {} }) if vcodec or acodec
else {
'vcodec': split_codecs[0],
'acodec': split_codecs[1],
} if len(split_codecs) == 2
else {})
def urlhandle_detect_ext(url_handle): def urlhandle_detect_ext(url_handle):
@ -6279,6 +6344,7 @@ def traverse_obj(obj, *paths, **kwargs):
Read as: `{key: traverse_obj(obj, path) for key, path in dct.items()}`. Read as: `{key: traverse_obj(obj, path) for key, path in dct.items()}`.
- `any`-builtin: Take the first matching object and return it, resetting branching. - `any`-builtin: Take the first matching object and return it, resetting branching.
- `all`-builtin: Take all matching objects and return them as a list, resetting branching. - `all`-builtin: Take all matching objects and return them as a list, resetting branching.
- `filter`-builtin: Return the value if it is truthy, `None` otherwise.
`tuple`, `list`, and `dict` all support nested paths and branches. `tuple`, `list`, and `dict` all support nested paths and branches.
@ -6320,6 +6386,11 @@ def traverse_obj(obj, *paths, **kwargs):
# instant compat # instant compat
str = compat_str str = compat_str
from .compat import (
compat_builtins_dict as dict_, # the basic dict type
compat_dict as dict, # dict preserving imsertion order
)
casefold = lambda k: compat_casefold(k) if isinstance(k, str) else k casefold = lambda k: compat_casefold(k) if isinstance(k, str) else k
if isinstance(expected_type, type): if isinstance(expected_type, type):
@ -6402,7 +6473,7 @@ def traverse_obj(obj, *paths, **kwargs):
if not branching: # string traversal if not branching: # string traversal
result = ''.join(result) result = ''.join(result)
elif isinstance(key, dict): elif isinstance(key, dict_):
iter_obj = ((k, _traverse_obj(obj, v, False, is_last)) for k, v in key.items()) iter_obj = ((k, _traverse_obj(obj, v, False, is_last)) for k, v in key.items())
result = dict((k, v if v is not None else default) for k, v in iter_obj result = dict((k, v if v is not None else default) for k, v in iter_obj
if v is not None or default is not NO_DEFAULT) or None if v is not None or default is not NO_DEFAULT) or None
@ -6480,7 +6551,7 @@ def traverse_obj(obj, *paths, **kwargs):
has_branched = False has_branched = False
key = None key = None
for last, key in lazy_last(variadic(path, (str, bytes, dict, set))): for last, key in lazy_last(variadic(path, (str, bytes, dict_, set))):
if not casesense and isinstance(key, str): if not casesense and isinstance(key, str):
key = compat_casefold(key) key = compat_casefold(key)
@ -6493,6 +6564,11 @@ def traverse_obj(obj, *paths, **kwargs):
objs = (list(filtered_objs),) objs = (list(filtered_objs),)
continue continue
# filter might be from __builtin__, future_builtins, or itertools.ifilter
if key in compat_filter_fns:
objs = filter(None, objs)
continue
if __debug__ and callable(key): if __debug__ and callable(key):
# Verify function signature # Verify function signature
_try_bind_args(key, None, None) _try_bind_args(key, None, None)
@ -6505,10 +6581,10 @@ def traverse_obj(obj, *paths, **kwargs):
objs = from_iterable(new_objs) objs = from_iterable(new_objs)
if test_type and not isinstance(key, (dict, list, tuple)): if test_type and not isinstance(key, (dict_, list, tuple)):
objs = map(type_test, objs) objs = map(type_test, objs)
return objs, has_branched, isinstance(key, dict) return objs, has_branched, isinstance(key, dict_)
def _traverse_obj(obj, path, allow_empty, test_type): def _traverse_obj(obj, path, allow_empty, test_type):
results, has_branched, is_dict = apply_path(obj, path, test_type) results, has_branched, is_dict = apply_path(obj, path, test_type)
@ -6531,6 +6607,76 @@ def traverse_obj(obj, *paths, **kwargs):
return None if default is NO_DEFAULT else default return None if default is NO_DEFAULT else default
def value(value):
return lambda _: value
class require(ExtractorError):
def __init__(self, name, expected=False):
super(require, self).__init__(
'Unable to extract {0}'.format(name), expected=expected)
def __call__(self, value):
if value is None:
raise self
return value
@partial_application
# typing: (subs: list[dict], /, *, lang='und', ext=None) -> dict[str, list[dict]
def subs_list_to_dict(subs, lang='und', ext=None):
"""
Convert subtitles from a traversal into a subtitle dict.
The path should have an `all` immediately before this function.
Arguments:
`lang` The default language tag for subtitle dicts with no
`lang` (`und`: undefined)
`ext` The default value for `ext` in the subtitle dicts
In the dict you can set the following additional items:
`id` The language tag to which the subtitle dict should be added
`quality` The sort order for each subtitle dict
"""
result = collections.defaultdict(list)
for sub in subs:
tn_url = url_or_none(sub.pop('url', None))
if tn_url:
sub['url'] = tn_url
elif not sub.get('data'):
continue
sub_lang = sub.pop('id', None)
if not isinstance(sub_lang, compat_str):
if not lang:
continue
sub_lang = lang
sub_ext = sub.get('ext')
if not isinstance(sub_ext, compat_str):
if not ext:
sub.pop('ext', None)
else:
sub['ext'] = ext
result[sub_lang].append(sub)
result = dict(result)
for subs in result.values():
subs.sort(key=lambda x: x.pop('quality', 0) or 0)
return result
def unpack(func, **kwargs):
"""Make a function that applies `partial(func, **kwargs)` to its argument as *args"""
@functools.wraps(func)
def inner(items):
return func(*items, **kwargs)
return inner
def T(*x): def T(*x):
""" For use in yt-dl instead of {type, ...} or set((type, ...)) """ """ For use in yt-dl instead of {type, ...} or set((type, ...)) """
return set(x) return set(x)

@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2021.12.17' __version__ = '2025.04.07'

Loading…
Cancel
Save