Compare commits

...

121 Commits

Author SHA1 Message Date
dirkf 956b8c5855 [YouTube] Bug-fix for `c1f5c3274a` 1 week ago
dirkf d5f561166b [core] Re-work format_note display in format list with abbreviated codec name 1 week ago
dirkf d0283f5385 [YouTube] Revert forcing player JS by default
* still leaving the parameters in place

thx bashonly for confirming this suggestion
2 weeks ago
dirkf 6315f4b1df [utils] Support additional codecs and dynamic_range 2 weeks ago
dirkf aeb1254fcf [YouTube] Fix playlist thumbnail extraction
Thx seproDev, yt-dlp/yt-dlp#11615
2 weeks ago
dirkf 25890f2ad1 [YouTube] Improve detection of geo-restriction
Thx yt-dlp
2 weeks ago
dirkf d65882a022 [YouTube] Improve mark_watched()
Thx: Brett824, yt-dlp/yt-dlp#4146
2 weeks ago
dirkf 39378f7b5c [YouTube] Fix incorrect chapter extraction
* align `_get_text()` with yt-dlp (thx, passim) at last
2 weeks ago
dirkf 6f5d4c3289 [YouTube] Improve targeting of pre-roll wait
Experimental for now.
Thx: yt-dlp/yt-dlp#14646
2 weeks ago
dirkf 5d445f8c5f [YouTube] Re-work client selection
* use `android_sdkless` by default
* use `web_safari` (HLS only) if logged in
* skip any non-HLS format with n-challenge
2 weeks ago
dirkf a1e2c7d90b [YouTube] Add further InnerTube clients
FWIW: android-sdkless, tv_downgraded, web_creator
Thx yt-dlp passim
2 weeks ago
dirkf c55ace3c50 [YouTube] Use insertion-order-preserving dict for InnerTube client data 2 weeks ago
dirkf 43e3121020 [utils] Align `parse_duration()` behaviour with yt-dlp
* handle comma-separated long-form durations
* support : as millisecond separator.
2 weeks ago
dirkf 7a488f7fae [utils] Stabilise traversal results using `compat_dict`
In `traverse_obj()`, use `compat_dict` to construct dicts,
ensuring insertion order sort, but`compat_builtin_dict`
to test for dict-iness...
2 weeks ago
dirkf 5585d76da6 [compat] Add `compat_dict`
A dict that preserves insertion order and otherwise resembles the
dict builtin (if it isn't it) rather than `collections.OrderedDict`.

Also:
* compat_builtins_dict: the built-in definition in case `compat_dict`
  was imported as `dict`
* compat_dict_items: use instead of `dict.items` to get items from
  a `compat_dict` in insertion order, if you didn't define `dict` as
  `compat_dict`.
2 weeks ago
dirkf 931e15621c [compat] Add `compat_abc_ABC`
Base class for abstract classes
2 weeks ago
dirkf 27867cc814 [compat] Add `compat_thread` 2 weeks ago
dirkf 70b40dd1ef [utils] Add `subs_list_to_dict()` traversal helper
Thx: yt-dlp/yt-dlp#10653, etc
2 weeks ago
dirkf a9b4649d92 [utils] Apply `partial_application` decorator to existing functions
Thx: yt-dlp/yt-dlp#10653 (etc)
2 weeks ago
dirkf 23a848c314 [utils] Add `partial_application` decorator function
Thx: yt-dlp/yt-dlp#10653
2 weeks ago
dirkf a96a778750 [core] Fix housekeeping for `available_at` 2 weeks ago
dirkf 68fe8c1781 [utils] Support traversal helper functions `require`, `value`, `unpack`
Thx: yt-dlp/yt-dlp#10653
2 weeks ago
dirkf 96419fa706 [utils] Support `filter` traversal key
Thx yt-dlp/yt-dlp#10653
2 weeks ago
dirkf cca41c9d2c [test] Move dict_get() traversal test to its own class
Matches yt-dlp/yt-dlp#9426
2 weeks ago
dirkf bc39e5e678 [test] Fix test_traversal_morsel for Py 3.14+
Thx: yt-dlp/yt-dlp#13471
2 weeks ago
dirkf 014ae63a11 [test] Support additional args and kwargs in report_warning() mocks 2 weeks ago
dirkf 1e109aaee1 [workflows/ci] Avoid installing wheel and setuptools with pip
Works around dependent wheel installation failure with Py 3.4 from 2025-10
2 months ago
dirkf efb4011211 [YouTube] Introduce `_extract_and_report_alerts()` per yt-dlp
Fixes #33196.

Also removing previous `_extract_alerts()` method.
2 months ago
dirkf c1f5c3274a [YouTube] Improve some traversals
Pending full alignment with yt-dlp ...
2 months ago
dirkf e21ff28f6f [YouTube] Misc clean-ups from linter, etc 2 months ago
dirkf 82552faba6 [workflows/ci] Update to windows-2022 runner
FFS
2 months ago
dirkf 617d4e6466 [core] Support explicit `--no-list-formats` option 2 months ago
dirkf 9223fcc48a [YouTube] Support `LOCKUP_CONTENT_TYPE_VIDEO` in subscriptions feed extraction
From yt-dlp/yt-dlp#13665), thx bashonly
2 months ago
dirkf 4222c6d78b [YouTube] Extract fallback title and description from initial data
Based on yt-dlp/yt-dlp#14078, thx bashonly
2 months ago
dirkf 2735d1bf1d [YouTube] Extract srt subtitles
From yt-dlp/yt-dlp#13411, thx gamer191
2 months ago
dirkf f2a774cb9d [YouTube] Fix subtitles extraction
From yt-dlp/yt-dlp#13659, thx bashonly
2 months ago
dirkf 92680b127f [YouTube] Handle required preroll waiting period
* Based on yt-dlp/yt-dlp#14081, thx bashonly
* Uses internal `youtube_preroll_sleep` param, default 6s
2 months ago
dirkf 40ab920354 [downloader] Delay download according to `available_at` format key 2 months ago
dirkf 0739f58f90 [YouTube] Implement player JS override for player `0004de42`
* based on yt-dlp/yt-dlp#14398, thx seproDev
* adds --youtube-player-js-variant option
* adds --youtube-player-js-version option
* sets defaults to main variant of player `0004de42`
* fixes #33187, for now
2 months ago
dirkf aac0148b89 [YouTube] Force `WEB` user agent for video page download
Fixes #33142, until default UAs work.
2 months ago
dirkf 7f7b3881aa [YouTube] Handle Web Safari formats
From yt-dlp/yt-dlp#14168, thx bashonly.
2 months ago
dirkf 0c41b03114 [YouTube] Update player client details 2 months ago
dirkf 7c6630bfdd [YouTube] Miscellaneous clean-ups 2 months ago
dirkf a084c80f7b [YouTube] Fix 680069a, excess `min_ver`
Resolves #33125.
7 months ago
dirkf e102b9993a [workflows/ci.yml] Move pinned Ubuntu runner images from withdrawn 20.4 to 22.04
* fix consequent missing `python-is-python2` package
7 months ago
dirkf 680069a149 [YouTube] Improve n-sig function extraction for player `aa3fc80b`
Resolves #33123
7 months ago
dirkf 4a31290ae1 [YouTube] Delete cached problem nsig cache data on descrambling error
* inspired by yt-dlp/yt-dlp#12750
7 months ago
dirkf 3a42f6ad37 [YouTube] Cache signature timestamp from player JS
* if the YT webpage can't be loaded, getting the `sts` requires loading the
player JS: this caches it
* based on yt-dlp/yt-dlp#13047, thx bashonly
7 months ago
dirkf ec75141bf0 [Cache] Add `clear` function 7 months ago
dirkf c052a16f72 [JSInterp] Add tests and relevant functionality from yt-dlp
* thx seproDev, bashonly: yt-dlp/yt-dlp#12760, yt-dlp/yt-dlp#12761:
  - Improve nested attribute support
  - Pass global stack when extracting objects
  - interpret_statement: Match attribute before indexing
  - Fix assignment to array elements with nested brackets
  - Add new signature tests
  - Invalidate JS function cache
  - Avoid testdata dupes now that we cache by URL

* rework nsig function name search
* fully fixes #33102
* update cache required versions
* update program version
8 months ago
dirkf bd2ded59f2 [JSInterp] Improve unary operators; add `!` 8 months ago
dirkf 16b7e97afa [JSInterp] Add `_separate_at_op()` 8 months ago
dirkf d21717978c [JSInterp] Improve JS classes, etc 8 months ago
dirkf 7513413794 [JSInterp] Reorganise some declarations to align better with yt-dlp 8 months ago
dirkf 67dbfa65f2 [InfoExtractor] Fix merging subtitles to empty target 8 months ago
dirkf 6eb6d6dff5 [InfoExtractor] Use local variants for remaining parent method calls
* ... where defined
8 months ago
dirkf 6c40d9f847 [YouTube] Remove remaining hard-coded API keys
* no longer required for these cases
8 months ago
dirkf 1b08d3281d [YouTube] Fix playlist continuation extraction
* thx coletdjnz, bashonly: yt-dlp/yt-dlp#12777
8 months ago
dirkf 32b8d31780 [YouTube] Support shorts playlist
* only 1..100: yt-dlp/yt-dlp#11130
8 months ago
dirkf 570b868078 [cache] Use `esc_rfc3986` to encode cache key 8 months ago
dirkf 2190e89260 [utils] Support optional `safe` argument for `escape_rfc3986()` 8 months ago
dirkf 7e136639db [compat] Improve Py2 compatibility for URL Quoting 8 months ago
dirkf cedeeed56f [cache] Align further with yt-dlp
* use compat_os_makedirs
* support non-ASCII characters in cache key
* improve logging
8 months ago
dirkf add4622870 [compat] Add compat_os_makedirs
* support exists_ok parameter in Py < 3.2
8 months ago
dirkf 9a6ddece4d [core] Refactor message routines to align better with yt-dlp
* in particular, support `only_once` in the same methods
8 months ago
dirkf 3eb8d22ddb
[JSInterp] Temporary fix for #33102 8 months ago
dirkf 4e714f9df1 [Misc] Correct [_]IE_DESC/NAME in a few IEs
* thx seproDev, yt-dlp/yt-dlp/pull/12694/commits/ae69e3c
* also add documenting comment in `InfoExtractor`
8 months ago
dirkf c1ea7f5a24 [ITV] Mark ITVX not working
* update old shim
* correct [_]IE_DESC
8 months ago
dirkf 2b4fbfce25 [YouTube] Support player `4fcd6e4a`
thx seproDev, bashonly: yt-dlp/yt-dlp#12748
8 months ago
dirkf 1bc45b8b6c [JSInterp] Use `,` for join() with null/undefined argument
Eg: [1,2,3].join(null) -> '1,2,3'
8 months ago
dirkf b982d77d0b [YouTube] Align signature tests with yt-dlp
thx bashonly, yt-dlp/yt-dlp#12725
8 months ago
dirkf c55dbf4838 [YouTube] Update signature extraction for players `643afba4`, `363db69b` 8 months ago
dirkf 087d865230 [YouTube] Support new player URL patterns 8 months ago
dirkf a4fc1151f1 [JSInterp] Improve indexing
* catch invalid list index with `ValueError` (eg [1, 2]['ab'] -> undefined)
* allow assignment outside existing list (eg var l = [1,2]; l[9] = 0;)
8 months ago
dirkf a464c159e6 [YouTube] Make `_extract_player_info()` use `_search_regex()` 8 months ago
dirkf 7dca08eff0 [YouTube] Also get original of translated automatic captions 8 months ago
dirkf 2239ee7965 [YouTube] Get subtitles/automatic captions from both web and API responses 8 months ago
dirkf da7223d4aa [YouTube] Improve support for tce-style player JS
* improve extraction of global "useful data" Array from player JS
* also handle tv-player and add tests: thx seproDev (yt-dlp/yt-dlp#12684)

Co-Authored-By: sepro <sepro@sepr0.com>
9 months ago
dirkf 37c2440d6a [YouTube] Update player client data
thx seproDev (yt-dlp/yt-dlp#12603)

Co-authored-by: sepro <sepro@sepr0.com>
9 months ago
dirkf 420d53387c [JSInterp] Improve tests
* from yt-dlp/yt-dlp#12313
* also fix d7c2708
9 months ago
dirkf 32f89de92b [YouTube] Update TVHTML5 client parameters
* resolves #33078
9 months ago
dirkf 283dca56fe [YouTube] Initially support tce-style player JS
* resolves #33079
9 months ago
dirkf 422b1b31cf [YouTube] Temporarily redirect from tce-style player JS 9 months ago
dirkf 1dc27e1c3b [JSInterp] Make indexing error handling more conformant
* by default TypeError -> undefined, else raise
* set allow_undefined=True/False to override
9 months ago
dirkf af049e309b [JSInterp] Handle undefined, etc, passed to JS_RegExp and Exception 9 months ago
dirkf 94849bc997 [JSInterp] Improve Date processing
* add JS_Date class implementing JS Date
* support constructor args other than date string
* support static methods of Date
* Date objects are still automatically coerced to timestamp before using in JS.
9 months ago
dirkf 974c7d7f34 [compat] Fix inheriting from compat_collections_chain_map
* see ytdl-org/youtube-dl#33079#issuecomment-2704038049
9 months ago
dirkf 8738407d77 [compat] Support zstd Content-Encoding
* see RFC 8878 7.2
9 months ago
dirkf cecaa18b80 [compat] Clean-up
* make workaround_optparse_bug9161 private
* add comments
* avoid leaving test objects behind
9 months ago
dirkf 673277e510
[YouTube] Fix 91b1569 9 months ago
dirkf 91b1569f68
[YouTube] Fix channel playlist extraction (#33074)
* [YouTube] Extract playlist items from LOCKUP_VIEW_MODEL_...
* resolves #33073
* thx seproDev (yt-dlp/yt-dlp#11615)

Co-authored-by: sepro <sepro@sepr0.com>
9 months ago
dirkf 711e72c292 [JSInterp] Fix bit-shift coercion for player 9c6dfc4a 10 months ago
dirkf 26b6f15d14 [compat] Make casefold private
* if required, not supported:
`from youtube_dl.casefold import _casefold as casefold`
10 months ago
dirkf 5975d7bb96 [YouTube] Use X-Goog-Visitor-Id
* required with tv player client
* resolves #33030
11 months ago
dirkf 63fb0fc415 [YouTube] Retain .videoDetails members from all player responses 11 months ago
dirkf b09442a2f4 [YouTube] Also use ios client when is_live 11 months ago
dirkf 55ad8a24ca [YouTube] Support `... /feeds/videos.xml?playlist_id={pl_id}` 11 months ago
dirkf 21fff05121 [YouTube] Switch to TV API client
* thx yt-dlp/yt-dlp#12059
11 months ago
dirkf 1036478d13 [YouTube] Endure subtitle URLs are complete
* WEB URLs are, MWEB not
* resolves #33017
11 months ago
dirkf 00ad2b8ca1 [YouTube] Refactor subtitle processing
* move to internal function
* use `traverse-obj()`
11 months ago
dirkf ab7c61ca29 [YouTube] Apply code style changes, trailing commas, etc 11 months ago
dirkf 176fc2cb00 [YouTube] Avoid early crash if webpage can't be read
* see issue #33013
11 months ago
dirkf d55d1f423d [YouTube] Always extract using MWEB API client
* temporary fix-up for 403 on download
* MWEB parameters from yt-dlp 2024-12-06
12 months ago
dirkf eeafbbc3e5 [YouTube] Fix signature function extraction for `2f1832d2`
* `_` was omitted from patterns
* thx yt-dlp/yt-dlp#11801

Co-authored-by: bashonly
12 months ago
dirkf cd7c7b5edb [YouTube] Simplify pattern for nsig function name extraction 12 months ago
dirkf eed784e15f [YouTube] Pass nsig value as return hook, fixes player `3bb1f723` 12 months ago
dirkf b4469a0f65 [YouTube] Handle player `3bb1f723`
* fix signature code extraction
* raise if n function returns input value
* add new tests from yt-dlp

Co-authored-by: bashonly
12 months ago
dirkf ce1e556b8f [jsinterp] Add return hook for player `3bb1f723`
* set var `_ytdl_do_not_return` to a specific value in the scope of a function
* if an expression to be returned has that value, `return` becomes `void`
12 months ago
dirkf f487b4a02a [jsinterp] Strip /* comments */ when parsing
* NB: _separate() is looking creaky
12 months ago
dirkf 60835ca16c [jsinterp] Fix and improve "methods"
* push, unshift return new length
* impove edge cases for push/pop, shift/unshift, forEach, indexOf, charCodeAt
* increase test coverage
12 months ago
dirkf 94fd774608 [jsinterp] Fix and improve split/join
* improve split/join edge cases
* correctly implement regex split (not like re.split)
12 months ago
dirkf 5dee6213ed [jsinterp] Fix and improve arithmetic operations
* addition becomes concat with a string operand
* improve handling of edgier cases
* arithmetic in float like JS (more places need cast to int?)
* increase test coverage
12 months ago
dirkf 81e64cacf2 [jsinterp] Support multiple indexing (eg a[1][2])
* extend single indexing with improved RE (should probably use/have used _separate_at_paren())
* fix some cases that should have given undefined, not throwing
* standardise RE group names
* support length of objects, like {1: 2, 3: 4, length: 42}
12 months ago
dirkf c1a03b1ac3 [jsinterp] Fix and improve loose and strict equality operations
* reimplement loose equality according to MDN (eg, 1 == "1")
* improve strict equality (eg, "abc" === "abc" but 'abc' is not 'abc')
* add tests for above
12 months ago
dirkf 118c6d7a17 [jsinterp] Implement `typeof` operator 12 months ago
dirkf f28d7178e4 [InfoExtractor] Use kwarg maxsplit for re.split
* May become kw-only in future Pythons
12 months ago
dirkf c5098961b0 [Youtube] Rework n function extraction pattern
Now also succeeds with player b12cc44b
1 year ago
dirkf dbc08fba83 [jsinterp] Improve slice implementation for player b12cc44b
Partly taken from yt-dlp/yt-dlp#10664, thx seproDev
        Fixes #32896
1 year ago
Aiur Adept 71223bff39
[Youtube] Fix nsig extraction for player 20dfca59 (#32891)
* dirkf's patch for nsig extraction
* add generic search per  yt-dlp/yt-dlp/pull/10611 - thx bashonly

---------

Co-authored-by: dirkf <fieldhouse@gmx.net>
1 year ago
dirkf e1b3fa242c [Youtube] Find `n` function name in player `3400486c`
Fixes #32877
1 year ago
dirkf 451046d62a [Youtube] Make n-sig throttling diagnostic up-to-date 1 year ago

@ -116,29 +116,29 @@ jobs:
strategy:
fail-fast: true
matrix:
os: [ubuntu-20.04]
os: [ubuntu-22.04]
python-version: ${{ fromJSON(needs.select.outputs.cpython-versions) }}
python-impl: [cpython]
ytdl-test-set: ${{ fromJSON(needs.select.outputs.test-set) }}
run-tests-ext: [sh]
include:
- os: windows-2019
- os: windows-2022
python-version: 3.4
python-impl: cpython
ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'core') && 'core' || 'nocore' }}
run-tests-ext: bat
- os: windows-2019
- os: windows-2022
python-version: 3.4
python-impl: cpython
ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'download') && 'download' || 'nodownload' }}
run-tests-ext: bat
# jython
- os: ubuntu-20.04
- os: ubuntu-22.04
python-version: 2.7
python-impl: jython
ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'core') && 'core' || 'nocore' }}
run-tests-ext: sh
- os: ubuntu-20.04
- os: ubuntu-22.04
python-version: 2.7
python-impl: jython
ytdl-test-set: ${{ contains(needs.select.outputs.test-set, 'download') && 'download' || 'nodownload' }}
@ -160,7 +160,7 @@ jobs:
# NB may run apt-get install in Linux
uses: ytdl-org/setup-python@v1
env:
# Temporary workaround for Python 3.5 failures - May 2024
# Temporary (?) workaround for Python 3.5 failures - May 2024
PIP_TRUSTED_HOST: "pypi.python.org pypi.org files.pythonhosted.org"
with:
python-version: ${{ matrix.python-version }}
@ -240,7 +240,10 @@ jobs:
# install 2.7
shell: bash
run: |
sudo apt-get install -y python2 python-is-python2
# Ubuntu 22.04 no longer has python-is-python2: fetch it
curl -L "http://launchpadlibrarian.net/474693132/python-is-python2_2.7.17-4_all.deb" -o python-is-python2.deb
sudo apt-get install -y python2
sudo dpkg --force-breaks -i python-is-python2.deb
echo "PYTHONHOME=/usr" >> "$GITHUB_ENV"
#-------- Python 2.6 --
- name: Set up Python 2.6 environment
@ -362,7 +365,7 @@ jobs:
python -m ensurepip || python -m pip --version || { \
get_pip="${{ contains(needs.select.outputs.own-pip-versions, matrix.python-version) && format('{0}/', matrix.python-version) || '' }}"; \
curl -L -O "https://bootstrap.pypa.io/pip/${get_pip}get-pip.py"; \
python get-pip.py; }
python get-pip.py --no-setuptools --no-wheel; }
- name: Set up Python 2.6 pip
if: ${{ matrix.python-version == '2.6' }}
shell: bash

@ -85,10 +85,10 @@ class FakeYDL(YoutubeDL):
# Silence an expected warning matching a regex
old_report_warning = self.report_warning
def report_warning(self, message):
def report_warning(self, message, *args, **kwargs):
if re.match(regex, message):
return
old_report_warning(message)
old_report_warning(message, *args, **kwargs)
self.report_warning = types.MethodType(report_warning, self)
@ -265,11 +265,11 @@ def assertRegexpMatches(self, text, regexp, msg=None):
def expect_warnings(ydl, warnings_re):
real_warning = ydl.report_warning
def _report_warning(w):
def _report_warning(self, w, *args, **kwargs):
if not any(re.search(w_re, w) for w_re in warnings_re):
real_warning(w)
ydl.report_warning = _report_warning
ydl.report_warning = types.MethodType(_report_warning, ydl)
def http_server_port(httpd):

@ -63,9 +63,21 @@ class TestCache(unittest.TestCase):
obj = {'x': 1, 'y': ['ä', '\\a', True]}
c.store('test_cache', 'k.', obj)
self.assertEqual(c.load('test_cache', 'k.', min_ver='1970.01.01'), obj)
new_version = '.'.join(('%d' % ((v + 1) if i == 0 else v, )) for i, v in enumerate(version_tuple(__version__)))
new_version = '.'.join(('%0.2d' % ((v + 1) if i == 0 else v, )) for i, v in enumerate(version_tuple(__version__)))
self.assertIs(c.load('test_cache', 'k.', min_ver=new_version), None)
def test_cache_clear(self):
ydl = FakeYDL({
'cachedir': self.test_dir,
})
c = Cache(ydl)
c.store('test_cache', 'k.', 'kay')
c.store('test_cache', 'l.', 'ell')
self.assertEqual(c.load('test_cache', 'k.'), 'kay')
c.clear('test_cache', 'k.')
self.assertEqual(c.load('test_cache', 'k.'), None)
self.assertEqual(c.load('test_cache', 'l.'), 'ell')
if __name__ == '__main__':
unittest.main()

@ -1,4 +1,5 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
@ -6,12 +7,14 @@ from __future__ import unicode_literals
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import math
import re
import time
from youtube_dl.compat import compat_str
from youtube_dl.compat import compat_str as str
from youtube_dl.jsinterp import JS_Undefined, JSInterpreter
NaN = object()
@ -19,7 +22,7 @@ NaN = object()
class TestJSInterpreter(unittest.TestCase):
def _test(self, jsi_or_code, expected, func='f', args=()):
if isinstance(jsi_or_code, compat_str):
if isinstance(jsi_or_code, str):
jsi_or_code = JSInterpreter(jsi_or_code)
got = jsi_or_code.call_function(func, *args)
if expected is NaN:
@ -40,16 +43,27 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){return 42 + 7;}', 49)
self._test('function f(){return 42 + undefined;}', NaN)
self._test('function f(){return 42 + null;}', 42)
self._test('function f(){return 1 + "";}', '1')
self._test('function f(){return 42 + "7";}', '427')
self._test('function f(){return false + true;}', 1)
self._test('function f(){return "false" + true;}', 'falsetrue')
self._test('function f(){return '
'1 + "2" + [3,4] + {k: 56} + null + undefined + Infinity;}',
'123,4[object Object]nullundefinedInfinity')
def test_sub(self):
self._test('function f(){return 42 - 7;}', 35)
self._test('function f(){return 42 - undefined;}', NaN)
self._test('function f(){return 42 - null;}', 42)
self._test('function f(){return 42 - "7";}', 35)
self._test('function f(){return 42 - "spam";}', NaN)
def test_mul(self):
self._test('function f(){return 42 * 7;}', 294)
self._test('function f(){return 42 * undefined;}', NaN)
self._test('function f(){return 42 * null;}', 0)
self._test('function f(){return 42 * "7";}', 294)
self._test('function f(){return 42 * "eggs";}', NaN)
def test_div(self):
jsi = JSInterpreter('function f(a, b){return a / b;}')
@ -57,17 +71,26 @@ class TestJSInterpreter(unittest.TestCase):
self._test(jsi, NaN, args=(JS_Undefined, 1))
self._test(jsi, float('inf'), args=(2, 0))
self._test(jsi, 0, args=(0, 3))
self._test(jsi, 6, args=(42, 7))
self._test(jsi, 0, args=(42, float('inf')))
self._test(jsi, 6, args=("42", 7))
self._test(jsi, NaN, args=("spam", 7))
def test_mod(self):
self._test('function f(){return 42 % 7;}', 0)
self._test('function f(){return 42 % 0;}', NaN)
self._test('function f(){return 42 % undefined;}', NaN)
self._test('function f(){return 42 % "7";}', 0)
self._test('function f(){return 42 % "beans";}', NaN)
def test_exp(self):
self._test('function f(){return 42 ** 2;}', 1764)
self._test('function f(){return 42 ** undefined;}', NaN)
self._test('function f(){return 42 ** null;}', 1)
self._test('function f(){return undefined ** 0;}', 1)
self._test('function f(){return undefined ** 42;}', NaN)
self._test('function f(){return 42 ** "2";}', 1764)
self._test('function f(){return 42 ** "spam";}', NaN)
def test_calc(self):
self._test('function f(a){return 2*a+1;}', 7, args=[3])
@ -89,13 +112,60 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){return 19 & 21;}', 17)
self._test('function f(){return 11 >> 2;}', 2)
self._test('function f(){return []? 2+3: 4;}', 5)
# equality
self._test('function f(){return 1 == 1}', True)
self._test('function f(){return 1 == 1.0}', True)
self._test('function f(){return 1 == "1"}', True)
self._test('function f(){return 1 == 2}', False)
self._test('function f(){return 1 != "1"}', False)
self._test('function f(){return 1 != 2}', True)
self._test('function f(){var x = {a: 1}; var y = x; return x == y}', True)
self._test('function f(){var x = {a: 1}; return x == {a: 1}}', False)
self._test('function f(){return NaN == NaN}', False)
self._test('function f(){return null == undefined}', True)
self._test('function f(){return "spam, eggs" == "spam, eggs"}', True)
# strict equality
self._test('function f(){return 1 === 1}', True)
self._test('function f(){return 1 === 1.0}', True)
self._test('function f(){return 1 === "1"}', False)
self._test('function f(){return 1 === 2}', False)
self._test('function f(){var x = {a: 1}; var y = x; return x === y}', True)
self._test('function f(){var x = {a: 1}; return x === {a: 1}}', False)
self._test('function f(){return NaN === NaN}', False)
self._test('function f(){return null === undefined}', False)
self._test('function f(){return null === null}', True)
self._test('function f(){return undefined === undefined}', True)
self._test('function f(){return "uninterned" === "uninterned"}', True)
self._test('function f(){return 1 === 1}', True)
self._test('function f(){return 1 === "1"}', False)
self._test('function f(){return 1 !== 1}', False)
self._test('function f(){return 1 !== "1"}', True)
# expressions
self._test('function f(){return 0 && 1 || 2;}', 2)
self._test('function f(){return 0 ?? 42;}', 0)
self._test('function f(){return "life, the universe and everything" < 42;}', False)
# https://github.com/ytdl-org/youtube-dl/issues/32815
self._test('function f(){return 0 - 7 * - 6;}', 42)
def test_bitwise_operators_typecast(self):
# madness
self._test('function f(){return null << 5}', 0)
self._test('function f(){return undefined >> 5}', 0)
self._test('function f(){return 42 << NaN}', 42)
self._test('function f(){return 42 << Infinity}', 42)
self._test('function f(){return 0.0 << null}', 0)
self._test('function f(){return NaN << 42}', 0)
self._test('function f(){return "21.9" << 1}', 42)
self._test('function f(){return true << "5";}', 32)
self._test('function f(){return true << true;}', 2)
self._test('function f(){return "19" & "21.9";}', 17)
self._test('function f(){return "19" & false;}', 0)
self._test('function f(){return "11.0" >> "2.1";}', 2)
self._test('function f(){return 5 ^ 9;}', 12)
self._test('function f(){return 0.0 << NaN}', 0)
self._test('function f(){return null << undefined}', 0)
self._test('function f(){return 21 << 4294967297}', 42)
def test_array_access(self):
self._test('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}', [5, 2, 7])
@ -110,8 +180,8 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){var x = 20; x = 30 + 1; return x;}', 31)
self._test('function f(){var x = 20; x += 30 + 1; return x;}', 51)
self._test('function f(){var x = 20; x -= 30 + 1; return x;}', -11)
self._test('function f(){var x = 2; var y = ["a", "b"]; y[x%y["length"]]="z"; return y}', ['z', 'b'])
@unittest.skip('Not yet fully implemented')
def test_comments(self):
self._test('''
function f() {
@ -130,6 +200,15 @@ class TestJSInterpreter(unittest.TestCase):
}
''', 3)
self._test('''
function f() {
var x = ( /* 1 + */ 2 +
/* 30 * 40 */
50);
return x;
}
''', 52)
def test_precedence(self):
self._test('''
function f() {
@ -151,6 +230,34 @@ class TestJSInterpreter(unittest.TestCase):
self._test(jsi, 86000, args=['12/31/1969 18:01:26 MDT'])
# epoch 0
self._test(jsi, 0, args=['1 January 1970 00:00:00 UTC'])
# undefined
self._test(jsi, NaN, args=[JS_Undefined])
# y,m,d, ... - may fail with older dates lacking DST data
jsi = JSInterpreter(
'function f() { return new Date(%s); }'
% ('2024, 5, 29, 2, 52, 12, 42',))
self._test(jsi, (
1719625932042 # UK value
+ (
+ 3600 # back to GMT
+ (time.altzone if time.daylight # host's DST
else time.timezone)
) * 1000))
# no arg
self.assertAlmostEqual(JSInterpreter(
'function f() { return new Date() - 0; }').call_function('f'),
time.time() * 1000, delta=100)
# Date.now()
self.assertAlmostEqual(JSInterpreter(
'function f() { return Date.now(); }').call_function('f'),
time.time() * 1000, delta=100)
# Date.parse()
jsi = JSInterpreter('function f(dt) { return Date.parse(dt); }')
self._test(jsi, 0, args=['1 January 1970 00:00:00 UTC'])
# Date.UTC()
jsi = JSInterpreter('function f() { return Date.UTC(%s); }'
% ('1970, 0, 1, 0, 0, 0, 0',))
self._test(jsi, 0)
def test_call(self):
jsi = JSInterpreter('''
@ -265,8 +372,28 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f() { a=5; return (a -= 1, a+=3, a); }', 7)
self._test('function f() { return (l=[0,1,2,3], function(a, b){return a+b})((l[1], l[2]), l[3]) }', 5)
def test_not(self):
self._test('function f() { return ! undefined; }', True)
self._test('function f() { return !0; }', True)
self._test('function f() { return !!0; }', False)
self._test('function f() { return ![]; }', False)
self._test('function f() { return !0 !== false; }', True)
def test_void(self):
self._test('function f() { return void 42; }', None)
self._test('function f() { return void 42; }', JS_Undefined)
def test_typeof(self):
self._test('function f() { return typeof undefined; }', 'undefined')
self._test('function f() { return typeof NaN; }', 'number')
self._test('function f() { return typeof Infinity; }', 'number')
self._test('function f() { return typeof true; }', 'boolean')
self._test('function f() { return typeof null; }', 'object')
self._test('function f() { return typeof "a string"; }', 'string')
self._test('function f() { return typeof 42; }', 'number')
self._test('function f() { return typeof 42.42; }', 'number')
self._test('function f() { var g = function(){}; return typeof g; }', 'function')
self._test('function f() { return typeof {key: "value"}; }', 'object')
# not yet implemented: Symbol, BigInt
def test_return_function(self):
jsi = JSInterpreter('''
@ -283,7 +410,7 @@ class TestJSInterpreter(unittest.TestCase):
def test_undefined(self):
self._test('function f() { return undefined === undefined; }', True)
self._test('function f() { return undefined; }', JS_Undefined)
self._test('function f() {return undefined ?? 42; }', 42)
self._test('function f() { return undefined ?? 42; }', 42)
self._test('function f() { let v; return v; }', JS_Undefined)
self._test('function f() { let v; return v**0; }', 1)
self._test('function f() { let v; return [v>42, v<=42, v&&42, 42&&v]; }',
@ -324,8 +451,19 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f() { let a; return a?.qq; }', JS_Undefined)
self._test('function f() { let a = {m1: 42, m2: 0 }; return a?.qq; }', JS_Undefined)
def test_indexing(self):
self._test('function f() { return [1, 2, 3, 4][3]}', 4)
self._test('function f() { return [1, [2, [3, [4]]]][1][1][1][0]}', 4)
self._test('function f() { var o = {1: 2, 3: 4}; return o[3]}', 4)
self._test('function f() { var o = {1: 2, 3: 4}; return o["3"]}', 4)
self._test('function f() { return [1, [2, {3: [4]}]][1][1]["3"][0]}', 4)
self._test('function f() { return [1, 2, 3, 4].length}', 4)
self._test('function f() { var o = {1: 2, 3: 4}; return o.length}', JS_Undefined)
self._test('function f() { var o = {1: 2, 3: 4}; o["length"] = 42; return o.length}', 42)
def test_regex(self):
self._test('function f() { let a=/,,[/,913,/](,)}/; }', None)
self._test('function f() { let a=/,,[/,913,/](,)}/; return a.source; }', ',,[/,913,/](,)}')
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/; "".replace(a, ""); return a; }
@ -373,13 +511,6 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){return -524999584 << 5}', 379882496)
self._test('function f(){return 1236566549 << 5}', 915423904)
def test_bitwise_operators_typecast(self):
# madness
self._test('function f(){return null << 5}', 0)
self._test('function f(){return undefined >> 5}', 0)
self._test('function f(){return 42 << NaN}', 42)
self._test('function f(){return 42 << Infinity}', 42)
def test_negative(self):
self._test('function f(){return 2 * -2.0 ;}', -4)
self._test('function f(){return 2 - - -2 ;}', 0)
@ -411,10 +542,19 @@ class TestJSInterpreter(unittest.TestCase):
self._test(jsi, 't-e-s-t', args=[test_input, '-'])
self._test(jsi, '', args=[[], '-'])
self._test('function f(){return '
'[1, 1.0, "abc", {a: 1}, null, undefined, Infinity, NaN].join()}',
'1,1,abc,[object Object],,,Infinity,NaN')
self._test('function f(){return '
'[1, 1.0, "abc", {a: 1}, null, undefined, Infinity, NaN].join("~")}',
'1~1~abc~[object Object]~~~Infinity~NaN')
def test_split(self):
test_result = list('test')
tests = [
'function f(a, b){return a.split(b)}',
'function f(a, b){return a["split"](b)}',
'function f(a, b){let x = ["split"]; return a[x[0]](b)}',
'function f(a, b){return String.prototype.split.call(a, b)}',
'function f(a, b){return String.prototype.split.apply(a, [b])}',
]
@ -424,6 +564,93 @@ class TestJSInterpreter(unittest.TestCase):
self._test(jsi, test_result, args=['t-e-s-t', '-'])
self._test(jsi, [''], args=['', '-'])
self._test(jsi, [], args=['', ''])
# RegExp split
self._test('function f(){return "test".split(/(?:)/)}',
['t', 'e', 's', 't'])
self._test('function f(){return "t-e-s-t".split(/[es-]+/)}',
['t', 't'])
# from MDN: surrogate pairs aren't handled: case 1 fails
# self._test('function f(){return "😄😄".split(/(?:)/)}',
# ['\ud83d', '\ude04', '\ud83d', '\ude04'])
# case 2 beats Py3.2: it gets the case 1 result
if sys.version_info >= (2, 6) and not ((3, 0) <= sys.version_info < (3, 3)):
self._test('function f(){return "😄😄".split(/(?:)/u)}',
['😄', '😄'])
def test_slice(self):
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice()}', [0, 1, 2, 3, 4, 5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(0)}', [0, 1, 2, 3, 4, 5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(5)}', [5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(99)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-2)}', [7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-99)}', [0, 1, 2, 3, 4, 5, 6, 7, 8])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(0, 0)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(1, 0)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(0, 1)}', [0])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(3, 6)}', [3, 4, 5])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(1, -1)}', [1, 2, 3, 4, 5, 6, 7])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-1, 1)}', [])
self._test('function f(){return [0, 1, 2, 3, 4, 5, 6, 7, 8].slice(-3, -1)}', [6, 7])
self._test('function f(){return "012345678".slice()}', '012345678')
self._test('function f(){return "012345678".slice(0)}', '012345678')
self._test('function f(){return "012345678".slice(5)}', '5678')
self._test('function f(){return "012345678".slice(99)}', '')
self._test('function f(){return "012345678".slice(-2)}', '78')
self._test('function f(){return "012345678".slice(-99)}', '012345678')
self._test('function f(){return "012345678".slice(0, 0)}', '')
self._test('function f(){return "012345678".slice(1, 0)}', '')
self._test('function f(){return "012345678".slice(0, 1)}', '0')
self._test('function f(){return "012345678".slice(3, 6)}', '345')
self._test('function f(){return "012345678".slice(1, -1)}', '1234567')
self._test('function f(){return "012345678".slice(-1, 1)}', '')
self._test('function f(){return "012345678".slice(-3, -1)}', '67')
def test_splice(self):
self._test('function f(){var T = ["0", "1", "2"]; T["splice"](2, 1, "0")[0]; return T }', ['0', '1', '0'])
def test_pop(self):
# pop
self._test('function f(){var a = [0, 1, 2, 3, 4, 5, 6, 7, 8]; return [a.pop(), a]}',
[8, [0, 1, 2, 3, 4, 5, 6, 7]])
self._test('function f(){return [].pop()}', JS_Undefined)
# push
self._test('function f(){var a = [0, 1, 2]; return [a.push(3, 4), a]}',
[5, [0, 1, 2, 3, 4]])
self._test('function f(){var a = [0, 1, 2]; return [a.push(), a]}',
[3, [0, 1, 2]])
def test_shift(self):
# shift
self._test('function f(){var a = [0, 1, 2, 3, 4, 5, 6, 7, 8]; return [a.shift(), a]}',
[0, [1, 2, 3, 4, 5, 6, 7, 8]])
self._test('function f(){return [].shift()}', JS_Undefined)
# unshift
self._test('function f(){var a = [0, 1, 2]; return [a.unshift(3, 4), a]}',
[5, [3, 4, 0, 1, 2]])
self._test('function f(){var a = [0, 1, 2]; return [a.unshift(), a]}',
[3, [0, 1, 2]])
def test_forEach(self):
self._test('function f(){var ret = []; var l = [4, 2]; '
'var log = function(e,i,a){ret.push([e,i,a]);}; '
'l.forEach(log); '
'return [ret.length, ret[0][0], ret[1][1], ret[0][2]]}',
[2, 4, 1, [4, 2]])
self._test('function f(){var ret = []; var l = [4, 2]; '
'var log = function(e,i,a){this.push([e,i,a]);}; '
'l.forEach(log, ret); '
'return [ret.length, ret[0][0], ret[1][1], ret[0][2]]}',
[2, 4, 1, [4, 2]])
def test_extract_function(self):
jsi = JSInterpreter('function a(b) { return b + 1; }')
func = jsi.extract_function('a')
self.assertEqual(func([2]), 3)
def test_extract_function_with_global_stack(self):
jsi = JSInterpreter('function c(d) { return d + e + f + g; }')
func = jsi.extract_function('c', {'e': 10}, {'f': 100, 'g': 1000})
self.assertEqual(func([1]), 1111)
if __name__ == '__main__':

@ -9,21 +9,32 @@ import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import itertools
import re
from youtube_dl.traversal import (
dict_get,
get_first,
require,
subs_list_to_dict,
T,
traverse_obj,
unpack,
value,
)
from youtube_dl.compat import (
compat_chr as chr,
compat_etree_fromstring,
compat_http_cookies,
compat_map as map,
compat_str,
compat_zip as zip,
)
from youtube_dl.utils import (
determine_ext,
ExtractorError,
int_or_none,
join_nonempty,
str_or_none,
)
@ -446,42 +457,164 @@ class TestTraversal(_TestCase):
msg='`any` should allow further branching')
def test_traversal_morsel(self):
values = {
'expires': 'a',
'path': 'b',
'comment': 'c',
'domain': 'd',
'max-age': 'e',
'secure': 'f',
'httponly': 'g',
'version': 'h',
'samesite': 'i',
}
# SameSite added in Py3.8, breaks .update for 3.5-3.7
if sys.version_info < (3, 8):
del values['samesite']
morsel = compat_http_cookies.Morsel()
# SameSite added in Py3.8, breaks .update for 3.5-3.7
# Similarly Partitioned, Py3.14, thx Grub4k
values = dict(zip(morsel, map(chr, itertools.count(ord('a')))))
morsel.set(str('item_key'), 'item_value', 'coded_value')
morsel.update(values)
values['key'] = str('item_key')
values['value'] = 'item_value'
values.update({
'key': str('item_key'),
'value': 'item_value',
}),
values = dict((str(k), v) for k, v in values.items())
# make test pass even without ordered dict
value_set = set(values.values())
for key, value in values.items():
self.assertEqual(traverse_obj(morsel, key), value,
for key, val in values.items():
self.assertEqual(traverse_obj(morsel, key), val,
msg='Morsel should provide access to all values')
self.assertEqual(set(traverse_obj(morsel, Ellipsis)), value_set,
msg='`...` should yield all values')
self.assertEqual(set(traverse_obj(morsel, lambda k, v: True)), value_set,
msg='function key should yield all values')
values = list(values.values())
self.assertMaybeCountEqual(traverse_obj(morsel, Ellipsis), values,
msg='`...` should yield all values')
self.assertMaybeCountEqual(traverse_obj(morsel, lambda k, v: True), values,
msg='function key should yield all values')
self.assertIs(traverse_obj(morsel, [(None,), any]), morsel,
msg='Morsel should not be implicitly changed to dict on usage')
def test_get_first(self):
self.assertEqual(get_first([{'a': None}, {'a': 'spam'}], 'a'), 'spam')
def test_traversal_filter(self):
data = [None, False, True, 0, 1, 0.0, 1.1, '', 'str', {}, {0: 0}, [], [1]]
self.assertEqual(
traverse_obj(data, (Ellipsis, filter)),
[True, 1, 1.1, 'str', {0: 0}, [1]],
'`filter` should filter falsy values')
class TestTraversalHelpers(_TestCase):
def test_traversal_require(self):
with self.assertRaises(ExtractorError, msg='Missing `value` should raise'):
traverse_obj(_TEST_DATA, ('None', T(require('value'))))
self.assertEqual(
traverse_obj(_TEST_DATA, ('str', T(require('value')))), 'str',
'`require` should pass through non-`None` values')
def test_subs_list_to_dict(self):
self.assertEqual(traverse_obj([
{'name': 'de', 'url': 'https://example.com/subs/de.vtt'},
{'name': 'en', 'url': 'https://example.com/subs/en1.ass'},
{'name': 'en', 'url': 'https://example.com/subs/en2.ass'},
], [Ellipsis, {
'id': 'name',
'url': 'url',
}, all, T(subs_list_to_dict)]), {
'de': [{'url': 'https://example.com/subs/de.vtt'}],
'en': [
{'url': 'https://example.com/subs/en1.ass'},
{'url': 'https://example.com/subs/en2.ass'},
],
}, 'function should build subtitle dict from list of subtitles')
self.assertEqual(traverse_obj([
{'name': 'de', 'url': 'https://example.com/subs/de.ass'},
{'name': 'de'},
{'name': 'en', 'content': 'content'},
{'url': 'https://example.com/subs/en'},
], [Ellipsis, {
'id': 'name',
'data': 'content',
'url': 'url',
}, all, T(subs_list_to_dict(lang=None))]), {
'de': [{'url': 'https://example.com/subs/de.ass'}],
'en': [{'data': 'content'}],
}, 'subs with mandatory items missing should be filtered')
self.assertEqual(traverse_obj([
{'url': 'https://example.com/subs/de.ass', 'name': 'de'},
{'url': 'https://example.com/subs/en', 'name': 'en'},
], [Ellipsis, {
'id': 'name',
'ext': ['url', T(determine_ext(default_ext=None))],
'url': 'url',
}, all, T(subs_list_to_dict(ext='ext'))]), {
'de': [{'url': 'https://example.com/subs/de.ass', 'ext': 'ass'}],
'en': [{'url': 'https://example.com/subs/en', 'ext': 'ext'}],
}, '`ext` should set default ext but leave existing value untouched')
self.assertEqual(traverse_obj([
{'name': 'en', 'url': 'https://example.com/subs/en2', 'prio': True},
{'name': 'en', 'url': 'https://example.com/subs/en1', 'prio': False},
], [Ellipsis, {
'id': 'name',
'quality': ['prio', T(int)],
'url': 'url',
}, all, T(subs_list_to_dict(ext='ext'))]), {'en': [
{'url': 'https://example.com/subs/en1', 'ext': 'ext'},
{'url': 'https://example.com/subs/en2', 'ext': 'ext'},
]}, '`quality` key should sort subtitle list accordingly')
self.assertEqual(traverse_obj([
{'name': 'de', 'url': 'https://example.com/subs/de.ass'},
{'name': 'de'},
{'name': 'en', 'content': 'content'},
{'url': 'https://example.com/subs/en'},
], [Ellipsis, {
'id': 'name',
'url': 'url',
'data': 'content',
}, all, T(subs_list_to_dict(lang='en'))]), {
'de': [{'url': 'https://example.com/subs/de.ass'}],
'en': [
{'data': 'content'},
{'url': 'https://example.com/subs/en'},
],
}, 'optionally provided lang should be used if no id available')
self.assertEqual(traverse_obj([
{'name': 1, 'url': 'https://example.com/subs/de1'},
{'name': {}, 'url': 'https://example.com/subs/de2'},
{'name': 'de', 'ext': 1, 'url': 'https://example.com/subs/de3'},
{'name': 'de', 'ext': {}, 'url': 'https://example.com/subs/de4'},
], [Ellipsis, {
'id': 'name',
'url': 'url',
'ext': 'ext',
}, all, T(subs_list_to_dict(lang=None))]), {
'de': [
{'url': 'https://example.com/subs/de3'},
{'url': 'https://example.com/subs/de4'},
],
}, 'non str types should be ignored for id and ext')
self.assertEqual(traverse_obj([
{'name': 1, 'url': 'https://example.com/subs/de1'},
{'name': {}, 'url': 'https://example.com/subs/de2'},
{'name': 'de', 'ext': 1, 'url': 'https://example.com/subs/de3'},
{'name': 'de', 'ext': {}, 'url': 'https://example.com/subs/de4'},
], [Ellipsis, {
'id': 'name',
'url': 'url',
'ext': 'ext',
}, all, T(subs_list_to_dict(lang='de'))]), {
'de': [
{'url': 'https://example.com/subs/de1'},
{'url': 'https://example.com/subs/de2'},
{'url': 'https://example.com/subs/de3'},
{'url': 'https://example.com/subs/de4'},
],
}, 'non str types should be replaced by default id')
def test_unpack(self):
self.assertEqual(
unpack(lambda *x: ''.join(map(compat_str, x)))([1, 2, 3]), '123')
self.assertEqual(
unpack(join_nonempty)([1, 2, 3]), '1-2-3')
self.assertEqual(
unpack(join_nonempty, delim=' ')([1, 2, 3]), '1 2 3')
with self.assertRaises(TypeError):
unpack(join_nonempty)()
with self.assertRaises(TypeError):
unpack()
def test_value(self):
self.assertEqual(
traverse_obj(_TEST_DATA, ('str', T(value('other')))), 'other',
'`value` should substitute specified value')
class TestDictGet(_TestCase):
def test_dict_get(self):
FALSE_VALUES = {
'none': None,
@ -504,6 +637,9 @@ class TestTraversal(_TestCase):
self.assertEqual(dict_get(d, ('b', 'c', key, )), None)
self.assertEqual(dict_get(d, ('b', 'c', key, ), skip_false_values=False), false_value)
def test_get_first(self):
self.assertEqual(get_first([{'a': None}, {'a': 'spam'}], 'a'), 'spam')
if __name__ == '__main__':
unittest.main()

@ -69,6 +69,7 @@ from youtube_dl.utils import (
parse_iso8601,
parse_resolution,
parse_qs,
partial_application,
pkcs1pad,
prepend_extension,
read_batch_urls,
@ -664,6 +665,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_duration('3h 11m 53s'), 11513)
self.assertEqual(parse_duration('3 hours 11 minutes 53 seconds'), 11513)
self.assertEqual(parse_duration('3 hours 11 mins 53 secs'), 11513)
self.assertEqual(parse_duration('3 hours, 11 minutes, 53 seconds'), 11513)
self.assertEqual(parse_duration('3 hours, 11 mins, 53 secs'), 11513)
self.assertEqual(parse_duration('62m45s'), 3765)
self.assertEqual(parse_duration('6m59s'), 419)
self.assertEqual(parse_duration('49s'), 49)
@ -682,6 +685,10 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_duration('PT1H0.040S'), 3600.04)
self.assertEqual(parse_duration('PT00H03M30SZ'), 210)
self.assertEqual(parse_duration('P0Y0M0DT0H4M20.880S'), 260.88)
self.assertEqual(parse_duration('01:02:03:050'), 3723.05)
self.assertEqual(parse_duration('103:050'), 103.05)
self.assertEqual(parse_duration('1HR 3MIN'), 3780)
self.assertEqual(parse_duration('2hrs 3mins'), 7380)
def test_fix_xml_ampersands(self):
self.assertEqual(
@ -895,6 +902,30 @@ class TestUtil(unittest.TestCase):
'vcodec': 'av01.0.05M.08',
'acodec': 'none',
})
self.assertEqual(parse_codecs('vp9.2'), {
'vcodec': 'vp9.2',
'acodec': 'none',
'dynamic_range': 'HDR10',
})
self.assertEqual(parse_codecs('vp09.02.50.10.01.09.18.09.00'), {
'vcodec': 'vp09.02.50.10.01.09.18.09.00',
'acodec': 'none',
'dynamic_range': 'HDR10',
})
self.assertEqual(parse_codecs('av01.0.12M.10.0.110.09.16.09.0'), {
'vcodec': 'av01.0.12M.10.0.110.09.16.09.0',
'acodec': 'none',
'dynamic_range': 'HDR10',
})
self.assertEqual(parse_codecs('dvhe'), {
'vcodec': 'dvhe',
'acodec': 'none',
'dynamic_range': 'DV',
})
self.assertEqual(parse_codecs('fLaC'), {
'vcodec': 'none',
'acodec': 'flac',
})
self.assertEqual(parse_codecs('theora, vorbis'), {
'vcodec': 'theora',
'acodec': 'vorbis',
@ -1723,6 +1754,21 @@ Line 1
'a', 'b', 'c', 'd',
from_dict={'a': 'c', 'c': [], 'b': 'd', 'd': None}), 'c-d')
def test_partial_application(self):
test_fn = partial_application(lambda x, kwarg=None: '{0}, kwarg={1!r}'.format(x, kwarg))
self.assertTrue(
callable(test_fn(kwarg=10)),
'missing positional parameter should apply partially')
self.assertEqual(
test_fn(10, kwarg=42), '10, kwarg=42',
'positionally passed argument should call function')
self.assertEqual(
test_fn(x=10), '10, kwarg=None',
'keyword passed positional should call function')
self.assertEqual(
test_fn(kwarg=42)(10), '10, kwarg=42',
'call after partial application should call the function')
if __name__ == '__main__':
unittest.main()

@ -1,4 +1,5 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
@ -12,6 +13,7 @@ import re
import string
from youtube_dl.compat import (
compat_contextlib_suppress,
compat_open as open,
compat_str,
compat_urlretrieve,
@ -50,23 +52,93 @@ _SIG_TESTS = [
(
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflBb0OQx.js',
84,
'123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQ0STUVWXYZ!"#$%&\'()*+,@./:;<=>'
'123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQ0STUVWXYZ!"#$%&\'()*+,@./:;<=>',
),
(
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vfl9FYC6l.js',
83,
'123456789abcdefghijklmnopqr0tuvwxyzABCDETGHIJKLMNOPQRS>UVWXYZ!"#$%&\'()*+,-./:;<=F'
'123456789abcdefghijklmnopqr0tuvwxyzABCDETGHIJKLMNOPQRS>UVWXYZ!"#$%&\'()*+,-./:;<=F',
),
(
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflCGk6yw/html5player.js',
'4646B5181C6C3020DF1D9C7FCFEA.AD80ABF70C39BD369CCCAE780AFBB98FA6B6CB42766249D9488C288',
'82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B'
'82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B',
),
(
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js',
'312AA52209E3623129A412D56A40F11CB0AF14AE.3EE09501CB14E3BCDC3B2AE808BF3F1D14E7FBF12',
'112AA5220913623229A412D56A40F11CB0AF14AE.3EE0950FCB14EEBCDC3B2AE808BF331D14E7FBF3',
)
),
(
'https://www.youtube.com/s/player/6ed0d907/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'AOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL2QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/3bb1f723/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'MyOSJXtKI3m-uME_jv7-pT12gOFC02RFkGoqWpzE0Cs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xxAj7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJ2OySqa0q',
),
(
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'AAOAOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7vgpDL0QwbdV06sCIEzpWqMGkFR20CFOS21Tp-7vj_EMu-m37KtXJoOy1',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/20830619/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
),
]
_NSIG_TESTS = [
@ -136,12 +208,16 @@ _NSIG_TESTS = [
),
(
'https://www.youtube.com/s/player/c57c113c/player_ias.vflset/en_US/base.js',
'-Txvy6bT5R6LqgnQNx', 'dcklJCnRUHbgSg',
'M92UUMHa8PdvPd3wyM', '3hPqLJsiNZx7yA',
),
(
'https://www.youtube.com/s/player/5a3b6271/player_ias.vflset/en_US/base.js',
'B2j7f_UPT4rfje85Lu_e', 'm5DmNymaGQ5RdQ',
),
(
'https://www.youtube.com/s/player/7a062b77/player_ias.vflset/en_US/base.js',
'NRcE3y3mVtm_cV-W', 'VbsCYUATvqlt5w',
),
(
'https://www.youtube.com/s/player/dac945fd/player_ias.vflset/en_US/base.js',
'o8BkRxXhuYsBCWi6RplPdP', '3Lx32v_hmzTm6A',
@ -152,7 +228,11 @@ _NSIG_TESTS = [
),
(
'https://www.youtube.com/s/player/cfa9e7cb/player_ias.vflset/en_US/base.js',
'qO0NiMtYQ7TeJnfFG2', 'k9cuJDHNS5O7kQ',
'aCi3iElgd2kq0bxVbQ', 'QX1y8jGb2IbZ0w',
),
(
'https://www.youtube.com/s/player/8c7583ff/player_ias.vflset/en_US/base.js',
'1wWCVpRR96eAmMI87L', 'KSkWAVv1ZQxC3A',
),
(
'https://www.youtube.com/s/player/b7910ca8/player_ias.vflset/en_US/base.js',
@ -166,6 +246,110 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/b22ef6e7/player_ias.vflset/en_US/base.js',
'b6HcntHGkvBLk_FRf', 'kNPW6A7FyP2l8A',
),
(
'https://www.youtube.com/s/player/3400486c/player_ias.vflset/en_US/base.js',
'lL46g3XifCKUZn1Xfw', 'z767lhet6V2Skl',
),
(
'https://www.youtube.com/s/player/5604538d/player_ias.vflset/en_US/base.js',
'7X-he4jjvMx7BCX', 'sViSydX8IHtdWA',
),
(
'https://www.youtube.com/s/player/20dfca59/player_ias.vflset/en_US/base.js',
'-fLCxedkAk4LUTK2', 'O8kfRq1y1eyHGw',
),
(
'https://www.youtube.com/s/player/b12cc44b/player_ias.vflset/en_US/base.js',
'keLa5R2U00sR9SQK', 'N1OGyujjEwMnLw',
),
(
'https://www.youtube.com/s/player/3bb1f723/player_ias.vflset/en_US/base.js',
'gK15nzVyaXE9RsMP3z', 'ZFFWFLPWx9DEgQ',
),
(
'https://www.youtube.com/s/player/f8f53e1a/player_ias.vflset/en_US/base.js',
'VTQOUOv0mCIeJ7i8kZB', 'kcfD8wy0sNLyNQ',
),
(
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
'YWt1qdbe8SAfkoPHW5d', 'RrRjWQOJmBiP',
),
(
'https://www.youtube.com/s/player/9c6dfc4a/player_ias.vflset/en_US/base.js',
'jbu7ylIosQHyJyJV', 'uwI0ESiynAmhNg',
),
(
'https://www.youtube.com/s/player/f6e09c70/player_ias.vflset/en_US/base.js',
'W9HJZKktxuYoDTqW', 'jHbbkcaxm54',
),
(
'https://www.youtube.com/s/player/f6e09c70/player_ias_tce.vflset/en_US/base.js',
'W9HJZKktxuYoDTqW', 'jHbbkcaxm54',
),
(
'https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js',
'Sy4aDGc0VpYRR9ew_', '5UPOT1VhoZxNLQ',
),
(
'https://www.youtube.com/s/player/d50f54ef/player_ias_tce.vflset/en_US/base.js',
'Ha7507LzRmH3Utygtj', 'XFTb2HoeOE5MHg',
),
(
'https://www.youtube.com/s/player/074a8365/player_ias_tce.vflset/en_US/base.js',
'Ha7507LzRmH3Utygtj', 'ufTsrE0IVYrkl8v',
),
(
'https://www.youtube.com/s/player/643afba4/player_ias.vflset/en_US/base.js',
'N5uAlLqm0eg1GyHO', 'dCBQOejdq5s-ww',
),
(
'https://www.youtube.com/s/player/69f581a5/tv-player-ias.vflset/tv-player-ias.js',
'-qIP447rVlTTwaZjY', 'KNcGOksBAvwqQg',
),
(
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
'ir9-V6cdbCiyKxhr', '2PL7ZDYAALMfmA',
),
(
'https://www.youtube.com/s/player/643afba4/player_ias.vflset/en_US/base.js',
'ir9-V6cdbCiyKxhr', '2PL7ZDYAALMfmA',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'eWYu5d5YeY_4LyEDc', 'XJQqf-N7Xra3gg',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/tv-player-ias.vflset/tv-player-ias.js',
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
),
(
'https://www.youtube.com/s/player/20830619/tv-player-ias.vflset/tv-player-ias.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
),
(
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
),
(
'https://www.youtube.com/s/player/aa3fc80b/player_ias.vflset/en_US/base.js',
'0qY9dal2uzOnOGwa-48hha', 'VSh1KDfQMk-eag',
),
]
@ -178,6 +362,8 @@ class TestPlayerInfo(unittest.TestCase):
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-de_DE.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-tablet-en_US.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js', 'e7567ecf'),
('https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js', '643afba4'),
# obsolete
('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
@ -187,8 +373,9 @@ class TestPlayerInfo(unittest.TestCase):
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
)
ie = YoutubeIE(FakeYDL({'cachedir': False}))
for player_url, expected_player_id in PLAYER_URLS:
player_id = YoutubeIE._extract_player_info(player_url)
player_id = ie._extract_player_info(player_url)
self.assertEqual(player_id, expected_player_id)
@ -200,21 +387,19 @@ class TestSignature(unittest.TestCase):
os.mkdir(self.TESTDATA_DIR)
def tearDown(self):
try:
with compat_contextlib_suppress(OSError):
for f in os.listdir(self.TESTDATA_DIR):
os.remove(f)
except OSError:
pass
def t_factory(name, sig_func, url_pattern):
def make_tfunc(url, sig_input, expected_sig):
m = url_pattern.match(url)
assert m, '%r should follow URL format' % url
test_id = m.group('id')
assert m, '{0!r} should follow URL format'.format(url)
test_id = re.sub(r'[/.-]', '_', m.group('id') or m.group('compat_id'))
def test_func(self):
basename = 'player-{0}-{1}.js'.format(name, test_id)
basename = 'player-{0}.js'.format(test_id)
fn = os.path.join(self.TESTDATA_DIR, basename)
if not os.path.exists(fn):
@ -229,7 +414,7 @@ def t_factory(name, sig_func, url_pattern):
def signature(jscode, sig_input):
func = YoutubeIE(FakeYDL())._parse_sig_js(jscode)
func = YoutubeIE(FakeYDL({'cachedir': False}))._parse_sig_js(jscode)
src_sig = (
compat_str(string.printable[:sig_input])
if isinstance(sig_input, int) else sig_input)
@ -237,17 +422,23 @@ def signature(jscode, sig_input):
def n_sig(jscode, sig_input):
funcname = YoutubeIE(FakeYDL())._extract_n_function_name(jscode)
return JSInterpreter(jscode).call_function(funcname, sig_input)
ie = YoutubeIE(FakeYDL({'cachedir': False}))
jsi = JSInterpreter(jscode)
jsi, _, func_code = ie._extract_n_function_code_jsi(sig_input, jsi)
return ie._extract_n_function_from_code(jsi, func_code)(sig_input)
make_sig_test = t_factory(
'signature', signature, re.compile(r'.*-(?P<id>[a-zA-Z0-9_-]+)(?:/watch_as3|/html5player)?\.[a-z]+$'))
'signature', signature,
re.compile(r'''(?x)
.+/(?P<h5>html5)?player(?(h5)(?:-en_US)?-|/)(?P<id>[a-zA-Z0-9/._-]+)
(?(h5)/(?:watch_as3|html5player))?\.js$
'''))
for test_spec in _SIG_TESTS:
make_sig_test(*test_spec)
make_nsig_test = t_factory(
'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_-]+)/.+.js$'))
'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_/.-]+)\.js$'))
for test_spec in _NSIG_TESTS:
make_nsig_test(*test_spec)

@ -357,7 +357,7 @@ class YoutubeDL(object):
_NUMERIC_FIELDS = set((
'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
'timestamp', 'upload_year', 'upload_month', 'upload_day',
'timestamp', 'upload_year', 'upload_month', 'upload_day', 'available_at',
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
'average_rating', 'comment_count', 'age_limit',
'start_time', 'end_time',
@ -540,10 +540,14 @@ class YoutubeDL(object):
"""Print message to stdout if not in quiet mode."""
return self.to_stdout(message, skip_eol, check_quiet=True)
def _write_string(self, s, out=None):
def _write_string(self, s, out=None, only_once=False, _cache=set()):
if only_once and s in _cache:
return
write_string(s, out=out, encoding=self.params.get('encoding'))
if only_once:
_cache.add(s)
def to_stdout(self, message, skip_eol=False, check_quiet=False):
def to_stdout(self, message, skip_eol=False, check_quiet=False, only_once=False):
"""Print message to stdout if not in quiet mode."""
if self.params.get('logger'):
self.params['logger'].debug(message)
@ -552,9 +556,9 @@ class YoutubeDL(object):
terminator = ['\n', ''][skip_eol]
output = message + terminator
self._write_string(output, self._screen_file)
self._write_string(output, self._screen_file, only_once=only_once)
def to_stderr(self, message):
def to_stderr(self, message, only_once=False):
"""Print message to stderr."""
assert isinstance(message, compat_str)
if self.params.get('logger'):
@ -562,7 +566,7 @@ class YoutubeDL(object):
else:
message = self._bidi_workaround(message)
output = message + '\n'
self._write_string(output, self._err_file)
self._write_string(output, self._err_file, only_once=only_once)
def to_console_title(self, message):
if not self.params.get('consoletitle', False):
@ -641,18 +645,11 @@ class YoutubeDL(object):
raise DownloadError(message, exc_info)
self._download_retcode = 1
def report_warning(self, message, only_once=False, _cache={}):
def report_warning(self, message, only_once=False):
'''
Print the message to stderr, it will be prefixed with 'WARNING:'
If stderr is a tty file the 'WARNING:' will be colored
'''
if only_once:
m_hash = hash((self, message))
m_cnt = _cache.setdefault(m_hash, 0)
_cache[m_hash] = m_cnt + 1
if m_cnt > 0:
return
if self.params.get('logger') is not None:
self.params['logger'].warning(message)
else:
@ -663,7 +660,7 @@ class YoutubeDL(object):
else:
_msg_header = 'WARNING:'
warning_message = '%s %s' % (_msg_header, message)
self.to_stderr(warning_message)
self.to_stderr(warning_message, only_once=only_once)
def report_error(self, message, *args, **kwargs):
'''
@ -677,6 +674,16 @@ class YoutubeDL(object):
kwargs['message'] = '%s %s' % (_msg_header, message)
self.trouble(*args, **kwargs)
def write_debug(self, message, only_once=False):
'''Log debug message or Print message to stderr'''
if not self.params.get('verbose', False):
return
message = '[debug] {0}'.format(message)
if self.params.get('logger'):
self.params['logger'].debug(message)
else:
self.to_stderr(message, only_once)
def report_unscoped_cookies(self, *args, **kwargs):
# message=None, tb=False, is_error=False
if len(args) <= 2:
@ -2397,60 +2404,52 @@ class YoutubeDL(object):
return res
def _format_note(self, fdict):
res = ''
if fdict.get('ext') in ['f4f', 'f4m']:
res += '(unsupported) '
if fdict.get('language'):
if res:
res += ' '
res += '[%s] ' % fdict['language']
if fdict.get('format_note') is not None:
res += fdict['format_note'] + ' '
if fdict.get('tbr') is not None:
res += '%4dk ' % fdict['tbr']
def simplified_codec(f, field):
assert field in ('acodec', 'vcodec')
codec = f.get(field)
return (
'unknown' if not codec
else '.'.join(codec.split('.')[:4]) if codec != 'none'
else 'images' if field == 'vcodec' and f.get('acodec') == 'none'
else None if field == 'acodec' and f.get('vcodec') == 'none'
else 'audio only' if field == 'vcodec'
else 'video only')
res = join_nonempty(
fdict.get('ext') in ('f4f', 'f4m') and '(unsupported)',
fdict.get('language') and ('[%s]' % (fdict['language'],)),
fdict.get('format_note') is not None and fdict['format_note'],
fdict.get('tbr') is not None and ('%4dk' % fdict['tbr']),
delim=' ')
res = [res] if res else []
if fdict.get('container') is not None:
if res:
res += ', '
res += '%s container' % fdict['container']
if (fdict.get('vcodec') is not None
and fdict.get('vcodec') != 'none'):
if res:
res += ', '
res += fdict['vcodec']
if fdict.get('vbr') is not None:
res += '@'
res.append('%s container' % (fdict['container'],))
if fdict.get('vcodec') not in (None, 'none'):
codec = simplified_codec(fdict, 'vcodec')
if codec and fdict.get('vbr') is not None:
codec += '@'
elif fdict.get('vbr') is not None and fdict.get('abr') is not None:
res += 'video@'
if fdict.get('vbr') is not None:
res += '%4dk' % fdict['vbr']
codec = 'video@'
else:
codec = None
codec = join_nonempty(codec, fdict.get('vbr') is not None and ('%4dk' % fdict['vbr']))
if codec:
res.append(codec)
if fdict.get('fps') is not None:
if res:
res += ', '
res += '%sfps' % fdict['fps']
if fdict.get('acodec') is not None:
if res:
res += ', '
if fdict['acodec'] == 'none':
res += 'video only'
else:
res += '%-5s' % fdict['acodec']
elif fdict.get('abr') is not None:
if res:
res += ', '
res += 'audio'
if fdict.get('abr') is not None:
res += '@%3dk' % fdict['abr']
if fdict.get('asr') is not None:
res += ' (%5dHz)' % fdict['asr']
res.append('%sfps' % (fdict['fps'],))
codec = (
simplified_codec(fdict, 'acodec') if fdict.get('acodec') is not None
else 'audio' if fdict.get('abr') is not None else None)
if codec:
res.append(join_nonempty(
'%-4s' % (codec + (('@%3dk' % fdict['abr']) if fdict.get('abr') else ''),),
fdict.get('asr') and '(%5dHz)' % fdict['asr'], delim=' '))
if fdict.get('filesize') is not None:
if res:
res += ', '
res += format_bytes(fdict['filesize'])
res.append(format_bytes(fdict['filesize']))
elif fdict.get('filesize_approx') is not None:
if res:
res += ', '
res += '~' + format_bytes(fdict['filesize_approx'])
return res
res.append('~' + format_bytes(fdict['filesize_approx']))
return ', '.join(res)
def list_formats(self, info_dict):
formats = info_dict.get('formats', [info_dict])
@ -2514,7 +2513,7 @@ class YoutubeDL(object):
self.get_encoding()))
write_string(encoding_str, encoding=None)
writeln_debug = lambda *s: self._write_string('[debug] %s\n' % (''.join(s), ))
writeln_debug = lambda *s: self.write_debug(''.join(s))
writeln_debug('youtube-dl version ', __version__)
if _LAZY_LOADER:
writeln_debug('Lazy loading extractors enabled')

@ -18,7 +18,7 @@ from .compat import (
compat_getpass,
compat_register_utf8,
compat_shlex_split,
workaround_optparse_bug9161,
_workaround_optparse_bug9161,
)
from .utils import (
_UnsafeExtensionError,
@ -50,7 +50,7 @@ def _real_main(argv=None):
# Compatibility fix for Windows
compat_register_utf8()
workaround_optparse_bug9161()
_workaround_optparse_bug9161()
setproctitle('youtube-dl')
@ -409,6 +409,8 @@ def _real_main(argv=None):
'include_ads': opts.include_ads,
'default_search': opts.default_search,
'youtube_include_dash_manifest': opts.youtube_include_dash_manifest,
'youtube_player_js_version': opts.youtube_player_js_version,
'youtube_player_js_variant': opts.youtube_player_js_variant,
'encoding': opts.encoding,
'extract_flat': opts.extract_flat,
'mark_watched': opts.mark_watched,

@ -1,3 +1,4 @@
# coding: utf-8
from __future__ import unicode_literals
import errno
@ -10,12 +11,14 @@ import traceback
from .compat import (
compat_getenv,
compat_open as open,
compat_os_makedirs,
)
from .utils import (
error_to_compat_str,
escape_rfc3986,
expand_path,
is_outdated_version,
try_get,
traverse_obj,
write_json_file,
)
from .version import __version__
@ -30,23 +33,35 @@ class Cache(object):
def __init__(self, ydl):
self._ydl = ydl
def _write_debug(self, *args, **kwargs):
self._ydl.write_debug(*args, **kwargs)
def _report_warning(self, *args, **kwargs):
self._ydl.report_warning(*args, **kwargs)
def _to_screen(self, *args, **kwargs):
self._ydl.to_screen(*args, **kwargs)
def _get_param(self, k, default=None):
return self._ydl.params.get(k, default)
def _get_root_dir(self):
res = self._ydl.params.get('cachedir')
res = self._get_param('cachedir')
if res is None:
cache_root = compat_getenv('XDG_CACHE_HOME', '~/.cache')
res = os.path.join(cache_root, self._YTDL_DIR)
return expand_path(res)
def _get_cache_fn(self, section, key, dtype):
assert re.match(r'^[a-zA-Z0-9_.-]+$', section), \
assert re.match(r'^[\w.-]+$', section), \
'invalid section %r' % section
assert re.match(r'^[a-zA-Z0-9_.-]+$', key), 'invalid key %r' % key
key = escape_rfc3986(key, safe='').replace('%', ',') # encode non-ascii characters
return os.path.join(
self._get_root_dir(), section, '%s.%s' % (key, dtype))
@property
def enabled(self):
return self._ydl.params.get('cachedir') is not False
return self._get_param('cachedir') is not False
def store(self, section, key, data, dtype='json'):
assert dtype in ('json',)
@ -56,61 +71,75 @@ class Cache(object):
fn = self._get_cache_fn(section, key, dtype)
try:
try:
os.makedirs(os.path.dirname(fn))
except OSError as ose:
if ose.errno != errno.EEXIST:
raise
compat_os_makedirs(os.path.dirname(fn), exist_ok=True)
self._write_debug('Saving {section}.{key} to cache'.format(section=section, key=key))
write_json_file({self._VERSION_KEY: __version__, 'data': data}, fn)
except Exception:
tb = traceback.format_exc()
self._ydl.report_warning(
'Writing cache to %r failed: %s' % (fn, tb))
self._report_warning('Writing cache to {fn!r} failed: {tb}'.format(fn=fn, tb=tb))
def clear(self, section, key, dtype='json'):
if not self.enabled:
return
fn = self._get_cache_fn(section, key, dtype)
self._write_debug('Clearing {section}.{key} from cache'.format(section=section, key=key))
try:
os.remove(fn)
except Exception as e:
if getattr(e, 'errno') == errno.ENOENT:
# file not found
return
tb = traceback.format_exc()
self._report_warning('Clearing cache from {fn!r} failed: {tb}'.format(fn=fn, tb=tb))
def _validate(self, data, min_ver):
version = try_get(data, lambda x: x[self._VERSION_KEY])
version = traverse_obj(data, self._VERSION_KEY)
if not version: # Backward compatibility
data, version = {'data': data}, self._DEFAULT_VERSION
if not is_outdated_version(version, min_ver or '0', assume_new=False):
return data['data']
self._ydl.to_screen(
'Discarding old cache from version {version} (needs {min_ver})'.format(**locals()))
self._write_debug('Discarding old cache from version {version} (needs {min_ver})'.format(version=version, min_ver=min_ver))
def load(self, section, key, dtype='json', default=None, min_ver=None):
def load(self, section, key, dtype='json', default=None, **kw_min_ver):
assert dtype in ('json',)
min_ver = kw_min_ver.get('min_ver')
if not self.enabled:
return default
cache_fn = self._get_cache_fn(section, key, dtype)
try:
with open(cache_fn, encoding='utf-8') as cachef:
self._write_debug('Loading {section}.{key} from cache'.format(section=section, key=key), only_once=True)
return self._validate(json.load(cachef), min_ver)
except (ValueError, KeyError):
try:
with open(cache_fn, 'r', encoding='utf-8') as cachef:
return self._validate(json.load(cachef), min_ver)
except ValueError:
try:
file_size = os.path.getsize(cache_fn)
except (OSError, IOError) as oe:
file_size = error_to_compat_str(oe)
self._ydl.report_warning(
'Cache retrieval from %s failed (%s)' % (cache_fn, file_size))
except IOError:
pass # No cache available
file_size = 'size: %d' % os.path.getsize(cache_fn)
except (OSError, IOError) as oe:
file_size = error_to_compat_str(oe)
self._report_warning('Cache retrieval from %s failed (%s)' % (cache_fn, file_size))
except Exception as e:
if getattr(e, 'errno') == errno.ENOENT:
# no cache available
return
self._report_warning('Cache retrieval from %s failed' % (cache_fn,))
return default
def remove(self):
if not self.enabled:
self._ydl.to_screen('Cache is disabled (Did you combine --no-cache-dir and --rm-cache-dir?)')
self._to_screen('Cache is disabled (Did you combine --no-cache-dir and --rm-cache-dir?)')
return
cachedir = self._get_root_dir()
if not any((term in cachedir) for term in ('cache', 'tmp')):
raise Exception('Not removing directory %s - this does not look like a cache dir' % cachedir)
raise Exception('Not removing directory %s - this does not look like a cache dir' % (cachedir,))
self._ydl.to_screen(
'Removing cache dir %s .' % cachedir, skip_eol=True)
self._to_screen(
'Removing cache dir %s .' % (cachedir,), skip_eol=True, ),
if os.path.exists(cachedir):
self._ydl.to_screen('.', skip_eol=True)
self._to_screen('.', skip_eol=True)
shutil.rmtree(cachedir)
self._ydl.to_screen('.')
self._to_screen('.')

@ -10,9 +10,10 @@ from .compat import (
# https://github.com/unicode-org/icu/blob/main/icu4c/source/data/unidata/CaseFolding.txt
# In case newly foldable Unicode characters are defined, paste the new version
# of the text inside the ''' marks.
# The text is expected to have only blank lines andlines with 1st character #,
# The text is expected to have only blank lines and lines with 1st character #,
# all ignored, and fold definitions like this:
# `from_hex_code; space_separated_to_hex_code_list; comment`
# `from_hex_code; status; space_separated_to_hex_code_list; comment`
# Only `status` C/F are used.
_map_str = '''
# CaseFolding-15.0.0.txt
@ -1657,11 +1658,6 @@ _map = dict(
del _map_str
def casefold(s):
def _casefold(s):
assert isinstance(s, compat_str)
return ''.join((_map.get(c, c) for c in s))
__all__ = [
'casefold',
]

@ -16,7 +16,6 @@ import os
import platform
import re
import shlex
import shutil
import socket
import struct
import subprocess
@ -24,11 +23,15 @@ import sys
import types
import xml.etree.ElementTree
_IDENTITY = lambda x: x
# naming convention
# 'compat_' + Python3_name.replace('.', '_')
# other aliases exist for convenience and/or legacy
# wrap disposable test values in type() to reclaim storage
# deal with critical unicode/str things first
# deal with critical unicode/str things first:
# compat_str, compat_basestring, compat_chr
try:
# Python 2
compat_str, compat_basestring, compat_chr = (
@ -39,18 +42,23 @@ except NameError:
str, (str, bytes), chr
)
# casefold
# compat_casefold
try:
compat_str.casefold
compat_casefold = lambda s: s.casefold()
except AttributeError:
from .casefold import casefold as compat_casefold
from .casefold import _casefold as compat_casefold
# compat_collections_abc
try:
import collections.abc as compat_collections_abc
except ImportError:
import collections as compat_collections_abc
compat_collections_abc = collections
# compat_urllib_request
try:
import urllib.request as compat_urllib_request
except ImportError: # Python 2
@ -79,11 +87,15 @@ except TypeError:
_add_init_method_arg(compat_urllib_request.Request)
del _add_init_method_arg
# compat_urllib_error
try:
import urllib.error as compat_urllib_error
except ImportError: # Python 2
import urllib2 as compat_urllib_error
# compat_urllib_parse
try:
import urllib.parse as compat_urllib_parse
except ImportError: # Python 2
@ -98,17 +110,23 @@ except ImportError: # Python 2
compat_urlparse = compat_urllib_parse
compat_urllib_parse_urlparse = compat_urllib_parse.urlparse
# compat_urllib_response
try:
import urllib.response as compat_urllib_response
except ImportError: # Python 2
import urllib as compat_urllib_response
# compat_urllib_response.addinfourl
try:
compat_urllib_response.addinfourl.status
except AttributeError:
# .getcode() is deprecated in Py 3.
compat_urllib_response.addinfourl.status = property(lambda self: self.getcode())
# compat_http_cookiejar
try:
import http.cookiejar as compat_cookiejar
except ImportError: # Python 2
@ -127,12 +145,16 @@ else:
compat_cookiejar_Cookie = compat_cookiejar.Cookie
compat_http_cookiejar_Cookie = compat_cookiejar_Cookie
# compat_http_cookies
try:
import http.cookies as compat_cookies
except ImportError: # Python 2
import Cookie as compat_cookies
compat_http_cookies = compat_cookies
# compat_http_cookies_SimpleCookie
if sys.version_info[0] == 2 or sys.version_info < (3, 3):
class compat_cookies_SimpleCookie(compat_cookies.SimpleCookie):
def load(self, rawdata):
@ -155,11 +177,15 @@ else:
compat_cookies_SimpleCookie = compat_cookies.SimpleCookie
compat_http_cookies_SimpleCookie = compat_cookies_SimpleCookie
# compat_html_entities, probably useless now
try:
import html.entities as compat_html_entities
except ImportError: # Python 2
import htmlentitydefs as compat_html_entities
# compat_html_entities_html5
try: # Python >= 3.3
compat_html_entities_html5 = compat_html_entities.html5
except AttributeError:
@ -2408,18 +2434,24 @@ except AttributeError:
# Py < 3.1
compat_http_client.HTTPResponse.getcode = lambda self: self.status
# compat_urllib_HTTPError
try:
from urllib.error import HTTPError as compat_HTTPError
except ImportError: # Python 2
from urllib2 import HTTPError as compat_HTTPError
compat_urllib_HTTPError = compat_HTTPError
# compat_urllib_request_urlretrieve
try:
from urllib.request import urlretrieve as compat_urlretrieve
except ImportError: # Python 2
from urllib import urlretrieve as compat_urlretrieve
compat_urllib_request_urlretrieve = compat_urlretrieve
# compat_html_parser_HTMLParser, compat_html_parser_HTMLParseError
try:
from HTMLParser import (
HTMLParser as compat_HTMLParser,
@ -2432,22 +2464,33 @@ except ImportError: # Python 3
# HTMLParseError was deprecated in Python 3.3 and removed in
# Python 3.5. Introducing dummy exception for Python >3.5 for compatible
# and uniform cross-version exception handling
class compat_HTMLParseError(Exception):
pass
compat_html_parser_HTMLParser = compat_HTMLParser
compat_html_parser_HTMLParseError = compat_HTMLParseError
# compat_subprocess_get_DEVNULL
try:
_DEVNULL = subprocess.DEVNULL
compat_subprocess_get_DEVNULL = lambda: _DEVNULL
except AttributeError:
compat_subprocess_get_DEVNULL = lambda: open(os.path.devnull, 'w')
# compat_http_server
try:
import http.server as compat_http_server
except ImportError:
import BaseHTTPServer as compat_http_server
# compat_urllib_parse_unquote_to_bytes,
# compat_urllib_parse_unquote, compat_urllib_parse_unquote_plus,
# compat_urllib_parse_urlencode,
# compat_urllib_parse_parse_qs
try:
from urllib.parse import unquote_to_bytes as compat_urllib_parse_unquote_to_bytes
from urllib.parse import unquote as compat_urllib_parse_unquote
@ -2455,8 +2498,7 @@ try:
from urllib.parse import urlencode as compat_urllib_parse_urlencode
from urllib.parse import parse_qs as compat_parse_qs
except ImportError: # Python 2
_asciire = (compat_urllib_parse._asciire if hasattr(compat_urllib_parse, '_asciire')
else re.compile(r'([\x00-\x7f]+)'))
_asciire = getattr(compat_urllib_parse, '_asciire', None) or re.compile(r'([\x00-\x7f]+)')
# HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus
# implementations from cpython 3.4.3's stdlib. Python 2's version
@ -2524,24 +2566,21 @@ except ImportError: # Python 2
# Possible solutions are to either port it from python 3 with all
# the friends or manually ensure input query contains only byte strings.
# We will stick with latter thus recursively encoding the whole query.
def compat_urllib_parse_urlencode(query, doseq=0, encoding='utf-8'):
def compat_urllib_parse_urlencode(query, doseq=0, safe='', encoding='utf-8', errors='strict'):
def encode_elem(e):
if isinstance(e, dict):
e = encode_dict(e)
elif isinstance(e, (list, tuple,)):
list_e = encode_list(e)
e = tuple(list_e) if isinstance(e, tuple) else list_e
e = type(e)(encode_elem(el) for el in e)
elif isinstance(e, compat_str):
e = e.encode(encoding)
e = e.encode(encoding, errors)
return e
def encode_dict(d):
return dict((encode_elem(k), encode_elem(v)) for k, v in d.items())
def encode_list(l):
return [encode_elem(e) for e in l]
return tuple((encode_elem(k), encode_elem(v)) for k, v in d.items())
return compat_urllib_parse._urlencode(encode_elem(query), doseq=doseq)
return compat_urllib_parse._urlencode(encode_elem(query), doseq=doseq).decode('ascii')
# HACK: The following is the correct parse_qs implementation from cpython 3's stdlib.
# Python 2's version is apparently totally broken
@ -2596,8 +2635,61 @@ except ImportError: # Python 2
('parse_qs', compat_parse_qs)):
setattr(compat_urllib_parse, name, fix)
try:
all(chr(i) in b'' for i in range(256))
except TypeError:
# not all chr(i) are str: patch Python2 quote
_safemaps = getattr(compat_urllib_parse, '_safemaps', {})
_always_safe = frozenset(compat_urllib_parse.always_safe)
def _quote(s, safe='/'):
"""quote('abc def') -> 'abc%20def'"""
if not s and s is not None: # fast path
return s
safe = frozenset(safe)
cachekey = (safe, _always_safe)
try:
safe_map = _safemaps[cachekey]
except KeyError:
safe = _always_safe | safe
safe_map = {}
for i in range(256):
c = chr(i)
safe_map[c] = (
c if (i < 128 and c in safe)
else b'%{0:02X}'.format(i))
_safemaps[cachekey] = safe_map
if safe.issuperset(s):
return s
return ''.join(safe_map[c] for c in s)
# linked code
def _quote_plus(s, safe=''):
return (
_quote(s, safe + b' ').replace(b' ', b'+') if b' ' in s
else _quote(s, safe))
# linked code
def _urlcleanup():
if compat_urllib_parse._urlopener:
compat_urllib_parse._urlopener.cleanup()
_safemaps.clear()
compat_urllib_parse.ftpcache.clear()
for name, fix in (
('quote', _quote),
('quote_plus', _quote_plus),
('urlcleanup', _urlcleanup)):
setattr(compat_urllib_parse, '_' + name, getattr(compat_urllib_parse, name))
setattr(compat_urllib_parse, name, fix)
compat_urllib_parse_parse_qs = compat_parse_qs
# compat_urllib_request_DataHandler
try:
from urllib.request import DataHandler as compat_urllib_request_DataHandler
except ImportError: # Python < 3.4
@ -2632,16 +2724,20 @@ except ImportError: # Python < 3.4
return compat_urllib_response.addinfourl(io.BytesIO(data), headers, url)
# compat_xml_etree_ElementTree_ParseError
try:
from xml.etree.ElementTree import ParseError as compat_xml_parse_error
except ImportError: # Python 2.6
from xml.parsers.expat import ExpatError as compat_xml_parse_error
compat_xml_etree_ElementTree_ParseError = compat_xml_parse_error
etree = xml.etree.ElementTree
# compat_xml_etree_ElementTree_Element
_etree = xml.etree.ElementTree
class _TreeBuilder(etree.TreeBuilder):
class _TreeBuilder(_etree.TreeBuilder):
def doctype(self, name, pubid, system):
pass
@ -2650,7 +2746,7 @@ try:
# xml.etree.ElementTree.Element is a method in Python <=2.6 and
# the following will crash with:
# TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types
isinstance(None, etree.Element)
isinstance(None, _etree.Element)
from xml.etree.ElementTree import Element as compat_etree_Element
except TypeError: # Python <=2.6
from xml.etree.ElementTree import _ElementInterface as compat_etree_Element
@ -2658,12 +2754,12 @@ compat_xml_etree_ElementTree_Element = compat_etree_Element
if sys.version_info[0] >= 3:
def compat_etree_fromstring(text):
return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder()))
return _etree.XML(text, parser=_etree.XMLParser(target=_TreeBuilder()))
else:
# python 2.x tries to encode unicode strings with ascii (see the
# XMLParser._fixtext method)
try:
_etree_iter = etree.Element.iter
_etree_iter = _etree.Element.iter
except AttributeError: # Python <=2.6
def _etree_iter(root):
for el in root.findall('*'):
@ -2675,27 +2771,29 @@ else:
# 2.7 source
def _XML(text, parser=None):
if not parser:
parser = etree.XMLParser(target=_TreeBuilder())
parser = _etree.XMLParser(target=_TreeBuilder())
parser.feed(text)
return parser.close()
def _element_factory(*args, **kwargs):
el = etree.Element(*args, **kwargs)
el = _etree.Element(*args, **kwargs)
for k, v in el.items():
if isinstance(v, bytes):
el.set(k, v.decode('utf-8'))
return el
def compat_etree_fromstring(text):
doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
doc = _XML(text, parser=_etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
for el in _etree_iter(doc):
if el.text is not None and isinstance(el.text, bytes):
el.text = el.text.decode('utf-8')
return doc
if hasattr(etree, 'register_namespace'):
compat_etree_register_namespace = etree.register_namespace
else:
# compat_xml_etree_register_namespace
try:
compat_etree_register_namespace = _etree.register_namespace
except AttributeError:
def compat_etree_register_namespace(prefix, uri):
"""Register a namespace prefix.
The registry is global, and any existing mapping for either the
@ -2704,14 +2802,16 @@ else:
attributes in this namespace will be serialized with prefix if possible.
ValueError is raised if prefix is reserved or is invalid.
"""
if re.match(r"ns\d+$", prefix):
raise ValueError("Prefix format reserved for internal use")
for k, v in list(etree._namespace_map.items()):
if re.match(r'ns\d+$', prefix):
raise ValueError('Prefix format reserved for internal use')
for k, v in list(_etree._namespace_map.items()):
if k == uri or v == prefix:
del etree._namespace_map[k]
etree._namespace_map[uri] = prefix
del _etree._namespace_map[k]
_etree._namespace_map[uri] = prefix
compat_xml_etree_register_namespace = compat_etree_register_namespace
# compat_xpath, compat_etree_iterfind
if sys.version_info < (2, 7):
# Here comes the crazy part: In 2.6, if the xpath is a unicode,
# .//node does not match if a node is a direct child of . !
@ -2898,7 +2998,6 @@ if sys.version_info < (2, 7):
def __init__(self, root):
self.root = root
##
# Generate all matching objects.
def compat_etree_iterfind(elem, path, namespaces=None):
@ -2933,13 +3032,15 @@ if sys.version_info < (2, 7):
else:
compat_xpath = lambda xpath: xpath
compat_etree_iterfind = lambda element, match: element.iterfind(match)
compat_xpath = _IDENTITY
# compat_os_name
compat_os_name = os._name if os.name == 'java' else os.name
# compat_shlex_quote
if compat_os_name == 'nt':
def compat_shlex_quote(s):
return s if re.match(r'^[-_\w./]+$', s) else '"%s"' % s.replace('"', '\\"')
@ -2954,6 +3055,7 @@ else:
return "'" + s.replace("'", "'\"'\"'") + "'"
# compat_shlex.split
try:
args = shlex.split('中文')
assert (isinstance(args, list)
@ -2969,6 +3071,7 @@ except (AssertionError, UnicodeEncodeError):
return list(map(lambda s: s.decode('utf-8'), shlex.split(s, comments, posix)))
# compat_ord
def compat_ord(c):
if isinstance(c, int):
return c
@ -2976,6 +3079,7 @@ def compat_ord(c):
return ord(c)
# compat_getenv, compat_os_path_expanduser, compat_setenv
if sys.version_info >= (3, 0):
compat_getenv = os.getenv
compat_expanduser = os.path.expanduser
@ -3063,6 +3167,22 @@ else:
compat_os_path_expanduser = compat_expanduser
# compat_os_makedirs
try:
os.makedirs('.', exist_ok=True)
compat_os_makedirs = os.makedirs
except TypeError: # < Py3.2
from errno import EEXIST as _errno_EEXIST
def compat_os_makedirs(name, mode=0o777, exist_ok=False):
try:
return os.makedirs(name, mode=mode)
except OSError as ose:
if not (exist_ok and ose.errno == _errno_EEXIST):
raise
# compat_os_path_realpath
if compat_os_name == 'nt' and sys.version_info < (3, 8):
# os.path.realpath on Windows does not follow symbolic links
# prior to Python 3.8 (see https://bugs.python.org/issue9949)
@ -3076,6 +3196,7 @@ else:
compat_os_path_realpath = compat_realpath
# compat_print
if sys.version_info < (3, 0):
def compat_print(s):
from .utils import preferredencoding
@ -3086,6 +3207,7 @@ else:
print(s)
# compat_getpass_getpass
if sys.version_info < (3, 0) and sys.platform == 'win32':
def compat_getpass(prompt, *args, **kwargs):
if isinstance(prompt, compat_str):
@ -3098,36 +3220,42 @@ else:
compat_getpass_getpass = compat_getpass
# compat_input
try:
compat_input = raw_input
except NameError: # Python 3
compat_input = input
# compat_kwargs
# Python < 2.6.5 require kwargs to be bytes
try:
def _testfunc(x):
pass
_testfunc(**{'x': 0})
(lambda x: x)(**{'x': 0})
except TypeError:
def compat_kwargs(kwargs):
return dict((bytes(k), v) for k, v in kwargs.items())
else:
compat_kwargs = lambda kwargs: kwargs
compat_kwargs = _IDENTITY
# compat_numeric_types
try:
compat_numeric_types = (int, float, long, complex)
except NameError: # Python 3
compat_numeric_types = (int, float, complex)
# compat_integer_types
try:
compat_integer_types = (int, long)
except NameError: # Python 3
compat_integer_types = (int, )
# compat_int
compat_int = compat_integer_types[-1]
# compat_socket_create_connection
if sys.version_info < (2, 7):
def compat_socket_create_connection(address, timeout, source_address=None):
host, port = address
@ -3154,6 +3282,7 @@ else:
compat_socket_create_connection = socket.create_connection
# compat_contextlib_suppress
try:
from contextlib import suppress as compat_contextlib_suppress
except ImportError:
@ -3196,12 +3325,12 @@ except AttributeError:
# repeated .close() is OK, but just in case
with compat_contextlib_suppress(EnvironmentError):
f.close()
popen.wait()
popen.wait()
# Fix https://github.com/ytdl-org/youtube-dl/issues/4223
# See http://bugs.python.org/issue9161 for what is broken
def workaround_optparse_bug9161():
def _workaround_optparse_bug9161():
op = optparse.OptionParser()
og = optparse.OptionGroup(op, 'foo')
try:
@ -3220,9 +3349,10 @@ def workaround_optparse_bug9161():
optparse.OptionGroup.add_option = _compat_add_option
if hasattr(shutil, 'get_terminal_size'): # Python >= 3.3
compat_get_terminal_size = shutil.get_terminal_size
else:
# compat_shutil_get_terminal_size
try:
from shutil import get_terminal_size as compat_get_terminal_size # Python >= 3.3
except ImportError:
_terminal_size = collections.namedtuple('terminal_size', ['columns', 'lines'])
def compat_get_terminal_size(fallback=(80, 24)):
@ -3252,27 +3382,33 @@ else:
columns = _columns
if lines is None or lines <= 0:
lines = _lines
return _terminal_size(columns, lines)
compat_shutil_get_terminal_size = compat_get_terminal_size
# compat_itertools_count
try:
itertools.count(start=0, step=1)
type(itertools.count(start=0, step=1))
compat_itertools_count = itertools.count
except TypeError: # Python 2.6
except TypeError: # Python 2.6 lacks step
def compat_itertools_count(start=0, step=1):
while True:
yield start
start += step
# compat_tokenize_tokenize
if sys.version_info >= (3, 0):
from tokenize import tokenize as compat_tokenize_tokenize
else:
from tokenize import generate_tokens as compat_tokenize_tokenize
# compat_struct_pack, compat_struct_unpack, compat_Struct
try:
struct.pack('!I', 0)
type(struct.pack('!I', 0))
except TypeError:
# In Python 2.6 and 2.7.x < 2.7.7, struct requires a bytes argument
# See https://bugs.python.org/issue19099
@ -3304,8 +3440,10 @@ else:
compat_Struct = struct.Struct
# compat_map/filter() returning an iterator, supposedly the
# same versioning as for zip below
# builtins returning an iterator
# compat_map, compat_filter
# supposedly the same versioning as for zip below
try:
from future_builtins import map as compat_map
except ImportError:
@ -3314,6 +3452,8 @@ except ImportError:
except ImportError:
compat_map = map
# compat_filter, compat_filter_fns
try:
from future_builtins import filter as compat_filter
except ImportError:
@ -3321,7 +3461,11 @@ except ImportError:
from itertools import ifilter as compat_filter
except ImportError:
compat_filter = filter
# "Is this function one or maybe the other filter()?"
compat_filter_fns = tuple(set((filter, compat_filter)))
# compat_zip
try:
from future_builtins import zip as compat_zip
except ImportError: # not 2.6+ or is 3.x
@ -3331,6 +3475,7 @@ except ImportError: # not 2.6+ or is 3.x
compat_zip = zip
# compat_itertools_zip_longest
# method renamed between Py2/3
try:
from itertools import zip_longest as compat_itertools_zip_longest
@ -3338,7 +3483,42 @@ except ImportError:
from itertools import izip_longest as compat_itertools_zip_longest
# new class in collections
# compat_abc_ABC
try:
from abc import ABC as compat_abc_ABC
except ImportError:
# Py < 3.4
from abc import ABCMeta as _ABCMeta
compat_abc_ABC = _ABCMeta(str('ABC'), (object,), {})
# dict mixin used here
# like UserDict.DictMixin, without methods created by MutableMapping
class _DictMixin(compat_abc_ABC):
def has_key(self, key):
return key in self
# get(), clear(), setdefault() in MM
def iterkeys(self):
return (k for k in self)
def itervalues(self):
return (self[k] for k in self)
def iteritems(self):
return ((k, self[k]) for k in self)
# pop(), popitem() in MM
def copy(self):
return type(self)(self)
# update() in MM
# compat_collections_chain_map
# collections.ChainMap: new class
try:
from collections import ChainMap as compat_collections_chain_map
# Py3.3's ChainMap is deficient
@ -3394,19 +3574,22 @@ except ImportError:
def new_child(self, m=None, **kwargs):
m = m or {}
m.update(kwargs)
return compat_collections_chain_map(m, *self.maps)
# support inheritance !
return type(self)(m, *self.maps)
@property
def parents(self):
return compat_collections_chain_map(*(self.maps[1:]))
return type(self)(*(self.maps[1:]))
# compat_re_Pattern, compat_re_Match
# Pythons disagree on the type of a pattern (RegexObject, _sre.SRE_Pattern, Pattern, ...?)
compat_re_Pattern = type(re.compile(''))
# and on the type of a match
compat_re_Match = type(re.match('a', 'a'))
# compat_base64_b64decode
if sys.version_info < (3, 3):
def compat_b64decode(s, *args, **kwargs):
if isinstance(s, compat_str):
@ -3418,6 +3601,7 @@ else:
compat_base64_b64decode = compat_b64decode
# compat_ctypes_WINFUNCTYPE
if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0):
# PyPy2 prior to version 5.4.0 expects byte strings as Windows function
# names, see the original PyPy issue [1] and the youtube-dl one [2].
@ -3436,6 +3620,7 @@ else:
return ctypes.WINFUNCTYPE(*args, **kwargs)
# compat_open
if sys.version_info < (3, 0):
# open(file, mode='r', buffering=- 1, encoding=None, errors=None, newline=None, closefd=True) not: opener=None
def compat_open(file_, *args, **kwargs):
@ -3463,18 +3648,151 @@ except AttributeError:
def compat_datetime_timedelta_total_seconds(td):
return (td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6
# optional decompression packages
# compat_brotli
# PyPi brotli package implements 'br' Content-Encoding
try:
import brotli as compat_brotli
except ImportError:
compat_brotli = None
# compat_ncompress
# PyPi ncompress package implements 'compress' Content-Encoding
try:
import ncompress as compat_ncompress
except ImportError:
compat_ncompress = None
# compat_zstandard
# PyPi zstandard package implements 'zstd' Content-Encoding (RFC 8878 7.2)
try:
import zstandard as compat_zstandard
except ImportError:
compat_zstandard = None
# compat_thread
try:
import _thread as compat_thread
except ImportError:
try:
import thread as compat_thread
except ImportError:
import dummy_thread as compat_thread
# compat_dict
# compat_builtins_dict
# compat_dict_items
if sys.version_info >= (3, 6):
compat_dict = compat_builtins_dict = dict
compat_dict_items = dict.items
else:
_get_ident = compat_thread.get_ident
class compat_dict(compat_collections_abc.MutableMapping, _DictMixin, dict):
"""`dict` that preserves insertion order with interface like Py3.7+"""
_order = [] # default that should never be used
def __init__(self, *mappings_or_iterables, **kwargs):
# order an unordered dict using a list of keys: actual Py 2.7+
# OrderedDict uses a doubly linked list for better performance
self._order = []
for arg in mappings_or_iterables:
self.__update(arg)
if kwargs:
self.__update(kwargs)
def __getitem__(self, key):
return dict.__getitem__(self, key)
def __setitem__(self, key, value):
try:
if key not in self._order:
self._order.append(key)
dict.__setitem__(self, key, value)
except Exception:
if key in self._order[-1:] and key not in self:
del self._order[-1]
raise
def __len__(self):
return dict.__len__(self)
def __delitem__(self, key):
dict.__delitem__(self, key)
try:
# expected case, O(len(self)), but who dels anyway?
self._order.remove(key)
except ValueError:
pass
def __iter__(self):
for from_ in self._order:
if from_ in self:
yield from_
def __del__(self):
for attr in ('_order',):
try:
delattr(self, attr)
except Exception:
pass
def __repr__(self, _repr_running={}):
# skip recursive items ...
call_key = id(self), _get_ident()
if _repr_running.get(call_key):
return '...'
_repr_running[call_key] = True
try:
return '%s({%s})' % (
type(self).__name__,
','.join('%r: %r' % k_v for k_v in self.items()))
finally:
del _repr_running[call_key]
# merge/update (PEP 584)
def __or__(self, other):
if not isinstance(other, compat_collections_abc.Mapping):
return NotImplemented
new = type(self)(self)
new.update(other)
return new
def __ror__(self, other):
if not isinstance(other, compat_collections_abc.Mapping):
return NotImplemented
new = type(other)(other)
new.update(self)
return new
def __ior__(self, other):
self.update(other)
return self
# optimisations
def __reversed__(self):
for from_ in reversed(self._order):
if from_ in self:
yield from_
def __contains__(self, item):
return dict.__contains__(self, item)
# allow overriding update without breaking __init__
def __update(self, *args, **kwargs):
super(compat_dict, self).update(*args, **kwargs)
compat_builtins_dict = dict
# Using the object's method, not dict's:
# an ordered dict's items can be returned unstably by unordered
# dict.items as if the method was not ((k, self[k]) for k in self)
compat_dict_items = lambda d: d.items()
legacy = [
'compat_HTMLParseError',
@ -3491,6 +3809,7 @@ legacy = [
'compat_getpass',
'compat_parse_qs',
'compat_realpath',
'compat_shlex_split',
'compat_urllib_parse_parse_qs',
'compat_urllib_parse_unquote',
'compat_urllib_parse_unquote_plus',
@ -3504,34 +3823,40 @@ legacy = [
__all__ = [
'compat_html_parser_HTMLParseError',
'compat_html_parser_HTMLParser',
'compat_Struct',
'compat_abc_ABC',
'compat_base64_b64decode',
'compat_basestring',
'compat_brotli',
'compat_builtins_dict',
'compat_casefold',
'compat_chr',
'compat_collections_abc',
'compat_collections_chain_map',
'compat_datetime_timedelta_total_seconds',
'compat_http_cookiejar',
'compat_http_cookiejar_Cookie',
'compat_http_cookies',
'compat_http_cookies_SimpleCookie',
'compat_contextlib_suppress',
'compat_ctypes_WINFUNCTYPE',
'compat_datetime_timedelta_total_seconds',
'compat_dict',
'compat_dict_items',
'compat_etree_fromstring',
'compat_etree_iterfind',
'compat_filter',
'compat_filter_fns',
'compat_get_terminal_size',
'compat_getenv',
'compat_getpass_getpass',
'compat_html_entities',
'compat_html_entities_html5',
'compat_html_parser_HTMLParseError',
'compat_html_parser_HTMLParser',
'compat_http_cookiejar',
'compat_http_cookiejar_Cookie',
'compat_http_cookies',
'compat_http_cookies_SimpleCookie',
'compat_http_client',
'compat_http_server',
'compat_input',
'compat_int',
'compat_integer_types',
'compat_itertools_count',
'compat_itertools_zip_longest',
@ -3541,6 +3866,7 @@ __all__ = [
'compat_numeric_types',
'compat_open',
'compat_ord',
'compat_os_makedirs',
'compat_os_name',
'compat_os_path_expanduser',
'compat_os_path_realpath',
@ -3550,13 +3876,14 @@ __all__ = [
'compat_register_utf8',
'compat_setenv',
'compat_shlex_quote',
'compat_shlex_split',
'compat_shutil_get_terminal_size',
'compat_socket_create_connection',
'compat_str',
'compat_struct_pack',
'compat_struct_unpack',
'compat_subprocess_get_DEVNULL',
'compat_subprocess_Popen',
'compat_thread',
'compat_tokenize_tokenize',
'compat_urllib_error',
'compat_urllib_parse',
@ -3570,5 +3897,5 @@ __all__ = [
'compat_xml_etree_register_namespace',
'compat_xpath',
'compat_zip',
'workaround_optparse_bug9161',
'compat_zstandard',
]

@ -11,6 +11,7 @@ from ..utils import (
decodeArgument,
encodeFilename,
error_to_compat_str,
float_or_none,
format_bytes,
shell_quote,
timeconvert,
@ -367,14 +368,27 @@ class FileDownloader(object):
})
return True
min_sleep_interval = self.params.get('sleep_interval')
if min_sleep_interval:
max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval)
sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval)
min_sleep_interval, max_sleep_interval = (
float_or_none(self.params.get(interval), default=0)
for interval in ('sleep_interval', 'max_sleep_interval'))
sleep_note = ''
available_at = info_dict.get('available_at')
if available_at:
forced_sleep_interval = available_at - int(time.time())
if forced_sleep_interval > min_sleep_interval:
sleep_note = 'as required by the site'
min_sleep_interval = forced_sleep_interval
if forced_sleep_interval > max_sleep_interval:
max_sleep_interval = forced_sleep_interval
sleep_interval = random.uniform(
min_sleep_interval, max_sleep_interval or min_sleep_interval)
if sleep_interval > 0:
self.to_screen(
'[download] Sleeping %s seconds...' % (
int(sleep_interval) if sleep_interval.is_integer()
else '%.2f' % sleep_interval))
'[download] Sleeping %.2f seconds %s...' % (
sleep_interval, sleep_note))
time.sleep(sleep_interval)
return self.real_download(filename, info_dict)

@ -32,7 +32,7 @@ class BokeCCBaseIE(InfoExtractor):
class BokeCCIE(BokeCCBaseIE):
_IE_DESC = 'CC视频'
IE_DESC = 'CC视频'
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{

@ -9,7 +9,7 @@ from ..utils import (
class CloudyIE(InfoExtractor):
_IE_DESC = 'cloudy.ec'
IE_DESC = 'cloudy.ec'
_VALID_URL = r'https?://(?:www\.)?cloudy\.ec/(?:v/|embed\.php\?.*?\bid=)(?P<id>[A-Za-z0-9]+)'
_TESTS = [{
'url': 'https://www.cloudy.ec/v/af511e2527aac',

@ -214,6 +214,7 @@ class InfoExtractor(object):
width : height ratio as float.
* no_resume The server does not support resuming the
(HTTP or RTMP) download. Boolean.
* available_at Unix timestamp of when a format will be available to download
* downloader_options A dictionary of downloader options as
described in FileDownloader
@ -422,6 +423,8 @@ class InfoExtractor(object):
_GEO_COUNTRIES = None
_GEO_IP_BLOCKS = None
_WORKING = True
# supply this in public subclasses: used in supported sites list, etc
# IE_DESC = 'short description of IE'
def __init__(self, downloader=None):
"""Constructor. Receives an optional downloader."""
@ -503,7 +506,7 @@ class InfoExtractor(object):
if not self._x_forwarded_for_ip:
# Geo bypass mechanism is explicitly disabled by user
if not self._downloader.params.get('geo_bypass', True):
if not self.get_param('geo_bypass', True):
return
if not geo_bypass_context:
@ -525,7 +528,7 @@ class InfoExtractor(object):
# Explicit IP block specified by user, use it right away
# regardless of whether extractor is geo bypassable or not
ip_block = self._downloader.params.get('geo_bypass_ip_block', None)
ip_block = self.get_param('geo_bypass_ip_block', None)
# Otherwise use random IP block from geo bypass context but only
# if extractor is known as geo bypassable
@ -536,8 +539,8 @@ class InfoExtractor(object):
if ip_block:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(ip_block)
if self._downloader.params.get('verbose', False):
self._downloader.to_screen(
if self.get_param('verbose', False):
self.to_screen(
'[debug] Using fake IP %s as X-Forwarded-For.'
% self._x_forwarded_for_ip)
return
@ -546,7 +549,7 @@ class InfoExtractor(object):
# Explicit country code specified by user, use it right away
# regardless of whether extractor is geo bypassable or not
country = self._downloader.params.get('geo_bypass_country', None)
country = self.get_param('geo_bypass_country', None)
# Otherwise use random country code from geo bypass context but
# only if extractor is known as geo bypassable
@ -557,8 +560,8 @@ class InfoExtractor(object):
if country:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country)
if self._downloader.params.get('verbose', False):
self._downloader.to_screen(
if self.get_param('verbose', False):
self.to_screen(
'[debug] Using fake IP %s (%s) as X-Forwarded-For.'
% (self._x_forwarded_for_ip, country.upper()))
@ -584,9 +587,9 @@ class InfoExtractor(object):
raise ExtractorError('An extractor error has occurred.', cause=e)
def __maybe_fake_ip_and_retry(self, countries):
if (not self._downloader.params.get('geo_bypass_country', None)
if (not self.get_param('geo_bypass_country', None)
and self._GEO_BYPASS
and self._downloader.params.get('geo_bypass', True)
and self.get_param('geo_bypass', True)
and not self._x_forwarded_for_ip
and countries):
country_code = random.choice(countries)
@ -696,7 +699,7 @@ class InfoExtractor(object):
if fatal:
raise ExtractorError(errmsg, sys.exc_info()[2], cause=err)
else:
self._downloader.report_warning(errmsg)
self.report_warning(errmsg)
return False
def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True, encoding=None, data=None, headers={}, query={}, expected_status=None):
@ -768,11 +771,11 @@ class InfoExtractor(object):
webpage_bytes = prefix + webpage_bytes
if not encoding:
encoding = self._guess_encoding_from_content(content_type, webpage_bytes)
if self._downloader.params.get('dump_intermediate_pages', False):
if self.get_param('dump_intermediate_pages', False):
self.to_screen('Dumping request to ' + urlh.geturl())
dump = base64.b64encode(webpage_bytes).decode('ascii')
self._downloader.to_screen(dump)
if self._downloader.params.get('write_pages', False):
self.to_screen(dump)
if self.get_param('write_pages', False):
basen = '%s_%s' % (video_id, urlh.geturl())
if len(basen) > 240:
h = '___' + hashlib.md5(basen.encode('utf-8')).hexdigest()
@ -974,19 +977,9 @@ class InfoExtractor(object):
"""Print msg to screen, prefixing it with '[ie_name]'"""
self._downloader.to_screen(self.__ie_msg(msg))
def write_debug(self, msg, only_once=False, _cache=[]):
def write_debug(self, msg, only_once=False):
'''Log debug message or Print message to stderr'''
if not self.get_param('verbose', False):
return
message = '[debug] ' + self.__ie_msg(msg)
logger = self.get_param('logger')
if logger:
logger.debug(message)
else:
if only_once and hash(message) in _cache:
return
self._downloader.to_stderr(message)
_cache.append(hash(message))
self._downloader.write_debug(self.__ie_msg(msg), only_once=only_once)
# name, default=None, *args, **kwargs
def get_param(self, name, *args, **kwargs):
@ -1082,7 +1075,7 @@ class InfoExtractor(object):
if mobj:
break
if not self._downloader.params.get('no_color') and compat_os_name != 'nt' and sys.stderr.isatty():
if not self.get_param('no_color') and compat_os_name != 'nt' and sys.stderr.isatty():
_name = '\033[0;34m%s\033[0m' % name
else:
_name = name
@ -1100,7 +1093,7 @@ class InfoExtractor(object):
elif fatal:
raise RegexNotFoundError('Unable to extract %s' % _name)
else:
self._downloader.report_warning('unable to extract %s' % _name + bug_reports_message())
self.report_warning('unable to extract %s' % _name + bug_reports_message())
return None
def _search_json(self, start_pattern, string, name, video_id, **kwargs):
@ -1170,7 +1163,7 @@ class InfoExtractor(object):
username = None
password = None
if self._downloader.params.get('usenetrc', False):
if self.get_param('usenetrc', False):
try:
netrc_machine = netrc_machine or self._NETRC_MACHINE
info = netrc.netrc().authenticators(netrc_machine)
@ -1181,7 +1174,7 @@ class InfoExtractor(object):
raise netrc.NetrcParseError(
'No authenticators for %s' % netrc_machine)
except (AttributeError, IOError, netrc.NetrcParseError) as err:
self._downloader.report_warning(
self.report_warning(
'parsing .netrc: %s' % error_to_compat_str(err))
return username, password
@ -1218,10 +1211,10 @@ class InfoExtractor(object):
"""
if self._downloader is None:
return None
downloader_params = self._downloader.params
if downloader_params.get('twofactor') is not None:
return downloader_params['twofactor']
twofactor = self.get_param('twofactor')
if twofactor is not None:
return twofactor
return compat_getpass('Type %s and press [Return]: ' % note)
@ -1356,7 +1349,7 @@ class InfoExtractor(object):
elif fatal:
raise RegexNotFoundError('Unable to extract JSON-LD')
else:
self._downloader.report_warning('unable to extract JSON-LD %s' % bug_reports_message())
self.report_warning('unable to extract JSON-LD %s' % bug_reports_message())
return {}
def _json_ld(self, json_ld, video_id, fatal=True, expected_type=None):
@ -1587,7 +1580,7 @@ class InfoExtractor(object):
if f.get('vcodec') == 'none': # audio only
preference -= 50
if self._downloader.params.get('prefer_free_formats'):
if self.get_param('prefer_free_formats'):
ORDER = ['aac', 'mp3', 'm4a', 'webm', 'ogg', 'opus']
else:
ORDER = ['webm', 'opus', 'ogg', 'mp3', 'aac', 'm4a']
@ -1599,7 +1592,7 @@ class InfoExtractor(object):
else:
if f.get('acodec') == 'none': # video only
preference -= 40
if self._downloader.params.get('prefer_free_formats'):
if self.get_param('prefer_free_formats'):
ORDER = ['flv', 'mp4', 'webm']
else:
ORDER = ['webm', 'flv', 'mp4']
@ -1665,7 +1658,7 @@ class InfoExtractor(object):
""" Either "http:" or "https:", depending on the user's preferences """
return (
'http:'
if self._downloader.params.get('prefer_insecure', False)
if self.get_param('prefer_insecure', False)
else 'https:')
def _proto_relative_url(self, url, scheme=None):
@ -3170,7 +3163,7 @@ class InfoExtractor(object):
# See com/longtailvideo/jwplayer/media/RTMPMediaProvider.as
# of jwplayer.flash.swf
rtmp_url_parts = re.split(
r'((?:mp4|mp3|flv):)', source_url, 1)
r'((?:mp4|mp3|flv):)', source_url, maxsplit=1)
if len(rtmp_url_parts) == 3:
rtmp_url, prefix, play_path = rtmp_url_parts
a_format.update({
@ -3197,7 +3190,7 @@ class InfoExtractor(object):
if fatal:
raise ExtractorError(msg)
else:
self._downloader.report_warning(msg)
self.report_warning(msg)
return res
def _float(self, v, name, fatal=False, **kwargs):
@ -3207,7 +3200,7 @@ class InfoExtractor(object):
if fatal:
raise ExtractorError(msg)
else:
self._downloader.report_warning(msg)
self.report_warning(msg)
return res
def _set_cookie(self, domain, name, value, expire_time=None, port=None,
@ -3216,12 +3209,12 @@ class InfoExtractor(object):
0, name, value, port, port is not None, domain, True,
domain.startswith('.'), path, True, secure, expire_time,
discard, None, None, rest)
self._downloader.cookiejar.set_cookie(cookie)
self.cookiejar.set_cookie(cookie)
def _get_cookies(self, url):
""" Return a compat_cookies_SimpleCookie with the cookies for the url """
req = sanitized_Request(url)
self._downloader.cookiejar.add_cookie_header(req)
self.cookiejar.add_cookie_header(req)
return compat_cookies_SimpleCookie(req.get_header('Cookie'))
def _apply_first_set_cookie_header(self, url_handle, cookie):
@ -3281,8 +3274,8 @@ class InfoExtractor(object):
return not any_restricted
def extract_subtitles(self, *args, **kwargs):
if (self._downloader.params.get('writesubtitles', False)
or self._downloader.params.get('listsubtitles')):
if (self.get_param('writesubtitles', False)
or self.get_param('listsubtitles')):
return self._get_subtitles(*args, **kwargs)
return {}
@ -3303,7 +3296,11 @@ class InfoExtractor(object):
""" Merge subtitle dictionaries, language by language. """
# ..., * , target=None
target = kwargs.get('target') or dict(subtitle_dict1)
target = kwargs.get('target')
if target is None:
target = dict(subtitle_dict1)
else:
subtitle_dicts = (subtitle_dict1,) + subtitle_dicts
for subtitle_dict in subtitle_dicts:
for lang in subtitle_dict:
@ -3311,8 +3308,8 @@ class InfoExtractor(object):
return target
def extract_automatic_captions(self, *args, **kwargs):
if (self._downloader.params.get('writeautomaticsub', False)
or self._downloader.params.get('listsubtitles')):
if (self.get_param('writeautomaticsub', False)
or self.get_param('listsubtitles')):
return self._get_automatic_captions(*args, **kwargs)
return {}
@ -3320,9 +3317,9 @@ class InfoExtractor(object):
raise NotImplementedError('This method must be implemented by subclasses')
def mark_watched(self, *args, **kwargs):
if (self._downloader.params.get('mark_watched', False)
if (self.get_param('mark_watched', False)
and (self._get_login_info()[0] is not None
or self._downloader.params.get('cookiefile') is not None)):
or self.get_param('cookiefile') is not None)):
self._mark_watched(*args, **kwargs)
def _mark_watched(self, *args, **kwargs):
@ -3330,7 +3327,7 @@ class InfoExtractor(object):
def geo_verification_headers(self):
headers = {}
geo_verification_proxy = self._downloader.params.get('geo_verification_proxy')
geo_verification_proxy = self.get_param('geo_verification_proxy')
if geo_verification_proxy:
headers['Ytdl-request-proxy'] = geo_verification_proxy
return headers

@ -35,15 +35,6 @@ from ..utils import (
class ITVBaseIE(InfoExtractor):
def _search_nextjs_data(self, webpage, video_id, **kw):
transform_source = kw.pop('transform_source', None)
fatal = kw.pop('fatal', True)
return self._parse_json(
self._search_regex(
r'''<script\b[^>]+\bid=('|")__NEXT_DATA__\1[^>]*>(?P<js>[^<]+)</script>''',
webpage, 'next.js data', group='js', fatal=fatal, **kw),
video_id, transform_source=transform_source, fatal=fatal)
def __handle_request_webpage_error(self, err, video_id=None, errnote=None, fatal=True):
if errnote is False:
return False
@ -109,7 +100,9 @@ class ITVBaseIE(InfoExtractor):
class ITVIE(ITVBaseIE):
_VALID_URL = r'https?://(?:www\.)?itv\.com/(?:(?P<w>watch)|hub)/[^/]+/(?(w)[\w-]+/)(?P<id>\w+)'
_IE_DESC = 'ITVX'
IE_DESC = 'ITVX'
_WORKING = False
_TESTS = [{
'note': 'Hub URLs redirect to ITVX',
'url': 'https://www.itv.com/hub/liar/2a4547a0012',
@ -270,7 +263,7 @@ class ITVIE(ITVBaseIE):
'ext': determine_ext(href, 'vtt'),
})
next_data = self._search_nextjs_data(webpage, video_id, fatal=False, default='{}')
next_data = self._search_nextjs_data(webpage, video_id, fatal=False, default={})
video_data.update(traverse_obj(next_data, ('props', 'pageProps', ('title', 'episode')), expected_type=dict)[0] or {})
title = traverse_obj(video_data, 'headerTitle', 'episodeTitle')
info = self._og_extract(webpage, require_title=not title)
@ -323,7 +316,7 @@ class ITVIE(ITVBaseIE):
class ITVBTCCIE(ITVBaseIE):
_VALID_URL = r'https?://(?:www\.)?itv\.com/(?!(?:watch|hub)/)(?:[^/]+/)+(?P<id>[^/?#&]+)'
_IE_DESC = 'ITV articles: News, British Touring Car Championship'
IE_DESC = 'ITV articles: News, British Touring Car Championship'
_TESTS = [{
'note': 'British Touring Car Championship',
'url': 'https://www.itv.com/btcc/articles/btcc-2018-all-the-action-from-brands-hatch',

@ -47,7 +47,7 @@ class SenateISVPIE(InfoExtractor):
['vetaff', '76462', 'http://vetaff-f.akamaihd.net'],
['arch', '', 'http://ussenate-f.akamaihd.net/']
]
_IE_NAME = 'senate.gov'
IE_NAME = 'senate.gov'
_VALID_URL = r'https?://(?:www\.)?senate\.gov/isvp/?\?(?P<qs>.+)'
_TESTS = [{
'url': 'http://www.senate.gov/isvp/?comm=judiciary&type=live&stt=&filename=judiciary031715&auto_play=false&wmode=transparent&poster=http%3A%2F%2Fwww.judiciary.senate.gov%2Fthemes%2Fjudiciary%2Fimages%2Fvideo-poster-flash-fit.png',

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

@ -404,6 +404,10 @@ def parseOpts(overrideArguments=None):
'-F', '--list-formats',
action='store_true', dest='listformats',
help='List all available formats of requested videos')
video_format.add_option(
'--no-list-formats',
action='store_false', dest='listformats',
help='Do not list available formats of requested videos (default)')
video_format.add_option(
'--youtube-include-dash-manifest',
action='store_true', dest='youtube_include_dash_manifest', default=True,
@ -412,6 +416,17 @@ def parseOpts(overrideArguments=None):
'--youtube-skip-dash-manifest',
action='store_false', dest='youtube_include_dash_manifest',
help='Do not download the DASH manifests and related data on YouTube videos')
video_format.add_option(
'--youtube-player-js-variant',
action='store', dest='youtube_player_js_variant',
help='For YouTube, the player javascript variant to use for n/sig deciphering; `actual` to follow the site; default `%default`.',
choices=('actual', 'main', 'tcc', 'tce', 'es5', 'es6', 'tv', 'tv_es6', 'phone', 'tablet'),
default='actual', metavar='VARIANT')
video_format.add_option(
'--youtube-player-js-version',
action='store', dest='youtube_player_js_version',
help='For YouTube, the player javascript version to use for n/sig deciphering, specified as `signature_timestamp@hash`, or `actual` to follow the site; default `%default`',
default='actual', metavar='STS@HASH')
video_format.add_option(
'--merge-output-format',
action='store', dest='merge_output_format', metavar='FORMAT', default=None,

@ -5,6 +5,10 @@
from .utils import (
dict_get,
get_first,
require,
subs_list_to_dict,
T,
traverse_obj,
unpack,
value,
)

@ -53,6 +53,8 @@ from .compat import (
compat_etree_fromstring,
compat_etree_iterfind,
compat_expanduser,
compat_filter as filter,
compat_filter_fns,
compat_html_entities,
compat_html_entities_html5,
compat_http_client,
@ -1859,6 +1861,39 @@ def write_json_file(obj, fn):
raise
class partial_application(object):
"""Allow a function to use pre-set argument values"""
# see _try_bind_args()
try:
inspect.signature
@staticmethod
def required_args(fn):
return [
param.name for param in inspect.signature(fn).parameters.values()
if (param.kind in (inspect.Parameter.POSITIONAL_ONLY, inspect.Parameter.POSITIONAL_OR_KEYWORD)
and param.default is inspect.Parameter.empty)]
except AttributeError:
# Py < 3.3
@staticmethod
def required_args(fn):
fn_args = inspect.getargspec(fn)
n_defaults = len(fn_args.defaults or [])
return (fn_args.args or [])[:-n_defaults if n_defaults > 0 else None]
def __new__(cls, func):
@functools.wraps(func)
def wrapped(*args, **kwargs):
if set(cls.required_args(func)[len(args):]).difference(kwargs):
return functools.partial(func, *args, **kwargs)
return func(*args, **kwargs)
return wrapped
if sys.version_info >= (2, 7):
def find_xpath_attr(node, xpath, key, val=None):
""" Find the xpath xpath[@key=val] """
@ -3152,6 +3187,7 @@ def extract_timezone(date_str):
return timezone, date_str
@partial_application
def parse_iso8601(date_str, delimiter='T', timezone=None):
""" Return a UNIX timestamp from the given date """
@ -3229,6 +3265,7 @@ def unified_timestamp(date_str, day_first=True):
return calendar.timegm(timetuple) + pm_delta * 3600 - compat_datetime_timedelta_total_seconds(timezone)
@partial_application
def determine_ext(url, default_ext='unknown_video'):
if url is None or '.' not in url:
return default_ext
@ -3807,6 +3844,7 @@ def base_url(url):
return re.match(r'https?://[^?#&]+/', url).group()
@partial_application
def urljoin(base, path):
path = _decode_compat_str(path, encoding='utf-8', or_none=True)
if not path:
@ -3831,6 +3869,7 @@ class PUTRequest(compat_urllib_request.Request):
return 'PUT'
@partial_application
def int_or_none(v, scale=1, default=None, get_attr=None, invscale=1, base=None):
if get_attr:
if v is not None:
@ -3857,6 +3896,7 @@ def str_to_int(int_str):
return int_or_none(int_str)
@partial_application
def float_or_none(v, scale=1, invscale=1, default=None):
if v is None:
return default
@ -3891,38 +3931,46 @@ def parse_duration(s):
return None
s = s.strip()
if not s:
return None
days, hours, mins, secs, ms = [None] * 5
m = re.match(r'(?:(?:(?:(?P<days>[0-9]+):)?(?P<hours>[0-9]+):)?(?P<mins>[0-9]+):)?(?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?Z?$', s)
m = re.match(r'''(?x)
(?P<before_secs>
(?:(?:(?P<days>[0-9]+):)?(?P<hours>[0-9]+):)?
(?P<mins>[0-9]+):)?
(?P<secs>(?(before_secs)[0-9]{1,2}|[0-9]+))
(?:[.:](?P<ms>[0-9]+))?Z?$
''', s)
if m:
days, hours, mins, secs, ms = m.groups()
days, hours, mins, secs, ms = m.group('days', 'hours', 'mins', 'secs', 'ms')
else:
m = re.match(
r'''(?ix)(?:P?
(?:
[0-9]+\s*y(?:ears?)?\s*
[0-9]+\s*y(?:ears?)?,?\s*
)?
(?:
[0-9]+\s*m(?:onths?)?\s*
[0-9]+\s*m(?:onths?)?,?\s*
)?
(?:
[0-9]+\s*w(?:eeks?)?\s*
[0-9]+\s*w(?:eeks?)?,?\s*
)?
(?:
(?P<days>[0-9]+)\s*d(?:ays?)?\s*
(?P<days>[0-9]+)\s*d(?:ays?)?,?\s*
)?
T)?
(?:
(?P<hours>[0-9]+)\s*h(?:ours?)?\s*
(?P<hours>[0-9]+)\s*h(?:(?:ou)?rs?)?,?\s*
)?
(?:
(?P<mins>[0-9]+)\s*m(?:in(?:ute)?s?)?\s*
(?P<mins>[0-9]+)\s*m(?:in(?:ute)?s?)?,?\s*
)?
(?:
(?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?\s*s(?:ec(?:ond)?s?)?\s*
(?P<secs>[0-9]+)(?:\.(?P<ms>[0-9]+))?\s*s(?:ec(?:ond)?s?)?\s*
)?Z?$''', s)
if m:
days, hours, mins, secs, ms = m.groups()
days, hours, mins, secs, ms = m.group('days', 'hours', 'mins', 'secs', 'ms')
else:
m = re.match(r'(?i)(?:(?P<hours>[0-9.]+)\s*(?:hours?)|(?P<mins>[0-9.]+)\s*(?:mins?\.?|minutes?)\s*)Z?$', s)
if m:
@ -3930,17 +3978,13 @@ def parse_duration(s):
else:
return None
duration = 0
if secs:
duration += float(secs)
if mins:
duration += float(mins) * 60
if hours:
duration += float(hours) * 60 * 60
if days:
duration += float(days) * 24 * 60 * 60
if ms:
duration += float(ms)
duration = (
((((float(days) * 24) if days else 0)
+ (float(hours) if hours else 0)) * 60
+ (float(mins) if mins else 0)) * 60
+ (float(secs) if secs else 0)
+ (float(ms) / 10 ** len(ms) if ms else 0))
return duration
@ -4204,12 +4248,16 @@ def lowercase_escape(s):
s)
def escape_rfc3986(s):
def escape_rfc3986(s, safe=None):
"""Escape non-ASCII characters as suggested by RFC 3986"""
if sys.version_info < (3, 0):
s = _encode_compat_str(s, 'utf-8')
if safe is not None:
safe = _encode_compat_str(safe, 'utf-8')
if safe is None:
safe = b"%/;:@&=+$,!~*'()?#[]"
# ensure unicode: after quoting, it can always be converted
return compat_str(compat_urllib_parse.quote(s, b"%/;:@&=+$,!~*'()?#[]"))
return compat_str(compat_urllib_parse.quote(s, safe))
def escape_url(url):
@ -4247,6 +4295,7 @@ def urlencode_postdata(*args, **kargs):
return compat_urllib_parse_urlencode(*args, **kargs).encode('ascii')
@partial_application
def update_url(url, **kwargs):
"""Replace URL components specified by kwargs
url: compat_str or parsed URL tuple
@ -4268,6 +4317,7 @@ def update_url(url, **kwargs):
return compat_urllib_parse.urlunparse(url._replace(**kwargs))
@partial_application
def update_url_query(url, query):
return update_url(url, query_update=query)
@ -4694,30 +4744,45 @@ def parse_codecs(codecs_str):
if not codecs_str:
return {}
split_codecs = list(filter(None, map(
lambda str: str.strip(), codecs_str.strip().strip(',').split(','))))
vcodec, acodec = None, None
lambda s: s.strip(), codecs_str.strip().split(','))))
vcodec, acodec, hdr = None, None, None
for full_codec in split_codecs:
codec = full_codec.split('.')[0]
if codec in ('avc1', 'avc2', 'avc3', 'avc4', 'vp9', 'vp8', 'hev1', 'hev2', 'h263', 'h264', 'mp4v', 'hvc1', 'av01', 'theora'):
if not vcodec:
vcodec = full_codec
elif codec in ('mp4a', 'opus', 'vorbis', 'mp3', 'aac', 'ac-3', 'ec-3', 'eac3', 'dtsc', 'dtse', 'dtsh', 'dtsl'):
codec, rest = full_codec.partition('.')[::2]
codec = codec.lower()
full_codec = '.'.join((codec, rest)) if rest else codec
codec = re.sub(r'0+(?=\d)', '', codec)
if codec in ('avc1', 'avc2', 'avc3', 'avc4', 'vp9', 'vp8', 'hev1', 'hev2',
'h263', 'h264', 'mp4v', 'hvc1', 'av1', 'theora', 'dvh1', 'dvhe'):
if vcodec:
continue
vcodec = full_codec
if codec in ('dvh1', 'dvhe'):
hdr = 'DV'
elif codec in ('av1', 'vp9'):
n, m = {
'av1': (2, '10'),
'vp9': (0, '2'),
}[codec]
if (rest.split('.', n + 1)[n:] or [''])[0].lstrip('0') == m:
hdr = 'HDR10'
elif codec in ('flac', 'mp4a', 'opus', 'vorbis', 'mp3', 'aac', 'ac-4',
'ac-3', 'ec-3', 'eac3', 'dtsc', 'dtse', 'dtsh', 'dtsl'):
if not acodec:
acodec = full_codec
else:
write_string('WARNING: Unknown codec %s\n' % full_codec, sys.stderr)
if not vcodec and not acodec:
if len(split_codecs) == 2:
return {
'vcodec': split_codecs[0],
'acodec': split_codecs[1],
}
else:
return {
write_string('WARNING: Unknown codec %s\n' % (full_codec,), sys.stderr)
return (
filter_dict({
'vcodec': vcodec or 'none',
'acodec': acodec or 'none',
}
return {}
'dynamic_range': hdr,
}) if vcodec or acodec
else {
'vcodec': split_codecs[0],
'acodec': split_codecs[1],
} if len(split_codecs) == 2
else {})
def urlhandle_detect_ext(url_handle):
@ -6279,6 +6344,7 @@ def traverse_obj(obj, *paths, **kwargs):
Read as: `{key: traverse_obj(obj, path) for key, path in dct.items()}`.
- `any`-builtin: Take the first matching object and return it, resetting branching.
- `all`-builtin: Take all matching objects and return them as a list, resetting branching.
- `filter`-builtin: Return the value if it is truthy, `None` otherwise.
`tuple`, `list`, and `dict` all support nested paths and branches.
@ -6320,6 +6386,11 @@ def traverse_obj(obj, *paths, **kwargs):
# instant compat
str = compat_str
from .compat import (
compat_builtins_dict as dict_, # the basic dict type
compat_dict as dict, # dict preserving imsertion order
)
casefold = lambda k: compat_casefold(k) if isinstance(k, str) else k
if isinstance(expected_type, type):
@ -6402,7 +6473,7 @@ def traverse_obj(obj, *paths, **kwargs):
if not branching: # string traversal
result = ''.join(result)
elif isinstance(key, dict):
elif isinstance(key, dict_):
iter_obj = ((k, _traverse_obj(obj, v, False, is_last)) for k, v in key.items())
result = dict((k, v if v is not None else default) for k, v in iter_obj
if v is not None or default is not NO_DEFAULT) or None
@ -6480,7 +6551,7 @@ def traverse_obj(obj, *paths, **kwargs):
has_branched = False
key = None
for last, key in lazy_last(variadic(path, (str, bytes, dict, set))):
for last, key in lazy_last(variadic(path, (str, bytes, dict_, set))):
if not casesense and isinstance(key, str):
key = compat_casefold(key)
@ -6493,6 +6564,11 @@ def traverse_obj(obj, *paths, **kwargs):
objs = (list(filtered_objs),)
continue
# filter might be from __builtin__, future_builtins, or itertools.ifilter
if key in compat_filter_fns:
objs = filter(None, objs)
continue
if __debug__ and callable(key):
# Verify function signature
_try_bind_args(key, None, None)
@ -6505,10 +6581,10 @@ def traverse_obj(obj, *paths, **kwargs):
objs = from_iterable(new_objs)
if test_type and not isinstance(key, (dict, list, tuple)):
if test_type and not isinstance(key, (dict_, list, tuple)):
objs = map(type_test, objs)
return objs, has_branched, isinstance(key, dict)
return objs, has_branched, isinstance(key, dict_)
def _traverse_obj(obj, path, allow_empty, test_type):
results, has_branched, is_dict = apply_path(obj, path, test_type)
@ -6531,6 +6607,76 @@ def traverse_obj(obj, *paths, **kwargs):
return None if default is NO_DEFAULT else default
def value(value):
return lambda _: value
class require(ExtractorError):
def __init__(self, name, expected=False):
super(require, self).__init__(
'Unable to extract {0}'.format(name), expected=expected)
def __call__(self, value):
if value is None:
raise self
return value
@partial_application
# typing: (subs: list[dict], /, *, lang='und', ext=None) -> dict[str, list[dict]
def subs_list_to_dict(subs, lang='und', ext=None):
"""
Convert subtitles from a traversal into a subtitle dict.
The path should have an `all` immediately before this function.
Arguments:
`lang` The default language tag for subtitle dicts with no
`lang` (`und`: undefined)
`ext` The default value for `ext` in the subtitle dicts
In the dict you can set the following additional items:
`id` The language tag to which the subtitle dict should be added
`quality` The sort order for each subtitle dict
"""
result = collections.defaultdict(list)
for sub in subs:
tn_url = url_or_none(sub.pop('url', None))
if tn_url:
sub['url'] = tn_url
elif not sub.get('data'):
continue
sub_lang = sub.pop('id', None)
if not isinstance(sub_lang, compat_str):
if not lang:
continue
sub_lang = lang
sub_ext = sub.get('ext')
if not isinstance(sub_ext, compat_str):
if not ext:
sub.pop('ext', None)
else:
sub['ext'] = ext
result[sub_lang].append(sub)
result = dict(result)
for subs in result.values():
subs.sort(key=lambda x: x.pop('quality', 0) or 0)
return result
def unpack(func, **kwargs):
"""Make a function that applies `partial(func, **kwargs)` to its argument as *args"""
@functools.wraps(func)
def inner(items):
return func(*items, **kwargs)
return inner
def T(*x):
""" For use in yt-dl instead of {type, ...} or set((type, ...)) """
return set(x)

@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2021.12.17'
__version__ = '2025.04.07'

Loading…
Cancel
Save