Since Python 3.6, invalid escape sequences are deprecated. It's likely
that there are invalid escape sequences somewhere on the webpage, so
instead of unescaping the whole webpage, just unescape the URL.
See https://bugs.python.org/issue27364. That change was designed for
string literals, while it affects the 'unicode_escape' encoding as well.
The code path is:
str.decode('unicode_escape')
codecs.unicode_escape_decode()
PyUnicode_DecodeUnicodeEscape()
Closes#11924
The API with `page` is no longer used in browsers, and YouTube always
returns {'reload': 'now'} when cookies are provided.
See http://youtube.github.io/spfjs/documentation/start/ for how SPF
works. Basically appending static link with a `spf` parameter yields the
corresponding dynamic link.
To reduce complexity, I don't support old Bangumi URLs directly via
_VALID_URL. Instead, I choose to let it go to generic redirection. An
example can be found in #10190:
http://bangumi.bilibili.com/anime/v/40062
HTMLParser, which is used by extract_attributes, already unescapes
attribute values with HTMLParser.unescape. They shouldn't be unescaped
again, to there may be parsing errors.
Ref: #11219, #11522