Yen Chi Hsuan
efbed08dc2
[utils] Encode hostnames before passing to urllib
...
With IDN (Internationalized Domain Name) and a proxy, non-ascii URLs
are passed down to urllib/urllib2, causing UnicodeEncodeError
Fixes #8890
9 years ago
Jaime Marquínez Ferrándiz
782b1b5bd1
[utils] lookup_unit_table: Match word boundary instead of end of string
9 years ago
Jaime Marquínez Ferrándiz
09fc33198a
utils: lookup_unit_table: Use a stricter regex
...
In parse_count multiple units start with the same letter, so it would match different units depending on the order they were sorted when iterating over them.
9 years ago
Sergey M․
810c10baa1
[utils] Use compat_xpath
9 years ago
Sergey M․
c5229f3926
[utils] PEP 8
9 years ago
remitamine
83548824c2
Merge pull request #8092 from bpfoley/twitter-thumbnail
...
[utils] Add extract_attributes for extracting html tag attributes
9 years ago
Sergey M․
2f7ae819ac
[utils] PEP 8
9 years ago
Sergey M․
fb47597b09
[bbc] Generalize unit table lookup and add parse_count
9 years ago
Yen Chi Hsuan
25cb05bda9
[utils] Remove codec2ext
...
This function is orignally used for determining file extensions of DASH
formats. Now in DASH, ext is determined by mime_type. See #8766 for more
information.
9 years ago
Yen Chi Hsuan
6d210f2090
[utils] Add more codecs to codec2ext
...
BBC uses avc3. Here's an example (thanks to @remitamine for this example)
http://rdmedia.bbc.co.uk/dash/ondemand/bbb/2/client_manifest-common_init.mpd
See also https://trac.ffmpeg.org/ticket/5217
9 years ago
Yen Chi Hsuan
19a17d4623
[utils] Add codec2ext
9 years ago
Jaime Marquínez Ferrándiz
3233a68fbb
[utils] update_url_query: Encode the strings in the query dict
...
The test case with {'test': '第二行тест'} was failing on python 2 (the non-ascii characters were replaced with '?').
9 years ago
remitamine
1255733945
Merge pull request #8739 from remitamine/update_url_params
...
[utils] add update_url_query function to create or update query string params
9 years ago
remitamine
38f9ef31dc
[utils] add update_url_query function
9 years ago
Yen Chi Hsuan
0cae023b24
Merge branch 'jython-support'
...
Closes #8302
9 years ago
Yen Chi Hsuan
8ee239e921
[utils] Jython support - handle filenames correctly
...
Now test:youtube downloads
9 years ago
Brian Foley
8bb56eeeea
[utils] Add extract_attributes for extracting html tag attributes
...
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
9 years ago
remitamine
e07237f640
[utils] remove check for val from find_xpath_attr
9 years ago
Yen Chi Hsuan
5eb6bdced4
[utils] Multiple changes to base_n()
...
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
9 years ago
Yen Chi Hsuan
680079be39
[utils] Relaxing regex in decode_packed_codes for vidzi
9 years ago
Yen Chi Hsuan
f52354a889
[utils] Move codes for handling eval() from iqiyi.py
9 years ago
Yen Chi Hsuan
59f898b7a7
[utils] Merge base_n functions
9 years ago
Yen Chi Hsuan
481888294d
[utils] Add base36 for use in Vidzi
9 years ago
Yen Chi Hsuan
81bdc8fdf6
[utils] Move base62 to utils
9 years ago
Sergey M․
f160785c5c
[utils] Remove AM/PM from unified_strdate patterns
9 years ago
Yen Chi Hsuan
b95dc034ca
[utils] Implement cache for OnDemandPagedList
9 years ago
remitamine
cafcf657a4
add more subtitles mime types to mimetype2ext and fix the platform subtitle extraction
9 years ago
Yen Chi Hsuan
c1c05c67ea
[utils] Jython support - disable setproctitle() until ctypes is complete
9 years ago
Yen Chi Hsuan
399a76e67b
[utils] Jython support: tolerate missing fcntl module
9 years ago
Jaime Marquínez Ferrándiz
765ac263db
[utils] mimetype2ext: return 'm4a' for 'audio/mp4' ( fixes #8620 )
...
The youtube extractor was using 'mp4' for them, therefore filters like 'bestaudio[ext=m4a]' stopped working (94278f7202
broke it).
9 years ago
Yen Chi Hsuan
5bc880b988
[utils] Add OHDave's RSA encryption function
9 years ago
Sergey M․
611c1dd96e
[refactor] Single quotes consistency
9 years ago
Sergey M․
d800609c62
[refactor] Do not specify redundant None as second argument in dict.get()
9 years ago
Sergey M․
9c7b38981c
[utils] Bump Firefox version in User-Agent
...
Old version number causes Youtube not to serve some formats in ytplayer.config
9 years ago
Sergey M․
8411229bd5
[utils] Allow dot in strip_jsonp
9 years ago
Sergey M․
86296ad2cd
[utils] Add ability to control skipping false values in dict_get
9 years ago
Sergey M․
cbecc9b903
[utils] Add dict_get convenience method
9 years ago
Jaime Marquínez Ferrándiz
87de7069b9
[utils] dfxp2srt: make TTMLPElementParser inherit from object
...
For consistency between python 2 and 3.
9 years ago
remitamine
2b14cb566f
[utils] fix dfxp2srt text extraction( fixes #8055 )
9 years ago
Yen Chi Hsuan
a0d8d704df
[utils] Reorder items in mimetype2ext alphabetically
9 years ago
Yen Chi Hsuan
f6861ec96f
[utils] Add more items to mimetype2ext ( #8293 )
...
These are used in Youtube formats
9 years ago
remitamine
6ec6cb4e95
Revert "fix typos"
...
This reverts commit 36a0e46c39
.
9 years ago
remitamine
36a0e46c39
fix typos
9 years ago
Jakub Wilk
dfb1b1468c
Fix typos
...
Closes #8200 .
9 years ago
Sergey M․
a7aaa39863
[utils] Extract known extensions for reuse
9 years ago
Yen Chi Hsuan
c047270c02
[utils] Remove Content-encoding from headers after decompression
...
With cn_verification_proxy, our http_response() is called twice, one from
PerRequestProxyHandler.proxy_open() and another from normal
YoutubeDL.urlopen(). As a result, for proxies honoring Accept-Encoding, the
following bug occurs:
$ youtube-dl -vs --cn-verification-proxy https://secure.uku.im:993 "test:letv"
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-vs', '--cn-verification-proxy', 'https://secure.uku.im:993 ', 'test:letv']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2015.12.23
[debug] Git HEAD: 97f18fa
[debug] Python version 3.5.1 - Linux-4.3.3-1-ARCH-x86_64-with-arch-Arch-Linux
[debug] exe versions: ffmpeg 2.8.4, ffprobe 2.8.4, rtmpdump 2.4
[debug] Proxy map: {}
[TestURL] Test URL: http://www.letv.com/ptv/vplay/22005890.html
[Letv] 22005890: Downloading webpage
[Letv] 22005890: Downloading playJson data
ERROR: Unable to download JSON metadata: Not a gzipped file (b'{"') (caused by OSError('Not a gzipped file (b\'{"\')',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/common.py", line 330, in _request_webpage
return self._downloader.urlopen(url_or_request)
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1886, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "/usr/lib/python3.5/urllib/request.py", line 471, in open
response = meth(req, response)
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/utils.py", line 773, in http_response
raise original_ioerror
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/utils.py", line 761, in http_response
uncompressed = io.BytesIO(gz.read())
File "/usr/lib/python3.5/gzip.py", line 274, in read
return self._buffer.read(size)
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 409, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
9 years ago
Sergey M․
9b9c5355e4
Rename error_to_str to error_to_compat_str
9 years ago
Sergey M․
8e60dc7526
[utils] Add encode_compat_str
9 years ago
Sergey M․
fdae235858
[utils] Add error_to_str
9 years ago
Yen Chi Hsuan
db2fe38b55
[utils] Support alternative timestamp format in TTML
...
Fixes #7608
9 years ago
Yen Chi Hsuan
d631d5f9f2
[utils] Fix TTML conversion
...
Tolerate invalid timestamps (closes #7909 )
9 years ago
Sergey M․
31b2051e21
[utils] Add remove_quotes
9 years ago
Yen Chi Hsuan
992fc9d6e1
[utils] Refactor handle_youtubedl_headers for future extension
9 years ago
Yen Chi Hsuan
0424ec307b
[utils] Correct docstring of YoutubeDLHandler
9 years ago
Yen Chi Hsuan
87f0e62d94
[utils] Separate codes for handling Youtubedl-* headers
9 years ago
Sergey M․
67dda51722
Rename compat_urllib_request_Request to sanitized_Request and move to utils
9 years ago
Sergey M․
9cb9a5df77
[utils] Check ext with trailing slash against the list of known extensions
9 years ago
Sergey M․
3e12bc583a
[utils] Improve determine_ext ( Closes #7593 )
9 years ago
Sergey M․
7e1f5447e7
[utils] Improve encode_dict
9 years ago
Sergey M․
7a3f0c00ad
[utils] Style
9 years ago
Sergey M․
7aefc49c40
[utils] Skip invalid/non HTML entities ( Closes #7518 )
9 years ago
Jaime Marquínez Ferrándiz
6a75040278
[utils] unified_strdate: Return None if the date format can't be recognized ( fixes #7340 )
...
This issue was introduced with ae12bc3ebb
, it returned 'None'.
9 years ago
Sergey M․
c90d16cf36
[utils:sanitize_path] Disallow trailing whitespace in path segment ( Closes #7332 )
9 years ago
Sergey M
30eecc6a04
Merge pull request #7296 from jaimeMF/xml_attrib_unicode
...
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
9 years ago
Sergey M․
ae12bc3ebb
[utils] Make unified_strdate always return unicode string
9 years ago
Sergey M․
578c074575
[utils] Support list of xpath in xpath_element
9 years ago
Sergey M․
52c3a6e49d
[utils] Improve parse_iso8601
9 years ago
Jaime Marquínez Ferrándiz
f78546272c
[compat] compat_etree_fromstring: also decode the text attribute
...
Deletes parse_xml from utils, because it also does it.
9 years ago
Jaime Marquínez Ferrándiz
36e6f62cd0
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x ( #7178 )
...
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
9 years ago
Sergey M․
d01949dc89
[utils:js_to_json] Fix bad escape in double quoted strings
9 years ago
Yen Chi Hsuan
1e399778ee
[letv] Fix extraction
...
Using data URIs for passing the decrypted M3U8 manifest, which is
supported by ffmpeg only.
9 years ago
Sergey M․
af98f8ff37
[utils] Return default on fail in int_or_none
9 years ago
Sergey M․
caf80631f0
[utils] Do not fail in float_or_none on non-numeric data
9 years ago
Sergey M․
1812afb7b3
[utils] Do not fail in int_or_none on non-numeric data ( Closes #7175 )
9 years ago
Sergey M․
5a1a2e9454
[utils] Fix kwargs on old python 2 ( Closes #6905 )
9 years ago
Sergey M․
e28034c5ac
[utils] Comment cookie processing until result from travis and some more testing
9 years ago
Sergey M․
266e466ee4
[utils] Simplify cookie processor
9 years ago
Sergey M․
1639282434
[utils] Add encode_dict
9 years ago
Sergey M․
ad72917274
[utils] Add issue URL in comment for #6457
9 years ago
Sergey M․
a6420bf50c
[utils] Add cookie processor for cookie correction ( Closes #6769 )
9 years ago
Sergey M․
66e289bab4
[utils] Generalize cli option converters
9 years ago
Sergey M․
8e636da499
[utils] Improve xpath_text
9 years ago
Sergey M․
5d2354f177
[utils] Relax attribute key assert
9 years ago
Sergey M․
a41fb80ce1
[utils] Add xpath_element and xpath_attr
9 years ago
Sergey M․
e5e78797e6
[utils] Strict HTTP responses ( Closes #6727 )
9 years ago
Sergey M․
5a4d9ddb21
[utils] Percent-encode redirect URL of Location header ( Closes #6457 )
9 years ago
Sergey M․
51f267d9d4
[YoutubeDL:utils] Move percent encode non-ASCII URLs workaround to http_request and simplify ( Closes #6457 )
9 years ago
Sergey M․
ee114368ad
[utils] Make value optional for find_xpath_attr
...
This allows selecting particular attributes by name but without specifying the value and similar to xpath syntax `[@attrib]`
9 years ago
Raphael Michel
2c7ed24796
Remove redundant (and wrong) class parameters
9 years ago
Yen Chi Hsuan
9c29bc69f7
[utils] Improve parse_duration
...
Now dots are parsed. For example '87 Min.'
9 years ago
Sergey M․
bf42a9906d
[utils] Add default value for xpath_text
10 years ago
Yen Chi Hsuan
4eb10f6621
[utils] Add ISO3166Utils
10 years ago
Yen Chi Hsuan
4e33577173
[utils] Support ttaf1 namespace in TTML
...
It's found in bbc.co.uk. See #6038
10 years ago
Yen Chi Hsuan
396726244a
[utils/ffmpeg] Move ISO 639 related codes to utils
10 years ago
Yen Chi Hsuan
ecee572411
[yahoo] Add support for closed captions ( closes #5714 )
10 years ago
Yen Chi Hsuan
1b0427e6c4
[utils] Support TTML without default namespace
...
In a strict sense such TTML is invalid, but Yahoo uses it.
10 years ago
Yen Chi Hsuan
c1c924abfe
[utils,common] Merge format_srt_time and _subtitles_timecode
...
format_srt_time uses a comma as the delimiter between seconds and
milliseconds while _subtitles_timecode uses a dot. All .srt examples I
found on the Internet uses a comma, so I use a comma in the merged
version. See http://matroska.org/technical/specs/subtitles/srt.html and
http://devel.aegisub.org/wiki/SubtitleFormats/SRT
10 years ago
Yen Chi Hsuan
7dff03636a
[utils] Support 'dur' field in TTML
10 years ago
Yen Chi Hsuan
d39e0f05db
[utils] Remove sanitize_url_path_consecutive_slashes()
...
This function is used only in SohuIE, which is updated to use a new
extraction logic.
10 years ago
Jaime Marquínez Ferrándiz
541168039d
[utils] get_exe_version: encode executable name ( fixes #5647 )
...
It failed in python 2.x when $PATH contains a directory with non-ascii characters.
10 years ago