Commit Graph

314 Commits (47d205a6460a696514f6485a516358513ab880b6)

Author SHA1 Message Date
Sergey M․ 27713812a0 [extractor/common] Add method for extracting form hidden input fields as dict 9 years ago
Yen Chi Hsuan 13af92fdc4 [common] Add 'fatal' to _extract_m3u8_formats 9 years ago
Sergey M․ 5414623791 [extractor/common] Remove superfluous line 9 years ago
Sergey M․ c342041fba [extractor/common] Use NO_DEFAULT from utils 9 years ago
Yen Chi Hsuan 621ed9f5f4 [common] Add note and errnote field for _extract_m3u8_formats 10 years ago
Sergey M․ baa43cbaf0 [extractor/common] Relax valid url check verbosity 10 years ago
Yen Chi Hsuan c1c924abfe [utils,common] Merge format_srt_time and _subtitles_timecode
format_srt_time uses a comma as the delimiter between seconds and
milliseconds while _subtitles_timecode uses a dot. All .srt examples I
found on the Internet uses a comma, so I use a comma in the merged
version. See http://matroska.org/technical/specs/subtitles/srt.html and
http://devel.aegisub.org/wiki/SubtitleFormats/SRT
10 years ago
Yen Chi Hsuan 05d5392cda [common] Ignore subtitles in m3u8 10 years ago
Sergey M․ 74f728249f [extractor/common] Fallback to empty string for (yet) missing `format_id` in `_sort_formats` (Closes #5624) 10 years ago
Jaime Marquínez Ferrándiz 2ddcd88129 Remove code that was only used by the Grooveshark extractor 10 years ago
zouhair cf0649f8b7 Typo: twice "the the" to "the" 10 years ago
Sergey M․ 3ded7bac16 [extractor/common] Add ability to specify custom field preference for `_sort_formats` 10 years ago
Jaime Marquínez Ferrándiz 08f2a92c9c InfoExtractor._search_regex: Suggest updating when the regex is not found (suggested in #5442)
Reuse the same message from ExtractorError
10 years ago
Yen Chi Hsuan c9a779695d [extractor/common] Add the encoding parameter
The QQMusic info extractor need forced encoding for correct working.
10 years ago
Sergey M․ 830d53bfae [utils] Add `video_title` for `url_result` 10 years ago
Sergey M․ e21a55abcc [extractor/common] Remove f4m section
It's now provided by `f4m_id`
10 years ago
Sergey M․ 4a34f69ea6 [extractor/common] Add subtitles timecode formatter 10 years ago
Sergey M․ f207019ce5 [extractor/common] Remove 'm3u8' from quality selection URL 10 years ago
Sergey M․ 8dc9d361c2 [extractor/common] Fix format_id when `last_media` is None and always include `m3u8_id` if present
The rationale behind `m3u8_id` was to resolve duplicates when processing several m3u8 playlists within the same media that give equal resulting `format_id`'s,
e.g. `youtube-dl http://www.rts.ch/play/tv/passe-moi-les-jumelles/video/la-fee-des-bois-mustang-les-chemins-du-vent?id=3854925 -F`
10 years ago
Philipp Hagemeister a0bb7c5593 [extractor/common] Improve m3u format IDs (#5143) 10 years ago
Sergey M․ 2f0f6578c3 [extractor/common] Assume non HTTP(S) URLs valid 10 years ago
Philipp Hagemeister 72a406e7aa [extractor/common] Pass in video_id (#5057) 10 years ago
Antti Ajanki 6f4ba54079 [extractor/common] Extract HTTP (possibly f4m) URLs from a .smil file 10 years ago
Antti Ajanki 637570326b [extractor/common] Extract the first of a seq of videos in a .smil file 10 years ago
Jaime Marquínez Ferrándiz bfc993cc91 Merge branch 'subtitles-rework'
(Closes PR #4964)
10 years ago
Sergey M․ 9fe6ef7ab2 [extractor/common] Fix preference for m3u8 quality selection URL 10 years ago
Philipp Hagemeister 8fb3ac3649 PEP8: W503 10 years ago
Philipp Hagemeister 77b2986b5b [extractor/common] Recognize Indian censorship (#5021) 10 years ago
Jaime Marquínez Ferrándiz 9868ea4936 [extractor/common] Simplify subtitles handling methods
Initially I was going to use a single method for handling both subtitles and automatic captions, that's why I used the 'list_subtitles' and the 'subtitles' variables.
10 years ago
Philipp Hagemeister fa15607773 PEP8 fixes 10 years ago
Jaime Marquínez Ferrándiz 4cd95bcbc3 [twitch:stream] Prefer the 'source' format (fixes #4972) 10 years ago
Sergey M? 4069766c52 [extractor/common] Test URLs with GET 10 years ago
Jaime Marquínez Ferrándiz 360e1ca5cc [youtube] Convert to new subtitles system
The automatic captions are stored in the 'automactic_captions' field, which is used if no normal subtitles are found for an specific language.
10 years ago
Jaime Marquínez Ferrándiz c84dd8a90d [YoutubeDL] store the subtitles to download in the 'requested_subtitles' field
We need to keep the orginal subtitles information, so that the '--load-info' option can be used to list or select the subtitles again.
We'll also be able to have a separate field for storing the automatic captions info.
10 years ago
Jaime Marquínez Ferrándiz a504ced097 Improve subtitles support
For each language the extractor builds a list with the available formats sorted (like for video formats), then YoutubeDL selects one of them using the '--sub-format' option which now allows giving the format preferences (for example 'ass/srt/best').
For each format the 'url' field can be set so that we only download the contents if needed, or if the contents needs to be processed (like in crunchyroll) the 'data' field can be used.

The reasons for this change are:
* We weren't checking that the format given with '--sub-format' was available, checking it in each extractor would be repetitive.
* It allows to easily support giving a format preference.
* The subtitles were automatically downloaded in the extractor, but I think that if you use for example the '--dump-json' option you want to finish as fast as possible.

Currently only the ted extractor has been updated, but the old system still works.
10 years ago
Philipp Hagemeister 03cd72b007 [extractor/common] Move up filesize
filesize and tbr should correlate, so it doesn't make sense to treat them differently.
10 years ago
Jaime Marquínez Ferrándiz 6ca7732d5e [extractor/common] Fix link to external documentation 10 years ago
Jaime Marquínez Ferrándiz 2d30521ab9 [youtube] Extract average rating (closes #2362) 10 years ago
Philipp Hagemeister 9650885be9 [escapist] Filter video differently (Fixes #4919) 10 years ago
Philipp Hagemeister 7e5db8c930 [options] Add --no-color 10 years ago
Philipp Hagemeister 3a5bcd0326 [extractor/common] Wrap extractor errors (Fixes #1194)
For now, we just wrap some common errors. More may follow. We do not want to catch actual programming errors in the extractors, such as 1 // 0.
10 years ago
Naglis Jonaitis 69319969de [extractor/common] Add new helper method _family_friendly_search 10 years ago
Philipp Hagemeister 1e1896f2de [extractor/common] Correct sort order.
We should look at height and width before ext_preference.
10 years ago
Sergey M․ 3900eec27c [extractor/common] Fix 2.0 manifest extraction (Closes #4830) 10 years ago
Sergey M․ 60ca389c64 [extractor/common] Prefix f4m/m3u8 entries with identifier 10 years ago
Philipp Hagemeister 9bb8e0a3f9 [wsj] Add new extractor (Fixes #4854) 10 years ago
Philipp Hagemeister 1a6373ef39 [sort_formats] Prefer bitrate over video size
720p @ 1000KB/s looks way better than 1080p @ 500KB/s
10 years ago
Philipp Hagemeister 995029a142 [nerdist] Add new extractor (Fixes #4851) 10 years ago
Philipp Hagemeister b04b885271 [extractor/common] Document all protocol values 10 years ago
Sergey M․ 96a53167fa [common] Generalize URLs' HTTP errors pre-testing 10 years ago
Philipp Hagemeister 3dee7826e7 [rtl2] PEP8, simplify, make rtmp tests run (#470) 10 years ago
Philipp Hagemeister cfb56d1af3 Add --list-thumbnails 10 years ago
Jaime Marquínez Ferrándiz e1554a407d [extractors] Use http_headers for setting the User-Agent and the Referer 10 years ago
Philipp Hagemeister 121c09c7be Merge remote-tracking branch 'Dineshs91/f4m-2.0' 10 years ago
Philipp Hagemeister 6271f1cad9 [youtube|ffmpeg] Automatically correct video with non-square pixels (Fixes #4674) 10 years ago
Philipp Hagemeister ff21a8e0ee Merge remote-tracking branch 'Tithen-Firion/master' 10 years ago
Philipp Hagemeister dd622d7c4e [netzkino] Add new extractor (Fixes #4669) 10 years ago
Philipp Hagemeister bec2248141 [InfoExtractor/common] Correct and test meta tag matching 10 years ago
Philipp Hagemeister 0590062925 Respect age_limit when listing extractors (Fixes #4653) 10 years ago
Philipp Hagemeister e65566a9cc [youtube] Correct handling when DASH manifest is not necessary to find all formats 10 years ago
Sergey M․ 6c6f1408f2 [extractor/common] Allow multiline content tags 10 years ago
Jaime Marquínez Ferrándiz 5d3808524d [extractor/common] Update docstring: replace FileDownloader with YoutubeDL 10 years ago
Philipp Hagemeister bf94e38d3d Merge remote-tracking branch 'Tithen-Firion/hsw-update' 10 years ago
Philipp Hagemeister f5e43bc695 [vine] Provide alt_title (Fixes #4448) 10 years ago
Sergey M․ e89a2aabed [extractor/common] Add generic SMIL formats extraction routine 10 years ago
Philipp Hagemeister f58766ce5c [extractor/common] Document ie_key in url results 10 years ago
Sergey M․ acf5cbfe93 [extractor/common] Add description to playlist_result 10 years ago
Philipp Hagemeister b82f815f37 Allow iterators for playlist result entries 10 years ago
Tithen-Firion ebb6419960 [common] Split _download_json
Add ability for extractor to use _parse_json
10 years ago
Tithen-Firion 995ad69c54 [common] Add new parameters for _download_webpage 10 years ago
Philipp Hagemeister 810fb84d5e pep8 and minor beautification all around 10 years ago
Jaime Marquínez Ferrándiz 42939b6129 [youtube] Use a cookie for seeting the language
This way, we don't have to do an aditional request
10 years ago
Philipp Hagemeister 4e262a8838 [generic] Detect direct video links (Fixes #4149, #4313) 10 years ago
Jouke Waleson 9e1a5b8455 PEP8: applied even more rules 10 years ago
Jouke Waleson 5f6a1245ff PEP8 applied 10 years ago
Philipp Hagemeister fed5d03260 [extractor/common] Document _type values (Motivated by #4254) 10 years ago
Philipp Hagemeister aff2f4f4f5 [arte] Clean up format sorting mess
We now use our standard sorting facilities. As a side effect, it's finally possible to download German videos from French URLs and vice versa.
10 years ago
Philipp Hagemeister 711ede6e1b [heise] Fix description, thumbnail and format ID 10 years ago
Philipp Hagemeister 8c25f81bee [util] Move compatibility functions out of util
utils is large enough without these compatibility functions.

Everything that is present in newer versions of Python (i.e. with dev Python it's just an import) goes into compat.py .
Everything else (i.e. youtube-dl-specific helpers) goes into utils.py .
10 years ago
Philipp Hagemeister 2c8e03d937 Sort formats by fps as well 10 years ago
Philipp Hagemeister fbb21cf528 [youtube] Add formats 298, 299 (Fixes #4056) 10 years ago
Philipp Hagemeister 81515ad9f6 [extractor/common] Improve m3u8 output 10 years ago
Philipp Hagemeister 23be51d8ce [generic] Handle audio streams that do not implement HEAD (Fixes #4032) 10 years ago
Philipp Hagemeister c64ed2a310 [viddler] Use API 10 years ago
Philipp Hagemeister 1ede5b2481 [glide] Simplify 10 years ago
dinesh 7a47d07c6d [extractor/common] href attribute added 10 years ago
dinesh 34e48bed3b [extractor/common] Added support for f4m manifest Version 2.0 10 years ago
Sergey M․ 5f58165def [extractor/common] Fix dumping requests with long file abspath on Windows 10 years ago
Philipp Hagemeister d838b1bd4a [utils] Default age_limit to None
If we can't parse it, it means we don't have any information, not that the content is unrestricted.
10 years ago
Philipp Hagemeister e7b6d12254 [utils] Improve and test js_to_json 10 years ago
Philipp Hagemeister b14f3a4c1d [golem] Simplify (#3828) 10 years ago
Philipp Hagemeister ed9266db90 [common] Add new helper function _match_id 10 years ago
Philipp Hagemeister f4b1c7adb8 [muenchentv] Move live title generation to common 10 years ago
Philipp Hagemeister f0b5d6af74 [vevo] Support 1080p videos (Fixes #3656) 10 years ago
Philipp Hagemeister 7267bd536f [muenchentv] Add support (Fixes #3507) 10 years ago
Sergey M․ 9ebf22b7d9 [common] Improve codecs extraction from m3u8 10 years ago
Philipp Hagemeister daebaab692 [extractor/common] Correct typo 10 years ago
Philipp Hagemeister 3524cc25ca [sportdeutschland] Add support for more plain videos 10 years ago
Philipp Hagemeister f1a9d64eea [extractor/common] Modernize 10 years ago
Philipp Hagemeister da9ec3b932 [muscivault] Add extractor (Fixes #3593) 10 years ago
Philipp Hagemeister 704df56da7 [sportdeutschland] add new extractor 10 years ago
Philipp Hagemeister b252735910 [extractor/common] Generate better f4m format IDs 10 years ago
Philipp Hagemeister 9480d1a566 Merge remote-tracking branch 'riking/twofactor' 10 years ago
Philipp Hagemeister d769be6c96 [grooveshark,http] Make HTTP POST downloads work 10 years ago
Philipp Hagemeister a36819731b [escapist] Add support for og:video:url (Fixes #3557) 10 years ago
riking 165250ff5e Remove debug prints 10 years ago
riking 83317f6938 [youtube] Add two-factor account signin (TOTP only)
Additional work is required to prompt the user for the SMS or phone call codes, as there is no framework currently to prompt the user during an extraction operation.

Fixes #3533
10 years ago
Jaime Marquínez Ferrándiz f036a6328e [extractor/common] _extract_f4m_formats: Use more specific messages when downloading the manifest 10 years ago
Jaime Marquínez Ferrándiz 31bb8d3f51 [bloomberg] Extract the available formats (closes #2776)
It uses a helper method in the InfoExtractor class.
The downloader will pick the requested formats using the bitrate in the info dict.
10 years ago
Philipp Hagemeister c3415d1bac [extractor/common] PEP8 10 years ago
Philipp Hagemeister b090af5922 [vube] Fix comment count 10 years ago
Philipp Hagemeister 1a30deca50 [teachertube] Fix title and playlist recognition 10 years ago
Philipp Hagemeister 9732d77ed2 [snotr] PEP8 and minor fixes (#3296) 10 years ago
Philipp Hagemeister 40c696e5c6 [screencast] Add suppot for more video types (#3236) 10 years ago
Philipp Hagemeister 4094b6e36d [vodlocker] PEP8, generalization, and simplification (#3223) 10 years ago
Jaime Marquínez Ferrándiz 78338f71ca [livestream:original] Add support for folder urls (closes #2631)
The webpage only contains shortened links for the videos, since the server
doesn't support HEAD requests, we use an specific extractor for them.
11 years ago
Philipp Hagemeister d551980823 [spiegeltv] Simplify and PEP8 11 years ago
Philipp Hagemeister ad3bc6acd5 Document and test categories (#2923) 11 years ago
Philipp Hagemeister 5afa7f8bee [extractor/common] --write-pages: Correct file name if video_id is None 11 years ago
Philipp Hagemeister 57c7411f46 [mixcloud] Shed API dependency (#2904) 11 years ago
Philipp Hagemeister c1bce22f23 [extractor/common] Protect against long video IDs and URLs 11 years ago
Philipp Hagemeister 2099125333 [soundcloud/generic] Add support for playlists 11 years ago
Philipp Hagemeister 28746fbd59 [bilibili] Add preliminary support (#2174)
The URL http://www.bilibili.tv/video/av636603/index_2.html does not work yet.
11 years ago
Anisse Astier ec0fafbb19 [extractor/common] fallback on utf-8 when charset is not found
fixes #2721
11 years ago
Philipp Hagemeister b6cfde99b7 Only mention websense URL once 11 years ago
Philipp Hagemeister 2410c43d83 Detect Websense censorship (Fixes #2670) 11 years ago
Philipp Hagemeister 38d63d846e [extractor/common] Clarify preference key in formats 11 years ago
Philipp Hagemeister 955c451456 Rename upload_timestamp to timestamp 11 years ago
Philipp Hagemeister 9d2ecdbc71 [vevo] Centralize timestamp handling 11 years ago
Philipp Hagemeister 5a25f39653 Correct extractor documentation 11 years ago
Philipp Hagemeister 9f62eaf4ef [canal13cl] Add test and improve extraction (#2498) 11 years ago
Philipp Hagemeister 0afef30b23 Add display_id field 11 years ago
Philipp Hagemeister 81c2f20b53 [youtube] Correct invalid JSON (Fixes #2353) 11 years ago
dst c1206423c4 Fix extraction of og content in single quotes 11 years ago
Jaime Marquínez Ferrándiz 0c708f11cb [bloomberg] Fix ooyala url extraction
Added a helper method to InfoExtractor for searching the ‘twitter:player’ meta property.
Now the OoyalaIE also recognizes the ‘ec’ parameter in the url as the embed code.
11 years ago
Philipp Hagemeister 7e8caf30c0 Throw an error if no video formats are found 11 years ago
Philipp Hagemeister db1f388878 [huffpost] Add support 11 years ago
Jaime Marquínez Ferrándiz 944d65c762 [extractor/common] Encode the url when calculating the md5 with `—write-pages` option
This doesn’t cause any problem in python 2.*, but on python 3 the `md5` function only accepts bytes.
11 years ago
Philipp Hagemeister 1394ce65b4 [youtube] Add new formats (Fixes #2221) 11 years ago
Philipp Hagemeister 50317b111d Merge branch 'youtube-dash-manifest'
Conflicts:
	youtube_dl/extractor/youtube.py
11 years ago
Philipp Hagemeister 9d4288b2d4 [extractor/common] Clarify when and when not we generate the filename 11 years ago
Philipp Hagemeister b60016e831 Deal with implicitly UTF-16 decoded webpages
These webpages don't specify an encoding and rely on the BOM
11 years ago
Philipp Hagemeister dd27fd1739 [youtube] Download DASH manifest
If given, download and parse the DASH manifest file, in order to get ultra-HQ formats.
Fixes #2166
11 years ago
Philipp Hagemeister 3ec05685f7 [extractor/common] Limit --write-pages filename to 200 chars
This avoids problems with very long URLs.
11 years ago
Philipp Hagemeister 9933b57430 [pornhub] Use centralized sorting 11 years ago
Philipp Hagemeister 3d3538e422 [khanacademy] Add support (Fixes #2066) 11 years ago
Philipp Hagemeister 5d73273f6f [orf] Use new extraction method (Fixes #2057) 11 years ago
Philipp Hagemeister 9887c9b2d6 [jpopsuki] Simplify 11 years ago
Philipp Hagemeister 08d13955dd [wistia] Prefer original video format above all others
We could also set up a formula which would weigh filesize/bitrate and vcodec/acodec (say, 1GB h264 < 3 GB MPEG2 < 2 GB h264), but that would get really messy real soon.
11 years ago
Philipp Hagemeister 5d4f3985be Document that format_id field should be present 11 years ago