Philipp Hagemeister
ad3bc6acd5
Document and test categories ( #2923 )
11 years ago
Philipp Hagemeister
5afa7f8bee
[extractor/common] --write-pages: Correct file name if video_id is None
11 years ago
Philipp Hagemeister
57c7411f46
[mixcloud] Shed API dependency ( #2904 )
11 years ago
Philipp Hagemeister
c1bce22f23
[extractor/common] Protect against long video IDs and URLs
11 years ago
Philipp Hagemeister
2099125333
[soundcloud/generic] Add support for playlists
11 years ago
Philipp Hagemeister
28746fbd59
[bilibili] Add preliminary support ( #2174 )
...
The URL http://www.bilibili.tv/video/av636603/index_2.html does not work yet.
11 years ago
Anisse Astier
ec0fafbb19
[extractor/common] fallback on utf-8 when charset is not found
...
fixes #2721
11 years ago
Philipp Hagemeister
b6cfde99b7
Only mention websense URL once
11 years ago
Philipp Hagemeister
2410c43d83
Detect Websense censorship ( Fixes #2670 )
11 years ago
Philipp Hagemeister
38d63d846e
[extractor/common] Clarify preference key in formats
11 years ago
Philipp Hagemeister
955c451456
Rename upload_timestamp to timestamp
11 years ago
Philipp Hagemeister
9d2ecdbc71
[vevo] Centralize timestamp handling
11 years ago
Philipp Hagemeister
5a25f39653
Correct extractor documentation
11 years ago
Philipp Hagemeister
9f62eaf4ef
[canal13cl] Add test and improve extraction ( #2498 )
11 years ago
Philipp Hagemeister
0afef30b23
Add display_id field
11 years ago
Philipp Hagemeister
81c2f20b53
[youtube] Correct invalid JSON ( Fixes #2353 )
11 years ago
dst
c1206423c4
Fix extraction of og content in single quotes
11 years ago
Jaime Marquínez Ferrándiz
0c708f11cb
[bloomberg] Fix ooyala url extraction
...
Added a helper method to InfoExtractor for searching the ‘twitter:player’ meta property.
Now the OoyalaIE also recognizes the ‘ec’ parameter in the url as the embed code.
11 years ago
Philipp Hagemeister
7e8caf30c0
Throw an error if no video formats are found
11 years ago
Philipp Hagemeister
db1f388878
[huffpost] Add support
11 years ago
Jaime Marquínez Ferrándiz
944d65c762
[extractor/common] Encode the url when calculating the md5 with `—write-pages` option
...
This doesn’t cause any problem in python 2.*, but on python 3 the `md5` function only accepts bytes.
11 years ago
Philipp Hagemeister
1394ce65b4
[youtube] Add new formats ( Fixes #2221 )
11 years ago
Philipp Hagemeister
50317b111d
Merge branch 'youtube-dash-manifest'
...
Conflicts:
youtube_dl/extractor/youtube.py
11 years ago
Philipp Hagemeister
9d4288b2d4
[extractor/common] Clarify when and when not we generate the filename
11 years ago
Philipp Hagemeister
b60016e831
Deal with implicitly UTF-16 decoded webpages
...
These webpages don't specify an encoding and rely on the BOM
11 years ago
Philipp Hagemeister
dd27fd1739
[youtube] Download DASH manifest
...
If given, download and parse the DASH manifest file, in order to get ultra-HQ formats.
Fixes #2166
11 years ago
Philipp Hagemeister
3ec05685f7
[extractor/common] Limit --write-pages filename to 200 chars
...
This avoids problems with very long URLs.
11 years ago
Philipp Hagemeister
9933b57430
[pornhub] Use centralized sorting
11 years ago
Philipp Hagemeister
3d3538e422
[khanacademy] Add support ( Fixes #2066 )
11 years ago
Philipp Hagemeister
5d73273f6f
[orf] Use new extraction method ( Fixes #2057 )
11 years ago
Philipp Hagemeister
9887c9b2d6
[jpopsuki] Simplify
11 years ago
Philipp Hagemeister
08d13955dd
[wistia] Prefer original video format above all others
...
We could also set up a formula which would weigh filesize/bitrate and vcodec/acodec (say, 1GB h264 < 3 GB MPEG2 < 2 GB h264), but that would get really messy real soon.
11 years ago
Philipp Hagemeister
5d4f3985be
Document that format_id field should be present
11 years ago
Philipp Hagemeister
7217e148fb
[yahoo] Use centralized sorting, and add tbr field
11 years ago
Philipp Hagemeister
c7deaa4c74
[zdf] Use centralized sorting
11 years ago
Philipp Hagemeister
e6812ac99d
[spiegel] Use centralized sorting
11 years ago
Philipp Hagemeister
4bcc7bd1f2
Add temporary _sort_formats helper function
11 years ago
Philipp Hagemeister
f49d89ee04
Add a resolution field and improve general --list-formats output
11 years ago
Philipp Hagemeister
f45f96f8f8
[myvideo] Use RTMP instead of RTMPT ( Fixes #2032 )
11 years ago
Philipp Hagemeister
1538eff6d8
[bliptv] Remove support for direct downloads
...
This is now handled by the generic IE
11 years ago
Philipp Hagemeister
aa94a6d315
[aparat] Add support ( Fixes #2012 )
11 years ago
Jaime Marquínez Ferrándiz
c0d0b01f0e
[generic] Detect ooyala videos ( fixes #2013 )
11 years ago
Philipp Hagemeister
46374a56b2
[youtube] Do not warn for videos with allow_rating=0
...
This fixes #1982
Test video: http://www.youtube.com/watch?v=gi2uH3YxohU
11 years ago
Itay Brandes
87a28127d2
_search_regex's "isatty" call fails with Py2exe's
...
_search_regex calls the sys.stderr.isatty() function for unix systems.
Py2exe uses a custom Stderr() stream which doesn't have an `isatty()`
function, leading to it's crash.
Fixes easily with checking that it's a unix system first.
11 years ago
Philipp Hagemeister
d67b0b1596
Reorder info_dict documentation
11 years ago
Philipp Hagemeister
c0ba0f4859
Document duration field
11 years ago
Philipp Hagemeister
e2b38da931
[mtv] Fixup incorrectly encoded XML documents
11 years ago
Philipp Hagemeister
7cc3570e53
Add fatal=False parameter to _download_* functions.
...
This allows us to simplify the calls in the youtube extractor even further.
11 years ago
Philipp Hagemeister
19e3dfc9f8
[9gag] Like/dislike count ( #1895 )
11 years ago
Philipp Hagemeister
aaebed13a8
[smotri] Simplify
11 years ago
Philipp Hagemeister
2a275ab007
[zdf] Use _download_xml
11 years ago
Philipp Hagemeister
79d09f47c2
Merge branch 'opener-to-ydl'
11 years ago
Philipp Hagemeister
c059bdd432
Remove quality_name field and improve zdf extractor
11 years ago
Philipp Hagemeister
02dbf93f0e
[zdf/common] Use API in ZDF extractor.
...
This also comes with a lot of extra format fields
Fixes #1518
11 years ago
Philipp Hagemeister
e03db0a077
Merge branch 'master' into opener-to-ydl
11 years ago
Jaime Marquínez Ferrándiz
267ed0c5d3
[collegehumor] Encode the xml before calling xml.etree.ElementTree.fromstring ( fixes #1822 )
...
Uses a new helper method in InfoExtractor: _download_xml
11 years ago
Philipp Hagemeister
7012b23c94
Match --download-archive during playlist processing ( Fixes #1745 )
11 years ago
Philipp Hagemeister
dca0872056
Move the opener to the YoutubeDL object.
...
This is the first step towards being able to just import youtube_dl and start using it.
Apart from removing global state, this would fix problems like #1805 .
11 years ago
Philipp Hagemeister
5904088811
Add support for tou.tv ( Fixes #1792 )
11 years ago
Philipp Hagemeister
91c7271aab
Add automatic generation of format note based on bitrate and codecs
11 years ago
Jaime Marquínez Ferrándiz
78fb87b283
Don't accept '>' inside the content attribute in OpenGraph regexes
11 years ago
Jaime Marquínez Ferrándiz
ab2d524780
Improve the OpenGraph regex
...
* Do not accept '>' between the property and content attributes.
* Recognize the properties if the content attribute is before the property attribute using two regexes (fixes the extraction of the description for SlideshareIE).
11 years ago
Philipp Hagemeister
eb0a839866
[common] Simplify og_search_property
11 years ago
Marcin Cieślak
a8eeb0597b
Fix AssertionError when og property not found
...
On tvp.pl some webpages contain OpenGraph
metadata and some don't.
If og property is not found, _og_search_description
fails with
WARNING: unable to extract OpenGraph description; please report this issue on http://yt-dl.org/bug
Traceback (most recent call last):
File "/usr/home/saper/bin/youtube-dl", line 18, in <module>
youtube_dl.main()
File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 766, in main
_real_main(argv)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 719, in _real_main
retcode = ydl.download(all_urls)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 715, in download
videos = self.extract_info(url)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 348, in extract_info
ie_result = ie.extract(url)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 125, in extract
return self._real_extract(url)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/tvp.py", line 56, in _real_extract
info['description'] = self._og_search_description(webpage)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 331, in _og_search_description
return self._og_search_property('description', html, fatal=False, **kargs)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 325, in _og_search_property
return unescapeHTML(escaped)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/utils.py", line 494, in unescapeHTML
assert type(s) == type(u'')
AssertionError
The patch allows me to use:
try:
info['description'] = self._og_search_description(webpage)
info['thumbnail'] = self._og_search_thumbnail(webpage)
except RegexNotFoundError:
pass
11 years ago
Jaime Marquínez Ferrándiz
9103bbc5cd
Add the 'webpage_url' field to info_dict
...
The url for the video page, it must allow to reproduce the result.
It's automatically set by YoutubeDL if it's missing.
11 years ago
Philipp Hagemeister
b5d0d817bc
Remove superfluous space
11 years ago
Philipp Hagemeister
ebc14f251c
Merge remote-tracking branch 'origin/master'
11 years ago
Philipp Hagemeister
d41e6efc85
New debug option --write-pages
11 years ago
Filippo Valsorda
8ffa13e03e
[Instagram] get the non-https link, as they are serving Akamai cert from a instagram.com domain
11 years ago
Jaime Marquínez Ferrándiz
55b3e45bba
[vimeo] Fix pro videos and player.vimeo.com urls
...
The old process can still be used for those videos.
Added RegexNotFoundError, which is raised by _search_regex if it can't extract the info.
11 years ago
Jaime Marquínez Ferrándiz
8c51aa6506
The 'format' field now defaults to '{format_id} - {width}x{height}{format_note}'
...
Following the YoutubeIE format. The 'format_note' gives additional info about the format, for example '3D' or 'DASH video'.
11 years ago
Philipp Hagemeister
416a5efce7
fix typos
11 years ago
Philipp Hagemeister
8dbe9899a9
Allow users to specify an age limit ( fixes #1545 )
...
With these changes, users can now restrict what videos are downloaded by the intented audience, by specifying their age with --age-limit YEARS .
Add rudimentary support in youtube, pornotube, and youporn.
11 years ago
Philipp Hagemeister
2f5865cc6d
Clarify that url and ext are optional when formats is given ( #980 )
11 years ago
Philipp Hagemeister
deefc05b88
Document formats (for #980 )
11 years ago
Jaime Marquínez Ferrándiz
0d75ae2ce3
Fix detection of the webpage charset if it's declared using ' instead of "
...
Like in "<meta charset='utf-8'/>"
11 years ago
Philipp Hagemeister
f143d86ad2
[sohu] Handle encoding, and fix tests
11 years ago
Philipp Hagemeister
6d69d03bac
Merge remote-tracking branch 'origin/reuse_ies'
11 years ago
Philipp Hagemeister
2eabb80254
[addanime] improve
11 years ago
Jaime Marquínez Ferrándiz
9e9c164052
Merge pull request #937 from jaimeMF/subtitles_rework
...
Subtitles rework
11 years ago
Philipp Hagemeister
79cb25776f
Cache suitable regular expressions
...
This speeds up TestAllURLsMatching.test_no_duplicates by about 8000% at the cost of minimal memory overhead.
11 years ago
Jaime Marquínez Ferrándiz
5d51a883c2
Use a dictionary for storing the subtitles
...
The errors while getting the subtitles are reported as warnings, if no subtitles are found return and empty dict.
11 years ago
Philipp Hagemeister
f38de77f6e
Use unescapeHTML for OpenGraph properties
...
These are attribute values, so we don't need the more complex and whitespace-destroying cleanHTML - we just need to unescape quotes, that's it.
11 years ago
Philipp Hagemeister
b9d3e1635f
Strip hash info from URL when making requests ( Fixes #1038 )
11 years ago
Philipp Hagemeister
3c4e6d8337
Improve OpenGraph property matching
11 years ago
Jaime Marquínez Ferrándiz
44dbe89035
Use re.DOTALL by default when searching OpenGraph properties
11 years ago
Jaime Marquínez Ferrándiz
46720279c2
InfoExtractor: add some helper methods to extract OpenGraph info
11 years ago
Philipp Hagemeister
690e872c51
Remove video_result helper method
...
Calling it was more complex then actually including the type in the video info
11 years ago
Jaime Marquínez Ferrándiz
56c7366547
YoutubeIE: reuse instances of InfoExtractors ( closes #998 )
...
When a IE is added to the list, it's also added to a dictionary. When a IE is requested it first looks in the dictionary and if there's no instance it will create a new one.
That way _real_initialize is only called once for each IE, saving time if it needs to login for example.
11 years ago
Philipp Hagemeister
d93e4dcbb7
Merge branch 'master' of github.com:rg3/youtube-dl
11 years ago
Philipp Hagemeister
73e79f2a1b
[3sat] Add support ( Fixes #1001 )
11 years ago
Jaime Marquínez Ferrándiz
fc79158de2
VimeoIE: authentication support ( closes #885 ) and add a method in the base InfoExtractor to get the login info
11 years ago
Philipp Hagemeister
0f81866329
Add --list-extractor-descriptions (human-readable list of IEs)
12 years ago
Philipp Hagemeister
f3d294617f
Document view_count ( Closes #963 )
12 years ago
Filippo Valsorda
98bcd2834a
improve generic and encrypted signature error messages
12 years ago
Philipp Hagemeister
3c25b9abae
Remove useless headers
12 years ago
Philipp Hagemeister
d6983cb460
Fix generic class move (add all files)
12 years ago