|
#1
|
|||
|
|||
request plugin for reddit.com (video download is missing audio)
There are a lot of subreddits and search results from reddit.com that I would like to capture, such as:
subreddit **External links are only visible to Support Staff****External links are only visible to Support Staff** search result **External links are only visible to Support Staff****External links are only visible to Support Staff** user posts **External links are only visible to Support Staff****External links are only visible to Support Staff** Can you add reddit.com account support so it will allow for showing/capturing nsfw by default, and also add scrolling support? Right now, if I add a link it captures 54 images which appear to be mostly related to the website theme, and then about 5 images which are the content I want, but they show as being "offline". Last edited by asdpoi2k; 09.04.2019 at 14:43. Reason: extra clarification |
#2
|
|||
|
|||
bump, did this one get missed or did I miss something? I don't think reddit is supported anywhere yet?
|
#3
|
||||
|
||||
I can think of supporting /user/ and maybe /r/ but unlikely search urls as we don't often add support for those as those can cause high cpu load and are only good/usefull for mass crawling.
We first have to check how the site works and maybe they provide a lightweight api that could be used
__________________
JD-Dev & Server-Admin |
#4
|
|||
|
|||
Is there a reason Reddit videos aren't supported?
Whether I copy a v.redd.it link or a Reddit post's link, JDownloader doesn't do anything. I found a thread from 2 years ago with the progress still at 0%, but that person at least had the video and audio files even if they weren't merged.
Am I doing something wrong? I don't even see a Reddit plugin within JDownloader. |
#5
|
||||
|
||||
@Kaegel
Merged reddit threads. The reasons are simple: - We cannot add support for every website of the Internet - A reddit plugin has been requested by serveral users but so far, not yet been created My "workaround" suggestion: Please use free browser addons such as "Video DownloadHelper" to download reddit content - you can use this addon as a helper to find the video downloadurls and download them with JD (well, in most cases). If not, you can still manually download via browser. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#6
|
||||
|
||||
Was planning on making a thread but noticed this is becoming the thread for Reddit, so I'll just add a reply here.
Not sure if supporting Reddit has ever been in your actual backlog but maybe this could help. A couple software that I used/use to archive Reddit /users and /r are: RipMe: **External links are only visible to Support Staff**... and BDFR: **External links are only visible to Support Staff**... Since the introduction of RedGIFs, RipMe has essentially stopped working for almost all hosts (imgur, gfycat, v.redd.it, etc) and BDFR has some troubles of its own with very poor dev support. Maybe a Reddit plugin support can be built from these two to be utilised in JD2? Not saying exactly copying them but going back to Jiaz's comment from last year about checking how the site works, I figured these applications could help provide some insight. Just spitballing here as I'm out of options when it comes to software for archiving Reddit content and was kind of hoping the people behind JD2 would be interested in filling this gap. Thank you. |
#7
|
||||
|
||||
Hi DukeM,
we already know these other tools Also there are still plenty of online "reddit downloaders" up and running fine. Also, you can easily add URLs of ANY video site to JD by using simple helper browser addons such as e.g. "Video DownloadHelper". Regarding reddit: Do they provide an official API? Do they mind ppl. downloading their content or do they not care? I've even seen public reddit downloader bots on reddit itself so it seems like they at least don't try to stop ppl. from downloading the content ... after all it's all user-generated anyways. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#8
|
||||
|
||||
Hey there, psp!
Ah, I do have VDH but I don't think their format really works for archiving entire users or subreddits. Although, I'm not aware of any other proper full archiving software out there that works as efficiently as these two (or as these two did/do, lol). The most important bit, as far as I'm concerned, is the proper filenaming system among other things and imo, BDFR making it customisable like {DATE}{USERNAME}{POSTID} etc is a pretty neat feature. If anything though, VDH is really good at fetching HLS links. Haha. As far as I know, they do. This docs page might be a good start: **External links are only visible to Support Staff****External links are only visible to Support Staff** There is also a feature where you can connect an "app" to your reddit account if you go over to **External links are only visible to Support Staff****External links are only visible to Support Staff**. Not sure what other benefits it gives yet aside from maybe accessing private/restricted subreddits your account is a member in and maybe bypassing captchas(?). Quote:
If you go over at r/datahoarder, you can even sometimes see discussions about archiving some users or communities. There are also entire reddit archivers that exist like ceddit.com, unreddit.com, removeddit.com, etc. that works like WayBack Machine and can even show you mod-deleted comments. So, I guess it'd be pretty safe to assume that Reddit doesn't care about this stuff, at least for now. And even then, most of the content are uploaded on third-party hosts (imgur, gfycat, redgifs, etc.) which is out of their jurisdiction in the first place anyway. Really hope you lot would take an interest in this. |
#9
|
||||
|
||||
Indeed you're right and they're really developer-friendly - seems like they do not even mind crawlers - I actually read their developer.
I've started working on it but please do not ask me for any ETA. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#10
|
||||
|
||||
Oh wow, that's awesome to hear! Thank you so much, psp.
And don't worry, I don't feel entitled enough to pester you on this. But I'll be sure to be keeping an eye out for when it gets deployed. Cheers! |
#11
|
||||
|
||||
The next update will contain the very first version of our reddit crawler
What it can do: Crawl single reddit posts: - External embedded (video) content such as e.g. gfycat - External embedded URLs e.g. "embedded" imgur.com images What it can't do (yet): - Crawl reddit-selfhosted videos and images (Yes I know a lot of content is hosted by reddit ... will work on it once I find the time ...) What it might never be able to do (unsure): - Crawling complete user profiles - Crawling complete subreddits Reasons: Requires a LOT of http requests What it will definitely never be able to do as we do not provide such crawlers: - Crawling "search query URLs" e.g.: Code:
reddit.com/search?q=nature%20images We are open source - you can contribute code at any time Wartest du auf einen angekündigten Bugfix oder ein neues Feature? Updates werden nicht immer sofort bereitgestellt! Bitte lies unser Update FAQ! | Please read our Update FAQ! --- Are you waiting for recently announced changes to get released? Updates to not necessarily get released immediately! Bitte lies unser Update FAQ! | Please read our Update FAQ! -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#12
|
||||
|
||||
Added simple support for downloading reddit selfhosted content.
Now it is up to you to test our "single posts crawler" and e.g. find posts for which JD does not find anything although you might expect it to. Wartest du auf einen angekündigten Bugfix oder ein neues Feature? Updates werden nicht immer sofort bereitgestellt! Bitte lies unser Update FAQ! | Please read our Update FAQ! --- Are you waiting for recently announced changes to get released? Updates to not necessarily get released immediately! Bitte lies unser Update FAQ! | Please read our Update FAQ! -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#13
|
|||
|
|||
how about adding TITLE from single reddit post to foldername?
that make all files much more interesting to have a name itself Quote:
|
#14
|
||||
|
||||
Quote:
As you can imagine, what I've done until now was just "quick and dirty" - it'll get better every day -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#15
|
|||
|
|||
"pattern" : "**External links are only visible to Support Staff**,
"rule" : "DEEPDECRYPT", "packageNamePattern" : "<title>(.+)</title>", |
#16
|
||||
|
||||
@verheiratet1952
Your LinkCrawler rule will not work anymore once our crawler is released which it already is ... Also, we are not using the reddit mainpage e.g. Code:
reddit.com/r/de/comments/hvzlf7/falschparken_in_stuttgart/ Code:
reddit.com/comments/hvzlf7/.json -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#17
|
||||
|
||||
For the next update:
- Added package name including date when the content was posted and name of the subreddit - Added ability to parse multiple reddit hosted pictures of present - Added parser for post text itself: If the text of post contains downloadable URLs which JD can handle, it will find them as well now I did not yet modify the filenames for reddit selfhosted content. Wartest du auf einen angekündigten Bugfix oder ein neues Feature? Updates werden nicht immer sofort bereitgestellt! Bitte lies unser Update FAQ! | Please read our Update FAQ! --- Are you waiting for recently announced changes to get released? Updates to not necessarily get released immediately! Bitte lies unser Update FAQ! | Please read our Update FAQ! -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#18
|
|||
|
|||
great changes!
is there any chance to edit/change settings? to customize for its own needs? Quote:
|
#19
|
||||
|
||||
Thanks for your feedback
Quote:
If you're talking about filenames/packagenames: You can always modify those via Packagizer rules. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#20
|
||||
|
||||
For the next update:
- v.redd.it and i.redd.it content will now also get displayed as host "reddit.com" - v.redd.it: Always only grab the BEST video quality available - Improved offline detection Wartest du auf einen angekündigten Bugfix oder ein neues Feature? Updates werden nicht immer sofort bereitgestellt! Bitte lies unser Update FAQ! | Please read our Update FAQ! --- Are you waiting for recently announced changes to get released? Updates to not necessarily get released immediately! Bitte lies unser Update FAQ! | Please read our Update FAQ! -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#21
|
|||
|
|||
what about "external-preview.redd.it" ?
|
#22
|
||||
|
||||
Hey, @psp!
Thanks so much for working on this. Wasn't expecting it to be this quick. I saw a related post on r/datahoarder about this update first and the poster brought up a good point about complete crawling of users and subreddits. I don't know how much help this would be but another user suggested PushShift for pulling the content instead to avoid a potential DoS. As far as I know about PushShift, it basically copies the data/content the moment they are submitted to Reddit. Maybe that can be a good way to go about that? Here's some more info about it if you're interested: **External links are only visible to Support Staff****External links are only visible to Support Staff** They also have something for pulling Reddit searches but I haven't tried it. Might be worth a look for the other user who requested this particular feature. Another user also suggested **External links are only visible to Support Staff****External links are only visible to Support Staff** but it's the first time I've heard of it. Again, thanks! And I'll be looking out for future improvements. |
#23
|
||||
|
||||
Hi again,
thanks for the suggestions but unfortunately we're not going to use external APIs for crawling. JD can now do the basics that will work for most of all users - if wanted, you could still e.g. use external APIs to crawl all comment-URLs of one subreddit --> Add those to JD --> And yes, this would again cause a lof of reddit http requests! Todays update includes the following changes: JD will now try to set filenames (basically the same as the packagename) for all reddit selfhosted content. Please keep in mind that this will not e.g. apply for imgur content as we got a separate plugin for imgur and other services --> These will try to grab the original filenames from these sources accordingly. Now it is up to you guys to test the existing functionality and make improvements suggestions. The reddit ticket linked on the first page of this thread will stay open as our plugin does not yet have all of the functionality I want it to have. Wartest du auf einen angekündigten Bugfix oder ein neues Feature? Updates werden nicht immer sofort bereitgestellt! Bitte lies unser Update FAQ! | Please read our Update FAQ! --- Are you waiting for recently announced changes to get released? Updates to not necessarily get released immediately! Bitte lies unser Update FAQ! | Please read our Update FAQ! -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#24
|
||||
|
||||
Hi,
Please post working example URLs. I'm off now - seeya tomorrow ... -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#25
|
|||
|
|||
Quote:
**External links are only visible to Support Staff****External links are only visible to Support Staff** **External links are only visible to Support Staff****External links are only visible to Support Staff** |
#26
|
||||
|
||||
I'm unable to open that "external" URL.
The content of that post is hosted on imgur.com and should be downloadable via JD just fine. Does the imgur.com URL get added as offline for you? -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#27
|
|||
|
|||
Quote:
is there any solution available? |
#28
|
||||
|
||||
Seems like you did change the default setting of your imgur.com plugin because otherwise this wouldn't have happened.
See Settings -> Plugins -> imgur.com -> Ativate "Use API[...]" Afterwards, re-add your reddit/imgur URLs. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#29
|
|||
|
|||
Quote:
"Use API[...]" was already activated for months... it does crawl most imgur links as offline, but it also adds them for both offline/online without correct renaming... it adds names like 'LNg3jgC' for package name even if title had also been added... |
#30
|
||||
|
||||
Would surprise me if not, it is still the default setting but I thought you might have turned that off.
Quote:
EDIT1: Maybe your IP/ISP is blocked by the imgur API. You could try to add your imgur.com account to JD and check if it works then. Quote:
The naming might seem wrong to you but technically it is absolutely correct. Reddit.com is linking to external websites --> JD will get the name from imgur.com and if no name is set here, the image-ID will be used. It's the same when e.g. adding uploaded.net URLs via services such as filecrypt.cc --> JD will never use the filenames shown there - it will always try to get the filenames from the service where the file is hosted. For reddit.com selfhoste content, the title of e.g. a comment will be set as filename. If you wish to use the "source name" as title in such a case, you will have to create a Packagizer rule that sets the title of the package as filename for all imgur.com URLs. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#31
|
|||
|
|||
How can I download whole U/ or R/ user and subreddits RIP users and subreddits using jdownloader2?? tried it few times....probably not catching all the url's and ignoring imgur files. example: r/tightdresses and user/AshleyWilsonPT/
Last edited by rafikabir85; 17.09.2020 at 10:50. |
#32
|
||||
|
||||
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#33
|
||||
|
||||
Next reddit Update will include a crawler to crawl all saved posts of an authenticated user but all account related features will be on hold until we got a nicer way to perform oauth logins:
Again: We're open source - you're free to check out our code/progress HERE. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#34
|
||||
|
||||
Next update will include:
- Subreddit crawler * - User crawler * - Superfast crawling for reddit selfhosted content * I've limited the crawler to crawl only the first page of a subreddit for now. As said, crawling complete subreddits will cause a lot of http requests and I don't want reddit to ban our application so I will leave this disabled until I find a solution. Wartest du auf einen angekündigten Bugfix oder ein neues Feature? Updates werden nicht immer sofort bereitgestellt! Bitte lies unser Update FAQ! | Please read our Update FAQ! --- Are you waiting for recently announced changes to get released? Updates to not necessarily get released immediately! Bitte lies unser Update FAQ! | Please read our Update FAQ! -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download Last edited by pspzockerscene; 22.09.2020 at 17:19. |
#35
|
|||
|
|||
> I've limited the crawler to crawl only the first page of a subreddit for now.
Is there any way to manually post the second/third/... page link and make jdownloader crawl it? Also, the plugin does not work for 'old.reddit.com'. It only crawls from 'www.reddit.com'. Please add support for old.reddit.com as well if it is not too much work. |
#36
|
||||
|
||||
Quote:
We're open source! Quote:
Please post example-URLs of all existing types for old.reddit.com (e.g. user, subreddit, comment, users' saved posts). -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#37
|
|||
|
|||
Quote:
Examples of some random subreddits and users and comments etc - subreddit - **External links are only visible to Support Staff****External links are only visible to Support Staff** user - **External links are only visible to Support Staff****External links are only visible to Support Staff** comment - **External links are only visible to Support Staff****External links are only visible to Support Staff** saved posts - **External links are only visible to Support Staff****External links are only visible to Support Staff** |
#38
|
||||
|
||||
Added support for old.reddit.com.
Wartest du auf einen angekündigten Bugfix oder ein neues Feature? Updates werden nicht immer sofort bereitgestellt! Bitte lies unser Update FAQ! | Please read our Update FAQ! --- Are you waiting for recently announced changes to get released? Updates to not necessarily get released immediately! Bitte lies unser Update FAQ! | Please read our Update FAQ! -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#39
|
|||
|
|||
Quote:
|
#40
|
|||
|
|||
now crawling for reddit links brings IMGUR prompt to add API credentials...
I am very sorry, but I dont see and find entry fields to add both "Client ID" and "Client Secret" ... and where to activate API? please see screenshots... I registered for my very own imgur account and did create an App like it is necessary with steps from jd prompt... |
Thread Tools | |
Display Modes | |
|
|