|
#1
|
|||
|
|||
Problem crawling twitter/x. Not a Jdownloader issue
I will start off explaining the problem. It seems twitter/x media pages don't load after a certain number of media posts, thus Jdownloader can't crawl anything that is not loaded. I noticed this when I saw a link on danbooru that I did not have downloaded. For example with the profile linked below.
**External links are only visible to Support Staff****External links are only visible to Support Staff** When I open the above link and use the middle mouse button to scroll all the way to the bottom until it stops loading. It consistently stops at the below link which is dated june 2021, which is in line with the crawled posts on Jdownloader. **External links are only visible to Support Staff****External links are only visible to Support Staff** However, I have a series of links still active that are dated before this. For example the below link is dated in 2013. **External links are only visible to Support Staff****External links are only visible to Support Staff** I have seen this happen on a few other accounts if you need more examples but I assume this is just a issue of where twitter stops showing posts after a certain number or they only archived to a certain point, with variations between accounts. Now the reason for this post. I put this under general discussion because I don't think this is an issue Jdownloader can solve, however I want to raise awareness to the issue and ask if anyone knows of sites that can show the "un-archived" twitter posts. I have more links to show but felt I was just rambling so I will end it with this. I found a thread with a issue on twitter however I don't think this pertains to that issue so I made a new thread. https://board.jdownloader.org/showthread.php?t=95054 Any additional info on this would be nice, even if the problem isn't fixed. Thanks. |
#2
|
||||
|
||||
Hi,
First of all, to save our time it would be very helpful to have a debug-log of JDownloader crawling that link. Log instructions: https://support.jdownloader.org/Know...d-session-logs Secondly, while without log I can't be sure, there was or still is a serverside limit on "how far you can scroll down". I was unable to find the old threads about this problem so you might have more luck using our search. Then also keep in mind that Twitter has rate-limits so maybe the crawler did not stop but run into one of these (did you turn on Bubble Notifications in JD?)? These can cause large crawl-delays. See e.g.: https://board.jdownloader.org/showthread.php?t=92696 Then finally, there are tools dedicated just for downloading from Twitter. I recommend using such tools instead of JDownloader. You can find them on github.com - random example picked by me: github.com/mmpx12/twitter-media-downloader
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
Thread Tools | |
Display Modes | |
|
|