#1
|
|||
|
|||
Deviantart plugin problem with paywall links
When processing images from a gallery, jdownloader gets "stuck" for long time on each link, if it's behind a subscription paywall(not sure about others) going from "Starting..." to "skipping - account required" can take up to several minutes, using around 10% of the CPU(2 cores and their respective threads at between 50-90% usage) constantly while processing links with spikes up to 20%.
Shortly after starting to try to download the first images, the cookie suddenly "expires"(maybe the system is logging it out because it's hammering it with requests). Interestingly if i touch nothing it takes very long to pass from 1 link to another, BUT if i interact with the UI, like per example change from "Downloads" to "LinkGrabber" tabs back and fourth constantly or scroll up and down repeatedly, it only takes between 10-20 seconds to skip each one or 5-10 seconds to download it, when possible. Another strange thing is that my download speed doesn't register any network usage, it stays flat at 0kb/s, like it's processing the image in the background or something and then it just pops from 0%->100% instantly(in cases where the image IS downloadable). Note: I tested everything inside a clean environment, fresh jdownloader with only my deviantart account and no settings changed, i used the "username/gallery/all" link to scan the entire gallery. Last edited by sgghostrider; 09.09.2024 at 04:34. Reason: Updating title and message to describe better the problem |
#2
|
||||
|
||||
Please provide the following information:
Please post your log-ID here If your report is about a specific website which JD supports via plugin, please also provide example URLs which can be used to reproduce the issue you are having. If your report is related to a login specific problem with a plugin supported website, enable debug mode before creating logs, see previously linked instructions. Bitte poste deine Log-ID hier. Falls dein Problem ein Problem mit einer Webseite ist, die per Plugin unterstützt wird, stelle bitte zusätzlich Testlinks zur Verfügung, mit denen sich dein Problem nachstellen lässt. Geht es um einen nicht funktionierenden Account-Login, aktiviere vor dem Erstellen deines Logs den debug Modus (siehe zuvor verlinkte Anleitung). -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#3
|
|||
|
|||
LOG: 09.09.24 20.12.37 <--> 09.09.24 20.31.13 jdlog://6445411370661/
Link(the least NSFW i found with sub): **External links are only visible to Support Staff****External links are only visible to Support Staff** I attached the "about" window screenshot, but as well attached 2 images of jdownloader beside the clock to show the problem. Note: I noticed that when i have just a bunch of images active at once(i disabled the other 8 packages and left only the one with just 170 images inside) it started going faster(still processing like 1 paywall image per around 5-20 seconds and every 5-10 seconds for downloadable ones, but still much faster than 1 every few min). Note 2: The problem is almost not noticeable if you just stop scanning after few dozen images and just add that to the queue. Let it scan the link fully and start downloading the package when finished. |
#4
|
||||
|
||||
I can see that this is not a freeze but the internally set "Request interval" so essentially a wait/sleep.
Every 1.5 seconds, one request to deviantart.com is allowed. If crawler + downloads are running at the same time, this may look like a freeze. I can't really do anything about it other than maybe add a setting so you can lower that value. Just keep in mind that if you lower it, you may and up getting your IP banned by deviantart (slash getting it banned faster than with the request interval).
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#5
|
|||
|
|||
I always start downloading ONLY after the crawling is already done, and if you mean that each link has to crawl to find the original image at start, that 1.5 seconds doesn't explain the minutes of waiting for each link, if you look at my "AtStart.png" image and "AfterSomeTime.png" you will see that it took 17 minutes to process ONLY less than 10 links, that's like 2 min per link, x100+ times that "1.5s" wait time.
Edit: I finally did some digging, installed a process explorer and wireshark to check what daheck was happening and sure i found the plugin IS NOT the main reason for the slowness... it's the program itself, this plugin just magnifies the core problem exponentially. I don't know why deviantart expires the cookie so fast, but i don't think it has to do with jdownloader, it just makes a request once, then shitton of disk writes(few tens of thousands, probably 1 for each link) and then it just gets stuck in "Thread Create"s and "Thread Exit"s for minutes at a time doing literally nothing but still using 10% of CPU for "something"(infinite loop bug?), when in the background it can get stuck in a single link doing that for over 10 minutes!(image attached). Edit2: I kept testing and testing, restarting the program, testing again and it's not even consistent, i can't help to narrow it down any further, sometimes it gets stuck for 10 min+ on each one, others it processes each link for around 1min... it changes on every restart of the program. It worked well 3-4 months ago, i was able to download galleries with thousands of images without problems, i don't know what changed since then... Last edited by sgghostrider; 10.09.2024 at 23:16. Reason: More info instead of create new post(for second time) |
#6
|
||||
|
||||
Quote:
Each http request to "deviantart.com" needs to have a gab of 1.5 seconds in between, globally. For a download, two requests are needed in this case but only one which goes to "deviantart.com". The 2nd one goes to the cdn where the image is hosted (e.g. "wixmp-ed30a86b8c4ca887773594c2.wixmp.com") so it is not affected by said request interval. Yap, agree, that doesn't explain that. Quote:
I'am taking your reports seriously. If you really want to look into it, grab our source code and do so: https://support.jdownloader.org/de/k...up-ide-eclipse I don't know yet either but as explained in other threads, in general it is in their interest to block bots so this might just be some kind of bot detection. I don't have the time to deep dive into every small issue so if in doubt, go to github.com and look for other open source tools that are dedicated to downloading from deviantart.com. They may be suited better for your use case. I've checked the deviantart plugin for loops with missing stop conditions and regular expressions which are taking too long but so far I was unable to find anything.
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#7
|
|||
|
|||
Quote:
Quote:
Thanks for your time. |
#8
|
||||
|
||||
No worries.
Let's hope you won't find a bug that I missed otherwise I'll be embarrassed Grab the source code and do a full text search for "deviantart". The class you want to find is this one: jd.plugins.hoster.DeviantArtCom The crawler can be found here: jd.plugins.crawler.DeviantArtComCrawler
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#9
|
|||
|
|||
Quote:
I don't even know if i should discuss this here or create a new thread as this is completely unrelated to "Plugins" but to a core inefficiency of jDownloader itself. I was not very far away with my guess of the "infinite loop", it's just not infinite but of a "very long" execution depending of the amount of links in queue. I think i will just create a new thread to not mix stuff up, when i'm done creating the thread where i explain the problem and how i think it can be solved, i will leave a link here by editing this message later Thread link: board.jdownloader.org/showthread.php?p=539620 Last edited by sgghostrider; 19.09.2024 at 22:07. Reason: link addition |
#10
|
||||
|
||||
Yes, please create new thread.
And thanks for taking time to deep dive into this. Do you have large packages? large in meaning of a package that has many many links in it or many small packages? Do you have many links in download list or linkgrabber or both? Also please do the following once you encounter the issue. Try to create a log(open the log creation dialog) but close it again. Repeat this step several times (3-6 times) and finally really create a log. This step causes thread dumps to be created so I can see what each thread is currently doing and can see if is the issue I have in mind (long lists) or something. Your log already has hints about this, just want to be clear about it
__________________
JD-Dev & Server-Admin |
#11
|
|||
|
|||
Quote:
And yes, the problem is exactly with many links(the more links there are active in queue, the longer it takes) which is almost always when downloading deviantart galleries as some artists have over 10k submissions which leads me to a request which i might make some day to raise or unlimit the hard limit of elements per package of 10008 |
Thread Tools | |
Display Modes | |
|
|