JDownloader Community - Appwork GmbH
 

Notices

Reply
 
Thread Tools Display Modes
  #1  
Old 20.04.2020, 19:17
zreenmkr zreenmkr is offline
JD Addict
 
Join Date: Feb 2020
Posts: 174
Default Cancelling FolderWatch Linkgrabber

hi, I added a bunch of *.crawljob to folderwatch, each file contains any where from 30-70urls in them. Jd became slow to respond to a simple right click. Checked Taskmanager for cpu usage and it was posting above 45%. So I decided to cancel all linkgrabbering via RightClick on linkgrabberring icons at the bottom of the jd window.

What I didn't realize was jd moved the all *.crawljobS file that just got cancelled to 'added' folder although knowingly not all urls were added to linkgrabber tab.

now i don't know which *.crawljob files to retrieve that I cancelled on because these files were created couple of days ago. so since folderwatch didn't make any modification to the *crawljob files, there is no way to used modified timestamp to back reference.

Is there a log where I could look into for timestemp where latest *.crawljob files were moved to 'added' folder in folderwatch?

Also why there are so many linkgrabbing icons at bottom of the screen while grabbing the urls from *.crawljob? It appears this process alone using alot of cpu recourses. Could the limit be set to grabbing link down to 5 or somethign at the same time? it appears there were over 10 linkgrabbing icons.
Reply With Quote
  #2  
Old 21.04.2020, 13:11
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,140
Default

Hi,
1. Every job starts in an own crawler which will create a separate icon.

2. You can set the total allowed number of link crawler threads with this advanced Setting:
Code:
link crawler maxthreads
3. Please note that the .crawljobs themselves cannot always "know" whether it they have failed or not as a website could e.g. display an errormessage on too many requests.
They are not "intelligent" at all.

4. Logs:
EDIT
There are no specific logs you can use to track single crawljobs but you can enable debug logs and then checkout your logs.
Every plugin e.g. has its own log.
See "debug logs" setting description as part of log posting instructions:
https://support.jdownloader.org/Know...d-session-logs

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 21.04.2020 at 14:08.
Reply With Quote
  #3  
Old 21.04.2020, 13:31
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,140
Default

Sorry, I got a few things wrong here in my initial answer.
Your post was about the crawljob things and NOT link crawler rules.
I'm correting my previous post atm.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 21.04.2020 at 13:39.
Reply With Quote
  #4  
Old 28.04.2020, 04:40
zreenmkr zreenmkr is offline
JD Addict
 
Join Date: Feb 2020
Posts: 174
Default

Quote:
Every job starts in an own crawler which will create a separate icon
Code:
link crawler maxthreads
i'm guessing this is max threads per *.crawljob file, if i added multiple crawljob files, it would load max nth threads per file. after changed the thread to 2 and restarted jd. it still load 22 crawling threads/icons at the bottom of jd with multiple crawljob files added

Quote:
3. Please note that the .crawljobs themselves cannot always "know" whether it they have failed or not as a website could e.g. display an errormessage on too many requests.
They are not "intelligent" at all.
good to know. thanks. on the related topic to add an extra little feature. so what i've observed is that when jd loads *.crawljob file to parse for urls, while in the mids crawling i intervene to cancel all linkgrabber it should rename the crawljob file and append a 'INCOMPLETE' at the end of the file and leave that file in 'folderwatch' folder instead of move it to 'added' folder.

Quote:
...you can enable debug logs and then checkout your logs.
Every plugin e.g. has its own log...
This is something I need to look into. Since I don't know enough about this i'm a little un-comfortable with possibility of sensitive info being public.
Reply With Quote
  #5  
Old 28.04.2020, 13:48
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,140
Default

Quote:
Originally Posted by zreenmkr View Post
good to know. thanks. on the related topic to add an extra little feature. so what i've observed is that when jd loads *.crawljob file to parse for urls, while in the mids crawling i intervene to cancel all linkgrabber it should rename the crawljob file and append a 'INCOMPLETE' at the end of the file and leave that file in 'folderwatch' folder instead of move it to 'added' folder.
You misunderstood me:
It is impossible for JD to know whether the crawl process was successful or not.
JD could only know this if you e.g. were able to add "number of expected results" or so.
Most users would never use this --> I would again recommend you do try to do this via EventScripter.

Quote:
Originally Posted by zreenmkr View Post
This is something I need to look into. Since I don't know enough about this i'm a little un-comfortable with possibility of sensitive info being public.
Log-uploads are never public.
Only official JD supporters can view logs.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #6  
Old 28.04.2020, 16:22
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,342
Default

Quote:
Originally Posted by zreenmkr View Post
Code:
link crawler maxthreads
i'm guessing this is max threads per *.crawljob file, if i added multiple crawljob files, it would load max nth threads per file. after changed the thread to 2 and restarted jd. it still load 22 crawling threads/icons at the bottom of jd with multiple crawljob files added
There is a global ThreadQueue for ALL crawler, shared/used by all of them. But when you add a new crawljob. EACH crawler will have it's own Icon at the bottom because you can abort them all at once or individually. You can have one/multiple jobs within one crawljob file.
Example: 1 Crawljob with 1 Job
1. Job: parse the crawljob file
2. Job: parse the job itself
3. Job: process the job itself
The indiciator icon will disappear once the complete job chain is finished.
__________________
JD-Dev & Server-Admin
Reply With Quote
  #7  
Old 28.04.2020, 16:25
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,342
Default

Quote:
Originally Posted by zreenmkr View Post
good to know. thanks. on the related topic to add an extra little feature. so what i've observed is that when jd loads *.crawljob file to parse for urls, while in the mids crawling i intervene to cancel all linkgrabber it should rename the crawljob file and append a 'INCOMPLETE' at the end of the file and leave that file in 'folderwatch' folder instead of move it to 'added' folder.
The complete crawling/processing/checking process is heavily multithreaded, queued and does NOT process the jobs in their input order because the process optimizes to process faster plugins/links first, reordering at runtime.

Also JDownloader doesn't know if complete/incomplete because it doesn't know how many links you expect as a result.
Maybe unsupported link -> no result -> incomplete or not?
Maybe supported link but no content found -> no result but processed -> incomplete or not?
Maybe supported link found but due to a bug only 1/10 links found -> results and processed -> incomplete or not?
There is no indication if the job is finished/incomplete/nothing found...
__________________
JD-Dev & Server-Admin
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 10:58.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.