JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #201  
Old 07.10.2022, 19:11
georgegalily georgegalily is offline
Super Loader
 
Join Date: Oct 2018
Posts: 27
Default

very large 500,000 links 45mb
Reply With Quote
  #202  
Old 07.10.2022, 19:15
georgegalily georgegalily is offline
Super Loader
 
Join Date: Oct 2018
Posts: 27
Default

It seems when the error occured the script is still runs but not well

Last edited by georgegalily; 07.10.2022 at 19:18.
Reply With Quote
  #203  
Old 07.10.2022, 19:33
georgegalily georgegalily is offline
Super Loader
 
Join Date: Oct 2018
Posts: 27
Default

How much it take to search this file?
JD2 detect duplicates in no time, but keeping the links in the list consume much memory, cpu and hard space
Reply With Quote
  #204  
Old 07.10.2022, 19:41
georgegalily georgegalily is offline
Super Loader
 
Join Date: Oct 2018
Posts: 27
Thumbs up

@Jiaz
Thank you very much
Removing the sleep make it work well but with low removing speed but acceptable.
This is the used script:
Code:
if (state == "AFTER") {
    var url = link.getURL();
    var historyFile = getPath(JD_HOME + "/cfg/history.txt");
    var lock = getModifyLock("history");
    var isDuplicate = false;
    lock.writeLock();
    try {
        var history = historyFile.exists() ? readFile(historyFile) : "";
        isDuplicate = history.indexOf(url) > -1;
        history = null;
    } finally {
        lock.writeUnlock();
    }
    if (isDuplicate) {
        var text = "#duplicatelink";
        var comment = link.getComment();
        comment = comment ? text + " " + comment : text;
        link.setComment(comment);
        link.getCrawledLink().remove();
    }
}
Reply With Quote
  #205  
Old 09.10.2022, 00:01
georgegalily georgegalily is offline
Super Loader
 
Join Date: Oct 2018
Posts: 27
Default

Hi, it seems this method is useless becuase the history file is large, checking every link in this history is slow. It takes forever for a bunch of links.
Not to mention that my pc becomes lag and I cannot tell if the even scripter finished checking the links or not, it does not show progress. I only know that it is running from the lag behaviour of the windows.

Putting the sleep in the script lead to failure.

If it is possible to implement an archieve in the download list that collect only the links in a package called (ArchievedLinks) without other informations, this may become handy. Beucase JD is checks the dupls in no time..
Thank you
Reply With Quote
  #206  
Old 09.10.2022, 12:03
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 78,638
Default

Quote:
Originally Posted by georgegalily View Post
Hi, it seems this method is useless becuase the history file is large, checking every link in this history is slow. It takes forever for a bunch of links.
Yes, because the script (in current form) does read the whole 50mb file again and again for each link and thus require huge amount of data. It would be faster/better to read the file once into memory and then only read again from disk if content has changed.
It should be updated to read the file once to memory and then all scripts will use the history in memory and not read again and again from disk.
I will try to update the script to work that way, give me some time as I'm no master in scripting like mgpai is
Reference for me, https://board.jdownloader.org/showpo...4&postcount=18
__________________
JD-Dev & Server-Admin

Last edited by Jiaz; 09.10.2022 at 12:06.
Reply With Quote
  #207  
Old 09.10.2022, 12:05
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 78,638
Default

Quote:
Originally Posted by georgegalily View Post
If it is possible to implement an archieve in the download list that collect only the links in a package called (ArchievedLinks) without other informations, this may become handy. Beucase JD is checks the dupls in no time..
We have an open ticket for such an idea but it makes no sense in doing this within JDownloader as this is a job of a *real* database with index/memory optimizations and other stuff. It would make most sense to use real database via http rest api in scripts to achieve this.the current scripting solution works good but of course depending on size of history it must be updated/optimized the larger the databse gets.
__________________
JD-Dev & Server-Admin
Reply With Quote
  #208  
Old 10.10.2022, 01:19
georgegalily georgegalily is offline
Super Loader
 
Join Date: Oct 2018
Posts: 27
Thumbs up

Quote:
Originally Posted by Jiaz View Post
Yes, because the script (in current form) does read the whole 50mb file again and again for each link and thus require huge amount of data. It would be faster/better to read the file once into memory and then only read again from disk if content has changed.
It should be updated to read the file once to memory and then all scripts will use the history in memory and not read again and again from disk.
I will try to update the script to work that way, give me some time as I'm no master in scripting like mgpai is
Reference for me, **External links are only visible to Support Staff**...
Thank you, please, take your time
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 11:43.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.