#1081
|
||||
|
||||
![]()
@mgpai
What he wants is not possible via link crawler rules. He added a detailed description here: https://board.jdownloader.org/showpo...3&postcount=11 To sum it up, what he wants is: - Search that website for keywords - Add the last X pages of the results The website might also display a reCaptchaV2 on search attempt/Cloudflare I told him that he will probably either need a very customized script or edit our official plugin and add the functionality he wants. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager Erste Schritte & Tutorials || JDownloader 2 Setup Download ----------------------------------- On Vacation / Im Urlaub Start: 2023-12-09 End: TBA |
#1082
|
|||
|
|||
![]() Quote:
|
#1083
|
||||
|
||||
![]()
You are right but how would you manage the thing with the "keywords" he wants?
Also via link crawler rule --> Only allow it to pick-ip URLs containing the keywords? -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager Erste Schritte & Tutorials || JDownloader 2 Setup Download ----------------------------------- On Vacation / Im Urlaub Start: 2023-12-09 End: TBA |
#1084
|
|||
|
|||
![]() Quote:
That page lists only the most recent 15 releases, so will not take long to crawl. Shorter interval can be used if the page is updated frequently. Linkcrawler rule > deeppattern > Create a html/url pattern which contains keywords. (best/most efficient option). OR Linkgrabber fiilter: Block urls which do not contain the keyword. |
#1085
|
||||
|
||||
![]()
Hm you are right this could work --> And I'am wrong
![]() Sometimes things are easier than expected at first glance. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager Erste Schritte & Tutorials || JDownloader 2 Setup Download ----------------------------------- On Vacation / Im Urlaub Start: 2023-12-09 End: TBA |
#1086
|
||||
|
||||
![]()
In the Linkcrawler rule I have tried several combinations for the deepPattern with a single key word and none of them worked, much less for two keywords separated by a space. Nothing is filtered, it adds everything, and crawls forever.
The linkgrabber filter will not work properly because it shows both the 'filtered' and the thousands of 'accepted' which continues to add everything else indefinitely, obviously because I'm unable to set a proper deepPattern. Here are three variations of deeppatterns that I have tried and none worked: "deepPattern" : "class="RARBG"><a href="([^"]+)"" "deepPattern" : "(http.+\\RARBG)" "deepPattern" : "(**External links are only visible to Support Staff** [ { "enabled" : true, "cookies" : null, "updateCookies" : true, "logging" : false, "maxDecryptDepth" : 0, "id" : 1582157977984, "name" : "rmz.cr", "pattern" : "**External links are only visible to Support Staff**, "rule" : "DEEPDECRYPT", "packageNamePattern" : null, "passwordPattern" : null, "formPattern" : null, "deepPattern" : "class="RARBG"><a href="([^"]+)"", "rewriteReplaceWith" : null } ] Last edited by RPNet-user; 20.02.2020 at 06:50. |
#1087
|
||||
|
||||
![]()
How do you expect this to work?
Your rule does not even contain your keysowrds anywhere. Anyways, here a blank example: Code:
[ { "enabled" : true, "updateCookies" : true, "logging" : false, "maxDecryptDepth" : 1, "id" : 1422443765154, "name" : "rmz.cr example rule", "pattern" : "https?://rmz\\.cr/", "rule" : "DEEPDECRYPT", "packageNamePattern" : null, "passwordPattern" : null, "formPattern" : null, "deepPattern" : "(/release/keyword1[a-z0-9\\-]+keyword2[a-z0-9\\-]+keyword3)", "rewriteReplaceWith" : null } ] Code:
[ { "enabled" : true, "updateCookies" : true, "logging" : false, "maxDecryptDepth" : 1, "id" : 1422443765154, "name" : "rmz.cr example rule", "pattern" : "https?://rmz\\.cr/", "rule" : "DEEPDECRYPT", "packageNamePattern" : null, "passwordPattern" : null, "formPattern" : null, "deepPattern" : "(/release/[a-z0-9\\-]+480p[a-z0-9\\-]+)", "rewriteReplaceWith" : null } ]
__________________
JD Supporter, Plugin Dev. & Community Manager Erste Schritte & Tutorials || JDownloader 2 Setup Download ----------------------------------- On Vacation / Im Urlaub Start: 2023-12-09 End: TBA |
#1088
|
||||
|
||||
![]()
@psp, Thanks
Yes my sample does contain the keyword as I was testing with the keyword 'RARBG', but obviously I was not using the proper syntax and regular expressions for the "pattern" : "https?://rmz\\.cr/" and the deepPattern: "/release/keyword[a-z0-9\\-]". Not that it matters anyway since it still will not work. I tested your 480p sample and it does grab all 480p only, and I also tested 1080p by replacing 480p with 1080p and that also works, however, adding the second keyword does not work, for example when adding the keyword RARBG; so with your 480p sample I decided to replace the 480p with the keyword RARBG which is the same keyword I tested with before my previous post with incorrect syntax and it 'does not work'. There is a single 1080p RARBG on the front page at the moment which the crawler does not add, however, when I use only the keyword 1080p it does add the 1080p RARBG with all other 1080p releases as well, which means that the linkcrawler rule does not work for keywords like 'RARBG' +added or by itself. I also tried with just single keywords using only the release names like: VXT, ION10, etc. and none of them worked. BTW, although a-z supposedly accepts either upper or lower case I also tested with the A-Z since some of these keywords are all upper case letters only and that still did not work. Here is what I found so far, the regex will accept any keyword in the body of the title before/prior to the hyphen "-" that separates the trailing word, for example in the title: title.2020.720p.webrip.x264.aac-expresso regex will accept any keyword prior to "-expresso" but not after the hyphen "-". @mgpai How do you create a linkgrabber filter that blocks urls which do not contain the keyword? So as not to show, not make visible, not include and 'not accept anything' other than the urls that contains that keyword? Last edited by RPNet-user; 21.02.2020 at 06:02. |
#1089
|
||||
|
||||
![]()
regex, (?!keyword|s)
regex101 dot com is good to help write patterns. Place in sample of content to match against with some that junk and you can see whats what.
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] |
#1090
|
||||
|
||||
![]()
Thanks but that's not going to help if no pattern will accept those type of keywords.
|
#1091
|
||||
|
||||
![]()
sure you need to have some way to identify objects within the url for instance. you can either look for what you want,.
or block everything that is not what you want. which is a negative look around ignore, you create a filter block for everything BUT what you want. Should work assuming the information is available within the url.
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] |
#1092
|
|||
|
|||
![]() Quote:
Quote:
Example: Code:
[.a-z0-9\\-]+ Quote:
If you specify the correct 'deepPattern' in linkcrawler rule, you will not need to create the linkgrabber filter rule. I second that. |
#1093
|
||||
|
||||
![]() Quote:
@mgpai The title sample that includes those dots is the name pattern that is used for the actual downloadable file names not the title of the posts, it was just an example to point out that none of the keywords after the hyphen "-" are been accepted by the regex keyword search, and yes I tried all those release names and several others after the hyphen in lowercase and none of them worked so the case does not appear to be affecting the functionality of the regex keyword search, and although I didn't use the "i" flag, I did test with A-Za-z0-9 to verify that it was not a case issue. Anyway it is working now, I had to remove the keyword from the middle and add it after the last+quantifier so it looks like this: (/release/[a-z0-9\\-]+[a-z0-9\\-]+rmteam) So the keywords after the hyphen will never work anywhere except in the last keyword placement of the regex regardless of the case. The site is updating regularly with just 15 posts on the first page which includes both tv shows and movies(nonfiltered), however, on the top of their page there is an option to select "movies only" which then the url adds /l/m after the top level dname. At the bottom of each page they are numbered with links to each page in the format: /l/m/2, /l/m/3, and so on, so the second page looks like this: rmz.cr/l/m/2. So if I wanted to add just the first five pages from the "/l/m/" to my crawl, then I assume that I would have to add/change this in the pattern and the deepPattern regex? Scratch that, I believe that mgpai's script--> "Add urls to linkgrabber at user-defined intervals" will handle that. I will test and post back with results. ![]() Last edited by raztoki; 22.02.2020 at 05:26. Reason: insert /quote bbcode |
#1094
|
|||
|
|||
![]()
Hi, I have tried the following script.
Quote:
It worked perfect when downloading one file only. If I try to download multiple files It fails. I tried using and not using "Synchronous execution of script" but both didn't work for multiple files. This is the error I am getting Code:
net.sourceforge.htmlunit.corejs.javascript.EcmaError: SyntaxError: Unterminated object literal (#17) at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3629) at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3613) at net.sourceforge.htmlunit.corejs.javascript.NativeJSON.parse(NativeJSON.java:125) at net.sourceforge.htmlunit.corejs.javascript.NativeJSON.execIdCall(NativeJSON.java:97) at net.sourceforge.htmlunit.corejs.javascript.IdFunctionObject.call(IdFunctionObject.java:89) at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1531) at script(:17) at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798) at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105) at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:411) at org.jdownloader.scripting.JSHtmlUnitPermissionRestricter$SandboxContextFactory.doTopCall(JSHtmlUnitPermissionRestricter.java:119) at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3057) at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:115) at net.sourceforge.htmlunit.corejs.javascript.Context.evaluateString(Context.java:1212) at org.jdownloader.extensions.eventscripter.ScriptThread.evalUNtrusted(ScriptThread.java:288) at org.jdownloader.extensions.eventscripter.ScriptThread.executeScipt(ScriptThread.java:180) at org.jdownloader.extensions.eventscripter.ScriptThread.run(ScriptThread.java:160) Code:
// Convert aac/m4a/ogg files to mp3 for youtube.com links // Trigger required: "A Download Stopped". var deleteSourceFile = true; // Set this to true to delete source file after conversion. var sourceFile = link.getDownloadPath(); var filetype = getPath(link.getDownloadPath()).getExtension(); var filename = link.getName(); var extLength = filetype.length + 1 var newfilename = filename.substring(0, filename.length - extLength) var downloadFolder = package.getDownloadFolder(); var destFile = downloadFolder + "\\" + newfilename + ".mp3"; if (link.isFinished()) { if (link.getHost() == "youtube.com") { if (filetype == "m4a" || filetype == "aac" || filetype == "ogg") { callSync(JD_HOME + "\\tools\\Windows\\ffmpeg\\x64\\ffmpeg.exe", "-v", "5", "-y", "-i", sourceFile, destFile) } if (deleteSourceFile && getPath(destFile).exists()) deleteFile(sourceFile, false); } Now I am trying to make the image from the video to be the cover. I will update when made. Also want to detect if there is a square to cut borders. Greetings, Germini Last edited by Germini; 22.02.2020 at 06:40. |
#1095
|
|||
|
|||
![]()
Does have a trigger for shutdown?
I start 3rd party app on trigger JD start. Now I need to terminate when JD stops. |
#1096
|
|||
|
|||
![]()
Hello script master!
Sometims download not complete. Status say "An Error occured!". I want cycle through download list and find all "An Error occured!" and reset them. I want use existing interval script. Now question: 1. Is links with this error included in running downloads from getRunningDownloadLinks()? 2. Or is better to check for myDownloadLink.getStatus()? Would value to check be "An Error occured!" like written in download list? |
#1097
|
|||
|
|||
![]() Quote:
Quote:
Quote:
Code:
var myConditionalSkipReason = myDownloadLink.getConditionalSkipReason(); |
#1098
|
|||
|
|||
![]()
Trying to get 'yyyy-mm-dd_-_hh-mm-ss', is there a better method? This is a little messy. Thanks
Code:
var regex1 = /(\d{4})\-(\d{2})\-(\d{2})/; var regex2 = /(\d{2}):(\d{2}):(\d{2})/; var date = new Date().toJSON().substr(0, 10).replace(regex1, "$1-$2-$3"); var time = new Date().toTimeString().substr(0, 9).replace(regex2, "$1-$2-$3"); var dateTime = date + '_-_' + time alert(dateTime); |
#1099
|
|||
|
|||
![]() Code:
var dateTime = new Date().toJSON().substr(0, 19).replace("T", "_-_").replace(/:/g, "-"); |
#1100
|
||||
|
||||
![]()
Merged EventScripter threads.
-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager Erste Schritte & Tutorials || JDownloader 2 Setup Download ----------------------------------- On Vacation / Im Urlaub Start: 2023-12-09 End: TBA |
![]() |
Thread Tools | |
Display Modes | |
|
|