JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 14.04.2022, 05:28
MediaFanatic MediaFanatic is offline
Junior Loader
 
Join Date: Mar 2019
Posts: 12
Default Linkcrawler Passing Cookies

I've worked on-and-off -- over two years -- and 70+ hours -- making different attempts to use LinkCrawler rules in jDownloader that pass cookies.

I've never been able to get them to work. I previously opened a thread on this topic. After going into great detail with the syntax and logging, the reply I received was: Cookies in Link Crawler rules are not very good and need to be worked on.

I assumed I would give it time, for these issues to be resolved. Every few months I would try again. To this day, I cannot write a LinkCrawler rule that passes cookies.

Here's a simple example, with sensitive data removed:

Code:
[
 {
  "name"               : "SaNet-SoftArchive",
  "id"                 : 1649901977688,
  "enabled"            : true,
  "pattern"            : "**External links are only visible to Support Staff**,
  "maxDecryptDepth"    : 1,
  "rewriteReplaceWith" : null,
  "passwordPattern"    : null,
  "packageNamePattern" : "<title>(.*?)</title>",
  "rule"               : "DEEPDECRYPT",
  "logging"            : true,
  "formPattern"        : null,
  "deepPattern"        : "(rapidgator\\.net/file/|nitroflare\\.com/view/|nitro\\.download/view/|filefactory\\.com/file/)",
  "updateCookies"      : true,
  "cookies"            : [
                          ["id","123456"],
                          [
                           "sa_remember",
                           "xxxxxd5d26xxxxx0ae03dxxxxxe62170xxxxx359b6xxxxx7517fxxxxx3bxxxxx"
                          ],
                          [
                           "AdskeeperStorage",
                           "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
                          ],
                          [
                           "PHPSESSID",
                           "xxxxx2b1cbaxxxxx46cxxxxx669xxxxx"
                          ]
                         ]
 }
]
As you can see, this is an extremely simple example.

This is for the website "SoftArchive" (sanet), the largest OCH-software indexer as rated by traffic.

I test the exact same settings manually and I am authenticated (links are shown). Through jDownloader, the cookies are not sent.

This is the same situation on every site I've tested. I've confirmed the crawlers are running in the log.
Reply With Quote
  #2  
Old 14.04.2022, 10:24
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 76,922
Default

@MediaFanatic:
Next time please ask for help earlier We often can help you faster or tell you where to look at or what the problem might have been

You can enable logging of your rule, then you can see if the rule matches and what the server response is.
https://support.jdownloader.org/Know...kcrawler-rules
set
Quote:
"logging": true,
What I can tell is that your deepPattern is wrong. Your pattern will not return any links at all Please know that the deepPattern specifies in what matching group JDownloader should be looking for links. In your case the matchig group will just be "host.com/file", no protocol, no fileID, nothing

We can help with rule but we need username/password or cookies, send to support@jdownloader.org (including your rule).
__________________
JD-Dev & Server-Admin

Last edited by Jiaz; 14.04.2022 at 10:27.
Reply With Quote
  #3  
Old 14.04.2022, 10:32
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 76,922
Default

Quote:
Originally Posted by MediaFanatic View Post
I assumed I would give it time, for these issues to be resolved.
Currently there are no known issues with cookies in Linkcrawler rules
__________________
JD-Dev & Server-Admin

Last edited by Jiaz; 26.04.2022 at 10:23.
Reply With Quote
  #4  
Old 26.04.2022, 09:15
MediaFanatic MediaFanatic is offline
Junior Loader
 
Join Date: Mar 2019
Posts: 12
Default

@Jiaz --
Thank you for the reply. I'm sorry I didn't reply earlier (for some reason I didn't receive the forum notification despite being subscribed -- will check spam folder).

I should clarify -- I did bring this to your attention much earlier. When I posted about, referring to "waiting", it was because my forum-posts ended in a dead-end. In that post, @raztoki mentioned there wasn't a solution and the cookie-implementation is not ideal.

I interpreted that to suggest that the cookie implementation would improve over time. In the meantime I continued testing different sites, writing my own demo page to test with it, etc.

Of course I've been using the logging; however, that has not helped. It does not state the reason the rule didn't work as you mention. When I started about two years ago, it was better in regard to the custom-LinkCrawler logging. The logging has changed since then and (I'm sure I could be wrong) from my impression, in the area of custom-LinkCrawlers, it has shown fewer details in the newer logging approach.

If you are interested in my original message that resulted in a dead-end, I would be very happy if you could take a look -- starting at the first post and reading chronologically: https://board.jdownloader.org/showthread.php?t=83773

I included the full logging in that message; that was prior to the new logging and it was a bit more helpful (for my unique case).

Thank you in advance for your time and insight

Last edited by MediaFanatic; 26.04.2022 at 09:36.
Reply With Quote
  #5  
Old 26.04.2022, 09:50
MediaFanatic MediaFanatic is offline
Junior Loader
 
Join Date: Mar 2019
Posts: 12
Default

Sorry - One more thing to add --

Quote:
Originally Posted by Jiaz View Post
Please know that the deepPattern specifies in what matching group JDownloader should be looking for links. In your case the matchig group will just be "host.com/file", no protocol, no fileID, nothing
The DeepPattern I'm using is in the same format specified in the articles / posts I have been able to find on LinkCrawler Rules.

It's the exact format you mentioned "host.com/file". There is no protocol, no fileID, nothing else -- just as you've written.

Here it is, one more time:
Code:
"(rapidgator\\.net\\/file\\/|nitroflare\\.com\\/view\\/|nitro\\.download\\/view\\/|filefactory\\.com\\/file\\/)
Because this is regex, I have to escape the dot "." -- however, in simpler terms, I'm just doing what you've said (host.com/file) using "OR" logic, nothing more.

Although there is one mistake; unfortunately it was part of my entering a sample here in the forum -- not in the original rule. I have corrected my mistake and created a sample where you can see very clearly how simple this rule works:
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Thank you again for you all of your help!
Reply With Quote
  #6  
Old 26.04.2022, 15:57
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 76,922
Default

@MediaFanatic: Nothing to be sorry for. Yes, I'm aware of the old thread but I'm currently not aware of any problems with cookie support in linkcrawler rules.

Quote:
Originally Posted by MediaFanatic View Post
The logging has changed since then and (I'm sure I could be wrong) from my impression, in the area of custom-LinkCrawlers, it has shown fewer details in the newer logging approach.
With logging enabled within the rule, the log will contain request/response from the corresponding requests. I'm not aware of any problems with the logging, maybe you can explain what you're missing?
I also take another look at , https://board.jdownloader.org/showpo...76&postcount=1 and I wonder about this rule because it conains fields that are not even supported/handled/used at all, like accountPattern/domainPattern, looks like it's a Linkcrawler Rule mixed with DomainRule because accountPattern/domainPattern are part of DomainRules and not Linkcrawler Rules. Also it is missing the essential pattern field, so JDownloader will not make use of that rule at all.

I guess the problem is because of wrong/mixed up rule in first place. pspzocker wrote good help articles on it, see
https://support.jdownloader.org/Know...kcrawler-rules
https://support.jdownloader.org/Know...kcrawler-rules
__________________
JD-Dev & Server-Admin

Last edited by Jiaz; 26.04.2022 at 16:06.
Reply With Quote
  #7  
Old 26.04.2022, 16:07
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 76,922
Default

Quote:
Originally Posted by MediaFanatic View Post
The DeepPattern I'm using is in the same format specified in the articles / posts I have been able to find on LinkCrawler Rules.

It's the exact format you mentioned "host.com/file". There is no protocol, no fileID, nothing else -- just as you've written.

Here it is, one more time:
Code:
"(rapidgator\\.net\\/file\\/|nitroflare\\.com\\/view\\/|nitro\\.download\\/view\\/|filefactory\\.com\\/file\\/)
This pattern is wrong. JDownloader will just find the matching group! your pattern must include the complete link. in your case JDownloader will just return for example "rapidgator.net/file".
You can see it in your regex101.com/r/x8Z5bo/1 too, see on the right side "match information", that is what JDownloader will *find*, those are no valid links at all

your deepPattern must either match exactly on the link you want to find or on the region where the links can be found.

I can help with the rule but I need working cookies. You can send them to support@jdownloader.org
__________________
JD-Dev & Server-Admin

Last edited by Jiaz; 26.04.2022 at 16:13.
Reply With Quote
  #8  
Old 27.04.2022, 02:54
MediaFanatic MediaFanatic is offline
Junior Loader
 
Join Date: Mar 2019
Posts: 12
Default

@Jiaz, thank you!

Yes, exactly, @pspzocker's page you linked is one of two that I was using to learn/compose the rule. I also searched posts on this forum.

To your comment re:Invalid Fields --

If you examine my rule above, in the first post, do you see any fields that are invalid? If there is anything unrelated to LinkCrawler (I didn't see a domain or accountPattern; that may have been in my original post where I was still trying to test different ideas from the forum). Hopefully I'm not including any incorrect fields any longer.

You second post about the "deepPattern" is perfect.

I had no idea that it had to match the entire link; I thought it was a wildcard, where anything that matched would be automatically searched for the full HREF. I also didn't realize that it needed to be an explicit match-group (separate parenthesis to indicate capture group).

Based on your help, I simply wrote a RegEx rule that took everything between the specific HREF HTML tags that included downloads. This worked perfectly!

I was also able to play with the JSON and address another issue, to get my first cookie to finally pass properly, thanks to your suggestion re:deepPattern, which eliminated that issue. This allowed me to see the issues with cookies more clearly and resolve the issue; at least on my simple test website I created to diagnose the issue.

Now that I have a test scenario working, I'll return to my original problem with LinkedIn Learning, to see if I can create a method for jDownloader to download my videos.

Thank you again for your help

Last edited by MediaFanatic; 27.04.2022 at 02:56.
Reply With Quote
  #9  
Old 27.04.2022, 10:25
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 76,922
Default

Quote:
Originally Posted by MediaFanatic View Post
To your comment re:Invalid Fields --
If you examine my rule above, in the first post, do you see any fields that are invalid?
I'm sorry, my bad. I was referring to your other thread
https://board.jdownloader.org/showthread.php?t=83773
__________________
JD-Dev & Server-Admin
Reply With Quote
  #10  
Old 27.04.2022, 10:25
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 76,922
Default

Quote:
Originally Posted by MediaFanatic View Post
I had no idea that it had to match the entire link; I thought it was a wildcard, where anything that matched would be automatically searched for the full HREF. I also didn't realize that it needed to be an explicit match-group (separate parenthesis to indicate capture group).
I will ask pspzockerscene to explain this more explicit

Quote:
Originally Posted by MediaFanatic View Post
Based on your help, I simply wrote a RegEx rule that took everything between the specific HREF HTML tags that included downloads. This worked perfectly!
You're welcome and sorry it took so long to come down to the root of the issue
__________________
JD-Dev & Server-Admin

Last edited by Jiaz; 27.04.2022 at 10:30.
Reply With Quote
  #11  
Old 27.04.2022, 10:32
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 76,922
Default

Quote:
Originally Posted by MediaFanatic View Post
Thank you again for your help
You're welcome and next time better ask and not wait an eternity
Sometimes I also just read over or forget about issues/threads until reading again about it
__________________
JD-Dev & Server-Admin
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 12:30.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.