JDownloader Community - Appwork GmbH
 

Notices

Reply
 
Thread Tools Display Modes
  #1  
Old 12.02.2021, 18:45
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 395
Default [LinkCrawler Rule] request for eurekadll.asia

Hi staff,
can you add support for eurekadll.asia ?

example links:

**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
Reply With Quote
  #2  
Old 12.02.2021, 18:50
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,144
Default

Hi,

no.

A plugin for this website is not required as it contains the URLs right away inside its HTML code.
If you want JD to auto-crawl URLs from this website, you could accomplish this by creating a LinkCrawler rule (type: DEEPDECRYPT).

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 12.02.2021 at 18:51. Reason: Fixed typo
Reply With Quote
  #3  
Old 15.02.2021, 17:30
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 395
Default

Hi psp

I add this rule

Code:
[ {
  "enabled" : true,
  "cookies" : null,
  "updateCookies" : true,
  "logging" : false,
  "maxDecryptDepth" : 2,
  "id" : 1613401915616,
  "name" : "eureka example rule",
  "pattern" : "https?://eurekaddl\\.asia\\?s=\\d+",
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "Download from <a href=\"(https?://[^\"]+)\"",
  "rewriteReplaceWith" : null
} ]
it works for links such

**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**

but it crawls also other many spam links (for example facebook, disqus ecc)

This rule, for example, doesn't work for links like

**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**

also if I set "maxDecryptDepth" : 2

LOG
Code:
15.02.21 16.04.38 <--> 15.02.21 16.30.10 jdlog://2123725302851/
Reply With Quote
  #4  
Old 15.02.2021, 18:10
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,144
Default

Quote:
Originally Posted by nathan1 View Post
it works for links such

**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
No it doesn't!
You need a separate rule for that.
I made one which always only grabs filecrypt.cc URLs as it seems like they're only using that.
(See the end of this post.)

Quote:
Originally Posted by nathan1 View Post
This rule, for example, doesn't work for links like
The rule you've made does not work at all as your regular expression is wrong and your DEEPDECRYPT pattern is also waay too open.
You can use webtools like regex101.com to test your regular expression.
Please keep in miond that it is not part of our support to teach our users how regular expressions work.
Please re-read our LinkCrawler Rules documentation and learn how regular expressions work so you can write your own rules in the future.

I've made two rules for your two different types of URLs for this website, see here:
Code:
[ {
  "enabled" : true,
  "maxDecryptDepth" : 1,
  "name" : "eurekaddl.asia 1: crawl filecrypt.cc URLs",
  "pattern" : "https?://eurekaddl\\.asia/[a-z0-9\\-]+/[a-z0-9\\-]+/",
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "(https?://filecrypt\\.cc/[^\"]+)",
  "rewriteReplaceWith" : null
},{
  "enabled" : true,
  "logging" : false,
  "maxDecryptDepth" : 1,
  "name" : "eurekaddl.asia 2: crawl search results",
  "pattern" : "https?://eurekaddl\\.asia/\\?s=.+",
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "\"(https?://eurekaddl\\.asia/[a-z0-9\\-]+/[a-z0-9\\-]+/)\"",
  "rewriteReplaceWith" : null
} ]
Rules as plaintext to get around out forum "http" censoring:

pastebin.com/wecvyRtt

Please keep in mind that this rule should be furtherly optimized as it will also process "category URLs" at this moment which will slow it down.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #5  
Old 16.02.2021, 13:26
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 395
Default

Thank you very much
Reply With Quote
  #6  
Old 16.02.2021, 16:00
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 395
Default

@psp

In your JD crawler rule is possible to add also a specific html tag to avoid that second rule capture all eurekaddl urls from query /?s=?

For example I see that Cosey - Jonathan are files that I wish capture and are 12 urls but with your second rule JD capture urls also outside html tag
Code:
class="container mainBg mainContainer"
that I wish to insert to define a perimeter in order to delimit the capture links area.

Reply With Quote
  #7  
Old 16.02.2021, 16:07
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,144
Default

Sure.
Well you can't just "define an area" to search as you're limited to that one pattern but you can of course change it to only grab all URLs inside tags after specified html classes representing the search results e.g.:
Code:
  "deepPattern" : "<div class=\"teaser-box\">\\s*<a href=\"(https?://eurekaddl\\.asia/[a-z0-9\\-]+/[a-z0-9\\-]+/)\"",
As said you will have to learn RegEx on your own in order to be able to write your own patterns!
See regex101.com


Full rule(s):
Code:
[ {
  "enabled" : true,
  "maxDecryptDepth" : 1,
  "name" : "eurekaddl.asia 1: crawl filecrypt.cc URLs",
  "pattern" : "https?://eurekaddl\\.asia/[a-z0-9\\-]+/[a-z0-9\\-]+/",
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "(https?://filecrypt\\.cc/[^\"]+)",
  "rewriteReplaceWith" : null
},{
  "enabled" : true,
  "logging" : false,
  "maxDecryptDepth" : 1,
  "name" : "eurekaddl.asia 2: crawl search results",
  "pattern" : "https?://eurekaddl\\.asia/\\?s=.+",
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "<div class=\"teaser-box\">\\s*<a href=\"(https?://eurekaddl\\.asia/[a-z0-9\\-]+/[a-z0-9\\-]+/)\"",
  "rewriteReplaceWith" : null
} ]
Plaintext:
pastebin.com/mwHdCcxf

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #8  
Old 16.02.2021, 18:08
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 395
Default

I try your edit but seems doesn't work well because it crawls only 3 URLs and not 12 urls that are inside that query
You choose teaser-box as tag and this is ok, but for strange reason JD fetch only 3 links



I try also to change teaser-box into row but it don't fetch nothing but I think to write correctly

Code:
  "deepPattern" : "<div class=\"row\">\\s*<a href=\"(https?://eurekaddl\\.asia/[a-z0-9\\-]+/[a-z0-9\\-]+/)\"",
LOG
Code:
16.02.21 16.59.12 <--> 16.02.21 16.55.45 jdlog://3943725302851/

Last edited by nathan1; 16.02.2021 at 18:12.
Reply With Quote
  #9  
Old 16.02.2021, 18:31
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,144
Default

Works just fine here.
Please make sure to use the exact rule I've posted:
pastebin.com/mwHdCcxf


Again:
You can check all regular expressions here:
regex101.com

Please learn how to use regular expressions on your own.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #10  
Old 16.02.2021, 19:04
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 395
Default

yes, you're right.
Strange problem. I change my windows account where I have installed another Jdownloader and I see that capture all 12 links correctly.

But into Jdownload of other windows account I return always 3 files.
Thanks for all, psp!
Reply With Quote
  #11  
Old 18.02.2021, 14:45
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 395
Default

@psp

I try to test this link
**External links are only visible to Support Staff****External links are only visible to Support Staff**

but it crawls only 12 links but we have about 45.
Maybe is wrong something?
Reply With Quote
  #12  
Old 18.02.2021, 14:48
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,144
Default

Just scroll down - there is mutliple pages of search results.
You'd need to extend that rule to grab these pages too and accept their URL format.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 14:10.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.