JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 07.04.2020, 09:55
marvelfrozen marvelfrozen is offline
Junior Loader
 
Join Date: Apr 2020
Posts: 14
Default Link Crawler can get files but not found on download

The website in question is for example is this

**External links are only visible to Support Staff****External links are only visible to Support Staff**

**External links are only visible to Support Staff****External links are only visible to Support Staff**

I use this rule and manage to crawl the image files list

Code:
[ {
  "enabled" : true,
  "cookies" : [ ],
  "updateCookies" : true,
  "logging" : false,
  "maxDecryptDepth" : 9,
  "id" : 1586231755944,
  "name" : "my rule",
  "pattern" : "**External links are only visible to Support Staff**,
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "(?<=<p>| )<a href=\"([^\"]+theweb\\.tv/(?!(category|tag))\\S+(/|\\.jpg))\"",
  "rewriteReplaceWith" : null
} ]
*i use "theweb" there to mask the website, sorry if confusing

But when I tried to download said files, it got error File not found on the download list.

Am I missing something?

Thank you for the help.

Last edited by marvelfrozen; 07.04.2020 at 10:14. Reason: mask domain
Reply With Quote
  #2  
Old 07.04.2020, 10:42
tony2long's Avatar
tony2long tony2long is offline
English Supporter
 
Join Date: Jun 2009
Posts: 6,382
Default

Can you open the crawl result with browser?
__________________
FAQ: How to upload a Log
Reply With Quote
  #3  
Old 07.04.2020, 10:49
marvelfrozen marvelfrozen is offline
Junior Loader
 
Join Date: Apr 2020
Posts: 14
Default

Right Click and open in browser opens the source url.

Is it supposed to open the image file instead?

Last edited by marvelfrozen; 07.04.2020 at 10:53.
Reply With Quote
  #4  
Old 07.04.2020, 11:23
tony2long's Avatar
tony2long tony2long is offline
English Supporter
 
Join Date: Jun 2009
Posts: 6,382
Default

I can't test it because theweb.tv is not in the source of that example link.
In the source, the link after "Full size:" is the link that you should get.
__________________
FAQ: How to upload a Log
Reply With Quote
  #5  
Old 07.04.2020, 11:42
marvelfrozen marvelfrozen is offline
Junior Loader
 
Join Date: Apr 2020
Posts: 14
Default

I sent you a private message. Please check.

This is the source (?) that I get if I crawl using the rule
**External links are only visible to Support Staff****External links are only visible to Support Staff**

While this is the one if I didn't use the rule, but copy block the gallery
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Why is it not directing to the image file when I'm using the rule?
Reply With Quote
  #6  
Old 07.04.2020, 12:44
tony2long's Avatar
tony2long tony2long is offline
English Supporter
 
Join Date: Jun 2009
Posts: 6,382
Default

It seems that non zero "maxDecryptDepth" will make upside down the links.
So try with 2 rules, one for picture only with "maxDecryptDepth" = 0 and "deepPattern" : "Full size: <a href="([^"]+)""
__________________
FAQ: How to upload a Log
Reply With Quote
  #7  
Old 07.04.2020, 13:26
marvelfrozen marvelfrozen is offline
Junior Loader
 
Join Date: Apr 2020
Posts: 14
Default

I have added a second rule, so the setup looks like this
Code:
[ {
  "enabled" : true,
  "cookies" : [ ],
  "updateCookies" : true,
  "logging" : false,
  "maxDecryptDepth" : 9,
  "name" : "rule 1",
  "pattern" : "**External links are only visible to Support Staff**,
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "(?<=<p>| )<a href=\"([^\"]+theweb\\.tv/(?!(category|tag))\\S+(/|\\.jpg))\"",
  "rewriteReplaceWith" : null
},{
  "enabled" : true,
  "cookies" : [ ],
  "updateCookies" : true,
  "logging" : false,
  "maxDecryptDepth" : 0,
  "name" : "rule 2",
  "pattern" : "**External links are only visible to Support Staff**,
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "Full size: <a href=\"([^"]+)\"",
  "rewriteReplaceWith" : null
} ]
But the links are still not found on download. It still also put the image link in the middle of the three when I try to show download urls.

I have also remove the .jpg from the first rule, but that results in the crawler not getting any files at all.

Last edited by marvelfrozen; 07.04.2020 at 13:35.
Reply With Quote
  #8  
Old 07.04.2020, 13:35
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 54,483
Default

Works fine here like this:
Code:
[ {
  "enabled" : true,
  "maxDecryptDepth" : 1,
  "name" : "modelblog.tv replace thumbnail URL to full image URL",
  "pattern" : "(https?://modelblog\\.tv/wp-content/uploads/\\d{4}/\\d{2}/.*)(-\\d+x\\d+)\\.jpg",
  "rule" : "REWRITE",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : null,
  "rewriteReplaceWith" : "$1.jpg"
}, {
  "enabled" : true,
  "updateCookies" : true,
  "maxDecryptDepth" : 1,
  "name" : "modelblog.tv grab thumbnails from overview page",
  "pattern" : "https?://modelblog\\.tv/(?!wp-content)[^/]+/",
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "(https?://modelblog\\.tv/wp-content/[^\"]+\\.jpg)",
  "rewriteReplaceWith" : null
} ]
This way you would only have to add the "overview" URL containing the thumbnails and save 1 step.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
How to create a log || Wie man einen Log erstellt
Captcha FAQ EN || Captcha FAQ DE || Erste Schritte & Tutorials
JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
That's true James
Quote:
Originally Posted by James
Die Leute verstehen einfach nicht dass nur weil man mit einer Waffe auch auf Menschen schießen kann dass ein Schützenver​ein kein Ort für Amoklaufide​en ist
Reply With Quote
  #9  
Old 07.04.2020, 14:05
marvelfrozen marvelfrozen is offline
Junior Loader
 
Join Date: Apr 2020
Posts: 14
Default

Thank you, that works just like I wanted.
Reply With Quote
  #10  
Old 07.04.2020, 14:16
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 54,483
Default

Thanks for your feedback.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
How to create a log || Wie man einen Log erstellt
Captcha FAQ EN || Captcha FAQ DE || Erste Schritte & Tutorials
JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
That's true James
Quote:
Originally Posted by James
Die Leute verstehen einfach nicht dass nur weil man mit einer Waffe auch auf Menschen schießen kann dass ein Schützenver​ein kein Ort für Amoklaufide​en ist
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 06:00.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.