JDownloader Community - Appwork GmbH
 

Notices

Reply
 
Thread Tools Display Modes
  #1  
Old 04.04.2020, 11:42
BerndBosch BerndBosch is offline
Modem User
 
Join Date: Mar 2020
Posts: 4
Question [LinkCrawler Rule] Substring in Download URL ersetzten

Hallo,

ich würde gerne (beim Crawlen?) einen konstanten Substring von bestimmten URLs gegen einen anderen tauschen.

Wenn ich die URL einer Bildergalerie auf **External links are only visible to Support Staff**www.meinbezirk.at kopiere, findet JDownloader alle Bilder. Das ist schon mal sehr super.

Die Bilder dort gibt es in mehreren Auflösungen, wobei die mit "_XXL" die größten sind.

Ich weiß jetzt aber, dass es die Bilder in noch höherer Auflösung gibt, die statt "_XXL" mit "_NATIVE" enden

z.B
Bildergalerie:
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Bild mit "_XXL" Substring:
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Bild mit "_NATIVE" Substring:
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Wobei der Substring am Ende mit "?/d+" obsolet zu sein scheint. Der Server liefert in beiden Fällen das gleiche Bild.


Wie kann ich JDownloader dazu bringen, dass er immer versucht die Bilder in der höchsten Auflösung (mit "_NATIVE" am Ende) herunterzuladen?

PS: Ich kann auch Java programmieren
Reply With Quote
  #2  
Old 04.04.2020, 13:11
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,659
Default

you can do this with linkcrawler rules, one for finding the links, and one for rewriting
linkcrawler functions https://board.jdownloader.org/showth...280#post422008
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #3  
Old 04.04.2020, 13:37
BerndBosch BerndBosch is offline
Modem User
 
Join Date: Mar 2020
Posts: 4
Question

Thanks raztoki,

i guess this would work, but where / how can i add these Link Crawler Rule?

I found LinkCrawler: Link Crawler Rules under Andvanced Settings. Is this the place for such a rule?

Last edited by BerndBosch; 04.04.2020 at 13:48. Reason: added additional info
Reply With Quote
  #4  
Old 04.04.2020, 13:56
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,659
Default

yes sorry, I thought your knew based on your op

linkcrawler rules mean you don't have to make a plugin for simple fetching tasks. more complicated tasks it is best to create decrypter plugin. You are free to still do this, if you're not scared of some work =]
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #5  
Old 04.04.2020, 17:11
BerndBosch BerndBosch is offline
Modem User
 
Join Date: Mar 2020
Posts: 4
Question

thx raztoki,

i assume for this task, of just replacing a substring, its sufficient to use the link crawler. But how to use it? is there any tutorial for the syntax?

I assume it has to look something like this:

[ {
"enabled" : true,
"cookies" : null,
"updateCookies" : false,
"logging" : false,
"maxDecryptDepth" : 1,
"id" : 1585564566125,
"name" : "meinBezirk.at",
"pattern" : "https?://*.\\.meinbezirk\\.at/*._XXL*.",
"rule" : "DEEPDECRYPT",
"packageNamePattern" : null,
"passwordPattern" : null,
"formPattern" : null,
"deepPattern" : null,
"rewriteReplaceWith" : null
} ]

why do i need an id? what number has to be used?
is the pattern key the url of the gallery, or the image?
how to tell it to replace "_XXL" with "_NATIVE"?
Reply With Quote
  #6  
Old 04.04.2020, 17:40
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,659
Default

id can be left out and it will be generated. its more so for internal tracking which rule triggered what etc, and probably to potentially indicate a rule that failed, or dupe rules etc

pattern is the pattern you trigger the event (forum url not image)
deepPattern is the image url (image)

second rule you will do it based on the image url as directhttp (this rule only needed if they are not say prot://domain/file.jpg)

third rule to change the url from xxl or what ever with listeners, alter the url with component(s) listened to and amend your changes.

this thread https://board.jdownloader.org/showthread.php?t=80070 has more on it, but not sure how much can be seen due to forum url moderation/masking
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #7  
Old 29.08.2020, 22:53
BerndBosch BerndBosch is offline
Modem User
 
Join Date: Mar 2020
Posts: 4
Default

Hello raztoki,

i dont understand your answer. i tried many ways - for hours now. i am frustrated.

Can you please just show me how how to replace a substring w/ another one so that everytime i paste a link like *XXL* is replaced by *NATIVE*.

I have learned from Assembler over COBOL to C++ and Java, and i also understand the concept of RegEx.... but this linkCrawler rules are really a pain in the ass.

Why dont you take (at least) one day and write one wiki page with at least the most important rules and give a few examples so that everyone can help to improve this software?
Reply With Quote
  #8  
Old 31.08.2020, 17:37
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,143
Default

Hi,
Quote:
Originally Posted by BerndBosch View Post
Hallo,

ich würde gerne (beim Crawlen?) einen konstanten Substring von bestimmten URLs gegen einen anderen tauschen.[...]
Hier ist eine simple LinkCrawler Regel, die alle "XL" und "XXL" Bilder/URLs zu "NATIVE" macht:
Spoiler:
Code:
[{
  "enabled" : true,
  "logging" : false,
  "maxDecryptDepth" : 1,
  "name" : "example rule meinbezirk.at replace _XL with _NATIVE",
  "pattern" : "(**External links are only visible to Support Staff**,
  "rule" : "REWRITE",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : null,
  "rewriteReplaceWith" : "$1_NATIVE.jpg"
}]

Hier nochmal auf nem Pastebin um unsere Foren-Zensur (Datenschutzgründe) zu umgehen:
pastebin.com/EGeVVw31

Quote:
Originally Posted by BerndBosch View Post
PS: Ich kann auch Java programmieren
Das ist sehr gut!
Wir sind open source und du darfst gerne Code beisteuern.

Bedenke außerdem, dass du mit LinkCrawler Regel nur relativ simple Dinge tun kannst - je mehr dein Crawler "können soll", desto sinnvoller ist es, ein eigenes Crawler Plugin (in Java) für JD zu schreiben.
Das was du hier möchtest scheint mir aber noch per LinkCrawler Regeln möglich zu sein.
Du köttest dir nun z.B. noch eine zweite Regel erstellen, die einzelne "meinbezirk.at" Artikel-URLs erkennt und dort alle Bilder sucht --> Kann kannst du diese Links auch über die Zwischenablagenüberwachung in JD einfügen und bekommst genau die Links, die du möchtest und nicht z.B. auch alle .js URLs aus dem html Code der meinbezirk Webseite.

Quote:
Originally Posted by BerndBosch View Post
Hello raztoki,

i dont understand your answer. i tried many ways - for hours now. i am frustrated.[...]
Well with my above rule you should easily accomplish your goal.

Quote:
Originally Posted by BerndBosch View Post
[...]
I have learned from Assembler over COBOL to C++ and Java, and i also understand the concept of RegEx.... but this linkCrawler rules are really a pain in the ass.

Why dont you take (at least) one day and write one wiki page with at least the most important rules and give a few examples so that everyone can help to improve this software?
Bis dato habe ich noch keine Zeit dafür gefunden.
HIER anbei eine Übersicht unserer bisherigen Support-Artikel.

LinkCrawler Regeln sind ein erweitertes Feature, das keine Benutzeroberfläche hat und alleine schon deswegen eher von erfahreneren Usern verwendet wird.
Bei vielen scheitert es bereits bei den regulären Ausdrücken und einen "wie lerne ich RegEx" Kurs zu geben ist nun wirklich nicht unsere aufgabe.

Ansonsten gebe ich dir aber schon recht - es mangelt noch an Erklärungen/Beispielen für ein paar simple LinkCrawler Regeln bzw. Erklärungen der Properties.

Auch hier darfst du gerne mithelfen und/oder zumindest anmerken, welche Beispiele/Erklärungen du dir wünschen würdest.

Grüße,
pspzockerscene
EDIT

Entschuldige bitte den Deutsch-Englisch Mix, aber das kommt bei uns im Forum manchmal vor^^
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 13:20.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.