View Single Post
  #1  
Old 16.06.2019, 13:56
smilies smilies is offline
Junior Loader
 
Join Date: Jan 2019
Posts: 13
Default LinkCrawler Rules - passing information through DEEPDECRYPT

Hi, I'd like to crawl a webpage A (then B, then C, ...) for links but to modify those links. One of the modifications is to insert a custom_string into the links. This custom_string isn't constant, it's different for each webpage A,B,C. The webpage doesn't care what comes after the question mark in the URL, so I can make the URL look like original_url?custom_string if that helps. Besides, the webpage contains Custom_string, i.e. a version of custom_string where the first letter is capitalized.
  • To crawl the webpage for links, I probably have to use DEEPDECRYPT. As far as I understand, DEEPDECRYPT can pass on information to the crawled links only from the contents of the webpage, but not from its address/URL. Is that correct?
If that is correct, then the solution seems to be this:
  • Use DEEPDECRYPT to match the links as well as Custom_string (capital letter!) as well as all_text_between_them. We have to match all_text_between_them because as far as I understand DEEPDECRYPT can only match contiguous strings. Is my understanding correct?
  • Modify the links using REWRITE. This includes my planned modifications mentioned above as well as: remove all_text_between_them and replace Custom_string by custom_string (make first letter lowercase). To make the first letter lowercase, I'll need 26 REWRITE rules: to replace A by a, B by b, and so on. Having 26 REWRITE rules will probably be slow? edit: I'll try to use \l
Do you plan to make the LinkCrawler more powerful? For example DEEPDECRYPT could pass on info from the URL and not only from the page contents. Or REWRITE (or another command) could allow things like adding 32 to the ASCII code of the first letter of Custom_string, so that it is made lowercase. Or what about adding a text box to the JDownloader settings where you can code scripts directly (in any imperative language), with variables etc, without having to set up an IDE and boilerplate code to write and compile entire plugins? Thanks!

Last edited by smilies; 16.06.2019 at 16:19.
Reply With Quote