Your answer goes to show that again you either haven't read any of my linked help articles or you completely failed to understand it.
If you do not understand parts of our article, please let us know so we can improve it instead of having to describe something again and again in our forums.
I'll try to explain it more detailed:
So back to the beginning:
You have a website which JD does not have a plugin for and you want JD to do two things:
1. Recognize a specific URL structure.
2. Find something on that website (in this case, only those pictures).
This means first we need to make a regular expression matching the URLs our rule is supposed to work on.
...because we usually do not want that rule to just process all URLs from that website but only those with a specific structure (our
pattern).
In this case it is roughly:
asian-sirens.net/wp/1234/12/bla-bla/
--> So we could use the following as our pattern value:
Code:
**External links are only visible to Support Staff**
Quote:
Originally Posted by verheiratet1952
how to make sure first that deep-parser is working even if cloudflare is active?
|
You could do this by making a really basic rule for that that simply crawls everything (basically the one you did in youjr first attempt but with the fixed regular expression).
Example:
Code:
[
{
"enabled": true,
"logging": false,
"maxDecryptDepth": 1,
"name": "example rule crawl EVERYTHING from specific URLs from website asian-sirens.net",
"pattern": "**External links are only visible to Support Staff**,
"rule": "DEEPDECRYPT",
"packageNamePattern": null,
"deepPattern": null
}
]
Example as plaintext for easier copy & paste:
pastebin.com/raw/Ppw6CSpf
Now to test it, add the URL you provided in your original post and if JD finds some/all of these pictures you can be sure that:
- The pattern is working
- There are no issues with Cloudflare
Now that we're at it:
You could also use tools like regex101.com to verify in beforehand, that the pattern we created for our rule is matching the URLs we want it to work for.
...so let's continue.
but I have no idea where to add 'property' - cannot find any help at google searching for example "linkcrawler property"
Quote:
Originally Posted by verheiratet1952
but I have no idea where to add 'property' - cannot find any help at google searching for example "linkcrawler property"
|
Do not google it.
Use examples found in our knowledge base and also ones you can find in our forum.
It's all explained in our help article but I'll add a more detailed explanation here:
Now you got a working rule but it crawls everything instead of only the pictures you want to have so let's add a "deepPattern" to the rule we've created before to tell it what to look for inside the websites' HTML code.
That is what we need
deepPattern for:
Code:
[
{
"enabled": true,
"logging": false,
"maxDecryptDepth": 1,
"name": "example rule crawl PICTURES from specific URLs from website asian-sirens.net",
"pattern": "**External links are only visible to Support Staff**,
"rule": "DEEPDECRYPT",
"packageNamePattern": null,
"deepPattern": "\"(https?://www\\.asian-sirens\\.net/uploads/[0-9]{4}/[^\"]+)"
}
]
Example as plaintext for easier copy & paste:
pastebin.com/raw/0RPb35i0
(As you can see, I've modified the "
deepPattern" once again - it is not the same version that I've shown you in my previous reply.)
Now test this one again and you should get only pictures.
If you still get too many, you might have to modify your deepPattern to try to only get the items you want.
For some websites this might be impossible but in this case it should be possible.
Again you could simply use webtools like regex101.com to test your deepPattern against the HTML of your target website so you could see the results JD would find in beforehand for example:
...by the way this one works exactly like the other one I've created for you here:
https://board.jdownloader.org/showpo...02&postcount=6
Now if you e.g. want to have nicer package names, you could make use of
packageNamePattern with another regular expression but that's up to you.
-psp-