JDownloader Community - Appwork GmbH
 

Notices

Reply
 
Thread Tools Display Modes
  #1  
Old 01.12.2021, 15:23
verheiratet1952 verheiratet1952 is offline
JD VIP
 
Join Date: Jan 2016
Posts: 325
Default asian-sirens.net DOES NOT bring any results at linkcrawler...

asian-sirens.net DOES NOT bring any results at linkcrawler...

(I also tried with new JD install using MULTIOS JAR)


example link
**External links are only visible to Support Staff****External links are only visible to Support Staff**
Reply With Quote
  #2  
Old 01.12.2021, 19:37
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 70,922
Default

Hi,
1. This website is not supported by JDownloader so you will have to use the DEEP-parser (= clipboard-observation will not work!) or use custom LinkCrawler rules.
Have you tried that already and did not get any results?

2. This website is using Cloudflare so if you're unlucky it fails for you because of that.
Our deep-parser is working just fine here with this website (= finds all of these images) so if you want us to check this, please provide a debug-log:

Please post your log-ID here | bitte poste deine Log-ID hier.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #3  
Old 01.12.2021, 21:30
verheiratet1952 verheiratet1952 is offline
JD VIP
 
Join Date: Jan 2016
Posts: 325
Default

Quote:
Originally Posted by pspzockerscene View Post
Hi,
1. This website is not supported by JDownloader so you will have to use the DEEP-parser (= clipboard-observation will not work!) or use custom **External links are only visible to Support Staff**....
Have you tried that already and did not get any results?

2. This website is using **External links are only visible to Support Staff**... so if you're unlucky it fails for you because of that.
Our deep-parser is working just fine here with this website (= finds all of these images) so if you want us to check this, please provide a debug-log:

Please post **External links are only visible to Support Staff**... | bitte poste **External links are only visible to Support Staff**....

-psp-

can you help me with pattern maybe?

[ {
"enabled" : true,
"cookies" : [ ],
"updateCookies" : true,
"logging" : false,
"maxDecryptDepth" : 0,
"name" : null,
"pattern" : "**External links are only visible to Support Staff**,
"rule" : "DEEPDECRYPT",
"packageNamePattern" : "<title>(.+)</title>",
"passwordPattern" : null,
"formPattern" : null,
"deepPattern" : null,
"rewriteReplaceWith" : null
} ]
Reply With Quote
  #4  
Old 01.12.2021, 22:08
verheiratet1952 verheiratet1952 is offline
JD VIP
 
Join Date: Jan 2016
Posts: 325
Default

Quote:
Originally Posted by verheiratet1952 View Post
can you help me with pattern maybe?

[ {
"enabled" : true,
"cookies" : [ ],
"updateCookies" : true,
"logging" : false,
"maxDecryptDepth" : 0,
"name" : null,
"pattern" : "**External links are only visible to Support Staff**,
"rule" : "DEEPDECRYPT",
"packageNamePattern" : "<title>(.+)</title>",
"passwordPattern" : null,
"formPattern" : null,
"deepPattern" : null,
"rewriteReplaceWith" : null
} ]


url slugs look like those for example

/wp/2021/11/satsuki/
/wp/2021/11/davina-stewart-davinastewart_/
/wp/2020/10/arabella_kat/
/wp/2020/10/putrypoyz/
wp/2018/10/vidadadada/


but I dont know to use code to make it work with linkcrawler !?

I know about

/.+?/
(.+)
\\d{6}

but it brings no results... I think it is about to use the right PATTERN...
Reply With Quote
  #5  
Old 01.12.2021, 22:19
verheiratet1952 verheiratet1952 is offline
JD VIP
 
Join Date: Jan 2016
Posts: 325
Default

Quote:
Originally Posted by verheiratet1952 View Post
url slugs look like those for example

/wp/2021/11/satsuki/
/wp/2021/11/davina-stewart-davinastewart_/
/wp/2020/10/arabella_kat/
/wp/2020/10/putrypoyz/
wp/2018/10/vidadadada/


but I dont know to use code to make it work with linkcrawler !?

I know about

/.+?/
(.+)
\\d{6}

but it brings no results... I think it is about to use the right PATTERN...

by the way: I am using eventscripter with txt-file with links to be added for deepdecrypt crawling with sleep intervall of 250
Reply With Quote
  #6  
Old 02.12.2021, 00:49
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 70,922
Default

Quote:
Originally Posted by verheiratet1952 View Post
can you help me with pattern maybe?
...
I would like to see some more effort from you.
We've helped you with multiple LinkCrawler Rules yet it seems like you haven't even learned the basics of regular expressions.
Example:
https://board.jdownloader.org/showthread.php?t=87038
Please re-read our LinkCrawler Rules article - it even links multiple websites to learn the basics of regular expressions:
https://support.jdownloader.org/Know...kcrawler-rules

Hints:
- your "pattern" is not escaped (e.g. dot = "match all")
- "deepPattern" is not defined but I'm quite sure you want it to grab only images or only specific images e.g.:
Code:
property=\"og:image\" content=\"(https?://www\\.asian-sirens\\.net/uploads/[0-9]{4}/[^\"]+)
(Untested - only an example!)

Quote:
Originally Posted by verheiratet1952 View Post
but I dont know to use code to make it work with linkcrawler !?

I know about

/.+?/
(.+)
\\d{6}

but it brings no results... I think it is about to use the right PATTERN...
Again:
Use the websites/webtools we've linked to test your regular expressions e.g. regex101.com.
Play around with it and you'll soon know the meaning of many more RegEx symbols

Quote:
Originally Posted by verheiratet1952 View Post
by the way: I am using eventscripter with txt-file with links to be added for deepdecrypt crawling with sleep intervall of 250
First you should make the rule work, then you can worry about the rest.

Also please stop double/tripple posting!
Our forum lets you edit existing posts which is what you would usually do if you want to add information to your existing posts if it was the last post inside a forum thread.


-psp-
EDIT

Edit example.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 02.12.2021 at 00:49. Reason: Added "Edit example."
Reply With Quote
  #7  
Old 02.12.2021, 14:12
verheiratet1952 verheiratet1952 is offline
JD VIP
 
Join Date: Jan 2016
Posts: 325
Default

Quote:
Originally Posted by pspzockerscene View Post

2. This website is using **External links are only visible to Support Staff**... so if you're unlucky it fails for you because of that.
Our deep-parser is working just fine here with this website (= finds all of these images) so if you want us to check this...
how to make sure first that deep-parser is working even if cloudflare is active?
is there internet provider issues or country limitation possible on my end because of cloudflare?




thanks you for your untested example...
property="og:image" content="(https?://www\\.asian-sirens\\.net/uploads/[0-9]{4}/[^"]+)

but I have no idea where to add 'property' - cannot find any help at google searching for example "linkcrawler property"
Reply With Quote
  #8  
Old 02.12.2021, 17:18
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 70,922
Default

Your answer goes to show that again you either haven't read any of my linked help articles or you completely failed to understand it.
If you do not understand parts of our article, please let us know so we can improve it instead of having to describe something again and again in our forums.
I'll try to explain it more detailed:

So back to the beginning:
You have a website which JD does not have a plugin for and you want JD to do two things:
1. Recognize a specific URL structure.
2. Find something on that website (in this case, only those pictures).

This means first we need to make a regular expression matching the URLs our rule is supposed to work on.
...because we usually do not want that rule to just process all URLs from that website but only those with a specific structure (our pattern).
In this case it is roughly:
asian-sirens.net/wp/1234/12/bla-bla/
--> So we could use the following as our pattern value:
Code:
**External links are only visible to Support Staff**
Quote:
Originally Posted by verheiratet1952 View Post
how to make sure first that deep-parser is working even if cloudflare is active?
You could do this by making a really basic rule for that that simply crawls everything (basically the one you did in youjr first attempt but with the fixed regular expression).
Example:
Code:
[
  {
    "enabled": true,
    "logging": false,
    "maxDecryptDepth": 1,
    "name": "example rule crawl EVERYTHING from specific URLs from website asian-sirens.net",
    "pattern": "**External links are only visible to Support Staff**,
    "rule": "DEEPDECRYPT",
    "packageNamePattern": null,
    "deepPattern": null
  }
]
Example as plaintext for easier copy & paste:
pastebin.com/raw/Ppw6CSpf

Now to test it, add the URL you provided in your original post and if JD finds some/all of these pictures you can be sure that:
- The pattern is working
- There are no issues with Cloudflare

Now that we're at it:
You could also use tools like regex101.com to verify in beforehand, that the pattern we created for our rule is matching the URLs we want it to work for.

...so let's continue.

but I have no idea where to add 'property' - cannot find any help at google searching for example "linkcrawler property"
Quote:
Originally Posted by verheiratet1952 View Post
but I have no idea where to add 'property' - cannot find any help at google searching for example "linkcrawler property"
Do not google it.
Use examples found in our knowledge base and also ones you can find in our forum.
It's all explained in our help article but I'll add a more detailed explanation here:

Now you got a working rule but it crawls everything instead of only the pictures you want to have so let's add a "deepPattern" to the rule we've created before to tell it what to look for inside the websites' HTML code.
That is what we need deepPattern for:
Code:
[
  {
    "enabled": true,
    "logging": false,
    "maxDecryptDepth": 1,
    "name": "example rule crawl PICTURES from specific URLs from website asian-sirens.net",
    "pattern": "**External links are only visible to Support Staff**,
    "rule": "DEEPDECRYPT",
    "packageNamePattern": null,
    "deepPattern": "\"(https?://www\\.asian-sirens\\.net/uploads/[0-9]{4}/[^\"]+)"
  }
]
Example as plaintext for easier copy & paste:
pastebin.com/raw/0RPb35i0
(As you can see, I've modified the "deepPattern" once again - it is not the same version that I've shown you in my previous reply.)

Now test this one again and you should get only pictures.
If you still get too many, you might have to modify your deepPattern to try to only get the items you want.
For some websites this might be impossible but in this case it should be possible.
Again you could simply use webtools like regex101.com to test your deepPattern against the HTML of your target website so you could see the results JD would find in beforehand for example:
Spoiler:




...by the way this one works exactly like the other one I've created for you here:
https://board.jdownloader.org/showpo...02&postcount=6
Now if you e.g. want to have nicer package names, you could make use of packageNamePattern with another regular expression but that's up to you.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 02.12.2021 at 17:27. Reason: Added more screenshots
Reply With Quote
  #9  
Old 02.12.2021, 18:37
verheiratet1952 verheiratet1952 is offline
JD VIP
 
Join Date: Jan 2016
Posts: 325
Default

Quote:
Originally Posted by pspzockerscene View Post
Your answer goes to show that again you either haven't read any of my linked help articles or you completely failed to understand it.
If you do not understand parts of our article, please let us know so we can improve it instead of having to describe something again and again in our forums.
I'll try to explain it more detailed:

So back to the beginning:
You have a website which JD does not have a plugin for and you want JD to do two things:
1. Recognize a specific URL structure.
2. Find something on that website (in this case, only those pictures).

This means first we need to make a regular expression matching the URLs our rule is supposed to work on.
...because we usually do not want that rule to just process all URLs from that website but only those with a specific structure (our pattern).
In this case it is roughly:
asian-sirens.net/wp/1234/12/bla-bla/
--> So we could use the following as our pattern value:
Code:
**External links are only visible to Support Staff**

You could do this by making a really basic rule for that that simply crawls everything (basically the one you did in youjr first attempt but with the fixed regular expression).
Example:
Code:
[
  {
    "enabled": true,
    "logging": false,
    "maxDecryptDepth": 1,
    "name": "example rule crawl EVERYTHING from specific URLs from website asian-sirens.net",
    "pattern": "**External links are only visible to Support Staff**,
    "rule": "DEEPDECRYPT",
    "packageNamePattern": null,
    "deepPattern": null
  }
]
Example as plaintext for easier copy & paste:
pastebin.com/raw/Ppw6CSpf

Now to test it, add the URL you provided in your original post and if JD finds some/all of these pictures you can be sure that:
- The pattern is working
- There are no issues with Cloudflare

Now that we're at it:
You could also use tools like regex101.com to verify in beforehand, that the pattern we created for our rule is matching the URLs we want it to work for.

...so let's continue.

but I have no idea where to add 'property' - cannot find any help at google searching for example "linkcrawler property"

Do not google it.
Use examples found in our knowledge base and also ones you can find in our forum.
It's all explained in our help article but I'll add a more detailed explanation here:

Now you got a working rule but it crawls everything instead of only the pictures you want to have so let's add a "deepPattern" to the rule we've created before to tell it what to look for inside the websites' HTML code.
That is what we need deepPattern for:
Code:
[
  {
    "enabled": true,
    "logging": false,
    "maxDecryptDepth": 1,
    "name": "example rule crawl PICTURES from specific URLs from website asian-sirens.net",
    "pattern": "**External links are only visible to Support Staff**,
    "rule": "DEEPDECRYPT",
    "packageNamePattern": null,
    "deepPattern": "\"(https?://www\\.asian-sirens\\.net/uploads/[0-9]{4}/[^\"]+)"
  }
]
Example as plaintext for easier copy & paste:
pastebin.com/raw/0RPb35i0
(As you can see, I've modified the "deepPattern" once again - it is not the same version that I've shown you in my previous reply.)

Now test this one again and you should get only pictures.
If you still get too many, you might have to modify your deepPattern to try to only get the items you want.
For some websites this might be impossible but in this case it should be possible.
Again you could simply use webtools like regex101.com to test your deepPattern against the HTML of your target website so you could see the results JD would find in beforehand for example:
Spoiler:




...by the way this one works exactly like the other one I've created for you here:
**External links are only visible to Support Staff**...
Now if you e.g. want to have nicer package names, you could make use of packageNamePattern with another regular expression but that's up to you.

-psp-

I once again installed new jd with multios file
I added linkcrawler code from you
please see **External links are only visible to Support Staff****External links are only visible to Support Staff**
I only added "packageNamePattern" : "<title>(.+)</title>", myself


I also added eventscripter to crawl a txt file with links from website...

I get NO results using txt-file eventscripter crawling, and I get NO results using 'continue' button using START DEEP link Analyse...

I am sorry :(


by the way:
pattern and deeppattern have different https www with '?'
I think that was your intention !?
Reply With Quote
  #10  
Old 02.12.2021, 19:07
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 70,922
Default

Quote:
Originally Posted by verheiratet1952 View Post
I also added eventscripter to crawl a txt file with links from website...
Again:
Please first add the rule and manually add links (without any EventScripter script) and check if it works that way!
Your post makes me think that you've tried it via EventScripter straight away without any testing...
Which URLs did you used for testing?
I only tested it using the URL you provided in your first post.

Quote:
Originally Posted by verheiratet1952 View Post
pattern and deeppattern have different https www with '?'
I think that was your intention !?
No it wasn't my intention but it doesn't matter as all of this websites' URLs do contain 'www.'.
Again the fact that you're even asking tells me that you didn't even invest 1 minute to think about it/learn how regular expressions work...
Here is a version which (dis-)allows the 'www.' in both regular expressions:
Code:
[
  {
    "enabled": true,
    "logging": false,
    "maxDecryptDepth": 1,
    "name": "example rule crawl PICTURES from specific URLs from website asian-sirens.net",
    "pattern": "**External links are only visible to Support Staff**,
    "rule": "DEEPDECRYPT",
    "packageNamePattern": null,
    "deepPattern": "\"(https?://(?:www\\.)?asian-sirens\\.net/uploads/[0-9]{4}/[^\"]+)"
  }
]
Rule as plaintext for easier copy & paste:
pastebin.com/raw/74QsXN4w

Quote:
Originally Posted by verheiratet1952 View Post
I get NO results using txt-file eventscripter crawling, and I get NO results using 'continue' button using START DEEP link Analyse...
Did you also test the first version of the LinkCrawler rule that should grab everything?
Even without any rule you get no results when you let JD deep-analyze the URL you provided in your first post??
Then maybe something is blocking JD or that "asian-sirens" website or Cloudflare (I don't think it's Cloudflare in this case as I don't see any Cloudflare cookies here).

To narrow this down, please do the following:
1. Change the "logging" parameter in that rule from false to true.
2. Enable debug logs (see link down below).
3. Please post your log-ID here | bitte poste deine Log-ID hier.

-psp-
EDIT

If none of this is working, we can look at this together via a Teamviewer session.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 02.12.2021 at 19:28. Reason: Fixed some typos
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 12:08.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.