JDownloader Community - Appwork GmbH
 

Notices

Reply
 
Thread Tools Display Modes
  #1  
Old 09.06.2021, 12:24
woolf woolf is offline
Wind Gust
 
Join Date: Jun 2021
Posts: 41
Default hitbdsm.com

Hello,
so I am trying to make rule for hitbdsm.com. My regex should work, but JD won't grab video for some reason. Could you please let me know what is wrong there?

Spoiler:

[{
"maxDecryptDepth": 1,
"name": "hitbdsm.com rule",
"pattern": "https?://(?:www\\.)?hitbdsm\\.com/[a-z0-9\\-/]+",
"rule": "DEEPDECRYPT",
"deepPattern": "<\\s*source src\\s*=\\s*'(.*?)'\\s*label"
}

Also tried with second rule:
Spoiler:

{
"pattern": "https?://\\w+\\.\\w+\\.com/\\w+/\\w+\\.mp4",
"rule": "DIRECTHTTP"
}]
Reply With Quote
  #2  
Old 09.06.2021, 12:44
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,289
Default

You have to take a look at the raw html source!
The video links are have embedded a player
Quote:
div class="responsive-player"> <iframe src=".....hitbdsm.com/wp-content/plugins/clean-tube-player/....
your first rule must have deepPattern for this clean-tube-player URL
then the 2nd rule is required, deepPattern for
Quote:
<source src=...
__________________
JD-Dev & Server-Admin
Reply With Quote
  #3  
Old 09.06.2021, 17:13
woolf woolf is offline
Wind Gust
 
Join Date: Jun 2021
Posts: 41
Unhappy

Ok, I give up. I don't know what I did wrong this time. This is what I've got so far:
Spoiler:

[{
"maxDecryptDepth": 1,
"name": "hitbdsm.com rule",
"pattern": "(https?://(?:www\\.)?hitbdsm\\.com/)((?!wp)[\\w-]+/)",
"rule": "DEEPDECRYPT",
"deepPattern": "<\\s*div\\sclass\\s*=\\s*'responsive-player\\s*'>\\s<\\s*iframe\\s*src\\s*=\\s*'(.*?)'"
},
{
"maxDecryptDepth": 1,
"name": "hitbdsm.com rule2",
"pattern": "(https?://(?:www\\.)?hitbdsm\\.com/wp-content/plugins/clean-tube-player.*)",
"rule": "DEEPDECRYPT",
"deepPattern": "<\\s*source src\\s*=\\s*'(.*?)'\\s*label='480p'"
}
]
Reply With Quote
  #4  
Old 09.06.2021, 17:29
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 70,918
Default

Here are two rules that will do the job:
Code:
[
  {
    "maxDecryptDepth": 1,
    "name": "hitbdsm.com rule",
    "pattern": "https?://(?:www\\.)?hitbdsm\\.com/(?!wp)[\\w-]+/",
    "rule": "DEEPDECRYPT",
    "deepPattern": "(/wp-content/plugins/clean-tube-player/public/[^\"]+)",
    "packageNamePattern": "<title>(.*?)( - HitBDSM)?</title>"
  },
  {
    "maxDecryptDepth": 1,
    "name": "hitbdsm.com rule2",
    "pattern": "https?://(?:www\\.)?hitbdsm\\.com/wp-content/plugins/clean-tube-player.*",
    "rule": "DEEPDECRYPT",
    "deepPattern": "<\\s*source src\\s*=\\s*'(.*?)'\\s*label='\\d+p'"
  }
]
Rule as plaintext for easier copy & paste:
pastebin.com/PxZYPWum

Please note the following:
- There are always multiple ways to accomplish your goal with these rules/RegExes - there is no "right" or "wrong" as long as it works
- I've noticed you tried to always grab 480p - I've adjusted the rule to grab all qualities - you can of course revert these changes if you want to
- Test your regular expressions via the free webtool regex101.com
- You can easily format/check json via free webtool jsoneditoronline.org

Ahh and by the way:
At this moment the filenames are bad but the packagename is nice.
If you want to have nice filenames, use a Packagizer rule to set the packagename as filename.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 09.06.2021 at 17:32.
Reply With Quote
  #5  
Old 09.06.2021, 17:43
woolf woolf is offline
Wind Gust
 
Join Date: Jun 2021
Posts: 41
Default

Thank you for this.
I know there are many ways to get same result, but I've tried many of them.
I was going to make it grab all qualities but at first I had to make it work at all.
Of course I use regex101.com all the time, it is very helpful, but it has a bit diffrent syntax - or maybe there is just some switch to make it work same as JD. I don't know.
I've already worked with packagizer so I hope I will make it work by myself.

I am still not sure why it did not work, but I'll compare it to your code later.
Reply With Quote
  #6  
Old 09.06.2021, 17:52
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 70,918
Default

On regex101.com, the left side you can set the "Flavor".
Choose one which does not need escaping of slash (/) e.g. "Java 8" and then all there is left to make the above work in regex101 is maybe to replace your double-backslashes (\\) with single backslashes (\).

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #7  
Old 10.06.2021, 11:44
woolf woolf is offline
Wind Gust
 
Join Date: Jun 2021
Posts: 41
Default

My mistake was wrong " escaping. I did ' instead of \".
But still only first rule is necessary to grab video, second one doesn't do anything because there are .mp4 videos found already. If I set to grab only 480p like I did first time it still grabs everything.
I thought these rules go one after another, so the second one gets the output from first one and so on.
Is there any way to grab only one video (480p) from this "second page"?

btw. my rule is "better" because it gets full URL and yours only
/wp-content/plugins/clean-tube-player/public/* part.
Does it mean I don't have to point to full URL? It doesn't make any sense.

Last edited by woolf; 10.06.2021 at 11:48.
Reply With Quote
  #8  
Old 10.06.2021, 14:30
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 70,918
Default

Quote:
Originally Posted by woolf View Post
But still only first rule is necessary to grab video, second one doesn't do anything because there are .mp4 videos found already.
Wrong!
The plain html code of e.g. "hitbdsm.com/some-random-video/" does not contain any .mp4 URLs.
You need the first rule to get from "hitbdsm.com/some-random-video/" to "hitbdsm.com/wp-content/plugins/clean-tube-player/public/player-x.php?q=...".
Before you ask:
Yes indeed it would work with the 2nd rule only but then you'd have to add the first links via deep-crawler (add links dialog).
The crawler rule allows it to also get picked up by the clipboard observation and also it's faster as it won't parse all of the html.
Without a rule, the deep crawler will also add all image/js/css files of a website which is basically all the stuff you do not want.

Quote:
Originally Posted by woolf View Post
I thought these rules go one after another, so the second one gets the output from first one and so on.
That's basically correct.

Quote:
Originally Posted by woolf View Post
btw. my rule is "better" because it gets full URL and yours only
Again it alyways depends.
My rule will grab the relative URL which will later get pieced together by the upper handling.
This has one advantage over your attempt:
If this website ever decides to use relative URLs, my rule will still work and yours will fail... but to be honest if they make changes they will probably change more so most likely both rule variants would fail

Quote:
Originally Posted by woolf View Post
Does it mean I don't have to point to full URL?
Well as long as you know that the target URL you want is on the same website (same domain): Yes, you can also RegEx relative URLs.

Quote:
Originally Posted by woolf View Post
If I set to grab only 480p like I did first time it still grabs everything.
...
Is there any way to grab only one video (480p) from this "second page"?
Indeed this is a special case:
The "clean-tube-player" URL contains a base64 encoded string in the "q" parameter which gets auto processed by one of our generic plugins which you cannot override via LinkCrawler Rules (yet).
... this means two things:
1. You only need one rule.
2. Differentiating between qualities is not possible based on the URLs which means none of our filter capabilities will help you to filter those.
All you could do is to use an EventScripter script to sort added URLs by size and then only keep the highest.

Sorry - this is an edge case and tool some time for me to recognize.

I've added a simple crawler plugin for this website that will auto handle this after the next update.
LinkCrawler Rules are not required anymore in this case.

Again, sorry for the confusion.

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #9  
Old 10.06.2021, 15:28
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,289
Default

Quote:
Originally Posted by woolf View Post
I thought these rules go one after another, so the second one gets the output from first one and so on.
All rules are checked/processed again and again until maxDepth is reached or an existing Plugin can further process the found link
__________________
JD-Dev & Server-Admin
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 14:04.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.