JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 26.01.2022, 15:38
dabrown dabrown is offline
Black Hole
 
Join Date: Jun 2015
Location: North America
Posts: 281
Default Packagizer regex driving me crazy (put part of URL into filename)

OK, I admit I can't figure out regex. What I'm trying to do is download a large number of files that unfortunately the site decided to name ALL the same. The only difference is the original filenames are buried in the URL under several layers of folder names.

Example URLs:

**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Note that the only difference is in the second subdirectory... which is what I want the files to be renamed to.

The issue is I can't figure out what the regex I need to scrape the filename out of the URL and make that the new filename (example, I want the names to be image00001-16-5-18.jpg, image00002-16-5-18.jpg, etc etc.


Every example I find on the board says to set the filename in packagizer rules to <jd:hoster:1>.<jd:orgfiletype>. That's nice... except all that does is rename all the files from nudecollect.com.jpg to _jd_hoster_1_.jpg. Not helpful.

I've tried all sorts of wrong regex in the downloadurl field:

www\.nudeco11ect\.com/*/.+
.*udeco11ect\.com/*/(?:v|d)/*/(.+)
www\.nudeco11ect\.com/(\d+)/.*

Basically what I've been trying to do is copy/paste example regex, modifying the top level domain, hoping I'll hit on one that does... something. Even if it's wrong, if it made the filenames unique that would be a start, but it just makes them all exactly the same.

The issue is I have no idea what the regex actually DOES or why some examples have forward slashes and some have backslashes and what wildcards are actually being used to grab data vs just "match anything". Most of the examples I find on the board have the original URLs hidden so I can't see what they are actually doing to match and replace. I don't know what the v or the d in the examples regex I find does. I think I know * means match all, and ? means match character. Other than that I'm lost.

I figured out how to get it to actually match the URL- that was easy. The hard part is matching the exact subfolder I want to use as the source for the new filename. What I want it to find is basically "match any url starting with **External links are only visible to Support Staff****External links are only visible to Support Staff**, ignore the first subdirectory, take the SECOND subdirectory text, ignore the 3rd and 4th subdirectory, then rename the file using the second subdirectory text and append the original filetype to the end. And if I were REALLY wanting to get tricky, use the 3rd subdirectory as the download subfolder name (which I can do manually since it's one copy/paste, whereas the gallery contains hundreds of files).

Thank you in advance if anyone can actually figure out what I'm asking here.
Reply With Quote
  #2  
Old 26.01.2022, 16:46
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,117
Default

Hi,
while we won't teach individual users how regular expressions work, we do have a nice Packagizer overview which also includes URLs on how to learn RegEx and some examples.

In this case I'll just provide a working rule for you - maybe it helps you to build more custom rules

Rule as screenshot:


Now if you wanted, you could even put all images of that collection into one package if that information is in your URL: "nudeco11ect.com/nudecollect-659.../bla" (highlighted part).
It would be easy to modify the regular expression to grab that part and set it as packagename.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 26.01.2022 at 16:47. Reason: Added missing hyperlink to Packagizer introduction
Reply With Quote
  #3  
Old 26.01.2022, 17:05
dabrown dabrown is offline
Black Hole
 
Join Date: Jun 2015
Location: North America
Posts: 281
Default

Thanks! It does what I needed it to do, though I now need to modify it since I found another layer I missed BUT I can now see what the regex is doing/how it's supposed to be formulated.
Reply With Quote
  #4  
Old 27.01.2022, 13:30
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,117
Default

Thanks for your feedback.
As linked in our Packagizer instructions, there are nice webtools available for RegEx testing e.g. regex101.com.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 01:38.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.