|
[Solved] Packagizer regex driving me crazy (put part of URL into filename) |
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Packagizer regex driving me crazy (put part of URL into filename)
OK, I admit I can't figure out regex. What I'm trying to do is download a large number of files that unfortunately the site decided to name ALL the same. The only difference is the original filenames are buried in the URL under several layers of folder names.
Example URLs: **External links are only visible to Support Staff****External links are only visible to Support Staff** **External links are only visible to Support Staff****External links are only visible to Support Staff** **External links are only visible to Support Staff****External links are only visible to Support Staff** **External links are only visible to Support Staff****External links are only visible to Support Staff** Note that the only difference is in the second subdirectory... which is what I want the files to be renamed to. The issue is I can't figure out what the regex I need to scrape the filename out of the URL and make that the new filename (example, I want the names to be image00001-16-5-18.jpg, image00002-16-5-18.jpg, etc etc. Every example I find on the board says to set the filename in packagizer rules to <jd:hoster:1>.<jd:orgfiletype>. That's nice... except all that does is rename all the files from nudecollect.com.jpg to _jd_hoster_1_.jpg. Not helpful. I've tried all sorts of wrong regex in the downloadurl field: www\.nudeco11ect\.com/*/.+ .*udeco11ect\.com/*/(?:v|d)/*/(.+) www\.nudeco11ect\.com/(\d+)/.* Basically what I've been trying to do is copy/paste example regex, modifying the top level domain, hoping I'll hit on one that does... something. Even if it's wrong, if it made the filenames unique that would be a start, but it just makes them all exactly the same. The issue is I have no idea what the regex actually DOES or why some examples have forward slashes and some have backslashes and what wildcards are actually being used to grab data vs just "match anything". Most of the examples I find on the board have the original URLs hidden so I can't see what they are actually doing to match and replace. I don't know what the v or the d in the examples regex I find does. I think I know * means match all, and ? means match character. Other than that I'm lost. I figured out how to get it to actually match the URL- that was easy. The hard part is matching the exact subfolder I want to use as the source for the new filename. What I want it to find is basically "match any url starting with **External links are only visible to Support Staff****External links are only visible to Support Staff**, ignore the first subdirectory, take the SECOND subdirectory text, ignore the 3rd and 4th subdirectory, then rename the file using the second subdirectory text and append the original filetype to the end. And if I were REALLY wanting to get tricky, use the 3rd subdirectory as the download subfolder name (which I can do manually since it's one copy/paste, whereas the gallery contains hundreds of files). Thank you in advance if anyone can actually figure out what I'm asking here. |
#2
|
||||
|
||||
Hi,
while we won't teach individual users how regular expressions work, we do have a nice Packagizer overview which also includes URLs on how to learn RegEx and some examples. In this case I'll just provide a working rule for you - maybe it helps you to build more custom rules Rule as screenshot: Now if you wanted, you could even put all images of that collection into one package if that information is in your URL: "nudeco11ect.com/nudecollect-659.../bla" (highlighted part). It would be easy to modify the regular expression to grab that part and set it as packagename. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download Last edited by pspzockerscene; 26.01.2022 at 16:47. Reason: Added missing hyperlink to Packagizer introduction |
#3
|
|||
|
|||
Thanks! It does what I needed it to do, though I now need to modify it since I found another layer I missed BUT I can now see what the regex is doing/how it's supposed to be formulated.
|
#4
|
||||
|
||||
Thanks for your feedback.
As linked in our Packagizer instructions, there are nice webtools available for RegEx testing e.g. regex101.com. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
|
|