⬛=Superfluous, I guess, added by the site to make it more readable/for SEO
⬛=Superfluous, added by me to assign the title I acquired before
#1
|
||||
|
||||
LinkGrabber stripping URL components?
I'm heavily using userScripts in my browser to handle titles for all kinds of downloads, and at some point, I wanted to integrate that into JDownloader and wrote a script for that.
Originally, I wrote it for single URLs that I would copy from my browser's address bar: My userScripts would get the appropriate title for the download and, depending on how the website was set up, append it to the URL, usually either as a query string ("?title=Good%20Title"), or hash ("#Good%20Title"). When I copied the URL, the JDownloader would grab the URL, and if I wanted, I could run the script from the context menu after right clicking the desired link in the list. The script would look at the link (myCrawledLink.getUrl()), extract the title and rename the file and the package accordingly. Today, I wanted to expand on this, but I noticed that now, after some update, the URLs are being stripped from their titles. What used to be "һttps://aparatꓸcam/35sfec251822/Fargo.S04E03.1080p.WEB.H264-CAKES.mkv.mp4.html#Fargo%20Season%204%20Episode%203%20-%20Raddoppiarlo"
Spoiler:
⬛=Superfluous, I guess, added by the site to make it more readable/for SEO
⬛=Superfluous, added by me to assign the title I acquired before The components that aren't strictly necessary to identify the link are gone. For URLs that I copied directly (from the Address Bar, for example), the original URL is still visible in myCrawledLink.getContentURL(), but no such luck for cases where the "extra info" was in the href property of <a> elements that were selected in the site text flow. Is there a way to change that? Last edited by svArtist; 15.10.2020 at 03:16. Reason: Link detection broke example highlighting |
#2
|
||||
|
||||
our plugins look for specific content of urls which is usually the least amount of the url required to make it work. typically domain+uid, && and the plugins determine filename/filesize online confirmations.
the easiest way to make this work would be to make a package customiser rule and search for your #whatever since most plugins do not use this (though mega and a few others do) and provide the custom filename you desire. In respects to traceability you could reference a unique id or set a comment which you can then later refer.
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] |
#3
|
||||
|
||||
@raztoki: Thanks for the explanation and I would also suggest packagizer rule and comment field.
@svArtist: There is no way to disable the internal handling of the plugin internal URL, that part removes unnecessary parts/rebuild URLs with important information only. How exactly do you add those URLs? You should use the packagizer to either directly parse hash fields and set values or copy those to comment field for late use.
__________________
JD-Dev & Server-Admin |
#4
|
||||
|
||||
Quote:
Quote:
I've never used custom packagizer rules before. On one hand, I think it'd be easier to have my userScripts just hand me a text-only list of the modified URLs that I'll give the LinkGrabber's parser directly, such that I can simply refer to the content URL, which seems to be set to the full original URL in cases where no context URL was found. But on the other hand, I want to get to know the packagizer stuff, so I'll look into that. So the packagizer gets the links before they're handled by the individual plugins? @Jiaz: I use GreaseMonkey for a lot of things. There's one site, for example, who lists the useful titles for things like episodes, for their site internal links. But the links to the actual contents are links to external sites. So I get the titles from the first listing and track them through site navigation, using URL components, until I get to the external links. I add the titles to these links (depending on the host, I'll usually choose a hash or a query string). Usually, a different userscript for the target site then sets the document title, so that I can download already correctly titled videos with a video downloader extension. Or, for other sites that don't have the actual titles but have the contents load dynamically in a page, I implemented a textarea in which I can dump a pre-formatted (JSON) list of all the episode numbers and titles that I can generate using another userScript on IMDB, which will change the document title upon episode change. I just started using JDownloader for some cases where the titles can be appended to several links at the same time, thinking I could just copy them from the modified document. Turns out to be not quite as simple |
#5
|
||||
|
||||
Nice!
Turns out, I can use packagizer to do my job for me for most cases anyway! If anyone is interested in what I did: Sourceurl(s) -> contains -> "(#)|(customTitle=)(.+)" -> as RegEx Link origin -> is -> Clipboard ... Filename -> "<jd:source:3>.<jd:orgfiletype>" Comment -> "customTitle=<jd:source:3>" It even seems to decode the URL Component for you! This way, if I need to do some custom stuff, I can use my EventScripter script and refer to the title. Now I just need to come up with a way to exclude some patterns or other ways of filtering cases like mega links which use the hash symbol... Negative look-aheads in RegEx are so ugly, I'm told... But hey, this seems to work: Code:
^(?:(?:https?:\/\/(?!mega\.(?:co\.)?nz)[^#]*#)|(?:.*customTitle=))([^\s&\/#]+).*$ Last edited by svArtist; 17.10.2020 at 05:08. |
#6
|
||||
|
||||
think your making your package customiser rule more complicated than it needs to be. As in if it doesn't contain customtitle or uniquereference it wont conflict JD setting its own filename. As long as your script sets every time and its the end of the url structure there should be no problem.
#uniquereference=(your%20name%20here) #uniquereference=(.+) if contains ... Sourceurl(s) -> contains -> "#uniquenamehere=(.+)" -> as RegEx and then Filename -> "<jd:source:1>.<jd:orgfiletype>" Comment -> "<jd:source:1>" do you use mega?
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] |
Thread Tools | |
Display Modes | |
|
|