|
#1
|
|||
|
|||
URL Extractor issue, question or feature
Looking for a similar tool or is it possible in feature JD2?
URL Extractor. Please read on the main page e.g. Code:
"What is URL Extractor?" **External links are only visible to Support Staff****External links are only visible to Support Staff** "Anchor Text" The Cyrillic alphabet is not supported here (unrecognized characters) I want to add such a link to JD2 later to download. The standard "Deep Search or Crawl Search" in this case does not extract links. |
#2
|
||||
|
||||
in respects to the topic this is no issue, JD is designed in relation to how we want it to work. It already supports returning of links of supported content via deep analyse, for example it only follows links when they are supported and only adds items to linkgrabber when content is supported like jpg png etc. If you want to extract urls and content outside of that, I would recommend using your webbrowser with browser extension (plenty exist). They will also return content derived from javascript. or use advanced setting: LinkCrawler.linkcrawlerrules to add support for additional content.
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] Last edited by raztoki; 04.11.2019 at 23:22. |
#3
|
|||
|
|||
I used this extension.
https://chrome.google.com/webstore/d...gofnhkkchiekoo Saves everything, including titles, that it expects in a CSV file. There is only a problem with Regex. Regex, which works in other tools, does not work here. I have no idea what regex supports this engine and how to modify it to work. Extract only specific links. |
#4
|
|||
|
|||
@raztoki
The problem is that I want to extract: Text link + Address It works. But nowhere on the internet can I find any "Multi-Link Extract" tool. All tools work only on single links. How can you enter, for example, 20 links? LinkCrawler.linkcrawlerrules cannot save additional data (EXTRA DATA): "TextLink" |
#5
|
||||
|
||||
What do you mean by *additional data* ?
__________________
JD-Dev & Server-Admin |
#6
|
|||
|
|||
I tried the tool, it's quick, but see what the Cyrillic problem is :( I don't know how to solve this problem. **External links are only visible to Support Staff****External links are only visible to Support Staff** https://postimg.cc/K4t7hHV5 |
#7
|
|||
|
|||
You can't just open the file as:
UTF-8 or Cyrillic Windows 1251 And save as Cyrillic -1251 or UTF-8 In this case, the text file will completely lose the correct encoding. https://i.postimg.cc/LXy1scNt/Screen...t-11-23-AM.jpg |
#8
|
|||
|
|||
@Jiaz - This is not intended to download entire sites, only: Extract URL with title, description.
|
Thread Tools | |
Display Modes | |
|
|