JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 02.06.2019, 12:03
djmakinera djmakinera is offline
JD Legend
 
Join Date: May 2010
Location: Poland
Posts: 8,171
Default I am asking for better improvements in link recognition

I am asking for better improvements in link recognition.

See: **External links are only visible to Support Staff****External links are only visible to Support Staff**
Reply With Quote
  #2  
Old 03.06.2019, 09:10
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 63,353
Default

How about posting the actual example link or do you expect me to type the link from that screenshot?
In case the link contains supported/known file extension then it should be picked up by generic http plugin. Else it should be supported by deep decryption in case it leads to downloadable content.
__________________
JD-Dev & Server-Admin
Reply With Quote
  #3  
Old 03.06.2019, 13:02
djmakinera djmakinera is offline
JD Legend
 
Join Date: May 2010
Location: Poland
Posts: 8,171
Default

I mean now, because I found a way to extract links from a binary file. But I would like JD to recognize links without percentages.
Reply With Quote
  #4  
Old 04.06.2019, 02:28
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 16,113
Default

percentages, i assume from his previous queries he is referring to urlencoding.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #5  
Old 04.06.2019, 09:32
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 63,353
Default

@djmakinera: as I already explained, urlencoding is normal for URLs and are already supported in given situations
__________________
JD-Dev & Server-Admin
Reply With Quote
  #6  
Old 04.06.2019, 12:00
djmakinera djmakinera is offline
JD Legend
 
Join Date: May 2010
Location: Poland
Posts: 8,171
Default

This means that the wrong link will be analyzed, e.g. with a space. This is not normal.
Reply With Quote
  #7  
Old 04.06.2019, 12:02
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 63,353
Default

How about some example links? spaces in URLs must be urlencoded!
__________________
JD-Dev & Server-Admin
Reply With Quote
  #8  
Old 10.06.2019, 02:08
djmakinera djmakinera is offline
JD Legend
 
Join Date: May 2010
Location: Poland
Posts: 8,171
Default

It's just that the editor correctly recognizes only URL. in JD not always. In some cases, it just does not work. And you can not even start the normal parsing of the link!

Example: https%3A%2F%2Fsoundcloud.com%2Fdj-itronix%201
Reply With Quote
  #9  
Old 11.06.2019, 17:18
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 63,353
Default

will be fixed with next core update but the example link is invalid
__________________
JD-Dev & Server-Admin
Reply With Quote
  #10  
Old 11.06.2019, 19:51
djmakinera djmakinera is offline
JD Legend
 
Join Date: May 2010
Location: Poland
Posts: 8,171
Default

It is correct, just simply incorrectly recognized urls

%3A%2F%2Fsoundcloud.com%2Fdj-itronix%201
soundcloud.com/dj-itronixwhitespace1
Blue - correct urls
Red - plaint text
Orange - white space
Reply With Quote
  #11  
Old 11.06.2019, 19:55
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 63,353
Default

how should JDownloader know that the 'whitespace 1' doesn't belong to the URL?
__________________
JD-Dev & Server-Admin
Reply With Quote
  #12  
Old 11.06.2019, 20:35
djmakinera djmakinera is offline
JD Legend
 
Join Date: May 2010
Location: Poland
Posts: 8,171
Default

It still depends on the site, but the soundcloud links do not contain spaces, so you can end the link where the white space begins

(?i)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))
Reply With Quote
  #13  
Old 11.06.2019, 20:50
djmakinera djmakinera is offline
JD Legend
 
Join Date: May 2010
Location: Poland
Posts: 8,171
Default

I wrote a regular expression that partially solves this problem... with a space will work, as long as there is no other word
See screenshot:
**External links are only visible to Support Staff****External links are only visible to Support Staff**
Reply With Quote
  #14  
Old 12.06.2019, 00:28
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 16,113
Default

urls contain urlencoding for space %20 thus is valid according to your source.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]

Last edited by raztoki; 14.06.2019 at 07:35. Reason: missing l in urls
Reply With Quote
  #15  
Old 12.06.2019, 10:19
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 63,353
Default

Quote:
Originally Posted by djmakinera View Post
It still depends on the site, but the soundcloud links do not contain spaces, so you can end the link where the white space begins
The processing of the urlencoding happens LONG LONG before any plugin kicks in.During that processing JDownloader cannot know that the space may be part of the URL as it's fully valid url or seperator
__________________
JD-Dev & Server-Admin
Reply With Quote
  #16  
Old 14.06.2019, 07:37
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 16,113
Default

copy the source code, it be either urlencoded on the website, or within quotation marks '' or "" thus tells you the beginning or ending of url (jd parser supports this). if you copy within quotation marks (yourself), its good practice then to urlencode as it could contain spaces, else you can have issues.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #17  
Old 14.06.2019, 15:25
djmakinera djmakinera is offline
JD Legend
 
Join Date: May 2010
Location: Poland
Posts: 8,171
Default

The solution is to open the file as ASCII (binary), which separates the invalid characters, detects only the URL in the editor, treats all other characters as non-url, but still JD2 still adds strange unicode.

Example:

Last edited by Jiaz; 14.06.2019 at 17:13.
Reply With Quote
  #18  
Old 14.06.2019, 17:13
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 63,353
Default

JDownloader doesn't add them. They are there. JDownloader doesn't magically add stuff to your clipboard content
__________________
JD-Dev & Server-Admin
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 19:08.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.