JDownloader Community - Appwork GmbH
 

Notices

Reply
 
Thread Tools Display Modes
  #1  
Old 26.07.2014, 12:28
CuF
Guest
 
Posts: n/a
Default Link Filtering Q: Removing non-links

I can't figure it out, so I don't know if it's possible.

In JD1 if you select a whole page and copy it to the clipboard, JD catches it, and if links are present, they are added.

In JD2, every single image file is added as well.
I don't want to block the ability to LinkGrab http links or even images... as long as they are explicitly links.

It seems a bit agressive for JD to parse the page source rather than the content as displayed.

Anyway... is there a filter rule I can put together that can fix this?

Just copying this page to the clipboard adds 33 links!

Last edited by CuF; 26.07.2014 at 12:30.
Reply With Quote
  #2  
Old 26.07.2014, 14:44
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 70,911
Default

Note that for a lot of sites we have plugins which "know" what to look for.
For other unsupported links, the only way is to parse all (jd will ask for deep analysis for such links).
This system has not changed from JD1 to JD2.
Please post some of your links here so we can re-check this in JD1 and JD2.

GreeZ psp
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #3  
Old 26.07.2014, 19:54
CuF
Guest
 
Posts: n/a
Default

I wasn't joking. Just ctrl-a, ctrl-c this page here.

33 'links' are found by JD2. JD1 finds none.
Something has definitely changed.
Reply With Quote
  #4  
Old 26.07.2014, 20:26
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,611
Default

Guess its dependant on your browser? in opera only copies as text when you ctrl a ctrl c (so if the links mentioned within href=blah>textlinkhere</a> it will only copy textlinkhere and not blah), if you copy HTML (say from the source protocol://host/file.extension) it will only pick up those objects, if you parse via deep decrypt in JD link analysis it will find more due to the parsing engine (supports relative paths etc). All three methods only pick up link formats which we support, just results will vary depending on the import feature. But most importantly also dependant on what your browser puts into the clipboard (re: first method).
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]

Last edited by raztoki; 26.07.2014 at 21:02.
Reply With Quote
  #5  
Old 26.07.2014, 23:27
CuF
Guest
 
Posts: n/a
Default

I'm using FF. Is there a way to turn 'deep analysis' off for clipboard parsed links?
Reply With Quote
  #6  
Old 27.07.2014, 00:20
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,611
Default

deep analysis isn't on clipboard by default. You have to either select within the many ways.



either via normal, and then on failure to find anything against links added you will be prompted with this

or via deep analysis will match deep instead of normal.



or do this with shortcut or menu

Attached Images
File Type: png Add_Links_Dialog-Add.png (32.3 KB, 127 views)
File Type: png Add_Links_Dialog-Nothing_found.png (10.1 KB, 115 views)
File Type: png Add_New_Links-Menu.png (15.1 KB, 109 views)
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #7  
Old 27.07.2014, 01:06
CuF
Guest
 
Posts: n/a
Default

Hmm. So it isn't finding the 'nothing found' first. It's examining the clipboard in a different manner than JD1 did; as raw html source, not a rendered web page.

With FF it snags every single picture in this window including the avatars, 'post reply' & 'quote' buttons. It's crazy.

I just tested Internet Explorer and JD2 is finding the links to the JDownloader setup files in pspzockerscene's signature. It shouldn't find them though, since they are links not text in a link format.

Sounds like JD2 wasn't tested on many browsers.
Reply With Quote
  #8  
Old 27.07.2014, 01:58
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,611
Default

once again that is a browser function (though I haven't heard of any that do that, other than maybe extensions?) that's copying html/href and not text which is the display. There are no links found via clipboard monitoring when ctrl a and ctrl c of this forum thread, only if you deep analyse within JD GUI function, or browser is doing something messed up.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #9  
Old 27.07.2014, 07:28
CuF
Guest
 
Posts: n/a
Default

Fixed:
Advanced Settings--
GraphicalUserInterfaceSettings: Clipboard Monitor Process HTMLFlavor: Disable

Probably shouldn't be on by default.

Last edited by CuF; 27.07.2014 at 08:42.
Reply With Quote
  #10  
Old 27.07.2014, 14:20
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,611
Default

hmm, Ok so chrome (I use it as secondary browser, but never experienced this before) and some probably other browsers copy HTML also. You don't see it via clipboard monitoring or ctrl v. You can disable JD2 from picking that up via going to 'settings > advanced > GraphicalUserInterfaceSettings: Clipboard Monitor Process HTMLFlavor'

edit, your new post.
sorry some reason my edit on previous post never submitted, but the above was what I wrote 12 hours ago
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]

Last edited by raztoki; 27.07.2014 at 14:53.
Reply With Quote
  #11  
Old 27.07.2014, 20:58
CuF
Guest
 
Posts: n/a
Default

Thanks for the help everyone.

When I really want JD to grab every single link on a page, I use 'view source' and copy that to the clipboard.

The advanced setting is hidden too deep to be useful as an option in place of 'view source' (it even requires a restart of JD).

It also certainly shouldn't be enabled by default considering how it responds to different browsers.
Reply With Quote
  #12  
Old 27.07.2014, 21:06
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,611
Default

That decision was made by appwork and I think they are happy with it, though I'll bring it up with them on Monday and I'll see why our current default setting. Only reason I haven't noticed it before, is that it's highly dependant on browser and I really never use chrome/firefox.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #13  
Old 27.07.2014, 21:12
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 70,911
Default

And anyways, we're still in BETA and collect good default settings for the JD2 Stable release (no ETA given).

GreeZ psp
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #14  
Old 27.07.2014, 22:46
CuF
Guest
 
Posts: n/a
Default

Quote:
Originally Posted by raztoki View Post
That decision was made by appwork and I think they are happy with it, though I'll bring it up with them on Monday and I'll see why our current default setting. Only reason I haven't noticed it before, is that it's highly dependant on browser and I really never use chrome/firefox.
I thought it was stated that it worked properly under Chrome, which I don't se.
The problems do arise in FF and IE.
Reply With Quote
  #15  
Old 28.07.2014, 00:10
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,611
Default

@cuf
I ran test also.. in my chrome it behaved like your firefox.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #16  
Old 10.05.2020, 01:47
jua jua is offline
Baby Loader
 
Join Date: Jun 2019
Posts: 7
Default

(I know this is an old thread, but I think this reply might help future searchers. So let me tag the post with keywords. Jdownloader LinkGrabber linkfilter too many links too many files exclude images favicon logo icon deep analysis)


I just had the same problem, and I have understood the reason. I am moving my computing (including JDownloader2) to a new machine a little bit at a time, and now I found a new problem which wasn't there on the old machine. The problem is the same as reported by the original poster here.

It took me a little experimenting to find out why it happens.

Short answer: clipboard format.

Long answer: read on.

When a program places some information in the clipboard (copy), it is placed in several formats, depending on the program and the info. In the case of Firefox and webpage text, these are the formats that Firefox puts in the clipboard (sizes are relative to this page, which I copied right now while writing):

Code:
FORMAT_ID FORMAT_NAME           HANDLE_TYPE SIZE    INDEX
---------------------------------------------------------
49699     text/html             Memory      196,042 1 
49376     HTML Format           Memory      98,347  2 
49751     text/_moz_htmlcontext Memory      206     3 
49750     text/_moz_htmlinfo    Memory      8       4 
13        CF_UNICODETEXT        Memory      22,784  5 
1         CF_TEXT               Memory      11,392  6 
49752     text/x-moz-url-priv   Memory      160     7
Some formats have richer content. When pasting, the receiving program usually picks the format with the richest content it can understand. In the case of an ordinary text processing program (e.g. Notepad), it would probably be Format 13 (CF_UNICODETEXT) or Format 1 (CF_TEXT), or possibly Rich Text Format if Firefox copied an RTF version too (which it doesn't). However, Firefox also puts the HTML source in the clibboard, in 2 different formats (49669, 49376) which Notepad would ignore, but JDownloader chooses over simple text.

That's why LinkGrabber "sees" the source and shows links that aren't interesting, such as images linked in the text of the web page. It's not a matter of deep analysis at all: just a matter of reading the source HTML rather than the rendered text.

A possible workaround is to paste the text into an editor such as Notepad, and re-copy from there, then paste into LinkGrabber.

A more permanent solution would be an option to let JDownloader discard the HTML source and simply pick simple text from the clipboard.

EDIT - Oops, the option is already there as @raztoki noted above ( https://board.jdownloader.org/showpo...3&postcount=10 ).

SOLUTION Settings > Advanced > GraphicalUserInterfaceSettings: Clipboard Monitor Process HTMLFlavor. Turn it off.

Last edited by jua; 10.05.2020 at 01:55. Reason: Hadn't noticed a solution a few posts above
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 15:31.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.