|
#1
|
||||
|
||||
Issue: Dupes from LinkGrabber list will be crawled, dupes from Download list won't
The issue is, that links, which have duplicates in LinkGrabber list are still being verified and online-checked, even though they won't be added by default.
Links, which have duplicates in Downloads list won't be verified or online-checked- unless I 'restore' them. This is correct! For the full report with screenshots, please read the attachment. And, please verify, if what I'm writing here is correct. If not, I would edit it accordingly, so others can use this as (correct) information. Preface JD provides the uses with an option to 'Restore x Filtered Links' - which is enabled by default. Duplicate handling with default settings works as follows: No duplicates at all will be added to LinkGrabber list, when a link to a duplicate already exists in Download list or in LinkGrabber list. However, you can restore filtered duplicates of links which exist in Download list (but only those!). In order to be able to see restored links, you have to enable 'Already in Downloadlist' in 'Views' pane Issue, I found In my eyes, also verifying and online-checking of duplicates in LinkGrabber list should not be performed. There is different duplicate crawling handling depending on which list the duplicates were found in. Please see my test cases for details - in particular the red/bold phrases: Case 1: When I copy a link into the LinkGrabber window (Analyze and Add Links) whose URL is already in LinkGrabber list, the link is not added. It won't be shown, even with this box checked: But I see the following in the BubbleNotifier: The link is obviously being verified and online-checked. But Found Packages stays at zero. Case 2: If the link is not in LinkGrabber list but in Downloads list, it looks like this: Here too - but only since a more recent update - the link is no longer added to LinkGrabber list. But in this case 2 it is not verified or online-checked - recognizable by the zeros in BubbleNotifier. But the second difference to case 1 is that I have the option to restore the link that was filtered as a duplicate. In fact, it's probably not a restoration. Because - if I click on the 'Restore 1 filtered links button' - then, and only then this link will be verified and online-checked: Differences to case 1 are:
Case 3: If I now try to add the link again, the link is not added once more and is not verified or online-checked either: The behavior is identical to case 2, except that no restore is offered here and it is also not indicated why the link was not entered in the LinkGrabber list. Last edited by StefanM; 12.06.2022 at 16:31. Reason: Corrections after Developer Feedback and more precise description |
#2
|
||||
|
||||
I'm sorry but this is not true at all. This button exists since 31.10.2013! It's functionality has not changed since then.
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 12.06.2022 at 15:03. |
#3
|
||||
|
||||
Quote:
But in all of my numerous installations it just showed up for the first time a few weeks ago. Still remember this, because then I started to search for that, as it did not work as expected (which was due to another custom filter). Is it possible, that somehow a setting in 'advanced settings' it was (without me doing that) set or kept as disabled? I have thousands of dupes in my LinkGrabber list which prove that this option was not there or not enabled in my installations for a very loooong time then. |
#4
|
||||
|
||||
Quote:
In both cases you'll have a custom configuration which does not contain the button.
__________________
JD-Dev & Server-Admin |
#5
|
||||
|
||||
Quote:
__________________
JD-Dev & Server-Admin |
#6
|
||||
|
||||
Quote:
The feature to disable dupe checks in Linkgrabber was added on 31.03.2017 (default enabled). The dupe check itself exists since 2011.
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 12.06.2022 at 15:36. |
#7
|
||||
|
||||
Quote:
If you look at the pdf you will see the screenshot with that many dupes. Those are dupes of links in Downloads list. And I would not have gotten them, in case the filter/restore option would have been enabled. That's what I tried to say. |
#8
|
||||
|
||||
Quote:
When you use "Already in Download list" as filter condition, then of course it will be filtered/not added except you have a view that matches it, explained here https://board.jdownloader.org/showpo...23&postcount=7
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 12.06.2022 at 15:04. |
#9
|
||||
|
||||
Case 1: you cannot add the same link multipe times to Linkgrabber list. You can disable this behaviour, see https://board.jdownloader.org/showpo...79&postcount=2
__________________
JD-Dev & Server-Admin |
#10
|
||||
|
||||
@Stefan: I've already explained it here, https://board.jdownloader.org/showpo...23&postcount=7
__________________
JD-Dev & Server-Admin |
#11
|
||||
|
||||
Quote:
Quote:
Because you've enabled a filter that has condition "Already in Downloads list".
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 12.06.2022 at 15:12. |
#12
|
||||
|
||||
Found Packages stays at zero because no new package got added to list.
Found links is 1 because the link is NOT filtered, but not added to list because it's already part of the Linkgrabber list.
__________________
JD-Dev & Server-Admin |
#13
|
||||
|
||||
Quote:
Well, I tested this at least 5 or 6 times back and forth: I can see the grabbing process or better: the verification/online check process. And I think it would be better, that - in case a duplicate was found - no verification/online check would be performed at all. This would also save a lot of time. Real Life Scenario Please note: I am talking about adding a list of links, not the contents of a web page, which would have to be crawled for links first. I add a few hundred links (paste a list of links) to LinkGrabber, and all of them are dupes, which already exist in LinkGrabber list, it would only cost seconds for JD to figure that out. But instead, all links are verified and online-checked first, which can take a lot of time. And after this process those dupes are not added to LinkkGrabber table. This my observation! And when I do the same with a list of links that have duplicates in Downloads list, then they won't be verified or online-checked, which is the behavior I asked for to implement it also for dupes in LinkGrabber list. |
#14
|
||||
|
||||
Quote:
Please do NOT attach a pdf of your post!
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 12.06.2022 at 15:29. |
#15
|
||||
|
||||
Quote:
This would make things so easy... I create a screenshot ( I use FastStone Capture) and then I can paste/embed it where I want to have it. But here I have to
That is 30 (!) actions instead of just pasting 5 pics. But there would also be another solution: A script, which would automatically upload my screenshot to a picture hoster and fetch the direct link, copying it to the clipboard. Last edited by StefanM; 12.06.2022 at 16:10. |
#16
|
||||
|
||||
Quote:
There are nice to use tools to create screenshots, auto upload to image hoster xy and copy url to clipboard, for example zscreen.
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 12.06.2022 at 16:11. |
#17
|
|||
|
|||
Quote:
I do agree there are some people who may post sensitive information, but if they are too stupid to do that, it is on them. Responsible users should not have to pay for their mistakes. |
#18
|
||||
|
||||
Quote:
|
#19
|
||||
|
||||
Quote:
Addendum: And if somebody is that stupid, it would even be worse for them: Because forced by forum policy to do so, they would upload sensitive information to a picture hoster, which they cannot even delete anymore. So, it would be for their own security to allow embedding screenshots directly, which they (or an admin) can delete any time, |
#20
|
||||
|
||||
@StefanM: Just to make it clear. Dupe check is NOT done on filename/filesize but ONLY on link/internal link!
Dupes and Mirrors are different things.
__________________
JD-Dev & Server-Admin |
#21
|
||||
|
||||
Yes, I'm aware of that and remember that you use the hash value for that, as any other duplicate finder software would do. (I created the German GUIs for SpaceMan99 and DuplicateCleaner )
|
#22
|
||||
|
||||
@StefanM: finally we come to the *real* topic or *issue* you want to be optimized!
So you would like to have an option/optimization that the crawling process should check for existing dupe in linkgrabber list to avoid unnecessary processing of the link, just to later *trash* it because it's already part of Linkgrabber list. So your wish is: add the same link again, abort it as soon as possible. in best case before it's been processed/online checked, right?
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 12.06.2022 at 15:46. |
#23
|
||||
|
||||
:P
Quote:
Exactly as you already do it with links in Downloads list. And for better understanding I created the pdf, where you can see this from the screenshots I was referring to. |
#24
|
||||
|
||||
perfect! and I guess you're talking about your vk links, so I can use them for testing, right? as explained, this dupe check works on link/internal link, so I must check the plugin what information is available before the processing of the link
__________________
JD-Dev & Server-Admin |
#25
|
||||
|
||||
Quote:
It was a general suggestion to use the same process for both, dupe checking against Downloads list and dupe checking against LinkGrabber list. Links you consider to be dupes shouldn't be verified or online-checked unnecessarily. ... as you already do it with links that have dupes in Downloads list. Cannot see any dependence from the hoster, whatsoever. |
#26
|
||||
|
||||
read here
Quote:
Many plugins work with custom internal links like host://fileID but you cannot check a.com/file/supernice/fileID against that.
__________________
JD-Dev & Server-Admin |
#27
|
||||
|
||||
Correct and I agree. I just want to explain why this won't be possible for all links.
__________________
JD-Dev & Server-Admin |
#28
|
||||
|
||||
@StefanM: I'm sorry but I don't understand! What is the problem here?
You add Link X into Linkgrabber and move to Downloads. Then you will be able to add Link X again to Linkgrabber except you've added a filter with condition *already in download list*. that will prevent this link to be added again. BUT in case you also have a matching view rule, then this link will be added to Linkgrabber because a matching view rule overrides a matching filter rule.
__________________
JD-Dev & Server-Admin |
#29
|
||||
|
||||
@StefanM, @mgpai:
The board software does not support *paste image from clipboard* feature. Yes, I agree that such a feature would make it easier/more user friendly to post images but the images still would be visible for support staff only. There are reasons why we don't make attachments available to public and use url blacklisting.
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 12.06.2022 at 16:44. |
#30
|
||||
|
||||
For me it's the opposite, when the link has already existed in the LinkGrabber list, I immediately get "-> Filtered Dupe" (in the log), no access to the web.
__________________
FAQ: How to upload a Log Last edited by tony2long; 14.06.2022 at 10:34. |
#31
|
||||
|
||||
Quote:
What does it show, if and when it has been enabled? |
#32
|
||||
|
||||
Quote:
Found link(s) 1 -and- Done
__________________
FAQ: How to upload a Log |
#33
|
||||
|
||||
Hmm, I tested that on June 12 several times, and for LinkGrabber dupes Bubble notifier showed
Found link(s) 1 Found package(s) 0 Online: 1 ... But I could check the log, too. Which file would that be? |
#34
|
||||
|
||||
most likely because enabled fast linkcheck support in Settings->Plugins->vk.com
__________________
JD-Dev & Server-Admin |
#35
|
||||
|
||||
Quote:
It's the only log file that contains that string Filtered Dupe. And I also cannot find the online check (as indicated by Bubble Notifier there). But I can see also from the timing that online check is being performed, but - again - only when the dupe exists in LinkGrabber. And I also have fast linkcheck enabled. When Bubble Notifier displays an online check, shouldn't one see this also in the log? Or would that be a different log file? |
#36
|
||||
|
||||
@StefanM: and that's exactly why JDownloader uses internal links to dupe checking as the URL may vary a lot. I just wanted to explain that it depends on plugin/implementation if the internal link is available before linkcheck or after linkcheck. and thus can't always be avoided
__________________
JD-Dev & Server-Admin |
#37
|
||||
|
||||
Sorry, because there is no vk.com in the title, I tried with simple link like:
tiktok.com/@elinadevia/video/7106693553442655514?lang=en
__________________
FAQ: How to upload a Log |
#38
|
||||
|
||||
Quote:
I learned that behavior can be different, depending on the hoster. What I still don't know: Whether or not there is (or can be) unnecessary online-checking for dupes in LinkGrabber. |
#39
|
||||
|
||||
Quote:
Examples for that: - Re-adding a twitter profile - Adding links to all sorts of video websites for which JD can return multiple results e.g. vimeo.com, pornhub.com - Adding cloud folders such as google drive - Adding user profiles e.g. twitter.com, vimeo.com profile, youporn.com profile Baiscally everything where a crawl process is involved because "before JD cannot know the results". E.g. user profiles can have more/different results over time. Cloud folders can get updated and so on. EDIT And as always, exception can exist e.g. some plugins are highly optimized and/or got extra settings for that behavior e.g. youtube, vk.com.
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#40
|
||||
|
||||
That heavily depends on the hoster/plugin and I can't give a general statement
__________________
JD-Dev & Server-Admin |
Thread Tools | |
Display Modes | |
|
|