JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 10.11.2020, 17:27
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default Problems to decrypt links for metalarea.org

I have add basic authentication in JD2 for metalarea.org

I paste these links on JD2 but it can't crawl nothing. But inside these pages I have several hosts

**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Maybe you can't see links inside metalarea pages because you need credentials to see hidden content (host links are also hidden in spoilers)
Reply With Quote
  #2  
Old 10.11.2020, 17:30
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

Hi,

I don't think that basic-authentication will work for this forum style website.
You will probably have to extract the cookies of that website and add a link crawler rule.
Without your login credentials I won't be able to help you with that.
If you need an example, see your older thread HERE.

-psp-
EDIT

I'll be offline soon and back again tomorrow.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 10.11.2020 at 17:40.
Reply With Quote
  #3  
Old 10.11.2020, 17:55
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

I sent you metalarea credentials.

Just another thing if I can.
When I copy information from JD2 I see like that

Link;2010 - Crusted.rar;**External links are only visible to Support Staff****External links are only visible to Support Staff**
Link;1996 - Incomplete Minds.7z;**External links are only visible to Support Staff****External links are only visible to Support Staff**

Is possible to see also (not only) the original link of forum from which I copied the link? I mean in this way:

Link;2010 - Crusted.rar;**External links are only visible to Support Staff****External links are only visible to Support Staff**
Link;1996 - Incomplete Minds.7z;**External links are only visible to Support Staff****External links are only visible to Support Staff**

Because I need to have also forum link in informations from which that link was taken.
Reply With Quote
  #4  
Old 10.11.2020, 18:00
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,522
Default

You can customize the CopyToClipboard Action via rightclick->context menu-> menu editor and there you can modify what is copied to clipboard
additional tags are
Quote:
{url.container}
{url.origin}
{url.content}
{url.referrer}
__________________
JD-Dev & Server-Admin
Reply With Quote
  #5  
Old 10.11.2020, 18:01
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,522
Default

Also check rightclick->context menu->properties->show url and double click into the url to see all known urls
__________________
JD-Dev & Server-Admin
Reply With Quote
  #6  
Old 10.11.2020, 18:02
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

The procedure is basically the same as described in your older forum thread but this time you need the "masession_id" cookie:
Code:
[ {
  "enabled" : true,
  "cookies" : [ ["masession_id", "CENSORED"] ],
  "updateCookies" : true,
  "logging" : false,
  "maxDecryptDepth" : 1,
  "name" : "metalarea.org example rule with cookie-login",
  "pattern" : "https?://metalarea\\.org/forum/index\\.php\\?showtopic=\\d+",
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : null,
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "Download from <a href=\"(https?://[^\"]+)\"",
  "rewriteReplaceWith" : null
} ]
Rule as plaintext for easier copy & paste:
pastebin.com/JNC85fCH
EDIT

Please keep in mind that while we always try to help, the creation of custom LinkCrawler Rules is something that you should learn!
We won't provide example rules for another 100 websites for you.
In order to learn how to do this, you need to learn how to use regular expressions first - you can use webtools such as this to practice: regex101.com


Regarding your 2nd question:
Go to Settings -> User Interface -> Downloadlink address display
Move "Data" to the top and/or deselect all others.

This will only work if you add "uncrypted" downloadlinks.
If you e.g. all .DLC containers, JD won't ever display the direct-URLs to you.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?

Last edited by pspzockerscene; 10.11.2020 at 18:05.
Reply With Quote
  #7  
Old 11.11.2020, 01:32
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@Jiaz
@pspzockerscene

Thanks a lot!
Reply With Quote
  #8  
Old 11.11.2020, 10:22
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,522
Default

@nathan1: also see my comment here https://board.jdownloader.org/showthread.php?t=85914
__________________
JD-Dev & Server-Admin
Reply With Quote
  #9  
Old 11.11.2020, 14:08
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

Thanks for your feedback.

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #10  
Old 11.11.2020, 22:50
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@Jiaz
@pspzockerscene

I add in this way

{type};{name};{url};{url.container};{url.origin};{packagename}

I try to test some links but after update it has some problems to recognize URLs and set <title> or don't copy <title> with CopyInformation Action

For example in this URL has mediafire link
**External links are only visible to Support Staff****External links are only visible to Support Staff**

but it don't crawl

also for this links have difficult
**External links are only visible to Support Staff****External links are only visible to Support Staff**

LOG
Code:
11.11.20 22.33.02 <--> 11.11.20 22.28.29 jdlog://6033425302851/
Reply With Quote
  #11  
Old 11.11.2020, 23:58
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

Just another example

For example, this link is not crawled
**External links are only visible to Support Staff****External links are only visible to Support Staff**

The <title> of this URL is Carach Angren - Franckensteina Strataemontanus (2020), Symphonic Black Metal

But JD2 after update don't works

1. don't crawl links inside it
2. don't generates <title> of **External links are only visible to Support Staff****External links are only visible to Support Staff** for packet where host links are crawled or however don't copy <title> from CopyInformation (also if I set up {type};{name};{url};{url.container};{url.origin};{packagename})
Reply With Quote
  #12  
Old 12.11.2020, 00:09
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

All working fine here!
1. Working fine.
If it doesn't for you, logout in browser and login again --> Grab the new value of the cookie and put that in your rule.
Cookies can expire - yours might have expired.
Mine also expired in my test-rule and I had to renew the cookie to make it work again.
2. If you want the rule to set a package title, you'd have to define that in the rule ("packageNamePattern").
Again I'm uring you to learn how to use regular expressions but I've modified the rule once again for you to grab- and set the title:
Code:
[ {
  "enabled" : true,
  "cookies" : [ [ "masession_id", "CENSORED" ] ],
  "updateCookies" : true,
  "logging" : false,
  "maxDecryptDepth" : 1,
  "id" : 1605027636498,
  "name" : "metalarea.org example rule with cookie-login",
  "pattern" : "https?://metalarea\\.org/forum/index\\.php\\?showtopic=\\d+",
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : "<title>(.*?)</title>",
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "Download from <a href=\"(https?://[^\"]+)\"",
  "rewriteReplaceWith" : null
} ]
Rule on pastebin:
pastebin.com/Cfv2Zgv0

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #13  
Old 12.11.2020, 01:12
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

Ok, I refresh browser and add new cookie and now works better but not perfectly.
I give you an examples. I copy these links (I use copy selected links extension of firefox to copy URLs)

**External links are only visible to Support Staff****External links are only visible to Support Staff**
javascript:multi_page_jump('**External links are only visible to Support Staff**, 38, 30 );
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**

JD2 crawls these links


I copy information and I see this

Link;Demoniac 1993 Satanas 666 (Rehearsal 11-1993).rar;**External links are only visible to Support Staff****External links are only visible to Support Staff** 1993 Satanas 666 (Rehearsal 11-1993);
Link;Moonblood-The Winter Falls Over The Land [Remastered](CD 2015).rar;**External links are only visible to Support Staff****External links are only visible to Support Staff** Winter Falls Over The Land [Remastered](CD 2015);
Link;st.rar;**External links are only visible to Support Staff****External links are only visible to Support Staff**

What is problem?
For st.rar "packageNamePattern" : "<title>(.*?)</title>" doesn't works.
Information that it returns to me are

Link;st.rar;**External links are only visible to Support Staff****External links are only visible to Support Staff**

At the end of line I see ;st; and not ;Moonblood-The Winter Falls Over The Land [Remastered](CD 2015);

And I don't understand why then JD2 requires also Login for metalarea if I add correctly cookie login. When I copy several links JD2 ask me login.



I give you LOG
Code:
11.11.20 23.48.47 <--> 12.11.20 00.11.04 jdlog://5333425302851/
Reply With Quote
  #14  
Old 12.11.2020, 01:36
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

Oh lol we actually had a crawler for this website from 2016.
I'll check that with Jiaz tomorrow.
The old crawler only listens for "http" URLs which is why you triggered it.

Please wait for us to re-check this tomorrow/later ...

-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #15  
Old 12.11.2020, 16:11
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

I've fixed our old metalarea plugin from 2016.
We're usually not making plugins for such simple websites anymore but you're lucky that this one still exists and didn't require a lot of work to fix.
You do not need the above linkcrawler rule anymore after the next update.

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #16  
Old 12.11.2020, 23:39
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

Quote:
"packageNamePattern" : "<title>(.*?)</title>",
But is still possible to use "packageNamePattern" : "<title>(.*?)</title>", rule? Or just add {packagename} for CopyToClipboard Action?

Last edited by nathan1; 12.11.2020 at 23:41.
Reply With Quote
  #17  
Old 13.11.2020, 00:19
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

In your package naming I see this

Quote:
Amenra - Mass VI (2017) - Metal Area - Extreme Music Portal
Amenra - Prayers 9 10 [ep] (2004) - Metal Area - Extreme Music Portal
Amenra - Mass IIII (2008) - Metal Area - Extreme Music Portal


I see that for every package is added Metal Area - Extreme Music Portal. Is possible to retrieve the real name of packet? Real names have its music genre tag in the <title> name. Look why, please

Quote:
Amenra - Mass VI (2017) - Doom/Sludge/Hardcore/Drone (D)
Amenra - Prayers 9 10 [ep] (2004) - Doom/Sludge/Hardcore/Drone
Amenra - Mass Iii (2006) - Doom/Sludge/Hardcore
I ask this because when you crawl a metalarea link in JD2, for example
**External links are only visible to Support Staff****External links are only visible to Support Staff**

you can see the real <title> is this

Amenra - Mass Iii (2006), Doom/Sludge/Hardcore



and not Amenra - Mass Iii (2006) - Metal Area - Extreme Music Portal
This because relative URLs are inside that page from where JD2 crawls links.

I would like his to refer to that title

Last edited by nathan1; 13.11.2020 at 00:30.
Reply With Quote
  #18  
Old 13.11.2020, 12:10
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

Quote:
Originally Posted by nathan1 View Post
But is still possible to use "packageNamePattern" : "<title>(.*?)</title>", rule? Or just add {packagename} for CopyToClipboard Action?
No.
Also, our plugin auto-does this to get the packagename.

Quote:
Originally Posted by nathan1 View Post
In your package naming I see this


...

I see that for every package is added Metal Area - Extreme Music Portal. Is possible to retrieve the real name of packet? Real names have its music genre tag in the <title> name. Look why, please
I've updated this for the next plugin-update.
In the future, please use packagizer rules to correct package titles if you need to remove parts of them.

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #19  
Old 14.11.2020, 03:31
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

I try to change <tag> to retrieve correct title name but something doesn't work.

I use your packagizer rules to fix package titles and instead of

Code:
"packageNamePattern" : "<title>(.*?)</title>",
I use this

Code:
"packageNamePattern" : "<td style=\"word-wrap:break-word;\" width=\"99%\">(.*?)</td>",

Code:
[ {
  "enabled" : true,
  "cookies" : [ [ "masession_id", "CENSORED" ] ],
  "updateCookies" : true,
  "logging" : false,
  "maxDecryptDepth" : 1,
  "id" : 1605027636498,
  "name" : "metalarea.org example rule with cookie-login",
  "pattern" : "https?://metalarea\\.org/forum/index\\.php\\?showtopic=\\d+",
  "rule" : "DEEPDECRYPT",
  "packageNamePattern" : "<td style="word-wrap:break-word;" width="99%">(.*?)</td>",
  "passwordPattern" : null,
  "formPattern" : null,
  "deepPattern" : "Download from <a href="(https?://[^"]+)"",
  "rewriteReplaceWith" : null
} ]

In your update you just remove "Metal Area.. " but real name is inside this tag
Code:
<td style="word-wrap:break-word;" width="99%"> </td>

For example, in this URL
**External links are only visible to Support Staff****External links are only visible to Support Staff**


this is text that I try to set up with your packagizer rules to correct package titles

Reply With Quote
  #20  
Old 16.11.2020, 17:00
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

I've updated it once again in our plugin.

Again:
If ay plugin for a website is available, the plugin will be used and LinkCrawler Rules for the same website will be ignored.

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #21  
Old 17.11.2020, 02:53
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@pspzockerscene

Great job!

Can I ask you what you change to retrieve text from that tag ? What you write ?
Reply With Quote
  #22  
Old 17.11.2020, 08:59
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,522
Default

@nathan1: he updated the native plugin to fetch the title as you want it
__________________
JD-Dev & Server-Admin
Reply With Quote
  #23  
Old 18.11.2020, 10:29
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@pspzockerscene

Seems that in your last update you disable my last fetching request
Please see LOG

Code:
18.11.20 09.16.21 <--> 18.11.20 09.15.44 jdlog://9615425302851/
Reply With Quote
  #24  
Old 18.11.2020, 13:36
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,522
Default

@nathan1: was my fault, wait for next update
__________________
JD-Dev & Server-Admin
Reply With Quote
  #25  
Old 18.11.2020, 16:47
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@Jiaz

ok,thanks
Reply With Quote
  #26  
Old 19.11.2020, 15:30
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@Jiaz & psp

I see that plugin have problems to decrypt and fetch <title> names if links are from file.karelia.ru

If you test these urls

**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**

you return this strange folder names

Code:
923zjt
rvw36q
wn93q9
qgw44f
gv79df


LOG
Code:
19.11.20 14.03.23 <--> 19.11.20 14.31.10 jdlog://9645425302851/
Reply With Quote
  #27  
Old 19.11.2020, 16:40
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

Updated our file.karelia.ru crawler to only set packagenames for folders with more than 1 items.

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #28  
Old 21.11.2020, 04:47
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default host url missing from Copy Information (Action)

I think that there are some problems with CopyToClipboard Action via rightclick -> context menu

I add these additional tags in (Action) Copy Information

Code:
{type};{name};{url};{url.container};{url.origin};{packagename};{url.referrer}
but JD2 don't copy also hoster's url. I try to explain: I copy these links

**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Inside these pages are links like

**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**

but all external hoster's url like yadi.sk or sampo.ru are missing from (Action) Copy Information. These information don't appears.
When I copy information







What additional tag do I need? These are insufficient

Code:
{type};{name};{url};{url.container};{url.origin};{packagename};{url.referrer}
LOG
Code:
21.11.20 03.26.47 <--> 21.11.20 03.46.28 jdlog://4695425302851/

Last edited by Jiaz; 21.11.2020 at 12:25.
Reply With Quote
  #29  
Old 21.11.2020, 07:12
mgpai mgpai is offline
Script Master
 
Join Date: Sep 2013
Posts: 1,545
Default

The data returned by {url} will vary, depending on the url sort order in Settings > User Interface. Use {url.content} instead, to get the download url.
Reply With Quote
  #30  
Old 21.11.2020, 15:56
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

Quote:
Originally Posted by mgpai View Post
The data returned by {url} will vary, depending on the url sort order in Settings > User Interface. Use {url.content} instead, to get the download url.
Thank you !
Reply With Quote
  #31  
Old 21.11.2020, 16:37
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,522
Default

@mgpai: Thanks for the fast and correct help
__________________
JD-Dev & Server-Admin
Reply With Quote
  #32  
Old 02.09.2022, 01:59
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@psp

I follow your tips from your last post of khinsider to apply for metalarea

Quote:
1. Go to Settings -> Advanced Settings -> Search for LinkCollector.dolinkcheck --> Disable this
2. Now add some- or all links.
You will notice that they will get added with a blue questionmark and without onlinestatus and with unknown filesize.
but seems that rate limit and disabling LinkCollector check is not sufficient. For example I copy about 500 metalarea urls but JDownloader still crawls a limitated number of urls (about 185)
Note: clipboard observer is enabled

LOG
Code:
01.09.22 23.15.02 <--> 01.09.22 23.52.12 jdlog://7369211370661/
I noticed that even if I have disabled the linkcollector check, JD does a check, even if partial, of the links, for example it gives me a high number of offline files.
In theory it shouldn't just show me the package name only (and this is great), but without having to check any links in it?




Why check me anyway if the files are offline even after disabling the linkcollector check? I'm only interested in coming up with the name of the packages, which it does correctly and only later can I check their status online - manually or when I download them.
Reply With Quote
  #33  
Old 02.09.2022, 13:28
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

Quote:
Originally Posted by nathan1 View Post
but seems that rate limit and disabling LinkCollector check is not sufficient.
In the case of metalarea it's not even clear if they have any kind of rate-limit...

Quote:
Originally Posted by nathan1 View Post
For example I copy about 500 metalarea urls but JDownloader still crawls a limitated number of urls (about 185)
Please provide the following information for us to check:
- All of those 500 URLs
- Your metalarea username + password

Quote:
Originally Posted by nathan1 View Post
I noticed that even if I have disabled the linkcollector check, JD does a check, even if partial, of the links, for example it gives me a high number of offline files.
That's because some URLs need to be crawled first as they could return multiple URLs.
This can't be turned off as JD would simply do nothing then
Also some crawlers will do the linkcheck right away because they've accessed the URL already so yes even with disabled linkcheck, some items will be displayed as online/offline with filename/filesize information set.

Quote:
Originally Posted by nathan1 View Post
In theory it shouldn't just show me the package name only (and this is great), but without having to check any links in it?
No.

Quote:
Originally Posted by nathan1 View Post
Why check me anyway if the files are offline even after disabling the linkcollector check? I'm only interested in coming up with the name of the packages, which it does correctly and only later can I check their status online - manually or when I download them.
I'm sorry but without having your testlinks I'm unable to tell what exactly happened but it doesn't look like a bug.
I can imagine that some of your metalarea links go to a 404 error-page which the crawler will detect and display them as offline.
This is not a real "online check" but as explained the crawler needs to access the added link anyways so that is not a bug.

Also, a package in JDownloader cannot be empty it needs to contain at least one item.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #34  
Old 02.09.2022, 14:46
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@psp

ok, the problem is that it does not detect many links: if you copy and paste all 500 links in bulk, JDownloader at some point freezes and "runs idle" for 1-2 minutes and in any case does not scan all the links. Instead, if you copy and paste 10-20 links it scans them and detects all 20.

What I don't understand is why the scan is conditioned by the number of links. Maybe there is some setting or timer to improve?

I sent you credentials + 500 urls metalarea links
Reply With Quote
  #35  
Old 02.09.2022, 15:36
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

Quote:
Originally Posted by nathan1 View Post
ok, the problem is that it does not detect many links: if you copy and paste all 500 links in bulk, JDownloader at some point freezes and "runs idle" for 1-2 minutes and in any case does not scan all the links.
I will take a look at it but as a poweruser you will always run into some kind of limits.
If you're frequently running into rate-limits, consider adding links with a delay of X seconds.
This should be easily possible using external scripts or even an EventScripter script:
https://support.jdownloader.org/Know...event-scripter

Quote:
Originally Posted by nathan1 View Post
What I don't understand is why the scan is conditioned by the number of links. Maybe there is some setting or timer to improve?
I'm sorry but I didn't understand neither of those two sentences.

Quote:
Originally Posted by nathan1 View Post
I sent you credentials + 500 urls metalarea links
I'll check it...
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #36  
Old 02.09.2022, 15:58
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

It was indeed a rate-limit.
Metalarea will return http error-code 400 when that limit is reached.
For the next update I've added some measures to try to prevent running into the rate-limit.
These measures include:
- Wait 1000ms between requests
- Limit max simultan crawler instances for metalarea to 1
- Return dummy URLs for retrying in case rate-limit is hit

Bitte auf das nächste CORE-Update warten!

Please wait for the next CORE-Update!

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #37  
Old 02.09.2022, 18:29
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@psp
Thank you for this update
Reply With Quote
  #38  
Old 03.09.2022, 17:38
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

Seems like is not a classic rate limit neither is it a sophisticated one:
They're simply blocking your current session after X (about 250) requests no matter how fast these requests are performed.

I'm still working on it but I think I should be able to add auto handling and remove the requestInterval so the crawler can crawl fullspeed.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #39  
Old 05.09.2022, 12:30
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,086
Default

Done. Rate-limit for metalarea links shouldn't be a problem anymore after the next set of plugin updates.

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #40  
Old 05.09.2022, 21:47
nathan1 nathan1 is offline
JD VIP
 
Join Date: Apr 2012
Posts: 394
Default

@psp

Thank you very much !
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 14:37.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.