JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 27.02.2011, 15:58
remi
Guest
 
Posts: n/a
Cool Handling of anti-captcha methods

Let's first try to understand this remarkable post in the reCaptcha thread.

Quote:
Originally Posted by pspzockerscene View Post
@remi
No we do officially support captcha-recognitions but the settings were removed more than half a year ago because we thought no one would use them.
Does this mean that the jD Team officially support non-official add-ons like these anti-captcha (the 'Spanish' automatic recognition, the decaptcher, the CT and any other) methods?

I must admit I've never seen a function in jD for choosing the method for captcha handling on a per host basis. If it existed why was it removed?

Quote:
Originally Posted by pspzockerscene View Post
Also we haven't received many complains yet.
Complaints about the removal of the settings I wrote about or what settings are you talking about?

I would like to propose the following overall design. If an internal method exists and it works, jD should use this method, otherwise it would search for an external method using the following priorities :-

1) first the automatic anti-captcha methods,

2) then the non-paying or mutual anti-captcha solving methods,

3) and finally the paying methods.

If the customer would enable manual captcha recognition then all the other methods would be overridden. Note that the term "disable automatic captcha recognition" would be useless, since CT also has a mutual captcha solving method that just shifts manual captcha solving in time. The cost for that shift is that customers have to solve many more captchas than in normal manual mode.

Note that I still make the distinction between internal and external, because I'm still thinking that the jD Team makes a distinction between these two types.
Reply With Quote
  #2  
Old 27.02.2011, 16:49
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,103
Default

@remi
It's easy.
Our captchasystem allows the use of external/"homemade" or "inofficial" captcharecognitions.
If a method is found for a hoster it will be used, no matter how and if it works!

GreeZ pspzockerscene
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #3  
Old 28.02.2011, 01:14
drbits's Avatar
drbits drbits is offline
JD English Support (inactive)
 
Join Date: Sep 2009
Location: Physically in Los Angeles, CA, USA
Posts: 4,434
Default

This is partly a matter of JD being open source. You can write any plugins you want, but you are responsible for supporting them, not the JD staff.

Each JAC (Java Anti-Captcha) plugin has a file listing the hosts to use it with. You can control what method is used to provide JAC answers for a site.

There is currently no way for JDownloader to deal with multiple methods for the same host.
__________________
Please, in each Forum, Read the Rules!.Helpful Links. Read before posting.
Reply With Quote
  #4  
Old 28.02.2011, 11:32
remi
Guest
 
Posts: n/a
Cool

OK, I see we're again singing the same note now.

I think a property for the JAC method based on the classification I made in my first post would help. The customer can then set the priorities when different methods for the same host would clash.

Managing the clashing methods by editing the .xml files is not a good option for most customers.
Reply With Quote
  #5  
Old 28.02.2011, 14:32
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,103
Default

Quote:
Originally Posted by drbits View Post
There is currently no way for JDownloader to deal with multiple methods for the same host.
There IS a way.
Usually a hosterplugin asks for a captcha and then gets it from the method in which's jacinfo there is the name of the hosterplugin BUT hosterplugins can also request captchas from captchamethods

GreeZ pspzockerscene
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #6  
Old 28.02.2011, 14:49
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,532
Default

i'm sorry psp, but jd cannot handle multiple methods for a single plugin
__________________
JD-Dev & Server-Admin
Reply With Quote
  #7  
Old 28.02.2011, 15:10
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,103
Default

@Jiaz
But plugins can request captchas directly from methods.
I haven't ever tried it but i know that it has been designed to also work this way^^
I'm sorry for telling wrong information.

GreeZ pspzockerscene
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #8  
Old 28.02.2011, 15:34
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,532
Default

yes but that has nothing to do with using several methods^^
__________________
JD-Dev & Server-Admin
Reply With Quote
  #9  
Old 28.02.2011, 15:49
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 71,103
Default

So do we need a ticket for this ?
There aren't many services using different captchas and for those we can use my method above ?!

GreeZ pspzockerscene
EDIT

Actually not many users are using other methods than the ones that come with JD so i don't know if it's worth adding new settings.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #10  
Old 07.03.2011, 13:18
gab1545
Guest
 
Posts: n/a
Default

I have to say that as a beginner this is at best a little confusing! If one could nominate one method over another in the settings somewhere this would be very useful.
Reply With Quote
  #11  
Old 24.03.2011, 14:37
celerondude
Guest
 
Posts: n/a
Default

Hi, I'd love a setting to choose which captcha-method to use, too.

I'm using the CaptchaTrader-method at night or when I'm not at the computer for some other reason, but at day I'm absolutely fine solving those captchas by hand.
An option to quickly change between manual and the external method would be great!


At this time, I'm each time fiddling in "jacinfo.xml" commenting al the hosters out and restarting jDownloader afterwards - but that's not a good solution.
Reply With Quote
  #12  
Old 25.03.2011, 11:05
remi
Guest
 
Posts: n/a
Cool

Quote:
Originally Posted by celerondude View Post
I'm each time fiddling in "jacinfo.xml" commenting al the hosters out
You don't need to change the contents of that file. I think just renaming that file might be sufficient.
Reply With Quote
  #13  
Old 25.03.2011, 16:07
celerondude
Guest
 
Posts: n/a
Default

Yes, you're right. I even figured out, that you don't have to restart JDownloader for the change to take effect.
It's suficent to rename the "jacinfo.xml", when you don't want to use the method and to restore it's name if you want to use it again.

I therefore wrote a small batch script renaming the file back and forth to quickly toggle the CaptchaTrader-method on and off.

Nonetheless it would be nice to have some possibilites to control the usage of captcha methods from inside JD, e.g.
  • Switch specific methods on/off.
  • Switch all external methods on/off.
  • Control the priority of methods, e.g. internal methods have higher priority than external methods, so if there's an internal and an external method use the internal.
  • Explicitly select which method is used for a specific hoster.
Reply With Quote
  #14  
Old 30.03.2011, 08:32
danutz
Guest
 
Posts: n/a
Default

So is there any rule as to the order in which anti-captcha methods are tried, if multiple methods are available for a host -- hopefully it's not random? Alphabetical order of method names at least ?
Reply With Quote
  #15  
Old 30.03.2011, 10:53
remi
Guest
 
Posts: n/a
Default

If you're a java programmer you might have a look at the source code. It's an open source project.
Reply With Quote
  #16  
Old 30.03.2011, 11:47
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 79,532
Default

there is currently no support for multiple methods for a captcha and at the moment we dont have plans/time to change this
__________________
JD-Dev & Server-Admin
Reply With Quote
  #17  
Old 30.03.2011, 13:18
danutz
Guest
 
Posts: n/a
Default

Yeah, I have the code and will soon start on it, time permitting. This will probably be somewhere deep in the guts of JD though.

But it has dawned on me that method preference could be implemented outside JD. We can replace all external anti-recaptcha methods with a single "dispatcher" method/plugin (with an appropriate JAX XML).

That "dispatcher" (a python, C++ or whatever program) can implement logic to decide which "real" method to call. As it fails to solve the captcha, JD will call it repeatedly. It can save its status in some file and decide what to do. For example it can decide to try the Spanish anti-captcha 10 times, than cough up some CaptchaTrader credits, then (if no credits) pop-up a dialog of itself (outside JD!)

The only problem that I foresee is that JD often disables a link with bogus "Plugin out of date" errors. This needs to be worked around in JD -- we need something like "always reset plugin-out-of-date errors".
Reply With Quote
  #18  
Old 30.03.2011, 13:38
remi
Guest
 
Posts: n/a
Default

Yes, why not.

If I were you I would also use the Spanish anti-captcha method for obtaining captchatrader credits and call it the Jiaznutz approach to captchas.
Reply With Quote
  #19  
Old 30.03.2011, 13:46
danutz
Guest
 
Posts: n/a
Default

Maybe humor is not my strong point today, remi What do you mean?

What you proposed (as a joke?) actually does make some sense, because it would consolidate both the gathering and spending of credits: gather via CT (both automatically and manually) and spend via JD (only via the CT plugin). I'm not sure if CT would be happy about such a poor captcha-solver -- they might ban the account.

But I'm feeling that you ACTUALLY meant to make fun of my scheme as being overly complicated, and a poor man's solution for something that JD should handle itself?
Reply With Quote
  #20  
Old 30.03.2011, 14:14
remi
Guest
 
Posts: n/a
Cool

I take your suggestions very seriously and they show very clearly that you're a smart guy.

No, the first part of my sentence is meant to be taken seriously and the second part is meant to be funny. I'm sorry for the confusion. I would rather call it the Spanish Jiaznutz approach.

You're right that CT accounts could be banned because of too many bad recognitions, but if the recognition attempts are good enough, the CT credits could be used in cases like fileserve that allows only 3 tries or in cases where the automatic recognitions don't work at all.

In my post #15 I gave an answer to your question about what jD is actually doing. Does it overwrite successive methods when it's parsing the xml files or does it simply take the first method it finds for a given host?

Note that the Spanish method might only recognise reCaptcha and no other types of captchas. I've never used CT nor the Spanish anti-captcha methods. Is it possible to be selective as to what types of captchas you want to solve on CT and can the different captcha types be differentiated by the Spanish method?
Reply With Quote
  #21  
Old 01.04.2011, 18:03
danutz
Guest
 
Posts: n/a
Default

I've finished the 'dispatcher' plugin... It manages CaptchaTrader, Spanish and manual recognition. The logic is configurable. Unpack the attached ZIP and read the README.txt.

Since JD ONLY gives plugins a captcha image, it's impossible to implement any logic for repetitions (e.g. try Spanish twice, then fork out some CT credits). JD needs to pass external plugins the link for which the captcha was generated (or at least the hoster). Then, more things will become possible.

Maybe this should go to the user scripts section...
Attached Files
File Type: zip jacdispatcher.zip (4.8 KB, 565 views)
Reply With Quote
  #22  
Old 02.04.2011, 09:10
danutz
Guest
 
Posts: n/a
Default

Tested overnight and this morning in the 2-dispatcher configuration (one for fileserve with CT, one for everything else with Spanish). It works perfectly. However JD has some problems:

* bogus plugin out-of-date errors block some links after several failed captchas

* sometimes a hoster completely stops, and I have to manually "Force download". The log shows a thread attempting a download, then immediately stop with "Download quota has been reached". That link is then never retried, nor are others from the same hoster picked up.
Reply With Quote
  #23  
Old 02.04.2011, 09:48
drbits's Avatar
drbits drbits is offline
JD English Support (inactive)
 
Join Date: Sep 2009
Location: Physically in Los Angeles, CA, USA
Posts: 4,434
Default

The request for supplying the host and the results to the Captcha plugins is already in the BugTracker. These will be necessary for proper operation of a complete Event Manager.
Reply With Quote
  #24  
Old 02.04.2011, 12:14
remi
Guest
 
Posts: n/a
Default

@danut

Thanks for your contribution. I hope many people will be able to take advantage of it.

If you've problems with jD itself, please create bug reports with example links and/or logs in the proper forums.
Reply With Quote
  #25  
Old 02.04.2011, 12:56
danutz
Guest
 
Posts: n/a
Default

Well, the "plugin out of date" error has been around forever. I've seen tens of updates with the title "restart on plugin out of date errors" go by, yet it still happens (e.g. on hotfile). Manually resetting the link solves it, but that is annoying and requires constant monitoring.

What is the problem? I can't seem to find a relevant thread, yet I'm sure it must have been discussed before.
Reply With Quote
  #26  
Old 02.04.2011, 21:58
drbits's Avatar
drbits drbits is offline
JD English Support (inactive)
 
Join Date: Sep 2009
Location: Physically in Los Angeles, CA, USA
Posts: 4,434
Default

These fixes are in the bugtracker. However, they require changes in the core of JDownloader (status standardisation, control engine update, fixes to the plugin interfaces, and possibly the event manager). This means they will take a while to fix and different parts will be fixed in different major releases.

"Plugin Error (out of date)" is generated by the host plugins whenever the response from a host is not exactly what is expected. This will be improved in the next major release (as demonstrated in the Nightly Test version).

Whenever you get a Plugin Error message, restart JDownloader and reset the links with the plugin error.

There are problems with the update system (it works, but not as one would expect). The way to be sure that JDownloader is fully updated is to start the program with JDupdate.jar instead of jdownloader.exe.

Currently, JDownloader checks for all updates when it starts, but does not ask you to restart the program unless there is an update to a part of the program that is not a plugin (these are considered minor patches). The "Check for Updates" icon does not check for plugin updates.


Last edited by drbits; 02.04.2011 at 22:02.
Reply With Quote
  #27  
Old 03.04.2011, 09:08
danutz
Guest
 
Posts: n/a
Default

You seem to think that "out of date errors" are always solved by updates, and if only the update mechanism worked better, all would be well.

My experience is that these are most often transient errors, that simply retrying the download solves the problem 95% of the time, and that a JD update doesn't even come in some cases (i.e. it's a freak incident not an error that many people see). It's possible for a hung HTTP GET to cause an "unexpected response", isn't it?

So, I think that instead of being fatal, plugin out of date errors should be retried immediately. Maybe after 5 successive out-of-dates JD should wait for 10 minutes. If, for some reason, you don't want to make this the default behavior, at least make it an option.

Even without automated recaptchas, the manual intervention required for out-of-dates was quite annoying. Given CaptchaTrader and the Spanish OCR, it becomes much more than that -- a roadblock.
Reply With Quote
  #28  
Old 03.04.2011, 14:11
remi
Guest
 
Posts: n/a
Cool

Quote:
Originally Posted by danutz View Post
You seem to think that "out of date errors" are always solved by updates, and if only the update mechanism worked better, all would be well.

My experience is that these are most often transient errors, that simply retrying the download solves the problem 95% of the time...
I think drbits wrote something different :-

Quote:
Originally Posted by drbits View Post
... they require changes in the core of JDownloader (status standardisation, control engine update, fixes to the plugin interfaces, and possibly the event manager). This means they will take a while to fix and different parts will be fixed in different major releases.
I think drbits understands this issue very well. Evidence of this is the discussion we had about "in error states" in 2009. See the Status Diagram and its surrounding posts.
Reply With Quote
  #29  
Old 03.04.2011, 15:26
danutz
Guest
 
Posts: n/a
Default

@remi: thanks, I will need to read on that (that graph looks confusing -- can we use Graphviz or something to make it more readable?).

I was mostly responding to this bit of advice (and the subsequent explanations about updatesl)

Quote:
Originally Posted by drbits View Post
Whenever you get a Plugin Error message, restart JDownloader and reset the links with the plugin error.
These are clearly not applicable to the full-automation scenario, so I thought I'd explain the problem as clearly as possible and propose the most obvious fix. Because, if the plugin out-of-date and "hoster hang" were fixed, it would take out some 30% of the work of downloading.

[ If we could also implement your suggestion regarding feeding "Spanish OCR" solutions to Captcha Trader, and if CT didn't block it, we would reach download heaven -- but that's another story. ]
Reply With Quote
  #30  
Old 04.04.2011, 11:54
remi
Guest
 
Posts: n/a
Default

When I made that and other diagrams, I just looked for a simple (free) state/object diagramming tool and I took the first I found. I've never seen coloured lines in those UML diagrams. If you don't understand my diagram please discuss in that thread.

As far as I can see Graphviz is graph visualization software, not a diagramming tool.

Quote:
Originally Posted by drbits View Post
Whenever you get a Plugin Error message, restart JDownloader and reset the links with the plugin error.
IMO it's overkill to restart jD, but wait for drbits' explanation.
Reply With Quote
  #31  
Old 04.04.2011, 13:40
danutz
Guest
 
Posts: n/a
Default

Quote:
Originally Posted by remi View Post
IMO it's overkill to restart jD, but wait for drbits' explanation.
Restarting is not in itself the problem. Anything that can't be automated is a problem -- having to press the "update" button, having to reset a link etc. Fatal errors kill automation. The only solution is to make plugin out-of-date errors non-fatal.

In my opinion the entire approach for handling "unexpected responses" is weak. You can't hope to keep up with every possible variation of error page that hosters throw at you. It is a losing battle. One should only strive to handle "normal" workflows.

I would propose the following logic for JD:

1. replace "Plugin out-of-date" errors with "unexpected response" errors; make these non-fatal: retry them 5 times in a row, then once every 60 minutes only.

2. monitor the available JD updates constantly

3. whenever JD sees an update for hoster H, it should disable all links from H that are waiting to be retried due to "unexpected responses" (as per rule 1)

4. whenever no downloads are active AND updates are queued up, JD should restart itself, with no intervention required.

5. after the restart, re-enable the links that were disabled by rule (3) above

Rule (1) balances the chance that an "unexpected response" might just be a transient incident, with the danger of hammering a hoster that really has changed its interface.

Rules (2) and (3) ensures that un-upgraded JD's don't continually retry links on hosters that are known to have changed.

Rules (4) and (5) ensure that JD can be trusted do its job even if the owner goes away for a 2-week trip.
Reply With Quote
  #32  
Old 04.04.2011, 14:12
remi
Guest
 
Posts: n/a
Default

I prefer to continue this in the proper "Suggested New Download control & UI" thread as this discussion is no longer relevant for anti-captcha handling.
Reply With Quote
  #33  
Old 04.04.2011, 14:34
danutz
Guest
 
Posts: n/a
Default

Sure, this thread is getting too fat (though control & ui *are* relevant to what an anti-captcha plugin can possibly do)
Reply With Quote
  #34  
Old 06.05.2011, 21:59
gfz87
Guest
 
Posts: n/a
Default Try different methods for captchas

Hi, is there a way to implement a function in JDownloader where one can set a anticaptcha method than runs on X server if the default one was not succesful for a configurable number of times? Like for example try three times some default method, and if it was not sucessful try another one. Greetings

Last edited by gfz87; 07.05.2011 at 00:27.
Reply With Quote
  #35  
Old 07.05.2011, 09:35
remi
Guest
 
Posts: n/a
Default

I like your suggestion but it's going one step further than "Handling of anti-captcha methods", which isn't supported yet.

If the captcha handling controller is build in a flexible way, this should be possible, but don't expect it soon.
Reply With Quote
  #36  
Old 16.06.2011, 13:52
lovelove
Guest
 
Posts: n/a
Default

Quote:
Originally Posted by gab1545 View Post
If one could nominate one method over another in the settings somewhere this would be very useful.
Quote:
Originally Posted by remi View Post
I would like to propose the following overall design. If an internal method exists and it works, jD should use this method, otherwise it would search for an external method using the following priorities :-

1) first the automatic anti-captcha methods,

2) then the non-paying or mutual anti-captcha solving methods,

3) and finally the paying methods.

If the customer would enable manual captcha recognition then all the other methods would be overridden. Note that the term "disable automatic captcha recognition" would be useless, since CT also has a mutual captcha solving method that just shifts manual captcha solving in time. The cost for that shift is that customers have to solve many more captchas than in normal manual mode.
Quote:
Originally Posted by celerondude View Post
Hi, I'd love a setting to choose which captcha-method to use, too.
Quote:
Originally Posted by celerondude View Post
it would be nice to have some possibilites to control the usage of captcha methods from inside JD, e.g.
  • Switch specific methods on/off.
  • Switch all external methods on/off.
  • Control the priority of methods, e.g. internal methods have higher priority than external methods, so if there's an internal and an external method use the internal.
  • Explicitly select which method is used for a specific hoster.
+1 from me.

My suggestion would be that captcha-methods would be displayed to the user in a rearrangeable list (with up/down buttons to change each methods position and thus it's priority), with a checkbox for each method allowing to enable/disable any given method. The more flexible, the better.

"Selecting which method should be used for which hoster" (as quoted above) would be desireable, too, but how would this work in practice?
Reply With Quote
  #37  
Old 26.06.2011, 11:51
anarchintosh
Guest
 
Posts: n/a
Default

if i can chip in here, it would also be good for the captcha 'dispatcher' or whatever new captcha handler is implemented to support Reconnect / IP renewals. this is grantedly an unideal thing, but it could be used to get multiple retries for spanish ocr etc on a host like fileserve (ie. try 3 captchas, renew ip, try 3 more.... and have a setting which allows you to specify a repeat of this X number of times)
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 03:31.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.