JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 10.02.2012, 20:28
mluser
Guest
 
Posts: n/a
Default Motherless.com missing pages from gallery

There's a problem with the way JD processes the list of pages from motherless.com
The decrypter reads the links from just one page. Meaning, it works when the pager at the end of a gallery is like this:

« PREV 1 2 3 4 5 6 NEXT »

but it fails on a longer gallery:

« PREV 1 2 3 4 5 6 7 8 9 10 11 ... 23 24 25 NEXT »

(the ... is added by motherless)

In this case, JD logs shows that it detected 14 pages (1 to 11 and 23 to 25), skipping 12 to 22.

The problem is in jd\plugins\decrypter\MotherLessCom.java

Code:
ArrayList<String> pages = new ArrayList<String>();
            pages.add("currentPage");
            String pagenumbers[] = br.getRegex("page=(\\d+)\"").getColumn(0);
            if (!(pagenumbers == null) && !(pagenumbers.length == 0)) {
                for (String aPageNumber : pagenumbers) {
                    if (!pages.contains(aPageNumber) && !aPageNumber.equals("1")) pages.add(aPageNumber);
                }
            }
            logger.info("Found " + pages.size() + " pages, decrypting now...");
            progress.setRange(pages.size());
            for (String getthepage : pages) {
                if (!getthepage.equals("currentPage")) br.getPage(param + "?page=" + getthepage);

Last edited by Jiaz; 13.02.2012 at 18:54.
Reply With Quote
  #2  
Old 11.02.2012, 03:36
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,649
Default

Can you please provide links which I can use to test/fix the problem
kthx
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]

Last edited by raztoki; 11.02.2012 at 03:41.
Reply With Quote
  #3  
Old 11.02.2012, 04:40
mluser
Guest
 
Posts: n/a
Default

Quote:
Originally Posted by raztoki View Post
Can you please provide links which I can use to test/fix the problem
kthx
sure, here's one that works as expected:
**External links are only visible to Support Staff****External links are only visible to Support Staff**

and here's one that fails:
**External links are only visible to Support Staff****External links are only visible to Support Staff**

keep in mind it's a porn site...
Reply With Quote
  #4  
Old 11.02.2012, 05:18
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,649
Default

cheers, fixed please wait for the next round of updates
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #5  
Old 13.02.2012, 18:23
mluser
Guest
 
Posts: n/a
Default

Hi, the new version doesn't seem to fix the issue. It only adds links from the first page. The logs now show that it porperly detects and "walks" all the pages, but only adds links from the first one (maybe the ?page= parameter in URL is missing?)
Reply With Quote
  #6  
Old 13.02.2012, 18:42
djmakinera djmakinera is offline
Banned
 
Join Date: May 2010
Location: Poland
Posts: 8,387
Default

mluser - Hehe. I checked myself. Server very slow (for me).
Works only "direct link to the photo/photos"
Page=1 .... 2 .... does not work --> Just checked only the "Groups"
------------------------ Thread: 112 -----------------------
112 13.02.12 18:38:11 - WARNING [java_downloader] -> No supported links found -> search for links in source code of all urls

Last edited by djmakinera; 13.02.2012 at 18:49.
Reply With Quote
  #7  
Old 13.02.2012, 19:05
mluser
Guest
 
Posts: n/a
Default

Quote:
Originally Posted by djmakinera View Post
mluser - Hehe. I checked myself. Server very slow (for me).
Works only "direct link to the photo/photos"
Page=1 .... 2 .... does not work --> Just checked only the "Groups"
------------------------ Thread: 112 -----------------------
112 13.02.12 18:38:11 - WARNING [java_downloader] -> No supported links found -> search for links in source code of all urls
it also works with galleries. try adding a gallery link and it will decrypt.

i'd be glad to provide a patch and test it, can someone tell me how to setup the build environment for source? I've downloaded the sources and dependencies but i can't get them to work.
Reply With Quote
  #8  
Old 13.02.2012, 19:20
djmakinera djmakinera is offline
Banned
 
Join Date: May 2010
Location: Poland
Posts: 8,387
Default

mluser - I do not know if it works, "Gallery".
I can not load the page (server overloaded or very, very slow)
Reply With Quote
  #9  
Old 14.02.2012, 01:19
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,649
Default

Spanning pages creates many GET requests and even with checking with a browser I had troubles before attempting the fix in JD + testing. They have very high server loads, it will cause issues. I might enforce lower connections if the problem persists. I've also noticed during my testing they seem to punish users who've requested multiple GETs request over prolonged period. Page scraping should happen first and then link checking. If the server, or user load become to high, online checking will just be temp uncheckable. When it goes to download, it will update the filesize data. It should still add all the links assuming server hasn't locked you out during that process. I limited my connects to (max con per host, inside settings > download and connection) to two for this that helped heaps. I used 'max dl' setting but as you guys are downloading multiple files at once (I know djmakinera would be.), limiting by max sim dl per host is a better alternative. Experiment a little and find out what would be the best simultaneous connection settings and I'll hard define them. Just keep in mind that, more isn't necessarily better if downloads start to fail.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]

Last edited by raztoki; 14.02.2012 at 01:22.
Reply With Quote
  #10  
Old 14.02.2012, 02:15
mluser
Guest
 
Posts: n/a
Default

The site is usally quite fast. It seems they were having problems earlier today, but it's loading quickly right now. I just tested it and I still get only 40 links (1 page).
Reply With Quote
  #11  
Old 14.02.2012, 02:48
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,649
Default

for what link? because I've tested for multiple spanning pages and it all to worked. Along with only 1 page posts I made sure I wouldn't break that.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #12  
Old 14.02.2012, 03:14
mluser
Guest
 
Posts: n/a
Default

OK, I've tested it more carefully. I discovered the gallery URL is *not* used as a base for "more" as the plugin is (just appeding ?page=N).

Galleries in this site can contain both Images and Videos.
If the gallery contains both, it shows the first few images and a link to "more images", then the first few videos, and a link to more videos, and so on. So the "first" gallery page is special.

When the gallery contains *only* images, the first "screenful" (40 images) are displayed BUT! the link at the end changes. So if the gallery is something like:

/G1234567, the links at the bottom don't point to G1234567?page=N but to GI1234567?page=N (where "I" would stand for Images, I suppose. There might be a GV1234567?page=N with videos).

So the workaround is to copy the gallery link starting with GI. The decrypter needs to be a little more complicated to find if a gallery has images, videos, or both, and "walk" the pages with the GI and GV URLs.
Reply With Quote
  #13  
Old 14.02.2012, 03:23
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,649
Default

forgot to post this earlier, i see you ahve discovered why also. what i wrote after my previous post ....

oh think I see why... when you get pages of posts which are not the end album (not more links), but are like the ones which have images / videos / galleries it tries to find total pages which it cant. Even with the previous code it 'fails' grabbing the lot within those pages types as it only grabs what's shown on the page, and not what's in 'more' which is what adjust the plugin for.. Ill look at it tonight
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #14  
Old 14.02.2012, 10:29
djmakinera djmakinera is offline
Banned
 
Join Date: May 2010
Location: Poland
Posts: 8,387
Default

Raztoki - 1 year ago availed of this service. 28/12/2010 There's more shit lands. Waste of time on it.http://board.jdownloader.org/showthread.php?t=24220 I stopped and I do not know what's changed and what is the situation with downloading files. Apparently introduced a premium, so that the possibility of faster browsing and downloading video files.

Last edited by djmakinera; 14.02.2012 at 10:32.
Reply With Quote
  #15  
Old 14.02.2012, 17:51
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,649
Default

@mluser

not quite we look for the uid and strip out the rest.

I only reproduce my observation in Galleries and the uid link opens another page with only a few images/videos and processes them normally, all i can do there is to make the plugin look for sub page.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #16  
Old 15.02.2012, 21:31
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 72,084
Default

raztoki you told me it's fixed not, is that correct?

GreeZ pspzockerscene
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #17  
Old 16.02.2012, 11:24
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,649
Default

it's fixed, I haven't committed it yet as indicated last night. I wanted your opinion on your existing ignore parts (regex) of the site. I noticed some parts of the site have 900 odd pages and I rather not create insane server loads via crawling them all and the associated posts.
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]
Reply With Quote
  #18  
Old 16.02.2012, 19:44
raztoki's Avatar
raztoki raztoki is offline
English Supporter
 
Join Date: Apr 2010
Location: Australia
Posts: 17,649
Default

ok so I've commented, and made some changes. Please note that this plugin isn't designed to grab the entire website 'desired content'. I've limited the sections the plugin works on. Sections like 'groups, videos and images' you can not process them other than a manual deep decrypt/parse or via browser extension. As these sections can contain in excess of 1000 spanning pages.

So what is supported is urls with:
domain.com/uid
domain.com/galID/uid
domain.com/g/some_name_here/uid
where uid = [A-Z0-9]{7}
spanning pages: greater than 5 pages adds results as available status enabled. (to reduce server loads, Still does additional linkcheck prior to downloading)
__________________
raztoki @ jDownloader reporter/developer
http://svn.jdownloader.org/users/170

Don't fight the system, use it to your advantage. :]

Last edited by raztoki; 16.02.2012 at 23:56.
Reply With Quote
  #19  
Old 16.02.2012, 21:47
pspzockerscene's Avatar
pspzockerscene pspzockerscene is online now
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 72,084
Default

Thanks.
And for clarification:
It's no problem for us to make decrypters for sites BUT we will not allow people to add static links like categories or stuff like the groups here which can contain over 1000 pages (40 images on each = ddos).
Please understand that.

GreeZ pspzockerscene
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 18:45.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.