JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 27.03.2019, 03:52
smilies smilies is offline
Junior Loader
 
Join Date: Jan 2019
Posts: 11
Default Tumblr API v1

Hi, I'd like to crawl high-resolution jpg from **External links are only visible to Support Staff****External links are only visible to Support Staff**
and I tried to write a Link Crawler Rule:

Code:
[
{
		"name": "SOME DOCUMENTATION: https://board.jdownloader.org/showthread.php?t=77280#post422008",
		"enabled": false
}, {
		"name": "tumblr: parse links like **External links are only visible to Support Staff**,
		"enabled": true,
		"pattern": "https?://hofd\\.tumblr\\.com/.*",
		"rule": "DEEPDECRYPT",
		"maxDecryptDepth": 0,
		"deepPattern": "<photo-url max-width=\"1280\">(.*?jpg)"
}
]
But no links are being added. What is going on during parsing? Is there an easy way to check?
Reply With Quote
  #2  
Old 27.03.2019, 10:17
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 65,270
Default

tumblr present the 'I consent' dialog because no cookies are present.
you can forward the cookies from your browser via to avoid this screen and then JDownloader will receive the api response.

Quote:
.......
"cookies" : [ ["pfg","cookieContentSeeDeveloperToolsOfBrowser"] ],
.....
__________________
JD-Dev & Server-Admin
Reply With Quote
  #3  
Old 27.03.2019, 10:34
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 65,270
Default

You will find the http requests/responses in log after activating the debug log mode.
Enable Settings-Advanced Settings-Log.debugmodeenabled and restart JD
I will also add new field to rule to enable logging (wait for next core update)
Quote:
"logging": true
__________________
JD-Dev & Server-Admin

Last edited by Jiaz; 27.03.2019 at 10:47.
Reply With Quote
  #4  
Old 28.03.2019, 17:23
smilies smilies is offline
Junior Loader
 
Join Date: Jan 2019
Posts: 11
Default

Makes sense. I tried adding the cookie, but still no links are grabbed. I also tried URL-decoding the cookie content, but still nothing.

I'll look at the log later.

I always use JSON escaping. (After URL-decoding there are quotation marks.)

After URL-decoding, the "pfg" cookie content has the following form:

Last edited by Jiaz; 28.03.2019 at 19:14.
Reply With Quote
  #5  
Old 28.03.2019, 19:14
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 65,270
Default

Just put the raw (as seen in headers section) cookie, don't encode/decode or anything, just copy/paste from the Cookie Header from developer tools in your browser.
In your case, use the 'Original' cookie. That should work fine. Enable Debug mode as explained, then you should see more in the logs
__________________
JD-Dev & Server-Admin
Reply With Quote
  #6  
Old 26.05.2019, 12:23
smilies smilies is offline
Junior Loader
 
Join Date: Jan 2019
Posts: 11
Default

Still didn't work, but I didn't look at the logs yet.

Another problem is that downloads are sometimes stuck at 100% and finish only after clicking "Resume". Maybe it has something to do with my settings (I don't have time to make the log right now), but I wonder why *any* settings may prevent JDownloader from realizing that the download is finished, and if clicking "Resume" helps why JDownloader doesn't "click" it itself.

Even when not stuck, each download seems to start slowly, so thousands of files take long.

So for now I'm using TumblThree for tumblr, it seems much faster.

Still using JDownloader for other websites, and may try to figure out later how the crawler works. Simple built-in documentation and debugging would be nice. Thanks for the help here so far and for the entire work.
Reply With Quote
  #7  
Old 26.05.2019, 16:30
mgpai mgpai is offline
Script Master
 
Join Date: Sep 2013
Posts: 606
Default

Rule from post #1 worked fine, as it is, without 'forwarding' any cookies. The request made via API URL does not appear to be redirected to the 'consent' page. @Jiaz should be able to check/confirm.
Reply With Quote
  #8  
Old 27.05.2019, 17:20
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 65,270
Default

Quote:
Originally Posted by smilies View Post
Another problem is that downloads are sometimes stuck at 100% and finish only after clicking "Resume".
We need a log to tell you more about this. Possible causes can be transparent (image) compression in place and first connection file size is x and 2nd connection file size is different and then JDownloader isn't able to *finish* download because filesize has changed during download.
__________________
JD-Dev & Server-Admin
Reply With Quote
  #9  
Old 27.05.2019, 17:22
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 65,270
Default

Quote:
Originally Posted by smilies View Post
Even when not stuck, each download seems to start slowly, so thousands of files take long.
Again, we need logs. JDownloader limits max concurrent downloads to 20 and default is 1-2 concurrent downloads. Also there are many more other factors that could be cause for this. too many to guess and log is best place to look for
__________________
JD-Dev & Server-Admin
Reply With Quote
  #10  
Old 27.05.2019, 17:24
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 65,270
Default

Quote:
Originally Posted by smilies View Post
So for now I'm using TumblThree for tumblr, it seems much faster.
Please provide example links, so I can check/compare this
__________________
JD-Dev & Server-Admin
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 05:31.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.