#1
|
|||
|
|||
Tumblr API v1
Hi, I'd like to crawl high-resolution jpg from **External links are only visible to Support Staff****External links are only visible to Support Staff**
and I tried to write a Link Crawler Rule: Code:
[ { "name": "SOME DOCUMENTATION: https://board.jdownloader.org/showthread.php?t=77280#post422008", "enabled": false }, { "name": "tumblr: parse links like **External links are only visible to Support Staff**, "enabled": true, "pattern": "https?://hofd\\.tumblr\\.com/.*", "rule": "DEEPDECRYPT", "maxDecryptDepth": 0, "deepPattern": "<photo-url max-width=\"1280\">(.*?jpg)" } ] |
#2
|
||||
|
||||
tumblr present the 'I consent' dialog because no cookies are present.
you can forward the cookies from your browser via to avoid this screen and then JDownloader will receive the api response. Quote:
__________________
JD-Dev & Server-Admin |
#3
|
||||
|
||||
You will find the http requests/responses in log after activating the debug log mode.
Enable Settings-Advanced Settings-Log.debugmodeenabled and restart JD I will also add new field to rule to enable logging (wait for next core update) Quote:
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 27.03.2019 at 11:47. |
#4
|
|||
|
|||
Makes sense. I tried adding the cookie, but still no links are grabbed. I also tried URL-decoding the cookie content, but still nothing.
I'll look at the log later. I always use JSON escaping. (After URL-decoding there are quotation marks.) After URL-decoding, the "pfg" cookie content has the following form: Last edited by Jiaz; 28.03.2019 at 20:14. |
#5
|
||||
|
||||
Just put the raw (as seen in headers section) cookie, don't encode/decode or anything, just copy/paste from the Cookie Header from developer tools in your browser.
In your case, use the 'Original' cookie. That should work fine. Enable Debug mode as explained, then you should see more in the logs
__________________
JD-Dev & Server-Admin |
#6
|
|||
|
|||
Still didn't work, but I didn't look at the logs yet.
Another problem is that downloads are sometimes stuck at 100% and finish only after clicking "Resume". Maybe it has something to do with my settings (I don't have time to make the log right now), but I wonder why *any* settings may prevent JDownloader from realizing that the download is finished, and if clicking "Resume" helps why JDownloader doesn't "click" it itself. Even when not stuck, each download seems to start slowly, so thousands of files take long. So for now I'm using TumblThree for tumblr, it seems much faster. Still using JDownloader for other websites, and may try to figure out later how the crawler works. Simple built-in documentation and debugging would be nice. Thanks for the help here so far and for the entire work. |
#7
|
|||
|
|||
Rule from post #1 worked fine, as it is, without 'forwarding' any cookies. The request made via API URL does not appear to be redirected to the 'consent' page. @Jiaz should be able to check/confirm.
|
#8
|
||||
|
||||
We need a log to tell you more about this. Possible causes can be transparent (image) compression in place and first connection file size is x and 2nd connection file size is different and then JDownloader isn't able to *finish* download because filesize has changed during download.
__________________
JD-Dev & Server-Admin |
#9
|
||||
|
||||
Again, we need logs. JDownloader limits max concurrent downloads to 20 and default is 1-2 concurrent downloads. Also there are many more other factors that could be cause for this. too many to guess and log is best place to look for
__________________
JD-Dev & Server-Admin |
#10
|
||||
|
||||
Please provide example links, so I can check/compare this
__________________
JD-Dev & Server-Admin |
#11
|
|||
|
|||
re: re: re:
Quote:
|
Thread Tools | |
Display Modes | |
|
|