#1
|
|||
|
|||
Twitter, downloading likes doesn't grab links for all of them
02.02.23 11.06.36 <--> 02.02.23 12.27.41 jdlog://5182311370661/
I am downloading my likes, have account configured using cookies I've tried this several times, linkgrabber seems to stop around 7400 links(probably ~3000 tweets). Not the exact same number, but close. first try 12.12.2022 I got 7416 links, downloaded them at 99.75% success rate(rest is probably deleted accounts etc., I don't mind it) another try 26.01.2023, found similar amount, but of course most of those were already added, found 2730 new links and downloaded them. today I did it again, again 74xx links in linkgrabber, 314 of those new. Logs attached above contain linkgrabber working on the url to my likes twice, with adding them in between. I am sorry I could not read what's wrong in logs I have 29.6k likes so expect 50k+ links in the grabber I may have suspected it's just timing out ar some point but that was in December and I don't remember why. Any help will be greatly appreciated! |
#2
|
||||
|
||||
Hi,
please enable debug mode and provide a new log, see: https://support.jdownloader.org/Know...d-session-logs You're most likely either running into a rate limit or a timeout. For further testing I'd recommend doing the following, both in Settings -> Plugins -> twitter.com: 1. Define global request limit for api.twitter.com -> Set this to 1500 2. Profile crawler: Wait time between pagination requests in milliseconds -> Set this to 2000 The crawler will be much slower now but maybe it will crawl all the way to the end.
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#3
|
|||
|
|||
Oh, such a dumb mistake, not enabling the debug mode, I apologise.
02.02.23 15.16.37 <--> 02.02.23 16.39.52 jdlog://0282311370661/ 1. I changed the api limit from 500 to 1500 2. The pagination interval was at 3000ms first try I left pagination at 3000, it produced 36xx links iirc then I did not remove anything from linkgrabber, but set pagination to 2000 as suggested, and run the linkgrabber again, I still have only 3631 links, I uploaded the log above I cleared the linkgrabber, changed the values to 250 and 1000 out of curiosity, ended up with 2766 links this will be in a newer log upload if it has any value 02.02.23 15.16.37 <--> 02.02.23 16.55.54 jdlog://3282311370661/ (clicked it twice if you find another one within a minute of this one) |
#4
|
||||
|
||||
I wasn't able to find anything out of the ordinary.
Your log is again very big. Next time please make sure that no other downloads are running, no other accounts are being checked and no other links except twitter are being added during your log session. Also keep in mind that while we do support crawling from twitter, JD is not a specific tool for mass-archiving so you might be better off using other specific twitter downloaders, see github.com for other open source projects. You could also collect the single post URLs manually using e.g. this method: https://support.jdownloader.org/Know...orted-websites ...though I doubt that's what you want to do.
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#5
|
|||
|
|||
I restarted the client, disabled all other accounts, didn't start any downloads, pasted the link once and uploaded logs within minute after failure
02.02.23 17.21.49 <--> 02.02.23 17.36.44 jdlog://4282311370661/ it stopped scraping a bit after 10 minutes, though I unfortunately had the api at lower delay than suggested 1500 only files modified during the process are twitter.com_jd.plugins.decrypter.TwitterComCrawler.log.[0/2] twitter.com_jd.plugins.hoster.TwitterCom.log.0 the process started at 17:23:38 --ID:146TS:1675355588613-02/02/23 17:33:08 - [jd.plugins.decrypter.TwitterComCrawler(crawlUserViaAPI)] -> Crawled page 165 | Tweets crawled so far: 3492/29604 | lastCreatedAtDateStr = Wed Aug 24 20:20:08 +0000 2022 | last nextCursor = HBaugdr+1In0rTAAAA== This is near the end, it looks as if it just stopped moments later, no errors in the last line, nothing. I Installed JD2 in another location, default settings, just added Twitter account with the cookie addon. The crawler stopped at exactly 10:28 duration. This is literally the last line in the crawler log(without debugging on): --ID:135TS:1675357263309-02/02/23 18:01:03 - [jd.plugins.decrypter.TwitterComCrawler(crawlUserViaAPI)] -> Crawled page 178 | Tweets crawled so far: 3765/29604 | lastCreatedAtDateStr = Sun Aug 07 17:54:38 +0000 2022 | last nextCursor = HBbwr/KnmYqiqDAAAA== Also installed it on another PC and tried, it stopped crawling a bit after 5 minutes, only getting to page 88. I assume this is a lost cause. Thanks for your time and sorry it didn't lead to anything. I know JD2 is not Twitter specific tool, but you're the best, other options I tried are limited to 500 tweets or plain not work. I have requested my Twitter user data to see if it contains favourite ids since I should be able to just paste tweets into JD bit by bit. The aim is mostly to save memes tbh. But the api lock in a week sounds like this is my last chance. The other solution would be to delete 2000 newest favourites, paste into JD, download, repeat. It deals with duplicates flawlessly. The browser addons won't work because JD already parses many more tweets than scrolling in the browser can get, after a while it just stops and doesn't go any further. |
#6
|
||||
|
||||
Quote:
I've also looked through the code and I as unable to find any break condition without logger except maybe if the user aborts. Did you ever do a rightclick on the linkgrabber bottom right activity -> Abort all Linkgrabber? Quote:
I'm taking that compliment, thanks! No you possibly misonderstood the "API lock", see this thread: https://board.jdownloader.org/showthread.php?t=92700 Quote:
Quote:
I used to think that the Web-API has this too but either I was wrong with that assumption or twitter removed that limit some time ago.
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#7
|
|||
|
|||
Quote:
Sure did! That's good. I am not, but... I got my archive, it looks like this: Code:
window.YTD.like.part0 = [ { "like" : { "tweetId" : "1621148179885625345", "fullText" : "Folarin Balogun has already scored more league goals this season than any Arsenal player has managed in either of the last two seasons.", "expandedUrl" : "**External links are only visible to Support Staff** } }, { "like" : { "tweetId" : "1621128315439403008", "fullText" : "Microsoft wydający i wspierający w tym samym Game Passie ruskiego Atomic Hearta i ukraińskiego Stalkera 2 to jest chyba gierkowy odpowiednik \"nie obchodzi mnie kto zaczął - podajcie sobie ręce\" :P", "expandedUrl" : "**External links are only visible to Support Staff** } }, (...) ] So with that list I should be set, able to paste smaller amounts into JD. I will try using JD itself again too, but after I'm done with what I'm currently doing because I suspect I'd just get blocked for too many requests otherwise. |
#8
|
||||
|
||||
Did it work in the end?
Do you mind sharing said Selenium script?
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
Thread Tools | |
Display Modes | |
|
|