JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 02.02.2023, 14:01
D4oS D4oS is offline
Registered / Inactive
 
Join Date: Feb 2023
Posts: 4
Default Twitter, downloading likes doesn't grab links for all of them

02.02.23 11.06.36 <--> 02.02.23 12.27.41 jdlog://5182311370661/

I am downloading my likes, have account configured using cookies

I've tried this several times, linkgrabber seems to stop around 7400 links(probably ~3000 tweets).
Not the exact same number, but close.

first try 12.12.2022 I got 7416 links, downloaded them at 99.75% success rate(rest is probably deleted accounts etc., I don't mind it)

another try 26.01.2023, found similar amount, but of course most of those were already added, found 2730 new links and downloaded them.

today I did it again, again 74xx links in linkgrabber, 314 of those new.

Logs attached above contain linkgrabber working on the url to my likes twice, with adding them in between.

I am sorry I could not read what's wrong in logs

I have 29.6k likes so expect 50k+ links in the grabber

I may have suspected it's just timing out ar some point but that was in December and I don't remember why.

Any help will be greatly appreciated!
Reply With Quote
  #2  
Old 02.02.2023, 15:42
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 73,443
Default

Hi,
please enable debug mode and provide a new log, see:
https://support.jdownloader.org/Know...d-session-logs

You're most likely either running into a rate limit or a timeout.
For further testing I'd recommend doing the following, both in Settings -> Plugins -> twitter.com:
1. Define global request limit for api.twitter.com -> Set this to 1500
2. Profile crawler: Wait time between pagination requests in milliseconds -> Set this to 2000

The crawler will be much slower now but maybe it will crawl all the way to the end.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #3  
Old 02.02.2023, 17:57
D4oS D4oS is offline
Registered / Inactive
 
Join Date: Feb 2023
Posts: 4
Default

Oh, such a dumb mistake, not enabling the debug mode, I apologise.

02.02.23 15.16.37 <--> 02.02.23 16.39.52 jdlog://0282311370661/

1. I changed the api limit from 500 to 1500
2. The pagination interval was at 3000ms

first try I left pagination at 3000, it produced 36xx links iirc
then I did not remove anything from linkgrabber, but set pagination to 2000 as suggested, and run the linkgrabber again, I still have only 3631 links, I uploaded the log above

I cleared the linkgrabber, changed the values to 250 and 1000 out of curiosity, ended up with 2766 links

this will be in a newer log upload if it has any value
02.02.23 15.16.37 <--> 02.02.23 16.55.54 jdlog://3282311370661/
(clicked it twice if you find another one within a minute of this one)
Reply With Quote
  #4  
Old 02.02.2023, 18:12
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 73,443
Default

I wasn't able to find anything out of the ordinary.

Your log is again very big.
Next time please make sure that no other downloads are running, no other accounts are being checked and no other links except twitter are being added during your log session.

Also keep in mind that while we do support crawling from twitter, JD is not a specific tool for mass-archiving so you might be better off using other specific twitter downloaders, see github.com for other open source projects.

You could also collect the single post URLs manually using e.g. this method:
https://support.jdownloader.org/Know...orted-websites
...though I doubt that's what you want to do.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #5  
Old 02.02.2023, 19:24
D4oS D4oS is offline
Registered / Inactive
 
Join Date: Feb 2023
Posts: 4
Default

I restarted the client, disabled all other accounts, didn't start any downloads, pasted the link once and uploaded logs within minute after failure
02.02.23 17.21.49 <--> 02.02.23 17.36.44 jdlog://4282311370661/

it stopped scraping a bit after 10 minutes, though I unfortunately had the api at lower delay than suggested 1500

only files modified during the process are
twitter.com_jd.plugins.decrypter.TwitterComCrawler.log.[0/2]
twitter.com_jd.plugins.hoster.TwitterCom.log.0

the process started at 17:23:38

--ID:146TS:1675355588613-02/02/23 17:33:08 - [jd.plugins.decrypter.TwitterComCrawler(crawlUserViaAPI)] -> Crawled page 165 | Tweets crawled so far: 3492/29604 | lastCreatedAtDateStr = Wed Aug 24 20:20:08 +0000 2022 | last nextCursor = HBaugdr+1In0rTAAAA==

This is near the end, it looks as if it just stopped moments later, no errors in the last line, nothing.

I Installed JD2 in another location, default settings, just added Twitter account with the cookie addon.
The crawler stopped at exactly 10:28 duration.
This is literally the last line in the crawler log(without debugging on):
--ID:135TS:1675357263309-02/02/23 18:01:03 - [jd.plugins.decrypter.TwitterComCrawler(crawlUserViaAPI)] -> Crawled page 178 | Tweets crawled so far: 3765/29604 | lastCreatedAtDateStr = Sun Aug 07 17:54:38 +0000 2022 | last nextCursor = HBbwr/KnmYqiqDAAAA==

Also installed it on another PC and tried, it stopped crawling a bit after 5 minutes, only getting to page 88.

I assume this is a lost cause. Thanks for your time and sorry it didn't lead to anything.

I know JD2 is not Twitter specific tool, but you're the best, other options I tried are limited to 500 tweets or plain not work.
I have requested my Twitter user data to see if it contains favourite ids since I should be able to just paste tweets into JD bit by bit.
The aim is mostly to save memes tbh. But the api lock in a week sounds like this is my last chance.

The other solution would be to delete 2000 newest favourites, paste into JD, download, repeat. It deals with duplicates flawlessly.

The browser addons won't work because JD already parses many more tweets than scrolling in the browser can get, after a while it just stops and doesn't go any further.
Reply With Quote
  #6  
Old 03.02.2023, 13:52
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 73,443
Default

Quote:
Originally Posted by D4oS View Post
This is near the end, it looks as if it just stopped moments later, no errors in the last line, nothing.
I completely agree with this.
I've also looked through the code and I as unable to find any break condition without logger except maybe if the user aborts.
Did you ever do a rightclick on the linkgrabber bottom right activity -> Abort all Linkgrabber?

Quote:
Originally Posted by D4oS View Post
I Installed JD2 in another location, default settings, just added Twitter account with the cookie addon.
Such re-installs won't help with plugin related issues

Quote:
Originally Posted by D4oS View Post
I know JD2 is not Twitter specific tool, but you're the best
I'm taking that compliment, thanks!

Quote:
Originally Posted by D4oS View Post
But the api lock in a week sounds like this is my last chance.
No you possibly misonderstood the "API lock", see this thread:
https://board.jdownloader.org/showthread.php?t=92700

Quote:
Originally Posted by D4oS View Post
The other solution would be to delete 2000 newest favourites, paste into JD, download, repeat. It deals with duplicates flawlessly.
Don't give up yet, we're still looking into it.

Quote:
Originally Posted by D4oS View Post
The browser addons won't work because JD already parses many more tweets than scrolling in the browser can get, after a while it just stops and doesn't go any further.
Ahh yes I remember there is some kind of limit in place when using the website.
I used to think that the Web-API has this too but either I was wrong with that assumption or twitter removed that limit some time ago.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #7  
Old 04.02.2023, 12:29
D4oS D4oS is offline
Registered / Inactive
 
Join Date: Feb 2023
Posts: 4
Default

Quote:
Originally Posted by pspzockerscene View Post
Did you ever do a rightclick on the linkgrabber bottom right activity -> Abort all Linkgrabber?
I don't think I did. I could do it once or twice by accident, but just recently I watched the progress popup disappear without my actions, as well as just did it too many times to accidentally click abort each try.

Quote:
Originally Posted by pspzockerscene View Post
you possibly misonderstood the "API lock"
Sure did! That's good.

Quote:
Originally Posted by pspzockerscene View Post
Don't give up yet, we're still looking into it.
I am not, but...
I got my archive, it looks like this:
Code:
window.YTD.like.part0 = [
  {
    "like" : {
      "tweetId" : "1621148179885625345",
      "fullText" : "Folarin Balogun has already scored more league goals this season than any Arsenal player has managed in either of the last two seasons.",
      "expandedUrl" : "**External links are only visible to Support Staff**
    }
  },
  {
    "like" : {
      "tweetId" : "1621128315439403008",
      "fullText" : "Microsoft wydający i wspierający w tym samym Game Passie ruskiego Atomic Hearta i ukraińskiego Stalkera 2 to jest chyba gierkowy odpowiednik \"nie obchodzi mnie kto zaczął - podajcie sobie ręce\" :P",
      "expandedUrl" : "**External links are only visible to Support Staff**
    }
  },
(...)
]
the twitter.com/i/web/status/<ID>is not what I want, so I made a little script to get twitter.com/<Username>/status/<ID> using selenium to load the page and find the url I need in the source, after covering some exceptions it's at 9600 urls done in about half a day meaning it should be done tomorrow. Seems like 5-20% of my likes are from deleted/suspended accounts and some plain fail(I should get the list of failed urls at the end, so I can just retry them as well as look at those myself). I wonder if that's related.

So with that list I should be set, able to paste smaller amounts into JD. I will try using JD itself again too, but after I'm done with what I'm currently doing because I suspect I'd just get blocked for too many requests otherwise.
Reply With Quote
  #8  
Old 10.11.2023, 12:09
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 73,443
Default

Did it work in the end?

Do you mind sharing said Selenium script?
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 05:32.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.