#1
|
|||
|
|||
Instagram Reels Crawling - New URL Format
So, for a long time now, when I want to crawl all reels on a profile, I have been using the LinkClump Browser Extension to do so. But now, this has become harder to do, because Instagram has changed the url format for reels. Previously, the default format for reels was: "website/reel/main_content_id". The new format is: "website/username/reel/main_content_id".
Currently, this new url format makes it impossible for JD to crawl individual reels, because each reel url contains the profile url, and JD only detects and crawls the profile url. At this point, I will clarify, that crawling a profile does NOT grab all reels on a profile. It only grabs reels if the user chooses to "feature" the reel on the "Posts" tab. This is why I have been using LinkClump on the Reels tab of a profile, to ensure I grab every reel, and not just the Reels which are featured on the Posts tab. Also, crawling a main profile grabs any image posts as well, which are not wanted when crawling reels. I have considered bringing this up before, but, I think now is the time for JD to be updated, so that it can accurately and precisely crawl all reels on a profile, as well as ensure that individual reels can be crawled when desired. Unlike the Posts tab of a profile, the Reels tab has a distinct url. The format is "website/username/reels". However, as stated before, for individual reels, the format is: "website/username/reel/main_content_id". It is important to note that, for all reels, the "reel" in the url is plural, and in individual reel urls, the "reel" is singular. |
#2
|
||||
|
||||
Quote:
Next time, please provide example URLs. Fixed. Wartest du auf einen angekündigten Bugfix oder ein neues Feature? Updates werden nicht immer sofort bereitgestellt! Bitte lies unser Update FAQ! | Please read our Update FAQ! --- Are you waiting for recently announced changes to get released? Updates to not necessarily get released immediately! Bitte lies unser Update FAQ! | Please read our Update FAQ! -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#3
|
|||
|
|||
Thank you. The individual reels now work. But, now there seems to be a new bug. When I try crawling an "all reels" url, it causes my login cookies to get expired, and also causes my account to get logged out in the browser as well. Here is an example url.
**External links are only visible to Support Staff****External links are only visible to Support Staff** Also, it seems the problem with individual reel urls is still not fixed. Trying to crawl an individual reel, still crawls the entire main profile. Here is an example url. **External links are only visible to Support Staff****External links are only visible to Support Staff** Last edited by goldensun87; 11.09.2024 at 03:26. |
#4
|
||||
|
||||
Quote:
Looks to be typical instagram bot detection. We can't prohibit that. This is what Insta returns in my tests: Code:
i.instagram.com/api/v1/clips/user/ ----------------Request Content------------- {"message":"login_required","error_title":"Du wurdest abgemeldet","error_body":"Bitte melde dich wieder an.","logout_reason":3,"status":"fail"} If it keeps failing, I'd recommend to manually collect all links to the individual reels. Instructions: https://support.jdownloader.org/de/k...orted-websites Quote:
The for user part of that URL, only "[A-Za-z0-9_-]+" was allowed so it is working at this moment, but not for your particular example since it contains a dot character in the username. Please also keep in mind that there are separate tools dedicated to instagram downloading which may do a better job than JDownloader. Go to github.com and search for "instagram downloader" to find them. For all mentioned code changes, the following information applies: Wartest du auf einen angekündigten Bugfix oder ein neues Feature? Updates werden nicht immer sofort bereitgestellt! Bitte lies unser Update FAQ! | Please read our Update FAQ! --- Are you waiting for recently announced changes to get released? Updates to not necessarily get released immediately! Bitte lies unser Update FAQ! | Please read our Update FAQ! -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#5
|
|||
|
|||
Huh. I was not aware that Instagram added that security to the reel tab links. Thankfully, main profile crawling still works.
Hopefully, you'll be able to fix the problem with usernames that have dot characters. If anyone else runs into this problem when crawling these particular usernames, I figured out a workaround. This is most useful if someone wants to use LinkClump to download all reels. Turn the LinkGrabber off. Collect all the links. Paste them into "Open Multiple URLs" browser extension. Click "Extract URLs from text". Copy them to a text file in Notepad++. Copy the part of a link which is "username/reel". Press Ctrl+F, and click on the "Replace" tab. Make sure the copied text is in the right box, and in the "replace with" box, type the letter "p". Then click "Replace All". All regular posts as well as reels still work with the "p" in the url, and JD can still accurately grab reels with the links in this form. |
#6
|
||||
|
||||
Quote:
I can also see that Instagram itself does not seem to be using the "i.instagram.com/api..." endpoint anymore for reels so that may be another reason. The endpoint is still working though but maybe just to catch bots - who knows. Already fixed, see my last reply.
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
Thread Tools | |
Display Modes | |
|
|