JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #1  
Old 10.09.2024, 03:19
goldensun87 goldensun87 is offline
Giga Loader
 
Join Date: Feb 2012
Posts: 94
Default Instagram Reels Crawling - New URL Format

So, for a long time now, when I want to crawl all reels on a profile, I have been using the LinkClump Browser Extension to do so. But now, this has become harder to do, because Instagram has changed the url format for reels. Previously, the default format for reels was: "website/reel/main_content_id". The new format is: "website/username/reel/main_content_id".

Currently, this new url format makes it impossible for JD to crawl individual reels, because each reel url contains the profile url, and JD only detects and crawls the profile url. At this point, I will clarify, that crawling a profile does NOT grab all reels on a profile. It only grabs reels if the user chooses to "feature" the reel on the "Posts" tab. This is why I have been using LinkClump on the Reels tab of a profile, to ensure I grab every reel, and not just the Reels which are featured on the Posts tab. Also, crawling a main profile grabs any image posts as well, which are not wanted when crawling reels.

I have considered bringing this up before, but, I think now is the time for JD to be updated, so that it can accurately and precisely crawl all reels on a profile, as well as ensure that individual reels can be crawled when desired. Unlike the Posts tab of a profile, the Reels tab has a distinct url. The format is "website/username/reels". However, as stated before, for individual reels, the format is: "website/username/reel/main_content_id". It is important to note that, for all reels, the "reel" in the url is plural, and in individual reel urls, the "reel" is singular.
Reply With Quote
  #2  
Old 10.09.2024, 12:24
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 73,484
Default

Quote:
Originally Posted by goldensun87 View Post
I have considered bringing this up before, but, I think now is the time...
There is no reason to wait before reporting a bug.
Next time, please provide example URLs.

Fixed.

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #3  
Old 11.09.2024, 04:16
goldensun87 goldensun87 is offline
Giga Loader
 
Join Date: Feb 2012
Posts: 94
Default

Thank you. The individual reels now work. But, now there seems to be a new bug. When I try crawling an "all reels" url, it causes my login cookies to get expired, and also causes my account to get logged out in the browser as well. Here is an example url.

**External links are only visible to Support Staff****External links are only visible to Support Staff**

Also, it seems the problem with individual reel urls is still not fixed. Trying to crawl an individual reel, still crawls the entire main profile. Here is an example url.

**External links are only visible to Support Staff****External links are only visible to Support Staff**

Last edited by goldensun87; 11.09.2024 at 04:26.
Reply With Quote
  #4  
Old 11.09.2024, 12:51
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 73,484
Default

Quote:
Originally Posted by goldensun87 View Post
When I try crawling an "all reels" url, it causes my login cookies to get expired, and also causes my account to get logged out in the browser as well. Here is an example url.
Same here.
Looks to be typical instagram bot detection. We can't prohibit that.
This is what Insta returns in my tests:
Code:
i.instagram.com/api/v1/clips/user/
----------------Request Content-------------
{"message":"login_required","error_title":"Du wurdest abgemeldet","error_body":"Bitte melde dich wieder an.","logout_reason":3,"status":"fail"}
Search for "instagram bot", "instagram ban" or similar in our support forums.
If it keeps failing, I'd recommend to manually collect all links to the individual reels.
Instructions:
https://support.jdownloader.org/de/k...orted-websites

Quote:
Originally Posted by goldensun87 View Post
Also, it seems the problem with individual reel urls is still not fixed. Trying to crawl an individual reel, still crawls the entire main profile. Here is an example url.
Fixed.
The for user part of that URL, only "[A-Za-z0-9_-]+" was allowed so it is working at this moment, but not for your particular example since it contains a dot character in the username.

Please also keep in mind that there are separate tools dedicated to instagram downloading which may do a better job than JDownloader.
Go to github.com and search for "instagram downloader" to find them.

For all mentioned code changes, the following information applies:

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #5  
Old 11.09.2024, 13:37
goldensun87 goldensun87 is offline
Giga Loader
 
Join Date: Feb 2012
Posts: 94
Default

Huh. I was not aware that Instagram added that security to the reel tab links. Thankfully, main profile crawling still works.

Hopefully, you'll be able to fix the problem with usernames that have dot characters. If anyone else runs into this problem when crawling these particular usernames, I figured out a workaround. This is most useful if someone wants to use LinkClump to download all reels.

Turn the LinkGrabber off. Collect all the links. Paste them into "Open Multiple URLs" browser extension. Click "Extract URLs from text". Copy them to a text file in Notepad++. Copy the part of a link which is "username/reel". Press Ctrl+F, and click on the "Replace" tab. Make sure the copied text is in the right box, and in the "replace with" box, type the letter "p". Then click "Replace All".

All regular posts as well as reels still work with the "p" in the url, and JD can still accurately grab reels with the links in this form.
Reply With Quote
  #6  
Old 11.09.2024, 13:44
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 73,484
Default

Quote:
Originally Posted by goldensun87 View Post
I was not aware that Instagram added that security to the reel tab links.
This is just an assumption that I made.
I can also see that Instagram itself does not seem to be using the "i.instagram.com/api..." endpoint anymore for reels so that may be another reason.
The endpoint is still working though but maybe just to catch bots - who knows.

Quote:
Originally Posted by goldensun87 View Post
Hopefully, you'll be able to fix the problem with usernames that have dot characters.
Already fixed, see my last reply.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 10:30.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.