JDownloader Community - Appwork GmbH
 

Reply
 
Thread Tools Display Modes
  #81  
Old 05.03.2024, 15:19
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 72,961
Default

Internal changelog:
- Sanitize existing Tweet text items RE forum https://board.jdownloader.org/showpo...4&postcount=79
- Updated text crawler to find "super big" Tweet-texts of premium users RE forum https://board.jdownloader.org/showpo...7&postcount=77 THIS WILL ONLY WORK FOR ITEMS ADDED AFTER SAID UPDATE
- Refactored logic which finds- and skips Tweet results before returning items as results to linkgrabber

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #82  
Old 08.03.2024, 08:18
RedViper RedViper is offline
Bandwidth Beast
 
Join Date: Apr 2016
Location: Mexico
Posts: 138
Default

Quote:
Originally Posted by pspzockerscene View Post
Please re-read my last post. It explains how the problem could happen even with disabled HLS video stream downloads.
Sorry I didn't see that that. Then I updated and it hasn't happened to me again.

Thank you
__________________
- Ramón V M
Reply With Quote
  #83  
Old 13.03.2024, 10:04
Marco86 Marco86 is offline
DSL Light User
 
Join Date: Aug 2022
Posts: 34
Default

I'm getting some inconsistent behaviour with some profiles.
**External links are only visible to Support Staff****External links are only visible to Support Staff** gives me 69 crawled posts, one time it crawled the complete 83 posts, just to go down to 69 again.
I am also not able to get more than 300 items total when crawling with retweets. I changed accounts and IP addresses many times.

The problem:
There are accounts where the media posts are not all crawled, even if there are only a few media posts.
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff** and every media older is not crawled when crawling the media tab url. This account has 80 media posts, of which 14 media posts are not crawled.
(The media posts seem to not be displayed all the way back under the media tab, but they are displayed when using the Old Twitter Layout (2024) extension.)

It might be related to ads or reposts that are crawled by jdownloader internally and not displayed if the user has retweets disabled.
It could be that these reposts are counted torwards the 850 post crawl limit which results in not all media items being crawled.
But this account is not even loading 850 posts when enabling "crawl with retweets", when crawling with retweets only a few posts are crawled (around 260 including textfiles), ads are crawled as well.
The ads will change with every crawl. Crawling without retweets will simply remove the ads and reposts and display around 150 items.
Reply With Quote
  #84  
Old 13.03.2024, 11:38
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 72,961
Default

Quote:
Originally Posted by Marco86 View Post
I'm getting some inconsistent behaviour with some profiles.
Just so that we're talking about the same things:
1. When adding a profile URL ending with "/media", only media items will be crawled which may be less than when adding the "normal" profile URL.

2. I'm counting this by using the number of crawled text items in JD when "Text crawl mode" is set to "Always".
When I'm adding the "/media" URL, I'm getting 69 results consistently (retweets disabled).



Quote:
Originally Posted by Marco86 View Post
There are accounts where the media posts are not all crawled, even if there are only a few media posts.
Fixed bug where the crawled crashed when a quoted Tweet was deleted (internally called "TweetTombstone").

Quote:
Originally Posted by Marco86 View Post
ads are crawled as well
Please provide examples for crawled ads.

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #85  
Old 15.03.2024, 08:22
Marco86 Marco86 is offline
DSL Light User
 
Join Date: Aug 2022
Posts: 34
Default

Thanks, the issues are fixed now.

You can see the ads by adding the links to the downloadlist, then crawling the profile again, all links that are not already in the downloadlist are ads.
twitter.com/WandaFPHOTO1/status/1768059544587079754
twitter.com/27mirror/status/1768212579908800875
twitter.com/karma_shopping/status/1768212579908800875

One thing that is missing is the information about the tagged accounts for media posts. This belongs in the textfile.

Example:
twitter.com/Immutable/status/1768441202280649164 There are 2 tagged accounts, displayed by clicking on the text right below the image.
This information could be added to the textfiles, for example:
tagged - Skies Verse @skies_verse
tagged - Polygon | Aggregated @0xPolygon

Last edited by Marco86; 15.03.2024 at 10:39. Reason: Deleted wrong information
Reply With Quote
  #86  
Old 15.03.2024, 13:07
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 72,961
Default

Quote:
Originally Posted by Marco86 View Post
Thanks, the issues are fixed now.
Thanks for your feedback.

Quote:
Originally Posted by Marco86 View Post
You can see the ads by adding the links to the downloadlist, then crawling the profile again, all links that are not already in the downloadlist are ads.
Which profile did you use for testing?
How do the ads look on the Twitter.com website?
Please provide a screenshot of such Twitter ads.
Are the ads account-specific like are Twitter account owners able to turn them off?

I didn't get any ads with all accounts I've tested with.
Also in browser after disabling my adblocker I wasn't able to get any ads.

Quote:
Originally Posted by Marco86 View Post
One thing that is missing is the information about the tagged accounts for media posts. This belongs in the textfile.
I disagree since that information is not part of the post-text.
If you want to have that information, I can set it as a plugin-property (documentation) on the result items and you can save it into a text-file e.g. via EventScripter script.

EventScripter forum thread:
https://board.jdownloader.org/showthread.php?t=70525
EventScripter help article:
https://support.jdownloader.org/Know...event-scripter
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #87  
Old 15.03.2024, 15:15
Marco86 Marco86 is offline
DSL Light User
 
Join Date: Aug 2022
Posts: 34
Default

These accounts have ads:
twitter.com/BallietBran
twitter.com/JustJamad
It only affects a few accounts.

Another issue:
Crawling twitter.com/BallietBran or twitter.com/BallietBran/media crawls 209 items, sometimes 410 items.
209 items is incomplete until 2022-07-25_BallietBran_1551292983646883840_FYdMv3gUYAIch4s
410 items is complete until 2022-01-23_BallietBran_1485198445883232257_FJx8EfxUYAEKMt0
All available media that jdownloader should be able to crawl can be displayed with the Old Twitter Layout (2024) extension by scrolling to the bottom of the media tab, without this extension not all available media is displayed, at least for this account.
Attached Images
File Type: png Ads.PNG (151.5 KB, 1 views)
Reply With Quote
  #88  
Old 15.03.2024, 15:40
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 72,961
Default

Quote:
Originally Posted by Marco86 View Post
These accounts have ads:
Fixed.

Quote:
Originally Posted by Marco86 View Post
Another issue:
Crawling twitter.com/BallietBran or twitter.com/BallietBran/media crawls 209 items, sometimes 410 items.
I can reproduce that but I'm very unsure if this is a bug.
You can search through your own Twitter logs and look for the text "Stopping because:".
Although it says "Stopping because: Failed to find any new Tweets on current page" from the last http response I can see that no Tweets were returned -> Serverside end has been reached.

Quote:
Originally Posted by Marco86 View Post
All available media that jdownloader should be able to crawl can be displayed with the Old Twitter Layout (2024) extension by scrolling to the bottom of the media tab
I'm not using said browser addon so I have no insights on what it is doing.
The current Twitter plugin is using "the current version of the Twitter.com website".
Tbh I do not see me being able to constantly put so much time into this plugin.
I still recommend you using other applications which specify on downloading from Twitter.

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #89  
Old 18.03.2024, 08:33
Marco86 Marco86 is offline
DSL Light User
 
Join Date: Aug 2022
Posts: 34
Default

There is an error in how the max setting handles reposts, twitter.com/Kawaka08?max_date=2024-03-15 should load the repost twitter.com/ComputerBase/status/1623061771425812480 as the repost was made today.
Reply With Quote
  #90  
Old 18.03.2024, 15:44
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 72,961
Default

Fixed.

For the next update:
- refactored Retweet handling
- prefer date of source Tweet for finding latest Tweet timestamp for 'max_date' handling
- set some plugin properties of source Tweet on Retweet results as "source_<propertyname>" so from Retweet results you can now e.g. find out what the TweeID of the source Tweet was, property for this would be 'source_tweetid'

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #91  
Old 26.03.2024, 19:47
Marco86 Marco86 is offline
DSL Light User
 
Join Date: Aug 2022
Posts: 34
Default

There is an error with the max_date setting when the poster has quoted his own post and the quoted post has a date that is older than the max_date, the post will not be crawled and the crawler seems to abort. This is true for with and without retweets and the media tab.

twitter.com/Kawaka08?max_date=2024-02-25
twitter.com/Kawaka08/media?max_date=2024-02-25

This is the post in question:
twitter.com/Kawaka08/status/1772652445077823998
Reply With Quote
  #92  
Old 26.03.2024, 20:42
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 81,021
Default

@Marco86: Thanks, pspzockerscene will look into this as soon as he finds time
__________________
JD-Dev & Server-Admin
Reply With Quote
  #93  
Old 27.03.2024, 15:01
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 72,961
Default

Updated logic which finds the "current last Tweet date":
- If Tweet is a Retweet, date of the source Tweet will be used
- If Tweet contains a quoted Tweet, date of source Tweet will be used

This returns ~50 results for your example URLs (Retweets enabled).

Wartest du auf einen angekündigten Bugfix oder ein neues Feature?
Updates werden nicht immer sofort bereitgestellt!
Bitte lies unser Update FAQ! | Please read our Update FAQ!

---
Are you waiting for recently announced changes to get released?
Updates to not necessarily get released immediately!
Bitte lies unser Update FAQ! | Please read our Update FAQ!


-psp-
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #94  
Old 10.04.2024, 06:55
DeusExBestia DeusExBestia is offline
DSL User
 
Join Date: Feb 2018
Posts: 37
Default

Thank you for all your hard work pspzockerscene.

Is there a way to prevent the downloader from appending " - media" to the folder of the user I'm liberating uploads from? I don't think it started happening until one of the recent updates.
Reply With Quote
  #95  
Old 10.04.2024, 10:39
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 81,021
Default

@DeusExBestia: You can use packagizer, see https://support.jdownloader.org/de/k...the-packagizer to remove the " - media" from package name.likes/media and others have this suffix in package name
__________________
JD-Dev & Server-Admin
Reply With Quote
  #96  
Old 10.04.2024, 13:32
pspzockerscene's Avatar
pspzockerscene pspzockerscene is offline
Community Manager
 
Join Date: Mar 2009
Location: Deutschland
Posts: 72,961
Default

@DeusExBestia
Thanks for your feedback.
The text "media" is applied for good reasons.
If you don't want it, please use the feature, Jiaz mentioned to remove it from your package names.
__________________
JD Supporter, Plugin Dev. & Community Manager

Erste Schritte & Tutorials || JDownloader 2 Setup Download
Spoiler:

A users' JD crashes and the first thing to ask is:
Quote:
Originally Posted by Jiaz View Post
Do you have Nero installed?
Reply With Quote
  #97  
Old 27.04.2024, 17:45
RedAero RedAero is offline
Modem User
 
Join Date: Aug 2020
Posts: 4
Default

I'm getting some strange behavior... Most of the time if I paste an account URL (e.g. twitter.com/Kawaka08), nothing happens, nothing is crawled, but *sometimes*, a few posts do get crawled, and not even consistently between different devices (i.e. computer A will crawl nothing, computer B gets 3 posts). On the other hand, copying the /media link seems to work fine.
Reply With Quote
  #98  
Old 27.04.2024, 20:49
Jiaz's Avatar
Jiaz Jiaz is offline
JD Manager
 
Join Date: Mar 2009
Location: Germany
Posts: 81,021
Default

@RedAero: have you added your twitter/x account to JDownloader via cookies method?
Please provide a debug log, see https://support.jdownloader.org/de/k...d-session-logs
Enable debug mode and restart JDownloader. Now reproduce the issue and then create log and post logID here
__________________
JD-Dev & Server-Admin
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +2. The time now is 08:53.
Provided By AppWork GmbH | Privacy | Imprint
Parts of the Design are used from Kirsch designed by Andrew & Austin
Powered by vBulletin® Version 3.8.10 Beta 1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.