#1
|
|||
|
|||
![]()
I'm having an issue in where the results from RemoteAPI "/linkgrabberv2/queryLinkCrawlerJobs" are empty after adding a link with "/linkgrabberv2/addLinks" with the provided jobUUIDs
After sending 9000 links (99% mega.nz) via the API sequentially around 200 of them were unable to get ANY result from "/linkgrabberv2/queryLinkCrawlerJobs" after adding the link. (However running "/linkgrabberv2/queryLinks" and providing it with jobUUIDs from "/linkgrabberv2/addLinks" results in the expected links, but this does not solve the issue) After sending x amount of links or waiting x hours the jobUUID will no longer exist in the "/linkgrabberv2/queryLinkCrawlerJobs" which is fine since I normally have the linkUUIDs and packageUUIDs associated with the jobUUIDs (there could be some sort of internal limit? But these appear to still exist after restarting) These ~200 link URLs are unable to be handled in my application in its current state as they only have their jobUUID saved. I was expecting "/linkgrabberv2/queryLinkCrawlerJobs" to ALWAYS return a result. I will need to fix these by running "/linkgrabberv2/queryLinks" with the saved jobUUID. JD Info: Code:
Build Data: Fri Jun 07 17:51:22 CEST 2024 Java: AdoptOpenJDK - OpenJDK Runtime Environment - 1.8.0_265(64bit/X86) OS: WINDOWS(WINDOWS_10_22H2)(64bit) Core: #48254 Launcher #5770 AppWork Utilities: #4055 Browser: #48227 Updater: #1061 Log File Password: **External links are only visible to Support Staff**(archive password is contained within this URL) If you have any question for me I will be in the IRC #jDownloader as @TheGreenUser, DM me if it's more convenient. I'm unable to provide any more detailed logs at this time however I do have one example, I can find up to ~200 more if required within the 8 JD log folders I have. If the debug logs are truly required I can reset JD and attempt to re-create the issue. Example: Within log folder: "1717954683594_" file: "jd.controlling.linkcollector.LinkCollector.log.0" find "CrawlerJob:ID:1718007087379" Code:
"/linkgrabberv2/addLinks" {"assignJobID": true, "overwritePackagizerRules": false, "packageName": "priority": "DEFAULT", "links": "**External links are only visible to Support Staff**, "sourceUrl":"my source url here"}: {"data" :[[1718007087379]],"rid": my_rid_placeholder_here}" "/linkgrabberv2/queryLinkCrawlerJobs": {"collectorInfo":true,"jobIds":[1718007087379]}: {"data" :[],"rid":my_rid_placeholder_here}" // program loop would normally end here for this response with error("received 0 LinkCrawlerJobs") "/linkgrabberv2/queryLinks" {"collectorInfo":true, jobUUIDs [1718007087379]}: {"data" :[{Availability:ONLINE BytesTotal:911906257 Comment: DownloadPassword: Enabled:true Host:mega.co.nz Name:VR PMV - Mother's Daughter.mp4 PackageUUID:1718007088166 Priority: Url:**External links are only visible to Support Staff**rid":1718047882924969911} basic program flow: (Only 1 link is handled at a time) Code:
for each link in links: Send 1 link to "/linkgrabberv2/addLinks" Receive 1 CrawlerJob:ID or return error Save CrawlerJob:ID to the link's database record Wait 2 seconds var jobs JobLinkCrawlerSortables loop: // todo: replace "/linkgrabberv2/queryLinkCrawlerJobs" with "/linkgrabberv2/isCollecting" then fetch LinkCrawlerJobs if !Collecting jobs, err = "/linkgrabberv2/queryLinkCrawlerJobs" {"collectorInfo":true, jobIds:[CrawlerJob:ID]} if err { return err } // we should never have an error here unless there is an issue with the connection to the internet or JD Error var completedJobs = 0 for _, job := range jobs { if !job.Crawling && !job.Checking { completedJobs++ } } if completedJobs >= len(jobs) { break } if loop takes more than 5 minutes { return error("failed to queryLinkCrawlerJobs after 5 minutes") // all +9000 jobs have taken less than 6 seconds } Wait 3 seconds } if len(jobs) == 0 { // "data" :[] return error("received 0 LinkCrawlerJobs") <--- Here's the issue } for job in jobs { // we should ONLY have one job, but just in-case query links with job.jobId extract unique packageUUID's rename each packageUUID with a custom prefix if links contains a link which is not "ONLINE" { do nothing, just log the issue to the link's database record } else { send packageUUIDs to download } } Save any errors, jobIds, packageUUIDs, linkUUIDs to the link's database record //TGU Last edited by TGU; 15.06.2024 at 01:46. Reason: clarified Remote API, updated queryLinkCrawlerJobs to 3sec |
#2
|
|||
|
|||
![]()
Update. I was able to replicate this issue with Log: Debug Mode and MyJdownloaderSettings: Debug enabled while testing 3 pixeldrain links (this happens with anything) on the current version, the same basic program flow still applies as above. The issue should be able to be found quickly.
![]() It appears to be only occurring when downloads are running (possibly) as the 3rd or 4th time when attempting to get this I started the downloads then added the links; but this could just be up to luck. 14.06.24 17.12.25 <--> 14.06.24 17.19.46 jdlog://4643411370661/ Remote API "/linkgrabberv2/queryLinkCrawlerJobs" is returning an empty data response for the first link Code:
Link 1 @ 2024-06-14 17:16:55.9338301: j_1718403401910 (received 0 link crawler jobs) Link 2 @ 2024-06-14 17:17:09.9908334: j_1718403415941,l_1718403427244,l_1718403427245,p_1718403427243 Link 3 @ 2024-06-14 17:17:24.0503298: j_1718403429998,l_1718403441283,p_1718403441282 Jobs, Links and Packages ids are prefixed with their first character + _ Last edited by TGU; 15.06.2024 at 01:32. |
#3
|
||||
|
||||
![]()
@TGU: JobLinkCrawler (via "/linkgrabberv2/queryLinkCrawlerJobs") are only available while the job is still waiting/running and shortly after it has finished. After it is finished/has processed all links, there are no guarantees how long it will be available.
That is desired behaviour. In Short: "/linkgrabberv2/queryLinkCrawlerJobs" no longer returning a JobLinkCrawler for specific jobUUID, that means it is finished/cleaned up. You should make use of "assignJobID" in "/linkgrabberv2/addLinks" if you want to get hold on the resulting links via "/linkgrabberv2/queryLinks". If neccessary, I can introduce new api method so you can *extend* the reachability of a JobLinkCrawler for certain of time to prevent earlier cleanup.
__________________
JD-Dev & Server-Admin |
#4
|
|||
|
|||
![]() Quote:
![]() The reason for using this endpoint vs "/linkgrabberv2/queryLinks". Is that naturally I also want to check the data about the job itself isChecking, isCrawling, #broken, #crawled, #filtered, #unhandled it just makes sense; I'm sure some of these could be extracted via queryLinks but not all of them. It was working well for ~9000 links but the ~200 caused issues. Last edited by TGU; 15.06.2024 at 02:04. |
#5
|
||||
|
||||
![]() Quote:
I think a way to extend reachability would be best solution for your case then?
__________________
JD-Dev & Server-Admin |
#6
|
|||
|
|||
![]() Quote:
:) I'm glad it's just a garbage collection "issue" Last edited by TGU; 15.06.2024 at 02:14. |
#7
|
||||
|
||||
![]()
I will think about an easy/fast solution to this
I agree. Most likely I will add a customizable *keep reachable* timeout via advanced settings.
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 15.06.2024 at 02:30. |
#8
|
||||
|
||||
![]()
@TGU: to keep it nice&easy&simple, how about a new flag in AddLinksQuery to disable auto cleanup of JobLinkCrawler entries from list and a new additional cleanup call so you can manually remove/cleanup once you no longer need the information. that way nothing changes for existing usage and you can change behaviour on per job basis and not have to change advanced settings all the time
whats your opinion on this?
__________________
JD-Dev & Server-Admin |
#9
|
|||
|
|||
![]() Quote:
I don't suppose you could also add the ability to get the stored Job UUIDs for QueryLinks & QueryPackages, as these aren't really available anywhere other than when you call "/linkgrabberv2/AddLinks". (I would create a new thread, but I've made too many recently) |
#10
|
||||
|
||||
![]() Quote:
with next update, add Quote:
__________________
JD-Dev & Server-Admin Last edited by Jiaz; 17.06.2024 at 13:51. |
#11
|
|||
|
|||
![]()
:thumbup: Perfect thanks for that, I've thought about it since a few years ago.
I'll provide updates for all the API changes on each thread once the core update is live. |
#12
|
||||
|
||||
![]()
I'll ping once those are live
__________________
JD-Dev & Server-Admin |
![]() |
Thread Tools | |
Display Modes | |
|
|