#1
|
|||
|
|||
link crawler giving false positive on iframe pages
Hi! This is to report an issue with the link crawler. When it makes deep scanning it's giving false positives on offline files. I mean, embedded videos on iframes that are online are marked as offline.
Consider this examples **External links are only visible to Support Staff****External links are only visible to Support Staff** I copy all the content of this page and ask for a deep analysis to the crawler. It gets correctly the 3 pages containing embedded videos. **External links are only visible to Support Staff****External links are only visible to Support Staff** **External links are only visible to Support Staff****External links are only visible to Support Staff** **External links are only visible to Support Staff****External links are only visible to Support Staff** This pages have <iframes> to show the video, but it seems that in some part of the crawl, its setted that the payload of url must be 12 character long, and and when the iframe loads a url that have the payload 12 character long it loads correcly and shows it online. But newer posts are coming 14 chars long and are being detected incompletely, so they are marked offline as the file doesn't exist **External links are only visible to Support Staff****External links are only visible to Support Staff** is crawled for **External links are only visible to Support Staff****External links are only visible to Support Staff** **External links are only visible to Support Staff****External links are only visible to Support Staff** is crawled for **External links are only visible to Support Staff****External links are only visible to Support Staff** and **External links are only visible to Support Staff****External links are only visible to Support Staff** is crawled for **External links are only visible to Support Staff****External links are only visible to Support Staff** Hope this helps to fix issue. Thanks!! Edit: After a analysis with old links from the same host, i'm getting the same issue with the last 2 letters from the payload being ignored. **External links are only visible to Support Staff****External links are only visible to Support Staff** gets **External links are only visible to Support Staff****External links are only visible to Support Staff** and the iframe have **External links are only visible to Support Staff****External links are only visible to Support Staff** Last edited by emilio530; 24.04.2020 at 20:49. |
Thread Tools | |
Display Modes | |
|
|