removing protocol and subdomain prefix might improve, but wont entirely as many sites have multiple domains, and or continually add new ones. some plugins set a unique identifier (most sites have uid) to combat that. We also typically also correct all urls into JD to one format protocol://domain/(path/)?uid. maybe also adding feature on checksumming either from advertised (hoster end) and confirmation your end could also assist. Note and many alter small components of files to create unique checksums.
|