Quote:
Originally Posted by pspzockerscene
That looks like it could lead to an "endless" number of results.
We do generally not add crawlers for:
- Search URLs
- Tag-search URLs
- Crawlers for complete websites and/or complete categories of websites
|
But that's how this particular site works. Everything in it is through tags.
By default, this page opens - **External links are only visible to Support Staff****External links are only visible to Support Staff**
It will display ALL posts of the site. Post is files (pictures, animations, video or flash video)
To somehow filter all these files, the site has no other functionality than using tags. If I select for example the "rating:s" tag hides all NSWF posts, in such a video the site can be shown even to children. And the site address will change to **External links are only visible to Support Staff****External links are only visible to Support Staff**
If I want to display only the posts of a specific author, then I again have to add a tag, for example "skuddbutt", and I get this address - **External links are only visible to Support Staff****External links are only visible to Support Staff**
The site itself found 474 posts with this tag
So is the name of the character, or the title of the work, or less specific tags, hair color, and the like. It's all part of the link.
You can also combine tags, "rating:s" + "skuddbutt" - **External links are only visible to Support Staff****External links are only visible to Support Staff**
Found only 8 posts!
I understand the fear of endless results. Well, this site parser will be released, which can add almost all the files from the site to the download list. But maybe just limit the number of files? Let's say a maximum of 100-500-1000 first files the program downloads, but then it doesn't.
It's just that this is not the first time I've dealt with plugins. Ivara.tv plugin can download all videos of one author (one channel). And there are channels for more than a hundred videos and I did not encounter any difficulties.
Quote:
Originally Posted by pspzockerscene
|
"If the website is simple and does not dynamically load content when scrolling down, you can try setting up one or multiple LinkCrawler Rules for that website."
There is a setting in the chan version of the site. You can enable dynamic setting to load next pages on the current page. And you can remove it and flip through the pages. In the second case, they will have separate addresses:
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
...
**External links are only visible to Support Staff****External links are only visible to Support Staff**
Guessing regularity in the links, I can not catch. And I already tried to upload it like that. The plugin is currently not able to work with such links and is trying to pump everything out of them. You need to teach the plugin to find links on these pages like **External links are only visible to Support Staff****External links are only visible to Support Staff** and download only them. And only from the links **External links are only visible to Support Staff****External links are only visible to Support Staff** the program downloads files in the highest quality. If I enter **External links are only visible to Support Staff****External links are only visible to Support Staff** then I will find all the files, but their reduced versions will be downloaded. For all large pictures on the site, a smaller copy of it is automatically created (not viewed) and it is this copy that the program downloads.
The beta version of the site loads a dynamically expanding page by default. There is no division into the first and subsequent pages. Of course, I can take the original link and replace chan with beta in it and the link will even open, but this is not very convenient - **External links are only visible to Support Staff****External links are only visible to Support Staff**
MyJDownloader
didn't find any link. I don't understand how it should work at all.
Link Gopher
selected only 46 links by filter. But not 474 as it should be. It looks like the page is unloading the rest from memory, since I scrolled through all the files
Linkclump
the same result. I activated the selection, I spent a minute to "drag" the selection through the entire page from beginning to end. And at the output in the clipboard, she found only 48 links, 2 of which are advertising posts.
Quote:
Originally Posted by pspzockerscene
I've just added support for their "beta" subdomain so with our next release of updates, the following links will work too:
beta.sankakucomplex.com/post/show/12345678
|
Fine! It's already good
Quote:
Originally Posted by pspzockerscene
Looks like "books" are internally simply a list of posts(?)
It's definitely possible to add a crawler plugin for such URLs.
|
Well, roughly speaking, yes. When searching for an author tag, there is a corresponding button on the right with a brief description of the tag from the wiki.
By clicking on this button, and why on the button "details". The full description of the tag will open. Link to all posts with this tag and link to all books with this tag.
**External links are only visible to Support Staff****External links are only visible to Support Staff**
Here are all the books with this tag
**External links are only visible to Support Staff****External links are only visible to Support Staff**
It's also a dynamic page. chan version of this page does not open.
**External links are only visible to Support Staff****External links are only visible to Support Staff**
Next book page
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff** - doesn't work either
This is something like a book preview, cover, title, tags, page numbers and other information. And already from it you can go directly to reading. Open list of all pages
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff** - this page is already opening in the old version of the site. And here the book acts as a "pool:440837" tag. Those tags again
Or the first page, and flip through in turn
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
etc.
Quote:
Originally Posted by pspzockerscene
Please provide example URLs to single books.
|
Book preview
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
Books as a tag
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
First pages of books
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
**External links are only visible to Support Staff****External links are only visible to Support Staff**
Quote:
Originally Posted by pspzockerscene
|
The name constructor, that is, in my translation is called "container". But does it work with this particular site's tags?
Quote:
Originally Posted by pspzockerscene
|
This is also a good option.
Quote:
Originally Posted by pspzockerscene
I wouldn't say that. As explained in the beginning of this post (see article I linked), you can easily collect thousands of such URLs and then let JD download them.
|
As it turned out, this only works with the old version of the site if you select non-dynamic pages. And even so, the pictures are not of the maximum resolution. The plugin needs to learn how to work not only with single posts, but also with pages of posts. At least with their non-dynamic versions. In this case, the page of the old site displays 20 or fewer links to posts. In the new version of the site, it is problematic to calculate, since new posts from the following pages are quickly loaded...
Quote:
Originally Posted by pspzockerscene
You are allowed to mention that 3rd party application here.
|
"Download Master" aka "Internet Download Accelerator". Previously, 10 years ago, it was one of the best applications in its class. Its browser extension, at least on the sankakucomplex, finds and downloads videos on the open page of the site. Also in the extension, you can "find" all the links on the page and add their list of downloads. But in this case, the link to the page means literally "download this page". He doesn't know how to dig deeper and look for something on the next page...