#1
|
|||
|
|||
request for new command/function for Eventscript
Hello , with scripts you do a lot of things and there are many commands and infinite combinations, but the main commands to "mess" on the html code of a page (= search for the correct links of the desired files) at the moment are:
Code:
var myString = getPage(myString/*URL*/);/*Loads a website (Method: GET) and returns the source code*/ var myString = postPage(myString/*URL*/, myString/*PostData*/);/*Loads a website (METHOD: POST) and returns the source code*/ openURL(myString/*URL*/);/*Open a website or path in your default browser/file explorer*/ So , i would like to suggest / request a creation a command (for EventScript) that allows you to obtain the html of the page AFTER loading (therefore with all the tags and strings modified by the various dynamic elements). something like : var myString = getPageMht(myString/*URL*/);/*Loads a website (Method: GET Mht ) and returns the source code After */ which is equivalent to the "inspect + reload page" command in Chrome. In Chrome (as in other browsers I believe) there are many apps that save the whole page as a single file <webpageAAA.mht> for example : "Save As MHT" in chrome web store : h##ps://chrome.google.com/webstore/detail/save-as-mht/hfmodljjaibbdndlikgagimhhodmobkc/related?hl=it and , on the creator page there are the various files with the codes / strings / commands - json , .js and the like. site h##ps://github.com/vsDizzy/SaveAsMHT <mime html>/<Mht> because the MHT files should be " a web page archive format which stands for MIME HTML" and "MHT format files does not save images, it only saves links to the online images". [[ But the app browser create a biig file with the encoded(/encrypted?) images inside + the whole page html + tag + url and the various elements (including dynamic ones). ]] maybe going through the Mht (lightened, without images and videos; html text only) could be a solution to get the web page code at complete / finished loading [== at <"Ctrl + Shift + I" and reload page>] or maybe there is already a simpler way; I still hope that we can at least consider the idea for a realization in the future. Thanks |
#2
|
||||
|
||||
JD browser by default is very simple, it performs a standard GET/POST/PUT/HEAD/etc requests. It does not have javascript/css or related abilities, so no building/changes can be made without those functions. To perform what you would like you would need a fully functioning browser, maybe along the lines of phantomjs (no longer in development according to wikipedia). Their are other headless browsers out there.
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] |
#3
|
|||
|
|||
@BJN01: Try wget. It is capable of mirroring remote sites locally. Can also be called from eventscripter.
|
#4
|
|||
|
|||
Quote:
Quote:
how do i call it in eventscript? i need an *.au3 ? and what should I write ? for example : <wget --save-headers -nd ht#p://www.lycos.com/ more index.html > ?? |
#5
|
|||
|
|||
Code:
callSync("path/to/wget", "-p", "--convert-links", url); |
#6
|
|||
|
|||
i did some tests but i don't get what i hoped for. I have found some scripts in autoit that give perhaps better results than wget but they keep escaping me or pieces I am looking for.
I found some ideas about "things" written with python, and since it should have the ability to create a * .exe of the code maybe I could get something. if by pure a#s I managed to get a decent .exe callable with ES (I'm not a programmer and I've never studied any language decently) , can I post it in this topic for discusion ? |
#7
|
|||
|
|||
While the format may differ, the mht and wget content should be pretty much the same. Can you provide details/examples of what exactly is missing in wget output?
Quote:
|
#8
|
||||
|
||||
@BJN01: I'm sorry but what you want to achieve a *real browser* is required that evaluates css/js.
Wget will only work for static/simple websites where everything is referenced in html but as soon as dynamic/evaluated javascript is involved, it will fail as well. You will have to find/use a *real browser* or at least *headless browser* that you can control via sort of api.
__________________
JD-Dev & Server-Admin |
#9
|
||||
|
||||
E.g. via Selenium.
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
#10
|
|||
|
|||
yes , is exactly the example-topic that I found on the net ...
for the moment I'm just beyond the <<print (" Hello Word")>> step, .... I won't abandon the idea but the" development "will be a bit long ... |
#11
|
||||
|
||||
I'll mark this as Solved.
We won't be able to teach you how to code but you're free to share possible solutions in this thread. At this moment we neither have a build-in "browser emulator/remote control" like Selenium nor do we provide official plugins for the purpose of crawling complete websites. -psp-
__________________
JD Supporter, Plugin Dev. & Community Manager
Erste Schritte & Tutorials || JDownloader 2 Setup Download |
|
|