View Single Post
  #1093  
Old 21.02.2020, 14:15
RPNet-user's Avatar
RPNet-user RPNet-user is offline
Tornado
 
Join Date: Apr 2017
Posts: 231
Smile

Quote:
Originally Posted by mgpai View Post
While PSP has gone the extra mile and provided you example rules, it is up to you to fine tune it and make it work. The 'release' urls on that site do not seem to contain any 'uppercase' characters. Hence, if you use 'uppercase' characters in keywords, it will obiviously not match. Either use the correct case, or include the case-insensitive flag in the regex.



The 'release' urls on that site also do not seem to contain any 'dot' characters. But, even if we take your string just as an example, it will also not match, as it contains '.' character while the regex doesn't. You have to modify it in order to match the string.

@mgpai

The title sample that includes those dots is the name pattern that is used for the actual downloadable file names not the title of the posts, it was just an example to point out that none of the keywords after the hyphen "-" are been accepted by the regex keyword search, and yes I tried all those release names and several others after the hyphen in lowercase and none of them worked so the case does not appear to be affecting the functionality of the regex keyword search, and although I didn't use the "i" flag, I did test with A-Za-z0-9 to verify that it was not a case issue.

Anyway it is working now, I had to remove the keyword from the middle and add it after the last+quantifier so it looks like this: (/release/[a-z0-9\\-]+[a-z0-9\\-]+rmteam)
So the keywords after the hyphen will never work anywhere except in the last keyword placement of the regex regardless of the case.

The site is updating regularly with just 15 posts on the first page which includes both tv shows and movies(nonfiltered), however, on the top of their page there is an option to select "movies only" which then the url adds /l/m after the top level dname. At the bottom of each page they are numbered with links to each page in the format: /l/m/2, /l/m/3, and so on, so the second page looks like this: rmz.cr/l/m/2.
So if I wanted to add just the first five pages from the "/l/m/" to my crawl, then I assume that I would have to add/change this in the pattern and the deepPattern regex?

Scratch that, I believe that mgpai's script--> "Add urls to linkgrabber at user-defined intervals" will handle that.
I will test and post back with results.

Last edited by raztoki; 22.02.2020 at 04:26. Reason: insert /quote bbcode
Reply With Quote