User Name Remember Me? Password

 Thread Tools Display Modes
#61
04.05.2019, 08:13
 raztoki English Supporter Join Date: Apr 2010 Location: Australia Posts: 16,200

sure you just wan to ignore http or https component of the protocol?
you would be better off with code like jiaz indicated, i would personally recommend bash like script you can then run multiple regular expressions one after another (unlike most text/word processors).

for example cat \file\text | grep patternexpression1 | grep patternexpression2
this allows you to process the text, to pre filter, and then additional patterns to find what you want. you can even port the findings to files and parse them multiple times if you require different outcomes.

raztoki
__________________

Don't fight the system, use it to your advantage. :]
#62
04.05.2019, 13:05
 djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294

The pattern is invalid, because in this case it ignores only the protocol, it must be changed to exclude the entire address.
#63
04.05.2019, 15:11
 raztoki English Supporter Join Date: Apr 2010 Location: Australia Posts: 16,200

yes i gathered, hence my question. since you know the answer then you should be able to fix the expression.
__________________

Don't fight the system, use it to your advantage. :]
#64
04.05.2019, 17:10
 djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294

Quote:
 Originally Posted by raztoki yes i gathered, hence my question. since you know the answer then you should be able to fix the expression.
I have changed the regex, it ignores the links, but the error in the selection of sentences and the error in some lines does not mark the whole text.

(?!(http|https?://|**External links are only visible to Support Staff**ftp://|www\.|[^\s:=]+@www\.).*?[a-z_\/0-9\-\#=&])(?=(\.|,|;|\?|\!)?("|'|«|»|\[|\s|\r|\n|$))[^.!?0-9]+[.!?] #65 04.05.2019, 17:21  raztoki English Supporter Join Date: Apr 2010 Location: Australia Posts: 16,200 you are on the right track with encasing, but you have now introduced more issues. anyway I'm not providing you with any assistance with regular expressions. I'm glad you're learning though! __________________ raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] #66 05.05.2019, 17:55  djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294 (?<!")😊(?!")|(?<!"(?=😊"))😊|😊(?!"(?<="😊"))|(?<!")😊|😊(?!")|😊(?!"(?:(?:[^"]*"){2})*[^"]*)|(?:"😊".*?)*\k😊|(?:(?>{[^}]*?})[^{}]*?)*\k😊 #67 05.05.2019, 22:24  djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294 From what I know many times, someone asked on the programming forum stackoverflow.com about almost the same. But there is no good solution. "Nothing you do will be perfect." To reduce the error rate as much as possible. Run the program on a large set of texts and add exceptions until you reach an acceptable level of error. However, if you need more than dozens of rules, you'll probably just want to rethink the problem. Step1: Search sentences that allowed at end .!? Example sentence: ! Code: Gdy patrzê na ¶wiat, to jest tak piêkne i straszne w tym samym czasie! -or- . Code: Gdy patrzê na ¶wiat, to jest tak piêkne i straszne w tym samym czasie. -or- ? Code: Gdy patrzê na ¶wiat, to jest tak piêkne i straszne w tym samym czasie? Step2: Search sentences NOT allowed at end . The beginning of the Line: 0. (ANY NUMBER + DOT) 5. (ANY NUMBER + DOT) 156. (ANY NUMBER + DOT) Only at the beginning of the line, everywhere else is acceptable. Step3: All languages of the world are allowed, except for Russian. Step4: Add a search exception for any links (URLs). Completely ignore. Step5: Allow sentence detection when another sentence ends with "three dots", "three exclamation marks", "three question marks" and the next begins with a capital letter: Example: Code: Jestem w innym ¶wiecie... W ¶wiecie o innej kulturze, jêzyku, tradycjach, architekturze, przyrodzie, kuchni, pogodzie. Code: Jestem w innym ¶wiecie!!! W ¶wiecie o innej kulturze, jêzyku, tradycjach, architekturze, przyrodzie, kuchni, pogodzie. Code: Jestem w innym ¶wiecie??? W ¶wiecie o innej kulturze, jêzyku, tradycjach, architekturze, przyrodzie, kuchni, pogodzie. #68 06.05.2019, 01:47  djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294 Quote:  Step2: Regex ignores URLs and finds the sentences. Code: ^(?!\d+\.).*[.!?]$
Only still remains to solve the issues of numbering some sentences (only the beginning of the line) Screenshot text: **External links are only visible to Support Staff****External links are only visible to Support Staff**

----------------------------------
At the moment, only such a workaround, but that and this expression works separately.
... need to look for a solution so that the numbering at the beginning of the line of the sentence is treated as a whole sentence.
With numbering and without numbering (in both cases)

the word "KONIEC" means the completion of the text, and then the separator. "="

Code:
^(?!\d+\.)|(?!KONIEC).*[.!?]\$
#69
07.05.2019, 11:48
 djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294

Update:
New pattern to ignore spaces before the selection.

Basically, a sentence ends with a ".?!" OR a sentence begins a line with a number + "." and ends with ".?!"

Screenshot:
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Regular expression - only do not ignore characters in links. And here is an error not solved.
#70
22.05.2019, 14:58
 djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294

@Jiaz - You can correct the pattern so that it does not detect char:
. ! ? in links?

Spoiler:
(\S+\.(com|net|org|edu|gov|ru|pl)(\/\S+)?)|((^\d+\..*?|[^\s].*?)(\.\.\.|[\.?!]))
#71
23.05.2019, 11:43
 Jiaz JD Manager Join Date: Mar 2009 Location: Germany Posts: 65,456

^\d+\..*? -> .*? -> you allow everything
^\d+\[^\\.!\?]*?

[^\s].*? -> .*? -> you allow everything
[^\s\.!\?]*?
__________________
#72
23.05.2019, 14:42
 djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294

Quote:
 Originally Posted by Jiaz ^\d+\..*? -> .*? -> you allow everything ^\d+\[^\\.!\?]*? [^\s].*? -> .*? -> you allow everything [^\s\.!\?]*?

(^\d+\..*?|.*?)(\.\.\.|[\.?!]). Basically, a sentence ends with a ".?!" OR a sentence begins a line with a number + "." and ends with ".?!"

-or-
This works better: (^\d+\..*?|[^\s].*?)(\.\.\.|[\.?!]) to ignore spaces before the selection.

I've tested. Unfortunately, this pattern is incorrect because it matches the links as sentences.

See screenshot:
**External links are only visible to Support Staff****External links are only visible to Support Staff**
#73
23.05.2019, 15:45
 Jiaz JD Manager Join Date: Mar 2009 Location: Germany Posts: 65,456

Please understand that I simply don't have the time to help you with your pattern. If you want to do this all within a single pattern, then you have to learn it and learn to write more complex patterns, like
emailregex.com
__________________
#74
23.05.2019, 16:11
 djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294

Anyway, thanks for partial help.
This question was surprisingly difficult to find an answer for. The regexes I found were too complicated to understand, and anything more that a regex is overkill and too difficult to implement.
#75
23.05.2019, 16:48
 djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294

Quote:
 Originally Posted by Jiaz like emailregex.com
A Filter, e.g. a name or package - what is the engine using? Because none of the ready-made pattern "E-MAIL" is incorrect :D
#76
24.05.2019, 17:09
 Jiaz JD Manager Join Date: Mar 2009 Location: Germany Posts: 65,456

Quote:
 Originally Posted by djmakinera A Filter, e.g. a name or package - what is the engine using? Because none of the ready-made pattern "E-MAIL" is incorrect :D
normal pattern/regex. no *engine*.
__________________
#77
17.06.2019, 10:26
 djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294

How do reverse the order of numbers separated by a comma?

Example: 01,03 -> 03,01
#78
17.06.2019, 11:03
 Jiaz JD Manager Join Date: Mar 2009 Location: Germany Posts: 65,456

There comes a time you should learn some sort of coding language and not try to achieve everything with regex.
__________________
#79
17.06.2019, 16:05
 djmakinera JD Legend Join Date: May 2010 Location: Poland Posts: 8,294

For Unix it is full on the net, but I do not see anything for Windows, but these scripts and commands also take time - writing patterns, the same effect as I would have to change manually. I do not see anything in it that could facilitate the exchange, but to convert 20 numbers, I would have to spend a few or more minutes each time, and manually a lot faster.
#80
17.06.2019, 16:31
 Jiaz JD Manager Join Date: Mar 2009 Location: Germany Posts: 65,456

Ever thought about regex NOT being the answer for all of your *stuff*?
__________________