04.05.2019
sure you just wan to ignore http or https component of the protocol?
you would be better off with code like jiaz indicated, i would personally recommend bash like script you can then run multiple regular expressions one after another (unlike most text/word processors).

for example cat \file\text | grep patternexpression1 | grep patternexpression2
this allows you to process the text, to pre filter, and then additional patterns to find what you want. you can even port the findings to files and parse them multiple times if you require different outcomes.

raztoki
Don't fight the system, use it to your advantage. :]
04.05.2019
The pattern is invalid, because in this case it ignores only the protocol, it must be changed to exclude the entire address.
04.05.2019
yes i gathered, hence my question. since you know the answer then you should be able to fix the expression.
Don't fight the system, use it to your advantage. :]
04.05.2019
Quote:
I have changed the regex, it ignores the links, but the error in the selection of sentences and the error in some lines does not mark the whole text.

(?!(http|https?://|**External links are only visible to Support Staff**ftp://|www\.|[^\s:=]+@www\.).*?[a-z_\/0-9\-\#=&])(?=(\.|,|;|\?|\!)?("|'|«|»|\[|\s|\r|\n|$))[^.!?0-9]+[.!?]

04.05.2019

raztoki:

you are on the right track with encasing, but you have now introduced more issues. anyway I'm not providing you with any assistance with regular expressions. I'm glad you're learning though!

Don't fight the system, use it to your advantage. :]

05.05.2019

djmakinera:

(?<!")😊(?!")|(?<!"(?=😊"))😊|😊(?!"(?<="😊"))|(?<!")😊|😊(?!")|😊(?!"(?:(?:[^"]*"){2})*[^"]*)|(?:"😊".*?)*\k😊|(?:(?>{[^}]*?})[^{}]*?)*\k😊

05.05.2019

djmakinera:

From what I know many times, someone asked on the programming forum stackoverflow.com about almost the same. But there is no good solution. "Nothing you do will be perfect." To reduce the error rate as much as possible. Run the program on a large set of texts and add exceptions until you reach an acceptable level of error. However, if you need more than dozens of rules, you'll probably just want to rethink the problem.

Step1: Search sentences that allowed at end .!?
Example sentence: !
Code: Gdy patrzê na ¶wiat, to jest tak piêkne i straszne w tym samym czasie!
-or- .
Code: Gdy patrzê na ¶wiat, to jest tak piêkne i straszne w tym samym czasie.
-or- ?
Code: Gdy patrzê na ¶wiat, to jest tak piêkne i straszne w tym samym czasie?

Step2: Search sentences NOT allowed at end .
The beginning of the Line: 0. (ANY NUMBER + DOT) 5. (ANY NUMBER + DOT) 156. (ANY NUMBER + DOT)
Only at the beginning of the line, everywhere else is acceptable.

Step3: All languages of the world are allowed, except for Russian.

Step4: Add a search exception for any links (URLs). Completely ignore.

Step5: Allow sentence detection when another sentence ends with "three dots", "three exclamation marks", "three question marks" and the next begins with a capital letter:
Example:
Code: Jestem w innym ¶wiecie... W ¶wiecie o innej kulturze, jêzyku, tradycjach, architekturze, przyrodzie, kuchni, pogodzie.
Code: Jestem w innym ¶wiecie!!! W ¶wiecie o innej kulturze, jêzyku, tradycjach, architekturze, przyrodzie, kuchni, pogodzie.
Code: Jestem w innym ¶wiecie??? W ¶wiecie o innej kulturze, jêzyku, tradycjach, architekturze, przyrodzie, kuchni, pogodzie.

06.05.2019

djmakinera:

Quote: Step2: Regex ignores URLs and finds the sentences.
Code: ^(?!\d+\.).*[.!?]$
Only still remains to solve the issues of numbering some sentences (only the beginning of the line) Screenshot text: **External links are only visible to Support Staff****External links are only visible to Support Staff**

----------------------------------
At the moment, only such a workaround, but that and this expression works separately.
... need to look for a solution so that the numbering at the beginning of the line of the sentence is treated as a whole sentence.
With numbering and without numbering (in both cases)

the word "KONIEC" means the completion of the text, and then the separator. "="

Code:
^(?!\d+\.)|(?!KONIEC).*[.!?]\$
07.05.2019
Update:
New pattern to ignore spaces before the selection.

Basically, a sentence ends with a ".?!" OR a sentence begins a line with a number + "." and ends with ".?!"

Screenshot:
**External links are only visible to Support Staff****External links are only visible to Support Staff**

Regular expression - only do not ignore characters in links. And here is an error not solved.
22.05.2019
@Jiaz - You can correct the pattern so that it does not detect char:
. ! ? in links?

Spoiler:
(\S+\.(com|net|org|edu|gov|ru|pl)(\/\S+)?)|((^\d+\..*?|[^\s].*?)(\.\.\.|[\.?!]))
23.05.2019
^\d+\..*? -> .*? -> you allow everything
^\d+\[^\\.!\?]*?

[^\s].*? -> .*? -> you allow everything
[^\s\.!\?]*?
23.05.2019
Quote:
(^\d+\..*?|.*?)(\.\.\.|[\.?!]). Basically, a sentence ends with a ".?!" OR a sentence begins a line with a number + "." and ends with ".?!"

-or-
This works better: (^\d+\..*?|[^\s].*?)(\.\.\.|[\.?!]) to ignore spaces before the selection.

I've tested. Unfortunately, this pattern is incorrect because it matches the links as sentences.

See screenshot:
**External links are only visible to Support Staff****External links are only visible to Support Staff**
23.05.2019
Please understand that I simply don't have the time to help you with your pattern. If you want to do this all within a single pattern, then you have to learn it and learn to write more complex patterns, like
emailregex.com
23.05.2019
Anyway, thanks for partial help.
This question was surprisingly difficult to find an answer for. The regexes I found were too complicated to understand, and anything more that a regex is overkill and too difficult to implement.
23.05.2019
Quote:
A Filter, e.g. a name or package - what is the engine using? Because none of the ready-made pattern "E-MAIL" is incorrect :D
24.05.2019
Quote:
normal pattern/regex. no *engine*.
17.06.2019
How do reverse the order of numbers separated by a comma?

Example: 01,03 -> 03,01
17.06.2019
There comes a time you should learn some sort of coding language and not try to achieve everything with regex.
17.06.2019
For Unix it is full on the net, but I do not see anything for Windows, but these scripts and commands also take time - writing patterns, the same effect as I would have to change manually. I do not see anything in it that could facilitate the exchange, but to convert 20 numbers, I would have to spend a few or more minutes each time, and manually a lot faster.
17.06.2019
Ever thought about regex NOT being the answer for all of your *stuff*?
