#41
|
|||
|
|||
I tried a regular expression, but I see that it still finds such a word:
Quote:
^(?!www|https?:).*[^\!'(),.:;?="А-Яа-я]$ Example1: Mój człowiek musi najpierw mnie zrozumieć. Wszystkie moje załamania i dziwactwa, które przy okazji mi wystarczają Example2: Mój człowiek musi najpierw mnie zrozumieć. Wszystkie moje załamania i dziwactwa, które przy okazji mi wystarczają. Zrozumcie moje hobby i zainteresowania. Dla mnie jest to bardzo ważne, gdy dana osoba jest zainteresowana moimi hobby, a przynajmniej stosuje swoją siłę, aby zrozumieć. Potrzebuję kogoś, kto będzie ze mną na tej samej częstotliwości! Example3: Mój człowiek musi najpierw mnie zrozumieć. Wszystkie moje załamania i dziwactwa, które przy okazji mi wystarczają. Zrozumcie moje hobby i zainteresowania. Dla mnie jest to bardzo ważne, gdy dana osoba jest zainteresowana moimi hobby, a przynajmniej stosuje swoją siłę, aby zrozumieć. Potrzebuję kogoś, kto będzie ze mną na tej samej częstotliwości? Example4: === Potrzebuję kogoś, kto będzie ze mną na tej samej częstotliwości Example5: Mój człowiek musi najpierw mnie zrozumieć. |
#42
|
||||
|
||||
And what is wrong with that? I don't see any difference from *such a word* to your rest text
__________________
JD-Dev & Server-Admin |
#43
|
|||
|
|||
^(?!www|https?:).*[^\!'(),.:;?="аАбБвВгГдДеЕёЁжЖзЗиИйЙкКлЛмМнНоОпПрРсСтТуУфФхХцЦчЧшШщЩъЪыЫьЬэЭюЮяЯ]{1}$
Something is wrong, it is not only the last but one of any allowed character (end line) {1} |
#44
|
|||
|
|||
Not words, but character, make differences.
|
#45
|
|||
|
|||
Regex to find all sentences of text?
Sentences <end with '.','?' or '!'>. Enabled Regex: [^.!?0-9]+[.!?] [x]Enabled Count Matches Regular expression correct, but should ignore all links www,http and https Example: **External links are only visible to Support Staff****External links are only visible to Support Staff** |
#46
|
||||
|
||||
Quote:
__________________
JD-Dev & Server-Admin |
#47
|
|||
|
|||
Unfortunately, but an incorrect expression.
Count 2 matches. Should count in this case: 10 matches. https://i.postimg.cc/DZcfKVhM/Screen...t-07-00-PM.jpg |
#48
|
||||
|
||||
There are only 2 lines in your examples, see 1 and 2 on the left. so only 2 matches
__________________
JD-Dev & Server-Admin |
#49
|
|||
|
|||
Quote:
See my regex: Here shows 10 counter + 2 counter (no link needed!). So here is something to improve. https://i.postimg.cc/bNsJZHtV/Screen...t-08-31-PM.jpg |
#50
|
||||
|
||||
But a sentence doesn't end with , ?
So you already have a working pattern, so why asking for help?
__________________
JD-Dev & Server-Admin |
#51
|
|||
|
|||
Quote:
You misinterpreted. The comma in this case was the dividing sentence. Regex - It does not work properly because it includes some characters in the links, even a dot in the links. And links are not sentences! |
#52
|
|||
|
|||
/^(www|https?:\/\/)?
the letters www, http or https ([\da-z\.-]+)\.([a-z\.]{2,6}) any number a dot ...two to six ([/\w\.-]*)\/?$/ letters,numbers, underscores,dots, or hyphens [^.!?0-9]+[.!?] Count Matches Sentences ?! Negative Lookahead Not work for me: [^.!?0-9]+[.!?]/^(?!www|https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/ |
#53
|
||||
|
||||
The pattern is invalid, for example
(?!www|https?:\/\/)? you can't use negative lookahead but then make it optional with ?
__________________
JD-Dev & Server-Admin |
#54
|
|||
|
|||
I have corrected, but still regular expression is not correct.
Code:
ERROR The complexity of matching the regular expression exceeded predefined bounds. Try refactoring the regular expression to make each choice made by the state machine unambiguous. This exception is thrown to prevent "eternal" matches that take an indefinite period time to locate. |
#55
|
|||
|
|||
The expression should be corrected:
1. It can not include links (because it contains these characters . ! ?) ?! Negative Lookahead NOT WORK 2. He can not tolerate simple names, for example: Меары А. С. Одинцой must have at least two characters 3. Ignore Cyrylic [^.!?0-9\p{Cyrillic}]+[.!?](?!https?:\/\/(www\.)[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)) |
#56
|
||||
|
||||
Negative Lookahead works fine but you cannot use it in combination with ?
Quote:
__________________
JD-Dev & Server-Admin |
#57
|
||||
|
||||
You should really start to learn conding in a language and not try to solve all problems with regex!
__________________
JD-Dev & Server-Admin |
#58
|
||||
|
||||
its powerful, just cant solve all yours queries ;p
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] |
#59
|
|||
|
|||
I do not need to know everything, I see nothing wrong with it, but in some cases a regular expression may work, maybe not in my case. You can find similar expressive examples on the other forum, and here everything is fine, but not in my case. So I have to include sentences in links, there is no other solution, it does not exist.
|
#60
|
|||
|
|||
Quote:
(?!http|https):\/\/(www\.)[\w\-_]+(\.[\w]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#]) Code:
/ An unescaped delimiter must be escaped with a backslash (\) Indicates that the pattern error. ?! does not ignore links, instead of ignoring it is still a match! |
#61
|
||||
|
||||
sure you just wan to ignore http or https component of the protocol?
you would be better off with code like jiaz indicated, i would personally recommend bash like script you can then run multiple regular expressions one after another (unlike most text/word processors). for example cat \file\text | grep patternexpression1 | grep patternexpression2 this allows you to process the text, to pre filter, and then additional patterns to find what you want. you can even port the findings to files and parse them multiple times if you require different outcomes. raztoki
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] |
#62
|
|||
|
|||
The pattern is invalid, because in this case it ignores only the protocol, it must be changed to exclude the entire address.
|
#63
|
||||
|
||||
yes i gathered, hence my question. since you know the answer then you should be able to fix the expression.
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] |
#64
|
|||
|
|||
Quote:
(?!(http|https?://|**External links are only visible to Support Staff**ftp://|www\.|[^\s:=]+@www\.).*?[a-z_\/0-9\-\#=&])(?=(\.|,|;|\?|\!)?("|'|«|»|\[|\s|\r|\n|$))[^.!?0-9]+[.!?] |
#65
|
||||
|
||||
you are on the right track with encasing, but you have now introduced more issues. anyway I'm not providing you with any assistance with regular expressions. I'm glad you're learning though!
__________________
raztoki @ jDownloader reporter/developer http://svn.jdownloader.org/users/170 Don't fight the system, use it to your advantage. :] |
#66
|
|||
|
|||
(?<!")😊(?!")|(?<!"(?=😊"))😊|😊(?!"(?<="😊"))|(?<!")😊|😊(?!")|😊(?!"(?:(?:[^"]*"){2})*[^"]*)|(?:"😊".*?)*\k😊|(?:(?>{[^}]*?})[^{}]*?)*\k😊
|
#67
|
|||
|
|||
From what I know many times, someone asked on the programming forum stackoverflow.com about almost the same.
But there is no good solution. "Nothing you do will be perfect." To reduce the error rate as much as possible. Run the program on a large set of texts and add exceptions until you reach an acceptable level of error. However, if you need more than dozens of rules, you'll probably just want to rethink the problem. Step1: Search sentences that allowed at end .!? Example sentence: ! Code:
Gdy patrzę na świat, to jest tak piękne i straszne w tym samym czasie! . Code:
Gdy patrzę na świat, to jest tak piękne i straszne w tym samym czasie. ? Code:
Gdy patrzę na świat, to jest tak piękne i straszne w tym samym czasie? Step2: Search sentences NOT allowed at end . The beginning of the Line: 0. (ANY NUMBER + DOT) 5. (ANY NUMBER + DOT) 156. (ANY NUMBER + DOT) Only at the beginning of the line, everywhere else is acceptable. Step3: All languages of the world are allowed, except for Russian. Step4: Add a search exception for any links (URLs). Completely ignore. Step5: Allow sentence detection when another sentence ends with "three dots", "three exclamation marks", "three question marks" and the next begins with a capital letter: Example: Code:
Jestem w innym świecie... W świecie o innej kulturze, języku, tradycjach, architekturze, przyrodzie, kuchni, pogodzie. Code:
Jestem w innym świecie!!! W świecie o innej kulturze, języku, tradycjach, architekturze, przyrodzie, kuchni, pogodzie. Code:
Jestem w innym świecie??? W świecie o innej kulturze, języku, tradycjach, architekturze, przyrodzie, kuchni, pogodzie. |
#68
|
|||
|
|||
Quote:
Code:
^(?!\d+\.).*[.!?]$ ---------------------------------- At the moment, only such a workaround, but that and this expression works separately. ... need to look for a solution so that the numbering at the beginning of the line of the sentence is treated as a whole sentence. With numbering and without numbering (in both cases) the word "KONIEC" means the completion of the text, and then the separator. "=" Code:
^(?!\d+\.)|(?!KONIEC).*[.!?]$ |
#69
|
|||
|
|||
Update:
New pattern to ignore spaces before the selection. Basically, a sentence ends with a ".?!" OR a sentence begins a line with a number + "." and ends with ".?!" Screenshot: https://postimg.cc/hzzVcNh2 Regular expression - only do not ignore characters in links. And here is an error not solved. |
#70
|
|||
|
|||
@Jiaz - You can correct the pattern so that it does not detect char:
. ! ? in links?
Spoiler:
(\S+\.(com|net|org|edu|gov|ru|pl)(\/\S+)?)|((^\d+\..*?|[^\s].*?)(\.\.\.|[\.?!]))
|
#71
|
||||
|
||||
^\d+\..*? -> .*? -> you allow everything
^\d+\[^\\.!\?]*? [^\s].*? -> .*? -> you allow everything [^\s\.!\?]*?
__________________
JD-Dev & Server-Admin |
#72
|
|||
|
|||
Quote:
(^\d+\..*?|.*?)(\.\.\.|[\.?!]). Basically, a sentence ends with a ".?!" OR a sentence begins a line with a number + "." and ends with ".?!" -or- This works better: (^\d+\..*?|[^\s].*?)(\.\.\.|[\.?!]) to ignore spaces before the selection. I've tested. Unfortunately, this pattern is incorrect because it matches the links as sentences. See screenshot: https://i.postimg.cc/SRqZRvs9/Screen...t-02-37-PM.jpg |
#73
|
||||
|
||||
Please understand that I simply don't have the time to help you with your pattern. If you want to do this all within a single pattern, then you have to learn it and learn to write more complex patterns, like
emailregex.com
__________________
JD-Dev & Server-Admin |
#74
|
|||
|
|||
Anyway, thanks for partial help.
This question was surprisingly difficult to find an answer for. The regexes I found were too complicated to understand, and anything more that a regex is overkill and too difficult to implement. |
#75
|
|||
|
|||
A Filter, e.g. a name or package - what is the engine using? Because none of the ready-made pattern "E-MAIL" is incorrect :D
|
#76
|
||||
|
||||
normal pattern/regex. no *engine*.
__________________
JD-Dev & Server-Admin |
#77
|
|||
|
|||
How do reverse the order of numbers separated by a comma?
Example: 01,03 -> 03,01 |
#78
|
||||
|
||||
There comes a time you should learn some sort of coding language and not try to achieve everything with regex.
__________________
JD-Dev & Server-Admin |
#79
|
|||
|
|||
For Unix it is full on the net, but I do not see anything for Windows, but these scripts and commands also take time - writing patterns, the same effect as I would have to change manually. I do not see anything in it that could facilitate the exchange, but to convert 20 numbers, I would have to spend a few or more minutes each time, and manually a lot faster.
|
#80
|
||||
|
||||
Ever thought about regex NOT being the answer for all of your *stuff*?
__________________
JD-Dev & Server-Admin |
|
|