How to refactor regex in Perl -


i have following sentences:

     text <mir-1> ggg-33 <exp-v-3> text text <vaccvirus-prop-1> other.      text <mir-1> text <assc-phrase-1> text <vaccvirus-prop-1> other <pattern-1> other. 

what want create single regular expression (regex) can match 2 sentences above. note differing pattern in above sentences middle factor <exp-v-3> , <assc-phrase-1>.

i'm stucked current attempt, matched them in 2 redundant regex. what's right way it?

 use data::dumper;      @sent = ("text <mir-1> ggg-33 <exp-v-3> text text <vaccvirus-prop-1> other.",              " text <mir-1> text <assc-phrase-1> text <vaccvirus-prop-1> other <pattern-1> other.");       foreach $sent (@sent) {        if ( $sent =~ /.*<mir-\d+>.*<exp-v-\d+>.*<vaccvirus-prop-\d+>.*/gi ) {            print "$sent\n";         }         elsif( $sent =~ /.*<mir-\d+>.*<assc-phrase-\d+>.*<vaccvirus-prop-\d+>/gi ) {          print "$sent\n";         }     } 

live demo

(?:xxx|yyy)\s*<mir-1>\s*(?:xxx|yyy)\s*(?:<exp-v-3>|<assc-phrase-1>)\s*(?:xxxx|yyy)\s*<vaccvirus-prop-1>

maybe regexp not optimized, work.

ok, here:

first magic:

(?:expr) - capture group not captured # <?:> helps avoid capturing 

second magic:

(a|b|c) - choose metasymbol in work. choose between <a> or <b> or <c> 

third magic:

here rubular work

generalization:

.+?\s*<mir-\d+>\s*.+?\s*(?:<exp-v-\d+>|<assc-phrase-\d+>)\s*.+?\s*<vaccvirus-prop-\d+>.+ 

and example:

here rubular work too

reject string:

.+?\s*<mir-\d+>\s*[^\[]+?\s*(?:<exp-v-\d+>|<assc-phrase-\d+>)\s*[^\]]+?\s*<vaccvirus-prop-\d+>.+ 

fourth magic:

[^symbols] - class of symbols. <^> @ beginning mean 'i don't want match them'. 

here example:

[abc]{1} - match <a> or <b> or <c> [^abc]{1} - not match <a> or <b> or <c> 

here rubular work again


Comments

Popular posts from this blog

javascript - Laravel datatable invalid JSON response -

java - Exception in thread "main" org.springframework.context.ApplicationContextException: Unable to start embedded container; -

sql server 2008 - My Sql Code Get An Error Of Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value '8:45 AM' to data type int -