05-19-2010 10:08 AM
Hi,
I have a simple string matching need and by experimenting found that the "Match Regular Expression" and "Match Pattern" vi's behave somewhat differently. I'd assume that the regular expression inputs on both would behave the same. A difference I've discovered is that the "|" character (the "vertical bar" character, commonly used as an "or" operator) is recognized as such in the Match Regular Expression vi, but not in the Match Pattern vi (where it is taken literally). Furthermore, I cannot find any documentation in Help (on-line or in LabVIEW) about the "|" character usage in regular expressions. Is this documented anywhere?
For example, suppose I want to match any of the following 4 words: "The" or "quick" or "brown" or "fox". The regular expression "The|quick|brown|fox" (without the quotes) works for the Match Regular Expression vi but not the Match Pattern vi. Below is a picture of the block diagram and the front panel results:
The Help says that the Match Regular Expression vi performs somewhat slower than the Match Pattern vi, so I started with the latter. But since it doesn't work for me, I'll use the former. But does anyone have any idea of the speed difference? I'd assume it is negligible in such a simple example.
Thanks!
Solved! Go to Solution.
05-19-2010 10:28 AM
Speaking only about execution time, I've done a simple test:
So for 1E6 iterations:
Match Regular Expression: about 3400 ms
Match Pattern vi: about 60 ms
Marco
05-19-2010 10:36 AM
Yep-
You hit a point that's frustrated me a time or two as well (and incidentally, caused some hair-pulling that I can ill afford)
The hint is in the help file:
for Match regular expression "The Match Regular Expression function gives you more options for matching strings but performs more slowly than the Match Pattern function....Use regular expressions in this function to refine searches....
Characters to Find | Regular Expression |
---|---|
VOLTS | VOLTS |
A plus sign or a minus sign | [+-] |
A sequence of one or more digits | [0-9]+ |
Zero or more spaces | \s* or * (that is, a space followed by an asterisk) |
One or more spaces, tabs, new lines, or carriage returns | [\t \r \n \s]+ |
One or more characters other than digits | [^0-9]+ |
The word Level only if it appears at the beginning of the string | ^Level |
The word Volts only if it appears at the end of the string | Volts$ |
The longest string within parentheses | \(.*\) |
The first string within parentheses but not containing any parentheses within it | \([^()]*\) |
A left bracket | \[ |
A right bracket | \] |
cat, cag, cot, cog, dat, dag, dot, and dag | [cd][ao][tg] |
cat or dog | cat|dog |
dog, cat dog, cat cat dog,cat cat cat dog, and so on | ((cat )*dog) |
One or more of the letter a followed by a space and the same number of the letter a, that is, a a, aa aa, aaa aaa, and so on | (a+) \1 |
For Match Pattern "This function is similar to the Search and Replace Pattern VI. The Match Pattern function gives you fewer options for matching strings but performs more quickly than the Match Regular Expression function. For example, the Match Pattern function does not support the parenthesis or vertical bar (|) characters.
Characters to Find | Regular Expression |
---|---|
VOLTS | VOLTS |
All uppercase and lowercase versions of volts, that is, VOLTS, Volts, volts, and so on | [Vv][Oo][Ll][Tt][Ss] |
A space, a plus sign, or a minus sign | [+-] |
A sequence of one or more digits | [0-9]+ |
Zero or more spaces | \s* or * (that is, a space followed by an asterisk) |
One or more spaces, tabs, new lines, or carriage returns | [\t \r \n \s]+ |
One or more characters other than digits | [~0-9]+ |
The word Level only if it begins at the offset position in the string | ^Level |
The word Volts only if it appears at the end of the string | Volts$ |
The longest string within parentheses | (.*) |
The longest string within parentheses but not containing any parentheses within it | ([~()]*) |
A left bracket | \[ |
A right bracket | \] |
cat, dog, cot, dot, cog, and so on. | [cd][ao][tg] |
Frustrating- but still managable.
05-19-2010 10:52 AM
Thanks, Marco. The execution time question in my post was an after-thought, so I didn't try it. So it appears that the Match Pattern is about 60 times faster! I wouldn't have suspected that much. But 3400 mS / 10e6 is only 3.4 uS, so it is negligible in my case. But if it were used in a loop as your example, then it could be significant!
-Ed
05-19-2010 11:09 AM - edited 05-19-2010 11:11 AM
Thanks, Jeff. That's what I was looking for. BUT my version of LabVIEW, 8.5, does NOT say "For example, the Match Pattern function does not support the parenthesis or vertical bar (|) characters."!
See: http://zone.ni.com/reference/en-XX/help/371361D-01/glang/match_pattern/
and http://zone.ni.com/reference/en-XX/help/371361D-01/glang/match_regular_expression/
Nor is it mentioned in the Special Characters for Match Pattern help: http://zone.ni.com/reference/en-XX/help/371361D-01/lvhowto/specialcharformatchpatt/
The only place | was "mentioned" is in the sentence: "Certain regular expressions that use alternation (such as (.|\s)*) require significant resources to process when applied to large input strings." But I am not processing a large string.
It looks like NI fixed this omission. What version is your help from?
Ed
05-19-2010 11:34 AM
I searched and found that LabVIEW version 8.6 help has this correction.
Ed
02-21-2017 01:16 PM
Hello my friend, how do you use | like The|quick|brown|fox in Match Pattern function?
02-21-2017 01:22 PM
7 years ago this thread was started, and unless you are here to continue the discussion on the differences between Match Regular Expression and Match Pattern, I suggest making your own thread.
Unofficial Forum Rules and Guidelines
Get going with G! - LabVIEW Wiki.
17 Part Blog on Automotive CAN bus. - Hooovahh - LabVIEW Overlord
02-22-2017 08:32 AM - edited 02-22-2017 08:33 AM
Match Pattern does not support the alternation option of the regular expression grammar. Historically Match Pattern was introduced with one of the first versions of LabVIEW and it implemented a simplified version of regular expression matching. There were various forms of regular expression syntaxis back then and the LabVIEW developers choose to implement one of them that was fairly powerful but not to complicated to implement.
Since, the PCRE (Perl Compatible Regular Expression) has more or less become the defacto standard for regular expression implementation and that is why NI eventually added the Match Regular Expression function that makes use of the PCRE library to implement a fully featured regular expression parser functionality.
Changing the Match Pattern function to support the full PCRE syntax was however not an option since it uses incompatible regular expression syntax and doing so would have broken many existing LabVIEW programs. Also the more simple regular expression syntax of Match Pattern results in a significant performance difference, so that is another reason to keep both functions in LabVIEW.
Use Match Pattern if its regular expression syntax supports your use case, and Match Regular Expression if you need the additional features of that function. And if you despise learning both (which IMHO is anyhow a completely impossible thing to do for the full PCRE syntax) simply use the Match Regular Expression only.