07-19-2011 11:11 PM - edited 07-19-2011 11:13 PM
Match Pattern.vi appears to need some hand-holding: it will only find a match when you tell it exactly where to look. The icon implies that the VI can match 'bb' from 'abbc' when given 'b*' as the input regular expression, but it won't unless you tell it to skip the 'a'.
This doesn't appear to be a previous discussion topic or KB entry, nor do I think this is a bug since it's pretty obvious. The LabVIEW help says that this VI "Searches for regular expression in string beginning at offset, and if it finds a match, splits string into three substrings." which would characterize the behavior I'm seeing. So is the icon incorrect, or am I using the VI incorrectly?
I'm using LabVIEW 2010 (without the service pack update).
Solved! Go to Solution.
07-20-2011 05:06 AM
b+ does what the icon says b* should do. What confuses me is that empty strings are returned as 'before substring' and 'match string'.
07-20-2011 05:51 AM - edited 07-20-2011 05:56 AM
I'd say that the function is correct.
The help for the finction implies that b* means to match *ZERO* or more instances of 'b'
With no offset, the checking begins at 'a'. Immediately it has matched zero occurances of 'b'. There was nothing before it matched, it matched an empty string, containing zero occurances of 'b', followed by 'abbc'
Offset=1; Checking begins at the first 'b'. It matches on 'bb' (The longest possible match). Thus we have 'a' before a match of 'bb' and 'c' after.
Offset=2: Checking begins at the second 'b'. It matches on the second 'b' which is followed by 'c'. This 'ab' before, a match of 'b' and 'c' after
Offset=3: Checking begins at the 'c'. A match of zero 'b's is immediately found. 'abb' before, match on empty string, followed by 'c'
However as jcarmody says, the match pattern you want is b+. This will look for *ONE* or more 'b' and will match 'bb' if you start the search at either the 'a' or the first 'b'
Rod.
07-20-2011 09:50 PM
Rod wrote:
With no offset, the checking begins at 'a'. Immediately it has matched zero occurances of 'b'. There was nothing before it matched, it matched an empty string, containing zero occurances of 'b', followed by 'abbc'
Excellent explanation. Thanks for the detail as well. I was expecting egrep style behavior, where the search scans the whole line:
$ echo 'abbc' | egrep -o 'b*'
bb
$ echo 'abbc' | egrep -o 'b+'
bb
Good to know there's a difference!
07-20-2011 10:25 PM
Should the icon of the function be changed?
b* looks right because in most basic search functions, * means wildcard.
But if b* does not give the outputs that the icon of the function implies, but b+ does, then shouldn't the icon be redrawn to show b+?
07-21-2011 07:58 AM
The difference in the results is because "match pattern" looks for (and reports) a single match, starting from the offset suplied (and as we have already seen, the reported match could be the empty string), whereas egrep will typically scan an entire file reporting every matched line (and with -o every match in every line) - In effect many invocations of "match pattern" with differing offsets. As an example consider
echo abbcabbbc | egrep -o "b*"
Both bb and bbb will be reported as matches. One presumes that zero length matches are also found, but that these are not reported.
Rod.
07-30-2011 02:20 PM
Ravens Fan wrote:
Should the icon of the function be changed?
b* looks right because in most basic search functions, * means wildcard.
But if b* does not give the outputs that the icon of the function implies, but b+ does, then shouldn't the icon be redrawn to show b+?
I agree with you. The 'out-of-the-box' (or maybe out-of-the-palette) experience was surprising, especially since the VI icon is a little misleading. If you wire it the way it's drawn, it doesn't do what it claims to do.
On the other hand, people less familiar with regular expressions and more comfortable with terminal wildcards might see a 'b+' as confusing. The 'b*' communicates the intent but not the perfect use 😉
Rod wrote:
The difference in the results is because "match pattern" looks for (and reports) a single match, starting from the offset suplied (and as we have already seen, the reported match could be the empty string), whereas egrep will typically scan an entire file reporting every matched line (and with -o every match in every line) - In effect many invocations of "match pattern" with differing offsets. As an example consider
echo abbcabbbc | egrep -o "b*"
Both bb and bbb will be reported as matches. One presumes that zero length matches are also found, but that these are not reported.
Another excellent characterization. And although Match Regular Expression VI is more suited to the task, here is Match Pattern in a VI that behaves more like egrep 🙂