LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Match Pattern Special Characters Confusion

Solved!
Go to solution

@rolfk wrote:

wiebe@CARYA wrote:

@bienieck wrote:

I switched to Match Pattern for simple patterns after watching some presentation from Darren N. Allegedly this function is more efficient.


For the CPU: probably.

For the programmer: not always.

 

A huge performance difference is caused by the difference in resulting string: Match Pattern returns SubStrings, Match Regular Expression returns Strings... Seems to me Match Regular Expression (an XNode) could be changed so it returns SubStrings too...


Not likely. That would probably require the entire algorithme being implemented in the LabVIEW manager layer, not imported as an external code source from another, non-LabVIEW aware project (PCRE). But reimplementation of the PCRE package in the LabVIEW C manager is extremely unlikely. Not only would it be an immense effort, with likely a lot of incompatible corner cases to the original PCRE quasi standard, but it would also make it extremely awkward to back port any PCRE improvements into that code. The C code dealing with subarrays has to be fully aware of the LabVIEW data type model and that is not something you can force push into any existing code.


The PCRE dll returns indices.

 

As it is, these indices are used to copy from the original string so SubStrings won't be used (the substrings are build into an array and indexed later on):

 

wiebeCARYA_1-1735296079337.png

 

It would be easy to prevent this and make the XNode return substrings to the original string.

 

The code would look silly, but the XNode would return SubStrings for capturing group results.

0 Kudos
Message 21 of 29
(256 Views)

wiebe@CARYA wrote:

@rolfk wrote:

wiebe@CARYA wrote:

@bienieck wrote:

I switched to Match Pattern for simple patterns after watching some presentation from Darren N. Allegedly this function is more efficient.


For the CPU: probably.

For the programmer: not always.

 

A huge performance difference is caused by the difference in resulting string: Match Pattern returns SubStrings, Match Regular Expression returns Strings... Seems to me Match Regular Expression (an XNode) could be changed so it returns SubStrings too...


Not likely. That would probably require the entire algorithme being implemented in the LabVIEW manager layer, not imported as an external code source from another, non-LabVIEW aware project (PCRE). But reimplementation of the PCRE package in the LabVIEW C manager is extremely unlikely. Not only would it be an immense effort, with likely a lot of incompatible corner cases to the original PCRE quasi standard, but it would also make it extremely awkward to back port any PCRE improvements into that code. The C code dealing with subarrays has to be fully aware of the LabVIEW data type model and that is not something you can force push into any existing code.


The PCRE dll returns indices.

 

As it is, these indices are used to copy from the original string so SubStrings won't be used (the substrings are build into an array and indexed later on):

 

wiebeCARYA_1-1735296079337.png

 

It would be easy to prevent this and make the XNode return substrings to the original string.


There is AFAIK no functionality to create substrings on the diagram. There isn't even exported (accessible through the Call Library Node or in an external DLL) C manager functionality to do that. It's all deep in the various C++ core DLLs that are not accessible from the diagram or a LabVIEW DLL. So, "easy" is a very relative term. And it's not as easy as adding a new node to the palettes "Create Substring", which is anything but easy in itself (which is why the PCRE is an XNode and not a built in function. XNodes are a pain in the ass to create but adding a new node in LabVIEW requires a project proposal of an entirely different type of scope.

 

Also nodes accepting substrings as parameters, need to have according flags on the relevant connectors, something which I believe is only possible on build in nodes, not subVIs, which Xnodes ultimately are too. If a connector doesn't have that flag, the LabVIEW compiler will promote any substring or subarray to a real string or array before calling that node.

 

Basically, allowing to use substrings for something like the PCRE node would require a lot of work, either adding new functionality to support substrings and arrays on diagram level (a huge undertaking and very unlikely) or converting the PCRE node into a built in node (slightly more likely).

Rolf Kalbermatter
My Blog
0 Kudos
Message 22 of 29
(251 Views)

I feel that it's also worth noting that Match Pattern is completely agnostic about embedded null chars, whereas Match Regular Expression will totally choke on inputs which contain nulls, since it's a C string implementation.

 

If memory serves, you may get compile time error if the pattern string is a BD constant containing a null, but other than that, null char problems are reported at runtime.

 

Dave

David Boyd
Sr. Test Engineer
Abbott Labs
(lapsed) Certified LabVIEW Developer
Message 23 of 29
(226 Views)

^[^\s]+$ will match any character(s) that isn't whitespace... since you have the ^ and $ boundaries, it will also make sure your entire line/string matches, so if it starts or ends with a whitespace character ([\r\n\t\f\v ]) it will fail.

0 Kudos
Message 24 of 29
(211 Views)

@DavidBoyd wrote:

I feel that it's also worth noting that Match Pattern is completely agnostic about embedded null chars, whereas Match Regular Expression will totally choke on inputs which contain nulls, since it's a C string implementation.

 

If memory serves, you may get compile time error if the pattern string is a BD constant containing a null, but other than that, null char problems are reported at runtime.

 

Dave


Seems to work fine in LV24 64 bit:

wiebeCARYA_0-1736161520242.png

returns "a", as expected.

 

wiebeCARYA_1-1736161637897.png

returns "abc\00\00", as expected.

 

Same with a control as input...

Same with a capturing group (e.g. "(a.*)$").

 

I wander if this is a (recent) fix or if there are specific situations where it fails.

0 Kudos
Message 25 of 29
(132 Views)

@rolfk wrote:

wiebe@CARYA wrote:

@rolfk wrote:

wiebe@CARYA wrote:

@bienieck wrote:

I switched to Match Pattern for simple patterns after watching some presentation from Darren N. Allegedly this function is more efficient.


For the CPU: probably.

For the programmer: not always.

 

A huge performance difference is caused by the difference in resulting string: Match Pattern returns SubStrings, Match Regular Expression returns Strings... Seems to me Match Regular Expression (an XNode) could be changed so it returns SubStrings too...


Not likely. That would probably require the entire algorithme being implemented in the LabVIEW manager layer, not imported as an external code source from another, non-LabVIEW aware project (PCRE). But reimplementation of the PCRE package in the LabVIEW C manager is extremely unlikely. Not only would it be an immense effort, with likely a lot of incompatible corner cases to the original PCRE quasi standard, but it would also make it extremely awkward to back port any PCRE improvements into that code. The C code dealing with subarrays has to be fully aware of the LabVIEW data type model and that is not something you can force push into any existing code.


The PCRE dll returns indices.

 

As it is, these indices are used to copy from the original string so SubStrings won't be used (the substrings are build into an array and indexed later on):

 

wiebeCARYA_1-1735296079337.png

 

It would be easy to prevent this and make the XNode return substrings to the original string.


There is AFAIK no functionality to create substrings on the diagram.


Get String Subset always returns a substring.

 

If an XNode (or .vim or inlined VI) uses Get String Subset there won't be string copies.

 

Same for subarrays. It works well and is easy.

 

*******EDIT***********

Here's an example of a .vim, that does something similar to MRE.

 

LV24, 64 bit.

 

A slow implementation (as MRE works currently)

wiebeCARYA_1-1736165401763.png

A faster implementation (as MRE could easily be made into):

wiebeCARYA_2-1736165434860.png

 

Tested like this:

wiebeCARYA_3-1736165555615.png

 

Slow.vi: 4.25 sec.

Faster.vi: 2.05 sec.

 

But in this test, LV needs the results of String 3 and String 4, so those will still be copied from their SubStrings to normal Strings.

 

If those outputs aren't connected

a) the Faster.vim will remove the String Subsets, in Slow.vim that's not possible. 

b) the two substrings won't need to be copied.

 

Testing like this:

wiebeCARYA_4-1736165743929.png

Slow.vi: 4.25 sec.

Faster.vi: 2 msec.

Download All
0 Kudos
Message 26 of 29
(127 Views)

wiebe,

 

Just checked this in my installation of LV2023Q1 32-bit, and you're right.  Seems to work fine.

 

I hate that I made a pronouncement based on prior behavior.  Now I'm going to be furiously looking through recent release/upgrade notes... this seems like it would have been a pretty substantial fix, and I'd love to know how they implemented it!

 

Dave

 

 

David Boyd
Sr. Test Engineer
Abbott Labs
(lapsed) Certified LabVIEW Developer
0 Kudos
Message 27 of 29
(112 Views)

wiebe,

 

Just checked this in my installation of LV2023Q1 32-bit, and you're right.  Seems to work fine.

 

I hate that I made a pronouncement based on prior behavior.  I just went looking through recent release/upgrade notes... this seems like it would have been a pretty substantial retooling, and I'd love to know how they implemented it... modify the open source PCRE library? pre-sanitize the inputs?  Release notes from 2019 forward don't seem to mention any changes to PCRE Match Regular Expression.

 

It would be fortunate if one of the NI developers chimed in here, but at this point it would really only serve to validate my memory of an earlier release.

 

Dave

 

 

David Boyd
Sr. Test Engineer
Abbott Labs
(lapsed) Certified LabVIEW Developer
0 Kudos
Message 28 of 29
(110 Views)

Well, there's a documentation bug to be fixed, at least.  The NI Offline Help Viewer entry for MRE (last updated 2022-06-08) still includes this note:

 

Note  The Match Regular Expression function does not support null characters in strings. If you include null characters in strings you wire to this function, LabVIEW returns an error and the function may return unexpected results.

David Boyd
Sr. Test Engineer
Abbott Labs
(lapsed) Certified LabVIEW Developer
Message 29 of 29
(99 Views)