Text file reading in non-English environment.

labmaster · ‎12-01-2024

I am a user of LV2023 Q3 in a non-English Windows 11 environment.

I am encountering an unusual issue when trying to read a text file.

When reading the following text, the result is displayed incorrectly, with strange spaces included:

Time(h)
0 -54.935528 -55.27586 15.103133
0.000278 -54.93565 -55.276031 15.054389

However, when I read it in LV, the result appears as follows:

It seems that this issue may be related to non-English encoding settings or something similar. What should I check to resolve this?

labmaster.

tst · ‎12-01-2024

The file is likely saved with a Unicode encoding (not really possible to tell without the file itself, but that's usually what it looks like. The stuff at the beginning is probably the BOM). If you open it in Notepad, you can see the encoding on the bottom right corner, and you can also change it in the save dialog. The easiest option is just to resave it.

If the file does have to be a Unicode file, you can either try reading all the bytes and then removing the extra bytes (can usually work if your text maps directly to ASCII), set LV to display Unicode (not officially supported) or convert the data to ASCII (search for Unicode here and there should be some options).

___________________
Try to take over the world!

labmaster · ‎12-01-2024

Thank you for your response. This is the first time I’ve encountered an encoding issue with a txt file, and I had never seriously considered the encoding options in Notepad before. I believe it was a valuable learning experience.

The txt file is the output of another commercial program, so it may need to be converted. Is there a function in LabVIEW that can detect the encoding and convert the file to ASCII txt in bulk?

labmaster.

tst · ‎12-01-2024

I believe that there is no standard for checking the encoding and that you just have to apply some heuristics and check (e.g. look for the BOM at the beginning and if it's there, it's probably encoding X).

___________________
Try to take over the world!

rolfk · ‎12-02-2024

Generally you can read the data as binary and then check the first 3 bytes. https://en.wikipedia.org/wiki/Byte_order_mark

If the first 3 bytes correspond to 0xEF 0xBB 0xBF you have 8-bit Unicode. The file is usually pretty readable in LabVIEW except if it contains characters beyond the 7-bit ASCI range. In that case those characters will look like multiple strange codes.

If the first two bytes are 0xFE 0xFF you have Unicode 16-bit encoding with Big Endian byte order.

If it is however 0xFF 0xFE you have Unicode 16-bit encoding with Little Endian byte order.

There are other possible BOM markers but they are VERY uncommon in the wild.

So if you detect one of the two 16-bit BOMs you have to pass the rest of the data to one of the many Wide Char to Multibyte conversion functions such as listed here: https://forums.ni.com/t5/Reference-Design-Content/LabVIEW-Unicode-Programming-Tools/ta-p/3493021

A quick and dirty way is to throw away every second byte which is usually a 0x00 as the Unicode codepoint is the same as the ASCII 7-byte code for the first 127 characters. This fails however if your text contains other characters than ASCII 7-bit, as they will be encoded for Unicode.

Rolf Kalbermatter
My Blog

maxnoder1995 · ‎12-02-2024

https://www.vipm.io/package/dataflow_g_lib_g_unicode/

LabVIEW

Text file reading in non-English environment.

Text file reading in non-English environment.

Re: Text file reading in non-English environment.

Re: Text file reading in non-English environment.

Re: Text file reading in non-English environment.

Re: Text file reading in non-English environment.

Re: Text file reading in non-English environment.