LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Web and unicode

Solved!
Go to solution

I need to download the contents of web pages. I have the Internet Toolkit and use this function:

Data communication -> Protocols -> HTTP client -> GET.vi

I have problems when the page contains some unicode text; for example, the word "Čadan" is output as "ÄŒadan" by the GET vi.

 

I read about an undocumented "UTF-8 to Text" vi, it partially works in reconverting the garbage, for example it converts "Žizdra" to "Žizdra", but doesn't work for many other characters, like the ÄŒ above that becomes a normal C.

I also read https://decibel.ni.com/content/docs/DOC-10153 but it didn't help me...

0 Kudos
Message 1 of 9
(4,083 Views)

Hello snamprogetti,

 

could you please write which operating system and version of LabVIEw are you using?

Antonios
0 Kudos
Message 2 of 9
(4,059 Views)

LabVIEW 2011 on Windows 7

0 Kudos
Message 3 of 9
(4,055 Views)

Hello Snamprogetti,

 

Unicode languages can be displayed by modifying the LabVIEW configuration file. The availability of display languages depends on the version of Windows 7. Only Windows 7 Ultimate allows users to choose a display language with the steps shown in here.

Windows 7 Enterprise and Professional are not able to download some of the very common language libraries that are listed in the Microsoft Knowledge Base.

 

To display Unicode languages, modify the LabVIEW .ini configuration file, using the following steps:
Navigate to C:\Program Files\National Instruments\<LabVIEW>\LabVIEW.ini. AddUseUnicode=TRUE to any new line in the LabVIEW .ini file. Save the file and exit LabVIEW. Upon relaunching LabVIEW, the Unicode characters should work correctly, whether typing or pasting in. 

Have you already tried that?

 

For control/indicator labels, it may be necessary to enter text as a Caption, not a Label. Right-click on the control/indicator and select Visible Items»Caption from the right-click menu. Paste in your text or type it using the steps from here.

 

You should note though that when building an executable that requires Unicode characters, you should insert the UseUnicode=TRUE line into the application.ini file generated by the executable creation.

 

Could you please try the above?

Antonios
0 Kudos
Message 4 of 9
(4,048 Views)

Yes i already tried that, but didn't help, probably because i am neither typing nor pasting in. Actually, for now i don't even care about displaying; all i want to do is read data from web URLs and elaborate it, but data keeps coming out that way from GET.vi

0 Kudos
Message 5 of 9
(4,044 Views)

Everything in LabVIEW is probably working correctly in LabVIEW, its just not working the way you want (or need) it to.  Can you describe what you're trying to do in a bit more detail?

 

The problem you're running into is that LabVIEW (for the most part, some of the unicode stuff mentioned above is the exception) interprets strings in the system code page.  If the primary language used with your computer is English, Spanish, French, or any number of other "Western European" languages, the system code page is likely Windows-1252.  See http://en.wikipedia.org/wiki/CP1252 for more information on exactly which characters can be represented on these systems.

 

What's happening is that you're receiving a (most likely) valid UTF-8 encoded string which LabVIEW interprets according to your system code page since there is no way to tell LabVIEW to interpret the string in any other way.  The specific problem your having (if your using the Windows-1252 code page) is that the "Č" character does not exist in your code page, therefore, there is no way to encode it in your system code page, and no way for LabVIEW to display it.  The "UTF-8 to Text" VI appears to be kind enough to replace it with a similar character than can be encoded in your system code page, namely a 'C'.

 

How you work around this limitation will depend on what, exactly, you want to do.  If you need to display strings including characters not in your system code page, you may be out of luck.  If you simply need to store strings including such characters (for example: to a file or database), you'll need to transcode the strings from UTF-8 to the proper character set for the file, database, etc.

 

Mark Moss

Electrical Validation Engineer

GHSP

0 Kudos
Message 6 of 9
(4,037 Views)

Here is what i'm doing:

1) download a page containing names, for example Čadan

2) use names to build other URLs, for example http://en.wikipedia.org/wiki/Čadan

3) download pages and do some other text extraction

4) write to file

 

The problem is mainly in step 2, as the URL won't work at all with a ÄŒ or a C. Percent encoding, such as %C4%8C for Č, would work, but i don't know how to get it.

0 Kudos
Message 7 of 9
(4,027 Views)

@Snamprogetti wrote:

Here is what i'm doing:

1) download a page containing names, for example Čadan

2) use names to build other URLs, for example http://en.wikipedia.org/wiki/Čadan

3) download pages and do some other text extraction

4) write to file

 

The problem is mainly in step 2, as the URL won't work at all with a ÄŒ or a C. Percent encoding, such as %C4%8C for Č, would work, but i don't know how to get it.


 

There is an Escape HTTP URL VI (http://zone.ni.com/reference/en-XX/help/371361H-01/lvcomm/escape_http_url/) that will probably handle the URL-encoding for you (i.e. convert Č to %C4%8C).

 

Mark Moss

Electrical Validation Engineer

GHSP

 

Mark Moss

 

Message 8 of 9
(4,021 Views)
Solution
Accepted by Snamprogetti

OK! the Escape HTTP URL VI also works with UTF-8. That solved step 2.

 

As for step 4, i tried to convert to Unicode and append a 0xFFFE Byte Order Mark as explained in https://decibel.ni.com/content/docs/DOC-10153, but never succeeded with any conversion VI.

 

Instead, i did no conversion and appended a 0xEFBBBF Byte Order Mark, which should be appropriate for the UTF-8 encoding used by GET.vi's output. Indeed, the files i write now correctly show Čadan or whatever. Since i stick to UTF-8, i have to apply it (with "Text to UTF-8") if doing any ASCII string searching in step 3. This solved at least my application. Thanks for helping

0 Kudos
Message 9 of 9
(4,001 Views)