09-18-2023 07:40 AM
Hi,
i have this web page with a table, and i have to download the data to LabVIEW, convert it to an XML and then i have to extract the information from the XML file such as temperature values, and other stuff. I have never worked on parsing XML files and since this file is pretty dense with informations and characters that are useless to me i was wondering if there is a faster way then filtering the string to obtain the data.
I will attach a demo of the file (the real one has more information, but the format is that) and what i have developed since now.
Thank you!
Solved! Go to Solution.
09-18-2023 08:24 AM
I haven't used it, but the EasyXML Toolkit from JKI seems useful here: https://www.vipm.io/package/jki_lib_easyxml/
Also link to the NI page for their XML VI's: https://www.ni.com/docs/en-US/bundle/labview/page/parsing-xml-files-in-labview.html
09-18-2023 08:48 AM
Thak you! Those vi's are pretty helpful, i can read the string better now, but still i cant filter properly the data that i want to extract.
09-18-2023 08:57 AM
This is a vastly over-simplified explanation of XML and your XML fragment, but should help you get started in "looking at" the full file.
XML, like HTML, uses <TAG>contents<\TAG> to surround text "contents" with begin/end Tags that say "what they are".
The first XML tag is <LVDATA>, and the last one is <\LVDATA>. Everything in between is "LabVIEW Data" (whatever that is).
The first (and only!) item in LVDATA is a <String> (until <\String>). It has a <Name> (url). It also has a <Val> which appears to be HTML. Of course, HTML uses "<" and ">" as begin/end pairs, so this XML parser has replaced them with "<" and ">", and I'm uncertain if the semicolon is part of the substitute HTML tag or not. I'm not "up" on my HTML, so I'm not sure what a "td" is, but I see something that looks like Temp2, 50.00, 27.39, and °C, which I assume are data columns. Just below that line is something with bgcolor="#00FF00" followed by the string "Basket up".
So this appears to be an HTML string, perhaps named "url", whose value is a lot of what looks like HTML "code".
Hope this helps.
Bob Schor
09-18-2023 09:06 AM
@Bob_Schor wrote:
so this XML parser has replaced them with "<" and ">", and I'm uncertain if the semicolon is part of the substitute HTML tag or not. I'm not "up" on my HTML, so I'm not sure what a "td" is, but I see something that looks like Temp2,
<td> is Table Data
TR is Table Row
Both should be within a <Table>
09-18-2023 09:15 AM
you download an HTML file and want to process this file.
"Flatten to XML" is not going to convert the HTML Markup into valid XML-tags
you may take a look at
C:\Program Files\National Instruments\LabVIEW 2020\examples\File IO\XML\Parse XML
09-18-2023 09:59 AM
Saving a HTML document as XML doesn't convert it anymore than changing a file extension from JPG to TIF does ...
09-18-2023 11:09 AM
So you are telling that the "conversion" from html to XML is not done properly in labVIEW and this makes parsing the XML file more complicated?
09-18-2023 11:59 AM
@DavideBrusamolino02 wrote:
So you are telling that the "conversion" from html to XML is not done properly in labVIEW and this makes parsing the XML file more complicated?
Is there a reason for this intermediate step - i.e., convert to xml? It seems to me to be easier to operate on it as a 2D array. Or if you need to convert to xml, just convert it and operate on the original 2D array.
09-18-2023 02:05 PM
Did you search the Web and ask about converting HTML to XML? It actually said it is (sometimes) possible, and gave some methods using Python. There are some other suggestions there that might apply to your situation ...
Bob Schor