Get a file from http to XML, parse the XML file and extract the data

DavideBrusamolino02 · ‎09-18-2023

Hi,

i have this web page with a table, and i have to download the data to LabVIEW, convert it to an XML and then i have to extract the information from the XML file such as temperature values, and other stuff. I have never worked on parsing XML files and since this file is pretty dense with informations and characters that are useless to me i was wondering if there is a faster way then filtering the string to obtain the data.

I will attach a demo of the file (the real one has more information, but the format is that) and what i have developed since now.

Thank you!

ebs27 · ‎09-18-2023

I haven't used it, but the EasyXML Toolkit from JKI seems useful here: https://www.vipm.io/package/jki_lib_easyxml/

Also link to the NI page for their XML VI's: https://www.ni.com/docs/en-US/bundle/labview/page/parsing-xml-files-in-labview.html

DavideBrusamolino02 · ‎09-18-2023

Thak you! Those vi's are pretty helpful, i can read the string better now, but still i cant filter properly the data that i want to extract.

Bob_Schor · ‎09-18-2023

This is a vastly over-simplified explanation of XML and your XML fragment, but should help you get started in "looking at" the full file.

XML, like HTML, uses <TAG>contents<\TAG> to surround text "contents" with begin/end Tags that say "what they are".

The first XML tag is <LVDATA>, and the last one is <\LVDATA>. Everything in between is "LabVIEW Data" (whatever that is).

The first (and only!) item in LVDATA is a <String> (until <\String>). It has a <Name> (url). It also has a <Val> which appears to be HTML. Of course, HTML uses "<" and ">" as begin/end pairs, so this XML parser has replaced them with "&lt" and "&gt", and I'm uncertain if the semicolon is part of the substitute HTML tag or not. I'm not "up" on my HTML, so I'm not sure what a "td" is, but I see something that looks like Temp2, 50.00, 27.39, and °C, which I assume are data columns. Just below that line is something with bgcolor="#00FF00" followed by the string "Basket up".

So this appears to be an HTML string, perhaps named "url", whose value is a lot of what looks like HTML "code".

Hope this helps.

Bob Schor

Yamaeda · ‎09-18-2023

@Bob_Schor wrote:

so this XML parser has replaced them with "&lt" and "&gt", and I'm uncertain if the semicolon is part of the substitute HTML tag or not. I'm not "up" on my HTML, so I'm not sure what a "td" is, but I see something that looks like Temp2,

<td> is Table Data

TR is Table Row

Both should be within a <Table>

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

alexderjuengere · ‎09-18-2023

you download an HTML file and want to process this file.

"Flatten to XML" is not going to convert the HTML Markup into valid XML-tags

Spoiler

you may take a look at

C:\Program Files\National Instruments\LabVIEW 2020\examples\File IO\XML\Parse XML

Yamaeda · ‎09-18-2023

Saving a HTML document as XML doesn't convert it anymore than changing a file extension from JPG to TIF does ...

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

DavideBrusamolino02 · ‎09-18-2023

So you are telling that the "conversion" from html to XML is not done properly in labVIEW and this makes parsing the XML file more complicated?

billko · ‎09-18-2023

@DavideBrusamolino02 wrote:

So you are telling that the "conversion" from html to XML is not done properly in labVIEW and this makes parsing the XML file more complicated?

Is there a reason for this intermediate step - i.e., convert to xml? It seems to me to be easier to operate on it as a 2D array. Or if you need to convert to xml, just convert it and operate on the original 2D array.

Bill

(Mid-Level minion.)
My support system ensures that I don't look totally incompetent.
Proud to say that I've progressed beyond knowing just enough to be dangerous. I now know enough to know that I have no clue about anything at all.
Humble author of the CLAD Nugget.

Bob_Schor · ‎09-18-2023

Did you search the Web and ask about converting HTML to XML? It actually said it is (sometimes) possible, and gave some methods using Python. There are some other suggestions there that might apply to your situation ...

Bob Schor

LabVIEW

Get a file from http to XML, parse the XML file and extract the data

Get a file from http to XML, parse the XML file and extract the data

Re: Get a file from http to XML, parse the XML file and extract the data

Re: Get a file from http to XML, parse the XML file and extract the data

Re: Get a file from http to XML, parse the XML file and extract the data

Re: Get a file from http to XML, parse the XML file and extract the data

Re: Get a file from http to XML, parse the XML file and extract the data

Re: Get a file from http to XML, parse the XML file and extract the data

Re: Get a file from http to XML, parse the XML file and extract the data

Re: Get a file from http to XML, parse the XML file and extract the data

Re: Get a file from http to XML, parse the XML file and extract the data