LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

How to get html source from an URL with frames

I am trying to get the event log out of my cable modem by programmatically connecting to its url  ( http://192.168.100.1/eventlog_page.asp ) at regular intervals and parsing out the new entries.
 
Unfortunately, a plain Datasocket read gets me a substutute page, containing the line:
 
"This diagnostic web page requires a browser capable of viewing frames.  Please use Internet Explorer 4.0 or greater."
 
I can correctly display the page using activeX, but how do I get the html source of the logging table into a string? I imagine this should be easy, but I typically never use any of this. Thanks!. 😄

Message Edited by altenbach on 04-27-2006 04:17 PM

0 Kudos
Message 1 of 11
(7,199 Views)
It must be Active Server Page (.asp) causing the problem.

I can get the html from my website frame page using a standard datasocket read. Here's the URL if you want to try (http://edodickens.home.mchsi.com/labview/index.html)

Active Server generally renders the page dynamically and this is probably causing the problem because the html doesn't really exsist until you request the page and the browser
"creates" the html.

One hack of a workaround might be to get the page in the ActiveX container, and with the VI still running, the IE right click menus still work. So you could try to position the curso over the desired frame and simulate a right click and some how select the "View Source". This will open the html in notepad so maybe you could get the Notepad Window handle and grab the text from it and do your parsing. No idea if that will actually work or not, but it would be fun (for youSmiley Wink) to try.

If I think of a better way, I'll post it.

Ed


Ed Dickens - Certified LabVIEW Architect - DISTek Integration, Inc. - NI Certified Alliance Partner
Using the Abort button to stop your VI is like using a tree to stop your car. It works, but there may be consequences.
Message 2 of 11
(7,187 Views)
You can probably fake the cable modem out with a User-Agent: statement in your http reqeust.

I am sorry I can't create a LV example from this machine but take a look at:

http://web-sniffer.net

Plug in a web address then use the resulting http request header User-Agent: in your request to
the cable modem.  Something like:

GET / HTTP/1.1[CRLF]
Host: www.fastlinx.net[CRLF]
Connection: close[CRLF]
Accept-Encoding: gzip[CRLF]
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[CRLF]
Accept-Language: en-us,en;q=0.5[CRLF]
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7[CRLF]
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.2) Gecko/20060419 Fedora/1.5.0.2-1.2.fc5 Firefox/1.5.0.2 pango-text Web-Sniffer/1.0.24[CRLF]
Referer: http://web-sniffer.net/[CRLF]
[CRLF]

Except that your User-Agent: string will probably be some version of IE.
The string the modem sends back will have lots of frame stuff but should
be consistent enough to parse.

Post back if this doesn't do it and I will take a crack at a .vi when
I can get on an LV computer.

Hope this helps.

Matt
Message 3 of 11
(7,185 Views)
Hi Altenbach,

It sounds like you've already got plenty of good leads to go on, but I just wanted to round out the conversation by addressing your original question about getting HTML text using ActiveX. Basicly, you want to get the Document child object from your Web Browser ActiveX object. LabVIEW outputs this as a variant, so you need to convert it to IHTMLDocument2, located under the Microsoft HTML Object Library. From there you can get a reference to the Body and then read its Inner/Outer HTML/text.

Here's an example!

Jarrod S.
National Instruments
Download All
Message 4 of 11
(7,173 Views)
Hmmm.... I want to qualify what I just said. This method is pretty easy but I seem to remember a colleague of mine having great difficulty extending this idea to incorporate separate frames. Maybe we missed something easy, but we always got an error type casting a reference to a Frames reference. If anyone has had any luck with this, please post! It's still a nice example, though. 😉

Jarrod S.
National Instruments
Message 5 of 11
(7,168 Views)
Thanks to all, but I am still stuck. While the activeX window is OK, the source is still messed up for some reason (see image).
Probably needs a few tweaks....
 
Fortunately, the cable tech came out yesterday and  replaced some corroded connectors high up on the pole and all's well now.
I was having good signal, but random disconnection problems since mid February (!).
They came out seven times (!) to cluelessly fiddle with the grounding, swap out boxes (twice!), and nobody ever bothered climbing the pole! 😞
 
Still, it would be great to have such a program in the arsenal, but I simply don't have time to work on this at the moment.
 
 
 

Message Edited by altenbach on 04-28-2006 12:20 PM

0 Kudos
Message 6 of 11
(7,168 Views)

Jarrod, Finally...

I've been wanting to write a program which will allow the regulars to know whether threads they've posted to have new posts without having to subscribe to their own posts. The trick is to recognize the appropriate thread icon, but the problem was that using the internet toolkit VI to get the HTML source would not allow the user to be logged in (apparently JAVAScript) and using the ActiveX didn't help either because I tried to find the HTML file saved on the computer and there didn't seem to be one.

Now that I have this, I might get back to this when I have the time.


___________________
Try to take over the world!
0 Kudos
Message 7 of 11
(7,151 Views)
Hi Altenbach,

This might be an issue with Microsoft's Web Browser ActiveX control. To tell you the truth, I'm not sure if it uses the same technology as IE. They may not have updated it since IE4 (this is a company that hasn't updated its current version in six years!).

In any case, I rewrote the example to use an external browser rather than an embedded control, so this will use whatever version of IE you currently have, which is undoubtedly newer than version 4.

I'm only posting the new main VI; you can still use the same Callback VI from my previous post. By the way, with this version, as long as the VI is running, it will still maintain its reference to the browser instance it launches, so you can browse directly from the browser and the VIs data will update in the background. Enjoy!

p.s. Believe me I know about internet difficulties. I've been having non-stop brownouts. I only get service at home about 50% of the time. You'd think it was 1997... 😉

Jarrod S.
National Instruments
0 Kudos
Message 8 of 11
(7,139 Views)
That sounds like a cool app! You'll have to post it if you can get it to work. If you aren't currently logged in to the forum when you run it, you could also programmatically enter in the form with your userID and password using ActiveX. I know there have been forum posts on this before. Here's one I found. Travis couldn't get this code to work with his bank website, but I know I used it for other pages successfully. Good luck!
Jarrod S.
National Instruments
0 Kudos
Message 9 of 11
(7,139 Views)

Actually, we already had a discussion where I posted an initial (messy) example meant to test whether this would work using the internet toolkit VI here. When I have enough time I will try to get it to work using this (although, if you know of a pure G way, that would be much preferable). Then, of course, if that works, comes the part of writing the program itself, which I don't know when I'll be able to get to. Obviously, if I will do it at some point, I will post the result, as this is meant as a public service for the regular users (although I'm not sure how much the web team will be happy with that Smiley Wink ).

The main problem would be with determining how far back to go to check for new posts. Since using the user's tracker is not an option (people could reply to old threads) the program would have to resort to looking at the boards themselves and that point it could get complicated.


___________________
Try to take over the world!
0 Kudos
Message 10 of 11
(7,128 Views)