Website Parsing/Retrieval using XML

I recently posted links to the HTML and Internet Explorer object references. These are excellent tools to get web data into MS Office for further processing.

Lately I have been reading into XML as it appears to be a much faster way to access web pages than instantiating Internet Explorer. My first experiment into XML a few months ago was a disaster; I chalk it up to jumping in too fast without studying, but often that is the way I learn best.

Here is some sample code showing how to retrieve information (in Excel) from a website using XML. First you will need to set a reference to Microsoft XML, in my case (Excel 2003) it is version 6.0. The filename in Windows XP is c:\windows\system32\msxml6.dll.

Sub TestXML()
' check google.com using xml
Dim xSite As XMLHTTP60
Set xSite = New XMLHTTP60
xSite.Open "GET", "http://www.google.com/", False
xSite.Send

Do Until xSite.readyState = 4
Loop

MsgBox xSite.getAllResponseHeaders
MsgBox xSite.getResponseHeader("Last-Modified")
MsgBox "Status Text: " & xSite.statusText & vbCr & vbCr & "Status: " & xSite.Status

Range("A1") = Left$(xSite.responseText, 100)
End Sub

Please note that as far as I can tell, this will only work in IE 7.

First we set a reference to the XML object, then use the Open method to build a connection string. The Send method actually executes the action and accesses the website. Then we run through some message boxes showing some samples of what you can do with the query.

The first message box shows the hidden response headers from the google homepage. Once you list them, you can single them out, which is what the second message box does. The third message box shows off the Status and StatusText properties, which you can use (along with readyState) to see if a webpage has finished loading before your macro continues.

The last part shows how you can parse the website into an Excel range.

For more information on the properties and methods you can access with XML, check out:
XMLHttpRequest Object

Enjoy,
JP

About JP

I'm just an average guy who writes VBA code for a living. This is my personal blog. Excel and Outlook are my thing, with a sprinkle of Access and Word here and there. Follow this space to learn more about VBA. Keep Reading »



Share This Article:

Share and bookmark this articledelicious buttonfacebook buttonlinkedin buttonstumbleupon buttontwitter button

comment bubble 3 Comment(s) on Website Parsing/Retrieval using XML:

  1. Rajendra writes:

    Getting "Permission denied – error 70"
    If I use some other site, im getting some msgs and getting some link in A1

    What i expected is – will get the whole site objects properties into cells starting from A1 :-( think too much.

  2. Great work man. I am using some of your stuff to pul the lat and long into excell sheet and dont need the distance. Needed some more information about XML . can we get away with XML and get the results directly to Ecel like we run a macro.
    Appreciated again – Sartaj

This article is closed to any future comments.
learn excel dashboards