How to Web Scrape with VBA using IE & Parse HTML?

VBA Web scraping with Internet Explorer

Web scraping is the process of extracting data available to public.

For example: You might want to extract stock historic data from a website to Excel in your computer. This way it is easy to analyse the stocks & decide on your Market activities.

It is unfair to use it to steal data from other websites & use it for you or sell it in open market.

Lets use it for better purpose.

VBA code to initiate IE & Extract a Website Data

The VBA code in this page can be run from a Excel VBA Editor.

  • It creates a object for Internet Explorer (IE).
  • Then navigates to the URL passed a parameter.
  • Once the page is loaded, it extracts the HTML data from IE.
  • Then use VBA commands to parse HTML elements.

Here is the actual code that does the above steps.

'--------------------------------------------------------------------------------
'Code by officeticks123@gmail.com
'Visit http://officetricks.com to get more Free & Fully Functional VBA Codes
'--------------------------------------------------------------------------------
Public Sub Web_Scrape_IE(sURL As String)
    'Declare variables for IE Object & HTML file
    Dim objIE As Object, sHtml As HTMLDocument
    
    'Create Internet Explorer Object
    Set objIE = CreateObject("InternetExplorer.Application")
    objIE.Visible = True
    
    'Start navigate to Website URL
    objIE.navigate sURL
    Application.Wait DateAdd("s", 1, Now)
    
    'Wait till Page loads
    Do While objIE.busy
        Application.StatusBar = "Application Loading. Please Wait..."
        Application.Wait DateAdd("s", 1, Now)
        DoEvents
    Loop
    
    'Page loaded - Now Get the Website HTML Content
    Application.StatusBar = "Extracting Details"
    DoEvents
    Set sHtml = objIE.document
    
    'Parse HTML Data from the website
    Set t0 = sHtml.getElementsByClassName("Classname")
    
    'Loop thru each Parsed Element
    For Each t1 In t0
    
        'Parse Sub Elements
        t3 = t1.getElementsByTagName("h2")
        For Each t4 In t3
        
            'Display parsed element from HTML
            MsgBox t3.innertext
        Next
    Next
    
    'Other Possible Parsing Functions
    'sHtml.getElementById
    'sHtml.getElementsByClassName
    'sHtml.getElementsByTagName("a")
    'sHtml.getAttribute("title")
    'sHtml.parentElement.NextSibling

    'Quit Internet Explorer
    objIE.Quit
    Set objIE = Nothing
    
End Sub

 

This is pretty much the conceptual steps involved in most Web scraping tools. But just that the other technologies use Chrome & Firefox.

Be it any technology or browser, the challenging part in most cases for a programmer is to decide on the HTML parsing & dealing with pagination concepts.

In many cases, CAPTCHA provides a good technique to step some beginner level web scrapers.

This code is not only used for web scraping or extracting web data. It can also be used to Automate some of the web activities like form filling, contacting website admin or support team, testing quiz etc.,

Leave a Reply