VBA Web scraping with Internet Explorer
Web scraping is the process of extracting data available to public.
For example: You might want to extract stock historic data from a website to Excel in your computer. This way it is easy to analyse the stocks & decide on your Market activities.
It is unfair to use it to steal data from other websites & use it for you or sell it in open market.
Lets use it for better purpose.
VBA code to initiate IE & Extract a Website Data
The VBA code in this page can be run from a Excel VBA Editor.
- It creates a object for Internet Explorer (IE).
- Then navigates to the URL passed a parameter.
- Once the page is loaded, it extracts the HTML data from IE.
- Then use VBA commands to parse HTML elements.
Here is the actual code that does the above steps.
'--------------------------------------------------------------------------------
'Code by officeticks123@gmail.com
'Visit http://officetricks.com to get more Free & Fully Functional VBA Codes
'--------------------------------------------------------------------------------
Public Sub Web_Scrape_IE(sURL As String)
'Declare variables for IE Object & HTML file
Dim objIE As Object, sHtml As HTMLDocument
'Create Internet Explorer Object
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Visible = True
'Start navigate to Website URL
objIE.navigate sURL
Application.Wait DateAdd("s", 1, Now)
'Wait till Page loads
Do While objIE.busy
Application.StatusBar = "Application Loading. Please Wait..."
Application.Wait DateAdd("s", 1, Now)
DoEvents
Loop
'Page loaded - Now Get the Website HTML Content
Application.StatusBar = "Extracting Details"
DoEvents
Set sHtml = objIE.document
'Parse HTML Data from the website
Set t0 = sHtml.getElementsByClassName("Classname")
'Loop thru each Parsed Element
For Each t1 In t0
'Parse Sub Elements
t3 = t1.getElementsByTagName("h2")
For Each t4 In t3
'Display parsed element from HTML
MsgBox t3.innertext
Next
Next
'Other Possible Parsing Functions
'sHtml.getElementById
'sHtml.getElementsByClassName
'sHtml.getElementsByTagName("a")
'sHtml.getAttribute("title")
'sHtml.parentElement.NextSibling
'Quit Internet Explorer
objIE.Quit
Set objIE = Nothing
End Sub
This is pretty much the conceptual steps involved in most Web scraping tools. But just that the other technologies use Chrome & Firefox.
Be it any technology or browser, the challenging part in most cases for a programmer is to decide on the HTML parsing & dealing with pagination concepts.
In many cases, CAPTCHA provides a good technique to step some beginner level web scrapers.
This code is not only used for web scraping or extracting web data. It can also be used to Automate some of the web activities like form filling, contacting website admin or support team, testing quiz etc.,