VBA Code to Extract Data from Website to Excel
To perform HTML screen scrapping or get webpage data into Excel using VBA code, read these easy tips.
Programmers & Software testing Professionals use this technique when ‘Excel query’ to fetch webpage option could not get required data from web or due to data formatting issues. For example, if we need a stock quotes in Excel, then we can use a Excel webquery option. But, we cannot hand pick the required data alone.
Lets see how to import data from a website with Excel VBA code.
Also Read: How to Automate Chrome using Excel VBA?
How to Extract data from Website to Excel using VBA?
HTML screen scraping can be done by creating instances of the object “MSXML2.XMLHTTP” or “MSXML2.ServerXMLHTTP”, as explained in the code snippet below.
Use this code in VBA to pull data from website i.,e HTML source of a page & place the content in a excel sheet.
- Create new excel workbook
- Press Alt + F11 to open VB editor.
- Copy paste the below code in space provided for code.
- Change the URL mentioned in the code.
- Execute the code by Pressing F5.
Private Sub HTML_VBA_Extract_Data_From_Website_To_Excel() Dim oXMLHTTP As Object Dim sPageHTML As String Dim sURL As String 'Change the URL before executing the code. URL to Extract data from. sURL = "http://WWW.WebSiteName.com" 'Extract data from website to Excel using VBA Set oXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP") oXMLHTTP.Open "GET", sURL, False oXMLHTTP.send sPageHTML = oXMLHTTP.responseText 'Get webpage data into Excel ThisWorkbook.Sheets(1).Cells(1, 1) = sPageHTML MsgBox "XMLHTML Fetch Completed" End Sub
If you want to know how to parse the HTML elements from the web page read this article: How To Extract HTML Table From Website?
It explains, how to process each HTML element, after the web page content is extracted.
Clear Cache for every VBA Extract Data from Website
Use this below code every time, before executing the above code. Because with every execution, it is possible that extracted website data can reside in cache.
If you run the code again without clearing the cache, then old data will be displayed again. To avoid this, use the below code.
Shell "RunDll32.exe InetCpl.Cpl, ClearMyTracksByProcess 11"
Note: Once this code is executed, it will clear the cache in your web browser. So, please execute this with caution. All your previous sessions and unfinished work will be deleted from browser cache.
Detect Broken URL or Dead Links In Webpage
Sometimes we might need to check only if the URL is active or not. During that time, just insert the below checking IF condition just after the .Send command.
If the URL is not active, then this code snippet will throw an error message.
oXMLHTTP.send If oXMLHTTP.Status <> 200 Then MsgBox sURL & ": URL Link is not Active" End If
Search Engines Extract Website Data
Web browsers like IE, Chrome, Firefox etc., fetch data from webpages and presents us in beautiful formats.
Why do we need to download Website data with a program? Even few crazy programmers develop such applications & name it as Spider or Crawler? Because it is going to Crawl through the web and extract HTML content from different websites.
Internet Search Engines: Google, Bing etc., have crawler programs that download webpage content and index. Guessing, nobody is trying to build a search engine with Excel, but may be for processing website content for other purposes like counting for words, checking specific URLs, Descriptions or Correctness of Email IDs etc.,
Onpage SEO Analysis: Web admins want to fetch their own website data for SEO analysis, marketing people do it for extracting email ids for promoting their products.
Excel Query Fails: Some websites get updated so frequently, like Stock Index, Facebook timeline, Sports updates, Commodity prices, News updates etc.,In this case one would not like to click on ‘Refresh Page’ multiple times.
Use this VBA to extract data from website and accomplish Automated content processing.
Though this article explained about fetching webpage content, the options available with MSXML2.ServerXMLHTTP are not discussed. To know more about this options, visit to MSDN Library.