VBA Code – To extract data – From website to Excel Macro

VBA Code to Extract Data from Website to Excel

To perform HTML screen scrapping or get webpage data into Excel using VBA code, read these easy tips.

Programmers & Software testing Professionals use this technique when ‘Excel query’ to fetch webpage option could not get required data from web or due to data formatting issues. For example, if we need a stock quotes in Excel, then we can use a Excel webquery option. But, we cannot hand pick the required data alone.

Lets see how to import data from a website with Excel VBA code.

How to Extract data from Website to Excel using VBA?

HTML screen scraping can be done by creating instances of the object “MSXML2.XMLHTTP” or “MSXML2.ServerXMLHTTP”, as explained in the code snippet below.

Use this code in VBA to pull data from website i.,e HTML source of a page & place the content in a excel sheet.

  1. Create new excel workbook
  2. Press Alt + F11 to open VB editor.
  3. Copy paste the below code in space provided for code.
  4. Change the URL mentioned in the code.
  5. Execute the code by Pressing F5.
Private Sub HTML_VBA_Extract_Data_From_Website_To_Excel()
    Dim oXMLHTTP As Object
    Dim sPageHTML  As String
    Dim sURL As String

    'Change the URL before executing the code. URL to Extract data from.
    sURL = "http://WWW.WebSiteName.com"

    'Extract data from website to Excel using VBA
    Set oXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP")
    oXMLHTTP.Open "GET", sURL, False
    oXMLHTTP.send
    sPageHTML = oXMLHTTP.responseText

    'Get webpage data into Excel
    ThisWorkbook.Sheets(1).Cells(1, 1) = sPageHTML

    MsgBox "XMLHTML Fetch Completed"

End Sub

If you want to know how to parse the HTML elements from the web page read this article: How To Extract HTML Table From Website?

It explains, how to process each HTML element, after the web page content is extracted.

Clear Cache for every VBA Extract Data from Website

Use this below code every time, before executing the above code. Because with every execution, it is possible that extracted website data can reside in cache.

Also Read: Extract URL From Sitemap of a Website

If you run the code again without clearing the cache, then old data will be displayed again. To avoid this, use the below code.

Shell "RunDll32.exe InetCpl.Cpl, ClearMyTracksByProcess 11"

Note: Once this code is executed, it will clear the cache in your web browser. So, please execute this with caution. All your previous sessions and unfinished work will be deleted from browser cache.

Detect Broken URL or Dead Links In Webpage

Sometimes we might need to check only if the URL is active or not. During that time, just insert the below checking IF condition just after the .Send command.

If the URL is not active, then this code snippet will throw an error message.

    oXMLHTTP.send

    If oXMLHTTP.Status <> 200 Then
        MsgBox sURL & ": URL Link is not Active"
    End If

Search Engines Extract Website Data

Web browsers like IE, Chrome, Firefox etc., fetch data from webpages and presents us in beautiful formats.

Why do we need to download Website data with a program? Even few crazy programmers develop such applications & name it as Spider or Crawler? Because it is going to Crawl through the web and extract HTML content from different websites.

Internet Search Engines: Google, Bing etc., have crawler programs that download webpage content and index. Guessing, nobody is trying to  build a search engine with Excel, but may be for processing website content for other purposes like counting for words, checking specific URLs, Descriptions or Correctness of Email IDs etc.,

Onpage SEO Analysis: Web admins want to fetch their own website data for SEO analysis, marketing people do it for extracting email ids for promoting their products.

Excel Query Fails: Some websites get updated so frequently, like Stock Index, Facebook timeline, Sports updates, Commodity prices, News updates etc.,In this case one would not like to click on ‘Refresh Page’ multiple times.

Use this VBA to extract data from website and accomplish Automated content processing.

Additional References

Though this article explained about fetching webpage content, the options available with MSXML2.ServerXMLHTTP are not discussed. To know more about this options, visit to MSDN Library.

6 thoughts on “VBA Code – To extract data – From website to Excel Macro”

  1. Hi,

    Just a suggestion.

    When I use this code, I am getting error message “a connection with the server could not be established”.

    So, I used “MSXML2.XMLHTTP” instead of “MSXML2.ServerXMLHTTP” and it worked. But not very sure about how this worked.

    I am posting here because I have another problem now. The code is not fetching full html source. Part of HTML Source is getting truncated. Please suggest if you have solution to this problem.

    Thanks,
    Puvi.

    1. Thanks for sharing your thoughts.

      This is not a problem with the function. I believe you are trying to write the html content of the website to an Excel Cell. You cannot write more than 32767 characters into a cell. Refer this link to know more about the limitation in a excel sheet: http://office.microsoft.com/en-in/excel-help/excel-specifications-and-limits-HP010073849.aspx

      To resolve this, try to store the website content in a string variable or write it to a output text file to reuse it later.

      Hope this helps.

      1. I copied and pasted (changing the website url) the above code and it didnt import any data at all and I did receive the message “fetch completed”. the first row’s height was increased but that was the only thing the code did or at least it seemed as though. On the other hand, I used the following code on a linked in forum for excel and vba and it did import.

        https://www.linkedin.com/groups/82527/82527-6150113411775676416?midToken=AQHl66-rrs7Gxw&trk=eml-b2_anet_digest_of_digests-hero-7-discussion~subject&trkEmail=eml-b2_anet_digest_of_digests-hero-7-discussion~subject-null-a5g92~ipps82gg~sc&_mSplash=1

  2. I fоund thіѕ wеbѕіtе vеrу uѕеfull.
    Cаn I аutоmаtе thе рrосеѕѕ оf fеtсhіng thе dаtа
    frоm a wеbѕіtе without hаvіng tо еntеr thе vаluеѕ ѕеԛuеntіаllу іntо thе
    wеbраgе?? Fоr еg, I nееd tо gеt thе dаtа fоr vаluеѕ 56679878 tо 56698765. whіlе
    dоіng thіѕ wе nееd tо еntеr thе vаluеѕ іndіvіduаllу аnd ѕubmіt a buttоn аnd gеt
    thе dаtа аnd thеn thе nеxt numbеr. саn wе аutоmаtе thіѕ process??

  3. I’m looking to utilize this to scrape some sales leads about pages from their websites. Once I have it in here, any idea how to filter out all but the main content of the page?

Leave a Reply