Export HTML Table to Excel – When Web query does not work

To Import a HTML table in a web-page we can use,

  • Web Query option in Excel to import it to Excel or
  • From IE browser, right click on the Web Page Table & choose option “Export to Microsoft Excel“.

With some website this web query option will not work properly.

In that case, You can use this code to export HTML Table to Excel using VBA.

Also Read: Download File from Website Using Excel

Excel VBA HTML Table Import – Step By Step

Extract the HTML Table content of the web page. The article in this page explains about how to import the HTML content.

  • In my Previous Article, the web page content is imported to Excel sheet as text content. But in this article, it is assigned to a HTMLFile Object which has more options to retrieve the HTML Tags.
  • Once you have the web page content, it will have the Table related Tags like <Table>, <TR> and <TD>.
  • Now the last step is to process each row & cell in that table and transfer it to the Excel Sheet.

VBA To Export HTML Table – Pull Website Table Data to Excel

Copy paste this code to Excel VB editor. Type web page URL that you want to scrape in cell A1 of worksheet1. Then

Sub Export_HTML_Table_To_Excel()
    Dim htm As Object
    Dim Tr As Object
    Dim Td As Object
    Dim Tab1 As Object

    'Replace the URL of the webpage that you want to download
    Web_URL = VBA.Trim(Sheets(1).Cells(1, 1))

    'Create HTMLFile Object
    Set HTML_Content = CreateObject("htmlfile")

    'Get the WebPage Content to HTMLFile Object
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", Web_URL, False
        .send
        HTML_Content.Body.Innerhtml = .responseText
    End With

    Column_Num_To_Start = 1
    iRow = 2
    iCol = Column_Num_To_Start
    iTable = 0

    'Loop Through Each Table and Download it to Excel in Proper Format
    For Each Tab1 In HTML_Content.getElementsByTagName("table")
        With HTML_Content.getElementsByTagName("table")(iTable)
            For Each Tr In .Rows
                For Each Td In Tr.Cells
                    Sheets(1).Cells(iRow, iCol).Select
                    Sheets(1).Cells(iRow, iCol) = Td.innerText
                    iCol = iCol + 1
                Next Td
                iCol = Column_Num_To_Start
                iRow = iRow + 1
            Next Tr
        End With
        iTable = iTable + 1
        iCol = Column_Num_To_Start
        iRow = iRow + 1
    Next Tab1

    MsgBox "Process Completed"
End Sub

Press F5 to execute this code.

Excel will pull HTML table & align the content in the Excel sheet in proper format.

Limitations of Parsing HTML <Table>

In many of the Website, even including Facebook, Twitter, the webpage will look like tables.

But they are embedded inside its HTML DIV & SPAN tags and not the <TABLE> tag.

The above code will work only if HTML has <table> tag. Other wise the parsing of the data has to be done using other DOM object commands.

It is better to use Facebook, Twitter API rather than just relying on HTML tags.

More Tips: Download Your Facebook, Twitter Profile Data to PC

17 thoughts on “Excel VBA html Table import – Export Web Table to Worksheet”

  1. I can’t tell you how helpful this was, thank you!

    Quick question though. It appears as though this cache’s old information because when the query is run for the second time, it happens a lot faster.

    This causes issues when what you’re querying is large. Is there any way to clear the cache after the information has been copied?

    1. Thanks Jacob for the comment. It is a valid point and I am analyzing on how to Clear the Cache. I guess it is somewhere stored in the default browser cache.
      I will post here on this very soon. Thank You.

      1. Dear kumarapush

        Did you find any way to clear the Cache..

        Please I want to know as I am using this code for my excel…

        but I need clear catch..as I run this macro many times

        thanks

        1. To Clear Cache of Web browser through VBA, add the below line before getting the HTML content from the Webpage.

          Shell “RunDll32.exe InetCpl.Cpl, ClearMyTracksByProcess 11”

          I haven’t tested this single line of code. But it should work out.Let me know if this works out for you, so that it will be helpful for others also. Thank You.

  2. It works perfectly, thank you.
    I just have a problem with one site where there are multiply tables defined in each other,
    Could you help me how to modify this code in order to only download one table defined with its ID?
    Thank you!

    1. iTable is the variable that extracts each table in the webpage. If you don’t want to extract all tables, exit the loop when iTable > 0. before this line.
      if iTable> 0 then Exit For ‘–> Add this line. If you still face issue, let me know the webpage to resolve issue.
      With HTML_Content.getElementsByTagName(“table”)(iTable)

      1. Add this line before statement “With”.
        If Tab1.ID = “ResultsGrid” Then
        and also add an “End IF” after the “End With” Line. I checked this. It will work now.

        But you will get some empty lines, guess you will be able to resolve it.

  3. This is a fantastic piece of code. Exactly what i need.

    I use it on but it extracts all the tables in the page. I only need the one with the matchups. How can the code be modified for this?

  4. That is cool, and i’m already using this “.innertext” method, but my table has some links in it that i’d like to paste too. Does anyone here knows how can i get those? thx

      1. Helo Kumarapush!

        I think that i’ve got something using innerhtml and string manipulation.

        Bu tnow i’m trapped into this. Why doesn’t it work?

        links(1) = “http://olp.conseur.org” & “/” & Mid(link(1), InStr(1, link(1), “href”) + 7, _

        InStr(1, link(1), “>”) – InStr(1, link(1), “href”) – 8)

        Knowing that links(1) is like this below:

        7

        Many thanks

        1. 1. Are you missing a ‘s’ in link(1). should it be link(1) or links(1)
          2. Please do this and it would help in debugging the issue.
          Place the below line just before the code you have mentioned.
          Thisworkbook.sheets(2).cells(1,1) = link(1)
          Thisworkbook.sheets(2).cells(2,1) = links(1)

          and reply with the texts that you get in these cells with double quotes.

  5. Hi,

    I want to extract only one Table in the web page that is provided in the somewhere middle of the page. There are many tables in the page and sometime the desired table is not present in the webpage.
    I have used “GetAllTables” but as provide me all the tables, it is not that useful.

    HTML Code is something like:
    Related U.S. Patent Documents
    Application NumberFiling DatePatent NumberIssue Date<TD PCT/JP2011/005332Sep 22, 2011<TD

    Foreign Application Priority Data
    Sep 28, 2010
    [JP]
    2010-216445

    Table just after the HTML tag “Related U.S. Patent Documents”.

Leave a Reply