Extract URL Links from Sitemap.xml file using Excel

About Sitemap.xml

Sitemap XML file is used by the search engines to find out the complete list of the pages of the website. Along with url link of the page, it also contains other metadata like last modified date, change frequency(monthly/daily), etc.. This information is vital for the search engine to perform the crawling of the web links in the blog.

Whenever a new page is added to the website, it automatically gets notified to the search engines from Sitemap.xml file and enables quick indexing of your pages in the Search Engines.

Usually sitemaps are available at the url http://<yourWebsiteName>/sitemap.xml

To access Google or Bing or Yahoo or any webmaster tools, first thing that needs to be done is the submission of the sitemap file of website.

Plugins to Create Sitemap.xml

There are many plugins to generate Sitemap.xml in WordPress blogs.

  • Yoast WordPress SEOSitemap is one another functionality provided by this plugin and awesome plugin for handing SEO features.

  • XML Sitemap & Google News Feeds pluginOnce installed, XML Sitemaps can be set under WordPress Admin Dashboard -> Settings -> Reading Settings -> Enable XML sitemaps. 

Note: XML Sitemaps generated using Yoast WordPress SEO plugin did not work properly for Custom Post types like Portfolio. If you are using custom post types, you could go for XML Sitemap & Google News Feeds plugin and it works like a charm.

Also Read: Add Facebook, Google+, Twitter Follow US button to your WordPress Site

Import Page URLs from Sitemap.xml to Excel

For doing in-depth analysis of the website, we often need to gather all the page url links of the website. May be as a webmaster of the website, you would like to maintain all the page URLs in Excel.

We can achieve this functionality easily with the help of Excel.

Step 1: Download the XML sitemap file http://<yourWebsiteName>/sitemap.xml and save it to your computer

Step 2:Open a New Excel workbook and open the downloaded Sitemap file

Step 3: A popup box will appear and choose ‘As as XML table’ and click on Ok. You might also need to click on OK for the next warning box

Step 4: You could see the XML nodes are properly loaded as columns in Excel worksheet. One column will contain all the page links available in the website.

Download this Excel with Macro to Fetch URL from Sitemap – Downloaded 2,164 Times

Sample Sitemap Files XML and Excel

XML Entry will look like this,

<url>
   <loc>http://officetricks.com</loc>
   <lastmod>2014-06-26T10:03:33+00:00</lastmod>
   <changefreq>daily</changefreq>
   <priority>1</priority>
</url>
<url>
   <loc>http://officetricks.com/create-wordpress-google-adsense-shortcode/</loc>
   <lastmod>2014-06-26T10:03:33+00:00</lastmod>
   <changefreq>weekly</changefreq>
   <priority>0.8</priority>
</url>

XML Table created in Excel will be created as below,

ns1:loc ns1:lastmod ns1:changefreq ns1:priority
http://officetricks.com 2014-06-26T10:03:33+00:00 daily 1
http://officetricks.com/create-wordpress-google-adsense-shortcode/ 2014-06-26T10:03:33+00:00 weekly 0.8

By following the simple steps, we are able to extract URL Links from Sitemap.xml using Excel XML Table.

More Tips: Read Website Content and Detect Dead/Broken URL from Excel

5 thoughts on “Extract URL Links from Sitemap.xml file using Excel”

Leave a Reply