ADITYA R SHARMA

How to Extract Links from Websites using Sitemap in Sheets

Are you tired of manually copying and pasting website links into your Google Sheets? Do you want a more efficient way to import website links into your spreadsheet?

Look no further than using a sitemap and ImportXML in Google Sheets.

I recently discovered this method and it has saved me a significant amount of time and effort. By utilizing a website’s sitemap and the ImportXML function in Google Sheets, I am able to easily import a list of website links into my spreadsheet.

This has been especially useful for tracking backlinks and conducting competitor research.

Table of Contents

What is a Sitemap?

As I begin to explain how to import website links using sitemap in Google Sheet using ImportXML for free, it’s important to first understand what a sitemap is.

In simple terms, a sitemap is a file that provides information about the pages, videos, and other files on a website, and the relationships between them. It helps search engines like Google to crawl and index all of a website’s content.

Sitemap
Example of Yoast SEO Plugin’s Sitemap

Sitemaps come in different formats, but the most common one is the XML sitemap. This type of sitemap is a structured file that lists all the pages on a website, along with additional information like their last modified date, priority, and frequency of change.

XML sitemaps are essential for any website owner who wants to improve their website’s visibility on search engines. By providing search engines with a clear roadmap of a website’s content, sitemaps can help to ensure that all pages are indexed and ranked appropriately.

I have also created a blog where I share how to extract all video links from YouTube Channel, as with YouTube you can run Import XML, we need the YouTube API v3. – Read How

Importing Website Links using Sitemap in Google Sheets

I will show you how to import website links using a sitemap in Google Sheets for free.

Step 1: Create a new Google Sheet and name it

The first step is to create a new Google Sheet and give it a name. This will be the sheet where you will import the website links from the sitemap.

To create a new Google Sheet, go to your Google Drive and click on the “New” button. Then, select “Google Sheets” from the drop-down menu. Once you have created the sheet, give it a name that is easy to remember.

or simply type sheets.new in your browser url field.

Step 2: Use ImportXML function to import data from Sitemap

The next step is to use the ImportXML function to import data from the sitemap. The ImportXML function is a powerful tool that allows you to extract data from any XML field on a website. To use the ImportXML function, you need to know the URL of the sitemap. Once you have the URL, you can use the following formula:

Extract Site URL from Sitemap

=IMPORTXML("URL","//*[local-name() ='url']/*[local-name() ='loc']")

This formula will import all the URLs from the sitemap into your Google Sheet.

So let’s test the output, we will use Yoast SEO website’s pages URL.

List of all Extracted URL in sheets from Sitemap

Voilà! And we’re done, we have successfully scraped all the links from Yoast SEO website.

Common Errors and Troubleshooting

Importing website links using sitemap in Google Sheet using ImportXML can sometimes encounter errors. Here are some of the common errors and their corresponding troubleshooting steps.

Personalised WordPress Hosting for Agencies

Are you Spending hours:

  • Managing your WordPress website?
  • Troubleshooting Technical difficulties?
  • Dealing with plugin Security Issues?
  • Website Uptime Monitoring?
  • Cleaning Hacked Sites?

Let me take care of your WordPress website, so you can focus on scaling your agency.

Give your clients top-notch VPS Hosting for blazing fast performance, 24×7 security monitoring with autopilot safe plugin updates.

I will help you free up time by managing WordPress websites.

#N/A Error

One of the most common errors when importing website links using sitemap in Google Sheet is the #N/A error. This error appears when the ImportXML function is unable to retrieve the data from the sitemap.

To troubleshoot this error, you can try the following:

  • Check the URL of the sitemap. Make sure that it is correct and complete.
  • Verify that the sitemap is accessible and not blocked by the website’s robots.txt file.
  • Ensure that the syntax of the ImportXML function is correct.

Xpath Queries

Another issue that you might encounter when importing website links using sitemap in Google Sheet is with Xpath queries. Xpath is a query language used to extract data from XML documents, including sitemaps.

To troubleshoot Xpath errors, you may want to consider the following:

  • Verify that the Xpath query is correct and matches the structure of the sitemap.
  • Ensure that the Xpath query is enclosed in double quotes.
  • Check that the Xpath query does not contain any typos or syntax errors.

XML Files

Finally, you may encounter issues with the XML files themselves. This can happen if the sitemap is corrupted or if it contains invalid characters.

To troubleshoot XML file errors, you can try the following:

  • Verify that the sitemap is a valid XML file.
  • Check that the sitemap does not contain any invalid characters or syntax errors.
  • Try opening the sitemap in a different XML editor to see if the issue persists.

By following these troubleshooting steps, you can resolve common errors when importing website links using sitemap in Google Sheet using ImportXML.



I share my Learnings & Case studies via email.
Subscribe to Stay Updated

Conclusion

In conclusion, using ImportXML in Google Sheets is a powerful tool to import website links and other data into your spreadsheet. By following the steps outlined in this article, you can easily scrape website links using a sitemap and import them into your Google Sheet for free.

However, it is important to keep in mind that ImportXML has its limitations. It may not work with all websites, and some websites may require more complex XPath queries to extract the desired data. Additionally, ImportXML may not be as reliable as other paid web scraping tools.

Aditya R Sharma

Hello! Currently, I am working as Chief Marketing Officer at POSIMYTH Innovations, with a deep love for WordPress. I love solving problem with AI, Automations & Critical planning. In my free time, I write blogs to share all his learnings with the internet.

Content Writing for SaaS & WordPress Business
Tired of Low Traffic & see your Blog Fail?

Let me help you with content writing & SEO strategy for your product that makes sales.

After raising multiple domains to 1M+ organic click, I am helping Business owners like you harness the real power of SEO.

Explore Further