XML Sitemap – What Is It?

A sitemap (Sitemap.xml) in XML format contains a list of all the pages on a website and is necessary to help search engine bots find them quickly and enable search engines to index them efficiently.
There are two types of Sitemap.xml: regular and index sitemaps.
A regular XML sitemap must not exceed 50 MB in size and can include up to 50,000 URLs.
An index sitemap combines multiple regular XML sitemaps and is used for large, multilingual websites. Each index file can contain up to 50,000 URL references.
How to Find an XML Sitemap

Several ways to view a sitemap:
1. In the robots.txt file, check the address line: https://site.com/robots.txt. The file may contain an XML sitemap hint in the following format: Sitemap: https://site.com/sitemap/sitemap.xml
2. If you cannot find a sitemap link in robots.txt, enter the following query in the address bar: https://site.com/sitemap.xml
While the robots.txt file must be located at /robots.txt, the sitemap.xml file can have any URL.
/sitemap.xml is just the most common name for an XML sitemap, but it can be different, for example: /sitemap-categories.xml, /sitemap-en.xml, and so on.
3. You can also search within a search engine using specific operators:
- site: Search for an exact website address.
- filetype: Search for a specific file type.
To find an XML file, create a search query like:
- site:site.com filetype:xml
- Elements of the XML sitemap
We already know that a sitemap can be either regular or index-based. Below are the required and optional elements for each type.
Regular Sitemap: Elements

Required elements:
- XML version specified in the first line, plus UTF-8 encoding;
- Tag: urlset – parent tag for the tags listed below; it's the standard for the current protocol;
- Tag: url – used for each URL entry. It is a child of
urlsetand parent to the tags listed below; - Tag: loc – indicates the exact URL of the page. It is a child of the
urltag.
Optional but recommended:
- Tag: lastmod – shows when the page was last updated. This is a child of the
urltag. Google Search considers this when matching page update timestamps. The W3C Datetime format must be used: YYYY-MM-DDThh:mm:ss+TZD. Example: 2023-03-16T20:25:40+02:00; - Tag: changefreq – indicates how often the page is updated, with values ranging from "always" to "never";
- Tag: priority – as the name suggests, indicates the priority of a page. Value range is from 0.0 to 1.0.
Google Search no longer considers the significance of tags: changefreq and priority for "fresh" information.
Index Sitemap: Elements

Required elements:
- The first line must specify the XML version and the required UTF-8 encoding for sitemap files;
- Tag: sitemapindex – parent tag, mandatory for all tags below. This is the standard;
- Tag: sitemap – essential, as it contains all information about each file. It is a child of
sitemapindex; - Tag: loc – specifies the location of the sitemap file. It is a child of tag: sitemap.
Optional but recommended:
- Tag: lastmod – specifies when the sitemap file was last updated. Important: this refers to the sitemap file itself. It is a child of the
sitemaptag.
Ways to Create a Sitemap

There are 4 ways to create a Sitemap. Here's a brief overview:
- Using a CMS. Content management systems like WordPress and Wix automatically generate Sitemaps.
- Manually. This is simple if you have a single-page website or a small resource. You’ll need a text editor and good attention to syntax and accuracy.
- Third-party generator services. For example: mysitemapgenerator.com; smallseotools.com (offers a free version); xml-sitemaps.com.
- Netpeak Spider. To use this tool, follow a clear process: scan the required URLs → launch the built-in Sitemap generator → configure settings → click "Generate" → save the file.
Sitemap: Google Recommendations
The largest search engine recommends following these principles for maximum indexing of your site:
- Ensure accuracy and correctness in writing URLs.
- URLs must belong exclusively to the specified domain (!).
- Place the Sitemap in the root (main) directory of your website.
- Use only UTF-8 encoding. Escape all special characters (if writing manually).
- The search engine does not scan every URL in your Sitemap and does not consider their order.
- Pages must not contain the
NOINDEXmeta tag to be indexed. - The Sitemap should be updated automatically whenever pages are added, removed, opened, or closed for indexing.
Sitemap: Bing Recommendations
Bing’s recommendations are practically identical to Google’s, just slightly rephrased.
Sitemap for Multilingual Websites
To prevent search engines from treating language versions of a site as duplicates, use one of three methods:
- Add the
rel="alternate" hreflang="x"attribute in the page code (the most common and convenient method). - Use an XML-Sitemap (recommended for large sites).
- Use HTTP headers.
Sitemap for Static Images
Sometimes search engines cannot scan images on a website, especially when images are loaded via JavaScript.
To resolve this, you can either include image links in the regular Sitemap or create a separate Image Sitemap.
Both approaches require defining an XML namespace with image-specific tags: image:image and image:loc.
Additional optional tags (not required, as search engines may ignore them) include: image:caption, image:geo_location, image:title, image:license.
Important! An Image Sitemap must use UTF-8 encoding, can include up to 1,000 images per URL, and must be updated regularly!
Sitemap for Video Content
The approach and requirements for creating a Video Sitemap are similar to those for images. By creating a video Sitemap, you inform the search engine that your site contains video content—especially important for recently published videos.
Google provides recommendations for creating a Video Sitemap. Key points include:
- Use UTF-8 encoding.
- A Video Sitemap must not exceed 50 MB or contain more than 50,000 video entries. Larger volumes require an index video Sitemap.
- A Video Sitemap does not guarantee indexing.
- All pages must return an HTTP 200 response code.
- The link must be included in robots.txt.
- Regular automatic updates are required.
First, define the namespace for video tags, then list the tags themselves (over twenty are supported).
Three parent tags: urlset, url (child of urlset), and video:video.
Other tags relate to links and descriptions: video:thumbnail_loc, video:title, video:description, video:content_loc, video:player_loc.
Search engines also welcome: video:duration and video:expiration_date.
Tags indicating ratings, views, access restrictions, and similar data are optional.
Sitemap for News Websites
Ideally, news websites should have a dedicated Sitemap that updates automatically every day to qualify for inclusion in Google News.
Such a Sitemap can contain up to 1,000 URLs. Google strongly recommends regular updates, especially if dozens of news articles are published daily. The News Sitemap should be placed in a relevant category like “News” or “News Feed,” or in the site’s root directory.
Important! Only include articles from the last 48 hours in the file; older entries should be removed. Articles will remain indexed in Google News for up to 30 days.
Required tags for a News Sitemap:
Two parent tags: news:news and news:publication, which has two child tags: news:name and news:language (in ISO 639-1 format).
Also important: news:publication_date and news:title.
Ways to Implement a Sitemap
To help search engines find your XML-Sitemap, use one or more of the following methods:
- Submit via Google Search Console;
- Use a ping request by sending a GET request with your Sitemap URL;
- Add your Sitemap URL to robots.txt.
XML-Sitemaps are analyzed only upon first discovery, not during every site crawl. If you update the file, notify the search engine using a ping request.
Sitemap: How to Detect Errors
If you followed all instructions correctly, major errors should not occur. Use these tools to check for issues:
- Google Search Console → “Sitemaps” report.
- Netpeak Spider → “Tools” → “Validator” (works automatically).
Tip! If your site is poorly indexed despite correct setup and no errors, try splitting your Sitemap into smaller parts and submit them separately. However, avoid overdoing it—excessive fragmentation may cause data loss in Google Search Console.