InstaRank SEO
Free SEO Tool

Free X-Robots-Tag Checker

Analyze X-Robots-Tag HTTP headers across your website. Detect noindex, nofollow, sitemap conflicts, and directive issues across 10 parameters.

10 Parameters
Up to 500 Pages
100% Free

How It Works

1

Enter Your URL

Enter your website URL and we'll discover up to 500 pages using link crawling and sitemap analysis.

2

Analyze Headers

We check every page's HTTP response headers for X-Robots-Tag directives and compare with meta robots tags.

3

Get Results

View your score across 10 parameters with detailed issues, impact analysis, and actionable fix suggestions.

What Is X-Robots-Tag?

The X-Robots-Tag is an HTTP response header that allows webmasters and developers to control how search engines crawl, index, and display content from their websites. Unlike the more commonly known meta robots tag, which is embedded directly in the HTML of a page, the X-Robots-Tag is sent as part of the HTTP response headers. This makes it particularly useful for controlling indexing behavior on non-HTML resources such as PDF files, images, video files, and other document types that do not support HTML meta tags.

The X-Robots-Tag was introduced by Google and is now supported by all major search engines, including Bing and Yandex. It accepts the same directives as the meta robots tag, such as noindex, nofollow, noarchive, and nosnippet. When a search engine crawler encounters an X-Robots-Tag header in a server response, it processes the directives in the same way it would process a meta robots tag found in the HTML head section.

One of the key strengths of the X-Robots-Tag is its flexibility. Because it is set at the server level, it can be applied to any response returned by the web server, regardless of file type. This means you can prevent indexing of PDF documents, image galleries, downloadable files, API responses, or any other resource type. You can also apply different directives to different user agents, allowing you to give Google and Bing different instructions for the same content.

Search engines combine directives from multiple sources when processing a page. If a page contains both a meta robots tag and an X-Robots-Tag header, the search engine will follow the most restrictive combination of the two. For example, if the meta robots tag says "index, follow" but the X-Robots-Tag header says "noindex," the page will not be indexed. Understanding this behavior is critical for avoiding accidental deindexing of important pages.

X-Robots-Tag vs Meta Robots: Key Differences

While the X-Robots-Tag and the meta robots tag serve the same fundamental purpose -- instructing search engines on how to handle content -- they differ in important ways that affect when and where you should use each one. Choosing the right mechanism depends on the type of content, your server configuration capabilities, and the scope of the directive.

The meta robots tag is placed inside the HTML head section of a web page using the syntax <meta name="robots" content="noindex, nofollow">. Because it lives inside the HTML document, it can only be used on resources that contain HTML. You cannot add a meta robots tag to a PDF file, an image, a CSS stylesheet, or a JavaScript file. This is the primary limitation that the X-Robots-Tag was designed to address.

The X-Robots-Tag is set via the HTTP response header, which means it can be applied to any type of resource. Whether the response is an HTML page, a PDF document, an image file, or a JSON API response, the X-Robots-Tag header can instruct the search engine on how to handle that resource. This makes it the only viable option for controlling indexing of non-HTML content.

Here is a comparison of the two approaches:

FeatureMeta Robots TagX-Robots-Tag Header
LocationHTML <head> sectionHTTP response header
HTML pagesYesYes
PDF / Images / VideoNoYes
User-agent targetingVia name attributeVia header prefix
Requires HTML editingYesNo
Requires server configNoYes
Bulk application via rulesDifficult (per-page)Easy (server rules)
Supported by GoogleYesYes
Supported by BingYesYes

An important consideration is that modifying HTTP headers requires access to server configuration, which may not be available on all hosting platforms. Shared hosting environments often limit access to server headers, making the meta robots tag the more practical option for HTML pages. On dedicated servers, virtual private servers, or platforms like Nginx and Apache where you have full configuration control, the X-Robots-Tag offers more powerful and flexible options.

In practice, most SEO professionals use both approaches together. The meta robots tag is used for page-level HTML control, while the X-Robots-Tag is used for non-HTML resources and for applying broad rules across entire directories or file types through server configuration. The key is to ensure the two methods are not in conflict, since search engines follow the most restrictive directive found in either source.

X-Robots-Tag Directives Explained

The X-Robots-Tag supports a wide range of directives that give you granular control over how search engines interact with your content. Each directive instructs the crawler to handle a specific aspect of indexing, caching, or displaying your content in search results. Multiple directives can be combined in a single header, separated by commas.

noindex

The noindex directive tells search engines not to include the resource in their search results. This is the most commonly used directive and is critical for preventing internal pages, duplicate content, staging environments, or private documents from appearing in search results. When a search engine encounters noindex, it will crawl the page but will not add it to its index. If the page was previously indexed, it will be removed over time. Note that noindex does not prevent the page from being crawled -- for that, you need robots.txt Disallow rules.

nofollow

The nofollow directive instructs search engines not to follow any links found in the resource. This means the search engine will not pass PageRank or other link equity through the links on this page. This is useful for user-generated content pages, login pages, or pages with untrusted external links. It is important to understand that nofollow on the X-Robots-Tag applies to all links on the page, unlike the rel="nofollow" attribute on individual links.

none

The none directive is a shorthand equivalent to specifying both noindex and nofollow together. Using X-Robots-Tag: none tells search engines not to index the page and not to follow any links on it. This is the most restrictive general-purpose directive and is appropriate for content that should be completely excluded from search engine consideration.

noarchive

The noarchive directive prevents search engines from storing a cached copy of the page. When noarchive is set, the "Cached" link in Google search results will not be available for that page. This is useful for pages with time-sensitive content, pricing information, or sensitive data that should not be accessible through cached versions after the content has been updated or removed.

nosnippet

The nosnippet directive prevents search engines from displaying text snippets or video previews for the page in search results. The page title will still appear, but no description text will be shown beneath it. This is rarely recommended for SEO purposes because snippets help users decide whether to click on a result, and removing them typically reduces click-through rates significantly.

noimageindex

The noimageindex directive instructs search engines not to index images on the page for Google Images or other image search results. The page itself may still be indexed, but images embedded within it will not appear in image search. This can be applied to both HTML pages and directly to image files via the X-Robots-Tag header.

max-snippet

The max-snippet:[number] directive controls the maximum length (in characters) of a text snippet shown in search results. Setting max-snippet:0 is equivalent to nosnippet. Setting max-snippet:-1 means no limit. A value like max-snippet:160 restricts the snippet to 160 characters. This directive gives you fine-grained control over how much of your content appears in search results while still allowing some preview text to attract clicks.

max-image-preview

The max-image-preview:[setting] directive controls the maximum size of an image preview displayed in search results. Valid values are none (no image preview), standard (default-sized preview), and large (large preview, which is generally recommended for better visibility in search results and eligibility for Discover). Setting this to large is a best practice for content pages you want to maximize visibility for.

max-video-preview

The max-video-preview:[number] directive sets the maximum number of seconds of a video snippet that may be shown in search results. A value of 0 means a still image may be used instead of a video preview. A value of -1 means there is no limit. This is relevant for pages that embed video content and appear in video search results.

unavailable_after

The unavailable_after:[date/time] directive tells search engines to stop showing the page in search results after the specified date and time. The date format follows RFC 850. This is useful for event pages, limited-time offers, job postings, or any content with a defined expiration. After the specified date, search engines will treat the page as if it has a noindex directive. Example: unavailable_after: 25 Jun 2026 15:00:00 EST.

How to Set X-Robots-Tag Headers

Setting the X-Robots-Tag header requires access to your web server configuration or application code. The exact method depends on your server software, hosting environment, and the level of control you need. Below are configuration examples for the most common server environments.

Apache (.htaccess or httpd.conf)

Apache uses the Header directive from mod_headers to set X-Robots-Tag headers. You can apply the header globally, to specific file types, or to specific directories:

# Apply noindex to all PDF files
<FilesMatch "\.pdf$">
    Header set X-Robots-Tag "noindex, nofollow"
</FilesMatch>

# Apply noindex to specific directories
<Directory "/var/www/html/private">
    Header set X-Robots-Tag "noindex"
</Directory>

# Apply noindex to all image files
<FilesMatch "\.(png|jpg|jpeg|gif|webp|svg)$">
    Header set X-Robots-Tag "noimageindex"
</FilesMatch>

# Target a specific user agent (Google only)
<FilesMatch "\.pdf$">
    Header set X-Robots-Tag "googlebot: noindex"
</FilesMatch>

# Combine multiple directives
Header set X-Robots-Tag "noindex, noarchive, max-snippet:0"

# Apply to multiple user agents with separate headers
Header append X-Robots-Tag "googlebot: noindex"
Header append X-Robots-Tag "bingbot: noindex, nofollow"

Nginx

In Nginx, use the add_header directive within server or location blocks to set X-Robots-Tag headers:

# Apply noindex to all PDF files
location ~* \.pdf$ {
    add_header X-Robots-Tag "noindex, nofollow" always;
}

# Apply noindex to an entire directory
location /private/ {
    add_header X-Robots-Tag "noindex" always;
}

# Apply noindex to staging or preview pages
location /preview/ {
    add_header X-Robots-Tag "noindex, nofollow" always;
}

# Apply to all image files
location ~* \.(png|jpg|jpeg|gif|webp|svg)$ {
    add_header X-Robots-Tag "noimageindex" always;
}

# Target specific user agent
location ~* \.pdf$ {
    add_header X-Robots-Tag "googlebot: noindex" always;
}

# Set on the entire site (use with caution)
server {
    # ...
    add_header X-Robots-Tag "max-image-preview:large" always;
}

The always keyword ensures the header is sent regardless of the response status code, which is important for 4xx and 5xx error pages. Without it, Nginx only adds custom headers to successful (2xx) and redirect (3xx) responses.

Application-Level (PHP, Node.js, Python)

You can also set the X-Robots-Tag header directly from your application code, which provides the most dynamic control:

# PHP
header("X-Robots-Tag: noindex, nofollow", true);

# Node.js (Express)
app.use('/private', (req, res, next) => {
    res.set('X-Robots-Tag', 'noindex, nofollow');
    next();
});

# Python (Django middleware)
class XRobotsTagMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response
    def __call__(self, request):
        response = self.get_response(request)
        if request.path.startswith('/private/'):
            response['X-Robots-Tag'] = 'noindex, nofollow'
        return response

# Next.js (in next.config.js headers)
async headers() {
    return [
        {
            source: '/private/:path*',
            headers: [
                { key: 'X-Robots-Tag', value: 'noindex, nofollow' },
            ],
        },
    ];
}

Application-level configuration is ideal when the decision to noindex a page depends on dynamic factors such as the content type, user role, or business logic that cannot easily be expressed in static server configuration rules.

How to Check X-Robots-Tag Headers

Verifying that your X-Robots-Tag headers are set correctly is essential for ensuring your indexing strategy works as intended. There are several methods to inspect HTTP headers returned by your server.

Using Browser Developer Tools

Open your browser's developer tools (F12 in Chrome, Firefox, or Edge), navigate to the Network tab, and reload the page. Click on the primary document request and inspect the Response Headers section. Look for the X-Robots-Tag header. If present, its value will show the directives being applied. This method works for any resource type, including PDF files and images, as long as the browser loads them directly.

Using Command Line Tools

The curl command-line tool is one of the quickest ways to inspect HTTP response headers. Use the -I flag to fetch only the headers, or -D - to see headers alongside the body:

# Fetch only headers
curl -I https://example.com/document.pdf

# Look specifically for X-Robots-Tag
curl -sI https://example.com/ | grep -i "x-robots-tag"

# Check with a specific user agent
curl -A "Googlebot/2.1" -sI https://example.com/ | grep -i "x-robots-tag"

Using Our Free X-Robots-Tag Checker

Our X-Robots-Tag Checker tool automatically crawls your website and inspects the HTTP headers of every page and resource. It identifies pages with noindex directives, detects conflicts between X-Robots-Tag headers and meta robots tags, checks for accidental noindex on important pages, and flags staging or development configuration that may have leaked into production. The tool provides a comprehensive report with severity ratings and actionable fix recommendations.

Using Google Search Console

Google Search Console's URL Inspection tool shows how Google sees your pages, including any X-Robots-Tag directives that were detected during crawling. If Google has found an X-Robots-Tag with noindex on a page, the URL Inspection tool will report it as "Excluded by noindex." The Coverage report also aggregates pages that are excluded from indexing, making it easy to spot patterns where X-Robots-Tag may be applied unintentionally.

Common X-Robots-Tag Mistakes

Misconfigured X-Robots-Tag headers can cause significant damage to your search visibility. Because the header operates at the server level and is often invisible to content editors, mistakes can persist for weeks or months before being discovered. Here are the most common errors and how to avoid them.

Accidental noindex on Important Pages

The most damaging mistake is accidentally applying a noindex directive to pages that should be indexed. This typically happens when a broad server configuration rule unintentionally matches important pages, or when a staging or development configuration is deployed to production without being removed. For example, a rule intended to noindex a specific directory may use a regex pattern that also matches other URLs. Always test your server configuration rules against your full URL inventory before deploying.

To prevent this, maintain an explicit list of URL patterns that should have noindex applied, and implement monitoring that alerts you if the number of indexed pages in Google Search Console drops unexpectedly. A sudden decrease in indexed pages often indicates an accidental noindex deployment.

Staging Configuration Leaking to Production

Many development teams apply a global X-Robots-Tag: noindex header to staging and development environments to prevent search engines from indexing test content. When the application or server configuration is deployed to production without removing this header, the entire production site becomes deindexed. This is one of the most common and catastrophic SEO mistakes that teams make.

The solution is to use environment-specific configuration that sets the X-Robots-Tag header conditionally based on the environment. In your deployment pipeline, verify that noindex headers are not present in production builds. Implement automated checks that flag any X-Robots-Tag: noindex header on production URLs as a critical alert.

Conflicting Directives Between X-Robots-Tag and Meta Robots

When both X-Robots-Tag and meta robots directives exist for the same page, search engines follow the most restrictive combination. If your meta robots tag says "index, follow" but your X-Robots-Tag header says "noindex," the page will not be indexed. These conflicts are difficult to detect because they involve two different systems -- the HTML template and the server configuration -- that are often managed by different teams.

Audit your site regularly by comparing the meta robots directives in HTML against the X-Robots-Tag headers returned by the server. Our checker tool automates this comparison and highlights any conflicts. When conflicts are found, decide which system should be authoritative for each URL pattern and remove the conflicting directive from the other.

Sitemap Conflicts with noindex

Including URLs in your XML sitemap that have an X-Robots-Tag: noindex header sends conflicting signals to search engines. The sitemap tells crawlers "this page is important, please index it," while the header says "do not index this page." Google has stated that when these signals conflict, the noindex directive takes precedence. However, this inconsistency wastes crawl budget, generates warnings in Google Search Console, and indicates poor site configuration.

Ensure that any URL with a noindex directive (whether via X-Robots-Tag or meta robots) is excluded from your XML sitemap. Set up automated sitemap generation that checks for noindex directives before including a URL. If you use a CMS plugin to generate your sitemap, verify that it respects both meta robots tags and X-Robots-Tag headers.

Forgetting to Include the Header on All Response Codes

In Nginx, custom headers added with add_header are only sent on 2xx and 3xx responses by default. If you need the X-Robots-Tag to apply to error pages (such as a custom 404 page that you do not want indexed), you must include the always keyword. In Apache, the Header directive applies to all responses by default, but be aware of the interaction between set, append, and merge behaviors to avoid overwriting or duplicating headers.

When to Use X-Robots-Tag vs Meta Robots

Choosing between X-Robots-Tag and meta robots depends on your specific use case. In many situations, the answer is clear-cut based on the type of resource and the control you need. Here are guidelines for making the right choice.

Use X-Robots-Tag When:

Controlling non-HTML resources: PDF files, images, video files, and other binary content cannot contain a meta robots tag. The X-Robots-Tag header is the only way to instruct search engines on how to handle these resources.

Applying rules across entire directories or file types: When you need to noindex all files in a directory or all files of a specific type, server-level rules with X-Robots-Tag are more efficient and maintainable than adding meta tags to every individual page.

Applying rules without modifying HTML: If you cannot easily modify the HTML of your pages (for example, with a legacy CMS or third-party application), the X-Robots-Tag allows you to control indexing at the server level without touching the HTML source.

Targeting specific search engine bots: The X-Robots-Tag allows you to target specific user agents by prefixing the directive, such as googlebot: noindex. While meta robots can also target specific bots via the name attribute, the X-Robots-Tag approach is sometimes simpler when combined with server configuration rules.

Use Meta Robots When:

Page-level control in a CMS: When content editors need to control indexing on a per-page basis through the CMS interface, meta robots tags are more practical. Most CMS platforms provide a user-friendly option to set meta robots without requiring server access.

No server configuration access: On shared hosting or managed platforms where you cannot modify server headers, the meta robots tag is your only option for controlling indexing of HTML pages.

Dynamic per-page decisions: When the indexing decision depends on the content of the page itself (such as noindexing pages with thin content), it is often easier to implement this logic within the page template using a meta robots tag rather than at the server configuration level.

For maximum control and coverage, many sites use both mechanisms in coordination. The meta robots tag handles per-page decisions within the HTML, while the X-Robots-Tag handles non-HTML resources and broad directory-level rules at the server configuration level. The key is to document your indexing strategy clearly so that both systems remain consistent and no conflicts arise.

X-Robots-Tag Best Practices

Following best practices for X-Robots-Tag implementation ensures that your indexing strategy is effective, maintainable, and free of unintended consequences. These guidelines are based on Google's official documentation and industry experience.

1. Audit Headers Before and After Deployment

Before deploying any server configuration changes that affect X-Robots-Tag headers, test against a comprehensive list of your site's URLs. After deployment, verify the headers are correct on production by checking a sample of pages across different sections and file types. Automated header checking tools can streamline this process.

2. Use Environment-Specific Configuration

Never use the same server configuration file for staging and production without environment-specific overrides. Ensure that noindex headers applied to staging environments are automatically removed or overridden when deploying to production. Use environment variables or separate configuration files to manage this.

3. Monitor Indexed Page Counts

Set up monitoring in Google Search Console to track the number of indexed pages over time. A sudden drop in indexed pages is often the first indicator of an accidental noindex deployment. Configure alerts to notify your team immediately if the indexed page count drops below a threshold.

4. Keep Sitemap and X-Robots-Tag in Sync

Never include URLs with noindex directives in your XML sitemap. Implement automated checks in your sitemap generation process to verify that every URL included is eligible for indexing. This prevents wasting crawl budget and sending conflicting signals to search engines.

5. Prefer Specific Rules Over Broad Rules

When configuring X-Robots-Tag directives, use the most specific rules possible. Instead of applying noindex to an entire directory, target specific file types or URL patterns. Broad rules are more likely to accidentally affect pages that should be indexed. The more specific your rules, the less likely you are to cause unintended deindexing.

6. Document Your Indexing Strategy

Maintain clear documentation of which URL patterns have X-Robots-Tag directives applied, why they are applied, and who is responsible for managing them. This documentation prevents configuration drift over time and helps new team members understand the rationale behind the current setup.

7. Test with Multiple User Agents

If you use user-agent-specific X-Robots-Tag directives, test with each targeted user agent to verify the correct headers are returned. Use curl with the -A flag to simulate different crawler user agents and confirm the expected directives are present.

8. Use max-image-preview:large for Content Pages

For pages you want to maximize visibility in search results and Google Discover, set max-image-preview:large. This allows Google to display large image previews alongside your search results, which typically increases click-through rates and is required for eligibility in Google Discover.

Frequently Asked Questions

Does X-Robots-Tag prevent crawling of a page?

No. The X-Robots-Tag only controls indexing and display behavior, not crawling. Search engines must first crawl a page to discover the X-Robots-Tag header, so the directive is inherently post-crawl. If you want to prevent crawling entirely, you need to use robots.txt Disallow rules. However, be aware that blocking crawling with robots.txt means the search engine cannot see the noindex directive, so it may still index the page based on external signals like inbound links. For pages you want excluded from search results, the recommended approach is to allow crawling but use noindex via X-Robots-Tag or meta robots.

Can I use X-Robots-Tag and meta robots together on the same page?

Yes, you can use both on the same page, but be aware that search engines will combine the directives and follow the most restrictive set. If your meta robots says "index, follow" and your X-Robots-Tag says "noindex," the page will not be indexed. This combination is valid but can lead to confusion if different teams manage the HTML templates and server configuration independently. It is best practice to use one mechanism as the authoritative source for each URL pattern and ensure they do not conflict.

How quickly does Google respond to a noindex X-Robots-Tag?

Google processes noindex directives during its next crawl of the affected page. For pages that are crawled frequently (such as your homepage or high-authority pages), the change can take effect within a few days. For pages that are crawled less frequently, it may take several weeks. You can use Google Search Console's URL Inspection tool to request a recrawl, which can speed up the process. To remove pages from the index more quickly, you can also use the Remove URLs tool in Search Console as a temporary measure while the noindex directive takes effect.

Does the X-Robots-Tag work for all search engines?

The X-Robots-Tag is supported by Google, Bing, and Yandex. Other search engines may or may not support it. For maximum compatibility, you should use both the X-Robots-Tag header and the meta robots tag on HTML pages. For non-HTML resources where meta robots is not possible, the X-Robots-Tag is your only option. Check the documentation of any specific search engine you want to target to verify support.

What happens if I accidentally noindex my entire site with X-Robots-Tag?

If you accidentally apply a global X-Robots-Tag: noindex to your entire site, search engines will begin deindexing your pages on their next crawl. The impact depends on how quickly and frequently your pages are crawled. For large, well-established sites, you may see pages dropping out of the index within days. The fix is to remove the errant header immediately and then request recrawling of your most important pages through Google Search Console. Full recovery can take weeks to months depending on the size of your site and how long the noindex was in place. This is why monitoring indexed page counts is so critical.

Can I use X-Robots-Tag to control AI crawlers and LLM bots?

The X-Robots-Tag supports user-agent-specific directives, which means you can target AI crawlers and LLM bots if they respect this header. For example, you could set X-Robots-Tag: GPTBot: noindex, nofollow to instruct OpenAI's GPTBot not to index or follow links on your page. However, whether AI crawlers actually honor the X-Robots-Tag depends on the bot. Google's AI crawlers (such as Google-Extended) do respect standard robots directives. Always check the documentation of specific AI crawlers to understand which directives they support, and use robots.txt as an additional layer of control.

Is X-Robots-Tag case-sensitive?

The HTTP header name "X-Robots-Tag" is case-insensitive per the HTTP specification, meaning "x-robots-tag," "X-ROBOTS-TAG," and "X-Robots-Tag" are all treated the same by search engines. However, the directive values (noindex, nofollow, etc.) are also treated case-insensitively by Google and Bing. Despite this flexibility, best practice is to use the standard casing X-Robots-Tag: noindex, nofollow for consistency and readability across your configuration files.

Should I use noindex or robots.txt Disallow to block pages from search?

For pages you want completely excluded from search results, use noindex (via X-Robots-Tag or meta robots) rather than robots.txt Disallow. Robots.txt prevents crawling but does not guarantee deindexing -- Google can still index a URL it has never crawled if there are enough external signals pointing to it (such as inbound links). The noindex directive, on the other hand, explicitly instructs search engines to remove the page from their index. The catch is that search engines must be able to crawl the page to see the noindex directive, so do not block crawling of pages that have noindex. The ideal approach for pages you want excluded from search is: allow crawling (no robots.txt Disallow), set noindex (via X-Robots-Tag or meta robots), and exclude from the XML sitemap.

Want a Complete SEO Audit?

X-Robots-Tag is just one of 19 SEO checks in our comprehensive audit. Get a full analysis of your website's technical SEO, on-page optimization, and more.

Run Full Website Audit