Skip to content

Add the ability to set X-Robots-Tag where specific query strings are added #88

@GeekInTheNorth

Description

@GeekInTheNorth

Background

As a website owner, I want URLs with specific query strings to be omitted from SEO crawls, so as to avoid burning through my search crawl budgets.

e.g.

This can lead to lots of variant crawling of a listing page resulting in search crawl expense
/listing-page/?category=123,page=1

Where URLs like this would want to be crawled as unfiltered paginated access to content
/listing-page/?page=1

The result is the need to conditionally apply a nofollow / noindex instruction to robots when accessing a page with a query string of "category="

Acceptance Criteria

Given I have query string parameters for a URL that is on my list of exclusion query string names
When I visit the page
Then the robots meta tag should be set as noindex / nofollow
And the robots header shoyld be set as noindex / nofollow

Given I have no query string parameters for a URL that is on my list of exclusion query string names
When I visit the page
Then the robots meta tag should be unaltered
And the robots header should be unaltered

Given I am rendering an <a href="..."> element
And the URL contains a restricted query string name
When the link is rendered
Then it should have a no follow rel attribute

Implementation Notes

  • Add a new screen for managing "Query Robots"
    • "Is Enabled": Checkbox for enabling / disabling this functionality
    • "Query String Names": Extensible list of edit boxes which are validated for the structure of a query string name
      • Must contain at least 1 entry
    • "Alter robots for qualifying requests": Checkbox to indicate the robots metatag and headers should be altered for a page when making a request with a qualifying query name
    • "Robots Instructions": Checkboxes to allow a user to define "no index" and "no follow"
    • "Add no follow to qualifying links"
  • Store this in the DDS with a repository / service pattern that uses caching to reduce trips to the database.
  • Update the MetaRobotsTagHelper so that:
    • (Existing) If environment robots is set, return the environment robots value
    • (New) If query string based robots is enabled and the request contains a qualifying query string name, then return the "Robots Instructions"
    • (Existing) If the robots tag has been defined without a value, then remove the tag
  • Update RobotsHeaderMiddleware
    • (Existing) If environment robots is set, set the x-robots-tag header to match the environment robots value
    • (New) If query string based robots is enabled and the request contains a qualifying query string name, return the query string robots instructions

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions