DIRECTIVE TABS.

Analysis of the robots meta tag and x-robots tag in the http header.

Overview
Tab Directives

The directives tab shows data about the meta robots tag and the X-Robots tag in the HTTP header. These robots directives can control how your content and URLs are displayed in Search Engines, such as Google.

To best understand the importance of these guidelines consider two possible scenarios that the Search Engine Bot might run into:

  1. The robots.txt file is missing or has authorized crawl of a URL, Search Engines consider that resource scannable and archivable in the index unless there are more specific directives defined through Meta Robots.
  2. the robots.txt has in “disallow” the url and does not allow it to be crawled. In this case the meta robots will not be detected and their instructions will not be executed.

The meta robots tag should be placed in the head of an HTML page or in the HTTP header using the X-Robots-Tag and multiple directives can be combined -> content=”noindex,nofollow”

The tab is presented with three columns, the first one concerning the URL address (Address) and the remaining ones dedicated to Meta Robots and X-Robots-Tag, respectively, for which the Seo Spider collects all the discovered instances.

The filters available to you are as follows:

  • Index: identifies pages that have the “index” attribute; its presence is not necessary because Search Engines index URLs even without this directive.
  • Noindex: this attribute tells Search Engines not to index pages. This directive does not block Spider scanning! It can be set globally or specific to each Search Engine.
  • Follow: through the “follow” directive you instruct the Spider to follow all links on the page while crawling. If not present the Search Engines will follow the default links.
  • Nofollow: consists of a ‘hint’ that tells Search Engines not to follow any links on the page for crawling. This directive is commonly applied in staging sites along with “noindex” to avoid Search Engine scans prior to publication.
    In order to run a crawl that has this directive you can enable the ‘Follow Internal Nofollow’ option from the “Config > Spider” menu.
    Remember well: with this directive, page links do not pass PageRank.
  • None: the “none” meta tag does not correspond to the lack of directives, in fact it is very restrictive and combines “noindex and “nofollow”.
  • NoArchive: this robots tag tells Google not to show the cached copy of the page in search results. By default, the cache copy of a resource is available in Chrome using the “cache:www.dominio.xxx” command.
  • NoSnippet: this meta robots inhibits the Search Engine (currently only to Google) from showing the “Meta Description” and the link to the cached version in the search results.
  • Max-Snippet: this value allows you to limit the length of the text snippet for this page to [numero] characters in Google. Special values include – 0 for no snippet, or -1 to allow any snippet length.
  • Max-Image-Preview: this value sets the maximum size of an image preview of an Html page in Google. Index values can be:
    • “none”: no preview of the image should be shown;
    • “standard”: you can show a default preview of the image;
    • “large”: you can show a preview of the image that is at most as wide as the visible area.
  • Max-Video-Preview: this directive instructs Google to use a maximum of [numero] seconds as a snippet for videos on a page in search results. You can also specify 0 to allow only a static image, or -1 to allow any preview length.
  • NoODP: This now-defunct meta tag indicated to Google not to use the Open Directory Project (DMOZ) for its snippets. It can safely be removed.NoYDIR: This is an old meta tag that told Google not to use Yahoo Directory for its snippets. This meta robots can be removed.

    NoImageIndex: meta robots with this directive tells Google not to index images on an html page. It may happen that images in a specific Url with instruction “NoimageIndex” are also linked from other pages, in which case they may still be indexed. To avoid this condition one solution is to set X-Robots-Tag in the HTTP Header of the image file.

    NoTranslate: this value tells Google not to propose the translation of the page in search results. If not present, the Search Engine may show a link next to the result in Serp with the possibility to see the translated page.

    Unavailable_After: this statement allows you to specify the exact time and date you want Google to stop showing the page in search results. Very useful instruction for pages related to events, fairs, time promotions etc.

    Refresh: This directive redirects the user to a new URL after a certain period of time.

Seo Spider Tab