Analysis of the 'Canonical' attribute and best practices to avoid duplicate content.

Canonical Analysis Tutorial

The “Canonical ( rel=”canonical”) element is an attribute used to specify to the Bots what the preferred version of a piece of content is on multiple pages (with different slugs) to avoid duplicate content.

This tag is very important in all websites but becomes indispensable in e-commerce, let’s see why.


Imagine you have an e-commerce that sells top brand sportswear, and the “running shoe” detail page can be reached from both the brand page and the “running” category.

Assume two scenarios:
the navigator clicks on “sports shoe” from the “brand” page.
Product detail page from Asics ‘brand’ page:

the navigator clicks on the product in the “Running” category.
Product detail page from the ‘Running’ category page:

In this case, the detail page also carries category references in the slug, and the site presents the same content with two different paths.

The use of the canonical tag helps you define which of the two should be considered by the Spider in defining the ranking.

Canonical Tab Filters

The data are available to you in the “canonicals” tab of the upper window and can be filtered by the following attributes:

  • Contains Canonicals: the page has a canonical URL set (via link element, HTTP header, or both). This could be a self-referential canonical URL, where the page URL is the same as the canonical URL, or it could be “canonicalized,” where the canonical URL is different from the page URL.
  • Self Referencing: there is the presence of a canonical link that corresponds to the same URL as the page as self-referencing that determines its status as the primary content of that specific content (see example above of ecommerce and sports shoe); therefore if there is another “Url Path” with the same content it should provide a “Canonicalised” tag.
  • Canonicalised: the page has a canonical URL other than itself. The URL in question is “canonicalized” to another location. This means that the Search Engines are instructed not to index the page, and the Ranking is consolidated to the canonical version.
  • Missing: there is no canonical URL present either as a link element or by HTTP header. If a page does not point to a canonical URL, Google identifies what it believes to be the best version or URL. This can lead to unpredictable ranking, and therefore generally all URLs should specify a canonical (self-referencing) version.
  • Multiple: there are multiple canons set for a single URL (or multiple link elements, HTTP header, or both combined). This can lead to unpredictability, since there should only be a single canonical URL set by a single implementation (link element or HTTP header) for each page.
  • Non-Indexable Canonical: the canonical URL is to a non-indexable page. This includes canons that are blocked by robots.txt, no response, redirect (3XX), client error (4XX), server error (5XX) or ‘noindex’.

Canonical versions of URLs should always be indexable, ‘200’ answer pages. Therefore, canonicals going to non-indexable pages should be corrected to indexable (‘indexable’) versions.

The ‘Occurences’ column counts the number of rel=”canonical” elements that were discovered for each individual URL.

From the sidebar you can check in real time all URLs that have critical canonical management issues. In the image below, you can see 1 URL that is “canonicalized” and 1 URL that has a “Non-Indexable Canonical.”

Related Tab: Canonical | Sidebar | Report

Canonical Video Tutorial

Common errors tag Canonical

Seo Spider Tab