Google’s John Mueller on 404ing or using Rel Canonical on URL parameters

There’s an interesting answer from John Mueller from Google on what to do with URLs that may appear duplicate due to URL parameters, like UTMs, at the end of URLs. John said absolutely no need to 404 those URLs, which no one would argue with. But he also said you can use the rel=canonical because that’s what it was designed for. The kicker is that he said it probably didn’t matter for SEO anyway.

Now I had to read John’s answer several times on Reddit and maybe I misinterpret the last part, so help me here.

Here is the question:

Good morning! New to the community but been in SEO for about 5 years. I started a new job as the only SEO manager and I’m thinking about the crawl budget. There are about 20,000 crawled unindexed URLs compared to the 2000 that are crawled and indexed – this is not due to an error, but to the high number of UTM/campaign specific URLs and (intentionally ) on 404 pages.

I was hoping to balance that crawl budget a bit and remove the UTM/campaign URLs from crawling via robots.txt and turning some of the 404s into 410s (this would also help overall site health).

Can anyone help me determine if this might be a good idea / could potentially cause harm?

404 response from John:

Pages that don’t exist should return 404s. You gain nothing in terms of SEO for making them 410s. The only reason I’ve heard I can follow is that it’s easier to recognize accidental 404s compared to deleted pages known as 410s. (IMO if your important pages accidentally get 404’s you’ll probably notice it quickly regardless of the result code)

John’s canonical response:

For UTM parameters I would just set the rel-canonical and leave them alone. Canonical rel won’t make them all go away (or robots.txt), but it’s a cleaner approach than blocking (which is what canonical rel was designed for, basically).

Ok, so far not using a 404 in this situation, but using rel=canonical – got it.

John then explained that SEO probably doesn’t matter?

For both, I suspect you wouldn’t see any visible changes to your site in search (sorry, tech-SEO aficionados). Rel-canonical on UTM URLs is certainly a cleaner solution than letting them accumulate and flourish on their own. Fixing this early means you won’t get 10 generations of SEOs telling you about the “duplicate content problem” (which isn’t a problem anyway if they’re not indexed; and when they’re indexed , they are deleted as duplicates anyway), so I guess it’s a good investment in your future use of time 🙂

So Google will probably handle duplicate URLs, UTM parameters anyway, even if they index them. But to please SEO consultants, use the rel=canonical? Is that what he says here? I like this answer, if that is its message – but maybe I was wrong?

Discussion forum on Reddit.

Comments are closed.