Google Exploit: Canonical Negative SEO by @martinibuster
A new negative SEO attack method has been discovered. What makes this exploit especially bad is that it is virtually impossible to detect the attacker. There is no way to recover if the attacking website is unknown. Google has remained silent about this kind of negative SEO.
How the Attack was Uncovered
The cross site canonical attack was discovered by Bill Hartzer of Hartzer Consulting. A business approached him about a sudden drop in rankings. During the course of reviewing the backlinks, Bill discovered links to a strange site.
But the client didn’t link to that site. Investigating that other site led him to the negative SEO site.
If that attacking site hadn’t linked to the third page Bill would not have been able to identify the attacking website. It was thanks to SEO Data Mining company Majestic’s new index that includes canonical data that Bill was able to discover the attacking site.
How the Canonical Negative SEO Works
The attack works by copying the entire “head” section of the victim’s web page into the head section of the spam web page, including the canonical tag. The canonical tag tells Google that this spam page is the victim’s web page.
Google then presumably assigns all the content (and the negative spam scores) from the spam web page to the victim’s web page.
How to Detect this Attack
I asked Bill if there was an alternate way to detect these attacks. He said he tried a number of software tools, including Copyscape and many others. But so far only Majestic was able to identify some of the attacking sites.
According to Bill Hartzer:
“I tried the source code search engine publicwww but it doesn’t show the data – only Majestic actually is showing the relationship, and that’s because the one doing the negative SEO linked out. In the other cases I’ve uncovered, though, the site is not linking out.
I know there are other sites that they’re doing this to… seen a few others.”
Is Google Doing Anything to Stop Cross Site Exploits?
Kristine Schachinger, who has recently identified a similar exploit, offered these observations:
“Usually the attack method and the results can be directly tracked back to each other. But this time the vector of the attack is not in the site being attacked, but in a weakness in Google’s algorithms.
The attack is based on Google “perceiving” the two sites as one. This transfers positive or negative variables between the attacker and victim sites.
The confusion persists for some time, meaning the attack has permanence beyond the lifecycle of the actual attack. This is a Google issue that doesn’t seem to be actively addressed by Google.”
What Can Google Do to Stop this Exploit?
This is clearly an exploit on how Google and Bing use the Canonical tag. In practice, the canonical tag is not a directive. This means that unlike with a Robots.txt file, search engines are not obligated to obey the canonical tag. The canonical tag is treated by search engines as a suggestion.
A possible solution may be for the search engines to update the canonical specifications so that it can no longer be used to canonicalize across different domains. Ideally, this is something that should be done through the Google Search Console.
So far, Google is remaining silent on how they intend to proceed to close this exploit in how Google ranks and de-ranks web pages. If this exploit is real, it has the potential to disrupt Google’s search results in a major way.
Images by Shutterstock, modified by Author