The other day, I had a Facebook conversation with Paula Kiger about duplicate content.
She wanted to know if she could repost an article on a website, after it had run elsewhere.
The short answer? Yes.
The long answer is you can absolutely do that, but I would make sure you do two things:
- Add a line at the bottom that says, “This first appeared on WEBSITE” and link to the original article; and
- Add a canonical link on the back-end of your website or blog to signal to Google that the original article was elsewhere and you aren’t stealing it.
What is Duplicate Content?
Let’s back up for a second.
What the heck is duplicate content?
Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin.
What this means is there very well could be pages on your website that are the same, most arguably because you have both http and https in your web URL or because you can get to the site using both www and non-www.
For instance, here on this very blog, we have more than 4,000 pages that are considered duplicate.
This, however, isn’t a big deal because, on the back-end, we’ve used canonical links to tell Google which pages are the authority and which to ignore.
Don’t Worry Too Much About it
There is a big misconception that Google will penalize your site if you have duplicate content.
This just isn’t true.
The only reason your site would be penalized for duplicate content is because you are stealing or scraping someone else’s work.
What happens, instead, is Google filters the pages, decides which one has more authority, and shows that in search results.
So if, for example, you had two product pages on your website that had the same copy, but different images, Google could determine which one they thought most important.
To avoid the serendipitous nature of it, you can put a canonical link on the second page that you deem less important. That link would point to the first page.
Or you can just leave it up to chance.
If you don’t want to worry about sorting through duplication on your site, you can let us worry about it instead.
Six Facts About Duplicate Content
Here is what else we know about duplicate content:
- Duplicate content doesn’t cause your site to be penalized, at least not in how we’ve been led to believe. Sure, some of your website pages may not show up in search results, but that’s the worst that can happen (and it’s fixable!).
- Google wants searchers to have diversity in results, which is why the same article doesn’t show up 10 times on the first page of results. The algorithms consolidate the duplicate content pages and show only one version.
- Your site will not be penalized for duplicate content unless its intent is to manipulate search results or because it’s been scraped from somewhere else (which is why the sentence at the end with a link is so important if you are repurposing from another site).
- You can tell Google which page(s) you want to appear in search results by adding canonical links to the duplicate pages that don’t matter. If, however, you leave it up to chance, Google may show a less desirable page in search results.
- Google does try to determine the original source of the content and display that in search results. But if you can help them out, do so.
- Just the other day, we found someone had scraped our content. Typically, we’d leave a comment and ask them to remove it, but this site didn’t have comments enabled. So we filed a DMCA request and Google had it removed.
What About Syndication?
You absolutely can have your content syndicated to other sites.
This is good! Do this with your industry trade publications.
But do know how it will affect your site’s search results…and theirs.
Here are four ways to handle syndication so Google doesn’t penalize either one of you for scraping content:
- Canonical link: The best solution is to have the site syndicating your content to add that canonical link we talked about earlier. This signals to the search engines that your article is the original and to place it first in search results.
- Direct attribution link: This is the advice I gave to Paula earlier this week because the site she was going to repurpose content doesn’t have the ability to easily (without a programmer) add a canonical link. That direct attribution link at the bottom of the page will work, but it’s definitely not the first choice. Make sure the link goes to the original article and not to the website’s home page.
- NoIndex: You certainly can ask them to place a “noindex” tag on the page so the search engines don’t see it at all. It’s unlikely anyone would do this (I’d tell you to go pound sand) because most webmasters want Google to see their web pages.
- Just go for it. There may be times where you just say, “Screw it,” and let someone syndicate your content without one of these three things happening. You would do this if the site had so much authority (Huffington Post, New York Times, TechCrunch) that it made sense for your name, thoughts, and content to get in front of their readers. Of course, if they won’t do one of the three things listed above—and their domain authority is higher than yours—your original article won’t show up in search results. That may make strategic sense every once in a while.
Hopefully this clears up any confusion about duplicate content.
The TL;DR version of it is that your site will not be penalized unless you are stealing copy from someone else.
What other myths have you uncovered about duplicate content?
P.S. It’s Corina Manea’s birthday today. Join me in wishing her a very happy day!