Is your company blog is dormant?

When was the last time you posted anything?

Did you know your blog is your greatest marketing asset and can generate the best ROI?

At O.C. Search we help companies make millions by helping them get the most out of their blogs.

Drop us a line and we'll show you how you can do the same...

A Powerful Robots.txt Technique

Stop Duplicate Content

If you use a content management system like WordPress, Joomla, Drupal or OsCommerce you may be suffering from duplicate content issues. These can be really difficult problems to solve. Sometimes it requires complicated .htaccess rewrite solutions and endless hours finding links to nofollow. But now there is an easier solution for most of these problems.

The trick here is to use your Google Webmaster Tools account and your robots.txt file.

1. Identify Duplicate Content Issues

One way to see if your site is having duplicate content issues is by going to your Google Webmaster Tools account. Look under Diagnostics ->HTML Suggestions ->and check to see if you have duplicate titles and meta descriptions. If you do, then you may be repeating the same title tag and description content on several pages of your site. However, this may also indicate that your website is creating duplicate webpage content. If your site is replicating the same page over and over again, then obviously your titles and meta descriptions will be the same. In either instance you are going to want to correct this duplication issue.

Another thing to look for are sections of your website like feeds, comments pages, and review page. These pages generally create duplicate content. You do not want the search engines crawling these pages.

2. How To Block Search Engines From Indexing Duplicate Content

Once you have identified the sections of your website that should be blocked from the search engines, create a robots.txt file. These are extremely simple to create, but also incredibly damaging if you make one small mistake. I’m not going to go into how to make a robots.txt file, because http://www.robotstxt.org does a great job of explaining it.

3. How To Attack Difficult Sections Of Your Website

A lot of content management systems like to use URLs that have questions marks and equal signs in them. Luckily, Google allows you to block strings within URLs. Here is an example:

User-agent: Googlebot
Disallow: /*?page=

This statement will block any URL that has the string ?page= in it. The reason why this is so handy is that most robots.txt examples only show us how to block webpages and directories AND not URLs that have long complicated strings. We just had a case where this little technique came in very handy!

Here are some good resources on robots.txt:
Robots.txt info for WordPress Websites.
Incredibly insightful post about robots.txt on SEOBook.com

[author-box-2]

One response to “A Powerful Robots.txt Technique”

admin says:

March 2, 2010 at 2:36 pm

Yeah link exchanges are bad. You can get banned from the search engines for that.

1 Comment

A Powerful Robots.txt Technique

1. Identify Duplicate Content Issues

2. How To Block Search Engines From Indexing Duplicate Content

3. How To Attack Difficult Sections Of Your Website

One response to “A Powerful Robots.txt Technique”

OC Search Consulting

Follow Us

Explore

Education

Contacts

Privacy