Relating with Meta Robots and Robots.txt

E-mail Print PDF

The generic names of robots are Web Wanderers, Web Crawlers, Web Ants and Web Spiders.  They play a very important role in the world of Search Engine Optimization.  Web robots are oftentimes misinterpreted like a virus because the software itself moves between sites, but the truth is, robots merely visit sites by requesting documents from them.  Majority of the robots are well-designed, professionally controlled, and impart a constructive service for improved web solutions.

Meta robots


is one of the well known techniques of preventing search engines from including some of your web pages in search results.  Web site owners use the /robots.txt file to give commands about their site to web robots; this is called The Robots Exclusion Protocol.  The web site creator has the ability to control on which part of the page should be indexed by search engines.  The effective use of robots.txt file has been in the industry for many years.  All robots can crawl except the admin files, paypal credit and cgi folder.  It is safe to create robots.txt file for this so that search engines shouldn’t access this information.  You can place it in the root directory.  Robots.txt file can be created in any text file editor.  It must be an ASCII text file, not an HTML file and should be in lowercase.  

Robots can be used for a number of purposes.  It can be utilized as HTML validation, Link validation, Indexing, Mirroring, and “What’s New” Monitoring.  The basic robots.txt file exercises two rules:  

1.User-Agent -


    a specific search engine robot.  A List of common bots may be found though the Web Robots Database.  An entry to apply to all bots should look like this:

User-Agent: *

2.Disallow –


are the pages a website creator wants to block.  The entry should begin with a forward slash (/) to a list of specific URL.

To block the entire site, use a forward slash.
Disallow: /

To block a directory, follow the directory name with a forward slash.
Disallow: /private_directory/

To block a page, list the page.
Disallow: /private_file.html

Ready resources and references can be searched all over the net.  Robots aren’t bad.  They are by and large good.  Robots.txt looks to be a lot more operative, and a lot easier to execute. Moreover, robots.txt allows greater preciseness and directiveness.
Add this page to your favorite Social Bookmarking websites
Digg! Reddit! Del.icio.us! Google! Live! Facebook! Slashdot! Technorati! StumbleUpon! MySpace! Newsvine! Yahoo! Mister-Wong! Squidoo! Diigo! Ask!
 

Subscribe SEO Newsletter. Get Tips and Advices from SEO Experts for FREE...


Receive HTML?

website traffic hits