If you have a development domain and/or are using the website for testing concepts and don't need it harvested by the search engines, here's what you can do within your Robots.txt file.
Code Example:
User-agent: *
Disallow: /
#This keeps all robots out (the setting I use now)
#Second example
User-agent: *
Disallow:
#This allows the useragent to harvest all files and folders.
#Third example
User-agent: *
Disallow: /images/
Disallow: /cgi-bin/
Disallow: /my_mail.html
#This will block all boots from the files and folders inside
#the images and cgi-bin folder. The file my_mail.html will
#not be harvested.
#Fourth example
User-agent: Anthill
Disallow: /pricelist/
#This will block the Anthill useragent from accessing your
#pricelist folder and all files in that directory. (Anthill
#is used to gather price-information automatically from online
#stores. Support for international versions.)
#Fifth example
User-agent: linklooker
Disallow: /
#This is a new bot, that is not registered so who know what
#the data is collected for.
------------------------------------------------------
If you are allowing some bots but not all then make sure you list the bots you allow first then the deny.
Code Example:
For More Information:#Allow googlebot, msnbot, and askjeeves to harvest all files and folders.
User-agent: googlebot
Disallow:
User-agent: msnbot
disallow:
User-agent: askjeeves
Disallow:
#The rest of the bots and spiders are blocked
User-agent: *
Disallow: /