What are Spiders or Search Engine Robots?

Search robots (bots) or spiders are automated search agents which establish what information is contained within a site and what it’s about, and indexes content on the internet. It follows links like a spider crawling through a web and finds sites and pages to index and refresh its index regularly.

This also means that submitting to search engines is rather unnecessary since a good linking strategy will cause the bots to find your site regardless.

Spiders or bots come on their own schedule so specifying the time frame for these bots to come in the robots file is unnecessary. Authority sites and higher ranked sites tend to have a higher frequency of visits from the search bot. Search bots automatically assume all pages are to be indexed and will follow every link unless told otherwise with the following inserted text in the head of your HTML:

Remember, the search engine’s job is to deliver relevant and useful results to a searcher. The search bot’s ability to gather information plays a key role in determining the value of these results.

Currently, the bots don’t do a very good job of establishing the value of sites designed with javascript, dynamic links, flash, frames, excessive images.

The biggest and most important bots out there are:

Google Bot – Agent name: googlebot
Yahoo Slurp – Agent name: slurp
MSN Bot – Agent name: msnbot

A complete list of search engine bots is available here.

With all the data collected in the index, the giant database of information has to be sorted in a meaningful manner. This is where search engine algorithms are used to establish the value of each piece of information gathered during the indexing phase. The outcome of this processing is the search engine results you see when you key into the search box or browser your keyword query.