Work | Tips | Chart
| Players | Alliances
| Study | Home
How
Search Engines (Say They) Work
By Danny Sullivan, April 17, 1996
Forget the usual debate about which search engine provides the
"most relevant" results. Yes, some engines do seem to
provide better results than others. However, all of them will
provide at least some relevant results to a query. In fact, usually
search engines produce so many relevant results that it is difficult
to understand why a page ranked first did better than another
page ranked 20th. This is the key question for the
webmaster: why are some pages making it to the top of the list
while others aren't.
As part of my study
of search engines, I first consulted the help files of each search
engine to see how they explained page rankings. Some help files
were more helpful than others, but none were of any great assistance.
For example, several of the search engines provided pages that
were "scored" identically, yet some came first, while
others with the same scores were farther down the list. Despite
having the same scores, pages were ranked against each other in
some way.
I examined top ranked pages, compared them to those farther down
on the list. I attempted to count total words and keywords in
each document, but this was a massive job and one that quickly
proved unnecessary. It was easy to see that there were little
differences between the pages. A top-ranked page might have just
as many keyword references to a lower-ranked page, and some method
not explained in the help files, or perhaps impossible to explains,
was at work. There certainly was nothing in any of the top-ranked
documents that stood out as the reason they were picked first,
although there were some helpful tips that did stand out and are
summarized in the study's conclusion.
Below is a summary of the official relevancy rules drawn from
the various help files. There are also comments from PC World's
January 1996 issue on search engines and Internet World's
"Search engine showdown: IW Labs tests seven Internet search tools" article from May 1996.
NOTE FROM APRIL 2026: If it sounds weird that a page written in April 1996 is referencing a magazine article from the next month, that's how print magazines worked (and I guess might still work). They'd arrive earlier than the published month date.
AltaVista
AltaVista says that words and phrases are used to determine
rankings, with documents getting a higher rank if (1) keywords
are in the first few words of the document, (2) keywords are found
close to one another in the document, and (3) if document contains
more of the query words than some other document.
- Help page with explanation of scoring
http://www.altavista.digital.com/cgi-bin/query?pg=h&what=web
- Add URL page
http://www.altavista.digital.com/cgi-bin/query?pg=addurl
Excite
Excite offers no explicit instructions on improving rankings
other than to suggest having descriptive text in complete sentences.
"Concepts" are created from this text.
- Help page
http://www.excite.com/FAQ.html
What others say:
- Internet World: Excite apparently constantly checks
sites that are known to be popular based on the number of links
pointed toward them. It also checks various What's New pages to
find new sites for its catalog.
InfoSeek
InfoSeek gives detailed instructions on how to use meta
information to create custom descriptions, but its help files
shed little light on how to improve a site's ratings. All that
can be gleaned is that scores are partly determined by the number
of times that a word or phrase appears on the page. InfoSeek also
warns that using a keyword more than seven times in a meta description
will cause the description to be ignored. Key help file URLs:
- Add URL page
http://guide.infoseek.com/AddUrl?sv=IS&lk=noframes&pg=DCaddurl.html
- Help page with explanation of scoring
http://guide.infoseek.com/Help?pg=ScoreHelp.html&sv=IS&lk=noframes
- Help page
http://guide.infoseek.com/Help?pg=DChelp.html&sv=IS&lk=noframes
- Feedback page
http://guide.infoseek.com/Comments?pg=DCcomments.html&sv=IS&lk=noframes
What others say:
- PC World: Scores also depend on how many times keywords
appear on a page in relation to InfoSeek's overall database, as
opposed to being based solely on appearance within the page itself.
Lycos
Lycos says higher scores are given to pages with keywords
mentioned "early on, rather than far down in some sub-section
of the site." That either means mentioning keywords early
on a page, or possibly early on the page, then repeated again
on other pages within the site. Help file URL:
- FAQs page
http://www.lycos.com/info/faq.html
What others say:
- PC World: The number of terms on the page, proximity
to each other and position on page are relevant.
- Internet World: Lycos also measures how many links
web-wide point to a particular site. Sites with many links pointed
toward them are more "relevant" those those with few
referring links. Lycos also builds abstracts based on headers,
titles, links and the first few words of key paragraphs, so it
is possible a keyword could be mentioned on a page but not appear
in the Lycos catalog.
Open Text
Nothing can be found at the Open Text site regarding search
relevancy. Interesting note: there used to be a choice on the
Add URL
page between having Open Text be "gentle" or "firm"
when it visited, with a firm visit going several layers into a
site. Now only gentle remains, so only Open Text can only be prompted
to visit one page at a time.
- Add URL
http://www.opentext.com/omw/f-omw-submit.html
WebCrawler
WebCrawler says the number of times the keywords appears
in the page are divided by the total number of words in the document
to get a percentage. The page with the biggest percentage is listed
first, then the rest in descending order. Help file URLs:
- FAQ
http://www.webcrawler.com/WebCrawler/Help/FAQ.html
- Check when URL was last visited
http://www.webcrawler.com/WebCrawler/Status.html
What others say:
- Internet World: Only text from pages judged "popular"
are stored in the catalog (popularity assumedly based on number
of links pointing at a page). Want to see how popular WebCrawler
thinks your page is? Check it out at http://www.webcrawler.com/WebCrawler/Links.html.
Work | Tips | Chart
| Players | Alliances
| Study | Home