November 1998

Searching The Web

Finding the information you need is the biggest challenge facing the Web surfer. Search tools allow you to type a few words to describe what you want, then they respond with a list of Web pages or, possibly, Web sites. We'll consider three types of search tools: crawlers, metacrawlers, and directories.

Web crawlers (also called spiders) automatically read Web sites and index Web pages. Search engines which use crawlers to gather data include AltaVista, HotBot, Lycos, and Excite.

Since the different search engines use different methods of evaluating a match between the words you supply and the Web pages they have indexed, they often give differing results. Why not use more than one index and combine the results? That's what metacrawlers do. They are parasites: they don't index anything themselves, they just use other search engines to do their searches, and they combine the results. MetaCrawler, Debriefing, and SavvySearch are examples of this type.

Finally, Web directories attempt to categorize Web sites. They search the directory to give you a list of categories, and sites within those categories, that match your search request. The best-known example is Yahoo.

Looking For RTG

In order to test the search tools, we did two searches on each. First we searched for "legal time and billing." We used a unique way of evaluating the results: did they find RTG's Web site? Our site is dedicated to RTG Bills, our legal time and billing software, so this search should find our site.

Our second search was for "RTG Bills." Once again we expect the search to find RTG's Web site, but this should be much easier because we specifically requested two words which are important on our site, yet not otherwise related to each other. There are other RTG's on the Web, but so far as we know there are no other products called RTG Bills.

Surprising Search Results

Yahoo provides an excellent example of what can go wrong. When we searched for "legal time and billing," Yahoo displayed 11 site matches. In the category "Business and Economy: Companies: Law: Software: Billing," Yahoo showed three sites. RTG's Web site was not among them.

However, if you look at the entire contents of that category, you find that it contains only six sites, and RTG's site is one of them. Why didn't the search show our site? The reason is that the description for RTG's site says "time and billing program for small law firms." It has the word "law" but not the word "legal," so it wasn't chosen as a hit.

Snap is another Web directory. Both searches found our Web site, but Snap has added a very odd description:

RTG offers a fully integrated billing and time-management package for attorneys, which will make sure your billable hours stay right where they're supposed to be.

We certainly didn't write that.

A search for "legal time and billing" on Debriefing, a meta search engine, came up with four sites on the first page of hits. RTG's site was first! The next two sites also seemed relevant, but hit #4 was Music Boulevard, which seems strange.

SavvySearch, also a meta search engine, found RTG's site, but it was hit #14 and hit #22 instead of #1. Both SavvySearch and Debriefing found us by using Excite and Infoseek, but Debriefing combines the results it receives and, apparently, SavvySearch does not.

AltaVista was a disappointment. The first search found RTG at hit #56 on one day and at #80 the next day, and even then the page it found was the July 1998 issue of RTG News, not our home page. The second search returned 10 pages from our site as the top 10 hits, but the RTG home page was still not among them. On the other hand, if you put quotes around the phrase "legal time and billing," RTG's home page shows up as hit #1.

Two sites, GoTo and Debriefing, didn't work the first day we tried them. GoTo said it was having technical difficulties. The next day it worked, but very slowly. Debriefing gave an "internal server error" message in response to every search. We sent them an e-mail message to tell them about it. The next day we received a response saying it was fixed, and it was.

GoTo couldn't find RTG's Web site. The first search failed to find any reference at all to us. The second search found the ABA's Legal Software Sampler II, a CD-ROM which contains a demo of RTG Bills, as hit #25. The results at HotBot were similar, with the CD-ROM appearing at #7.

We were surprised that HotBot didn't find our Web site. Last year we tried the same search and it came up as hit #6. It also found RTG Bills at that time.

Final Test Results

The table below lists the results in order of success in finding our Web site. The word "no" means that the search did not return a reference to any page of RTG's Web site in the first 50 hits.

We realize, of course, that this is not a comprehensive evaluation of these search tools. For another opinion, you might want to look at the December 1 issue of PC Magazine.

Search Tool Type Search 1: legal time and billing Search 2: rtg bills
Debriefing meta #1 #1
Infoseek, Thunderstone crawler #2 #1
Excite crawler #3 #1
Lycos crawler #4 #1
Northern Light, Google, Galaxy crawler #8 #1
MetaCrawler meta #11 #1
Snap directory #14 #1
SavvySearch meta #14 #18
AltaVista crawler no #1
Yahoo directory no (but found correct category) no
HotBot, GoTo crawler no no

RTG Bills and RTG Timer are trademarks of RTG Data Systems. Other company and product names may be trademarks of the companies with which they are associated.

Back to the RTG News page