Celtica's Computer Club > Webscape > Using Metasearching
Using Metasearching

Most surfers have used one or other of the Web search engines. These resources collect Web pages and create databases for people to browse or search. They can be quite comprehensive or fairly specialized. Each of them has its unique content and presents a unique interface, requiring a unique set of rules for searching and displaying search results differently. To exhaust a search, one often has to use several of them and has to be familiar with the different interfaces and searching rules.

A metasearch is a central place with a uniform interface where a query can be entered and the search can be conducted simultaneously in as many search engines and directories as necessary, and search results can be brought back and displayed in a consistent format. Tools with these features have come to be called metasearch engines.

Unlike the individual search engines and directories, metasearch engines do not have their own databases; they do not collect web pages; they do not accept URL additions; and they do not classify or review web sites. Instead, they send queries simultaneously to multiple Web search engines and/or Web directories. Many of the metasearch engines integrate search results: duplicate findings are merged into one entry; some rank the results according to various criteria; some allow selection of search engines to be searched.

Before conducting a metasearch engine search, it is important to find out which search engines are included in your search. Most metasearch engines default to the major search engines, such as AltaVista, Excite, Lycos, and Infoseek. Others will also include Usenet searches, and other specialized databases. Negotiations between the metasearch engine companies and the individual search engine companies may also result in a major search engine being excluded from a metasearch engine. For example, Northern Light would not allow any of the metasearch engines to robotically search its index since this process drains its resources. Development of metasearch engines lags behind development of search engines and some metasearch engines are still including defunct search engines.

Successful use of a metasearch engine depends on the status of each of the individual search engines used. Some may be heavily loaded at the time; some may be unreachable. The added features mentioned above require further resources from the metasearch engines, resulting in slower response time, a serious problem with many of the metasearch engines. Many of them, therefore, have a timeout period, so that attempts to work with a particular search engine can be abandoned if no response comes from it within a set period of time.

Remember too that a query submitted to a metasearch engine, with its uniform search interface and syntax, is to be applied against the diversity of individual search engines. It is therefore impossible for metasearch engines to take advantage of all the features of the individual search engines. Boolean searches, for example, may produce varied results. Phrase searches may not be supported. Other features, such as query refinement, are sacrificed.

Moreover, metasearch engines generally do not conduct exhaustive searches: they do not bring back all the pages from each of the individual search engines. They only make use of the top 10 to 100 hits from each of them. While this is sufficient for most searches, individual search engines must be consulted if one needs to go beyond the top hits as determined by the metasearch engines. Some metasearch engines facilitate this need by providing query links back to the individual search engines.

The following metasearch engines are among the major ones currently available.

Ask Jeeves
http://www.askjeeves.com/
Simple syntax; results presented in pull-down menus; number of matches reported from each search engine; no integration; no ranking; interesting design; fairly good response time; limited number of search engines used.
Debriefing
http://www.debriefing.com
A new contender; searches AltaVista, Yahoo, Infoseek, Excite, Webcrawler , Lycos and Hotbot in the English version; its French version searches Yahoo France, PagesWeb, Ecila, Infoseek France, Excite France and Lokace; it supports boolean (+ -) and phrase searches (" "); collates the results, ranks them and removes duplicates; provides the most significant domain name for a search; in the advanced search mode, it allows for searches within a particular site (no need to provide a complete URL)
Dogpile
http://www.dogpile.com/
Relatively new; searches Web sites, Usenet, FTP sites and newswires (25 in all); for first time users, start with "Custom Search" where one can set the order and the number of the 25 search engines so that results from one's favorite sites return first, and/or exclude certain sites (skip) from the search engine list, a very handy feature; timeout can be set from ten to 60 seconds; it searches three sites at a time and if there are enough results (ten hits), the search will stop, otherwise it will continue on to the next three sites. Ten records from each of the three sites will be displayed. Further hits from the three sites can be retrieved with a click, and the next three sites can be searched with a click as well. Search results are displayed with summaries; number of hits from each site is reported; Boolean searches are supported; response time is very good; no integration of results. See also MetaFind below.
Highway 61
http://www.highway61.com/
Searches only Yahoo, Lycos, Webcrawler, Infoseek and Excite (used to search AltaVista as well); AND and OR searches; number of hits from each site is reported; results displayed with summaries; sites coming from most search engines are ranked higher; interesting way of presenting options: timeout period is presented as "Your patience level." I like the developer's sense of humour in admitting that "this is not an exact science" when referring to how many hits a search should return. Response time leaves much room for improvement. 
Internet Sleuth
http://www.isleuth.com/
One of the largest collections of searchable sites, divided into several major categories: Web search engines and directories, reviewed sites, news, business and finance, software and Usenet; very flexible selection of search engines to be included (Hold Ctrl key to select multiple databases, Shift key to select a range). Maximum search time can be set between ten seconds and two minutes (used to be five); no integration of results; display of search results can be customized to show titles only or titles with summaries; number of results from each site can range from ten to 100; convenient arrangements for retrieving more records from individual search engines; response time is moderate.
Mamma
http://www.mamma.com/
Searches the Web, Usenet, news, stock symbols, company names, MP3 files, pictures and sound; supports optional phrase searches and searches limited to titles only; optionally shows summaries; Boolean operators can be used (+ and -). It claims to present results in a uniform format by relevance and source. A limited number of search engines is supported: AltaVista, Excite, Infoseek, Lycos, WebCrawler, and Yahoo. No arrangement for further searches in the individual search engines. Response time is moderate.
MetaCrawler
http://www.go2net.com/search.html
One of the earliest metasearch engines, purchased by go2net from University of Washington. It is best to customize it before using: set default interface (regular, power, or low bandwidth); select the default Boolean operators to be used (OR, AND, or as a phrase); may limit results from Web pages from North America, Europe, Asia, Australia, South America, Africa, Antarctica, or U.S. educational, commercial or government sites; set timeout period, and number of results from each source; or start with power search where all the options can be set before searching; results are displayed with summaries, integrated and ranked; response time is fairly good; Web search includes only the major search engines: Lycos, Infoseek, WebCrawler, Excite, AltaVista, and Yahoo. Many other types of databases have been added recently - computer products, usenet, files, stock quotes. 
ProFusion
http://www.profusion.com/
Excellent options in search engine selection: one can choose the best three, the fastest three, all or any of the available search engines; Boolean and phrase searches are supported; searches the Web or Usenet; search results can be displayed with summaries or without; one can have up to 50 links of search results checked to make sure they are live. Results are integrated and number of hits from each search engine is reported; search terms can be saved for future reruns (This feature seems to have disappeared). Unfortunately, ProFusion tends to be very slow in response time, but with recent address change, speed has dramatically improved.
Search.com
http://www.search.com/
Searches Google, Ask Jeeves, LookSmart and dozens of other leading search engines
Verio Metasearch
http://search.verio.net/
The advanced query interface has a very powerful scoring feature, allowing one to decide which individual search engine's results carry more weight than others; maximum delay time can be arbitrarily set; number of search results can range from ten to all; returns the most meta-information about a site, including relevance rank and score, and number of search engines ranking a site in its top ten hits. Slow response time.