Dialog customer
newsletter - Dec 98
|
Why do search engines only cover
a fraction of the Internet?
Traditionally, retrieving external information was the domain
of information professionals who primarily used paid-for online
databases with sophisticated search languages. The asendancy of
the Internet, and in particular, the web, has resulted in an explosion
of alternative information sources that are currently relatively
cheap or free. However there are key differences between these
sources.
Information is increasingly accessed
via the web, not on it
Database integration involves the user visiting a web site and
retrieving the information they require from a database, using
what is often a rudimentary search language. This information
is presented in a temporary, computer generated web page. Therefore
search engines can't find the valuable information on these temporary
(or dynamic) web pages and the information remains hidden in the
database until retrieved by the user. Also database integration
is expensive and online advertising has not been recouping the
development costs. Consequently there is an increasing trend towards
charging for information on the web.
Nobody ever searched the web
When using a search engine you do not "search the web". In
fact you are searching a database of indexed websites. These databases
have been compiled by programs called "spiders" which search the
web for new web pages. However, they cannot keep up with the phenomenal
growth of the web. Research published in the April 98 edition
of the journal "Science" revealed how much, or rather how little,
of the web each of the major search engines cover.
|
Web Directories
So why doesn't Yahoo! feature in this table? Because it is not a search
engine! It is a web directory. Unlike search engines which are compiled
automatically by computers indexing keywords on webpages, web directories
are compiled manually by human editors. They are pre-defined lists of
websites which are categorised by subject. However, as they are compiled
manually, web directories only cover a fraction of what's available
on the web. Also, inclusion in a directory is often entirely at the
discretion of the editor(s) - so someone else determines what constitutes
a useful website - which may not always be what you want.
Search Languages
Online search command languages are currently more powerful than their
web search engine cousins. Not only do they offer a greater range of
options for identifying and retrieving the information, they also allow
the user to manipulate the results in ways that search engines just
can't. RANK and SORT would be examples of such commands on Dialog.
These issues have several consequences:
- There is an increasing need
to manage the burgeoning choice of sources and to quickly identify
which is most appropriate
- The expansion of information
retrieval skills beyond the exclusivity of information professionals
to end users (high volume consumers of information who are aligned
to non-information functions within the organisation)
- Managing subscriptions to a
myriad range of websites versus single integrated invoicing for
online databases
- Cost of information versus
cost of time - there is a greater appreciation of the speed and
power of online databases
The web
is a compliment to, not a replacement for, commercial online databases.
Time has a premium value in today's business environment and "free information
that take too long to find and format is expensive information" (Information
Today, Feb 98)
|