Selection Criteria:
How are sites selected to be included
in the database?
Directories: rely on the
judgment and expertise of the people compiling them and it's not hard to
see that subject experts would have different selection criteria
than a hobbyist or casual user. Also, the process is governed by the
whims of the folks looking at the sites -- today I might include it in
the directory listing, but tomorrow I may not.
Listings may be submitted to the directory
for possible inclusion, or chosen by the indexer. Whether or not your
site is included in the database is a function of how relevant it is to
the subject matter of the directory, its scope, coverage, and accuracy,
in additional to other selection criteria such as audience.
Key issue: maintenance
of established level of quality.
Search Engines: also called intelligent agents, worms, crawlers,
spiders and robots ('bots), are automated. They traverse the web site
content and its links in a variety of different ways and collect the
results into a database.
Webmasters can also submit their sites for
possible inclusion into the database. Additionally, the search engine
itself seeks out the key words or phrases in web sites (or indexes each
word) and includes the documents in which they occur into its database.
(This is a very simplistic description of how a search engine works!) So
frequently it is only a matter of time before your new site gets indexed
by one of the major search engines.
Key issue: indexing web
documents for maximum access (not quality driven).
Note: we will be discussing search
engines in greater depth in Chapter 3.
Access
How do I access the information I
need?
Directories: Because of their
hierarchical arrangement, most directories are browsable, that is, you
can click on a subject of interest to see pertinent links and
subcategories on your topic. You are dependent upon the indexers
vocabulary to describe your topic, and you may have to figure out
exactly what that is (for example, you may want information on cars,
only there isn't a category for "cars" but there is one for
"automobiles."). Although this may be slightly confusing at first, by
using controlled vocabulary -- that is, one subject heading to
describe topics (such as cars, automobiles) instead of several, you'll
find all the information on the subject in one place.
Some of the larger directories allow you to
search for topics, but it is very important to remember that this is a
limited kind of search -- you are searching for subject
categories within the directory and its database, not in actual
web documents.
Key issue: accurately
classifying web sites (usually into subject areas)
Search engines:
You search the database of a search engine by
entering a key words into a dialog box; web sites in which these terms
occur are presented as relevant documents. Almost everyone has had the
experience where you go to a web site that is high in the rankings, only
to find out it is completely off topic! In Lesson One, we looked at many
of the reasons this happens.
Key issue: relevant
retrieval by using automated indexing techniques
Special: It is
important to note, that because someone, rather than something
compiles the directory databases, most of the web sites classified
in a topic area are really about your topic.
Usage
When would I use a directory? a search
engine?
Directories: Use a directory
when you want to see what is available on a topic, when you are
beginning your research or when you trust the compiler of the directory
to channel you to the best sites.
Key issue:
Well organized directories save time with preliminary
research
Search engines: Use a search engine
when time is not a factor (to sift through many sites), when you know
what most of the directories are listing but you'd like to see new
sites, or those that may not have been included into major directories.
Also use search engines to continue your research -- remember, if a site
or page is not entirely devoted to a topic, it may not be included in a
directory and so querying a search engine by key word may be the only
way it is accessible.
Key issue:
because they index words and images within a web document, search
engines are powerful tools for finding information not considered by a
human to be the main "topic" of a web page