Vol. 13 No. 1VASTA WInter 99 p. 8

 

Tech Talk

by Eric Armstrong

a column on technology


Searching the
World Wide Web:

Getting the most out of the net with Search Engines

As soon as the net started to get big, finding stuff of any use on the World Wide Web became tricky. Since 1995, when the web went popular, search engines have grown into a very big business, and their goal is to help you to get the most out of the information glut that is the web. The big problem is that getting the search engines to work isn't necessarily easy.

Before you start searching, it is important to realize what the web is, and even more importantly, what it isn't. The web is an information source, like a library in some ways, which can be used for quick reference. Rarely is it useful for in depth research - the content on the web isn't usually well enough researched and often it is produced with a strong bias or independently, without anyone to edit, revise, or vet the writing or check the facts.

However, the web is an ideal research environment for certain areas: current news, popular opinion, production information and statistics are areas where the data can be updated immediately via the web, and so it serves as a way to get accurate, up to date information. Contact information, current government documents and statistics and data about International organizations and non-governmental organizations are also available via web; our own <vasta.org> web site is a great example of this type of site, featuring contact information and current information about our international organization. The web isn't so good when you are looking for the following: comprehensive scholarly research, historical information, book reviews, extensive biographical information or full-length literary works. Rarely you can find very up to the minute information about these items, but usually you should visit an

>>

  academic library.

If you decide that you want to search the web for information about a specific topic there are three ways of searching: the subject guide, the subject tree and the search engine. Each has its own strength and weaknesses and is suited to a particular type of search.

A Subject Guide is usually a list of links to various internet sites prepared by an individual, group or organization that focuses on a specific discipline. Often annotated, these guides can point you to the best sites that you can trust. Unfortunately, these guides cannot be comprehensive, you cannot search the text of them, and are hard to keep up to date. Look for a Subject Guide on <vasta.org> in the new year.

A Subject Tree* is like a giant Subject Guide. The most famous version of this is Yahoo <www.yahoo.com>, who organizes the web into categories which branch out into subcategories, on and on until they reach lists of links to annotated web sites. In fact, <vasta.org> is listed on the Yahoo site under the following category: Arts, Performing Arts, Theater, Organizations. We can search this tree for other Organizations, but we can't search it for other voice-related items. Search Trees are highly organized, and they often allow both browsing and searching (by searching the tree, I also found the Vietnamese American Space Technology Association, also know by the acronym VASTA). However, they aren't comprehensive, the annotations are very short, and the vocabulary the Search Tree utilizes is not consistent with library conventions.

Our third category, the Search Engine, is now the most common form of search device on the net. Good examples are:

Alta Vista <www.altavista.com> HotBot <www.hotbot.com>
Excite <www.excite.com>
Northern Lights <www.nlsearch.com>,

a new comer. These have indexed many pages, up to a third of the 350 million pages that make up the World Wide Web today. They work by sending out a "spider" program that sends information back to the huge database that is at the heart of

>>

  the search engine. They provide ranked results in their search results, can do specialized searches for images, audio and video, but lack the human touch of the search guide or tree. It is important when deciding which search engine to use to find out how that search engine works. What does it index? Many search engines index only the meta-tags that web authors place at the top of their web pages. Some search engines index the entire text of the web page and are therefore much more powerful.

Search engines are difficult to use effectively because they are designed to use "natural language" searches, which are very vague. If you are familiar with Boolean searches, common in all library index computers, you should look for Boolean search capabilities on the advanced search pages connected with most search engines (look for Help links if you can't find them). I would recommend that you try HotBot and Alta Vista for your search to begin with: always use more than one engine for your search, as they may have indexed different things. Most of these engines have a limited capacity to their database, and it is highly likely that the third of the web one engine has indexed isn't the same third that another has listed. If you are interested in a different kind of search, try Northern Lights. Designed with help from some reference librarians, Northern Lights also has access to a number of print journals that you can search for free and download the articles for a small fee. For less than the cost of single subscription, a monthly subscription allows you access to a number of downloads per month at a huge savings over their per article price.

*This term was coined by Brandeis University Reference Librarian Ann Frenkel.

Eric Armstrong is the Speech Guy at Brandeis University. He is the webdesigner for <vasta.org> and heads the technology group for VASTA. If you have any questions for him, please email him at:

<armstrong@brandeis.edu>

 

 

| Teaching Voice | President's Letter | VASTA Journal | Passage to India | TechTalk | Board Minutes |

| VASTA Conference | Advocacy | Speaking/Singing | South Africa | International | Regional News |

 


© Copyright 1987-99 Voice and Speech Trainers Association, Inc.