
The ECN No Name Newsletter is no longer being published. This is an archived issue.
[previous article] [next article]Julie Dickinson and Dave Halsema
In days of old, computers were often thought of as glorified calculators, crunching numbers contently and hopefully spitting out correct answers. But in these modern times, not only is the face of computer hardware rapidly changing and developing, but the use of computers itself has undergone a transformation.
More and more computers these days are being used as information storage warehouses. Computers are an excellent tool for such a task with their lightning-speed searches and ability to communicate with other computers. Users are finding that they can access any information they need from anywhere in the world. This powerful ability to access information can reduce our dependence on hardcopy reference books, such as encyclopedias and dictionaries, and save many a weary student from trudging to the library to do research.
We have outlined some useful programs that take advantage of the wealth of information that is available on the Internet. These packages are ARCHIE, GOPHER, WAIS, and WWW. We hope, by following the descriptions and basic instructions of these programs, the user will be tempted to try out the examples and learn first hand how useful a computer can really be.
ARCHIE
With anonymous ftp servers (see previous article) you can obtain files from many repositories on the Internet. The questions then is: How do I find these files? If you know you want a certain file, database, etc., how can you go about finding it? With the Internet being as huge as it is, no one person can keep track of everything. Fortunately, there is a tool, called ARCHIE, that may be invaluable in assisting you.
ARCHIE allows searching of indexes of files that are available on public servers on the Internet. It currently indexes about 1200 servers and 2.1 million files.
Using ARCHIE is relatively simple. You ask it to find filenames which contain a certain string. It will return the actual filenames that meet this requirement. You can then examine the list and choose which one(s) you think will meet your needs. Next you use anonymous ftp to move the file to your computer.
The ARCHIE command is very convenient to run at the ECN because
it is installed on all our machines; type "archie -modifiers
searchstring". ARCHIE will return all filenames which contain
the
The modifiers control the type of search you are conducting.
They are:
NOTE: You may only use one of -c, -r, and -s in your command
statement.
For example, suppose that you are looking for a source for the
ARCHIE command, because you want to install it yourself. Type
"archie -s archie".
When ARCHIE completes searching a database, it will output a
listing of the matches it has found (that is, the number of
filenames which contain the string "archie"). A portion of the
output might look like this:
For these five matches (ARCHIE will really find much more than
five), ARCHIE gives the server, the directory in which the file
resides and the filename (or a directory) which matched the
search criterion. Once you find something that looks useful you
can look at the file by ftping to the host. If you do not know
the name of the file, then use "dir" once you arrive in order to
see if there is anything of interest. Now the problem is: which
site do we choose?
There are a few guidelines:
Unfortunately some people choose filenames that don't really
explain what is in the file. There is an additional service on
ARCHIE called whatis which can be used to locate software or data
files even if the filename bears no resemblance to its contents.
It is a set of alternative indexing keywords for files on the
network. When administrators put files in their ftp archives,
they may contribute to an index entry for the file to help people
find it. Since the whatis service requires this human
intervention, it may not be as complete as we may ideally like it
to be, but it is definitely better than nothing!
When you do a whatis search, ARCHIE uses your search string to
examine the keyword list. If anything is matched, ARCHIE will
print the name of the file and a brief description. If you see a
file that you think is appropriate, you must then do a filename
search (using the ARCHIE command as described above) to see where
it is located. The third and final step, of course, is to
anonymous ftp to the given host to obtain the file.
Unfortunately, you cannot do an ARCHIE command to perform a
whatis search. Instead you must use ARCHIE with telnet. There
are several ARCHIE servers on the Internet and each one contains
exactly the same information. The best way to choose what server
to use is to choose the one closest to you. This makes it easier
for the network, and, if everyone follows this guideline, should
generally spread the work around. Here are the available ARCHIE
servers:
Let's do a whatis search for a gene sequence map for E.Coli
bacteria. If we were to use the ARCHIE command, "archie -s
coli", ARCHIE would return almost 200 matches--most of which
would be broccoli recipes!! Let's see what happens with whatis
after telneting to one of the above machines and logging in as
"archie."
Notice that NGDD is a directory, not a file. So when you
anonymous ftp to ncbi.nlm.nih.gov you should type the "dir"
command to see what files are there. The index which is accessed
with the ARCHIE command is updated monthly, but the whatis index
is not updated on a regular basis. Therefore there is a chance
that if you find something with whatis, you may not be able to
locate it with the ARCHIE command. This would happen if someone
deleted a file from his or her ftp archives, but did not delete
the filename from the whatis descriptions database.
GOPHER
GOPHER is a lookup tool that lets you prowl through the Internet
by selecting resources from menus. GOPHER started out as a
distributed campus information service at the University of
Minnesota, the home of the Golden Gophers. Since the main
function of this system is to "go fer" things, the name is quite
appropriate. Since its beginnings at Minnesota, the GOPHER
system has grown--from one site to over 100, all within 18
months.
To access a public server at Illinois, type "telnet
gopher.uiuc.edu" or connect to Minnesota by typing "telnet
consultant.micro.umn.edu". In either case, login as "gopher."
The Minnesota server was giving me problems when I tried to run a
few things for this article, so all examples shown in this
section are from the Illinois server. It doesn't really matter
which server you contact though. Your choice of server will only
determine the first menu you see.
Once you telnet to a public server and are logged as "gopher",
you will see a menu. From this main menu you can choose menu
items to move to other directories or access files. Then you can
choose from other directories or access other files. (And so on
and so on and so on.)
Here is the main menu at Illinois:
GOPHER keeps track of two types of entities: directories and
resources. A slash at the end of a line denotes a directory. If
you select a directory, you will see another menu. A period
denotes a file. Simply select a file to read it. The symbol >
refers to an indexed directory resource. This type of entry will
be explained later in the article.
Some handy commands to get you around the menus are:
To select an item, type the number of the line, or use the arrow
keys to select the item that you wish to examine, and press
return.
When you get to the end of an article, GOPHER asks what you want
to do with it. Press "m" if you would like to e-mail a copy of
the article to yourself. If you select an indexed directory
resource (marked by > at the end of the line), GOPHER will
prompt you for search words:
Index word(s) to search for:
GOPHER will then create a new menu which is a subset of this
directory's contents that only contains items matching your
search criterion. In some ways this is like ARCHIE--searching for
items which match your search string. One distinct advantage is
that you do not have to exit GOPHER and ftp to a certain server
to view the file. All you have to do is select the file in the
menu and (abracadabra!) there it is on your screen. GOPHER
searches are always case insensitive.
Keep in mind, though, that GOPHER does enforce licensing
restrictions. That is, there may be things that users at the
University of Minnesota may have access to that we at Purdue may
not. GOPHER knows where you are and will not distribute material
to you if certain restrictions exist. For example, the
University of Minnesota has the UPI news feed on-line, but cannot
distribute it off campus.
WAIS
Wide Area Information Server (WAIS pronounced "ways") is a tool
that allows users to search indexed databases. The search is
performed by providing keywords and having WAIS return documents
that contain these keywords. What makes WAIS so interesting is
the creation of databases that it uses. A database can contain
any information and requires no special format to be useful. In
fact, the databases are created by running an indexing program,
so the whole process is automatic. Once the database is complete,
one is able to use the WAIS search-engine to find whatever topics
may be held in the information that was indexed.
Let's take a look at how the search-engine works. You can either
have the WAIS client program compiled and installed on your local
machine, or it may be accessed via telnet. We will use the
latter method since WAIS may not be present on some computers.
The machine you want to telnet to is quake.think.com or
nnsc.nsf.net. When you reach the login prompt type "wais" and
press return.
After the login sequence you will be greeted with a screen that
has a list of servers and the sources each server contains. It
will look like:
The top right corner of the screen shows that 349 source
libraries are present. The highlighted bar acts as a cursor
which moves up and down the screen. The bar will move down
pressing "j" and will move up by pressing "k". You can also move
the cursor by typing in a number that corresponds to the left
column of the screen and pressing return. The cursor will
immediately jump to that source. For example, to see the next
screen of sources we could enter "19" and press return. Typing a
"?" will produce a handy help screen showing a summary of all the
commands that are recognized.
To select a source to search, place the cursor on the source you
are interested in and press the space bar. An asterisk will
appear to the right of the source number. The asterisk means
that any search performed will be done on this source. You can
select more than one source to search from.
As an example I will look up a movie review for the soon to be
classic Who Framed Roger Rabbit? Place the cursor on source
number 215 and press the space bar to select this source. This
source is titled "movie-reviews" and certainly sounds like what
we want. The asterisk appears directly to the right showing that
the "movie-reviews" source will be used for our search. Now that
the source has been selected press the space bar to enter some
keywords. After entering some proper keywords, our screen now
looks like the next screendump.
We press return to begin the search. WAIS simply counts the
number of occurrences of your keywords in a document and ranks
each document with a weighted score from 1 to 1000. A score of
1000 means that WAIS believes this will be the most pertinent
document you are looking for. It is often necessary to add or
provide new keywords for WAIS to refine a search. In fact, WAIS
was built on this principle, so don't be disheartened if you
don't get exactly what you want on the first try.
In our example we find the following documents returned to us
after our search is performed.
I believe the screen is self-explanatory. We seem to have lucked
out because the first two lines are both reviews of the movie we
were looking for. Selecting the text to read is the same as the
procedure outlined above for selecting a source to search: place
the cursor on the document you wish to read and press return.
In this article we have been using a character-oriented version
of WAIS known as SWAIS. There are other clients available which
some people feel are better at locating documents. One example
is XWAIS, which is a WAIS client which runs under the X Window
System. Not only does it look better, the XWAIS client has more
functionality. For instance, earlier I stated that it was normal
to refine searches by adding or using new keywords. XWAIS
handles this nicely by offering something known as relevance
feedback. This allows the user to take an article that was
previously found and find other articles that are similar to it.
This in effect makes any portion or the entire document a
"keyword" resulting in better searches.
There is still much work going on with WAIS. Search-engines are
being made more powerful, new client software is being written
for many types of computers, and older clients are slowly getting
all of the bugs worked out of them. In addition to this, people
all over the world are creating, modifying, and refining their
indexed sources for others to use, resulting in more abundant and
accurate information. You can bet that you will hear more about
WAIS in the future.
WWW
The World-Wide Web, commonly referred to as WWW, is another
information service that is offered on the Internet. As you will
see, WWW is aptly named. WWW lives up to its name by offering an
interface based on the technology of hypertext.
What is hypertext? Hypertext is a way of presenting information
where certain words in a document can be used as references to
other documents, pictures, or any other type of information
imaginable. These references, known as links, allow the user to
investigate a topic in as much detail as they like.
To use WWW, the set of programs may either be installed on your
machine, or you may use a public-access client via telnet. We
will be doing the latter. You may either telnet to
eiecs2.njit.edu or info.cern.ch and login as "www." It would be
preferable to try info.cern.ch first, but since that computer is
down at the time of this writing, we'll use the other one.
Type "telnet eiecs2.njit.edu" to access a public WWW client. At
the login prompt enter "www" and press return. You will be
prompted for the terminal type you are using, and once entered,
you will see the preceding screen.
This is the home page at NJIT. The home page is the document
that you see when you first enter the WWW. The bracketed numbers
are links to other documents. To follow a link, type the number
and hit return. The list of words below the line are commands.
To execute one of the commands type the capitalized letter that
is in the word and hit return. For example, to exit WWW enter a
capital X (in the command eXit) and hit return. Note that the
interface will be different if you telnet to info.cern.ch
instead. However the concept of links will still be preserved.
When viewing documents that are larger than the screen size, one
can use the N command (for Next) to see the next portion of the
document. Selecting Back will scroll the current document up.
Selecting Up will take you back a level. For example, from the
home page you may select 1 to view the help screen. From the
help screen, if you select Up you will come back to the home
page.
As you can see, there is on-line help available and many of the
commands are self-explanatory. With repeated use you will see
how powerful an interface this is.
Hypertext is a concept that is gaining popularity. Hypermedia is
being applied to other areas besides information retrieval. Most
notably is the application to television viewing. There are
experimental services available in some California cities which
allow tv viewers to control and interact with with what is shown
on their screen. For example while watching a baseball game, if
you are interested in the player's history and statistics,
pressing a button on a special console will give you the
information. Customers are able to play along with the
contestants on Jeopardy or solve the murder mystery airing at
8PM. The implementation of this technology is exciting and sure
to make news in the near future.
-c return files whose names contain the search
string. (Upper and lowercase letters must
match exactly.)
-e return files whose names match the search
string EXACTLY. (This is default.)
-r treat search string as UNIX regular expression.
-s return files whose names contain the search
string. (Case of letter here is irrelevant.)
-l reformat output so it is suitable for input into
another program (for example, grep). This
would enable you to look through the output
to find what you want.
-mnumber return no more than number files.
Host aix1.segi.ulg.ac.be
Location: /pub/docs/tcpip/ftpsites
FILE -rw-r--r-- 12899 Sep 19 1991 archie
Host akiu.gw.tohoku.ac.jp
Location: /pub/net
DIRECTORY drwxrwxr-x 1024 Oct 7 02:39 archie
Host ashley.cs.widener.edu
Location: /pub
DIRECTORY drwxrwxr-x 512 Oct 26 18:15 archie
Host wuarchive.wustl.edu
Location: /usenet/comp.sources.misc/volume27
DIRECTORY drwxrwxr-x 512 Apr 16 1992 archie
Location: /usenet/comp.sources.misc/volume33
DIRECTORY drwxrwxr-x 512 Nov 6 02:08 archie
archie.rutgers.edu Northeastern U.S.
archie.sura.net Southeastern U.S. (Purdue's link)
archie.unl.edu Western U.S.
archie.ans.net Sites connected to the ANS network
(one of the Internet Service Providers)
archie.mcgill.ca Canada
archie.au Australia & the Pacific Basin
archie.funet.fi Europe
archie.doc.ic.ac.uk United Kingdom
archie> whatis coli
ECD@Escherichia coli db (M.Kroeger, Giessen)
NGDD@Normalized gene maps for E.coli, S.typh., etc.
@(Y. Abel, Montreal)
The file NGDD looks like what we need. Now we need to find out
where it lives.
archie> prog NGDD
# matches / % database searched: 1 /100%
Host ncbi.nlm.nih.gov (130.14.20.1)
Last updated 02:23 4 Mar 1992
Location: /repository
DIRECTORY rwxrwxr-x 512 Jun 25 1990 NGDD
Root gopher server: gopher.uiuc.edu
1. Welcome to the U of Illinois Gopher.
2. Campus Announcements (12/1/92)/
3. What's Now? (12/15/92)
4. Information about Gopher/
5. Keyword Search of Gopher Menus >
6. U of Illinois Campus Information/
7. Champaign-Urbana & Regional Information/
8. Computer Documentation/
9. Libraries/
10. Newspapers, Newsletters, and Weather/
11. Other Gopher and Information Servers/
12. Phone Books (PH)/
13. Internet File Server (ftp) Sites/
< move backward
> move forward
q exit gopher
? help
u move up to previous menu
% telnet quake.think.com
Trying 192.31.181.1 ...
Connected to quake.think.com.
Escape character is '^]'.
SunOS UNIX (quake)
login: wais
Last login: Mon Dec 21 09:54:37 from wugate.wustl.edu
SunOS Release 4.1.1 (QUAKE) #3:
Tue Jul 7 11:09:01 PDT 1992
Welcome to swais.
TERM = (vt100)
Starting swais (this may take a little while)...
<
<
<
<
webmaster@ecn.purdue.edu
Last modified: Friday, 12-Sep-97 19:21:37 EST