ECN No Name Newsletter: January, 1993

The ECN No Name Newsletter is no longer being published. This is an archived issue.

[previous article] [next article]

The Internet Offers A World Of Information

NO NAME NEWSLETTER-- January 1993

Julie Dickinson and Dave Halsema


In days of old, computers were often thought of as glorified calculators, crunching numbers contently and hopefully spitting out correct answers. But in these modern times, not only is the face of computer hardware rapidly changing and developing, but the use of computers itself has undergone a transformation.

More and more computers these days are being used as information storage warehouses. Computers are an excellent tool for such a task with their lightning-speed searches and ability to communicate with other computers. Users are finding that they can access any information they need from anywhere in the world. This powerful ability to access information can reduce our dependence on hardcopy reference books, such as encyclopedias and dictionaries, and save many a weary student from trudging to the library to do research.

We have outlined some useful programs that take advantage of the wealth of information that is available on the Internet. These packages are ARCHIE, GOPHER, WAIS, and WWW. We hope, by following the descriptions and basic instructions of these programs, the user will be tempted to try out the examples and learn first hand how useful a computer can really be.

ARCHIE

With anonymous ftp servers (see previous article) you can obtain files from many repositories on the Internet. The questions then is: How do I find these files? If you know you want a certain file, database, etc., how can you go about finding it? With the Internet being as huge as it is, no one person can keep track of everything. Fortunately, there is a tool, called ARCHIE, that may be invaluable in assisting you.

ARCHIE allows searching of indexes of files that are available on public servers on the Internet. It currently indexes about 1200 servers and 2.1 million files.

Using ARCHIE is relatively simple. You ask it to find filenames which contain a certain string. It will return the actual filenames that meet this requirement. You can then examine the list and choose which one(s) you think will meet your needs. Next you use anonymous ftp to move the file to your computer.

The ARCHIE command is very convenient to run at the ECN because it is installed on all our machines; type "archie -modifiers searchstring". ARCHIE will return all filenames which contain the .

The modifiers control the type of search you are conducting. They are:

       -c  return files whose names contain the search
           string.  (Upper and lowercase letters must
           match exactly.)

       -e  return files whose names match the search
           string EXACTLY. (This is default.)

       -r  treat search string as UNIX regular expression.

       -s  return files whose names contain the search
           string.  (Case of letter here is irrelevant.)

       -l  reformat output so it is suitable for input into
           another program (for example, grep).  This
           would enable you to look through the output
           to find what you want.

      -mnumber  return no more than number files.

NOTE: You may only use one of -c, -r, and -s in your command statement.

For example, suppose that you are looking for a source for the ARCHIE command, because you want to install it yourself. Type "archie -s archie".

When ARCHIE completes searching a database, it will output a listing of the matches it has found (that is, the number of filenames which contain the string "archie"). A portion of the output might look like this:

Host aix1.segi.ulg.ac.be
    Location: /pub/docs/tcpip/ftpsites
           FILE -rw-r--r--      12899  Sep 19 1991  archie
Host akiu.gw.tohoku.ac.jp
    Location: /pub/net
      DIRECTORY drwxrwxr-x       1024  Oct  7 02:39  archie
Host ashley.cs.widener.edu
    Location: /pub
      DIRECTORY drwxrwxr-x        512  Oct 26 18:15  archie
Host wuarchive.wustl.edu
    Location: /usenet/comp.sources.misc/volume27
      DIRECTORY drwxrwxr-x        512  Apr 16 1992  archie
    Location: /usenet/comp.sources.misc/volume33
      DIRECTORY drwxrwxr-x        512  Nov  6 02:08  archie

For these five matches (ARCHIE will really find much more than five), ARCHIE gives the server, the directory in which the file resides and the filename (or a directory) which matched the search criterion. Once you find something that looks useful you can look at the file by ftping to the host. If you do not know the name of the file, then use "dir" once you arrive in order to see if there is anything of interest. Now the problem is: which site do we choose?

There are a few guidelines:

Unfortunately some people choose filenames that don't really explain what is in the file. There is an additional service on ARCHIE called whatis which can be used to locate software or data files even if the filename bears no resemblance to its contents. It is a set of alternative indexing keywords for files on the network. When administrators put files in their ftp archives, they may contribute to an index entry for the file to help people find it. Since the whatis service requires this human intervention, it may not be as complete as we may ideally like it to be, but it is definitely better than nothing!

When you do a whatis search, ARCHIE uses your search string to examine the keyword list. If anything is matched, ARCHIE will print the name of the file and a brief description. If you see a file that you think is appropriate, you must then do a filename search (using the ARCHIE command as described above) to see where it is located. The third and final step, of course, is to anonymous ftp to the given host to obtain the file.

Unfortunately, you cannot do an ARCHIE command to perform a whatis search. Instead you must use ARCHIE with telnet. There are several ARCHIE servers on the Internet and each one contains exactly the same information. The best way to choose what server to use is to choose the one closest to you. This makes it easier for the network, and, if everyone follows this guideline, should generally spread the work around. Here are the available ARCHIE servers:

   archie.rutgers.edu        Northeastern U.S.
   archie.sura.net           Southeastern U.S. (Purdue's link)
   archie.unl.edu            Western U.S.
   archie.ans.net            Sites connected to the ANS network
                    (one of the Internet Service Providers)
   archie.mcgill.ca          Canada
   archie.au                 Australia & the Pacific Basin
   archie.funet.fi           Europe
   archie.doc.ic.ac.uk       United Kingdom

Let's do a whatis search for a gene sequence map for E.Coli bacteria. If we were to use the ARCHIE command, "archie -s coli", ARCHIE would return almost 200 matches--most of which would be broccoli recipes!! Let's see what happens with whatis after telneting to one of the above machines and logging in as "archie."

archie> whatis coli
ECD@Escherichia coli db (M.Kroeger, Giessen)
NGDD@Normalized gene maps for E.coli, S.typh., etc.
@(Y. Abel, Montreal)
The file NGDD looks like what we need.  Now we need to  find  out
where it lives.

archie> prog NGDD
# matches / % database searched:     1 /100%

Host ncbi.nlm.nih.gov     (130.14.20.1)
Last updated 02:23   4 Mar 1992
  Location:  /repository
     DIRECTORY  rwxrwxr-x     512   Jun 25   1990   NGDD

Notice that NGDD is a directory, not a file. So when you anonymous ftp to ncbi.nlm.nih.gov you should type the "dir" command to see what files are there. The index which is accessed with the ARCHIE command is updated monthly, but the whatis index is not updated on a regular basis. Therefore there is a chance that if you find something with whatis, you may not be able to locate it with the ARCHIE command. This would happen if someone deleted a file from his or her ftp archives, but did not delete the filename from the whatis descriptions database.

GOPHER

GOPHER is a lookup tool that lets you prowl through the Internet by selecting resources from menus. GOPHER started out as a distributed campus information service at the University of Minnesota, the home of the Golden Gophers. Since the main function of this system is to "go fer" things, the name is quite appropriate. Since its beginnings at Minnesota, the GOPHER system has grown--from one site to over 100, all within 18 months.

To access a public server at Illinois, type "telnet gopher.uiuc.edu" or connect to Minnesota by typing "telnet consultant.micro.umn.edu". In either case, login as "gopher." The Minnesota server was giving me problems when I tried to run a few things for this article, so all examples shown in this section are from the Illinois server. It doesn't really matter which server you contact though. Your choice of server will only determine the first menu you see.

Once you telnet to a public server and are logged as "gopher", you will see a menu. From this main menu you can choose menu items to move to other directories or access files. Then you can choose from other directories or access other files. (And so on and so on and so on.)

Here is the main menu at Illinois:

     Root gopher server:  gopher.uiuc.edu
      1.  Welcome to the U of Illinois Gopher.
      2.  Campus Announcements (12/1/92)/
      3.  What's Now?  (12/15/92)
      4.  Information about Gopher/
      5.  Keyword Search of Gopher Menus 
      6.  U of Illinois Campus Information/
      7.  Champaign-Urbana & Regional Information/
      8.  Computer Documentation/
      9.  Libraries/
     10.  Newspapers, Newsletters, and Weather/
     11.  Other Gopher and Information Servers/
     12.  Phone Books (PH)/
     13.  Internet File Server (ftp) Sites/

GOPHER keeps track of two types of entities: directories and resources. A slash at the end of a line denotes a directory. If you select a directory, you will see another menu. A period denotes a file. Simply select a file to read it. The symbol refers to an indexed directory resource. This type of entry will be explained later in the article.

Some handy commands to get you around the menus are:

  
 <    move backward
 >    move forward
 q    exit gopher
 ?    help
 u    move up to previous menu

To select an item, type the number of the line, or use the arrow keys to select the item that you wish to examine, and press return.

When you get to the end of an article, GOPHER asks what you want to do with it. Press "m" if you would like to e-mail a copy of the article to yourself. If you select an indexed directory resource (marked by at the end of the line), GOPHER will prompt you for search words:

Index word(s) to search for:

GOPHER will then create a new menu which is a subset of this directory's contents that only contains items matching your search criterion. In some ways this is like ARCHIE--searching for items which match your search string. One distinct advantage is that you do not have to exit GOPHER and ftp to a certain server to view the file. All you have to do is select the file in the menu and (abracadabra!) there it is on your screen. GOPHER searches are always case insensitive.

Keep in mind, though, that GOPHER does enforce licensing restrictions. That is, there may be things that users at the University of Minnesota may have access to that we at Purdue may not. GOPHER knows where you are and will not distribute material to you if certain restrictions exist. For example, the University of Minnesota has the UPI news feed on-line, but cannot distribute it off campus.

WAIS

Wide Area Information Server (WAIS pronounced "ways") is a tool that allows users to search indexed databases. The search is performed by providing keywords and having WAIS return documents that contain these keywords. What makes WAIS so interesting is the creation of databases that it uses. A database can contain any information and requires no special format to be useful. In fact, the databases are created by running an indexing program, so the whole process is automatic. Once the database is complete, one is able to use the WAIS search-engine to find whatever topics may be held in the information that was indexed.

Let's take a look at how the search-engine works. You can either have the WAIS client program compiled and installed on your local machine, or it may be accessed via telnet. We will use the latter method since WAIS may not be present on some computers. The machine you want to telnet to is quake.think.com or nnsc.nsf.net. When you reach the login prompt type "wais" and press return.

% telnet quake.think.com
Trying 192.31.181.1 ...
Connected to quake.think.com.
Escape character is '^]'.

SunOS UNIX (quake)

login: wais
Last login: Mon Dec 21 09:54:37 from wugate.wustl.edu
SunOS Release 4.1.1 (QUAKE) #3:
     Tue Jul 7 11:09:01 PDT 1992

Welcome to swais.
TERM = (vt100)
Starting swais (this may take a little while)...

After the login sequence you will be greeted with a screen that has a list of servers and the sources each server contains. It will look like:


	      <>

The top right corner of the screen shows that 349 source libraries are present. The highlighted bar acts as a cursor which moves up and down the screen. The bar will move down pressing "j" and will move up by pressing "k". You can also move the cursor by typing in a number that corresponds to the left column of the screen and pressing return. The cursor will immediately jump to that source. For example, to see the next screen of sources we could enter "19" and press return. Typing a "?" will produce a handy help screen showing a summary of all the commands that are recognized.

To select a source to search, place the cursor on the source you are interested in and press the space bar. An asterisk will appear to the right of the source number. The asterisk means that any search performed will be done on this source. You can select more than one source to search from.

As an example I will look up a movie review for the soon to be classic Who Framed Roger Rabbit? Place the cursor on source number 215 and press the space bar to select this source. This source is titled "movie-reviews" and certainly sounds like what we want. The asterisk appears directly to the right showing that the "movie-reviews" source will be used for our search. Now that the source has been selected press the space bar to enter some keywords. After entering some proper keywords, our screen now looks like the next screendump.


                      <>

We press return to begin the search. WAIS simply counts the number of occurrences of your keywords in a document and ranks each document with a weighted score from 1 to 1000. A score of 1000 means that WAIS believes this will be the most pertinent document you are looking for. It is often necessary to add or provide new keywords for WAIS to refine a search. In fact, WAIS was built on this principle, so don't be disheartened if you don't get exactly what you want on the first try.

In our example we find the following documents returned to us after our search is performed.


                      <>

I believe the screen is self-explanatory. We seem to have lucked out because the first two lines are both reviews of the movie we were looking for. Selecting the text to read is the same as the procedure outlined above for selecting a source to search: place the cursor on the document you wish to read and press return.

In this article we have been using a character-oriented version of WAIS known as SWAIS. There are other clients available which some people feel are better at locating documents. One example is XWAIS, which is a WAIS client which runs under the X Window System. Not only does it look better, the XWAIS client has more functionality. For instance, earlier I stated that it was normal to refine searches by adding or using new keywords. XWAIS handles this nicely by offering something known as relevance feedback. This allows the user to take an article that was previously found and find other articles that are similar to it. This in effect makes any portion or the entire document a "keyword" resulting in better searches.

There is still much work going on with WAIS. Search-engines are being made more powerful, new client software is being written for many types of computers, and older clients are slowly getting all of the bugs worked out of them. In addition to this, people all over the world are creating, modifying, and refining their indexed sources for others to use, resulting in more abundant and accurate information. You can bet that you will hear more about WAIS in the future.

WWW

The World-Wide Web, commonly referred to as WWW, is another information service that is offered on the Internet. As you will see, WWW is aptly named. WWW lives up to its name by offering an interface based on the technology of hypertext.

What is hypertext? Hypertext is a way of presenting information where certain words in a document can be used as references to other documents, pictures, or any other type of information imaginable. These references, known as links, allow the user to investigate a topic in as much detail as they like.

To use WWW, the set of programs may either be installed on your machine, or you may use a public-access client via telnet. We will be doing the latter. You may either telnet to eiecs2.njit.edu or info.cern.ch and login as "www." It would be preferable to try info.cern.ch first, but since that computer is down at the time of this writing, we'll use the other one.


                      <>

Type "telnet eiecs2.njit.edu" to access a public WWW client. At the login prompt enter "www" and press return. You will be prompted for the terminal type you are using, and once entered, you will see the preceding screen.

This is the home page at NJIT. The home page is the document that you see when you first enter the WWW. The bracketed numbers are links to other documents. To follow a link, type the number and hit return. The list of words below the line are commands. To execute one of the commands type the capitalized letter that is in the word and hit return. For example, to exit WWW enter a capital X (in the command eXit) and hit return. Note that the interface will be different if you telnet to info.cern.ch instead. However the concept of links will still be preserved.

When viewing documents that are larger than the screen size, one can use the N command (for Next) to see the next portion of the document. Selecting Back will scroll the current document up. Selecting Up will take you back a level. For example, from the home page you may select 1 to view the help screen. From the help screen, if you select Up you will come back to the home page.

As you can see, there is on-line help available and many of the commands are self-explanatory. With repeated use you will see how powerful an interface this is.

Hypertext is a concept that is gaining popularity. Hypermedia is being applied to other areas besides information retrieval. Most notably is the application to television viewing. There are experimental services available in some California cities which allow tv viewers to control and interact with with what is shown on their screen. For example while watching a baseball game, if you are interested in the player's history and statistics, pressing a button on a special console will give you the information. Customers are able to play along with the contestants on Jeopardy or solve the murder mystery airing at 8PM. The implementation of this technology is exciting and sure to make news in the near future.


webmaster@ecn.purdue.edu
Last modified: Friday, 12-Sep-97 19:21:37 EST

[HTML Check] HTML