Chapter 1: World Wide Web

"This `telephone' has too many shortcomings to be seriously considered as a means of communication. The device is inherently of no value to us." -Western Union internal memo, 1876

1 . 1 Introduction to the World Wide Web

So what's all this fuss about the World Wide Web? What's the big deal? Why should I bother to spend my time looking at all sorts of irrelevant drivel? These questions are a typical response by the non-techie to all the hype of the World Wide Web (also known simply as the Web). In fact, most of the content is drivel, and it takes far too long to get useful information, but the Web is a big deal and it is worth understanding it's implications.

Perhaps the most thoughtful and profound demonstration of the impact of the Web is the recently completed "24 Hours in Cyberspace"(1) project. In that event, 100 photojournalists around the world photographed events; transmitted them to mission control in San Francisco, where editors typed up stories about the events; recorded telephone interviews with the photographers; composed the entire product into Web pages; and built the information into a robust compelling instant publicationall in 24 hours. Viewed by literally millions of people, the site had approximately four million "hits" on that first day.

It was, and still is a compelling storyhow information technology is used in the daily lives of people, around the world. It used the technology to tell the story of how technology can enhance their lives. It was an elegant affair.

Is this journalism, radio, broadcasting, or what? Clearly the integration of all these technologies has created something much greater than the sum of its parts. The ability to assemble and edit information, including images and sounds, and make it available for instant reading/broadcast was phenomenal. The fundamental enabling technological glue and the cause of the Internet explosion is the World Wide Web(2). So what is it?

Let's start in the middle of Web time with Mosaic. Mosaic, a Web browser, was the first "killer" Internet application. Mosaic was introduced to the net in the same way as many other university research projects. It's available for free, and, with source code, for non-profit use. Like many other applications it is a product of the Internet community, specifically, the National Center for Supercomputing Applications (NCSA). It was the product of yet another unheralded government (National Science Foundation) grant.

To understand the explosive growth of the Web, take a look at the number of Web sites discovered by Matthew Gray of net.Genesis over the past few years.

Growth of the Web
---------------------------------------------------------
Month   No. of Web sites   % of commercial    Hosts per Web    
                               sites           server      
---------------------------------------------------------
6/93   130               1.5               13,000          
12/93  623               4.6               3,475           
6/94   2,738             13.5              1,095           
12/94  10,022            18.3              451             
6/95   23,500            31.3              270             
1/96   90,000            50.2              100(estimate)   
---------------------------------------------------------

The Web's exponential growth continues. As the Web becomes more widely used, it will start to impact traditional broadcast media like radio and television.

The developers of Mosaic did not try to invent everything. They built on a number of existing standards and systems. Prime among these was the Web developed at CERN, the European Laboratory for Particle Physics. In fact, most of the technological "break-throughs" were the result of the WWW. The fuss and hoopla that surrounded Mosaic was due to the unified and reasonably pleasant interface it presents to the user.(3)


The Arena Web browser from World Wide Web Organization (W3O)

Mosaic and its commercial clones such as Netscape from Netscape Communications offer end users a view of a compound document with many types of data, images, sounds, video etc. (See Section 3.4.1 Compound Document in Chapter 3 Points of View). Many items in the document contain links to other documents. These hypertext links allow the user to browse an entire collection of related documents easily. The documents are distributed and accessed throughout the Internet via the protocols supported by the Web. The net effect (pun intended) is to be able to read compound documents containing images and sounds with the real information sources distributed over the Internet. Web browsers have become the front end to the Internet.

Several key features make the Web extremely powerful.


Tim Berners-Lee(4) is the acknowledged "father" of the Web. Originally from CERN he is now at the World Wide Web Organization (W3O). From his overview of the Web comes the following summary:

World Wide Web - Summary

The WWW (World Wide Web) project merges the techniques of networked information and hypertext to make an easy but powerful global information system.

The project represents any information accessible over the network as part of a seamless hypertext information space.

W3 was originally developed to allow information sharing within internationally dispersed teams, and the dissemination of information by support groups. Originally aimed at the High Energy Physics community, it has spread to other areas and attracted much interest in user support, resource discovery and collaborative work areas. It is currently the most advanced information system deployed on the Internet, and embraces within its data model most information in previous networked information systems.

In fact, the web is an architecture which will also embrace any future advances in technology, including new networks, protocols, object types and data formats.

Clients and server for many platforms exist and are under continual development. Much more information about all aspects of the web is available on-line so skip to "Getting started" if you have an internet connection.

Reader view

The WWW world consists of documents, and links. Indexes are special documents which, rather than being read, may be searched. The result of such a search is another ("virtual") document containing links to the documents found. A simple protocol ("HTTP") is used to allow a browser program to request a keyword search by a remote information server.

The web contains documents in many formats. Those documents which are hypertext, (real or virtual) contain links to other documents, or places within documents. All documents, whether real, virtual or indexes, look similar to the reader and are contained within the same addressing scheme.

To follow a link, a reader clicks with a mouse (or types in a number if he or she has no mouse). To search and index, a reader gives keywords (or other search criteria). These are the only operations necessary to access the entire world of data.

Information provider view

The WWW browsers can access many existing data systems via existing protocols (FTP, NNTP) or via HTTP and a gateway. In this way, the critical mass of data is quickly exceeded, and the increasing use of the system by readers and information suppliers encourage each other.

Providing information is as simple as running the W3 server and pointing it at an existing directory structure. The server automatically generates the a hypertext view of your files to guide the user around.

To personalize it, you can write a few SGML hypertext files to give an even more friendly view. Also, any file available by anonymous FTP, or any internet newsgroup can be immediately linked into the web. The very small start-up effort is designed to allow small contributions. At the other end of the scale, large information providers may provide an HTTP server with full text or keyword indexing. This may allow access to a large existing database without changing the way that database is managed. Such gateways have already been made into Oracle(tm), WAIS, and Digital's VMS/Help systems, to name but a few.

The WWW model gets over the frustrating incompatibilities of data format between suppliers and reader by allowing negotiation of format between a smart browser and a smart server. This should provide a basis for extension into multimedia, and allow those who share application standards to make full use of them across the web.

This summary does not describe the many exciting possibilities opened up by the WWW project, such as efficient document caching. the reduction of redundant out-of-date copies, and the use of knowledge daemons. There is more information in the on-line project documentation, including some background on hypertext and many technical notes.

Getting Started

If you have nothing else but an Internet connection, then telnet to info.cern.ch (no user or password). This very simple interface works with any terminal but in fact gives you access to anything on the web. It starts you at a special beginner's entry point. Use it to find up-to-date information on the WWW client program you need to run on your computer, with details of how to get it. This is the crudest interface to the web do not judge the web by this. Just use it to find the best client for your machine.

You can also find pointers to all documentation, including manuals, tutorials and papers.

Tim BL





[SECTION 1.2] [TABLE OF CONTENTS]

Skip to chapter[1][2][3][4][5][6][7][8][9]



© Prentice-Hall, Inc.
A Simon & Schuster Company
Upper Saddle River, New Jersey 07458

Legal Statement