RSNA.org

Mini Tutorial

Internet for You

 
Katarzyna J. Macura, M.D., Ph.D.
The Russell H. Morgan Department of Radiology and Radiological Science •  Johns Hopkins Medical Institutions

Part 4 — Basics on the WWW
by Katarzyna J. Macura, M.D., Ph.D.

Although many people use the terms "World-Wide Web" (WWW) and "Internet" interchangeably, the WWW is just one of the many services that are available on the Internet. While the WWW is second in popularity only to e-mail, other services such as FTP, telnet, and newsgroups are also commonly used. WWW was first developed as a tool for collaboration in the high energy physics community, CERN.

CERN is the European Organization for Nuclear Research, the world's largest particle physics center. Founded in 1954, the laboratory was one of Europe's first joint ventures. CERN was born out of a need to collaborate: no single European country could afford the facilities that were needed. There was an additional political aspect to encourage cooperation between the countries of Europe that had so recently been at war. More than half the world's high-energy physicists are now involved in CERN's experiments. CERN has become a shining example of international collaboration, a world-wide laboratory.

In late 1990, Tim Berners-Lee, a CERN computer scientist, invented the World-Wide Web for the high-energy physics collaborations which demanded instantaneous information sharing between physicists working in different universities and institutes all over the world. There were many obstacles in the 1980s to the effective exchange of information. There were a great variety of computer and network systems, with hardly any common features. Users needed to understand many inconsistent and complicated systems. Different types of information had to be accessed in different ways, involving a big investment of effort and time by users. The result was frustration and inefficiency.

This was fertile soil for the invention of the World-Wide Web. Berners-Lee together with Robert Cailliau wrote the first WWW client, a browser-editor running under NeXTStep, and the first WWW server along with most of the communications software, defining URLs, HTTP and HTML. Using WWW, scientists could at last access information from any source in a consistent and simple way. The launching of this revolutionary idea was made possible by the widespread adoption of the Internet around that time. This provided a standard for communication between computers, on which WWW and the first "virtual community" could be built.

Initially, there were only two kinds of browsers. One was the original development version, very sophisticated but only available on NeXT machines. The other was the "line-mode" browser, which was easy to install and run on any platform but limited in power and user-friendliness. The small team at CERN could no longer do all the work needed to develop the system further and CERN programmers launched a plea via the Internet for other developers to join in.

The next generation browser, Mosaic, was released in 1992 by the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Urbana-Champaign. This friendly window-based graphical Web browser ran in the X Window System environment, popular in the research community. Shortly afterwards, NCSA also released versions for the PC and Macintosh environments. The existence of reliable user-friendly browsers on popular computers had an immediate impact on the spread of WWW. 1994 was the "Year of the Web." The world's First International World-Wide Web conference was held at CERN in May. This event was acclaimed as the "Woodstock of the Web."

In 1994, Marc Andreesen, the creator of Mosaic, left NCSA to start his own company, called the Netscape Communication Corporation, selling the new powerful browser Netscape Navigator. With Navigator 2.0, Netscape introduced a scripting language called JavaScript (originally called LiveScript). JavaScript allowed the manipulation of HTML to change elements of the page or the browser window based on user interaction. In 1995 with the launch of Windows 95 and a Web browser of its own, Internet Explorer, Microsoft began an effort to challenge Netscape. For quite a while, Internet Explorer played catch-up to Netscape's continual pushing of the browsing technological limits, but with one major advantage: unlike Netscape, Internet Explorer was free of charge. Slowly, Internet Explorer gained popularity and market share to become the number 1 browser on the Net.

The Web's evolution has been shaped by the World-Wide Web Consortium (W3C). Berners-Lee is currently a director of the W3C at the Laboratory for Computer Science, Massachusetts Institute of Technology. W3C consists of representatives from the world's leading Internet companies. W3C develops interoperable technologies (specifications, guidelines, software and tools) to "lead the Web to its full potential as a forum for information, commerce, communication and collective understanding."

The fundamental idea of WWW was to merge the techniques of computer networking and hypertext into a powerful and easy-to-use global information system. Hypertext is text with links to further information, on the model of references in a scientific paper or cross-references in a dictionary. With electronic documents, these cross-references can be followed by a mouse-click, and with the WWW, they can be anywhere in the world.

The WWW consists of an international collection of electronic documents, called Web pages that have built-in links to other related documents. Web pages commonly have text and images but may also contain audio and movie files. The links between web pages, called hyperlinks, allow users to navigate quickly and in a nonlinear fashion from one web page to another with just one click. Text links appear as underlined words or phrases, and graphic links are in the form of images or icons. The user can identify the link by placing the arrow-shaped cursor on the link, which then changes to a hand-shaped link pointer. By clicking a link, the user requests the Web page indicated by the link. To remind the user which pages were already visited, browsers change the color of a visited link. A collection of related Web pages forms a Web site.

All Web sites have a starting point, called a home page, that is similar to a book or magazine cover. The home page is the first Web page normally visited and provides information about the site's purpose and content. There are seven basic types of Web pages: advocacy (contain content that describes a cause, opinion, or idea), business/marketing (contain content that promotes or sells products or services), informational (contain factual information), news (contain stories, articles related to current events, life, money, sports, weather, etc.), educational (provide variety of educational materials and online tutoring), portal (offer a variety of Internet services from a single location), and personal. Many pages fall into more than one of these categories.

The navigation by "wandering" from one page to another is called browsing or surfing the Web. Web pages are accessed using software called a Web browser. Users run the Web browser software, such as Netscape Navigator or Microsoft's Internet Explorer, on their own computers. The Web pages that Make up a Web site are stored on another computer, called a server. A Web server is a computer and software that delivers requested web pages to the client's browser. Many Internet Service Providers (ISPs) include in their services a Web address and several MB of storage so that customers can maintain their own Web sites. Storage space on a Web server can also be purchased from Web hosting companies for a monthly fee.

Any personal computer linked to the Internet can become a Web server. Web-enabled handheld computers and devices such as cellular phones use a special type of browser designed for their small screens. A microbrowser, or minibrowser, is a software program that accesses and displays Web pages on the Web-enabled hardware. Downloading is the process of receiving information, such as a Web page, from a server on the Internet.

Every Web page has a unique address. This address is called the Uniform Resource Locator (URL)(example http://www.rsna.org). The URL begins with http://, which stands for the hypertext transfer protocol, the communication protocol used to transfer Web pages over the Internet. The second part of the URL refers to the name of the server where the Web site resides. This part of the address is also called a domain name. Each computer with a connection to the Internet is assigned an IP address. The domain name is simply a mnemonic that makes it easier for computer users to remember the Web site. The user's Web browser converts the domain name to the appropriate IP address. Thus, the user may put either http://192.203.125.59/ or http://www.rsna.org).  as the URL and still visit the same Web page. The third portion of the URL is the directory/folder on the host computer that contains a specific part of the Web site. Subdirectories might also be indicated in this part of the address. The last segment of the URL is a document name, file name and file extension, leading to a specific Web page. Each file on the WWW has a unique URL. URLs make it possible to navigate using links because a link is associated with a URL.

Web pages are created using the hypertext markup language (HTML). HTML is a set of special codes, also called tags or markups, that defines the placement and format of text, graphics, video and sound on a Web page. These codes specify how the web site's elements will be displayed in a browser, and where the links will lead. Thus, HTML can be thought of as a programming language for Web browsers. The HTML file that the browser downloads to display the Web page is actually a simple text file that can be read by any text editor such as Microsoft Notepad or Microsoft Word. The HTML file doesn't actually contain the graphics or other multimedia files. Instead, it contains HTML reference to those files. The browser uses those references to find the files on the server, download them, and then display them as a part of the Web page.

The finished HTML document is readable by any browser on any computer. The content of the Web pages does not change, but the final look may vary from browser to browser (Netscape/Internet Explorer) and platform to platform (Windows/MacOS). The more complex and specialized the HTML tagging and the more multimedia features the Web page contains, the longer it takes to download and display the document. In addition to standard HTML tags there are special tags that are available as plug-ins that extend the variety of files and media that a browser can display on a Web page. Certain types of video and sound files might require plug-ins for viewing/listening. If the Web page requires a plug-in that the user's browser does not have, a message will be displayed indicating which plug-ins are needed and from which sites they can be downloaded, usually for free.

The process of developing and authoring Web pages is called Web publishing. A Webmaster is the individual responsible for creating and maintaining the Web site. Since there is no single organization that controls additions and changes to Web sites all over the world, there is no central menu or catalog of Web sites' content or addresses. There are, however, several companies maintaining organized directories of Web sites to help users find information on specific topics. A search engine is a software program that can be used to locate Web pages on certain topics or find specific pages for which the user does not know the exact URL. To find a page or pages, the user enters a word or phrase, called search text or keywords, in the search engine's text box; the search engine then displays a list of all Web pages that contain the entered keywords. Any Web page listed as the result of a search is called a hit. The companies providing access to search engines also provide directories of Web sites organized by topics/categories. Once the user connects to the search engine, she/he can search by selecting topics from an indexed list. Most search engines can handle both simple and complex text searches.

With a simple search, a search engine often returns a huge list that might contain thousands of hits. To get a smaller, more focused list, the user should use advanced search techniques. Because each search engine provides a slightly different set of advanced search tools, the online Help should be consulted for details on composing advanced searches. Search engines actually do not search the entire Internet; such a search would take a long time. Instead, they search an index of Internet sites that is constantly updated by the company that provides the search engine. Like any other Web site, a search engine has a URL (e.g., www.google.com or www.altavista.com).

The World-Wide Web is the fastest growing and in many ways the most exciting part of the Internet. As an Internet-based hypermedia initiative for global information sharing, it has become an integral part of our personal and professional life. The user can see the whole web of information as one vast hypertext document. There is no need to know where information is stored, or any details of its format or organization. Behind this apparent simplicity, of course, there is a set of ingenious design concepts, protocols and conventions that make WWW easy to use and a friendly interactive window to the world that opens up from our desks.


www.aawr.org
Editor's Note: The original Mini-Tutorial on the Internet by Katarzyna J. Macura, M.D., Ph.D., was published in the AAWR Newsletter Focus. Dr. Macura updated her series for RSNA News.

Copyright © 2008 Radiological Society of North America, Inc., 820 Jorie Blvd, Oak Brook, IL 60523-2251
Tel. 1-630-571-2670 || fax 1-630-571-7837 || U.S. and Canada: Main 1-800-381-6660, Membership 1-877-RSNA-MEM (776-2636)