URL?? Earl? Yurl? Oorl?

Uniform Resource Locator is simply the internet address to a resource (web site, folder, file, etc) written in a uniform manner. A webpage address which you use all the time is an URL, but there is more to URLs than general www addresses. Generally speaking you have no need to understand what all the gobbledegook means, nor do you have to try to type in URLs, just copy and paste them in, use links, or just click on what Google finds for you. But if you really want to understand the Net you need to know something about them.

An URL can be general (directed to what is known as the "Home Page"):

 protocol://name.host name.organisation.country(other than US)/

... or specific (right down to file level):

 protocol://name.host name.type of organisation.country/folder/sub-folder/filename

If the URL is using a Web Server, then the protocol is http as explained below. But, you may ask, what is a Web Server? Well, contrary to widespread belief a Web Server is not some gigantic computer (although dedicated powerful computers with large storage capacity are often called Servers). A Web Server is a program that is located on a computer on the Internet. The program is called a Web Server because it waits for a remote Web Browser to connect to it and then serves it the requested files. In this case, the computer language they use to communicate is http. The web browser (the most popular ones are Microsoft's Internet Explorer, Opera, and Firefox ) sends a request over a telephone line or dedicated cable, the Web server then locates the file (the requested web page) and sends it back to the browser. How does it locate the requested file? Simple really, the URL tells it where it is. Unless a specific file is requested, the server looks for a file called index.htm or index.html.

The Domain Name System (DNS) server maintains a database of domain names (host names) and their corresponding IP addresses. Domain names and URLs (Uniform Resource Locators) are matched to IP addresses using DNS software. When an Internet user types www.google.com, or any other URL, into a browser, a query is sent to the ISP's name server which returns an IP address for the site.

For more detailed explanations of IP addresses and related topics click here.

Back to URLs. The protocol indicates the type of program language your browser is required to use to gain access to a particular file and is generally followed by :// and the Internet address of the remote computer.

The name is the distant computer. Conventionally on the Web the far computer is called www (the abbreviation for World Wide Web). The next bit is the address where the computer is located. The Home Page is a specific file (conventionally called index.html) located on that computer's hard disk at that address. That far computer is known as an internet server, and the Home Page of, say, Jones inc, is a file placed on that server and probably hundreds of miles from Jones inc. This page you are reading is a daughter-file called urlpage02.html of my index.html file in my folder on my service provider, freedom2surf , who give 200 Mb free Web space to their clients.

The Full Path

To sum up, here is an imaginary url with a breakdown of its components.If a page is stored in a subdirectory (folder within a folder), its name is also separated by a slash, and subdirectories can be several levels deep. For example, the components of the following hypothetical URL are described below:

  http://www.zoo.com/vertebrates/mammals/felines/lion.html

http:                    protocol
//                       separators
www.animals.com/         domain name and main directory
vertebrates/             subdirectory name, this folder is in the main folder
mammals/                 subdirectory name, this folder is in the vertebrates folder
felines/                 subdirectory name, this folder is in the mammals folder
lions.html               file name (Web page), this file is in the felines folder

Entering http://www.animals.com into your browser would take you to the Home Page of Animals Com and, no doubt, there there would be a link to the Mammals page and on that page a link to the Lions page, the vertebrates/ folder perhaps being invisible to the user. The full URL http://www.animals.com/vertebrates/mammals/felines/lion.html would take you direct to the Lion page.

The main protocol types which collectively make up the Internet are:

http
ftp
gopher
telnet
WAIS

http: (hypertext transfer protocol) a World Wide Web server language. This is the most common. On the Web these addresses always start with http://, but the beauty is that nearly all browsers will add this automatically.
If you want to get to, say Jones, an American company, just bang in www.jones.com (for the UK, for com put co.uk, i.e., www.jones.co.uk) and your browser will automatically convert this to http://www.jones.com/ All URLs, apart from the USA, must have a country code.

On to the other protocol types :

ftp

file transfer protocol. There are two sorts of ftp sites: anonymous and non-anonymous.

An anonymous ftp site allows access to archives of files. Anonymous sites are the ones you log into using anonymous as the login and your e-mail address as the password. Files are access only, they may be read or downloaded but not altered in any way.

Non-anonymous ftp sites hold archives of files which are 'read only' unless you enter a password to access them. The page you are looking at is an HTML file located at ftp.######.f2s.com a non-anonymous ftp location where ###### is my User ID. You can freely access it on http as you have done now but only I, the webmaster, can access it on ftp to alter it (webmaster incidentally does not imply any particular expertise, it is the term used for the person who maintains a website and who has full access to all the files).

So, how do you enter the password for a non-anonymous ftp site, since you are not prompted for one? Well the password is embeded in the URL the webmaster uses and takes the form:

ftp://username:password@ftp.servername.com/home/servername/homepage.html

In practice there is no need to remember any of this if you use an ftp program. On most you just enter your ftp URL and password in the labled boxes on set up and they are securely stored for all future transmissions until changed. The utility I use is FreeFTP, from Brandyware Software. I simply click a connect icon then drag and drop the files I wish transferred. This is a boon for webmasters, simplifying editing and updating, but still allowing you to control what is going on. A more advanced ftp utility, also free, is SmartFTP, but setting it up, although straightforward, requires a little more expertise. Which brings us on to Archie. Archie is an Internet utility used to search for file names. There are approximately 30 computer systems throughout the Internet, called Archie servers, that maintain catalogs of files available for downloading from various FTP sites. Periodically, Archie servers search FTP sites throughout the Internet and record information about the files they find. If you do not have Archie, some Internet hosts let you log on via Telnet as user "archie."

Hope all this is not getting too boring, or giving you this kind of a headache! (Put your mouse pointer near Tom and Jerry)

gopher: Gopher is a text menu driven directory of files, newsgroups, etc. The // part is omitted. Gopher is a program that searches for file names and resources on the Internet and presents hierarchical menus to the user. As users select options, they are moved to different Gopher servers on the Internet. Where links have been established, Usenet news and other information can be read directly from Gopher. There are more than 7,000 Gopher servers on the Internet. Veronica is a subprogram that searches the Internet for specific resources by description, not just file name. Using Boolean searches (this AND this, this OR this, etc.), users can search Gopher servers to retrieve a selected group of menus that pertain to their area of interest. Another internet utility associated with Gopher is Jughead. Jughead is an Internet utility used to search for a keyword throughout all levels of a Gopher menu. With Jughead, you do not have to jump from one menu level to the next.

telnet: telnet initiates a session to log on remotely to another computer. When selected, your web browser launches an external program and connects to the specified site. Telnet is a terminal emulation protocol commonly used on the Internet and TCP/IP-based networks. It allows a user at a terminal or PC to log onto a remote computer and run a program. Telnet was originally developed for ARPAnet and is an inherent part of the TCP/IP communications protocol. Although most computers on the Internet that allow Telnet access require users to have an established account and password, there are some that allow you to run programs such as search utilities.

WAIS: Wide Area Indexed Server: a site for searching documents by keywords. Wais is in fact a database that contains indexes to documents that are on the Internet. Using the Z39.50 query language, text files can be searched based on keywords.

Peter Ghiringhelli, B.A.(Hons), M.A.

Return to Top