Analyzing the Anatomy of URLs
A URL (pronounced You-Are-Ell), or Uniform Resource Locator, is a fancy way of saying an address for information on the Internet. If you hear URL, just think “address” or “location.” URLs differ based on how specific you need to be.
URLs can be absolute (complete) or relative (partial):
- If you’re creating a document that you want to publish on the Internet, you use an absolute URL so that anyone — anywhere in the world — on the Internet can find the page.
- If you’re creating links to other files within the same folder or on the same server, you need to provide only a relative URL. Remember that you’re already in the same directory (or folder or general vicinity) as the file to which you’re linking.
All HTML documents can use URLs to link to other information. URLs, in turn, can point to many different things, such as HTML documents, other sites on the Internet, or even images and sound files.
URLs are case sensitive. On some computers, typing a filename such as Kitten.html is very different from typing kitten.html. If you create a filename that uses special capitalization (instead of, say, using all lowercase characters), you must use this same capitalization the same way every time you link to the document. (It’d be easier for you and your readers to use just lowercase.)
If you’re not used to them, URLs can be pretty odd looking. Each part of a URL has a built-in specific meaning, however, much like each part of your home address. The street address “12 Fritter Lane, Apartment G, Santa Clara, CA 95051,” for example, provides a postal carrier with essential and complete information — the specific apartment in a specific building on a specific street in a specific town in a specific state in a specific ZIP code. Specifically.
URLs work the same way by providing a browser with all the parts it needs to locate information. A URL consists of the protocol indicator, the hostname, and the directory name and/or filename. The following (fictitious) URL is an example of an absolute URL:
Following is a description of each URL part:
- http:// portion (protocol indicator): Tells the server how to send the information. The protocol indicator is the standard used by Web servers and browsers that lets them talk to each other. If you’re creating HTML documents, people point to them by using http:// as the protocol indicator. You might notice that the http:// protocol indicator often is omitted by publications, both for space and because most URLs (at least those published in the media) tend to be http:// -type URLs.
- Note: Even though you can leave the http:// off the URL in casual usage, you must include it when linking to another Web site.
- cat.feline.org portion (hostname): Specifies a computer on the Internet. If you publish an HTML document, you’re placing it on a computer that “serves” the document to anyone who knows the correct URL. This computer has an address that’s common to all documents that it stores. The server thus “hosts” all these documents and makes them accessible to users.
- To obtain the hostname of the server on which you place your files, check with your system administrator.
- fur portion (directory name): You may not need to show a directory name, or you may have several that represent directories inside directories (or folders inside folders). If you have an account with an Internet service provider, your directory name may also begin with a tilde and your user name, yielding something such as http://cat.feline.org/~lucy/, assuming, of course, that lucy is the account name.
- fuzzy.html portion (name of file located on the host computer): Sometimes you don’t need to provide a filename — the server simply gives out the default file in the directory. The default filenames are usually one of three: index.html, default.html, or homepage.html, depending on which kind of Web server the files are located. The filename is like many other files; it contains a name (fuzzy) and an extension (.html).
Sometimes, URLs have a hostname with a port number at the end (for example, cat.feline.org:80). This number gives the server more precise information about the URL. If you see a URL with a number, just leave the number on the URL. If you don’t see a number, don’t worry about it.
Try to avoid creating directory names or filenames with spaces or other unusual characters. Stay with letters (uppercase and lowercase), numbers, underscores (_), periods (.), or plus signs (+). Some servers have problems with odd characters. In addition, if you do use any capitalization in your filenames, you must also use the same capitalization in any links pointing to those files because some servers require consistent capitalization.