How to Use a DTD with Java
An XML document can have a DTD, which spells out exactly what elements can appear in an XML document and in what order the elements can appear. DTD stands for Document Type Definition, but that won’t be on the test.
A DTD for an XML document about movies, for example, may specify that each
Movie element must have
Price subelements and an attribute named
year. It can also specify that the root element must be named
Movies and consist of any number of
The main purpose of the DTD is to spell out the structure of an XML document so that users of the document know how to interpret it. Another, equally important use of the DTD is to validate the document to make sure that it doesn’t have any structural errors. If you create a
Movies XML document that has two titles for a movie, for example, you can use the DTD to detect the error.
You can store the DTD for an XML document in the same file as the XML data, but more often, you store the DTD in a separate file. That way, you can use a DTD to govern the format of several XML documents of the same type. To indicate the name of the file that contains the DTD, you add a
<!DOCTYPE> declaration to the XML document. Here’s an example:
<!DOCTYPE Movies SYSTEM “movies.dtd”>
Here the XML file is identified as a
Movies document, whose DTD you can find in the file
movies.dtd. Add this tag near the beginning of the
movies.xml file, right after the
This code shows a DTD file for the
<?xml version=“1.0” encoding=“UTF-8”?>
<!ELEMENT Movies (Movie*)>
<!ELEMENT Movie (Title, Price)>
<!ATTLIST Movie year CDATA #REQUIRED>
<!ELEMENT Title (#PCDATA)>
<!ELEMENT Price (#PCDATA)>
Each of the ELEMENT tags in a DTD defines a type of element that can appear in the document and indicates what can appear as the content for that element type. The general form of the
ELEMENT tag is this:
<!ELEMENT element (content)>
Use the rules listed here to express the content.
||The specified element can occur 0 or more times.|
||The specified element can occur 1 or more times.|
||The specified element can occur 0 or 1 time.|
||Text data is allowed.|
||Any child elements are allowed.|
||No child elements of any type are allowed.|
ELEMENT tag in the DTD shown above, for example, says that a
Movies element consists of zero or more
Movie elements. The second
ELEMENT tag says that a
Movie element consists of a
Title element followed by a
Price element. The third and fourth
ELEMENT tags say that the
Price elements consist of text data.
If this notation looks vaguely familiar, that’s because it’s derived from regular expressions.
ATTLIST tag provides the name of each attribute. Its general form is this:
<!ATTLIST element attribute type default-value>
Here’s a breakdown of this tag:
elementnames the element whose tag the attribute can appear in.
attributeprovides the name of the attribute.
typespecifies what can appear as the attribute’s value. The
typecan be any of the items listed in this table.
defaultprovides a default value and indicates whether the attribute is required or optional.
defaultcan be any of the items listed in the following table.
|Element||The Attribute Value …|
||Can be any character string.|
||Can be one of the listed strings.|
||Must be a name token, which is a string made up of letters and numbers.|
||Must be one or more name tokens separated by white space.|
||Is a name token that must be unique. In other words, no other element in the document can have the same value for this attribute.|
||Must be the same as an
||Is a list of
Check out the attribute defaults here.
|Default||Optional or Required?|
||Optional. This value is used if the attribute is omitted.|
||Optional. If included, however, it must be this value, and if omitted, this value is used by default.|
ATTLIST tag declaration from
<!ATTLIST Movie year CDATA #REQUIRED>
This declaration indicates that the attribute goes with the
Movie element, is named
year, can be any kind of data, and is required.
ATTLIST tag that specifies a list of possible values along with a default:
<!ATTLIST Movie genre (SciFi|Action|Comedy|Drama) Comedy>
This form of the
ATTLIST tag lets you create an attribute that’s similar to an enumeration, with a list of acceptable values.