How to Map SQL to XML and XML to SQL
To exchange data between SQL databases and XML documents, the various elements of an SQL database must be translatable into equivalent elements of an XML document, and, of course, vice versa.
In SQL, the character sets supported depend on which implementation you’re using. This means that IBM’s DB2 may support character sets that are not supported by Microsoft’s SQL Server. If you use a less common character set, migrating your database and application from one RDBMS platform to another may be difficult.
XML supports only one, Unicode. This is a good thing from the point of view of exchanging data between any given SQL implementation and XML. All the RDBMS vendors have to define a mapping between strings of each of their character sets and Unicode, as well as a reverse mapping from Unicode to each of their character sets.
Luckily, XML doesn’t also support multiple character sets. If it did, vendors would have a many-to-many problem that would require several more mappings and reverse mappings to resolve.
Characters that are legal in SQL but illegal in XML must be mapped to something legal before they can become part of an XML document. SQL supports delimited identifiers. This means that odd characters such as %, $, and & are legal, as long as they’re enclosed within double quotes. Such characters are not legal in XML.
Furthermore, XML Names that begin with the characters XML in any combination of cases are reserved and thus cannot be used with impunity. If you have any SQL identifiers that begin with those letters, you have to change them.
In moving from SQL to XML, all SQL identifiers are converted to Unicode. From there, any SQL identifiers that are also legal XML Names are left unchanged. SQL identifier characters that are not legal XML Names are replaced with a hexadecimal code that either takes the form “_xNNNN_” or “_xNNNNNNNN_”, where N represents an uppercase hexadecimal digit.
For example, the underscore will be represented by “_x005F_”. The colon will be represented by “_x003A_”. These representations are the codes for the Unicode characters for the underscore and colon. The case where an SQL identifier starts with the characters x, m, and l is handled by prefixing all such instances with a code in the form “_xFFFF_”.
Conversion from XML to SQL is much easier. All you need to do is scan the characters of an XML Name for a sequence of “_xNNNN_” or “_xNNNNNNNN_”. Whenever you find such a sequence, replace it with the character that the Unicode corresponds to. If an XML Name begins with the characters “_xFFFF_”, ignore them.
The SQL standard specifies that an SQL data type must be mapped to the closest possible XML Schema data type. The designation closest possible means that all values allowed by the SQL type will be allowed by the XML Schema type, and the fewest possible values not allowed by the SQL type will be allowed by the XML Schema type.
XML facets, such as maxInclusive and minInclusive, can restrict the values allowed by the XML Schema type to the values allowed by the corresponding SQL type.
For example, if the SQL data type restricts values of the INTEGER type to the range –2157483648<value<2157483647, in XML the maxInclusive value can be set to 2157483647, and the minInclusive value can be set to –2157483648. Here’s an example of such a mapping:
<xsd:simpleType> <xsd:restriction base="xsd:integer"/> <xsd:maxInclusive value="2157483647"/> <xsd:minInclusive value="-2157483648"/> <xsd:annotation> <sqlxml:sqltype name="INTEGER"/> </xsd:annotation> </xsd:restriction> </xsd:simpleType>
You can map a table to an XML document. Similarly, you can map all the tables in a schema or all the tables in a catalog. Privileges are maintained by the mapping. A person who has the SELECT privilege on only some table columns will be able to map only those columns to the XML document.
The mapping actually produces two documents, one that contains the data in the table and the other that contains the XML Schema that describes the first document. Here’s an example of the mapping of an SQL table to an XML data-containing document:
<CUSTOMER> <row> <FirstName>Abe</FirstName> <LastName>Abelson</LastName> <City>Springfield</City> <AreaCode>714</AreaCode> <Telephone>555-1111</Telephone> </row> <row> <FirstName>Bill</FirstName> <LastName>Bailey</LastName> <City>Decatur</City> <AreaCode>714</AreaCode> <Telephone>555-2222</Telephone> </row> . . . </CUSTOMER>
The root element of the document has been given the name of the table. Each table row is contained within a <row> element, and each row element contains a sequence of column elements, each named after the corresponding column in the source table. Each column element contains a data value.
Because SQL data might include null values, you must decide how to represent them in an XML document. You can represent a null value either as nil or absent. If you choose the nil option, then the attribute xsi:nil=“true” marks the column elements that represent null values. It might be used in the following way:
<row> <FirstName>Bill</FirstName> <LastName>Bailey</LastName> <City xsi:nil="true" /> <AreaCode>714</AreaCode> <Telephone>555-2222</Telephone> </row>
If you choose the absent option, you could implement it as follows:
<row> <FirstName>Bill</FirstName> <LastName>Bailey</LastName> <AreaCode>714</AreaCode> <Telephone>555-2222</Telephone> </row>
In this case, the row containing the null value is absent. There is no reference to it.
How to generate the XML Schema
When mapping from SQL to XML, the first document generated is the one that contains the data. The second contains the schema information. Consider the schema for the CUSTOMER document:
<xsd:schema> <xsd:simpleType name="CHAR_15"> <xsd:restriction base="xsd:string"> <xsd:length value = "15"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name="CHAR_25"> <xsd:restriction base="xsd:string"> <xsd:length value = "25"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name="CHAR_3"> <xsd:restriction base="xsd:string"> <xsd:length value = "3"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name="CHAR_8"> <xsd:restriction base="xsd:string"> <xsd:length value = "8"/> </xsd:restriction> </xsd:simpleType> <xsd:sequence> <xsd:element name="FirstName" type="CHAR_15"/> <xsd:element name="LastName" type="CHAR_25"/> <xsd:element name="City" type="CHAR_25 nillable=”true"/> <xsd:element name="AreaCode" type="CHAR_3" nillable="true"/> <xsd:element name="Telephone" type="CHAR_8" nillable="true"/> </xsd:sequence> </xsd:schema>
This schema is appropriate if the nil approach to handling nulls is used. The absent approach requires a slightly different element definition. For example:
<xsd:element name="City" type="CHAR_25" minOccurs="0"/>