30 July 2010

XML in SQL Server

16 April 2009

Hon, Junior Developer, Dorset Software


XML is commonplace in today’s development environment and often stored in a database either for persisting or for later manipulation.  This means that inside SQL Server there is a data type dedicated to XML operations.  Although it is a very powerful type, it has its limitations and there are special ways to implement with this type in particular scenarios.  This article takes the reader through some of the features of XML in SQL server.

The first XML data type limitation regards data retrieval; although you can retrieve the data it does not have the ability to be casted to any other data type other than varchar, nchar, nvarchar, varbinary(max).  It also has comparison limitations whereby you cannot perform arithmetic comparisons other than a comparison to null.  The type cannot be constrained either, so it cannot be used as any type of key, used as a unique constraint or in a non-XML index.  Additionally, when storing XML, the table must have an index because this will optimise the queries run against the column.  Three secondary index types may be of interest when querying for XML:

  • Path for path centric optimisations where you know the path you are querying
  • Value for known values where the path remains unknown
  • Property to optimise queries where the path is known but using a value rather than the path

There are a couple of limitations regarding the use of XML schemas as well.  It is perfectly fine to constrain the XML data to a schema, but the schema must exist in the same database as the XML data types.  Furthermore since shredded XML data does not support formatting, this means that some XML schema properties will not be supported; this will include <xs:include>, <xs:redefine>, <xs:notation>.

It is also worth remembering that shredding XML also incurs overhead, so storing XML SQL Server is implicitly shredding the data into inaccessible tables.  This means that if you just want to persist the XML data then you should, in practise, only use varbinary(Max) which will be quicker and remove encoding issues.

Limitations aside, you can perform some interesting and flexible queries with XML Path.  It is an alternative to XML Explicit because you can define custom roots and hence hierarchies instead of relying on auto or writing the explicit mode query.  You can also try to use Raw because you can use its optional Elements parameter to replace the generic ‘row’ element name.

If you need to nest FOR XML queries this is also supported in SQL Server. With SQL Server it is possible to bulk load XML data from a file using the OpenRowSet function.  Using Bulk Rowset Provider you can insert XML from a file directly to a XML column.  If you are using either the Single_Clob or Single_NClob to let SQL server know what sort of data is inserted, try to use the preferred Single_Blob option because this allows support for Windows encoding conversions. 

A feature of XML data types is the associated methods which you can use to manipulate retrieved XML data on the fly.  These support XQuery and XPath and these functions include xml.value, xml.exists, xml.query, xml.nodes, xml.modify.

Finally SQL Server 2008 now supports all FLOWR operations where Let was not supported in earlier versions.  Lax validation has been included to XML schema in this new release, allowing you to mix string and loose typing XML.  The loose typing feature is supported via wildcards using any, anyattribute or anyType.

I hope that this article has covered some of the limitations of the XML data type but for those new to the data type it is important to recognize how powerful it can be when used correctly.


News

Intelligent Communications

29 Jun 2010
Dorset Software today announced the release of Intelligent Communications; the solution that radically enhances the way that businesses communicate. The software-based communication system...

Hiring Now!

21 May 2010
With strong growth and a solid sales pipeline Dorset Software are looking for talented individuals...

Improving Database Performance...

01 Feb 2010
Dorset Software introduce their Database Performance Auditing...

Featured Case Studies

Wandsworth Borough Council

Read how Dorset Software developed and delivered a bespoke incident tracking solution to Wandsworth on time and to budget.

Western Riverside Waste Authority

Learn how the Western Riverside Waste Authority improved transparency of waste and recycling with a suite of software applications.