Generating Relational data from xml datatype Part 1

Published: 15 Nov 2006
By: Imran Nathani

The introduction of the xml datatype in SQL Server has given us an enhancement of actually dumping raw XML obtained from different servers into a distributed environment into one relational table. This article which is the first part of two, deals with the basics of the xml datatype and retrieving data from the xml datatype.

Introduction

The xml datatype is normally used to store information of a more static nature like invoices, vouchers or receipts obtained from other systems in XML form. However it can also be queried to obtain specific data which is exactly what I will cover in this article. We simply declare the xml datatype as follows:

DECLARE @xmlvar AS xml

XQUERY language

XQUERY is the language used to retrieve XML data and is synonymous to SQL in relational data. Without going into too much depth about XQuery, the following XML block and sample XQuery should help introduce the topic:

<?xml version="1.0" ?> 
 <Invoices>   
    <Invoice InvoiceId="1">
      <Invoicedate>10-Jan-2006</Invoicedate>
      <Salesperson id="1">XYZ</Salesperson>
    </Invoice>   
    <Invoice InvoiceId="2">
      <Invoicedate>11-Jan-2006</Invoicedate>
      <Salesperson id="1">XYZ</Salesperson>
    </Invoice>     
    <Invoice InvoiceId="3">
      <Invoicedate>11-Jan-2006</Invoicedate>
      <Salesperson id="2">ABC</Salesperson>
    </Invoice>      
 </Invoices>

Assuming we have a simple XML document as above the implication of the following XQueries are as follows:

Invoices/Invoice Points to the 'Invoice' nodes under the root node 'Invoices'.
Invoices/Invoice/@InvoiceId Points to attribute 'InvoiceId' represented by '@' in the 'Invoice' nodes.
Invoices/Invoice[1] Points to first 'Invoice' node under the root node 'Invoices'.
data(Invoices/Invoice) Retrieves everything qualifying under the 'Invoices/Invoice' path.
data(Invoices/Invoice/@InvoiceId) Retrieves everything qualifying as the InvoiceId attribute under the 'Invoices/Invoice' path.

In addition to the data() function, XQuery also has functions like count(), sum(), avg() etc. .

XML handling methods in SQL Server

SQL Server has some methods to handle the xml datatype. The following code snippets will demonstrate these methods. Also these snippets have been written in such a way that they work directly in the query analyzer.

Query method

The query method is used to query XML from an xml data type. The parameter to this method is an XQuery expression as displayed below:

Syntax:
SELECT @xmlvar.query('[XQUERY expression]')

Example:

DECLARE @xmlvar AS xml
SET @xmldata = '<?xml version="1.0" ?> 
 <Invoices>   
    <Invoice InvoiceId="1">
      <Invoicedate>10-Jan-2006</Invoicedate>
      <Salesperson id="1">XYZ</Salesperson> 
    </Invoice>   
    <Invoice InvoiceId="2">
      <Invoicedate>11-Jan-2006</Invoicedate>
      <Salesperson id="1">XYZ</Salesperson> 
    </Invoice>     
    <Invoice InvoiceId="3">
      <Invoicedate>11-Jan-2006</Invoicedate>
      <Salesperson id="2">ABC</Salesperson> 
    </Invoice>      
 </Invoices>'
SELECT @xmldata.query('data(/ Invoices/Invoice/Salesperson )')

Result:

XYZ XYZ ABC

The result is the data from the Salesperson tags.

Value method

The value method is used to return a single value from an xml datatype. The parameters to this method is a XQuery expression along with a SQL return datatype (int, varchar, money etc.).

Syntax:
SELECT @xmlvar.value('[XQUERY 

expression]','[SQL-datatype]')
Example:
DECLARE @xmlvar AS xml
SET @xmldata = '<?xml version="1.0" ?> 
 <Invoices>   
    <Invoice InvoiceId="1">
      <Invoicedate>10-Jan-2006</Invoicedate>
      <Salesperson id="1">XYZ</Salesperson> 
    </Invoice>   
    <Invoice InvoiceId="2">
      <Invoicedate>11-Jan-2006</Invoicedate>
      <Salesperson id="1">XYZ</Salesperson> 
    </Invoice>     
    <Invoice InvoiceId="3">
      <Invoicedate>11-Jan-2006</Invoicedate>
      <Salesperson id="2">ABC</Salesperson> 
    </Invoice>      
 </Invoices>'
SELECT @xmldata.value('avg(/ 

Invoices/Invoice/@InvoiceId )', 'int')

Result:

2

The above example returns the average of the InvoiceId as an integer datatype.

Exist method

The exist method is used to determine whether a specified node, value, attribute or data exists in the xml variable. The parameter to this method is an XQUERY expression. It returns a boolean value.

Syntax:
SELECT @xmlvar.exist('[XQUERY expression]')
Example:
DECLARE @xmlvar AS xml
SET @xmldata = '<?xml version="1.0" ?> 
 <Invoices>   
    <Invoice InvoiceId="1">
      <Invoicedate>10-Jan-2006</Invoicedate>
      <Salesperson id="1">XYZ</Salesperson> 
    </Invoice>   
    <Invoice InvoiceId="2">
      <Invoicedate>11-Jan-2006</Invoicedate>
      <Salesperson id="1">XYZ</Salesperson> 
    </Invoice>     
    <Invoice InvoiceId="3">
      <Invoicedate>11-Jan-2006</Invoicedate>
      <Salesperson id="2">ABC</Salesperson> 
    </Invoice>      
 </Invoices>'
SELECT @xmldata.exist('/ Invoices/Invoice/@InvoiceId ')

Result:

True

As mentioned it returns True as the InvoiceId already exist. However if the XQuery is modified to /Invoices/Invoice/Invoicedate/@InvoiceId it would return False because there is no attribute of InvoiceId under the Invoicedate tag.

Summary

XML data received from disparate systems and data providers could be ripped apart and stored into a relational table, but this becomes an uphill task when the received XML structure is less uniform, or it needs to be further morphed into another structure. Rather, implementation of SQL Server's XML datatype is the simplest way to achieve this ,which is further bolstered by straightforward retrieval and manipulation capabilities of SQL Server.

About Imran Nathani

Sorry, no bio is available

View complete profile

Top Articles in this category

Using SQL Server Integration Services (SSIS) Control Flow in SQL Server 2005
SQL Server Integration Services (SSIS) in SQL Server 2005 has been redone, and is clearly superior over SQL Server 2000. With all of the new capabilities and features in SSIS, it is possible to create very complex transformation packages. This article will show you how to take advantage of the control flow features.

Using SQL Server Integration Services (SSIS) Data Flow in SQL Server 2005
SQL Server Integration Services (SSIS) in SQL Server 2005 has been redone, and is clearly superior over SQL Server 2000. With all of the new capabilities and features in SSIS, it is possible to create very complex data transformation packages. This article will show you how.

Introduction to SQL Server Encryption and Symmetric Key Encryption Tutorial
Pinal Dave talks about encryption in SQL Server 2005.

The SqlCacheDependency Class
Luke Stratman shows how and why to use the SqlCacheDependency class.

SQL Server Pre-Code Review Tips
A list of tips for enforcing coding standards in SQL.

Top
 
 
 

Please login to rate or to leave a comment.

Product Spotlight