XML Schema Patterns for Common Data Structures

Editor:
Paul Downey, BT

Abstract

This document provides a set of simple XML Schema 1.0 patterns for describing commonly used data structures. The data structures described are intended to be independent of any particular programming language or database or modelling environment.

Status of this Document

This document has no official standing.

This origin if this document stems from discussions in the Web Services Description Working Group surrounding a set of example XML Schema patterns submitted as examples for the WSDL 2.0 Primer.

However, as a result of discussions during the W3C Workshop on XML Schema 1.0 User Experiences, the W3C may form a dedicated Working Group to develop a set of patterns for common data structures of XML Schema for the purpose of simplifying the mapping of XML Schemas into programming language structures (see also the proposed charter]. This note may provide useful input to such a Working Group.

The Web Services Description Working Group has not endorsed, nor reviewed this document.

Table of Contents

1 Introduction
    1.1 Notational Conventions
2 Patterns for Common Data Structures
    2.1 Naming Types
    2.2 Enumerated Type
        2.2.1 Extensible Enumerated Type
    2.3 Collection
        2.3.1 Extending A Collection With Attributes
        2.3.2 Extending A Collection With Elements
        2.3.3 Inheritance
    2.4 Vector
    2.5 Maps
        2.5.1 Map Keyed with xs:ID or xml:ID
        2.5.2 Map Type
        2.5.3 WSDL Instance Map Item Type
3 Normative References
4 Informative References

Appendix

A Acknowledgements


1 Introduction

This note provides a set of example XML Schema structures [XML Schema: Structures] and types [XML Schema: Datatypes] which may be used to exchange commonly used data structures in the form of an XML document.

Authors of tools which map or bind data structures to XML may find these patterns useful to represent simple and common place constructs. Recognising these patterns and presenting them in terms most appropriate to the specific language, database or environment may preserve such data structures across boundaries, as well as providing an improved user experience when using data mapping and binding tools.

Authors of XML Schema 1.0 documents may find these patterns useful in providing a better user experience for consumers of their schemas using data mapping and binding tools.

It is not intended that these constructs should constrain the use of the XML Schema Recommendation. A processor using these constructs to map data structures should continue to support the whole of the XML Schema Recommendation.

1.1 Notational Conventions

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [IETF RFC 2119].

This specification uses properties from the XML Information Set (see [XML Information Set]). Such properties are denoted by square brackets, e.g. [namespace name].

This specification uses namespace prefixes that are listed in Table 1. Note that the choice of any namespace prefix is arbitrary and not semantically significant (see [XML Information Set]).

Table 1. Prefixes and Namespaces used in this specification
Prefix Namespace Definition
xs "http://www.w3.org/2001/XMLSchema" Defined in the W3C XML Schema specifications [XML Schema: Structures], [XML Schema: Datatypes].
xsi "http://www.w3.org/2001/XMLSchema-instance" Defined in the W3C XML Schema specification [XML Schema: Structures].
wsdli "http://www.w3.org/2005/05/wsdl-instance" Defined in the W3C WSDL 2.0 specification [WSDL 2.0 Part 1].

Namespace names of the general form "http://example.org/..." and "http://example.com/..." represent application or context-dependent URIs (see [IETF RFC 3986]).

All parts of this specification are normative, with the exception of examples and sections explicitly marked as "Non-Normative".

2 Patterns for Common Data Structures

2.1 Naming Types

Giving each data structure an individual type, following the so-called "Venetian blind" schema authoring style, simplifies mapping to and from a data model.

The name of a schema type, attribute or element may be any valid XML non-colonized name including names which may be reserved or not directly representable in some programming languages, such as "object", "static", "final", "class", "Customer-Profile", etc.

2.2 Enumerated Type

The XML Schema enumerated type provides a useful mechanism for expressing a controlled vocabulary for an element or attribute value.

Example 1: Enumerated Type
<xs:simpleType name='Beatle' >
  <xs:restriction base='xs:string' >
   <xs:enumeration value='John' />
   <xs:enumeration value='Paul />
   <xs:enumeration value='George' />
   <xs:enumeration value='Stuart' />
   <xs:enumeration value='Pete' />
   <xs:enumeration value='Ringo' />
  </xs:restriction>
</xs:simpleType>
          

2.2.1 Extensible Enumerated Type

An enumeration may made open to extension, possible in a later version of the schema, by creating a union with its base type.

Example 2: Extensible Enumerated Type
<xs:simpleType name='KnownCurrency' >
  <xs:restriction base='xs:string' >
   <xs:enumeration value='GBP' />
   <xs:enumeration value='USD' />
   <xs:enumeration value='CAD' />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name='Currency' >
  <xs:union memberTypes='tns:Currency xs:string' />
</xs:simpleType>
          

Version 2 of the schema may introduce additional values whilst remaining backwards compatible with the original schema:

<xs:simpleType name='KnownCurrency' >
  <xs:restriction base='xs:string' >
   <xs:enumeration value='GBP' />
   <xs:enumeration value='USD' />
   <xs:enumeration value='CAD' />
   <xs:enumeration value='EUR' />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name='Currency' >
  <xs:union memberTypes='tns:Currency xs:string' />
</xs:simpleType>
          

2.3 Collection

A collection of data items, typically contained by a programming language "object", "class", "structure", or "record", may be represented in XML Schema using a single model group. The individual items held in a collection may appear either as XML element or attribute values.

Example 3: Simple Collection
<xs:complexType name="ProductType">
  <xs:sequence>
    <xs:element name="name" type="xs:string"/>
    <xs:element name="shade" type="xs:string"/>
  </xs:sequence>
  <xs:attribute name="id" type="xs:string" />
  <xs:attribute name="inStock" type="xs:int" />
</xs:complexType>
          

The collection may be composed using a model group of either:

  • sequence - the elements are ordered

  • all - the elements are unordered

  • choice - only one of the elements may be present

The all model group may appear attractive given programming language techniques such as introspection or reflection often return items in a random order. However, there are significant restrictions placed upon all types, not least an element cannot have a maxOccurs value greater than 1. The Unique Particle Attribution (UPA) constraint prevents a model group of all from being extended, either by containing an any element wildcard, being incorporated in a substitution group or derived using extension or restriction.

2.3.1 Extending A Collection With Attributes

A collection may be made open to extension by the addition of attributes using the anyAttribute construct.

Example 4: Extending a Collection with Attributes
 <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="name" type="xs:string" />
      <xs:element name="shade" type="xs:string"/>
   </xs:sequence>
   <xs:attribute name="id" type="xs:string" />
   <xs:anyAttribute namespace="##any" processContents="lax"/>
 </xs:complexType>
          

The namespace value will typically either be:

  • ##targetNamespace - the additional items will appear in the same namespace

  • ##other - the additional items may appear in a different namespace

  • ##any - the additional items may appear in any namespace

The processContents value of lax is typically used to indicate that additional items are unlikely to be known to the schema, this is a common case when evolving a schema to another version.

2.3.2 Extending A Collection With Elements

Example 5: Extending a Collection with Elements in Another Namespace
 <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="name" type="xs:string" />
      <xs:element name="shade" type="xs:string" />
      <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
   </xs:sequence>
 </xs:complexType>
          

An any construct with a namespace of ##other may directly follow an optional element.

Example 6: Extending a Collection with Elements After an Optional Element
 <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="name" type="xs:string" />
      <xs:element name="shade" type="xs:string" />
      <xs:element name="nickName" type="xs:string" minOccurs="0"/>
      <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
   </xs:sequence>
 </xs:complexType>
          

The UPA constraint prevents an any wildcard with a namespace value of any or targetNamespace to follow an element with a minOccurs value of 0. In the case where a collection may be extended within the same namespace, an optional wrapper element may be used to contain the any wildcard.

Example 7: Extending a Collection Within the Same Namespace
<xs:complexType name="CustomerType">
  <xs:sequence>
    <xs:element name="firstName" type="xs:string" />
    <xs:element name="lastName" type="xs:string" />
    <xs:element name="extension" type="tns:CustomerExtensionType" minOccurs="0" />
  </xs:sequence>
  <xs:anyAttribute/>
</xs:complexType>

<xs:complexType name="CustomerExtensionType">
  <xs:sequence>
    <xs:any processContents="lax" minOccurs="0" maxOccurs="unbounded"
          namespace="##targetNamespace"/>
    </xs:sequence>
</xs:complexType>
          

Optional elements may be added to the CustomerExtenstionType whilst remaining compatible with previous versions of the schema:

<xs:complexType name="CustomerExtensionType">
  <xs:sequence>
    <xs:element name="middleName" type="xs:string" minOccurs="0" />
    <xs:element name="title" type="xs:string" minOccurs="0" />
    <xs:any processContents="lax" minOccurs="0" maxOccurs="unbounded"
          namespace="##targetNamespace"/>
    </xs:sequence>
</xs:complexType>
          

Note that the 'any' element wildcard may not be used within an all model group.

2.3.3 Inheritance

Editorial note: PaulD 2005-07-14
Examples of extending a collection into another namespace as for a class being extended into another package

2.4 Vector

A vector is an ordered sequence of repeated items of the same data type. This is a very common construct in programming languages appearing as an array or list. Some tools which generate schemas from a model may elect to give each vector a dedicated type including a wrapper element.

Example 8: Wrapped Vector
<xs:complexType name="ItemListType">
  <xs:sequence>
    <xs:element name="item" type="xs:string" minOccurs=0 maxOccurs="unbounded"/>
  </xs:sequence>
</xs:complexType>
          

However, there may not always be a wrapper element. A collection may contain more than one element with a maxOccurs value greater than 1.

Example 9: Bare Vectors
 <xs:complexType name="CustomerType">
  <xs:sequence>
    <xs:element name="name" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
    <xs:element name="addressLine" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
    <xs:element name="telephoneNumber" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
  <xs:sequence>
</xs:complexType>
          
<customer>
    <name>Mr</name>
    <name>Benn</name>
    <addressLine>52, Festive Road</addressLine>
    <addressLine>London</addressLine>
    <addressLine>England</addressLine>
    <telephoneNumber>+44 207 946 0001</telephoneNumber>
    <telephoneNumber>+44 7700 900 001</telephoneNumber>
</customer>
          

A multi-dimensional array may be built by composing collections and vectors which contain vectors. More complicated array structures such as a sparse or jagged matrices are beyond the scope of this note.

2.5 Maps

A map is an unordered collection of repeated items of the same type appearing in a programming language as a "hash table", "dictionary", "associative array", "associative memory", "indexed table", "keyed data", etc. Within the map, each item is accessible by a key value, unique within the scope of the collection.

Example 10: Representing a Map

The following Perl hash variable:

products = (
  '10203' => {
    name => 'apple'
    price => '35',
  },

  '10204' => {
    name => 'pear',
    price => '50',
  },
)
          

may be represented in XML using an attribute for the key:

<products>
   <product id="10203">
    <name>apple</name>
    <price>35</price>
  </product>
  <product id="10204">
    <name>pear</name>
    <price>50</price>
</product>
          

or an element for the key:

<products>
   <product>
    <key>10203</key>
    <name>apple</name>
    <price>35</price>
  </product>
  <product>
    <key>10204</key>
    <name>pear</name>
    <price>50</price>
  </product>
</products>
          

For a tool to be able to recognise that a repeated item a map, it needs to be able to identify which of the repeated elements or attributes represents the unique key.

2.5.1 Map Keyed with xs:ID or xml:ID

The XML Schema type xs:ID and XML Core defined [xml:id] attribute guarantee uniqueness of an value within the scope of an XML document. A repeated element with a required attribute of type xs:ID may therefore be may be safely represented using a map, with the xs:ID attribute as the key.

Example 11: Map Keyed Using xml:id
<xs:schema targetNamespace="http://www.w3.org/1998/XML/Namespace">
 <xs:attribute name="id" type="xs:ID"/>
</xs:schema>

<xs:schema targetNamespace="http://www.openuri.org/">
 <xs:import namespace="http://www.w3.org/XML/1998/namespace"/>

 <xs:complexType name="ProductType">
   <xs:sequence>
     <xs:element name="name" type="xs:string"/>
     <xs:element name="price" type="xs:string"/>
   </xs:sequence>
   <xs:attribute ref="xml:id" use="required" />
 </xs:complexType>
  
</xs:schema>
          

2.5.2 Map Type

XML Schema provides a set of Identity Constraints, which may used to describe a value as being unique within a portion of a document identified using an XPath expression. A repeated element with a required attribute or element with a cardinality of one constrained by xs:unique may be safely represented using a map, with the unique elements or attributes forming the key.

Editorial note: PaulD 2005-07-15
This pattern seems promising, though it is limited to an element, difficult to present in a complexType. The XPath could be complex for tools to detect. Other schema identity constraints such as key/keyref suffer from similar difficulties.
Example 12: Map with xs:unique Key
<?xml version="1.0" encoding="UTF-8"?>
  <xs:element name="map">
      <xs:complexType>
      <xs:sequence>
        <xs:element name="item" type="MapItemType" minOccurs="0" maxOccurs="unbounded"/>
      </xs:sequence>
    </xs:complexType>
    <xs:unique name="item">
      <xs:selector xpath="item"/>
      <xs:field xpath="key"/>
    </xs:unique>
  </xs:element>
    
  <xs:complexType name="MapType">
    <xs:sequence>
      <xs:element ref="map"/>
    </xs:sequence>
  </xs:complexType>
    
  <xs:complexType name="MapItemType">
    <xs:sequence>
      <xs:element name="key" type="xs:anyType"/>
      <xs:element name="value" type="xs:anyType"/>
    </xs:sequence>
  </xs:complexType>
          

2.5.3 WSDL Instance Map Item Type

Editorial note: PaulD 2005-07-14
This base type is offered as an alternative to the generic unique pattern and follows the approach taken by Axis. The wsdli namespace has been used without the approval of the WSDL WG and is therefore subject to change.

The WSDL Instance namespace defines a the MapItemType, which provides a container for a pair of elements, "key" and "value".

Example 13: WSDL Instance Defined MapItemType
<xs:schema targetNamespace="http://www.w3.org/2005/05/wsdl-instance"
  elementFormDefault="qualified" attributeFormDefault="unqualified"> 
  <xs:complexType name="MapItemType">
    <xs:sequence>
      <xs:element name="key" type="xs:anyType"/>
      <xs:element name="value" type="xs:anyType"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>
          

The MapItemType may used within a repeated element to represent a map in which the index is the "key" element, which may contain complex XML content.

Example 14: Use of the MapItemType
<xs:complexType name="MapType">
  <xs:sequence>
    <xs:element name="item" minOccurs="0" maxOccurs="unbounded" type="wsdli:MapItemType"/>
  </xs:sequence>
</xs:complexType>
          

3 Normative References

XML Schema: Structures
XML Schema Part 1: Structures Second Edition, H. Thompson, D. Beech, M. Maloney, and N. Mendelsohn, Editors. World Wide Web Consortium Recommendation, 28 October 2004. (See http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/.)
XML Schema: Datatypes
XML Schema Part 2: Datatypes Second Edition, P. Byron and A. Malhotra, Editors. World Wide Web Consortium Recommendation, 28 October 2004. (See http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/.)
xml:id
xml:id Version 1.0, J. Marsh, D. Veillard, N. Walsh, World Wide Web Consortium Proposed Recommendation, July 2005. (See http://www.w3.org/TR/xml-id/.)
WSDL 2.0 Part 1
Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language, Roberto Chinnici, Martin Gudgin, Jean-Jacques Moreau, Jeffrey Schlimmer, Sanjiva Weerawarana, World Wide Web Consortium Working Draft 10 May 2005 (See http://www.w3.org/TR/2005/WD-wsdl20-20050510.)
XML Information Set
XML Information Set (Second Edition), J. Cowan and R. Tobin, World Wide Web Consortium Recommendation, 4 February 2004. (See http://www.w3.org/TR/2001/REC-xml-infoset-20011024/.)
IETF RFC 3986
Uniform Resource Identifiers (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter, January 2005. (See http://www.ietf.org/rfc/rfc3986.txt.)
IETF RFC 2119
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner, Author. Internet Engineering Task Force, June 1999. (See http://www.ietf.org/rfc/rfc2119.txt.)

4 Informative References

XML Schema: Primer
XML Schema Part 0: Primer Second Edition, David C. Fallside, Priscilla Walmsley, Editors. World Wide Web Consortium Recommendation, 28 October 2004. (See http://www.w3.org/TR/2004/REC-xmlschema-0-20041028/.)
WSDL: Primer
Web Services Description Language (WSDL) Version 2.0 Part 0: Primer, David Booth, Canyang Kevin Liu, Editors. World Wide Web Consortium Editors' copy, June 2005. (See http://dev.w3.org/cvsweb/~checkout~/2002/ws/desc/wsdl20/wsdl20-primer.html?content-type=text/html;%20charset=utf-8.)

A Acknowledgements

This document has been developed by participants of the Web Services Description Working Group.