This is so obscure i hesitate to blog about it, except that it took me so long to figure out that i’d love to save somebody else the trouble. You won’t care unless:

  • You’re designing an XML Schema definition (.xsd) to validate an XML file
  • You’re defining an element to contain regular text, or multiple elements, in any order, from zero to many times

Here’s an example: suppose you have a plain text description of events that includes people, places, and Bible references.

Jesus heals Simon’s mother-in-law (Matt 8:14-17; Mark 1:29-34; Luke 4:38-41)

You want to link person references with a Link element, Bible references with a Reference element, and otherwise leave the plain text as is. This results in something like this (using square brackets since otherwise WordPress gets confused):

[Link]Jesus[/Link] heals [Link]Simon[/Link]’s mother-in-law ([Reference]Matt 8:14-17[/Reference]; [Reference]Mark 1:29-34[/Reference]; [Reference]Luke 4:38-41[/Reference])

Now imagine several of these in the same element, so potentially you can have any arbitrary sequence of Links, References, and plain text, in any order, any number of times. Describing this with a BNF grammar is trivial:

LinkRef ::= Link | Reference
TextItem ::=  ( text | LinkRef )+

A cursory reading of the XML Schema description (which i’d never actually done before, instead depending on XMLSpy which generally lets me avoid thinking that hard) might make you think grouping models like sequence, choice, and all in conjunction with attributes like minOccurs and maxOccurs would do what you need. But there’s a surprisingly complex set of interactions between these, that i still don’t really understand, and so what seemed so simple proved surprisingly hard. Here are a few examples of what i tried, where XMLSpy’s validation model for XSD files (which i’m assuming is correct) wouldn’t allow it:

  • while all is for an unordered group of elements, it’s restricted to maxOccurs=1. So it doesn’t handle unbounded occurrence (though it does allow minOccurs=0, e.g. optionality). Furthermore, it can’t be nested inside other model groups like sequence.
  • choice groupings can be neither optional nor unbounded.
  • trying to specify multiple occurrences of both Link and Reference, each both optional and unbounded, is flagged as an ambiguous model.

The solution i finally discovered (after embarrassingly many other permutations, more by trial and error than anything else):

  • define a LinkRef group that allows a sequence of either Link or Reference, both optional and unbounded (zero to many occurrences)
  • the TextItem (enclosing parent) element allows an optional and unbounded sequence of LinkRef groups.

For the more visually oriented, here’s how it looks in XMLSpy:

TextItem and LinkRef Grouping