You are currently viewing the XML section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developers’ questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
I was wanting some advice regarding the best-practice approach to
representing a hierarchy in XML. Let me explain by an example of the two
different approaches I have seen.
In the first example all of the data in the document is enclosed within what
might be an ambiguous <categories> node. However on some occasions there
might be some need for this node to have some attributes. And then for
individual <category> nodes within that, they again have a <categories> node
within which child <category> nodes exist.
In the second example, the <categories> nodes are eliminated and hierarchy
is represented simply by there being <category> nodes within <category>
nodes.
What I would like guidance on is which approach would be considered
"best-practice", or are both valid in different circumstances? Additionally,
does anyone have some good references on where I could research these XML
design issues further? Are there any ramifications with regard to writing an
XSD schema for such document?
Could you be less clear? It's so illuminating that I have to put sunglasses on!
IF I understand what you're trying to say, my advice would be to try to abstract all you can. <categories> enclosing many <category> elements seems fine with me. However, I don't think it's a great idea to allow <categories> elements and <category> elements to be siblings. IMHO this is an abstraction error. But, you can do all you want, as long as it fits your needs ;o)
Good luck!
Dijkstra's law on Programming and Inertia:
If you don't know what your program is supposed to do, don't try to write it.
Your example 2 gives a flatter hierarchy and would generally be the best and most compact (each extra tag adds size and processing overhead).
However, you may want to include collection elements (such as your <categories> if you need to group sub-elements because:
- you need to qualify some of the sub-elements by a common attribute (or multiple common sub-elements), or
- you want to be able to visually inspect the file (in IE say) and a jumble of sub-elements is too confusing.
I find one of the helpful methods of analysing data structures is that used for normalising relational database structures.
Also the object-oriented inheritance tree design process help too. That is:
- Bubble commonality as high up the tree as you can, and
- Sediment differences as low in the tree as you can