I wish I could help! It's not easy.
I think the golden rule is that if the code is clear and well-written then there will be much less need for documentation. One aspect of this is to split the transformation into small and comprehensible units, and run them in a pipeline. Part of the documentation (as well as a regression test aid) is then an annotated schema for every intermediate document that passes from one transformation to the next down the pipeline.
You can then get the sitation where the pipeline itself is quite complex, and where you need a top-level system description that explains what each of the steps in the pipeline do. But if you've coded it well, the individual stylesheets should be fairly self-explanatory.
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference