Parsing converts < to < and converts < to a node. Conversely, serialization converts nodes to < and converts < to < So if you want your transformation to start with < and end up with <, then you either need to parse it twice and serialize it once, or to parse it once and suppress serialization. The first solution involves extracting the HTML as a string and (re-)parsing it to convert it into nodes, using some kind of extension function (e.g. saxon:parse in Saxon). The second solution involves using disable-output-escaping. d-o-e is usually frowned upon for two reasons: it's often misused, and it's not supported in all environments (for example it doesn't work in Firefox - in fact, it doesn't work in any environment where the result tree isn't serialized). But in this case, other than redesigning the source documents, it may be the best option.
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference