Converting non-XML formats to XML is much easier using XSLT 2.0 than 1.0, because of things like regular expression handling. My paper at XML 2004 might give you some ideas.
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference