how to import data from html

dude153 · December 7th, 2007, 04:37 PM

In xslt, is there any way I can import html data from an html page?

mhkay · December 7th, 2007, 07:12 PM

If it's XHTML, you can simply read it using the document() function.

If it's not, you have to convert it to XML. One way to do that is to read it on the fly using John Cowan's TagSoup parser, which presents it to the XSLT application as if it had been XML all along. Another way is to do a static conversion using the JTidy utility. Finally, XSLT 2.0 allows you to read HTML as text (without conversion to XML) using the unparsed-text() function - but of course, as text, it's not so easy to manipulate.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference