You seem to be having trouble knowing where to start, but it's difficult to help you because you've told us so little about the problem. If I were you I would spend a little time reading - you'll find lots of worked examples in my XSLT Programmer's Reference book for example.
There are two particular aspects raised by your problem:
(a) your input is HTML but XSLT requires XML. There are many ways to convert HTML to XML (for example using jtidy) - it depends a lot on how "clean" the HTML is to start with, and on how you want to deploy this application (is it a one-off, or a regular production job?)
(b) your transformation needs to read multiple input files. That's not difficult - you can read them using the document() or collection() functions - but the details depend on how you want to supply the list of files. Are you processing all the files in a directory? Or are you following hyperlinks from one document to another?
Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference