Question and Answer DTD

ROCXY · November 28th, 2008, 07:23 AM

Dear All,

Here is a case; where I come across developing project in XML.
My client would be providing me the hard books or PDF files of question and answer unit book and which comprises of all type of questioning procedure e.g. Optional, Fill ups, Summary etc in simple it is a book with full of math problem and keys. And the client want us to publish in web using XML as interactive format.

Following are the list of details I would like to get advice from you all.

1. Does this conversion of PDF to interactive XML to share via web page sounds good or NOT?
2. What XML standard will be suitable for this, if possible please send me the link of DTD or XML samples.

Any help would be great full

Thank.
Rocxy

mhkay · November 28th, 2008, 07:55 AM

PDF is highly variable in how easy it is to reverse-engineer - it depends how it was created. Sometimes it's just scanned images. There are tools for turning it into something more usable, but sometimes they give no better results than you could get by OCR scanning the printed pages. Ask for some samples and do some experiments before you commit to your cost estimates. Preferably, get the source documents from which the PDF was produced, they will be much easier to convert.

As for the XML standard to use, I think it would be best to design your own. Use MathML for the maths part perhaps, and you could base the rest on something like DocBook, but I suspect you'll have more flexibility if you define your own schema rather than trying to use something off the shelf.

Michael Kay
http://www.saxonica.com/
Author, XSLT 2.0 and XPath 2.0 Programmer's Reference

ROCXY · November 29th, 2008, 01:15 AM

Dear Kay,

Thanks for your valuable inputs. I will get the PDF as text behind or the composite one.

Thanks!