aspx_professional thread: convertoer from any document(word,rowdpad,txt,pdf,etc..) to html
Message #1 by "Sharad Parsana" <sharadparsana@y...> on Mon, 16 Apr 2001 15:27:18
|
|
This may be a little late for you, but I think this might be exactly
what you're looking for. You'll need some C++ to understand the
article, but it comes directly from Microsoft's MSDN (Adam Blum April
1996) article titled "Building ISAPI Filters and the CVTDOC Sample"
Search for it on MSDN ... its pretty cool.
Basically, you can intercept a web request to IIS using an ISAPI filter
(written in C++). The filter will see if the requested file exists as
html on the server. If it does not, the filter makes an attempt to
convert the file to html using a standard converter (e.g. the word save
as html). It then converts the file to html, and hands the request to
IIS which subsequently delivers the html file. All future requests for
this file will now go directly to the existing html file (the file need
only be converted once, upon the first request for the file).
Directly from Adam Blum's own explanation:
"CVTDOC is a simple ISAPI filter I wrote in response to a need from
several clients for "automatic file publishing": Generating HTML on the
fly for specific document types. CVTDOC uses the capability of an ISAPI
filter to supplement server capabilities by registering itself as
intercepting all URL map events, and then checking to see if the
document type requested is one that it knows how to convert.
The following fragment from the CVTDOC documentation (CVTDOC.DOC when
you download the sample) may explain this requirement better:
Web content creators and Webmasters often want to "publish" a document
or data file on the Web. However, it can be very inconvenient to
constantly run a conversion program to generate new HTML each time the
document or data file is updated. Relying on the Webmaster to run the
conversion program for data that is often updated is also prone to
error. If you are positive that the user has the software to display the
document in native form, no conversion is necessary, but this is
dangerous to assume. It would be great to be able to leave the document
in native form, and have the Web server (or a Web server add-in such as
CVTDOC) convert the document to HTML on the fly as needed.
CVTDOC is an Internet Services API (ISAPI) filter that dynamically
converts documents to HTML if required when the HTML file is accessed.
If the HTML document is out of date (older than the source document) or
missing, it is automatically generated from the ISAPI filter, based on
"conversion programs" registered for the source document type in the
Registry. I provide sample conversion programs for Word documents,
Microsoft=AE Excel spreadsheets, and text files, but it's important to
remember that this can be used for any document type. The primary
purpose of CVTDOC is to demonstrate the powerful capabilities of ISAPI
filters. Nevertheless, I think you will find it useful in its own right.
The following section describes in detail how the filter was
constructed. It's relatively to easy to lose the forest through the
trees here: A quick glance at Using CVTDOC on installing and using the
filter (both of which are really quite simple) may help avoid any
disorientation as you plow through the minutiae of how this was built.
"
Good luck,
The sample is already done, so it's also a nice way to figure out ISAPI
programming.
Bruce Pezzlo, MCSD
-----Original Message-----
From: Sharad Parsana [mailto:sharadparsana@y...]
Sent: Monday, April 16, 2001 11:27 AM
To: ASP_Professional
Subject: [aspx_professional] convertoer from any
document(word,rowdpad,txt,pdf,etc..) to html
I am having problem regarding convertor which is going to convert any
document like word file ,wordpad file , pdf file ,etc... to the html
file .
It compulsory that In the html file the content should be same as from
the
ducoment file...Conetnt means like spacing,italic ,bold , etc.....
Can any body help me for this..
What can i do for this...
Thanks
Sharad
|