Wrox Programmer Forums
|
HTML Code Clinic Do you have some HTML code you'd like to share and get suggestions from others for tweaking or improving it? This discussion is the place.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the HTML Code Clinic section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old August 20th, 2008, 11:39 AM
Friend of Wrox
 
Join Date: Jun 2007
Posts: 477
Thanks: 10
Thanked 19 Times in 18 Posts
Default Conversion to HTML

I was wondering if anyone has had success converting .doc files and .pdfs into HTML. Obviously we are not thrilled with Word's "Save as Web Page..." option and we'd prefer to automate the process, rather than having someone hand code everything from scratch. If we have to review it, and make edits that's fine, but something that spits out a good approximation of clean valid XHTML would be great.

A secondary question would be, is that how you store information for web retrieval for a corporate intranet and similar applications?


-------------------------

Whatever you can do or dream you can, begin it. Boldness has genius, power and magic in it. Begin it now.
-Johann von Goethe

When Two Hearts Race... Both Win.
-Dove Chocolate Wrapper

Chroniclemaster1, Founder of www.EarthChronicle.com
A Growing History of our Planet, by our Planet, for our Planet.
__________________
-------------------------

Whatever you can do or dream you can, begin it. Boldness has genius, power and magic in it. Begin it now.
-Johann von Goethe

When Two Hearts Race... Both Win.
-Dove Chocolate Wrapper

Chroniclemaster1, Founder of www.EarthChronicle.com
A Growing History of our Planet, by our Planet, for our Planet.
 
Old August 20th, 2008, 01:13 PM
Friend of Wrox
 
Join Date: Jun 2008
Posts: 1,649
Thanks: 3
Thanked 141 Times in 140 Posts
Default

You know, you *can* "script" MS Word.

In VBScript code (either in ASP or standalone) you can do
     Set word = Server.CreateObject("Word.Application")
and then use that object reference to access properties and methods of Word, just as you would do using VBS within Word.

So you *could* use that to load up a ".doc" file and then do a "save as" to HTML. Thus, as you said, automating the process.

It's not great (I'm given to understatement) but it can work. Not my cup of tea, but I've had friends who have done it.

As for PDF: There's a product called "AspPdf" out there that will allow you to manipulate PDF files via ASP or ASP.NET and I *think* that it will allow a "save as" to HTML, as well. You could google them to see.
 
Old August 25th, 2008, 02:36 AM
Friend of Wrox
 
Join Date: Jun 2007
Posts: 477
Thanks: 10
Thanked 19 Times in 18 Posts
Default

Thanks for the tips!

-------------------------

Whatever you can do or dream you can, begin it. Boldness has genius, power and magic in it. Begin it now.
-Johann von Goethe

When Two Hearts Race... Both Win.
-Dove Chocolate Wrapper

Chroniclemaster1, Founder of www.EarthChronicle.com
A Growing History of our Planet, by our Planet, for our Planet.





Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF to HTML conversion madhukp Classic ASP Basics 11 June 24th, 2013 07:19 PM
html to pdf conversion gaurikhot ASP.NET 2.0 Professional 1 December 8th, 2008 06:57 AM
Remove orphaned html elements from html string pauliehaha C# 2008 aka C# 3.0 2 June 30th, 2008 09:40 AM
Pdf to html conversion and export the checked para lohith ASP.NET 2.0 Professional 0 July 3rd, 2007 12:09 AM
Can you preload child html files to 1parent html? bekim Javascript How-To 4 January 22nd, 2005 04:17 PM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.