Issue with replace function using regex
Hi,
I'm working with a cms tool called Smartsite. I'm not that good yet with regular expressions. I found allot solutions for my problems.
But i'm having problems with 2 expressions.
First issue
---------------
<td style='BORDER-RIGHT: windowtext 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: windowtext 1pt solid; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0cm; BORDER-LEFT: windowtext 1pt solid; WIDTH: 426.45pt; PADDING-TOP: 0cm; BORDER-BOTTOM: windowtext 1pt solid; BACKGROUND-COLOR: transparent' valign='top' width='569' colspan='2'>
This is what the msword converter does with a td tag.
I placed this expression text.replace(/style[^<]*/gi,''); and i only keep this <td> instead of all that crap. But i like to keep the colspan="2" between the td tag.
I tried allot of expression but none of them works.
Second issue
---------------------
I'm working with xhtml, img tags need to be closed at the end like <img src=""/>
It's not doing that automaticly with the mswordconverter to html, its not closing the image tag
Example:
<img height="301" src="[urlprefix]Docs/daniel/ConvertedWordFile5334240.html_files/image002.jpg" width="567">
Is there a way to say that it needs to keep everything between the image tag but only place a "\" to close the tag like -> text = text.replace(/<img *> /gi,'<img* \/>');
I hope you can help me, i'm already searching 4 houres after this.
thx for helping.
Regards Colemonts Peter
This is my code so far. It's used to convert imported word document to plain html
-------------------------------------------------------------------------------------------------------------------------
<![CDATA[
var text = edit().getHTML();
text = text.replace(//gi,'');
text = text.replace(/<\/font>/gi,'');
text = text.replace(/<span[^<]*>/gi,'');
text = text.replace(/<\/span>/gi,'');
text = text.replace(/<ins[^<]*>/gi,'');
text = text.replace(/<\/ins>/gi,'');
text = text.replace(/style[^<]*/gi,'');
text = text.replace(/<table[^<]*>/gi,'<table border=1>');
text = text.replace(/<p[^<]*>/gi,'<p>');
text = text.replace(/<div[^<]*>/gi,'<div>');
text = text.replace(/<h1[^<]*>/gi,'<h1>');
text = text.replace(/<h2[^<]*>/gi,'<h2>');
text = text.replace(/<h1><b[^<]*>/gi,'<h1>');
text = text.replace(/<\/b><\/h1>/gi,'</h1>');
text = text.replace(/<h2[^<]*><b>/gi,'<h2>');
text = text.replace(/<\/b><\/h2>/gi,'</h2>');
text = text.replace(/<ul[^<]*>/gi,'[list]');
text = text.replace(/<li[^<]*>/gi,'<li>');
text = text.replace(/<a[^<]*>/gi,'');
text = text.replace(/<\/a>/gi,'');
text = text.replace(/.nbsp;/gi,'');
text = text.replace(/<p>\s*<\/p>/gi,'');
text = text.replace(/\r+/g,'\r');
text = text.replace(/\n+/g,'\n');
text = text.replace(/(\r\n)+/g,'\r\n');
text = text.replace(/<p>(-|.middot;|.bull;)\s*/gi,'<p>-');
edit().setHTML(text);
]]>
|