Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > Open Source > Perl
Password Reminder
Register
Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read
Welcome to the p2p.wrox.com Forums.

You are currently viewing the Perl section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developersí questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old May 25th, 2009, 04:04 AM
Authorized User
 
Join Date: Apr 2009
Posts: 31
Thanks: 0
Thanked 0 Times in 0 Posts
Default Count and replace

Hi,

XML:
Code<tableset>
<table id="acprof-9780199226009-table-1" frame="none">
<tgroup cols="4">
<colspec colnum="1" colname="col1"/>
<colspec colnum="2" colname="col2"/>
<colspec colnum="3" colname="col3"/>
<colspec colnum="4" colname="col4"/>
<thead>
<row rowsep="1">
<entry colname="col1"/>
<entry colname="col2" align="center">
<p>
<b>UK</b>
</p>
</entry>
<entry colname="col3" align="center">
<p>
<b>France</b>
</p>
</entry>
<entry colname="col4" align="center">
<p>
<b>Germany</b>
</p>
</entry>
</row>
</thead>
<tbody>
<row>
<entry colname="col1" align="left">
<p>Industry</p>
</entry>
<entry colname="col2" align="char" char=".">
<p>17.5</p>
</entry>
<entry colname="col3" align="char" char=".">
<p>14.4</p>
</entry>
<entry colname="col4" align="char" char=".">
<p>25.4</p>
</entry>
</row>

</tbody>
</tgroup>
</table>
<table id="acprof-9780199226009-table-1" frame="none">
<tgroup cols="4">
<colspec colnum="1" colname="col1"/>
<colspec colnum="2" colname="col2"/>
<colspec colnum="3" colname="col3"/>
<colspec colnum="4" colname="col4"/>
<thead>
<row rowsep="1">
<entry colname="col1"/>
<entry colname="col2" align="center">
<p>
<b>UK</b>
</p>
</entry>
<entry colname="col3" align="center">
<p>
<b>France</b>
</p>
</entry>
<entry colname="col4" align="center">
<p>
<b>Germany</b>
</p>
</entry>
</row>
</thead>
<tbody>
<row>
<entry colname="col1" align="left">
<p>Industry</p>
</entry>
<entry colname="col2" align="char" char=".">
<p>17.5</p>
</entry>
<entry colname="col3" align="char" char=".">
<p>14.4</p>
</entry>
<entry colname="col4" align="char" char=".">
<p>25.4</p>
</entry>
</row>

<row>
<entry colname="col1" align="left">
<p>Industry</p>
</entry>
<entry colname="col2" align="char" char=".">
<p>17.5</p>
</entry>
<entry colname="col3" align="char" char=".">
<p>14.4</p>
</entry>
<entry colname="col4" align="char" char=".">
<p>25.4</p>
</entry>
</row>
</tbody>
</tgroup>
</table>
</tableset>

Need output xml:

Code<tableset>
<table id="acprof-9780199226009-table-1" totalrow="2" frame="none">
<tgroup cols="4">
<colspec colnum="1" colname="col1"/>
<colspec colnum="2" colname="col2"/>
<colspec colnum="3" colname="col3"/>
<colspec colnum="4" colname="col4"/>
<thead>
<row rowsep="1">
<entry colname="col1"/>
<entry colname="col2" align="center">
<p>
<b>UK</b>
</p>
</entry>
<entry colname="col3" align="center">
<p>
<b>France</b>
</p>
</entry>
<entry colname="col4" align="center">
<p>
<b>Germany</b>
</p>
</entry>
</row>
</thead>
<tbody>
<row>
<entry colname="col1" align="left">
<p>Industry</p>
</entry>
<entry colname="col2" align="char" char=".">
<p>17.5</p>
</entry>
<entry colname="col3" align="char" char=".">
<p>14.4</p>
</entry>
<entry colname="col4" align="char" char=".">
<p>25.4</p>
</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="acprof-9780199226009-table-1" totalrow="3" frame="none">
<tgroup cols="4">
<colspec colnum="1" colname="col1"/>
<colspec colnum="2" colname="col2"/>
<colspec colnum="3" colname="col3"/>
<colspec colnum="4" colname="col4"/>
<thead>
<row rowsep="1">
<entry colname="col1"/>
<entry colname="col2" align="center">
<p>
<b>UK</b>
</p>
</entry>
<entry colname="col3" align="center">
<p>
<b>France</b>
</p>
</entry>
<entry colname="col4" align="center">
<p>
<b>Germany</b>
</p>
</entry>
</row>
</thead>
<tbody>
<row>
<entry colname="col1" align="left">
<p>Industry</p>
</entry>
<entry colname="col2" align="char" char=".">
<p>17.5</p>
</entry>
<entry colname="col3" align="char" char=".">
<p>14.4</p>
</entry>
<entry colname="col4" align="char" char=".">
<p>25.4</p>
</entry>
</row>
<row>
<entry colname="col1" align="left">
<p>Industry</p>
</entry>
<entry colname="col2" align="char" char=".">
<p>17.5</p>
</entry>
<entry colname="col3" align="char" char=".">
<p>14.4</p>
</entry>
<entry colname="col4" align="char" char=".">
<p>25.4</p>
</entry>
</row>
</tbody>
</tgroup>
</table>
</tableset>

counting the each table tag inside the <row>tag, and place the counting value in the table attribute value like totalrow="2", which is highlighted in the red text.

Anyone help me how to counting the element in perl scripting...

Regards,
Nagaraj
Reply With Quote
  #2 (permalink)  
Old June 18th, 2009, 12:50 PM
Friend of Wrox
Points: 1,515, Level: 15
Points: 1,515, Level: 15 Points: 1,515, Level: 15 Points: 1,515, Level: 15
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Dec 2003
Location: Oxford, , United Kingdom.
Posts: 488
Thanks: 0
Thanked 3 Times in 3 Posts
Default

HTML::Parser could do this. There's a pretty good tutorial. I don't have the link, but you should be able to google it. Otherwise, read the file into an array, parse the array backwards, so that you know how many row elements you've hit by the time you get to the table declaration
__________________
--
Charlie Harvey's website - linux, perl, java, anarchism and punk rock: http://charlieharvey.org.uk
Reply With Quote
  #3 (permalink)  
Old June 19th, 2009, 12:52 PM
Authorized User
 
Join Date: Apr 2009
Posts: 31
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi Cider,

Code:
#!/usr/bin/perl

use strict;

my $directory = "D:/Nagaraj/oup/OUP-Brenkert/Preprocessed";

use File::Find;
use strict;
        
my $count;
my $s;
my $nb;
my $cou;
my $i;
my $ss;
my $tag;

#my $directory = "";

find (\&process, $directory);

sub process
{
    my @outLines;  #Data we are going to output
    my $Replace;      #Data we are reading line by line

        #print "processing $_ / $File::Find::name\n";

    # Only parse files that end in .xml
    if ( ($File::Find::name =~ /\.XML$/) || ($File::Find::name =~ /\.xml$/)) {

        open (FILE, $File::Find::name ) or die "Cannot open file: $!";

        while ( $Replace = <FILE> ) {

                $cou =~ s/<tableGroup([^>]+)<\/tableGroup>/<tableGroup$1<\/tableGroup>/gi;
                        for($i=0;$i<=$cou;$i++)
                                {
                                    if(/(<tableGroup([^>]+)<\/tableGroup>)/)
                                    {
                                            $ss=$1;
                                            $count = ($ss =~ s/<row/<row/g);
                                            $Replace =~ s/<tgroup cols\="(.*?)">/<tgroup cols\="$1"><SPiTable><SPiTable-body xmlns\:aid\="http:\/\/ns.adobe.com\/AdobeInDesign\/4.0\/" aid\:table\="table" aid\:trows\="$count" aid:tcols="$1">/g;
                                            print $count;
                                   }

                                }
              #$Replace =~ s/&ndash;/&#x2013;/g;
              push(@outLines, $Replace);
              }

        close FILE;
        open ( OUTFILE, ">$File::Find::name" ) or
        die "Cannot open file: $!";

        print ( OUTFILE @outLines );
        close ( OUTFILE );

        undef( @outLines );
           }
        }
I tried my best in the above perl script but it not count and replace the string.

You may corrrect this.........?????????
Reply With Quote
  #4 (permalink)  
Old August 20th, 2009, 12:55 AM
Registered User
 
Join Date: Aug 2009
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
Default

This will do what you have asked.
Code:
use strict;
use warnings;
# Define some file names.
my $infile="input.html";
my $outfile="output.html";

# Open the input file.
open (FILE, "<$infile") or die "$!\n";
my @input=<FILE>;		# This can be memory hungry if you have a large file, but it's quick-n-dirty enough for the sake of a demo.
close FILE;

# Now parse it.
my %row_counts;			# This is where we will keep the count. Keyed on the row where we found the table start.
my $current_id;			# We keep track of the row we fond the current table on.
# A C-style loop over the input.
for (my $i=0; $i<scalar(@input); $i++) {
	if ($input[$i]=~m/$\<table id=\"(.*?)\"/) {
		# We have a match, and $1 should contain the ID.
		$current_id=$i;
	}
	# Is this a "<row>" line?
	if ($input[$i]=~m/\<row\>/) {
		# Yes. Do we already have a count for this table?
		if ($row_counts{$current_id}) {
			# Yes, increment it.
			$row_counts{$current_id}++;
		} else {
			# No. This is the first row for this table, set the count to 1.
			$row_counts{$current_id}=1;
		}
	}
}
# We now have a has which holds all our row counts. The key of the hash is the row where the table started.
# Now print the entire array, but whenever we have a row where a table starts, insert the count.
for (my $i=0; $i<scalar(@input); $i++) {
	if ($row_counts{$i}) {
		# We are looking at a row where a table starts. We need to insert the row count.
		# split the line at whitespace.
		my @temp_array=split(/\s/, $input[$i]);
		# We should now have 3 elements:
		# ('<table', 'id="something"' and 'frame="none">'
		print "$temp_array[0] $temp_array[1] totalrow=\"$row_counts{$i}\" $temp_array[2]\n";
	} else {
		print "$input[$i]\n";
	}
	
	
}


print "Result:\n";
foreach(keys(%row_counts)) {
	print "$_=>$row_counts{$_}\n";
}
The comments should be self-explanatory.
It does, however, make a few assumptions:

It assumes that your HTML/XML is well formed.
It assumes that your input will always be in quite a rigid fashion (such as line 42, which assumes that there are no other attributes in your table definition).

It needs some degree of work, but it does what you want.
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
XSL: Count = Count + 1 elayaraja.s XSLT 3 July 18th, 2008 04:21 AM
is there any in built function to count page count g.tamilselvan MySQL 1 February 15th, 2006 07:43 AM
replace "." with "/" thelos Excel VBA 1 September 14th, 2005 11:47 AM
Replace crmpicco VB How-To 15 May 20th, 2005 02:34 PM
Count, sum, count a value, return records CongoGrey Access 1 April 18th, 2005 03:25 PM



All times are GMT -4. The time now is 12:58 AM.


Powered by vBulletin®
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.