Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XSLT
|
XSLT General questions and answers about XSLT. For issues strictly specific to the book XSLT 1.1 Programmers Reference, please post to that forum instead.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XSLT section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old August 22nd, 2009, 11:10 PM
Friend of Wrox
 
Join Date: Feb 2009
Posts: 119
Thanks: 25
Thanked 3 Times in 3 Posts
Default

That works thanks.
 
Old August 23rd, 2009, 04:22 AM
Friend of Wrox
 
Join Date: Feb 2009
Posts: 119
Thanks: 25
Thanked 3 Times in 3 Posts
Default Executive officers of Kellog company

Hello,

I now have to find the executive officers of the Kellog company and they are no longer in one nice table. They are spread out over many tables an divs in repeating pattern

If you do a search for - James M. Jenness - you will see what I mean

I was able to take the code and advice that you have provided and apply/change it for two other documents. One for microsoft and the other for 3M.

This is the kellog link

http://www.sec.gov/Archives/edgar/da...47381e10vk.htm

Any help is always appreciated

Regards.
 
Old August 23rd, 2009, 08:43 AM
Friend of Wrox
 
Join Date: Nov 2007
Posts: 1,243
Thanks: 0
Thanked 245 Times in 244 Posts
Default

You will need to look at the document structure and find the XPath expressions to select the elements containing the data you are looking for. I am afraid with such an irregular structure there is not much you can automate.
The code
Code:
  <xsl:template name="main">
    <xsl:variable name="html-doc" select="d:htmlparse(unparsed-text($f, 'ISO-8859-1'))"/>
   
    <results>    
      <xsl:apply-templates select="$html-doc/document/type/sequence/filename/description/text/html/body//div[div[contains(normalize-space(), 'Executive Officers.')]]/table/tr[2]"/>
    </results>
  </xsl:template>
  
  <xsl:template match="tr">
    <person name="{normalize-space(td[1])}" age="{normalize-space(td[2])}" title="{normalize-space(parent::table/following-sibling::div[1])}"/>
  </xsl:template>
finds only two items:
Code:
<results>
   <person name="James M. Jenness" age="62" title="Chairman of the Board"/>
   <person name="A. D. David Mackay" age="53"
           title="President and Chief Executive Officer"/>
</results>
__________________
Martin Honnen
Microsoft MVP (XML, Data Platform Development) 2005/04 - 2013/03
My blog





Similar Threads
Thread Thread Starter Forum Replies Last Post
How can I extract text from a GIF image? Pls help! superjas Excel VBA 2 March 7th, 2018 11:16 PM
how to extract text from html??? naureen Java Basics 2 October 2nd, 2007 11:19 AM
Extract text from webpages asif_sharif ASP.NET 2.0 Basics 7 October 1st, 2007 03:56 PM
Extract text with java script TheMajor Javascript 5 September 30th, 2007 09:45 PM
Extract text from text file & put in dropdown box tsukey Beginning PHP 5 July 20th, 2004 09:49 PM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.