View Single Post
  #4 (permalink)  
Old October 16th, 2007, 08:51 AM
cfflexguy cfflexguy is offline
Registered User
 
Join Date: Oct 2007
Location: , , .
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Default

there are a couple of issues that i'm needing to address.

1) you are correct in saying that i'm dealing with some overlap markup structures. one of the reasons i'm using markers for section / chapter / verse is that throughout the text, there are instances where verses will span chapters, and sections will span several chapters or verses within a chapter. on the xhtml side, i really don't think i need to know much more than where a section, chapter, verse begins... so i opted out of containers for them because of overlap / span issues.

2) i'm needing to actually store the text in smaller chunks, and so i've considered allowing the sections as a container to do that, the idea being that there are fewer instances where sections have overlap issues than there are at the chapter / verse level, and i can address those as i bump into them by managing the xml data manually.

3) all of this text is actually coming from proprietary formatted files from the publishers. none of it is in xml to start with. i've already done a lot of work moving those files to xml using regex and string manipulation. *fun*

-------------------------

so the structure 'should' be:

section(zs)/chapter(c)/verse(v)/content(q)/...

<q> is a tag used for poetry... the 'level' is the level of indention of the poetry. i'm using css classes to manage this within the xhtml post transform. not all of the text actually contains poetry, and so not all of it will have a <q> tag... but for psalms, it is what is used most often.

there is additional markup beyond the <q> tag for additional formatting of the text, as well as things like footnotes which fall within the text.

this is an example with the <nd> tag... which is 'name of deity':

<q level='2'>and against <nd>his</nd> anointed one.</q>

the 'id' attributes within the xml are different depending on the tag. the id for <zs> (section) is a number that i've applied to the section to identify it with a row in a table within my database. each section is a range of verses.

the id for chapter (c) and verse (v) are actually the chapter and verse numbers. i could have used:

<c>1<v>1<q level="1">content</q></v><v>2<q level="2">content of second verse</q></v></c>

as containers ... but needed to address the overlapping verses that occur at the end / beginning of some chapters, so i just used a marker for chapter / verse instead, and used the 'id' attribute so that i could still output the value.

what i'm trying to work out right now ... is that i don't want the sections (<zs>) to appear at the top of the content, i actually want them to appear within the content at the appropriate marker... and ... i can't figure out how to transform the additional markup within the <q> tags. (<nd>, <f> ... etc... ) it keeps coming back blank.

sorry so long winded!
-jim


Quote:
quote:Originally posted by mhkay
 Your post suggests that you're playing with parallel or overlap markup structures and I'm afraid that's not a game for beginners.

However, you really haven't described your problem clearly enough. I can only guess at the relationship of terms like "chapter" and "section" to elements in your source; I can't tell what the level numbers or ids mean; and I can't work out the logic that makes you put ids 5919 and 5923 at the start of your output but not 5920. Superficially, though, there seems to be a very direct mapping from the <v> and <q> elements in your input to <span> elements in your output.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
Reply With Quote