Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XML
|
XML General XML discussions.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XML section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old August 26th, 2005, 09:11 AM
Banned
 
Join Date: Jul 2005
Posts: 317
Thanks: 0
Thanked 0 Times in 0 Posts
Default Opinions Wanted about XML Data Storage

I have a dilemna, and I thought that this forum could probably give me some good opinions.

My new site is being developed with this model: XML/XSLT/CSS/ASP.NET with some VB.NET scripting included.

I'm going to store links to internal web pages & external sites in XML docs. The external links are already located within one XML doc. I was thinking that I would create separate internal link pages for each directory of my site, for which there are 7.

I then pull that document into an XSL stylesheet that's going to be transformed with another XML doc's data, like this:
<xsl:variable name="internal_links" select="document('http://SERVER/DIRECTORY/docs/xml/internal_links.xml')" />
So my questions are:
1) Would it be more efficient to store all of our site's internal links within one XML doc, or several XML docs for each directory?

2) When another XML doc is called using the document() method, is that full document loaded into the page? Or is only the data that's called from the stylesheet actually loaded?

Thanks for any input.

P.S. These are the pros & cons that I have for each stup so far:

Pro's
- simplicity of use throughout site, due to less of a possibility for duplication

Con's
- filesize may be too big
- could be disorganization due to the amount of entries

KWilliams
 
Old August 27th, 2005, 04:06 AM
joefawcett's Avatar
Wrox Author
 
Join Date: Jun 2003
Posts: 3,074
Thanks: 1
Thanked 38 Times in 37 Posts
Default

In my opinion the main things to consider are
  • The size of the document(s). This is the big factor as creating a DOM uses roughly three to four times the amount of memory as the file's actual size.
  • Can you cache the DOM for re-use by different users? If so one big document may well be better given the first point. If you need to give each user a separate copy then the smaller the better.
  • It is easier to manage one file rather than many.
So unless the site has an exceptional number of internal links then one document sounds better.

Regarding the document function the whole document is loaded into memory, it has to be really to verify it is well formed and to allow searches.

--

Joe (Microsoft MVP - XML)
 
Old August 29th, 2005, 08:36 AM
Banned
 
Join Date: Jul 2005
Posts: 317
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi Joe,

Thanks for your reply:
Quote:
quote:In my opinion the main things to consider are
    * The size of the document(s). This is the big factor as creating a DOM uses roughly three to four times the amount of memory as the file's actual size.
    * Can you cache the DOM for re-use by different users? If so one big document may well be better given the first point. If you need to give each user a separate copy then the smaller the better.
    * It is easier to manage one file rather than many.

How would I go about caching this doc within the XSLT stylesheet?
Quote:
quote:So unless the site has an exceptional number of internal links then one document sounds better.

Regarding the document function the whole document is loaded into memory, it has to be really to verify it is well formed and to allow searches.
So if every XSLT stylesheet for the site references another XML doc that contains all of our site's links using the document() method, will that stylesheet actually load the entire doc, or just the part of it referenced within the XSLT stylesheet? Thanks.

KWilliams
 
Old August 29th, 2005, 12:12 PM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

You can reasonably expect that if you call document(), then that document will be parsed and loaded into a tree in memory, which will stay in memory for the duration of the transformation.

There may be products, or facilities within products, that do better than this. Saxon has an extension function discard-document() that allows you to get rid of a document from memory if it's no longer needed; Saxon-SA 8.5 also has a capability for reading documents serially without building the tree if your stylesheet is written to handle the data this way.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old August 29th, 2005, 01:50 PM
Banned
 
Join Date: Jul 2005
Posts: 317
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Well, I've gone back and forth on this one. On the one hand, it would be easier to maintain one file with all of our site's internal links. On the other hand, our site would have at least 586 internal links in that one document, and each element would have 17 child nodes in addition to an id attribute.

So after all of this, I think that it will be best to keep it as it is with separate "internal_links.xml" docs for each directory of our site, as the only page that would need to access more than one of these pages would be the home page.

Thanks for all of your great input. It's greatly appreciated.

KWilliams
 
Old September 1st, 2005, 02:24 PM
Banned
 
Join Date: Jul 2005
Posts: 317
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi Michael,

I was hoping that you could elaborate on your answer to this post. I've been trying to work with separate XML docs that each contain internal links for that specified directory. But as you can imagine, I figured out rather quickly that it will be simpler to maintain one file.

You had mentioned Saxon SA's capability to flush XML documents using the extension function discard-document(), and "...Saxon-SA 8.5 also has a capability for reading documents serially without building the tree if your stylesheet is written to handle the data this way.".

I noticed a post at http://www.biglist.com/lists/xsl-lis.../msg01138.html that states "...most, if not all processors, evaluate variable only when they're needed, so if you don't use $config-top anywhere, the document names.xml is not fetched and read into a source tree. Also, processors cache the documents for the duration of the transformation (or life-span of e.g. Transformer object), so if it's used in multiple locations in your stylesheet, processed only once.".

So my questions are:
Is there a way to cache such a file once it's been called from the XSLT stylesheet using the W3's XSLT Procesor?
If so, how should I go about that?

Thanks for any further assistance on this matter.

KWilliams
 
Old September 2nd, 2005, 06:56 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

Sorry, I'm not sure I understand your follow-up question. Are you concerned about caching documents for the duration of a single transformation, or caching them across multiple transformations? Are you concerned about the amount of memory used when you load and keep documents in memory, or about the time taken to load them when you don't?

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old September 2nd, 2005, 12:03 PM
Banned
 
Join Date: Jul 2005
Posts: 317
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Quote:
quote:Are you concerned about caching documents for the duration of a single transformation, or caching them across multiple transformations?
I'm concerned about caching them across multiple transformations, since my site's going to pull data dynamically from 2 source docs and transform them. For instance, this QueryString:
http://www.mysite.com/index.aspx?pag...&dir=SAMPLEDIR

Will pull these two source docs:
http://www.mysite.com/SAMPLEDIR/docs/xml/SAMPLEPAGE.xml
http://www.mysite.com/SAMPLEDIR/docs...SAMPLEPAGE.xsl

And transform then using the transformNode method. Because each page will its individual XML & XSLT files, and because each XSLT stylesheet will reference the internal_links.xml doc using the document() function, I was wondering if it would be loaded each time.

Quote:
quote:Are you concerned about the amount of memory used when you load and keep documents in memory, or about the time taken to load them when you don't?
I was concerned about the actual document loading for each page that named it in a variable, but I now understand that it doesn't actually load until it's called from the template body. From what I understand, the entire document will not load...only the nodes that are selected using one of the filter methods, like with:
<xsl:for-each select="$internal_links/internal_links/page[@id = 'faqs']">
 <xsl:value-of select="title" />
</xsl:for-each>
...only the node with the id "faqs" will be pulled.
Or if I wanted to just pull the "page" nodes, I'd do:
<xsl:for-each select="$internal_links/internal_links/page">
Am I correct?

KWilliams





Similar Threads
Thread Thread Starter Forum Replies Last Post
data storage and import stealthdevil Visual Basic 2005 Basics 4 September 19th, 2007 11:12 AM
opinion on data storage phlint69 Access VBA 2 September 27th, 2005 08:16 AM
Wanted: Code for "Beginning C# XML"-book (2002) wmi All Other Wrox Books 1 September 21st, 2004 06:32 AM
help wanted re: form that inserts data to sql db madcap ADO.NET 2 November 14th, 2003 10:26 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.