Wrox Programmer Forums
Go Back   Wrox Programmer Forums > XML > XML
|
XML General XML discussions.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the XML section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old September 18th, 2006, 09:35 AM
Registered User
 
Join Date: Sep 2006
Posts: 5
Thanks: 0
Thanked 0 Times in 0 Posts
Default Parse/Load/Search xml file size near about 1 GB

Hi All
  I have given a problem set in which i need to develop dot net application which should Parse/Load/Search xml document of size ~ 1GB . And it is given that i should not use database for it .Please help me to solve this problem . how can i achieve this ?

 
Old September 18th, 2006, 01:00 PM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

It depends very much on the nature of the "search". You either need to allocate a fairly large amount of memory, or you need to search using a low-level technology such as Sax, Stax, or STX. There are some XSLT and XQuery products that can handle a limited range of searches using serial processing: for example in XSLT, Saxon-SA has a serial processing mode for a very restricted class of XPath expressions. Some products such as DataDirect XQuery have an option to do "document projection" in which the parts of the document that aren't accessed by the query aren't loaded into memory.

When I see constraints like "I should not use a database", my question is always "Why?". What are the real requirements that make a database an unacceptable solution?

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old September 19th, 2006, 12:21 AM
Registered User
 
Join Date: Sep 2006
Posts: 5
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Thank you Michael Sir for your reply. It is the Problem set which i have given to solve.They want it without using database or might be thinking that Why to again store in database if you have already have it in XML ? :):)

 
Old September 19th, 2006, 02:43 AM
mhkay's Avatar
Wrox Author
 
Join Date: Apr 2004
Posts: 4,962
Thanks: 0
Thanked 292 Times in 287 Posts
Default

I think it's a always a good idea to question requirements. If "they" don't want a database, there could be any number of reasons: cost of purchase, cost of administration, performance of database loading. If you discover the real reasons you may find that they also rule out some non-database solutions - and you may find that they don't rule out some solutions that do use a database. Users, managers, and customers have a right to define the requirements, but they don't have a right to make design decisions - that's the job of the engineer.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference
 
Old September 19th, 2006, 04:47 AM
Registered User
 
Join Date: Sep 2006
Posts: 5
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi Michael Sir
   Yes Its true.I have taken part in Tech Fest and this problem set is from that Tech Fest Only so i can't ask them requirement of it.
The whole problem is like
Objective
    Define a development approach to Parse/Load/Search an XML document of size ~1GB
Description
    Project Gutenberg (www.gutenberg.org) maintains a list of books in a RDF format.There is an offline version of the same available at \\ht-dynapps\gutenberg
    You need to provide the following APIs that will allow you to use the contents:
    //Given a start and end index provides allows to incrementally get the books from the list(ala google way)
    public List<Book> GutenbergBookManager.getBooks(int start, in end)
    // Given an ID of the book searches the document return the book details
    public Book GutenbergBookManager.getBook(String id)
    // Given the search phrase returns the list of the books with matching subject (word occurring anywhere in the subject line)
    public List<Book> GutenbergBookManager.searchBook(String subject)
* Assume that you do not have the luxury to dump the data int0o a relational database.






Similar Threads
Thread Thread Starter Forum Replies Last Post
how to parse the XML file and DTD through Xerces tufailfifa C++ Programming 0 June 25th, 2007 07:03 AM
how to parse the XML file and DTD tufailfifa XML 0 June 25th, 2007 07:02 AM
parse error xml load document asp.net academics2006 ASP.NET 1.0 and 1.1 Basics 0 March 13th, 2006 03:21 PM
parse xml file with Xerces-C_2_5_0 ,DOM taianmhzy XML 0 May 27th, 2004 04:14 AM
max size for xml file dg1234 XML 1 October 22nd, 2003 03:14 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.