Wrox Home  
Search P2P Archive for: Go

  Return to Index  

application_development thread: RE: application_development digest: March 18, 2003


Message #1 by "Tom Degen" <TDEGEN@j...> on Tue, 18 Mar 2003 18:30:52 -0600
In a nutshell, I've found it most efficient to create a buffered file
stream and use asynchronous readback (asyncreadback) method for large
files.  If you multi-thread the reads and perform your variable locks
appropriately then you should be fine.  I believe when I last tried it,
a 20 MB text file could be completely displayed in about 5 or 6 minutes.
However, when I multi-threaded it, I could display portions of it
immediately.  This did slow down the entire display process by a minute
or two, but the fact remains I could use portions of the file
immediately.

Also, one thing to note about garbage collection, you can call the GC
explicitly.  As soon as you close your buffer stream and your read
stream you can call the GC and release you memory immediately.  Calling
the GC does require some overhead, but if you have 5+ MB of memory (or
more!) taken up by your closed file streams, then the 100 to 500 ms it
takes to call the GC is well worth it.



Thomas A Degen
Project Analyst
Information Technologies
Journal Sentinel Inc.
(xxx) xxx-xxxx
TDegen@J...




-----Original Message-----
From: Application Development digest
[mailto:application_development@p...]
Sent: Tuesday, March 18, 2003 6:02 PM
To: application_development digest recipients
Subject: application_development digest: March 18, 2003

-----------------------------------------------
When replying to the digest, please quote only
relevant material, and edit the subject line to
reflect the message you are replying to.
-----------------------------------------------

The URL for this list is:
http://p2p.wrox.com/list.asp?list=3Dapplication_development
APPLICATION_DEVELOPMENT Digest for Tuesday, March 18, 2003.

1. RE: large text files processing in .NET
2. RE: large text files processing in .NET

----------------------------------------------------------------------

Subject: RE: large text files processing in .NET
From: "come2study" <come2study@y...>
Date: Tue, 18 Mar 2003 04:39:01
X-Message-Number: 1

Thanks a lot for the reply. I need to use only .NET. I don't have the
option to change the platform. Yes, garbage collection is a serious
problem to look into. There are lots of objects in .NET like filestream,

bufferstream etc and streamreader and streamwriter. In comparing them,
when I browsed thru generally it is said that reading thru streamreader
is
faster.  Any idea which method is efficient, faster and whether
buffering
should be done? Also we have other tools like biztalk. How files can be
handled efficiently?  Thanks..


> Well, I suggest you use native C++ code to do this.  This allows
programming
at a much lower level, so it can seriously improve performance.

Also, whatever language you use, do NOT use techniques that buffer to
much
data.  I recommend splitting the data in blocks.  In theory, the ideal
size
of a datablock should be a multiple of a physical block size of your
harddisk...

Where I'm afraid of, when I read your question, is the garbage
collection
of
.NET.  When you want to read multiple Gb of data, all data will be
cached
in
buffers, but only be released by the garbase collector long after you
don't
need the data anymore.  I have serious doubts about the performance of
this
method...

HTH,
Tim

> ----------
> From: 	come2study[SMTP:come2study@y...]
> Reply To: 	Application Development
> Sent: 	lundi 17 mars 2003 13:11
> To: 	Application Development
> Subject: 	[application_development] large text files processing in
> .NET
>
> Hi,
> We are in the process of designing of a system which basically reads
data
> from large text files (GB of data!), do some processing and store them

in
> the database. The application is in .NET. Has anyone of you used such
> large text files? Can you please tell me which method of I/O
processing
is
>
> faster in .NET to give high performance?  Even if u had gone thru some

> case studies related to this on .NET, please let me know the url. Pls
> help.
>
> Thanks.
>


=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D
Voir texte francais apres texte neerlandais

Deze email, met inbegrip van elk bijgevoegd document, is vertrouwelijk.
Indien u niet de geadresseerde bent, is het openbaar maken, kopieren of
gebruik maken ervan verboden. Indien u dit bericht verkeerdelijk hebt
ontvangen, gelieve het te vernietigen en de afzender onmiddellijk te
verwittigen. De veiligheid en juistheid van email-berichten kunnen niet
gewaarborgd worden, aangezien de informatie kan onderschept of
gesaboteerd
worden, verloren gaan of virussen kan bevatten. De afzender wijst
bijgevolg elke aansprakelijkheid af in dergelijke gevallen. Indien een
controle zich opdringt, gelieve een papieren kopie te vragen.


Ce message electronique, y compris tout document joint, est
confidentiel.
Si vous n'etes pas le destinataire de ce message, toute divulgation,
copie
ou utilisation en est interdite. Si vous avez recu ce message par
erreur,
veuillez le detruire et en informer immediatement l'expediteur. La
securite et l'exactitude des transmissions de messages electroniques ne
peuvent etre garanties etant donne que les informations peuvent etre
interceptees, alterees, perdues ou infectees par des virus; l
'expediteur
decline des lors toute responsabilite en pareils cas. Si une
verification
s'impose, veuillez demander une copie papier.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D
----------------------------------------------------------------------

Subject: RE: large text files processing in .NET
From: "jerry Weidong Lo" <cswdluo@c...>
Date: Tue, 18 Mar 2003 12:49:47 +0800
X-Message-Number: 2

   In my experience of handling the reading and writing of larde text
files.
The streamwriter and streamreader may be faster than other provided by
.net
or java such as bufferedreader(in java) and bufferstream(.net).
   As you need to use only .NET,i suggest you use the streamreader and
split
the large files in blocks,best use buffer  strategy.
   Regs!
                                              jerry.lo

----- Original Message -----
From: "come2study" <come2study@y...>
To: "Application Development" <application_development@p...>
Sent: Tuesday, March 18, 2003 4:39 AM
Subject: [application_development] RE: large text files processing in
.NET


> Thanks a lot for the reply. I need to use only .NET. I don't have the
> option to change the platform. Yes, garbage collection is a serious
> problem to look into. There are lots of objects in .NET like
filestream,
> bufferstream etc and streamreader and streamwriter. In comparing them,
> when I browsed thru generally it is said that reading thru
streamreader is
> faster.  Any idea which method is efficient, faster and whether
buffering
> should be done? Also we have other tools like biztalk. How files can
be
> handled efficiently?  Thanks..
>
>
> > Well, I suggest you use native C++ code to do this.  This allows
> programming
> at a much lower level, so it can seriously improve performance.
>
> Also, whatever language you use, do NOT use techniques that buffer to
much
> data.  I recommend splitting the data in blocks.  In theory, the ideal
size
> of a datablock should be a multiple of a physical block size of your
> harddisk...
>
> Where I'm afraid of, when I read your question, is the garbage
collection
> of
> .NET.  When you want to read multiple Gb of data, all data will be
cached
> in
> buffers, but only be released by the garbase collector long after you
don't
> need the data anymore.  I have serious doubts about the performance of
this
> method...
>
> HTH,
> Tim
>
> > ----------
> > From: come2study[SMTP:come2study@y...]
> > Reply To: Application Development
> > Sent: lundi 17 mars 2003 13:11
> > To: Application Development
> > Subject: [application_development] large text files processing in
> > .NET
> >
> > Hi,
> > We are in the process of designing of a system which basically reads
> data
> > from large text files (GB of data!), do some processing and store
them
> in
> > the database. The application is in .NET. Has anyone of you used
such
> > large text files? Can you please tell me which method of I/O
processing
> is
> >
> > faster in .NET to give high performance?  Even if u had gone thru
some
> > case studies related to this on .NET, please let me know the url.
Pls
> > help.
> >
> > Thanks.
> >
>
>
> 
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D
> Voir texte francais apres texte neerlandais
>
> Deze email, met inbegrip van elk bijgevoegd document, is
vertrouwelijk.
> Indien u niet de geadresseerde bent, is het openbaar maken, kopieren
of
> gebruik maken ervan verboden. Indien u dit bericht verkeerdelijk hebt
> ontvangen, gelieve het te vernietigen en de afzender onmiddellijk te
> verwittigen. De veiligheid en juistheid van email-berichten kunnen
niet
> gewaarborgd worden, aangezien de informatie kan onderschept of
gesaboteerd
> worden, verloren gaan of virussen kan bevatten. De afzender wijst
> bijgevolg elke aansprakelijkheid af in dergelijke gevallen. Indien een
> controle zich opdringt, gelieve een papieren kopie te vragen.
>
>
> Ce message electronique, y compris tout document joint, est
confidentiel.
> Si vous n'etes pas le destinataire de ce message, toute divulgation,
copie
> ou utilisation en est interdite. Si vous avez recu ce message par
erreur,
> veuillez le detruire et en informer immediatement l'expediteur. La
> securite et l'exactitude des transmissions de messages electroniques
ne
> peuvent etre garanties etant donne que les informations peuvent etre
> interceptees, alterees, perdues ou infectees par des virus; l
'expediteur
> decline des lors toute responsabilite en pareils cas. Si une
verification
> s'impose, veuillez demander une copie papier.
> 
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D




---

END OF DIGEST


  Return to Index