Wrox Home  
Search P2P Archive for: Go

  Return to Index  

asptoday_discuss thread: Parsing *.doc or *.txt data into an Access Database


Message #1 by "Bill Cohen" <billcohen@h...> on Tue, 4 Dec 2001 05:30:25
I have been given the odd job of converting a word doc filled with 

repeated data.  The objective is to to convert 1000 distincive records 

from the word doc into and Access database.



Does anyone have a clue as to how I should proceed.



Here is a sample of the doc files:



Please provide feedback.

----------------------------------------------------

1995 Dalton Trust

  

Contact Info:   Unknown  

Total Investments/Commitments: 1 1 

Total Dollars Invested/Committed: 1 $100,000 

Avg. Dollars per Investment/Commitment: 1,2 $100,000 

Activity Ranking: 1,3 414 / 1,527 

Total Dollar Ranking: 1,3,4 1,195 / 1,527 

Avg. Dollar Ranking: 1,2,3,4 1,190 / 1,527 



1 Closed & DA investments/commitments: (01/01/01 - 11/28/01)

2 Excludes investments/commitments where Amount = Unknown

3 Comparison vs. all Investment Managers in EPP DatabaseSM (01/01/01 - 

11/28/01)

4 Excludes Equity Line commitments

   --> actual dollars ultimately invested by Manager may be significantly 

less than 

Amount 



---------------------------------------------------------------------------

-----

 

    

Managed

Investment Funds

  

Fund Name 1 Total 2

Investments    Total $ 2,3

Invested       Avg. $ per 2,3

Investment       

1995 Dalton Trust 1   $100,000   $100,000  

 

1 Click to view PrivateRaise ProfileSM

2 Closed & DA investments/commitments: (01/01/01 - 11/28/01)

3 Excludes investments/commitments where Amount = Unknown

  

 



---------------------------------------------------------------------------

-----

 

      

Investment History



(01/01/01 - 11/28/01)

  

* 1 Date  Issuer 2 Symbol  Amount 3     Security  

C 09/26/01 Internet Pictures Corporation IPIX $100,000     Pref: Conv 

 

1 C=Closed | DA=Def. Agreement | A=Announced | I=Intended | P=Postponed | 

X=Cancelled

2 Click to view PrivateRaise ProfileSM

3 Amount associated with Equity Line commitments may significantly 

overstate the 

actual level

  of investment that will ultimately be made by a Manager.

  

 



---------------------------------------------------------------------------

-----

 

Investment Tendencies



(Closed & DA investments: 01/01/01 - 11/28/01)



 

 

Security Type

(% of dollars invested)  

 Type Percentage 

Common Stock - 

Preferred Stock: Convertible 100.0% 

Preferred Stock: non-Convertible - 

Debt: Convertible - 

Debt: non-Convertible - 

Other: Convertible - 

Prepaid Warrant - 

Equity Line - 

Unknown - 

 

 

 

Industry

(% of dollars invested)  

 Sector Percentage 

Internet: Infrastructure-related 100.0% 

 

 



 

Market Capitalization

(% of dollars invested)  

 Category Percentage 

Less than $50 m 100.0% 

$50 m - $99 m - 

$100 m - $249 m - 

$250 m - $499 m - 

$500 m - $999 m - 

$1 b - $4.9 b - 

Greater than $5 b - 

 

 

 

Investment Terms

(% of investments which include specific Investment Terms)

  

 Term 1 Percentage 2 

Fixed Price - 

Reset Price 100.0% 

Variable Price - 

Anti-Dilution Protection - 

Hard Floor Price 100.0% 

Soft Floor Price - 

Investor Warrants - 

Investor Call Option - 

Investor Greenshoe - 

Conversion/Exercise Restrictions - 

Selling Restrictions - 

Hedging Restrictions - 

Investor Purchase Rights - 

Investor Redemption - 

Required Registration 100.0% 

Issuer Put Option - 

Forced Conversion/Exercise - 

Issuer Redemption -
Message #2 by "Jason Salas" <jason@k...> on Tue, 4 Dec 2001 15:51:42 +1000
Hi Bill,



I know it's probably not what you're wanting to hear, and there probably is

a way to go through a script or component, but I'd recommend just having an

intern or low-level IS staffer enter all of the data from the .DOC/.TXT file

to the DB manually.  1,000 records really isn't that many...and you have

total control over the fields.  From the looks of it, the content in the

.DOC isn't delimited by any consistent form (i.e., tab-delimited,

comma-delimited, etc.), so the conversion through a script may be shaky, at

best.  If you try to do a straight import, you may lose some of the data

you're working with, or it may get misplaced, which will force you to make

changes manually, which would be more work overall.



I once did this for a DB migration project that had 16 resultant fields in a

single DB table from 1,300 text-based files...it was a long, rough week, but

well worth it in the end.  It's a scrub job, but only a couple days' work by

1 or 2 people.  The lack of standardization in the data is a key reason I

feel this way.



There's probably a way to get this done, but I'd recommend just doing it by

hand first, and then staying with a DB-driven repository if data throughout.

From thereafter, you have total control and standardization of the format

and look of your data.  Will the data permanently going to be converted over

to the DB, or is this just for routine archival purposes?



Either than that, I assume there are third-party solutions out there that

could pull this off.



HTH,

Jason





----- Original Message -----

From: "Bill Cohen" <billcohen@h...>

To: "ASPToday Discuss" <asptoday_discuss@p...>

Sent: Tuesday, December 04, 2001 5:30 AM

Subject: [asptoday_discuss] Parsing *.doc or *.txt data into an Access

Database





> I have been given the odd job of converting a word doc filled with

> repeated data.  The objective is to to convert 1000 distincive records

> from the word doc into and Access database.

>

> Does anyone have a clue as to how I should proceed.

>

> Here is a sample of the doc files:

>

> Please provide feedback.

> ----------------------------------------------------

> 1995 Dalton Trust

>

> Contact Info:   Unknown

> Total Investments/Commitments: 1 1

> Total Dollars Invested/Committed: 1 $100,000

> Avg. Dollars per Investment/Commitment: 1,2 $100,000

> Activity Ranking: 1,3 414 / 1,527

> Total Dollar Ranking: 1,3,4 1,195 / 1,527

> Avg. Dollar Ranking: 1,2,3,4 1,190 / 1,527

>

> 1 Closed & DA investments/commitments: (01/01/01 - 11/28/01)

> 2 Excludes investments/commitments where Amount = Unknown

> 3 Comparison vs. all Investment Managers in EPP DatabaseSM (01/01/01 -

> 11/28/01)

> 4 Excludes Equity Line commitments

>    --> actual dollars ultimately invested by Manager may be significantly

> less than

> Amount

>

> --------------------------------------------------------------------------

-

> -----

>

>

> Managed

> Investment Funds

>

> Fund Name 1 Total 2

> Investments    Total $ 2,3

> Invested       Avg. $ per 2,3

> Investment

> 1995 Dalton Trust 1   $100,000   $100,000

>

> 1 Click to view PrivateRaise ProfileSM

> 2 Closed & DA investments/commitments: (01/01/01 - 11/28/01)

> 3 Excludes investments/commitments where Amount = Unknown

>

>

>

> --------------------------------------------------------------------------

-

> -----

>

>

> Investment History

>

> (01/01/01 - 11/28/01)

>

> * 1 Date  Issuer 2 Symbol  Amount 3     Security

> C 09/26/01 Internet Pictures Corporation IPIX $100,000     Pref: Conv

>

> 1 C=Closed | DA=Def. Agreement | A=Announced | I=Intended | P=Postponed |

> X=Cancelled

> 2 Click to view PrivateRaise ProfileSM

> 3 Amount associated with Equity Line commitments may significantly

> overstate the

> actual level

>   of investment that will ultimately be made by a Manager.

>

>

>

> --------------------------------------------------------------------------

-

> -----

>

> Investment Tendencies

>

> (Closed & DA investments: 01/01/01 - 11/28/01)

>

>

>

> Security Type

> (% of dollars invested)

>  Type Percentage

> Common Stock -

> Preferred Stock: Convertible 100.0%

> Preferred Stock: non-Convertible -

> Debt: Convertible -

> Debt: non-Convertible -

> Other: Convertible -

> Prepaid Warrant -

> Equity Line -

> Unknown -

>

>

>

> Industry

> (% of dollars invested)

>  Sector Percentage

> Internet: Infrastructure-related 100.0%

>

>

>

>

> Market Capitalization

> (% of dollars invested)

>  Category Percentage

> Less than $50 m 100.0%

> $50 m - $99 m -

> $100 m - $249 m -

> $250 m - $499 m -

> $500 m - $999 m -

> $1 b - $4.9 b -

> Greater than $5 b -

>

>

>

> Investment Terms

> (% of investments which include specific Investment Terms)

>

>  Term 1 Percentage 2

> Fixed Price -

> Reset Price 100.0%

> Variable Price -

> Anti-Dilution Protection -

> Hard Floor Price 100.0%

> Soft Floor Price -

> Investor Warrants -

> Investor Call Option -

> Investor Greenshoe -

> Conversion/Exercise Restrictions -

> Selling Restrictions -

> Hedging Restrictions -

> Investor Purchase Rights -

> Investor Redemption -

> Required Registration 100.0%

> Issuer Put Option -

> Forced Conversion/Exercise -

> Issuer Redemption -




$subst('Email.Unsub')

>




  Return to Index