Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > PHP/MySQL > Pro PHP
Password Reminder
Register
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read
Pro PHP Advanced PHP coding discussions. Beginning-level questions will be redirected to the Beginning PHP forum.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the Pro PHP section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developers’ questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
Reply
 
Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old October 16th, 2003, 06:58 AM
Registered User
 
Join Date: Oct 2003
Location: Ernakulam, Kerala, India.
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
Send a message via AIM to basil123 Send a message via Yahoo to basil123
Default Count the number of words in a PDF document

Hello,

I have to develop an intranet application that reads documents and count the number of words in that document.I have done the same for text/html/word documents by reading the document using the fopen() function. I have to develop the site in PHP.

BUT i don't know how to do the same for PDF documents. From the php manual I got the information that by installing pdf libraries in the server, we could use functions related to pdf documents.I have downloaded the pdf libraries and installed in my web server.Following are the method that I have tried:

Since the function "pdf_open_pdi" is used to open an existing PDF document,I tried in that way with the following code:

$pdf = pdf_new();
pdf_open_file($pdf);
$pdi = pdf_open_pdi($pdf, "firstfile.pdf", "", 0);
$page= pdf_open_pdi_page($pdf, $pdi, 1, "");

But the script returns an error because of the function "pdf_open_pdi" returned a 0 handle. The file "firstfile.pdf" was there in the right path. I want to know why the function returns a handle 0. Because of this I could not use this document handle in the function "pdf_open_pdi_page".

If iam not in the right path please advice me how to count the number of words in a pdf document that exists in the web server.

Your thoughts and advice on this would be greatly appreciated.

Thanks in advance.

:)

Reply With Quote
  #2 (permalink)  
Old October 17th, 2003, 05:03 PM
Friend of Wrox
 
Join Date: Jun 2003
Location: , , USA.
Posts: 101
Thanks: 0
Thanked 1 Time in 1 Post
Send a message via AIM to Moharo
Default

hello there. you may want to check out http://php.net/manual/en/ref.pdf.php for more about pdf in php. as of the word count you can use split() function, but first you need to read the entire content of the pdf document into a variable, consider the following:

<?php

$content = "this is a sample text this is a sample text this is a sample text";
$word_count = sizeof(split(" ",$content));
echo $word_count;

?>

hope that helped :D:D:D

the genuine genius
Reply With Quote
  #3 (permalink)  
Old October 17th, 2003, 05:14 PM
Friend of Wrox
 
Join Date: Jun 2003
Location: , , USA.
Posts: 101
Thanks: 0
Thanked 1 Time in 1 Post
Send a message via AIM to Moharo
Default

hey there again. here's some more...

you can not really consider single characters as words just like in an example above. if you want to be more accurate let's ignore a single characters:

<?

$content = "this is a sample text this is a sample text this is a sample text";
$str_array = split(" ",$content);
$word_count = 0;


for($i=0;$i<sizeof($str_array);$i++)
{
    if(strlen($str_array[$i]) > 1)
    {
         $word_count++;
    }
}

echo $word_count;

?>

:D

the genuine genius
Reply With Quote
  #4 (permalink)  
Old October 18th, 2003, 04:54 PM
Friend of Wrox
Points: 2,570, Level: 21
Points: 2,570, Level: 21 Points: 2,570, Level: 21 Points: 2,570, Level: 21
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2003
Location: San Diego, CA, USA
Posts: 836
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Well, counting the words in the document wasn't really the problem, it was getting the document open in the first place.

The PDI functions that ship with the PDF extension of PHP aren't enabled by default, if I remember correctly. I could be wrong, though... The reason I suspect this is the case is because the basic (read: free) version of PDFLib doesn't include the PDI extension.

PHP's PDF extension would still have the functions defined, but they wouldn't actually work unless your version of PDFLib implemented the PDI functions.

I would suggest going through each line in your code to see which function is actually failing... is it pdf_new() or really pdf_open_pdi()?



Take care,

Nik
http://www.bigaction.org/
Reply With Quote
  #5 (permalink)  
Old July 19th, 2006, 11:06 PM
Registered User
 
Join Date: Jul 2006
Location: , , .
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
Default

I'm looking for same issue...
Could you accomplish reading a pdf document and counting the words on it?
I would appreciate your help, thank you very much

Reply With Quote
  #6 (permalink)  
Old May 27th, 2009, 01:00 AM
Registered User
 
Join Date: May 2009
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
Default Word count in pdf file

I am looking for the same,
if anybody have the solution
please sent it to me,
I am very thankful to u.

Thanks in advance
Vikas
vikasp@saamarth.net
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
convert number into words on acces form superparim Wrox Book Feedback 0 September 19th, 2005 01:55 PM
count the TOTAL number of segments crmpicco Classic ASP Basics 2 February 1st, 2005 05:03 AM
Words Document maha HTML Code Clinic 8 September 29th, 2004 03:08 AM
Changing number to words kekohchaa VS.NET 2002/2003 3 April 7th, 2004 11:27 PM
Count the number of words in a PDF document. basil123 PHP How-To 0 October 16th, 2003 06:51 AM



All times are GMT -4. The time now is 04:02 AM.


Powered by vBulletin®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.