|
|
 |
| Pro PHP Advanced PHP coding discussions. Beginning-level questions will be redirected to the Beginning PHP forum. |
Welcome to the p2p.wrox.com Forums.
You are currently viewing the Pro PHP section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of computer programmers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other programmers’ questions, win occasional prizes given to our best members, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
|
 |
|
|
 |

October 16th, 2003, 07:58 AM
|
|
Registered User
|
|
Join Date: Oct 2003
Location: Ernakulam, Kerala, India.
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Count the number of words in a PDF document
Hello,
I have to develop an intranet application that reads documents and count the number of words in that document.I have done the same for text/html/word documents by reading the document using the fopen() function. I have to develop the site in PHP.
BUT i don't know how to do the same for PDF documents. From the php manual I got the information that by installing pdf libraries in the server, we could use functions related to pdf documents.I have downloaded the pdf libraries and installed in my web server.Following are the method that I have tried:
Since the function "pdf_open_pdi" is used to open an existing PDF document,I tried in that way with the following code:
$pdf = pdf_new();
pdf_open_file($pdf);
$pdi = pdf_open_pdi($pdf, "firstfile.pdf", "", 0);
$page= pdf_open_pdi_page($pdf, $pdi, 1, "");
But the script returns an error because of the function "pdf_open_pdi" returned a 0 handle. The file "firstfile.pdf" was there in the right path. I want to know why the function returns a handle 0. Because of this I could not use this document handle in the function "pdf_open_pdi_page".
If iam not in the right path please advice me how to count the number of words in a pdf document that exists in the web server.
Your thoughts and advice on this would be greatly appreciated.
Thanks in advance.
:)
|

October 17th, 2003, 06:03 PM
|
|
Friend of Wrox
|
|
Join Date: Jun 2003
Location: , , USA.
Posts: 101
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
hello there. you may want to check out http://php.net/manual/en/ref.pdf.php for more about pdf in php. as of the word count you can use split() function, but first you need to read the entire content of the pdf document into a variable, consider the following:
<?php
$content = "this is a sample text this is a sample text this is a sample text";
$word_count = sizeof(split(" ",$content));
echo $word_count;
?>
hope that helped :D:D:D
the genuine genius
|

October 17th, 2003, 06:14 PM
|
|
Friend of Wrox
|
|
Join Date: Jun 2003
Location: , , USA.
Posts: 101
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
hey there again. here's some more...
you can not really consider single characters as words just like in an example above. if you want to be more accurate let's ignore a single characters:
<?
$content = "this is a sample text this is a sample text this is a sample text";
$str_array = split(" ",$content);
$word_count = 0;
for($i=0;$i<sizeof($str_array);$i++)
{
if(strlen($str_array[$i]) > 1)
{
$word_count++;
}
}
echo $word_count;
?>
 :D
the genuine genius
|

October 18th, 2003, 05:54 PM
|
|
Friend of Wrox
|
|
Join Date: Jun 2003
Location: San Diego, CA, USA.
Posts: 833
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Well, counting the words in the document wasn't really the problem, it was getting the document open in the first place.
The PDI functions that ship with the PDF extension of PHP aren't enabled by default, if I remember correctly. I could be wrong, though... The reason I suspect this is the case is because the basic (read: free) version of PDFLib doesn't include the PDI extension.
PHP's PDF extension would still have the functions defined, but they wouldn't actually work unless your version of PDFLib implemented the PDI functions.
I would suggest going through each line in your code to see which function is actually failing... is it pdf_new() or really pdf_open_pdi()?
Take care,
Nik
http://www.bigaction.org/
|

July 20th, 2006, 12:06 AM
|
|
Registered User
|
|
Join Date: Jul 2006
Location: , , .
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
I'm looking for same issue...
Could you accomplish reading a pdf document and counting the words on it?
I would appreciate your help, thank you very much
|

May 27th, 2009, 02:00 AM
|
|
Registered User
|
|
Join Date: May 2009
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
|
|
Word count in pdf file
I am looking for the same,
if anybody have the solution
please sent it to me,
I am very thankful to u.
Thanks in advance
Vikas
vikasp@saamarth.net
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
 |