02-18-2012 07:20 AM
Hi guys
I am trying to count the letter occurance in a document, say a PDF or word. Say A occurance in the document. And also count the amount of letters in a document in the document and output the letter i counted to a new documment.
Best Regards
Hammingcode
02-18-2012 10:09 AM
What have you tried?
First you need to get the document into LabVIEW as a string.
Then I would consider converting the string to a U8 array. Then I would do a histogram on that. You would have the number of each byte value that exists in the document which would be the ASCII value of a particular character.
02-18-2012 10:32 AM
I thoguht about that, but Labview string has limit size, if the document size is quite big. it is not possible to convert all letters into string
02-18-2012 10:43 AM
If the dcoument is so large that all its text cannot be contained by one string, then you will need to read it in sections and add the character counts.
Lynn
02-18-2012 11:27 AM
Are you serious about the 'string length' limit? A string is sizable upto U32, or 4 GB, so probably more than the amount of memory your OS is giving to any process.
Ton
02-18-2012 01:11 PM
You mean i need copy the contains of document and paste it in the Labview string? Is there any subVi can read the contain of document to a string?
02-18-2012 01:50 PM
It depends. What is the source? Is it .pdf for Word, the answer will be different for each.
I don't know about the .pdf. For Word, you could use ActiveX functions to access the file and read the data.
02-21-2012 11:51 AM
- How have you been getting on with this? Do you have any other details?
I'd be interested in seeing how you could extract the text from a PDF file. I have no doubt that it could be done, however you might need a dig a little through the construction of the PDF file. Most of the tools which are available are for generation - for example the Report Generation Toolkit for MS applications or PDF toolkits from multiple 3rd party sources.
02-21-2012 11:53 AM
If it is a PDF - this looks to be a useful document as a starting point.