PDA

View Full Version : Extracting text from images


LevelFive
26th Jan 2002, 07:30
I’m looking for any products that can extract text from images. Anyone know of any?

The images contain large amounts of text and I would like to be able to process the text.

john_tullamarine
26th Jan 2002, 09:20
Probably around 18 months ago I saw a TV documentary on encryption .. one technique reviewed was the use of compressed bitmaps to hide the text data in amongst the otherwise discarded bytes. Sorry I can't remember the contact details but one ought to be able to chase it down via the search engines ....

spannersatcx
26th Jan 2002, 11:24
Where are the images from? Is it something you are scanning? If so when you scan it rather than save as an image use OCR software for the text. If you have not scanned it, if the quality of the image is ok you can print it out first and then try scanning it to extract the text with OCR. <img src="smile.gif" border="0">

Blacksheep
26th Jan 2002, 16:49
I use OmniPage Pro 9.0 by Caere. It came bundled with my Canon Scanner but you should be able to buy it seperately. You can put image files straight through, there's no need to print the file and then scan it. I've tried other OCR software before, and you couldn't import image files from elsewhere but OmniPage is the best I've tried yet. Very useful.

<a href="http://www.caere.com/products/omnipage/pro/" target="_blank">http://www.caere.com/products/omnipage/pro/</a>

Just checked the link and found they are on version 11 now...

**********************************. .Through difficulties to the cinema

[ 26 January 2002: Message edited by: Blacksheep ]</p>

bblank
26th Jan 2002, 19:51
If you find stand-alone commercial OCR expensive (as may be the case if you don't have a continuing need for the application) you can try WOCAR first (assuming it is Windows software you want). It is freeware for noncommercial use. It is intended for scanned text and your files must be black and white, not greyscale. The file must be saved in TIFF format. Depending on your images, a basic graphics editor such as IrfanView (also freeware) might/should be able to do the necessary things to convert to files WOCAR can handle.

I just checked and the zip-archive that is available now seems to be the same one I downloaded about five years ago. I don't know what the lack of development means.

<a href="http://ccambien.free.fr/wocar/" target="_blank">http://ccambien.free.fr/wocar/</a>

stickyb
26th Jan 2002, 20:20
There's also something called Textbridge which does a good job of going through images and converting text to word or excel format.. .I think u can use it either with a scanner or with an image file

LevelFive
28th Jan 2002, 02:07
Thanks for the help. It’s very much appreciated.