Initial import of tesseract-1.04b from pkgsrc-wip (packaged by heinz@ - pkgsrc

diff options

author	wiz <wiz>	2007-05-18 06:39:27 +0000
committer	wiz <wiz>	2007-05-18 06:39:27 +0000
commit	610d73b98c0e4e40dea9ef52c263fb6ee707a485 (patch)
tree	a705c8f6b7adc55c4b7c0be896b56cd6c86a9c90 /print/auctex
parent	ad07b9b93124b6968ed7e97027519dad166d3fbd (diff)
download	pkgsrc-610d73b98c0e4e40dea9ef52c263fb6ee707a485.tar.gz

Initial import of tesseract-1.04b from pkgsrc-wip (packaged by heinz@

and myself): This code is a raw OCR engine. It has NO PAGE LAYOUT ANALYSIS, NO OUTPUT FORMATTING, and NO UI. It can only process an image of a single column and create text from it. It can detect fixed pitch vs proportional text. Having said that, in 1995, this engine was in the top 3 in terms of character accuracy, and it compiles and runs on both Linux and Windows. Another current limitation is that it only recognizes English and its character set is only US-ASCII. Training code IS included in the open source release however, and will be included in a future release.

Diffstat (limited to 'print/auctex')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: