pkgsrc - [no description]

Age	Commit message (Collapse)	Author	Files	Lines
2007-07-28	Update to 2.00, provided by Rumko on pkgsrc-users.	wiz	7	-73/+67
	July 02 2007 - V2.00 Converted internal character handling to UTF8. Trained with 6 languages. Added unicharset_extractor, wordlist2dawg. Added boxfile creation mode. Added UNLV regression test capability. Fixed problems with copyright and registered symbols. Fixed extern "C" declarations problem.
2007-05-18	Initial import of tesseract-1.04b from pkgsrc-wip (packaged by heinz@	wiz	9	-0/+396
	and myself): This code is a raw OCR engine. It has NO PAGE LAYOUT ANALYSIS, NO OUTPUT FORMATTING, and NO UI. It can only process an image of a single column and create text from it. It can detect fixed pitch vs proportional text. Having said that, in 1995, this engine was in the top 3 in terms of character accuracy, and it compiles and runs on both Linux and Windows. Another current limitation is that it only recognizes English and its character set is only US-ASCII. Training code IS included in the open source release however, and will be included in a future release.