diff options
author | adam <adam> | 2014-10-02 16:06:02 +0000 |
---|---|---|
committer | adam <adam> | 2014-10-02 16:06:02 +0000 |
commit | 1f412672e8db4f0981ba2699484d0914d58200e1 (patch) | |
tree | 1cfd8567995e9c6a88ab2cb1cf245f48608dc94c /graphics/tesseract/Makefile | |
parent | 951cc8217b4eba0b9c042792efb1f46994a06ae6 (diff) | |
download | pkgsrc-1f412672e8db4f0981ba2699484d0914d58200e1.tar.gz |
Changes 3.02.02:
* Moved ResultIterator/PageIterator to ccmain.
* Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic.
* Added paragraph detection in layout analysis/post OCR.
* Fixed inconsistent xheight during training and over-chopping.
* Added simultaneous multi-language capability.
* Refactored top-level word recognition module.
* Added experimental equation detector.
* Improved handling of resolution from input images.
* Blamer module added for error analysis.
* Cleaned up externally used namespace by removing includes from baseapi.h.
* Removed dead memory mangagement code.
* Tidied up constraints on control parameters.
* Added support for ShapeTable in classifier and training.
* Refactored class pruner.
* Fixed training leaks and randomness.
* Major improvements to layout analysis for better image detection, diacritic detection, better textline finding, better tabstop finding.
* Improved line detection and removal.
* Added fixed pitch chopper for CJK.
* Added UNICHARSET to WERD_CHOICE to make mult-language handling easier.
* Fixed problems with internally scaled images.
* Added page and bbox to string in tr files to identify source of training data better.
* Fixes to Hindi Shiroreka splitter.
* Added word bigram correction.
* Reduced stack memory consumption and eliminated some ugly typedefs.
* Added new uniform classifier API.
* Added new training error counter.
* Fixed endian bug in dawg reader.
* Many other fixes, including the way in which the chopper finds chops and messes with the outline while it does so.
Diffstat (limited to 'graphics/tesseract/Makefile')
-rw-r--r-- | graphics/tesseract/Makefile | 49 |
1 files changed, 21 insertions, 28 deletions
diff --git a/graphics/tesseract/Makefile b/graphics/tesseract/Makefile index f6dd93aac99..a98803d4abc 100644 --- a/graphics/tesseract/Makefile +++ b/graphics/tesseract/Makefile @@ -1,42 +1,35 @@ -# $NetBSD: Makefile,v 1.13 2014/09/23 19:07:06 jperkin Exp $ -# +# $NetBSD: Makefile,v 1.14 2014/10/02 16:06:02 adam Exp $ -DISTNAME= tesseract-2.04 -PKGREVISION= 4 +DISTNAME= tesseract-ocr-3.02.02 +PKGNAME= ${DISTNAME:S/-ocr//} CATEGORIES= graphics -MASTER_SITES= http://tesseract-ocr.googlecode.com/files/ +MASTER_SITES= https://tesseract-ocr.googlecode.com/files/ DISTFILES+= ${DISTNAME}.tar.gz -.for l in deu eng fra ita nld spa -DISTFILES+= tesseract-2.00.${l}.tar.gz -.endfor MAINTAINER= pkgsrc-users@NetBSD.org HOMEPAGE= http://code.google.com/p/tesseract-ocr/ COMMENT= Commercial quality open source OCR engine LICENSE= apache-2.0 -LDFLAGS.SunOS+= -lsocket -lnsl - -INSTALLATION_DIRS= libexec share/doc/tesseract share/tesseract - -GNU_CONFIGURE= yes -USE_LANGUAGES= c c++ -USE_TOOLS+= gmake pax +USE_LANGUAGES= c c++ +USE_LIBTOOL= yes +USE_TOOLS+= gmake pax +GNU_CONFIGURE= yes +CONFIGURE_ENV+= LIBLEPT_HEADERSDIR=${BUILDLINK_PREFIX.leptonica}/include +MAKE_ENV+= LANGS=${TESSERACT_LANGS:Q} -post-extract: - ${RM} ${WRKSRC}/java/makefile +WRKSRC= ${WRKDIR}/tesseract-ocr -post-build: - ${SED} -e "s,@PREFIX@,${PREFIX}," ${FILESDIR}/tesseract.sh \ - > ${WRKSRC}/tesseract.sh +INSTALLATION_DIRS= libexec share/doc/tesseract share/tesseract -post-install: - ${MV} ${DESTDIR}${PREFIX}/bin/tesseract ${DESTDIR}${PREFIX}/libexec - ${INSTALL_SCRIPT} ${WRKSRC}/tesseract.sh ${DESTDIR}${PREFIX}/bin/tesseract - ${INSTALL_DATA} ${WRKSRC}/README ${DESTDIR}${PREFIX}/share/doc/tesseract - ${INSTALL_DATA} ${WRKSRC}/phototest.tif ${DESTDIR}${PREFIX}/share/tesseract - cd ${WRKDIR}/tessdata && ${PAX} -rw * ${DESTDIR}${PREFIX}/share/tessdata - chmod a-x ${DESTDIR}${PREFIX}/share/tessdata/*.* +TESSERACT_LANGS= afr ara aze bel ben bul cat ces chi_sim chi_tra chr \ + dan deu ell eng enm epo equ est eus fin fra frk frm \ + glg grc heb hin hrv hun ind isl ita jpn kan kor lav \ + lit mal mkd mlt msa nld nor pol por rus slk slv spa \ + sqi srp swa swe tam tel tgl tha tur ukr vie +.for l in ${TESSERACT_LANGS} +DISTFILES+= tesseract-ocr-3.02.${l}.tar.gz +.endfor -.include "../../graphics/tiff/buildlink3.mk" +.include "../../graphics/leptonica/buildlink3.mk" .include "../../mk/bsd.pkg.mk" |