summaryrefslogtreecommitdiff
path: root/graphics/tesseract/Makefile
diff options
context:
space:
mode:
authoradam <adam>2014-10-02 16:06:02 +0000
committeradam <adam>2014-10-02 16:06:02 +0000
commit1f412672e8db4f0981ba2699484d0914d58200e1 (patch)
tree1cfd8567995e9c6a88ab2cb1cf245f48608dc94c /graphics/tesseract/Makefile
parent951cc8217b4eba0b9c042792efb1f46994a06ae6 (diff)
downloadpkgsrc-1f412672e8db4f0981ba2699484d0914d58200e1.tar.gz
Changes 3.02.02:
* Moved ResultIterator/PageIterator to ccmain. * Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic. * Added paragraph detection in layout analysis/post OCR. * Fixed inconsistent xheight during training and over-chopping. * Added simultaneous multi-language capability. * Refactored top-level word recognition module. * Added experimental equation detector. * Improved handling of resolution from input images. * Blamer module added for error analysis. * Cleaned up externally used namespace by removing includes from baseapi.h. * Removed dead memory mangagement code. * Tidied up constraints on control parameters. * Added support for ShapeTable in classifier and training. * Refactored class pruner. * Fixed training leaks and randomness. * Major improvements to layout analysis for better image detection, diacritic detection, better textline finding, better tabstop finding. * Improved line detection and removal. * Added fixed pitch chopper for CJK. * Added UNICHARSET to WERD_CHOICE to make mult-language handling easier. * Fixed problems with internally scaled images. * Added page and bbox to string in tr files to identify source of training data better. * Fixes to Hindi Shiroreka splitter. * Added word bigram correction. * Reduced stack memory consumption and eliminated some ugly typedefs. * Added new uniform classifier API. * Added new training error counter. * Fixed endian bug in dawg reader. * Many other fixes, including the way in which the chopper finds chops and messes with the outline while it does so.
Diffstat (limited to 'graphics/tesseract/Makefile')
-rw-r--r--graphics/tesseract/Makefile49
1 files changed, 21 insertions, 28 deletions
diff --git a/graphics/tesseract/Makefile b/graphics/tesseract/Makefile
index f6dd93aac99..a98803d4abc 100644
--- a/graphics/tesseract/Makefile
+++ b/graphics/tesseract/Makefile
@@ -1,42 +1,35 @@
-# $NetBSD: Makefile,v 1.13 2014/09/23 19:07:06 jperkin Exp $
-#
+# $NetBSD: Makefile,v 1.14 2014/10/02 16:06:02 adam Exp $
-DISTNAME= tesseract-2.04
-PKGREVISION= 4
+DISTNAME= tesseract-ocr-3.02.02
+PKGNAME= ${DISTNAME:S/-ocr//}
CATEGORIES= graphics
-MASTER_SITES= http://tesseract-ocr.googlecode.com/files/
+MASTER_SITES= https://tesseract-ocr.googlecode.com/files/
DISTFILES+= ${DISTNAME}.tar.gz
-.for l in deu eng fra ita nld spa
-DISTFILES+= tesseract-2.00.${l}.tar.gz
-.endfor
MAINTAINER= pkgsrc-users@NetBSD.org
HOMEPAGE= http://code.google.com/p/tesseract-ocr/
COMMENT= Commercial quality open source OCR engine
LICENSE= apache-2.0
-LDFLAGS.SunOS+= -lsocket -lnsl
-
-INSTALLATION_DIRS= libexec share/doc/tesseract share/tesseract
-
-GNU_CONFIGURE= yes
-USE_LANGUAGES= c c++
-USE_TOOLS+= gmake pax
+USE_LANGUAGES= c c++
+USE_LIBTOOL= yes
+USE_TOOLS+= gmake pax
+GNU_CONFIGURE= yes
+CONFIGURE_ENV+= LIBLEPT_HEADERSDIR=${BUILDLINK_PREFIX.leptonica}/include
+MAKE_ENV+= LANGS=${TESSERACT_LANGS:Q}
-post-extract:
- ${RM} ${WRKSRC}/java/makefile
+WRKSRC= ${WRKDIR}/tesseract-ocr
-post-build:
- ${SED} -e "s,@PREFIX@,${PREFIX}," ${FILESDIR}/tesseract.sh \
- > ${WRKSRC}/tesseract.sh
+INSTALLATION_DIRS= libexec share/doc/tesseract share/tesseract
-post-install:
- ${MV} ${DESTDIR}${PREFIX}/bin/tesseract ${DESTDIR}${PREFIX}/libexec
- ${INSTALL_SCRIPT} ${WRKSRC}/tesseract.sh ${DESTDIR}${PREFIX}/bin/tesseract
- ${INSTALL_DATA} ${WRKSRC}/README ${DESTDIR}${PREFIX}/share/doc/tesseract
- ${INSTALL_DATA} ${WRKSRC}/phototest.tif ${DESTDIR}${PREFIX}/share/tesseract
- cd ${WRKDIR}/tessdata && ${PAX} -rw * ${DESTDIR}${PREFIX}/share/tessdata
- chmod a-x ${DESTDIR}${PREFIX}/share/tessdata/*.*
+TESSERACT_LANGS= afr ara aze bel ben bul cat ces chi_sim chi_tra chr \
+ dan deu ell eng enm epo equ est eus fin fra frk frm \
+ glg grc heb hin hrv hun ind isl ita jpn kan kor lav \
+ lit mal mkd mlt msa nld nor pol por rus slk slv spa \
+ sqi srp swa swe tam tel tgl tha tur ukr vie
+.for l in ${TESSERACT_LANGS}
+DISTFILES+= tesseract-ocr-3.02.${l}.tar.gz
+.endfor
-.include "../../graphics/tiff/buildlink3.mk"
+.include "../../graphics/leptonica/buildlink3.mk"
.include "../../mk/bsd.pkg.mk"