Commit Graph

6000 Commits

Author SHA1 Message Date
Amit D
0cb9c40528
Add configurable variables to control thresholding (#3577) 2021-09-29 23:17:22 +03:00
zdenop
c4ad9b7bbf cmake: fix curl linking if CURL_LIBRARIES is not available 2021-09-25 12:31:05 +02:00
zdenop
da084ea9e6 cmake: fix copy&paste error 2021-09-25 12:26:06 +02:00
zdenop
ebb214c443 destroy temporary page_pix 2021-09-25 10:26:31 +02:00
zdenop
9a53758bc9 cmake: improve formating 2021-09-25 10:10:38 +02:00
zdenop
b294e3fc92
Merge pull request #3578 from adaptech-cz/improve-cmake
cmake: Improve configuration
2021-09-25 10:03:38 +02:00
Robert Pösel
362ed9b5e7 cmake: Improve configuration
- Use correct library name in TesseractConfig.cmake on all platforms
- Expose Tesseract_VERSION and Tesseract_VERSION_* variables in TesseractConfig.cmake
2021-09-24 22:15:46 +02:00
Egor Pugin
749beb6b14
Merge pull request #3576 from tesseract-ocr/Shreeshrii-patch-1
Update vcpkg.yml
2021-09-24 22:50:47 +03:00
zdenop
6b4447b931 cmake: remove REQUIRED during finding leptonica library, as we raise FATAL_ERROR later 2021-09-24 20:43:36 +02:00
zdenop
a4a14cb92b fix vcpkg action 2021-09-24 20:30:10 +02:00
Shreeshrii
e7d7ca86d6
Update vcpkg.yml
As suggested by @zdenop  in https://github.com/tesseract-ocr/tesseract/issues/3574#issuecomment-926031923
2021-09-24 21:35:54 +05:30
Stefan Weil
f9d17598a8 Make automake builds less noisy by default
The old commit only silenced parts of the build,
while the new one silences the whole build.

Fixes: 47af1282f4
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-24 15:21:26 +02:00
Amit D
adaaef87a4
Fix wrong tiles parameters in Sauvola (#3570)
Thanks to Robert Sachunsky @bertsky that pointed out the issue.
2021-09-23 10:26:07 +03:00
Amit D
6998c0ed71
Merge pull request #3571 from MerlijnWajer/hocr-write-scan-res
hocrrenderer: write scan_res property to the ocr_page
2021-09-22 09:22:26 +03:00
Stefan Weil
6ef6e36f78 codeql: Run apt-get update before trying to install Ubuntu packages
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-22 07:04:33 +02:00
Merlijn Wajer
ca177e72f3 hocrrenderer: write scan_res property to the ocr_page
This will make Tesseract emit the DPI of the document, if known at OCR
time. This is requird to properly interpret the x_fsize (font size)
property of words, since Tesseract scales the font size to the DPI.

See issue #3326 (https://github.com/tesseract-ocr/tesseract/issues/3326)
2021-09-21 11:02:52 +02:00
zdenop
19cc9afb25 cmake: add Tesseract_LIBRARY_DIRS 2021-09-19 18:34:24 +02:00
zdenop
89d86d6eee cmake: improve configuration,
prefer find_package instead of PKG_CONFIG for leptonica and tiff
2021-09-19 18:28:45 +02:00
zdenop
2d397a8551 cmake: improve libarchive support - prefer cmake function instead of pkg-config 2021-09-19 11:17:40 +02:00
zdenop
eafbb2b22a cmake: add option to build with libcurl support 2021-09-19 11:13:21 +02:00
zdenop
ff5f59cf24 cmake: fix cygwin GNU c++ build; fixes #2379 2021-09-19 11:06:19 +02:00
Egor Pugin
9ac988b27a
Merge pull request #3568 from adaptech-cz/fix-cmake-linking
Fix linking Tesseract to project using CMake on Linux
2021-09-17 23:15:39 +03:00
Robert Pösel
d7528e7cea Fix linking Tesseract to project using CMake on Linux
In the past Tesseract library was wrongly named as "liblibtesseract" so it was needed to use "-llibtesseract" as argument for linker. The name was fixed in commit 52cac3a42e but the ${Tesseract_LIBRARIES} variable still wrongly holds "libtesseract" and causes linker error when one is using it in target_link_libraries(...). This commit fixes that.
2021-09-17 21:35:30 +02:00
zdenop
0c49ee18cd fix visibility compilation 2021-09-17 17:38:01 +02:00
zdenop
4bdeda4af7
Merge pull request #3566 from stweil/issue3564
Simplify function LoadTrainingData and fix mastertrainer_test
2021-09-17 10:45:21 +02:00
Stefan Weil
638045133f Simplify function LoadTrainingData and fix mastertrainer_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-17 08:24:50 +02:00
Egor Pugin
2b4c3599de
Merge pull request #3565 from stweil/issue3564
Fix crash of shapeclustering (fixes #3564)
2021-09-17 00:11:49 +03:00
Stefan Weil
d87e08f266 Fix crash of shapeclustering (fixes #3564)
Fixes: 4415209fd6 ("Remove tessopt. This fixes mastertrainer test in shared build")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-16 22:31:09 +02:00
Stefan Weil
75f167ac8c Create new pre-release 5.0.0-beta-20210916
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-16 06:35:29 +02:00
Stefan Weil
386dd8a0c0 Update (master branch was renamed to main)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-13 07:42:46 +02:00
Stefan Weil
60fd2b4aba CI: Link basicapitest with Accelerate framework for MacOS
Clean also some other compiler options for basicapitest.

Fixes: 3ab8dcbf72 ("Use Apple Accelerate framework [...]")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-06 17:19:55 +02:00
Stefan Weil
e5e12f2856 Disable HAVE_FRAMEWORK_ACCELERATE for compilers which fail to compile with it
g++-10 and g++-11 throw compiler errors in builds with the
Accelerate framework, so disable it for all GNU compilers
before version 12 (which still has to be tested).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-06 17:15:46 +02:00
Egor Pugin
35dee4646f
Merge pull request #3549 from stweil/issue1573
Abort LSTM training with integer model (fixes issue #1573)
2021-09-06 13:11:46 +03:00
Stefan Weil
ec87dd4d49 Abort LSTM training with integer model (fixes issue #1573)
Tesseract currently cannot continue LSTM training from an
integer (fast) model.

Report this to users who try it nevertheless instead of crashing
with an assertion.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-06 08:18:55 +02:00
Stefan Weil
b5d4b67a3a Update test submodule
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-05 21:37:34 +02:00
Stefan Weil
a027dca007 Extend URI support for Tesseract with libcurl
libcurl not only supports HTTP and HTTPS, but also a lot of other protocols,
for example FTP and SFTP. Those protocols can also be useful for Tesseract.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-05 16:49:22 +02:00
Egor Pugin
1f437f3be8
Merge pull request #3545 from stweil/issue-3544
Rename processed TIFF output file and add page number if needed (fixe…
2021-09-01 15:26:06 +03:00
Stefan Weil
7fc9a34f79 Rename processed TIFF output file and add page number if needed (fixes issue #3544)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-01 14:16:05 +02:00
Robert Pösel
40fdacd485 Add missing check for __ARM_NEON
This makes it consistent with intsimdmatrixneon.cpp file and allows having this file included in builds even for non-NEON platforms (simplifies build config).
2021-08-26 15:28:59 +02:00
Egor Pugin
0fb170b994
Merge pull request #3540 from stweil/tessdata_prefix
Fix handling of TESSDATA_PREFIX containing // (fixes issue #3527)
2021-08-24 21:53:27 +03:00
Stefan Weil
4dcd8fa591 Fix handling of TESSDATA_PREFIX containing // (fixes issue #3527)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-24 20:05:54 +02:00
Egor Pugin
e57a3113fb
Merge pull request #3539 from stweil/submodels
Use model prefix also for submodels
2021-08-24 16:07:27 +03:00
Stefan Weil
391e713ae8 Use model prefix also for submodels
Fix also a regression in the for loop which handles submodels.

Fixes: 0d91c700c0 ("Modernize code in Tesseract::init_tesseract")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-24 13:41:00 +02:00
Stefan Weil
7cfcfe1101 cmake: Remove universalambigs.cpp
Fixes: 407346246c ("[universalambigs] Use inline variables.")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-23 07:30:03 +02:00
Stefan Weil
0d91c700c0 Modernize code in Tesseract::init_tesseract
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-23 07:30:03 +02:00
Egor Pugin
1d3d1fbc62 Move member function bodies into class template. 2021-08-20 12:42:40 +03:00
Egor Pugin
c539328d7d Merge branch 'master' of github.com-egorpugin:tesseract-ocr/tesseract 2021-08-20 12:38:12 +03:00
Egor Pugin
407346246c [universalambigs] Use inline variables. 2021-08-20 12:38:03 +03:00
Stefan Weil
7acda5cb6c Fix cloning of Image with pix_ == nullptr (issue #537)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-18 19:22:23 +02:00
Egor Pugin
feb32ecbe5 Merge branch 'master' of github.com-egorpugin:tesseract-ocr/tesseract 2021-08-18 18:15:05 +03:00