tesseract

mirror of https://github.com/tesseract-ocr/tesseract.git synced 2024-11-30 23:49:05 +08:00

Author	SHA1	Message	Date
Stefan Weil	8e7b1119b5	Run more unittests with the user's locale Hopefully this improves the test coverage. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-16 18:12:55 +02:00
Stefan Weil	59e31e958b	Fix more build error for compilation without legacy engine Skip the tests which need the legacy code. Add also code to those tests to use the user's locale to test that, too. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-16 18:12:55 +02:00
Stefan Weil	780986ebfb	Fix linker error for baseapi_test when building without legacy engine Linker error reported in issue #2439: unittest/baseapi_test.cc:190: undefined reference to `tesseract::TessBaseAPI::AdaptToWordStr(tesseract::PageSegMode, char const*)' Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-16 18:12:55 +02:00
zdenop	7e9d2f4bc4	Merge pull request #2432 from nickjwhite/hocrmoretypes Add different classes to hocr output depending on BlockType	2019-05-16 17:02:48 +02:00
zdenop	b124a5f6ca	Merge pull request #2437 from stweil/locale-fix Fix some unittests with locale de_DE.UTF-8	2019-05-16 17:02:02 +02:00
Stefan Weil	331cc84d8d	Remove assertions for unsupported locale settings The latest code passed all unittests with locale de_DE.UTF-8 and has fixed the locale issues which were reported on GitHub. Therefore the assertions can be removed. Any remaining locale issue will be fixed when it is identified. To help finding such remaining isses, debug code now uses the user's locale settings instead of the default "C" locale for all executables which use TessBaseAPI. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-16 13:59:39 +02:00
Stefan Weil	77f9bad3c2	Fix UNICHARSET::save_to_string for locale de_DE.UTF-8 That function writes float values which must always use '.' as the decimal separator, no matter what the current locale setting is. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-16 11:39:59 +02:00
Stefan Weil	36ed6da349	Fix baseapi_test with locale de_DE.UTF-8 The unittest failed with LANG=de_DE.UTF-8: $ unittest/baseapi_test Running main() from ../../../../unittest/../googletest/googletest/src/gtest_main.cc [==========] Running 12 tests from 2 test suites. [----------] Global test environment set-up. [----------] 10 tests from TesseractTest [ RUN ] TesseractTest.ArraySizeTest [ OK ] TesseractTest.ArraySizeTest (0 ms) [ RUN ] TesseractTest.BasicTesseractTest [ OK ] TesseractTest.BasicTesseractTest (1251 ms) [ RUN ] TesseractTest.IteratesParagraphsEvenIfNotDetected [ OK ] TesseractTest.IteratesParagraphsEvenIfNotDetected (347 ms) [ RUN ] TesseractTest.HOCRWorksWithoutSetInputName [ OK ] TesseractTest.HOCRWorksWithoutSetInputName (403 ms) [ RUN ] TesseractTest.HOCRContainsBaseline [ OK ] TesseractTest.HOCRContainsBaseline (389 ms) [ RUN ] TesseractTest.RickSnyderNotFuckSnyder [ OK ] TesseractTest.RickSnyderNotFuckSnyder (346 ms) [ RUN ] TesseractTest.AdaptToWordStrTest Trying to adapt "136 " to "1 3 6" Trying to adapt "256 " to "2 5 6" Trying to adapt "410 " to "4 1 0" Trying to adapt "432 " to "4 3 2" Trying to adapt "540 " to "5 4 0" Trying to adapt "692 " to "6 9 2" Trying to adapt "779 " to "7 7 9" Trying to adapt "793 " to "7 9 3" Trying to adapt "808 " to "8 0 8" Trying to adapt "815 " to "8 1 5" Trying to adapt "12 " to "1 2" Trying to adapt "12 " to "1 2" [ OK ] TesseractTest.AdaptToWordStrTest (788 ms) [ RUN ] TesseractTest.BasicLSTMTest [ OK ] TesseractTest.BasicLSTMTest (4525 ms) [ RUN ] TesseractTest.LSTMGeometryTest [ OK ] TesseractTest.LSTMGeometryTest (615 ms) [ RUN ] TesseractTest.InitConfigOnlyTest Error: unichar ? in normproto file is not in unichar set. Error: unichar 0.232621 in normproto file is not in unichar set. Error: unichar 0.000400 in normproto file is not in unichar set. Error: unichar 0.231864 in normproto file is not in unichar set. [...] Error: unichar ? in normproto file is not in unichar set. Error: unichar 0.233915 in normproto file is not in unichar set. Error: unichar 0.000400 in normproto file is not in unichar set. Error: unichar 0.221755 in normproto file is not in unichar set. Error: unichar 0.000400 in normproto file is not in unichar set. Error: unichar ? in normproto file is not in unichar set. baseapi_test(21845,0x1134c45c0) malloc: * error for object 0x927f96c28005e0: pointer being freed was not allocated baseapi_test(21845,0x1134c45c0) malloc: * set a breakpoint in malloc_error_break to debug [INFO] Lang eng took 327ms in regular init [INFO] Lang chi_tra took 1422ms in regular init Abort trap: 6 TesseractTest.InitConfigOnlyTest is fixed by using std::istringstream instead of sscanf. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-16 11:05:09 +02:00
Stefan Weil	0dcc889e8d	Fix apiexample_test with locale de_DE.UTF-8 The unittest failed with LANG=de_DE.UTF-8: $ unittest/apiexample_test Running main() from ../../../../unittest/../googletest/googletest/src/gtest_main.cc [==========] Running 4 tests from 2 test suites. [----------] Global test environment set-up. [----------] 1 test from EuroText [ RUN ] EuroText.FastLatinOCR contains_unichar_id(unichar_id):Error:Assert failed:in file ../../../../../src/ccutil/unicharset.h, line 874 Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-15 22:43:47 +02:00
zdenop	4b397c70cc	Merge pull request #2434 from stweil/configure configure: Fix for latest developer tools on macOS	2019-05-15 07:31:44 +02:00
Stefan Weil	7917ffb6c2	configure: Fix for latest developer tools on macOS AX_CHECK_COMPILE_FLAG fails if it is used with -Werror and the compiler raises error -Wunused-macros. Add -Wno-unused-macros to disable those errors if possible. Simplify also the setting of several conditionals (AVX, AVX2, ...). Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-14 22:31:23 +02:00
Stefan Weil	6b1e709b19	Fix Doxygen comments for void functions Void functions should not use @return. It causes compiler warnings like this one: src/classify/intproto.cpp:326:5: warning: '@return' command used in a comment that is attached to a function returning void [-Wdocumentation] Some non-void functions also were documented with @return none. Fix those comments, too. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-14 21:57:17 +02:00
Stefan Weil	caa04882fd	normmatch: Remove unused private function PrintNormMatch was unused. Remove it and remove also an unused prototype. Make the only remaining private function NormEvidenceOf static. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-14 20:56:04 +02:00
Nick White	068eb4c35d	Add different classes to hocr output depending on BlockType These classes are taken from the hOCR specification, and seem to map well onto the BlockType types. There are probably more that could be added.	2019-05-14 13:25:08 +01:00
Egor Pugin	b9b74a6942	Update sw build.	2019-05-13 01:54:23 +03:00
zdenop	746674fcd5	Merge pull request #2430 from stweil/fix Fix reading of parameter from traineddata normproto component and make function independent of locale	2019-05-12 15:59:41 +02:00
Stefan Weil	5d92fbf010	Replace sscanf by std::istringstream Using std::istringstream allows conversion of string to float independent of the current locale setting. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-12 15:04:30 +02:00
Stefan Weil	c76ceafcdf	Fix reading of parameter from traineddata normproto component The NonEssential parameter was wrongly derived from linear_token instead of essential_token and therefore always set to true. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-12 14:43:58 +02:00
Stefan Weil	c07bc4e014	Fix Doxygen comment Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-12 08:55:23 +02:00
Stefan Weil	c8e96e2c02	Fix cast from pointer to integer type Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-12 08:54:46 +02:00
Zdenko Podobný	3f4dcf3c8b	cmake: uninstall target	2019-05-08 19:19:26 +02:00
zdenop	a94334a255	cmake: fix build without pkg-config (issue #2424 )	2019-05-08 18:49:48 +02:00
Zdenko Podobný	68ca3518be	autotools: remove list of traineddata files	2019-05-08 15:36:58 +02:00
zdenop	28cfaaae43	Merge pull request #2423 from jbarlow83/fix-cppflags Fix CPPFLAGS configuration for icu4c and libarchive	2019-05-07 11:28:59 +02:00
James R. Barlow	403361701a	Fix CPPFLAGS configuration for icu4c and libarchive missing from configure.ac	2019-05-07 02:01:20 -07:00
zdenop	7a5b9b8fcd	ScrollView: remove custom implementation of GetAddrInfo	2019-05-04 15:16:41 +02:00
zdenop	5e01f74648	remove unused include	2019-05-04 15:14:54 +02:00
zdenop	83e92e0179	Merge pull request #2422 from stweil/include tesscallback: Remove unused code	2019-05-04 12:20:23 +02:00
Stefan Weil	aba037329a	tesscallback: Remove more unused code Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-04 11:05:50 +02:00
Stefan Weil	57ff92e4bf	tesscallback: Remove unused code Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-02 22:14:04 +02:00
zdenop	9192c3afe2	correct tessdata comment in baseapi.h	2019-05-02 08:43:04 +02:00
zdenop	7e48368a5e	Merge pull request #2421 from stweil/includes universalambigs: Add missing include file	2019-05-02 08:36:49 +02:00
zdenop	39d3824c78	Merge pull request #2420 from stweil/locale Fix more locale dependencies	2019-05-02 08:31:41 +02:00
zdenop	4b77d9e806	Merge pull request #2419 from stweil/typos Fix some typos (most found and fixed by codespell)	2019-05-02 08:29:13 +02:00
Stefan Weil	cd749be473	universalambigs: Add missing include file This allows fixing two compiler warnings from clang++: src/ccutil/universalambigs.cpp:23:19: warning: no previous extern declaration for non-static variable 'kUniversalAmbigsFile' [-Wmissing-variable-declarations] src/ccutil/universalambigs.cpp:19019:18: warning: no previous extern declaration for non-static variable 'ksizeofUniversalAmbigsFile' [-Wmissing-variable-declarations] Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-02 07:36:31 +02:00
Stefan Weil	4fbc0a257b	commandlineflags: Replace strtod by std::stringstream Using std::stringstream allows conversion of double to string independent of the current locale setting. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-02 07:33:46 +02:00
Stefan Weil	d047fa1d1b	paramsd: Replace strtod by std::stringstream Using std::stringstream allows conversion of double to string independent of the current locale setting. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-02 07:33:46 +02:00
Stefan Weil	e3860e45b7	clusttool: Replace strtof by std::stringstream Using std::stringstream allows conversion of float to string independent of the current locale setting. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-02 07:33:45 +02:00
Stefan Weil	ed45656ec8	clusttool: Remove unused code and some global functions * WriteProtoList is unused. Remove it. * ReadNFloats, WriteNFloats and WriteProtoStyle are only used locally, so make them local. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-02 07:33:45 +02:00
Stefan Weil	28a521fec2	Fix some typos (most found and fixed by codespell) Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-05-01 20:30:41 +02:00
zdenop	41f50b19bb	fix crash in case of missing PNG support in Leptonica see #2333	2019-05-01 19:51:54 +02:00
zdenop	90aef80dd7	fix documentation about datapath: ending "/" is not relevant	2019-05-01 11:37:50 +02:00
Zdenko Podobný	087576f2d9	cmake: fix linux build	2019-04-29 18:00:03 +02:00
Jeff Breidenbach	546a9e81eb	fix #1900 : intraword spacing for slightly better pdf copy-paste performance	2019-04-29 11:28:30 +02:00
zdenop	137e6de56f	Print info when uzn file is used.	2019-04-28 19:06:38 +02:00
zdenop	0fe929010a	cmake: fixes #2337 Android cross-build	2019-04-24 21:42:58 +02:00
Zdenko Podobný	80e54e401d	fix spelling	2019-04-24 15:35:22 +02:00
Zdenko Podobný	832c257771	remove unused variable	2019-04-24 14:55:35 +02:00
Stefan Weil	b7bc71e987	Fix build for Windows * winsock2.h is case sensitive, lower case is required for cross build. * ws2tcpip.h is required for addrinfo. * FreeAddrInfo conflicts with existing freeaddrinfo, so rename it. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2019-04-24 11:24:47 +02:00
zdenop	63448de640	cmake: remove host.h from installation, remove definition of NOMINMAX and report used C++ standard	2019-04-23 23:05:26 +02:00

1 2 3 4 5 ...

3917 Commits