tesseract

mirror of https://github.com/tesseract-ocr/tesseract.git synced 2024-12-11 15:09:03 +08:00

Author	SHA1	Message	Date
Stefan Weil	488cc49aa8	Use env variable in AppVeyor configuration Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-09 19:23:40 +02:00
Egor Pugin	d93094b397	Merge pull request #1971 from stweil/fix Fix integer overflow in overlap calculation	2018-10-09 19:59:09 +03:00
Stefan Weil	7f911ac5e0	Fix integer overflow in overlap calculation Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-09 16:43:31 +02:00
zdenop	ca5d285a28	hocr: add ocrp_wconf to unconditional ocr-capabilities; fixes #1470	2018-10-09 16:34:50 +02:00
zdenop	956525f5a4	fix uninitialized variable, remove unused variable	2018-10-09 15:47:20 +02:00
zdenop	a6e716659e	Merge pull request #1970 from stweil/virtual Remove virtual specifiers	2018-10-09 15:40:47 +02:00
Zdenko Podobný	67b6b02e2d	Merge branch 'master' of https://github.com/tesseract-ocr/tesseract * 'master' of https://github.com/tesseract-ocr/tesseract: Remove code for _MSC_VER < 1900 keep API compatibility with #1265 Update googletest submodule to release v1.8.1 Update test submodule Always use isascii() with isspace() Avoid crash with --psm 0 and LSTM traineddata SVPaint: Remove empty block Classify: Don't hide debug parameter UNICHARMAP: Remove comparison which is always false svpaint: Change a variable from global to local pgedit: remove unused declaration of display_bln_lines Plumbing: Remove comparison which is always false Release candidate 2 use pdf L_FLATE_ENCODE only for png input; fixes #1961	2018-10-09 15:37:40 +02:00
Stefan Weil	128422e75c	Remove virtual specifiers Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-09 15:23:59 +02:00
zdenop	a9a411613a	Merge pull request #1968 from stweil/msvc Remove code for _MSC_VER < 1900	2018-10-09 14:42:37 +02:00
Stefan Weil	f94b3fd9fc	Remove code for _MSC_VER < 1900 Tesseract does not support Visual C++ older than Visual Studio 2015. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-09 14:05:21 +02:00
zdenop	c375f4fbf7	keep API compatibility with #1265	2018-10-09 11:22:15 +02:00
zdenop	7be5f74df8	Merge pull request #1966 from stweil/tests Update submodules for testing	2018-10-08 20:57:28 +02:00
Stefan Weil	af02ac6474	Update googletest submodule to release v1.8.1 Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 19:54:56 +02:00
Stefan Weil	eba1c81d52	Update test submodule The latest version includes more files for testing. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 19:44:08 +02:00
zdenop	272ebf995f	Merge pull request #1965 from stweil/isspace Always use isascii() with isspace()	2018-10-08 18:47:39 +02:00
zdenop	ab39adbcab	Merge pull request #1964 from stweil/fix Avoid crash with --psm 0 and LSTM traineddata	2018-10-08 18:45:37 +02:00
Stefan Weil	dcd0377bf0	Always use isascii() with isspace() isspace() must only used with an unsigned char or EOF argument, and even then its result can depend on the current locale settings. While this is not a problem for C/C++ executables which use the default "C" locale, it becomes a problem when the Tesseract API is called from languages like Python or Java which don't use the "C" locale. By calling isasci() before calling isspace() this uncertainty can be avoided, because any locale will hopefully give identical results for the basic ASCII character set. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 17:25:09 +02:00
Stefan Weil	32e92def49	Avoid crash with --psm 0 and LSTM traineddata Orientation and script detect only worked with legacy models and crashed with LSTM models. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 16:03:54 +02:00
zdenop	59ebd58fcc	Merge pull request #1963 from stweil/fix Fix some warnings from static code analyzer LGTM	2018-10-08 15:09:59 +02:00
Stefan Weil	1eeca175f7	SVPaint: Remove empty block This fixes a warning from LGTM: Empty block without comment Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 14:25:05 +02:00
Stefan Weil	9c857ab962	Classify: Don't hide debug parameter Fix a warning from LGTM: Local variable 'debug' hides a parameter of the same name. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 14:22:31 +02:00
Stefan Weil	30b75cfc05	UNICHARMAP: Remove comparison which is always false Warning from LGTM: Comparison is always false because index <= 0 and 1 <= length. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 14:15:17 +02:00
Stefan Weil	3ae765ecca	svpaint: Change a variable from global to local This fixes a warning from LGTM: Poor global variable name 'rgb'. Prefer longer, descriptive names for globals (eg. kMyGlobalConstant, not foo). Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 13:53:09 +02:00
Stefan Weil	7b5955920d	pgedit: remove unused declaration of display_bln_lines This fixes a warning from LGTM: This parameter of type ScrollView is 144 bytes - consider passing a pointer/reference instead. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 13:49:59 +02:00
Stefan Weil	ae93b65b1f	Plumbing: Remove comparison which is always false Warning from LGTM: Comparison is always false because index >= 0. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-08 13:47:16 +02:00
zdenop	944816ae3d	Release candidate 2	2018-10-07 21:10:50 +02:00
zdenop	f794571195	use pdf L_FLATE_ENCODE only for png input; fixes #1961	2018-10-07 20:57:19 +02:00
Zdenko Podobný	8598731daf	Merge branch 'master' of https://github.com/tesseract-ocr/tesseract * 'master' of https://github.com/tesseract-ocr/tesseract: (27 commits) Rework check for readable input file fix "mktemp -d --tmpdir" on Mac OS; see #1453 pgedit: Change some variables from global to local ones improve description of min_characters_to_try variable WERD_RES: Remove comparisons which are constant GENERIC_2D_ARRAY: Pass parameters by reference genericvector: Pass parameters by reference chop: Use more efficient float calculations for sqrt rect: Use more efficient float calculations for ceil, floor intproto: Use more efficient float calculations for floor genericvector: Rewrite code to satisfy static code analyzer Fix constructor for class Dict (uninitialized member variables) Fix use of wrong UNICHARSET lstmtraining: Remove dead code for purified model name combine_tessdata: Handle failures when extracting lstmtraining: Check write permission for output model implement parameter min_characters_to_try for minimum characters to try to skip page entirely. fixes #1729 Merge and enhance documentation on language and script models Document some more config options for tesseract Add Makefile rule to build HTML manpages ...	2018-10-07 15:39:02 +02:00
Egor Pugin	5cf5c80ba1	Merge pull request #1960 from stweil/errhandling Rework check for readable input file	2018-10-07 12:23:31 +03:00
Stefan Weil	67bf9062df	Rework check for readable input file This reverts commit `1a096441d0` and implements an alternate check which allows input from stdin. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 22:33:02 +02:00
zdenop	140bfa43f0	Merge branch 'master' of https://github.com/tesseract-ocr/tesseract	2018-10-06 20:50:08 +02:00
zdenop	4044ba8260	fix "mktemp -d --tmpdir" on Mac OS; see #1453	2018-10-06 20:47:48 +02:00
zdenop	c4fb194ba2	Merge pull request #1958 from stweil/lgtm Fix some warnings from static code analyzer LGTM	2018-10-06 20:27:21 +02:00
Stefan Weil	685abc91f3	pgedit: Change some variables from global to local ones This fixes compiler warnings and a warning from LGTM: Poor global variable name 'pe'. Prefer longer, descriptive names [...] Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 20:14:20 +02:00
zdenop	424dbd5dc7	improve description of min_characters_to_try variable	2018-10-06 20:10:54 +02:00
Stefan Weil	18f7ab751e	WERD_RES: Remove comparisons which are constant This fixes warnings from LGTM: Comparison is always false because id >= 0. Comparison is always true because mirrored >= 1. Comparison is always false because id >= 0. INVALID_UNICHAR_ID is -1, so the warnings are correct. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 20:06:38 +02:00
Stefan Weil	238c872753	GENERIC_2D_ARRAY: Pass parameters by reference This fixes warnings from LGTM: This parameter of type FontClassInfo is 192 bytes - consider passing a pointer/reference instead. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 19:48:13 +02:00
Stefan Weil	a7982185c9	genericvector: Pass parameters by reference This fixes warnings like the following one from LGTM: This parameter of type ParamsTrainingHypothesis is 112 bytes - consider passing a pointer/reference instead. Most parameters can also get the const attribute. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 19:47:49 +02:00
Stefan Weil	819c43d377	chop: Use more efficient float calculations for sqrt This fixes warnings from LGTM: Multiplication result may overflow 'float' before it is converted to 'double'. While the sqrt function always calculates with double, here the overloaded std::sqrt can be used to handle the float arguments more efficiently. Replace also an old C++ type cast by a static_cast. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 18:59:23 +02:00
Stefan Weil	f264464ec6	rect: Use more efficient float calculations for ceil, floor This fixes warnings from LGTM: Multiplication result may overflow 'float' before it is converted to 'double'. While the floor function always calculates with double, here the overloaded std::floor can be used to handle the float arguments more efficiently. Replace also old C++ type casts by static_cast. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 18:51:06 +02:00
zdenop	1e4768c1f5	Merge pull request #1957 from stweil/lgtm Fix some warnings from static code analyzer LGTM	2018-10-06 18:42:12 +02:00
zdenop	e78c33cfc3	Merge pull request #1956 from stweil/valgrind Fix constructor for class Dict (uninitialized member variables)	2018-10-06 18:32:39 +02:00
Stefan Weil	b26866bb3b	intproto: Use more efficient float calculations for floor This fixes warnings from LGTM: Multiplication result may overflow 'float' before it is converted to 'double'. While the floor function always calculates with double, here the overloaded std::floor can be used to handle the float arguments more efficiently. Replace also old C++ type casts by static_cast. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 18:29:38 +02:00
Stefan Weil	06a8de0b8b	genericvector: Rewrite code to satisfy static code analyzer Warning from LGTM: Resource data_ is acquired by class GenericVector<FontSpacingInfo *> but not released in the destructor. LGTM complains about data_ not being deleted in the destructor. The destructor calls the clear() method, but the delete there was conditional which confuses the static code analyzer. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 18:24:13 +02:00
Stefan Weil	c2a8aa00b8	Fix constructor for class Dict (uninitialized member variables) wildcard_unichar_id_, apostrophe_unichar_id_, question_unichar_id_ and slash_unichar_id_ were not initialized in the constructor. slash_unichar_id_ was used later in a conditional. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 17:52:52 +02:00
zdenop	9efedc15b2	Merge pull request #1954 from stweil/unicharset Fix use of wrong UNICHARSET	2018-10-06 15:04:31 +02:00
zdenop	76cd80e1d7	Merge pull request #1953 from stweil/fix lstmtraining: Remove dead code for purified model name	2018-10-06 15:02:39 +02:00
Stefan Weil	8dc9e9fd14	Fix use of wrong UNICHARSET Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 13:21:09 +02:00
Stefan Weil	0e71e5a754	lstmtraining: Remove dead code for purified model name The purified model name `model_output` was unused, so remove the comment and the unused code. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2018-10-06 09:34:17 +02:00
Egor Pugin	0e43ae5cf4	Merge pull request #1951 from stweil/checkdir combine_tessdata, lstmtraining: Check for write failures	2018-10-05 23:38:01 +03:00

1 2 3 4 5 ...

3211 Commits