Commit Graph

2360 Commits

Author SHA1 Message Date
Shreeshrii
7b409a1bfa unittest testfile 2017-08-19 18:43:57 +05:30
Shreeshrii
436ad77e44 Create readme.md 2017-08-19 18:43:10 +05:30
Shreeshrii
88e4c62b39 Add files via upload 2017-08-19 18:42:06 +05:30
theraysmith
6f13d75534 Merge pull request #1051 from stweil/googletest
Add GoogleTest infrastructure
2017-08-17 20:19:46 -07:00
zdenop
3847b7dd74 Merge pull request #1085 from KindDragon/patch-1
Added CMake option to use system ICU library
2017-08-17 08:37:33 +02:00
Arkady Shapkin
d171488e21 Added CMake option to use system ICU library 2017-08-17 02:50:54 +03:00
zdenop
7afa05a03e Merge pull request #1072 from stweil/listlangs
List available languages recursively
2017-08-13 14:50:42 +02:00
zdenop
197b89b6ac Merge pull request #1077 from chrismamo1/chore/cleanup-compiler-warnings
WIP: Chore/cleanup compiler warnings
2017-08-13 14:50:26 +02:00
zdenop
3755a29abb Merge pull request #1076 from chrismamo1/bug/listlangs-without-eng
move code around so that list-langs will work without an English traineddata file
2017-08-13 14:50:10 +02:00
chrismamo1
6f281c36a7 fix a problem I introduced in a previous commit 2017-08-12 18:09:22 -05:00
chrismamo1
7111167497 fix a set-but-not-used warning and add casts for comparing signed+unsigned numbers 2017-08-12 17:53:28 -05:00
chrismamo1
b89bb09f9b fix a set but not used warning and cleanup some old code from 2007 2017-08-12 17:48:33 -05:00
chrismamo1
f9b51d7983 suppress a strict aliasing warning; the original author was very clear about the nature of the problematic code 2017-08-12 17:36:50 -05:00
chrismamo1
5fd3e22f74 move code around so that list-langs will work without an English traineddata file 2017-08-12 17:15:27 -05:00
Stefan Weil
cc0d87c5b8 List available languages recursively
Tesseract supports hierarchies of languages and uses them since
the new files best/*.traineddata were added.

Now `tesseract --list-langs` also shows any traineddata files in
subdirectories of the tessdata directory.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-08-10 18:55:38 +02:00
Egor Pugin
efa50daf5a Merge pull request #1070 from stweil/resolution
Change default resolution from 70 to 300 dpi
2017-08-08 23:05:14 +03:00
Stefan Weil
0720b3f38b Change default resolution from 70 to 300 dpi
The default resolution is used for images without an explicit resolution
or with an unreasonable resolution (smaller than 70 or larger than 2400).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-08-08 16:48:10 +02:00
Ray Smith
5f5e85e4a0 Fixed lack of error on non-existent traineddata 2017-08-07 09:58:43 -07:00
Ray Smith
0a91498195 Improved error message on missing optional config 2017-08-07 09:50:49 -07:00
Ray Smith
4b3c5f6c35 Added check for non-empty traineddata flag 2017-08-07 09:43:30 -07:00
Egor Pugin
c67c2e9f41 Add combine_lang_model to cmake and cppan builds. 2017-08-06 14:46:32 +03:00
zdenop
08ec5775a1 Merge pull request #1064 from stweil/win32
Fix broken build for Windows
2017-08-04 10:50:01 +02:00
Stefan Weil
cdec915e17 Fix broken build for Windows
Windows does not provide a mkdir function with two parameters.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-08-04 10:18:35 +02:00
Ray Smith
8e55e52be7 Harder unittest that uses file i/o and string manipulation 2017-08-03 15:51:18 -07:00
Ray Smith
4572940639 Portability fix to help tests compile with the same code in both Google and github 2017-08-03 15:42:26 -07:00
Ray Smith
2fbcba62e5 Initial push of one simple unittest 2017-08-02 17:35:29 -07:00
Ray Smith
77c44cdecd Added convert to int and directory listing to combine_tessdata 2017-08-02 14:53:07 -07:00
Ray Smith
2ef1aeaeb4 Added AVX2 and AVX512 detector 2017-08-02 14:15:50 -07:00
Ray Smith
39b168a0b6 Removed errors introduced by git merge 2017-08-02 14:12:45 -07:00
Ray Smith
4e9665debf Added ADAM optimizer, unless git screwed it up, cos there is no diff 2017-08-02 14:03:50 -07:00
Ray Smith
2633fef0b6 Part 2 of separating out the unicharset from the LSTM model, fixing command line for training 2017-08-02 13:29:23 -07:00
Egor Pugin
61adbdfa4b Merge pull request #1054 from tdhintz/master
std::max build fix.
2017-07-27 02:49:21 +03:00
Hintz
67314ea9bd Merge pull request #1 from tdhintz/tdhintz-stdmax-patch
Define std::max under VS2017 x64
2017-07-26 16:40:08 -05:00
Hintz
c5a861b229 Define std::max under VS2017 x64 2017-07-26 17:19:40 -04:00
Ray Smith
0e95e2ca87 Rewrote the recoder to use an encoding based on wubi instead of radical-stroke index, changed from normalized to unnormalized unichar representation 2017-07-25 09:40:44 -07:00
Ray Smith
b0ead95d64 Changed the way unicharsets are handled to allow support for the ™ character. Can find the issue where it was requested. 2017-07-24 11:45:57 -07:00
Stefan Weil
99755b0732 googletest: Add dummy test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-24 19:45:06 +02:00
Stefan Weil
796cd7ab56 cmake: Add googletest
The submodule is build automatically as soon as it exists.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-24 19:45:06 +02:00
Stefan Weil
f36dc34c4f Add googletest submodule
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-24 19:45:06 +02:00
Ray Smith
4efc539f51 clang tidy on previous pull 2017-07-19 17:04:49 -07:00
Ray Smith
4e8018d013 Important fix to RTL languages saves last space on each line, which was previously lost 2017-07-19 17:04:06 -07:00
Ray Smith
3f7735492f Removed unnecessary using statements and cleaned up google/non-google distinction 2017-07-19 16:42:48 -07:00
Ray Smith
cec1037260 Fixed BestPix to always return the highest resolution available, even if a lower bit depth than the original 2017-07-19 16:28:26 -07:00
Egor Pugin
66e686a0e6 Merge pull request #1041 from stweil/leptonica
Use lept_free to free memory allocated by Leptonica
2017-07-16 18:04:54 +03:00
Egor Pugin
900bf6076f Merge pull request #1040 from stweil/clean
Delete unused code in PangoFontInfo
2017-07-16 14:21:08 +03:00
rays
45fb7dde49 Fixed regression of issue #644 again! 2017-07-15 23:36:58 -07:00
Stefan Weil
ba95a686aa Use lept_free to free memory allocated by Leptonica
This fixes problems on Windows when Tesseract and Leptonica use different
C runtime libraries.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 08:34:18 +02:00
Stefan Weil
5a7b7ed7e1 PangoFontInfo: Remove unused method is_italic
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:22:05 +02:00
Stefan Weil
0cd71c67c9 PangoFontInfo: Remove unused method is_bold
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:59 +02:00
Stefan Weil
fbfbf67cf9 PangoFontInfo: Remove unused method is_smallcaps
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:49 +02:00