Thijs Leegwater
f061503a14
Added JPEG quality option parameter (-c jpg_quality=n)
2018-01-11 09:11:30 +01:00
Josh Reid
cdc35338c5
Added check if input PSM value is outside of range ( #1236 )
...
Wrote a function to throw an error if PSM is outside 0-13 or OEM is outside 0-5.
fixes #1234
2017-12-14 11:37:44 +01:00
Stefan Weil
aa6eb6bd46
Remove Tesseract parameter "include_page_breaks" and use FF by default
...
Now Tesseract adds a page break (normally form feed) by default.
It is still possible to suppress page breaks by setting an empty
page_separator.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-09-19 07:34:32 +02:00
amitdo
a905548ed6
Autotools build: Remove the option 'USING_MULTIPLELIBS'
...
Libtool's convenience libraries should never be installed. Fixes #985 .
2017-09-11 15:03:53 +03:00
Ray Smith
fc6a390c6c
Added intsimdmatrix as a generic integer matrixdotvector function with AVX2 and SSE specializations
2017-09-08 15:06:19 +01:00
Ray Smith
a18620cfea
Improved results on images with no resolution. Estimates resolution
...
from the size of the connected components, based on average text size.
2017-09-08 09:37:03 +01:00
Stefan Weil
b9365cdff1
api: Fix typo in comment
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-09-03 09:14:00 +02:00
zdenop
7afa05a03e
Merge pull request #1072 from stweil/listlangs
...
List available languages recursively
2017-08-13 14:50:42 +02:00
chrismamo1
5fd3e22f74
move code around so that list-langs will work without an English traineddata file
2017-08-12 17:15:27 -05:00
Stefan Weil
cc0d87c5b8
List available languages recursively
...
Tesseract supports hierarchies of languages and uses them since
the new files best/*.traineddata were added.
Now `tesseract --list-langs` also shows any traineddata files in
subdirectories of the tessdata directory.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-08-10 18:55:38 +02:00
Stefan Weil
0720b3f38b
Change default resolution from 70 to 300 dpi
...
The default resolution is used for images without an explicit resolution
or with an unreasonable resolution (smaller than 70 or larger than 2400).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-08-08 16:48:10 +02:00
Ray Smith
2ef1aeaeb4
Added AVX2 and AVX512 detector
2017-08-02 14:15:50 -07:00
Ray Smith
dc8745e6fd
Move LSTM unicharset and recoder to traineddata with version string part1. Backwards compatible - maybe.
2017-07-14 11:14:23 -07:00
Ray Smith
7588540296
Removed changes from last commit that didn't belong
2017-07-14 11:08:26 -07:00
Ray Smith
3ec11bd37a
Deleted some dead LSTM code, making everything use the recoder
2017-07-14 10:58:21 -07:00
Ray Smith
da03e4e910
Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion
2017-07-14 09:30:14 -07:00
Justin Hotchkiss Palermo
f057938069
fix filenames in comments
2017-07-02 17:35:47 -04:00
Justin Hotchkiss Palermo
1d862a54bd
Add new line to a few error messages.
2017-07-01 08:40:57 -04:00
Stefan Weil
1cf8fe51a0
Remove mathfix.h
...
It was only needed for MS Visual Studio 2012 and older.
Those compilers are not supported for Tesseract.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-05 20:26:25 +02:00
zdenop
ffb1ec3535
Merge pull request #918 from rfschtkt/issue529
...
Issue529
2017-05-13 19:33:46 +02:00
Raf Schietekat
b4cf46697f
Issue #529 : inside main() use return rather than exit
2017-05-13 18:02:00 +02:00
Stefan Weil
84396707a8
Fix crash if output file could not be opened
...
This error case results in fout_ == nullptr.
Closing a nullptr file is not allowed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 17:27:07 +02:00
zdenop
29f3de9be1
Merge pull request #914 from stweil/clean
...
Clean code
2017-05-13 12:45:57 +02:00
Stefan Weil
5dc4af62fb
baseapi: Simplify code
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 12:14:29 +02:00
Stefan Weil
78142593d2
Fix order of destructor calls for DawgCache and TessBaseAPI
...
TessBaseAPI must release its cache use before DawgCache is destroyed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 11:35:30 +02:00
Stefan Weil
f37f858c99
main: Fix two memory leaks
...
When Tesseract terminates by calling the exit function,
the destructor of any local auto variable is not called.
Fix two cases by using static variables.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 21:15:52 +02:00
Stefan Weil
5e3665c6ae
Remove most libtiff dependencies
...
libtiff is no longer needed for OpenCL, so remove that dependency.
It is still suggested for Windows to redirect warning messages
from the tesseract executable to the console.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 10:15:35 +02:00
Raf Schietekat
c335508e84
Fewer g++ -Wsign-compare warnings
2017-05-11 23:14:52 +02:00
zdenop
64994a2707
Merge pull request #900 from rfschtkt/cast
...
Reviewed uses of reinterpret_cast
2017-05-11 16:08:12 +02:00
Raf Schietekat
8aa0a2dd48
RAII: *::GetUNLVText()
2017-05-11 02:02:37 +02:00
Raf Schietekat
1dab23916f
RAII: *::GetBoxText()
2017-05-11 02:02:37 +02:00
Raf Schietekat
b7b68a65dd
RAII: *::GetTSVText()
2017-05-11 02:02:37 +02:00
Raf Schietekat
a1fff874b4
RAII: *::GetHOCRText()
2017-05-11 02:02:37 +02:00
Raf Schietekat
986970d6ca
RAII: pdfrenderer.cpp: pdftext
2017-05-11 02:02:37 +02:00
Raf Schietekat
3c6e18ecf9
RAII: pdfrenderer.cpp: buffer
2017-05-11 02:02:37 +02:00
Raf Schietekat
936ca00c44
RAII: pdfrenderer.cpp: cidtogidmap
2017-05-11 02:02:37 +02:00
Raf Schietekat
2772f78170
RAII: LTRResultIterator::GetUTF8Text
2017-05-11 02:02:37 +02:00
Raf Schietekat
f75665c34f
RAII: TessBaseAPI::GetUTF8Text()
2017-05-11 02:02:37 +02:00
Raf Schietekat
4840c65bf0
RAII: ResultIterator::GetUTF8Text(): was leaked inside TessBaseAPI::GetUTF8Text()
2017-05-11 02:02:37 +02:00
Raf Schietekat
3983d2f76a
Reviewed uses of reinterpret_cast
2017-05-11 01:58:40 +02:00
Egor Pugin
0afd5939b1
Use NDEBUG macro instead of DEBUG.
2017-05-08 13:01:22 +03:00
Ray Smith
6ac31dcbdd
Fixed DetectOS so it doesn't crash with a big image
2017-05-03 15:50:31 -07:00
Stefan Weil
c1d649ebbc
api: Replace Tesseract data types by POSIX data types
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-02 18:21:44 +02:00
Stefan Weil
aea0d9a8d5
api: Remove unneeded NULL checks
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-04-30 19:23:24 +02:00
Stefan Weil
1c59914b61
Use Leptonica struct names L_Compressed_Data, Pix
...
The Tesseract project prefers that names, so fix the remaining exceptions.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-04-30 10:50:12 +02:00
Ray Smith
7a116ce8bb
More formatting fixes from clang tidy
2017-04-28 13:38:32 -07:00
Ray Smith
77015526fa
Jeff's fixes to pdf rendering
2017-04-28 13:38:13 -07:00
zdenop
13b7900ebf
Merge pull request #778 from cjmayo/singleopts
...
tidy tesseract(1) adding missing options
2017-04-28 18:58:40 +02:00
Ray Smith
1cc511188d
Added extra Init that takes a memory buffer or a filereader function pointer to enable read of traineddata from memory or foreign file systems. Updated existing readers to use TFile API instead of FILE. This does not yet add big-endian capability to LSTM, but it is very easy from here.
2017-04-27 15:48:23 -07:00
James R. Barlow
f54577e6be
Fix #786 - 3.05 linkage fails on macOS Sierra with --enable-opencl
...
Also needed for 4.00.
2017-04-10 22:22:49 -07:00