Egor Pugin
0eaabc42c7
Update CMakeLists.txt
2020-05-12 11:49:15 +03:00
Egor Pugin
e720a26745
[cmake] Set inactivity timeout during icu download to 300 seconds.
...
Fixes #2972 .
2020-05-09 18:55:45 +03:00
Stefan Weil
fe966cc0b1
Add build script for oss-fuzz fuzzers
...
This is a copy of projects/tesseract-ocr/build.sh including its history from
https://github.com/google/oss-fuzz.git .
It allows maintaining the build rules with the Tesseract source code.
The build rules for Leptonica were slightly modified to avoid
unneeded compilations.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-05-08 17:37:37 +02:00
Stefan Weil
016016df77
Build only required Leptonica components
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-05-08 17:37:37 +02:00
Guido Vranken
6e9a1e97db
Fix build ( #3177 )
...
* [tesseract-ocr] Fix build
* [tesseract-ocr] Disable AFL, lower resolution
2020-05-08 17:37:37 +02:00
jonathanmetzman
db5655333e
Migrate projects using -lFuzzingEngine to $LIB_FUZZING_ENGINE ( #2325 )
...
Migrate from -lFuzzingEngine to $LIB_FUZZING_ENGINE where possible and not causing breakage
2020-05-08 17:37:37 +02:00
Guido Vranken
56b94fb783
Add fuzzer that processes 512x512 images ( #2279 )
2020-05-08 17:37:37 +02:00
Guido Vranken
b2d1a11016
Use Leptonica master branch ( #2224 )
2020-05-08 17:37:37 +02:00
Guido Vranken
1a7f633ab0
Add Tesseract ( #2210 )
...
* Add Tesseract
* Use -lz instead of static library path
* Disable Tesseract shared build
* Minimal repository cloning (--depth 1)
* Improve tessdata directory resolution syntax
* Don't hardcode TESSDATA_PREFIX into binary
* Don't move, but copy $SRC/tessdata to $OUT
Move sometimes results in "inter-device move failed"
2020-05-08 17:37:37 +02:00
Robin Watts
80d4af6ecf
Add a mechanism to avoid creating debug fonts.
...
If TESSERACT_DISABLE_DEBUG_FONTS is defined, tesseract doesn't
atetmpt to create any debug fonts. This not only saves memory,
but it (combined with the change to optionally use Pix as
internal storage for the ImageData) allows us to use an
embedded Leptonica library with no format handlers at all.
2020-05-05 00:22:23 +01:00
Robin Watts
6bcb941bcf
Avoid tesseract writing Pix out/reading them back.
...
By default, when we ImageData::SetPix, we write the data out as a
PNG, just to read it back in to get a compressed buffer of data.
We then use this to generate a new Pix.
In builds of Tesseract on systems where we don't have temp files,
writing files out is problematic.
Not only that, but compressing/uncompressing is slow, and on minimal
builds of leptonica, where we've disabled the format writers to reduce
memory footprint, we get no compression anyway.
In such cases, it'd be far nicer just to keep the original Pix as
the internal data.
Also, when recovering the pixmap from the ImageData, if we know we're
only going to read from the data, we can avoid duplicating it and
just use the original. This is exactly the case when GRAPHICS_DISABLED
is set.
So, introduce a TESSERACT_IMAGEDATA_AS_PIX predefine that we can use
to cause the internal data to be a Pix rather than a compressed
buffer.
Given we don't do compression, and they were writing to memory,
this was all just more effort than we needed.
Also, if we're using GRAPHICS_DISABLED, we might as well just
pixCopy rather than pixClone as only the scaler uses this.
2020-05-04 21:01:22 +01:00
zdenop
79c3ebbbb9
Merge pull request #2962 from stweil/GetPageRes
...
Add TessBaseAPI::GetPageRes again
2020-05-04 15:15:29 +02:00
Stefan Weil
9173e6e3f7
Add TessBaseAPI::GetPageRes again
...
It is now added unconditionally, so it is always available for the unittest.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-05-04 14:03:39 +02:00
Amit D
acc4c8bff5
Merge pull request #2952 from jannick0/patch-1
...
[trie.h] pattern definition: fix documentation
2020-04-27 23:44:48 +03:00
zdenop
23be532f7d
Merge pull request #2957 from stweil/master
2020-04-27 19:56:32 +02:00
Stefan Weil
1188e0a516
Remove old code which was used for Ocropus
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-04-27 16:33:34 +02:00
jannick0
e044163085
[trie.h] pattern definition: fix documentation
...
The fix makes the definition of `\n` consistent with the examples given below the definition. Please note that I did not check this against how it is implemented in the code.
2020-04-19 13:47:42 +02:00
Egor Pugin
cdebe13d81
[ci] Add fail-fast: false strategy.
2020-03-30 01:53:41 +03:00
Stefan Weil
4a00b68c63
Fix lambda function for curl code errors
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-18 20:46:52 +01:00
Stefan Weil
9f5a3f6ac7
Fix uninitialized local variable in curl code
...
Compiler warning:
src/api/baseapi.cpp:1151:27: warning:
variable 'curlcode' is uninitialized when used here [-Wuninitialized]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-18 19:25:33 +01:00
zdenop
6e307074d8
Merge pull request #2894 from stweil/curl
...
Report errors from curl_easy functions
2020-03-18 14:14:07 +01:00
Egor Pugin
916875d74a
[sw] Fix mingw build.
2020-03-17 17:57:10 +03:00
Egor Pugin
04a7650b51
Update README.md
2020-03-14 03:23:14 +03:00
Egor Pugin
e1cf69fd9e
[ci] Update.
2020-03-13 23:45:38 +03:00
Egor Pugin
a6c8d4c692
[ci] Merge three configs into one.
2020-03-13 19:37:22 +03:00
Stefan Weil
ef4f99a994
Run xgetbv instruction only on machines which support it
...
This fixes a regression for older Intel processors.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-08 17:32:10 +01:00
Stefan Weil
a7c9c566ee
Update submodule googletest to tagged release release-1.10.0
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-08 17:29:46 +01:00
Stefan Weil
a350108592
Update submodule abseil to tagged release 20200225
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-08 17:29:09 +01:00
Stefan Weil
eff4dc0603
Use lambda expressions for reporting curl errors
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-02-23 22:44:42 +01:00
Stefan Weil
9972c91127
Report errors from curl_easy functions
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-02-23 22:26:51 +01:00
Egor Pugin
90405ad0e3
Merge pull request #2893 from stweil/piccolo
...
Update piccolo2d-core and piccolo2d-extras
2020-02-23 19:20:44 +03:00
Egor Pugin
bbd2c31b91
Merge pull request #2895 from stweil/avx
...
simd: Check whether the OS supports FMA, AVX, ...
2020-02-23 19:20:18 +03:00
Stefan Weil
57ff90687d
simd: Check whether the OS supports FMA, AVX, ...
...
The previous check was only for the MS compiler, but not for gcc and clang.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-02-23 16:34:35 +01:00
Stefan Weil
62010da593
Update piccolo2d-core and piccolo2d-extras
...
Make also curl less noisy.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-02-23 16:28:08 +01:00
Egor Pugin
695f862bd7
Update linux.yml
2020-02-21 10:37:05 +03:00
Egor Pugin
d6265166c6
Fix sw build after upload.
2020-02-20 13:44:07 +03:00
Egor Pugin
cfb2a2c3a4
Add sw options for openmp.
2020-02-19 16:08:51 +03:00
Stefan Brechtken
b2ed8038d1
TableFind: clearing the statically allocated memory on api end
2020-02-19 13:18:28 +01:00
Stefan Brechtken
b3649b9fb2
TableFind: Api access, reskew and y inversion of the resulting TBOXes
2020-02-19 12:36:22 +01:00
Stefan Brechtken
877ef39c40
TableRecognizer: Adding functions in order to calculate row and columns bounding boxes
2020-02-19 12:23:26 +01:00
Stefan Brechtken
2954367c62
Temporary: adding Singleton pattern in order to bypass tesseract iterators and lists
2020-02-19 12:12:57 +01:00
Stefan Brechtken
46f16b8430
Adding workflow order to table detector java debug output + adding final column and row calculation
2020-02-19 12:08:16 +01:00
zdenop
95befed6b1
Merge pull request #2880 from HelgeSverre/patch-1
...
Update README.md
2020-02-07 10:43:36 +01:00
Helge Sverre
0705abf827
Update README.md
...
Typo in the work "documentation" in the link to the "Running Tesseract" section
2020-02-07 10:37:10 +01:00
zdenop
ddb663c099
Merge pull request #2878 from zuphilip/patch-1
...
Update link about pre-build binary packages
2020-02-05 08:07:40 +01:00
Philipp Zumstein
ca2624cdcb
Update link about pre-build binary packages
2020-02-05 07:44:18 +01:00
zdenop
7c3ac569f9
Replace references to the old wiki by new URLs ( #2877 )
...
Replace references to the old wiki by new URLs
2020-02-03 14:59:18 +01:00
Stefan Weil
16553014e0
Replace references to the old wiki by new URLs
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-02-03 11:37:41 +01:00
Stefan Weil
20bcbc4058
Catch std::runtime_error exception when setting the locale in debug code
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-02-03 07:58:43 +01:00
Stefan Weil
a1a177f582
Doxyfile: Add missing source directories (include, unittest)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-01-30 14:35:24 +01:00