Commit Graph

174 Commits

Author SHA1 Message Date
sunyuechi
16fc9d90a4 Add RISC-V V support (#4346)
Convert riscv-v-spec-1.0.pdf into 111 PNG images,
then perform OCR on each one in sequence,
and measure the testing time on banana_f3:

old:        31m16.267s
new:        16m51.155s

Co-authored-by: sunyuechi <sunyuechi@iscas.ac.cn>
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2024-11-08 08:09:01 +01:00
Stefan Weil
d7c0a05ffa Remove Tensorflow support
Tensorflow was never used because of missing models.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-11-07 13:40:43 +01:00
Stefan Weil
c886e3b639 Update NSIS configuration
- Move NSIS installer file to new location
- Support cross builds with NSIS
- Clean nsis configuration
- Fix typos in nsis configuration
- Add jar files needed for ScrollView.jar
- Move ScrollView.jar to a new section
- Add missing configurations to tessdata
- Registry settings are now disabled (problems with long PATH)
- Add menu sections for all languages
- Simplify language downloads
- Tune and improve nsis configuration
- Add sizes for language data
- Add missing translations to nsis configuration
- Don't show details in installer by default
- Initial code for 64 bit Tesseract installer
- Fix uninstall for TESSDATA_PREFIX registry key
- Remove cube code
- nsis: Add all training executables
- nsis: Disable registry settings

Trying to add to PATH fails if the old PATH is very long and
will result in an empty PATH.

Remove these settings as they were already disabled by default,
and both are not needed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-11-02 07:00:33 +01:00
Stefan Weil
d50600a618 Remove old comment in Makefile.am
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-26 08:05:07 +02:00
Stefan Weil
67aad9ed13 Compile src/lstm/tfnetwork.cpp only in builds with TensorFlow
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-26 08:03:53 +02:00
Stefan Weil
d72567ad45 Use AM_CPPFLAGS also for compilation of all sources
It was not used for all sources before. Therefore some parts of the
code (especially the code for training) used different compiler
options. For example NDEBUG was not defined, and so the training
code was built with debug assertions (resulting in potentially
slower execution).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-05-19 18:43:55 +02:00
Jan Kamlah
577e8a8b93 Add PAGE XML renderer / export (#4214)
Add PAGE XML export and documentation.
To generate PAGE XML output just add 'page' to the tesseract command.

The output is outputname + '.page.xml' to avoid conflicts with ALTO export.

The output can be customized with the flags:
tessedit_create_page_polygon and tessedit_create_page_wordlevel.

Co-authored-by: Stefan Weil <sw@weilnetz.de>
2024-04-19 21:12:39 +02:00
Stefan Weil
d5e000bc58
Remove unsupported OpenCL code and related API functions (#4220)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-04-11 19:15:39 +03:00
Stefan Weil
92999505ee Abort with error message if OSD is requested with LSTM-only model
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-05-09 19:24:24 +02:00
Stefan Weil
1e04be842d Replace 'can not' by 'cannot'
Both forms are used in American English, but 'cannot' is more common
(also in Tesseract code), so use it always.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-02-08 17:34:22 +01:00
Stefan Weil
b7d7b85834 Add missing .exe for training tools to fix build with msys2
Fixes: 8c573e4cef ("autotools: Add rule for svpaint executable")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-08-25 14:34:19 +02:00
Stefan Weil
8c573e4cef
autotools: Add rule for svpaint executable (#3873)
Move also its source code svpaint.cpp from src/viewer/ to src/,
so it is no longer included in libtesseract by the cmake build.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-07-19 00:50:01 +03:00
Stefan Weil
b0d82879e5 Add initial support for Intel AVX512F
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-04-14 23:47:04 +02:00
Stefan Weil
a773bf28db Fix linker flags for MSYS2 clang64 builds
MSYS2 clang64 uses the lld linker which does not support --as-needed.
The normal GNU ld uses that linker option with ELF targets but ignores
it for PE targets (.exe, .dll), so it can be removed.

Remove also the -Wl, which is only needed when linker options are
passed to the compiler but not when they are directly passed to the
linker.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-27 11:11:40 +01:00
Egor Pugin
6115200f40
Update Makefile.am 2022-02-07 03:24:51 +03:00
Stefan Weil
5884036ecd Don't use compiler flags -march=native -mtune=native in autoconf builds
Using those flags is not acceptable for Linux distributions
because the resulting code then depends on the build
infrastructure, so the build result is not deterministic.

It is still possible to use those compiler flags by specifying
CXXFLAGS.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-11 12:29:51 +01:00
Stefan Weil
7058bbf282 Move googletest to unittest/third_party/googletest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-01 11:50:50 +01:00
Stefan Weil
104ef8f30e Move src/api/tesseractmain.cpp to src/tesseract.cpp
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-31 21:43:30 +01:00
Stefan Weil
676b86be4d Fix automake warning because of redefined DEFAULT_INCLUDES
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 22:14:06 +02:00
Stefan Weil
0aad8b8619 Fix build with OpenCL and add namespace to OpenCL code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-06 07:51:03 +02:00
Stefan Weil
f9d17598a8 Make automake builds less noisy by default
The old commit only silenced parts of the build,
while the new one silences the whole build.

Fixes: 47af1282f4
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-24 15:21:26 +02:00
zdenop
0c49ee18cd fix visibility compilation 2021-09-17 17:38:01 +02:00
Stefan Weil
4dcd8fa591 Fix handling of TESSDATA_PREFIX containing // (fixes issue #3527)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-24 20:05:54 +02:00
Egor Pugin
407346246c [universalambigs] Use inline variables. 2021-08-20 12:38:03 +03:00
Stefan Weil
63c12a9ee5 unittest: Enable more code for tatweel_test without requiring Tensorflow
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-12 14:12:53 +02:00
Stefan Weil
49f410ced3 unittest: Remove dependency on absl::StripAsciiWhitespace()
This removes the last dependency on Abseil, so that submodule
is now removed completely.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-06 20:59:10 +02:00
Stefan Weil
87707bb8b0 unittest: Remove dependency on absl::StrSplit()
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-06 20:59:09 +02:00
Stefan Weil
f407345cbe unittest: Remove dependency on absl::StrJoin()
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-06 20:59:09 +02:00
Stefan Weil
61b8e301dd unittest: Remove dependency on absl::StrCat()
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-06 20:59:09 +02:00
Stefan Weil
8486f59493 unittest: Remove dependency on absl::StrFormat()
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-06 20:59:09 +02:00
Stefan Weil
fe5ca9dad9 unittest: Remove dependency on absl::GetCurrentTimeNanos()
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-06 20:59:09 +02:00
Stefan Weil
6b8b1f0007 unittest: Remove some dependencies on abseil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-06 20:59:09 +02:00
Stefan Weil
a73e7b97a4 Add float dotproduct implementation for NEON
Signed-off-by: Stefan Weil <stefan.weil@bib.uni-mannheim.de>
2021-08-03 10:35:22 +02:00
Stefan Weil
66b77e6639 Prepare using float instead of double for LSTM calculations
The new header file ccutils/tesstypes.h also prepares support
for larger images by introducing a new data type for image
size and coordinates (still unused).

FloatToDouble is now a local function.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-24 13:59:37 +02:00
Stefan Weil
4df822a3fc
Revert "Merge pull request #3330 from Sintun/master" (#3505)
This reverts commit 122daf1d64, reversing
changes made to 4cd56dc5f5.

Those changes caused two regressions which resulted in an assertion
or a segmentation fault.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-22 09:04:23 +03:00
Stefan Weil
f0fb6809e3 Use SIMD instructions for DotProductNative
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-14 19:13:01 +02:00
Stefan Weil
93348a83a3 Remove scripts for training
They were replaced by Python3 scripts (part of the tesstrain repository).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-18 10:47:44 +02:00
Stefan Weil
bf3421ff12 Fix autoconf build for latest MacOS (Intel and M1)
On latest MacOS 11.3 the system header file "ostream" includes a file
named "version".

The macro DEFAULT_INCLUDES adds the source root to the list of include
directories by default. As MacOS uses a case insensitive file system,
the compiler finds and includes the file "VERSION" there which causes
compiler errors and a failing build process.

Setting an empty DEFAULT_INCLUDES fixes that, but requires moving
config_auto.h to another directory in the include search path.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-13 17:22:58 +02:00
Stefan Weil
14505484c1 automake: Add build rule for fuzzer-api-512x256
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 10:44:43 +02:00
Stefan Weil
a74bbb6032 Remove bits16.h and BITS16 data type
Add also const attribute to some functions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
6ddceac538 Remove mfdefs.cpp from CMakeLists.txt and Makefile.am
That file was removed in commit 47715e576a.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 17:27:08 +02:00
Stefan Weil
0611c892b6 Disable more code with GRAPHICS_DISABLED
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-02 16:43:26 +02:00
Stefan Weil
3f0ac1185c Add new files ccstruct/image.cpp and ccstruct/image.h to Makefile
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-01 14:00:08 +02:00
Stefan Weil
2e349dbba5 Fix compilation for Tensorflow code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 16:19:06 +02:00
Stefan Weil
7d70ed4b41 Modernize code for OTSU and reduce public API further
Remove thresholder.h from the public API.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-23 08:59:04 +01:00
Stefan Weil
7fdf79aff4 Move function ExtractFontName to baseapi.cpp
It is only used there, so now a local function.
This also allows removing blobclass.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Brechtken
d856acba56 Change License to Apache V2, add new file to Makefile.am, change file name to .h ending 2021-03-16 14:16:02 +01:00
Stefan Weil
e51fcb2d31 Remove last usage of STRING
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Egor Pugin
d7823a71c2 Remove unused file. 2021-03-15 09:47:04 +03:00
Stefan Weil
58304cbfdd Don't compile OpenCL code when OpenCL is disabled
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 15:40:23 +01:00