Commit Graph

6481 Commits

Author SHA1 Message Date
Zdenko Podobný
9d71da7854 Merge branch 'main' of https://github.com/tesseract-ocr/tesseract 2023-02-10 12:13:26 +01:00
zdenop
392e56cd87
Update cmake.yml
libarchive is broken on macos: https://github.com/libarchive/libarchive/issues/1819
2023-02-10 12:12:38 +01:00
Zdenko Podobný
9bac701d5e cmake: fix gcc-7 fatal error: filesystem: No such file or directory 2023-02-10 09:51:59 +01:00
Stefan Weil
f1e3697dd4 Fix some typos in comments (found by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-02-08 20:51:58 +01:00
Egor Pugin
0221094275
Merge pull request #4015 from stweil/spelling
Replace 'can not' by 'cannot'
2023-02-08 22:02:06 +03:00
Stefan Weil
1e04be842d Replace 'can not' by 'cannot'
Both forms are used in American English, but 'cannot' is more common
(also in Tesseract code), so use it always.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-02-08 17:34:22 +01:00
zdenop
7becbbd627
Update cmake-win64.yml 2023-02-07 15:11:00 +01:00
Egor Pugin
efa89c6dfa
Merge pull request #4013 from ferdnyc/patch-1
Fix libdir in tesseract.pc from CMake
2023-02-03 14:23:43 +03:00
Frank Dana
5e116fa5ca
Fix libdir in tesseract.pc from CMake
tesseract.pc.cmake was hardcoding libdir to
`{prefix}/lib`, which is wrong for systems that use
`/usr/lib64/` on 64-bit. `CMAKE_INSTALL_LIBDIR`
is already expected to contain the libdir path
relative to the install prefix.
2023-02-02 19:57:59 -05:00
autoantwort
1c09782354
msvc debug: fix wrong lib name in generated pkgconfig file (#4008) 2023-01-31 15:30:45 +01:00
Egor Pugin
e3fb0c657d
Merge pull request #4009 from kraj/gcc13
Fix build with gcc 13 by including <cstdint>
2023-01-30 23:11:06 +03:00
Khem Raj
2025b53de6 Fix build with gcc 13 by including <cstdint>
gcc 13 moved some includes around and as a result <cstdint> is
no longer transitively included [1]. Explicitly include it for
int32_t.

[1] https://gcc.gnu.org/gcc-13/porting_to.html#header-dep-changes

Signed-off-by: Khem Raj <raj.khem@gmail.com>
2023-01-30 11:28:24 -08:00
zdenop
a3b9acfa4a
Merge pull request #4006 from autoantwort/fix-linkage
Fix linkage of icu and pango
2023-01-28 15:41:48 +01:00
Leander Schulten
680d1e231c Fix linkage of icu and pango 2023-01-28 04:19:45 +01:00
Stefan Weil
3bedea1bdd Fix FP division by zero in LanguageModel::ExtractFeaturesFromPath (issue #3995)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-01-20 16:45:09 +01:00
Stefan Weil
1852afe9f8 Remove unneeded type cast in LanguageModel::ExtractFeaturesFromPath
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-01-20 16:45:09 +01:00
Stefan Weil
4142b32815 Fix some whitespace issues in source code and text files
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-01-19 18:31:06 +01:00
Egor Pugin
2a7ed8b6a5
Merge pull request #3992 from seupedro/patch-1
Update README.md
2023-01-14 22:46:00 +03:00
Seu Pedro
1851e5a1c4
Update README.md
Added a link explaining what an OCR Engine is
2023-01-14 15:37:22 -03:00
zdenop
0ef192050a fix "cannot pass non-trivial object of type 'std::string'" 2023-01-08 19:13:48 +01:00
zdenop
804b63646f show out filename on successful created of traineddata (combine_lang_model) 2023-01-08 18:30:31 +01:00
zdenop
005bfe4950 fix "cannot pass non-trivial object of type 'std::string'" 2023-01-06 18:34:16 +01:00
zdenop
8a26329623 unicharset_extractor:
- run ReadMemBoxes only for box files
- do not write unicharset in case of broken box file
2023-01-06 15:52:42 +01:00
Amit D
da3737d371
Update issue-bug.yml 2023-01-05 11:03:20 +02:00
Amit D
a0f06e20b4
Update issue-bug.yml 2023-01-05 10:45:22 +02:00
Amit D
65b8a3b019
Update issue-bug.yml 2023-01-05 10:37:15 +02:00
Amit D
dbedbad20d
Update issue-bug.yml 2022-12-27 18:36:06 +02:00
Amit D
95e84735d5
Update issue-bug.yml 2022-12-27 11:52:28 +02:00
Amit D
98837f83a9
Update issue-bug.yml 2022-12-27 11:40:07 +02:00
Amit D
b76b5be65c
Create an issue template for a feature request 2022-12-25 20:51:23 +02:00
Amit D
ce0ed917f6
Create a new issue template 2022-12-25 20:32:57 +02:00
Stefan Weil
080da83cc5 Create new release 5.3.0
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-22 14:57:57 +01:00
Amit D
7f5345d207
Update README.md
'Promote' @stweil ... :-)
2022-12-19 10:14:32 +02:00
Zdenko Podobný
f25196151b cmake - msvc/openmp: clean&document configuration 2022-12-15 13:26:56 +01:00
Zdenko Podobný
f2f37a8323 cmake - mscvc: silent warning C4068: unknown pragma 'GCC' 2022-12-15 13:25:43 +01:00
Stefan Weil
86a7bc6c06 Create new release 5.3.0-rc1
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 17:37:35 +01:00
Stefan Weil
6e4de524d0 Replace MacOS -> macOS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 17:37:35 +01:00
Stefan Weil
6a21a74ecf Suppress compiler warning caused by very long string
Add pragmas which suppress this warning from gcc or clang:

    src/ccutil/universalambigs.h:26:5: warning:
     string literal of length 170929 exceeds maximum length 65536 that
     C++ compilers are required to support [-Woverlength-strings]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 13:34:01 +01:00
Stefan Weil
369b811c99 Replace at accessor by [] operator in function Classify::CreateIntTemplates
UnicityTable did not provide the [] operator, so add it for this change.

Suggested-by: Egor Pugin <egor.pugin@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
a806d21883 Fix function ReadTrainingSamples (issue #3925)
This fixes duplicate delete when running cntraining.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
23138ab88a Fix function Classify::WriteIntTemplates (issue #3925)
It crashed when running mftraining because unicharset_size in file
"inttemp" was written with 8 bytes instead of 4 bytes.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
4fa046b1b3 Fix function tesseract::write_set (issue #3925)
It crashed when running mftraining with fs.size() == 0.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
1fd8f8165f Fix function UnicityTable::push_back (issue #3925)
mftraining crashed because the returned value was 1 instead of 0
for the first call of UnicityTable::push_back.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
1d3b410968 Fix function ComputeChiSquared (issue #3925)
mftraining crashed if the search did not find anything.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
5591bc04ef Remove assertion in function NewSimpleProto (issue #3925)
It was triggered by mftraining.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
f969ba9161 Fix function Classify::CreateIntTemplates (issue #3925)
The old code did not work correctly if FClass->font_set.size() was 0.
It created the FontSet fs with size 1 instead of 0.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
6b7cb1cbc6 Add missing serialization to FILE for vector of pointers (issue #3925)
It is required for mftraining which otherwise writes a wrong shapetable.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
90c09a3df3 Replace void_proc by kdwald_proc with correct arguments
This allows removing a reinterpret_cast and fixes a runtime error
with sanitizers:

runtime error: call to function
tesseract::MakePotentialClusters(tesseract::ClusteringContext*, tesseract::CLUSTER*, int)
through pointer to incorrect function type 'void (*)(...)'

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Zdenko Podobný
04551ce2a6 clang-format: use default value for line width (80) 2022-12-12 16:55:34 +01:00
Egor Pugin
0680ba870e
Merge pull request #3978 from stweil/sanfix
Modernize function ObjectCache::DeleteUnusedObjects (fix issue with s…
2022-12-12 17:56:42 +03:00