Commit Graph

6317 Commits

Author SHA1 Message Date
Stefan Weil
3bedea1bdd Fix FP division by zero in LanguageModel::ExtractFeaturesFromPath (issue #3995)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-01-20 16:45:09 +01:00
Stefan Weil
1852afe9f8 Remove unneeded type cast in LanguageModel::ExtractFeaturesFromPath
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-01-20 16:45:09 +01:00
Stefan Weil
4142b32815 Fix some whitespace issues in source code and text files
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-01-19 18:31:06 +01:00
Egor Pugin
2a7ed8b6a5
Merge pull request #3992 from seupedro/patch-1
Update README.md
2023-01-14 22:46:00 +03:00
Seu Pedro
1851e5a1c4
Update README.md
Added a link explaining what an OCR Engine is
2023-01-14 15:37:22 -03:00
zdenop
0ef192050a fix "cannot pass non-trivial object of type 'std::string'" 2023-01-08 19:13:48 +01:00
zdenop
804b63646f show out filename on successful created of traineddata (combine_lang_model) 2023-01-08 18:30:31 +01:00
zdenop
005bfe4950 fix "cannot pass non-trivial object of type 'std::string'" 2023-01-06 18:34:16 +01:00
zdenop
8a26329623 unicharset_extractor:
- run ReadMemBoxes only for box files
- do not write unicharset in case of broken box file
2023-01-06 15:52:42 +01:00
Amit D
da3737d371
Update issue-bug.yml 2023-01-05 11:03:20 +02:00
Amit D
a0f06e20b4
Update issue-bug.yml 2023-01-05 10:45:22 +02:00
Amit D
65b8a3b019
Update issue-bug.yml 2023-01-05 10:37:15 +02:00
Amit D
dbedbad20d
Update issue-bug.yml 2022-12-27 18:36:06 +02:00
Amit D
95e84735d5
Update issue-bug.yml 2022-12-27 11:52:28 +02:00
Amit D
98837f83a9
Update issue-bug.yml 2022-12-27 11:40:07 +02:00
Amit D
b76b5be65c
Create an issue template for a feature request 2022-12-25 20:51:23 +02:00
Amit D
ce0ed917f6
Create a new issue template 2022-12-25 20:32:57 +02:00
Stefan Weil
080da83cc5 Create new release 5.3.0
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-22 14:57:57 +01:00
Amit D
7f5345d207
Update README.md
'Promote' @stweil ... :-)
2022-12-19 10:14:32 +02:00
Zdenko Podobný
f25196151b cmake - msvc/openmp: clean&document configuration 2022-12-15 13:26:56 +01:00
Zdenko Podobný
f2f37a8323 cmake - mscvc: silent warning C4068: unknown pragma 'GCC' 2022-12-15 13:25:43 +01:00
Stefan Weil
86a7bc6c06 Create new release 5.3.0-rc1
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 17:37:35 +01:00
Stefan Weil
6e4de524d0 Replace MacOS -> macOS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 17:37:35 +01:00
Stefan Weil
6a21a74ecf Suppress compiler warning caused by very long string
Add pragmas which suppress this warning from gcc or clang:

    src/ccutil/universalambigs.h:26:5: warning:
     string literal of length 170929 exceeds maximum length 65536 that
     C++ compilers are required to support [-Woverlength-strings]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 13:34:01 +01:00
Stefan Weil
369b811c99 Replace at accessor by [] operator in function Classify::CreateIntTemplates
UnicityTable did not provide the [] operator, so add it for this change.

Suggested-by: Egor Pugin <egor.pugin@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
a806d21883 Fix function ReadTrainingSamples (issue #3925)
This fixes duplicate delete when running cntraining.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
23138ab88a Fix function Classify::WriteIntTemplates (issue #3925)
It crashed when running mftraining because unicharset_size in file
"inttemp" was written with 8 bytes instead of 4 bytes.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
4fa046b1b3 Fix function tesseract::write_set (issue #3925)
It crashed when running mftraining with fs.size() == 0.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
1fd8f8165f Fix function UnicityTable::push_back (issue #3925)
mftraining crashed because the returned value was 1 instead of 0
for the first call of UnicityTable::push_back.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
1d3b410968 Fix function ComputeChiSquared (issue #3925)
mftraining crashed if the search did not find anything.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
5591bc04ef Remove assertion in function NewSimpleProto (issue #3925)
It was triggered by mftraining.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
f969ba9161 Fix function Classify::CreateIntTemplates (issue #3925)
The old code did not work correctly if FClass->font_set.size() was 0.
It created the FontSet fs with size 1 instead of 0.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
6b7cb1cbc6 Add missing serialization to FILE for vector of pointers (issue #3925)
It is required for mftraining which otherwise writes a wrong shapetable.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Stefan Weil
90c09a3df3 Replace void_proc by kdwald_proc with correct arguments
This allows removing a reinterpret_cast and fixes a runtime error
with sanitizers:

runtime error: call to function
tesseract::MakePotentialClusters(tesseract::ClusteringContext*, tesseract::CLUSTER*, int)
through pointer to incorrect function type 'void (*)(...)'

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-13 08:04:50 +01:00
Zdenko Podobný
04551ce2a6 clang-format: use default value for line width (80) 2022-12-12 16:55:34 +01:00
Egor Pugin
0680ba870e
Merge pull request #3978 from stweil/sanfix
Modernize function ObjectCache::DeleteUnusedObjects (fix issue with s…
2022-12-12 17:56:42 +03:00
Stefan Weil
8c34b0de62 Modernize function ObjectCache::DeleteUnusedObjects (fix issue with sanitizers)
The old code did not work with compiler option `-fsanitize=address,undefined`
and caused apiexample_test to run forever with this error message:

Running main() from unittest/third_party/googletest/googletest/src/gtest_main.cc
[==========] Running 4 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from EuroText
[ RUN      ] EuroText.FastLatinOCR
/usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/debug/safe_iterator.h:608:
In function:
    _Safe_iterator<type-parameter-0-0, type-parameter-0-1,
    std::bidirectional_iterator_tag>
    &__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<tesseract::ObjectCache<tesseract::Dawg>::ReferenceCount
    *,
    std::__cxx1998::vector<tesseract::ObjectCache<tesseract::Dawg>::ReferenceCount,
    std::allocator<tesseract::ObjectCache<tesseract::Dawg>::ReferenceCount>>>,
[...]

That error message was followed by an endless sequence of newlines.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-12 15:00:18 +01:00
zdenop
b37de16633 Revert "fix: index variable in OpenMP 'for' statement must have signed integral type"
This reverts commit bc7a7eea2f.
2022-12-11 19:49:54 +01:00
zdenop
d89ff4667b reformat code (files with tabs) 2022-12-10 20:33:35 +01:00
zdenop
f77c63d446 report missing or empty box file 2022-12-10 19:28:17 +01:00
zdenop
4ebaa4bffb GA: use png 1.6.39 from cmake-win64 2022-12-08 20:04:10 +01:00
zdenop
b7319c26f9 Merge branch 'main' of https://github.com/tesseract-ocr/tesseract 2022-12-04 18:56:40 +01:00
zdenop
bc7a7eea2f fix: index variable in OpenMP 'for' statement must have signed integral type 2022-12-04 18:56:30 +01:00
zdenop
51cf430899 fix typo (missing space) 2022-12-04 18:49:56 +01:00
Stefan Weil
a5292214b8
Fix function tesseract::WriteFeature (issue #3925) (#3972)
Fixes: 3b0759940c ("Replace more STRING by std::string")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-12-04 01:01:20 +02:00
zdenop
c1a1d7e00c
Update cmake-win64.yml
start scheduling cmake-win64 GA
2022-11-30 15:43:52 +01:00
zdenop
cdf6b601ce
Update cmake-win64.yml 2022-11-30 14:37:32 +01:00
zdenop
9cd5012e89
Update cmake-win64.yml
remove unused features in GA test
2022-11-30 14:36:48 +01:00
Zdenko Podobný
7e51f0bac5 GA cmake-win64: uninstall strawberryperl to fix libtiff build 2022-11-30 11:34:10 +01:00
Zdenko Podobný
ac8ff2eae9 GA cmake-win64: fix getting version info 2022-11-30 10:42:39 +01:00