When textord_blockndoc_fixed was set to 1 empty rows caused a segmentation
fault. Test also textord_blockndoc_fixed first because it is typically 0.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Improve the DebugDump output by slightly adjusting the format for the numeric columns, which was 3,3,3,3 and overflowing in our test runs, damaging the table layout. See rationale in the code comment:
------
// The largest (positive and negative) numbers are reported for lindent & rindent.
// While the column header has widths 5,4,4,5, it is therefore opportune to slightly
// offset the widths in the format string here to allow ample space for lindent & rindent
// while keeeping the final table output nicely readable: 4,5,5,4.
# Conflicts:
# src/ccmain/paragraphs.cpp
* comment fix, pointed out by @stweil
Both forms are used in American English, but 'cannot' is more common
(also in Tesseract code), so use it always.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
gcc 13 moved some includes around and as a result <cstdint> is
no longer transitively included [1]. Explicitly include it for
int32_t.
[1] https://gcc.gnu.org/gcc-13/porting_to.html#header-dep-changes
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Add pragmas which suppress this warning from gcc or clang:
src/ccutil/universalambigs.h:26:5: warning:
string literal of length 170929 exceeds maximum length 65536 that
C++ compilers are required to support [-Woverlength-strings]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
UnicityTable did not provide the [] operator, so add it for this change.
Suggested-by: Egor Pugin <egor.pugin@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It crashed when running mftraining because unicharset_size in file
"inttemp" was written with 8 bytes instead of 4 bytes.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
mftraining crashed because the returned value was 1 instead of 0
for the first call of UnicityTable::push_back.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The old code did not work correctly if FClass->font_set.size() was 0.
It created the FontSet fs with size 1 instead of 0.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This allows removing a reinterpret_cast and fixes a runtime error
with sanitizers:
runtime error: call to function
tesseract::MakePotentialClusters(tesseract::ClusteringContext*, tesseract::CLUSTER*, int)
through pointer to incorrect function type 'void (*)(...)'
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The old code did not work with compiler option `-fsanitize=address,undefined`
and caused apiexample_test to run forever with this error message:
Running main() from unittest/third_party/googletest/googletest/src/gtest_main.cc
[==========] Running 4 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from EuroText
[ RUN ] EuroText.FastLatinOCR
/usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/debug/safe_iterator.h:608:
In function:
_Safe_iterator<type-parameter-0-0, type-parameter-0-1,
std::bidirectional_iterator_tag>
&__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<tesseract::ObjectCache<tesseract::Dawg>::ReferenceCount
*,
std::__cxx1998::vector<tesseract::ObjectCache<tesseract::Dawg>::ReferenceCount,
std::allocator<tesseract::ObjectCache<tesseract::Dawg>::ReferenceCount>>>,
[...]
That error message was followed by an endless sequence of newlines.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Fixes: cac116dd11 ("Replace more PointerVector by std::vector [...]")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Coverity Scan reports "Unnecessary object copies can affect performance"
and suggests using the auto keyword with an &.
Signed-off-by: Stefan Weil <sw@weilnetz.de>