The internal range is 0...(n-1), but for users a page range 1...n is
more natural. Showing a range 0...n is wrong because it would imply
n+1 pages.
Change printed text from
Loaded 72/72 pages (0-72) of document ...
to
Loaded 72/72 pages (1-72) of document ...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang++ report:
api/baseapi.h:852:4: warning:
extra ';' after member function definition [-Wextra-semi]
[...]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
gcc report:
ccstruct/blamer.cpp:343:65: warning:
'truth_x' may be used uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Coverity report:
CID 1164730 (#1 of 1): Resource leak (RESOURCE_LEAK)
4. leaked_storage: Variable lines going out of scope leaks the storage it points to.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes clang compiler warnings like this one:
wordrec/gradechop.cpp:52:3: warning:
'register' storage class specifier is deprecated [-Wdeprecated-register]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
gcc reports these warnings with -Wextra:
ccstruct/pageres.h:330:3: warning:
base class 'class ELIST_LINK' should be explicitly initialized
in the copy constructor [-Wextra]
ccstruct/ratngs.cpp:115:1: warning:
base class 'class ELIST_LINK' should be explicitly initialized
in the copy constructor [-Wextra]
ccstruct/ratngs.h:291:3: warning:
base class 'class ELIST_LINK' should be explicitly initialized
in the copy constructor [-Wextra]
ccutil/genericvector.h:435:3: warning:
base class 'class GenericVector<WERD_RES*>' should be explicitly initialized
in the copy constructor [-Wextra]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
gcc reports a potential bad array access:
ccstruct/mod128.cpp:98:20: warning:
array subscript has type 'char' [-Wchar-subscripts]
dir is of type 'char'. Most compilers use signed char by default.
Then the value of dir is in the range -128 ... 127 and cannot be
used to access an array with 256 elements.
Don't fix that but disable the buggy code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings like this one:
api/baseapi.h:739:32: warning:
type qualifiers ignored on function return type [-Wignored-qualifiers]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This is what was required to get 'make-dist' to work. I left autogen alone
since it works, albeit with an error message. My practice packages appear
to work fine.
Font recognition was poor, due to forcing a 1st and 2nd choice at
a character level, when the total score for the correct font is often
correct at the word level, so allowed the propagation of a full set
of fonts and scores to the word recognizer, which can now decide word
level fonts using the scores instead of simple votes.
Change precipitated a cleanup of output data structures for classifier
results, eliminating ScoredClass and INT_RESULT_STRUCT, with a few
extra elements going in UnicharRating, and using that wherever possible.
That added the extra complexity of 1-rating due to a flip between 0 is
good and 0 is bad for the internal classifier scores before they are
converted to rating and certainty.
Tha, Vie, Kan, Tel etc.
There is a new overlap detector that detects when diacritics
cause a big increase in textline overlap. In such cases, diacritics from
overlap regions are kept separate from layout analysis completely, allowing
textline formation to happen without them. The diacritics are then assigned
to 0, 1 or 2 close words at the end of layout analysis, using and modifying
an old noise detection data path.
The stored diacritics are used or not during recognition according to the
character classifier's liking for them.