Remove debug output and fix an out-of-bounds read for unsupported arguments.
Fixes: e8a9a56f9f ("Support symbolic values for --oem and --psm options")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Convert riscv-v-spec-1.0.pdf into 111 PNG images,
then perform OCR on each one in sequence,
and measure the testing time on banana_f3:
old: 31m16.267s
new: 16m51.155s
Co-authored-by: sunyuechi <sunyuechi@iscas.ac.cn>
Co-authored-by: Stefan Weil <sw@weilnetz.de>
This fixes several performance issues reported by Coverity:
Variable 'master_trainer_' is assigned in constructor body.
Consider performing initialization in initialization list.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Add PAGE XML export and documentation.
To generate PAGE XML output just add 'page' to the tesseract command.
The output is outputname + '.page.xml' to avoid conflicts with ALTO export.
The output can be customized with the flags:
tessedit_create_page_polygon and tessedit_create_page_wordlevel.
Co-authored-by: Stefan Weil <sw@weilnetz.de>