Convert riscv-v-spec-1.0.pdf into 111 PNG images,
then perform OCR on each one in sequence,
and measure the testing time on banana_f3:
old: 31m16.267s
new: 16m51.155s
Co-authored-by: sunyuechi <sunyuechi@iscas.ac.cn>
Co-authored-by: Stefan Weil <sw@weilnetz.de>
libstdc++-6.dll and libgcc_s_seh-1.dll must be taken from the compiler
directory, not from the pacman DLLs.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The build process needs the packages curl, python3-venv and unzip
which are missing in the Docker image for Ubuntu.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The pattern for the training tools *.exe also includes tesseract.exe,
so it must be excluded explicitly.
Add also a macro BINDIR which simplifies the NSIS rules.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
- Add manual pages in HTML format and helper for Tesseract command line
- Don't remove the installation directory recursively
- Add GitHub action for Tesseract installer for Windows
- Add docbook-xml to required packages (needed for doc)
- Use unicode for NSIS installer
- Optionally sign executables
- Add more file properties to installer
- Update configuration for use with pacman
- Build Windows installer only for 64 bit Windows
Signed-off-by: Stefan Weil <sw@weilnetz.de>
- Added a couple of languages that are available for the Linux version
- Add new section for script data
- Get data from tessdata_fast
The data files are now in the "script" subdirectory.
- Update list of scripts and languages
- Update path for script trained data
- Add data for Han Simplified vertical script
- Fix names of tessdata (jpn_vert, kmr)
- Fix some path names for 64 bit version
- Remove testing files from installation
Those files were moved from tesseract.git to test.git.
- Don't enforce admin mode, but use highest available
- Don't use a checkbox for the license
- Remove unused code for registry settings (PATH, TESSDATA)
- Don't show README.md (did not work)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
- Move NSIS installer file to new location
- Support cross builds with NSIS
- Clean nsis configuration
- Fix typos in nsis configuration
- Add jar files needed for ScrollView.jar
- Move ScrollView.jar to a new section
- Add missing configurations to tessdata
- Registry settings are now disabled (problems with long PATH)
- Add menu sections for all languages
- Simplify language downloads
- Tune and improve nsis configuration
- Add sizes for language data
- Add missing translations to nsis configuration
- Don't show details in installer by default
- Initial code for 64 bit Tesseract installer
- Fix uninstall for TESSDATA_PREFIX registry key
- Remove cube code
- nsis: Add all training executables
- nsis: Disable registry settings
Trying to add to PATH fails if the old PATH is very long and
will result in an empty PATH.
Remove these settings as they were already disabled by default,
and both are not needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Commit db52047420 added the filename conversion for the hOCR renderer,
but it was removed later for TSV in commit 6700edd8bc.
Tesseract does not use a filename conversion anywhere else, so remove it
for the other renderers, too.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
As this project follows Semantic Versioning, the shared object
version should match these semantics.
The two options that make sense here are to have the soversion
set to the version major (so only breaking changes are tracked)
or to set to version major and minor (so breaking and API additions
are tracked).
Since the Windows version of the library already uses version major
and version minor, let's just do this universally.
Fixes: 832926f5af ("Update library version handling for cmake")
Signed-off-by: Neal Gompa <neal@gompa.dev>
This fixes five performance issues reported by Codacy:
%u in format string (no. 2) requires 'unsigned int' but the argument type is 'signed int'.
%u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'.
%d in format string (no. 1) requires 'int' but the argument type is 'unsigned int'.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes several performance issues reported by Coverity:
Variable 'master_trainer_' is assigned in constructor body.
Consider performing initialization in initialization list.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes two performance issues reported by Codacy:
Ineffective call of function 'substr' because a prefix of the string
is assigned to itself. Use resize() or pop_back() instead.
Signed-off-by: Stefan Weil <sw@weilnetz.de>