Commit Graph

6467 Commits

Author SHA1 Message Date
Stefan Weil
d0d43dfbce Update NSIS installer
- Add manual pages in HTML format and helper for Tesseract command line
- Don't remove the installation directory recursively
- Add GitHub action for Tesseract installer for Windows
- Add docbook-xml to required packages (needed for doc)
- Use unicode for NSIS installer
- Optionally sign executables
- Add more file properties to installer
- Update configuration for use with pacman
- Build Windows installer only for 64 bit Windows

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-11-02 07:00:33 +01:00
Regina Retter
b7c5996248 Update installer for Windows
- Added a couple of languages that are available for the Linux version
- Add new section for script data
- Get data from tessdata_fast
  The data files are now in the "script" subdirectory.
- Update list of scripts and languages
- Update path for script trained data
- Add data for Han Simplified vertical script
- Fix names of tessdata (jpn_vert, kmr)
- Fix some path names for 64 bit version
- Remove testing files from installation
  Those files were moved from tesseract.git to test.git.
- Don't enforce admin mode, but use highest available
- Don't use a checkbox for the license
- Remove unused code for registry settings (PATH, TESSDATA)
- Don't show README.md (did not work)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-11-02 07:00:33 +01:00
Stefan Weil
c886e3b639 Update NSIS configuration
- Move NSIS installer file to new location
- Support cross builds with NSIS
- Clean nsis configuration
- Fix typos in nsis configuration
- Add jar files needed for ScrollView.jar
- Move ScrollView.jar to a new section
- Add missing configurations to tessdata
- Registry settings are now disabled (problems with long PATH)
- Add menu sections for all languages
- Simplify language downloads
- Tune and improve nsis configuration
- Add sizes for language data
- Add missing translations to nsis configuration
- Don't show details in installer by default
- Initial code for 64 bit Tesseract installer
- Fix uninstall for TESSDATA_PREFIX registry key
- Remove cube code
- nsis: Add all training executables
- nsis: Disable registry settings

Trying to add to PATH fails if the old PATH is very long and
will result in an empty PATH.

Remove these settings as they were already disabled by default,
and both are not needed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-11-02 07:00:33 +01:00
zdenop@gmail.com
678e427d8b add NSIS script for Windows installer
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@815 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2024-11-02 07:00:33 +01:00
Stefan Weil
7fd6d2388a Fix more typos in code comments and variable name
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-10-31 15:00:55 +01:00
zdenop
9f8e07cdf9
Merge pull request #4337 from stweil/typos
Fix some typos and grammer issues
2024-10-28 14:12:56 +01:00
Amit D.
3633e88b2a
Update README.md: Fix OSS-Fuzz link 2024-10-28 14:32:09 +02:00
Stefan Weil
3400ce7662 Fix more typos in code comments 2024-10-23 15:05:58 +02:00
Stefan Weil
31e864b4a4 Fix Settup -> Setup in method names
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-10-23 15:03:44 +02:00
Stefan Weil
688f8283c5 Fix some code comments
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-10-23 14:59:15 +02:00
Stefan Weil
638868ed38
Modernize code for renderers and remove filename conversion for Windows (#4330)
Commit db52047420 added the filename conversion for the hOCR renderer,
but it was removed later for TSV in commit 6700edd8bc.

Tesseract does not use a filename conversion anywhere else, so remove it
for the other renderers, too.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-10-23 08:34:06 +03:00
Stefan Weil
3020c14a60 CI: Install libtool as required dependency for macOS build
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-10-23 07:07:09 +02:00
Stefan Weil
e9fc2af0b2 CI: Install curl and icu4c as required dependencies for macOS build
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-10-22 16:41:29 +02:00
Zdenko Podobný
2976eb1678 Revert "use variable instead of hardcoded name for pkg-config file"
This reverts commit b4a4f5c6cb.
2024-10-22 11:03:58 +02:00
Stefan Weil
b4adf2464b Replace deprecated runner macos-12 by macos-latest in GitHub actions
The macOS 12 runner image will be removed by December 3rd, 2024.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-10-17 18:39:46 +02:00
Zdenko Podobný
3ebed57878 Merge branch 'main' of https://github.com/tesseract-ocr/tesseract 2024-10-17 09:11:30 +02:00
Zdenko Podobný
b4a4f5c6cb use variable instead of hardcoded name for pkg-config file 2024-10-17 09:11:22 +02:00
zdenop
aacc9052b9
Update cmake.yml
Use macOS 15 as the macOS 12 runner image will be removed by 12/3/2024
2024-10-17 07:06:25 +02:00
Egor Pugin
61ed4d9f36 Do not export PDBs for static libraries. Fixes #4279. 2024-10-07 20:42:52 +03:00
zdenop
900c721f14
Merge pull request #4319 from Conan-Kudo/fix-soversion
cmake: Correctly set the soversion based on SemVer properties
2024-09-19 15:18:08 +02:00
Neal Gompa
280779c615 cmake: Correctly set the soversion based on SemVer properties
As this project follows Semantic Versioning, the shared object
version should match these semantics.

The two options that make sense here are to have the soversion
set to the version major (so only breaking changes are tracked)
or to set to version major and minor (so breaking and API additions
are tracked).

Since the Windows version of the library already uses version major
and version minor, let's just do this universally.

Fixes: 832926f5af ("Update library version handling for cmake")

Signed-off-by: Neal Gompa <neal@gompa.dev>
2024-09-18 07:44:29 -04:00
Stefan Weil
4f43536335
Merge pull request #4314 from stweil/optimize
Add C++ stream for log messages and use it in two debug messages
2024-09-04 05:22:03 +02:00
Stefan Weil
37d1c6506d Add TESS_API in declaration for tesserr (fix sw build)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-09-03 17:31:38 +02:00
Stefan Weil
7ef8e3c7ee Print time for ErrorCounter::ComputeErrorRate in milliseconds
Optimize also the code, replace tprintf by C++ stream
and call clock() only when needed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-09-03 16:26:50 +02:00
Stefan Weil
bd7b3571cc Print time for tessedit_timing_debug in milliseconds
Optimize also the code a little bit.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-09-03 16:26:50 +02:00
Stefan Weil
33d673c46d tprintf: Add C++ stream for log messages
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-09-03 16:26:50 +02:00
Stefan Weil
a63e7ec2e6 tprintf: Modernize and simplify the code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-09-03 15:42:03 +02:00
Stefan Weil
3a4a013dfe tprintf: Remove unused macro and update comment
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-09-03 15:42:03 +02:00
Stefan Weil
1b222452f4 Remove unnecessary assignment and assertions
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-09-03 15:41:07 +02:00
Stefan Weil
027ad18a8d Fix several format strings
This fixes five performance issues reported by Codacy:

    %u in format string (no. 2) requires 'unsigned int' but the argument type is 'signed int'.
    %u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'.
    %d in format string (no. 1) requires 'int' but the argument type is 'unsigned int'.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-26 13:16:09 +02:00
Stefan Weil
6be58e54fa Initialize variables in initialization list
This fixes several performance issues reported by Coverity:

    Variable 'master_trainer_' is assigned in constructor body.
    Consider performing initialization in initialization list.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-26 13:16:09 +02:00
Stefan Weil
4ea8495d1c Replace std::string::substr by std::string::resize
This fixes two performance issues reported by Codacy:

    Ineffective call of function 'substr' because a prefix of the string
    is assigned to itself. Use resize() or pop_back() instead.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-26 13:16:09 +02:00
Stefan Weil
5fd7870cd6 Fix location of namespace statement
It separated a comment for Tesseract::recog_pseudo_word from this function.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-26 08:13:15 +02:00
Stefan Weil
d50600a618 Remove old comment in Makefile.am
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-26 08:05:07 +02:00
Stefan Weil
67aad9ed13 Compile src/lstm/tfnetwork.cpp only in builds with TensorFlow
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-26 08:03:53 +02:00
Stefan Weil
4e42f9de54
Modernize code for list of available models (#4308)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-26 07:38:40 +02:00
Stefan Weil
fc50324986
Replace access/_access by std::filesystem::exists (#4307)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-25 18:57:22 +02:00
Egor Pugin
ee80dfe509
Merge pull request #4305 from Balearica/issue-4304 2024-08-23 12:13:11 +03:00
Mahesh Madhav
3b9d119518
Reduce clock syscalls (#4303)
Gate the sample of the clock by the tessedit_timing_debug flag,
which is the only time it gets used anyway.

This eliminates unnecessary clock_gettime() system calls.
2024-08-23 08:16:52 +02:00
Balearica
ba8dfcece7 Calculate row bounding box in single-word mode per #4304 2024-08-22 21:33:35 -07:00
Stefan Weil
215b023c43 Set hOCR capabilities ocrp_dir and ocrp_lang unconditionally
Both `dir` and `lang` are also written if no font information
was requested.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-12 09:10:32 +02:00
Stefan Weil
ecf0622a85 Fix comment
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-08-10 18:42:40 +02:00
Stefan Weil
46b99041eb CI: Clean more GitHub action (remove unneeded mkdir)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-07-29 17:28:38 +02:00
Stefan Weil
620d82812f CI: Clean GitHub action for autotools on macOS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-07-29 17:26:07 +02:00
Stefan Weil
c6b0082754 CI: Clean GitHub action for unittest on macOS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-07-29 17:20:20 +02:00
Stefan Weil
e563e83e49 CI: Replace macOS 11 runner which is no longer supported by macOS 14 runner
Use also newer compilers.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-07-28 21:12:22 +02:00
Stefan Weil
e1fea0700f Fix whitespace issues (space at line endings)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-07-28 20:56:47 +02:00
Stefan Weil
c5030ea15a Add missing include statement
Fixes: bc490ea7ab ("Ignore illegal TESSDATA_PREFIX [...]")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-07-28 20:56:47 +02:00
JKamlah
dd08a7aa6a Fix confidence output for the PAGE XML renderer. 2024-07-25 11:29:45 +02:00
Stefan Weil
bc490ea7ab Ignore illegal TESSDATA_PREFIX (not existing filesystem entry, issue #4277)
Don't check for a directory, because a symbolic link is also allowed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2024-07-03 20:47:55 +02:00