tesseract

mirror of https://github.com/tesseract-ocr/tesseract.git synced 2024-12-18 03:19:15 +08:00

Author	SHA1	Message	Date
Robin Watts	db10c7b577	intsimdmatrixneon.cpp: Do biasing in SIMD.	2020-10-12 04:30:46 -07:00
Robin Watts	d1e49d6dd2	intsimdmatrixavx2: Do biasing in SIMD. We also move to relying on both scales and output having been padded to accomodate us writing more results than are actually needed here. This was allowed for a few commits back.	2020-10-12 04:30:46 -07:00
Robin Watts	872816897a	Rejig intsimdmatrix to reduce FP ops. Avoid 1) floating point division by 127, 2) conversion of bias to double, 3) FP addition, in favour of 1) integer multiplication by 127, and 2) integer addition. (Also costs extra work in the serialisation/deserialisation of the scale values, and conversion of weights to int formats, but these are all one offs).	2020-10-12 04:30:46 -07:00
Robin Watts	aba1800f69	Round output buffers for intSimdMatrix. In order to allow intSimdMatrix implementations to 'overwrite' their outputs, ensure that the output buffers are always padded to the next block size. This doesn't make any difference yet, but it enables optimisations further down the line, especially when the biasing is pulled into the SIMD.	2020-10-12 11:47:16 +01:00
Robin Watts	9dfdac51c6	Tweak scales array for intSimdMatrix case. Currently, the size of the scales array is not rounded up in the same way as the weights are. This blocks us pushing the scale calculations into the SIMD, as when we "overread" the end of the scale array, we potentially get errors. Here, we adjust the intSimdMatrix stuff to ensure that the scales array reserves enough entries to allow such overreads to work. This doesn't make any difference for now, but opens the way for future optimisations.	2020-10-12 11:47:16 +01:00
Shatur95	5a377707e0	Generate imported target automatically	2020-10-12 11:47:16 +01:00
Shatur95	8dad1e24a2	Modernize CMake config files	2020-10-12 11:47:16 +01:00
amitdo	958f23453e	Improve disabled legacy engine build	2020-10-12 11:47:16 +01:00
amitdo	06154e028b	Improve disabled legacy engine build	2020-10-12 11:47:16 +01:00
amitdo	e81b485066	Improve disabled legacy engine build	2020-10-12 11:47:15 +01:00
amitdo	7df4918644	Improve disabled legacy engine build	2020-10-12 11:47:15 +01:00
Shatur95	ec8766ce74	Use DESTINATION instead of TYPE For compatibility with older CMake.	2020-10-12 11:47:15 +01:00
Stefan Weil	ac14ab32c6	Remove dummy functions from globaloc.cpp and related code Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-10-04 12:24:26 +02:00
zdenop	0ded9f3573	Merge pull request #3113 from stweil/pango Remove unused functions FontUtils::GetAllRenderableCharacters	2020-10-03 18:07:42 +02:00
Stefan Weil	7c4ef88dab	Remove unused functions FontUtils::GetAllRenderableCharacters They used the function pango_coverage_max which does nothing and which has been deprecated since pango version 1.44. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-10-03 12:04:40 +02:00
Egor Pugin	45413e6c42	Merge pull request #3112 from Shatur95/fix-cmake-install-configs Fix CMake install configs	2020-10-03 00:32:05 +03:00
Shatur95	72779fb185	Fix CMake install configs	2020-10-01 22:05:02 +03:00
Egor Pugin	b19e3ee63c	Update appveyor.yml	2020-09-10 15:40:07 +03:00
Egor Pugin	76ead638e3	Update sw.yml	2020-09-10 02:05:29 +03:00
zdenop	f5561c4c42	Merge pull request #3090 from nam-leduc/correct-debug-find-images Correct "NoImages" in debug pdf file	2020-09-07 09:22:43 +02:00
Le Duc Nam	eb8f1674bf	Correct "NoImages" in debug pdf file Issues: Debug information for "NoImages" just be binary image, it don't show up the result of photo_mask_pix to developer Fix: Substract binary image to photo_mask_pix, the result are "NoImages" binary pix	2020-09-06 23:31:30 +07:00
Stefan Weil	162f3707e2	Merge pull request #3082 from bertsky/fix-line-detector Fix separator line detector	2020-08-29 20:33:09 +02:00
Robert Sachunsky	640c14e080	AutoPageSeg/FindBlocks/GridRemoveUnderlinePartitions: avoid self-deletion When checking horizontal line partitions for possible interpretation as underline formatting, avoid confusing the hline partition itself with an overlapping neighbour (which would delete it).	2020-08-24 19:13:48 +02:00
Robert Sachunsky	65a077d3e9	FindAndRemoveLines/FindVerticalAlignment: decrease fixed vline min length When detecting vertical separators, the blob aligner is used to glue line segments (often segmented due to artificial cracks). But (unlike LineFinder) it has many parameters that are not relative to pixel density/resolution. This change decreases the minimum absolute length in pixels for vertical separators.	2020-08-24 19:13:36 +02:00
Robert Sachunsky	0228d93684	textord debugging: invert default top/bottom bounaries, improve description	2020-08-24 19:13:25 +02:00
Stefan Weil	d33edbc4b1	Merge pull request #3066 from robinwatts/pushback14 Remove unused char constant that causes a warning.	2020-07-17 15:55:51 +02:00
Robin Watts	578462109b	Remove unused char constant that causes a warning. The kDictWildcard is never actually used, so removing it makes no difference. It causes warnings in MSVC builds as MSVC doesn't know how to pack a unicode value into chars.	2020-07-17 14:22:37 +01:00
zdenop	749851d39d	Merge pull request #3065 from robinwatts/pushback13 Squash some warnings in MSVC build.	2020-07-16 14:41:43 +02:00
Robin Watts	150e2e54fe	Squash some warnings in MSVC build. In particular, "defined but not used" (caused by GRAPHICS_DISABLED), double constants being truncated to floats, and implicit casts.	2020-07-16 10:08:40 +01:00
zdenop	7fa200bfb7	Merge pull request #3064 from robinwatts/pushback12 Fix Memory leak when using TESSERACT_IMAGEDATA_AS_PIX	2020-07-15 19:08:58 +02:00
Robin Watts	7f45b719d1	Fix Memory leak when using TESSERACT_IMAGEDATA_AS_PIX If building with TESSERACT_IMAGEDATA_AS_PIX, then tesseract doesn't compress/decompress images, but rather holds the data as internal Pix structures. Unfortunately, I forgot to make the ImageData destructor free these, so memory leaked during use. Fixed here.	2020-07-15 12:35:35 +01:00
zdenop	135c8a49b5	Merge pull request #3061 from stweil/neon Always use NEON by default for ARMv8	2020-07-11 09:11:54 +02:00
zdenop	875bd48bd5	Merge pull request #3058 from stweil/scrollview Disable more code and data with GRAPHICS_DISABLED	2020-07-11 09:11:27 +02:00
Stefan Weil	548a832b98	Use strtok_s for MSVC in class SVNetwork strtok_s can be used with MSVC as a replacement for strtok_r, so less special handling is needed in the code and class SVNetwork can be made smaller by removing member has_content. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-07-10 17:47:05 +02:00
Stefan Weil	636c37fa01	Merge pull request #3060 from edwinnyawoli/patch-1	2020-07-10 16:06:25 +02:00
Edwin Nyawoli	317495ecb8	Link 'traineddata' word to its documentation This is to help make it clearer to users	2020-07-10 14:01:37 +00:00
Stefan Weil	2db2223b39	Always use NEON by default for ARMv8 Signed-off-by: Stefan Weil <stefan.weil@bib.uni-mannheim.de>	2020-07-10 15:27:09 +02:00
Edwin Nyawoli	1fb6c41e0f	Fix typo in README.md	2020-07-10 12:06:48 +00:00
Stefan Weil	cb3880fb15	Disable more code and data with GRAPHICS_DISABLED Some runtime parameters which are only relevant with graphics enabled were now removed from builds when graphics was disabled. TableFinder::DisplayColSegmentGrid is never used, so remove it completely. Builds with --disable-graphics significantly reduce the code size and avoid some function calls which might be important for certain applications: text data bss dec hex filename 3219230 41136 13920 3274286 31f62e .libs/libtesseract.so (--disable-graphics, old) 3211347 40976 13600 3265923 31d583 .libs/libtesseract.so (--disable-graphics, new) 3360942 43656 15392 3419990 342f56 .libs/libtesseract.so (default) Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-07-09 11:23:33 +02:00
Stefan Weil	22e6c2e5a7	Fix division by 0.0 in BaselineRow::PerpDistanceFromBaseline It was reported by oss-fuzz (issue 23962). Add log output to find real images which trigger that issue. Avoid also some conversions from float to double by always using float. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-07-08 17:59:02 +02:00
zdenop	b67736cd6b	Merge pull request #3055 from stweil/string Use const char* for filename parameters	2020-07-07 18:15:04 +02:00
Stefan Weil	8137cf35a6	Use const char* for filename parameters This replaces the proprietary STRING data type (801 instead of 838 lines remaining). It also removes STRING from osdetect.h and serialis.h. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-07-07 14:20:09 +02:00
Stefan Weil	d01b2e43b8	unittest: Update comments in normstrngs_test.cc Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-07-07 11:29:48 +02:00
zdenop	36985fcc03	Merge pull request #3052 from stweil/msvc Fix cmake build for MSVC	2020-07-03 21:09:52 +02:00
Stefan Weil	0e79daed42	Fix cmake build for MSVC MSVC does not support /arch:FMA or /arch:SSE4.1. For /arch:AVX and /arch:AVX2 no check is needed because they are supported since a long time. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-07-01 21:54:51 +02:00
Stefan Weil	e910b3c20b	Merge pull request #3050 from zdenop/cmake_AVX Update cmake builds to set right flags for AVX, ...	2020-07-01 08:52:20 +02:00
zdenop	511189b069	Update CMakeLists.txt thanks. Co-authored-by: Stefan Weil <sw@weilnetz.de>	2020-07-01 08:43:55 +02:00
zdenop	2538989ef5	cmake: NEON build is not supported on Mac OS X	2020-07-01 00:12:10 +02:00
zdenop	3c3e7b913f	cmake: check compiler flags for AVX,AVX2,FMA,SSE4.1 support	2020-06-30 23:09:36 +02:00
zdenop	33f1e1371b	cmake: eliminate OptimizeForArchitecture	2020-06-30 22:35:05 +02:00

1 2 3 4 5 ...

4645 Commits