tesseract/ccmain
Ray Smith 0e868ef377 Major change to improve layout analysis for heavily diacritic languages:
Tha, Vie, Kan, Tel etc.
There is a new overlap detector that detects when diacritics
cause a big increase in textline overlap. In such cases, diacritics from
overlap regions are kept separate from layout analysis completely, allowing
textline formation to happen without them. The diacritics are then assigned
to 0, 1 or 2 close words at the end of layout analysis, using and modifying
an old noise detection data path.
The stored diacritics are used or not during recognition according to the
character classifier's liking for them.
2015-05-12 16:47:02 -07:00
..
adaptions.cpp Fixed issues 1093-1097 2014-02-04 23:36:24 +00:00
applybox.cpp Fixed issue 1252: Refactored LearnBlob and its call hierarchy to make it a member of Classify. 2015-05-12 15:22:34 -07:00
control.cpp Major change to improve layout analysis for heavily diacritic languages: 2015-05-12 16:47:02 -07:00
control.h remove unused code (Windows related) 2013-07-08 18:21:10 +00:00
cube_control.cpp Fixed issues 899/1220/1246 (mixed eng+ara) 2014-09-17 18:27:49 -07:00
cube_reco_context.cpp Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic, Added paragraph detection in layout analysis/post OCR, Fixed inconsistent xheight during training and over-chopping, Added simultaneous multi-language capability, Refactored top-level word recognition module, Fixed problems with internally scaled images 2012-02-02 02:59:49 +00:00
cube_reco_context.h Misc fixes 2014-04-23 22:54:50 +00:00
cubeclassifier.cpp Fixed issue 1102 2014-04-24 00:10:59 +00:00
cubeclassifier.h Refactored classifier to make it easier to add new ones 2013-09-23 15:16:01 +00:00
docqual.cpp Fixed issues 1093-1097 2014-02-04 23:36:24 +00:00
docqual.h remove unused code (Windows related) 2013-07-08 18:21:10 +00:00
equationdetect.cpp Fixed slow-down that was caused by upping MAX_NUM_CLASSES 2014-01-24 21:12:35 +00:00
equationdetect.h Added experimental equation detector 2012-02-02 02:50:01 +00:00
fixspace.cpp Major change to improve layout analysis for heavily diacritic languages: 2015-05-12 16:47:02 -07:00
fixspace.h remove unused code (Windows related) 2013-07-08 18:21:10 +00:00
fixxht.cpp Fixed problems with shifted baselines so recognition can recover from layout analysis errors. 2015-05-12 15:53:45 -07:00
ltrresultiterator.cpp Misc fixes 2014-04-23 22:54:50 +00:00
ltrresultiterator.h Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc. 2013-09-23 15:26:50 +00:00
Makefile.am Removed dependence on IMAGE class 2014-01-09 17:48:00 +00:00
mutableiterator.h Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic, Added paragraph detection in layout analysis/post OCR, Fixed inconsistent xheight during training and over-chopping, Added simultaneous multi-language capability, Refactored top-level word recognition module, Fixed problems with internally scaled images 2012-02-02 02:59:49 +00:00
osdetect.cpp Cleanup from previous changes 2014-08-12 16:12:46 -07:00
osdetect.h Fixed issues 1081-1090 2014-02-04 02:23:18 +00:00
output.cpp Fixed issues 1093-1097 2014-02-04 23:36:24 +00:00
output.h remove unused code (Windows related) 2013-07-08 18:21:10 +00:00
pageiterator.cpp Major change to improve layout analysis for heavily diacritic languages: 2015-05-12 16:47:02 -07:00
pageiterator.h Major change to improve layout analysis for heavily diacritic languages: 2015-05-12 16:47:02 -07:00
pagesegmain.cpp Major change to improve layout analysis for heavily diacritic languages: 2015-05-12 16:47:02 -07:00
pagewalk.cpp Major refactor of control.cpp to enable line recognition 2014-08-11 23:23:06 +00:00
par_control.cpp Major refactor of control.cpp to enable line recognition 2014-08-11 23:23:06 +00:00
paragraphs_internal.h Fixed typos and improved comments 2012-09-21 15:31:20 +00:00
paragraphs.cpp misc fixes 2014-01-09 17:49:07 +00:00
paragraphs.h Provide better paragraph segmentation without having to run fully 2012-05-10 00:03:34 +00:00
paramsd.cpp Fixed issue 1099 2014-04-24 00:06:36 +00:00
paramsd.h Add ability to build under android (without cube or scrollview). 2015-05-12 15:41:15 -07:00
pgedit.cpp Major change to improve layout analysis for heavily diacritic languages: 2015-05-12 16:47:02 -07:00
pgedit.h fix svn:executable atribute, trailing spaces, version include 2013-11-03 17:24:00 +00:00
recogtraining.cpp Major change to improve layout analysis for heavily diacritic languages: 2015-05-12 16:47:02 -07:00
reject.cpp Fixed issues 1093-1097 2014-02-04 23:36:24 +00:00
reject.h Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc. 2013-09-23 15:26:50 +00:00
resultiterator.cpp fix VS2010 build; 2015-02-05 17:27:18 +01:00
resultiterator.h fix VS2010 build; 2015-02-05 17:27:18 +01:00
superscript.cpp Improved sub/superscript treatment 2013-09-20 19:49:47 +00:00
tessbox.cpp Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc. 2013-09-23 15:26:50 +00:00
tessbox.h remove unused code (Windows related) 2013-07-08 18:21:10 +00:00
tessedit.cpp Add ability to build under android (without cube or scrollview). 2015-05-12 15:41:15 -07:00
tessedit.h remove unused code (Windows related) 2013-07-08 18:21:10 +00:00
tesseract_cube_combiner.cpp Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc. 2013-09-23 15:26:50 +00:00
tesseract_cube_combiner.h fixing issue 628 (replacing __MSW32__ with _WIN32) and issue 614 (reverting "class DLLSYM STRING" to "class CCUTIL_API STRING") 2012-02-19 21:48:45 +00:00
tesseractclass.cpp Major change to improve layout analysis for heavily diacritic languages: 2015-05-12 16:47:02 -07:00
tesseractclass.h Major change to improve layout analysis for heavily diacritic languages: 2015-05-12 16:47:02 -07:00
tessvars.cpp remove unused code (Windows related) 2013-07-08 18:21:10 +00:00
tessvars.h Removed dependence on IMAGE class 2014-01-09 17:46:37 +00:00
tfacepp.cpp Major refactor to improve speed on difficut images, especially when running 2015-05-12 14:59:14 -07:00
thresholder.cpp Major refactor of control.cpp to enable line recognition 2014-08-11 23:23:06 +00:00
thresholder.h Major refactor of control.cpp to enable line recognition 2014-08-11 23:23:06 +00:00
werdit.cpp Major refactor of control.cpp to enable line recognition 2014-08-11 23:23:06 +00:00
werdit.h Major refactor of control.cpp to enable line recognition 2014-08-11 23:23:06 +00:00