Commit Graph

5987 Commits

Author SHA1 Message Date
Tom Morris
d3c69072d7 Update for Github & fix spelling 2015-06-22 16:08:18 -04:00
Zdenko Podobný
3a0da4e1b7 fix DISABLE_GRAPHICS build (google code issue 1490) 2015-06-21 22:50:14 +02:00
Zdenko Podobný
392ad881cc fix link to UNVL test 2015-06-14 17:29:55 +02:00
Zdenko Podobný
6a998ec5d0 fix redefinition in stringrenderer.cpp (stringrenderer.h) 2015-06-14 17:29:27 +02:00
Zdenko Podobný
9b7f2527f1 fix links in doc; autotools requires README 2015-06-13 00:08:05 +02:00
Ray Smith
0ee178d79b Clang fixes to earlier changes and build compatability with Google environment part 2 2015-06-12 11:17:47 -07:00
Ray Smith
d174c4fd33 Fixed occurrence of small rotated blocks in loosely spaced text part 2 2015-06-12 11:12:06 -07:00
Ray Smith
b1d99dfe23 Added a backup adaptive classifier to take over from primary when it fills on a large document 2015-06-12 11:10:53 -07:00
Ray Smith
78b5e1a77d Fixed occurrence of small rotated blocks in loosely spaced text 2015-06-12 11:05:00 -07:00
Ray Smith
d74c625e52 Fixed blob division params to fix CJK training speed. 2015-06-12 10:59:26 -07:00
Ray Smith
4c7ab0caea Fixed font lists, improved wordlist management 2015-06-12 10:56:40 -07:00
Ray Smith
ab0f4e2c38 Clang fixes to earlier changes and build compatability with Google environment 2015-06-12 10:53:21 -07:00
zdenop
3ba1f83eb1 Merge pull request #36 from jan-ruzicka/patch-2
ChangeLog reformatting for consistent ordering
2015-06-11 09:50:38 +02:00
Jan Ruzicka
953c563efb change order of entries V1.0 ... V2.04
This is to have the newest on top ordering of revisions.
2015-06-11 01:34:45 -04:00
Jan Ruzicka
36740897e0 convert date formats 2015-06-11 01:27:11 -04:00
Jan Ruzicka
42481f2cf4 uniform bullet formatting 2015-06-10 22:52:37 -04:00
zdenop
10ea4f0636 Merge pull request #35 from jan-ruzicka/patch-1
more link updates
2015-06-02 21:29:47 +02:00
Jan Ruzicka
f89c7808cf more link updates
modifying link to training from google code and adding link to documentation by Doxygen.
2015-06-02 14:12:42 -04:00
zdenop
8faea4bf06 Update README.md
fix links to wiki
2015-06-02 09:56:55 +02:00
Zdenko Podobný
fc793355a8 Move pdf documents to docs repository 2015-05-22 22:10:31 +02:00
Zdenko Podobný
b1b02572ab Merge branch 'Issue1474'
* Issue1474:
  Fix potential null pointer dereference in ccmain/paragraphs.cpp.
2015-05-22 21:19:14 +02:00
Zdenko Podobný
d8a55d739d Fix potential null pointer dereference in ccmain/paragraphs.cpp. 2015-05-22 21:17:33 +02:00
zdenop
e4136f28a5 Merge pull request #33 from rmtheis/tweak-readme
Minor edits to Readme
2015-05-22 08:25:44 +02:00
Robert Theis
a36a5f96d0 Minor edits to Readme 2015-05-21 19:36:50 -07:00
zdenop
f8ebff262e Merge pull request #32 from orbitcowboy/master
Fix potential null pointer dereference in ccmain/paragraphs.cpp.
2015-05-20 19:01:13 +02:00
orbitcowboy
9328f0e5d4 Fix potential null pointer dereference in ccmain/paragraphs.cpp. 2015-05-19 10:17:44 +02:00
Jim Regan
05acff6253 Merge pull request #23 from tesseract-ocr/training-sh
/usr/share/fonts is the wrong path on Mac
2015-05-18 14:05:44 +01:00
Jim O'Regan
4a6195202c fix typo 2015-05-18 12:32:36 +01:00
Jim O'Regan
99be295349 Merge branch 'monitor' of https://github.com/tesseract-ocr/tesseract into monitor 2015-05-18 12:29:11 +01:00
Renard Wellnitz
49a7ed13ea fix to compile tesseract on mac with clang 2015-05-18 09:59:10 +01:00
Jim O'Regan
16ac3b0a20 /usr/share/fonts is the wrong path on Mac 2015-05-18 09:53:14 +01:00
zdenop
e9f59351de Merge pull request #19 from haf/feature/readme-improvement
[infra] updating readme
2015-05-18 08:46:46 +02:00
Zdenko Podobný
438edd6c7b added row attributes to hocr output 2015-05-17 22:13:59 +02:00
Zdenko Podobný
917e994caa extend ETEXT_DESC by progress_callback 2015-05-17 21:56:40 +02:00
Zdenko Podobný
ed6ae9b974 Add monitor to GetHOCRText 2015-05-17 21:55:50 +02:00
Henrik Feldt
a0ea634e15 [infra] README -> README.md, links 2015-05-16 19:19:54 +02:00
Henrik Feldt
03c29f96d8 [infra] updating readme 2015-05-16 19:10:10 +02:00
Zdenko Podobný
59bcbc79b3 fix GIT_VER info in VS2010 2015-05-15 15:14:49 +02:00
Zdenko Podobný
e98849b482 rint error message when pdf.ttf is not found. 2015-05-15 15:14:00 +02:00
Jim O'Regan
e7b087ffe6 update Doxyfile 2015-05-14 13:43:07 +01:00
Zdenko Podobný
aec22a47ec fix autotools c++11 issue with disabled training 2015-05-14 14:25:49 +02:00
Zdenko Podobný
1d6de86150 fix VS2010 linking error 2015-05-14 14:24:55 +02:00
Zdenko Podobný
035b324f0f reflect the latest commits in VS2010 build 2015-05-14 10:52:54 +02:00
Ray Smith
941d87057e Fixed training build 2015-05-13 17:46:58 -07:00
Ray Smith
81b67f7ed9 Removed debug logging that doesn't belong 2015-05-13 17:12:23 -07:00
Ray Smith
d91df9856b Fixed crash on debugging classifier with a shapetable present 2015-05-13 17:10:23 -07:00
Ray Smith
4598061324 Fixed infinite loop in training due to poor clipping of the table filler 2015-05-13 17:09:35 -07:00
Ray Smith
5bb0d89291 Improved debug of class pruner 2015-05-13 17:07:11 -07:00
zhivko.tabakov@gmail.com
07be522e43 Issue 1351: OpenCL build - kernel_ThresholdRectToPix() not accounting for padding bits in the output pix?!
https://code.google.com/p/tesseract-ocr/issues/detail?id=1351

What steps will reproduce the problem?
1.Use tesseract build with OpenCL.
2.Pass full color image with width which is not multiple of 32.
3.Recognition is way too slow and does not recognize anything.
I read the article on http://www.sk-spell.sk.cx/tesseract-meets-the-opencl-first-test and decided to give OCL a try. The initial result was as per point 3 above. After some debugging I figured the problem is that the OCL version of threshold rect generation does not account for padding bits in the output pix lines. To prove my discovery I made a quick fix in oclkernels.h replacing the definition of kernel_ThresholdRectToPix

Just a reminder: it is necessary to force OCL kernel recompilation after changing this source (e.g. delete “kernel - <device>.bin” from the exec folder).
The fix is working but I am not sure about it since the original source apparently works for other people (as per the article). If I am right the OS/GPU are irrelevant since the bug is algorithmic, but mine are Windows/AMD. Also similar fix is applicable to kernel_ThresholdRectToPix_OneChan(), but there the input array might have some padding bytes as well, so its indexing will need further adjustments. I can come with some prove/fix for it either - I have not played with it yet.
Disclaimer: I have no prior experience with image processing and tesseract source or with GPU computing and OpenCL (but please do explain if I am wrong).
2015-05-13 21:23:23 +01:00
Ray Smith
1e3b671298 Fixes to make yesterday's changes compile 2015-05-13 09:58:59 -07:00