Tom Morris
fc80ceafb9
Fix hocrtsv references in Makefile
2016-03-02 10:46:52 -05:00
Tom Morris
6700edd8bc
Cleanup TSV renderer
...
Remove all references to hocr, hocr.tsv, etc. Remove dead code for font
info, input filename, HTML escapes. Improved comments. Fixed
indentation.
2016-03-01 13:41:19 -05:00
Sundar M. Vaidya
937ceb2d1b
Adds hocrtsv to tessdata/configs/Makefile.am
2016-03-01 12:25:15 -05:00
Sundar M. Vaidya
3163b38151
Adds hocrtsv file to configs folder.
2016-03-01 12:23:12 -05:00
Sundar M. Vaidya
59d593d796
Calls TessHOcrTsvRenderer if tessedit_create_hocrtsv is true.
2016-03-01 12:23:12 -05:00
Tom Morris
e3e1fe0e20
Document hocr_font_info in config
2016-02-14 16:49:00 -05:00
James R. Barlow
b30930b95a
Replace pdf.ttf with sharp2.ttf, keep name the same
...
As discussed at length in issue #182 , the existing pdf.ttf causes difficulties
for certain PDF viewers, in part because the old file had zero advance width.
With testing, sharp2.ttf seems to be the best available compromise, although
it's not perfect and causes some visual difficulties in Evince. It does
seem to fix Kindle and OS X Preview.
2016-02-11 15:44:11 -08:00
Amit Dovev
6b08184a2c
Update Makefile.am
2015-12-18 16:12:32 +02:00
amitdo
c2f5e9b849
If there is no explicit renderer(s), default to TessTextRenderer
...
Revert fd429c32
, 43834da7
, 05de195e
.
See #49 , #59 .
The code in this commit solves the issue in a more elegant way, IMHO.
Now you can use:
* `tesseract eurotext.tif eurotext txt pdf`
* `tesseract eurotext.tif eurotext txt hocr`
* `tesseract eurotext.tif eurotext txt hocr pdf`
NOTE:
With `tesseract eurotext.tif eurotext`
or `tesseract eurotext.tif eurotext txt`
the psm will be set to '3', but...
With `tesseract eurotext.tif eurotext txt pdf`
or `tesseract eurotext.tif eurotext txt hocr`
the psm will be set to '1'.
2015-12-11 19:06:49 +02:00
Zdenko Podobný
66a76a9477
Revert "temporary add config/*, configure and Makefile.in for release"
...
This reverts commits ec9581d8f2
, 1afe382c4e
, 4b2cfabcc1
2015-07-31 21:44:43 +02:00
Zdenko Podobný
5dfb0cb898
Fixes #64 - tessedit_create_txt 0 blocks box training
2015-07-25 22:49:55 +02:00
Jim O'Regan
05de195efc
disable text creation for unlv, makebox, box.train, and box.train.stderr (see #49 )
2015-07-20 10:07:55 +01:00
Jim O'Regan
43834da7a2
disable text creation when creating hOCR (issue #49 )
2015-07-18 08:56:21 +01:00
Jeff Breidenbach
fd429c32a0
PDF creation: not disabling tessedit_create_txt
...
Okay, everything is more of less under control except for this:
tesseract phototest.tif - pdf > phototest.pdf
This is sending activating both the text renderer, and the pdf renderer.
They both get sent to stdout where they mix together and cause chaos.
Same thing happens with this command.
tesseract phototest.tif stdout pdf > phototest.pdf
What's happening is tesseractmain.cpp is setting tessedit_create_pdf without
disabling tessedit_create_txt.
https://groups.google.com/d/msgid/tesseract-dev/32c065ee-aefa-441a-b37b-b6bdc234c8ab%40googlegroups.com
2015-07-18 08:39:57 +01:00
Zdenko Podobný
ec9581d8f2
temporary add configure and Makefile.in for release
2015-07-11 09:42:43 +02:00
Ray Smith
1e3b671298
Fixes to make yesterday's changes compile
2015-05-13 09:58:59 -07:00
Ray Smith
6b634170c1
Significant change to invisible font system
...
to improve correctness and compatibility with
external programs, particularly ghostscript.
We will start mapping everything to a single glyph,
rather than allowing characters to run off the end
of the font.
A more detailed design discussion is embedded into
pdfrenderer.cpp comments. The font, source code
that produces the font, and the design comments
were contributed by Ken Sharp from Artifex Software.
2015-05-12 17:33:18 -07:00
Ray Smith
d9699c4099
Fixed bidi handling in PDF output
2014-10-09 13:29:01 -07:00
Zdenko Podobný
369fabb7fc
fix filemode;
...
update autotools and distribution script to repository changes;
ignore doxygen generated files and langauge data files;
2014-08-14 23:37:17 +02:00
zdenop
1ea387232b
fix compatibility of uninstall: MacOSX rm needs -f instead of --force
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1127 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-07-24 20:39:30 +00:00
zdenop
a66f5b84c8
install pdf.ttf and pdf.ttx as part of tesseract library
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1031 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-29 22:12:32 +00:00
theraysmith@gmail.com
91d2265429
More minor fixes from issues and cleanup
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@974 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 01:38:00 +00:00
theraysmith@gmail.com
4c72deea6c
Added pdf config file
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@972 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 19:18:07 +00:00
theraysmith@gmail.com
bfa401a6f8
Added PDF data files
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@971 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 19:14:11 +00:00
zdenop@gmail.com
53a3e0f88a
fix issue 755; add example config files from tesseract manpage
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@894 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-20 20:20:10 +00:00
zdenop@gmail.com
d5b3c6c47c
fix Parallel Build Trees (a.k.a. VPATH Builds) ('make install-langs' and 'make install-jars')
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@888 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-03 21:26:35 +00:00
zdenop@gmail.com
32d212d0c6
add new config file - get.image
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@826 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-02-23 11:56:49 +00:00
zdenop@gmail.com
e83503022c
update script for 3.02.02 release
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@793 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-10-26 18:49:14 +00:00
zdenop
1131e5dd2f
addition to Issue 724
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@731 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-07-04 15:35:26 +00:00
zdenop@gmail.com
d72a318c5c
fix Issue 724: DESTDIR not supported with make install-langs
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@730 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-07-03 20:33:28 +00:00
zdenop@gmail.com
1455bf5610
set tessedit_module_name for windows;
...
implement 'make install LANG="eng ara deu"';
more headers need to be installed: https://groups.google.com/group/tesseract-dev/msg/a4f7424377993b2e
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@700 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-06 22:41:43 +00:00
zdenop@gmail.com
8cc34e85f1
'make install' do not require language data; language data are installed by 'make install-langs'
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@695 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-05 00:11:38 +00:00
zdenop@gmail.com
3b326532cc
fix --enable-multiple-libraries; implement quite mode (issue 580)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@691 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-03 11:48:59 +00:00
zdenop@gmail.com
425c2b8205
install data files; small fix of INSTALL, README; removed ABOUT-NLS (NLS not used at the moment)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@667 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-05 16:25:40 +00:00
theraysmith@gmail.com
d581ab7e12
New config for testing bigram correction.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@661 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 18:46:19 +00:00
theraysmith@gmail.com
6e273b71bd
Cube trained data for fra, ita, rus, spa
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@656 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:08:26 +00:00
theraysmith@gmail.com
aae3da5bf1
Last minute fixes for making the tarball
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@636 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-22 05:28:44 +00:00
zdenop@gmail.com
67f47008c7
fixed "one lib" build on linux; runautoconf renamed to autogen.sh;
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@631 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 19:39:54 +00:00
zdenop@gmail.com
2ded50b4d0
'make dist' improvement; removed debugwin.* from vs2008 and vs2010; decreased warning level in vs2008 project files for Release* build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@620 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 21:33:28 +00:00
joregan@gmail.com
323ee5af7a
more Makefile.in
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@618 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 18:40:33 +00:00
theraysmith@gmail.com
d5d15f32d7
Deleted Makefile.in from svn
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@606 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:32:44 +00:00
zdenop@gmail.com
7ec3dca968
show page 0 for multipage tiff;
...
Windows: use binary mode for fopen (issue 70);
autotools: fixed cutil/Makefile.am, improved tessdata/Makefile.am;
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@604 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 21:42:13 +00:00
zdenop@gmail.com
3463abfd34
commented parameters that caused error (read_params_file: parameter not found:)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@589 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-06-15 20:20:45 +00:00
theraysmith
311d1f9253
Added Hindi traineddata
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@576 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-21 21:57:08 +00:00
zdenop@gmail.com
d8a2303daf
improved makemoredists script and tessdata/Makefile.am
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@546 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-12-05 13:33:45 +00:00
theraysmith
dbcab0eed3
Traineddata for non-Eng languages
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@540 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 01:07:46 +00:00
theraysmith
5c854e03ea
Cleaned up unused parameters
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@539 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 01:06:44 +00:00
zdenop@gmail.com
4523ce9f7d
3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
zdenop@gmail.com
7511d76315
fixed hocr to produce valid document (acording http://validator.w3.org/ ) - issue http://code.google.com/p/tesseract-ocr/issues/detail?id=401
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@525 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-17 20:03:58 +00:00
zdenop@gmail.com
fa4d4589cb
fixed hocr (escape special special characters; thank to aizvorski) + hocr config)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@515 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-29 19:03:06 +00:00