Thijs Leegwater
f061503a14
Added JPEG quality option parameter (-c jpg_quality=n)
2018-01-11 09:11:30 +01:00
Ray Smith
da03e4e910
Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion
2017-07-14 09:30:14 -07:00
Stefan Weil
1cf8fe51a0
Remove mathfix.h
...
It was only needed for MS Visual Studio 2012 and older.
Those compilers are not supported for Tesseract.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-05 20:26:25 +02:00
Raf Schietekat
c335508e84
Fewer g++ -Wsign-compare warnings
2017-05-11 23:14:52 +02:00
Raf Schietekat
986970d6ca
RAII: pdfrenderer.cpp: pdftext
2017-05-11 02:02:37 +02:00
Raf Schietekat
3c6e18ecf9
RAII: pdfrenderer.cpp: buffer
2017-05-11 02:02:37 +02:00
Raf Schietekat
936ca00c44
RAII: pdfrenderer.cpp: cidtogidmap
2017-05-11 02:02:37 +02:00
Raf Schietekat
4840c65bf0
RAII: ResultIterator::GetUTF8Text(): was leaked inside TessBaseAPI::GetUTF8Text()
2017-05-11 02:02:37 +02:00
Stefan Weil
1c59914b61
Use Leptonica struct names L_Compressed_Data, Pix
...
The Tesseract project prefers that names, so fix the remaining exceptions.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-04-30 10:50:12 +02:00
Ray Smith
77015526fa
Jeff's fixes to pdf rendering
2017-04-28 13:38:13 -07:00
Jeff Breidenbach
9038faf436
Better escaping for PDF title; fixes #636
2017-04-02 19:01:16 +02:00
Ray Smith
ca16a08c10
Removed dead TODO
2017-01-25 15:54:11 -08:00
James R. Barlow
bf638b9202
Fix PDF syntax error: "XObject" instead of "/XObject" when textonly_pdf=false
2017-01-20 13:36:38 -08:00
Zdenko Podobný
effa5741e6
Implement invisible text only for PDF
2017-01-20 21:26:34 +01:00
Stefan Weil
78d91701bd
Simplify new operations
...
It is not necessary to check for null pointers after new.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-11-30 20:24:38 +01:00
Ray Smith
5913d7344f
Added missing license headers
2016-11-18 15:53:11 -08:00
Ray Smith
c1c1e426b3
Added new LSTM-based neural network line recognizer
2016-11-07 15:38:07 -08:00
Ray Smith
2c837dffc3
Result of clang tidy on recent merge
2016-11-07 10:46:33 -08:00
Zdenko Podobný
5610738be9
fix #369 - pdf output with transparent background image
2016-08-05 22:37:58 +02:00
Zdenko Podobný
66f37f0cd3
add copyright to renderer.cpp and pdfr.cpp
2016-03-18 19:43:45 +01:00
Stefan Weil
5ce88d7f49
pdfrenderer: Fix uninitialized local variables
...
Coverity bug reports:
CID 1270405: Uninitialized scalar variable
CID 1270408: Uninitialized scalar variable
CID 1270409: Uninitialized scalar variable
CID 1270410: Uninitialized scalar variable
Those variables are set conditionally in the while loop
and must keep their values in following iterations, so
they must be declared outside of the loop.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-25 22:24:06 +01:00
Stefan Weil
997c4a6078
api: Fix printing of a size_t value
...
size_t is not always the same as long, especially not for 64 bit Windows:
api/pdfrenderer.cpp:549:31: warning:
format '%ld' expects argument of type 'long int',
but argument 4 has type 'size_t {aka long long unsigned int}' [-Wformat=]
size_t normally requires a format string "%zu", but this is unsupported
by Visual Studio, so use a type cast.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-05 06:39:35 +01:00
Stefan Weil
11b2a4d9af
api: Fix typos in comments (all found by codespell)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-09-14 21:54:27 +02:00
Zdenko Podobný
0337d898d4
fix bug in UTF-16BE conversion
2015-08-10 21:22:20 +02:00
Zdenko Podobný
628de5ba3f
enable pdfrender with NO_CUBE_BUILD
2015-08-07 23:20:22 +02:00
Jeff Breidenbach
9dcf2c6aa8
replace CubeUtils::UTF8ToUTF32 in pdfrenderer
2015-08-07 22:18:33 +02:00
Ray Smith
a303ab9d00
Misc fixes, mostly clang formatting, but some bug fixes in matrix, werd, and tesstrain_utils. Also updates unicharset to match traineddata files.
2015-07-09 14:28:20 -07:00
Ray Smith
ab0f4e2c38
Clang fixes to earlier changes and build compatability with Google environment
2015-06-12 10:53:21 -07:00
orbitcowboy
9328f0e5d4
Fix potential null pointer dereference in ccmain/paragraphs.cpp.
2015-05-19 10:17:44 +02:00
Zdenko Podobný
e98849b482
rint error message when pdf.ttf is not found.
2015-05-15 15:14:00 +02:00
Ray Smith
6b634170c1
Significant change to invisible font system
...
to improve correctness and compatibility with
external programs, particularly ghostscript.
We will start mapping everything to a single glyph,
rather than allowing characters to run off the end
of the font.
A more detailed design discussion is embedded into
pdfrenderer.cpp comments. The font, source code
that produces the font, and the design comments
were contributed by Ken Sharp from Artifex Software.
2015-05-12 17:33:18 -07:00
Ray Smith
d9699c4099
Fixed bidi handling in PDF output
2014-10-09 13:29:01 -07:00
Zdenko Podobný
d0cb1071b2
remove parameters tessedit_pdf_jpg_quality, tessedit_pdf_compression (reasons are in i1300 and i1285)
2014-10-07 23:37:34 +02:00
Zdenko Podobný
4904afe65b
fix issue 1300 - patch from #35
2014-10-06 22:43:56 +02:00
Zdenko Podobný
4c01561b0f
fix issue 1300 - patch from #26
2014-10-02 21:19:17 +02:00
Zdenko Podobný
f8613fab22
fix issue 1300 /patches from breidenbach
2014-09-21 16:38:24 +02:00
Zdenko Podobný
d1aa61c110
fix issue 1285: reimplement option to select pdf compression
2014-09-06 09:32:22 +02:00
theraysmith@gmail.com
b64ad05096
Improved efficiency of image processing for PDF
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1141 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:15:25 +00:00
zdenop
bce2cd5f33
enable to select pdf compression type and jpeg quality (fix issue 1263)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1134 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-08 21:18:44 +00:00
zdenop
5b779456f9
fix compatibility with leptonica 1.71 and 1.70
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1126 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-07-24 19:11:39 +00:00
zdenop
905e6162b9
put info about (API) version; fix typo
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1117 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-22 18:31:42 +00:00
theraysmith@gmail.com
25a8c7b720
Enabled streaming input and output of multi-page documents
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1105 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:46:21 +00:00
zdenop@gmail.com
2367ba1f6e
fix PDF rendering for Arabic. http://ftp.de.debian.org/debian/pool/main/t/tesseract/tesseract_3.03.02-3.diff.gz
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1055 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-03-21 10:11:32 +00:00
theraysmith@gmail.com
864b2f6d80
Fixed problems with selection/copy/paste in some PDF viewers
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1042 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 19:14:16 +00:00
theraysmith@gmail.com
4585a4c9df
Fixed empty page with color input
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1032 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-30 02:18:51 +00:00
theraysmith@gmail.com
0ddc7bfcaf
Fixed first-word only bug in PDF output.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1022 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-27 22:40:03 +00:00
theraysmith@gmail.com
d11dc049e3
Fixed a lot of compiler/clang warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
5b9a7e06eb
Turned on pdfrenderer functionality that needs leptonica 1.70
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1009 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-23 23:01:10 +00:00
zdenop@gmail.com
ef3b1d936e
fix mingw build issues
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@995 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-18 09:00:54 +00:00
zdenop@gmail.com
94d08567e1
fix vs2010 (and maybe vs2008) build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@983 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 20:13:55 +00:00