mirror of
https://github.com/tesseract-ocr/tesseract.git
synced 2024-11-23 18:49:08 +08:00
top-skimming import from sf.net
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk/trunk@2 d0cd1f9f-072b-0410-8dd7-cf729c803f20
This commit is contained in:
commit
425d593ebe
6
.cvsignore
Normal file
6
.cvsignore
Normal file
@ -0,0 +1,6 @@
|
||||
BUILD
|
||||
OWNERS
|
||||
Makefile
|
||||
README.google
|
||||
runautoconf
|
||||
config_auto.h
|
8
AUTHORS
Normal file
8
AUTHORS
Normal file
@ -0,0 +1,8 @@
|
||||
Ray Smith (lead developer) <theraysmith@users.sourceforge.net>
|
||||
Phil Cheatle
|
||||
Simon Crouch
|
||||
Dan Johnson
|
||||
Mark Seaman
|
||||
Sheelagh Huddleston
|
||||
Chris Newton
|
||||
... and several others.
|
23
COPYING
Normal file
23
COPYING
Normal file
@ -0,0 +1,23 @@
|
||||
This package contains the Tesseract Open Source OCR Engine.
|
||||
Orignally developed at Hewlett Packard Laboratories Bristol and
|
||||
at Hewlett Packard Co, Greeley Colorado, all the code
|
||||
in this distribution is now licensed under the Apache License:
|
||||
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
|
||||
|
||||
Other Dependencies and Licenses:
|
||||
================================
|
||||
The Aspirin/MIGRAINES system is no longer used.
|
||||
|
||||
Tesseract can also make use of the libtiff library. (www.libtiff.org)
|
||||
Without libtiff, Tesseract can only read uncompressed and G3 compressed
|
||||
TIFF files.
|
20
ChangeLog
Normal file
20
ChangeLog
Normal file
@ -0,0 +1,20 @@
|
||||
June 2006 - V1.0 of open source Tesseract checked-in.
|
||||
Sep 7 2006 - V1.01.
|
||||
Added mfcpch.cpp and getopt.cpp for VC++.
|
||||
Fixed problem with greyscale images and no libtiff.
|
||||
Stopped debug window from being used for the usage output.
|
||||
Fixed load of inttemp for big-endian architectures.
|
||||
Fixed some Mac compilation issues.
|
||||
Oct 4 2006 - V1.02
|
||||
Removed dependency on Aspirin.
|
||||
Fixed a few missing Apache license headers.
|
||||
Removed $log.
|
||||
Feb 2 2007 - V1.03
|
||||
Added mftraining and cntraining.
|
||||
Added baseapi with adaptive thresholding for grey and color.
|
||||
Fixed many memory leaks.
|
||||
Fixed several bugs including lack of use of adaptive classifier.
|
||||
Added ifdefs to eliminate graphics code and add embedded platform support.
|
||||
Incorporated several patches, including 64-bit builds, Mac builds.
|
||||
Minor accuracy improvements.
|
||||
|
229
INSTALL
Normal file
229
INSTALL
Normal file
@ -0,0 +1,229 @@
|
||||
Copyright 1994, 1995, 1996, 1999, 2000, 2001, 2002 Free Software
|
||||
Foundation, Inc.
|
||||
|
||||
This file is free documentation; the Free Software Foundation gives
|
||||
unlimited permission to copy, distribute and modify it.
|
||||
|
||||
Basic Installation
|
||||
==================
|
||||
|
||||
These are generic installation instructions.
|
||||
|
||||
The `configure' shell script attempts to guess correct values for
|
||||
various system-dependent variables used during compilation. It uses
|
||||
those values to create a `Makefile' in each directory of the package.
|
||||
It may also create one or more `.h' files containing system-dependent
|
||||
definitions. Finally, it creates a shell script `config.status' that
|
||||
you can run in the future to recreate the current configuration, and a
|
||||
file `config.log' containing compiler output (useful mainly for
|
||||
debugging `configure').
|
||||
|
||||
It can also use an optional file (typically called `config.cache'
|
||||
and enabled with `--cache-file=config.cache' or simply `-C') that saves
|
||||
the results of its tests to speed up reconfiguring. (Caching is
|
||||
disabled by default to prevent problems with accidental use of stale
|
||||
cache files.)
|
||||
|
||||
If you need to do unusual things to compile the package, please try
|
||||
to figure out how `configure' could check whether to do them, and mail
|
||||
diffs or instructions to the address given in the `README' so they can
|
||||
be considered for the next release. If you are using the cache, and at
|
||||
some point `config.cache' contains results you don't want to keep, you
|
||||
may remove or edit it.
|
||||
|
||||
The file `configure.ac' (or `configure.in') is used to create
|
||||
`configure' by a program called `autoconf'. You only need
|
||||
`configure.ac' if you want to change it or regenerate `configure' using
|
||||
a newer version of `autoconf'.
|
||||
|
||||
The simplest way to compile this package is:
|
||||
|
||||
1. `cd' to the directory containing the package's source code and type
|
||||
`./configure' to configure the package for your system. If you're
|
||||
using `csh' on an old version of System V, you might need to type
|
||||
`sh ./configure' instead to prevent `csh' from trying to execute
|
||||
`configure' itself.
|
||||
|
||||
Running `configure' takes awhile. While running, it prints some
|
||||
messages telling which features it is checking for.
|
||||
|
||||
2. Type `make' to compile the package.
|
||||
|
||||
3. Optionally, type `make check' to run any self-tests that come with
|
||||
the package.
|
||||
|
||||
4. Type `make install' to install the programs and any data files and
|
||||
documentation.
|
||||
|
||||
5. You can remove the program binaries and object files from the
|
||||
source code directory by typing `make clean'. To also remove the
|
||||
files that `configure' created (so you can compile the package for
|
||||
a different kind of computer), type `make distclean'. There is
|
||||
also a `make maintainer-clean' target, but that is intended mainly
|
||||
for the package's developers. If you use it, you may have to get
|
||||
all sorts of other programs in order to regenerate files that came
|
||||
with the distribution.
|
||||
|
||||
Compilers and Options
|
||||
=====================
|
||||
|
||||
Some systems require unusual options for compilation or linking that
|
||||
the `configure' script does not know about. Run `./configure --help'
|
||||
for details on some of the pertinent environment variables.
|
||||
|
||||
You can give `configure' initial values for configuration parameters
|
||||
by setting variables in the command line or in the environment. Here
|
||||
is an example:
|
||||
|
||||
./configure CC=c89 CFLAGS=-O2 LIBS=-lposix
|
||||
|
||||
*Note Defining Variables::, for more details.
|
||||
|
||||
Compiling For Multiple Architectures
|
||||
====================================
|
||||
|
||||
You can compile the package for more than one kind of computer at the
|
||||
same time, by placing the object files for each architecture in their
|
||||
own directory. To do this, you must use a version of `make' that
|
||||
supports the `VPATH' variable, such as GNU `make'. `cd' to the
|
||||
directory where you want the object files and executables to go and run
|
||||
the `configure' script. `configure' automatically checks for the
|
||||
source code in the directory that `configure' is in and in `..'.
|
||||
|
||||
If you have to use a `make' that does not support the `VPATH'
|
||||
variable, you have to compile the package for one architecture at a
|
||||
time in the source code directory. After you have installed the
|
||||
package for one architecture, use `make distclean' before reconfiguring
|
||||
for another architecture.
|
||||
|
||||
Installation Names
|
||||
==================
|
||||
|
||||
By default, `make install' will install the package's files in
|
||||
`/usr/local/bin', `/usr/local/man', etc. You can specify an
|
||||
installation prefix other than `/usr/local' by giving `configure' the
|
||||
option `--prefix=PATH'.
|
||||
|
||||
You can specify separate installation prefixes for
|
||||
architecture-specific files and architecture-independent files. If you
|
||||
give `configure' the option `--exec-prefix=PATH', the package will use
|
||||
PATH as the prefix for installing programs and libraries.
|
||||
Documentation and other data files will still use the regular prefix.
|
||||
|
||||
In addition, if you use an unusual directory layout you can give
|
||||
options like `--bindir=PATH' to specify different values for particular
|
||||
kinds of files. Run `configure --help' for a list of the directories
|
||||
you can set and what kinds of files go in them.
|
||||
|
||||
If the package supports it, you can cause programs to be installed
|
||||
with an extra prefix or suffix on their names by giving `configure' the
|
||||
option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'.
|
||||
|
||||
Optional Features
|
||||
=================
|
||||
|
||||
Some packages pay attention to `--enable-FEATURE' options to
|
||||
`configure', where FEATURE indicates an optional part of the package.
|
||||
They may also pay attention to `--with-PACKAGE' options, where PACKAGE
|
||||
is something like `gnu-as' or `x' (for the X Window System). The
|
||||
`README' should mention any `--enable-' and `--with-' options that the
|
||||
package recognizes.
|
||||
|
||||
For packages that use the X Window System, `configure' can usually
|
||||
find the X include and library files automatically, but if it doesn't,
|
||||
you can use the `configure' options `--x-includes=DIR' and
|
||||
`--x-libraries=DIR' to specify their locations.
|
||||
|
||||
Specifying the System Type
|
||||
==========================
|
||||
|
||||
There may be some features `configure' cannot figure out
|
||||
automatically, but needs to determine by the type of machine the package
|
||||
will run on. Usually, assuming the package is built to be run on the
|
||||
_same_ architectures, `configure' can figure that out, but if it prints
|
||||
a message saying it cannot guess the machine type, give it the
|
||||
`--build=TYPE' option. TYPE can either be a short name for the system
|
||||
type, such as `sun4', or a canonical name which has the form:
|
||||
|
||||
CPU-COMPANY-SYSTEM
|
||||
|
||||
where SYSTEM can have one of these forms:
|
||||
|
||||
OS KERNEL-OS
|
||||
|
||||
See the file `config.sub' for the possible values of each field. If
|
||||
`config.sub' isn't included in this package, then this package doesn't
|
||||
need to know the machine type.
|
||||
|
||||
If you are _building_ compiler tools for cross-compiling, you should
|
||||
use the `--target=TYPE' option to select the type of system they will
|
||||
produce code for.
|
||||
|
||||
If you want to _use_ a cross compiler, that generates code for a
|
||||
platform different from the build platform, you should specify the
|
||||
"host" platform (i.e., that on which the generated programs will
|
||||
eventually be run) with `--host=TYPE'.
|
||||
|
||||
Sharing Defaults
|
||||
================
|
||||
|
||||
If you want to set default values for `configure' scripts to share,
|
||||
you can create a site shell script called `config.site' that gives
|
||||
default values for variables like `CC', `cache_file', and `prefix'.
|
||||
`configure' looks for `PREFIX/share/config.site' if it exists, then
|
||||
`PREFIX/etc/config.site' if it exists. Or, you can set the
|
||||
`CONFIG_SITE' environment variable to the location of the site script.
|
||||
A warning: not all `configure' scripts look for a site script.
|
||||
|
||||
Defining Variables
|
||||
==================
|
||||
|
||||
Variables not defined in a site shell script can be set in the
|
||||
environment passed to `configure'. However, some packages may run
|
||||
configure again during the build, and the customized values of these
|
||||
variables may be lost. In order to avoid this problem, you should set
|
||||
them in the `configure' command line, using `VAR=value'. For example:
|
||||
|
||||
./configure CC=/usr/local2/bin/gcc
|
||||
|
||||
will cause the specified gcc to be used as the C compiler (unless it is
|
||||
overridden in the site shell script).
|
||||
|
||||
`configure' Invocation
|
||||
======================
|
||||
|
||||
`configure' recognizes the following options to control how it
|
||||
operates.
|
||||
|
||||
`--help'
|
||||
`-h'
|
||||
Print a summary of the options to `configure', and exit.
|
||||
|
||||
`--version'
|
||||
`-V'
|
||||
Print the version of Autoconf used to generate the `configure'
|
||||
script, and exit.
|
||||
|
||||
`--cache-file=FILE'
|
||||
Enable the cache: use and save the results of the tests in FILE,
|
||||
traditionally `config.cache'. FILE defaults to `/dev/null' to
|
||||
disable caching.
|
||||
|
||||
`--config-cache'
|
||||
`-C'
|
||||
Alias for `--cache-file=config.cache'.
|
||||
|
||||
`--quiet'
|
||||
`--silent'
|
||||
`-q'
|
||||
Do not print messages saying which checks are being made. To
|
||||
suppress all normal output, redirect it to `/dev/null' (any error
|
||||
messages will still be shown).
|
||||
|
||||
`--srcdir=DIR'
|
||||
Look for the package's source code in directory DIR. Usually
|
||||
`configure' can determine that directory automatically.
|
||||
|
||||
`configure' also accepts some other, not widely useful, options. Run
|
||||
`configure --help' for more details.
|
||||
|
15
Makefile.am
Normal file
15
Makefile.am
Normal file
@ -0,0 +1,15 @@
|
||||
# TODO(luc) Add 'doc' to this list when ready
|
||||
SUBDIRS = ccstruct ccutil classify cutil dict display image textord viewer wordrec ccmain training
|
||||
|
||||
EXTRA_DIST = tessdata phototest.tif tesseract.dsp tesseract.dsw
|
||||
#EXTRA_DIST = doc/html doc/@PACKAGE_NAME@_@PACKAGE_VERSION@.pdf doc/@PACKAGE_NAME@_@PACKAGE_VERSION@.ps.gz
|
||||
|
||||
dist-hook:
|
||||
# Need to remove CVS directories from directories
|
||||
# added using EXTRA_DIST. $(distdir)/tessdata would in
|
||||
# theory suffice.
|
||||
rm -rf `find $(distdir) -name CVS`
|
||||
# Also remove extra files not needed in a distribution
|
||||
rm -rf `find $(distdir) -name configure.ac`
|
||||
rm -rf `find $(distdir) -name acinclude.m4`
|
||||
rm -rf `find $(distdir) -name aclocal.m4`
|
628
Makefile.in
Normal file
628
Makefile.in
Normal file
@ -0,0 +1,628 @@
|
||||
# Makefile.in generated by automake 1.9.6 from Makefile.am.
|
||||
# @configure_input@
|
||||
|
||||
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
|
||||
# 2003, 2004, 2005 Free Software Foundation, Inc.
|
||||
# This Makefile.in is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
|
||||
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
|
||||
# PARTICULAR PURPOSE.
|
||||
|
||||
@SET_MAKE@
|
||||
srcdir = @srcdir@
|
||||
top_srcdir = @top_srcdir@
|
||||
VPATH = @srcdir@
|
||||
pkgdatadir = $(datadir)/@PACKAGE@
|
||||
pkglibdir = $(libdir)/@PACKAGE@
|
||||
pkgincludedir = $(includedir)/@PACKAGE@
|
||||
top_builddir = .
|
||||
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
|
||||
INSTALL = @INSTALL@
|
||||
install_sh_DATA = $(install_sh) -c -m 644
|
||||
install_sh_PROGRAM = $(install_sh) -c
|
||||
install_sh_SCRIPT = $(install_sh) -c
|
||||
INSTALL_HEADER = $(INSTALL_DATA)
|
||||
transform = $(program_transform_name)
|
||||
NORMAL_INSTALL = :
|
||||
PRE_INSTALL = :
|
||||
POST_INSTALL = :
|
||||
NORMAL_UNINSTALL = :
|
||||
PRE_UNINSTALL = :
|
||||
POST_UNINSTALL = :
|
||||
build_triplet = @build@
|
||||
host_triplet = @host@
|
||||
DIST_COMMON = README $(am__configure_deps) $(srcdir)/Makefile.am \
|
||||
$(srcdir)/Makefile.in $(top_srcdir)/config/config.h.in \
|
||||
$(top_srcdir)/configure AUTHORS COPYING ChangeLog INSTALL NEWS \
|
||||
config/config.guess config/config.sub config/depcomp \
|
||||
config/install-sh config/missing config/mkinstalldirs
|
||||
subdir = .
|
||||
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
|
||||
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
|
||||
$(top_srcdir)/config/ac_define_versionlevel.m4 \
|
||||
$(top_srcdir)/config/acinclude_custom.m4 \
|
||||
$(top_srcdir)/configure.ac
|
||||
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
|
||||
$(ACLOCAL_M4)
|
||||
am__CONFIG_DISTCLEAN_FILES = config.status config.cache config.log \
|
||||
configure.lineno configure.status.lineno
|
||||
mkinstalldirs = $(SHELL) $(top_srcdir)/config/mkinstalldirs
|
||||
CONFIG_HEADER = config_auto.h
|
||||
CONFIG_CLEAN_FILES =
|
||||
SOURCES =
|
||||
DIST_SOURCES =
|
||||
RECURSIVE_TARGETS = all-recursive check-recursive dvi-recursive \
|
||||
html-recursive info-recursive install-data-recursive \
|
||||
install-exec-recursive install-info-recursive \
|
||||
install-recursive installcheck-recursive installdirs-recursive \
|
||||
pdf-recursive ps-recursive uninstall-info-recursive \
|
||||
uninstall-recursive
|
||||
ETAGS = etags
|
||||
CTAGS = ctags
|
||||
DIST_SUBDIRS = $(SUBDIRS)
|
||||
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
|
||||
distdir = $(PACKAGE)-$(VERSION)
|
||||
top_distdir = $(distdir)
|
||||
am__remove_distdir = \
|
||||
{ test ! -d $(distdir) \
|
||||
|| { find $(distdir) -type d ! -perm -200 -exec chmod u+w {} ';' \
|
||||
&& rm -fr $(distdir); }; }
|
||||
DIST_ARCHIVES = $(distdir).tar.gz
|
||||
GZIP_ENV = --best
|
||||
distuninstallcheck_listfiles = find . -type f -print
|
||||
distcleancheck_listfiles = find . -type f -print
|
||||
ACLOCAL = @ACLOCAL@
|
||||
AMDEP_FALSE = @AMDEP_FALSE@
|
||||
AMDEP_TRUE = @AMDEP_TRUE@
|
||||
AMTAR = @AMTAR@
|
||||
AUTOCONF = @AUTOCONF@
|
||||
AUTOHEADER = @AUTOHEADER@
|
||||
AUTOMAKE = @AUTOMAKE@
|
||||
AWK = @AWK@
|
||||
CC = @CC@
|
||||
CCDEPMODE = @CCDEPMODE@
|
||||
CFLAGS = @CFLAGS@
|
||||
CPPFLAGS = @CPPFLAGS@
|
||||
CXX = @CXX@
|
||||
CXXCPP = @CXXCPP@
|
||||
CXXDEPMODE = @CXXDEPMODE@
|
||||
CXXFLAGS = @CXXFLAGS@
|
||||
CXXRPOFLAGS = @CXXRPOFLAGS@
|
||||
CYGPATH_W = @CYGPATH_W@
|
||||
DEFS = @DEFS@
|
||||
DEPDIR = @DEPDIR@
|
||||
ECHO_C = @ECHO_C@
|
||||
ECHO_N = @ECHO_N@
|
||||
ECHO_T = @ECHO_T@
|
||||
EGREP = @EGREP@
|
||||
EXEEXT = @EXEEXT@
|
||||
GNUWIN32_DIR = @GNUWIN32_DIR@
|
||||
HAVE_GNUWIN32_FALSE = @HAVE_GNUWIN32_FALSE@
|
||||
HAVE_GNUWIN32_TRUE = @HAVE_GNUWIN32_TRUE@
|
||||
HAVE_LIBTIFF_FALSE = @HAVE_LIBTIFF_FALSE@
|
||||
HAVE_LIBTIFF_TRUE = @HAVE_LIBTIFF_TRUE@
|
||||
INSTALL_DATA = @INSTALL_DATA@
|
||||
INSTALL_PROGRAM = @INSTALL_PROGRAM@
|
||||
INSTALL_SCRIPT = @INSTALL_SCRIPT@
|
||||
INSTALL_STRIP_PROGRAM = @INSTALL_STRIP_PROGRAM@
|
||||
LDFLAGS = @LDFLAGS@
|
||||
LIBOBJS = @LIBOBJS@
|
||||
LIBS = @LIBS@
|
||||
LIBTIFF_CFLAGS = @LIBTIFF_CFLAGS@
|
||||
LIBTIFF_LIBS = @LIBTIFF_LIBS@
|
||||
LTLIBOBJS = @LTLIBOBJS@
|
||||
MAINT = @MAINT@
|
||||
MAINTAINER_MODE_FALSE = @MAINTAINER_MODE_FALSE@
|
||||
MAINTAINER_MODE_TRUE = @MAINTAINER_MODE_TRUE@
|
||||
MAKEINFO = @MAKEINFO@
|
||||
OBJEXT = @OBJEXT@
|
||||
OPTS = @OPTS@
|
||||
PACKAGE = @PACKAGE@
|
||||
PACKAGE_BUGREPORT = @PACKAGE_BUGREPORT@
|
||||
PACKAGE_DATE = @PACKAGE_DATE@
|
||||
PACKAGE_NAME = @PACKAGE_NAME@
|
||||
PACKAGE_STRING = @PACKAGE_STRING@
|
||||
PACKAGE_TARNAME = @PACKAGE_TARNAME@
|
||||
PACKAGE_VERSION = @PACKAGE_VERSION@
|
||||
PACKAGE_YEAR = @PACKAGE_YEAR@
|
||||
PATH_SEPARATOR = @PATH_SEPARATOR@
|
||||
RANLIB = @RANLIB@
|
||||
RPO_NO = @RPO_NO@
|
||||
RPO_YES = @RPO_YES@
|
||||
SET_MAKE = @SET_MAKE@
|
||||
SHELL = @SHELL@
|
||||
STRIP = @STRIP@
|
||||
USING_CL_FALSE = @USING_CL_FALSE@
|
||||
USING_CL_TRUE = @USING_CL_TRUE@
|
||||
VERSION = @VERSION@
|
||||
ac_ct_CC = @ac_ct_CC@
|
||||
ac_ct_CXX = @ac_ct_CXX@
|
||||
ac_ct_RANLIB = @ac_ct_RANLIB@
|
||||
ac_ct_STRIP = @ac_ct_STRIP@
|
||||
am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
|
||||
am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
|
||||
am__fastdepCXX_FALSE = @am__fastdepCXX_FALSE@
|
||||
am__fastdepCXX_TRUE = @am__fastdepCXX_TRUE@
|
||||
am__include = @am__include@
|
||||
am__leading_dot = @am__leading_dot@
|
||||
am__quote = @am__quote@
|
||||
am__tar = @am__tar@
|
||||
am__untar = @am__untar@
|
||||
bindir = @bindir@
|
||||
build = @build@
|
||||
build_alias = @build_alias@
|
||||
build_cpu = @build_cpu@
|
||||
build_os = @build_os@
|
||||
build_vendor = @build_vendor@
|
||||
datadir = @datadir@
|
||||
exec_prefix = @exec_prefix@
|
||||
host = @host@
|
||||
host_alias = @host_alias@
|
||||
host_cpu = @host_cpu@
|
||||
host_os = @host_os@
|
||||
host_vendor = @host_vendor@
|
||||
includedir = @includedir@
|
||||
infodir = @infodir@
|
||||
install_sh = @install_sh@
|
||||
libdir = @libdir@
|
||||
libexecdir = @libexecdir@
|
||||
localstatedir = @localstatedir@
|
||||
mandir = @mandir@
|
||||
mkdir_p = @mkdir_p@
|
||||
oldincludedir = @oldincludedir@
|
||||
prefix = @prefix@
|
||||
program_transform_name = @program_transform_name@
|
||||
sbindir = @sbindir@
|
||||
sharedstatedir = @sharedstatedir@
|
||||
sysconfdir = @sysconfdir@
|
||||
target_alias = @target_alias@
|
||||
|
||||
# TODO(luc) Add 'doc' to this list when ready
|
||||
SUBDIRS = ccstruct ccutil classify cutil dict display image textord viewer wordrec ccmain training
|
||||
EXTRA_DIST = tessdata phototest.tif tesseract.dsp tesseract.dsw
|
||||
all: config_auto.h
|
||||
$(MAKE) $(AM_MAKEFLAGS) all-recursive
|
||||
|
||||
.SUFFIXES:
|
||||
am--refresh:
|
||||
@:
|
||||
$(srcdir)/Makefile.in: @MAINTAINER_MODE_TRUE@ $(srcdir)/Makefile.am $(am__configure_deps)
|
||||
@for dep in $?; do \
|
||||
case '$(am__configure_deps)' in \
|
||||
*$$dep*) \
|
||||
echo ' cd $(srcdir) && $(AUTOMAKE) --gnu '; \
|
||||
cd $(srcdir) && $(AUTOMAKE) --gnu \
|
||||
&& exit 0; \
|
||||
exit 1;; \
|
||||
esac; \
|
||||
done; \
|
||||
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu Makefile'; \
|
||||
cd $(top_srcdir) && \
|
||||
$(AUTOMAKE) --gnu Makefile
|
||||
.PRECIOUS: Makefile
|
||||
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
|
||||
@case '$?' in \
|
||||
*config.status*) \
|
||||
echo ' $(SHELL) ./config.status'; \
|
||||
$(SHELL) ./config.status;; \
|
||||
*) \
|
||||
echo ' cd $(top_builddir) && $(SHELL) ./config.status $@ $(am__depfiles_maybe)'; \
|
||||
cd $(top_builddir) && $(SHELL) ./config.status $@ $(am__depfiles_maybe);; \
|
||||
esac;
|
||||
|
||||
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
|
||||
$(SHELL) ./config.status --recheck
|
||||
|
||||
$(top_srcdir)/configure: @MAINTAINER_MODE_TRUE@ $(am__configure_deps)
|
||||
cd $(srcdir) && $(AUTOCONF)
|
||||
$(ACLOCAL_M4): @MAINTAINER_MODE_TRUE@ $(am__aclocal_m4_deps)
|
||||
cd $(srcdir) && $(ACLOCAL) $(ACLOCAL_AMFLAGS)
|
||||
|
||||
config_auto.h: stamp-h1
|
||||
@if test ! -f $@; then \
|
||||
rm -f stamp-h1; \
|
||||
$(MAKE) stamp-h1; \
|
||||
else :; fi
|
||||
|
||||
stamp-h1: $(top_srcdir)/config/config.h.in $(top_builddir)/config.status
|
||||
@rm -f stamp-h1
|
||||
cd $(top_builddir) && $(SHELL) ./config.status config_auto.h
|
||||
$(top_srcdir)/config/config.h.in: @MAINTAINER_MODE_TRUE@ $(am__configure_deps)
|
||||
cd $(top_srcdir) && $(AUTOHEADER)
|
||||
rm -f stamp-h1
|
||||
touch $@
|
||||
|
||||
distclean-hdr:
|
||||
-rm -f config_auto.h stamp-h1
|
||||
uninstall-info-am:
|
||||
|
||||
# This directory's subdirectories are mostly independent; you can cd
|
||||
# into them and run `make' without going through this Makefile.
|
||||
# To change the values of `make' variables: instead of editing Makefiles,
|
||||
# (1) if the variable is set in `config.status', edit `config.status'
|
||||
# (which will cause the Makefiles to be regenerated when you run `make');
|
||||
# (2) otherwise, pass the desired values on the `make' command line.
|
||||
$(RECURSIVE_TARGETS):
|
||||
@failcom='exit 1'; \
|
||||
for f in x $$MAKEFLAGS; do \
|
||||
case $$f in \
|
||||
*=* | --[!k]*);; \
|
||||
*k*) failcom='fail=yes';; \
|
||||
esac; \
|
||||
done; \
|
||||
dot_seen=no; \
|
||||
target=`echo $@ | sed s/-recursive//`; \
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
echo "Making $$target in $$subdir"; \
|
||||
if test "$$subdir" = "."; then \
|
||||
dot_seen=yes; \
|
||||
local_target="$$target-am"; \
|
||||
else \
|
||||
local_target="$$target"; \
|
||||
fi; \
|
||||
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|
||||
|| eval $$failcom; \
|
||||
done; \
|
||||
if test "$$dot_seen" = "no"; then \
|
||||
$(MAKE) $(AM_MAKEFLAGS) "$$target-am" || exit 1; \
|
||||
fi; test -z "$$fail"
|
||||
|
||||
mostlyclean-recursive clean-recursive distclean-recursive \
|
||||
maintainer-clean-recursive:
|
||||
@failcom='exit 1'; \
|
||||
for f in x $$MAKEFLAGS; do \
|
||||
case $$f in \
|
||||
*=* | --[!k]*);; \
|
||||
*k*) failcom='fail=yes';; \
|
||||
esac; \
|
||||
done; \
|
||||
dot_seen=no; \
|
||||
case "$@" in \
|
||||
distclean-* | maintainer-clean-*) list='$(DIST_SUBDIRS)' ;; \
|
||||
*) list='$(SUBDIRS)' ;; \
|
||||
esac; \
|
||||
rev=''; for subdir in $$list; do \
|
||||
if test "$$subdir" = "."; then :; else \
|
||||
rev="$$subdir $$rev"; \
|
||||
fi; \
|
||||
done; \
|
||||
rev="$$rev ."; \
|
||||
target=`echo $@ | sed s/-recursive//`; \
|
||||
for subdir in $$rev; do \
|
||||
echo "Making $$target in $$subdir"; \
|
||||
if test "$$subdir" = "."; then \
|
||||
local_target="$$target-am"; \
|
||||
else \
|
||||
local_target="$$target"; \
|
||||
fi; \
|
||||
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|
||||
|| eval $$failcom; \
|
||||
done && test -z "$$fail"
|
||||
tags-recursive:
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) tags); \
|
||||
done
|
||||
ctags-recursive:
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) ctags); \
|
||||
done
|
||||
|
||||
ID: $(HEADERS) $(SOURCES) $(LISP) $(TAGS_FILES)
|
||||
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
|
||||
unique=`for i in $$list; do \
|
||||
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
|
||||
done | \
|
||||
$(AWK) ' { files[$$0] = 1; } \
|
||||
END { for (i in files) print i; }'`; \
|
||||
mkid -fID $$unique
|
||||
tags: TAGS
|
||||
|
||||
TAGS: tags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
|
||||
$(TAGS_FILES) $(LISP)
|
||||
tags=; \
|
||||
here=`pwd`; \
|
||||
if ($(ETAGS) --etags-include --version) >/dev/null 2>&1; then \
|
||||
include_option=--etags-include; \
|
||||
empty_fix=.; \
|
||||
else \
|
||||
include_option=--include; \
|
||||
empty_fix=; \
|
||||
fi; \
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
if test "$$subdir" = .; then :; else \
|
||||
test ! -f $$subdir/TAGS || \
|
||||
tags="$$tags $$include_option=$$here/$$subdir/TAGS"; \
|
||||
fi; \
|
||||
done; \
|
||||
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
|
||||
unique=`for i in $$list; do \
|
||||
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
|
||||
done | \
|
||||
$(AWK) ' { files[$$0] = 1; } \
|
||||
END { for (i in files) print i; }'`; \
|
||||
if test -z "$(ETAGS_ARGS)$$tags$$unique"; then :; else \
|
||||
test -n "$$unique" || unique=$$empty_fix; \
|
||||
$(ETAGS) $(ETAGSFLAGS) $(AM_ETAGSFLAGS) $(ETAGS_ARGS) \
|
||||
$$tags $$unique; \
|
||||
fi
|
||||
ctags: CTAGS
|
||||
CTAGS: ctags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
|
||||
$(TAGS_FILES) $(LISP)
|
||||
tags=; \
|
||||
here=`pwd`; \
|
||||
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
|
||||
unique=`for i in $$list; do \
|
||||
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
|
||||
done | \
|
||||
$(AWK) ' { files[$$0] = 1; } \
|
||||
END { for (i in files) print i; }'`; \
|
||||
test -z "$(CTAGS_ARGS)$$tags$$unique" \
|
||||
|| $(CTAGS) $(CTAGSFLAGS) $(AM_CTAGSFLAGS) $(CTAGS_ARGS) \
|
||||
$$tags $$unique
|
||||
|
||||
GTAGS:
|
||||
here=`$(am__cd) $(top_builddir) && pwd` \
|
||||
&& cd $(top_srcdir) \
|
||||
&& gtags -i $(GTAGS_ARGS) $$here
|
||||
|
||||
distclean-tags:
|
||||
-rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
|
||||
|
||||
distdir: $(DISTFILES)
|
||||
$(am__remove_distdir)
|
||||
mkdir $(distdir)
|
||||
$(mkdir_p) $(distdir)/config
|
||||
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
|
||||
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
|
||||
list='$(DISTFILES)'; for file in $$list; do \
|
||||
case $$file in \
|
||||
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
|
||||
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
|
||||
esac; \
|
||||
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
|
||||
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
|
||||
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
|
||||
dir="/$$dir"; \
|
||||
$(mkdir_p) "$(distdir)$$dir"; \
|
||||
else \
|
||||
dir=''; \
|
||||
fi; \
|
||||
if test -d $$d/$$file; then \
|
||||
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
|
||||
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
|
||||
fi; \
|
||||
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
|
||||
else \
|
||||
test -f $(distdir)/$$file \
|
||||
|| cp -p $$d/$$file $(distdir)/$$file \
|
||||
|| exit 1; \
|
||||
fi; \
|
||||
done
|
||||
list='$(DIST_SUBDIRS)'; for subdir in $$list; do \
|
||||
if test "$$subdir" = .; then :; else \
|
||||
test -d "$(distdir)/$$subdir" \
|
||||
|| $(mkdir_p) "$(distdir)/$$subdir" \
|
||||
|| exit 1; \
|
||||
distdir=`$(am__cd) $(distdir) && pwd`; \
|
||||
top_distdir=`$(am__cd) $(top_distdir) && pwd`; \
|
||||
(cd $$subdir && \
|
||||
$(MAKE) $(AM_MAKEFLAGS) \
|
||||
top_distdir="$$top_distdir" \
|
||||
distdir="$$distdir/$$subdir" \
|
||||
distdir) \
|
||||
|| exit 1; \
|
||||
fi; \
|
||||
done
|
||||
$(MAKE) $(AM_MAKEFLAGS) \
|
||||
top_distdir="$(top_distdir)" distdir="$(distdir)" \
|
||||
dist-hook
|
||||
-find $(distdir) -type d ! -perm -777 -exec chmod a+rwx {} \; -o \
|
||||
! -type d ! -perm -444 -links 1 -exec chmod a+r {} \; -o \
|
||||
! -type d ! -perm -400 -exec chmod a+r {} \; -o \
|
||||
! -type d ! -perm -444 -exec $(SHELL) $(install_sh) -c -m a+r {} {} \; \
|
||||
|| chmod -R a+r $(distdir)
|
||||
dist-gzip: distdir
|
||||
tardir=$(distdir) && $(am__tar) | GZIP=$(GZIP_ENV) gzip -c >$(distdir).tar.gz
|
||||
$(am__remove_distdir)
|
||||
|
||||
dist-bzip2: distdir
|
||||
tardir=$(distdir) && $(am__tar) | bzip2 -9 -c >$(distdir).tar.bz2
|
||||
$(am__remove_distdir)
|
||||
|
||||
dist-tarZ: distdir
|
||||
tardir=$(distdir) && $(am__tar) | compress -c >$(distdir).tar.Z
|
||||
$(am__remove_distdir)
|
||||
|
||||
dist-shar: distdir
|
||||
shar $(distdir) | GZIP=$(GZIP_ENV) gzip -c >$(distdir).shar.gz
|
||||
$(am__remove_distdir)
|
||||
|
||||
dist-zip: distdir
|
||||
-rm -f $(distdir).zip
|
||||
zip -rq $(distdir).zip $(distdir)
|
||||
$(am__remove_distdir)
|
||||
|
||||
dist dist-all: distdir
|
||||
tardir=$(distdir) && $(am__tar) | GZIP=$(GZIP_ENV) gzip -c >$(distdir).tar.gz
|
||||
$(am__remove_distdir)
|
||||
|
||||
# This target untars the dist file and tries a VPATH configuration. Then
|
||||
# it guarantees that the distribution is self-contained by making another
|
||||
# tarfile.
|
||||
distcheck: dist
|
||||
case '$(DIST_ARCHIVES)' in \
|
||||
*.tar.gz*) \
|
||||
GZIP=$(GZIP_ENV) gunzip -c $(distdir).tar.gz | $(am__untar) ;;\
|
||||
*.tar.bz2*) \
|
||||
bunzip2 -c $(distdir).tar.bz2 | $(am__untar) ;;\
|
||||
*.tar.Z*) \
|
||||
uncompress -c $(distdir).tar.Z | $(am__untar) ;;\
|
||||
*.shar.gz*) \
|
||||
GZIP=$(GZIP_ENV) gunzip -c $(distdir).shar.gz | unshar ;;\
|
||||
*.zip*) \
|
||||
unzip $(distdir).zip ;;\
|
||||
esac
|
||||
chmod -R a-w $(distdir); chmod a+w $(distdir)
|
||||
mkdir $(distdir)/_build
|
||||
mkdir $(distdir)/_inst
|
||||
chmod a-w $(distdir)
|
||||
dc_install_base=`$(am__cd) $(distdir)/_inst && pwd | sed -e 's,^[^:\\/]:[\\/],/,'` \
|
||||
&& dc_destdir="$${TMPDIR-/tmp}/am-dc-$$$$/" \
|
||||
&& cd $(distdir)/_build \
|
||||
&& ../configure --srcdir=.. --prefix="$$dc_install_base" \
|
||||
$(DISTCHECK_CONFIGURE_FLAGS) \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) dvi \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) check \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) install \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) installcheck \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) uninstall \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) distuninstallcheck_dir="$$dc_install_base" \
|
||||
distuninstallcheck \
|
||||
&& chmod -R a-w "$$dc_install_base" \
|
||||
&& ({ \
|
||||
(cd ../.. && umask 077 && mkdir "$$dc_destdir") \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) DESTDIR="$$dc_destdir" install \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) DESTDIR="$$dc_destdir" uninstall \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) DESTDIR="$$dc_destdir" \
|
||||
distuninstallcheck_dir="$$dc_destdir" distuninstallcheck; \
|
||||
} || { rm -rf "$$dc_destdir"; exit 1; }) \
|
||||
&& rm -rf "$$dc_destdir" \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) dist \
|
||||
&& rm -rf $(DIST_ARCHIVES) \
|
||||
&& $(MAKE) $(AM_MAKEFLAGS) distcleancheck
|
||||
$(am__remove_distdir)
|
||||
@(echo "$(distdir) archives ready for distribution: "; \
|
||||
list='$(DIST_ARCHIVES)'; for i in $$list; do echo $$i; done) | \
|
||||
sed -e '1{h;s/./=/g;p;x;}' -e '$${p;x;}'
|
||||
distuninstallcheck:
|
||||
@cd $(distuninstallcheck_dir) \
|
||||
&& test `$(distuninstallcheck_listfiles) | wc -l` -le 1 \
|
||||
|| { echo "ERROR: files left after uninstall:" ; \
|
||||
if test -n "$(DESTDIR)"; then \
|
||||
echo " (check DESTDIR support)"; \
|
||||
fi ; \
|
||||
$(distuninstallcheck_listfiles) ; \
|
||||
exit 1; } >&2
|
||||
distcleancheck: distclean
|
||||
@if test '$(srcdir)' = . ; then \
|
||||
echo "ERROR: distcleancheck can only run from a VPATH build" ; \
|
||||
exit 1 ; \
|
||||
fi
|
||||
@test `$(distcleancheck_listfiles) | wc -l` -eq 0 \
|
||||
|| { echo "ERROR: files left in build directory after distclean:" ; \
|
||||
$(distcleancheck_listfiles) ; \
|
||||
exit 1; } >&2
|
||||
check-am: all-am
|
||||
check: check-recursive
|
||||
all-am: Makefile config_auto.h
|
||||
installdirs: installdirs-recursive
|
||||
installdirs-am:
|
||||
install: install-recursive
|
||||
install-exec: install-exec-recursive
|
||||
install-data: install-data-recursive
|
||||
uninstall: uninstall-recursive
|
||||
|
||||
install-am: all-am
|
||||
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
|
||||
|
||||
installcheck: installcheck-recursive
|
||||
install-strip:
|
||||
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
|
||||
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
|
||||
`test -z '$(STRIP)' || \
|
||||
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
|
||||
mostlyclean-generic:
|
||||
|
||||
clean-generic:
|
||||
|
||||
distclean-generic:
|
||||
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
|
||||
|
||||
maintainer-clean-generic:
|
||||
@echo "This command is intended for maintainers to use"
|
||||
@echo "it deletes files that may require special tools to rebuild."
|
||||
clean: clean-recursive
|
||||
|
||||
clean-am: clean-generic mostlyclean-am
|
||||
|
||||
distclean: distclean-recursive
|
||||
-rm -f $(am__CONFIG_DISTCLEAN_FILES)
|
||||
-rm -f Makefile
|
||||
distclean-am: clean-am distclean-generic distclean-hdr distclean-tags
|
||||
|
||||
dvi: dvi-recursive
|
||||
|
||||
dvi-am:
|
||||
|
||||
html: html-recursive
|
||||
|
||||
info: info-recursive
|
||||
|
||||
info-am:
|
||||
|
||||
install-data-am:
|
||||
|
||||
install-exec-am:
|
||||
|
||||
install-info: install-info-recursive
|
||||
|
||||
install-man:
|
||||
|
||||
installcheck-am:
|
||||
|
||||
maintainer-clean: maintainer-clean-recursive
|
||||
-rm -f $(am__CONFIG_DISTCLEAN_FILES)
|
||||
-rm -rf $(top_srcdir)/autom4te.cache
|
||||
-rm -f Makefile
|
||||
maintainer-clean-am: distclean-am maintainer-clean-generic
|
||||
|
||||
mostlyclean: mostlyclean-recursive
|
||||
|
||||
mostlyclean-am: mostlyclean-generic
|
||||
|
||||
pdf: pdf-recursive
|
||||
|
||||
pdf-am:
|
||||
|
||||
ps: ps-recursive
|
||||
|
||||
ps-am:
|
||||
|
||||
uninstall-am: uninstall-info-am
|
||||
|
||||
uninstall-info: uninstall-info-recursive
|
||||
|
||||
.PHONY: $(RECURSIVE_TARGETS) CTAGS GTAGS all all-am am--refresh check \
|
||||
check-am clean clean-generic clean-recursive ctags \
|
||||
ctags-recursive dist dist-all dist-bzip2 dist-gzip dist-hook \
|
||||
dist-shar dist-tarZ dist-zip distcheck distclean \
|
||||
distclean-generic distclean-hdr distclean-recursive \
|
||||
distclean-tags distcleancheck distdir distuninstallcheck dvi \
|
||||
dvi-am html html-am info info-am install install-am \
|
||||
install-data install-data-am install-exec install-exec-am \
|
||||
install-info install-info-am install-man install-strip \
|
||||
installcheck installcheck-am installdirs installdirs-am \
|
||||
maintainer-clean maintainer-clean-generic \
|
||||
maintainer-clean-recursive mostlyclean mostlyclean-generic \
|
||||
mostlyclean-recursive pdf pdf-am ps ps-am tags tags-recursive \
|
||||
uninstall uninstall-am uninstall-info-am
|
||||
|
||||
#EXTRA_DIST = doc/html doc/@PACKAGE_NAME@_@PACKAGE_VERSION@.pdf doc/@PACKAGE_NAME@_@PACKAGE_VERSION@.ps.gz
|
||||
|
||||
dist-hook:
|
||||
# Need to remove CVS directories from directories
|
||||
# added using EXTRA_DIST. $(distdir)/tessdata would in
|
||||
# theory suffice.
|
||||
rm -rf `find $(distdir) -name CVS`
|
||||
# Also remove extra files not needed in a distribution
|
||||
rm -rf `find $(distdir) -name configure.ac`
|
||||
rm -rf `find $(distdir) -name acinclude.m4`
|
||||
rm -rf `find $(distdir) -name aclocal.m4`
|
||||
# Tell versions [3.59,3.63) of GNU make to not export all variables.
|
||||
# Otherwise a system limit (for SysV at least) may be exceeded.
|
||||
.NOEXPORT:
|
85
README
Normal file
85
README
Normal file
@ -0,0 +1,85 @@
|
||||
Introduction
|
||||
============
|
||||
This package contains the Tesseract Open Source OCR Engine.
|
||||
Orignally developed at Hewlett Packard Laboratories Bristol and
|
||||
at Hewlett Packard Co, Greeley Colorado, all the code
|
||||
in this distribution is now licensed under the Apache License:
|
||||
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
|
||||
|
||||
Other Dependencies and Licenses:
|
||||
================================
|
||||
The Aspirin/MIGRAINES system is no longer required.
|
||||
|
||||
Tesseract can also make use of the libtiff library. (www.libtiff.org)
|
||||
Without libtiff, Tesseract can only read uncompressed and G3 compressed
|
||||
TIFF files.
|
||||
|
||||
|
||||
History:
|
||||
========
|
||||
The engine was developed at Hewlett Packard Laboratories Bristol and
|
||||
at Hewlett Packard Co, Greeley Colorado between 1985 and 1994, with some
|
||||
more changes made in 1996 to port to Windows, and some C++izing in 1998.
|
||||
A lot of the code was written in C, and then some more was written in C++.
|
||||
Since then all the code has been converted to at least compile with a C++
|
||||
compiler. Currently it builds under Linux with gcc2.95 and under Windows
|
||||
with VC++6. The C++ code makes heavy use of a list system using macros.
|
||||
This predates stl, was portable before stl, and is more efficent than stl
|
||||
lists, but has the big negative that if you do get a segmentation violation,
|
||||
it is hard to debug. Another "feature" of the C/C++ split is that the C++
|
||||
data structures get converted to C data structures to call the low-level C
|
||||
code. This is ugly, and the C++izing of the C code is a step towards
|
||||
eliminating the conversion, but it has not happened yet.
|
||||
|
||||
|
||||
Directory Structure (ordered by dependency):
|
||||
============================================
|
||||
ccmain Top-level code. The main program resides in tesseractmain.cpp.
|
||||
display An "editor" to view and operate on the internal structures.
|
||||
(Requires a working viewer - batteries not included.)
|
||||
wordrec The word-level recognizer.
|
||||
textord The module that organizes(orders) text into lines and words.
|
||||
classify The low-level character classifiers.
|
||||
ccstruct Classes to hold information about a page as it is being processed.
|
||||
viewer The client side of a client server viewing system.
|
||||
Unfortunately, at this time, the server side is not available.
|
||||
image Image class and processing functions.
|
||||
dict Language model code.
|
||||
cutil Code for file I/O, lists, heaps etc, from the old C code.
|
||||
ccutil Somewhat newer code for lists, memory allocation etc from the
|
||||
old C++ code.
|
||||
|
||||
|
||||
About the Engine
|
||||
================
|
||||
This code is a raw OCR engine. It has NO PAGE LAYOUT ANALYSIS, NO OUTPUT
|
||||
FORMATTING, and NO UI. It can only process an image of a single column
|
||||
and create text from it. It can detect fixed pitch vs proportional text.
|
||||
Having said that, in 1995, this engine was in the top 3 in terms of character
|
||||
accuracy, and it compiles and runs on both Linux and Windows. Another current
|
||||
limitation is that it only recognizes English and its character set is only
|
||||
US-ASCII. Training code IS included in the open source release however, and
|
||||
will be included in a future release.
|
||||
|
||||
|
||||
Using the Engine
|
||||
================
|
||||
The usage of both Windows and Linux versions is the same.
|
||||
The executable must reside in the same directory as the tessdata directory
|
||||
The command line is:
|
||||
tesseract <image.tif> <output> batch
|
||||
The image file requires an .tif extension for its type to be recognized
|
||||
correctly. If a file exists with the .tif extension replaced by .uzn, then it
|
||||
will be interpreted as a UNLV-style zone file. (See www.isri.unlv.edu for
|
||||
details of the zone files.)
|
||||
|
78
ReleaseNotes
Normal file
78
ReleaseNotes
Normal file
@ -0,0 +1,78 @@
|
||||
Tesseract release notes Feb 2, 2007 - V1.03.
|
||||
Added mftraining and cntraining. Using an image with a box file, tesseract
|
||||
generates .tr output files. cntraining runs on the .tr files to make
|
||||
normproto that lives in tessdata. mftraining runs on the .tr files to
|
||||
make inttemp and pffmtable in tessdata. These are the main data files
|
||||
that tesseract uses to recognize characters. At present, the code to make
|
||||
dictionary files is not yet available, nor are any sample box files or
|
||||
rebuilt inttemp or documentation to create any of these. Recognition is
|
||||
still limited to the ASCII set, but when this problem is fixed, documentation
|
||||
will follow.
|
||||
|
||||
Added a new API with adaptive thresholding for grey and color images.
|
||||
See ccmain/baseapi.h/cpp for details. The main program has been converted
|
||||
to use the API as an example. See main() in ccmain/tesseractmain.cpp for
|
||||
details. The API is designed to make it easy to add subclasses with ability
|
||||
to output the bounding boxes etc from the internal structures. The adaptive
|
||||
thresholding improves accuracy (most of the time) on non-binary images.
|
||||
|
||||
Many memory leaks have been fixed. There are no known leaks left from using
|
||||
the API correctly.
|
||||
|
||||
The adaptive classifier was not operating correctly. This bug, and several
|
||||
others have been fixed, including poor chopping, an indefinite (if not quite
|
||||
infinite) loop in the number parser, and a couple of crash bugs. Thanks to
|
||||
all that have contributed bugs and bug fixes.
|
||||
|
||||
It is now possible to build without any of the graphics support to save code
|
||||
size using #define GRAPHICS_DISABLED. There is also a new EMBEDDED define
|
||||
for use on operating systems with limited library support.
|
||||
|
||||
64-bit and Mac OSX buildability is now included in the mainline source tree.
|
||||
Thanks to all that have contributed patches and comments to help with that.
|
||||
1.03 is also endian-independent, apart from the tiff i/o, so if you use
|
||||
libtiff, the code should run on all platforms, even if you get/create new
|
||||
data files of a different endinanness.
|
||||
|
||||
Some of the bug fixes improve accuracy, and so do some of the changes to
|
||||
DangAmbigs and user-words.
|
||||
|
||||
Tesseract release notes, Oct 4 2006 - V1.02.
|
||||
Removed dependency on aspirin. *All* code is now licensed under Apache2.0.
|
||||
|
||||
Tesseract release notes, Sep 7 2006 - V1.01.
|
||||
|
||||
Fixes for this release:
|
||||
Added mfcpch.cpp and getopt.cpp for VC++.
|
||||
Fixed problem with greyscale images and no libtiff.
|
||||
Stopped debug window from being used for the usage output.
|
||||
Fixed load of inttemp for big-endian architectures.
|
||||
Fixed some Mac compilation issues.
|
||||
|
||||
This version should read uncompressed 8 bit grey and 24 bit color tiffs
|
||||
without having to have libtiff. It does a dumb threshold though, so don't
|
||||
expect good results from poor contrast or images of natural scenes etc.
|
||||
|
||||
If you just run tesseract with no command line args you should now get a
|
||||
sensible usage message on linux, with or without X-windows.
|
||||
|
||||
If you can get it to compile on a PPC Mac, it may now run correctly,
|
||||
although not all the build issues are fixed yet.
|
||||
|
||||
Building Tesseract:
|
||||
Windows:
|
||||
Unpack the tar.gz archive
|
||||
Open tesseract.dsw in DevStudio (preferably version 6, higher versions will be more difficult)
|
||||
Set Win32 - Release as the active configuration.
|
||||
Build.
|
||||
Copy tesseract.exe from bin.rel up one directory level.
|
||||
Run tesseract phototest.tif phototest
|
||||
This will create phototest.txt.
|
||||
|
||||
Linux:
|
||||
Unpack the tar.gz archive
|
||||
./configure
|
||||
make
|
||||
Copy tesseract from ccmain up one directory level (or create a symbolic link)
|
||||
Run tesseract phototest.tif phototest
|
||||
This will create phototest.txt.
|
10
acinclude.m4
Normal file
10
acinclude.m4
Normal file
@ -0,0 +1,10 @@
|
||||
# Master include for AC macros. This directory structure allows
|
||||
# for more flexibility with respect to CVS modules.
|
||||
#
|
||||
# Author: Luc Vincent
|
||||
|
||||
### m4_include(config/ac_compile_check_sizeof.m4)dnl
|
||||
#m4_include(config/ac_create_stdint_h.m4)dnl
|
||||
#m4_include(config/ax_create_stdint_h.m4)dnl
|
||||
m4_include(config/ac_define_versionlevel.m4)dnl
|
||||
m4_include(config/acinclude_custom.m4)dnl
|
920
aclocal.m4
vendored
Normal file
920
aclocal.m4
vendored
Normal file
@ -0,0 +1,920 @@
|
||||
# generated automatically by aclocal 1.9.6 -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
|
||||
# 2005 Free Software Foundation, Inc.
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
|
||||
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
|
||||
# PARTICULAR PURPOSE.
|
||||
|
||||
# Copyright (C) 2002, 2003, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# AM_AUTOMAKE_VERSION(VERSION)
|
||||
# ----------------------------
|
||||
# Automake X.Y traces this macro to ensure aclocal.m4 has been
|
||||
# generated from the m4 files accompanying Automake X.Y.
|
||||
AC_DEFUN([AM_AUTOMAKE_VERSION], [am__api_version="1.9"])
|
||||
|
||||
# AM_SET_CURRENT_AUTOMAKE_VERSION
|
||||
# -------------------------------
|
||||
# Call AM_AUTOMAKE_VERSION so it can be traced.
|
||||
# This function is AC_REQUIREd by AC_INIT_AUTOMAKE.
|
||||
AC_DEFUN([AM_SET_CURRENT_AUTOMAKE_VERSION],
|
||||
[AM_AUTOMAKE_VERSION([1.9.6])])
|
||||
|
||||
# AM_AUX_DIR_EXPAND -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 2001, 2003, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# For projects using AC_CONFIG_AUX_DIR([foo]), Autoconf sets
|
||||
# $ac_aux_dir to `$srcdir/foo'. In other projects, it is set to
|
||||
# `$srcdir', `$srcdir/..', or `$srcdir/../..'.
|
||||
#
|
||||
# Of course, Automake must honor this variable whenever it calls a
|
||||
# tool from the auxiliary directory. The problem is that $srcdir (and
|
||||
# therefore $ac_aux_dir as well) can be either absolute or relative,
|
||||
# depending on how configure is run. This is pretty annoying, since
|
||||
# it makes $ac_aux_dir quite unusable in subdirectories: in the top
|
||||
# source directory, any form will work fine, but in subdirectories a
|
||||
# relative path needs to be adjusted first.
|
||||
#
|
||||
# $ac_aux_dir/missing
|
||||
# fails when called from a subdirectory if $ac_aux_dir is relative
|
||||
# $top_srcdir/$ac_aux_dir/missing
|
||||
# fails if $ac_aux_dir is absolute,
|
||||
# fails when called from a subdirectory in a VPATH build with
|
||||
# a relative $ac_aux_dir
|
||||
#
|
||||
# The reason of the latter failure is that $top_srcdir and $ac_aux_dir
|
||||
# are both prefixed by $srcdir. In an in-source build this is usually
|
||||
# harmless because $srcdir is `.', but things will broke when you
|
||||
# start a VPATH build or use an absolute $srcdir.
|
||||
#
|
||||
# So we could use something similar to $top_srcdir/$ac_aux_dir/missing,
|
||||
# iff we strip the leading $srcdir from $ac_aux_dir. That would be:
|
||||
# am_aux_dir='\$(top_srcdir)/'`expr "$ac_aux_dir" : "$srcdir//*\(.*\)"`
|
||||
# and then we would define $MISSING as
|
||||
# MISSING="\${SHELL} $am_aux_dir/missing"
|
||||
# This will work as long as MISSING is not called from configure, because
|
||||
# unfortunately $(top_srcdir) has no meaning in configure.
|
||||
# However there are other variables, like CC, which are often used in
|
||||
# configure, and could therefore not use this "fixed" $ac_aux_dir.
|
||||
#
|
||||
# Another solution, used here, is to always expand $ac_aux_dir to an
|
||||
# absolute PATH. The drawback is that using absolute paths prevent a
|
||||
# configured tree to be moved without reconfiguration.
|
||||
|
||||
AC_DEFUN([AM_AUX_DIR_EXPAND],
|
||||
[dnl Rely on autoconf to set up CDPATH properly.
|
||||
AC_PREREQ([2.50])dnl
|
||||
# expand $ac_aux_dir to an absolute path
|
||||
am_aux_dir=`cd $ac_aux_dir && pwd`
|
||||
])
|
||||
|
||||
# AM_CONDITIONAL -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 1997, 2000, 2001, 2003, 2004, 2005
|
||||
# Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 7
|
||||
|
||||
# AM_CONDITIONAL(NAME, SHELL-CONDITION)
|
||||
# -------------------------------------
|
||||
# Define a conditional.
|
||||
AC_DEFUN([AM_CONDITIONAL],
|
||||
[AC_PREREQ(2.52)dnl
|
||||
ifelse([$1], [TRUE], [AC_FATAL([$0: invalid condition: $1])],
|
||||
[$1], [FALSE], [AC_FATAL([$0: invalid condition: $1])])dnl
|
||||
AC_SUBST([$1_TRUE])
|
||||
AC_SUBST([$1_FALSE])
|
||||
if $2; then
|
||||
$1_TRUE=
|
||||
$1_FALSE='#'
|
||||
else
|
||||
$1_TRUE='#'
|
||||
$1_FALSE=
|
||||
fi
|
||||
AC_CONFIG_COMMANDS_PRE(
|
||||
[if test -z "${$1_TRUE}" && test -z "${$1_FALSE}"; then
|
||||
AC_MSG_ERROR([[conditional "$1" was never defined.
|
||||
Usually this means the macro was only invoked conditionally.]])
|
||||
fi])])
|
||||
|
||||
|
||||
# Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005
|
||||
# Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 8
|
||||
|
||||
# There are a few dirty hacks below to avoid letting `AC_PROG_CC' be
|
||||
# written in clear, in which case automake, when reading aclocal.m4,
|
||||
# will think it sees a *use*, and therefore will trigger all it's
|
||||
# C support machinery. Also note that it means that autoscan, seeing
|
||||
# CC etc. in the Makefile, will ask for an AC_PROG_CC use...
|
||||
|
||||
|
||||
# _AM_DEPENDENCIES(NAME)
|
||||
# ----------------------
|
||||
# See how the compiler implements dependency checking.
|
||||
# NAME is "CC", "CXX", "GCJ", or "OBJC".
|
||||
# We try a few techniques and use that to set a single cache variable.
|
||||
#
|
||||
# We don't AC_REQUIRE the corresponding AC_PROG_CC since the latter was
|
||||
# modified to invoke _AM_DEPENDENCIES(CC); we would have a circular
|
||||
# dependency, and given that the user is not expected to run this macro,
|
||||
# just rely on AC_PROG_CC.
|
||||
AC_DEFUN([_AM_DEPENDENCIES],
|
||||
[AC_REQUIRE([AM_SET_DEPDIR])dnl
|
||||
AC_REQUIRE([AM_OUTPUT_DEPENDENCY_COMMANDS])dnl
|
||||
AC_REQUIRE([AM_MAKE_INCLUDE])dnl
|
||||
AC_REQUIRE([AM_DEP_TRACK])dnl
|
||||
|
||||
ifelse([$1], CC, [depcc="$CC" am_compiler_list=],
|
||||
[$1], CXX, [depcc="$CXX" am_compiler_list=],
|
||||
[$1], OBJC, [depcc="$OBJC" am_compiler_list='gcc3 gcc'],
|
||||
[$1], GCJ, [depcc="$GCJ" am_compiler_list='gcc3 gcc'],
|
||||
[depcc="$$1" am_compiler_list=])
|
||||
|
||||
AC_CACHE_CHECK([dependency style of $depcc],
|
||||
[am_cv_$1_dependencies_compiler_type],
|
||||
[if test -z "$AMDEP_TRUE" && test -f "$am_depcomp"; then
|
||||
# We make a subdir and do the tests there. Otherwise we can end up
|
||||
# making bogus files that we don't know about and never remove. For
|
||||
# instance it was reported that on HP-UX the gcc test will end up
|
||||
# making a dummy file named `D' -- because `-MD' means `put the output
|
||||
# in D'.
|
||||
mkdir conftest.dir
|
||||
# Copy depcomp to subdir because otherwise we won't find it if we're
|
||||
# using a relative directory.
|
||||
cp "$am_depcomp" conftest.dir
|
||||
cd conftest.dir
|
||||
# We will build objects and dependencies in a subdirectory because
|
||||
# it helps to detect inapplicable dependency modes. For instance
|
||||
# both Tru64's cc and ICC support -MD to output dependencies as a
|
||||
# side effect of compilation, but ICC will put the dependencies in
|
||||
# the current directory while Tru64 will put them in the object
|
||||
# directory.
|
||||
mkdir sub
|
||||
|
||||
am_cv_$1_dependencies_compiler_type=none
|
||||
if test "$am_compiler_list" = ""; then
|
||||
am_compiler_list=`sed -n ['s/^#*\([a-zA-Z0-9]*\))$/\1/p'] < ./depcomp`
|
||||
fi
|
||||
for depmode in $am_compiler_list; do
|
||||
# Setup a source with many dependencies, because some compilers
|
||||
# like to wrap large dependency lists on column 80 (with \), and
|
||||
# we should not choose a depcomp mode which is confused by this.
|
||||
#
|
||||
# We need to recreate these files for each test, as the compiler may
|
||||
# overwrite some of them when testing with obscure command lines.
|
||||
# This happens at least with the AIX C compiler.
|
||||
: > sub/conftest.c
|
||||
for i in 1 2 3 4 5 6; do
|
||||
echo '#include "conftst'$i'.h"' >> sub/conftest.c
|
||||
# Using `: > sub/conftst$i.h' creates only sub/conftst1.h with
|
||||
# Solaris 8's {/usr,}/bin/sh.
|
||||
touch sub/conftst$i.h
|
||||
done
|
||||
echo "${am__include} ${am__quote}sub/conftest.Po${am__quote}" > confmf
|
||||
|
||||
case $depmode in
|
||||
nosideeffect)
|
||||
# after this tag, mechanisms are not by side-effect, so they'll
|
||||
# only be used when explicitly requested
|
||||
if test "x$enable_dependency_tracking" = xyes; then
|
||||
continue
|
||||
else
|
||||
break
|
||||
fi
|
||||
;;
|
||||
none) break ;;
|
||||
esac
|
||||
# We check with `-c' and `-o' for the sake of the "dashmstdout"
|
||||
# mode. It turns out that the SunPro C++ compiler does not properly
|
||||
# handle `-M -o', and we need to detect this.
|
||||
if depmode=$depmode \
|
||||
source=sub/conftest.c object=sub/conftest.${OBJEXT-o} \
|
||||
depfile=sub/conftest.Po tmpdepfile=sub/conftest.TPo \
|
||||
$SHELL ./depcomp $depcc -c -o sub/conftest.${OBJEXT-o} sub/conftest.c \
|
||||
>/dev/null 2>conftest.err &&
|
||||
grep sub/conftst6.h sub/conftest.Po > /dev/null 2>&1 &&
|
||||
grep sub/conftest.${OBJEXT-o} sub/conftest.Po > /dev/null 2>&1 &&
|
||||
${MAKE-make} -s -f confmf > /dev/null 2>&1; then
|
||||
# icc doesn't choke on unknown options, it will just issue warnings
|
||||
# or remarks (even with -Werror). So we grep stderr for any message
|
||||
# that says an option was ignored or not supported.
|
||||
# When given -MP, icc 7.0 and 7.1 complain thusly:
|
||||
# icc: Command line warning: ignoring option '-M'; no argument required
|
||||
# The diagnosis changed in icc 8.0:
|
||||
# icc: Command line remark: option '-MP' not supported
|
||||
if (grep 'ignoring option' conftest.err ||
|
||||
grep 'not supported' conftest.err) >/dev/null 2>&1; then :; else
|
||||
am_cv_$1_dependencies_compiler_type=$depmode
|
||||
break
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
cd ..
|
||||
rm -rf conftest.dir
|
||||
else
|
||||
am_cv_$1_dependencies_compiler_type=none
|
||||
fi
|
||||
])
|
||||
AC_SUBST([$1DEPMODE], [depmode=$am_cv_$1_dependencies_compiler_type])
|
||||
AM_CONDITIONAL([am__fastdep$1], [
|
||||
test "x$enable_dependency_tracking" != xno \
|
||||
&& test "$am_cv_$1_dependencies_compiler_type" = gcc3])
|
||||
])
|
||||
|
||||
|
||||
# AM_SET_DEPDIR
|
||||
# -------------
|
||||
# Choose a directory name for dependency files.
|
||||
# This macro is AC_REQUIREd in _AM_DEPENDENCIES
|
||||
AC_DEFUN([AM_SET_DEPDIR],
|
||||
[AC_REQUIRE([AM_SET_LEADING_DOT])dnl
|
||||
AC_SUBST([DEPDIR], ["${am__leading_dot}deps"])dnl
|
||||
])
|
||||
|
||||
|
||||
# AM_DEP_TRACK
|
||||
# ------------
|
||||
AC_DEFUN([AM_DEP_TRACK],
|
||||
[AC_ARG_ENABLE(dependency-tracking,
|
||||
[ --disable-dependency-tracking speeds up one-time build
|
||||
--enable-dependency-tracking do not reject slow dependency extractors])
|
||||
if test "x$enable_dependency_tracking" != xno; then
|
||||
am_depcomp="$ac_aux_dir/depcomp"
|
||||
AMDEPBACKSLASH='\'
|
||||
fi
|
||||
AM_CONDITIONAL([AMDEP], [test "x$enable_dependency_tracking" != xno])
|
||||
AC_SUBST([AMDEPBACKSLASH])
|
||||
])
|
||||
|
||||
# Generate code to set up dependency tracking. -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005
|
||||
# Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
#serial 3
|
||||
|
||||
# _AM_OUTPUT_DEPENDENCY_COMMANDS
|
||||
# ------------------------------
|
||||
AC_DEFUN([_AM_OUTPUT_DEPENDENCY_COMMANDS],
|
||||
[for mf in $CONFIG_FILES; do
|
||||
# Strip MF so we end up with the name of the file.
|
||||
mf=`echo "$mf" | sed -e 's/:.*$//'`
|
||||
# Check whether this is an Automake generated Makefile or not.
|
||||
# We used to match only the files named `Makefile.in', but
|
||||
# some people rename them; so instead we look at the file content.
|
||||
# Grep'ing the first line is not enough: some people post-process
|
||||
# each Makefile.in and add a new line on top of each file to say so.
|
||||
# So let's grep whole file.
|
||||
if grep '^#.*generated by automake' $mf > /dev/null 2>&1; then
|
||||
dirpart=`AS_DIRNAME("$mf")`
|
||||
else
|
||||
continue
|
||||
fi
|
||||
# Extract the definition of DEPDIR, am__include, and am__quote
|
||||
# from the Makefile without running `make'.
|
||||
DEPDIR=`sed -n 's/^DEPDIR = //p' < "$mf"`
|
||||
test -z "$DEPDIR" && continue
|
||||
am__include=`sed -n 's/^am__include = //p' < "$mf"`
|
||||
test -z "am__include" && continue
|
||||
am__quote=`sed -n 's/^am__quote = //p' < "$mf"`
|
||||
# When using ansi2knr, U may be empty or an underscore; expand it
|
||||
U=`sed -n 's/^U = //p' < "$mf"`
|
||||
# Find all dependency output files, they are included files with
|
||||
# $(DEPDIR) in their names. We invoke sed twice because it is the
|
||||
# simplest approach to changing $(DEPDIR) to its actual value in the
|
||||
# expansion.
|
||||
for file in `sed -n "
|
||||
s/^$am__include $am__quote\(.*(DEPDIR).*\)$am__quote"'$/\1/p' <"$mf" | \
|
||||
sed -e 's/\$(DEPDIR)/'"$DEPDIR"'/g' -e 's/\$U/'"$U"'/g'`; do
|
||||
# Make sure the directory exists.
|
||||
test -f "$dirpart/$file" && continue
|
||||
fdir=`AS_DIRNAME(["$file"])`
|
||||
AS_MKDIR_P([$dirpart/$fdir])
|
||||
# echo "creating $dirpart/$file"
|
||||
echo '# dummy' > "$dirpart/$file"
|
||||
done
|
||||
done
|
||||
])# _AM_OUTPUT_DEPENDENCY_COMMANDS
|
||||
|
||||
|
||||
# AM_OUTPUT_DEPENDENCY_COMMANDS
|
||||
# -----------------------------
|
||||
# This macro should only be invoked once -- use via AC_REQUIRE.
|
||||
#
|
||||
# This code is only required when automatic dependency tracking
|
||||
# is enabled. FIXME. This creates each `.P' file that we will
|
||||
# need in order to bootstrap the dependency handling code.
|
||||
AC_DEFUN([AM_OUTPUT_DEPENDENCY_COMMANDS],
|
||||
[AC_CONFIG_COMMANDS([depfiles],
|
||||
[test x"$AMDEP_TRUE" != x"" || _AM_OUTPUT_DEPENDENCY_COMMANDS],
|
||||
[AMDEP_TRUE="$AMDEP_TRUE" ac_aux_dir="$ac_aux_dir"])
|
||||
])
|
||||
|
||||
# Copyright (C) 1996, 1997, 2000, 2001, 2003, 2005
|
||||
# Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 8
|
||||
|
||||
# AM_CONFIG_HEADER is obsolete. It has been replaced by AC_CONFIG_HEADERS.
|
||||
AU_DEFUN([AM_CONFIG_HEADER], [AC_CONFIG_HEADERS($@)])
|
||||
|
||||
# Do all the work for Automake. -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005
|
||||
# Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 12
|
||||
|
||||
# This macro actually does too much. Some checks are only needed if
|
||||
# your package does certain things. But this isn't really a big deal.
|
||||
|
||||
# AM_INIT_AUTOMAKE(PACKAGE, VERSION, [NO-DEFINE])
|
||||
# AM_INIT_AUTOMAKE([OPTIONS])
|
||||
# -----------------------------------------------
|
||||
# The call with PACKAGE and VERSION arguments is the old style
|
||||
# call (pre autoconf-2.50), which is being phased out. PACKAGE
|
||||
# and VERSION should now be passed to AC_INIT and removed from
|
||||
# the call to AM_INIT_AUTOMAKE.
|
||||
# We support both call styles for the transition. After
|
||||
# the next Automake release, Autoconf can make the AC_INIT
|
||||
# arguments mandatory, and then we can depend on a new Autoconf
|
||||
# release and drop the old call support.
|
||||
AC_DEFUN([AM_INIT_AUTOMAKE],
|
||||
[AC_PREREQ([2.58])dnl
|
||||
dnl Autoconf wants to disallow AM_ names. We explicitly allow
|
||||
dnl the ones we care about.
|
||||
m4_pattern_allow([^AM_[A-Z]+FLAGS$])dnl
|
||||
AC_REQUIRE([AM_SET_CURRENT_AUTOMAKE_VERSION])dnl
|
||||
AC_REQUIRE([AC_PROG_INSTALL])dnl
|
||||
# test to see if srcdir already configured
|
||||
if test "`cd $srcdir && pwd`" != "`pwd`" &&
|
||||
test -f $srcdir/config.status; then
|
||||
AC_MSG_ERROR([source directory already configured; run "make distclean" there first])
|
||||
fi
|
||||
|
||||
# test whether we have cygpath
|
||||
if test -z "$CYGPATH_W"; then
|
||||
if (cygpath --version) >/dev/null 2>/dev/null; then
|
||||
CYGPATH_W='cygpath -w'
|
||||
else
|
||||
CYGPATH_W=echo
|
||||
fi
|
||||
fi
|
||||
AC_SUBST([CYGPATH_W])
|
||||
|
||||
# Define the identity of the package.
|
||||
dnl Distinguish between old-style and new-style calls.
|
||||
m4_ifval([$2],
|
||||
[m4_ifval([$3], [_AM_SET_OPTION([no-define])])dnl
|
||||
AC_SUBST([PACKAGE], [$1])dnl
|
||||
AC_SUBST([VERSION], [$2])],
|
||||
[_AM_SET_OPTIONS([$1])dnl
|
||||
AC_SUBST([PACKAGE], ['AC_PACKAGE_TARNAME'])dnl
|
||||
AC_SUBST([VERSION], ['AC_PACKAGE_VERSION'])])dnl
|
||||
|
||||
_AM_IF_OPTION([no-define],,
|
||||
[AC_DEFINE_UNQUOTED(PACKAGE, "$PACKAGE", [Name of package])
|
||||
AC_DEFINE_UNQUOTED(VERSION, "$VERSION", [Version number of package])])dnl
|
||||
|
||||
# Some tools Automake needs.
|
||||
AC_REQUIRE([AM_SANITY_CHECK])dnl
|
||||
AC_REQUIRE([AC_ARG_PROGRAM])dnl
|
||||
AM_MISSING_PROG(ACLOCAL, aclocal-${am__api_version})
|
||||
AM_MISSING_PROG(AUTOCONF, autoconf)
|
||||
AM_MISSING_PROG(AUTOMAKE, automake-${am__api_version})
|
||||
AM_MISSING_PROG(AUTOHEADER, autoheader)
|
||||
AM_MISSING_PROG(MAKEINFO, makeinfo)
|
||||
AM_PROG_INSTALL_SH
|
||||
AM_PROG_INSTALL_STRIP
|
||||
AC_REQUIRE([AM_PROG_MKDIR_P])dnl
|
||||
# We need awk for the "check" target. The system "awk" is bad on
|
||||
# some platforms.
|
||||
AC_REQUIRE([AC_PROG_AWK])dnl
|
||||
AC_REQUIRE([AC_PROG_MAKE_SET])dnl
|
||||
AC_REQUIRE([AM_SET_LEADING_DOT])dnl
|
||||
_AM_IF_OPTION([tar-ustar], [_AM_PROG_TAR([ustar])],
|
||||
[_AM_IF_OPTION([tar-pax], [_AM_PROG_TAR([pax])],
|
||||
[_AM_PROG_TAR([v7])])])
|
||||
_AM_IF_OPTION([no-dependencies],,
|
||||
[AC_PROVIDE_IFELSE([AC_PROG_CC],
|
||||
[_AM_DEPENDENCIES(CC)],
|
||||
[define([AC_PROG_CC],
|
||||
defn([AC_PROG_CC])[_AM_DEPENDENCIES(CC)])])dnl
|
||||
AC_PROVIDE_IFELSE([AC_PROG_CXX],
|
||||
[_AM_DEPENDENCIES(CXX)],
|
||||
[define([AC_PROG_CXX],
|
||||
defn([AC_PROG_CXX])[_AM_DEPENDENCIES(CXX)])])dnl
|
||||
])
|
||||
])
|
||||
|
||||
|
||||
# When config.status generates a header, we must update the stamp-h file.
|
||||
# This file resides in the same directory as the config header
|
||||
# that is generated. The stamp files are numbered to have different names.
|
||||
|
||||
# Autoconf calls _AC_AM_CONFIG_HEADER_HOOK (when defined) in the
|
||||
# loop where config.status creates the headers, so we can generate
|
||||
# our stamp files there.
|
||||
AC_DEFUN([_AC_AM_CONFIG_HEADER_HOOK],
|
||||
[# Compute $1's index in $config_headers.
|
||||
_am_stamp_count=1
|
||||
for _am_header in $config_headers :; do
|
||||
case $_am_header in
|
||||
$1 | $1:* )
|
||||
break ;;
|
||||
* )
|
||||
_am_stamp_count=`expr $_am_stamp_count + 1` ;;
|
||||
esac
|
||||
done
|
||||
echo "timestamp for $1" >`AS_DIRNAME([$1])`/stamp-h[]$_am_stamp_count])
|
||||
|
||||
# Copyright (C) 2001, 2003, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# AM_PROG_INSTALL_SH
|
||||
# ------------------
|
||||
# Define $install_sh.
|
||||
AC_DEFUN([AM_PROG_INSTALL_SH],
|
||||
[AC_REQUIRE([AM_AUX_DIR_EXPAND])dnl
|
||||
install_sh=${install_sh-"$am_aux_dir/install-sh"}
|
||||
AC_SUBST(install_sh)])
|
||||
|
||||
# Copyright (C) 2003, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 2
|
||||
|
||||
# Check whether the underlying file-system supports filenames
|
||||
# with a leading dot. For instance MS-DOS doesn't.
|
||||
AC_DEFUN([AM_SET_LEADING_DOT],
|
||||
[rm -rf .tst 2>/dev/null
|
||||
mkdir .tst 2>/dev/null
|
||||
if test -d .tst; then
|
||||
am__leading_dot=.
|
||||
else
|
||||
am__leading_dot=_
|
||||
fi
|
||||
rmdir .tst 2>/dev/null
|
||||
AC_SUBST([am__leading_dot])])
|
||||
|
||||
# Add --enable-maintainer-mode option to configure. -*- Autoconf -*-
|
||||
# From Jim Meyering
|
||||
|
||||
# Copyright (C) 1996, 1998, 2000, 2001, 2002, 2003, 2004, 2005
|
||||
# Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 4
|
||||
|
||||
AC_DEFUN([AM_MAINTAINER_MODE],
|
||||
[AC_MSG_CHECKING([whether to enable maintainer-specific portions of Makefiles])
|
||||
dnl maintainer-mode is disabled by default
|
||||
AC_ARG_ENABLE(maintainer-mode,
|
||||
[ --enable-maintainer-mode enable make rules and dependencies not useful
|
||||
(and sometimes confusing) to the casual installer],
|
||||
USE_MAINTAINER_MODE=$enableval,
|
||||
USE_MAINTAINER_MODE=no)
|
||||
AC_MSG_RESULT([$USE_MAINTAINER_MODE])
|
||||
AM_CONDITIONAL(MAINTAINER_MODE, [test $USE_MAINTAINER_MODE = yes])
|
||||
MAINT=$MAINTAINER_MODE_TRUE
|
||||
AC_SUBST(MAINT)dnl
|
||||
]
|
||||
)
|
||||
|
||||
AU_DEFUN([jm_MAINTAINER_MODE], [AM_MAINTAINER_MODE])
|
||||
|
||||
# Check to see how 'make' treats includes. -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 2001, 2002, 2003, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 3
|
||||
|
||||
# AM_MAKE_INCLUDE()
|
||||
# -----------------
|
||||
# Check to see how make treats includes.
|
||||
AC_DEFUN([AM_MAKE_INCLUDE],
|
||||
[am_make=${MAKE-make}
|
||||
cat > confinc << 'END'
|
||||
am__doit:
|
||||
@echo done
|
||||
.PHONY: am__doit
|
||||
END
|
||||
# If we don't find an include directive, just comment out the code.
|
||||
AC_MSG_CHECKING([for style of include used by $am_make])
|
||||
am__include="#"
|
||||
am__quote=
|
||||
_am_result=none
|
||||
# First try GNU make style include.
|
||||
echo "include confinc" > confmf
|
||||
# We grep out `Entering directory' and `Leaving directory'
|
||||
# messages which can occur if `w' ends up in MAKEFLAGS.
|
||||
# In particular we don't look at `^make:' because GNU make might
|
||||
# be invoked under some other name (usually "gmake"), in which
|
||||
# case it prints its new name instead of `make'.
|
||||
if test "`$am_make -s -f confmf 2> /dev/null | grep -v 'ing directory'`" = "done"; then
|
||||
am__include=include
|
||||
am__quote=
|
||||
_am_result=GNU
|
||||
fi
|
||||
# Now try BSD make style include.
|
||||
if test "$am__include" = "#"; then
|
||||
echo '.include "confinc"' > confmf
|
||||
if test "`$am_make -s -f confmf 2> /dev/null`" = "done"; then
|
||||
am__include=.include
|
||||
am__quote="\""
|
||||
_am_result=BSD
|
||||
fi
|
||||
fi
|
||||
AC_SUBST([am__include])
|
||||
AC_SUBST([am__quote])
|
||||
AC_MSG_RESULT([$_am_result])
|
||||
rm -f confinc confmf
|
||||
])
|
||||
|
||||
# Copyright (C) 1999, 2000, 2001, 2003, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 3
|
||||
|
||||
# AM_PROG_CC_C_O
|
||||
# --------------
|
||||
# Like AC_PROG_CC_C_O, but changed for automake.
|
||||
AC_DEFUN([AM_PROG_CC_C_O],
|
||||
[AC_REQUIRE([AC_PROG_CC_C_O])dnl
|
||||
AC_REQUIRE([AM_AUX_DIR_EXPAND])dnl
|
||||
# FIXME: we rely on the cache variable name because
|
||||
# there is no other way.
|
||||
set dummy $CC
|
||||
ac_cc=`echo $[2] | sed ['s/[^a-zA-Z0-9_]/_/g;s/^[0-9]/_/']`
|
||||
if eval "test \"`echo '$ac_cv_prog_cc_'${ac_cc}_c_o`\" != yes"; then
|
||||
# Losing compiler, so override with the script.
|
||||
# FIXME: It is wrong to rewrite CC.
|
||||
# But if we don't then we get into trouble of one sort or another.
|
||||
# A longer-term fix would be to have automake use am__CC in this case,
|
||||
# and then we could set am__CC="\$(top_srcdir)/compile \$(CC)"
|
||||
CC="$am_aux_dir/compile $CC"
|
||||
fi
|
||||
])
|
||||
|
||||
# Fake the existence of programs that GNU maintainers use. -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 1997, 1999, 2000, 2001, 2003, 2005
|
||||
# Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 4
|
||||
|
||||
# AM_MISSING_PROG(NAME, PROGRAM)
|
||||
# ------------------------------
|
||||
AC_DEFUN([AM_MISSING_PROG],
|
||||
[AC_REQUIRE([AM_MISSING_HAS_RUN])
|
||||
$1=${$1-"${am_missing_run}$2"}
|
||||
AC_SUBST($1)])
|
||||
|
||||
|
||||
# AM_MISSING_HAS_RUN
|
||||
# ------------------
|
||||
# Define MISSING if not defined so far and test if it supports --run.
|
||||
# If it does, set am_missing_run to use it, otherwise, to nothing.
|
||||
AC_DEFUN([AM_MISSING_HAS_RUN],
|
||||
[AC_REQUIRE([AM_AUX_DIR_EXPAND])dnl
|
||||
test x"${MISSING+set}" = xset || MISSING="\${SHELL} $am_aux_dir/missing"
|
||||
# Use eval to expand $SHELL
|
||||
if eval "$MISSING --run true"; then
|
||||
am_missing_run="$MISSING --run "
|
||||
else
|
||||
am_missing_run=
|
||||
AC_MSG_WARN([`missing' script is too old or missing])
|
||||
fi
|
||||
])
|
||||
|
||||
# Copyright (C) 2003, 2004, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# AM_PROG_MKDIR_P
|
||||
# ---------------
|
||||
# Check whether `mkdir -p' is supported, fallback to mkinstalldirs otherwise.
|
||||
#
|
||||
# Automake 1.8 used `mkdir -m 0755 -p --' to ensure that directories
|
||||
# created by `make install' are always world readable, even if the
|
||||
# installer happens to have an overly restrictive umask (e.g. 077).
|
||||
# This was a mistake. There are at least two reasons why we must not
|
||||
# use `-m 0755':
|
||||
# - it causes special bits like SGID to be ignored,
|
||||
# - it may be too restrictive (some setups expect 775 directories).
|
||||
#
|
||||
# Do not use -m 0755 and let people choose whatever they expect by
|
||||
# setting umask.
|
||||
#
|
||||
# We cannot accept any implementation of `mkdir' that recognizes `-p'.
|
||||
# Some implementations (such as Solaris 8's) are not thread-safe: if a
|
||||
# parallel make tries to run `mkdir -p a/b' and `mkdir -p a/c'
|
||||
# concurrently, both version can detect that a/ is missing, but only
|
||||
# one can create it and the other will error out. Consequently we
|
||||
# restrict ourselves to GNU make (using the --version option ensures
|
||||
# this.)
|
||||
AC_DEFUN([AM_PROG_MKDIR_P],
|
||||
[if mkdir -p --version . >/dev/null 2>&1 && test ! -d ./--version; then
|
||||
# We used to keeping the `.' as first argument, in order to
|
||||
# allow $(mkdir_p) to be used without argument. As in
|
||||
# $(mkdir_p) $(somedir)
|
||||
# where $(somedir) is conditionally defined. However this is wrong
|
||||
# for two reasons:
|
||||
# 1. if the package is installed by a user who cannot write `.'
|
||||
# make install will fail,
|
||||
# 2. the above comment should most certainly read
|
||||
# $(mkdir_p) $(DESTDIR)$(somedir)
|
||||
# so it does not work when $(somedir) is undefined and
|
||||
# $(DESTDIR) is not.
|
||||
# To support the latter case, we have to write
|
||||
# test -z "$(somedir)" || $(mkdir_p) $(DESTDIR)$(somedir),
|
||||
# so the `.' trick is pointless.
|
||||
mkdir_p='mkdir -p --'
|
||||
else
|
||||
# On NextStep and OpenStep, the `mkdir' command does not
|
||||
# recognize any option. It will interpret all options as
|
||||
# directories to create, and then abort because `.' already
|
||||
# exists.
|
||||
for d in ./-p ./--version;
|
||||
do
|
||||
test -d $d && rmdir $d
|
||||
done
|
||||
# $(mkinstalldirs) is defined by Automake if mkinstalldirs exists.
|
||||
if test -f "$ac_aux_dir/mkinstalldirs"; then
|
||||
mkdir_p='$(mkinstalldirs)'
|
||||
else
|
||||
mkdir_p='$(install_sh) -d'
|
||||
fi
|
||||
fi
|
||||
AC_SUBST([mkdir_p])])
|
||||
|
||||
# Helper functions for option handling. -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 2001, 2002, 2003, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 3
|
||||
|
||||
# _AM_MANGLE_OPTION(NAME)
|
||||
# -----------------------
|
||||
AC_DEFUN([_AM_MANGLE_OPTION],
|
||||
[[_AM_OPTION_]m4_bpatsubst($1, [[^a-zA-Z0-9_]], [_])])
|
||||
|
||||
# _AM_SET_OPTION(NAME)
|
||||
# ------------------------------
|
||||
# Set option NAME. Presently that only means defining a flag for this option.
|
||||
AC_DEFUN([_AM_SET_OPTION],
|
||||
[m4_define(_AM_MANGLE_OPTION([$1]), 1)])
|
||||
|
||||
# _AM_SET_OPTIONS(OPTIONS)
|
||||
# ----------------------------------
|
||||
# OPTIONS is a space-separated list of Automake options.
|
||||
AC_DEFUN([_AM_SET_OPTIONS],
|
||||
[AC_FOREACH([_AM_Option], [$1], [_AM_SET_OPTION(_AM_Option)])])
|
||||
|
||||
# _AM_IF_OPTION(OPTION, IF-SET, [IF-NOT-SET])
|
||||
# -------------------------------------------
|
||||
# Execute IF-SET if OPTION is set, IF-NOT-SET otherwise.
|
||||
AC_DEFUN([_AM_IF_OPTION],
|
||||
[m4_ifset(_AM_MANGLE_OPTION([$1]), [$2], [$3])])
|
||||
|
||||
# Check to make sure that the build environment is sane. -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 1996, 1997, 2000, 2001, 2003, 2005
|
||||
# Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 4
|
||||
|
||||
# AM_SANITY_CHECK
|
||||
# ---------------
|
||||
AC_DEFUN([AM_SANITY_CHECK],
|
||||
[AC_MSG_CHECKING([whether build environment is sane])
|
||||
# Just in case
|
||||
sleep 1
|
||||
echo timestamp > conftest.file
|
||||
# Do `set' in a subshell so we don't clobber the current shell's
|
||||
# arguments. Must try -L first in case configure is actually a
|
||||
# symlink; some systems play weird games with the mod time of symlinks
|
||||
# (eg FreeBSD returns the mod time of the symlink's containing
|
||||
# directory).
|
||||
if (
|
||||
set X `ls -Lt $srcdir/configure conftest.file 2> /dev/null`
|
||||
if test "$[*]" = "X"; then
|
||||
# -L didn't work.
|
||||
set X `ls -t $srcdir/configure conftest.file`
|
||||
fi
|
||||
rm -f conftest.file
|
||||
if test "$[*]" != "X $srcdir/configure conftest.file" \
|
||||
&& test "$[*]" != "X conftest.file $srcdir/configure"; then
|
||||
|
||||
# If neither matched, then we have a broken ls. This can happen
|
||||
# if, for instance, CONFIG_SHELL is bash and it inherits a
|
||||
# broken ls alias from the environment. This has actually
|
||||
# happened. Such a system could not be considered "sane".
|
||||
AC_MSG_ERROR([ls -t appears to fail. Make sure there is not a broken
|
||||
alias in your environment])
|
||||
fi
|
||||
|
||||
test "$[2]" = conftest.file
|
||||
)
|
||||
then
|
||||
# Ok.
|
||||
:
|
||||
else
|
||||
AC_MSG_ERROR([newly created file is older than distributed files!
|
||||
Check your system clock])
|
||||
fi
|
||||
AC_MSG_RESULT(yes)])
|
||||
|
||||
# Copyright (C) 2001, 2003, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# AM_PROG_INSTALL_STRIP
|
||||
# ---------------------
|
||||
# One issue with vendor `install' (even GNU) is that you can't
|
||||
# specify the program used to strip binaries. This is especially
|
||||
# annoying in cross-compiling environments, where the build's strip
|
||||
# is unlikely to handle the host's binaries.
|
||||
# Fortunately install-sh will honor a STRIPPROG variable, so we
|
||||
# always use install-sh in `make install-strip', and initialize
|
||||
# STRIPPROG with the value of the STRIP variable (set by the user).
|
||||
AC_DEFUN([AM_PROG_INSTALL_STRIP],
|
||||
[AC_REQUIRE([AM_PROG_INSTALL_SH])dnl
|
||||
# Installed binaries are usually stripped using `strip' when the user
|
||||
# run `make install-strip'. However `strip' might not be the right
|
||||
# tool to use in cross-compilation environments, therefore Automake
|
||||
# will honor the `STRIP' environment variable to overrule this program.
|
||||
dnl Don't test for $cross_compiling = yes, because it might be `maybe'.
|
||||
if test "$cross_compiling" != no; then
|
||||
AC_CHECK_TOOL([STRIP], [strip], :)
|
||||
fi
|
||||
INSTALL_STRIP_PROGRAM="\${SHELL} \$(install_sh) -c -s"
|
||||
AC_SUBST([INSTALL_STRIP_PROGRAM])])
|
||||
|
||||
# Check how to create a tarball. -*- Autoconf -*-
|
||||
|
||||
# Copyright (C) 2004, 2005 Free Software Foundation, Inc.
|
||||
#
|
||||
# This file is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# serial 2
|
||||
|
||||
# _AM_PROG_TAR(FORMAT)
|
||||
# --------------------
|
||||
# Check how to create a tarball in format FORMAT.
|
||||
# FORMAT should be one of `v7', `ustar', or `pax'.
|
||||
#
|
||||
# Substitute a variable $(am__tar) that is a command
|
||||
# writing to stdout a FORMAT-tarball containing the directory
|
||||
# $tardir.
|
||||
# tardir=directory && $(am__tar) > result.tar
|
||||
#
|
||||
# Substitute a variable $(am__untar) that extract such
|
||||
# a tarball read from stdin.
|
||||
# $(am__untar) < result.tar
|
||||
AC_DEFUN([_AM_PROG_TAR],
|
||||
[# Always define AMTAR for backward compatibility.
|
||||
AM_MISSING_PROG([AMTAR], [tar])
|
||||
m4_if([$1], [v7],
|
||||
[am__tar='${AMTAR} chof - "$$tardir"'; am__untar='${AMTAR} xf -'],
|
||||
[m4_case([$1], [ustar],, [pax],,
|
||||
[m4_fatal([Unknown tar format])])
|
||||
AC_MSG_CHECKING([how to create a $1 tar archive])
|
||||
# Loop over all known methods to create a tar archive until one works.
|
||||
_am_tools='gnutar m4_if([$1], [ustar], [plaintar]) pax cpio none'
|
||||
_am_tools=${am_cv_prog_tar_$1-$_am_tools}
|
||||
# Do not fold the above two line into one, because Tru64 sh and
|
||||
# Solaris sh will not grok spaces in the rhs of `-'.
|
||||
for _am_tool in $_am_tools
|
||||
do
|
||||
case $_am_tool in
|
||||
gnutar)
|
||||
for _am_tar in tar gnutar gtar;
|
||||
do
|
||||
AM_RUN_LOG([$_am_tar --version]) && break
|
||||
done
|
||||
am__tar="$_am_tar --format=m4_if([$1], [pax], [posix], [$1]) -chf - "'"$$tardir"'
|
||||
am__tar_="$_am_tar --format=m4_if([$1], [pax], [posix], [$1]) -chf - "'"$tardir"'
|
||||
am__untar="$_am_tar -xf -"
|
||||
;;
|
||||
plaintar)
|
||||
# Must skip GNU tar: if it does not support --format= it doesn't create
|
||||
# ustar tarball either.
|
||||
(tar --version) >/dev/null 2>&1 && continue
|
||||
am__tar='tar chf - "$$tardir"'
|
||||
am__tar_='tar chf - "$tardir"'
|
||||
am__untar='tar xf -'
|
||||
;;
|
||||
pax)
|
||||
am__tar='pax -L -x $1 -w "$$tardir"'
|
||||
am__tar_='pax -L -x $1 -w "$tardir"'
|
||||
am__untar='pax -r'
|
||||
;;
|
||||
cpio)
|
||||
am__tar='find "$$tardir" -print | cpio -o -H $1 -L'
|
||||
am__tar_='find "$tardir" -print | cpio -o -H $1 -L'
|
||||
am__untar='cpio -i -H $1 -d'
|
||||
;;
|
||||
none)
|
||||
am__tar=false
|
||||
am__tar_=false
|
||||
am__untar=false
|
||||
;;
|
||||
esac
|
||||
|
||||
# If the value was cached, stop now. We just wanted to have am__tar
|
||||
# and am__untar set.
|
||||
test -n "${am_cv_prog_tar_$1}" && break
|
||||
|
||||
# tar/untar a dummy directory, and stop if the command works
|
||||
rm -rf conftest.dir
|
||||
mkdir conftest.dir
|
||||
echo GrepMe > conftest.dir/file
|
||||
AM_RUN_LOG([tardir=conftest.dir && eval $am__tar_ >conftest.tar])
|
||||
rm -rf conftest.dir
|
||||
if test -s conftest.tar; then
|
||||
AM_RUN_LOG([$am__untar <conftest.tar])
|
||||
grep GrepMe conftest.dir/file >/dev/null 2>&1 && break
|
||||
fi
|
||||
done
|
||||
rm -rf conftest.dir
|
||||
|
||||
AC_CACHE_VAL([am_cv_prog_tar_$1], [am_cv_prog_tar_$1=$_am_tool])
|
||||
AC_MSG_RESULT([$am_cv_prog_tar_$1])])
|
||||
AC_SUBST([am__tar])
|
||||
AC_SUBST([am__untar])
|
||||
]) # _AM_PROG_TAR
|
||||
|
||||
m4_include([acinclude.m4])
|
41
ccmain/Makefile.am
Normal file
41
ccmain/Makefile.am
Normal file
@ -0,0 +1,41 @@
|
||||
SUBDIRS =
|
||||
AM_CPPFLAGS = \
|
||||
-I$(top_srcdir)/ccutil -I$(top_srcdir)/ccstruct \
|
||||
-I$(top_srcdir)/image -I$(top_srcdir)/viewer \
|
||||
-I$(top_srcdir)/ccops -I$(top_srcdir)/dict \
|
||||
-I$(top_srcdir)/classify -I$(top_srcdir)/display \
|
||||
-I$(top_srcdir)/wordrec -I$(top_srcdir)/cutil \
|
||||
-I$(top_srcdir)/textord
|
||||
|
||||
EXTRA_DIST = \
|
||||
adaptions.h applybox.h baseapi.h blobcmp.h \
|
||||
callnet.h charcut.h \
|
||||
control.h docqual.h expandblob.h fixspace.h fixxht.h \
|
||||
imgscale.h matmatch.h output.h paircmp.h reject.h scaleimg.h \
|
||||
tessbox.h tessedit.h tesseractmain.h tessvars.h tfacep.h \
|
||||
tessembedded.h tfacepp.h tstruct.h werdit.h
|
||||
|
||||
noinst_LIBRARIES = libtesseract_main.a
|
||||
libtesseract_main_a_SOURCES = \
|
||||
tessedit.cpp adaptions.cpp applybox.cpp \
|
||||
baseapi.cpp blobcmp.cpp \
|
||||
callnet.cpp charcut.cpp charsample.cpp control.cpp \
|
||||
docqual.cpp expandblob.cpp fixspace.cpp fixxht.cpp \
|
||||
imgscale.cpp matmatch.cpp output.cpp paircmp.cpp \
|
||||
reject.cpp scaleimg.cpp tessbox.cpp tessvars.cpp \
|
||||
tfacepp.cpp tstruct.cpp werdit.cpp
|
||||
|
||||
bin_PROGRAMS = tesseract
|
||||
tesseract_SOURCES = tesseractmain.cpp
|
||||
tesseract_LDADD = \
|
||||
libtesseract_main.a \
|
||||
../display/libtesseract_display.a \
|
||||
../textord/libtesseract_textord.a \
|
||||
../wordrec/libtesseract_wordrec.a \
|
||||
../classify/libtesseract_classify.a \
|
||||
../dict/libtesseract_dict.a \
|
||||
../viewer/libtesseract_viewer.a \
|
||||
../image/libtesseract_image.a \
|
||||
../cutil/libtesseract_cutil.a \
|
||||
../ccstruct/libtesseract_ccstruct.a \
|
||||
../ccutil/libtesseract_ccutil.a
|
636
ccmain/Makefile.in
Normal file
636
ccmain/Makefile.in
Normal file
@ -0,0 +1,636 @@
|
||||
# Makefile.in generated by automake 1.9.6 from Makefile.am.
|
||||
# @configure_input@
|
||||
|
||||
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
|
||||
# 2003, 2004, 2005 Free Software Foundation, Inc.
|
||||
# This Makefile.in is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
|
||||
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
|
||||
# PARTICULAR PURPOSE.
|
||||
|
||||
@SET_MAKE@
|
||||
|
||||
|
||||
srcdir = @srcdir@
|
||||
top_srcdir = @top_srcdir@
|
||||
VPATH = @srcdir@
|
||||
pkgdatadir = $(datadir)/@PACKAGE@
|
||||
pkglibdir = $(libdir)/@PACKAGE@
|
||||
pkgincludedir = $(includedir)/@PACKAGE@
|
||||
top_builddir = ..
|
||||
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
|
||||
INSTALL = @INSTALL@
|
||||
install_sh_DATA = $(install_sh) -c -m 644
|
||||
install_sh_PROGRAM = $(install_sh) -c
|
||||
install_sh_SCRIPT = $(install_sh) -c
|
||||
INSTALL_HEADER = $(INSTALL_DATA)
|
||||
transform = $(program_transform_name)
|
||||
NORMAL_INSTALL = :
|
||||
PRE_INSTALL = :
|
||||
POST_INSTALL = :
|
||||
NORMAL_UNINSTALL = :
|
||||
PRE_UNINSTALL = :
|
||||
POST_UNINSTALL = :
|
||||
build_triplet = @build@
|
||||
host_triplet = @host@
|
||||
bin_PROGRAMS = tesseract$(EXEEXT)
|
||||
subdir = ccmain
|
||||
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
|
||||
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
|
||||
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
|
||||
$(top_srcdir)/config/ac_define_versionlevel.m4 \
|
||||
$(top_srcdir)/config/acinclude_custom.m4 \
|
||||
$(top_srcdir)/configure.ac
|
||||
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
|
||||
$(ACLOCAL_M4)
|
||||
mkinstalldirs = $(SHELL) $(top_srcdir)/config/mkinstalldirs
|
||||
CONFIG_HEADER = $(top_builddir)/config_auto.h
|
||||
CONFIG_CLEAN_FILES =
|
||||
LIBRARIES = $(noinst_LIBRARIES)
|
||||
AR = ar
|
||||
ARFLAGS = cru
|
||||
libtesseract_main_a_AR = $(AR) $(ARFLAGS)
|
||||
libtesseract_main_a_LIBADD =
|
||||
am_libtesseract_main_a_OBJECTS = tessedit.$(OBJEXT) \
|
||||
adaptions.$(OBJEXT) applybox.$(OBJEXT) baseapi.$(OBJEXT) \
|
||||
blobcmp.$(OBJEXT) callnet.$(OBJEXT) charcut.$(OBJEXT) \
|
||||
charsample.$(OBJEXT) control.$(OBJEXT) docqual.$(OBJEXT) \
|
||||
expandblob.$(OBJEXT) fixspace.$(OBJEXT) fixxht.$(OBJEXT) \
|
||||
imgscale.$(OBJEXT) matmatch.$(OBJEXT) output.$(OBJEXT) \
|
||||
paircmp.$(OBJEXT) reject.$(OBJEXT) scaleimg.$(OBJEXT) \
|
||||
tessbox.$(OBJEXT) tessvars.$(OBJEXT) tfacepp.$(OBJEXT) \
|
||||
tstruct.$(OBJEXT) werdit.$(OBJEXT)
|
||||
libtesseract_main_a_OBJECTS = $(am_libtesseract_main_a_OBJECTS)
|
||||
am__installdirs = "$(DESTDIR)$(bindir)"
|
||||
binPROGRAMS_INSTALL = $(INSTALL_PROGRAM)
|
||||
PROGRAMS = $(bin_PROGRAMS)
|
||||
am_tesseract_OBJECTS = tesseractmain.$(OBJEXT)
|
||||
tesseract_OBJECTS = $(am_tesseract_OBJECTS)
|
||||
tesseract_DEPENDENCIES = libtesseract_main.a \
|
||||
../display/libtesseract_display.a \
|
||||
../textord/libtesseract_textord.a \
|
||||
../wordrec/libtesseract_wordrec.a \
|
||||
../classify/libtesseract_classify.a \
|
||||
../dict/libtesseract_dict.a ../viewer/libtesseract_viewer.a \
|
||||
../image/libtesseract_image.a ../cutil/libtesseract_cutil.a \
|
||||
../ccstruct/libtesseract_ccstruct.a \
|
||||
../ccutil/libtesseract_ccutil.a
|
||||
DEFAULT_INCLUDES = -I. -I$(srcdir) -I$(top_builddir)
|
||||
depcomp = $(SHELL) $(top_srcdir)/config/depcomp
|
||||
am__depfiles_maybe = depfiles
|
||||
CXXCOMPILE = $(CXX) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
|
||||
$(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CXXFLAGS) $(CXXFLAGS)
|
||||
CXXLD = $(CXX)
|
||||
CXXLINK = $(CXXLD) $(AM_CXXFLAGS) $(CXXFLAGS) $(AM_LDFLAGS) $(LDFLAGS) \
|
||||
-o $@
|
||||
SOURCES = $(libtesseract_main_a_SOURCES) $(tesseract_SOURCES)
|
||||
DIST_SOURCES = $(libtesseract_main_a_SOURCES) $(tesseract_SOURCES)
|
||||
RECURSIVE_TARGETS = all-recursive check-recursive dvi-recursive \
|
||||
html-recursive info-recursive install-data-recursive \
|
||||
install-exec-recursive install-info-recursive \
|
||||
install-recursive installcheck-recursive installdirs-recursive \
|
||||
pdf-recursive ps-recursive uninstall-info-recursive \
|
||||
uninstall-recursive
|
||||
ETAGS = etags
|
||||
CTAGS = ctags
|
||||
DIST_SUBDIRS = $(SUBDIRS)
|
||||
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
|
||||
ACLOCAL = @ACLOCAL@
|
||||
AMDEP_FALSE = @AMDEP_FALSE@
|
||||
AMDEP_TRUE = @AMDEP_TRUE@
|
||||
AMTAR = @AMTAR@
|
||||
AUTOCONF = @AUTOCONF@
|
||||
AUTOHEADER = @AUTOHEADER@
|
||||
AUTOMAKE = @AUTOMAKE@
|
||||
AWK = @AWK@
|
||||
CC = @CC@
|
||||
CCDEPMODE = @CCDEPMODE@
|
||||
CFLAGS = @CFLAGS@
|
||||
CPPFLAGS = @CPPFLAGS@
|
||||
CXX = @CXX@
|
||||
CXXCPP = @CXXCPP@
|
||||
CXXDEPMODE = @CXXDEPMODE@
|
||||
CXXFLAGS = @CXXFLAGS@
|
||||
CXXRPOFLAGS = @CXXRPOFLAGS@
|
||||
CYGPATH_W = @CYGPATH_W@
|
||||
DEFS = @DEFS@
|
||||
DEPDIR = @DEPDIR@
|
||||
ECHO_C = @ECHO_C@
|
||||
ECHO_N = @ECHO_N@
|
||||
ECHO_T = @ECHO_T@
|
||||
EGREP = @EGREP@
|
||||
EXEEXT = @EXEEXT@
|
||||
GNUWIN32_DIR = @GNUWIN32_DIR@
|
||||
HAVE_GNUWIN32_FALSE = @HAVE_GNUWIN32_FALSE@
|
||||
HAVE_GNUWIN32_TRUE = @HAVE_GNUWIN32_TRUE@
|
||||
HAVE_LIBTIFF_FALSE = @HAVE_LIBTIFF_FALSE@
|
||||
HAVE_LIBTIFF_TRUE = @HAVE_LIBTIFF_TRUE@
|
||||
INSTALL_DATA = @INSTALL_DATA@
|
||||
INSTALL_PROGRAM = @INSTALL_PROGRAM@
|
||||
INSTALL_SCRIPT = @INSTALL_SCRIPT@
|
||||
INSTALL_STRIP_PROGRAM = @INSTALL_STRIP_PROGRAM@
|
||||
LDFLAGS = @LDFLAGS@
|
||||
LIBOBJS = @LIBOBJS@
|
||||
LIBS = @LIBS@
|
||||
LIBTIFF_CFLAGS = @LIBTIFF_CFLAGS@
|
||||
LIBTIFF_LIBS = @LIBTIFF_LIBS@
|
||||
LTLIBOBJS = @LTLIBOBJS@
|
||||
MAINT = @MAINT@
|
||||
MAINTAINER_MODE_FALSE = @MAINTAINER_MODE_FALSE@
|
||||
MAINTAINER_MODE_TRUE = @MAINTAINER_MODE_TRUE@
|
||||
MAKEINFO = @MAKEINFO@
|
||||
OBJEXT = @OBJEXT@
|
||||
OPTS = @OPTS@
|
||||
PACKAGE = @PACKAGE@
|
||||
PACKAGE_BUGREPORT = @PACKAGE_BUGREPORT@
|
||||
PACKAGE_DATE = @PACKAGE_DATE@
|
||||
PACKAGE_NAME = @PACKAGE_NAME@
|
||||
PACKAGE_STRING = @PACKAGE_STRING@
|
||||
PACKAGE_TARNAME = @PACKAGE_TARNAME@
|
||||
PACKAGE_VERSION = @PACKAGE_VERSION@
|
||||
PACKAGE_YEAR = @PACKAGE_YEAR@
|
||||
PATH_SEPARATOR = @PATH_SEPARATOR@
|
||||
RANLIB = @RANLIB@
|
||||
RPO_NO = @RPO_NO@
|
||||
RPO_YES = @RPO_YES@
|
||||
SET_MAKE = @SET_MAKE@
|
||||
SHELL = @SHELL@
|
||||
STRIP = @STRIP@
|
||||
USING_CL_FALSE = @USING_CL_FALSE@
|
||||
USING_CL_TRUE = @USING_CL_TRUE@
|
||||
VERSION = @VERSION@
|
||||
ac_ct_CC = @ac_ct_CC@
|
||||
ac_ct_CXX = @ac_ct_CXX@
|
||||
ac_ct_RANLIB = @ac_ct_RANLIB@
|
||||
ac_ct_STRIP = @ac_ct_STRIP@
|
||||
am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
|
||||
am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
|
||||
am__fastdepCXX_FALSE = @am__fastdepCXX_FALSE@
|
||||
am__fastdepCXX_TRUE = @am__fastdepCXX_TRUE@
|
||||
am__include = @am__include@
|
||||
am__leading_dot = @am__leading_dot@
|
||||
am__quote = @am__quote@
|
||||
am__tar = @am__tar@
|
||||
am__untar = @am__untar@
|
||||
bindir = @bindir@
|
||||
build = @build@
|
||||
build_alias = @build_alias@
|
||||
build_cpu = @build_cpu@
|
||||
build_os = @build_os@
|
||||
build_vendor = @build_vendor@
|
||||
datadir = @datadir@
|
||||
exec_prefix = @exec_prefix@
|
||||
host = @host@
|
||||
host_alias = @host_alias@
|
||||
host_cpu = @host_cpu@
|
||||
host_os = @host_os@
|
||||
host_vendor = @host_vendor@
|
||||
includedir = @includedir@
|
||||
infodir = @infodir@
|
||||
install_sh = @install_sh@
|
||||
libdir = @libdir@
|
||||
libexecdir = @libexecdir@
|
||||
localstatedir = @localstatedir@
|
||||
mandir = @mandir@
|
||||
mkdir_p = @mkdir_p@
|
||||
oldincludedir = @oldincludedir@
|
||||
prefix = @prefix@
|
||||
program_transform_name = @program_transform_name@
|
||||
sbindir = @sbindir@
|
||||
sharedstatedir = @sharedstatedir@
|
||||
sysconfdir = @sysconfdir@
|
||||
target_alias = @target_alias@
|
||||
SUBDIRS =
|
||||
AM_CPPFLAGS = \
|
||||
-I$(top_srcdir)/ccutil -I$(top_srcdir)/ccstruct \
|
||||
-I$(top_srcdir)/image -I$(top_srcdir)/viewer \
|
||||
-I$(top_srcdir)/ccops -I$(top_srcdir)/dict \
|
||||
-I$(top_srcdir)/classify -I$(top_srcdir)/display \
|
||||
-I$(top_srcdir)/wordrec -I$(top_srcdir)/cutil \
|
||||
-I$(top_srcdir)/textord
|
||||
|
||||
EXTRA_DIST = \
|
||||
adaptions.h applybox.h baseapi.h blobcmp.h \
|
||||
callnet.h charcut.h \
|
||||
control.h docqual.h expandblob.h fixspace.h fixxht.h \
|
||||
imgscale.h matmatch.h output.h paircmp.h reject.h scaleimg.h \
|
||||
tessbox.h tessedit.h tesseractmain.h tessvars.h tfacep.h \
|
||||
tessembedded.h tfacepp.h tstruct.h werdit.h
|
||||
|
||||
noinst_LIBRARIES = libtesseract_main.a
|
||||
libtesseract_main_a_SOURCES = \
|
||||
tessedit.cpp adaptions.cpp applybox.cpp \
|
||||
baseapi.cpp blobcmp.cpp \
|
||||
callnet.cpp charcut.cpp charsample.cpp control.cpp \
|
||||
docqual.cpp expandblob.cpp fixspace.cpp fixxht.cpp \
|
||||
imgscale.cpp matmatch.cpp output.cpp paircmp.cpp \
|
||||
reject.cpp scaleimg.cpp tessbox.cpp tessvars.cpp \
|
||||
tfacepp.cpp tstruct.cpp werdit.cpp
|
||||
|
||||
tesseract_SOURCES = tesseractmain.cpp
|
||||
tesseract_LDADD = \
|
||||
libtesseract_main.a \
|
||||
../display/libtesseract_display.a \
|
||||
../textord/libtesseract_textord.a \
|
||||
../wordrec/libtesseract_wordrec.a \
|
||||
../classify/libtesseract_classify.a \
|
||||
../dict/libtesseract_dict.a \
|
||||
../viewer/libtesseract_viewer.a \
|
||||
../image/libtesseract_image.a \
|
||||
../cutil/libtesseract_cutil.a \
|
||||
../ccstruct/libtesseract_ccstruct.a \
|
||||
../ccutil/libtesseract_ccutil.a
|
||||
|
||||
all: all-recursive
|
||||
|
||||
.SUFFIXES:
|
||||
.SUFFIXES: .cpp .o .obj
|
||||
$(srcdir)/Makefile.in: @MAINTAINER_MODE_TRUE@ $(srcdir)/Makefile.am $(am__configure_deps)
|
||||
@for dep in $?; do \
|
||||
case '$(am__configure_deps)' in \
|
||||
*$$dep*) \
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
|
||||
&& exit 0; \
|
||||
exit 1;; \
|
||||
esac; \
|
||||
done; \
|
||||
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu ccmain/Makefile'; \
|
||||
cd $(top_srcdir) && \
|
||||
$(AUTOMAKE) --gnu ccmain/Makefile
|
||||
.PRECIOUS: Makefile
|
||||
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
|
||||
@case '$?' in \
|
||||
*config.status*) \
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
|
||||
*) \
|
||||
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
|
||||
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
|
||||
esac;
|
||||
|
||||
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
|
||||
|
||||
$(top_srcdir)/configure: @MAINTAINER_MODE_TRUE@ $(am__configure_deps)
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
|
||||
$(ACLOCAL_M4): @MAINTAINER_MODE_TRUE@ $(am__aclocal_m4_deps)
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
|
||||
|
||||
clean-noinstLIBRARIES:
|
||||
-test -z "$(noinst_LIBRARIES)" || rm -f $(noinst_LIBRARIES)
|
||||
libtesseract_main.a: $(libtesseract_main_a_OBJECTS) $(libtesseract_main_a_DEPENDENCIES)
|
||||
-rm -f libtesseract_main.a
|
||||
$(libtesseract_main_a_AR) libtesseract_main.a $(libtesseract_main_a_OBJECTS) $(libtesseract_main_a_LIBADD)
|
||||
$(RANLIB) libtesseract_main.a
|
||||
install-binPROGRAMS: $(bin_PROGRAMS)
|
||||
@$(NORMAL_INSTALL)
|
||||
test -z "$(bindir)" || $(mkdir_p) "$(DESTDIR)$(bindir)"
|
||||
@list='$(bin_PROGRAMS)'; for p in $$list; do \
|
||||
p1=`echo $$p|sed 's/$(EXEEXT)$$//'`; \
|
||||
if test -f $$p \
|
||||
; then \
|
||||
f=`echo "$$p1" | sed 's,^.*/,,;$(transform);s/$$/$(EXEEXT)/'`; \
|
||||
echo " $(INSTALL_PROGRAM_ENV) $(binPROGRAMS_INSTALL) '$$p' '$(DESTDIR)$(bindir)/$$f'"; \
|
||||
$(INSTALL_PROGRAM_ENV) $(binPROGRAMS_INSTALL) "$$p" "$(DESTDIR)$(bindir)/$$f" || exit 1; \
|
||||
else :; fi; \
|
||||
done
|
||||
|
||||
uninstall-binPROGRAMS:
|
||||
@$(NORMAL_UNINSTALL)
|
||||
@list='$(bin_PROGRAMS)'; for p in $$list; do \
|
||||
f=`echo "$$p" | sed 's,^.*/,,;s/$(EXEEXT)$$//;$(transform);s/$$/$(EXEEXT)/'`; \
|
||||
echo " rm -f '$(DESTDIR)$(bindir)/$$f'"; \
|
||||
rm -f "$(DESTDIR)$(bindir)/$$f"; \
|
||||
done
|
||||
|
||||
clean-binPROGRAMS:
|
||||
-test -z "$(bin_PROGRAMS)" || rm -f $(bin_PROGRAMS)
|
||||
tesseract$(EXEEXT): $(tesseract_OBJECTS) $(tesseract_DEPENDENCIES)
|
||||
@rm -f tesseract$(EXEEXT)
|
||||
$(CXXLINK) $(tesseract_LDFLAGS) $(tesseract_OBJECTS) $(tesseract_LDADD) $(LIBS)
|
||||
|
||||
mostlyclean-compile:
|
||||
-rm -f *.$(OBJEXT)
|
||||
|
||||
distclean-compile:
|
||||
-rm -f *.tab.c
|
||||
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/adaptions.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/applybox.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/baseapi.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/blobcmp.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/callnet.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/charcut.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/charsample.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/control.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/docqual.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/expandblob.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fixspace.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fixxht.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/imgscale.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/matmatch.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/output.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/paircmp.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/reject.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scaleimg.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tessbox.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tessedit.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tesseractmain.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tessvars.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tfacepp.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tstruct.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/werdit.Po@am__quote@
|
||||
|
||||
.cpp.o:
|
||||
@am__fastdepCXX_TRUE@ if $(CXXCOMPILE) -MT $@ -MD -MP -MF "$(DEPDIR)/$*.Tpo" -c -o $@ $<; \
|
||||
@am__fastdepCXX_TRUE@ then mv -f "$(DEPDIR)/$*.Tpo" "$(DEPDIR)/$*.Po"; else rm -f "$(DEPDIR)/$*.Tpo"; exit 1; fi
|
||||
@AMDEP_TRUE@@am__fastdepCXX_FALSE@ source='$<' object='$@' libtool=no @AMDEPBACKSLASH@
|
||||
@AMDEP_TRUE@@am__fastdepCXX_FALSE@ DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) @AMDEPBACKSLASH@
|
||||
@am__fastdepCXX_FALSE@ $(CXXCOMPILE) -c -o $@ $<
|
||||
|
||||
.cpp.obj:
|
||||
@am__fastdepCXX_TRUE@ if $(CXXCOMPILE) -MT $@ -MD -MP -MF "$(DEPDIR)/$*.Tpo" -c -o $@ `$(CYGPATH_W) '$<'`; \
|
||||
@am__fastdepCXX_TRUE@ then mv -f "$(DEPDIR)/$*.Tpo" "$(DEPDIR)/$*.Po"; else rm -f "$(DEPDIR)/$*.Tpo"; exit 1; fi
|
||||
@AMDEP_TRUE@@am__fastdepCXX_FALSE@ source='$<' object='$@' libtool=no @AMDEPBACKSLASH@
|
||||
@AMDEP_TRUE@@am__fastdepCXX_FALSE@ DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) @AMDEPBACKSLASH@
|
||||
@am__fastdepCXX_FALSE@ $(CXXCOMPILE) -c -o $@ `$(CYGPATH_W) '$<'`
|
||||
uninstall-info-am:
|
||||
|
||||
# This directory's subdirectories are mostly independent; you can cd
|
||||
# into them and run `make' without going through this Makefile.
|
||||
# To change the values of `make' variables: instead of editing Makefiles,
|
||||
# (1) if the variable is set in `config.status', edit `config.status'
|
||||
# (which will cause the Makefiles to be regenerated when you run `make');
|
||||
# (2) otherwise, pass the desired values on the `make' command line.
|
||||
$(RECURSIVE_TARGETS):
|
||||
@failcom='exit 1'; \
|
||||
for f in x $$MAKEFLAGS; do \
|
||||
case $$f in \
|
||||
*=* | --[!k]*);; \
|
||||
*k*) failcom='fail=yes';; \
|
||||
esac; \
|
||||
done; \
|
||||
dot_seen=no; \
|
||||
target=`echo $@ | sed s/-recursive//`; \
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
echo "Making $$target in $$subdir"; \
|
||||
if test "$$subdir" = "."; then \
|
||||
dot_seen=yes; \
|
||||
local_target="$$target-am"; \
|
||||
else \
|
||||
local_target="$$target"; \
|
||||
fi; \
|
||||
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|
||||
|| eval $$failcom; \
|
||||
done; \
|
||||
if test "$$dot_seen" = "no"; then \
|
||||
$(MAKE) $(AM_MAKEFLAGS) "$$target-am" || exit 1; \
|
||||
fi; test -z "$$fail"
|
||||
|
||||
mostlyclean-recursive clean-recursive distclean-recursive \
|
||||
maintainer-clean-recursive:
|
||||
@failcom='exit 1'; \
|
||||
for f in x $$MAKEFLAGS; do \
|
||||
case $$f in \
|
||||
*=* | --[!k]*);; \
|
||||
*k*) failcom='fail=yes';; \
|
||||
esac; \
|
||||
done; \
|
||||
dot_seen=no; \
|
||||
case "$@" in \
|
||||
distclean-* | maintainer-clean-*) list='$(DIST_SUBDIRS)' ;; \
|
||||
*) list='$(SUBDIRS)' ;; \
|
||||
esac; \
|
||||
rev=''; for subdir in $$list; do \
|
||||
if test "$$subdir" = "."; then :; else \
|
||||
rev="$$subdir $$rev"; \
|
||||
fi; \
|
||||
done; \
|
||||
rev="$$rev ."; \
|
||||
target=`echo $@ | sed s/-recursive//`; \
|
||||
for subdir in $$rev; do \
|
||||
echo "Making $$target in $$subdir"; \
|
||||
if test "$$subdir" = "."; then \
|
||||
local_target="$$target-am"; \
|
||||
else \
|
||||
local_target="$$target"; \
|
||||
fi; \
|
||||
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|
||||
|| eval $$failcom; \
|
||||
done && test -z "$$fail"
|
||||
tags-recursive:
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) tags); \
|
||||
done
|
||||
ctags-recursive:
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) ctags); \
|
||||
done
|
||||
|
||||
ID: $(HEADERS) $(SOURCES) $(LISP) $(TAGS_FILES)
|
||||
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
|
||||
unique=`for i in $$list; do \
|
||||
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
|
||||
done | \
|
||||
$(AWK) ' { files[$$0] = 1; } \
|
||||
END { for (i in files) print i; }'`; \
|
||||
mkid -fID $$unique
|
||||
tags: TAGS
|
||||
|
||||
TAGS: tags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
|
||||
$(TAGS_FILES) $(LISP)
|
||||
tags=; \
|
||||
here=`pwd`; \
|
||||
if ($(ETAGS) --etags-include --version) >/dev/null 2>&1; then \
|
||||
include_option=--etags-include; \
|
||||
empty_fix=.; \
|
||||
else \
|
||||
include_option=--include; \
|
||||
empty_fix=; \
|
||||
fi; \
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
if test "$$subdir" = .; then :; else \
|
||||
test ! -f $$subdir/TAGS || \
|
||||
tags="$$tags $$include_option=$$here/$$subdir/TAGS"; \
|
||||
fi; \
|
||||
done; \
|
||||
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
|
||||
unique=`for i in $$list; do \
|
||||
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
|
||||
done | \
|
||||
$(AWK) ' { files[$$0] = 1; } \
|
||||
END { for (i in files) print i; }'`; \
|
||||
if test -z "$(ETAGS_ARGS)$$tags$$unique"; then :; else \
|
||||
test -n "$$unique" || unique=$$empty_fix; \
|
||||
$(ETAGS) $(ETAGSFLAGS) $(AM_ETAGSFLAGS) $(ETAGS_ARGS) \
|
||||
$$tags $$unique; \
|
||||
fi
|
||||
ctags: CTAGS
|
||||
CTAGS: ctags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
|
||||
$(TAGS_FILES) $(LISP)
|
||||
tags=; \
|
||||
here=`pwd`; \
|
||||
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
|
||||
unique=`for i in $$list; do \
|
||||
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
|
||||
done | \
|
||||
$(AWK) ' { files[$$0] = 1; } \
|
||||
END { for (i in files) print i; }'`; \
|
||||
test -z "$(CTAGS_ARGS)$$tags$$unique" \
|
||||
|| $(CTAGS) $(CTAGSFLAGS) $(AM_CTAGSFLAGS) $(CTAGS_ARGS) \
|
||||
$$tags $$unique
|
||||
|
||||
GTAGS:
|
||||
here=`$(am__cd) $(top_builddir) && pwd` \
|
||||
&& cd $(top_srcdir) \
|
||||
&& gtags -i $(GTAGS_ARGS) $$here
|
||||
|
||||
distclean-tags:
|
||||
-rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
|
||||
|
||||
distdir: $(DISTFILES)
|
||||
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
|
||||
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
|
||||
list='$(DISTFILES)'; for file in $$list; do \
|
||||
case $$file in \
|
||||
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
|
||||
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
|
||||
esac; \
|
||||
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
|
||||
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
|
||||
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
|
||||
dir="/$$dir"; \
|
||||
$(mkdir_p) "$(distdir)$$dir"; \
|
||||
else \
|
||||
dir=''; \
|
||||
fi; \
|
||||
if test -d $$d/$$file; then \
|
||||
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
|
||||
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
|
||||
fi; \
|
||||
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
|
||||
else \
|
||||
test -f $(distdir)/$$file \
|
||||
|| cp -p $$d/$$file $(distdir)/$$file \
|
||||
|| exit 1; \
|
||||
fi; \
|
||||
done
|
||||
list='$(DIST_SUBDIRS)'; for subdir in $$list; do \
|
||||
if test "$$subdir" = .; then :; else \
|
||||
test -d "$(distdir)/$$subdir" \
|
||||
|| $(mkdir_p) "$(distdir)/$$subdir" \
|
||||
|| exit 1; \
|
||||
distdir=`$(am__cd) $(distdir) && pwd`; \
|
||||
top_distdir=`$(am__cd) $(top_distdir) && pwd`; \
|
||||
(cd $$subdir && \
|
||||
$(MAKE) $(AM_MAKEFLAGS) \
|
||||
top_distdir="$$top_distdir" \
|
||||
distdir="$$distdir/$$subdir" \
|
||||
distdir) \
|
||||
|| exit 1; \
|
||||
fi; \
|
||||
done
|
||||
check-am: all-am
|
||||
check: check-recursive
|
||||
all-am: Makefile $(LIBRARIES) $(PROGRAMS)
|
||||
installdirs: installdirs-recursive
|
||||
installdirs-am:
|
||||
for dir in "$(DESTDIR)$(bindir)"; do \
|
||||
test -z "$$dir" || $(mkdir_p) "$$dir"; \
|
||||
done
|
||||
install: install-recursive
|
||||
install-exec: install-exec-recursive
|
||||
install-data: install-data-recursive
|
||||
uninstall: uninstall-recursive
|
||||
|
||||
install-am: all-am
|
||||
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
|
||||
|
||||
installcheck: installcheck-recursive
|
||||
install-strip:
|
||||
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
|
||||
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
|
||||
`test -z '$(STRIP)' || \
|
||||
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
|
||||
mostlyclean-generic:
|
||||
|
||||
clean-generic:
|
||||
|
||||
distclean-generic:
|
||||
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
|
||||
|
||||
maintainer-clean-generic:
|
||||
@echo "This command is intended for maintainers to use"
|
||||
@echo "it deletes files that may require special tools to rebuild."
|
||||
clean: clean-recursive
|
||||
|
||||
clean-am: clean-binPROGRAMS clean-generic clean-noinstLIBRARIES \
|
||||
mostlyclean-am
|
||||
|
||||
distclean: distclean-recursive
|
||||
-rm -rf ./$(DEPDIR)
|
||||
-rm -f Makefile
|
||||
distclean-am: clean-am distclean-compile distclean-generic \
|
||||
distclean-tags
|
||||
|
||||
dvi: dvi-recursive
|
||||
|
||||
dvi-am:
|
||||
|
||||
html: html-recursive
|
||||
|
||||
info: info-recursive
|
||||
|
||||
info-am:
|
||||
|
||||
install-data-am:
|
||||
|
||||
install-exec-am: install-binPROGRAMS
|
||||
|
||||
install-info: install-info-recursive
|
||||
|
||||
install-man:
|
||||
|
||||
installcheck-am:
|
||||
|
||||
maintainer-clean: maintainer-clean-recursive
|
||||
-rm -rf ./$(DEPDIR)
|
||||
-rm -f Makefile
|
||||
maintainer-clean-am: distclean-am maintainer-clean-generic
|
||||
|
||||
mostlyclean: mostlyclean-recursive
|
||||
|
||||
mostlyclean-am: mostlyclean-compile mostlyclean-generic
|
||||
|
||||
pdf: pdf-recursive
|
||||
|
||||
pdf-am:
|
||||
|
||||
ps: ps-recursive
|
||||
|
||||
ps-am:
|
||||
|
||||
uninstall-am: uninstall-binPROGRAMS uninstall-info-am
|
||||
|
||||
uninstall-info: uninstall-info-recursive
|
||||
|
||||
.PHONY: $(RECURSIVE_TARGETS) CTAGS GTAGS all all-am check check-am \
|
||||
clean clean-binPROGRAMS clean-generic clean-noinstLIBRARIES \
|
||||
clean-recursive ctags ctags-recursive distclean \
|
||||
distclean-compile distclean-generic distclean-recursive \
|
||||
distclean-tags distdir dvi dvi-am html html-am info info-am \
|
||||
install install-am install-binPROGRAMS install-data \
|
||||
install-data-am install-exec install-exec-am install-info \
|
||||
install-info-am install-man install-strip installcheck \
|
||||
installcheck-am installdirs installdirs-am maintainer-clean \
|
||||
maintainer-clean-generic maintainer-clean-recursive \
|
||||
mostlyclean mostlyclean-compile mostlyclean-generic \
|
||||
mostlyclean-recursive pdf pdf-am ps ps-am tags tags-recursive \
|
||||
uninstall uninstall-am uninstall-binPROGRAMS uninstall-info-am
|
||||
|
||||
# Tell versions [3.59,3.63) of GNU make to not export all variables.
|
||||
# Otherwise a system limit (for SysV at least) may be exceeded.
|
||||
.NOEXPORT:
|
1078
ccmain/adaptions.cpp
Normal file
1078
ccmain/adaptions.cpp
Normal file
File diff suppressed because it is too large
Load Diff
109
ccmain/adaptions.h
Normal file
109
ccmain/adaptions.h
Normal file
@ -0,0 +1,109 @@
|
||||
/**********************************************************************
|
||||
* File: adaptions.h (Formerly adaptions.h)
|
||||
* Description: Functions used to adapt to blobs already confidently
|
||||
* identified
|
||||
* Author: Chris Newton
|
||||
* Created: Thu Oct 7 10:17:28 BST 1993
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef ADAPTIONS_H
|
||||
#define ADAPTIONS_H
|
||||
|
||||
#include "charsample.h"
|
||||
#include "charcut.h"
|
||||
#include "notdll.h"
|
||||
|
||||
extern BOOL_VAR_H (tessedit_reject_ems, FALSE, "Reject all m's");
|
||||
extern BOOL_VAR_H (tessedit_reject_suspect_ems, FALSE, "Reject suspect m's");
|
||||
extern double_VAR_H (tessedit_cluster_t1, 0.20,
|
||||
"t1 threshold for clustering samples");
|
||||
extern double_VAR_H (tessedit_cluster_t2, 0.40,
|
||||
"t2 threshold for clustering samples");
|
||||
extern double_VAR_H (tessedit_cluster_t3, 0.12,
|
||||
"Extra threshold for clustering samples, only keep a new sample if best score greater than this value");
|
||||
extern double_VAR_H (tessedit_cluster_accept_fraction, 0.80,
|
||||
"Largest fraction of characters in cluster for it to be used for adaption");
|
||||
extern INT_VAR_H (tessedit_cluster_min_size, 3,
|
||||
"Smallest number of samples in a cluster for it to be used for adaption");
|
||||
extern BOOL_VAR_H (tessedit_cluster_debug, FALSE,
|
||||
"Generate and print debug information for adaption by clustering");
|
||||
extern BOOL_VAR_H (tessedit_use_best_sample, FALSE,
|
||||
"Use best sample from cluster when adapting");
|
||||
extern BOOL_VAR_H (tessedit_test_cluster_input, FALSE,
|
||||
"Set reject map to enable cluster input to be measured");
|
||||
extern BOOL_VAR_H (tessedit_matrix_match, TRUE, "Use matrix matcher");
|
||||
extern BOOL_VAR_H (tessedit_old_matrix_match, FALSE, "Use matrix matcher");
|
||||
extern BOOL_VAR_H (tessedit_mm_use_non_adaption_set, FALSE,
|
||||
"Don't try to adapt to characters on this list");
|
||||
extern STRING_VAR_H (tessedit_non_adaption_set, ",.;:'~@*",
|
||||
"Characters to be avoided when adapting");
|
||||
extern BOOL_VAR_H (tessedit_mm_adapt_using_prototypes, TRUE,
|
||||
"Use prototypes when adapting");
|
||||
extern BOOL_VAR_H (tessedit_mm_use_prototypes, TRUE,
|
||||
"Use prototypes as clusters are built");
|
||||
extern BOOL_VAR_H (tessedit_mm_use_rejmap, FALSE,
|
||||
"Adapt to characters using reject map");
|
||||
extern BOOL_VAR_H (tessedit_mm_all_rejects, FALSE,
|
||||
"Adapt to all characters using, matrix matcher");
|
||||
extern BOOL_VAR_H (tessedit_mm_only_match_same_char, FALSE,
|
||||
"Only match samples against clusters for the same character");
|
||||
extern BOOL_VAR_H (tessedit_process_rns, FALSE, "Handle m - rn ambigs");
|
||||
extern BOOL_VAR_H (tessedit_demo_adaption, FALSE,
|
||||
"Display cut images and matrix match for demo purposes");
|
||||
extern INT_VAR_H (tessedit_demo_word1, 62,
|
||||
"Word number of first word to display");
|
||||
extern INT_VAR_H (tessedit_demo_word2, 64,
|
||||
"Word number of second word to display");
|
||||
extern STRING_VAR_H (tessedit_demo_file, "academe",
|
||||
"Name of document containing demo words");
|
||||
BOOL8 word_adaptable( //should we adapt?
|
||||
WERD_RES *word,
|
||||
UINT16 mode);
|
||||
void collect_ems_for_adaption(WERD_RES *word,
|
||||
CHAR_SAMPLES_LIST *char_clusters,
|
||||
CHAR_SAMPLE_LIST *chars_waiting);
|
||||
void collect_characters_for_adaption(WERD_RES *word,
|
||||
CHAR_SAMPLES_LIST *char_clusters,
|
||||
CHAR_SAMPLE_LIST *chars_waiting);
|
||||
void cluster_sample(CHAR_SAMPLE *sample,
|
||||
CHAR_SAMPLES_LIST *char_clusters,
|
||||
CHAR_SAMPLE_LIST *chars_waiting);
|
||||
void check_wait_list(CHAR_SAMPLE_LIST *chars_waiting,
|
||||
CHAR_SAMPLE *sample,
|
||||
CHAR_SAMPLES *best_cluster);
|
||||
void complete_clustering(CHAR_SAMPLES_LIST *char_clusters,
|
||||
CHAR_SAMPLE_LIST *chars_waiting);
|
||||
void adapt_to_good_ems(WERD_RES *word,
|
||||
CHAR_SAMPLES_LIST *char_clusters,
|
||||
CHAR_SAMPLE_LIST *chars_waiting);
|
||||
void adapt_to_good_samples(WERD_RES *word,
|
||||
CHAR_SAMPLES_LIST *char_clusters,
|
||||
CHAR_SAMPLE_LIST *chars_waiting);
|
||||
void print_em_stats(CHAR_SAMPLES_LIST *char_clusters,
|
||||
CHAR_SAMPLE_LIST *chars_waiting);
|
||||
//lines of the image
|
||||
CHAR_SAMPLE *clip_sample(PIXROW *pixrow,
|
||||
IMAGELINE *imlines,
|
||||
BOX pix_box, //box of imlines extent
|
||||
BOOL8 white_on_black,
|
||||
char c);
|
||||
void display_cluster_prototypes(CHAR_SAMPLES_LIST *char_clusters);
|
||||
void reject_all_ems(WERD_RES *word);
|
||||
void reject_all_fullstops(WERD_RES *word);
|
||||
void reject_suspect_ems(WERD_RES *word);
|
||||
void reject_suspect_fullstops(WERD_RES *word);
|
||||
BOOL8 suspect_em(WERD_RES *word, INT16 index);
|
||||
BOOL8 suspect_fullstop(WERD_RES *word, INT16 i);
|
||||
#endif
|
859
ccmain/applybox.cpp
Normal file
859
ccmain/applybox.cpp
Normal file
@ -0,0 +1,859 @@
|
||||
/**********************************************************************
|
||||
* File: applybox.cpp (Formerly applybox.c)
|
||||
* Description: Re segment rows according to box file data
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 24 09:11:23 GMT 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
/*
|
||||
define SECURE_NAMES for code versions which go to UNLV to stop tessedit
|
||||
including all the newdiff stuff (which contains lots of text indicating
|
||||
what measures we are interested in.
|
||||
*/
|
||||
/* #define SECURE_NAMES done in secnames.h when necessary*/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "applybox.h"
|
||||
#include <ctype.h>
|
||||
#include <string.h>
|
||||
#ifdef __UNIX__
|
||||
#include <assert.h>
|
||||
#include <errno.h>
|
||||
#endif
|
||||
#include "mainblk.h"
|
||||
#include "genblob.h"
|
||||
#include "fixxht.h"
|
||||
#include "control.h"
|
||||
#include "tessbox.h"
|
||||
#include "globals.h"
|
||||
#include "secname.h"
|
||||
|
||||
#define SECURE_NAMES
|
||||
#ifndef SECURE_NAMES
|
||||
#include "wordstats.h"
|
||||
#endif
|
||||
|
||||
#define EXTERN
|
||||
EXTERN BOOL_VAR (applybox_rebalance, TRUE, "Drop dead");
|
||||
EXTERN INT_VAR (applybox_debug, 0, "Debug level");
|
||||
EXTERN STRING_VAR (applybox_test_exclusions, "|",
|
||||
"Chars ignored for testing");
|
||||
EXTERN double_VAR (applybox_error_band, 0.15, "Err band as fract of xht");
|
||||
|
||||
/*************************************************************************
|
||||
* The code re-assigns outlines to form words each with ONE labelled blob.
|
||||
* Noise is left in UNLABELLED words. The chars on the page are checked crudely
|
||||
* for sensible position relative to baseline and xht. Failed boxes are
|
||||
* compensated for by duplicating other believable instances of the character.
|
||||
*
|
||||
* The box file is assumed to contain box definitions, one per line, of the
|
||||
* following format:
|
||||
* <Char> <left> <bottom> <right> <top> ... arbitrary trailing fields unused
|
||||
*
|
||||
* The approach taken is to search the WHOLE page for stuff overlapping each box.
|
||||
* - This is not too inefficient and is SAFE.
|
||||
* - We can detect overlapping blobs as we will be attempting to put a blob
|
||||
* from a LABELLED word into the current word.
|
||||
* - When all the boxes have been processed we can detect any stuff which is
|
||||
* being ignored - it is the unlabelled words left on the page.
|
||||
*
|
||||
* A box should only overlap one row.
|
||||
*
|
||||
* A warning is given if the box is on the same row as the previous box, but NOT
|
||||
* on the same row as the previous blob.
|
||||
*
|
||||
* Any OUTLINE which overlaps the box is put into the new word.
|
||||
*
|
||||
* ascender chars must ascend above xht significantly
|
||||
* xht chars must not rise above row xht significantly
|
||||
* bl chars must not descend below baseline significantly
|
||||
* descender chars must descend below baseline significantly
|
||||
*
|
||||
* ?? Certain chars are DROPPED - to limit the training data.
|
||||
*
|
||||
*************************************************************************/
|
||||
|
||||
void apply_boxes(BLOCK_LIST *block_list //real blocks
|
||||
) {
|
||||
INT16 boxfile_lineno = 0;
|
||||
INT16 boxfile_charno = 0;
|
||||
BOX box; //boxfile box
|
||||
char ch[2]; //correct ch from boxfile
|
||||
ROW *row;
|
||||
ROW *prev_row = NULL;
|
||||
INT16 prev_box_right = MAX_INT16;
|
||||
INT16 block_id;
|
||||
INT16 row_id;
|
||||
INT16 box_count = 0;
|
||||
INT16 box_failures = 0;
|
||||
INT16 labels_ok;
|
||||
INT16 rows_ok;
|
||||
INT16 bad_blobs;
|
||||
INT16 tgt_char_counts[128]; //No. of box samples
|
||||
// INT16 labelled_char_counts[128]; //No. of unique labelled samples
|
||||
INT16 i;
|
||||
INT16 rebalance_count = 0;
|
||||
char min_char;
|
||||
INT16 min_samples;
|
||||
INT16 final_labelled_blob_count;
|
||||
|
||||
for (i = 0; i < 128; i++)
|
||||
tgt_char_counts[i] = 0;
|
||||
|
||||
FILE* box_file;
|
||||
STRING filename = imagefile;
|
||||
filename += ".box";
|
||||
if (!(box_file = fopen (filename.string(), "r"))) {
|
||||
CANTOPENFILE.error ("read_next_box", EXIT,
|
||||
"Cant open box file %s %d",
|
||||
filename.string(), errno);
|
||||
}
|
||||
|
||||
ch[1] = '\0';
|
||||
clear_any_old_text(block_list);
|
||||
while (read_next_box (box_file, &box, &ch[0])) {
|
||||
box_count++;
|
||||
tgt_char_counts[ch[0]]++;
|
||||
row = find_row_of_box (block_list, box, block_id, row_id);
|
||||
if (box.left () < prev_box_right) {
|
||||
boxfile_lineno++;
|
||||
boxfile_charno = 1;
|
||||
}
|
||||
else
|
||||
boxfile_charno++;
|
||||
|
||||
if (row == NULL) {
|
||||
box_failures++;
|
||||
report_failed_box (boxfile_lineno, boxfile_charno, box, ch,
|
||||
"FAILURE! box overlaps no blobs or blobs in multiple rows");
|
||||
}
|
||||
else {
|
||||
if ((box.left () >= prev_box_right) && (row != prev_row))
|
||||
report_failed_box (boxfile_lineno, boxfile_charno, box, ch,
|
||||
"WARNING! false row break");
|
||||
box_failures += resegment_box (row, box, ch, block_id, row_id,
|
||||
boxfile_lineno, boxfile_charno);
|
||||
prev_row = row;
|
||||
}
|
||||
prev_box_right = box.right ();
|
||||
}
|
||||
tidy_up(block_list,
|
||||
labels_ok,
|
||||
rows_ok,
|
||||
bad_blobs,
|
||||
tgt_char_counts,
|
||||
rebalance_count,
|
||||
min_char,
|
||||
min_samples,
|
||||
final_labelled_blob_count);
|
||||
tprintf ("APPLY_BOXES:\n");
|
||||
tprintf (" Boxes read from boxfile: %6d\n", box_count);
|
||||
tprintf (" Initially labelled blobs: %6d in %d rows\n",
|
||||
labels_ok, rows_ok);
|
||||
tprintf (" Box failures detected: %6d\n", box_failures);
|
||||
tprintf (" Duped blobs for rebalance:%6d\n", rebalance_count);
|
||||
tprintf (" \"%c\" has fewest samples:%6d\n", min_char, min_samples);
|
||||
tprintf (" Total unlabelled words: %6d\n",
|
||||
bad_blobs);
|
||||
tprintf (" Final labelled words: %6d\n",
|
||||
final_labelled_blob_count);
|
||||
}
|
||||
|
||||
|
||||
void clear_any_old_text( //remove correct text
|
||||
BLOCK_LIST *block_list //real blocks
|
||||
) {
|
||||
BLOCK_IT block_it(block_list);
|
||||
ROW_IT row_it;
|
||||
WERD_IT word_it;
|
||||
|
||||
for (block_it.mark_cycle_pt ();
|
||||
!block_it.cycled_list (); block_it.forward ()) {
|
||||
row_it.set_to_list (block_it.data ()->row_list ());
|
||||
for (row_it.mark_cycle_pt (); !row_it.cycled_list (); row_it.forward ()) {
|
||||
word_it.set_to_list (row_it.data ()->word_list ());
|
||||
for (word_it.mark_cycle_pt ();
|
||||
!word_it.cycled_list (); word_it.forward ()) {
|
||||
word_it.data ()->set_text ("");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
BOOL8 read_next_box(FILE* box_file, //
|
||||
BOX *box,
|
||||
char *ch) {
|
||||
char buff[256]; //boxfile read buffer
|
||||
char *buffptr = buff;
|
||||
STRING box_filename;
|
||||
static INT16 line = 0;
|
||||
INT32 x_min;
|
||||
INT32 y_min;
|
||||
INT32 x_max;
|
||||
INT32 y_max;
|
||||
INT32 count = 0;
|
||||
|
||||
while (!feof (box_file)) {
|
||||
fgets (buff, sizeof (buff) - 1, box_file);
|
||||
line++;
|
||||
|
||||
/* Check for blank lines in box file */
|
||||
for (buffptr = buff; isspace (*buffptr); buffptr++)
|
||||
;
|
||||
if (*buffptr != '\0') {
|
||||
count =
|
||||
sscanf (buff,
|
||||
"%c " INT32FORMAT " " INT32FORMAT " " INT32FORMAT " "
|
||||
INT32FORMAT, ch, &x_min, &y_min, &x_max, &y_max);
|
||||
if (count != 5) {
|
||||
tprintf ("Box file format error on line %i ignored\n", line);
|
||||
}
|
||||
else {
|
||||
*box = BOX (ICOORD (x_min, y_min), ICOORD (x_max, y_max));
|
||||
return TRUE; //read a box ok
|
||||
}
|
||||
}
|
||||
}
|
||||
return FALSE; //EOF
|
||||
}
|
||||
|
||||
|
||||
ROW *find_row_of_box( //
|
||||
BLOCK_LIST *block_list, //real blocks
|
||||
BOX box, //from boxfile
|
||||
INT16 &block_id,
|
||||
INT16 &row_id_to_process) {
|
||||
BLOCK_IT block_it(block_list);
|
||||
BLOCK *block;
|
||||
ROW_IT row_it;
|
||||
ROW *row;
|
||||
ROW *row_to_process = NULL;
|
||||
INT16 row_id;
|
||||
WERD_IT word_it;
|
||||
WERD *word;
|
||||
BOOL8 polyg;
|
||||
PBLOB_IT blob_it;
|
||||
PBLOB *blob;
|
||||
OUTLINE_IT outline_it;
|
||||
OUTLINE *outline;
|
||||
|
||||
/*
|
||||
Find row to process - error if box REALLY overlaps more than one row. (I.e
|
||||
it overlaps blobs in the row - not just overlaps the bounding box of the
|
||||
whole row.)
|
||||
*/
|
||||
|
||||
block_id = 0;
|
||||
for (block_it.mark_cycle_pt ();
|
||||
!block_it.cycled_list (); block_it.forward ()) {
|
||||
block_id++;
|
||||
row_id = 0;
|
||||
block = block_it.data ();
|
||||
if (block->bounding_box ().overlap (box)) {
|
||||
row_it.set_to_list (block->row_list ());
|
||||
for (row_it.mark_cycle_pt ();
|
||||
!row_it.cycled_list (); row_it.forward ()) {
|
||||
row_id++;
|
||||
row = row_it.data ();
|
||||
if (row->bounding_box ().overlap (box)) {
|
||||
word_it.set_to_list (row->word_list ());
|
||||
for (word_it.mark_cycle_pt ();
|
||||
!word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
polyg = word->flag (W_POLYGON);
|
||||
if (word->bounding_box ().overlap (box)) {
|
||||
blob_it.set_to_list (word->gblob_list ());
|
||||
for (blob_it.mark_cycle_pt ();
|
||||
!blob_it.cycled_list (); blob_it.forward ()) {
|
||||
blob = blob_it.data ();
|
||||
if (gblob_bounding_box (blob, polyg).
|
||||
overlap (box)) {
|
||||
outline_it.
|
||||
set_to_list (gblob_out_list
|
||||
(blob, polyg));
|
||||
for (outline_it.mark_cycle_pt ();
|
||||
!outline_it.cycled_list ();
|
||||
outline_it.forward ()) {
|
||||
outline = outline_it.data ();
|
||||
if (goutline_bounding_box
|
||||
(outline, polyg).major_overlap (box)) {
|
||||
if ((row_to_process == NULL) ||
|
||||
(row_to_process == row)) {
|
||||
row_to_process = row;
|
||||
row_id_to_process = row_id;
|
||||
}
|
||||
else
|
||||
/* RETURN ERROR Box overlaps blobs in more than one row */
|
||||
return NULL;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return row_to_process;
|
||||
}
|
||||
|
||||
|
||||
INT16 resegment_box( //
|
||||
ROW *row,
|
||||
BOX box,
|
||||
char *ch,
|
||||
INT16 block_id,
|
||||
INT16 row_id,
|
||||
INT16 boxfile_lineno,
|
||||
INT16 boxfile_charno) {
|
||||
WERD_IT word_it;
|
||||
WERD *word;
|
||||
WERD *new_word = NULL;
|
||||
BOOL8 polyg = false;
|
||||
PBLOB_IT blob_it;
|
||||
PBLOB_IT new_blob_it;
|
||||
PBLOB *blob;
|
||||
PBLOB *new_blob;
|
||||
OUTLINE_IT outline_it;
|
||||
OUTLINE_LIST dummy; // Just to initialize new_outline_it.
|
||||
OUTLINE_IT new_outline_it = &dummy;
|
||||
OUTLINE *outline;
|
||||
BOX new_word_box;
|
||||
float word_x_centre;
|
||||
float baseline;
|
||||
INT16 error_count = 0; //number of chars lost
|
||||
|
||||
word_it.set_to_list (row->word_list ());
|
||||
for (word_it.mark_cycle_pt (); !word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
polyg = word->flag (W_POLYGON);
|
||||
if (word->bounding_box ().overlap (box)) {
|
||||
blob_it.set_to_list (word->gblob_list ());
|
||||
for (blob_it.mark_cycle_pt ();
|
||||
!blob_it.cycled_list (); blob_it.forward ()) {
|
||||
blob = blob_it.data ();
|
||||
if (gblob_bounding_box (blob, polyg).overlap (box)) {
|
||||
outline_it.set_to_list (gblob_out_list (blob, polyg));
|
||||
for (outline_it.mark_cycle_pt ();
|
||||
!outline_it.cycled_list (); outline_it.forward ()) {
|
||||
outline = outline_it.data ();
|
||||
if (goutline_bounding_box (outline, polyg).
|
||||
major_overlap (box)) {
|
||||
if (strlen (word->text ()) > 0) {
|
||||
if (error_count == 0) {
|
||||
error_count = 1;
|
||||
if (applybox_debug > 4)
|
||||
report_failed_box (boxfile_lineno,
|
||||
boxfile_charno,
|
||||
box, ch,
|
||||
"FAILURE! box overlaps blob in labelled word");
|
||||
}
|
||||
if (applybox_debug > 4)
|
||||
tprintf
|
||||
("APPLY_BOXES: ALSO ignoring corrupted char blk:%d row:%d \"%s\"\n",
|
||||
block_id, row_id,
|
||||
word_it.data ()->text ());
|
||||
word_it.data ()->set_text ("");
|
||||
//UN label it
|
||||
error_count++;
|
||||
}
|
||||
|
||||
if (error_count == 0) {
|
||||
if (new_word == NULL) {
|
||||
/* Make a new word with a single blob */
|
||||
new_word = word->shallow_copy ();
|
||||
new_word->set_text (ch);
|
||||
if (polyg)
|
||||
new_blob = new PBLOB;
|
||||
else
|
||||
new_blob = (PBLOB *) new C_BLOB;
|
||||
new_blob_it.set_to_list (new_word->
|
||||
gblob_list ());
|
||||
new_blob_it.add_to_end (new_blob);
|
||||
new_outline_it.
|
||||
set_to_list (gblob_out_list
|
||||
(new_blob, polyg));
|
||||
}
|
||||
new_outline_it.add_to_end (outline_it.
|
||||
extract ());
|
||||
//move blob
|
||||
}
|
||||
}
|
||||
}
|
||||
//no outlines in blob
|
||||
if (outline_it.empty ())
|
||||
//so delete blob
|
||||
delete blob_it.extract ();
|
||||
}
|
||||
}
|
||||
if (blob_it.empty ()) //no blobs in word
|
||||
//so delete word
|
||||
delete word_it.extract ();
|
||||
}
|
||||
}
|
||||
if (error_count > 0)
|
||||
return error_count;
|
||||
|
||||
if (new_word != NULL) {
|
||||
gblob_sort_list (new_word->gblob_list (), polyg);
|
||||
word_it.add_to_end (new_word);
|
||||
new_word_box = new_word->bounding_box ();
|
||||
word_x_centre = (new_word_box.left () + new_word_box.right ()) / 2.0f;
|
||||
baseline = row->base_line (word_x_centre);
|
||||
|
||||
if (STRING (chs_caps_ht).contains (ch[0]) &&
|
||||
(new_word_box.top () <
|
||||
baseline + (1 + applybox_error_band) * row->x_height ())) {
|
||||
report_failed_box (boxfile_lineno, boxfile_charno, box, ch,
|
||||
"FAILURE! caps-ht char didn't ascend");
|
||||
new_word->set_text ("");
|
||||
return 1;
|
||||
}
|
||||
if (STRING (chs_odd_top).contains (ch[0]) &&
|
||||
(new_word_box.top () <
|
||||
baseline + (1 - applybox_error_band) * row->x_height ())) {
|
||||
report_failed_box (boxfile_lineno, boxfile_charno, box, ch,
|
||||
"FAILURE! Odd top char below xht");
|
||||
new_word->set_text ("");
|
||||
return 1;
|
||||
}
|
||||
if (STRING (chs_x_ht).contains (ch[0]) &&
|
||||
((new_word_box.top () >
|
||||
baseline + (1 + applybox_error_band) * row->x_height ()) ||
|
||||
(new_word_box.top () <
|
||||
baseline + (1 - applybox_error_band) * row->x_height ()))) {
|
||||
report_failed_box (boxfile_lineno, boxfile_charno, box, ch,
|
||||
"FAILURE! x-ht char didn't have top near xht");
|
||||
new_word->set_text ("");
|
||||
return 1;
|
||||
}
|
||||
if (STRING (chs_non_ambig_bl).contains (ch[0]) &&
|
||||
((new_word_box.bottom () <
|
||||
baseline - applybox_error_band * row->x_height ()) ||
|
||||
(new_word_box.bottom () >
|
||||
baseline + applybox_error_band * row->x_height ()))) {
|
||||
report_failed_box (boxfile_lineno, boxfile_charno, box, ch,
|
||||
"FAILURE! non ambig BL char didnt have bottom near baseline");
|
||||
new_word->set_text ("");
|
||||
return 1;
|
||||
}
|
||||
if (STRING (chs_odd_bot).contains (ch[0]) &&
|
||||
(new_word_box.bottom () >
|
||||
baseline + applybox_error_band * row->x_height ())) {
|
||||
report_failed_box (boxfile_lineno, boxfile_charno, box, ch,
|
||||
"FAILURE! Odd bottom char above baseline");
|
||||
new_word->set_text ("");
|
||||
return 1;
|
||||
}
|
||||
if (STRING (chs_desc).contains (ch[0]) &&
|
||||
(new_word_box.bottom () >
|
||||
baseline - applybox_error_band * row->x_height ())) {
|
||||
report_failed_box (boxfile_lineno, boxfile_charno, box, ch,
|
||||
"FAILURE! Descender doesn't descend");
|
||||
new_word->set_text ("");
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
else {
|
||||
report_failed_box (boxfile_lineno, boxfile_charno, box, ch,
|
||||
"FAILURE! Couldn't find any blobs");
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* tidy_up()
|
||||
* - report >1 block
|
||||
* - sort the words in each row.
|
||||
* - report any rows with no labelled words.
|
||||
* - report any remaining unlabelled words
|
||||
* - report total labelled words
|
||||
*
|
||||
*************************************************************************/
|
||||
void tidy_up( //
|
||||
BLOCK_LIST *block_list, //real blocks
|
||||
INT16 &ok_char_count,
|
||||
INT16 &ok_row_count,
|
||||
INT16 &unlabelled_words,
|
||||
INT16 *tgt_char_counts,
|
||||
INT16 &rebalance_count,
|
||||
char &min_char,
|
||||
INT16 &min_samples,
|
||||
INT16 &final_labelled_blob_count) {
|
||||
BLOCK_IT block_it(block_list);
|
||||
ROW_IT row_it;
|
||||
ROW *row;
|
||||
WERD_IT word_it;
|
||||
WERD *word;
|
||||
WERD *duplicate_word;
|
||||
INT16 block_idx = 0;
|
||||
INT16 row_idx;
|
||||
INT16 all_row_idx = 0;
|
||||
BOOL8 row_ok;
|
||||
BOOL8 rebalance_needed = FALSE;
|
||||
//No. of unique labelled samples
|
||||
INT16 labelled_char_counts[128];
|
||||
INT16 i;
|
||||
char ch;
|
||||
char prev_ch = '\0';
|
||||
BOOL8 at_dupe_of_prev_word;
|
||||
ROW *prev_row = NULL;
|
||||
INT16 left;
|
||||
INT16 prev_left = -1;
|
||||
|
||||
for (i = 0; i < 128; i++)
|
||||
labelled_char_counts[i] = 0;
|
||||
|
||||
ok_char_count = 0;
|
||||
ok_row_count = 0;
|
||||
unlabelled_words = 0;
|
||||
if ((applybox_debug > 4) && (block_it.length () != 1))
|
||||
|
||||
tprintf ("APPLY_BOXES: More than one block??\n");
|
||||
|
||||
for (block_it.mark_cycle_pt ();
|
||||
!block_it.cycled_list (); block_it.forward ()) {
|
||||
block_idx++;
|
||||
row_idx = 0;
|
||||
row_ok = FALSE;
|
||||
row_it.set_to_list (block_it.data ()->row_list ());
|
||||
for (row_it.mark_cycle_pt (); !row_it.cycled_list (); row_it.forward ()) {
|
||||
row_idx++;
|
||||
all_row_idx++;
|
||||
row = row_it.data ();
|
||||
word_it.set_to_list (row->word_list ());
|
||||
word_it.sort (word_comparator);
|
||||
for (word_it.mark_cycle_pt ();
|
||||
!word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
if (strlen (word->text ()) == 0) {
|
||||
unlabelled_words++;
|
||||
if (applybox_debug > 4) {
|
||||
tprintf
|
||||
("APPLY_BOXES: Unlabelled word blk:%d row:%d allrows:%d\n",
|
||||
block_idx, row_idx, all_row_idx);
|
||||
}
|
||||
}
|
||||
else {
|
||||
if (word->gblob_list ()->length () != 1)
|
||||
tprintf
|
||||
("APPLY_BOXES: FATALITY - MULTIBLOB Labelled word blk:%d row:%d allrows:%d\n",
|
||||
block_idx, row_idx, all_row_idx);
|
||||
|
||||
ok_char_count++;
|
||||
labelled_char_counts[*word->text ()]++;
|
||||
row_ok = TRUE;
|
||||
}
|
||||
}
|
||||
if ((applybox_debug > 4) && (!row_ok)) {
|
||||
tprintf
|
||||
("APPLY_BOXES: Row with no labelled words blk:%d row:%d allrows:%d\n",
|
||||
block_idx, row_idx, all_row_idx);
|
||||
}
|
||||
else
|
||||
ok_row_count++;
|
||||
}
|
||||
}
|
||||
|
||||
min_samples = 9999;
|
||||
for (i = 0; i < 128; i++) {
|
||||
if (tgt_char_counts[i] > labelled_char_counts[i]) {
|
||||
if (labelled_char_counts[i] <= 1) {
|
||||
tprintf
|
||||
("APPLY_BOXES: FATALITY - %d labelled samples of \"%c\" - target is %d\n",
|
||||
labelled_char_counts[i], (char) i, tgt_char_counts[i]);
|
||||
}
|
||||
else {
|
||||
rebalance_needed = TRUE;
|
||||
if (applybox_debug > 0)
|
||||
tprintf
|
||||
("APPLY_BOXES: REBALANCE REQD \"%c\" - target of %d from %d labelled samples\n",
|
||||
(char) i, tgt_char_counts[i], labelled_char_counts[i]);
|
||||
}
|
||||
}
|
||||
if ((min_samples > labelled_char_counts[i]) && (tgt_char_counts[i] > 0)) {
|
||||
min_samples = labelled_char_counts[i];
|
||||
min_char = (char) i;
|
||||
}
|
||||
}
|
||||
|
||||
while (applybox_rebalance && rebalance_needed) {
|
||||
block_it.set_to_list (block_list);
|
||||
for (block_it.mark_cycle_pt ();
|
||||
!block_it.cycled_list (); block_it.forward ()) {
|
||||
row_it.set_to_list (block_it.data ()->row_list ());
|
||||
for (row_it.mark_cycle_pt ();
|
||||
!row_it.cycled_list (); row_it.forward ()) {
|
||||
row = row_it.data ();
|
||||
word_it.set_to_list (row->word_list ());
|
||||
for (word_it.mark_cycle_pt ();
|
||||
!word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
left = word->bounding_box ().left ();
|
||||
ch = *word->text ();
|
||||
at_dupe_of_prev_word = ((row == prev_row) &&
|
||||
(left = prev_left) &&
|
||||
(ch == prev_ch));
|
||||
if ((ch != '\0') &&
|
||||
(labelled_char_counts[ch] > 1) &&
|
||||
(tgt_char_counts[ch] > labelled_char_counts[ch]) &&
|
||||
(!at_dupe_of_prev_word)) {
|
||||
/* Duplicate the word to rebalance the labelled samples */
|
||||
if (applybox_debug > 9) {
|
||||
tprintf ("Duping \"%c\" from ", ch);
|
||||
word->bounding_box ().print ();
|
||||
}
|
||||
duplicate_word = new WERD;
|
||||
*duplicate_word = *word;
|
||||
word_it.add_after_then_move (duplicate_word);
|
||||
rebalance_count++;
|
||||
labelled_char_counts[ch]++;
|
||||
}
|
||||
prev_row = row;
|
||||
prev_left = left;
|
||||
prev_ch = ch;
|
||||
}
|
||||
}
|
||||
}
|
||||
rebalance_needed = FALSE;
|
||||
for (i = 0; i < 128; i++) {
|
||||
if ((tgt_char_counts[i] > labelled_char_counts[i]) &&
|
||||
(labelled_char_counts[i] > 1)) {
|
||||
rebalance_needed = TRUE;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* Now final check - count labelled blobs */
|
||||
final_labelled_blob_count = 0;
|
||||
block_it.set_to_list (block_list);
|
||||
for (block_it.mark_cycle_pt ();
|
||||
!block_it.cycled_list (); block_it.forward ()) {
|
||||
row_it.set_to_list (block_it.data ()->row_list ());
|
||||
for (row_it.mark_cycle_pt (); !row_it.cycled_list (); row_it.forward ()) {
|
||||
row = row_it.data ();
|
||||
word_it.set_to_list (row->word_list ());
|
||||
word_it.sort (word_comparator);
|
||||
for (word_it.mark_cycle_pt ();
|
||||
!word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
if ((strlen (word->text ()) == 1) &&
|
||||
(word->gblob_list ()->length () == 1))
|
||||
final_labelled_blob_count++;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void report_failed_box(INT16 boxfile_lineno,
|
||||
INT16 boxfile_charno,
|
||||
BOX box,
|
||||
char *box_ch,
|
||||
const char *err_msg) {
|
||||
if (applybox_debug > 4)
|
||||
tprintf ("APPLY_BOXES: boxfile %1d/%1d/%s ((%1d,%1d),(%1d,%1d)): %s\n",
|
||||
boxfile_lineno,
|
||||
boxfile_charno,
|
||||
box_ch,
|
||||
box.left (), box.bottom (), box.right (), box.top (), err_msg);
|
||||
}
|
||||
|
||||
|
||||
void apply_box_training(BLOCK_LIST *block_list) {
|
||||
BLOCK_IT block_it(block_list);
|
||||
ROW_IT row_it;
|
||||
ROW *row;
|
||||
WERD_IT word_it;
|
||||
WERD *word;
|
||||
WERD *bln_word;
|
||||
WERD copy_outword; // copy to denorm
|
||||
PBLOB_IT blob_it;
|
||||
DENORM denorm;
|
||||
INT16 count = 0;
|
||||
char ch[2];
|
||||
|
||||
ch[1] = '\0';
|
||||
|
||||
tprintf ("Generating training data\n");
|
||||
for (block_it.mark_cycle_pt ();
|
||||
!block_it.cycled_list (); block_it.forward ()) {
|
||||
row_it.set_to_list (block_it.data ()->row_list ());
|
||||
for (row_it.mark_cycle_pt (); !row_it.cycled_list (); row_it.forward ()) {
|
||||
row = row_it.data ();
|
||||
word_it.set_to_list (row->word_list ());
|
||||
for (word_it.mark_cycle_pt ();
|
||||
!word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
if ((strlen (word->text ()) == 1) &&
|
||||
(word->gblob_list ()->length () == 1)) {
|
||||
/* Here is a word with a single char label and a single blob so train on it */
|
||||
bln_word =
|
||||
make_bln_copy (word, row, row->x_height (), &denorm);
|
||||
blob_it.set_to_list (bln_word->blob_list ());
|
||||
ch[0] = *word->text ();
|
||||
tess_training_tester (blob_it.data (),
|
||||
//single blob
|
||||
&denorm, TRUE, //correct
|
||||
ch, //correct ASCII char
|
||||
1, //ASCII length
|
||||
NULL);
|
||||
copy_outword = *(bln_word);
|
||||
copy_outword.baseline_denormalise (&denorm);
|
||||
blob_it.set_to_list (copy_outword.blob_list ());
|
||||
ch[0] = *word->text ();
|
||||
delete bln_word;
|
||||
count++;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
tprintf ("Generated training data for %d blobs\n", count);
|
||||
}
|
||||
|
||||
|
||||
void apply_box_testing(BLOCK_LIST *block_list) {
|
||||
BLOCK_IT block_it(block_list);
|
||||
ROW_IT row_it;
|
||||
ROW *row;
|
||||
INT16 row_count = 0;
|
||||
WERD_IT word_it;
|
||||
WERD *word;
|
||||
WERD *bln_word;
|
||||
INT16 word_count = 0;
|
||||
PBLOB_IT blob_it;
|
||||
DENORM denorm;
|
||||
INT16 count = 0;
|
||||
char ch[2];
|
||||
WERD *outword; //bln best choice
|
||||
//segmentation
|
||||
WERD_CHOICE *best_choice; //tess output
|
||||
WERD_CHOICE *raw_choice; //top choice permuter
|
||||
//detailed results
|
||||
BLOB_CHOICE_LIST_CLIST blob_choices;
|
||||
INT16 char_count = 0;
|
||||
INT16 correct_count = 0;
|
||||
INT16 err_count = 0;
|
||||
INT16 rej_count = 0;
|
||||
#ifndef SECURE_NAMES
|
||||
WERDSTATS wordstats; //As from newdiff
|
||||
#endif
|
||||
char tess_rej_str[3];
|
||||
char tess_long_str[3];
|
||||
|
||||
ch[1] = '\0';
|
||||
strcpy (tess_rej_str, "|A");
|
||||
strcpy (tess_long_str, "|B");
|
||||
|
||||
for (block_it.mark_cycle_pt ();
|
||||
!block_it.cycled_list (); block_it.forward ()) {
|
||||
row_it.set_to_list (block_it.data ()->row_list ());
|
||||
for (row_it.mark_cycle_pt (); !row_it.cycled_list (); row_it.forward ()) {
|
||||
row = row_it.data ();
|
||||
row_count++;
|
||||
word_count = 0;
|
||||
word_it.set_to_list (row->word_list ());
|
||||
for (word_it.mark_cycle_pt ();
|
||||
!word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
word_count++;
|
||||
if ((strlen (word->text ()) == 1) &&
|
||||
!STRING (applybox_test_exclusions).contains (*word->text ())
|
||||
&& (word->gblob_list ()->length () == 1)) {
|
||||
/* Here is a word with a single char label and a single blob so test it */
|
||||
bln_word =
|
||||
make_bln_copy (word, row, row->x_height (), &denorm);
|
||||
blob_it.set_to_list (bln_word->blob_list ());
|
||||
ch[0] = *word->text ();
|
||||
char_count++;
|
||||
best_choice = tess_segment_pass1 (bln_word,
|
||||
&denorm,
|
||||
tess_default_matcher,
|
||||
raw_choice,
|
||||
&blob_choices, outword);
|
||||
|
||||
/*
|
||||
Test for TESS screw up on word. Recog_word has already ensured that the
|
||||
choice list, outword blob lists and best_choice string are the same
|
||||
length. A TESS screw up is indicated by a blank filled or 0 length string.
|
||||
*/
|
||||
if ((best_choice->string ().length () == 0) ||
|
||||
(strspn (best_choice->string ().string (), " ") ==
|
||||
best_choice->string ().length ())) {
|
||||
rej_count++;
|
||||
tprintf ("%d:%d: \"%s\" -> TESS FAILED\n",
|
||||
row_count, word_count, ch);
|
||||
#ifndef SECURE_NAMES
|
||||
wordstats.word (tess_rej_str, 2, ch, 1);
|
||||
#endif
|
||||
}
|
||||
else {
|
||||
if ((best_choice->string ().length () !=
|
||||
outword->blob_list ()->length ()) ||
|
||||
(best_choice->string ().length () !=
|
||||
blob_choices.length ())) {
|
||||
tprintf
|
||||
("ASSERT FAIL String:\"%s\"; Strlen=%d; #Blobs=%d; #Choices=%d\n",
|
||||
best_choice->string ().string (),
|
||||
best_choice->string ().length (),
|
||||
outword->blob_list ()->length (),
|
||||
blob_choices.length ());
|
||||
}
|
||||
ASSERT_HOST (best_choice->string ().length () ==
|
||||
outword->blob_list ()->length ());
|
||||
ASSERT_HOST (best_choice->string ().length () ==
|
||||
blob_choices.length ());
|
||||
fix_quotes ((char *) best_choice->string ().string (),
|
||||
//turn to double
|
||||
outword, &blob_choices);
|
||||
if (strcmp (best_choice->string ().string (), ch) != 0) {
|
||||
err_count++;
|
||||
tprintf ("%d:%d: \"%s\" -> \"%s\"\n",
|
||||
row_count, word_count, ch,
|
||||
best_choice->string ().string ());
|
||||
}
|
||||
else
|
||||
correct_count++;
|
||||
#ifndef SECURE_NAMES
|
||||
if (best_choice->string ().length () > 2)
|
||||
wordstats.word (tess_long_str, 2, ch, 1);
|
||||
else
|
||||
wordstats.word ((char *) best_choice->string ().
|
||||
string (),
|
||||
best_choice->string ().length (), ch,
|
||||
1);
|
||||
#endif
|
||||
}
|
||||
delete bln_word;
|
||||
delete outword;
|
||||
delete best_choice;
|
||||
delete raw_choice;
|
||||
blob_choices.deep_clear ();
|
||||
count++;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
#ifndef SECURE_NAMES
|
||||
wordstats.print (1, 100.0);
|
||||
wordstats.conf_matrix ();
|
||||
tprintf ("Tested %d chars: %d correct; %d rejected by tess; %d errs\n",
|
||||
char_count, correct_count, rej_count, err_count);
|
||||
#endif
|
||||
}
|
71
ccmain/applybox.h
Normal file
71
ccmain/applybox.h
Normal file
@ -0,0 +1,71 @@
|
||||
/**********************************************************************
|
||||
* File: applybox.h (Formerly applybox.h)
|
||||
* Description: Re segment rows according to box file data
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 24 09:11:23 GMT 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef APPLYBOX_H
|
||||
#define APPLYBOX_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "ocrblock.h"
|
||||
#include "ocrrow.h"
|
||||
#include "notdll.h"
|
||||
|
||||
extern BOOL_VAR_H (applybox_rebalance, TRUE, "Drop dead");
|
||||
extern INT_VAR_H (applybox_debug, 0, "Debug level");
|
||||
extern STRING_VAR_H (applybox_test_exclusions, "|",
|
||||
"Chars ignored for testing");
|
||||
extern double_VAR_H (applybox_error_band, 0.15, "Err band as fract of xht");
|
||||
void apply_boxes(BLOCK_LIST *block_list //real blocks
|
||||
);
|
||||
void clear_any_old_text( //remove correct text
|
||||
BLOCK_LIST *block_list //real blocks
|
||||
);
|
||||
BOOL8 read_next_box(FILE* box_file, //
|
||||
BOX *box,
|
||||
char *ch);
|
||||
ROW *find_row_of_box( //
|
||||
BLOCK_LIST *block_list, //real blocks
|
||||
BOX box, //from boxfile
|
||||
INT16 &block_id,
|
||||
INT16 &row_id_to_process);
|
||||
INT16 resegment_box( //
|
||||
ROW *row,
|
||||
BOX box,
|
||||
char *ch,
|
||||
INT16 block_id,
|
||||
INT16 row_id,
|
||||
INT16 boxfile_lineno,
|
||||
INT16 boxfile_charno);
|
||||
void tidy_up( //
|
||||
BLOCK_LIST *block_list, //real blocks
|
||||
INT16 &ok_char_count,
|
||||
INT16 &ok_row_count,
|
||||
INT16 &unlabelled_words,
|
||||
INT16 *tgt_char_counts,
|
||||
INT16 &rebalance_count,
|
||||
char &min_char,
|
||||
INT16 &min_samples,
|
||||
INT16 &final_labelled_blob_count);
|
||||
void report_failed_box(INT16 boxfile_lineno,
|
||||
INT16 boxfile_charno,
|
||||
BOX box,
|
||||
char *box_ch,
|
||||
const char *err_msg);
|
||||
void apply_box_training(BLOCK_LIST *block_list);
|
||||
void apply_box_testing(BLOCK_LIST *block_list);
|
||||
#endif
|
395
ccmain/baseapi.cpp
Normal file
395
ccmain/baseapi.cpp
Normal file
@ -0,0 +1,395 @@
|
||||
/**********************************************************************
|
||||
* File: baseapi.cpp
|
||||
* Description: Simple API for calling tesseract.
|
||||
* Author: Ray Smith
|
||||
* Created: Fri Oct 06 15:35:01 PDT 2006
|
||||
*
|
||||
* (C) Copyright 2006, Google Inc.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "baseapi.h"
|
||||
|
||||
#include "tessedit.h"
|
||||
#include "pageres.h"
|
||||
#include "tessvars.h"
|
||||
#include "control.h"
|
||||
#include "applybox.h"
|
||||
#include "pgedit.h"
|
||||
#include "varabled.h"
|
||||
#include "adaptmatch.h"
|
||||
|
||||
BOOL_VAR(tessedit_resegment_from_boxes, FALSE,
|
||||
"Take segmentation and labeling from box file");
|
||||
BOOL_VAR(tessedit_train_from_boxes, FALSE,
|
||||
"Generate training data from boxed chars");
|
||||
|
||||
// Minimum sensible image size to be worth running tesseract.
|
||||
const int kMinRectSize = 10;
|
||||
|
||||
// Start tesseract.
|
||||
// The datapath must be the name of the data directory or some other file
|
||||
// in which the data directory resides (for instance argv[0].)
|
||||
// The configfile is the name of a file in the tessconfigs directory
|
||||
// (eg batch) or NULL to run on defaults.
|
||||
// Outputbase may also be NULL, and is the basename of various output files.
|
||||
// If the output of any of these files is enabled, then a name nmust be given.
|
||||
// If numeric_mode is true, only possible digits and roman numbers are
|
||||
// returned. Returns 0 if successful. Crashes if not.
|
||||
// The argc and argv may be 0 and NULL respectively. They are used for
|
||||
// providing config files for debug/display purposes.
|
||||
// TODO(rays) get the facts straight. Is it OK to call
|
||||
// it more than once? Make it properly check for errors and return them.
|
||||
int TessBaseAPI::Init(const char* datapath, const char* outputbase,
|
||||
const char* configfile, bool numeric_mode,
|
||||
int argc, char* argv[]) {
|
||||
int result = init_tesseract(datapath, outputbase, configfile, argc, argv);
|
||||
bln_numericmode.set_value(numeric_mode);
|
||||
return result;
|
||||
}
|
||||
|
||||
// Recognize a rectangle from an image and return the result as a string.
|
||||
// May be called many times for a single Init.
|
||||
// Currently has no error checking.
|
||||
// Greyscale of 8 and color of 24 or 32 bits per pixel may be given.
|
||||
// Palette color images will not work properly and must be converted to
|
||||
// 24 bit.
|
||||
// Binary images of 1 bit per pixel may also be given but they must be
|
||||
// byte packed with the MSB of the first byte being the first pixel, and a
|
||||
// one pixel is WHITE. For binary images set bytes_per_pixel=0.
|
||||
// The recognized text is returned as a char* which (in future will be coded
|
||||
// as UTF8 and) must be freed with the delete [] operator.
|
||||
char* TessBaseAPI::TesseractRect(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top,
|
||||
int width, int height) {
|
||||
if (width < kMinRectSize || height < kMinRectSize)
|
||||
return NULL; // Nothing worth doing.
|
||||
|
||||
// Copy/Threshold the image to the tesseract global page_image.
|
||||
CopyImageToTesseract(imagedata, bytes_per_pixel, bytes_per_line,
|
||||
left, top, width, height);
|
||||
|
||||
return RecognizeToString();
|
||||
}
|
||||
|
||||
// Call between pages or documents etc to free up memory and forget
|
||||
// adaptive data.
|
||||
void TessBaseAPI::ClearAdaptiveClassifier() {
|
||||
ResetAdaptiveClassifier();
|
||||
}
|
||||
|
||||
// Close down tesseract and free up memory.
|
||||
void TessBaseAPI::End() {
|
||||
ResetAdaptiveClassifier();
|
||||
end_tesseract();
|
||||
}
|
||||
|
||||
// Dump the internal binary image to a PGM file.
|
||||
void TessBaseAPI::DumpPGM(const char* filename) {
|
||||
IMAGELINE line;
|
||||
line.init(page_image.get_xsize());
|
||||
FILE *fp = fopen(filename, "w");
|
||||
fprintf(fp, "P5 " INT32FORMAT " " INT32FORMAT " 255\n", page_image.get_xsize(),
|
||||
page_image.get_ysize());
|
||||
for (int j = page_image.get_ysize()-1; j >= 0 ; --j) {
|
||||
page_image.get_line(0, j, page_image.get_xsize(), &line, 0);
|
||||
for (int i = 0; i < page_image.get_xsize(); ++i) {
|
||||
UINT8 b = line.pixels[i] ? 255 : 0;
|
||||
fwrite(&b, 1, 1, fp);
|
||||
}
|
||||
}
|
||||
fclose(fp);
|
||||
}
|
||||
|
||||
// Copy the given image rectangle to Tesseract, with adaptive thresholding
|
||||
// if the image is not already binary.
|
||||
void TessBaseAPI::CopyImageToTesseract(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top,
|
||||
int width, int height) {
|
||||
if (bytes_per_pixel > 0) {
|
||||
// Threshold grey or color.
|
||||
int* thresholds = new int[bytes_per_pixel];
|
||||
int* hi_values = new int[bytes_per_pixel];
|
||||
|
||||
// Compute the thresholds.
|
||||
OtsuThreshold(imagedata, bytes_per_pixel, bytes_per_line,
|
||||
left, top, left + width, top + height,
|
||||
thresholds, hi_values);
|
||||
|
||||
// Threshold the image to the tesseract global page_image.
|
||||
ThresholdRect(imagedata, bytes_per_pixel, bytes_per_line,
|
||||
left, top, width, height,
|
||||
thresholds, hi_values);
|
||||
delete [] thresholds;
|
||||
delete [] hi_values;
|
||||
} else {
|
||||
CopyBinaryRect(imagedata, bytes_per_line, left, top, width, height);
|
||||
}
|
||||
}
|
||||
|
||||
// Compute the Otsu threshold(s) for the given image rectangle, making one
|
||||
// for each channel. Each channel is always one byte per pixel.
|
||||
// Returns an array of threshold values and an array of hi_values, such
|
||||
// that a pixel value >threshold[channel] is considered foreground if
|
||||
// hi_values[channel] is 0 or background if 1. A hi_value of -1 indicates
|
||||
// that there is no apparent foreground. At least one hi_value will not be -1.
|
||||
// thresholds and hi_values are assumed to be of bytes_per_pixel size.
|
||||
void TessBaseAPI::OtsuThreshold(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top, int right, int bottom,
|
||||
int* thresholds,
|
||||
int* hi_values) {
|
||||
// Of all channels with no good hi_value, keep the best so we can always
|
||||
// produce at least one answer.
|
||||
int best_hi_value = 0;
|
||||
int best_hi_index = 0;
|
||||
bool any_good_hivalue = false;
|
||||
double best_hi_dist = 0.0;
|
||||
|
||||
for (int ch = 0; ch < bytes_per_pixel; ++ch) {
|
||||
thresholds[ch] = 0;
|
||||
hi_values[ch] = -1;
|
||||
// Compute the histogram of the image rectangle.
|
||||
int histogram[256];
|
||||
HistogramRect(imagedata + ch, bytes_per_pixel, bytes_per_line,
|
||||
left, top, right, bottom, histogram);
|
||||
int H;
|
||||
int best_omega_0;
|
||||
int best_t = OtsuStats(histogram, &H, &best_omega_0);
|
||||
// To be a convincing foreground we must have a small fraction of H
|
||||
// or to be a convincing background we must have a large fraction of H.
|
||||
// In between we assume this channel contains no thresholding information.
|
||||
int hi_value = best_omega_0 < H * 0.5;
|
||||
thresholds[ch] = best_t;
|
||||
if (best_omega_0 > H * 0.75) {
|
||||
any_good_hivalue = true;
|
||||
hi_values[ch] = 0;
|
||||
}
|
||||
else if (best_omega_0 < H * 0.25) {
|
||||
any_good_hivalue = true;
|
||||
hi_values[ch] = 1;
|
||||
}
|
||||
else {
|
||||
// In case all channels are like this, keep the best of the bad lot.
|
||||
double hi_dist = hi_value ? (H - best_omega_0) : best_omega_0;
|
||||
if (hi_dist > best_hi_dist) {
|
||||
best_hi_dist = hi_dist;
|
||||
best_hi_value = hi_value;
|
||||
best_hi_index = ch;
|
||||
}
|
||||
}
|
||||
}
|
||||
if (!any_good_hivalue) {
|
||||
// Use the best of the ones that were not good enough.
|
||||
hi_values[best_hi_index] = best_hi_value;
|
||||
}
|
||||
}
|
||||
|
||||
// Compute the histogram for the given image rectangle, and the given
|
||||
// channel. (Channel pointed to by imagedata.) Each channel is always
|
||||
// one byte per pixel.
|
||||
// Bytes per pixel is used to skip channels not being
|
||||
// counted with this call in a multi-channel (pixel-major) image.
|
||||
// Histogram is always a 256 element array to count occurrences of
|
||||
// each pixel value.
|
||||
void TessBaseAPI::HistogramRect(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top, int right, int bottom,
|
||||
int* histogram) {
|
||||
int width = right - left;
|
||||
memset(histogram, 0, sizeof(*histogram) * 256);
|
||||
const UINT8* pix = imagedata +
|
||||
top*bytes_per_line +
|
||||
left*bytes_per_pixel;
|
||||
for (int y = top; y < bottom; ++y) {
|
||||
for (int x = 0; x < width; ++x) {
|
||||
++histogram[pix[x * bytes_per_pixel]];
|
||||
}
|
||||
pix += bytes_per_line;
|
||||
}
|
||||
}
|
||||
|
||||
// Compute the Otsu threshold(s) for the given histogram.
|
||||
// Also returns H = total count in histogram, and
|
||||
// omega0 = count of histogram below threshold.
|
||||
int TessBaseAPI::OtsuStats(const int* histogram,
|
||||
int* H_out,
|
||||
int* omega0_out) {
|
||||
int H = 0;
|
||||
double mu_T = 0.0;
|
||||
for (int i = 0; i < 256; ++i) {
|
||||
H += histogram[i];
|
||||
mu_T += i * histogram[i];
|
||||
}
|
||||
|
||||
// Now maximize sig_sq_B over t.
|
||||
// http://www.ctie.monash.edu.au/hargreave/Cornall_Terry_328.pdf
|
||||
int best_t = -1;
|
||||
int omega_0, omega_1;
|
||||
int best_omega_0 = 0;
|
||||
double best_sig_sq_B = 0.0;
|
||||
double mu_0, mu_1, mu_t;
|
||||
omega_0 = 0;
|
||||
mu_t = 0.0;
|
||||
for (int t = 0; t < 255; ++t) {
|
||||
omega_0 += histogram[t];
|
||||
mu_t += t * static_cast<double>(histogram[t]);
|
||||
if (omega_0 == 0)
|
||||
continue;
|
||||
omega_1 = H - omega_0;
|
||||
mu_0 = mu_t / omega_0;
|
||||
mu_1 = (mu_T - mu_t) / omega_1;
|
||||
double sig_sq_B = mu_1 - mu_0;
|
||||
sig_sq_B *= sig_sq_B * omega_0 * omega_1;
|
||||
if (best_t < 0 || sig_sq_B > best_sig_sq_B) {
|
||||
best_sig_sq_B = sig_sq_B;
|
||||
best_t = t;
|
||||
best_omega_0 = omega_0;
|
||||
}
|
||||
}
|
||||
if (H_out != NULL) *H_out = H;
|
||||
if (omega0_out != NULL) *omega0_out = best_omega_0;
|
||||
return best_t;
|
||||
}
|
||||
|
||||
// Threshold the given grey or color image into the tesseract global
|
||||
// image ready for recognition. Requires thresholds and hi_value
|
||||
// produced by OtsuThreshold above.
|
||||
void TessBaseAPI::ThresholdRect(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top,
|
||||
int width, int height,
|
||||
const int* thresholds,
|
||||
const int* hi_values) {
|
||||
IMAGELINE line;
|
||||
page_image.create(width, height, 1);
|
||||
line.init(width);
|
||||
// For each line in the image, fill the IMAGELINE class and put it into the
|
||||
// Tesseract global page_image. Note that Tesseract stores images with the
|
||||
// bottom at y=0 and 0 is black, so we need 2 kinds of inversion.
|
||||
const UINT8* data = imagedata + top*bytes_per_line + left*bytes_per_pixel;
|
||||
for (int y = height - 1 ; y >= 0; --y) {
|
||||
const UINT8* pix = data;
|
||||
for (int x = 0; x < width; ++x, pix += bytes_per_pixel) {
|
||||
line.pixels[x] = 1;
|
||||
for (int ch = 0; ch < bytes_per_pixel; ++ch) {
|
||||
if (hi_values[ch] >= 0 &&
|
||||
(pix[ch] > thresholds[ch]) == (hi_values[ch] == 0)) {
|
||||
line.pixels[x] = 0;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
page_image.put_line(0, y, width, &line, 0);
|
||||
data += bytes_per_line;
|
||||
}
|
||||
}
|
||||
|
||||
// Cut out the requested rectangle of the binary image to the
|
||||
// tesseract global image ready for recognition.
|
||||
void TessBaseAPI::CopyBinaryRect(const UINT8* imagedata,
|
||||
int bytes_per_line,
|
||||
int left, int top,
|
||||
int width, int height) {
|
||||
// Copy binary image, cutting out the required rectangle.
|
||||
IMAGE image;
|
||||
image.capture(const_cast<UINT8*>(imagedata),
|
||||
bytes_per_line*8, top + height, 1);
|
||||
page_image.create(width, height, 1);
|
||||
copy_sub_image(&image, left, top, width, height, &page_image, 0, 0, false);
|
||||
}
|
||||
|
||||
// Low-level function to recognize the current global image to a string.
|
||||
char* TessBaseAPI::RecognizeToString() {
|
||||
BLOCK_LIST block_list;
|
||||
|
||||
FindLines(&block_list);
|
||||
|
||||
// Now run the main recognition.
|
||||
PAGE_RES* page_res = Recognize(&block_list, NULL);
|
||||
|
||||
return TesseractToText(page_res);
|
||||
}
|
||||
|
||||
// Find lines from the image making the BLOCK_LIST.
|
||||
void TessBaseAPI::FindLines(BLOCK_LIST* block_list) {
|
||||
STRING input_file = "noname.tif";
|
||||
// The following call creates a full-page block and then runs connected
|
||||
// component analysis and text line creation.
|
||||
pgeditor_read_file(input_file, block_list);
|
||||
}
|
||||
|
||||
// Recognize the tesseract global image and return the result as Tesseract
|
||||
// internal structures.
|
||||
PAGE_RES* TessBaseAPI::Recognize(BLOCK_LIST* block_list, ETEXT_DESC* monitor) {
|
||||
if (tessedit_resegment_from_boxes)
|
||||
apply_boxes(block_list);
|
||||
if (edit_variables)
|
||||
start_variables_editor();
|
||||
|
||||
PAGE_RES* page_res = new PAGE_RES(block_list);
|
||||
if (interactive_mode) {
|
||||
pgeditor_main(block_list); //pgeditor user I/F
|
||||
} else if (tessedit_train_from_boxes) {
|
||||
apply_box_training(block_list);
|
||||
} else {
|
||||
// Now run the main recognition.
|
||||
recog_all_words(page_res, monitor);
|
||||
}
|
||||
return page_res;
|
||||
}
|
||||
|
||||
// Make a text string from the internal data structures.
|
||||
// The input page_res is deleted.
|
||||
char* TessBaseAPI::TesseractToText(PAGE_RES* page_res) {
|
||||
if (page_res != NULL) {
|
||||
int total_length = 2;
|
||||
PAGE_RES_IT page_res_it(page_res);
|
||||
// Iterate over the data structures to extract the recognition result.
|
||||
for (page_res_it.restart_page(); page_res_it.word () != NULL;
|
||||
page_res_it.forward()) {
|
||||
WERD_RES *word = page_res_it.word();
|
||||
WERD_CHOICE* choice = word->best_choice;
|
||||
if (choice != NULL) {
|
||||
total_length += choice->string().length() + 1;
|
||||
}
|
||||
}
|
||||
char* result = new char[total_length];
|
||||
char* ptr = result;
|
||||
for (page_res_it.restart_page(); page_res_it.word () != NULL;
|
||||
page_res_it.forward()) {
|
||||
WERD_RES *word = page_res_it.word();
|
||||
WERD_CHOICE* choice = word->best_choice;
|
||||
if (choice != NULL) {
|
||||
strcpy(ptr, choice->string().string());
|
||||
ptr += strlen(ptr);
|
||||
if (word->word->flag(W_EOL))
|
||||
*ptr++ = '\n';
|
||||
else
|
||||
*ptr++ = ' ';
|
||||
}
|
||||
}
|
||||
*ptr++ = '\n';
|
||||
*ptr = '\0';
|
||||
delete page_res;
|
||||
return result;
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
154
ccmain/baseapi.h
Normal file
154
ccmain/baseapi.h
Normal file
@ -0,0 +1,154 @@
|
||||
///////////////////////////////////////////////////////////////////////
|
||||
// File: baseapi.h
|
||||
// Description: Simple API for calling tesseract.
|
||||
// Author: Ray Smith
|
||||
// Created: Fri Oct 06 15:35:01 PDT 2006
|
||||
//
|
||||
// (C) Copyright 2006, Google Inc.
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
///////////////////////////////////////////////////////////////////////
|
||||
|
||||
#ifndef THIRD_PARTY_TESSERACT_CCMAIN_BASEAPI_H__
|
||||
#define THIRD_PARTY_TESSERACT_CCMAIN_BASEAPI_H__
|
||||
|
||||
#include <string>
|
||||
|
||||
#include "host.h"
|
||||
#include "ocrclass.h"
|
||||
|
||||
class PAGE_RES;
|
||||
class BLOCK_LIST;
|
||||
|
||||
// Base class for all tesseract APIs.
|
||||
// Specific classes can add ability to work on different inputs or produce
|
||||
// different outputs.
|
||||
|
||||
class TessBaseAPI {
|
||||
public:
|
||||
// Start tesseract.
|
||||
// The datapath must be the name of the data directory or some other file
|
||||
// in which the data directory resides (for instance argv[0].)
|
||||
// The configfile is the name of a file in the tessconfigs directory
|
||||
// (eg batch) or NULL to run on defaults.
|
||||
// Outputbase may also be NULL, and is the basename of various output files.
|
||||
// If the output of any of these files is enabled, then a name must be given.
|
||||
// If numeric_mode is true, only possible digits and roman numbers are
|
||||
// returned. Returns 0 if successful. Crashes if not.
|
||||
// The argc and argv may be 0 and NULL respectively. They are used for
|
||||
// providing config files for debug/display purposes.
|
||||
// TODO(rays) get the facts straight. Is it OK to call
|
||||
// it more than once? Make it properly check for errors and return them.
|
||||
static int Init(const char* datapath, const char* outputbase,
|
||||
const char* configfile, bool numeric_mode,
|
||||
int argc, char* argv[]);
|
||||
|
||||
// Recognize a rectangle from an image and return the result as a string.
|
||||
// May be called many times for a single Init.
|
||||
// Currently has no error checking.
|
||||
// Greyscale of 8 and color of 24 or 32 bits per pixel may be given.
|
||||
// Palette color images will not work properly and must be converted to
|
||||
// 24 bit.
|
||||
// Binary images of 1 bit per pixel may also be given but they must be
|
||||
// byte packed with the MSB of the first byte being the first pixel, and a
|
||||
// 1 represents WHITE. For binary images set bytes_per_pixel=0.
|
||||
// The recognized text is returned as a char* which (in future will be coded
|
||||
// as UTF8 and) must be freed with the delete [] operator.
|
||||
static char* TesseractRect(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top, int width, int height);
|
||||
|
||||
// Call between pages or documents etc to free up memory and forget
|
||||
// adaptive data.
|
||||
static void ClearAdaptiveClassifier();
|
||||
|
||||
// Close down tesseract and free up memory.
|
||||
static void End();
|
||||
|
||||
// Dump the internal binary image to a PGM file.
|
||||
static void DumpPGM(const char* filename);
|
||||
|
||||
protected:
|
||||
// Copy the given image rectangle to Tesseract, with adaptive thresholding
|
||||
// if the image is not already binary.
|
||||
static void CopyImageToTesseract(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top, int width, int height);
|
||||
|
||||
// Compute the Otsu threshold(s) for the given image rectangle, making one
|
||||
// for each channel. Each channel is always one byte per pixel.
|
||||
// Returns an array of threshold values and an array of hi_values, such
|
||||
// that a pixel value >threshold[channel] is considered foreground if
|
||||
// hi_values[channel] is 0 or background if 1. A hi_value of -1 indicates
|
||||
// that there is no apparent foreground. At least one hi_value will not be -1.
|
||||
// thresholds and hi_values are assumed to be of bytes_per_pixel size.
|
||||
static void OtsuThreshold(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top, int right, int bottom,
|
||||
int* thresholds,
|
||||
int* hi_values);
|
||||
|
||||
// Compute the histogram for the given image rectangle, and the given
|
||||
// channel. (Channel pointed to by imagedata.) Each channel is always
|
||||
// one byte per pixel.
|
||||
// Bytes per pixel is used to skip channels not being
|
||||
// counted with this call in a multi-channel (pixel-major) image.
|
||||
// Histogram is always a 256 element array to count occurrences of
|
||||
// each pixel value.
|
||||
static void HistogramRect(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top, int right, int bottom,
|
||||
int* histogram);
|
||||
|
||||
// Compute the Otsu threshold(s) for the given histogram.
|
||||
// Also returns H = total count in histogram, and
|
||||
// omega0 = count of histogram below threshold.
|
||||
static int OtsuStats(const int* histogram,
|
||||
int* H_out,
|
||||
int* omega0_out);
|
||||
|
||||
// Threshold the given grey or color image into the tesseract global
|
||||
// image ready for recognition. Requires thresholds and hi_value
|
||||
// produced by OtsuThreshold above.
|
||||
static void ThresholdRect(const UINT8* imagedata,
|
||||
int bytes_per_pixel,
|
||||
int bytes_per_line,
|
||||
int left, int top,
|
||||
int width, int height,
|
||||
const int* thresholds,
|
||||
const int* hi_values);
|
||||
|
||||
// Cut out the requested rectangle of the binary image to the
|
||||
// tesseract global image ready for recognition.
|
||||
static void CopyBinaryRect(const UINT8* imagedata,
|
||||
int bytes_per_line,
|
||||
int left, int top,
|
||||
int width, int height);
|
||||
|
||||
// Low-level function to recognize the current global image to a string.
|
||||
static char* RecognizeToString();
|
||||
|
||||
// Find lines from the image making the BLOCK_LIST.
|
||||
static void FindLines(BLOCK_LIST* block_list);
|
||||
|
||||
// Recognize the tesseract global image and return the result as Tesseract
|
||||
// internal structures.
|
||||
static PAGE_RES* Recognize(BLOCK_LIST* block_list, ETEXT_DESC* monitor);
|
||||
|
||||
// Convert (and free) the internal data structures into a text string.
|
||||
static char* TesseractToText(PAGE_RES* page_res);
|
||||
};
|
||||
|
||||
#endif // THIRD_PARTY_TESSERACT_CCMAIN_BASEAPI_H__
|
76
ccmain/blobcmp.cpp
Normal file
76
ccmain/blobcmp.cpp
Normal file
@ -0,0 +1,76 @@
|
||||
/**********************************************************************
|
||||
* File: blobcmp.c (Formerly blobcmp.c)
|
||||
* Description: Code to compare blobs using the adaptive matcher.
|
||||
* Author: Ray Smith
|
||||
* Created: Wed Apr 21 09:28:51 BST 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "fxdefs.h"
|
||||
#include "ocrfeatures.h"
|
||||
#include "intmatcher.h"
|
||||
#include "intproto.h"
|
||||
#include "adaptive.h"
|
||||
#include "adaptmatch.h"
|
||||
#include "const.h"
|
||||
#include "tessvars.h"
|
||||
|
||||
#define CMP_CLASS 'x'
|
||||
|
||||
/**********************************************************************
|
||||
* compare_tess_blobs
|
||||
*
|
||||
* Match 2 blobs using the adaptive classifier.
|
||||
**********************************************************************/
|
||||
float compare_tess_blobs(TBLOB *blob1,
|
||||
TEXTROW *row1,
|
||||
TBLOB *blob2,
|
||||
TEXTROW *row2) {
|
||||
int fcount; /*number of features */
|
||||
ADAPT_TEMPLATES ad_templates;
|
||||
LINE_STATS line_stats1, line_stats2;
|
||||
INT_FEATURE_ARRAY int_features;
|
||||
FEATURE_SET float_features;
|
||||
INT_RESULT_STRUCT int_result; /*output */
|
||||
|
||||
BIT_VECTOR AllProtosOn = NewBitVector (MAX_NUM_PROTOS);
|
||||
BIT_VECTOR AllConfigsOn = NewBitVector (MAX_NUM_CONFIGS);
|
||||
set_all_bits (AllProtosOn, WordsInVectorOfSize (MAX_NUM_PROTOS));
|
||||
set_all_bits (AllConfigsOn, WordsInVectorOfSize (MAX_NUM_CONFIGS));
|
||||
|
||||
EnterClassifyMode;
|
||||
ad_templates = NewAdaptedTemplates ();
|
||||
GetLineStatsFromRow(row1, &line_stats1);
|
||||
/*copy baseline stuff */
|
||||
GetLineStatsFromRow(row2, &line_stats2);
|
||||
MakeNewAdaptedClass(blob1, &line_stats1, CMP_CLASS, ad_templates);
|
||||
fcount = GetAdaptiveFeatures (blob2, &line_stats2,
|
||||
int_features, &float_features);
|
||||
if (fcount > 0) {
|
||||
SetBaseLineMatch();
|
||||
IntegerMatcher (ClassForClassId (ad_templates->Templates, CMP_CLASS),
|
||||
AllProtosOn, AllConfigsOn, fcount, fcount,
|
||||
int_features, 0, 0, &int_result, testedit_match_debug);
|
||||
FreeFeatureSet(float_features);
|
||||
if (int_result.Rating < 0)
|
||||
int_result.Rating = MAX_FLOAT32;
|
||||
}
|
||||
|
||||
free_adapted_templates(ad_templates);
|
||||
FreeBitVector(AllConfigsOn);
|
||||
FreeBitVector(AllProtosOn);
|
||||
|
||||
return fcount > 0 ? int_result.Rating * fcount : MAX_FLOAT32;
|
||||
}
|
29
ccmain/blobcmp.h
Normal file
29
ccmain/blobcmp.h
Normal file
@ -0,0 +1,29 @@
|
||||
/**********************************************************************
|
||||
* File: blobcmp.c
|
||||
* Description: Code to compare blobs using the adaptive matcher.
|
||||
* Author: Ray Smith
|
||||
* Created: Wed Apr 21 09:28:51 BST 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef BLOBCMP_H
|
||||
#define BLOBCMP_H
|
||||
|
||||
#include "tstruct.h"
|
||||
|
||||
float compare_tess_blobs(TBLOB *blob1,
|
||||
TEXTROW *row1,
|
||||
TBLOB *blob2,
|
||||
TEXTROW *row2);
|
||||
#endif
|
93
ccmain/callnet.cpp
Normal file
93
ccmain/callnet.cpp
Normal file
@ -0,0 +1,93 @@
|
||||
/**********************************************************************
|
||||
* File: callnet.cpp (Formerly callnet.c)
|
||||
* Description: Interface to Neural Net matcher
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 18 10:35:00 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "errcode.h"
|
||||
//#include "nmatch.h"
|
||||
#include "globals.h"
|
||||
|
||||
#define OUTPUT_NODES 94
|
||||
|
||||
const ERRCODE NETINIT = "NN init error";
|
||||
|
||||
//extern "C"
|
||||
//{
|
||||
//extern char* demodir; /* where program lives */
|
||||
|
||||
void init_net() { /* Initialise net */
|
||||
#ifdef ASPIRIN_INCLUDED
|
||||
char wts_filename[256];
|
||||
|
||||
if (nmatch_init_network () != 0) {
|
||||
NETINIT.error ("Init_net", EXIT, "Errcode %s", nmatch_error_string ());
|
||||
}
|
||||
strcpy(wts_filename, demodir);
|
||||
strcat (wts_filename, "tessdata/netwts");
|
||||
|
||||
if (nmatch_load_network (wts_filename) != 0) {
|
||||
NETINIT.error ("Init_net", EXIT, "Weights failed, Errcode %s",
|
||||
nmatch_error_string ());
|
||||
}
|
||||
#endif
|
||||
}
|
||||
|
||||
|
||||
void callnet( /* Apply image to net */
|
||||
float *input_vector,
|
||||
char *top,
|
||||
float *top_score,
|
||||
char *next,
|
||||
float *next_score) {
|
||||
#ifdef ASPIRIN_INCLUDED
|
||||
float *output_vector;
|
||||
int i;
|
||||
int max_out_i = 0;
|
||||
int next_max_out_i = 0;
|
||||
float max_out = -9;
|
||||
float next_max_out = -9;
|
||||
|
||||
nmatch_set_input(input_vector);
|
||||
nmatch_propagate_forward();
|
||||
output_vector = nmatch_get_output ();
|
||||
|
||||
/* Now find top two choices */
|
||||
|
||||
for (i = 0; i < OUTPUT_NODES; i++) {
|
||||
if (output_vector[i] > max_out) {
|
||||
next_max_out = max_out;
|
||||
max_out = output_vector[i];
|
||||
next_max_out_i = max_out_i;
|
||||
max_out_i = i;
|
||||
}
|
||||
else {
|
||||
if (output_vector[i] > next_max_out) {
|
||||
next_max_out = output_vector[i];
|
||||
next_max_out_i = i;
|
||||
}
|
||||
}
|
||||
}
|
||||
*top = max_out_i + '!';
|
||||
*next = next_max_out_i + '!';
|
||||
*top_score = max_out;
|
||||
*next_score = next_max_out;
|
||||
#endif
|
||||
}
|
||||
|
||||
|
||||
//};
|
32
ccmain/callnet.h
Normal file
32
ccmain/callnet.h
Normal file
@ -0,0 +1,32 @@
|
||||
/**********************************************************************
|
||||
* File: callnet.h (Formerly callnet.h)
|
||||
* Description: Interface to Neural Net matcher
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 18 10:35:00 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef CALLNET_H
|
||||
#define CALLNET_H
|
||||
|
||||
// extern "C" {
|
||||
void init_net(); /* Initialise net */
|
||||
void callnet( /* Apply image to net */
|
||||
float *input_vector,
|
||||
char *top,
|
||||
float *top_score,
|
||||
char *next,
|
||||
float *next_score);
|
||||
// };
|
||||
#endif
|
710
ccmain/charcut.cpp
Normal file
710
ccmain/charcut.cpp
Normal file
@ -0,0 +1,710 @@
|
||||
/**********************************************************************
|
||||
* File: charcut.cpp (Formerly charclip.c)
|
||||
* Description: Code for character clipping
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 11 08:35:15 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "charcut.h"
|
||||
#include "imgs.h"
|
||||
#include "showim.h"
|
||||
#include "evnts.h"
|
||||
#include "notdll.h"
|
||||
|
||||
#define LARGEST(a,b) ( (a) > (b) ? (a) : (b) )
|
||||
#define SMALLEST(a,b) ( (a) > (b) ? (b) : (a) )
|
||||
#define BUG_OFFSET 1
|
||||
#define EXTERN
|
||||
|
||||
EXTERN INT_VAR (pix_word_margin, 3, "How far outside word BB to grow");
|
||||
|
||||
extern IMAGE page_image;
|
||||
|
||||
ELISTIZE (PIXROW)
|
||||
/*************************************************************************
|
||||
* PIXROW::PIXROW()
|
||||
*
|
||||
* Constructor for a specified size PIXROW from a blob
|
||||
*************************************************************************/
|
||||
PIXROW::PIXROW(INT16 pos, INT16 count, PBLOB *blob) {
|
||||
OUTLINE_LIST *outline_list;
|
||||
OUTLINE_IT outline_it;
|
||||
POLYPT_LIST *pts_list;
|
||||
POLYPT_IT pts_it;
|
||||
INT16 i;
|
||||
FCOORD pt;
|
||||
FCOORD vec;
|
||||
float y_coord;
|
||||
INT16 x_coord;
|
||||
|
||||
row_offset = pos;
|
||||
row_count = count;
|
||||
min = (INT16 *) alloc_mem (count * sizeof (INT16));
|
||||
max = (INT16 *) alloc_mem (count * sizeof (INT16));
|
||||
outline_list = blob->out_list ();
|
||||
outline_it.set_to_list (outline_list);
|
||||
|
||||
for (i = 0; i < count; i++) {
|
||||
min[i] = MAX_INT16 - 1;
|
||||
max[i] = -MAX_INT16 + 1;
|
||||
y_coord = row_offset + i + 0.5;
|
||||
for (outline_it.mark_cycle_pt ();
|
||||
!outline_it.cycled_list (); outline_it.forward ()) {
|
||||
pts_list = outline_it.data ()->polypts ();
|
||||
pts_it.set_to_list (pts_list);
|
||||
for (pts_it.mark_cycle_pt ();
|
||||
!pts_it.cycled_list (); pts_it.forward ()) {
|
||||
pt = pts_it.data ()->pos;
|
||||
vec = pts_it.data ()->vec;
|
||||
if ((vec.y () != 0) &&
|
||||
(((pt.y () <= y_coord) && (pt.y () + vec.y () >= y_coord))
|
||||
|| ((pt.y () >= y_coord)
|
||||
&& (pt.y () + vec.y () <= y_coord)))) {
|
||||
/* The segment crosses y_coord so find x-point and check for min/max. */
|
||||
x_coord = (INT16) floor ((y_coord -
|
||||
pt.y ()) * vec.x () / vec.y () +
|
||||
pt.x () + 0.5);
|
||||
if (x_coord < min[i])
|
||||
min[i] = x_coord;
|
||||
x_coord--; //to get pix to left of line
|
||||
if (x_coord > max[i])
|
||||
max[i] = x_coord;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* PIXROW::plot()
|
||||
*
|
||||
* Draw the PIXROW
|
||||
*************************************************************************/
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void PIXROW::plot(WINDOW fd //where to paint
|
||||
) const {
|
||||
INT16 i;
|
||||
INT16 y_coord;
|
||||
|
||||
for (i = 0; i < row_count; i++) {
|
||||
y_coord = row_offset + i;
|
||||
if (min[i] <= max[i]) {
|
||||
rectangle (fd, min[i], y_coord, max[i] + 1, y_coord + 1);
|
||||
}
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
/*************************************************************************
|
||||
* PIXROW::bounding_box()
|
||||
*
|
||||
* Generate bounding box for blob image
|
||||
*************************************************************************/
|
||||
|
||||
bool PIXROW::bad_box( //return true if box exceeds image
|
||||
int xsize,
|
||||
int ysize) const {
|
||||
BOX bbox = bounding_box ();
|
||||
if (bbox.left () < 0 || bbox.right () > xsize
|
||||
|| bbox.top () > ysize || bbox.bottom () < 0) {
|
||||
tprintf("Box (%d,%d)->(%d,%d) bad compared to %d,%d\n",
|
||||
bbox.left(),bbox.bottom(), bbox.right(), bbox.top(),
|
||||
xsize, ysize);
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* PIXROW::bounding_box()
|
||||
*
|
||||
* Generate bounding box for blob image
|
||||
*************************************************************************/
|
||||
|
||||
BOX PIXROW::bounding_box() const {
|
||||
INT16 i;
|
||||
INT16 y_coord;
|
||||
INT16 min_x = MAX_INT16 - 1;
|
||||
INT16 min_y = MAX_INT16 - 1;
|
||||
INT16 max_x = -MAX_INT16 + 1;
|
||||
INT16 max_y = -MAX_INT16 + 1;
|
||||
|
||||
for (i = 0; i < row_count; i++) {
|
||||
y_coord = row_offset + i;
|
||||
if (min[i] <= max[i]) {
|
||||
if (y_coord < min_y)
|
||||
min_y = y_coord;
|
||||
if (y_coord + 1 > max_y)
|
||||
max_y = y_coord + 1;
|
||||
if (min[i] < min_x)
|
||||
min_x = min[i];
|
||||
if (max[i] + 1 > max_x)
|
||||
max_x = max[i] + 1;
|
||||
}
|
||||
}
|
||||
if (min_x > max_x || min_y > max_y)
|
||||
return BOX ();
|
||||
else
|
||||
return BOX (ICOORD (min_x, min_y), ICOORD (max_x, max_y));
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* PIXROW::contract()
|
||||
*
|
||||
* Reduce the mins and maxs so that they end on black pixels
|
||||
*************************************************************************/
|
||||
|
||||
void PIXROW::contract( //image array
|
||||
IMAGELINE *imlines,
|
||||
INT16 x_offset, //of pixels[0]
|
||||
INT16 foreground_colour //0 or 1
|
||||
) {
|
||||
INT16 i;
|
||||
UINT8 *line_pixels;
|
||||
|
||||
for (i = 0; i < row_count; i++) {
|
||||
if (min[i] > max[i])
|
||||
continue;
|
||||
|
||||
line_pixels = imlines[i].pixels;
|
||||
while (line_pixels[min[i] - x_offset] != foreground_colour) {
|
||||
if (min[i] == max[i]) {
|
||||
min[i] = MAX_INT16 - 1;
|
||||
max[i] = -MAX_INT16 + 1;
|
||||
goto nextline;
|
||||
}
|
||||
else
|
||||
min[i]++;
|
||||
}
|
||||
while (line_pixels[max[i] - x_offset] != foreground_colour) {
|
||||
if (min[i] == max[i]) {
|
||||
min[i] = MAX_INT16 - 1;
|
||||
max[i] = -MAX_INT16 + 1;
|
||||
goto nextline;
|
||||
}
|
||||
else
|
||||
max[i]--;
|
||||
}
|
||||
nextline:;
|
||||
//goto label!
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* PIXROW::extend()
|
||||
*
|
||||
* 1 pixel extension in each direction to cover extra black area
|
||||
*************************************************************************/
|
||||
|
||||
BOOL8 PIXROW::extend( //image array
|
||||
IMAGELINE *imlines,
|
||||
BOX &imbox,
|
||||
PIXROW *prev, //for prev blob
|
||||
PIXROW *next, //for next blob
|
||||
INT16 foreground_colour) {
|
||||
INT16 i;
|
||||
INT16 x_offset = imbox.left ();
|
||||
INT16 limit;
|
||||
INT16 left_limit;
|
||||
INT16 right_limit;
|
||||
UINT8 *pixels = NULL;
|
||||
UINT8 *pixels_below = NULL; //row below current
|
||||
UINT8 *pixels_above = NULL; //row above current
|
||||
BOOL8 changed = FALSE;
|
||||
|
||||
pixels_above = imlines[0].pixels;
|
||||
for (i = 0; i < row_count; i++) {
|
||||
pixels_below = pixels;
|
||||
pixels = pixels_above;
|
||||
if (i < (row_count - 1))
|
||||
pixels_above = imlines[i + 1].pixels;
|
||||
else
|
||||
pixels_above = NULL;
|
||||
|
||||
/* Extend Left by one pixel*/
|
||||
if (prev == NULL || prev->max[i] < prev->min[i])
|
||||
limit = imbox.left ();
|
||||
else
|
||||
limit = prev->max[i] + 1;
|
||||
if ((min[i] <= max[i]) &&
|
||||
(min[i] > limit) &&
|
||||
(pixels[min[i] - 1 - x_offset] == foreground_colour)) {
|
||||
min[i]--;
|
||||
changed = TRUE;
|
||||
}
|
||||
|
||||
/* Extend Right by one pixel*/
|
||||
if (next == NULL || next->min[i] > next->max[i])
|
||||
limit = imbox.right () - 1;//-1 to index inside pix
|
||||
else
|
||||
limit = next->min[i] - 1;
|
||||
if ((min[i] <= max[i]) &&
|
||||
(max[i] < limit) &&
|
||||
(pixels[max[i] + 1 - x_offset] == foreground_colour)) {
|
||||
max[i]++;
|
||||
changed = TRUE;
|
||||
}
|
||||
|
||||
/* Extend down by one row */
|
||||
if (pixels_below != NULL) {
|
||||
if (min[i] < min[i - 1]) { //row goes left of row below
|
||||
if (prev == NULL || prev->max[i - 1] < prev->min[i - 1])
|
||||
left_limit = min[i];
|
||||
else
|
||||
left_limit = LARGEST (min[i], prev->max[i - 1] + 1);
|
||||
}
|
||||
else
|
||||
left_limit = min[i - 1];
|
||||
|
||||
if (max[i] > max[i - 1]) { //row goes right of row below
|
||||
if (next == NULL || next->min[i - 1] > next->max[i - 1])
|
||||
right_limit = max[i];
|
||||
else
|
||||
right_limit = SMALLEST (max[i], next->min[i - 1] - 1);
|
||||
}
|
||||
else
|
||||
right_limit = max[i - 1];
|
||||
|
||||
while ((left_limit <= right_limit) &&
|
||||
(pixels_below[left_limit - x_offset] != foreground_colour))
|
||||
left_limit++; //find black extremity
|
||||
|
||||
if ((left_limit <= right_limit) && (left_limit < min[i - 1])) {
|
||||
min[i - 1] = left_limit; //widen left if poss
|
||||
changed = TRUE;
|
||||
}
|
||||
|
||||
while ((left_limit <= right_limit) &&
|
||||
(pixels_below[right_limit - x_offset] != foreground_colour))
|
||||
right_limit--; //find black extremity
|
||||
|
||||
if ((left_limit <= right_limit) && (right_limit > max[i - 1])) {
|
||||
max[i - 1] = right_limit;//widen right if poss
|
||||
changed = TRUE;
|
||||
}
|
||||
}
|
||||
|
||||
/* Extend up by one row */
|
||||
if (pixels_above != NULL) {
|
||||
if (min[i] < min[i + 1]) { //row goes left of row above
|
||||
if (prev == NULL || prev->min[i + 1] > prev->max[i + 1])
|
||||
left_limit = min[i];
|
||||
else
|
||||
left_limit = LARGEST (min[i], prev->max[i + 1] + 1);
|
||||
}
|
||||
else
|
||||
left_limit = min[i + 1];
|
||||
|
||||
if (max[i] > max[i + 1]) { //row goes right of row above
|
||||
if (next == NULL || next->min[i + 1] > next->max[i + 1])
|
||||
right_limit = max[i];
|
||||
else
|
||||
right_limit = SMALLEST (max[i], next->min[i + 1] - 1);
|
||||
}
|
||||
else
|
||||
right_limit = max[i + 1];
|
||||
|
||||
while ((left_limit <= right_limit) &&
|
||||
(pixels_above[left_limit - x_offset] != foreground_colour))
|
||||
left_limit++; //find black extremity
|
||||
|
||||
if ((left_limit <= right_limit) && (left_limit < min[i + 1])) {
|
||||
min[i + 1] = left_limit; //widen left if poss
|
||||
changed = TRUE;
|
||||
}
|
||||
|
||||
while ((left_limit <= right_limit) &&
|
||||
(pixels_above[right_limit - x_offset] != foreground_colour))
|
||||
right_limit--; //find black extremity
|
||||
|
||||
if ((left_limit <= right_limit) && (right_limit > max[i + 1])) {
|
||||
max[i + 1] = right_limit;//widen right if poss
|
||||
changed = TRUE;
|
||||
}
|
||||
}
|
||||
}
|
||||
return changed;
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* PIXROW::char_clip_image()
|
||||
* Cut out a sub image for a character
|
||||
*************************************************************************/
|
||||
|
||||
void PIXROW::char_clip_image( //box of imlines extnt
|
||||
IMAGELINE *imlines,
|
||||
BOX &im_box,
|
||||
ROW *row, //row containing word
|
||||
IMAGE &clip_image, //unscaled sq subimage
|
||||
float &baseline_pos //baseline ht in image
|
||||
) {
|
||||
INT16 clip_image_xsize; //sub image x size
|
||||
INT16 clip_image_ysize; //sub image y size
|
||||
INT16 x_shift; //from pixrow to subim
|
||||
INT16 y_shift; //from pixrow to subim
|
||||
BOX char_pix_box; //bbox of char pixels
|
||||
INT16 y_dest;
|
||||
INT16 x_min;
|
||||
INT16 x_max;
|
||||
INT16 x_min_dest;
|
||||
INT16 x_max_dest;
|
||||
INT16 x_width;
|
||||
INT16 y;
|
||||
|
||||
clip_image_xsize = clip_image.get_xsize ();
|
||||
clip_image_ysize = clip_image.get_ysize ();
|
||||
|
||||
char_pix_box = bounding_box ();
|
||||
/*
|
||||
The y shift is calculated by first finding the coord of the bottom of the
|
||||
image relative to the image lines. Then reducing this so by the amount
|
||||
relative to the clip image size, necessary to vertically position the
|
||||
character.
|
||||
*/
|
||||
y_shift = char_pix_box.bottom () - row_offset -
|
||||
(INT16) floor ((clip_image_ysize - char_pix_box.height () + 0.5) / 2);
|
||||
|
||||
/*
|
||||
The x_shift is the shift to be applied to the page coord in the pixrow to
|
||||
generate a centred char in the clip image. Thus the left hand edge of the
|
||||
char is shifted to the margin width of the centred character.
|
||||
*/
|
||||
x_shift = char_pix_box.left () -
|
||||
(INT16) floor ((clip_image_xsize - char_pix_box.width () + 0.5) / 2);
|
||||
|
||||
for (y = 0; y < row_count; y++) {
|
||||
/*
|
||||
Check that there is something in this row of the source that will fit in the
|
||||
sub image. If there is, reduce x range if necessary, then copy it
|
||||
*/
|
||||
y_dest = y - y_shift;
|
||||
if ((min[y] <= max[y]) && (y_dest >= 0) && (y_dest < clip_image_ysize)) {
|
||||
x_min = min[y];
|
||||
x_min_dest = x_min - x_shift;
|
||||
if (x_min_dest < 0) {
|
||||
x_min = x_min - x_min_dest;
|
||||
x_min_dest = 0;
|
||||
}
|
||||
x_max = max[y];
|
||||
x_max_dest = x_max - x_shift;
|
||||
if (x_max_dest > clip_image_xsize - 1) {
|
||||
x_max = x_max - (x_max_dest - (clip_image_xsize - 1));
|
||||
x_max_dest = clip_image_xsize - 1;
|
||||
}
|
||||
x_width = x_max - x_min + 1;
|
||||
if (x_width > 0) {
|
||||
x_min -= im_box.left ();
|
||||
//offset pixel ptr
|
||||
imlines[y].pixels += x_min;
|
||||
clip_image.put_line (x_min_dest, y_dest, x_width, imlines + y,
|
||||
0);
|
||||
imlines[y].init (); //reset pixel ptr
|
||||
}
|
||||
}
|
||||
}
|
||||
/*
|
||||
Baseline position relative to clip image: First find the baseline relative
|
||||
to the page origin at the x coord of the centre of the character. Then
|
||||
make this relative to the character bottom. Finally shift by the margin
|
||||
between the bottom of the character and the bottom of the clip image.
|
||||
*/
|
||||
if (row == NULL)
|
||||
baseline_pos = 0; //Not needed
|
||||
else
|
||||
baseline_pos = row->base_line ((char_pix_box.left () +
|
||||
char_pix_box.right ()) / 2.0)
|
||||
- char_pix_box.bottom ()
|
||||
+ ((clip_image_ysize - char_pix_box.height ()) / 2);
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* char_clip_word()
|
||||
*
|
||||
* Generate a PIXROW_LIST with one element for each blob in the word, together
|
||||
* with the image lines for the whole word.
|
||||
*************************************************************************/
|
||||
|
||||
void char_clip_word( //
|
||||
WERD *word, //word to be processed
|
||||
IMAGE &bin_image, //whole image
|
||||
PIXROW_LIST *&pixrow_list, //pixrows built
|
||||
IMAGELINE *&imlines, //lines cut from image
|
||||
BOX &pix_box //box defining imlines
|
||||
) {
|
||||
BOX word_box = word->bounding_box ();
|
||||
PBLOB_LIST *blob_list;
|
||||
PBLOB_IT blob_it;
|
||||
PIXROW_IT pixrow_it;
|
||||
INT16 pix_offset; //Y pos of pixrow[0]
|
||||
INT16 row_height; //No of pix rows
|
||||
INT16 imlines_x_offset;
|
||||
PIXROW *prev;
|
||||
PIXROW *next;
|
||||
PIXROW *current;
|
||||
BOOL8 changed; //still improving
|
||||
BOOL8 just_changed; //still improving
|
||||
INT16 iteration_count = 0;
|
||||
INT16 foreground_colour;
|
||||
|
||||
if (word->flag (W_INVERSE))
|
||||
foreground_colour = 1;
|
||||
else
|
||||
foreground_colour = 0;
|
||||
|
||||
/* Define region for max pixrow expansion */
|
||||
pix_box = word_box;
|
||||
pix_box.move_bottom_edge (-pix_word_margin);
|
||||
pix_box.move_top_edge (pix_word_margin);
|
||||
pix_box.move_left_edge (-pix_word_margin);
|
||||
pix_box.move_right_edge (pix_word_margin);
|
||||
pix_box -= BOX (ICOORD (0, 0 + BUG_OFFSET),
|
||||
ICOORD (bin_image.get_xsize (),
|
||||
bin_image.get_ysize () - BUG_OFFSET));
|
||||
|
||||
/* Generate pixrows list */
|
||||
|
||||
pix_offset = pix_box.bottom ();
|
||||
row_height = pix_box.height ();
|
||||
blob_list = word->blob_list ();
|
||||
blob_it.set_to_list (blob_list);
|
||||
|
||||
pixrow_list = new PIXROW_LIST;
|
||||
pixrow_it.set_to_list (pixrow_list);
|
||||
|
||||
for (blob_it.mark_cycle_pt (); !blob_it.cycled_list (); blob_it.forward ()) {
|
||||
PIXROW *row = new PIXROW (pix_offset, row_height, blob_it.data ());
|
||||
ASSERT_HOST (!row->
|
||||
bad_box (bin_image.get_xsize (), bin_image.get_ysize ()));
|
||||
pixrow_it.add_after_then_move (row);
|
||||
}
|
||||
|
||||
imlines = generate_imlines (bin_image, pix_box);
|
||||
|
||||
/* Contract pixrows - shrink min and max back to black pixels */
|
||||
|
||||
imlines_x_offset = pix_box.left ();
|
||||
|
||||
pixrow_it.move_to_first ();
|
||||
for (pixrow_it.mark_cycle_pt ();
|
||||
!pixrow_it.cycled_list (); pixrow_it.forward ()) {
|
||||
ASSERT_HOST (!pixrow_it.data ()->
|
||||
bad_box (bin_image.get_xsize (), bin_image.get_ysize ()));
|
||||
pixrow_it.data ()->contract (imlines, imlines_x_offset,
|
||||
foreground_colour);
|
||||
ASSERT_HOST (!pixrow_it.data ()->
|
||||
bad_box (bin_image.get_xsize (), bin_image.get_ysize ()));
|
||||
}
|
||||
|
||||
/* Expand pixrows iteratively 1 pixel at a time */
|
||||
do {
|
||||
changed = FALSE;
|
||||
pixrow_it.move_to_first ();
|
||||
prev = NULL;
|
||||
current = NULL;
|
||||
next = pixrow_it.data ();
|
||||
for (pixrow_it.mark_cycle_pt ();
|
||||
!pixrow_it.cycled_list (); pixrow_it.forward ()) {
|
||||
prev = current;
|
||||
current = next;
|
||||
if (pixrow_it.at_last ())
|
||||
next = NULL;
|
||||
else
|
||||
next = pixrow_it.data_relative (1);
|
||||
just_changed = current->extend (imlines, pix_box, prev, next,
|
||||
foreground_colour);
|
||||
ASSERT_HOST (!current->
|
||||
bad_box (bin_image.get_xsize (),
|
||||
bin_image.get_ysize ()));
|
||||
changed = changed || just_changed;
|
||||
}
|
||||
iteration_count++;
|
||||
}
|
||||
while (changed);
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* generate_imlines()
|
||||
* Get an array of IMAGELINES holding a portion of an image
|
||||
*************************************************************************/
|
||||
|
||||
IMAGELINE *generate_imlines( //get some imagelines
|
||||
IMAGE &bin_image, //from here
|
||||
BOX &pix_box) {
|
||||
IMAGELINE *imlines; //array of lines
|
||||
int i;
|
||||
|
||||
imlines = new IMAGELINE[pix_box.height ()];
|
||||
for (i = 0; i < pix_box.height (); i++) {
|
||||
imlines[i].init (pix_box.width ());
|
||||
//coord to start at
|
||||
bin_image.fast_get_line (pix_box.left (),
|
||||
pix_box.bottom () + i + BUG_OFFSET,
|
||||
//line to get
|
||||
pix_box.width (), //width to get
|
||||
imlines + i); //dest imline
|
||||
}
|
||||
return imlines;
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* display_clip_image()
|
||||
* All the boring user interface bits to let you see what's going on
|
||||
*************************************************************************/
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
WINDOW display_clip_image(WERD *word, //word to be processed
|
||||
IMAGE &bin_image, //whole image
|
||||
PIXROW_LIST *pixrow_list, //pixrows built
|
||||
BOX &pix_box //box of subimage
|
||||
) {
|
||||
WINDOW clip_window; //window for debug
|
||||
BOX word_box = word->bounding_box ();
|
||||
int border = word_box.height () / 2;
|
||||
BOX display_box = word_box;
|
||||
|
||||
display_box.move_bottom_edge (-border);
|
||||
display_box.move_top_edge (border);
|
||||
display_box.move_left_edge (-border);
|
||||
display_box.move_right_edge (border);
|
||||
display_box -= BOX (ICOORD (0, 0 - BUG_OFFSET),
|
||||
ICOORD (bin_image.get_xsize (),
|
||||
bin_image.get_ysize () - BUG_OFFSET));
|
||||
|
||||
pgeditor_msg ("Creating Clip window...");
|
||||
clip_window =
|
||||
create_window ("Clipped Blobs",
|
||||
SCROLLINGWIN,
|
||||
editor_word_xpos, editor_word_ypos,
|
||||
3 * (word_box.width () + 2 * border),
|
||||
3 * (word_box.height () + 2 * border),
|
||||
//window width,height
|
||||
// xmin, xmax
|
||||
display_box.left (), display_box.right (),
|
||||
display_box.bottom () - BUG_OFFSET,
|
||||
display_box.top () - BUG_OFFSET,
|
||||
// ymin, ymax
|
||||
TRUE, FALSE, FALSE, TRUE); // down event & key only
|
||||
pgeditor_msg ("Creating Clip window...Done");
|
||||
|
||||
clear_view_surface(clip_window);
|
||||
show_sub_image (&bin_image,
|
||||
display_box.left (),
|
||||
display_box.bottom (),
|
||||
display_box.width (),
|
||||
display_box.height (),
|
||||
clip_window,
|
||||
display_box.left (), display_box.bottom () - BUG_OFFSET);
|
||||
|
||||
word->plot (clip_window, RED);
|
||||
word_box.plot (clip_window, INT_HOLLOW, TRUE, BLUE, BLUE);
|
||||
pix_box.plot (clip_window, INT_HOLLOW, TRUE, BLUE, BLUE);
|
||||
plot_pixrows(pixrow_list, clip_window);
|
||||
overlap_picture_ops(TRUE);
|
||||
return clip_window;
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* display_images()
|
||||
* Show a pair of clip and scaled character images and wait for key before
|
||||
* continuing.
|
||||
*************************************************************************/
|
||||
|
||||
void display_images(IMAGE &clip_image, IMAGE &scaled_image) {
|
||||
WINDOW clip_im_window; //window for debug
|
||||
WINDOW scale_im_window; //window for debug
|
||||
INT16 i;
|
||||
GRAPHICS_EVENT event; // c;
|
||||
|
||||
// xmin xmax ymin ymax
|
||||
clip_im_window = create_window ("Clipped Blob", SCROLLINGWIN, editor_word_xpos - 20, editor_word_ypos - 100, 5 * clip_image.get_xsize (), 5 * clip_image.get_ysize (), 0, clip_image.get_xsize (), 0, clip_image.get_ysize (),
|
||||
TRUE, FALSE, FALSE, TRUE); // down event & key only
|
||||
|
||||
clear_view_surface(clip_im_window);
|
||||
show_sub_image (&clip_image,
|
||||
0, 0,
|
||||
clip_image.get_xsize (), clip_image.get_ysize (),
|
||||
clip_im_window, 0, 0);
|
||||
|
||||
line_color_index(clip_im_window, RED);
|
||||
for (i = 1; i < clip_image.get_xsize (); i++) {
|
||||
move2d (clip_im_window, i, 0);
|
||||
draw2d (clip_im_window, i, clip_image.get_xsize ());
|
||||
}
|
||||
for (i = 1; i < clip_image.get_ysize (); i++) {
|
||||
move2d (clip_im_window, 0, i);
|
||||
draw2d (clip_im_window, clip_image.get_xsize (), i);
|
||||
}
|
||||
|
||||
// xmin xmax ymin ymax
|
||||
scale_im_window = create_window ("Scaled Blob", SCROLLINGWIN, editor_word_xpos + 300, editor_word_ypos - 100, 5 * scaled_image.get_xsize (), 5 * scaled_image.get_ysize (), 0, scaled_image.get_xsize (), 0, scaled_image.get_ysize (),
|
||||
TRUE, FALSE, FALSE, TRUE); // down event & key only
|
||||
|
||||
clear_view_surface(scale_im_window);
|
||||
show_sub_image (&scaled_image,
|
||||
0, 0,
|
||||
scaled_image.get_xsize (), scaled_image.get_ysize (),
|
||||
scale_im_window, 0, 0);
|
||||
|
||||
line_color_index(scale_im_window, RED);
|
||||
for (i = 1; i < scaled_image.get_xsize (); i++) {
|
||||
move2d (scale_im_window, i, 0);
|
||||
draw2d (scale_im_window, i, scaled_image.get_xsize ());
|
||||
}
|
||||
for (i = 1; i < scaled_image.get_ysize (); i++) {
|
||||
move2d (scale_im_window, 0, i);
|
||||
draw2d (scale_im_window, scaled_image.get_xsize (), i);
|
||||
}
|
||||
|
||||
overlap_picture_ops(TRUE);
|
||||
await_event(scale_im_window, TRUE, ANY_EVENT, &event);
|
||||
destroy_window(clip_im_window);
|
||||
destroy_window(scale_im_window);
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* plot_pixrows()
|
||||
* Display a list of pixrows
|
||||
*************************************************************************/
|
||||
|
||||
void plot_pixrows( //plot for all blobs
|
||||
PIXROW_LIST *pixrow_list,
|
||||
WINDOW win) {
|
||||
PIXROW_IT pixrow_it(pixrow_list);
|
||||
INT16 colour = RED;
|
||||
|
||||
for (pixrow_it.mark_cycle_pt ();
|
||||
!pixrow_it.cycled_list (); pixrow_it.forward ()) {
|
||||
if (colour > RED + 7)
|
||||
colour = RED;
|
||||
|
||||
perimeter_color_index (win, (COLOUR) colour);
|
||||
interior_style(win, INT_HOLLOW, TRUE);
|
||||
pixrow_it.data ()->plot (win);
|
||||
colour++;
|
||||
}
|
||||
}
|
||||
#endif
|
119
ccmain/charcut.h
Normal file
119
ccmain/charcut.h
Normal file
@ -0,0 +1,119 @@
|
||||
/**********************************************************************
|
||||
* File: charcut.h (Formerly charclip.h)
|
||||
* Description: Code for character clipping
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 11 08:35:15 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef CHARCUT_H
|
||||
#define CHARCUT_H
|
||||
|
||||
#include "pgedit.h"
|
||||
#include "notdll.h"
|
||||
#include "notdll.h"
|
||||
|
||||
/*************************************************************************
|
||||
* CLASS PIXROW
|
||||
*
|
||||
* This class describes the pixels occupied by a blob. It uses two arrays, (min
|
||||
* and max), each with one element per row, to identify the min and max x
|
||||
* coordinates of the black pixels in the character on that row of the image.
|
||||
* The number of rows used to describe the blob is held in row_count - note that
|
||||
* some rows may be unoccupied - signified by max < min. The page coordinate of
|
||||
* the row defined by min[0] and max[0] is held in row_offset.
|
||||
*************************************************************************/
|
||||
|
||||
class PIXROW:public ELIST_LINK
|
||||
{
|
||||
public:
|
||||
INT16 row_offset; //y coord of min[0]
|
||||
INT16 row_count; //length of arrays
|
||||
INT16 *min; //array of min x
|
||||
INT16 *max; //array of max x
|
||||
|
||||
PIXROW() { //empty constructor
|
||||
row_offset = 0;
|
||||
row_count = 0;
|
||||
min = NULL;
|
||||
max = NULL;
|
||||
}
|
||||
PIXROW( //specified size
|
||||
INT16 pos,
|
||||
INT16 count,
|
||||
PBLOB *blob);
|
||||
|
||||
~PIXROW () { //destructor
|
||||
if (min != NULL)
|
||||
free_mem(min);
|
||||
if (max != NULL)
|
||||
free_mem(max);
|
||||
max = NULL;
|
||||
}
|
||||
|
||||
void plot( //use current settings
|
||||
WINDOW fd) const; //where to paint
|
||||
|
||||
BOX bounding_box() const; //return bounding box
|
||||
//return true if box exceeds image
|
||||
bool bad_box(int xsize, int ysize) const;
|
||||
|
||||
void contract( //force end on black
|
||||
IMAGELINE *imlines, //image array
|
||||
INT16 x_offset, //of pixels[0]
|
||||
INT16 foreground_colour); //0 or 1
|
||||
|
||||
//image array
|
||||
BOOL8 extend(IMAGELINE *imlines,
|
||||
BOX &imbox,
|
||||
PIXROW *prev, //for prev blob
|
||||
PIXROW *next, //for next blob
|
||||
INT16 foreground_colour); //0 or 1
|
||||
|
||||
//box of imlines extnt
|
||||
void char_clip_image(IMAGELINE *imlines,
|
||||
BOX &im_box,
|
||||
ROW *row, //row containing word
|
||||
IMAGE &clip_image, //unscaled char image
|
||||
float &baseline_pos); //baseline ht in image
|
||||
|
||||
};
|
||||
|
||||
ELISTIZEH (PIXROW)
|
||||
extern INT_VAR_H (pix_word_margin, 3, "How far outside word BB to grow");
|
||||
extern BOOL_VAR_H (show_char_clipping, TRUE, "Show clip image window?");
|
||||
extern INT_VAR_H (net_image_width, 40, "NN input image width");
|
||||
extern INT_VAR_H (net_image_height, 36, "NN input image height");
|
||||
extern INT_VAR_H (net_image_x_height, 22, "NN input image x_height");
|
||||
void char_clip_word( //
|
||||
WERD *word, //word to be processed
|
||||
IMAGE &bin_image, //whole image
|
||||
PIXROW_LIST *&pixrow_list, //pixrows built
|
||||
IMAGELINE *&imlines, //lines cut from image
|
||||
BOX &pix_box //box defining imlines
|
||||
);
|
||||
IMAGELINE *generate_imlines( //get some imagelines
|
||||
IMAGE &bin_image, //from here
|
||||
BOX &pix_box);
|
||||
//word to be processed
|
||||
WINDOW display_clip_image(WERD *word,
|
||||
IMAGE &bin_image, //whole image
|
||||
PIXROW_LIST *pixrow_list, //pixrows built
|
||||
BOX &pix_box //box of subimage
|
||||
);
|
||||
void display_images(IMAGE &clip_image, IMAGE &scaled_image);
|
||||
void plot_pixrows( //plot for all blobs
|
||||
PIXROW_LIST *pixrow_list,
|
||||
WINDOW win);
|
||||
#endif
|
698
ccmain/charsample.cpp
Normal file
698
ccmain/charsample.cpp
Normal file
@ -0,0 +1,698 @@
|
||||
/**********************************************************************
|
||||
* File: charsample.cpp (Formerly charsample.c)
|
||||
* Description: Class to contain character samples and match scores
|
||||
* to be used for adaption
|
||||
* Author: Chris Newton
|
||||
* Created: Thu Oct 7 13:40:37 BST 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <stdio.h>
|
||||
#include <ctype.h>
|
||||
#include <math.h>
|
||||
#ifdef __UNIX__
|
||||
#include <assert.h>
|
||||
#include <unistd.h>
|
||||
#endif
|
||||
#include "memry.h"
|
||||
#include "tessvars.h"
|
||||
#include "statistc.h"
|
||||
#include "charsample.h"
|
||||
#include "paircmp.h"
|
||||
#include "matmatch.h"
|
||||
#include "adaptions.h"
|
||||
#include "secname.h"
|
||||
#include "notdll.h"
|
||||
|
||||
extern INT32 demo_word; // Hack for demos
|
||||
|
||||
ELISTIZE (CHAR_SAMPLE) ELISTIZE (CHAR_SAMPLES) CHAR_SAMPLE::CHAR_SAMPLE () {
|
||||
sample_blob = NULL;
|
||||
sample_denorm = NULL;
|
||||
sample_image = NULL;
|
||||
ch = '\0';
|
||||
n_samples_matched = 0;
|
||||
total_match_scores = 0.0;
|
||||
sumsq_match_scores = 0.0;
|
||||
}
|
||||
|
||||
|
||||
CHAR_SAMPLE::CHAR_SAMPLE(PBLOB *blob, DENORM *denorm, char c) {
|
||||
sample_blob = blob;
|
||||
sample_denorm = denorm;
|
||||
sample_image = NULL;
|
||||
ch = c;
|
||||
n_samples_matched = 0;
|
||||
total_match_scores = 0.0;
|
||||
sumsq_match_scores = 0.0;
|
||||
}
|
||||
|
||||
|
||||
CHAR_SAMPLE::CHAR_SAMPLE(IMAGE *image, char c) {
|
||||
sample_blob = NULL;
|
||||
sample_denorm = NULL;
|
||||
sample_image = image;
|
||||
ch = c;
|
||||
n_samples_matched = 0;
|
||||
total_match_scores = 0.0;
|
||||
sumsq_match_scores = 0.0;
|
||||
}
|
||||
|
||||
|
||||
float CHAR_SAMPLE::match_sample( // Update match scores
|
||||
CHAR_SAMPLE *test_sample,
|
||||
BOOL8 updating) {
|
||||
float score1;
|
||||
float score2;
|
||||
IMAGE *image = test_sample->image ();
|
||||
|
||||
if (sample_blob != NULL && test_sample->blob () != NULL) {
|
||||
PBLOB *blob = test_sample->blob ();
|
||||
DENORM *denorm = test_sample->denorm ();
|
||||
|
||||
score1 = compare_bln_blobs (sample_blob, sample_denorm, blob, denorm);
|
||||
score2 = compare_bln_blobs (blob, denorm, sample_blob, sample_denorm);
|
||||
|
||||
score1 = (score1 > score2) ? score1 : score2;
|
||||
}
|
||||
else if (sample_image != NULL && image != NULL) {
|
||||
CHAR_PROTO *sample = new CHAR_PROTO (this);
|
||||
|
||||
score1 = matrix_match (sample_image, image);
|
||||
delete sample;
|
||||
}
|
||||
else
|
||||
return BAD_SCORE;
|
||||
|
||||
if ((tessedit_use_best_sample || tessedit_cluster_debug) && updating) {
|
||||
n_samples_matched++;
|
||||
total_match_scores += score1;
|
||||
sumsq_match_scores += score1 * score1;
|
||||
}
|
||||
return score1;
|
||||
}
|
||||
|
||||
|
||||
double CHAR_SAMPLE::mean_score() {
|
||||
if (n_samples_matched > 0)
|
||||
return (total_match_scores / n_samples_matched);
|
||||
else
|
||||
return BAD_SCORE;
|
||||
}
|
||||
|
||||
|
||||
double CHAR_SAMPLE::variance() {
|
||||
double mean = mean_score ();
|
||||
|
||||
if (n_samples_matched > 0) {
|
||||
return (sumsq_match_scores / n_samples_matched) - mean * mean;
|
||||
}
|
||||
else
|
||||
return BAD_SCORE;
|
||||
}
|
||||
|
||||
|
||||
void CHAR_SAMPLE::print(FILE *f) {
|
||||
if (!tessedit_cluster_debug)
|
||||
return;
|
||||
|
||||
if (n_samples_matched > 0)
|
||||
fprintf (f,
|
||||
"%c - sample matched against " INT32FORMAT
|
||||
" blobs, mean: %f, var: %f\n", ch, n_samples_matched,
|
||||
mean_score (), variance ());
|
||||
else
|
||||
fprintf (f, "No matches for this sample (%c)\n", ch);
|
||||
}
|
||||
|
||||
|
||||
void CHAR_SAMPLE::reset_match_statistics() {
|
||||
n_samples_matched = 0;
|
||||
total_match_scores = 0.0;
|
||||
sumsq_match_scores = 0.0;
|
||||
}
|
||||
|
||||
|
||||
CHAR_SAMPLES::CHAR_SAMPLES() {
|
||||
type = UNKNOWN;
|
||||
samples.clear ();
|
||||
ch = '\0';
|
||||
best_sample = NULL;
|
||||
proto = NULL;
|
||||
}
|
||||
|
||||
|
||||
CHAR_SAMPLES::CHAR_SAMPLES(CHAR_SAMPLE *sample) {
|
||||
CHAR_SAMPLE_IT sample_it = &samples;
|
||||
|
||||
ASSERT_HOST (sample->image () != NULL || sample->blob () != NULL);
|
||||
|
||||
if (sample->image () != NULL)
|
||||
type = IMAGE_CLUSTER;
|
||||
else if (sample->blob () != NULL)
|
||||
type = BLOB_CLUSTER;
|
||||
|
||||
samples.clear ();
|
||||
sample_it.add_to_end (sample);
|
||||
if (tessedit_mm_only_match_same_char)
|
||||
ch = sample->character ();
|
||||
else
|
||||
ch = '\0';
|
||||
best_sample = NULL;
|
||||
proto = NULL;
|
||||
}
|
||||
|
||||
|
||||
void CHAR_SAMPLES::add_sample(CHAR_SAMPLE *sample) {
|
||||
CHAR_SAMPLE_IT sample_it = &samples;
|
||||
|
||||
if (tessedit_use_best_sample || tessedit_cluster_debug)
|
||||
for (sample_it.mark_cycle_pt ();
|
||||
!sample_it.cycled_list (); sample_it.forward ()) {
|
||||
sample_it.data ()->match_sample (sample, TRUE);
|
||||
sample->match_sample (sample_it.data (), TRUE);
|
||||
}
|
||||
|
||||
sample_it.add_to_end (sample);
|
||||
|
||||
if (tessedit_mm_use_prototypes && type == IMAGE_CLUSTER)
|
||||
if (samples.length () == tessedit_mm_prototype_min_size)
|
||||
this->build_prototype ();
|
||||
else if (samples.length () > tessedit_mm_prototype_min_size)
|
||||
this->add_sample_to_prototype (sample);
|
||||
}
|
||||
|
||||
|
||||
void CHAR_SAMPLES::add_sample_to_prototype(CHAR_SAMPLE *sample) {
|
||||
BOOL8 rebuild = FALSE;
|
||||
INT32 new_xsize = proto->x_size ();
|
||||
INT32 new_ysize = proto->y_size ();
|
||||
INT32 sample_xsize = sample->image ()->get_xsize ();
|
||||
INT32 sample_ysize = sample->image ()->get_ysize ();
|
||||
|
||||
if (sample_xsize > new_xsize) {
|
||||
new_xsize = sample_xsize;
|
||||
rebuild = TRUE;
|
||||
}
|
||||
if (sample_ysize > new_ysize) {
|
||||
new_ysize = sample_ysize;
|
||||
rebuild = TRUE;
|
||||
}
|
||||
|
||||
if (rebuild)
|
||||
proto->enlarge_prototype (new_xsize, new_ysize);
|
||||
|
||||
proto->add_sample (sample);
|
||||
}
|
||||
|
||||
|
||||
void CHAR_SAMPLES::build_prototype() {
|
||||
CHAR_SAMPLE_IT sample_it = &samples;
|
||||
CHAR_SAMPLE *sample;
|
||||
INT32 proto_xsize = 0;
|
||||
INT32 proto_ysize = 0;
|
||||
|
||||
if (type != IMAGE_CLUSTER
|
||||
|| samples.length () < tessedit_mm_prototype_min_size)
|
||||
return;
|
||||
|
||||
for (sample_it.mark_cycle_pt ();
|
||||
!sample_it.cycled_list (); sample_it.forward ()) {
|
||||
sample = sample_it.data ();
|
||||
if (sample->image ()->get_xsize () > proto_xsize)
|
||||
proto_xsize = sample->image ()->get_xsize ();
|
||||
if (sample->image ()->get_ysize () > proto_ysize)
|
||||
proto_ysize = sample->image ()->get_ysize ();
|
||||
}
|
||||
|
||||
proto = new CHAR_PROTO (proto_xsize, proto_ysize, 0, 0, '\0');
|
||||
|
||||
for (sample_it.mark_cycle_pt ();
|
||||
!sample_it.cycled_list (); sample_it.forward ())
|
||||
this->add_sample_to_prototype (sample_it.data ());
|
||||
|
||||
}
|
||||
|
||||
|
||||
void CHAR_SAMPLES::find_best_sample() {
|
||||
CHAR_SAMPLE_IT sample_it = &samples;
|
||||
double score;
|
||||
double best_score = MAX_INT32;
|
||||
|
||||
if (ch == '\0' || samples.length () < tessedit_mm_prototype_min_size)
|
||||
return;
|
||||
|
||||
for (sample_it.mark_cycle_pt ();
|
||||
!sample_it.cycled_list (); sample_it.forward ()) {
|
||||
score = sample_it.data ()->mean_score ();
|
||||
if (score < best_score) {
|
||||
best_score = score;
|
||||
best_sample = sample_it.data ();
|
||||
}
|
||||
}
|
||||
#ifndef SECURE_NAMES
|
||||
if (tessedit_cluster_debug) {
|
||||
tprintf ("Best sample for this %c cluster:\n", ch);
|
||||
best_sample->print (debug_fp);
|
||||
}
|
||||
#endif
|
||||
}
|
||||
|
||||
|
||||
float CHAR_SAMPLES::match_score(CHAR_SAMPLE *sample) {
|
||||
if (tessedit_mm_only_match_same_char && sample->character () != ch)
|
||||
return BAD_SCORE;
|
||||
|
||||
if (tessedit_use_best_sample && best_sample != NULL)
|
||||
return best_sample->match_sample (sample, FALSE);
|
||||
else if ((tessedit_mm_use_prototypes
|
||||
|| tessedit_mm_adapt_using_prototypes) && proto != NULL)
|
||||
return proto->match_sample (sample);
|
||||
else
|
||||
return this->nn_match_score (sample);
|
||||
}
|
||||
|
||||
|
||||
float CHAR_SAMPLES::nn_match_score(CHAR_SAMPLE *sample) {
|
||||
CHAR_SAMPLE_IT sample_it = &samples;
|
||||
float score;
|
||||
float min_score = MAX_INT32;
|
||||
|
||||
for (sample_it.mark_cycle_pt ();
|
||||
!sample_it.cycled_list (); sample_it.forward ()) {
|
||||
score = sample_it.data ()->match_sample (sample, FALSE);
|
||||
if (score < min_score)
|
||||
min_score = score;
|
||||
}
|
||||
|
||||
return min_score;
|
||||
}
|
||||
|
||||
|
||||
void CHAR_SAMPLES::assign_to_char() {
|
||||
STATS char_frequency(FIRST_CHAR, LAST_CHAR);
|
||||
CHAR_SAMPLE_IT sample_it = &samples;
|
||||
INT32 i;
|
||||
INT32 max_index = 0;
|
||||
INT32 max_freq = 0;
|
||||
|
||||
if (samples.length () == 0 || tessedit_mm_only_match_same_char)
|
||||
return;
|
||||
|
||||
for (sample_it.mark_cycle_pt ();
|
||||
!sample_it.cycled_list (); sample_it.forward ())
|
||||
char_frequency.add ((INT32) sample_it.data ()->character (), 1);
|
||||
|
||||
for (i = FIRST_CHAR; i <= LAST_CHAR; i++)
|
||||
if (char_frequency.pile_count (i) > max_freq) {
|
||||
max_index = i;
|
||||
max_freq = char_frequency.pile_count (i);
|
||||
}
|
||||
|
||||
if (samples.length () >= tessedit_cluster_min_size
|
||||
&& max_freq > samples.length () * tessedit_cluster_accept_fraction)
|
||||
ch = (char) max_index;
|
||||
}
|
||||
|
||||
|
||||
void CHAR_SAMPLES::print(FILE *f) {
|
||||
CHAR_SAMPLE_IT sample_it = &samples;
|
||||
|
||||
fprintf (f, "Collected " INT32FORMAT " samples\n", samples.length ());
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (tessedit_cluster_debug)
|
||||
for (sample_it.mark_cycle_pt ();
|
||||
!sample_it.cycled_list (); sample_it.forward ())
|
||||
sample_it.data ()->print (f);
|
||||
|
||||
if (ch == '\0')
|
||||
fprintf (f, "\nCluster not used for adaption\n");
|
||||
else
|
||||
fprintf (f, "\nCluster used to adapt to '%c's\n", ch);
|
||||
#endif
|
||||
}
|
||||
|
||||
|
||||
CHAR_PROTO::CHAR_PROTO() {
|
||||
xsize = 0;
|
||||
ysize = 0;
|
||||
ch = '\0';
|
||||
nsamples = 0;
|
||||
proto_data = NULL;
|
||||
proto = NULL;
|
||||
}
|
||||
|
||||
|
||||
CHAR_PROTO::CHAR_PROTO(INT32 x_size,
|
||||
INT32 y_size,
|
||||
INT32 n_samples,
|
||||
float initial_value,
|
||||
char c) {
|
||||
INT32 x;
|
||||
INT32 y;
|
||||
|
||||
xsize = x_size;
|
||||
ysize = y_size;
|
||||
ch = c;
|
||||
nsamples = n_samples;
|
||||
|
||||
ALLOC_2D_ARRAY(xsize, ysize, proto_data, proto, float);
|
||||
|
||||
for (y = 0; y < ysize; y++)
|
||||
for (x = 0; x < xsize; x++)
|
||||
proto[x][y] = initial_value;
|
||||
}
|
||||
|
||||
|
||||
CHAR_PROTO::CHAR_PROTO(CHAR_SAMPLE *sample) {
|
||||
INT32 x;
|
||||
INT32 y;
|
||||
IMAGELINE imline_s;
|
||||
|
||||
if (sample->image () == NULL) {
|
||||
xsize = 0;
|
||||
ysize = 0;
|
||||
ch = '\0';
|
||||
nsamples = 0;
|
||||
proto_data = NULL;
|
||||
proto = NULL;
|
||||
}
|
||||
else {
|
||||
ch = sample->character ();
|
||||
xsize = sample->image ()->get_xsize ();
|
||||
ysize = sample->image ()->get_ysize ();
|
||||
nsamples = 1;
|
||||
|
||||
ALLOC_2D_ARRAY(xsize, ysize, proto_data, proto, float);
|
||||
|
||||
for (y = 0; y < ysize; y++) {
|
||||
sample->image ()->fast_get_line (0, y, xsize, &imline_s);
|
||||
for (x = 0; x < xsize; x++)
|
||||
if (imline_s.pixels[x] == BINIM_WHITE)
|
||||
proto[x][y] = 1.0;
|
||||
else
|
||||
proto[x][y] = -1.0;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
CHAR_PROTO::~CHAR_PROTO () {
|
||||
if (proto_data != NULL)
|
||||
FREE_2D_ARRAY(proto_data, proto);
|
||||
}
|
||||
|
||||
|
||||
float CHAR_PROTO::match_sample(CHAR_SAMPLE *test_sample) {
|
||||
CHAR_PROTO *test_proto;
|
||||
float score;
|
||||
|
||||
if (test_sample->image () != NULL) {
|
||||
test_proto = new CHAR_PROTO (test_sample);
|
||||
if (xsize > test_proto->x_size ())
|
||||
score = this->match (test_proto);
|
||||
else {
|
||||
demo_word = -demo_word; // Flag different call
|
||||
score = test_proto->match (this);
|
||||
}
|
||||
}
|
||||
else
|
||||
return BAD_SCORE;
|
||||
|
||||
delete test_proto;
|
||||
|
||||
return score;
|
||||
}
|
||||
|
||||
|
||||
float CHAR_PROTO::match(CHAR_PROTO *test_proto) {
|
||||
INT32 xsize2 = test_proto->x_size ();
|
||||
INT32 y_size;
|
||||
INT32 y_size2;
|
||||
INT32 x_offset;
|
||||
INT32 y_offset;
|
||||
INT32 x;
|
||||
INT32 y;
|
||||
CHAR_PROTO *match_proto;
|
||||
float score;
|
||||
float sum = 0.0;
|
||||
|
||||
ASSERT_HOST (xsize >= xsize2);
|
||||
|
||||
x_offset = (xsize - xsize2) / 2;
|
||||
|
||||
if (ysize < test_proto->y_size ()) {
|
||||
y_size = test_proto->y_size ();
|
||||
y_size2 = ysize;
|
||||
y_offset = (y_size - y_size2) / 2;
|
||||
|
||||
match_proto = new CHAR_PROTO (xsize,
|
||||
y_size,
|
||||
nsamples * test_proto->n_samples (),
|
||||
0, '\0');
|
||||
|
||||
for (y = 0; y < y_offset; y++) {
|
||||
for (x = 0; x < xsize2; x++) {
|
||||
match_proto->data ()[x + x_offset][y] =
|
||||
test_proto->data ()[x][y] * nsamples;
|
||||
sum += match_proto->data ()[x + x_offset][y];
|
||||
}
|
||||
}
|
||||
|
||||
for (y = y_offset + y_size2; y < y_size; y++) {
|
||||
for (x = 0; x < xsize2; x++) {
|
||||
match_proto->data ()[x + x_offset][y] =
|
||||
test_proto->data ()[x][y] * nsamples;
|
||||
sum += match_proto->data ()[x + x_offset][y];
|
||||
}
|
||||
}
|
||||
|
||||
for (y = y_offset; y < y_offset + y_size2; y++) {
|
||||
for (x = 0; x < x_offset; x++) {
|
||||
match_proto->data ()[x][y] = proto[x][y - y_offset] *
|
||||
test_proto->n_samples ();
|
||||
sum += match_proto->data ()[x][y];
|
||||
}
|
||||
|
||||
for (x = x_offset + xsize2; x < xsize; x++) {
|
||||
match_proto->data ()[x][y] = proto[x][y - y_offset] *
|
||||
test_proto->n_samples ();
|
||||
sum += match_proto->data ()[x][y];
|
||||
}
|
||||
|
||||
for (x = x_offset; x < x_offset + xsize2; x++) {
|
||||
match_proto->data ()[x][y] =
|
||||
proto[x][y - y_offset] * test_proto->data ()[x - x_offset][y];
|
||||
sum += match_proto->data ()[x][y];
|
||||
}
|
||||
}
|
||||
}
|
||||
else {
|
||||
y_size = ysize;
|
||||
y_size2 = test_proto->y_size ();
|
||||
y_offset = (y_size - y_size2) / 2;
|
||||
|
||||
match_proto = new CHAR_PROTO (xsize,
|
||||
y_size,
|
||||
nsamples * test_proto->n_samples (),
|
||||
0, '\0');
|
||||
|
||||
for (y = 0; y < y_offset; y++)
|
||||
for (x = 0; x < xsize; x++) {
|
||||
match_proto->data ()[x][y] =
|
||||
proto[x][y] * test_proto->n_samples ();
|
||||
sum += match_proto->data ()[x][y];
|
||||
}
|
||||
|
||||
for (y = y_offset + y_size2; y < y_size; y++)
|
||||
for (x = 0; x < xsize; x++) {
|
||||
match_proto->data ()[x][y] =
|
||||
proto[x][y] * test_proto->n_samples ();
|
||||
sum += match_proto->data ()[x][y];
|
||||
}
|
||||
|
||||
for (y = y_offset; y < y_offset + y_size2; y++) {
|
||||
for (x = 0; x < x_offset; x++) {
|
||||
match_proto->data ()[x][y] =
|
||||
proto[x][y] * test_proto->n_samples ();
|
||||
sum += match_proto->data ()[x][y];
|
||||
}
|
||||
|
||||
for (x = x_offset + xsize2; x < xsize; x++) {
|
||||
match_proto->data ()[x][y] =
|
||||
proto[x][y] * test_proto->n_samples ();
|
||||
sum += match_proto->data ()[x][y];
|
||||
}
|
||||
|
||||
for (x = x_offset; x < x_offset + xsize2; x++) {
|
||||
match_proto->data ()[x][y] = proto[x][y] *
|
||||
test_proto->data ()[x - x_offset][y - y_offset];
|
||||
sum += match_proto->data ()[x][y];
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
score = (1.0 - sum /
|
||||
(xsize * y_size * nsamples * test_proto->n_samples ()));
|
||||
|
||||
if (tessedit_mm_debug) {
|
||||
if (score < 0) {
|
||||
tprintf ("Match score %f\n", score);
|
||||
tprintf ("x: %d, y: %d, ns: %d, nt: %d, dx %d, dy: %d\n",
|
||||
xsize, y_size, nsamples, test_proto->n_samples (),
|
||||
x_offset, y_offset);
|
||||
for (y = 0; y < y_size; y++) {
|
||||
tprintf ("\n%d", y);
|
||||
for (x = 0; x < xsize; x++)
|
||||
tprintf ("\t%d", match_proto->data ()[x][y]);
|
||||
|
||||
}
|
||||
tprintf ("\n");
|
||||
fflush(debug_fp);
|
||||
}
|
||||
}
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
if (tessedit_display_mm) {
|
||||
tprintf ("Match score %f\n", score);
|
||||
display_images (this->make_image (),
|
||||
test_proto->make_image (), match_proto->make_image ());
|
||||
}
|
||||
else if (demo_word != 0) {
|
||||
if (demo_word > 0)
|
||||
display_image (test_proto->make_image (), "Test sample",
|
||||
300, 400, FALSE);
|
||||
else
|
||||
display_image (this->make_image (), "Test sample", 300, 400, FALSE);
|
||||
|
||||
display_image (match_proto->make_image (), "Best match",
|
||||
700, 400, TRUE);
|
||||
}
|
||||
#endif
|
||||
|
||||
delete match_proto;
|
||||
|
||||
return score;
|
||||
}
|
||||
|
||||
|
||||
void CHAR_PROTO::enlarge_prototype(INT32 new_xsize, INT32 new_ysize) {
|
||||
float *old_proto_data = proto_data;
|
||||
float **old_proto = proto;
|
||||
INT32 old_xsize = xsize;
|
||||
INT32 old_ysize = ysize;
|
||||
INT32 x_offset;
|
||||
INT32 y_offset;
|
||||
INT32 x;
|
||||
INT32 y;
|
||||
|
||||
ASSERT_HOST (new_xsize >= xsize && new_ysize >= ysize);
|
||||
|
||||
xsize = new_xsize;
|
||||
ysize = new_ysize;
|
||||
ALLOC_2D_ARRAY(xsize, ysize, proto_data, proto, float);
|
||||
x_offset = (xsize - old_xsize) / 2;
|
||||
y_offset = (ysize - old_ysize) / 2;
|
||||
|
||||
for (y = 0; y < y_offset; y++)
|
||||
for (x = 0; x < xsize; x++)
|
||||
proto[x][y] = nsamples;
|
||||
|
||||
for (y = y_offset + old_ysize; y < ysize; y++)
|
||||
for (x = 0; x < xsize; x++)
|
||||
proto[x][y] = nsamples;
|
||||
|
||||
for (y = y_offset; y < y_offset + old_ysize; y++) {
|
||||
for (x = 0; x < x_offset; x++)
|
||||
proto[x][y] = nsamples;
|
||||
|
||||
for (x = x_offset + old_xsize; x < xsize; x++)
|
||||
proto[x][y] = nsamples;
|
||||
|
||||
for (x = x_offset; x < x_offset + old_xsize; x++)
|
||||
proto[x][y] = old_proto[x - x_offset][y - y_offset];
|
||||
}
|
||||
|
||||
FREE_2D_ARRAY(old_proto_data, old_proto);
|
||||
}
|
||||
|
||||
|
||||
void CHAR_PROTO::add_sample(CHAR_SAMPLE *sample) {
|
||||
INT32 x_offset;
|
||||
INT32 y_offset;
|
||||
INT32 x;
|
||||
INT32 y;
|
||||
IMAGELINE imline_s;
|
||||
INT32 sample_xsize = sample->image ()->get_xsize ();
|
||||
INT32 sample_ysize = sample->image ()->get_ysize ();
|
||||
|
||||
x_offset = (xsize - sample_xsize) / 2;
|
||||
y_offset = (ysize - sample_ysize) / 2;
|
||||
|
||||
ASSERT_HOST (x_offset >= 0 && y_offset >= 0);
|
||||
|
||||
for (y = 0; y < y_offset; y++)
|
||||
for (x = 0; x < xsize; x++)
|
||||
proto[x][y]++; // Treat pixels outside the
|
||||
// range as white
|
||||
for (y = y_offset + sample_ysize; y < ysize; y++)
|
||||
for (x = 0; x < xsize; x++)
|
||||
proto[x][y]++;
|
||||
|
||||
for (y = y_offset; y < y_offset + sample_ysize; y++) {
|
||||
sample->image ()->fast_get_line (0,
|
||||
y - y_offset, sample_xsize, &imline_s);
|
||||
for (x = x_offset; x < x_offset + sample_xsize; x++) {
|
||||
if (imline_s.pixels[x - x_offset] == BINIM_WHITE)
|
||||
proto[x][y]++;
|
||||
else
|
||||
proto[x][y]--;
|
||||
}
|
||||
|
||||
for (x = 0; x < x_offset; x++)
|
||||
proto[x][y]++;
|
||||
|
||||
for (x = x_offset + sample_xsize; x < xsize; x++)
|
||||
proto[x][y]++;
|
||||
}
|
||||
|
||||
nsamples++;
|
||||
}
|
||||
|
||||
|
||||
IMAGE *CHAR_PROTO::make_image() {
|
||||
IMAGE *image;
|
||||
IMAGELINE imline_p;
|
||||
INT32 x;
|
||||
INT32 y;
|
||||
|
||||
ASSERT_HOST (nsamples != 0);
|
||||
|
||||
image = new (IMAGE);
|
||||
image->create (xsize, ysize, 8);
|
||||
|
||||
for (y = 0; y < ysize; y++) {
|
||||
image->fast_get_line (0, y, xsize, &imline_p);
|
||||
|
||||
for (x = 0; x < xsize; x++) {
|
||||
imline_p.pixels[x] = 128 +
|
||||
(UINT8) ((proto[x][y] * 128.0) / (0.00001 + nsamples));
|
||||
}
|
||||
|
||||
image->fast_put_line (0, y, xsize, &imline_p);
|
||||
}
|
||||
return image;
|
||||
}
|
1668
ccmain/control.cpp
Normal file
1668
ccmain/control.cpp
Normal file
File diff suppressed because it is too large
Load Diff
193
ccmain/control.h
Normal file
193
ccmain/control.h
Normal file
@ -0,0 +1,193 @@
|
||||
/**********************************************************************
|
||||
* File: control.h (Formerly control.h)
|
||||
* Description: Module-independent matcher controller.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Apr 23 11:09:58 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef CONTROL_H
|
||||
#define CONTROL_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "ocrblock.h"
|
||||
//#include "epapdest.h"
|
||||
#include "ratngs.h"
|
||||
#include "statistc.h"
|
||||
//#include "epapconv.h"
|
||||
#include "ocrshell.h"
|
||||
#include "pageres.h"
|
||||
#include "charsample.h"
|
||||
#include "notdll.h"
|
||||
|
||||
enum ACCEPTABLE_WERD_TYPE
|
||||
{
|
||||
AC_UNACCEPTABLE, //Unacceptable word
|
||||
AC_LOWER_CASE, //ALL lower case
|
||||
AC_UPPER_CASE, //ALL upper case
|
||||
AC_INITIAL_CAP, //ALL but initial lc
|
||||
AC_LC_ABBREV, //a.b.c.
|
||||
AC_UC_ABBREV //A.B.C.
|
||||
};
|
||||
|
||||
typedef BOOL8 (*BLOB_REJECTOR) (PBLOB *, BLOB_CHOICE_IT *, void *);
|
||||
|
||||
extern INT_VAR_H (tessedit_single_match, FALSE, "Top choice only from CP");
|
||||
//extern BOOL_VAR_H(tessedit_small_match,FALSE,"Use small matrix matcher");
|
||||
extern BOOL_VAR_H (tessedit_print_text, FALSE, "Write text to stdout");
|
||||
extern BOOL_VAR_H (tessedit_draw_words, FALSE, "Draw source words");
|
||||
extern BOOL_VAR_H (tessedit_draw_outwords, FALSE, "Draw output words");
|
||||
extern BOOL_VAR_H (tessedit_training_wiseowl, FALSE,
|
||||
"Call WO to learn blobs");
|
||||
extern BOOL_VAR_H (tessedit_training_tess, FALSE, "Call Tess to learn blobs");
|
||||
extern BOOL_VAR_H (tessedit_matcher_is_wiseowl, FALSE, "Call WO to classify");
|
||||
extern BOOL_VAR_H (tessedit_dump_choices, FALSE, "Dump char choices");
|
||||
extern BOOL_VAR_H (tessedit_fix_fuzzy_spaces, TRUE,
|
||||
"Try to improve fuzzy spaces");
|
||||
extern BOOL_VAR_H (tessedit_unrej_any_wd, FALSE,
|
||||
"Dont bother with word plausibility");
|
||||
extern BOOL_VAR_H (tessedit_fix_hyphens, TRUE, "Crunch double hyphens?");
|
||||
extern BOOL_VAR_H (tessedit_reject_fullstops, FALSE, "Reject all fullstops");
|
||||
extern BOOL_VAR_H (tessedit_reject_suspect_fullstops, FALSE,
|
||||
"Reject suspect fullstops");
|
||||
extern BOOL_VAR_H (tessedit_redo_xheight, TRUE, "Check/Correct x-height");
|
||||
extern BOOL_VAR_H (tessedit_cluster_adaption_on, TRUE,
|
||||
"Do our own adaption - ems only");
|
||||
extern BOOL_VAR_H (tessedit_enable_doc_dict, TRUE,
|
||||
"Add words to the document dictionary");
|
||||
extern BOOL_VAR_H (word_occ_first, FALSE, "Do word occ before re-est xht");
|
||||
extern BOOL_VAR_H (tessedit_xht_fiddles_on_done_wds, TRUE,
|
||||
"Apply xht fix up even if done");
|
||||
extern BOOL_VAR_H (tessedit_xht_fiddles_on_no_rej_wds, TRUE,
|
||||
"Apply xht fix up even in no rejects");
|
||||
extern INT_VAR_H (x_ht_check_word_occ, 2, "Check Char Block occupancy");
|
||||
extern INT_VAR_H (x_ht_stringency, 1, "How many confirmed a/n to accept?");
|
||||
extern BOOL_VAR_H (x_ht_quality_check, TRUE, "Dont allow worse quality");
|
||||
extern BOOL_VAR_H (tessedit_debug_block_rejection, FALSE,
|
||||
"Block and Row stats");
|
||||
extern INT_VAR_H (debug_x_ht_level, 0, "Reestimate debug");
|
||||
extern BOOL_VAR_H (rej_use_xht, TRUE, "Individual rejection control");
|
||||
extern BOOL_VAR_H (debug_acceptable_wds, FALSE, "Dump word pass/fail chk");
|
||||
extern STRING_VAR_H (chs_leading_punct, "('`\"", "Leading punctuation");
|
||||
extern
|
||||
STRING_VAR_H (chs_trailing_punct1, ").,;:?!", "1st Trailing punctuation");
|
||||
extern STRING_VAR_H (chs_trailing_punct2, ")'`\"",
|
||||
"2nd Trailing punctuation");
|
||||
extern double_VAR_H (quality_rej_pc, 0.08,
|
||||
"good_quality_doc lte rejection limit");
|
||||
extern double_VAR_H (quality_blob_pc, 0.0,
|
||||
"good_quality_doc gte good blobs limit");
|
||||
extern double_VAR_H (quality_outline_pc, 1.0,
|
||||
"good_quality_doc lte outline error limit");
|
||||
extern double_VAR_H (quality_char_pc, 0.95,
|
||||
"good_quality_doc gte good char limit");
|
||||
extern INT_VAR_H (quality_min_initial_alphas_reqd, 2,
|
||||
"alphas in a good word");
|
||||
extern BOOL_VAR_H (tessedit_tess_adapt_to_rejmap, FALSE,
|
||||
"Use reject map to control Tesseract adaption");
|
||||
extern INT_VAR_H (tessedit_tess_adaption_mode, 3,
|
||||
"Adaptation decision algorithm for tess");
|
||||
extern INT_VAR_H (tessedit_em_adaption_mode, 62,
|
||||
"Adaptation decision algorithm for ems matrix matcher");
|
||||
extern BOOL_VAR_H (tessedit_cluster_adapt_after_pass1, FALSE,
|
||||
"Adapt using clusterer after pass 1");
|
||||
extern BOOL_VAR_H (tessedit_cluster_adapt_after_pass2, FALSE,
|
||||
"Adapt using clusterer after pass 1");
|
||||
extern BOOL_VAR_H (tessedit_cluster_adapt_after_pass3, FALSE,
|
||||
"Adapt using clusterer after pass 1");
|
||||
extern BOOL_VAR_H (tessedit_cluster_adapt_before_pass1, FALSE,
|
||||
"Adapt using clusterer before Tess adaping during pass 1");
|
||||
extern INT_VAR_H (tessedit_cluster_adaption_mode, 0,
|
||||
"Adaptation decision algorithm for matrix matcher");
|
||||
extern BOOL_VAR_H (tessedit_adaption_debug, FALSE,
|
||||
"Generate and print debug information for adaption");
|
||||
extern BOOL_VAR_H (tessedit_minimal_rej_pass1, FALSE,
|
||||
"Do minimal rejection on pass 1 output");
|
||||
extern BOOL_VAR_H (tessedit_test_adaption, FALSE,
|
||||
"Test adaption criteria");
|
||||
extern BOOL_VAR_H (tessedit_global_adaption, FALSE,
|
||||
"Adapt to all docs over time");
|
||||
extern BOOL_VAR_H (tessedit_matcher_log, FALSE, "Log matcher activity");
|
||||
extern INT_VAR_H (tessedit_test_adaption_mode, 3,
|
||||
"Adaptation decision algorithm for tess");
|
||||
extern BOOL_VAR_H (test_pt, FALSE, "Test for point");
|
||||
extern double_VAR_H (test_pt_x, 99999.99, "xcoord");
|
||||
extern double_VAR_H (test_pt_y, 99999.99, "ycoord");
|
||||
void recog_pseudo_word( //recognize blobs
|
||||
BLOCK_LIST *block_list, //blocks to check
|
||||
BOX &selection_box);
|
||||
BOOL8 recog_interactive( //recognize blobs
|
||||
BLOCK *, //block
|
||||
ROW *row, //row of word
|
||||
WERD *word //word to recognize
|
||||
);
|
||||
void recog_all_words( //process words
|
||||
PAGE_RES *page_res, //page structure
|
||||
volatile ETEXT_DESC *monitor //progress monitor
|
||||
);
|
||||
void classify_word_pass1( //recog one word
|
||||
WERD_RES *word, //word to do
|
||||
ROW *row,
|
||||
BOOL8 cluster_adapt,
|
||||
CHAR_SAMPLES_LIST *char_clusters,
|
||||
CHAR_SAMPLE_LIST *chars_waiting);
|
||||
//word to do
|
||||
void classify_word_pass2(WERD_RES *word, ROW *row);
|
||||
void match_word_pass2( //recog one word
|
||||
WERD_RES *word, //word to do
|
||||
ROW *row,
|
||||
float x_height);
|
||||
void fix_rep_char( //Repeated char word
|
||||
WERD_RES *word //word to do
|
||||
);
|
||||
void fix_quotes( //make double quotes
|
||||
char *string, //string to fix
|
||||
WERD *word, //word to do //char choices
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices);
|
||||
void fix_hyphens( //crunch double hyphens
|
||||
char *string, //string to fix
|
||||
WERD *word, //word to do //char choices
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices);
|
||||
void merge_blobs( //combine 2 blobs
|
||||
PBLOB *blob1, //dest blob
|
||||
PBLOB *blob2 //source blob
|
||||
);
|
||||
void choice_dump_tester( //dump chars in word
|
||||
PBLOB *, //blob
|
||||
DENORM *, //de-normaliser
|
||||
BOOL8 correct, //ly segmented
|
||||
char *text, //correct text
|
||||
INT32 count, //chars in text
|
||||
BLOB_CHOICE_LIST *ratings //list of results
|
||||
);
|
||||
WERD *make_bln_copy(WERD *src_word, ROW *row, float x_height, DENORM *denorm);
|
||||
ACCEPTABLE_WERD_TYPE acceptable_word_string(const char *s);
|
||||
BOOL8 check_debug_pt(WERD_RES *word, int location);
|
||||
void set_word_fonts( //good chars in word
|
||||
WERD_RES *word, //word to adapt to //detailed results
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices);
|
||||
void font_recognition_pass( //good chars in word
|
||||
PAGE_RES_IT &page_res_it);
|
||||
void add_in_one_row( //good chars in word
|
||||
ROW_RES *row, //current row
|
||||
STATS *fonts, //font stats
|
||||
INT8 *italic, //output count
|
||||
INT8 *bold //output count
|
||||
);
|
||||
void find_modal_font( //good chars in word
|
||||
STATS *fonts, //font stats
|
||||
INT8 *font_out, //output font
|
||||
INT8 *font_count //output count
|
||||
);
|
||||
#endif
|
1453
ccmain/docqual.cpp
Normal file
1453
ccmain/docqual.cpp
Normal file
File diff suppressed because it is too large
Load Diff
155
ccmain/docqual.h
Normal file
155
ccmain/docqual.h
Normal file
@ -0,0 +1,155 @@
|
||||
/******************************************************************
|
||||
* File: docqual.h (Formerly docqual.h)
|
||||
* Description: Document Quality Metrics
|
||||
* Author: Phil Cheatle
|
||||
* Created: Mon May 9 11:27:28 BST 1994
|
||||
*
|
||||
* (C) Copyright 1994, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef DOCQUAL_H
|
||||
#define DOCQUAL_H
|
||||
|
||||
#include "control.h"
|
||||
#include "notdll.h"
|
||||
|
||||
enum GARBAGE_LEVEL
|
||||
{
|
||||
G_NEVER_CRUNCH,
|
||||
G_OK,
|
||||
G_DODGY,
|
||||
G_TERRIBLE
|
||||
};
|
||||
|
||||
extern STRING_VAR_H (outlines_odd, "%| ", "Non standard number of outlines");
|
||||
extern STRING_VAR_H (outlines_2, "ij!?%\":;",
|
||||
"Non standard number of outlines");
|
||||
extern BOOL_VAR_H (docqual_excuse_outline_errs, FALSE,
|
||||
"Allow outline errs in unrejection?");
|
||||
extern BOOL_VAR_H (tessedit_good_quality_unrej, TRUE,
|
||||
"Reduce rejection on good docs");
|
||||
extern BOOL_VAR_H (tessedit_use_reject_spaces, TRUE, "Reject spaces?");
|
||||
extern double_VAR_H (tessedit_reject_doc_percent, 65.00,
|
||||
"%rej allowed before rej whole doc");
|
||||
extern double_VAR_H (tessedit_reject_block_percent, 45.00,
|
||||
"%rej allowed before rej whole block");
|
||||
extern double_VAR_H (tessedit_reject_row_percent, 40.00,
|
||||
"%rej allowed before rej whole row");
|
||||
extern double_VAR_H (tessedit_whole_wd_rej_row_percent, 70.00,
|
||||
"%of row rejects in whole word rejects which prevents whole row rejection");
|
||||
extern BOOL_VAR_H (tessedit_preserve_blk_rej_perfect_wds, TRUE,
|
||||
"Only rej partially rejected words in block rejection");
|
||||
extern BOOL_VAR_H (tessedit_preserve_row_rej_perfect_wds, TRUE,
|
||||
"Only rej partially rejected words in row rejection");
|
||||
extern BOOL_VAR_H (tessedit_dont_blkrej_good_wds, FALSE,
|
||||
"Use word segmentation quality metric");
|
||||
extern BOOL_VAR_H (tessedit_dont_rowrej_good_wds, FALSE,
|
||||
"Use word segmentation quality metric");
|
||||
extern INT_VAR_H (tessedit_preserve_min_wd_len, 2,
|
||||
"Only preserve wds longer than this");
|
||||
extern BOOL_VAR_H (tessedit_row_rej_good_docs, TRUE,
|
||||
"Apply row rejection to good docs");
|
||||
extern double_VAR_H (tessedit_good_doc_still_rowrej_wd, 1.1,
|
||||
"rej good doc wd if more than this fraction rejected");
|
||||
extern BOOL_VAR_H (tessedit_reject_bad_qual_wds, TRUE,
|
||||
"Reject all bad quality wds");
|
||||
extern BOOL_VAR_H (tessedit_debug_doc_rejection, FALSE, "Page stats");
|
||||
extern BOOL_VAR_H (tessedit_debug_quality_metrics, FALSE,
|
||||
"Output data to debug file");
|
||||
extern BOOL_VAR_H (bland_unrej, FALSE, "unrej potential with no chekcs");
|
||||
extern double_VAR_H (quality_rowrej_pc, 1.1,
|
||||
"good_quality_doc gte good char limit");
|
||||
extern BOOL_VAR_H (unlv_tilde_crunching, TRUE,
|
||||
"Mark v.bad words for tilde crunch");
|
||||
extern BOOL_VAR_H (crunch_early_merge_tess_fails, TRUE,
|
||||
"Before word crunch?");
|
||||
extern BOOL_VAR_H (crunch_early_convert_bad_unlv_chs, FALSE,
|
||||
"Take out ~^ early?");
|
||||
extern double_VAR_H (crunch_terrible_rating, 80.0, "crunch rating lt this");
|
||||
extern BOOL_VAR_H (crunch_terrible_garbage, TRUE, "As it says");
|
||||
extern double_VAR_H (crunch_poor_garbage_cert, -9.0,
|
||||
"crunch garbage cert lt this");
|
||||
extern double_VAR_H (crunch_poor_garbage_rate, 60,
|
||||
"crunch garbage rating lt this");
|
||||
extern double_VAR_H (crunch_pot_poor_rate, 40,
|
||||
"POTENTIAL crunch rating lt this");
|
||||
extern double_VAR_H (crunch_pot_poor_cert, -8.0,
|
||||
"POTENTIAL crunch cert lt this");
|
||||
extern BOOL_VAR_H (crunch_pot_garbage, TRUE, "POTENTIAL crunch garbage");
|
||||
extern double_VAR_H (crunch_del_rating, 60,
|
||||
"POTENTIAL crunch rating lt this");
|
||||
extern double_VAR_H (crunch_del_cert, -10.0, "POTENTIAL crunch cert lt this");
|
||||
extern double_VAR_H (crunch_del_min_ht, 0.7, "Del if word ht lt xht x this");
|
||||
extern double_VAR_H (crunch_del_max_ht, 3.0, "Del if word ht gt xht x this");
|
||||
extern double_VAR_H (crunch_del_min_width, 3.0,
|
||||
"Del if word width lt xht x this");
|
||||
extern double_VAR_H (crunch_del_high_word, 1.5,
|
||||
"Del if word gt xht x this above bl");
|
||||
extern double_VAR_H (crunch_del_low_word, 0.5,
|
||||
"Del if word gt xht x this below bl");
|
||||
extern double_VAR_H (crunch_small_outlines_size, 0.6,
|
||||
"Small if lt xht x this");
|
||||
extern INT_VAR_H (crunch_rating_max, 10, "For adj length in rating per ch");
|
||||
extern INT_VAR_H (crunch_pot_indicators, 1,
|
||||
"How many potential indicators needed");
|
||||
extern BOOL_VAR_H (crunch_leave_ok_strings, TRUE,
|
||||
"Dont touch sensible strings");
|
||||
extern BOOL_VAR_H (crunch_accept_ok, TRUE, "Use acceptability in okstring");
|
||||
extern BOOL_VAR_H (crunch_leave_accept_strings, FALSE,
|
||||
"Dont pot crunch sensible strings");
|
||||
extern BOOL_VAR_H (crunch_include_numerals, FALSE, "Fiddle alpha figures");
|
||||
extern INT_VAR_H (crunch_leave_lc_strings, 4,
|
||||
"Dont crunch words with long lower case strings");
|
||||
extern INT_VAR_H (crunch_leave_uc_strings, 4,
|
||||
"Dont crunch words with long lower case strings");
|
||||
extern INT_VAR_H (crunch_long_repetitions, 3,
|
||||
"Crunch words with long repetitions");
|
||||
extern INT_VAR_H (crunch_debug, 0, "As it says");
|
||||
INT16 word_blob_quality( //Blob seg changes
|
||||
WERD_RES *word,
|
||||
ROW *row);
|
||||
BOOL8 crude_match_blobs(PBLOB *blob1, PBLOB *blob2);
|
||||
INT16 word_outline_errs( //Outline count errs
|
||||
WERD_RES *word);
|
||||
void word_char_quality( //Blob seg changes
|
||||
WERD_RES *word,
|
||||
ROW *row,
|
||||
INT16 *match_count,
|
||||
INT16 *accepted_match_count);
|
||||
void unrej_good_chs(WERD_RES *word, ROW *row);
|
||||
void print_boxes(WERD *word);
|
||||
INT16 count_outline_errs(char c, INT16 outline_count);
|
||||
void quality_based_rejection(PAGE_RES_IT &page_res_it, BOOL8 good_quality_doc);
|
||||
void unrej_good_quality_words( //unreject potential
|
||||
PAGE_RES_IT &page_res_it);
|
||||
void doc_and_block_rejection( //reject big chunks
|
||||
PAGE_RES_IT &page_res_it,
|
||||
BOOL8 good_quality_doc);
|
||||
void reject_whole_page(PAGE_RES_IT &page_res_it);
|
||||
void tilde_crunch(PAGE_RES_IT &page_res_it);
|
||||
BOOL8 terrible_word_crunch(WERD_RES *word, GARBAGE_LEVEL garbage_level);
|
||||
BOOL8 potential_word_crunch(WERD_RES *word,
|
||||
GARBAGE_LEVEL garbage_level,
|
||||
BOOL8 ok_dict_word);
|
||||
void tilde_delete(PAGE_RES_IT &page_res_it);
|
||||
//word to do
|
||||
void convert_bad_unlv_chs(WERD_RES *word_res);
|
||||
//word to do
|
||||
void merge_tess_fails(WERD_RES *word_res);
|
||||
GARBAGE_LEVEL garbage_word(WERD_RES *word, BOOL8 ok_dict_word);
|
||||
CRUNCH_MODE word_deletable(WERD_RES *word, INT16 &delete_mode);
|
||||
INT16 failure_count(WERD_RES *word);
|
||||
BOOL8 noise_outlines(WERD *word);
|
||||
//word to do
|
||||
void insert_rej_cblobs(WERD_RES *word);
|
||||
#endif
|
82
ccmain/expandblob.cpp
Normal file
82
ccmain/expandblob.cpp
Normal file
@ -0,0 +1,82 @@
|
||||
/**************************************************************************
|
||||
* Revision 5.1 89/07/27 11:46:53 11:46:53 ray ()
|
||||
* (C) Copyright 1989, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**************************************************************************/
|
||||
#include "mfcpch.h"
|
||||
#include "expandblob.h"
|
||||
#include "tessclas.h"
|
||||
#include "const.h"
|
||||
#include "structures.h"
|
||||
#include "freelist.h"
|
||||
|
||||
/***********************************************************************
|
||||
free_blob(blob) frees the blob and everything it is connected to,
|
||||
i.e. outlines, nodes, edgepts, bytevecs, ratings etc
|
||||
*************************************************************************/
|
||||
void free_blob( /*blob to free */
|
||||
register TBLOB *blob) {
|
||||
if (blob == NULL)
|
||||
return; /*duff blob */
|
||||
free_tree (blob->outlines); /*do the tree of outlines */
|
||||
oldblob(blob); /*free the actual blob */
|
||||
}
|
||||
|
||||
|
||||
/***************************************************************************
|
||||
free_tree(outline) frees the current outline
|
||||
and then its sub-tree
|
||||
*****************************************************************************/
|
||||
void free_tree( /*outline to draw */
|
||||
register TESSLINE *outline) {
|
||||
if (outline == NULL)
|
||||
return; /*duff outline */
|
||||
if (outline->next != NULL)
|
||||
free_tree (outline->next);
|
||||
if (outline->child != NULL)
|
||||
free_tree (outline->child); /*and sub-tree */
|
||||
free_outline(outline); /*free the outline */
|
||||
}
|
||||
|
||||
|
||||
/*******************************************************************************
|
||||
free_outline(outline) frees an outline and anything connected to it
|
||||
*********************************************************************************/
|
||||
void free_outline( /*outline to free */
|
||||
register TESSLINE *outline) {
|
||||
if (outline->compactloop != NULL)
|
||||
/*no compact loop */
|
||||
memfree (outline->compactloop);
|
||||
|
||||
if (outline->loop != NULL)
|
||||
free_loop (outline->loop);
|
||||
|
||||
oldoutline(outline);
|
||||
}
|
||||
|
||||
|
||||
/*********************************************************************************
|
||||
free_loop(startpt) frees all the elements of the closed loop
|
||||
starting at startpt
|
||||
***********************************************************************************/
|
||||
void free_loop( /*outline to free */
|
||||
register EDGEPT *startpt) {
|
||||
register EDGEPT *edgept; /*current point */
|
||||
|
||||
if (startpt == NULL)
|
||||
return;
|
||||
edgept = startpt;
|
||||
do {
|
||||
edgept = oldedgept (edgept); /*free it and move on */
|
||||
}
|
||||
while (edgept != startpt);
|
||||
}
|
13
ccmain/expandblob.h
Normal file
13
ccmain/expandblob.h
Normal file
@ -0,0 +1,13 @@
|
||||
#ifndef EXPANDBLOB_H
|
||||
#define EXPANDBLOB_H
|
||||
|
||||
#include "tessclas.h"
|
||||
|
||||
void free_blob(register TBLOB *blob);
|
||||
|
||||
void free_tree(register TESSLINE *outline);
|
||||
|
||||
void free_outline(register TESSLINE *outline);
|
||||
|
||||
void free_loop(register EDGEPT *startpt);
|
||||
#endif
|
974
ccmain/fixspace.cpp
Normal file
974
ccmain/fixspace.cpp
Normal file
@ -0,0 +1,974 @@
|
||||
/******************************************************************
|
||||
* File: fixspace.cpp (Formerly fixspace.c)
|
||||
* Description: Implements a pass over the page res, exploring the alternative
|
||||
* spacing possibilities, trying to use context to improve the
|
||||
word spacing
|
||||
* Author: Phil Cheatle
|
||||
* Created: Thu Oct 21 11:38:43 BST 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <ctype.h>
|
||||
#include "reject.h"
|
||||
#include "statistc.h"
|
||||
#include "genblob.h"
|
||||
#include "control.h"
|
||||
#include "fixspace.h"
|
||||
#include "tessvars.h"
|
||||
#include "tessbox.h"
|
||||
#include "secname.h"
|
||||
|
||||
#define EXTERN
|
||||
|
||||
EXTERN BOOL_VAR (fixsp_check_for_fp_noise_space, TRUE,
|
||||
"Try turning noise to space in fixed pitch");
|
||||
EXTERN BOOL_VAR (fixsp_fp_eval, TRUE, "Use alternate evaluation for fp");
|
||||
EXTERN BOOL_VAR (fixsp_noise_score_fixing, TRUE, "More sophisticated?");
|
||||
EXTERN INT_VAR (fixsp_non_noise_limit, 1,
|
||||
"How many non-noise blbs either side?");
|
||||
EXTERN double_VAR (fixsp_small_outlines_size, 0.28, "Small if lt xht x this");
|
||||
|
||||
EXTERN BOOL_VAR (fixsp_ignore_punct, TRUE, "In uniform spacing calc");
|
||||
EXTERN BOOL_VAR (fixsp_numeric_fix, TRUE, "Try to deal with numeric punct");
|
||||
EXTERN BOOL_VAR (fixsp_prefer_joined_1s, TRUE, "Arbitrary boost");
|
||||
EXTERN BOOL_VAR (tessedit_test_uniform_wd_spacing, FALSE,
|
||||
"Limit context word spacing");
|
||||
EXTERN BOOL_VAR (tessedit_prefer_joined_punct, FALSE,
|
||||
"Reward punctation joins");
|
||||
EXTERN INT_VAR (fixsp_done_mode, 1, "What constitues done for spacing");
|
||||
EXTERN INT_VAR (debug_fix_space_level, 0, "Contextual fixspace debug");
|
||||
EXTERN STRING_VAR (numeric_punctuation, ".,",
|
||||
"Punct. chs expected WITHIN numbers");
|
||||
|
||||
#define PERFECT_WERDS 999
|
||||
#define MAXSPACING 128 /*max expected spacing in pix */
|
||||
|
||||
/*************************************************************************
|
||||
* fix_fuzzy_spaces()
|
||||
* Walk over the page finding sequences of words joined by fuzzy spaces. Extract
|
||||
* them as a sublist, process the sublist to find the optimal arrangement of
|
||||
* spaces then replace the sublist in the ROW_RES.
|
||||
*************************************************************************/
|
||||
|
||||
void fix_fuzzy_spaces( //find fuzzy words
|
||||
volatile ETEXT_DESC *monitor, //progress monitor
|
||||
INT32 word_count, //count of words in doc
|
||||
PAGE_RES *page_res) {
|
||||
BLOCK_RES_IT block_res_it; //iterators
|
||||
ROW_RES_IT row_res_it;
|
||||
WERD_RES_IT word_res_it_from;
|
||||
WERD_RES_IT word_res_it_to;
|
||||
WERD_RES *word_res;
|
||||
WERD_RES_LIST fuzzy_space_words;
|
||||
INT16 new_length;
|
||||
BOOL8 prevent_null_wd_fixsp; //DONT process blobless wds
|
||||
INT32 word_index; //current word
|
||||
|
||||
block_res_it.set_to_list (&page_res->block_res_list);
|
||||
word_index = 0;
|
||||
for (block_res_it.mark_cycle_pt ();
|
||||
!block_res_it.cycled_list (); block_res_it.forward ()) {
|
||||
row_res_it.set_to_list (&block_res_it.data ()->row_res_list);
|
||||
for (row_res_it.mark_cycle_pt ();
|
||||
!row_res_it.cycled_list (); row_res_it.forward ()) {
|
||||
word_res_it_from.set_to_list (&row_res_it.data ()->word_res_list);
|
||||
while (!word_res_it_from.at_last ()) {
|
||||
word_res = word_res_it_from.data ();
|
||||
while (!word_res_it_from.at_last () &&
|
||||
!(word_res->combination ||
|
||||
word_res_it_from.data_relative (1)->
|
||||
word->flag (W_FUZZY_NON) ||
|
||||
word_res_it_from.data_relative (1)->
|
||||
word->flag (W_FUZZY_SP))) {
|
||||
fix_sp_fp_word (word_res_it_from, row_res_it.data ()->row);
|
||||
word_res = word_res_it_from.forward ();
|
||||
word_index++;
|
||||
if (monitor != NULL) {
|
||||
monitor->ocr_alive = TRUE;
|
||||
monitor->progress = 90 + 5 * word_index / word_count;
|
||||
}
|
||||
}
|
||||
|
||||
if (!word_res_it_from.at_last ()) {
|
||||
word_res_it_to = word_res_it_from;
|
||||
prevent_null_wd_fixsp =
|
||||
word_res->word->gblob_list ()->empty ();
|
||||
if (check_debug_pt (word_res, 60))
|
||||
debug_fix_space_level.set_value (10);
|
||||
word_res_it_to.forward ();
|
||||
word_index++;
|
||||
if (monitor != NULL) {
|
||||
monitor->ocr_alive = TRUE;
|
||||
monitor->progress = 90 + 5 * word_index / word_count;
|
||||
}
|
||||
while (!word_res_it_to.at_last () &&
|
||||
(word_res_it_to.data_relative (1)->
|
||||
word->flag (W_FUZZY_NON) ||
|
||||
word_res_it_to.data_relative (1)->
|
||||
word->flag (W_FUZZY_SP))) {
|
||||
if (check_debug_pt (word_res, 60))
|
||||
debug_fix_space_level.set_value (10);
|
||||
if (word_res->word->gblob_list ()->empty ())
|
||||
prevent_null_wd_fixsp = TRUE;
|
||||
word_res = word_res_it_to.forward ();
|
||||
}
|
||||
if (check_debug_pt (word_res, 60))
|
||||
debug_fix_space_level.set_value (10);
|
||||
if (word_res->word->gblob_list ()->empty ())
|
||||
prevent_null_wd_fixsp = TRUE;
|
||||
if (prevent_null_wd_fixsp)
|
||||
word_res_it_from = word_res_it_to;
|
||||
else {
|
||||
fuzzy_space_words.assign_to_sublist (&word_res_it_from,
|
||||
&word_res_it_to);
|
||||
fix_fuzzy_space_list (fuzzy_space_words,
|
||||
row_res_it.data ()->row);
|
||||
new_length = fuzzy_space_words.length ();
|
||||
word_res_it_from.add_list_before (&fuzzy_space_words);
|
||||
for (;
|
||||
(!word_res_it_from.at_last () &&
|
||||
(new_length > 0)); new_length--) {
|
||||
word_res_it_from.forward ();
|
||||
}
|
||||
}
|
||||
if (test_pt)
|
||||
debug_fix_space_level.set_value (0);
|
||||
}
|
||||
fix_sp_fp_word (word_res_it_from, row_res_it.data ()->row);
|
||||
//Last word in row
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void fix_fuzzy_space_list( //space explorer
|
||||
WERD_RES_LIST &best_perm,
|
||||
ROW *row) {
|
||||
INT16 best_score;
|
||||
WERD_RES_LIST current_perm;
|
||||
INT16 current_score;
|
||||
BOOL8 improved = FALSE;
|
||||
|
||||
//default score
|
||||
best_score = eval_word_spacing (best_perm);
|
||||
|
||||
dump_words (best_perm, best_score, 1, improved);
|
||||
|
||||
if (best_score != PERFECT_WERDS)
|
||||
initialise_search(best_perm, current_perm);
|
||||
|
||||
while ((best_score != PERFECT_WERDS) && !current_perm.empty ()) {
|
||||
match_current_words(current_perm, row);
|
||||
current_score = eval_word_spacing (current_perm);
|
||||
dump_words (current_perm, current_score, 2, improved);
|
||||
if (current_score > best_score) {
|
||||
best_perm.clear ();
|
||||
best_perm.deep_copy (¤t_perm);
|
||||
best_score = current_score;
|
||||
improved = TRUE;
|
||||
}
|
||||
if (current_score < PERFECT_WERDS)
|
||||
transform_to_next_perm(current_perm);
|
||||
}
|
||||
dump_words (best_perm, best_score, 3, improved);
|
||||
}
|
||||
|
||||
|
||||
void initialise_search(WERD_RES_LIST &src_list, WERD_RES_LIST &new_list) {
|
||||
WERD_RES_IT src_it(&src_list);
|
||||
WERD_RES_IT new_it(&new_list);
|
||||
WERD_RES *src_wd;
|
||||
WERD_RES *new_wd;
|
||||
|
||||
for (src_it.mark_cycle_pt (); !src_it.cycled_list (); src_it.forward ()) {
|
||||
src_wd = src_it.data ();
|
||||
if (!src_wd->combination) {
|
||||
new_wd = new WERD_RES (*src_wd);
|
||||
new_wd->combination = FALSE;
|
||||
new_wd->part_of_combo = FALSE;
|
||||
new_it.add_after_then_move (new_wd);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void match_current_words(WERD_RES_LIST &words, ROW *row) {
|
||||
WERD_RES_IT word_it(&words);
|
||||
WERD_RES *word;
|
||||
|
||||
for (word_it.mark_cycle_pt (); !word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
if ((!word->part_of_combo) && (word->outword == NULL))
|
||||
classify_word_pass2(word, row);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* eval_word_spacing()
|
||||
* The basic measure is the number of characters in contextually confirmed
|
||||
* words. (I.e the word is done)
|
||||
* If all words are contextually confirmed the evaluation is deemed perfect.
|
||||
*
|
||||
* Some fiddles are done to handle "1"s as these are VERY frequent causes of
|
||||
* fuzzy spaces. The problem with the basic measure is that "561 63" would score
|
||||
* the same as "56163", though given our knowledge that the space is fuzzy, and
|
||||
* that there is a "1" next to the fuzzy space, we need to ensure that "56163"
|
||||
* is prefered.
|
||||
*
|
||||
* The solution is to NOT COUNT the score of any word which has a digit at one
|
||||
* end and a "1Il" as the character the other side of the space.
|
||||
*
|
||||
* Conversly, any character next to a "1" within a word is counted as a positive
|
||||
* score. Thus "561 63" would score 4 (3 chars in a numeric word plus 1 side of
|
||||
* the "1" joined). "56163" would score 7 - all chars in a numeric word + 2
|
||||
* sides of a "1" joined.
|
||||
*
|
||||
* The joined 1 rule is applied to any word REGARDLESS of contextual
|
||||
* confirmation. Thus "PS7a71 3/7a" scores 1 (neither word is contexutally
|
||||
* confirmed. The only score is from the joined 1. "PS7a713/7a" scores 2.
|
||||
*
|
||||
*************************************************************************/
|
||||
INT16 eval_word_spacing(WERD_RES_LIST &word_res_list) {
|
||||
WERD_RES_IT word_res_it(&word_res_list);
|
||||
INT16 total_score = 0;
|
||||
INT16 word_count = 0;
|
||||
INT16 done_word_count = 0;
|
||||
INT16 word_len;
|
||||
INT16 i;
|
||||
WERD_RES *word; //current word
|
||||
INT16 prev_word_score = 0;
|
||||
BOOL8 prev_word_done = FALSE;
|
||||
BOOL8 prev_char_1 = FALSE; //prev ch a "1/I/l"?
|
||||
BOOL8 prev_char_digit = FALSE; //prev ch 2..9 or 0
|
||||
BOOL8 current_char_1 = FALSE;
|
||||
BOOL8 current_word_ok_so_far;
|
||||
STRING punct_chars = "!\"`',.:;";
|
||||
BOOL8 prev_char_punct = FALSE;
|
||||
BOOL8 current_char_punct = FALSE;
|
||||
BOOL8 word_done = FALSE;
|
||||
|
||||
do {
|
||||
word = word_res_it.data ();
|
||||
word_done = fixspace_thinks_word_done (word);
|
||||
word_count++;
|
||||
if (word->tess_failed) {
|
||||
total_score += prev_word_score;
|
||||
if (prev_word_done)
|
||||
done_word_count++;
|
||||
prev_word_score = 0;
|
||||
prev_char_1 = FALSE;
|
||||
prev_char_digit = FALSE;
|
||||
prev_word_done = FALSE;
|
||||
}
|
||||
else {
|
||||
/*
|
||||
Can we add the prev word score and potentially count this word?
|
||||
Yes IF it didnt end in a 1 when the first char of this word is a digit
|
||||
AND it didnt end in a digit when the first char of this word is a 1
|
||||
*/
|
||||
word_len = word->reject_map.length ();
|
||||
current_word_ok_so_far = FALSE;
|
||||
if (!((prev_char_1 &&
|
||||
digit_or_numeric_punct (word,
|
||||
word->best_choice->string ()[0])) ||
|
||||
(prev_char_digit &&
|
||||
((word_done &&
|
||||
(word->best_choice->string ()[0] == '1')) ||
|
||||
(!word_done &&
|
||||
STRING (conflict_set_I_l_1).contains (word->best_choice->
|
||||
string ()[0])))))) {
|
||||
total_score += prev_word_score;
|
||||
if (prev_word_done)
|
||||
done_word_count++;
|
||||
current_word_ok_so_far = word_done;
|
||||
}
|
||||
|
||||
if ((current_word_ok_so_far) &&
|
||||
(!tessedit_test_uniform_wd_spacing ||
|
||||
((word->best_choice->permuter () == NUMBER_PERM) ||
|
||||
uniformly_spaced (word)))) {
|
||||
prev_word_done = TRUE;
|
||||
prev_word_score = word_len;
|
||||
}
|
||||
else {
|
||||
prev_word_done = FALSE;
|
||||
prev_word_score = 0;
|
||||
}
|
||||
|
||||
if (fixsp_prefer_joined_1s) {
|
||||
/* Add 1 to total score for every joined 1 regardless of context and rejtn */
|
||||
|
||||
for (i = 0, prev_char_1 = FALSE; i < word_len; i++) {
|
||||
current_char_1 = word->best_choice->string ()[i] == '1';
|
||||
if (prev_char_1 || (current_char_1 && (i > 0)))
|
||||
total_score++;
|
||||
prev_char_1 = current_char_1;
|
||||
}
|
||||
}
|
||||
|
||||
/* Add 1 to total score for every joined punctuation regardless of context
|
||||
and rejtn */
|
||||
if (tessedit_prefer_joined_punct) {
|
||||
for (i = 0, prev_char_punct = FALSE; i < word_len; i++) {
|
||||
current_char_punct =
|
||||
punct_chars.contains (word->best_choice->string ()[i]);
|
||||
if (prev_char_punct || (current_char_punct && (i > 0)))
|
||||
total_score++;
|
||||
prev_char_punct = current_char_punct;
|
||||
}
|
||||
}
|
||||
prev_char_digit = digit_or_numeric_punct (word,
|
||||
word->best_choice->
|
||||
string ()[word_len - 1]);
|
||||
prev_char_1 =
|
||||
((word_done
|
||||
&& (word->best_choice->string ()[word_len - 1] == '1'))
|
||||
|| (!word_done
|
||||
&& STRING (conflict_set_I_l_1).contains (word->best_choice->
|
||||
string ()[word_len -
|
||||
1])));
|
||||
}
|
||||
/* Find next word */
|
||||
do
|
||||
word_res_it.forward ();
|
||||
while (word_res_it.data ()->part_of_combo);
|
||||
}
|
||||
while (!word_res_it.at_first ());
|
||||
total_score += prev_word_score;
|
||||
if (prev_word_done)
|
||||
done_word_count++;
|
||||
if (done_word_count == word_count)
|
||||
return PERFECT_WERDS;
|
||||
else
|
||||
return total_score;
|
||||
}
|
||||
|
||||
|
||||
BOOL8 digit_or_numeric_punct(WERD_RES *word, char ch) {
|
||||
return (isdigit (ch) ||
|
||||
(fixsp_numeric_fix &&
|
||||
(word->best_choice->permuter () == NUMBER_PERM) &&
|
||||
STRING (numeric_punctuation).contains (ch)));
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* transform_to_next_perm()
|
||||
* Examines the current word list to find the smallest word gap size. Then walks
|
||||
* the word list closing any gaps of this size by either inserted new
|
||||
* combination words, or extending existing ones.
|
||||
*
|
||||
* The routine COULD be limited to stop it building words longer than N blobs.
|
||||
*
|
||||
* If there are no more gaps then it DELETES the entire list and returns the
|
||||
* empty list to cause termination.
|
||||
*************************************************************************/
|
||||
void transform_to_next_perm(WERD_RES_LIST &words) {
|
||||
WERD_RES_IT word_it(&words);
|
||||
WERD_RES_IT prev_word_it(&words);
|
||||
WERD_RES *word;
|
||||
WERD_RES *prev_word;
|
||||
WERD_RES *combo;
|
||||
WERD *copy_word;
|
||||
INT16 prev_right = -1;
|
||||
BOX box;
|
||||
INT16 gap;
|
||||
INT16 min_gap = MAX_INT16;
|
||||
|
||||
for (word_it.mark_cycle_pt (); !word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
if (!word->part_of_combo) {
|
||||
box = word->word->bounding_box ();
|
||||
if (prev_right >= 0) {
|
||||
gap = box.left () - prev_right;
|
||||
if (gap < min_gap)
|
||||
min_gap = gap;
|
||||
}
|
||||
prev_right = box.right ();
|
||||
}
|
||||
}
|
||||
if (min_gap < MAX_INT16) {
|
||||
prev_right = -1; //back to start
|
||||
word_it.set_to_list (&words);
|
||||
for (; //cant use cycle pt due to inserted combos at start of list
|
||||
(prev_right < 0) || !word_it.at_first (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
if (!word->part_of_combo) {
|
||||
box = word->word->bounding_box ();
|
||||
if (prev_right >= 0) {
|
||||
gap = box.left () - prev_right;
|
||||
if (gap <= min_gap) {
|
||||
prev_word = prev_word_it.data ();
|
||||
if (prev_word->combination)
|
||||
combo = prev_word;
|
||||
else {
|
||||
/* Make a new combination and insert before the first word being joined */
|
||||
copy_word = new WERD;
|
||||
*copy_word = *(prev_word->word);
|
||||
//deep copy
|
||||
combo = new WERD_RES (copy_word);
|
||||
combo->combination = TRUE;
|
||||
prev_word->part_of_combo = TRUE;
|
||||
prev_word_it.add_before_then_move (combo);
|
||||
}
|
||||
combo->word->set_flag (W_EOL, word->word->flag (W_EOL));
|
||||
if (word->combination) {
|
||||
combo->word->join_on (word->word);
|
||||
//Move blbs to combo
|
||||
//old combo no longer needed
|
||||
delete word_it.extract ();
|
||||
}
|
||||
else {
|
||||
//Cpy current wd to combo
|
||||
combo->copy_on (word);
|
||||
word->part_of_combo = TRUE;
|
||||
}
|
||||
combo->done = FALSE;
|
||||
if (combo->outword != NULL) {
|
||||
delete combo->outword;
|
||||
delete combo->best_choice;
|
||||
delete combo->raw_choice;
|
||||
combo->outword = NULL;
|
||||
combo->best_choice = NULL;
|
||||
combo->raw_choice = NULL;
|
||||
}
|
||||
}
|
||||
else
|
||||
//catch up
|
||||
prev_word_it = word_it;
|
||||
}
|
||||
prev_right = box.right ();
|
||||
}
|
||||
}
|
||||
}
|
||||
else
|
||||
words.clear (); //signal termination
|
||||
}
|
||||
|
||||
|
||||
void dump_words(WERD_RES_LIST &perm, INT16 score, INT16 mode, BOOL8 improved) {
|
||||
WERD_RES_IT word_res_it(&perm);
|
||||
static STRING initial_str;
|
||||
|
||||
if (debug_fix_space_level > 0) {
|
||||
if (mode == 1) {
|
||||
initial_str = "";
|
||||
for (word_res_it.mark_cycle_pt ();
|
||||
!word_res_it.cycled_list (); word_res_it.forward ()) {
|
||||
if (!word_res_it.data ()->part_of_combo) {
|
||||
initial_str += word_res_it.data ()->best_choice->string ();
|
||||
initial_str += ' ';
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_fix_space_level > 1) {
|
||||
switch (mode) {
|
||||
case 1:
|
||||
tprintf ("EXTRACTED (%d): \"", score);
|
||||
break;
|
||||
case 2:
|
||||
tprintf ("TESTED (%d): \"", score);
|
||||
break;
|
||||
case 3:
|
||||
tprintf ("RETURNED (%d): \"", score);
|
||||
break;
|
||||
}
|
||||
|
||||
for (word_res_it.mark_cycle_pt ();
|
||||
!word_res_it.cycled_list (); word_res_it.forward ()) {
|
||||
if (!word_res_it.data ()->part_of_combo)
|
||||
tprintf ("%s/%1d ",
|
||||
word_res_it.data ()->best_choice->string ().
|
||||
string (),
|
||||
(int) word_res_it.data ()->best_choice->permuter ());
|
||||
}
|
||||
tprintf ("\"\n");
|
||||
}
|
||||
else if (improved) {
|
||||
tprintf ("FIX SPACING \"%s\" => \"", initial_str.string ());
|
||||
for (word_res_it.mark_cycle_pt ();
|
||||
!word_res_it.cycled_list (); word_res_it.forward ()) {
|
||||
if (!word_res_it.data ()->part_of_combo)
|
||||
tprintf ("%s/%1d ",
|
||||
word_res_it.data ()->best_choice->string ().
|
||||
string (),
|
||||
(int) word_res_it.data ()->best_choice->permuter ());
|
||||
}
|
||||
tprintf ("\"\n");
|
||||
}
|
||||
#endif
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* uniformly_spaced()
|
||||
* Return true if one of the following are true:
|
||||
* - All inter-char gaps are the same width
|
||||
* - The largest gap is no larger than twice the mean/median of the others
|
||||
* - The largest gap is < 64/5 = 13 and all others are <= 0
|
||||
* **** REMEMBER - WE'RE NOW WORKING WITH A BLN WERD !!!
|
||||
*************************************************************************/
|
||||
BOOL8 uniformly_spaced( //sensible word
|
||||
WERD_RES *word) {
|
||||
PBLOB_IT blob_it;
|
||||
BOX box;
|
||||
INT16 prev_right = -MAX_INT16;
|
||||
INT16 gap;
|
||||
INT16 max_gap = -MAX_INT16;
|
||||
INT16 max_gap_count = 0;
|
||||
STATS gap_stats (0, MAXSPACING);
|
||||
BOOL8 result;
|
||||
const ROW *row = word->denorm.row ();
|
||||
float max_non_space;
|
||||
float normalised_max_nonspace;
|
||||
INT16 i = 0;
|
||||
STRING punct_chars = "\"`',.:;";
|
||||
|
||||
blob_it.set_to_list (word->outword->blob_list ());
|
||||
|
||||
for (blob_it.mark_cycle_pt (); !blob_it.cycled_list (); blob_it.forward ()) {
|
||||
box = blob_it.data ()->bounding_box ();
|
||||
if ((prev_right > -MAX_INT16) &&
|
||||
(!fixsp_ignore_punct ||
|
||||
(!punct_chars.contains (word->best_choice->string ()[i - 1]) &&
|
||||
!punct_chars.contains (word->best_choice->string ()[i])))) {
|
||||
gap = box.left () - prev_right;
|
||||
if (gap < max_gap)
|
||||
gap_stats.add (gap, 1);
|
||||
else if (gap == max_gap)
|
||||
max_gap_count++;
|
||||
else {
|
||||
if (max_gap_count > 0)
|
||||
gap_stats.add (max_gap, max_gap_count);
|
||||
max_gap = gap;
|
||||
max_gap_count = 1;
|
||||
}
|
||||
}
|
||||
prev_right = box.right ();
|
||||
i++;
|
||||
}
|
||||
|
||||
max_non_space = (row->space () + 3 * row->kern ()) / 4;
|
||||
normalised_max_nonspace = max_non_space * bln_x_height / row->x_height ();
|
||||
|
||||
result = ((gap_stats.get_total () == 0) ||
|
||||
(max_gap <= normalised_max_nonspace) ||
|
||||
((gap_stats.get_total () > 2) &&
|
||||
(max_gap <= 2 * gap_stats.median ())) ||
|
||||
((gap_stats.get_total () <= 2) &&
|
||||
(max_gap <= 2 * gap_stats.mean ())));
|
||||
#ifndef SECURE_NAMES
|
||||
if ((debug_fix_space_level > 1)) {
|
||||
if (result)
|
||||
tprintf
|
||||
("ACCEPT SPACING FOR: \"%s\" norm_maxnon = %f max=%d maxcount=%d total=%d mean=%f median=%f\n",
|
||||
word->best_choice->string ().string (), normalised_max_nonspace,
|
||||
max_gap, max_gap_count, gap_stats.get_total (), gap_stats.mean (),
|
||||
gap_stats.median ());
|
||||
else
|
||||
tprintf
|
||||
("REJECT SPACING FOR: \"%s\" norm_maxnon = %f max=%d maxcount=%d total=%d mean=%f median=%f\n",
|
||||
word->best_choice->string ().string (), normalised_max_nonspace,
|
||||
max_gap, max_gap_count, gap_stats.get_total (), gap_stats.mean (),
|
||||
gap_stats.median ());
|
||||
}
|
||||
#endif
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
BOOL8 fixspace_thinks_word_done(WERD_RES *word) {
|
||||
if (word->done)
|
||||
return TRUE;
|
||||
|
||||
/*
|
||||
Use all the standard pass 2 conditions for mode 5 in set_done() in
|
||||
reject.c BUT DONT REJECT IF THE WERD IS AMBIGUOUS - FOR SPACING WE DONT
|
||||
CARE WHETHER WE HAVE of/at on/an etc.
|
||||
*/
|
||||
if ((fixsp_done_mode > 0) &&
|
||||
(word->tess_accepted ||
|
||||
((fixsp_done_mode == 2) &&
|
||||
(word->reject_map.reject_count () == 0)) ||
|
||||
(fixsp_done_mode == 3)) &&
|
||||
(strchr (word->best_choice->string ().string (), ' ') == NULL) &&
|
||||
((word->best_choice->permuter () == SYSTEM_DAWG_PERM) ||
|
||||
(word->best_choice->permuter () == FREQ_DAWG_PERM) ||
|
||||
(word->best_choice->permuter () == USER_DAWG_PERM) ||
|
||||
(word->best_choice->permuter () == NUMBER_PERM)))
|
||||
return TRUE;
|
||||
else
|
||||
return FALSE;
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* fix_sp_fp_word()
|
||||
* Test the current word to see if it can be split by deleting noise blobs. If
|
||||
* so, do the buisiness.
|
||||
* Return with the iterator pointing to the same place if the word is unchanged,
|
||||
* or the last of the replacement words.
|
||||
*************************************************************************/
|
||||
void fix_sp_fp_word(WERD_RES_IT &word_res_it, ROW *row) {
|
||||
WERD_RES *word_res;
|
||||
WERD_RES_LIST sub_word_list;
|
||||
WERD_RES_IT sub_word_list_it(&sub_word_list);
|
||||
INT16 blob_index;
|
||||
INT16 new_length;
|
||||
float junk;
|
||||
|
||||
word_res = word_res_it.data ();
|
||||
if (!fixsp_check_for_fp_noise_space ||
|
||||
word_res->word->flag (W_REP_CHAR) ||
|
||||
word_res->combination ||
|
||||
word_res->part_of_combo || !word_res->word->flag (W_DONT_CHOP))
|
||||
return;
|
||||
|
||||
blob_index = worst_noise_blob (word_res, &junk);
|
||||
if (blob_index < 0)
|
||||
return;
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_fix_space_level > 1) {
|
||||
tprintf ("FP fixspace working on \"%s\"\n",
|
||||
word_res->best_choice->string ().string ());
|
||||
}
|
||||
#endif
|
||||
gblob_sort_list ((PBLOB_LIST *) word_res->word->rej_cblob_list (), FALSE);
|
||||
sub_word_list_it.add_after_stay_put (word_res_it.extract ());
|
||||
fix_noisy_space_list(sub_word_list, row);
|
||||
new_length = sub_word_list.length ();
|
||||
word_res_it.add_list_before (&sub_word_list);
|
||||
for (; (!word_res_it.at_last () && (new_length > 1)); new_length--) {
|
||||
word_res_it.forward ();
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void fix_noisy_space_list(WERD_RES_LIST &best_perm, ROW *row) {
|
||||
INT16 best_score;
|
||||
WERD_RES_IT best_perm_it(&best_perm);
|
||||
WERD_RES_LIST current_perm;
|
||||
WERD_RES_IT current_perm_it(¤t_perm);
|
||||
WERD_RES *old_word_res;
|
||||
WERD_RES *new_word_res;
|
||||
INT16 current_score;
|
||||
BOOL8 improved = FALSE;
|
||||
|
||||
//default score
|
||||
best_score = fp_eval_word_spacing (best_perm);
|
||||
|
||||
dump_words (best_perm, best_score, 1, improved);
|
||||
|
||||
new_word_res = new WERD_RES;
|
||||
old_word_res = best_perm_it.data ();
|
||||
//Kludge to force deep copy
|
||||
old_word_res->combination = TRUE;
|
||||
*new_word_res = *old_word_res; //deep copy
|
||||
//Undo kludge
|
||||
old_word_res->combination = FALSE;
|
||||
//Undo kludge
|
||||
new_word_res->combination = FALSE;
|
||||
current_perm_it.add_to_end (new_word_res);
|
||||
|
||||
break_noisiest_blob_word(current_perm);
|
||||
|
||||
while ((best_score != PERFECT_WERDS) && !current_perm.empty ()) {
|
||||
match_current_words(current_perm, row);
|
||||
current_score = fp_eval_word_spacing (current_perm);
|
||||
dump_words (current_perm, current_score, 2, improved);
|
||||
if (current_score > best_score) {
|
||||
best_perm.clear ();
|
||||
best_perm.deep_copy (¤t_perm);
|
||||
best_score = current_score;
|
||||
improved = TRUE;
|
||||
}
|
||||
if (current_score < PERFECT_WERDS)
|
||||
break_noisiest_blob_word(current_perm);
|
||||
}
|
||||
dump_words (best_perm, best_score, 3, improved);
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* break_noisiest_blob_word()
|
||||
* Find the word with the blob which looks like the worst noise.
|
||||
* Break the word into two, deleting the noise blob.
|
||||
*************************************************************************/
|
||||
void break_noisiest_blob_word(WERD_RES_LIST &words) {
|
||||
WERD_RES_IT word_it(&words);
|
||||
WERD_RES_IT worst_word_it;
|
||||
float worst_noise_score = 9999;
|
||||
int worst_blob_index = -1; //noisiest blb of noisiest wd
|
||||
int blob_index; //of wds noisiest blb
|
||||
float noise_score; //of wds noisiest blb
|
||||
WERD_RES *word_res;
|
||||
C_BLOB_IT blob_it;
|
||||
C_BLOB_IT rej_cblob_it;
|
||||
C_BLOB_LIST new_blob_list;
|
||||
C_BLOB_IT new_blob_it;
|
||||
C_BLOB_IT new_rej_cblob_it;
|
||||
WERD *new_word;
|
||||
INT16 start_of_noise_blob;
|
||||
INT16 i;
|
||||
|
||||
for (word_it.mark_cycle_pt (); !word_it.cycled_list (); word_it.forward ()) {
|
||||
blob_index = worst_noise_blob (word_it.data (), &noise_score);
|
||||
if ((blob_index > -1) && (worst_noise_score > noise_score)) {
|
||||
worst_noise_score = noise_score;
|
||||
worst_blob_index = blob_index;
|
||||
worst_word_it = word_it;
|
||||
}
|
||||
}
|
||||
if (worst_blob_index < 0) {
|
||||
words.clear (); //signal termination
|
||||
return;
|
||||
}
|
||||
|
||||
/* Now split the worst_word_it */
|
||||
|
||||
word_res = worst_word_it.data ();
|
||||
|
||||
/* Move blobs before noise blob to a new bloblist */
|
||||
|
||||
new_blob_it.set_to_list (&new_blob_list);
|
||||
blob_it.set_to_list (word_res->word->cblob_list ());
|
||||
for (i = 0; i < worst_blob_index; i++, blob_it.forward ()) {
|
||||
new_blob_it.add_after_then_move (blob_it.extract ());
|
||||
}
|
||||
start_of_noise_blob = blob_it.data ()->bounding_box ().left ();
|
||||
delete blob_it.extract (); //throw out noise blb
|
||||
|
||||
new_word = new WERD (&new_blob_list, word_res->word);
|
||||
new_word->set_flag (W_EOL, FALSE);
|
||||
word_res->word->set_flag (W_BOL, FALSE);
|
||||
word_res->word->set_blanks (1);//After break
|
||||
|
||||
new_rej_cblob_it.set_to_list (new_word->rej_cblob_list ());
|
||||
rej_cblob_it.set_to_list (word_res->word->rej_cblob_list ());
|
||||
for (;
|
||||
(!rej_cblob_it.empty () &&
|
||||
(rej_cblob_it.data ()->bounding_box ().left () <
|
||||
start_of_noise_blob)); rej_cblob_it.forward ()) {
|
||||
new_rej_cblob_it.add_after_then_move (rej_cblob_it.extract ());
|
||||
}
|
||||
|
||||
worst_word_it.add_before_then_move (new WERD_RES (new_word));
|
||||
|
||||
word_res->done = FALSE;
|
||||
if (word_res->outword != NULL) {
|
||||
delete word_res->outword;
|
||||
delete word_res->best_choice;
|
||||
delete word_res->raw_choice;
|
||||
word_res->outword = NULL;
|
||||
word_res->best_choice = NULL;
|
||||
word_res->raw_choice = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
INT16 worst_noise_blob(WERD_RES *word_res, float *worst_noise_score) {
|
||||
PBLOB_IT blob_it;
|
||||
INT16 blob_count;
|
||||
float noise_score[512];
|
||||
int i;
|
||||
int min_noise_blob; //1st contender
|
||||
int max_noise_blob; //last contender
|
||||
int non_noise_count;
|
||||
int worst_noise_blob; //Worst blob
|
||||
float small_limit = bln_x_height * fixsp_small_outlines_size;
|
||||
float non_noise_limit = bln_x_height * 0.8;
|
||||
|
||||
blob_it.set_to_list (word_res->outword->blob_list ());
|
||||
//normalised
|
||||
blob_count = blob_it.length ();
|
||||
ASSERT_HOST (blob_count <= 512);
|
||||
if (blob_count < 5)
|
||||
return -1; //too short to split
|
||||
/* Get the noise scores for all blobs */
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_fix_space_level > 5)
|
||||
tprintf ("FP fixspace Noise metrics for \"%s\": ",
|
||||
word_res->best_choice->string ().string ());
|
||||
#endif
|
||||
|
||||
for (i = 0; i < blob_count; i++, blob_it.forward ()) {
|
||||
if (word_res->reject_map[i].accepted ())
|
||||
noise_score[i] = non_noise_limit;
|
||||
else
|
||||
noise_score[i] = blob_noise_score (blob_it.data ());
|
||||
|
||||
if (debug_fix_space_level > 5)
|
||||
tprintf ("%1.1f ", noise_score[i]);
|
||||
}
|
||||
if (debug_fix_space_level > 5)
|
||||
tprintf ("\n");
|
||||
|
||||
/* Now find the worst one which is far enough away from the end of the word */
|
||||
|
||||
non_noise_count = 0;
|
||||
for (i = 0;
|
||||
(i < blob_count) && (non_noise_count < fixsp_non_noise_limit); i++) {
|
||||
if (noise_score[i] >= non_noise_limit)
|
||||
non_noise_count++;
|
||||
}
|
||||
if (non_noise_count < fixsp_non_noise_limit)
|
||||
return -1;
|
||||
min_noise_blob = i;
|
||||
|
||||
non_noise_count = 0;
|
||||
for (i = blob_count - 1;
|
||||
(i >= 0) && (non_noise_count < fixsp_non_noise_limit); i--) {
|
||||
if (noise_score[i] >= non_noise_limit)
|
||||
non_noise_count++;
|
||||
}
|
||||
if (non_noise_count < fixsp_non_noise_limit)
|
||||
return -1;
|
||||
max_noise_blob = i;
|
||||
|
||||
if (min_noise_blob > max_noise_blob)
|
||||
return -1;
|
||||
|
||||
*worst_noise_score = small_limit;
|
||||
worst_noise_blob = -1;
|
||||
for (i = min_noise_blob; i <= max_noise_blob; i++) {
|
||||
if (noise_score[i] < *worst_noise_score) {
|
||||
worst_noise_blob = i;
|
||||
*worst_noise_score = noise_score[i];
|
||||
}
|
||||
}
|
||||
return worst_noise_blob;
|
||||
}
|
||||
|
||||
|
||||
float blob_noise_score(PBLOB *blob) {
|
||||
OUTLINE_IT outline_it;
|
||||
BOX box; //BB of outline
|
||||
INT16 outline_count = 0;
|
||||
INT16 max_dimension;
|
||||
INT16 largest_outline_dimension = 0;
|
||||
|
||||
outline_it.set_to_list (blob->out_list ());
|
||||
for (outline_it.mark_cycle_pt ();
|
||||
!outline_it.cycled_list (); outline_it.forward ()) {
|
||||
outline_count++;
|
||||
box = outline_it.data ()->bounding_box ();
|
||||
if (box.height () > box.width ())
|
||||
max_dimension = box.height ();
|
||||
else
|
||||
max_dimension = box.width ();
|
||||
|
||||
if (largest_outline_dimension < max_dimension)
|
||||
largest_outline_dimension = max_dimension;
|
||||
}
|
||||
|
||||
if (fixsp_noise_score_fixing) {
|
||||
if (outline_count > 5)
|
||||
//penalise LOTS of blobs
|
||||
largest_outline_dimension *= 2;
|
||||
|
||||
box = blob->bounding_box ();
|
||||
|
||||
if ((box.bottom () > bln_baseline_offset * 4) ||
|
||||
(box.top () < bln_baseline_offset / 2))
|
||||
//Lax blob is if high or low
|
||||
largest_outline_dimension /= 2;
|
||||
}
|
||||
return largest_outline_dimension;
|
||||
}
|
||||
|
||||
|
||||
void fixspace_dbg(WERD_RES *word) {
|
||||
BOX box = word->word->bounding_box ();
|
||||
BOOL8 show_map_detail = FALSE;
|
||||
INT16 i;
|
||||
|
||||
box.print ();
|
||||
#ifndef SECURE_NAMES
|
||||
tprintf (" \"%s\" ", word->best_choice->string ().string ());
|
||||
tprintf ("Blob count: %d (word); %d/%d (outword)\n",
|
||||
word->word->gblob_list ()->length (),
|
||||
word->outword->gblob_list ()->length (),
|
||||
word->outword->rej_blob_list ()->length ());
|
||||
word->reject_map.print (debug_fp);
|
||||
tprintf ("\n");
|
||||
if (show_map_detail) {
|
||||
tprintf ("\"%s\"\n", word->best_choice->string ().string ());
|
||||
for (i = 0; word->best_choice->string ()[i] != '\0'; i++) {
|
||||
tprintf ("**** \"%c\" ****\n", word->best_choice->string ()[i]);
|
||||
word->reject_map[i].full_print (debug_fp);
|
||||
}
|
||||
}
|
||||
|
||||
tprintf ("Tess Accepted: %s\n", word->tess_accepted ? "TRUE" : "FALSE");
|
||||
tprintf ("Done flag: %s\n\n", word->done ? "TRUE" : "FALSE");
|
||||
#endif
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* fp_eval_word_spacing()
|
||||
* Evaluation function for fixed pitch word lists.
|
||||
*
|
||||
* Basically, count the number of "nice" characters - those which are in tess
|
||||
* acceptable words or in dict words and are not rejected.
|
||||
* Penalise any potential noise chars
|
||||
*************************************************************************/
|
||||
|
||||
INT16 fp_eval_word_spacing(WERD_RES_LIST &word_res_list) {
|
||||
WERD_RES_IT word_it(&word_res_list);
|
||||
WERD_RES *word;
|
||||
PBLOB_IT blob_it;
|
||||
INT16 word_length;
|
||||
INT16 score = 0;
|
||||
INT16 i;
|
||||
const char *chs;
|
||||
float small_limit = bln_x_height * fixsp_small_outlines_size;
|
||||
|
||||
if (!fixsp_fp_eval)
|
||||
return (eval_word_spacing (word_res_list));
|
||||
|
||||
for (word_it.mark_cycle_pt (); !word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
word_length = word->reject_map.length ();
|
||||
chs = word->best_choice->string ().string ();
|
||||
if ((word->done ||
|
||||
word->tess_accepted) ||
|
||||
(word->best_choice->permuter () == SYSTEM_DAWG_PERM) ||
|
||||
(word->best_choice->permuter () == FREQ_DAWG_PERM) ||
|
||||
(word->best_choice->permuter () == USER_DAWG_PERM) ||
|
||||
(safe_dict_word (chs) > 0)) {
|
||||
blob_it.set_to_list (word->outword->blob_list ());
|
||||
for (i = 0; i < word_length; i++, blob_it.forward ()) {
|
||||
if ((chs[i] == ' ') ||
|
||||
(blob_noise_score (blob_it.data ()) < small_limit))
|
||||
score -= 1; //penalise possibly erroneous non-space
|
||||
|
||||
else if (word->reject_map[i].accepted ())
|
||||
score++;
|
||||
}
|
||||
}
|
||||
}
|
||||
if (score < 0)
|
||||
score = 0;
|
||||
return score;
|
||||
}
|
72
ccmain/fixspace.h
Normal file
72
ccmain/fixspace.h
Normal file
@ -0,0 +1,72 @@
|
||||
/******************************************************************
|
||||
* File: fixspace.h (Formerly fixspace.h)
|
||||
* Description: Implements a pass over the page res, exploring the alternative
|
||||
* spacing possibilities, trying to use context to improve the
|
||||
word spacing
|
||||
* Author: Phil Cheatle
|
||||
* Created: Thu Oct 21 11:38:43 BST 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef FIXSPACE_H
|
||||
#define FIXSPACE_H
|
||||
|
||||
#include "pageres.h"
|
||||
#include "varable.h"
|
||||
#include "ocrclass.h"
|
||||
#include "notdll.h"
|
||||
|
||||
extern BOOL_VAR_H (fixsp_check_for_fp_noise_space, TRUE,
|
||||
"Try turning noise to space in fixed pitch");
|
||||
extern BOOL_VAR_H (fixsp_fp_eval, TRUE, "Use alternate evaluation for fp");
|
||||
extern BOOL_VAR_H (fixsp_noise_score_fixing, TRUE, "More sophisticated?");
|
||||
extern INT_VAR_H (fixsp_non_noise_limit, 1,
|
||||
"How many non-noise blbs either side?");
|
||||
extern double_VAR_H (fixsp_small_outlines_size, 0.28,
|
||||
"Small if lt xht x this");
|
||||
extern BOOL_VAR_H (fixsp_ignore_punct, TRUE, "In uniform spacing calc");
|
||||
extern BOOL_VAR_H (fixsp_numeric_fix, TRUE, "Try to deal with numeric punct");
|
||||
extern BOOL_VAR_H (fixsp_prefer_joined_1s, TRUE, "Arbitrary boost");
|
||||
extern BOOL_VAR_H (tessedit_test_uniform_wd_spacing, FALSE,
|
||||
"Limit context word spacing");
|
||||
extern BOOL_VAR_H (tessedit_prefer_joined_punct, FALSE,
|
||||
"Reward punctation joins");
|
||||
extern INT_VAR_H (fixsp_done_mode, 1, "What constitues done for spacing");
|
||||
extern INT_VAR_H (debug_fix_space_level, 0, "Contextual fixspace debug");
|
||||
extern STRING_VAR_H (numeric_punctuation, ".,",
|
||||
"Punct. chs expected WITHIN numbers");
|
||||
void fix_fuzzy_spaces( //find fuzzy words
|
||||
volatile ETEXT_DESC *monitor, //progress monitor
|
||||
INT32 word_count, //count of words in doc
|
||||
PAGE_RES *page_res);
|
||||
void fix_fuzzy_space_list( //space explorer
|
||||
WERD_RES_LIST &best_perm,
|
||||
ROW *row);
|
||||
void initialise_search(WERD_RES_LIST &src_list, WERD_RES_LIST &new_list);
|
||||
void match_current_words(WERD_RES_LIST &words, ROW *row);
|
||||
INT16 eval_word_spacing(WERD_RES_LIST &word_res_list);
|
||||
BOOL8 digit_or_numeric_punct(WERD_RES *word, char ch);
|
||||
void transform_to_next_perm(WERD_RES_LIST &words);
|
||||
void dump_words(WERD_RES_LIST &perm, INT16 score, INT16 mode, BOOL8 improved);
|
||||
BOOL8 uniformly_spaced( //sensible word
|
||||
WERD_RES *word);
|
||||
BOOL8 fixspace_thinks_word_done(WERD_RES *word);
|
||||
void fix_sp_fp_word(WERD_RES_IT &word_res_it, ROW *row);
|
||||
void fix_noisy_space_list(WERD_RES_LIST &best_perm, ROW *row);
|
||||
void break_noisiest_blob_word(WERD_RES_LIST &words);
|
||||
INT16 worst_noise_blob(WERD_RES *word_res, float *worst_noise_score);
|
||||
float blob_noise_score(PBLOB *blob);
|
||||
void fixspace_dbg(WERD_RES *word);
|
||||
INT16 fp_eval_word_spacing(WERD_RES_LIST &word_res_list);
|
||||
#endif
|
790
ccmain/fixxht.cpp
Normal file
790
ccmain/fixxht.cpp
Normal file
@ -0,0 +1,790 @@
|
||||
/**********************************************************************
|
||||
* File: fixxht.cpp (Formerly fixxht.c)
|
||||
* Description: Improve x_ht and look out for case inconsistencies
|
||||
* Author: Phil Cheatle
|
||||
* Created: Thu Aug 5 14:11:08 BST 1993
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <string.h>
|
||||
#include <ctype.h>
|
||||
#include "varable.h"
|
||||
#include "tessvars.h"
|
||||
#include "control.h"
|
||||
#include "reject.h"
|
||||
#include "fixxht.h"
|
||||
#include "secname.h"
|
||||
|
||||
#define EXTERN
|
||||
|
||||
EXTERN double_VAR (x_ht_fraction_of_caps_ht, 0.7,
|
||||
"Fract of cps ht est of xht");
|
||||
EXTERN double_VAR (x_ht_variation, 0.35,
|
||||
"Err band as fract of caps/xht dist");
|
||||
EXTERN double_VAR (x_ht_sub_variation, 0.5,
|
||||
"Err band as fract of caps/xht dist");
|
||||
EXTERN BOOL_VAR (rej_trial_ambigs, TRUE,
|
||||
"reject x-ht ambigs when under trial");
|
||||
EXTERN BOOL_VAR (x_ht_conservative_ambigs, FALSE,
|
||||
"Dont rely on ambigs + maxht");
|
||||
EXTERN BOOL_VAR (x_ht_check_est, TRUE, "Cross check estimates");
|
||||
EXTERN BOOL_VAR (x_ht_case_flip, FALSE, "Flip or reject suspect case");
|
||||
EXTERN BOOL_VAR (x_ht_include_dodgy_blobs, TRUE,
|
||||
"Include blobs with possible noise?");
|
||||
EXTERN BOOL_VAR (x_ht_limit_flip_trials, TRUE,
|
||||
"Dont do trial flips when ambigs are close to xht?");
|
||||
EXTERN BOOL_VAR (rej_use_check_block_occ, TRUE,
|
||||
"Analyse rejection behaviour");
|
||||
|
||||
EXTERN STRING_VAR (chs_non_ambig_caps_ht,
|
||||
"!#$%&()/12346789?ABDEFGHIKLNQRT[]\\bdfhkl",
|
||||
"Reliable ascenders");
|
||||
EXTERN STRING_VAR (chs_x_ht, "acegmnopqrsuvwxyz", "X height chars");
|
||||
EXTERN STRING_VAR (chs_non_ambig_x_ht, "aenqr", "reliable X height chars");
|
||||
EXTERN STRING_VAR (chs_ambig_caps_x, "cCmMoO05sSuUvVwWxXzZ",
|
||||
"X ht or caps ht chars");
|
||||
EXTERN STRING_VAR (chs_bl_ambig_caps_x, "pPyY", " Caps or descender ambigs");
|
||||
|
||||
/* The following arent used in this module but are used in applybox.c */
|
||||
EXTERN STRING_VAR (chs_caps_ht,
|
||||
"!#$%&()/0123456789?ABCDEFGHIJKLMNOPQRSTUVWXYZ[]\\bdfhkl{|}",
|
||||
"Ascender chars");
|
||||
EXTERN STRING_VAR (chs_desc, "gjpqy", "Descender chars");
|
||||
EXTERN STRING_VAR (chs_non_ambig_bl,
|
||||
"!#$%&01246789?ABCDEFGHIKLMNORSTUVWXYZabcdehiklmnorstuvwxz",
|
||||
"Reliable baseline chars");
|
||||
EXTERN STRING_VAR (chs_odd_top, "ijt", "Chars with funny ascender region");
|
||||
EXTERN STRING_VAR (chs_odd_bot, "()35JQ[]\\/{}|", "Chars with funny base");
|
||||
|
||||
/* The following arent used but are defined for completeness */
|
||||
EXTERN STRING_VAR (chs_bl,
|
||||
"!#$%&()/01246789?ABCDEFGHIJKLMNOPRSTUVWXYZ[]\\abcdefhiklmnorstuvwxz{}",
|
||||
"Baseline chars");
|
||||
EXTERN STRING_VAR (chs_non_ambig_desc, "gq", "Reliable descender chars");
|
||||
|
||||
/*************************************************************************
|
||||
* re_estimate_x_ht()
|
||||
*
|
||||
* Walk the blobs in the word together with the text string and reject map.
|
||||
* NOTE: All evaluation is done on the baseline normalised word. This is so that
|
||||
* the BOX class can be used (integer). The reasons for this are:
|
||||
* a) We must use the outword - ie the Tess result
|
||||
* b) The outword is always converted to integer representation as that is how
|
||||
* Tess works
|
||||
* c) We would like to use the BOX class, cos its there - this is integer
|
||||
* precision.
|
||||
* d) If we de-normed the outword we would get rounding errors and would find
|
||||
* that integers are too imprecise (x-height around 15 pixels instead of a
|
||||
* scale of 128 in bln form.
|
||||
* CONVINCED?
|
||||
*
|
||||
* A) Try to re-estimatate x-ht and caps ht from confirmed pts in word.
|
||||
*
|
||||
* FOR each non reject blob
|
||||
* IF char is baseline posn ambiguous
|
||||
* Remove ambiguity by comparing its posn with respect to baseline.
|
||||
* IF char is a confirmed x-ht char
|
||||
* Add x-ht posn to confirmed_x_ht pts for word
|
||||
* IF char is a confirmed caps-ht char
|
||||
* Add blob_ht to caps ht pts for word
|
||||
*
|
||||
* IF Std Dev of caps hts < 2 (AND # samples > 0)
|
||||
* Use mean as caps ht estimate (Dont use median as we can expect a
|
||||
* fair variation between the heights of the NON_AMBIG_CAPS_HT_CHS)
|
||||
* IF Std Dev of caps hts >= 2 (AND # samples > 0)
|
||||
* Suspect small caps font.
|
||||
* Look for 2 clusters, each with Std Dev < 2.
|
||||
* IF 2 clusters found
|
||||
* Pick the smaller median as the caps ht estimate of the smallcaps.
|
||||
*
|
||||
* IF failed to estimate a caps ht
|
||||
* Use the median caps ht if there is one,
|
||||
* ELSE use the caps ht estimate of the previous word. NO!!!
|
||||
*
|
||||
*
|
||||
* IF there are confirmed x-height chars
|
||||
* Estimate confirmed x-height as the median value
|
||||
* ELSE IF there is a confirmed caps ht
|
||||
* Estimate confirmed x-height as a fraction of confirmed caps ht value
|
||||
* ELSE
|
||||
* Use the value for the previous word or the row value if this is the
|
||||
* first word in the block. NO!!!
|
||||
*
|
||||
* B) Add in case ambiguous blobs based on confirmed x-ht/caps ht, changing case
|
||||
* as necessary. Reestimate caps ht and x-ht as in A, using the extended
|
||||
* clusters.
|
||||
*
|
||||
* C) If word contains rejects, and x-ht estimate significantly differs from
|
||||
* original estimate, return TRUE so that the word can be rematched
|
||||
*************************************************************************/
|
||||
|
||||
void re_estimate_x_ht( //improve for 1 word
|
||||
WERD_RES *word_res, //word to do
|
||||
float *trial_x_ht //new match value
|
||||
) {
|
||||
PBLOB_IT blob_it;
|
||||
INT16 blob_ht_above_baseline;
|
||||
|
||||
const char *word_str;
|
||||
INT16 i;
|
||||
|
||||
STATS all_blobs_ht (0, 300); //every blob in word
|
||||
STATS x_ht (0, 300); //confirmed pts in wd
|
||||
STATS caps_ht (0, 300); //confirmed pts in wd
|
||||
STATS case_ambig (0, 300); //lower case ambigs
|
||||
|
||||
INT16 rej_blobs_count = 0;
|
||||
INT16 rej_blobs_max_height = 0;
|
||||
INT32 rej_blobs_max_area = 0;
|
||||
float x_ht_ok_variation;
|
||||
float max_blob_ht;
|
||||
float marginally_above_x_ht;
|
||||
|
||||
BOX blob_box; //blob bounding box
|
||||
float est_x_ht = 0.0; //word estimate
|
||||
float est_caps_ht = 0.0; //word estimate
|
||||
//based on hard data?
|
||||
BOOL8 est_caps_ht_certain = FALSE;
|
||||
BOOL8 est_x_ht_certain = FALSE;//based on hard data?
|
||||
BOOL8 trial = FALSE; //Sepeculative values?
|
||||
BOOL8 no_comment = FALSE; //No change in xht
|
||||
float ambig_lc_x_est;
|
||||
float ambig_uc_caps_est;
|
||||
INT16 x_ht_ambigs = 0;
|
||||
INT16 caps_ht_ambigs = 0;
|
||||
|
||||
/* Calculate default variation of blob x_ht from bln x_ht for bln word */
|
||||
x_ht_ok_variation =
|
||||
(bln_x_height / x_ht_fraction_of_caps_ht - bln_x_height) * x_ht_variation;
|
||||
|
||||
word_str = word_res->best_choice->string ().string ();
|
||||
/*
|
||||
Cycle blobs, allocating to one of the stats sets when possible.
|
||||
*/
|
||||
blob_it.set_to_list (word_res->outword->blob_list ());
|
||||
for (blob_it.mark_cycle_pt (), i = 0;
|
||||
!blob_it.cycled_list (); blob_it.forward (), i++) {
|
||||
if (!dodgy_blob (blob_it.data ())) {
|
||||
blob_box = blob_it.data ()->bounding_box ();
|
||||
blob_ht_above_baseline = blob_box.top () - bln_baseline_offset;
|
||||
all_blobs_ht.add (blob_ht_above_baseline, 1);
|
||||
|
||||
if (word_res->reject_map[i].rejected ()) {
|
||||
rej_blobs_count++;
|
||||
if (blob_box.height () > rej_blobs_max_height)
|
||||
rej_blobs_max_height = blob_box.height ();
|
||||
if (blob_box.area () > rej_blobs_max_area)
|
||||
rej_blobs_max_area = blob_box.area ();
|
||||
}
|
||||
else {
|
||||
if (STRING (chs_non_ambig_x_ht).contains (word_str[i]))
|
||||
x_ht.add (blob_ht_above_baseline, 1);
|
||||
|
||||
if (STRING (chs_non_ambig_caps_ht).contains (word_str[i]))
|
||||
caps_ht.add (blob_ht_above_baseline, 1);
|
||||
|
||||
if (STRING (chs_ambig_caps_x).contains (word_str[i])) {
|
||||
case_ambig.add (blob_ht_above_baseline, 1);
|
||||
if (STRING (chs_x_ht).contains (word_str[i]))
|
||||
x_ht_ambigs++;
|
||||
else
|
||||
caps_ht_ambigs++;
|
||||
}
|
||||
|
||||
if (STRING (chs_bl_ambig_caps_x).contains (word_str[i])) {
|
||||
if (STRING (chs_x_ht).contains (word_str[i])) {
|
||||
/* confirm x_height provided > 15% total height below baseline */
|
||||
if ((bln_baseline_offset - blob_box.bottom ()) /
|
||||
(float) blob_box.height () > 0.15)
|
||||
x_ht.add (blob_ht_above_baseline, 1);
|
||||
}
|
||||
else {
|
||||
/* confirm caps_height provided < 5% total height below baseline */
|
||||
if ((bln_baseline_offset - blob_box.bottom ()) /
|
||||
(float) blob_box.height () < 0.05)
|
||||
caps_ht.add (blob_ht_above_baseline, 1);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
est_caps_ht = estimate_from_stats (caps_ht);
|
||||
est_x_ht = estimate_from_stats (x_ht);
|
||||
est_ambigs(word_res, case_ambig, &ambig_lc_x_est, &ambig_uc_caps_est);
|
||||
max_blob_ht = all_blobs_ht.ile (0.9999);
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 20) {
|
||||
tprintf ("Mode20:A: %s ", word_str);
|
||||
word_res->reject_map.print (debug_fp);
|
||||
tprintf (" XHT:%f CAP:%f MAX:%f AMBIG X:%f CAP:%f\n",
|
||||
est_x_ht, est_caps_ht, max_blob_ht,
|
||||
ambig_lc_x_est, ambig_uc_caps_est);
|
||||
}
|
||||
#endif
|
||||
if (!x_ht_conservative_ambigs &&
|
||||
(ambig_lc_x_est > 0) &&
|
||||
(ambig_lc_x_est == ambig_uc_caps_est) &&
|
||||
(max_blob_ht > ambig_lc_x_est + x_ht_ok_variation)) {
|
||||
//may be zero but believe xht
|
||||
ambig_uc_caps_est = est_caps_ht;
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:B: Fiddle ambig_uc_caps_est to %f\n",
|
||||
ambig_lc_x_est);
|
||||
#endif
|
||||
}
|
||||
|
||||
/* Now make some estimates */
|
||||
|
||||
if ((est_x_ht > 0) ||
|
||||
(est_caps_ht > 0) ||
|
||||
((ambig_lc_x_est > 0) && (ambig_lc_x_est != ambig_uc_caps_est))) {
|
||||
/* There is some sensible data to go on so make the most of it. */
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:C: Sensible Data\n", ambig_lc_x_est);
|
||||
if (est_x_ht > 0) {
|
||||
est_x_ht_certain = TRUE;
|
||||
if (est_caps_ht == 0) {
|
||||
if ((ambig_uc_caps_est > ambig_lc_x_est) &&
|
||||
(ambig_uc_caps_est > est_x_ht + x_ht_ok_variation))
|
||||
est_caps_ht = ambig_uc_caps_est;
|
||||
else
|
||||
est_caps_ht = est_x_ht / x_ht_fraction_of_caps_ht;
|
||||
}
|
||||
if (case_ambig.get_total () > 0)
|
||||
improve_estimate(word_res, est_x_ht, est_caps_ht, x_ht, caps_ht);
|
||||
est_caps_ht_certain = caps_ht.get_total () > 0;
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:D: Est from xht XHT:%f CAP:%f\n",
|
||||
est_x_ht, est_caps_ht);
|
||||
#endif
|
||||
}
|
||||
else if (est_caps_ht > 0) {
|
||||
est_caps_ht_certain = TRUE;
|
||||
if ((ambig_lc_x_est > 0) &&
|
||||
(ambig_lc_x_est < est_caps_ht - x_ht_ok_variation))
|
||||
est_x_ht = ambig_lc_x_est;
|
||||
else
|
||||
est_x_ht = est_caps_ht * x_ht_fraction_of_caps_ht;
|
||||
if (ambig_lc_x_est + ambig_uc_caps_est > 0)
|
||||
improve_estimate(word_res, est_x_ht, est_caps_ht, x_ht, caps_ht);
|
||||
est_x_ht_certain = x_ht.get_total () > 0;
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:E: Est from caps XHT:%f CAP:%f\n",
|
||||
est_x_ht, est_caps_ht);
|
||||
#endif
|
||||
}
|
||||
else {
|
||||
/* Do something based on case ambig chars alone - we have guessed that the
|
||||
ambigs are lower case. */
|
||||
est_x_ht = ambig_lc_x_est;
|
||||
est_x_ht_certain = TRUE;
|
||||
if (ambig_uc_caps_est > ambig_lc_x_est) {
|
||||
est_caps_ht = ambig_uc_caps_est;
|
||||
est_caps_ht_certain = TRUE;
|
||||
}
|
||||
else
|
||||
est_caps_ht = est_x_ht / x_ht_fraction_of_caps_ht;
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:F: Est from ambigs XHT:%f CAP:%f\n",
|
||||
est_x_ht, est_caps_ht);
|
||||
#endif
|
||||
}
|
||||
/* Check for sane interpretation of evidence:
|
||||
Try shifting caps ht if min certain caps ht is not significantly greater
|
||||
than the estimated x ht or the max certain x ht is not significantly less
|
||||
than the estimated caps ht. */
|
||||
if (x_ht_check_est) {
|
||||
if ((caps_ht.get_total () > 0) &&
|
||||
(est_x_ht + x_ht_ok_variation >= caps_ht.ile (0.0001))) {
|
||||
trial = TRUE;
|
||||
est_caps_ht = est_x_ht;
|
||||
est_x_ht = x_ht_fraction_of_caps_ht * est_caps_ht;
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:G: Trial XHT:%f CAP:%f\n",
|
||||
est_x_ht, est_caps_ht);
|
||||
#endif
|
||||
}
|
||||
else if ((x_ht.get_total () > 0) &&
|
||||
(est_caps_ht - x_ht_ok_variation <= x_ht.ile (0.9999))) {
|
||||
trial = TRUE;
|
||||
est_x_ht = est_caps_ht;
|
||||
est_caps_ht = est_x_ht / x_ht_fraction_of_caps_ht;
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:H: Trial XHT:%f CAP:%f\n",
|
||||
est_x_ht, est_caps_ht);
|
||||
#endif
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
else {
|
||||
/* There is no sensible data so we're in the dark. */
|
||||
|
||||
marginally_above_x_ht = bln_x_height +
|
||||
x_ht_ok_variation * x_ht_sub_variation;
|
||||
/*
|
||||
If there are no rejects, or the only rejects have a narrow height, or have
|
||||
a small area compared to a normal char, then estimate the x-height as the
|
||||
original one. (I.e dont fiddle about if the only rejects look like
|
||||
punctuation) - we use max height as mean or median will be too low if
|
||||
there are only two blobs - Eg "F."
|
||||
*/
|
||||
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:I: In the dark\n");
|
||||
|
||||
if ((rej_blobs_count == 0) ||
|
||||
(rej_blobs_max_height < 0.3 * max_blob_ht) ||
|
||||
(rej_blobs_max_area < 0.3 * max_blob_ht * max_blob_ht)) {
|
||||
no_comment = TRUE;
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:J: No comment due to no rejects\n");
|
||||
}
|
||||
else if (x_ht_limit_flip_trials &&
|
||||
((max_blob_ht < marginally_above_x_ht) ||
|
||||
((ambig_lc_x_est > 0) &&
|
||||
(ambig_lc_x_est == ambig_uc_caps_est) &&
|
||||
(ambig_lc_x_est < marginally_above_x_ht)))) {
|
||||
no_comment = TRUE;
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:K: No comment as close to xht %f < %f\n",
|
||||
ambig_lc_x_est, marginally_above_x_ht);
|
||||
}
|
||||
else if (x_ht_conservative_ambigs && (ambig_uc_caps_est > 0)) {
|
||||
trial = TRUE;
|
||||
est_caps_ht = ambig_lc_x_est;
|
||||
est_x_ht = x_ht_fraction_of_caps_ht * est_caps_ht;
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:L: Trial XHT:%f CAP:%f\n",
|
||||
est_x_ht, est_caps_ht);
|
||||
#endif
|
||||
}
|
||||
/*
|
||||
If the top of the word is nowhere near where we expect ascenders to be
|
||||
(less than half the x_ht -> caps_ht distance) - suspect an all caps word
|
||||
at the x-ht. Estimate x-ht accordingly - but only as a TRIAL!
|
||||
NOTE we do NOT check location of baseline. Commas can descend as much as
|
||||
real descenders so we would need to do something to make sure that any
|
||||
disqualifying descenders were not at the end.
|
||||
*/
|
||||
else {
|
||||
if (max_blob_ht <
|
||||
(bln_x_height + bln_x_height / x_ht_fraction_of_caps_ht) / 2.0) {
|
||||
trial = TRUE;
|
||||
est_x_ht = x_ht_fraction_of_caps_ht * max_blob_ht;
|
||||
est_caps_ht = max_blob_ht;
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:M: Trial XHT:%f CAP:%f\n",
|
||||
est_x_ht, est_caps_ht);
|
||||
#endif
|
||||
}
|
||||
else {
|
||||
no_comment = TRUE;
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:N: No comment as nothing else matched\n");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* Sanity check - reject word if fails */
|
||||
|
||||
if (!no_comment &&
|
||||
((est_x_ht > 2 * bln_x_height) ||
|
||||
(est_x_ht / word_res->denorm.scale () <= min_sane_x_ht_pixels) ||
|
||||
(est_caps_ht <= est_x_ht) || (est_caps_ht >= 2.5 * est_x_ht))) {
|
||||
no_comment = TRUE;
|
||||
if (!trial && rej_use_xht) {
|
||||
if (debug_x_ht_level >= 2) {
|
||||
tprintf ("Sanity check rejecting %s ", word_str);
|
||||
word_res->reject_map.print (debug_fp);
|
||||
tprintf ("\n");
|
||||
}
|
||||
word_res->reject_map.rej_word_xht_fixup ();
|
||||
|
||||
}
|
||||
if (debug_x_ht_level >= 20)
|
||||
tprintf ("Mode20:O: No comment as nothing else matched\n");
|
||||
}
|
||||
|
||||
if (no_comment || trial) {
|
||||
word_res->x_height = bln_x_height / word_res->denorm.scale ();
|
||||
word_res->guessed_x_ht = TRUE;
|
||||
word_res->caps_height = (bln_x_height / x_ht_fraction_of_caps_ht) /
|
||||
word_res->denorm.scale ();
|
||||
word_res->guessed_caps_ht = TRUE;
|
||||
/*
|
||||
Reject ambigs in the current word if we are uncertain and:
|
||||
there are rejects OR
|
||||
there is only one char which is an ambig OR
|
||||
there is conflict between the case of the ambigs even though there is
|
||||
no height separation Eg "Ms" recognised from "MS"
|
||||
*/
|
||||
if (rej_trial_ambigs &&
|
||||
((word_res->reject_map.reject_count () > 0) ||
|
||||
(word_res->reject_map.length () == 1) ||
|
||||
((x_ht_ambigs > 0) && (caps_ht_ambigs > 0)))) {
|
||||
#ifndef SECURE_NAMES
|
||||
if (debug_x_ht_level >= 2) {
|
||||
tprintf ("TRIAL Rej Ambigs %s ", word_str);
|
||||
word_res->reject_map.print (debug_fp);
|
||||
}
|
||||
#endif
|
||||
reject_ambigs(word_res);
|
||||
if (debug_x_ht_level >= 2) {
|
||||
tprintf (" ");
|
||||
word_res->reject_map.print (debug_fp);
|
||||
tprintf ("\n");
|
||||
}
|
||||
}
|
||||
}
|
||||
else {
|
||||
word_res->x_height = est_x_ht / word_res->denorm.scale ();
|
||||
word_res->guessed_x_ht = !est_x_ht_certain;
|
||||
word_res->caps_height = est_caps_ht / word_res->denorm.scale ();
|
||||
word_res->guessed_caps_ht = !est_caps_ht_certain;
|
||||
}
|
||||
|
||||
if (!no_comment && (fabs (est_x_ht - bln_x_height) > x_ht_ok_variation))
|
||||
*trial_x_ht = est_x_ht / word_res->denorm.scale ();
|
||||
else
|
||||
*trial_x_ht = 0.0;
|
||||
|
||||
#ifndef SECURE_NAMES
|
||||
if (((*trial_x_ht > 0) && (debug_x_ht_level >= 3)) ||
|
||||
(debug_x_ht_level >= 5)) {
|
||||
tprintf ("%s ", word_str);
|
||||
word_res->reject_map.print (debug_fp);
|
||||
tprintf
|
||||
(" X:%0.2f Cps:%0.2f Mxht:%0.2f RJ MxHt:%d MxAr:%d Rematch:%c\n",
|
||||
est_x_ht, est_caps_ht, max_blob_ht, rej_blobs_max_height,
|
||||
rej_blobs_max_area, *trial_x_ht > 0 ? '*' : ' ');
|
||||
}
|
||||
#endif
|
||||
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* check_block_occ()
|
||||
* Checks word for coarse block occupancy, rejecting more chars and flipping
|
||||
* case of case ambiguous chars as required.
|
||||
*************************************************************************/
|
||||
|
||||
void check_block_occ(WERD_RES *word_res) {
|
||||
PBLOB_IT blob_it;
|
||||
STRING new_string;
|
||||
REJMAP new_map = word_res->reject_map;
|
||||
WERD_CHOICE *new_choice;
|
||||
|
||||
const char *word_str = word_res->best_choice->string ().string ();
|
||||
INT16 i;
|
||||
INT16 reject_count = 0;
|
||||
char confirmed_char;
|
||||
float x_ht;
|
||||
float caps_ht;
|
||||
|
||||
if (word_res->x_height > 0)
|
||||
x_ht = word_res->x_height * word_res->denorm.scale ();
|
||||
else
|
||||
x_ht = bln_x_height;
|
||||
|
||||
if (word_res->caps_height > 0)
|
||||
caps_ht = word_res->caps_height * word_res->denorm.scale ();
|
||||
else
|
||||
caps_ht = x_ht / x_ht_fraction_of_caps_ht;
|
||||
|
||||
blob_it.set_to_list (word_res->outword->blob_list ());
|
||||
|
||||
for (blob_it.mark_cycle_pt (), i = 0;
|
||||
!blob_it.cycled_list (); blob_it.forward (), i++) {
|
||||
new_string += word_str[i]; //default copy
|
||||
if (word_res->reject_map[i].accepted ()) {
|
||||
confirmed_char = check_blob_occ (word_str[i],
|
||||
blob_it.data ()->bounding_box ().
|
||||
top () - bln_baseline_offset, x_ht,
|
||||
caps_ht);
|
||||
|
||||
if (confirmed_char == '\0') {
|
||||
if (rej_use_check_block_occ) {
|
||||
new_map[i].setrej_xht_fixup ();
|
||||
reject_count++;
|
||||
}
|
||||
}
|
||||
else
|
||||
new_string[i] = confirmed_char;
|
||||
}
|
||||
}
|
||||
if ((reject_count > 0) || (new_string != word_str)) {
|
||||
if (debug_x_ht_level >= 2) {
|
||||
tprintf ("Shape Verification: %s ", word_str);
|
||||
word_res->reject_map.print (debug_fp);
|
||||
tprintf (" -> %s ", new_string.string ());
|
||||
new_map.print (debug_fp);
|
||||
tprintf ("\n");
|
||||
}
|
||||
new_choice = new WERD_CHOICE (new_string.string (),
|
||||
word_res->best_choice->rating (),
|
||||
word_res->best_choice->certainty (),
|
||||
word_res->best_choice->permuter ());
|
||||
delete word_res->best_choice;
|
||||
word_res->best_choice = new_choice;
|
||||
word_res->reject_map = new_map;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* check_blob_occ()
|
||||
*
|
||||
* Checks blob for position relative to position above baseline
|
||||
* Returns 0 for reject, or (possibly case shifted) confirmed char
|
||||
*************************************************************************/
|
||||
|
||||
char check_blob_occ(char proposed_char,
|
||||
INT16 blob_ht_above_baseline,
|
||||
float x_ht,
|
||||
float caps_ht) {
|
||||
BOOL8 blob_definite_x_ht;
|
||||
BOOL8 blob_definite_caps_ht;
|
||||
float acceptable_variation;
|
||||
|
||||
acceptable_variation = (caps_ht - x_ht) * x_ht_variation;
|
||||
/* ??? REJECT if expected descender and nothing significantly below BL */
|
||||
|
||||
/* ??? REJECT if expected ascender and nothing significantly above x-ht */
|
||||
|
||||
/*
|
||||
IF AMBIG_CAPS_X_CHS
|
||||
IF blob is definitely an ascender ( > xht + xht err )AND
|
||||
char is an x-ht char
|
||||
THEN
|
||||
flip case
|
||||
IF blob is defintiely an x-ht ( <= xht + xht err ) AND
|
||||
char is an ascender char
|
||||
THEN
|
||||
flip case
|
||||
*/
|
||||
blob_definite_x_ht = blob_ht_above_baseline <= x_ht + acceptable_variation;
|
||||
blob_definite_caps_ht = blob_ht_above_baseline >=
|
||||
caps_ht - acceptable_variation;
|
||||
|
||||
if (STRING (chs_ambig_caps_x).contains (proposed_char)) {
|
||||
if ((!blob_definite_x_ht && !blob_definite_caps_ht) ||
|
||||
(proposed_char == '0' && !blob_definite_caps_ht) ||
|
||||
(proposed_char == 'o' && !blob_definite_x_ht))
|
||||
return '\0';
|
||||
|
||||
else if (blob_definite_caps_ht &&
|
||||
STRING (chs_x_ht).contains (proposed_char)) {
|
||||
if (x_ht_case_flip)
|
||||
//flip to upper case
|
||||
return (char) toupper (proposed_char);
|
||||
else
|
||||
return '\0';
|
||||
}
|
||||
|
||||
else if (blob_definite_x_ht &&
|
||||
!STRING (chs_x_ht).contains (proposed_char)) {
|
||||
if (x_ht_case_flip)
|
||||
//flip to lower case
|
||||
return (char) tolower (proposed_char);
|
||||
else
|
||||
return '\0';
|
||||
}
|
||||
}
|
||||
else
|
||||
if ((STRING (chs_non_ambig_x_ht).contains (proposed_char)
|
||||
&& !blob_definite_x_ht)
|
||||
|| (STRING (chs_non_ambig_caps_ht).contains (proposed_char)
|
||||
&& !blob_definite_caps_ht))
|
||||
return '\0';
|
||||
return proposed_char;
|
||||
}
|
||||
|
||||
|
||||
float estimate_from_stats(STATS &stats) {
|
||||
if (stats.get_total () <= 0)
|
||||
return 0.0;
|
||||
else if (stats.get_total () >= 3)
|
||||
return stats.ile (0.5); //median
|
||||
else
|
||||
return stats.mean ();
|
||||
}
|
||||
|
||||
|
||||
void improve_estimate(WERD_RES *word_res,
|
||||
float &est_x_ht,
|
||||
float &est_caps_ht,
|
||||
STATS &x_ht,
|
||||
STATS &caps_ht) {
|
||||
PBLOB_IT blob_it;
|
||||
INT16 blob_ht_above_baseline;
|
||||
|
||||
const char *word_str;
|
||||
INT16 i;
|
||||
BOX blob_box; //blob bounding box
|
||||
char confirmed_char;
|
||||
float new_val;
|
||||
|
||||
/* IMPROVE estimates here - if good estimates, and case ambig chars,
|
||||
rescan blobs to fix case ambig blobs, re-estimate hts ??? maybe always do
|
||||
it after deciding x-height
|
||||
*/
|
||||
|
||||
blob_it.set_to_list (word_res->outword->blob_list ());
|
||||
word_str = word_res->best_choice->string ().string ();
|
||||
for (blob_it.mark_cycle_pt (), i = 0;
|
||||
!blob_it.cycled_list (); blob_it.forward (), i++) {
|
||||
if ((STRING (chs_ambig_caps_x).contains (word_str[i])) &&
|
||||
(!dodgy_blob (blob_it.data ()))) {
|
||||
blob_box = blob_it.data ()->bounding_box ();
|
||||
blob_ht_above_baseline = blob_box.top () - bln_baseline_offset;
|
||||
confirmed_char = check_blob_occ (word_str[i],
|
||||
blob_ht_above_baseline,
|
||||
est_x_ht, est_caps_ht);
|
||||
if (confirmed_char != '\0')
|
||||
if (STRING (chs_x_ht).contains (confirmed_char))
|
||||
x_ht.add (blob_ht_above_baseline, 1);
|
||||
else
|
||||
caps_ht.add (blob_ht_above_baseline, 1);
|
||||
}
|
||||
}
|
||||
new_val = estimate_from_stats (x_ht);
|
||||
if (new_val > 0)
|
||||
est_x_ht = new_val;
|
||||
new_val = estimate_from_stats (caps_ht);
|
||||
if (new_val > 0)
|
||||
est_caps_ht = new_val;
|
||||
}
|
||||
|
||||
|
||||
void reject_ambigs( //rej any accepted xht ambig chars
|
||||
WERD_RES *word) {
|
||||
const char *word_str;
|
||||
int i = 0;
|
||||
|
||||
word_str = word->best_choice->string ().string ();
|
||||
while (*word_str != '\0') {
|
||||
if (STRING (chs_ambig_caps_x).contains (*word_str))
|
||||
word->reject_map[i].setrej_xht_fixup ();
|
||||
word_str++;
|
||||
i++;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void est_ambigs( //xht ambig ht stats
|
||||
WERD_RES *word_res,
|
||||
STATS &stats,
|
||||
float *ambig_lc_x_est, //xht est
|
||||
float *ambig_uc_caps_est //caps est
|
||||
) {
|
||||
float x_ht_ok_variation;
|
||||
STATS short_ambigs (0, 300);
|
||||
STATS tall_ambigs (0, 300);
|
||||
PBLOB_IT blob_it;
|
||||
BOX blob_box; //blob bounding box
|
||||
INT16 blob_ht_above_baseline;
|
||||
|
||||
const char *word_str;
|
||||
INT16 i;
|
||||
float min; //min ambig ch ht
|
||||
float max; //max ambig ch ht
|
||||
float short_limit; // for lower case
|
||||
float tall_limit; // for upper case
|
||||
|
||||
x_ht_ok_variation =
|
||||
(bln_x_height / x_ht_fraction_of_caps_ht - bln_x_height) * x_ht_variation;
|
||||
|
||||
if (stats.get_total () == 0) {
|
||||
*ambig_lc_x_est = 0;
|
||||
*ambig_uc_caps_est = 0;
|
||||
}
|
||||
else {
|
||||
min = stats.ile (0.0);
|
||||
max = stats.ile (0.99999);
|
||||
if ((max - min) < x_ht_ok_variation) {
|
||||
*ambig_lc_x_est = *ambig_uc_caps_est = stats.mean ();
|
||||
//close enough
|
||||
}
|
||||
else {
|
||||
/* Try reclustering into lower and upper case chars */
|
||||
short_limit = min + (max - min) * x_ht_variation;
|
||||
tall_limit = max - (max - min) * x_ht_variation;
|
||||
word_str = word_res->best_choice->string ().string ();
|
||||
blob_it.set_to_list (word_res->outword->blob_list ());
|
||||
for (blob_it.mark_cycle_pt (), i = 0;
|
||||
!blob_it.cycled_list (); blob_it.forward (), i++) {
|
||||
if (word_res->reject_map[i].accepted () &&
|
||||
STRING (chs_ambig_caps_x).contains (word_str[i]) &&
|
||||
(!dodgy_blob (blob_it.data ()))) {
|
||||
blob_box = blob_it.data ()->bounding_box ();
|
||||
blob_ht_above_baseline =
|
||||
blob_box.top () - bln_baseline_offset;
|
||||
if (blob_ht_above_baseline <= short_limit)
|
||||
short_ambigs.add (blob_ht_above_baseline, 1);
|
||||
else if (blob_ht_above_baseline >= tall_limit)
|
||||
tall_ambigs.add (blob_ht_above_baseline, 1);
|
||||
}
|
||||
}
|
||||
*ambig_lc_x_est = short_ambigs.mean ();
|
||||
*ambig_uc_caps_est = tall_ambigs.mean ();
|
||||
/* Cop out if we havent got sensible clusters. */
|
||||
if (*ambig_uc_caps_est - *ambig_lc_x_est <= x_ht_ok_variation)
|
||||
*ambig_lc_x_est = *ambig_uc_caps_est = stats.mean ();
|
||||
//close enough
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* dodgy_blob()
|
||||
* Returns true if the blob has more than one outline, one above the other.
|
||||
* These are dodgy as the top blob could be noise, causing the bounding box xht
|
||||
* to be misleading
|
||||
*************************************************************************/
|
||||
|
||||
BOOL8 dodgy_blob(PBLOB *blob) {
|
||||
OUTLINE_IT outline_it = blob->out_list ();
|
||||
INT16 highest_bottom = -MAX_INT16;
|
||||
INT16 lowest_top = MAX_INT16;
|
||||
BOX outline_box;
|
||||
|
||||
if (x_ht_include_dodgy_blobs)
|
||||
return FALSE; //no blob is ever dodgy
|
||||
for (outline_it.mark_cycle_pt ();
|
||||
!outline_it.cycled_list (); outline_it.forward ()) {
|
||||
outline_box = outline_it.data ()->bounding_box ();
|
||||
if (lowest_top > outline_box.top ())
|
||||
lowest_top = outline_box.top ();
|
||||
if (highest_bottom < outline_box.bottom ())
|
||||
highest_bottom = outline_box.bottom ();
|
||||
}
|
||||
return highest_bottom >= lowest_top;
|
||||
}
|
92
ccmain/fixxht.h
Normal file
92
ccmain/fixxht.h
Normal file
@ -0,0 +1,92 @@
|
||||
/**********************************************************************
|
||||
* File: fixxht.h (Formerly fixxht.h)
|
||||
* Description: Improve x_ht and look out for case inconsistencies
|
||||
* Author: Phil Cheatle
|
||||
* Created: Thu Aug 5 14:11:08 BST 1993
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef FIXXHT_H
|
||||
#define FIXXHT_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "statistc.h"
|
||||
#include "pageres.h"
|
||||
#include "notdll.h"
|
||||
|
||||
extern double_VAR_H (x_ht_fraction_of_caps_ht, 0.7,
|
||||
"Fract of cps ht est of xht");
|
||||
extern double_VAR_H (x_ht_variation, 0.35,
|
||||
"Err band as fract of caps/xht dist");
|
||||
extern double_VAR_H (x_ht_sub_variation, 0.5,
|
||||
"Err band as fract of caps/xht dist");
|
||||
extern BOOL_VAR_H (rej_trial_ambigs, TRUE,
|
||||
"reject x-ht ambigs when under trial");
|
||||
extern BOOL_VAR_H (x_ht_conservative_ambigs, FALSE,
|
||||
"Dont rely on ambigs + maxht");
|
||||
extern BOOL_VAR_H (x_ht_check_est, TRUE, "Cross check estimates");
|
||||
extern BOOL_VAR_H (x_ht_case_flip, FALSE, "Flip or reject suspect case");
|
||||
extern BOOL_VAR_H (x_ht_include_dodgy_blobs, TRUE,
|
||||
"Include blobs with possible noise?");
|
||||
extern BOOL_VAR_H (x_ht_limit_flip_trials, TRUE,
|
||||
"Dont do trial flips when ambigs are close to xht?");
|
||||
extern BOOL_VAR_H (rej_use_check_block_occ, TRUE,
|
||||
"Analyse rejection behaviour");
|
||||
extern STRING_VAR_H (chs_non_ambig_caps_ht,
|
||||
"!#$%&()/12346789?ABDEFGHIKLNQRT[]\\bdfhkl",
|
||||
"Reliable ascenders");
|
||||
extern STRING_VAR_H (chs_x_ht, "acegmnopqrsuvwxyz", "X height chars");
|
||||
extern STRING_VAR_H (chs_non_ambig_x_ht, "aenqr", "reliable X height chars");
|
||||
extern STRING_VAR_H (chs_ambig_caps_x, "cCmMoO05sSuUvVwWxXzZ",
|
||||
"X ht or caps ht chars");
|
||||
extern STRING_VAR_H (chs_bl_ambig_caps_x, "pPyY",
|
||||
" Caps or descender ambigs");
|
||||
extern STRING_VAR_H (chs_caps_ht,
|
||||
"!#$%&()/0123456789?ABCDEFGHIJKLMNOPQRSTUVWXYZ[]\\bdfhkl{|}",
|
||||
"Ascender chars");
|
||||
extern STRING_VAR_H (chs_desc, "gjpqy", "Descender chars");
|
||||
extern STRING_VAR_H (chs_non_ambig_bl,
|
||||
"!#$%&01246789?ABCDEFGHIKLMNORSTUVWXYZabcdehiklmnorstuvwxz",
|
||||
"Reliable baseline chars");
|
||||
extern STRING_VAR_H (chs_odd_top, "ijt", "Chars with funny ascender region");
|
||||
extern STRING_VAR_H (chs_odd_bot, "()35JQ[]\\/{}|", "Chars with funny base");
|
||||
extern STRING_VAR_H (chs_bl,
|
||||
"!#$%&()/01246789?ABCDEFGHIJKLMNOPRSTUVWXYZ[]\\abcdefhiklmnorstuvwxz{}",
|
||||
"Baseline chars");
|
||||
extern STRING_VAR_H (chs_non_ambig_desc, "gq", "Reliable descender chars");
|
||||
void re_estimate_x_ht( //improve for 1 word
|
||||
WERD_RES *word_res, //word to do
|
||||
float *trial_x_ht //new match value
|
||||
);
|
||||
void check_block_occ(WERD_RES *word_res);
|
||||
char check_blob_occ(char proposed_char,
|
||||
INT16 blob_ht_above_baseline,
|
||||
float x_ht,
|
||||
float caps_ht);
|
||||
float estimate_from_stats(STATS &stats);
|
||||
void improve_estimate(WERD_RES *word_res,
|
||||
float &est_x_ht,
|
||||
float &est_caps_ht,
|
||||
STATS &x_ht,
|
||||
STATS &caps_ht);
|
||||
void reject_ambigs( //rej any accepted xht ambig chars
|
||||
WERD_RES *word);
|
||||
//xht ambig ht stats
|
||||
void est_ambigs(WERD_RES *word_res,
|
||||
STATS &stats,
|
||||
float *ambig_lc_x_est, //xht est
|
||||
float *ambig_uc_caps_est //caps est
|
||||
);
|
||||
BOOL8 dodgy_blob(PBLOB *blob);
|
||||
#endif
|
154
ccmain/imgscale.cpp
Normal file
154
ccmain/imgscale.cpp
Normal file
@ -0,0 +1,154 @@
|
||||
/**********************************************************************
|
||||
* File: imgscale.cpp (Formerly dyn_prog.c)
|
||||
* Description: Dynamic programming for smart scaling of images.
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 18 16:12:03 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
/*************************************************************************
|
||||
* This is really Sheelagh's code that I've hacked into a more usable form.
|
||||
* It is used by scaleim.c All I did to it was to change "factor" from int to
|
||||
* float.
|
||||
*************************************************************************/
|
||||
/************************************************************************
|
||||
* This version uses the result of the previous row to influence the
|
||||
* current row's calculation.
|
||||
************************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include "errcode.h"
|
||||
|
||||
#define f(xc, yc) ((xc - factor*yc)*(xc - factor*yc))
|
||||
|
||||
#define g(oldyc, yc, oldxc, xc) (factor*factor*(oldyc - yc)*(oldyc - yc)/(abs(oldxc - xc) + 1))
|
||||
|
||||
void
|
||||
dyn_exit (const char s[]) {
|
||||
fprintf (stderr, "%s", s);
|
||||
err_exit();
|
||||
}
|
||||
|
||||
|
||||
void dyn_prog( //The clever bit
|
||||
int n,
|
||||
int *x,
|
||||
int *y,
|
||||
int ymax,
|
||||
int *oldx,
|
||||
int *oldy,
|
||||
int oldn,
|
||||
float factor) {
|
||||
int i, z, j, matchflag;
|
||||
int **ymin;
|
||||
float **F, fz;
|
||||
|
||||
/* F[i][z] gives minimum over y <= z */
|
||||
|
||||
F = (float **) calloc (n, sizeof (float *));
|
||||
ymin = (int **) calloc (n, sizeof (int *));
|
||||
if ((F == NULL) || (ymin == NULL))
|
||||
dyn_exit ("Error in calloc\n");
|
||||
|
||||
for (i = 0; i < n; i++) {
|
||||
F[i] = (float *) calloc (ymax - n + i + 1, sizeof (float));
|
||||
ymin[i] = (int *) calloc (ymax - n + i + 1, sizeof (int));
|
||||
if ((F[i] == NULL) || (ymin[i] == NULL))
|
||||
dyn_exit ("Error in calloc\n");
|
||||
}
|
||||
|
||||
F[0][0] = f (x[0], 0);
|
||||
/* find nearest transition of same sign (white to black) */
|
||||
j = 0;
|
||||
while ((j < oldn) && (oldx[j] < x[0]))
|
||||
j += 2;
|
||||
if (j >= oldn)
|
||||
j -= 2;
|
||||
else if ((j - 2 >= 0) && ((x[0] - oldx[j - 2]) < (oldx[j] - x[0])))
|
||||
j -= 2;
|
||||
if (abs (oldx[j] - x[0]) < factor) {
|
||||
matchflag = 1;
|
||||
F[0][0] += g (oldy[j], 0, oldx[j], x[0]);
|
||||
}
|
||||
else
|
||||
matchflag = 0;
|
||||
ymin[0][0] = 0;
|
||||
|
||||
for (z = 1; z < ymax - n + 1; z++) {
|
||||
fz = f (x[0], z);
|
||||
/* add penalty for deviating from previous row if necessary */
|
||||
|
||||
if (matchflag)
|
||||
fz += g (oldy[j], z, oldx[j], x[0]);
|
||||
if (fz < F[0][z - 1]) {
|
||||
F[0][z] = fz;
|
||||
ymin[0][z] = z;
|
||||
}
|
||||
else {
|
||||
F[0][z] = F[0][z - 1];
|
||||
ymin[0][z] = ymin[0][z - 1];
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 1; i < n; i++) {
|
||||
F[i][i] = f (x[i], i) + F[i - 1][i - 1];
|
||||
/* add penalty for deviating from previous row if necessary */
|
||||
if (j > 0)
|
||||
j--;
|
||||
else
|
||||
j++;
|
||||
while ((j < oldn) && (oldx[j] < x[i]))
|
||||
j += 2;
|
||||
if (j >= oldn)
|
||||
j -= 2;
|
||||
else if ((j - 2 >= 0) && ((x[i] - oldx[j - 2]) < (oldx[j] - x[i])))
|
||||
j -= 2;
|
||||
if (abs (oldx[j] - x[i]) < factor) {
|
||||
matchflag = 1;
|
||||
F[i][i] += g (oldy[j], i, oldx[j], x[i]);
|
||||
}
|
||||
else
|
||||
matchflag = 0;
|
||||
ymin[i][i] = i;
|
||||
for (z = i + 1; z < ymax - n + i + 1; z++) {
|
||||
fz = f (x[i], z) + F[i - 1][z - 1];
|
||||
/* add penalty for deviating from previous row if necessary */
|
||||
if (matchflag)
|
||||
fz += g (oldy[j], z, oldx[j], x[i]);
|
||||
if (fz < F[i][z - 1]) {
|
||||
F[i][z] = fz;
|
||||
ymin[i][z] = z;
|
||||
}
|
||||
else {
|
||||
F[i][z] = F[i][z - 1];
|
||||
ymin[i][z] = ymin[i][z - 1];
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
y[n - 1] = ymin[n - 1][ymax - 1];
|
||||
for (i = n - 2; i >= 0; i--)
|
||||
y[i] = ymin[i][y[i + 1] - 1];
|
||||
|
||||
for (i = 0; i < n; i++) {
|
||||
free (F[i]);
|
||||
free (ymin[i]);
|
||||
}
|
||||
free(F);
|
||||
free(ymin);
|
||||
|
||||
return;
|
||||
}
|
32
ccmain/imgscale.h
Normal file
32
ccmain/imgscale.h
Normal file
@ -0,0 +1,32 @@
|
||||
/**********************************************************************
|
||||
* File: imgscale.h (Formerly dyn_prog.h)
|
||||
* Description: Dynamic programming for smart scaling of images.
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 18 16:12:03 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef IMGSCALE_H
|
||||
#define IMGSCALE_H
|
||||
|
||||
void dyn_prog( //The clever bit
|
||||
int n,
|
||||
int *x,
|
||||
int *y,
|
||||
int ymax,
|
||||
int *oldx,
|
||||
int *oldy,
|
||||
int oldn,
|
||||
float factor);
|
||||
#endif
|
404
ccmain/matmatch.cpp
Normal file
404
ccmain/matmatch.cpp
Normal file
@ -0,0 +1,404 @@
|
||||
/**********************************************************************
|
||||
* File: matmatch.cpp (Formerly matrix_match.c)
|
||||
* Description: matrix matching routines for Tessedit
|
||||
* Author: Chris Newton
|
||||
* Created: Wed Nov 24 15:57:41 GMT 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <stdlib.h>
|
||||
#include <math.h>
|
||||
#include <string.h>
|
||||
#include <ctype.h>
|
||||
#ifdef __UNIX__
|
||||
#include <assert.h>
|
||||
#endif
|
||||
#include "tessvars.h"
|
||||
#include "stderr.h"
|
||||
#include "img.h"
|
||||
#include "evnts.h"
|
||||
#include "showim.h"
|
||||
#include "hosthplb.h"
|
||||
#include "grphics.h"
|
||||
#include "evnts.h"
|
||||
#include "adaptions.h"
|
||||
#include "matmatch.h"
|
||||
#include "secname.h"
|
||||
|
||||
#define EXTERN
|
||||
|
||||
EXTERN BOOL_VAR (tessedit_display_mm, FALSE, "Display matrix matches");
|
||||
EXTERN BOOL_VAR (tessedit_mm_debug, FALSE,
|
||||
"Print debug information for matrix matcher");
|
||||
EXTERN INT_VAR (tessedit_mm_prototype_min_size, 3,
|
||||
"Smallest number of samples in a cluster for a prototype to be used");
|
||||
|
||||
// Colours for displaying the match
|
||||
#define BB_COLOUR 0
|
||||
#define BW_COLOUR 1
|
||||
#define WB_COLOUR 3
|
||||
#define UB_COLOUR 5
|
||||
#define BU_COLOUR 7
|
||||
#define UU_COLOUR 9
|
||||
#define WU_COLOUR 11
|
||||
#define UW_COLOUR 13
|
||||
#define WW_COLOUR 15
|
||||
|
||||
#define BINIM_BLACK 0
|
||||
#define BINIM_WHITE 1
|
||||
|
||||
float matrix_match( // returns match score
|
||||
IMAGE *image1,
|
||||
IMAGE *image2) {
|
||||
ASSERT_HOST (image1->get_bpp () == 1 && image2->get_bpp () == 1);
|
||||
|
||||
if (image1->get_xsize () >= image2->get_xsize ())
|
||||
return match1 (image1, image2);
|
||||
else
|
||||
return match1 (image2, image1);
|
||||
}
|
||||
|
||||
|
||||
float match1( /* returns match score */
|
||||
IMAGE *image_w,
|
||||
IMAGE *image_n) {
|
||||
INT32 x_offset;
|
||||
INT32 y_offset;
|
||||
INT32 x_size = image_w->get_xsize ();
|
||||
INT32 y_size;
|
||||
INT32 x_size2 = image_n->get_xsize ();
|
||||
INT32 y_size2;
|
||||
IMAGE match_image;
|
||||
IMAGELINE imline_w;
|
||||
IMAGELINE imline_n;
|
||||
IMAGELINE match_imline;
|
||||
INT32 x;
|
||||
INT32 y;
|
||||
float sum = 0.0;
|
||||
|
||||
x_offset = (image_w->get_xsize () - image_n->get_xsize ()) / 2;
|
||||
|
||||
ASSERT_HOST (x_offset >= 0);
|
||||
match_imline.init (x_size);
|
||||
|
||||
sum = 0;
|
||||
|
||||
if (image_w->get_ysize () < image_n->get_ysize ()) {
|
||||
y_size = image_n->get_ysize ();
|
||||
y_size2 = image_w->get_ysize ();
|
||||
y_offset = (y_size - y_size2) / 2;
|
||||
|
||||
if (tessedit_display_mm && !tessedit_mm_use_prototypes)
|
||||
tprintf ("I1 (%d, %d), I2 (%d, %d), MI (%d, %d)\n", x_size,
|
||||
image_w->get_ysize (), x_size2, image_n->get_ysize (),
|
||||
x_size, y_size);
|
||||
|
||||
match_image.create (x_size, y_size, 4);
|
||||
|
||||
for (y = 0; y < y_offset; y++) {
|
||||
image_n->fast_get_line (0, y, x_size2, &imline_n);
|
||||
for (x = 0; x < x_size2; x++) {
|
||||
if (imline_n.pixels[x] == BINIM_BLACK) {
|
||||
sum += -1;
|
||||
match_imline.pixels[x] = UB_COLOUR;
|
||||
}
|
||||
else {
|
||||
match_imline.pixels[x] = UW_COLOUR;
|
||||
}
|
||||
}
|
||||
match_image.fast_put_line (x_offset, y, x_size2, &match_imline);
|
||||
}
|
||||
|
||||
for (y = y_offset + y_size2; y < y_size; y++) {
|
||||
image_n->fast_get_line (0, y, x_size2, &imline_n);
|
||||
for (x = 0; x < x_size2; x++) {
|
||||
if (imline_n.pixels[x] == BINIM_BLACK) {
|
||||
sum += -1.0;
|
||||
match_imline.pixels[x] = UB_COLOUR;
|
||||
}
|
||||
else {
|
||||
match_imline.pixels[x] = UW_COLOUR;
|
||||
}
|
||||
}
|
||||
match_image.fast_put_line (x_offset, y, x_size2, &match_imline);
|
||||
}
|
||||
|
||||
for (y = y_offset; y < y_offset + y_size2; y++) {
|
||||
image_w->fast_get_line (0, y - y_offset, x_size, &imline_w);
|
||||
image_n->fast_get_line (0, y, x_size2, &imline_n);
|
||||
for (x = 0; x < x_offset; x++) {
|
||||
if (imline_w.pixels[x] == BINIM_BLACK) {
|
||||
sum += -1.0;
|
||||
match_imline.pixels[x] = BU_COLOUR;
|
||||
}
|
||||
else {
|
||||
match_imline.pixels[x] = WU_COLOUR;
|
||||
}
|
||||
}
|
||||
|
||||
for (x = x_offset + x_size2; x < x_size; x++) {
|
||||
if (imline_w.pixels[x] == BINIM_BLACK) {
|
||||
sum += -1.0;
|
||||
match_imline.pixels[x] = BU_COLOUR;
|
||||
}
|
||||
else {
|
||||
match_imline.pixels[x] = WU_COLOUR;
|
||||
}
|
||||
}
|
||||
|
||||
for (x = x_offset; x < x_offset + x_size2; x++) {
|
||||
if (imline_n.pixels[x - x_offset] == imline_w.pixels[x]) {
|
||||
sum += 1.0;
|
||||
if (imline_w.pixels[x] == BINIM_BLACK)
|
||||
match_imline.pixels[x] = BB_COLOUR;
|
||||
else
|
||||
match_imline.pixels[x] = WW_COLOUR;
|
||||
}
|
||||
else {
|
||||
sum += -1.0;
|
||||
if (imline_w.pixels[x] == BINIM_BLACK)
|
||||
match_imline.pixels[x] = BW_COLOUR;
|
||||
else
|
||||
match_imline.pixels[x] = WB_COLOUR;
|
||||
}
|
||||
}
|
||||
|
||||
match_image.fast_put_line (0, y, x_size, &match_imline);
|
||||
}
|
||||
}
|
||||
else {
|
||||
y_size = image_w->get_ysize ();
|
||||
y_size2 = image_n->get_ysize ();
|
||||
y_offset = (y_size - y_size2) / 2;
|
||||
|
||||
if (tessedit_display_mm && !tessedit_mm_use_prototypes)
|
||||
tprintf ("I1 (%d, %d), I2 (%d, %d), MI (%d, %d)\n", x_size,
|
||||
image_w->get_ysize (), x_size2, image_n->get_ysize (),
|
||||
x_size, y_size);
|
||||
|
||||
match_image.create (x_size, y_size, 4);
|
||||
|
||||
for (y = 0; y < y_offset; y++) {
|
||||
image_w->fast_get_line (0, y, x_size, &imline_w);
|
||||
for (x = 0; x < x_size; x++) {
|
||||
if (imline_w.pixels[x] == BINIM_BLACK) {
|
||||
sum += -1;
|
||||
match_imline.pixels[x] = BU_COLOUR;
|
||||
}
|
||||
else {
|
||||
match_imline.pixels[x] = WU_COLOUR;
|
||||
}
|
||||
}
|
||||
match_image.fast_put_line (0, y, x_size, &match_imline);
|
||||
}
|
||||
|
||||
for (y = y_offset + y_size2; y < y_size; y++) {
|
||||
image_w->fast_get_line (0, y, x_size, &imline_w);
|
||||
for (x = 0; x < x_size; x++) {
|
||||
if (imline_w.pixels[x] == BINIM_BLACK) {
|
||||
sum += -1;
|
||||
match_imline.pixels[x] = BU_COLOUR;
|
||||
}
|
||||
else {
|
||||
match_imline.pixels[x] = WU_COLOUR;
|
||||
}
|
||||
}
|
||||
match_image.fast_put_line (0, y, x_size, &match_imline);
|
||||
}
|
||||
|
||||
for (y = y_offset; y < y_offset + y_size2; y++) {
|
||||
image_w->fast_get_line (0, y, x_size, &imline_w);
|
||||
image_n->fast_get_line (0, y - y_offset, x_size2, &imline_n);
|
||||
for (x = 0; x < x_offset; x++) {
|
||||
if (imline_w.pixels[x] == BINIM_BLACK) {
|
||||
sum += -1.0;
|
||||
match_imline.pixels[x] = BU_COLOUR;
|
||||
}
|
||||
else {
|
||||
match_imline.pixels[x] = WU_COLOUR;
|
||||
}
|
||||
}
|
||||
|
||||
for (x = x_offset + x_size2; x < x_size; x++) {
|
||||
if (imline_w.pixels[x] == BINIM_BLACK) {
|
||||
sum += -1.0;
|
||||
match_imline.pixels[x] = BU_COLOUR;
|
||||
}
|
||||
else {
|
||||
match_imline.pixels[x] = WU_COLOUR;
|
||||
}
|
||||
}
|
||||
|
||||
for (x = x_offset; x < x_offset + x_size2; x++) {
|
||||
if (imline_n.pixels[x - x_offset] == imline_w.pixels[x]) {
|
||||
sum += 1.0;
|
||||
if (imline_w.pixels[x] == BINIM_BLACK)
|
||||
match_imline.pixels[x] = BB_COLOUR;
|
||||
else
|
||||
match_imline.pixels[x] = WW_COLOUR;
|
||||
}
|
||||
else {
|
||||
sum += -1.0;
|
||||
if (imline_w.pixels[x] == BINIM_BLACK)
|
||||
match_imline.pixels[x] = BW_COLOUR;
|
||||
else
|
||||
match_imline.pixels[x] = WB_COLOUR;
|
||||
}
|
||||
}
|
||||
|
||||
match_image.fast_put_line (0, y, x_size, &match_imline);
|
||||
}
|
||||
}
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
if (tessedit_display_mm && !tessedit_mm_use_prototypes) {
|
||||
tprintf ("Match score %f\n", 1.0 - sum / (x_size * y_size));
|
||||
display_images(image_w, image_n, &match_image);
|
||||
}
|
||||
#endif
|
||||
|
||||
if (tessedit_mm_debug)
|
||||
tprintf ("Match score %f\n", 1.0 - sum / (x_size * y_size));
|
||||
|
||||
return (1.0 - sum / (x_size * y_size));
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* display_images()
|
||||
*
|
||||
* Show a pair of images, plus the match image
|
||||
*
|
||||
*************************************************************************/
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void display_images(IMAGE *image_w, IMAGE *image_n, IMAGE *match_image) {
|
||||
WINDOW w_im_window;
|
||||
WINDOW n_im_window;
|
||||
WINDOW match_window;
|
||||
GRAPHICS_EVENT event; //output event
|
||||
INT16 i;
|
||||
|
||||
// xmin xmax ymin ymax
|
||||
w_im_window = create_window ("Image 1", SCROLLINGWIN, 20, 100, 10 * image_w->get_xsize (), 10 * image_w->get_ysize (), 0, image_w->get_xsize (), 0, image_w->get_ysize (),
|
||||
TRUE, FALSE, FALSE, TRUE); // down event & key only
|
||||
|
||||
clear_view_surface(w_im_window);
|
||||
show_sub_image (image_w,
|
||||
0, 0,
|
||||
image_w->get_xsize (), image_w->get_ysize (),
|
||||
w_im_window, 0, 0);
|
||||
|
||||
line_color_index(w_im_window, RED);
|
||||
for (i = 1; i < image_w->get_xsize (); i++) {
|
||||
move2d (w_im_window, i, 0);
|
||||
draw2d (w_im_window, i, image_w->get_ysize ());
|
||||
}
|
||||
for (i = 1; i < image_w->get_ysize (); i++) {
|
||||
move2d (w_im_window, 0, i);
|
||||
draw2d (w_im_window, image_w->get_xsize (), i);
|
||||
}
|
||||
|
||||
// xmin xmax ymin ymax
|
||||
n_im_window = create_window ("Image 2", SCROLLINGWIN, 240, 100, 10 * image_n->get_xsize (), 10 * image_n->get_ysize (), 0, image_n->get_xsize (), 0, image_n->get_ysize (),
|
||||
TRUE, FALSE, FALSE, TRUE); // down event & key only
|
||||
|
||||
clear_view_surface(n_im_window);
|
||||
show_sub_image (image_n,
|
||||
0, 0,
|
||||
image_n->get_xsize (), image_n->get_ysize (),
|
||||
n_im_window, 0, 0);
|
||||
|
||||
line_color_index(n_im_window, RED);
|
||||
for (i = 1; i < image_n->get_xsize (); i++) {
|
||||
move2d (n_im_window, i, 0);
|
||||
draw2d (n_im_window, i, image_n->get_ysize ());
|
||||
}
|
||||
for (i = 1; i < image_n->get_ysize (); i++) {
|
||||
move2d (n_im_window, 0, i);
|
||||
draw2d (n_im_window, image_n->get_xsize (), i);
|
||||
}
|
||||
overlap_picture_ops(TRUE);
|
||||
|
||||
// xmin xmax ymin ymax
|
||||
match_window = create_window ("Match Result", SCROLLINGWIN, 460, 100, 10 * match_image->get_xsize (), 10 * match_image->get_ysize (), 0, match_image->get_xsize (), 0, match_image->get_ysize (),
|
||||
TRUE, FALSE, FALSE, TRUE); // down event & key only
|
||||
|
||||
clear_view_surface(match_window);
|
||||
show_sub_image (match_image,
|
||||
0, 0,
|
||||
match_image->get_xsize (), match_image->get_ysize (),
|
||||
match_window, 0, 0);
|
||||
|
||||
line_color_index(match_window, RED);
|
||||
for (i = 1; i < match_image->get_xsize (); i++) {
|
||||
move2d (match_window, i, 0);
|
||||
draw2d (match_window, i, match_image->get_ysize ());
|
||||
}
|
||||
for (i = 1; i < match_image->get_ysize (); i++) {
|
||||
move2d (match_window, 0, i);
|
||||
draw2d (match_window, match_image->get_xsize (), i);
|
||||
}
|
||||
overlap_picture_ops(TRUE);
|
||||
|
||||
await_event(match_window, TRUE, ANY_EVENT, &event);
|
||||
destroy_window(w_im_window);
|
||||
destroy_window(n_im_window);
|
||||
destroy_window(match_window);
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* display_image()
|
||||
*
|
||||
* Show a single image
|
||||
*
|
||||
*************************************************************************/
|
||||
|
||||
WINDOW display_image(IMAGE *image,
|
||||
const char *title,
|
||||
INT32 x,
|
||||
INT32 y,
|
||||
BOOL8 wait) {
|
||||
WINDOW im_window;
|
||||
INT16 i;
|
||||
GRAPHICS_EVENT event; //output event
|
||||
|
||||
// xmin xmax ymin ymax
|
||||
im_window = create_window (title, SCROLLINGWIN, x, y, 10 * image->get_xsize (), 10 * image->get_ysize (), 0, image->get_xsize (), 0, image->get_ysize (),
|
||||
TRUE, FALSE, FALSE, TRUE); // down event & key only
|
||||
|
||||
clear_view_surface(im_window);
|
||||
show_sub_image (image,
|
||||
0, 0,
|
||||
image->get_xsize (), image->get_ysize (), im_window, 0, 0);
|
||||
|
||||
line_color_index(im_window, RED);
|
||||
for (i = 1; i < image->get_xsize (); i++) {
|
||||
move2d (im_window, i, 0);
|
||||
draw2d (im_window, i, image->get_ysize ());
|
||||
}
|
||||
for (i = 1; i < image->get_ysize (); i++) {
|
||||
move2d (im_window, 0, i);
|
||||
draw2d (im_window, image->get_xsize (), i);
|
||||
}
|
||||
overlap_picture_ops(TRUE);
|
||||
|
||||
if (wait)
|
||||
await_event(im_window, TRUE, ANY_EVENT, &event);
|
||||
|
||||
return im_window;
|
||||
}
|
||||
#endif
|
48
ccmain/matmatch.h
Normal file
48
ccmain/matmatch.h
Normal file
@ -0,0 +1,48 @@
|
||||
/**********************************************************************
|
||||
* File: matmatch.h (Formerly matrix_match.h)
|
||||
* Description: matrix matching routines for Tessedit
|
||||
* Author: Chris Newton
|
||||
* Created: Wed Nov 24 15:57:41 GMT 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef MATMATCH_H
|
||||
#define MATMATCH_H
|
||||
|
||||
#include "img.h"
|
||||
#include "hosthplb.h"
|
||||
#include "notdll.h"
|
||||
|
||||
#define BINIM_BLACK 0
|
||||
#define BINIM_WHITE 1
|
||||
#define BAD_MATCH 9999.0
|
||||
|
||||
extern BOOL_VAR_H (tessedit_display_mm, FALSE, "Display matrix matches");
|
||||
extern BOOL_VAR_H (tessedit_mm_debug, FALSE,
|
||||
"Print debug information for matrix matcher");
|
||||
extern INT_VAR_H (tessedit_mm_prototype_min_size, 3,
|
||||
"Smallest number of samples in a cluster for a prototype to be used");
|
||||
float matrix_match( // returns match score
|
||||
IMAGE *image1,
|
||||
IMAGE *image2);
|
||||
float match1( /* returns match score */
|
||||
IMAGE *image_w,
|
||||
IMAGE *image_n);
|
||||
void display_images(IMAGE *image_w, IMAGE *image_n, IMAGE *match_image);
|
||||
WINDOW display_image(IMAGE *image,
|
||||
const char *title,
|
||||
INT32 x,
|
||||
INT32 y,
|
||||
BOOL8 wait);
|
||||
#endif
|
1185
ccmain/output.cpp
Normal file
1185
ccmain/output.cpp
Normal file
File diff suppressed because it is too large
Load Diff
112
ccmain/output.h
Normal file
112
ccmain/output.h
Normal file
@ -0,0 +1,112 @@
|
||||
/******************************************************************
|
||||
* File: output.h (Formerly output.h)
|
||||
* Description: Output pass
|
||||
* Author: Phil Cheatle
|
||||
* Created: Thu Aug 4 10:56:08 BST 1994
|
||||
*
|
||||
* (C) Copyright 1994, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef OUTPUT_H
|
||||
#define OUTPUT_H
|
||||
|
||||
#include "varable.h"
|
||||
//#include "epapconv.h"
|
||||
#include "pageres.h"
|
||||
#include "notdll.h"
|
||||
|
||||
extern BOOL_EVAR_H (tessedit_write_block_separators, TRUE,
|
||||
"Write block separators in output");
|
||||
extern BOOL_VAR_H (tessedit_write_raw_output, FALSE,
|
||||
"Write raw stuff to name.raw");
|
||||
extern BOOL_EVAR_H (tessedit_write_output, TRUE, "Write text to name.txt");
|
||||
extern BOOL_EVAR_H (tessedit_write_txt_map, TRUE,
|
||||
"Write .txt to .etx map file");
|
||||
extern BOOL_EVAR_H (tessedit_write_rep_codes, TRUE,
|
||||
"Write repetition char code");
|
||||
extern BOOL_EVAR_H (tessedit_write_unlv, FALSE, "Write .unlv output file");
|
||||
extern STRING_EVAR_H (unrecognised_char, "|",
|
||||
"Output char for unidentified blobs");
|
||||
extern INT_EVAR_H (suspect_level, 99, "Suspect marker level");
|
||||
extern INT_VAR_H (suspect_space_level, 100,
|
||||
"Min suspect level for rejecting spaces");
|
||||
extern INT_VAR_H (suspect_short_words, 2,
|
||||
"Dont Suspect dict wds longer than this");
|
||||
extern BOOL_VAR_H (suspect_constrain_1Il, FALSE,
|
||||
"UNLV keep 1Il chars rejected");
|
||||
extern double_VAR_H (suspect_rating_per_ch, 999.9,
|
||||
"Dont touch bad rating limit");
|
||||
extern double_VAR_H (suspect_accept_rating, -999.9,
|
||||
"Accept good rating limit");
|
||||
extern BOOL_EVAR_H (tessedit_minimal_rejection, FALSE,
|
||||
"Only reject tess failures");
|
||||
extern BOOL_VAR_H (tessedit_zero_rejection, FALSE, "Dont reject ANYTHING");
|
||||
extern BOOL_VAR_H (tessedit_word_for_word, FALSE,
|
||||
"Make output have exactly one word per WERD");
|
||||
extern BOOL_VAR_H (tessedit_consistent_reps, TRUE,
|
||||
"Force all rep chars the same");
|
||||
void output_pass( //Tess output pass //send to api
|
||||
PAGE_RES_IT &page_res_it,
|
||||
BOOL8 write_to_shm);
|
||||
void write_results( //output a word
|
||||
PAGE_RES_IT &page_res_it, //full info
|
||||
char newline_type, //type of newline
|
||||
BOOL8 force_eol, //override tilde crunch?
|
||||
BOOL8 write_to_shm //send to api
|
||||
);
|
||||
WERD_CHOICE *make_epaper_choice( //convert one word
|
||||
WERD_RES *word, //word to do
|
||||
char newline_type //type of newline
|
||||
);
|
||||
INT16 make_reject ( //make reject code
|
||||
BOX * inset_box, //bounding box
|
||||
INT16 prevright, //previous char
|
||||
INT16 nextleft, //next char
|
||||
DENORM * denorm, //de-normalizer
|
||||
char word_string[] //output string
|
||||
);
|
||||
char determine_newline_type( //test line ends
|
||||
WERD *word, //word to do
|
||||
BLOCK *block, //current block
|
||||
WERD *next_word, //next word
|
||||
BLOCK *next_block //block of next word
|
||||
);
|
||||
void write_cooked_text( //write output
|
||||
WERD *word, //word to do
|
||||
const STRING &text, //text to write
|
||||
BOOL8 acceptable, //good stuff
|
||||
BOOL8 pass2, //done on pass2
|
||||
FILE *fp //file to write
|
||||
);
|
||||
void write_shm_text( //write output
|
||||
WERD_RES *word, //word to do
|
||||
BLOCK *block, //block it is from
|
||||
ROW_RES *row, //row it is from
|
||||
const STRING &text //text to write
|
||||
);
|
||||
void write_map( //output a map file
|
||||
FILE *mapfile, //mapfile to write to
|
||||
WERD_RES *word);
|
||||
FILE *open_outfile( //open .map & .unlv file
|
||||
const char *extension);
|
||||
void write_unlv_text(WERD_RES *word);
|
||||
char get_rep_char( // what char is repeated?
|
||||
WERD_RES *word);
|
||||
void ensure_rep_chars_are_consistent(WERD_RES *word);
|
||||
void set_unlv_suspects(WERD_RES *word);
|
||||
INT16 count_alphas( //how many alphas
|
||||
const char *s);
|
||||
INT16 count_alphanums( //how many alphanums
|
||||
const char *s);
|
||||
BOOL8 acceptable_number_string(const char *s);
|
||||
#endif
|
107
ccmain/paircmp.cpp
Normal file
107
ccmain/paircmp.cpp
Normal file
@ -0,0 +1,107 @@
|
||||
/**********************************************************************
|
||||
* File: paircmp.cpp (Formerly paircmp.c)
|
||||
* Description: Code to compare two blobs using the adaptive matcher
|
||||
* Author: Ray Smith
|
||||
* Created: Wed Apr 21 09:31:02 BST 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "blobcmp.h"
|
||||
#include "tfacep.h"
|
||||
#include "paircmp.h"
|
||||
|
||||
#define EXTERN
|
||||
|
||||
/**********************************************************************
|
||||
* compare_blob_pairs
|
||||
*
|
||||
* A blob processor to compare pairs of selected blobs.
|
||||
**********************************************************************/
|
||||
|
||||
BOOL8 compare_blob_pairs( //blob processor
|
||||
BLOCK *,
|
||||
ROW *row, //row it came from
|
||||
WERD *,
|
||||
PBLOB *blob //blob to compare
|
||||
) {
|
||||
static ROW *prev_row = NULL; //other in pair
|
||||
static PBLOB *prev_blob = NULL;
|
||||
float rating; //from matcher
|
||||
|
||||
if (prev_row == NULL || prev_blob == NULL) {
|
||||
prev_row = row;
|
||||
prev_blob = blob;
|
||||
}
|
||||
else {
|
||||
rating = compare_blobs (prev_blob, prev_row, blob, row);
|
||||
tprintf ("Rating=%g\n", rating);
|
||||
prev_row = NULL;
|
||||
prev_blob = NULL;
|
||||
}
|
||||
return TRUE;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* compare_blobs
|
||||
*
|
||||
* Compare 2 blobs and return the rating.
|
||||
**********************************************************************/
|
||||
|
||||
float compare_blobs( //match 2 blobs
|
||||
PBLOB *blob1, //first blob
|
||||
ROW *row1, //row it came from
|
||||
PBLOB *blob2, //other blob
|
||||
ROW *row2) {
|
||||
PBLOB *bn_blob1; //baseline norm
|
||||
PBLOB *bn_blob2;
|
||||
DENORM denorm1, denorm2;
|
||||
float rating; //match result
|
||||
|
||||
bn_blob1 = blob1->baseline_normalise (row1, &denorm1);
|
||||
bn_blob2 = blob2->baseline_normalise (row2, &denorm2);
|
||||
rating = compare_bln_blobs (bn_blob1, &denorm1, bn_blob2, &denorm2);
|
||||
delete bn_blob1;
|
||||
delete bn_blob2;
|
||||
return rating;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* compare_bln_blobs
|
||||
*
|
||||
* Compare 2 baseline normalised blobs and return the rating.
|
||||
**********************************************************************/
|
||||
|
||||
float compare_bln_blobs( //match 2 blobs
|
||||
PBLOB *blob1, //first blob
|
||||
DENORM *denorm1,
|
||||
PBLOB *blob2, //other blob
|
||||
DENORM *denorm2) {
|
||||
TBLOB *tblob1; //tessblobs
|
||||
TBLOB *tblob2;
|
||||
TEXTROW tessrow1, tessrow2; //tess rows
|
||||
float rating; //match result
|
||||
|
||||
tblob1 = make_tess_blob (blob1, TRUE);
|
||||
make_tess_row(denorm1, &tessrow1);
|
||||
tblob2 = make_tess_blob (blob2, TRUE);
|
||||
make_tess_row(denorm2, &tessrow2);
|
||||
rating = compare_tess_blobs (tblob1, &tessrow1, tblob2, &tessrow2);
|
||||
free_blob(tblob1);
|
||||
free_blob(tblob2);
|
||||
|
||||
return rating;
|
||||
}
|
43
ccmain/paircmp.h
Normal file
43
ccmain/paircmp.h
Normal file
@ -0,0 +1,43 @@
|
||||
/**********************************************************************
|
||||
* File: paircmp.h (Formerly paircmp.h)
|
||||
* Description: Code to compare two blobs using the adaptive matcher
|
||||
* Author: Ray Smith
|
||||
* Created: Wed Apr 21 09:31:02 BST 1993
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef PAIRCMP_H
|
||||
#define PAIRCMP_H
|
||||
|
||||
#include "ocrblock.h"
|
||||
#include "varable.h"
|
||||
#include "notdll.h"
|
||||
|
||||
BOOL8 compare_blob_pairs( //blob processor
|
||||
BLOCK *,
|
||||
ROW *row, //row it came from
|
||||
WERD *,
|
||||
PBLOB *blob //blob to compare
|
||||
);
|
||||
float compare_blobs( //match 2 blobs
|
||||
PBLOB *blob1, //first blob
|
||||
ROW *row1, //row it came from
|
||||
PBLOB *blob2, //other blob
|
||||
ROW *row2);
|
||||
float compare_bln_blobs( //match 2 blobs
|
||||
PBLOB *blob1, //first blob
|
||||
DENORM *denorm1,
|
||||
PBLOB *blob2, //other blob
|
||||
DENORM *denorm2);
|
||||
#endif
|
1655
ccmain/reject.cpp
Normal file
1655
ccmain/reject.cpp
Normal file
File diff suppressed because it is too large
Load Diff
175
ccmain/reject.h
Normal file
175
ccmain/reject.h
Normal file
@ -0,0 +1,175 @@
|
||||
/**********************************************************************
|
||||
* File: reject.h (Formerly reject.h)
|
||||
* Description: Rejection functions used in tessedit
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Sep 23 16:50:21 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef REJECT_H
|
||||
#define REJECT_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "pageres.h"
|
||||
#include "notdll.h"
|
||||
|
||||
extern INT_VAR_H (tessedit_reject_mode, 5, "Rejection algorithm");
|
||||
extern INT_VAR_H (tessedit_ok_mode, 5, "Acceptance decision algorithm");
|
||||
extern BOOL_VAR_H (tessedit_use_nn, TRUE, "");
|
||||
extern BOOL_VAR_H (tessedit_rejection_debug, FALSE, "Adaption debug");
|
||||
extern BOOL_VAR_H (tessedit_rejection_stats, FALSE, "Show NN stats");
|
||||
extern BOOL_VAR_H (tessedit_flip_0O, TRUE, "Contextual 0O O0 flips");
|
||||
extern double_VAR_H (tessedit_lower_flip_hyphen, 1.5,
|
||||
"Aspect ratio dot/hyphen test");
|
||||
extern double_VAR_H (tessedit_upper_flip_hyphen, 1.8,
|
||||
"Aspect ratio dot/hyphen test");
|
||||
extern BOOL_VAR_H (rej_trust_doc_dawg, FALSE,
|
||||
"Use DOC dawg in 11l conf. detector");
|
||||
extern BOOL_VAR_H (rej_1Il_use_dict_word, FALSE, "Use dictword test");
|
||||
extern BOOL_VAR_H (rej_1Il_trust_permuter_type, TRUE, "Dont double check");
|
||||
extern BOOL_VAR_H (one_ell_conflict_default, TRUE,
|
||||
"one_ell_conflict default");
|
||||
extern BOOL_VAR_H (show_char_clipping, FALSE, "Show clip image window?");
|
||||
extern BOOL_VAR_H (nn_debug, FALSE, "NN DEBUGGING?");
|
||||
extern BOOL_VAR_H (nn_reject_debug, FALSE, "NN DEBUG each char?");
|
||||
extern BOOL_VAR_H (nn_lax, FALSE, "Use 2nd rate matches");
|
||||
extern BOOL_VAR_H (nn_double_check_dict, FALSE, "Double check");
|
||||
extern BOOL_VAR_H (nn_conf_double_check_dict, TRUE,
|
||||
"Double check for confusions");
|
||||
extern BOOL_VAR_H (nn_conf_1Il, TRUE, "NN use 1Il conflicts");
|
||||
extern BOOL_VAR_H (nn_conf_Ss, TRUE, "NN use Ss conflicts");
|
||||
extern BOOL_VAR_H (nn_conf_hyphen, TRUE, "NN hyphen conflicts");
|
||||
extern BOOL_VAR_H (nn_conf_test_good_qual, FALSE, "NN dodgy 1Il cross check");
|
||||
extern BOOL_VAR_H (nn_conf_test_dict, TRUE, "NN dodgy 1Il cross check");
|
||||
extern BOOL_VAR_H (nn_conf_test_sensible, TRUE, "NN dodgy 1Il cross check");
|
||||
extern BOOL_VAR_H (nn_conf_strict_on_dodgy_chs, TRUE,
|
||||
"Require stronger NN match");
|
||||
extern double_VAR_H (nn_dodgy_char_threshold, 0.99, "min accept score");
|
||||
extern INT_VAR_H (nn_conf_accept_level, 4, "NN accept dodgy 1Il matches? ");
|
||||
extern INT_VAR_H (nn_conf_initial_i_level, 3,
|
||||
"NN accept initial Ii match level ");
|
||||
extern BOOL_VAR_H (no_unrej_dubious_chars, TRUE,
|
||||
"Dubious chars next to reject?");
|
||||
extern BOOL_VAR_H (no_unrej_no_alphanum_wds, TRUE,
|
||||
"Stop unrej of non A/N wds?");
|
||||
extern BOOL_VAR_H (no_unrej_1Il, FALSE, "Stop unrej of 1Ilchars?");
|
||||
extern BOOL_VAR_H (rej_use_tess_accepted, TRUE,
|
||||
"Individual rejection control");
|
||||
extern BOOL_VAR_H (rej_use_tess_blanks, TRUE, "Individual rejection control");
|
||||
extern BOOL_VAR_H (rej_use_good_perm, TRUE, "Individual rejection control");
|
||||
extern BOOL_VAR_H (rej_use_sensible_wd, FALSE, "Extend permuter check");
|
||||
extern BOOL_VAR_H (rej_alphas_in_number_perm, FALSE, "Extend permuter check");
|
||||
extern double_VAR_H (rej_whole_of_mostly_reject_word_fract, 0.85,
|
||||
"if >this fract");
|
||||
extern INT_VAR_H (rej_mostly_reject_mode, 1,
|
||||
"0-never, 1-afterNN, 2-after new xht");
|
||||
extern double_VAR_H (tessed_fullstop_aspect_ratio, 1.2,
|
||||
"if >this fract then reject");
|
||||
extern INT_VAR_H (net_image_width, 40, "NN input image width");
|
||||
extern INT_VAR_H (net_image_height, 36, "NN input image height");
|
||||
extern INT_VAR_H (net_image_x_height, 22, "NN input image x_height");
|
||||
extern INT_VAR_H (tessedit_image_border, 2, "Rej blbs near image edge limit");
|
||||
extern INT_VAR_H (net_bl_nodes, 20, "Number of baseline nodes");
|
||||
extern double_VAR_H (nn_reject_threshold, 0.5, "NN min accept score");
|
||||
extern double_VAR_H (nn_reject_head_and_shoulders, 0.6,
|
||||
"top scores sep factor");
|
||||
extern STRING_VAR_H (ok_single_ch_non_alphanum_wds, "-?\075",
|
||||
"Allow NN to unrej");
|
||||
extern STRING_VAR_H (ok_repeated_ch_non_alphanum_wds, "-?*\075",
|
||||
"Allow NN to unrej");
|
||||
extern STRING_VAR_H (conflict_set_I_l_1, "Il1[]", "Il1 conflict set");
|
||||
extern STRING_VAR_H (conflict_set_S_s, "Ss$", "Ss conflict set");
|
||||
extern STRING_VAR_H (conflict_set_hyphen, "-_~", "hyphen conflict set");
|
||||
extern STRING_VAR_H (dubious_chars_left_of_reject, "!'+`()-./\\<>;:^_,~\"",
|
||||
"Unreliable chars");
|
||||
extern STRING_VAR_H (dubious_chars_right_of_reject, "!'+`()-./\\<>;:^_,~\"",
|
||||
"Unreliable chars");
|
||||
extern INT_VAR_H (min_sane_x_ht_pixels, 8,
|
||||
"Reject any x-ht lt or eq than this");
|
||||
void set_done( //set done flag
|
||||
WERD_RES *word,
|
||||
INT16 pass);
|
||||
void make_reject_map( //make rej map for wd //detailed results
|
||||
WERD_RES *word,
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
ROW *row,
|
||||
INT16 pass //1st or 2nd?
|
||||
);
|
||||
void reject_blanks(WERD_RES *word);
|
||||
void reject_I_1_L(WERD_RES *word);
|
||||
//detailed results
|
||||
void reject_poor_matches(WERD_RES *word, BLOB_CHOICE_LIST_CLIST *blob_choices);
|
||||
float compute_reject_threshold( //compute threshold //detailed results
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices);
|
||||
int sort_floats( //qsort function
|
||||
const void *arg1, //ptrs to floats
|
||||
const void *arg2);
|
||||
void reject_edge_blobs(WERD_RES *word);
|
||||
BOOL8 one_ell_conflict(WERD_RES *word_res, BOOL8 update_map);
|
||||
INT16 first_alphanum_pos(const char *word);
|
||||
INT16 alpha_count(const char *word);
|
||||
BOOL8 word_contains_non_1_digit(const char *word);
|
||||
BOOL8 test_ambig_word( //test for ambiguity
|
||||
WERD_RES *word);
|
||||
//original word
|
||||
BOOL8 ambig_word(const char *start_word,
|
||||
char *temp_word, //alterable copy
|
||||
INT16 test_char_pos //idx to char to alter
|
||||
);
|
||||
const char *char_ambiguities(char c);
|
||||
|
||||
#ifndef EMBEDDED
|
||||
void test_ambigs(const char *word);
|
||||
#endif
|
||||
|
||||
void nn_recover_rejects(WERD_RES *word, ROW *row);
|
||||
void nn_match_word( //Match a word
|
||||
WERD_RES *word,
|
||||
ROW *row);
|
||||
//of character
|
||||
INT16 nn_match_char(IMAGE &scaled_image,
|
||||
float baseline_pos, //rel to scaled_image
|
||||
BOOL8 dict_word, //part of dict wd?
|
||||
BOOL8 checked_dict_word, //part of dict wd?
|
||||
BOOL8 sensible_word, //part acceptable str?
|
||||
BOOL8 centre, //not at word ends?
|
||||
BOOL8 good_quality_word, //initial segmentation
|
||||
char tess_ch //confirm this?
|
||||
);
|
||||
INT16 evaluate_net_match(char top,
|
||||
float top_score,
|
||||
char next,
|
||||
float next_score,
|
||||
char tess_ch,
|
||||
BOOL8 dict_word,
|
||||
BOOL8 checked_dict_word,
|
||||
BOOL8 sensible_word,
|
||||
BOOL8 centre,
|
||||
BOOL8 good_quality_word);
|
||||
void dont_allow_dubious_chars(WERD_RES *word);
|
||||
|
||||
void dont_allow_1Il(WERD_RES *word);
|
||||
|
||||
INT16 count_alphanums( //how many alphanums
|
||||
WERD_RES *word);
|
||||
void reject_mostly_rejects( //rej all if most rejectd
|
||||
WERD_RES *word);
|
||||
BOOL8 repeated_nonalphanum_wd(WERD_RES *word, ROW *row);
|
||||
BOOL8 repeated_ch_string(const char *rep_ch_str);
|
||||
INT16 safe_dict_word(const char *s);
|
||||
void flip_hyphens(WERD_RES *word);
|
||||
void flip_0O(WERD_RES *word);
|
||||
BOOL8 non_O_upper(char c);
|
||||
BOOL8 non_0_digit(char c);
|
||||
#endif
|
366
ccmain/scaleimg.cpp
Normal file
366
ccmain/scaleimg.cpp
Normal file
@ -0,0 +1,366 @@
|
||||
/**********************************************************************
|
||||
* File: scaleimg.cpp (Formerly scaleim.c)
|
||||
* Description: Smart scaling of images.
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 18 16:12:03 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
/*************************************************************************
|
||||
* This is really Sheelagh's code that I've hacked into a more usable form.
|
||||
* You simply call scale_image() passing in source and target images. The target
|
||||
* image should be empty, but created - in order to define the destination
|
||||
* size.
|
||||
*************************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include "fileerr.h"
|
||||
#include "tprintf.h"
|
||||
#include "grphics.h"
|
||||
#include "img.h"
|
||||
//#include "basefile.h"
|
||||
#include "imgscale.h"
|
||||
#include "scaleimg.h"
|
||||
|
||||
void scale_image( //scale an image
|
||||
IMAGE &image, //source image
|
||||
IMAGE &target_image //target image
|
||||
) {
|
||||
INT32 xsize, ysize, new_xsize, new_ysize;
|
||||
IMAGELINE line, new_line;
|
||||
int *hires, *lores, *oldhires, *oldlores;
|
||||
int i, j, n, oldn, row, col;
|
||||
int offset = 0; //not used here
|
||||
float factor;
|
||||
UINT8 curr_colour, new_colour;
|
||||
int dummy = -1;
|
||||
IMAGE image2; //horiz scaled image
|
||||
|
||||
xsize = image.get_xsize ();
|
||||
ysize = image.get_ysize ();
|
||||
new_xsize = target_image.get_xsize ();
|
||||
new_ysize = target_image.get_ysize ();
|
||||
if (new_ysize > new_xsize)
|
||||
new_line.init (new_ysize);
|
||||
else
|
||||
new_line.init (new_xsize);
|
||||
|
||||
factor = (float) xsize / (float) new_xsize;
|
||||
|
||||
hires = (int *) calloc (xsize, sizeof (int));
|
||||
lores = (int *) calloc (new_xsize, sizeof (int));
|
||||
oldhires = (int *) calloc (xsize, sizeof (int));
|
||||
oldlores = (int *) calloc (new_xsize, sizeof (int));
|
||||
if ((hires == NULL) || (lores == NULL) || (oldhires == NULL)
|
||||
|| (oldlores == NULL)) {
|
||||
fprintf (stderr, "Calloc error in scale_image\n");
|
||||
err_exit();
|
||||
}
|
||||
|
||||
image2.create (new_xsize, ysize, image.get_bpp ());
|
||||
|
||||
oldn = 0;
|
||||
/* do first row separately because hires[col-1] doesn't make sense here */
|
||||
image.fast_get_line (0, 0, xsize, &line);
|
||||
/* each line nominally begins with white */
|
||||
curr_colour = 1;
|
||||
n = 0;
|
||||
for (i = 0; i < xsize; i++) {
|
||||
new_colour = *(line.pixels + i);
|
||||
if (new_colour != curr_colour) {
|
||||
hires[n] = i;
|
||||
n++;
|
||||
curr_colour = new_colour;
|
||||
}
|
||||
}
|
||||
if (offset != 0)
|
||||
for (i = 0; i < n; i++)
|
||||
hires[i] += offset;
|
||||
|
||||
if (n > new_xsize) {
|
||||
tprintf ("Too many transitions (%d) on line 0\n", n);
|
||||
scale_image_cop_out(image,
|
||||
target_image,
|
||||
factor,
|
||||
hires,
|
||||
lores,
|
||||
oldhires,
|
||||
oldlores);
|
||||
return;
|
||||
}
|
||||
else if (n > 0)
|
||||
dyn_prog (n, hires, lores, new_xsize, &dummy, &dummy, 0, factor);
|
||||
else
|
||||
lores[0] = new_xsize;
|
||||
|
||||
curr_colour = 1;
|
||||
j = 0;
|
||||
for (i = 0; i < new_xsize; i++) {
|
||||
if (lores[j] == i) {
|
||||
curr_colour = 1 - curr_colour;
|
||||
j++;
|
||||
}
|
||||
*(new_line.pixels + i) = curr_colour;
|
||||
}
|
||||
image2.put_line (0, 0, new_xsize, &new_line, 0);
|
||||
|
||||
for (i = 0; i < n; i++) {
|
||||
oldhires[i] = hires[i];
|
||||
oldlores[i] = lores[i];
|
||||
}
|
||||
|
||||
for (i = n; i < oldn; i++) {
|
||||
oldhires[i] = 0;
|
||||
oldlores[i] = 0;
|
||||
}
|
||||
oldn = n;
|
||||
|
||||
for (row = 1; row < ysize; row++) {
|
||||
image.fast_get_line (0, row, xsize, &line);
|
||||
/* each line nominally begins with white */
|
||||
curr_colour = 1;
|
||||
n = 0;
|
||||
for (i = 0; i < xsize; i++) {
|
||||
new_colour = *(line.pixels + i);
|
||||
if (new_colour != curr_colour) {
|
||||
hires[n] = i;
|
||||
n++;
|
||||
curr_colour = new_colour;
|
||||
}
|
||||
}
|
||||
for (i = n; i < oldn; i++) {
|
||||
hires[i] = 0;
|
||||
lores[i] = 0;
|
||||
}
|
||||
if (offset != 0)
|
||||
for (i = 0; i < n; i++)
|
||||
hires[i] += offset;
|
||||
|
||||
if (n > new_xsize) {
|
||||
tprintf ("Too many transitions (%d) on line %d\n", n, row);
|
||||
scale_image_cop_out(image,
|
||||
target_image,
|
||||
factor,
|
||||
hires,
|
||||
lores,
|
||||
oldhires,
|
||||
oldlores);
|
||||
return;
|
||||
}
|
||||
else if (n > 0)
|
||||
dyn_prog(n, hires, lores, new_xsize, oldhires, oldlores, oldn, factor);
|
||||
else
|
||||
lores[0] = new_xsize;
|
||||
|
||||
curr_colour = 1;
|
||||
j = 0;
|
||||
for (i = 0; i < new_xsize; i++) {
|
||||
if (lores[j] == i) {
|
||||
curr_colour = 1 - curr_colour;
|
||||
j++;
|
||||
}
|
||||
*(new_line.pixels + i) = curr_colour;
|
||||
}
|
||||
image2.put_line (0, row, new_xsize, &new_line, 0);
|
||||
|
||||
for (i = 0; i < n; i++) {
|
||||
oldhires[i] = hires[i];
|
||||
oldlores[i] = lores[i];
|
||||
}
|
||||
for (i = n; i < oldn; i++) {
|
||||
oldhires[i] = 0;
|
||||
oldlores[i] = 0;
|
||||
}
|
||||
oldn = n;
|
||||
}
|
||||
|
||||
free(hires);
|
||||
free(lores);
|
||||
free(oldhires);
|
||||
free(oldlores);
|
||||
|
||||
/* NOW DO THE VERTICAL SCALING from image2 to target_image*/
|
||||
|
||||
xsize = new_xsize;
|
||||
factor = (float) ysize / (float) new_ysize;
|
||||
offset = 0;
|
||||
|
||||
hires = (int *) calloc (ysize, sizeof (int));
|
||||
lores = (int *) calloc (new_ysize, sizeof (int));
|
||||
oldhires = (int *) calloc (ysize, sizeof (int));
|
||||
oldlores = (int *) calloc (new_ysize, sizeof (int));
|
||||
if ((hires == NULL) || (lores == NULL) || (oldhires == NULL)
|
||||
|| (oldlores == NULL)) {
|
||||
fprintf (stderr, "Calloc error in scale_image (vert)\n");
|
||||
err_exit();
|
||||
}
|
||||
|
||||
oldn = 0;
|
||||
/* do first col separately because hires[col-1] doesn't make sense here */
|
||||
image2.get_column (0, 0, ysize, &line, 0);
|
||||
/* each line nominally begins with white */
|
||||
curr_colour = 1;
|
||||
n = 0;
|
||||
for (i = 0; i < ysize; i++) {
|
||||
new_colour = *(line.pixels + i);
|
||||
if (new_colour != curr_colour) {
|
||||
hires[n] = i;
|
||||
n++;
|
||||
curr_colour = new_colour;
|
||||
}
|
||||
}
|
||||
|
||||
if (offset != 0)
|
||||
for (i = 0; i < n; i++)
|
||||
hires[i] += offset;
|
||||
|
||||
if (n > new_ysize) {
|
||||
tprintf ("Too many transitions (%d) on column 0\n", n);
|
||||
scale_image_cop_out(image,
|
||||
target_image,
|
||||
factor,
|
||||
hires,
|
||||
lores,
|
||||
oldhires,
|
||||
oldlores);
|
||||
return;
|
||||
}
|
||||
else if (n > 0)
|
||||
dyn_prog (n, hires, lores, new_ysize, &dummy, &dummy, 0, factor);
|
||||
else
|
||||
lores[0] = new_ysize;
|
||||
|
||||
curr_colour = 1;
|
||||
j = 0;
|
||||
for (i = 0; i < new_ysize; i++) {
|
||||
if (lores[j] == i) {
|
||||
curr_colour = 1 - curr_colour;
|
||||
j++;
|
||||
}
|
||||
*(new_line.pixels + i) = curr_colour;
|
||||
}
|
||||
target_image.put_column (0, 0, new_ysize, &new_line, 0);
|
||||
|
||||
for (i = 0; i < n; i++) {
|
||||
oldhires[i] = hires[i];
|
||||
oldlores[i] = lores[i];
|
||||
}
|
||||
for (i = n; i < oldn; i++) {
|
||||
oldhires[i] = 0;
|
||||
oldlores[i] = 0;
|
||||
}
|
||||
oldn = n;
|
||||
|
||||
for (col = 1; col < xsize; col++) {
|
||||
image2.get_column (col, 0, ysize, &line, 0);
|
||||
/* each line nominally begins with white */
|
||||
curr_colour = 1;
|
||||
n = 0;
|
||||
for (i = 0; i < ysize; i++) {
|
||||
new_colour = *(line.pixels + i);
|
||||
if (new_colour != curr_colour) {
|
||||
hires[n] = i;
|
||||
n++;
|
||||
curr_colour = new_colour;
|
||||
}
|
||||
}
|
||||
for (i = n; i < oldn; i++) {
|
||||
hires[i] = 0;
|
||||
lores[i] = 0;
|
||||
}
|
||||
|
||||
if (offset != 0)
|
||||
for (i = 0; i < n; i++)
|
||||
hires[i] += offset;
|
||||
|
||||
if (n > new_ysize) {
|
||||
tprintf ("Too many transitions (%d) on column %d\n", n, col);
|
||||
scale_image_cop_out(image,
|
||||
target_image,
|
||||
factor,
|
||||
hires,
|
||||
lores,
|
||||
oldhires,
|
||||
oldlores);
|
||||
return;
|
||||
}
|
||||
else if (n > 0)
|
||||
dyn_prog(n, hires, lores, new_ysize, oldhires, oldlores, oldn, factor);
|
||||
else
|
||||
lores[0] = new_ysize;
|
||||
|
||||
curr_colour = 1;
|
||||
j = 0;
|
||||
for (i = 0; i < new_ysize; i++) {
|
||||
if (lores[j] == i) {
|
||||
curr_colour = 1 - curr_colour;
|
||||
j++;
|
||||
}
|
||||
*(new_line.pixels + i) = curr_colour;
|
||||
}
|
||||
target_image.put_column (col, 0, new_ysize, &new_line, 0);
|
||||
|
||||
for (i = 0; i < n; i++) {
|
||||
oldhires[i] = hires[i];
|
||||
oldlores[i] = lores[i];
|
||||
}
|
||||
for (i = n; i < oldn; i++) {
|
||||
oldhires[i] = 0;
|
||||
oldlores[i] = 0;
|
||||
}
|
||||
oldn = n;
|
||||
}
|
||||
free(hires);
|
||||
free(lores);
|
||||
free(oldhires);
|
||||
free(oldlores);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* scale_image_cop_out
|
||||
*
|
||||
* Cop-out of scale_image by doing it the easy way and free the data.
|
||||
**********************************************************************/
|
||||
|
||||
void scale_image_cop_out( //scale an image
|
||||
IMAGE &image, //source image
|
||||
IMAGE &target_image, //target image
|
||||
float factor, //scale factor
|
||||
int *hires,
|
||||
int *lores,
|
||||
int *oldhires,
|
||||
int *oldlores) {
|
||||
INT32 xsize, ysize, new_xsize, new_ysize;
|
||||
|
||||
xsize = image.get_xsize ();
|
||||
ysize = image.get_ysize ();
|
||||
new_xsize = target_image.get_xsize ();
|
||||
new_ysize = target_image.get_ysize ();
|
||||
|
||||
if (factor <= 0.5)
|
||||
reduce_sub_image (&image, 0, 0, xsize, ysize,
|
||||
&target_image, 0, 0, (INT32) (1.0 / factor), FALSE);
|
||||
else if (factor >= 2)
|
||||
enlarge_sub_image (&image, 0, 0, &target_image,
|
||||
0, 0, new_xsize, new_ysize, (INT32) factor, FALSE);
|
||||
else
|
||||
copy_sub_image (&image, 0, 0, xsize, ysize, &target_image, 0, 0, FALSE);
|
||||
free(hires);
|
||||
free(lores);
|
||||
free(oldhires);
|
||||
free(oldlores);
|
||||
}
|
35
ccmain/scaleimg.h
Normal file
35
ccmain/scaleimg.h
Normal file
@ -0,0 +1,35 @@
|
||||
/**********************************************************************
|
||||
* File: scaleimg.h (Formerly scaleim.h)
|
||||
* Description: Smart scaling of images.
|
||||
* Author: Phil Cheatle
|
||||
* Created: Wed Nov 18 16:12:03 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef SCALEIMG_H
|
||||
#define SCALEIMG_H
|
||||
|
||||
void scale_image( //scale an image
|
||||
IMAGE &image, //source image
|
||||
IMAGE &target_image //target image
|
||||
);
|
||||
void scale_image_cop_out( //scale an image
|
||||
IMAGE &image, //source image
|
||||
IMAGE &target_image, //target image
|
||||
float factor, //scale factor
|
||||
int *hires,
|
||||
int *lores,
|
||||
int *oldhires,
|
||||
int *oldlores);
|
||||
#endif
|
370
ccmain/tessbox.cpp
Normal file
370
ccmain/tessbox.cpp
Normal file
@ -0,0 +1,370 @@
|
||||
/**********************************************************************
|
||||
* File: tessbox.cpp (Formerly tessbox.c)
|
||||
* Description: Black boxed Tess for developing a resaljet.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Apr 23 11:03:36 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "tfacep.h"
|
||||
#include "tfacepp.h"
|
||||
#include "tessbox.h"
|
||||
#include "mfoutline.h"
|
||||
|
||||
#define EXTERN
|
||||
|
||||
/**********************************************************************
|
||||
* tess_segment_pass1
|
||||
*
|
||||
* Segment a word using the pass1 conditions of the tess segmenter.
|
||||
**********************************************************************/
|
||||
|
||||
WERD_CHOICE *tess_segment_pass1( //recog one word
|
||||
WERD *word, //bln word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
) {
|
||||
WERD_CHOICE *result; //return value
|
||||
int saved_enable_assoc = 0;
|
||||
int saved_chop_enable = 0;
|
||||
|
||||
if (word->flag (W_DONT_CHOP)) {
|
||||
saved_enable_assoc = enable_assoc;
|
||||
saved_chop_enable = chop_enable;
|
||||
enable_assoc = 0;
|
||||
chop_enable = 0;
|
||||
if (word->flag (W_REP_CHAR))
|
||||
permute_only_top = 1;
|
||||
}
|
||||
set_pass1();
|
||||
// tprintf("pass1 chop on=%d, seg=%d, onlytop=%d",chop_enable,enable_assoc,permute_only_top);
|
||||
result = recog_word (word, denorm, matcher, NULL, NULL, FALSE,
|
||||
raw_choice, blob_choices, outword);
|
||||
if (word->flag (W_DONT_CHOP)) {
|
||||
enable_assoc = saved_enable_assoc;
|
||||
chop_enable = saved_chop_enable;
|
||||
permute_only_top = 0;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* tess_segment_pass2
|
||||
*
|
||||
* Segment a word using the pass2 conditions of the tess segmenter.
|
||||
**********************************************************************/
|
||||
|
||||
WERD_CHOICE *tess_segment_pass2( //recog one word
|
||||
WERD *word, //bln word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
) {
|
||||
WERD_CHOICE *result; //return value
|
||||
int saved_enable_assoc = 0;
|
||||
int saved_chop_enable = 0;
|
||||
|
||||
if (word->flag (W_DONT_CHOP)) {
|
||||
saved_enable_assoc = enable_assoc;
|
||||
saved_chop_enable = chop_enable;
|
||||
enable_assoc = 0;
|
||||
chop_enable = 0;
|
||||
if (word->flag (W_REP_CHAR))
|
||||
permute_only_top = 1;
|
||||
}
|
||||
set_pass2();
|
||||
result = recog_word (word, denorm, matcher, NULL, NULL, FALSE,
|
||||
raw_choice, blob_choices, outword);
|
||||
if (word->flag (W_DONT_CHOP)) {
|
||||
enable_assoc = saved_enable_assoc;
|
||||
chop_enable = saved_chop_enable;
|
||||
permute_only_top = 0;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* correct_segment_pass2
|
||||
*
|
||||
* Segment a word correctly using the pass2 conditions of the tess segmenter.
|
||||
* Then call the tester with all the correctly segmented blobs.
|
||||
* If the correct segmentation cannot be found, the tester is called
|
||||
* with the segmentation found by tess and all the correct flags set to
|
||||
* false and all strings are NULL.
|
||||
**********************************************************************/
|
||||
|
||||
WERD_CHOICE *correct_segment_pass2( //recog one word
|
||||
WERD *word, //bln word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
) {
|
||||
set_pass2();
|
||||
return recog_word (word, denorm, matcher, NULL, tester, TRUE,
|
||||
raw_choice, blob_choices, outword);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* test_segment_pass2
|
||||
*
|
||||
* Segment a word correctly using the pass2 conditions of the tess segmenter.
|
||||
* Then call the tester on all words used by tess in its search.
|
||||
* Do this only on words where the correct segmentation could be found.
|
||||
**********************************************************************/
|
||||
|
||||
WERD_CHOICE *test_segment_pass2( //recog one word
|
||||
WERD *word, //bln word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
) {
|
||||
set_pass2();
|
||||
return recog_word (word, denorm, matcher, tester, NULL, TRUE,
|
||||
raw_choice, blob_choices, outword);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* tess_acceptable_word
|
||||
*
|
||||
* Return true if the word is regarded as "good enough".
|
||||
**********************************************************************/
|
||||
|
||||
BOOL8 tess_acceptable_word( //test acceptability
|
||||
WERD_CHOICE *word_choice, //after context
|
||||
WERD_CHOICE *raw_choice //before context
|
||||
) {
|
||||
A_CHOICE choice; //after context
|
||||
A_CHOICE tess_raw; //before
|
||||
|
||||
choice.rating = word_choice->rating ();
|
||||
choice.certainty = word_choice->certainty ();
|
||||
choice.string = (char *) word_choice->string ().string ();
|
||||
tess_raw.rating = raw_choice->rating ();
|
||||
tess_raw.certainty = raw_choice->certainty ();
|
||||
tess_raw.string = (char *) raw_choice->string ().string ();
|
||||
//call tess
|
||||
return AcceptableResult (&choice, &tess_raw);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* tess_adaptable_word
|
||||
*
|
||||
* Return true if the word is regarded as "good enough".
|
||||
**********************************************************************/
|
||||
|
||||
BOOL8 tess_adaptable_word( //test adaptability
|
||||
WERD *word, //word to test
|
||||
WERD_CHOICE *word_choice, //after context
|
||||
WERD_CHOICE *raw_choice //before context
|
||||
) {
|
||||
TWERD *tessword; //converted word
|
||||
INT32 result; //answer
|
||||
|
||||
tessword = make_tess_word (word, NULL);
|
||||
result = AdaptableWord (tessword, word_choice->string ().string (),
|
||||
raw_choice->string ().string ());
|
||||
delete_word(tessword);
|
||||
return result != 0;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* tess_cn_matcher
|
||||
*
|
||||
* Match a blob using the Tess Char Normalized (non-adaptive) matcher
|
||||
* only.
|
||||
**********************************************************************/
|
||||
|
||||
void tess_cn_matcher( //call tess
|
||||
PBLOB *pblob, //previous blob
|
||||
PBLOB *blob, //blob to match
|
||||
PBLOB *nblob, //next blob
|
||||
WERD *word, //word it came from
|
||||
DENORM *denorm, //de-normaliser
|
||||
BLOB_CHOICE_LIST &ratings //list of results
|
||||
) {
|
||||
LIST result; //tess output
|
||||
TBLOB *tessblob; //converted blob
|
||||
TEXTROW tessrow; //dummy row
|
||||
|
||||
tess_cn_matching = TRUE; //turn it on
|
||||
tess_bn_matching = FALSE;
|
||||
//convert blob
|
||||
tessblob = make_tess_blob (blob, TRUE);
|
||||
//make dummy row
|
||||
make_tess_row(denorm, &tessrow);
|
||||
//classify
|
||||
result = AdaptiveClassifier (tessblob, NULL, &tessrow);
|
||||
free_blob(tessblob);
|
||||
//make our format
|
||||
convert_choice_list(result, ratings);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* tess_bn_matcher
|
||||
*
|
||||
* Match a blob using the Tess Baseline Normalized (adaptive) matcher
|
||||
* only.
|
||||
**********************************************************************/
|
||||
|
||||
void tess_bn_matcher( //call tess
|
||||
PBLOB *pblob, //previous blob
|
||||
PBLOB *blob, //blob to match
|
||||
PBLOB *nblob, //next blob
|
||||
WERD *word, //word it came from
|
||||
DENORM *denorm, //de-normaliser
|
||||
BLOB_CHOICE_LIST &ratings //list of results
|
||||
) {
|
||||
LIST result; //tess output
|
||||
TBLOB *tessblob; //converted blob
|
||||
TEXTROW tessrow; //dummy row
|
||||
|
||||
tess_bn_matching = TRUE; //turn it on
|
||||
tess_cn_matching = FALSE;
|
||||
//convert blob
|
||||
tessblob = make_tess_blob (blob, TRUE);
|
||||
//make dummy row
|
||||
make_tess_row(denorm, &tessrow);
|
||||
//classify
|
||||
result = AdaptiveClassifier (tessblob, NULL, &tessrow);
|
||||
free_blob(tessblob);
|
||||
//make our format
|
||||
convert_choice_list(result, ratings);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* tess_default_matcher
|
||||
*
|
||||
* Match a blob using the default functionality of the Tess matcher.
|
||||
**********************************************************************/
|
||||
|
||||
void tess_default_matcher( //call tess
|
||||
PBLOB *pblob, //previous blob
|
||||
PBLOB *blob, //blob to match
|
||||
PBLOB *nblob, //next blob
|
||||
WERD *word, //word it came from
|
||||
DENORM *denorm, //de-normaliser
|
||||
BLOB_CHOICE_LIST &ratings //list of results
|
||||
) {
|
||||
LIST result; //tess output
|
||||
TBLOB *tessblob; //converted blob
|
||||
TEXTROW tessrow; //dummy row
|
||||
|
||||
tess_bn_matching = FALSE; //turn it off
|
||||
tess_cn_matching = FALSE;
|
||||
//convert blob
|
||||
tessblob = make_tess_blob (blob, TRUE);
|
||||
//make dummy row
|
||||
make_tess_row(denorm, &tessrow);
|
||||
//classify
|
||||
result = AdaptiveClassifier (tessblob, NULL, &tessrow);
|
||||
free_blob(tessblob);
|
||||
//make our format
|
||||
convert_choice_list(result, ratings);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* tess_training_tester
|
||||
*
|
||||
* Matcher tester function which actually trains tess.
|
||||
**********************************************************************/
|
||||
|
||||
void tess_training_tester( //call tess
|
||||
PBLOB *blob, //blob to match
|
||||
DENORM *denorm, //de-normaliser
|
||||
BOOL8 correct, //ly segmented
|
||||
char *text, //correct text
|
||||
INT32 count, //chars in text
|
||||
BLOB_CHOICE_LIST *ratings //list of results
|
||||
) {
|
||||
TBLOB *tessblob; //converted blob
|
||||
TEXTROW tessrow; //dummy row
|
||||
|
||||
if (correct) {
|
||||
NormMethod = character; //Force char norm spc 30/11/93
|
||||
tess_bn_matching = FALSE; //turn it off
|
||||
tess_cn_matching = FALSE;
|
||||
//convert blob
|
||||
tessblob = make_tess_blob (blob, TRUE);
|
||||
//make dummy row
|
||||
make_tess_row(denorm, &tessrow);
|
||||
//learn it
|
||||
LearnBlob(tessblob, &tessrow, text, count);
|
||||
free_blob(tessblob);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* tess_adapter
|
||||
*
|
||||
* Adapt to the word using the Tesseract mechanism.
|
||||
**********************************************************************/
|
||||
|
||||
void tess_adapter( //adapt to word
|
||||
WERD *word, //bln word
|
||||
DENORM *denorm, //de-normalise
|
||||
const char *string, //string for word
|
||||
const char *raw_string, //before context
|
||||
const char *rejmap //reject map
|
||||
) {
|
||||
TWERD *tessword; //converted word
|
||||
static TEXTROW tessrow; //dummy row
|
||||
|
||||
//make dummy row
|
||||
make_tess_row(denorm, &tessrow);
|
||||
//make a word
|
||||
tessword = make_tess_word (word, &tessrow);
|
||||
AdaptToWord(tessword, &tessrow, string, raw_string, rejmap);
|
||||
//adapt to it
|
||||
delete_word(tessword); //free it
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* tess_add_doc_word
|
||||
*
|
||||
* Add the given word to the document dictionary
|
||||
**********************************************************************/
|
||||
|
||||
void tess_add_doc_word( //test acceptability
|
||||
WERD_CHOICE *word_choice //after context
|
||||
) {
|
||||
A_CHOICE choice; //after context
|
||||
|
||||
choice.rating = word_choice->rating ();
|
||||
choice.certainty = word_choice->certainty ();
|
||||
choice.string = (char *) word_choice->string ().string ();
|
||||
add_document_word(&choice);
|
||||
}
|
110
ccmain/tessbox.h
Normal file
110
ccmain/tessbox.h
Normal file
@ -0,0 +1,110 @@
|
||||
/**********************************************************************
|
||||
* File: tessbox.h (Formerly tessbox.h)
|
||||
* Description: Black boxed Tess for developing a resaljet.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Apr 23 11:03:36 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef TESSBOX_H
|
||||
#define TESSBOX_H
|
||||
|
||||
#include "ratngs.h"
|
||||
#include "notdll.h"
|
||||
|
||||
WERD_CHOICE *tess_segment_pass1( //recog one word
|
||||
WERD *word, //bln word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
);
|
||||
WERD_CHOICE *tess_segment_pass2( //recog one word
|
||||
WERD *word, //bln word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
);
|
||||
//recog one word
|
||||
WERD_CHOICE *correct_segment_pass2(WERD *word, //bln word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
);
|
||||
WERD_CHOICE *test_segment_pass2( //recog one word
|
||||
WERD *word, //bln word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
);
|
||||
BOOL8 tess_acceptable_word( //test acceptability
|
||||
WERD_CHOICE *word_choice, //after context
|
||||
WERD_CHOICE *raw_choice //before context
|
||||
);
|
||||
BOOL8 tess_adaptable_word( //test adaptability
|
||||
WERD *word, //word to test
|
||||
WERD_CHOICE *word_choice, //after context
|
||||
WERD_CHOICE *raw_choice //before context
|
||||
);
|
||||
void tess_cn_matcher( //call tess
|
||||
PBLOB *pblob, //previous blob
|
||||
PBLOB *blob, //blob to match
|
||||
PBLOB *nblob, //next blob
|
||||
WERD *word, //word it came from
|
||||
DENORM *denorm, //de-normaliser
|
||||
BLOB_CHOICE_LIST &ratings //list of results
|
||||
);
|
||||
void tess_bn_matcher( //call tess
|
||||
PBLOB *pblob, //previous blob
|
||||
PBLOB *blob, //blob to match
|
||||
PBLOB *nblob, //next blob
|
||||
WERD *word, //word it came from
|
||||
DENORM *denorm, //de-normaliser
|
||||
BLOB_CHOICE_LIST &ratings //list of results
|
||||
);
|
||||
void tess_default_matcher( //call tess
|
||||
PBLOB *pblob, //previous blob
|
||||
PBLOB *blob, //blob to match
|
||||
PBLOB *nblob, //next blob
|
||||
WERD *word, //word it came from
|
||||
DENORM *denorm, //de-normaliser
|
||||
BLOB_CHOICE_LIST &ratings //list of results
|
||||
);
|
||||
void tess_training_tester( //call tess
|
||||
PBLOB *blob, //blob to match
|
||||
DENORM *denorm, //de-normaliser
|
||||
BOOL8 correct, //ly segmented
|
||||
char *text, //correct text
|
||||
INT32 count, //chars in text
|
||||
BLOB_CHOICE_LIST *ratings //list of results
|
||||
);
|
||||
void tess_adapter( //adapt to word
|
||||
WERD *word, //bln word
|
||||
DENORM *denorm, //de-normalise
|
||||
const char *string, //string for word
|
||||
const char *raw_string, //before context
|
||||
const char *rejmap);
|
||||
void tess_add_doc_word( //test acceptability
|
||||
WERD_CHOICE *word_choice //after context
|
||||
);
|
||||
#endif
|
321
ccmain/tessedit.cpp
Normal file
321
ccmain/tessedit.cpp
Normal file
@ -0,0 +1,321 @@
|
||||
/**********************************************************************
|
||||
* File: tessedit.cpp (Formerly tessedit.c)
|
||||
* Description: Main program for merge of tess and editor.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Jan 07 15:21:46 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
//#include <osfcn.h>
|
||||
//#include <signal.h>
|
||||
//#include <time.h>
|
||||
//#include <unistd.h>
|
||||
#include "tfacep.h" //must be before main.h
|
||||
//#include "fileerr.h"
|
||||
#include "stderr.h"
|
||||
#include "basedir.h"
|
||||
#include "tessvars.h"
|
||||
//#include "debgwin.h"
|
||||
//#include "epapdest.h"
|
||||
#include "control.h"
|
||||
#include "imgs.h"
|
||||
#include "reject.h"
|
||||
#include "pageres.h"
|
||||
//#include "gpapdest.h"
|
||||
#include "mainblk.h"
|
||||
#include "nwmain.h"
|
||||
#include "pgedit.h"
|
||||
#include "ocrshell.h"
|
||||
#include "tprintf.h"
|
||||
//#include "ipeerr.h"
|
||||
//#include "restart.h"
|
||||
#include "tessedit.h"
|
||||
//#include "fontfind.h"
|
||||
#include "permute.h"
|
||||
#include "permdawg.h"
|
||||
#include "permnum.h"
|
||||
#include "stopper.h"
|
||||
#include "adaptmatch.h"
|
||||
#include "intmatcher.h"
|
||||
#include "chop.h"
|
||||
#include "globals.h"
|
||||
|
||||
//extern "C" {
|
||||
#include "callnet.h" //phils nn stuff
|
||||
//}
|
||||
#include "notdll.h" //phils nn stuff
|
||||
|
||||
#define VARDIR "configs/" /*variables files */
|
||||
//config under api
|
||||
#define API_CONFIG "configs/api_config"
|
||||
#define EXTERN
|
||||
|
||||
EXTERN BOOL_EVAR (tessedit_write_vars, FALSE, "Write all vars to file");
|
||||
EXTERN BOOL_VAR (tessedit_tweaking_tess_vars, FALSE,
|
||||
"Fiddle tess config values");
|
||||
|
||||
EXTERN INT_VAR (tweak_ReliableConfigThreshold, 2, "Tess VAR");
|
||||
|
||||
EXTERN double_VAR (tweak_garbage, 1.5, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_ok_word, 1.25, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_good_word, 1.1, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_freq_word, 1.0, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_ok_number, 1.4, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_good_number, 1.1, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_non_word, 1.25, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_CertaintyPerChar, -0.5, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_NonDictCertainty, -2.5, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_RejectCertaintyOffset, 1.0, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_GoodAdaptiveMatch, 0.125, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_GreatAdaptiveMatch, 0.10, "Tess VAR");
|
||||
EXTERN INT_VAR (tweak_AdaptProtoThresh, 230, "Tess VAR");
|
||||
EXTERN INT_VAR (tweak_AdaptFeatureThresh, 230, "Tess VAR");
|
||||
EXTERN INT_VAR (tweak_min_outline_points, 6, "Tess VAR");
|
||||
EXTERN INT_VAR (tweak_min_outline_area, 2000, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_good_split, 50.0, "Tess VAR");
|
||||
EXTERN double_VAR (tweak_ok_split, 100.0, "Tess VAR");
|
||||
|
||||
extern INT16 XOFFSET;
|
||||
extern INT16 YOFFSET;
|
||||
extern int NO_BLOCK;
|
||||
|
||||
//progress monitor
|
||||
ETEXT_DESC *global_monitor = NULL;
|
||||
|
||||
int init_tesseract(const char *arg0,
|
||||
const char *textbase,
|
||||
const char *configfile,
|
||||
int configc,
|
||||
const char *const *configv) {
|
||||
FILE *var_file;
|
||||
static char c_path[MAX_PATH]; //path for c code
|
||||
|
||||
// Set the basename, compute the data directory and read C++ configs.
|
||||
main_setup(arg0, textbase, configc, configv);
|
||||
debug_window_on.set_value (FALSE);
|
||||
|
||||
if (tessedit_write_vars) {
|
||||
var_file = fopen ("edited.cfg", "w");
|
||||
if (var_file != NULL) {
|
||||
print_variables(var_file);
|
||||
fclose(var_file);
|
||||
}
|
||||
}
|
||||
strcpy (c_path, datadir.string ());
|
||||
c_path[strlen (c_path) - strlen (m_data_sub_dir.string ())] = '\0';
|
||||
demodir = c_path;
|
||||
start_recog(configfile, textbase);
|
||||
|
||||
ReliableConfigThreshold = tweak_ReliableConfigThreshold;
|
||||
|
||||
set_tess_tweak_vars();
|
||||
|
||||
if (tessedit_use_nn) //phils nn stuff
|
||||
init_net();
|
||||
return 0; //Normal exit
|
||||
}
|
||||
|
||||
void end_tesseract() {
|
||||
end_recog();
|
||||
}
|
||||
|
||||
#ifdef _TIFFIO_
|
||||
void read_tiff_image(TIFF* tif, IMAGE* image) {
|
||||
tdata_t buf;
|
||||
uint32 image_width, image_height;
|
||||
uint16 photometric;
|
||||
short bpp;
|
||||
TIFFGetField(tif, TIFFTAG_IMAGEWIDTH, &image_width);
|
||||
TIFFGetField(tif, TIFFTAG_IMAGELENGTH, &image_height);
|
||||
TIFFGetField(tif, TIFFTAG_BITSPERSAMPLE, &bpp);
|
||||
TIFFGetField(tif, TIFFTAG_PHOTOMETRIC, &photometric);
|
||||
// Tesseract's internal representation is 0-is-black,
|
||||
// so if the photometric is 1 (min is black) then high-valued pixels
|
||||
// are 1 (white), otherwise they are 0 (black).
|
||||
UINT8 high_value = photometric == 1;
|
||||
image->create(image_width, image_height, bpp);
|
||||
IMAGELINE line;
|
||||
line.init(image_width);
|
||||
|
||||
buf = _TIFFmalloc(TIFFScanlineSize(tif));
|
||||
int bytes_per_line = (image_width*bpp + 7)/8;
|
||||
UINT8* dest_buf = image->get_buffer();
|
||||
// This will go badly wrong with one of the more exotic tiff formats,
|
||||
// but the majority will work OK.
|
||||
for (int y = 0; y < image_height; ++y) {
|
||||
TIFFReadScanline(tif, buf, y);
|
||||
memcpy(dest_buf, buf, bytes_per_line);
|
||||
dest_buf += bytes_per_line;
|
||||
}
|
||||
if (high_value == 0)
|
||||
invert_image(image);
|
||||
_TIFFfree(buf);
|
||||
}
|
||||
#endif
|
||||
|
||||
/* Define command type identifiers */
|
||||
|
||||
enum CMD_EVENTS
|
||||
{
|
||||
ACTION_1_CMD_EVENT,
|
||||
RECOG_WERDS,
|
||||
RECOG_PSEUDO,
|
||||
ACTION_2_CMD_EVENT
|
||||
};
|
||||
|
||||
/**********************************************************************
|
||||
* extend_menu()
|
||||
*
|
||||
* Function called by pgeditor to let you extend the command menu.
|
||||
* Items can be added to the "MODES" and "OTHER" menus. The modes_id_base
|
||||
* and other_id_base parameters are required to offset your command event ids
|
||||
* from those of pgeditor, and to let the pgeditor which commands are mode
|
||||
* changes and which are unmoded commands. (Sorry if you think these offsets
|
||||
* are a bit kludgy, the alternative would be to duplicate all the menu
|
||||
* constructor modes within pgeditor so that the offsets could be hidden.)
|
||||
*
|
||||
* Items for the "MODES" menu may only be simple menu items (just a name and
|
||||
* id). Items for the "OTHER" menu can be editable parameters or boolean
|
||||
* toggles. Refer to menu.h to see how to build different types.
|
||||
**********************************************************************/
|
||||
|
||||
void extend_menu( //handle for "MODES"
|
||||
RADIO_MENU *modes_menu,
|
||||
INT16 modes_id_base, //mode cmd ids offset
|
||||
NON_RADIO_MENU *other_menu, //handle for "OTHER"
|
||||
INT16 other_id_base //mode cmd ids offset
|
||||
) {
|
||||
/* Example new mode */
|
||||
|
||||
modes_menu->add_child (new RADIO_MENU_LEAF ("Recog Words",
|
||||
modes_id_base + RECOG_WERDS));
|
||||
modes_menu->add_child (new RADIO_MENU_LEAF ("Recog Blobs",
|
||||
modes_id_base + RECOG_PSEUDO));
|
||||
|
||||
/* Example toggle
|
||||
|
||||
other_menu->add_child(
|
||||
new TOGGLE_MENU_LEAF( "Action 2", //Display string
|
||||
other_id_base + ACTION_2_CMD_EVENT, //offset command id
|
||||
FALSE ) ); //Initial value
|
||||
|
||||
Example text parm (commented out)
|
||||
|
||||
other_menu->add_child(
|
||||
new VARIABLE_MENU_LEAF( "Parm change", //Display string
|
||||
other_id_base + ACTION_3_CMD_EVENT, //offset command id
|
||||
"default value" ) ); //default value string
|
||||
*/
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* extend_moded_commands()
|
||||
*
|
||||
* Function called by pgeditor when the user is in one of the extended modes
|
||||
* defined by extend_menu() and the user has selected an area in the image
|
||||
* window.
|
||||
**********************************************************************/
|
||||
|
||||
void extend_moded_commands( //current mode
|
||||
INT32 mode,
|
||||
BOX selection_box //area selected
|
||||
) {
|
||||
char msg[MAX_CHARS + 1];
|
||||
|
||||
switch (mode) {
|
||||
case RECOG_WERDS:
|
||||
command_window->msg ("Recogging selected words");
|
||||
|
||||
/* This is how to apply a "word processor" function to each selected word */
|
||||
|
||||
process_selected_words(current_block_list,
|
||||
selection_box,
|
||||
&recog_interactive);
|
||||
break;
|
||||
case RECOG_PSEUDO:
|
||||
command_window->msg ("Recogging selected blobs");
|
||||
|
||||
/* This is how to apply a "word processor" function to each selected word */
|
||||
|
||||
recog_pseudo_word(current_block_list, selection_box);
|
||||
break;
|
||||
default:
|
||||
sprintf (msg, "Unexpected extended mode " INT32FORMAT, mode);
|
||||
command_window->msg (msg);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* extend_unmoded_commands()
|
||||
*
|
||||
* Function called by pgeditor when the user has selected one of the unmoded
|
||||
* extended menu options.
|
||||
**********************************************************************/
|
||||
|
||||
void extend_unmoded_commands( //current mode
|
||||
INT32 cmd_event,
|
||||
char *new_value //changed value if any
|
||||
) {
|
||||
char msg[MAX_CHARS + 1];
|
||||
|
||||
switch (cmd_event) {
|
||||
case ACTION_2_CMD_EVENT: //a toggle event
|
||||
if (new_value[0] == 'T')
|
||||
//Display message
|
||||
command_window->msg ("Extended Action 2 ON!!");
|
||||
else
|
||||
command_window->msg ("Extended Action 2 OFF!!");
|
||||
break;
|
||||
default:
|
||||
sprintf (msg, "Unrecognised extended command " INT32FORMAT " (%s)",
|
||||
cmd_event, new_value);
|
||||
command_window->msg (msg);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* set_tess_tweak_vars()
|
||||
* Set TESS vars from the tweek value - This is only really of use during search
|
||||
* of the space of tess configs - othertimes the default values are set
|
||||
*
|
||||
*************************************************************************/
|
||||
void set_tess_tweak_vars() {
|
||||
if (tessedit_tweaking_tess_vars) {
|
||||
garbage = tweak_garbage;
|
||||
ok_word = tweak_ok_word;
|
||||
good_word = tweak_good_word;
|
||||
freq_word = tweak_freq_word;
|
||||
ok_number = tweak_ok_number;
|
||||
good_number = tweak_good_number;
|
||||
non_word = tweak_non_word;
|
||||
CertaintyPerChar = tweak_CertaintyPerChar;
|
||||
NonDictCertainty = tweak_NonDictCertainty;
|
||||
RejectCertaintyOffset = tweak_RejectCertaintyOffset;
|
||||
GoodAdaptiveMatch = tweak_GoodAdaptiveMatch;
|
||||
GreatAdaptiveMatch = tweak_GreatAdaptiveMatch;
|
||||
AdaptProtoThresh = tweak_AdaptProtoThresh;
|
||||
AdaptFeatureThresh = tweak_AdaptFeatureThresh;
|
||||
min_outline_points = tweak_min_outline_points;
|
||||
min_outline_area = tweak_min_outline_area;
|
||||
good_split = tweak_good_split;
|
||||
ok_split = tweak_ok_split;
|
||||
}
|
||||
// if (expiry_day * 24 * 60 * 60 < time(NULL))
|
||||
// err_exit();
|
||||
}
|
67
ccmain/tessedit.h
Normal file
67
ccmain/tessedit.h
Normal file
@ -0,0 +1,67 @@
|
||||
/**********************************************************************
|
||||
* File: tessedit.h (Formerly tessedit.h)
|
||||
* Description: Main program for merge of tess and editor.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Jan 07 15:21:46 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef TESSEDIT_H
|
||||
#define TESSEDIT_H
|
||||
|
||||
#include "tessclas.h"
|
||||
#include "ocrclass.h"
|
||||
#include "pgedit.h"
|
||||
#include "notdll.h"
|
||||
|
||||
// Includes libtiff if HAVE_LIBTIFF is defined
|
||||
#ifdef HAVE_LIBTIFF
|
||||
#ifdef GOOGLE3
|
||||
#include "third_party/tiff/tiffio.h"
|
||||
#else
|
||||
#include "tiffio.h"
|
||||
#endif
|
||||
#endif
|
||||
|
||||
//progress monitor
|
||||
extern ETEXT_DESC *global_monitor;
|
||||
|
||||
int init_tesseract(const char *arg0,
|
||||
const char *textbase,
|
||||
const char *configfile,
|
||||
int configc,
|
||||
const char *const *configv);
|
||||
void recognize_page(STRING& image_name);
|
||||
void end_tesseract();
|
||||
|
||||
#ifdef _TIFFIO_
|
||||
void read_tiff_image(TIFF* tif, IMAGE* image);
|
||||
#endif
|
||||
|
||||
//handle for "MODES"
|
||||
void extend_menu(RADIO_MENU *modes_menu,
|
||||
INT16 modes_id_base, //mode cmd ids offset
|
||||
NON_RADIO_MENU *other_menu, //handle for "OTHER"
|
||||
INT16 other_id_base //mode cmd ids offset
|
||||
);
|
||||
//current mode
|
||||
void extend_moded_commands(INT32 mode,
|
||||
BOX selection_box //area selected
|
||||
);
|
||||
//current mode
|
||||
void extend_unmoded_commands(INT32 cmd_event,
|
||||
char *new_value //changed value if any
|
||||
);
|
||||
void set_tess_tweak_vars();
|
||||
#endif
|
38
ccmain/tessembedded.h
Normal file
38
ccmain/tessembedded.h
Normal file
@ -0,0 +1,38 @@
|
||||
/**********************************************************************
|
||||
* File: tessembedded.h
|
||||
* Description: Access to initialization functions in embedded environment
|
||||
* Author: Marius Renn
|
||||
* Created: Sun Oct 21
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef TESSEMBEDDED_H
|
||||
#define TESSEMBEDDED_H
|
||||
|
||||
#include "ocrblock.h"
|
||||
#include "varable.h"
|
||||
#include "notdll.h"
|
||||
|
||||
int init_tessembedded(const char *arg0,
|
||||
const char *textbase,
|
||||
const char *configfile,
|
||||
int configc,
|
||||
const char *const *configv);
|
||||
|
||||
void tessembedded_read_file(STRING &name,
|
||||
BLOCK_LIST *blocks);
|
||||
|
||||
void end_tessembedded();
|
||||
|
||||
#endif
|
311
ccmain/tesseractmain.cpp
Normal file
311
ccmain/tesseractmain.cpp
Normal file
@ -0,0 +1,311 @@
|
||||
/**********************************************************************
|
||||
* File: tessedit.cpp (Formerly tessedit.c)
|
||||
* Description: Main program for merge of tess and editor.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Jan 07 15:21:46 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "applybox.h"
|
||||
#include "control.h"
|
||||
#include "tessvars.h"
|
||||
#include "tessedit.h"
|
||||
#include "baseapi.h"
|
||||
#include "pageres.h"
|
||||
#include "imgs.h"
|
||||
#include "varabled.h"
|
||||
#include "tprintf.h"
|
||||
#include "tesseractmain.h"
|
||||
#include "stderr.h"
|
||||
#include "notdll.h"
|
||||
#include "mainblk.h"
|
||||
#include "globals.h"
|
||||
#include "tfacep.h"
|
||||
#include "callnet.h"
|
||||
|
||||
#define VARDIR "configs/" /*variables files */
|
||||
//config under api
|
||||
#define API_CONFIG "configs/api_config"
|
||||
#define EXTERN
|
||||
|
||||
EXTERN BOOL_VAR (tessedit_read_image, TRUE, "Ensure the image is read");
|
||||
EXTERN BOOL_VAR (tessedit_write_images, FALSE,
|
||||
"Capture the image from the IPE");
|
||||
EXTERN BOOL_VAR (tessedit_debug_to_screen, FALSE, "Dont use debug file");
|
||||
|
||||
extern INT16 XOFFSET;
|
||||
extern INT16 YOFFSET;
|
||||
extern int NO_BLOCK;
|
||||
|
||||
const ERRCODE USAGE = "Usage";
|
||||
char szAppName[] = "Tessedit"; //app name
|
||||
|
||||
/**********************************************************************
|
||||
* main()
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
int main(int argc, char **argv) {
|
||||
STRING outfile; //output file
|
||||
|
||||
if (argc < 3) {
|
||||
USAGE.error (argv[0], EXIT,
|
||||
"%s imagename outputbase [configfile [[+|-]varfile]...]\n", argv[0]);
|
||||
}
|
||||
|
||||
if (argc == 3)
|
||||
TessBaseAPI::Init(argv[0], argv[1], NULL, false, 0, argv + 2);
|
||||
else
|
||||
TessBaseAPI::Init(argv[0], argv[1], argv[3], false, argc - 4, argv + 4);
|
||||
|
||||
tprintf ("Tesseract Open Source OCR Engine\n");
|
||||
|
||||
IMAGE image;
|
||||
#ifdef _TIFFIO_
|
||||
TIFF* tif = TIFFOpen(argv[1], "r");
|
||||
if (tif) {
|
||||
read_tiff_image(tif, &image);
|
||||
TIFFClose(tif);
|
||||
} else {
|
||||
READFAILED.error (argv[0], EXIT, argv[1]);
|
||||
}
|
||||
#else
|
||||
if (image.read_header(argv[1]) < 0)
|
||||
READFAILED.error (argv[0], EXIT, argv[1]);
|
||||
if (image.read(image.get_ysize ()) < 0) {
|
||||
MEMORY_OUT.error(argv[0], EXIT, "Read of image %s",
|
||||
argv[1]);
|
||||
}
|
||||
#endif
|
||||
int bytes_per_line = check_legal_image_size(image.get_xsize(),
|
||||
image.get_ysize(),
|
||||
image.get_bpp());
|
||||
char* text = TessBaseAPI::TesseractRect(image.get_buffer(), image.get_bpp()/8,
|
||||
bytes_per_line, 0, 0,
|
||||
image.get_xsize(), image.get_ysize());
|
||||
outfile = argv[2];
|
||||
outfile += ".txt";
|
||||
FILE* fp = fopen(outfile.string(), "w");
|
||||
if (fp != NULL) {
|
||||
fwrite(text, 1, strlen(text), fp);
|
||||
fclose(fp);
|
||||
}
|
||||
delete [] text;
|
||||
TessBaseAPI::End();
|
||||
|
||||
return 0; //Normal exit
|
||||
}
|
||||
#else
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
UINT16 lang; //language
|
||||
STRING pagefile; //input file
|
||||
|
||||
if (argc < 4) {
|
||||
USAGE.error (argv[0], EXIT,
|
||||
"%s imagename outputbase configfile [[+|-]varfile]...\n", argv[0]);
|
||||
}
|
||||
|
||||
time_t t_start = time(NULL);
|
||||
|
||||
init_tessembedded (argv[0], argv[2], argv[3], argc - 4, argv + 4);
|
||||
|
||||
tprintf ("Tesseract Open Source OCR Engine (graphics disabled)\n");
|
||||
|
||||
if (tessedit_read_image) {
|
||||
#ifdef _TIFFIO_
|
||||
TIFF* tif = TIFFOpen(argv[1], "r");
|
||||
if (tif) {
|
||||
read_tiff_image(tif);
|
||||
TIFFClose(tif);
|
||||
} else
|
||||
READFAILED.error (argv[0], EXIT, argv[1]);
|
||||
|
||||
#else
|
||||
if (page_image.read_header (argv[1]) < 0)
|
||||
READFAILED.error (argv[0], EXIT, argv[1]);
|
||||
if (page_image.read (page_image.get_ysize ()) < 0) {
|
||||
MEMORY_OUT.error (argv[0], EXIT, "Read of image %s",
|
||||
argv[1]);
|
||||
}
|
||||
#endif
|
||||
}
|
||||
|
||||
pagefile = argv[1];
|
||||
|
||||
BLOCK_LIST current_block_list;
|
||||
tessembedded_read_file(pagefile, ¤t_block_list);
|
||||
tprintf ("Done reading files.\n");
|
||||
|
||||
PAGE_RES page_res(¤t_block_list);
|
||||
|
||||
recog_all_words(&page_res, NULL);
|
||||
|
||||
current_block_list.clear();
|
||||
ResetAdaptiveClassifier();
|
||||
|
||||
time_t t_end = time(NULL);
|
||||
double secs = difftime(t_end, t_start);
|
||||
tprintf ("Done. Number of seconds: %d\n", (int)secs);
|
||||
return 0; //Normal exit
|
||||
}
|
||||
|
||||
#endif
|
||||
|
||||
int initialized = 0;
|
||||
|
||||
#ifdef __MSW32__
|
||||
/**********************************************************************
|
||||
* WinMain
|
||||
*
|
||||
* Main function for a windows program.
|
||||
**********************************************************************/
|
||||
|
||||
int WINAPI WinMain( //main for windows //command line
|
||||
HINSTANCE hInstance,
|
||||
HINSTANCE hPrevInstance,
|
||||
LPSTR lpszCmdLine,
|
||||
int nCmdShow) {
|
||||
WNDCLASS wc;
|
||||
HWND hwnd;
|
||||
MSG msg;
|
||||
|
||||
char **argv;
|
||||
char *argsin[2];
|
||||
int argc;
|
||||
int exit_code;
|
||||
|
||||
wc.style = CS_NOCLOSE | CS_OWNDC;
|
||||
wc.lpfnWndProc = (WNDPROC) WndProc;
|
||||
wc.cbClsExtra = 0;
|
||||
wc.cbWndExtra = 0;
|
||||
wc.hInstance = hInstance;
|
||||
wc.hIcon = NULL; //LoadIcon (NULL, IDI_APPLICATION);
|
||||
wc.hCursor = NULL; //LoadCursor (NULL, IDC_ARROW);
|
||||
wc.hbrBackground = (HBRUSH) (COLOR_WINDOW + 1);
|
||||
wc.lpszMenuName = NULL;
|
||||
wc.lpszClassName = szAppName;
|
||||
|
||||
RegisterClass(&wc);
|
||||
|
||||
hwnd = CreateWindow (szAppName, szAppName,
|
||||
WS_OVERLAPPEDWINDOW | WS_DISABLED,
|
||||
CW_USEDEFAULT, CW_USEDEFAULT, CW_USEDEFAULT,
|
||||
CW_USEDEFAULT, HWND_DESKTOP, NULL, hInstance, NULL);
|
||||
|
||||
argsin[0] = strdup (szAppName);
|
||||
argsin[1] = strdup (lpszCmdLine);
|
||||
/*allocate memory for the args. There can never be more than half*/
|
||||
/*the total number of characters in the arguments.*/
|
||||
argv =
|
||||
(char **) malloc (((strlen (argsin[0]) + strlen (argsin[1])) / 2 + 1) *
|
||||
sizeof (char *));
|
||||
|
||||
/*now construct argv as it should be for C.*/
|
||||
argc = parse_args (2, argsin, argv);
|
||||
|
||||
// ShowWindow (hwnd, nCmdShow);
|
||||
// UpdateWindow (hwnd);
|
||||
|
||||
if (initialized) {
|
||||
exit_code = main (argc, argv);
|
||||
free (argsin[0]);
|
||||
free (argsin[1]);
|
||||
free(argv);
|
||||
return exit_code;
|
||||
}
|
||||
while (GetMessage (&msg, NULL, 0, 0)) {
|
||||
TranslateMessage(&msg);
|
||||
DispatchMessage(&msg);
|
||||
if (initialized) {
|
||||
exit_code = main (argc, argv);
|
||||
break;
|
||||
}
|
||||
else
|
||||
exit_code = msg.wParam;
|
||||
}
|
||||
free (argsin[0]);
|
||||
free (argsin[1]);
|
||||
free(argv);
|
||||
return exit_code;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* WndProc
|
||||
*
|
||||
* Function to respond to messages.
|
||||
**********************************************************************/
|
||||
|
||||
LONG WINAPI WndProc( //message handler
|
||||
HWND hwnd, //window with message
|
||||
UINT msg, //message typ
|
||||
WPARAM wParam,
|
||||
LPARAM lParam) {
|
||||
HDC hdc;
|
||||
|
||||
if (msg == WM_CREATE) {
|
||||
//
|
||||
// Create a rendering context.
|
||||
//
|
||||
hdc = GetDC (hwnd);
|
||||
ReleaseDC(hwnd, hdc);
|
||||
initialized = 1;
|
||||
return 0;
|
||||
}
|
||||
return DefWindowProc (hwnd, msg, wParam, lParam);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* parse_args
|
||||
*
|
||||
* Turn a list of args into a new list of args with each separate
|
||||
* whitespace spaced string being an arg.
|
||||
**********************************************************************/
|
||||
|
||||
int
|
||||
parse_args ( /*refine arg list */
|
||||
int argc, /*no of input args */
|
||||
char *argv[], /*input args */
|
||||
char *arglist[] /*output args */
|
||||
) {
|
||||
int argcount; /*converted argc */
|
||||
char *testchar; /*char in option string */
|
||||
int arg; /*current argument */
|
||||
|
||||
argcount = 0; /*no of options */
|
||||
for (arg = 0; arg < argc; arg++) {
|
||||
testchar = argv[arg]; /*start of arg */
|
||||
do {
|
||||
while (*testchar
|
||||
&& (*testchar == ' ' || *testchar == '\n'
|
||||
|| *testchar == '\t'))
|
||||
testchar++; /*skip white space */
|
||||
if (*testchar) {
|
||||
/*new arg */
|
||||
arglist[argcount++] = testchar;
|
||||
/*skip to white space */
|
||||
for (testchar++; *testchar && *testchar != ' ' && *testchar != '\n' && *testchar != '\t'; testchar++);
|
||||
if (*testchar)
|
||||
*testchar++ = '\0'; /*turn to separate args */
|
||||
}
|
||||
}
|
||||
while (*testchar);
|
||||
}
|
||||
return argcount; /*new number of args */
|
||||
}
|
||||
#endif
|
58
ccmain/tesseractmain.h
Normal file
58
ccmain/tesseractmain.h
Normal file
@ -0,0 +1,58 @@
|
||||
/**********************************************************************
|
||||
* File: tessedit.h (Formerly tessedit.h)
|
||||
* Description: Main program for merge of tess and editor.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Jan 07 15:21:46 GMT 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef TESSERACTMAIN_H
|
||||
#define TESSERACTMAIN_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "tessclas.h"
|
||||
#include "notdll.h"
|
||||
#include "tessembedded.h"
|
||||
|
||||
extern BOOL_VAR_H (tessedit_read_image, TRUE, "Ensure the image is read");
|
||||
INT32 api_main( //run from api
|
||||
const char *arg0, //program name
|
||||
UINT16 lang //language
|
||||
);
|
||||
INT16 setup_info( //setup dummy engine info
|
||||
UINT16 lang, //user language
|
||||
const char *name, //of engine
|
||||
const char *version //of engine
|
||||
);
|
||||
INT16 read_image( //read dummy image info
|
||||
IMAGE *im_out //output image
|
||||
);
|
||||
#ifdef __MSW32__
|
||||
int WINAPI WinMain( //main for windows //command line
|
||||
HINSTANCE hInstance,
|
||||
HINSTANCE hPrevInstance,
|
||||
LPSTR lpszCmdLine,
|
||||
int nCmdShow);
|
||||
LONG WINAPI WndProc( //message handler
|
||||
HWND hwnd, //window with message
|
||||
UINT msg, //message typ
|
||||
WPARAM wParam,
|
||||
LPARAM lParam);
|
||||
int parse_args ( /*refine arg list */
|
||||
int argc, /*no of input args */
|
||||
char *argv[], /*input args */
|
||||
char *arglist[] /*output args */
|
||||
);
|
||||
#endif
|
||||
#endif
|
38
ccmain/tessvars.cpp
Normal file
38
ccmain/tessvars.cpp
Normal file
@ -0,0 +1,38 @@
|
||||
/**********************************************************************
|
||||
* File: tessvars.cpp (Formerly tessvars.c)
|
||||
* Description: Variables and other globals for tessedit.
|
||||
* Author: Ray Smith
|
||||
* Created: Mon Apr 13 13:13:23 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "tessvars.h"
|
||||
|
||||
#define EXTERN
|
||||
|
||||
EXTERN INT_VAR (tessedit_adapt_kludge, 0,
|
||||
"Use acceptable result or dangambigs");
|
||||
EXTERN BOOL_VAR (interactive_mode, FALSE, "Run interactively?");
|
||||
EXTERN BOOL_VAR (edit_variables, FALSE, "Variables Editor Window?");
|
||||
// xiaofan EXTERN STRING_VAR(file_type,".bl","Filename extension");
|
||||
EXTERN STRING_VAR (file_type, ".tif", "Filename extension");
|
||||
INT_VAR (testedit_match_debug, 0, "Integer match debug ctrl");
|
||||
EXTERN INT_VAR (tessedit_dangambigs_chop, FALSE,
|
||||
"Use DangAmbigs to direct chop");
|
||||
EXTERN INT_VAR (tessedit_dangambigs_assoc, FALSE,
|
||||
"Use DangAmbigs to direct assoc");
|
||||
|
||||
EXTERN IMAGE page_image; //image of page
|
||||
EXTERN FILE *debug_fp; //write debug stuff here
|
48
ccmain/tessvars.h
Normal file
48
ccmain/tessvars.h
Normal file
@ -0,0 +1,48 @@
|
||||
/**********************************************************************
|
||||
* File: tessvars.h (Formerly tessvars.h)
|
||||
* Description: Variables and other globals for tessedit.
|
||||
* Author: Ray Smith
|
||||
* Created: Mon Apr 13 13:13:23 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef TESSVARS_H
|
||||
#define TESSVARS_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "img.h"
|
||||
#include "tordmain.h"
|
||||
#include "notdll.h"
|
||||
|
||||
extern INT_VAR_H (tessedit_adapt_kludge, 0,
|
||||
"Use acceptable result or dangambigs");
|
||||
extern BOOL_VAR_H (interactive_mode, FALSE, "Run interactively?");
|
||||
extern BOOL_VAR_H (edit_variables, FALSE, "Variables Editor Window?");
|
||||
//xiaofan extern STRING_VAR_H(file_type,".bl","Filename extension");
|
||||
extern STRING_VAR_H (file_type, ".tif", "Filename extension");
|
||||
extern INT_VAR_H (tessedit_truncate_wordchoice_log, 10,
|
||||
"Max words to keep in list");
|
||||
extern INT_VAR_H (testedit_match_debug, 0, "Integer match debug ctrl");
|
||||
extern INT_VAR_H (tessedit_truncate_chopper, 1,
|
||||
"Shorten chopper seam search");
|
||||
extern INT_VAR_H (tessedit_fix_sideways_chops, 1,
|
||||
"Fix sideways chop problem");
|
||||
extern INT_VAR_H (tessedit_dangambigs_chop, FALSE,
|
||||
"Use DangAmbigs to direct chop");
|
||||
extern INT_VAR_H (tessedit_dangambigs_assoc, FALSE,
|
||||
"Use DangAmbigs to direct assoc");
|
||||
|
||||
extern IMAGE page_image; //image of page
|
||||
extern FILE *debug_fp; //write debug stuff here
|
||||
#endif
|
121
ccmain/tfacep.h
Normal file
121
ccmain/tfacep.h
Normal file
@ -0,0 +1,121 @@
|
||||
/**********************************************************************
|
||||
* File: tfacep.h (Formerly tfacep.h)
|
||||
* Description: Declarations of C functions and C owned data.
|
||||
* Author: Ray Smith
|
||||
* Created: Mon Apr 27 12:51:28 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef TFACEP_H
|
||||
#define TFACEP_H
|
||||
|
||||
#include "hosthplb.h"
|
||||
#include "tessclas.h"
|
||||
#include "tessarray.h"
|
||||
#include "tstruct.h"
|
||||
#include "notdll.h"
|
||||
#include "choices.h"
|
||||
#include "oldlist.h"
|
||||
#include "hyphen.h"
|
||||
#include "tface.h"
|
||||
#include "permute.h"
|
||||
#include "adaptmatch.h"
|
||||
#include "blobclass.h"
|
||||
#include "stopper.h"
|
||||
#include "associate.h"
|
||||
#include "chop.h"
|
||||
#include "expandblob.h"
|
||||
#include "tordvars.h"
|
||||
#include "metrics.h"
|
||||
#include "tface.h"
|
||||
#include "badwords.h"
|
||||
#include "structures.h"
|
||||
|
||||
#define BLOB_MATCHING_ON
|
||||
typedef void (*TESS_TESTER) (TBLOB *, BOOL8, char *, INT32, LIST);
|
||||
typedef LIST (*TESS_MATCHER) (TBLOB *, TBLOB *, TBLOB *, void *, TEXTROW *);
|
||||
|
||||
extern "C"
|
||||
{
|
||||
/*
|
||||
int start_recog( //Real main in C
|
||||
int argc,
|
||||
char *argv[]);
|
||||
void program_editup2( //afterforking part
|
||||
int argc,
|
||||
char** argv);
|
||||
|
||||
int end_recog( //Real main in C
|
||||
int argc,
|
||||
char *argv[]);
|
||||
void set_interactive_pass();
|
||||
void set_pass1();
|
||||
void set_pass2();
|
||||
//ARRAY cc_recog(TWERD*,TESS_CHOICE*,TESS_CHOICE*,TESS_TESTER,
|
||||
// TESS_TESTER);*/
|
||||
//void wo_learn_blob(TBLOB*,TEXTROW*,char*,INT32);
|
||||
//LIST AdaptiveClassifier(TBLOB*,TBLOB*,TEXTROW*);
|
||||
//void LearnBlob(TBLOB*,TEXTROW*,char*,INT32);
|
||||
//TWERD *newword();
|
||||
//TBLOB *newblob();
|
||||
//TESSLINE *newoutline();
|
||||
//EDGEPT *newedgept();
|
||||
//void oldedgept(EDGEPT*);
|
||||
//void destroy_nodes(void*,void (*)(void*));
|
||||
//TESS_LIST *append_choice(TESS_LIST*,char*,double,double,char);
|
||||
//void fix_quotes (char*);
|
||||
//void record_certainty(double,int);
|
||||
//int AcceptableResult(A_CHOICE*,A_CHOICE*);
|
||||
//int AdaptableWord(TWERD*,const char*,const char*);
|
||||
//void delete_word(TWERD*);
|
||||
//void free_blob(TBLOB*);
|
||||
//void add_document_word(A_CHOICE*);
|
||||
//void AdaptToWord(TWERD*,TEXTROW*,const char*,const char*,const char*);
|
||||
//void SaveBadWord(const char*,double);
|
||||
//void free_choice(TESS_CHOICE*);
|
||||
//TWERD *newword();
|
||||
//TBLOB *newblob();
|
||||
//void free_blob( //free a blob
|
||||
// TBLOB *blob); //blob to free
|
||||
|
||||
//int dict_word( const char* );
|
||||
|
||||
//extern int tess_cn_matching;
|
||||
//extern int tess_bn_matching;
|
||||
//extern int last_word_on_line;
|
||||
extern TEXTROW normalized_row;
|
||||
//extern TESS_MATCHER blob_matchers[];
|
||||
//extern FILE *rawfile;
|
||||
//extern FILE *textfile;
|
||||
//extern int character_count;
|
||||
//extern int word_count;
|
||||
//extern int enable_assoc;
|
||||
//extern int chop_enable;
|
||||
//extern int permute_only_top;
|
||||
extern int display_ratings;
|
||||
|
||||
};
|
||||
|
||||
#if 0
|
||||
#define strsave(s) \
|
||||
((s) ? \
|
||||
((char*) strcpy ((char*)alloc_string (strlen(s)+1), s)) : \
|
||||
(NULL))
|
||||
#endif
|
||||
|
||||
#define BOLD_ON "&dB(s3B"
|
||||
#define BOLD_OFF "&d@(s0B"
|
||||
#define UNDERLINE_ON "&dD"
|
||||
#define UNDERLINE_OFF "&d@"
|
||||
#endif
|
411
ccmain/tfacepp.cpp
Normal file
411
ccmain/tfacepp.cpp
Normal file
@ -0,0 +1,411 @@
|
||||
/**********************************************************************
|
||||
* File: tfacepp.cpp (Formerly tface++.c)
|
||||
* Description: C++ side of the C/C++ Tess/Editor interface.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Apr 23 15:39:23 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#ifdef __UNIX__
|
||||
#include <assert.h>
|
||||
#endif
|
||||
#include "errcode.h"
|
||||
#include "tessarray.h"
|
||||
//#include "fxtop.h"
|
||||
#include "werd.h"
|
||||
#include "tfacep.h"
|
||||
#include "tstruct.h"
|
||||
#include "tfacepp.h"
|
||||
#include "tessvars.h"
|
||||
#include "reject.h"
|
||||
|
||||
#define EXTERN
|
||||
|
||||
EXTERN BOOL_VAR (tessedit_override_permuter, TRUE, "According to dict_word");
|
||||
|
||||
static POLY_MATCHER tess_matcher;//current matcher
|
||||
static POLY_TESTER tess_tester; //current tester
|
||||
static POLY_TESTER tess_trainer; //current trainer
|
||||
static DENORM *tess_denorm; //current denorm
|
||||
static WERD *tess_word; //current word
|
||||
|
||||
#define MAX_UNDIVIDED_LENGTH 24
|
||||
/**********************************************************************
|
||||
* recog_word
|
||||
*
|
||||
* Convert the word to tess form and pass it to the tess segmenter.
|
||||
* Convert the output back to editor form.
|
||||
**********************************************************************/
|
||||
WERD_CHOICE *recog_word( //recog one owrd
|
||||
WERD *word, //word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
POLY_TESTER trainer, //trainer function
|
||||
BOOL8 testing, //true if answer driven
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
) {
|
||||
WERD_CHOICE *word_choice;
|
||||
UINT8 perm_type;
|
||||
UINT8 real_dict_perm_type;
|
||||
|
||||
if (word->blob_list ()->empty ()) {
|
||||
word_choice = new WERD_CHOICE ("", 10.0f, -1.0f, TOP_CHOICE_PERM);
|
||||
raw_choice = new WERD_CHOICE ("", 10.0f, -1.0f, TOP_CHOICE_PERM);
|
||||
outword = word->poly_copy (denorm->row ()->x_height ());
|
||||
}
|
||||
else
|
||||
word_choice = recog_word_recursive (word, denorm, matcher, tester,
|
||||
trainer, testing, raw_choice,
|
||||
blob_choices, outword);
|
||||
if ((word_choice->string ().length () !=
|
||||
outword->blob_list ()->length ()) ||
|
||||
(word_choice->string ().length () != blob_choices->length ())) {
|
||||
tprintf
|
||||
("recog_word ASSERT FAIL String:\"%s\"; Strlen=%d; #Blobs=%d; #Choices=%d\n",
|
||||
word_choice->string ().string (), word_choice->string ().length (),
|
||||
outword->blob_list ()->length (), blob_choices->length ());
|
||||
}
|
||||
ASSERT_HOST (word_choice->string ().length () ==
|
||||
outword->blob_list ()->length ());
|
||||
ASSERT_HOST (word_choice->string ().length () == blob_choices->length ());
|
||||
|
||||
/* Copy any reject blobs into the outword */
|
||||
outword->rej_blob_list ()->deep_copy (word->rej_blob_list ());
|
||||
|
||||
if (tessedit_override_permuter) {
|
||||
/* Override the permuter type if a straight dictionary check disagrees. */
|
||||
perm_type = word_choice->permuter ();
|
||||
if ((perm_type != SYSTEM_DAWG_PERM) &&
|
||||
(perm_type != FREQ_DAWG_PERM) && (perm_type != USER_DAWG_PERM)) {
|
||||
real_dict_perm_type = dict_word (word_choice->string ().string ());
|
||||
if (((real_dict_perm_type == SYSTEM_DAWG_PERM) ||
|
||||
(real_dict_perm_type == FREQ_DAWG_PERM) ||
|
||||
(real_dict_perm_type == USER_DAWG_PERM)) &&
|
||||
(alpha_count (word_choice->string ().string ()) > 0))
|
||||
word_choice->set_permuter (real_dict_perm_type);
|
||||
//Use dict perm
|
||||
}
|
||||
if (tessedit_rejection_debug && perm_type != word_choice->permuter ()) {
|
||||
tprintf ("Permuter Type Flipped from %d to %d\n",
|
||||
perm_type, word_choice->permuter ());
|
||||
}
|
||||
}
|
||||
assert ((word_choice == NULL) == (raw_choice == NULL));
|
||||
return word_choice;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* recog_word_recursive
|
||||
*
|
||||
* Convert the word to tess form and pass it to the tess segmenter.
|
||||
* Convert the output back to editor form.
|
||||
**********************************************************************/
|
||||
|
||||
WERD_CHOICE *recog_word_recursive( //recog one owrd
|
||||
WERD *word, //word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
POLY_TESTER trainer, //trainer function
|
||||
BOOL8 testing, //true if answer driven
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
) {
|
||||
INT32 initial_blob_choice_len;
|
||||
INT32 word_length; //no of blobs
|
||||
STRING word_string; //converted from tess
|
||||
ARRAY tess_ratings; //tess results
|
||||
A_CHOICE tess_choice; //best word
|
||||
A_CHOICE tess_raw; //raw result
|
||||
TWERD *tessword; //tess format
|
||||
BLOB_CHOICE_LIST *choice_list; //fake list
|
||||
//iterator
|
||||
BLOB_CHOICE_LIST_C_IT choice_it;
|
||||
|
||||
tess_matcher = matcher; //install matcher
|
||||
tess_tester = testing ? tester : NULL;
|
||||
tess_trainer = testing ? trainer : NULL;
|
||||
tess_denorm = denorm;
|
||||
tess_word = word;
|
||||
// blob_matchers[1]=call_matcher;
|
||||
if (word->blob_list ()->length () > MAX_UNDIVIDED_LENGTH) {
|
||||
return split_and_recog_word (word, denorm, matcher, tester, trainer,
|
||||
testing, raw_choice, blob_choices,
|
||||
outword);
|
||||
}
|
||||
else {
|
||||
if (word->flag (W_EOL))
|
||||
last_word_on_line = TRUE;
|
||||
else
|
||||
last_word_on_line = FALSE;
|
||||
initial_blob_choice_len = blob_choices->length ();
|
||||
tessword = make_tess_word (word, NULL);
|
||||
tess_ratings = cc_recog (tessword, &tess_choice, &tess_raw,
|
||||
testing
|
||||
&& tester != NULL /* ? call_tester : NULL */ ,
|
||||
testing
|
||||
&& trainer !=
|
||||
NULL /* ? call_train_tester : NULL */ );
|
||||
//convert word
|
||||
outword = make_ed_word (tessword, word);
|
||||
if (outword == NULL) {
|
||||
outword = word->poly_copy (denorm->row ()->x_height ());
|
||||
}
|
||||
delete_word(tessword); //get rid of it
|
||||
//no of blobs
|
||||
word_length = outword->blob_list ()->length ();
|
||||
//convert all ratings
|
||||
convert_choice_lists(tess_ratings, blob_choices);
|
||||
//copy string
|
||||
word_string = tess_raw.string;
|
||||
while (word_string.length () < word_length)
|
||||
word_string += " "; //pad with blanks
|
||||
raw_choice = new WERD_CHOICE (word_string.string (),
|
||||
tess_raw.rating, tess_raw.certainty,
|
||||
tess_raw.permuter);
|
||||
word_string = tess_choice.string;
|
||||
if (word_string.length () > word_length) {
|
||||
tprintf ("recog_word: Discarded long string \"%s\"\n",
|
||||
word_string.string ());
|
||||
word_string = NULL; //should never happen
|
||||
}
|
||||
if (blob_choices->length () - initial_blob_choice_len != word_length) {
|
||||
word_string = NULL; //force rejection
|
||||
tprintf ("recog_word: Choices list len:%d; blob lists len:%d\n",
|
||||
blob_choices->length (), word_length);
|
||||
//list of lists
|
||||
choice_it.set_to_list (blob_choices);
|
||||
while (blob_choices->length () - initial_blob_choice_len <
|
||||
word_length) {
|
||||
//get fake one
|
||||
choice_list = new BLOB_CHOICE_LIST;
|
||||
//add to list
|
||||
choice_it.add_to_end (choice_list);
|
||||
tprintf ("recog_word: Added dummy choice list\n");
|
||||
}
|
||||
while (blob_choices->length () - initial_blob_choice_len >
|
||||
word_length) {
|
||||
choice_it.move_to_last ();
|
||||
//should never happen
|
||||
delete choice_it.extract ();
|
||||
tprintf ("recog_word: Deleted choice list\n");
|
||||
}
|
||||
}
|
||||
while (word_string.length () < word_length)
|
||||
word_string += " "; //pad with blanks
|
||||
|
||||
assert (raw_choice != NULL);
|
||||
if (tess_choice.string)
|
||||
strfree(tess_choice.string);
|
||||
if (tess_raw.string)
|
||||
strfree(tess_raw.string);
|
||||
return new WERD_CHOICE (word_string.string (),
|
||||
tess_choice.rating, tess_choice.certainty,
|
||||
tess_choice.permuter);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* split_and_recog_word
|
||||
*
|
||||
* Convert the word to tess form and pass it to the tess segmenter.
|
||||
* Convert the output back to editor form.
|
||||
**********************************************************************/
|
||||
|
||||
WERD_CHOICE *split_and_recog_word( //recog one owrd
|
||||
WERD *word, //word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
POLY_TESTER trainer, //trainer function
|
||||
BOOL8 testing, //true if answer driven
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
) {
|
||||
// INT32 outword1_len;
|
||||
// INT32 outword2_len;
|
||||
WERD *first_word; //poly copy of word
|
||||
WERD *second_word; //fabricated word
|
||||
WERD *outword2; //2nd output word
|
||||
PBLOB *blob;
|
||||
WERD_CHOICE *result; //resturn value
|
||||
WERD_CHOICE *result2; //output of 2nd word
|
||||
WERD_CHOICE *raw_choice2; //raw version of 2nd
|
||||
float gap; //blob gap
|
||||
float bestgap; //biggest gap
|
||||
PBLOB_LIST new_blobs; //list of gathered blobs
|
||||
PBLOB_IT blob_it;
|
||||
//iterator
|
||||
PBLOB_IT new_blob_it = &new_blobs;
|
||||
|
||||
first_word = word->poly_copy (denorm->row ()->x_height ());
|
||||
blob_it.set_to_list (first_word->blob_list ());
|
||||
bestgap = -MAX_INT32;
|
||||
while (!blob_it.at_last ()) {
|
||||
blob = blob_it.data ();
|
||||
//gap to next
|
||||
gap = blob_it.data_relative (1)->bounding_box ().left () - blob->bounding_box ().right ();
|
||||
blob_it.forward ();
|
||||
if (gap > bestgap) {
|
||||
bestgap = gap; //find biggest
|
||||
new_blob_it = blob_it; //save position
|
||||
}
|
||||
}
|
||||
//take 2nd half
|
||||
new_blobs.assign_to_sublist (&new_blob_it, &blob_it);
|
||||
//make it a word
|
||||
second_word = new WERD (&new_blobs, 1, NULL);
|
||||
ASSERT_HOST (word->blob_list ()->length () ==
|
||||
first_word->blob_list ()->length () +
|
||||
second_word->blob_list ()->length ());
|
||||
|
||||
result = recog_word_recursive (first_word, denorm, matcher,
|
||||
tester, trainer, testing, raw_choice,
|
||||
blob_choices, outword);
|
||||
delete first_word; //done that one
|
||||
result2 = recog_word_recursive (second_word, denorm, matcher,
|
||||
tester, trainer, testing, raw_choice2,
|
||||
blob_choices, outword2);
|
||||
delete second_word; //done that too
|
||||
*result += *result2; //combine ratings
|
||||
delete result2;
|
||||
*raw_choice += *raw_choice2;
|
||||
delete raw_choice2; //finished with it
|
||||
// outword1_len= outword->blob_list()->length();
|
||||
// outword2_len= outword2->blob_list()->length();
|
||||
outword->join_on (outword2); //join words
|
||||
delete outword2;
|
||||
// if ( outword->blob_list()->length() != outword1_len + outword2_len )
|
||||
// tprintf( "Split&Recog: part1len=%d; part2len=%d; combinedlen=%d\n",
|
||||
// outword1_len, outword2_len, outword->blob_list()->length() );
|
||||
// ASSERT_HOST( outword->blob_list()->length() == outword1_len + outword2_len );
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* call_matcher
|
||||
*
|
||||
* Called from Tess with a blob in tess form.
|
||||
* Convert the blob to editor form.
|
||||
* Call the matcher setup by the segmenter in tess_matcher.
|
||||
* Convert the output choices back to tess form.
|
||||
**********************************************************************/
|
||||
|
||||
LIST call_matcher( //call a matcher
|
||||
TBLOB *ptblob, //previous
|
||||
TBLOB *tessblob, //blob to match
|
||||
TBLOB *ntblob, //next
|
||||
void *, //unused parameter
|
||||
TEXTROW * //always null anyway
|
||||
) {
|
||||
PBLOB *pblob; //converted blob
|
||||
PBLOB *blob; //converted blob
|
||||
PBLOB *nblob; //converted blob
|
||||
LIST result; //tess output
|
||||
BLOB_CHOICE *choice; //current choice
|
||||
char string[2]; //char converted
|
||||
BLOB_CHOICE_LIST ratings; //matcher result
|
||||
BLOB_CHOICE_IT it; //iterator
|
||||
|
||||
blob = make_ed_blob (tessblob);//convert blob
|
||||
if (blob == NULL)
|
||||
return NULL; //can't do it
|
||||
pblob = ptblob != NULL ? make_ed_blob (ptblob) : NULL;
|
||||
nblob = ntblob != NULL ? make_ed_blob (ntblob) : NULL;
|
||||
(*tess_matcher) (pblob, blob, nblob, tess_word, tess_denorm, ratings);
|
||||
//match it
|
||||
delete blob; //don't need that now
|
||||
if (pblob != NULL)
|
||||
delete pblob;
|
||||
if (nblob != NULL)
|
||||
delete nblob;
|
||||
it.set_to_list (&ratings); //get list
|
||||
result = NULL;
|
||||
string[1] = '\0';
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ()) {
|
||||
choice = it.data ();
|
||||
string[0] = choice->char_class ();
|
||||
result = append_choice (result, string,
|
||||
choice->rating (), choice->certainty (),
|
||||
choice->config ());
|
||||
}
|
||||
return result; //converted list
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* call_tester
|
||||
*
|
||||
* Called from Tess with a blob in tess form.
|
||||
* Convert the blob to editor form.
|
||||
* Call the tester setup by the segmenter in tess_tester.
|
||||
**********************************************************************/
|
||||
|
||||
void call_tester( //call a tester
|
||||
TBLOB *tessblob, //blob to test
|
||||
BOOL8 correct_blob, //true if good
|
||||
char *text, //source text
|
||||
INT32 count, //chars in text
|
||||
LIST result //output of matcher
|
||||
) {
|
||||
PBLOB *blob; //converted blob
|
||||
BLOB_CHOICE_LIST ratings; //matcher result
|
||||
|
||||
blob = make_ed_blob (tessblob);//convert blob
|
||||
if (blob == NULL)
|
||||
return;
|
||||
//make it right type
|
||||
convert_choice_list(result, ratings);
|
||||
if (tess_tester != NULL)
|
||||
(*tess_tester) (blob, tess_denorm, correct_blob, text, count, &ratings);
|
||||
delete blob; //don't need that now
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* call_train_tester
|
||||
*
|
||||
* Called from Tess with a blob in tess form.
|
||||
* Convert the blob to editor form.
|
||||
* Call the trainer setup by the segmenter in tess_trainer.
|
||||
**********************************************************************/
|
||||
|
||||
void call_train_tester( //call a tester
|
||||
TBLOB *tessblob, //blob to test
|
||||
BOOL8 correct_blob, //true if good
|
||||
char *text, //source text
|
||||
INT32 count, //chars in text
|
||||
LIST result //output of matcher
|
||||
) {
|
||||
PBLOB *blob; //converted blob
|
||||
BLOB_CHOICE_LIST ratings; //matcher result
|
||||
|
||||
blob = make_ed_blob (tessblob);//convert blob
|
||||
if (blob == NULL)
|
||||
return;
|
||||
//make it right type
|
||||
convert_choice_list(result, ratings);
|
||||
if (tess_trainer != NULL)
|
||||
(*tess_trainer) (blob, tess_denorm, correct_blob, text, count, &ratings);
|
||||
delete blob; //don't need that now
|
||||
}
|
85
ccmain/tfacepp.h
Normal file
85
ccmain/tfacepp.h
Normal file
@ -0,0 +1,85 @@
|
||||
/**********************************************************************
|
||||
* File: tfacepp.h (Formerly tface++.h)
|
||||
* Description: C++ side of the C/C++ Tess/Editor interface.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Apr 23 15:39:23 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef TFACEPP_H
|
||||
#define TFACEPP_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "tstruct.h"
|
||||
#include "ratngs.h"
|
||||
#include "tessclas.h"
|
||||
#include "notdll.h"
|
||||
|
||||
extern BOOL_VAR_H (tessedit_override_permuter, TRUE,
|
||||
"According to dict_word");
|
||||
WERD_CHOICE *recog_word( //recog one owrd
|
||||
WERD *word, //word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
POLY_TESTER trainer, //trainer function
|
||||
BOOL8 testing, //true if answer driven
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
);
|
||||
//recog one owrd
|
||||
WERD_CHOICE *recog_word_recursive(WERD *word, //word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
POLY_TESTER trainer, //trainer function
|
||||
BOOL8 testing, //true if answer driven
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
);
|
||||
//recog one owrd
|
||||
WERD_CHOICE *split_and_recog_word(WERD *word, //word to do
|
||||
DENORM *denorm, //de-normaliser
|
||||
POLY_MATCHER matcher, //matcher function
|
||||
POLY_TESTER tester, //tester function
|
||||
POLY_TESTER trainer, //trainer function
|
||||
BOOL8 testing, //true if answer driven
|
||||
WERD_CHOICE *&raw_choice, //raw result //list of blob lists
|
||||
BLOB_CHOICE_LIST_CLIST *blob_choices,
|
||||
WERD *&outword //bln word output
|
||||
);
|
||||
LIST call_matcher( //call a matcher
|
||||
TBLOB *ptblob, //previous
|
||||
TBLOB *tessblob, //blob to match
|
||||
TBLOB *ntblob, //next
|
||||
void *, //unused parameter
|
||||
TEXTROW * //always null anyway
|
||||
);
|
||||
void call_tester( //call a tester
|
||||
TBLOB *tessblob, //blob to test
|
||||
BOOL8 correct_blob, //true if good
|
||||
char *text, //source text
|
||||
INT32 count, //chars in text
|
||||
LIST result //output of matcher
|
||||
);
|
||||
void call_train_tester( //call a tester
|
||||
TBLOB *tessblob, //blob to test
|
||||
BOOL8 correct_blob, //true if good
|
||||
char *text, //source text
|
||||
INT32 count, //chars in text
|
||||
LIST result //output of matcher
|
||||
);
|
||||
#endif
|
511
ccmain/tstruct.cpp
Normal file
511
ccmain/tstruct.cpp
Normal file
@ -0,0 +1,511 @@
|
||||
/**********************************************************************
|
||||
* File: tstruct.cpp (Formerly tstruct.c)
|
||||
* Description: Code to manipulate the structures of the C++/C interface.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Apr 23 15:49:29 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "tfacep.h"
|
||||
#include "tstruct.h"
|
||||
//#include "structures.h"
|
||||
|
||||
static ERRCODE BADFRAGMENTS = "Couldn't find matching fragment ends";
|
||||
|
||||
ELISTIZE (FRAGMENT)
|
||||
//extern /*"C"*/ oldoutline(TESSLINE*);
|
||||
/**********************************************************************
|
||||
* FRAGMENT::FRAGMENT
|
||||
*
|
||||
* Constructor for fragments.
|
||||
**********************************************************************/
|
||||
FRAGMENT::FRAGMENT ( //constructor
|
||||
EDGEPT * head_pt, //start point
|
||||
EDGEPT * tail_pt //end point
|
||||
):head (head_pt->pos.x, head_pt->pos.y), tail (tail_pt->pos.x,
|
||||
tail_pt->pos.y) {
|
||||
headpt = head_pt; //save ptrs
|
||||
tailpt = tail_pt;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_ed_word
|
||||
*
|
||||
* Make an editor format word from the tess style word.
|
||||
**********************************************************************/
|
||||
|
||||
WERD *make_ed_word( //construct word
|
||||
TWERD *tessword, //word to convert
|
||||
WERD *clone //clone this one
|
||||
) {
|
||||
WERD *word; //converted word
|
||||
TBLOB *tblob; //current blob
|
||||
PBLOB *blob; //new blob
|
||||
PBLOB_LIST blobs; //list of blobs
|
||||
PBLOB_IT blob_it = &blobs; //iterator
|
||||
|
||||
for (tblob = tessword->blobs; tblob != NULL; tblob = tblob->next) {
|
||||
blob = make_ed_blob (tblob);
|
||||
if (blob != NULL)
|
||||
blob_it.add_after_then_move (blob);
|
||||
}
|
||||
if (!blobs.empty ())
|
||||
word = new WERD (&blobs, clone);
|
||||
else
|
||||
word = NULL;
|
||||
return word;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_ed_blob
|
||||
*
|
||||
* Make an editor format blob from the tess style blob.
|
||||
**********************************************************************/
|
||||
|
||||
PBLOB *make_ed_blob( //construct blob
|
||||
TBLOB *tessblob //blob to convert
|
||||
) {
|
||||
TESSLINE *tessol; //tess outline
|
||||
FRAGMENT_LIST fragments; //list of fragments
|
||||
OUTLINE *outline; //current outline
|
||||
OUTLINE_LIST out_list; //list of outlines
|
||||
OUTLINE_IT out_it = &out_list; //iterator
|
||||
|
||||
for (tessol = tessblob->outlines; tessol != NULL; tessol = tessol->next) {
|
||||
//stick in list
|
||||
register_outline(tessol, &fragments);
|
||||
}
|
||||
while (!fragments.empty ()) {
|
||||
outline = make_ed_outline (&fragments);
|
||||
if (outline != NULL)
|
||||
out_it.add_after_then_move (outline);
|
||||
}
|
||||
if (out_it.empty())
|
||||
return NULL; //couldn't do it
|
||||
return new PBLOB (&out_list); //turn to blob
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_ed_outline
|
||||
*
|
||||
* Make an editor format outline from the list of fragments.
|
||||
**********************************************************************/
|
||||
|
||||
OUTLINE *make_ed_outline( //constructoutline
|
||||
FRAGMENT_LIST *list //list of fragments
|
||||
) {
|
||||
FRAGMENT *fragment; //current fragment
|
||||
EDGEPT *edgept; //current point
|
||||
ICOORD headpos; //coords of head
|
||||
ICOORD tailpos; //coords of tail
|
||||
FCOORD pos; //coords of edgept
|
||||
FCOORD vec; //empty
|
||||
POLYPT *polypt; //current point
|
||||
POLYPT_LIST poly_list; //list of point
|
||||
POLYPT_IT poly_it = &poly_list;//iterator
|
||||
FRAGMENT_IT fragment_it = list;//fragment
|
||||
|
||||
headpos = fragment_it.data ()->head;
|
||||
do {
|
||||
fragment = fragment_it.data ();
|
||||
edgept = fragment->headpt; //start of segment
|
||||
do {
|
||||
pos = FCOORD (edgept->pos.x, edgept->pos.y);
|
||||
vec = FCOORD (edgept->vec.x, edgept->vec.y);
|
||||
polypt = new POLYPT (pos, vec);
|
||||
//add to list
|
||||
poly_it.add_after_then_move (polypt);
|
||||
edgept = edgept->next;
|
||||
}
|
||||
while (edgept != fragment->tailpt);
|
||||
tailpos = ICOORD (edgept->pos.x, edgept->pos.y);
|
||||
//get rid of it
|
||||
delete fragment_it.extract ();
|
||||
if (tailpos != headpos) {
|
||||
if (fragment_it.empty ()) {
|
||||
// tprintf("Bad tailpos (%d,%d), Head=(%d,%d), no fragments.\n",
|
||||
// fragment->head.x(),fragment->head.y(),
|
||||
// headpos.x(),headpos.y());
|
||||
return NULL;
|
||||
}
|
||||
fragment_it.forward ();
|
||||
//find next segment
|
||||
for (fragment_it.mark_cycle_pt (); !fragment_it.cycled_list () && fragment_it.data ()->head != tailpos;
|
||||
fragment_it.forward ());
|
||||
if (fragment_it.data ()->head != tailpos) {
|
||||
// tprintf("Bad tailpos (%d,%d), Fragments are:\n",
|
||||
// tailpos.x(),tailpos.y());
|
||||
for (fragment_it.mark_cycle_pt ();
|
||||
!fragment_it.cycled_list (); fragment_it.forward ()) {
|
||||
fragment = fragment_it.extract ();
|
||||
// tprintf("Head=(%d,%d), tail=(%d,%d)\n",
|
||||
// fragment->head.x(),fragment->head.y(),
|
||||
// fragment->tail.x(),fragment->tail.y());
|
||||
delete fragment;
|
||||
}
|
||||
return NULL; //can't do it
|
||||
// BADFRAGMENTS.error("make_ed_blob",ABORT,NULL);
|
||||
}
|
||||
}
|
||||
}
|
||||
while (tailpos != headpos);
|
||||
return new OUTLINE (&poly_it); //turn to outline
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* register_outline
|
||||
*
|
||||
* Add the fragments in the given outline to the list
|
||||
**********************************************************************/
|
||||
|
||||
void register_outline( //add fragments
|
||||
TESSLINE *outline, //tess format
|
||||
FRAGMENT_LIST *list //list to add to
|
||||
) {
|
||||
EDGEPT *startpt; //start of outline
|
||||
EDGEPT *headpt; //start of fragment
|
||||
EDGEPT *tailpt; //end of fragment
|
||||
FRAGMENT *fragment; //new fragment
|
||||
FRAGMENT_IT it = list; //iterator
|
||||
|
||||
startpt = outline->loop;
|
||||
do {
|
||||
startpt = startpt->next;
|
||||
if (startpt == NULL)
|
||||
return; //illegal!
|
||||
}
|
||||
while (startpt->flags[0] == 0 && startpt != outline->loop);
|
||||
headpt = startpt;
|
||||
do
|
||||
startpt = startpt->next;
|
||||
while (startpt->flags[0] != 0 && startpt != headpt);
|
||||
if (startpt->flags[0] != 0)
|
||||
return; //all hidden!
|
||||
|
||||
headpt = startpt;
|
||||
do {
|
||||
tailpt = headpt;
|
||||
do
|
||||
tailpt = tailpt->next;
|
||||
while (tailpt->flags[0] == 0 && tailpt != startpt);
|
||||
fragment = new FRAGMENT (headpt, tailpt);
|
||||
it.add_after_then_move (fragment);
|
||||
while (tailpt->flags[0] != 0)
|
||||
tailpt = tailpt->next;
|
||||
headpt = tailpt;
|
||||
}
|
||||
while (tailpt != startpt);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* convert_choice_lists
|
||||
*
|
||||
* Convert the ARRAY of TESS_LIST of TESS_CHOICEs into a BLOB_CHOICE_LIST.
|
||||
**********************************************************************/
|
||||
|
||||
void convert_choice_lists( //convert lists
|
||||
ARRAY tessarray, //list from tess
|
||||
BLOB_CHOICE_LIST_CLIST *ratings //list of results
|
||||
) {
|
||||
INT32 length; //elements in array
|
||||
INT32 index; //index to array
|
||||
LIST result; //tess output
|
||||
//iterator
|
||||
BLOB_CHOICE_LIST_C_IT it = ratings;
|
||||
BLOB_CHOICE_LIST *choice; //created choice
|
||||
|
||||
if (tessarray != NULL) {
|
||||
length = array_count (tessarray);
|
||||
for (index = 0; index < length; index++) {
|
||||
result = (LIST) array_value (tessarray, index);
|
||||
//make one
|
||||
choice = new BLOB_CHOICE_LIST;
|
||||
//convert blob choices
|
||||
convert_choice_list(result, *choice);
|
||||
//add to super list
|
||||
it.add_after_then_move (choice);
|
||||
}
|
||||
free_mem(tessarray); //lists already freed
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* convert_choice_list
|
||||
*
|
||||
* Convert the LIST of TESS_CHOICEs into a BLOB_CHOICE_LIST.
|
||||
**********************************************************************/
|
||||
|
||||
void convert_choice_list( //convert lists
|
||||
LIST list, //list from tess
|
||||
BLOB_CHOICE_LIST &ratings //list of results
|
||||
) {
|
||||
LIST result; //tess output
|
||||
BLOB_CHOICE_IT it = &ratings; //iterator
|
||||
BLOB_CHOICE *choice; //created choice
|
||||
A_CHOICE *tesschoice; //choice to convert
|
||||
|
||||
for (result = list; result != NULL; result = result->next) {
|
||||
//traverse list
|
||||
tesschoice = (A_CHOICE *) result->node;
|
||||
//make one
|
||||
choice = new BLOB_CHOICE (tesschoice->string[0], tesschoice->rating, tesschoice->certainty, tesschoice->config);
|
||||
it.add_after_then_move (choice);
|
||||
}
|
||||
destroy_nodes (list, (void (*)(void *)) free_choice);
|
||||
//get rid of it
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_tess_row
|
||||
*
|
||||
* Make a fake row structure to pass to the tesseract matchers.
|
||||
**********************************************************************/
|
||||
|
||||
void make_tess_row( //make fake row
|
||||
DENORM *denorm, //row info
|
||||
TEXTROW *tessrow //output row
|
||||
) {
|
||||
tessrow->baseline.segments = 1;
|
||||
tessrow->baseline.xstarts[0] = -32767;
|
||||
tessrow->baseline.xstarts[1] = 32767;
|
||||
tessrow->baseline.quads[0].a = 0;
|
||||
tessrow->baseline.quads[0].b = 0;
|
||||
tessrow->baseline.quads[0].c = bln_baseline_offset;
|
||||
tessrow->xheight.segments = 1;
|
||||
tessrow->xheight.xstarts[0] = -32767;
|
||||
tessrow->xheight.xstarts[1] = 32767;
|
||||
tessrow->xheight.quads[0].a = 0;
|
||||
tessrow->xheight.quads[0].b = 0;
|
||||
tessrow->xheight.quads[0].c = bln_x_height + bln_baseline_offset;
|
||||
tessrow->lineheight = bln_x_height;
|
||||
tessrow->ascrise = denorm->row ()->ascenders () * denorm->scale ();
|
||||
tessrow->descdrop = denorm->row ()->descenders () * denorm->scale ();
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_tess_word
|
||||
*
|
||||
* Convert the word to Tess format.
|
||||
**********************************************************************/
|
||||
|
||||
TWERD *make_tess_word( //convert owrd
|
||||
WERD *word, //word to do
|
||||
TEXTROW *row //fake row
|
||||
) {
|
||||
TWERD *tessword; //tess format
|
||||
|
||||
tessword = newword (); //use old allocator
|
||||
tessword->row = row; //give them something
|
||||
//copy string
|
||||
tessword->correct = strsave (word->text ());
|
||||
tessword->guess = NULL;
|
||||
tessword->blobs = make_tess_blobs (word->blob_list ());
|
||||
tessword->blanks = 1;
|
||||
tessword->blobcount = word->blob_list ()->length ();
|
||||
tessword->next = NULL;
|
||||
return tessword;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_tess_blobs
|
||||
*
|
||||
* Make Tess style blobs from a list of BLOBs.
|
||||
**********************************************************************/
|
||||
|
||||
TBLOB *make_tess_blobs( //make tess blobs
|
||||
PBLOB_LIST *bloblist //list to convert
|
||||
) {
|
||||
PBLOB_IT it = bloblist; //iterator
|
||||
PBLOB *blob; //current blob
|
||||
TBLOB *head; //output list
|
||||
TBLOB *tail; //end of list
|
||||
TBLOB *tessblob;
|
||||
|
||||
head = NULL;
|
||||
tail = NULL;
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ()) {
|
||||
blob = it.data ();
|
||||
tessblob = make_tess_blob (blob, TRUE);
|
||||
if (head)
|
||||
tail->next = tessblob;
|
||||
else
|
||||
head = tessblob;
|
||||
tail = tessblob;
|
||||
}
|
||||
return head;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_tess_blob
|
||||
*
|
||||
* Make a single Tess style blob
|
||||
**********************************************************************/
|
||||
|
||||
TBLOB *make_tess_blob( //make tess blob
|
||||
PBLOB *blob, //blob to convert
|
||||
BOOL8 flatten //flatten outline structure
|
||||
) {
|
||||
INT32 index;
|
||||
TBLOB *tessblob;
|
||||
|
||||
tessblob = newblob ();
|
||||
tessblob->outlines = (struct olinestruct *)
|
||||
make_tess_outlines (blob->out_list (), flatten);
|
||||
for (index = 0; index < TBLOBFLAGS; index++)
|
||||
tessblob->flags[index] = 0; //!!
|
||||
tessblob->correct = 0;
|
||||
tessblob->guess = 0;
|
||||
for (index = 0; index < MAX_WO_CLASSES; index++) {
|
||||
tessblob->classes[index] = 0;
|
||||
tessblob->values[index] = 0;
|
||||
}
|
||||
tessblob->next = NULL;
|
||||
return tessblob;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_tess_outlines
|
||||
*
|
||||
* Make Tess style outlines from a list of OUTLINEs.
|
||||
**********************************************************************/
|
||||
|
||||
TESSLINE *make_tess_outlines( //make tess outlines
|
||||
OUTLINE_LIST *outlinelist, //list to convert
|
||||
BOOL8 flatten //flatten outline structure
|
||||
) {
|
||||
OUTLINE_IT it = outlinelist; //iterator
|
||||
OUTLINE *outline; //current outline
|
||||
TESSLINE *head; //output list
|
||||
TESSLINE *tail; //end of list
|
||||
TESSLINE *tessoutline;
|
||||
|
||||
head = NULL;
|
||||
tail = NULL;
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ()) {
|
||||
outline = it.data ();
|
||||
tessoutline = newoutline ();
|
||||
tessoutline->compactloop = NULL;
|
||||
tessoutline->loop = make_tess_edgepts (outline->polypts (),
|
||||
tessoutline->topleft,
|
||||
tessoutline->botright);
|
||||
if (tessoutline->loop == NULL) {
|
||||
oldoutline(tessoutline);
|
||||
continue;
|
||||
}
|
||||
tessoutline->start = tessoutline->loop->pos;
|
||||
tessoutline->node = NULL;
|
||||
tessoutline->next = NULL;
|
||||
tessoutline->child = NULL;
|
||||
if (!outline->child ()->empty ()) {
|
||||
if (flatten)
|
||||
tessoutline->next = (struct olinestruct *)
|
||||
make_tess_outlines (outline->child (), flatten);
|
||||
else {
|
||||
tessoutline->next = NULL;
|
||||
tessoutline->child = (struct olinestruct *)
|
||||
make_tess_outlines (outline->child (), flatten);
|
||||
}
|
||||
}
|
||||
else
|
||||
tessoutline->next = NULL;
|
||||
if (head)
|
||||
tail->next = tessoutline;
|
||||
else
|
||||
head = tessoutline;
|
||||
while (tessoutline->next != NULL)
|
||||
tessoutline = tessoutline->next;
|
||||
tail = tessoutline;
|
||||
}
|
||||
return head;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_tess_edgepts
|
||||
*
|
||||
* Make Tess style edgepts from a list of POLYPTs.
|
||||
**********************************************************************/
|
||||
|
||||
EDGEPT *make_tess_edgepts( //make tess edgepts
|
||||
POLYPT_LIST *edgeptlist, //list to convert
|
||||
TPOINT &tl, //bounding box
|
||||
TPOINT &br) {
|
||||
INT32 index;
|
||||
POLYPT_IT it = edgeptlist; //iterator
|
||||
POLYPT *edgept; //current edgept
|
||||
EDGEPT *head; //output list
|
||||
EDGEPT *tail; //end of list
|
||||
EDGEPT *tessedgept;
|
||||
|
||||
head = NULL;
|
||||
tail = NULL;
|
||||
tl.x = MAX_INT16;
|
||||
tl.y = -MAX_INT16;
|
||||
br.x = -MAX_INT16;
|
||||
br.y = MAX_INT16;
|
||||
for (it.mark_cycle_pt (); !it.cycled_list ();) {
|
||||
edgept = it.data ();
|
||||
tessedgept = newedgept ();
|
||||
tessedgept->pos.x = (INT16) edgept->pos.x ();
|
||||
tessedgept->pos.y = (INT16) edgept->pos.y ();
|
||||
if (tessedgept->pos.x < tl.x)
|
||||
tl.x = tessedgept->pos.x;
|
||||
if (tessedgept->pos.x > br.x)
|
||||
br.x = tessedgept->pos.x;
|
||||
if (tessedgept->pos.y > tl.y)
|
||||
tl.y = tessedgept->pos.y;
|
||||
if (tessedgept->pos.y < br.y)
|
||||
br.y = tessedgept->pos.y;
|
||||
if (head != NULL && tessedgept->pos.x == tail->pos.x
|
||||
&& tessedgept->pos.y == tail->pos.y) {
|
||||
oldedgept(tessedgept);
|
||||
}
|
||||
else {
|
||||
for (index = 0; index < EDGEPTFLAGS; index++)
|
||||
tessedgept->flags[index] = 0;
|
||||
if (head != NULL) {
|
||||
tail->vec.x = tessedgept->pos.x - tail->pos.x;
|
||||
tail->vec.y = tessedgept->pos.y - tail->pos.y;
|
||||
tessedgept->prev = tail;
|
||||
}
|
||||
tessedgept->next = head;
|
||||
if (head)
|
||||
tail->next = tessedgept;
|
||||
else
|
||||
head = tessedgept;
|
||||
tail = tessedgept;
|
||||
}
|
||||
it.forward ();
|
||||
}
|
||||
head->prev = tail;
|
||||
tail->vec.x = head->pos.x - tail->pos.x;
|
||||
tail->vec.y = head->pos.y - tail->pos.y;
|
||||
if (head == tail) {
|
||||
oldedgept(head);
|
||||
return NULL; //empty
|
||||
}
|
||||
return head;
|
||||
}
|
108
ccmain/tstruct.h
Normal file
108
ccmain/tstruct.h
Normal file
@ -0,0 +1,108 @@
|
||||
/**********************************************************************
|
||||
* File: tstruct.h (Formerly tstruct.h)
|
||||
* Description: Code to manipulate the structures of the C++/C interface.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Apr 23 15:49:29 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef TSTRUCT_H
|
||||
#define TSTRUCT_H
|
||||
|
||||
#include "tessarray.h"
|
||||
#include "werd.h"
|
||||
#include "tessclas.h"
|
||||
#include "ratngs.h"
|
||||
#include "notdll.h"
|
||||
#include "oldlist.h"
|
||||
|
||||
/*
|
||||
struct TESS_LIST
|
||||
{
|
||||
TESS_LIST *node; //data
|
||||
TESS_LIST *next; //next in list
|
||||
};
|
||||
|
||||
struct TESS_CHOICE
|
||||
{
|
||||
float rating; //scaled
|
||||
float certainty; //absolute
|
||||
char permuter; //which permuter code
|
||||
INT8 config; //which config
|
||||
char* string; //really can!
|
||||
};
|
||||
*/
|
||||
class FRAGMENT:public ELIST_LINK
|
||||
{
|
||||
public:
|
||||
FRAGMENT() { //constructor
|
||||
}
|
||||
FRAGMENT(EDGEPT *head_pt, //start
|
||||
EDGEPT *tail_pt); //end
|
||||
|
||||
ICOORD head; //coords of start
|
||||
ICOORD tail; //coords of end
|
||||
EDGEPT *headpt; //start point
|
||||
EDGEPT *tailpt; //end point
|
||||
|
||||
NEWDELETE2 (FRAGMENT)
|
||||
};
|
||||
|
||||
ELISTIZEH (FRAGMENT)
|
||||
WERD *make_ed_word( //construct word
|
||||
TWERD *tessword, //word to convert
|
||||
WERD *clone //clone this one
|
||||
);
|
||||
PBLOB *make_ed_blob( //construct blob
|
||||
TBLOB *tessblob //blob to convert
|
||||
);
|
||||
OUTLINE *make_ed_outline( //constructoutline
|
||||
FRAGMENT_LIST *list //list of fragments
|
||||
);
|
||||
void register_outline( //add fragments
|
||||
TESSLINE *outline, //tess format
|
||||
FRAGMENT_LIST *list //list to add to
|
||||
);
|
||||
void convert_choice_lists( //convert lists
|
||||
ARRAY tessarray, //list from tess
|
||||
BLOB_CHOICE_LIST_CLIST *ratings //list of results
|
||||
);
|
||||
void convert_choice_list( //convert lists
|
||||
LIST list, //list from tess
|
||||
BLOB_CHOICE_LIST &ratings //list of results
|
||||
);
|
||||
void make_tess_row( //make fake row
|
||||
DENORM *denorm, //row info
|
||||
TEXTROW *tessrow //output row
|
||||
);
|
||||
TWERD *make_tess_word( //convert owrd
|
||||
WERD *word, //word to do
|
||||
TEXTROW *row //fake row
|
||||
);
|
||||
TBLOB *make_tess_blobs( //make tess blobs
|
||||
PBLOB_LIST *bloblist //list to convert
|
||||
);
|
||||
TBLOB *make_tess_blob( //make tess blob
|
||||
PBLOB *blob, //blob to convert
|
||||
BOOL8 flatten //flatten outline structure
|
||||
);
|
||||
TESSLINE *make_tess_outlines( //make tess outlines
|
||||
OUTLINE_LIST *outlinelist, //list to convert
|
||||
BOOL8 flatten //flatten outline structure
|
||||
);
|
||||
EDGEPT *make_tess_edgepts( //make tess edgepts
|
||||
POLYPT_LIST *edgeptlist, //list to convert
|
||||
TPOINT &tl, //bounding box
|
||||
TPOINT &br);
|
||||
#endif
|
193
ccmain/werdit.cpp
Normal file
193
ccmain/werdit.cpp
Normal file
@ -0,0 +1,193 @@
|
||||
/**********************************************************************
|
||||
* File: werdit.cpp (Formerly wordit.c)
|
||||
* Description: An iterator for passing over all the words in a document.
|
||||
* Author: Ray Smith
|
||||
* Created: Mon Apr 27 08:51:22 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "werdit.h"
|
||||
|
||||
#define EXTERN
|
||||
|
||||
//EXTERN BOOL_VAR(wordit_linearc,FALSE,"Pass poly of linearc to Tess");
|
||||
|
||||
/**********************************************************************
|
||||
* WERDIT::start_page
|
||||
*
|
||||
* Get ready to iterate over the page by setting the iterators.
|
||||
**********************************************************************/
|
||||
|
||||
void WERDIT::start_page( //set iterators
|
||||
BLOCK_LIST *block_list //blocks to check
|
||||
) {
|
||||
block_it.set_to_list (block_list);
|
||||
block_it.mark_cycle_pt ();
|
||||
do {
|
||||
while (block_it.data ()->row_list ()->empty ()
|
||||
&& !block_it.cycled_list ()) {
|
||||
block_it.forward ();
|
||||
}
|
||||
if (!block_it.data ()->row_list ()->empty ()) {
|
||||
row_it.set_to_list (block_it.data ()->row_list ());
|
||||
row_it.mark_cycle_pt ();
|
||||
while (row_it.data ()->word_list ()->empty ()
|
||||
&& !row_it.cycled_list ()) {
|
||||
row_it.forward ();
|
||||
}
|
||||
if (!row_it.data ()->word_list ()->empty ()) {
|
||||
word_it.set_to_list (row_it.data ()->word_list ());
|
||||
word_it.mark_cycle_pt ();
|
||||
}
|
||||
}
|
||||
}
|
||||
while (!block_it.cycled_list () && row_it.data ()->word_list ()->empty ());
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* WERDIT::forward
|
||||
*
|
||||
* Give the next word on the page, or NULL if none left.
|
||||
* This code assumes all rows to be non-empty, but blocks are allowed
|
||||
* to be empty as eventually we will have non-text blocks.
|
||||
* The output is always a copy and needs to be deleted by somebody.
|
||||
**********************************************************************/
|
||||
|
||||
WERD *WERDIT::forward() { //use iterators
|
||||
WERD *word; //actual word
|
||||
// WERD *larc_word; //linearc copy
|
||||
WERD *result; //output word
|
||||
ROW *row; //row of word
|
||||
|
||||
if (word_it.cycled_list ()) {
|
||||
return NULL; //finished page
|
||||
}
|
||||
else {
|
||||
word = word_it.data ();
|
||||
row = row_it.data ();
|
||||
word_it.forward ();
|
||||
if (word_it.cycled_list ()) {
|
||||
row_it.forward (); //finished row
|
||||
if (row_it.cycled_list ()) {
|
||||
do {
|
||||
block_it.forward (); //finished block
|
||||
if (!block_it.cycled_list ()) {
|
||||
row_it.set_to_list (block_it.data ()->row_list ());
|
||||
row_it.mark_cycle_pt ();
|
||||
}
|
||||
}
|
||||
//find non-empty block
|
||||
while (!block_it.cycled_list ()
|
||||
&& row_it.cycled_list ());
|
||||
}
|
||||
if (!row_it.cycled_list ()) {
|
||||
word_it.set_to_list (row_it.data ()->word_list ());
|
||||
word_it.mark_cycle_pt ();
|
||||
}
|
||||
}
|
||||
|
||||
// if (wordit_linearc && !word->flag(W_POLYGON))
|
||||
// {
|
||||
// larc_word=word->larc_copy(row->x_height());
|
||||
// result=larc_word->poly_copy(row->x_height());
|
||||
// delete larc_word;
|
||||
// }
|
||||
// else
|
||||
result = word->poly_copy (row->x_height ());
|
||||
return result;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* make_pseudo_word
|
||||
*
|
||||
* Make all the blobs inside a selection into a single word.
|
||||
* The word is always a copy and needs to be deleted.
|
||||
**********************************************************************/
|
||||
|
||||
WERD *make_pseudo_word( //make fake word
|
||||
BLOCK_LIST *block_list, //blocks to check //block of selection
|
||||
BOX &selection_box,
|
||||
BLOCK *&pseudo_block,
|
||||
ROW *&pseudo_row //row of selection
|
||||
) {
|
||||
BLOCK_IT block_it(block_list);
|
||||
BLOCK *block;
|
||||
ROW_IT row_it;
|
||||
ROW *row;
|
||||
WERD_IT word_it;
|
||||
WERD *word;
|
||||
PBLOB_IT blob_it;
|
||||
PBLOB *blob;
|
||||
PBLOB_LIST new_blobs; //list of gathered blobs
|
||||
//iterator
|
||||
PBLOB_IT new_blob_it = &new_blobs;
|
||||
WERD *pseudo_word; //fabricated word
|
||||
WERD *poly_word; //poly copy of word
|
||||
// WERD *larc_word; //linearc copy
|
||||
|
||||
for (block_it.mark_cycle_pt ();
|
||||
!block_it.cycled_list (); block_it.forward ()) {
|
||||
block = block_it.data ();
|
||||
if (block->bounding_box ().overlap (selection_box)) {
|
||||
pseudo_block = block;
|
||||
row_it.set_to_list (block->row_list ());
|
||||
for (row_it.mark_cycle_pt ();
|
||||
!row_it.cycled_list (); row_it.forward ()) {
|
||||
row = row_it.data ();
|
||||
if (row->bounding_box ().overlap (selection_box)) {
|
||||
word_it.set_to_list (row->word_list ());
|
||||
for (word_it.mark_cycle_pt ();
|
||||
!word_it.cycled_list (); word_it.forward ()) {
|
||||
word = word_it.data ();
|
||||
if (word->bounding_box ().overlap (selection_box)) {
|
||||
// if (wordit_linearc && !word->flag(W_POLYGON))
|
||||
// {
|
||||
// larc_word=word->larc_copy(row->x_height());
|
||||
// poly_word=larc_word->poly_copy(row->x_height());
|
||||
// delete larc_word;
|
||||
// }
|
||||
// else
|
||||
poly_word = word->poly_copy (row->x_height ());
|
||||
blob_it.set_to_list (poly_word->blob_list ());
|
||||
for (blob_it.mark_cycle_pt ();
|
||||
!blob_it.cycled_list (); blob_it.forward ()) {
|
||||
blob = blob_it.data ();
|
||||
if (blob->bounding_box ().
|
||||
overlap (selection_box)) {
|
||||
new_blob_it.add_after_then_move (blob_it.
|
||||
extract
|
||||
());
|
||||
//steal off list
|
||||
pseudo_row = row;
|
||||
}
|
||||
}
|
||||
delete poly_word; //get rid of it
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
if (!new_blobs.empty ()) {
|
||||
//make new word
|
||||
pseudo_word = new WERD (&new_blobs, 1, NULL);
|
||||
}
|
||||
else
|
||||
pseudo_word = NULL;
|
||||
return pseudo_word;
|
||||
}
|
67
ccmain/werdit.h
Normal file
67
ccmain/werdit.h
Normal file
@ -0,0 +1,67 @@
|
||||
/**********************************************************************
|
||||
* File: wordit.c
|
||||
* Description: An iterator for passing over all the words in a document.
|
||||
* Author: Ray Smith
|
||||
* Created: Mon Apr 27 08:51:22 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef WERDIT_H
|
||||
#define WERDIT_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "ocrblock.h"
|
||||
#include "notdll.h"
|
||||
|
||||
class WERDIT
|
||||
{
|
||||
public:
|
||||
WERDIT() {
|
||||
} //empty contructor
|
||||
WERDIT( //empty contructor
|
||||
BLOCK_LIST *blocklist) { //blocks on page
|
||||
start_page(blocklist); //ready to scan
|
||||
}
|
||||
|
||||
void start_page( //get ready
|
||||
BLOCK_LIST *blocklist); //blocks on page
|
||||
|
||||
WERD *forward(); //get next word
|
||||
WERD *next_word() { //get next word
|
||||
return word_it.data (); //already at next
|
||||
}
|
||||
ROW *row() { //get current row
|
||||
return word_it.cycled_list ()? NULL : row_it.data ();
|
||||
}
|
||||
ROW *next_row() { //get next row
|
||||
return row_it.data_relative (1);
|
||||
}
|
||||
BLOCK *block() { //get current block
|
||||
return block_it.data ();
|
||||
}
|
||||
|
||||
private:
|
||||
BLOCK_IT block_it; //iterators
|
||||
ROW_IT row_it;
|
||||
WERD_IT word_it;
|
||||
};
|
||||
|
||||
//extern BOOL_VAR_H(wordit_linearc,FALSE,"Pass poly of linearc to Tess");
|
||||
WERD *make_pseudo_word( //make fake word
|
||||
BLOCK_LIST *block_list, //blocks to check //block of selection
|
||||
BOX &selection_box,
|
||||
BLOCK *&pseudo_block,
|
||||
ROW *&pseudo_row //row of selection
|
||||
);
|
||||
#endif
|
25
ccstruct/Makefile.am
Normal file
25
ccstruct/Makefile.am
Normal file
@ -0,0 +1,25 @@
|
||||
SUBDIRS =
|
||||
AM_CPPFLAGS = \
|
||||
-I$(top_srcdir)/ccutil -I$(top_srcdir)/cutil \
|
||||
-I$(top_srcdir)/image -I$(top_srcdir)/viewer
|
||||
|
||||
EXTRA_DIST = \
|
||||
blckerr.h blobbox.h blobs.h blread.h coutln.h crakedge.h \
|
||||
genblob.h hpddef.h hpdsizes.h ipoints.h labls.h linlsq.h \
|
||||
lmedsq.h mod128.h normalis.h ocrblock.h ocrrow.h pageblk.h \
|
||||
pageres.h pdblock.h pdclass.h points.h polyaprx.h polyblk.h \
|
||||
polyblob.h polyvert.h poutline.h quadlsq.h quadratc.h \
|
||||
quspline.h ratngs.h rect.h rejctmap.h rwpoly.h statistc.h \
|
||||
stepblob.h txtregn.h vecfuncs.h werd.h
|
||||
|
||||
noinst_LIBRARIES = libtesseract_ccstruct.a
|
||||
libtesseract_ccstruct_a_SOURCES = \
|
||||
blobbox.cpp blobs.cpp blread.cpp callcpp.cpp \
|
||||
coutln.cpp genblob.cpp labls.cpp linlsq.cpp \
|
||||
lmedsq.cpp mod128.cpp normalis.cpp ocrblock.cpp \
|
||||
ocrrow.cpp pageblk.cpp pageres.cpp pdblock.cpp \
|
||||
points.cpp polyaprx.cpp polyblk.cpp polyblob.cpp \
|
||||
polyvert.cpp poutline.cpp quadlsq.cpp quadratc.cpp \
|
||||
quspline.cpp ratngs.cpp rect.cpp rejctmap.cpp \
|
||||
rwpoly.cpp statistc.cpp stepblob.cpp txtregn.cpp \
|
||||
vecfuncs.cpp werd.cpp
|
587
ccstruct/Makefile.in
Normal file
587
ccstruct/Makefile.in
Normal file
@ -0,0 +1,587 @@
|
||||
# Makefile.in generated by automake 1.9.6 from Makefile.am.
|
||||
# @configure_input@
|
||||
|
||||
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
|
||||
# 2003, 2004, 2005 Free Software Foundation, Inc.
|
||||
# This Makefile.in is free software; the Free Software Foundation
|
||||
# gives unlimited permission to copy and/or distribute it,
|
||||
# with or without modifications, as long as this notice is preserved.
|
||||
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
|
||||
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
|
||||
# PARTICULAR PURPOSE.
|
||||
|
||||
@SET_MAKE@
|
||||
|
||||
srcdir = @srcdir@
|
||||
top_srcdir = @top_srcdir@
|
||||
VPATH = @srcdir@
|
||||
pkgdatadir = $(datadir)/@PACKAGE@
|
||||
pkglibdir = $(libdir)/@PACKAGE@
|
||||
pkgincludedir = $(includedir)/@PACKAGE@
|
||||
top_builddir = ..
|
||||
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
|
||||
INSTALL = @INSTALL@
|
||||
install_sh_DATA = $(install_sh) -c -m 644
|
||||
install_sh_PROGRAM = $(install_sh) -c
|
||||
install_sh_SCRIPT = $(install_sh) -c
|
||||
INSTALL_HEADER = $(INSTALL_DATA)
|
||||
transform = $(program_transform_name)
|
||||
NORMAL_INSTALL = :
|
||||
PRE_INSTALL = :
|
||||
POST_INSTALL = :
|
||||
NORMAL_UNINSTALL = :
|
||||
PRE_UNINSTALL = :
|
||||
POST_UNINSTALL = :
|
||||
build_triplet = @build@
|
||||
host_triplet = @host@
|
||||
subdir = ccstruct
|
||||
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
|
||||
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
|
||||
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
|
||||
$(top_srcdir)/config/ac_define_versionlevel.m4 \
|
||||
$(top_srcdir)/config/acinclude_custom.m4 \
|
||||
$(top_srcdir)/configure.ac
|
||||
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
|
||||
$(ACLOCAL_M4)
|
||||
mkinstalldirs = $(SHELL) $(top_srcdir)/config/mkinstalldirs
|
||||
CONFIG_HEADER = $(top_builddir)/config_auto.h
|
||||
CONFIG_CLEAN_FILES =
|
||||
LIBRARIES = $(noinst_LIBRARIES)
|
||||
AR = ar
|
||||
ARFLAGS = cru
|
||||
libtesseract_ccstruct_a_AR = $(AR) $(ARFLAGS)
|
||||
libtesseract_ccstruct_a_LIBADD =
|
||||
am_libtesseract_ccstruct_a_OBJECTS = blobbox.$(OBJEXT) blobs.$(OBJEXT) \
|
||||
blread.$(OBJEXT) callcpp.$(OBJEXT) coutln.$(OBJEXT) \
|
||||
genblob.$(OBJEXT) labls.$(OBJEXT) linlsq.$(OBJEXT) \
|
||||
lmedsq.$(OBJEXT) mod128.$(OBJEXT) normalis.$(OBJEXT) \
|
||||
ocrblock.$(OBJEXT) ocrrow.$(OBJEXT) pageblk.$(OBJEXT) \
|
||||
pageres.$(OBJEXT) pdblock.$(OBJEXT) points.$(OBJEXT) \
|
||||
polyaprx.$(OBJEXT) polyblk.$(OBJEXT) polyblob.$(OBJEXT) \
|
||||
polyvert.$(OBJEXT) poutline.$(OBJEXT) quadlsq.$(OBJEXT) \
|
||||
quadratc.$(OBJEXT) quspline.$(OBJEXT) ratngs.$(OBJEXT) \
|
||||
rect.$(OBJEXT) rejctmap.$(OBJEXT) rwpoly.$(OBJEXT) \
|
||||
statistc.$(OBJEXT) stepblob.$(OBJEXT) txtregn.$(OBJEXT) \
|
||||
vecfuncs.$(OBJEXT) werd.$(OBJEXT)
|
||||
libtesseract_ccstruct_a_OBJECTS = \
|
||||
$(am_libtesseract_ccstruct_a_OBJECTS)
|
||||
DEFAULT_INCLUDES = -I. -I$(srcdir) -I$(top_builddir)
|
||||
depcomp = $(SHELL) $(top_srcdir)/config/depcomp
|
||||
am__depfiles_maybe = depfiles
|
||||
CXXCOMPILE = $(CXX) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
|
||||
$(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CXXFLAGS) $(CXXFLAGS)
|
||||
CXXLD = $(CXX)
|
||||
CXXLINK = $(CXXLD) $(AM_CXXFLAGS) $(CXXFLAGS) $(AM_LDFLAGS) $(LDFLAGS) \
|
||||
-o $@
|
||||
SOURCES = $(libtesseract_ccstruct_a_SOURCES)
|
||||
DIST_SOURCES = $(libtesseract_ccstruct_a_SOURCES)
|
||||
RECURSIVE_TARGETS = all-recursive check-recursive dvi-recursive \
|
||||
html-recursive info-recursive install-data-recursive \
|
||||
install-exec-recursive install-info-recursive \
|
||||
install-recursive installcheck-recursive installdirs-recursive \
|
||||
pdf-recursive ps-recursive uninstall-info-recursive \
|
||||
uninstall-recursive
|
||||
ETAGS = etags
|
||||
CTAGS = ctags
|
||||
DIST_SUBDIRS = $(SUBDIRS)
|
||||
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
|
||||
ACLOCAL = @ACLOCAL@
|
||||
AMDEP_FALSE = @AMDEP_FALSE@
|
||||
AMDEP_TRUE = @AMDEP_TRUE@
|
||||
AMTAR = @AMTAR@
|
||||
AUTOCONF = @AUTOCONF@
|
||||
AUTOHEADER = @AUTOHEADER@
|
||||
AUTOMAKE = @AUTOMAKE@
|
||||
AWK = @AWK@
|
||||
CC = @CC@
|
||||
CCDEPMODE = @CCDEPMODE@
|
||||
CFLAGS = @CFLAGS@
|
||||
CPPFLAGS = @CPPFLAGS@
|
||||
CXX = @CXX@
|
||||
CXXCPP = @CXXCPP@
|
||||
CXXDEPMODE = @CXXDEPMODE@
|
||||
CXXFLAGS = @CXXFLAGS@
|
||||
CXXRPOFLAGS = @CXXRPOFLAGS@
|
||||
CYGPATH_W = @CYGPATH_W@
|
||||
DEFS = @DEFS@
|
||||
DEPDIR = @DEPDIR@
|
||||
ECHO_C = @ECHO_C@
|
||||
ECHO_N = @ECHO_N@
|
||||
ECHO_T = @ECHO_T@
|
||||
EGREP = @EGREP@
|
||||
EXEEXT = @EXEEXT@
|
||||
GNUWIN32_DIR = @GNUWIN32_DIR@
|
||||
HAVE_GNUWIN32_FALSE = @HAVE_GNUWIN32_FALSE@
|
||||
HAVE_GNUWIN32_TRUE = @HAVE_GNUWIN32_TRUE@
|
||||
HAVE_LIBTIFF_FALSE = @HAVE_LIBTIFF_FALSE@
|
||||
HAVE_LIBTIFF_TRUE = @HAVE_LIBTIFF_TRUE@
|
||||
INSTALL_DATA = @INSTALL_DATA@
|
||||
INSTALL_PROGRAM = @INSTALL_PROGRAM@
|
||||
INSTALL_SCRIPT = @INSTALL_SCRIPT@
|
||||
INSTALL_STRIP_PROGRAM = @INSTALL_STRIP_PROGRAM@
|
||||
LDFLAGS = @LDFLAGS@
|
||||
LIBOBJS = @LIBOBJS@
|
||||
LIBS = @LIBS@
|
||||
LIBTIFF_CFLAGS = @LIBTIFF_CFLAGS@
|
||||
LIBTIFF_LIBS = @LIBTIFF_LIBS@
|
||||
LTLIBOBJS = @LTLIBOBJS@
|
||||
MAINT = @MAINT@
|
||||
MAINTAINER_MODE_FALSE = @MAINTAINER_MODE_FALSE@
|
||||
MAINTAINER_MODE_TRUE = @MAINTAINER_MODE_TRUE@
|
||||
MAKEINFO = @MAKEINFO@
|
||||
OBJEXT = @OBJEXT@
|
||||
OPTS = @OPTS@
|
||||
PACKAGE = @PACKAGE@
|
||||
PACKAGE_BUGREPORT = @PACKAGE_BUGREPORT@
|
||||
PACKAGE_DATE = @PACKAGE_DATE@
|
||||
PACKAGE_NAME = @PACKAGE_NAME@
|
||||
PACKAGE_STRING = @PACKAGE_STRING@
|
||||
PACKAGE_TARNAME = @PACKAGE_TARNAME@
|
||||
PACKAGE_VERSION = @PACKAGE_VERSION@
|
||||
PACKAGE_YEAR = @PACKAGE_YEAR@
|
||||
PATH_SEPARATOR = @PATH_SEPARATOR@
|
||||
RANLIB = @RANLIB@
|
||||
RPO_NO = @RPO_NO@
|
||||
RPO_YES = @RPO_YES@
|
||||
SET_MAKE = @SET_MAKE@
|
||||
SHELL = @SHELL@
|
||||
STRIP = @STRIP@
|
||||
USING_CL_FALSE = @USING_CL_FALSE@
|
||||
USING_CL_TRUE = @USING_CL_TRUE@
|
||||
VERSION = @VERSION@
|
||||
ac_ct_CC = @ac_ct_CC@
|
||||
ac_ct_CXX = @ac_ct_CXX@
|
||||
ac_ct_RANLIB = @ac_ct_RANLIB@
|
||||
ac_ct_STRIP = @ac_ct_STRIP@
|
||||
am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
|
||||
am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
|
||||
am__fastdepCXX_FALSE = @am__fastdepCXX_FALSE@
|
||||
am__fastdepCXX_TRUE = @am__fastdepCXX_TRUE@
|
||||
am__include = @am__include@
|
||||
am__leading_dot = @am__leading_dot@
|
||||
am__quote = @am__quote@
|
||||
am__tar = @am__tar@
|
||||
am__untar = @am__untar@
|
||||
bindir = @bindir@
|
||||
build = @build@
|
||||
build_alias = @build_alias@
|
||||
build_cpu = @build_cpu@
|
||||
build_os = @build_os@
|
||||
build_vendor = @build_vendor@
|
||||
datadir = @datadir@
|
||||
exec_prefix = @exec_prefix@
|
||||
host = @host@
|
||||
host_alias = @host_alias@
|
||||
host_cpu = @host_cpu@
|
||||
host_os = @host_os@
|
||||
host_vendor = @host_vendor@
|
||||
includedir = @includedir@
|
||||
infodir = @infodir@
|
||||
install_sh = @install_sh@
|
||||
libdir = @libdir@
|
||||
libexecdir = @libexecdir@
|
||||
localstatedir = @localstatedir@
|
||||
mandir = @mandir@
|
||||
mkdir_p = @mkdir_p@
|
||||
oldincludedir = @oldincludedir@
|
||||
prefix = @prefix@
|
||||
program_transform_name = @program_transform_name@
|
||||
sbindir = @sbindir@
|
||||
sharedstatedir = @sharedstatedir@
|
||||
sysconfdir = @sysconfdir@
|
||||
target_alias = @target_alias@
|
||||
SUBDIRS =
|
||||
AM_CPPFLAGS = \
|
||||
-I$(top_srcdir)/ccutil -I$(top_srcdir)/cutil \
|
||||
-I$(top_srcdir)/image -I$(top_srcdir)/viewer
|
||||
|
||||
EXTRA_DIST = \
|
||||
blckerr.h blobbox.h blobs.h blread.h coutln.h crakedge.h \
|
||||
genblob.h hpddef.h hpdsizes.h ipoints.h labls.h linlsq.h \
|
||||
lmedsq.h mod128.h normalis.h ocrblock.h ocrrow.h pageblk.h \
|
||||
pageres.h pdblock.h pdclass.h points.h polyaprx.h polyblk.h \
|
||||
polyblob.h polyvert.h poutline.h quadlsq.h quadratc.h \
|
||||
quspline.h ratngs.h rect.h rejctmap.h rwpoly.h statistc.h \
|
||||
stepblob.h txtregn.h vecfuncs.h werd.h
|
||||
|
||||
noinst_LIBRARIES = libtesseract_ccstruct.a
|
||||
libtesseract_ccstruct_a_SOURCES = \
|
||||
blobbox.cpp blobs.cpp blread.cpp callcpp.cpp \
|
||||
coutln.cpp genblob.cpp labls.cpp linlsq.cpp \
|
||||
lmedsq.cpp mod128.cpp normalis.cpp ocrblock.cpp \
|
||||
ocrrow.cpp pageblk.cpp pageres.cpp pdblock.cpp \
|
||||
points.cpp polyaprx.cpp polyblk.cpp polyblob.cpp \
|
||||
polyvert.cpp poutline.cpp quadlsq.cpp quadratc.cpp \
|
||||
quspline.cpp ratngs.cpp rect.cpp rejctmap.cpp \
|
||||
rwpoly.cpp statistc.cpp stepblob.cpp txtregn.cpp \
|
||||
vecfuncs.cpp werd.cpp
|
||||
|
||||
all: all-recursive
|
||||
|
||||
.SUFFIXES:
|
||||
.SUFFIXES: .cpp .o .obj
|
||||
$(srcdir)/Makefile.in: @MAINTAINER_MODE_TRUE@ $(srcdir)/Makefile.am $(am__configure_deps)
|
||||
@for dep in $?; do \
|
||||
case '$(am__configure_deps)' in \
|
||||
*$$dep*) \
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
|
||||
&& exit 0; \
|
||||
exit 1;; \
|
||||
esac; \
|
||||
done; \
|
||||
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu ccstruct/Makefile'; \
|
||||
cd $(top_srcdir) && \
|
||||
$(AUTOMAKE) --gnu ccstruct/Makefile
|
||||
.PRECIOUS: Makefile
|
||||
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
|
||||
@case '$?' in \
|
||||
*config.status*) \
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
|
||||
*) \
|
||||
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
|
||||
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
|
||||
esac;
|
||||
|
||||
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
|
||||
|
||||
$(top_srcdir)/configure: @MAINTAINER_MODE_TRUE@ $(am__configure_deps)
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
|
||||
$(ACLOCAL_M4): @MAINTAINER_MODE_TRUE@ $(am__aclocal_m4_deps)
|
||||
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
|
||||
|
||||
clean-noinstLIBRARIES:
|
||||
-test -z "$(noinst_LIBRARIES)" || rm -f $(noinst_LIBRARIES)
|
||||
libtesseract_ccstruct.a: $(libtesseract_ccstruct_a_OBJECTS) $(libtesseract_ccstruct_a_DEPENDENCIES)
|
||||
-rm -f libtesseract_ccstruct.a
|
||||
$(libtesseract_ccstruct_a_AR) libtesseract_ccstruct.a $(libtesseract_ccstruct_a_OBJECTS) $(libtesseract_ccstruct_a_LIBADD)
|
||||
$(RANLIB) libtesseract_ccstruct.a
|
||||
|
||||
mostlyclean-compile:
|
||||
-rm -f *.$(OBJEXT)
|
||||
|
||||
distclean-compile:
|
||||
-rm -f *.tab.c
|
||||
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/blobbox.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/blobs.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/blread.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/callcpp.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/coutln.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/genblob.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/labls.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/linlsq.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/lmedsq.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/mod128.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/normalis.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ocrblock.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ocrrow.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/pageblk.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/pageres.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/pdblock.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/points.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/polyaprx.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/polyblk.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/polyblob.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/polyvert.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/poutline.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/quadlsq.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/quadratc.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/quspline.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ratngs.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/rect.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/rejctmap.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/rwpoly.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/statistc.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stepblob.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/txtregn.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/vecfuncs.Po@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/werd.Po@am__quote@
|
||||
|
||||
.cpp.o:
|
||||
@am__fastdepCXX_TRUE@ if $(CXXCOMPILE) -MT $@ -MD -MP -MF "$(DEPDIR)/$*.Tpo" -c -o $@ $<; \
|
||||
@am__fastdepCXX_TRUE@ then mv -f "$(DEPDIR)/$*.Tpo" "$(DEPDIR)/$*.Po"; else rm -f "$(DEPDIR)/$*.Tpo"; exit 1; fi
|
||||
@AMDEP_TRUE@@am__fastdepCXX_FALSE@ source='$<' object='$@' libtool=no @AMDEPBACKSLASH@
|
||||
@AMDEP_TRUE@@am__fastdepCXX_FALSE@ DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) @AMDEPBACKSLASH@
|
||||
@am__fastdepCXX_FALSE@ $(CXXCOMPILE) -c -o $@ $<
|
||||
|
||||
.cpp.obj:
|
||||
@am__fastdepCXX_TRUE@ if $(CXXCOMPILE) -MT $@ -MD -MP -MF "$(DEPDIR)/$*.Tpo" -c -o $@ `$(CYGPATH_W) '$<'`; \
|
||||
@am__fastdepCXX_TRUE@ then mv -f "$(DEPDIR)/$*.Tpo" "$(DEPDIR)/$*.Po"; else rm -f "$(DEPDIR)/$*.Tpo"; exit 1; fi
|
||||
@AMDEP_TRUE@@am__fastdepCXX_FALSE@ source='$<' object='$@' libtool=no @AMDEPBACKSLASH@
|
||||
@AMDEP_TRUE@@am__fastdepCXX_FALSE@ DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) @AMDEPBACKSLASH@
|
||||
@am__fastdepCXX_FALSE@ $(CXXCOMPILE) -c -o $@ `$(CYGPATH_W) '$<'`
|
||||
uninstall-info-am:
|
||||
|
||||
# This directory's subdirectories are mostly independent; you can cd
|
||||
# into them and run `make' without going through this Makefile.
|
||||
# To change the values of `make' variables: instead of editing Makefiles,
|
||||
# (1) if the variable is set in `config.status', edit `config.status'
|
||||
# (which will cause the Makefiles to be regenerated when you run `make');
|
||||
# (2) otherwise, pass the desired values on the `make' command line.
|
||||
$(RECURSIVE_TARGETS):
|
||||
@failcom='exit 1'; \
|
||||
for f in x $$MAKEFLAGS; do \
|
||||
case $$f in \
|
||||
*=* | --[!k]*);; \
|
||||
*k*) failcom='fail=yes';; \
|
||||
esac; \
|
||||
done; \
|
||||
dot_seen=no; \
|
||||
target=`echo $@ | sed s/-recursive//`; \
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
echo "Making $$target in $$subdir"; \
|
||||
if test "$$subdir" = "."; then \
|
||||
dot_seen=yes; \
|
||||
local_target="$$target-am"; \
|
||||
else \
|
||||
local_target="$$target"; \
|
||||
fi; \
|
||||
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|
||||
|| eval $$failcom; \
|
||||
done; \
|
||||
if test "$$dot_seen" = "no"; then \
|
||||
$(MAKE) $(AM_MAKEFLAGS) "$$target-am" || exit 1; \
|
||||
fi; test -z "$$fail"
|
||||
|
||||
mostlyclean-recursive clean-recursive distclean-recursive \
|
||||
maintainer-clean-recursive:
|
||||
@failcom='exit 1'; \
|
||||
for f in x $$MAKEFLAGS; do \
|
||||
case $$f in \
|
||||
*=* | --[!k]*);; \
|
||||
*k*) failcom='fail=yes';; \
|
||||
esac; \
|
||||
done; \
|
||||
dot_seen=no; \
|
||||
case "$@" in \
|
||||
distclean-* | maintainer-clean-*) list='$(DIST_SUBDIRS)' ;; \
|
||||
*) list='$(SUBDIRS)' ;; \
|
||||
esac; \
|
||||
rev=''; for subdir in $$list; do \
|
||||
if test "$$subdir" = "."; then :; else \
|
||||
rev="$$subdir $$rev"; \
|
||||
fi; \
|
||||
done; \
|
||||
rev="$$rev ."; \
|
||||
target=`echo $@ | sed s/-recursive//`; \
|
||||
for subdir in $$rev; do \
|
||||
echo "Making $$target in $$subdir"; \
|
||||
if test "$$subdir" = "."; then \
|
||||
local_target="$$target-am"; \
|
||||
else \
|
||||
local_target="$$target"; \
|
||||
fi; \
|
||||
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|
||||
|| eval $$failcom; \
|
||||
done && test -z "$$fail"
|
||||
tags-recursive:
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) tags); \
|
||||
done
|
||||
ctags-recursive:
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) ctags); \
|
||||
done
|
||||
|
||||
ID: $(HEADERS) $(SOURCES) $(LISP) $(TAGS_FILES)
|
||||
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
|
||||
unique=`for i in $$list; do \
|
||||
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
|
||||
done | \
|
||||
$(AWK) ' { files[$$0] = 1; } \
|
||||
END { for (i in files) print i; }'`; \
|
||||
mkid -fID $$unique
|
||||
tags: TAGS
|
||||
|
||||
TAGS: tags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
|
||||
$(TAGS_FILES) $(LISP)
|
||||
tags=; \
|
||||
here=`pwd`; \
|
||||
if ($(ETAGS) --etags-include --version) >/dev/null 2>&1; then \
|
||||
include_option=--etags-include; \
|
||||
empty_fix=.; \
|
||||
else \
|
||||
include_option=--include; \
|
||||
empty_fix=; \
|
||||
fi; \
|
||||
list='$(SUBDIRS)'; for subdir in $$list; do \
|
||||
if test "$$subdir" = .; then :; else \
|
||||
test ! -f $$subdir/TAGS || \
|
||||
tags="$$tags $$include_option=$$here/$$subdir/TAGS"; \
|
||||
fi; \
|
||||
done; \
|
||||
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
|
||||
unique=`for i in $$list; do \
|
||||
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
|
||||
done | \
|
||||
$(AWK) ' { files[$$0] = 1; } \
|
||||
END { for (i in files) print i; }'`; \
|
||||
if test -z "$(ETAGS_ARGS)$$tags$$unique"; then :; else \
|
||||
test -n "$$unique" || unique=$$empty_fix; \
|
||||
$(ETAGS) $(ETAGSFLAGS) $(AM_ETAGSFLAGS) $(ETAGS_ARGS) \
|
||||
$$tags $$unique; \
|
||||
fi
|
||||
ctags: CTAGS
|
||||
CTAGS: ctags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
|
||||
$(TAGS_FILES) $(LISP)
|
||||
tags=; \
|
||||
here=`pwd`; \
|
||||
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
|
||||
unique=`for i in $$list; do \
|
||||
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
|
||||
done | \
|
||||
$(AWK) ' { files[$$0] = 1; } \
|
||||
END { for (i in files) print i; }'`; \
|
||||
test -z "$(CTAGS_ARGS)$$tags$$unique" \
|
||||
|| $(CTAGS) $(CTAGSFLAGS) $(AM_CTAGSFLAGS) $(CTAGS_ARGS) \
|
||||
$$tags $$unique
|
||||
|
||||
GTAGS:
|
||||
here=`$(am__cd) $(top_builddir) && pwd` \
|
||||
&& cd $(top_srcdir) \
|
||||
&& gtags -i $(GTAGS_ARGS) $$here
|
||||
|
||||
distclean-tags:
|
||||
-rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
|
||||
|
||||
distdir: $(DISTFILES)
|
||||
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
|
||||
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
|
||||
list='$(DISTFILES)'; for file in $$list; do \
|
||||
case $$file in \
|
||||
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
|
||||
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
|
||||
esac; \
|
||||
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
|
||||
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
|
||||
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
|
||||
dir="/$$dir"; \
|
||||
$(mkdir_p) "$(distdir)$$dir"; \
|
||||
else \
|
||||
dir=''; \
|
||||
fi; \
|
||||
if test -d $$d/$$file; then \
|
||||
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
|
||||
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
|
||||
fi; \
|
||||
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
|
||||
else \
|
||||
test -f $(distdir)/$$file \
|
||||
|| cp -p $$d/$$file $(distdir)/$$file \
|
||||
|| exit 1; \
|
||||
fi; \
|
||||
done
|
||||
list='$(DIST_SUBDIRS)'; for subdir in $$list; do \
|
||||
if test "$$subdir" = .; then :; else \
|
||||
test -d "$(distdir)/$$subdir" \
|
||||
|| $(mkdir_p) "$(distdir)/$$subdir" \
|
||||
|| exit 1; \
|
||||
distdir=`$(am__cd) $(distdir) && pwd`; \
|
||||
top_distdir=`$(am__cd) $(top_distdir) && pwd`; \
|
||||
(cd $$subdir && \
|
||||
$(MAKE) $(AM_MAKEFLAGS) \
|
||||
top_distdir="$$top_distdir" \
|
||||
distdir="$$distdir/$$subdir" \
|
||||
distdir) \
|
||||
|| exit 1; \
|
||||
fi; \
|
||||
done
|
||||
check-am: all-am
|
||||
check: check-recursive
|
||||
all-am: Makefile $(LIBRARIES)
|
||||
installdirs: installdirs-recursive
|
||||
installdirs-am:
|
||||
install: install-recursive
|
||||
install-exec: install-exec-recursive
|
||||
install-data: install-data-recursive
|
||||
uninstall: uninstall-recursive
|
||||
|
||||
install-am: all-am
|
||||
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
|
||||
|
||||
installcheck: installcheck-recursive
|
||||
install-strip:
|
||||
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
|
||||
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
|
||||
`test -z '$(STRIP)' || \
|
||||
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
|
||||
mostlyclean-generic:
|
||||
|
||||
clean-generic:
|
||||
|
||||
distclean-generic:
|
||||
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
|
||||
|
||||
maintainer-clean-generic:
|
||||
@echo "This command is intended for maintainers to use"
|
||||
@echo "it deletes files that may require special tools to rebuild."
|
||||
clean: clean-recursive
|
||||
|
||||
clean-am: clean-generic clean-noinstLIBRARIES mostlyclean-am
|
||||
|
||||
distclean: distclean-recursive
|
||||
-rm -rf ./$(DEPDIR)
|
||||
-rm -f Makefile
|
||||
distclean-am: clean-am distclean-compile distclean-generic \
|
||||
distclean-tags
|
||||
|
||||
dvi: dvi-recursive
|
||||
|
||||
dvi-am:
|
||||
|
||||
html: html-recursive
|
||||
|
||||
info: info-recursive
|
||||
|
||||
info-am:
|
||||
|
||||
install-data-am:
|
||||
|
||||
install-exec-am:
|
||||
|
||||
install-info: install-info-recursive
|
||||
|
||||
install-man:
|
||||
|
||||
installcheck-am:
|
||||
|
||||
maintainer-clean: maintainer-clean-recursive
|
||||
-rm -rf ./$(DEPDIR)
|
||||
-rm -f Makefile
|
||||
maintainer-clean-am: distclean-am maintainer-clean-generic
|
||||
|
||||
mostlyclean: mostlyclean-recursive
|
||||
|
||||
mostlyclean-am: mostlyclean-compile mostlyclean-generic
|
||||
|
||||
pdf: pdf-recursive
|
||||
|
||||
pdf-am:
|
||||
|
||||
ps: ps-recursive
|
||||
|
||||
ps-am:
|
||||
|
||||
uninstall-am: uninstall-info-am
|
||||
|
||||
uninstall-info: uninstall-info-recursive
|
||||
|
||||
.PHONY: $(RECURSIVE_TARGETS) CTAGS GTAGS all all-am check check-am \
|
||||
clean clean-generic clean-noinstLIBRARIES clean-recursive \
|
||||
ctags ctags-recursive distclean distclean-compile \
|
||||
distclean-generic distclean-recursive distclean-tags distdir \
|
||||
dvi dvi-am html html-am info info-am install install-am \
|
||||
install-data install-data-am install-exec install-exec-am \
|
||||
install-info install-info-am install-man install-strip \
|
||||
installcheck installcheck-am installdirs installdirs-am \
|
||||
maintainer-clean maintainer-clean-generic \
|
||||
maintainer-clean-recursive mostlyclean mostlyclean-compile \
|
||||
mostlyclean-generic mostlyclean-recursive pdf pdf-am ps ps-am \
|
||||
tags tags-recursive uninstall uninstall-am uninstall-info-am
|
||||
|
||||
# Tell versions [3.59,3.63) of GNU make to not export all variables.
|
||||
# Otherwise a system limit (for SysV at least) may be exceeded.
|
||||
.NOEXPORT:
|
29
ccstruct/blckerr.h
Normal file
29
ccstruct/blckerr.h
Normal file
@ -0,0 +1,29 @@
|
||||
/**********************************************************************
|
||||
* File: blckerr.h (Formerly blockerr.h)
|
||||
* Description: Error codes for the page block classes.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Mar 19 17:43:30 GMT 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef BLCKERR_H
|
||||
#define BLCKERR_H
|
||||
|
||||
#include "errcode.h"
|
||||
|
||||
const ERRCODE BADBLOCKLINE = "Y coordinate in block out of bounds";
|
||||
const ERRCODE LOSTBLOCKLINE = "Can't find rectangle for line";
|
||||
const ERRCODE ILLEGAL_GRADIENT = "Gradient wrong side of edge step!";
|
||||
const ERRCODE WRONG_WORD = "Word doesn't have blobs of that type";
|
||||
#endif
|
778
ccstruct/blobbox.cpp
Normal file
778
ccstruct/blobbox.cpp
Normal file
@ -0,0 +1,778 @@
|
||||
/**********************************************************************
|
||||
* File: blobbox.cpp (Formerly blobnbox.c)
|
||||
* Description: Code for the textord blob class.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Jul 30 09:08:51 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "blobbox.h"
|
||||
|
||||
#define PROJECTION_MARGIN 10 //arbitrary
|
||||
#define EXTERN
|
||||
|
||||
EXTERN double_VAR (textord_error_weight, 3,
|
||||
"Weighting for error in believability");
|
||||
EXTERN BOOL_VAR (pitsync_projection_fix, TRUE,
|
||||
"Fix bug in projection profile");
|
||||
|
||||
ELISTIZE (BLOBNBOX) ELIST2IZE (TO_ROW) ELISTIZE (TO_BLOCK)
|
||||
/**********************************************************************
|
||||
* BLOBNBOX::merge
|
||||
*
|
||||
* Merge this blob with the given blob, which should be after this.
|
||||
**********************************************************************/
|
||||
void BLOBNBOX::merge( //merge blobs
|
||||
BLOBNBOX *nextblob //blob to join with
|
||||
) {
|
||||
box += nextblob->box; //merge boxes
|
||||
nextblob->joined = TRUE;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOBNBOX::chop
|
||||
*
|
||||
* Chop this blob into equal sized pieces using the x height as a guide.
|
||||
* The blob is not actually chopped. Instead, fake blobs are inserted
|
||||
* with the relevant bounding boxes.
|
||||
**********************************************************************/
|
||||
|
||||
void BLOBNBOX::chop( //chop blobs
|
||||
BLOBNBOX_IT *start_it, //location of this
|
||||
BLOBNBOX_IT *end_it, //iterator
|
||||
FCOORD rotation, //for landscape
|
||||
float xheight //of line
|
||||
) {
|
||||
INT16 blobcount; //no of blobs
|
||||
BLOBNBOX *newblob; //fake blob
|
||||
BLOBNBOX *blob; //current blob
|
||||
INT16 blobindex; //number of chop
|
||||
INT16 leftx; //left edge of blob
|
||||
float blobwidth; //width of each
|
||||
float rightx; //right edge to scan
|
||||
float ymin, ymax; //limits of new blob
|
||||
float test_ymin, test_ymax; //limits of part blob
|
||||
ICOORD bl, tr; //corners of box
|
||||
BLOBNBOX_IT blob_it; //blob iterator
|
||||
|
||||
//get no of chops
|
||||
blobcount = (INT16) floor (box.width () / xheight);
|
||||
if (blobcount > 1 && (blob_ptr != NULL || cblob_ptr != NULL)) {
|
||||
//width of each
|
||||
blobwidth = (float) (box.width () + 1) / blobcount;
|
||||
for (blobindex = blobcount - 1, rightx = box.right ();
|
||||
blobindex >= 0; blobindex--, rightx -= blobwidth) {
|
||||
ymin = (float) MAX_INT32;
|
||||
ymax = (float) -MAX_INT32;
|
||||
blob_it = *start_it;
|
||||
do {
|
||||
blob = blob_it.data ();
|
||||
if (blob->blob_ptr != NULL)
|
||||
find_blob_limits (blob->blob_ptr, rightx - blobwidth, rightx,
|
||||
rotation, test_ymin, test_ymax);
|
||||
else
|
||||
find_cblob_vlimits (blob->cblob_ptr, rightx - blobwidth,
|
||||
rightx,
|
||||
/*rotation, */ test_ymin, test_ymax);
|
||||
blob_it.forward ();
|
||||
if (test_ymin < ymin)
|
||||
ymin = test_ymin;
|
||||
if (test_ymax > ymax)
|
||||
ymax = test_ymax;
|
||||
}
|
||||
while (blob != end_it->data ());
|
||||
if (ymin < ymax) {
|
||||
leftx = (INT16) floor (rightx - blobwidth);
|
||||
if (leftx < box.left ())
|
||||
leftx = box.left (); //clip to real box
|
||||
bl = ICOORD (leftx, (INT16) floor (ymin));
|
||||
tr = ICOORD ((INT16) ceil (rightx), (INT16) ceil (ymax));
|
||||
if (blobindex == 0)
|
||||
box = BOX (bl, tr); //change box
|
||||
else {
|
||||
newblob = new BLOBNBOX;
|
||||
//box is all it has
|
||||
newblob->box = BOX (bl, tr);
|
||||
//stay on current
|
||||
end_it->add_after_stay_put (newblob);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* find_blob_limits
|
||||
*
|
||||
* Scan the outlines of the blob to locate the y min and max
|
||||
* between the given x limits.
|
||||
**********************************************************************/
|
||||
|
||||
void find_blob_limits( //get y limits
|
||||
PBLOB *blob, //blob to search
|
||||
float leftx, //x limits
|
||||
float rightx,
|
||||
FCOORD rotation, //for landscape
|
||||
float &ymin, //output y limits
|
||||
float &ymax) {
|
||||
float testy; //y intercept
|
||||
FCOORD pos; //rotated
|
||||
FCOORD vec;
|
||||
POLYPT *polypt; //current point
|
||||
//outlines
|
||||
OUTLINE_IT out_it = blob->out_list ();
|
||||
POLYPT_IT poly_it; //outline pts
|
||||
|
||||
ymin = (float) MAX_INT32;
|
||||
ymax = (float) -MAX_INT32;
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
//get points
|
||||
poly_it.set_to_list (out_it.data ()->polypts ());
|
||||
for (poly_it.mark_cycle_pt (); !poly_it.cycled_list ();
|
||||
poly_it.forward ()) {
|
||||
polypt = poly_it.data ();
|
||||
pos = polypt->pos;
|
||||
pos.rotate (rotation);
|
||||
vec = polypt->vec;
|
||||
vec.rotate (rotation);
|
||||
if (pos.x () < leftx && pos.x () + vec.x () > leftx
|
||||
|| pos.x () > leftx && pos.x () + vec.x () < leftx) {
|
||||
testy = pos.y () + vec.y () * (leftx - pos.x ()) / vec.x ();
|
||||
//intercept of boundary
|
||||
if (testy < ymin)
|
||||
ymin = testy;
|
||||
if (testy > ymax)
|
||||
ymax = testy;
|
||||
}
|
||||
if (pos.x () >= leftx && pos.x () <= rightx) {
|
||||
if (pos.y () > ymax)
|
||||
ymax = pos.y ();
|
||||
if (pos.y () < ymin)
|
||||
ymin = pos.y ();
|
||||
}
|
||||
if (pos.x () > rightx && pos.x () + vec.x () < rightx
|
||||
|| pos.x () < rightx && pos.x () + vec.x () > rightx) {
|
||||
testy = pos.y () + vec.y () * (rightx - pos.x ()) / vec.x ();
|
||||
//intercept of boundary
|
||||
if (testy < ymin)
|
||||
ymin = testy;
|
||||
if (testy > ymax)
|
||||
ymax = testy;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* find_cblob_limits
|
||||
*
|
||||
* Scan the outlines of the cblob to locate the y min and max
|
||||
* between the given x limits.
|
||||
**********************************************************************/
|
||||
|
||||
void find_cblob_limits( //get y limits
|
||||
C_BLOB *blob, //blob to search
|
||||
float leftx, //x limits
|
||||
float rightx,
|
||||
FCOORD rotation, //for landscape
|
||||
float &ymin, //output y limits
|
||||
float &ymax) {
|
||||
INT16 stepindex; //current point
|
||||
ICOORD pos; //current coords
|
||||
ICOORD vec; //rotated step
|
||||
C_OUTLINE *outline; //current outline
|
||||
//outlines
|
||||
C_OUTLINE_IT out_it = blob->out_list ();
|
||||
|
||||
ymin = (float) MAX_INT32;
|
||||
ymax = (float) -MAX_INT32;
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
outline = out_it.data ();
|
||||
pos = outline->start_pos (); //get coords
|
||||
pos.rotate (rotation);
|
||||
for (stepindex = 0; stepindex < outline->pathlength (); stepindex++) {
|
||||
//inside
|
||||
if (pos.x () >= leftx && pos.x () <= rightx) {
|
||||
if (pos.y () > ymax)
|
||||
ymax = pos.y ();
|
||||
if (pos.y () < ymin)
|
||||
ymin = pos.y ();
|
||||
}
|
||||
vec = outline->step (stepindex);
|
||||
vec.rotate (rotation);
|
||||
pos += vec; //move to next
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* find_cblob_vlimits
|
||||
*
|
||||
* Scan the outlines of the cblob to locate the y min and max
|
||||
* between the given x limits.
|
||||
**********************************************************************/
|
||||
|
||||
void find_cblob_vlimits( //get y limits
|
||||
C_BLOB *blob, //blob to search
|
||||
float leftx, //x limits
|
||||
float rightx,
|
||||
float &ymin, //output y limits
|
||||
float &ymax) {
|
||||
INT16 stepindex; //current point
|
||||
ICOORD pos; //current coords
|
||||
ICOORD vec; //rotated step
|
||||
C_OUTLINE *outline; //current outline
|
||||
//outlines
|
||||
C_OUTLINE_IT out_it = blob->out_list ();
|
||||
|
||||
ymin = (float) MAX_INT32;
|
||||
ymax = (float) -MAX_INT32;
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
outline = out_it.data ();
|
||||
pos = outline->start_pos (); //get coords
|
||||
for (stepindex = 0; stepindex < outline->pathlength (); stepindex++) {
|
||||
//inside
|
||||
if (pos.x () >= leftx && pos.x () <= rightx) {
|
||||
if (pos.y () > ymax)
|
||||
ymax = pos.y ();
|
||||
if (pos.y () < ymin)
|
||||
ymin = pos.y ();
|
||||
}
|
||||
vec = outline->step (stepindex);
|
||||
pos += vec; //move to next
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* find_cblob_hlimits
|
||||
*
|
||||
* Scan the outlines of the cblob to locate the x min and max
|
||||
* between the given y limits.
|
||||
**********************************************************************/
|
||||
|
||||
void find_cblob_hlimits( //get x limits
|
||||
C_BLOB *blob, //blob to search
|
||||
float bottomy, //y limits
|
||||
float topy,
|
||||
float &xmin, //output x limits
|
||||
float &xmax) {
|
||||
INT16 stepindex; //current point
|
||||
ICOORD pos; //current coords
|
||||
ICOORD vec; //rotated step
|
||||
C_OUTLINE *outline; //current outline
|
||||
//outlines
|
||||
C_OUTLINE_IT out_it = blob->out_list ();
|
||||
|
||||
xmin = (float) MAX_INT32;
|
||||
xmax = (float) -MAX_INT32;
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
outline = out_it.data ();
|
||||
pos = outline->start_pos (); //get coords
|
||||
for (stepindex = 0; stepindex < outline->pathlength (); stepindex++) {
|
||||
//inside
|
||||
if (pos.y () >= bottomy && pos.y () <= topy) {
|
||||
if (pos.x () > xmax)
|
||||
xmax = pos.x ();
|
||||
if (pos.x () < xmin)
|
||||
xmin = pos.x ();
|
||||
}
|
||||
vec = outline->step (stepindex);
|
||||
pos += vec; //move to next
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* rotate_blob
|
||||
*
|
||||
* Poly copy the blob and rotate the copy by the given vector.
|
||||
**********************************************************************/
|
||||
|
||||
PBLOB *rotate_blob( //get y limits
|
||||
PBLOB *blob, //blob to search
|
||||
FCOORD rotation //vector to rotate by
|
||||
) {
|
||||
PBLOB *copy; //copy of blob
|
||||
POLYPT *polypt; //current point
|
||||
OUTLINE_IT out_it;
|
||||
POLYPT_IT poly_it; //outline pts
|
||||
|
||||
copy = new PBLOB;
|
||||
*copy = *blob; //deep copy
|
||||
out_it.set_to_list (copy->out_list ());
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
//get points
|
||||
poly_it.set_to_list (out_it.data ()->polypts ());
|
||||
for (poly_it.mark_cycle_pt (); !poly_it.cycled_list ();
|
||||
poly_it.forward ()) {
|
||||
polypt = poly_it.data ();
|
||||
//rotate it
|
||||
polypt->pos.rotate (rotation);
|
||||
polypt->vec.rotate (rotation);
|
||||
}
|
||||
out_it.data ()->compute_bb ();
|
||||
}
|
||||
return copy;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* rotate_cblob
|
||||
*
|
||||
* Poly copy the blob and rotate the copy by the given vector.
|
||||
**********************************************************************/
|
||||
|
||||
PBLOB *rotate_cblob( //rotate it
|
||||
C_BLOB *blob, //blob to search
|
||||
float xheight, //for poly approx
|
||||
FCOORD rotation //for landscape
|
||||
) {
|
||||
PBLOB *copy; //copy of blob
|
||||
POLYPT *polypt; //current point
|
||||
OUTLINE_IT out_it;
|
||||
POLYPT_IT poly_it; //outline pts
|
||||
|
||||
copy = new PBLOB (blob, xheight);
|
||||
out_it.set_to_list (copy->out_list ());
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
//get points
|
||||
poly_it.set_to_list (out_it.data ()->polypts ());
|
||||
for (poly_it.mark_cycle_pt (); !poly_it.cycled_list ();
|
||||
poly_it.forward ()) {
|
||||
polypt = poly_it.data ();
|
||||
//rotate it
|
||||
polypt->pos.rotate (rotation);
|
||||
polypt->vec.rotate (rotation);
|
||||
}
|
||||
out_it.data ()->compute_bb ();
|
||||
}
|
||||
return copy;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* crotate_cblob
|
||||
*
|
||||
* Rotate the copy by the given vector and return a C_BLOB.
|
||||
**********************************************************************/
|
||||
|
||||
C_BLOB *crotate_cblob( //rotate it
|
||||
C_BLOB *blob, //blob to search
|
||||
FCOORD rotation //for landscape
|
||||
) {
|
||||
C_OUTLINE_LIST out_list; //output outlines
|
||||
//input outlines
|
||||
C_OUTLINE_IT in_it = blob->out_list ();
|
||||
//output outlines
|
||||
C_OUTLINE_IT out_it = &out_list;
|
||||
|
||||
for (in_it.mark_cycle_pt (); !in_it.cycled_list (); in_it.forward ()) {
|
||||
out_it.add_after_then_move (new C_OUTLINE (in_it.data (), rotation));
|
||||
}
|
||||
return new C_BLOB (&out_list);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* box_next
|
||||
*
|
||||
* Compute the bounding box of this blob with merging of x overlaps
|
||||
* but no pre-chopping.
|
||||
* Then move the iterator on to the start of the next blob.
|
||||
**********************************************************************/
|
||||
|
||||
BOX box_next( //get bounding box
|
||||
BLOBNBOX_IT *it //iterator to blobds
|
||||
) {
|
||||
BLOBNBOX *blob; //current blob
|
||||
BOX result; //total box
|
||||
|
||||
blob = it->data ();
|
||||
result = blob->bounding_box ();
|
||||
do {
|
||||
it->forward ();
|
||||
blob = it->data ();
|
||||
if (blob->blob () == NULL && blob->cblob () == NULL)
|
||||
//was pre-chopped
|
||||
result += blob->bounding_box ();
|
||||
}
|
||||
//until next real blob
|
||||
while (blob->blob () == NULL && blob->cblob () == NULL || blob->joined_to_prev ());
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* box_next_pre_chopped
|
||||
*
|
||||
* Compute the bounding box of this blob with merging of x overlaps
|
||||
* but WITH pre-chopping.
|
||||
* Then move the iterator on to the start of the next pre-chopped blob.
|
||||
**********************************************************************/
|
||||
|
||||
BOX box_next_pre_chopped( //get bounding box
|
||||
BLOBNBOX_IT *it //iterator to blobds
|
||||
) {
|
||||
BLOBNBOX *blob; //current blob
|
||||
BOX result; //total box
|
||||
|
||||
blob = it->data ();
|
||||
result = blob->bounding_box ();
|
||||
do {
|
||||
it->forward ();
|
||||
blob = it->data ();
|
||||
}
|
||||
//until next real blob
|
||||
while (blob->joined_to_prev ());
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* TO_ROW::TO_ROW
|
||||
*
|
||||
* Constructor to make a row from a blob.
|
||||
**********************************************************************/
|
||||
|
||||
TO_ROW::TO_ROW ( //constructor
|
||||
BLOBNBOX * blob, //first blob
|
||||
float top, //corrected top
|
||||
float bottom, //of row
|
||||
float row_size //ideal
|
||||
):y_min (bottom), y_max (top), initial_y_min (bottom) {
|
||||
float diff; //in size
|
||||
BLOBNBOX_IT it = &blobs; //list of blobs
|
||||
|
||||
it.add_to_end (blob);
|
||||
diff = top - bottom - row_size;
|
||||
if (diff > 0) {
|
||||
y_max -= diff / 2;
|
||||
y_min += diff / 2;
|
||||
}
|
||||
//very small object
|
||||
else if ((top - bottom) * 3 < row_size) {
|
||||
diff = row_size / 3 + bottom - top;
|
||||
y_max += diff / 2;
|
||||
y_min -= diff / 2;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* TO_ROW:add_blob
|
||||
*
|
||||
* Add the blob to the end of the row.
|
||||
**********************************************************************/
|
||||
|
||||
void TO_ROW::add_blob( //constructor
|
||||
BLOBNBOX *blob, //first blob
|
||||
float top, //corrected top
|
||||
float bottom, //of row
|
||||
float row_size //ideal
|
||||
) {
|
||||
float allowed; //allowed expansion
|
||||
float available; //expansion
|
||||
BLOBNBOX_IT it = &blobs; //list of blobs
|
||||
|
||||
it.add_to_end (blob);
|
||||
allowed = row_size + y_min - y_max;
|
||||
if (allowed > 0) {
|
||||
available = top > y_max ? top - y_max : 0;
|
||||
if (bottom < y_min)
|
||||
//total available
|
||||
available += y_min - bottom;
|
||||
if (available > 0) {
|
||||
available += available; //do it gradually
|
||||
if (available < allowed)
|
||||
available = allowed;
|
||||
if (bottom < y_min)
|
||||
y_min -= (y_min - bottom) * allowed / available;
|
||||
if (top > y_max)
|
||||
y_max += (top - y_max) * allowed / available;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* TO_ROW:insert_blob
|
||||
*
|
||||
* Add the blob to the row in the correct position.
|
||||
**********************************************************************/
|
||||
|
||||
void TO_ROW::insert_blob( //constructor
|
||||
BLOBNBOX *blob //first blob
|
||||
) {
|
||||
BLOBNBOX_IT it = &blobs; //list of blobs
|
||||
|
||||
if (it.empty ())
|
||||
it.add_before_then_move (blob);
|
||||
else {
|
||||
it.mark_cycle_pt ();
|
||||
while (!it.cycled_list ()
|
||||
&& it.data ()->bounding_box ().left () <=
|
||||
blob->bounding_box ().left ())
|
||||
it.forward ();
|
||||
if (it.cycled_list ())
|
||||
it.add_to_end (blob);
|
||||
else
|
||||
it.add_before_stay_put (blob);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* TO_ROW::compute_vertical_projection
|
||||
*
|
||||
* Compute the vertical projection of a TO_ROW from its blobs.
|
||||
**********************************************************************/
|
||||
|
||||
void TO_ROW::compute_vertical_projection() { //project whole row
|
||||
BOX row_box; //bound of row
|
||||
BLOBNBOX *blob; //current blob
|
||||
BOX blob_box; //bounding box
|
||||
BLOBNBOX_IT blob_it = blob_list ();
|
||||
|
||||
if (blob_it.empty ())
|
||||
return;
|
||||
row_box = blob_it.data ()->bounding_box ();
|
||||
for (blob_it.mark_cycle_pt (); !blob_it.cycled_list (); blob_it.forward ())
|
||||
row_box += blob_it.data ()->bounding_box ();
|
||||
|
||||
projection.set_range (row_box.left () - PROJECTION_MARGIN,
|
||||
row_box.right () + PROJECTION_MARGIN);
|
||||
projection_left = row_box.left () - PROJECTION_MARGIN;
|
||||
projection_right = row_box.right () + PROJECTION_MARGIN;
|
||||
for (blob_it.mark_cycle_pt (); !blob_it.cycled_list (); blob_it.forward ()) {
|
||||
blob = blob_it.data ();
|
||||
if (blob->blob () != NULL)
|
||||
vertical_blob_projection (blob->blob (), &projection);
|
||||
else if (blob->cblob () != NULL)
|
||||
vertical_cblob_projection (blob->cblob (), &projection);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* vertical_blob_projection
|
||||
*
|
||||
* Compute the vertical projection of a blob from its outlines
|
||||
* and add to the given STATS.
|
||||
**********************************************************************/
|
||||
|
||||
void vertical_blob_projection( //project outlines
|
||||
PBLOB *blob, //blob to project
|
||||
STATS *stats //output
|
||||
) {
|
||||
//outlines of blob
|
||||
OUTLINE_IT out_it = blob->out_list ();
|
||||
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
vertical_outline_projection (out_it.data (), stats);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* vertical_outline_projection
|
||||
*
|
||||
* Compute the vertical projection of a outline from its outlines
|
||||
* and add to the given STATS.
|
||||
**********************************************************************/
|
||||
|
||||
void vertical_outline_projection( //project outlines
|
||||
OUTLINE *outline, //outline to project
|
||||
STATS *stats //output
|
||||
) {
|
||||
POLYPT *polypt; //current point
|
||||
INT32 xcoord; //current pixel coord
|
||||
float end_x; //end of vec
|
||||
POLYPT_IT poly_it = outline->polypts ();
|
||||
OUTLINE_IT out_it = outline->child ();
|
||||
float ymean; //amount to add
|
||||
float width; //amount of x
|
||||
|
||||
for (poly_it.mark_cycle_pt (); !poly_it.cycled_list (); poly_it.forward ()) {
|
||||
polypt = poly_it.data ();
|
||||
end_x = polypt->pos.x () + polypt->vec.x ();
|
||||
if (polypt->vec.x () > 0) {
|
||||
for (xcoord = (INT32) floor (polypt->pos.x ());
|
||||
xcoord < end_x; xcoord++) {
|
||||
if (polypt->pos.x () < xcoord) {
|
||||
width = (float) xcoord;
|
||||
ymean =
|
||||
polypt->vec.y () * (xcoord -
|
||||
polypt->pos.x ()) / polypt->vec.x () +
|
||||
polypt->pos.y ();
|
||||
}
|
||||
else {
|
||||
width = polypt->pos.x ();
|
||||
ymean = polypt->pos.y ();
|
||||
}
|
||||
if (end_x > xcoord + 1) {
|
||||
width -= xcoord + 1;
|
||||
ymean +=
|
||||
polypt->vec.y () * (xcoord + 1 -
|
||||
polypt->pos.x ()) / polypt->vec.x () +
|
||||
polypt->pos.y ();
|
||||
}
|
||||
else {
|
||||
width -= end_x;
|
||||
ymean += polypt->pos.y () + polypt->vec.y ();
|
||||
}
|
||||
ymean = ymean * width / 2;
|
||||
stats->add (xcoord, (INT32) floor (ymean + 0.5));
|
||||
}
|
||||
}
|
||||
else if (polypt->vec.x () < 0) {
|
||||
for (xcoord = (INT32) floor (end_x);
|
||||
xcoord < polypt->pos.x (); xcoord++) {
|
||||
if (polypt->pos.x () > xcoord + 1) {
|
||||
width = xcoord + 1.0f;
|
||||
ymean =
|
||||
polypt->vec.y () * (xcoord + 1 -
|
||||
polypt->pos.x ()) / polypt->vec.x () +
|
||||
polypt->pos.y ();
|
||||
}
|
||||
else {
|
||||
width = polypt->pos.x ();
|
||||
ymean = polypt->pos.y ();
|
||||
}
|
||||
if (end_x < xcoord) {
|
||||
width -= xcoord;
|
||||
ymean +=
|
||||
polypt->vec.y () * (xcoord -
|
||||
polypt->pos.x ()) / polypt->vec.x () +
|
||||
polypt->pos.y ();
|
||||
}
|
||||
else {
|
||||
width -= end_x;
|
||||
ymean += polypt->pos.y () + polypt->vec.y ();
|
||||
}
|
||||
ymean = ymean * width / 2;
|
||||
stats->add (xcoord, (INT32) floor (ymean + 0.5));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
vertical_outline_projection (out_it.data (), stats);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* vertical_cblob_projection
|
||||
*
|
||||
* Compute the vertical projection of a cblob from its outlines
|
||||
* and add to the given STATS.
|
||||
**********************************************************************/
|
||||
|
||||
void vertical_cblob_projection( //project outlines
|
||||
C_BLOB *blob, //blob to project
|
||||
STATS *stats //output
|
||||
) {
|
||||
//outlines of blob
|
||||
C_OUTLINE_IT out_it = blob->out_list ();
|
||||
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
vertical_coutline_projection (out_it.data (), stats);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* vertical_coutline_projection
|
||||
*
|
||||
* Compute the vertical projection of a outline from its outlines
|
||||
* and add to the given STATS.
|
||||
**********************************************************************/
|
||||
|
||||
void vertical_coutline_projection( //project outlines
|
||||
C_OUTLINE *outline, //outline to project
|
||||
STATS *stats //output
|
||||
) {
|
||||
ICOORD pos; //current point
|
||||
ICOORD step; //edge step
|
||||
INT32 length; //of outline
|
||||
INT16 stepindex; //current step
|
||||
C_OUTLINE_IT out_it = outline->child ();
|
||||
|
||||
pos = outline->start_pos ();
|
||||
length = outline->pathlength ();
|
||||
for (stepindex = 0; stepindex < length; stepindex++) {
|
||||
step = outline->step (stepindex);
|
||||
if (step.x () > 0) {
|
||||
if (pitsync_projection_fix)
|
||||
stats->add (pos.x (), -pos.y ());
|
||||
else
|
||||
stats->add (pos.x (), pos.y ());
|
||||
}
|
||||
else if (step.x () < 0) {
|
||||
if (pitsync_projection_fix)
|
||||
stats->add (pos.x () - 1, pos.y ());
|
||||
else
|
||||
stats->add (pos.x () - 1, -pos.y ());
|
||||
}
|
||||
pos += step;
|
||||
}
|
||||
|
||||
for (out_it.mark_cycle_pt (); !out_it.cycled_list (); out_it.forward ()) {
|
||||
vertical_coutline_projection (out_it.data (), stats);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* TO_BLOCK::TO_BLOCK
|
||||
*
|
||||
* Constructor to make a TO_BLOCK from a real block.
|
||||
**********************************************************************/
|
||||
|
||||
TO_BLOCK::TO_BLOCK( //make a block
|
||||
BLOCK *src_block //real block
|
||||
) {
|
||||
block = src_block;
|
||||
}
|
||||
|
||||
static void clear_blobnboxes(BLOBNBOX_LIST* boxes) {
|
||||
BLOBNBOX_IT it = boxes;
|
||||
// A BLOBNBOX generally doesn't own its blobs, so if they do, you
|
||||
// have to delete them explicitly.
|
||||
for (it.mark_cycle_pt(); !it.cycled_list(); it.forward()) {
|
||||
BLOBNBOX* box = it.data();
|
||||
if (box->blob() != NULL)
|
||||
delete box->blob();
|
||||
if (box->cblob() != NULL)
|
||||
delete box->cblob();
|
||||
}
|
||||
}
|
||||
|
||||
TO_BLOCK::~TO_BLOCK() {
|
||||
// Any residual BLOBNBOXes at this stage own their blobs, so delete them.
|
||||
clear_blobnboxes(&blobs);
|
||||
clear_blobnboxes(&underlines);
|
||||
clear_blobnboxes(&noise_blobs);
|
||||
clear_blobnboxes(&small_blobs);
|
||||
clear_blobnboxes(&large_blobs);
|
||||
}
|
||||
|
381
ccstruct/blobbox.h
Normal file
381
ccstruct/blobbox.h
Normal file
@ -0,0 +1,381 @@
|
||||
/**********************************************************************
|
||||
* File: blobbox.h (Formerly blobnbox.h)
|
||||
* Description: Code for the textord blob class.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Jul 30 09:08:51 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef BLOBBOX_H
|
||||
#define BLOBBOX_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "clst.h"
|
||||
#include "elst2.h"
|
||||
#include "werd.h"
|
||||
#include "ocrblock.h"
|
||||
#include "statistc.h"
|
||||
|
||||
extern double_VAR_H (textord_error_weight, 3,
|
||||
"Weighting for error in believability");
|
||||
|
||||
enum PITCH_TYPE
|
||||
{
|
||||
PITCH_DUNNO, //insufficient data
|
||||
PITCH_DEF_FIXED, //definitely fixed
|
||||
PITCH_MAYBE_FIXED, //could be
|
||||
PITCH_DEF_PROP,
|
||||
PITCH_MAYBE_PROP,
|
||||
PITCH_CORR_FIXED,
|
||||
PITCH_CORR_PROP
|
||||
};
|
||||
|
||||
class BLOBNBOX;
|
||||
ELISTIZEH (BLOBNBOX)
|
||||
class BLOBNBOX:public ELIST_LINK
|
||||
{
|
||||
public:
|
||||
BLOBNBOX() { //empty
|
||||
blob_ptr = NULL;
|
||||
cblob_ptr = NULL;
|
||||
joined = FALSE;
|
||||
reduced = FALSE;
|
||||
area = 0;
|
||||
}
|
||||
BLOBNBOX( //constructor
|
||||
PBLOB *srcblob) {
|
||||
blob_ptr = srcblob;
|
||||
cblob_ptr = NULL;
|
||||
box = srcblob->bounding_box ();
|
||||
joined = FALSE;
|
||||
reduced = FALSE;
|
||||
area = (int) srcblob->area ();
|
||||
}
|
||||
BLOBNBOX( //constructor
|
||||
C_BLOB *srcblob) {
|
||||
blob_ptr = NULL;
|
||||
cblob_ptr = srcblob;
|
||||
box = srcblob->bounding_box ();
|
||||
joined = FALSE;
|
||||
reduced = FALSE;
|
||||
area = (int) srcblob->area ();
|
||||
}
|
||||
|
||||
//get bounding box
|
||||
const BOX &bounding_box() const {
|
||||
return box;
|
||||
}
|
||||
//get bounding box
|
||||
const BOX &reduced_box() const {
|
||||
return red_box;
|
||||
}
|
||||
void set_reduced_box( //set other box
|
||||
BOX new_box) {
|
||||
red_box = new_box;
|
||||
reduced = TRUE;
|
||||
}
|
||||
INT32 enclosed_area() const { //get area
|
||||
return area;
|
||||
}
|
||||
|
||||
void rotate_box( //just box
|
||||
FCOORD vec) {
|
||||
box.rotate (vec);
|
||||
}
|
||||
|
||||
BOOL8 joined_to_prev() const { //access function
|
||||
return joined != 0;
|
||||
}
|
||||
BOOL8 red_box_set() const { //access function
|
||||
return reduced != 0;
|
||||
}
|
||||
void merge( //merge with next
|
||||
BLOBNBOX *nextblob);
|
||||
void chop( //fake chop blob
|
||||
BLOBNBOX_IT *start_it, //location of this
|
||||
BLOBNBOX_IT *blob_it, //iterator
|
||||
FCOORD rotation, //for landscape
|
||||
float xheight); //line height
|
||||
|
||||
PBLOB *blob() { //access function
|
||||
return blob_ptr;
|
||||
}
|
||||
C_BLOB *cblob() { //access function
|
||||
return cblob_ptr;
|
||||
}
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void plot( //draw one
|
||||
WINDOW window, //window to draw in
|
||||
COLOUR blob_colour, //for outer bits
|
||||
COLOUR child_colour) { //for holes
|
||||
if (blob_ptr != NULL)
|
||||
blob_ptr->plot (window, blob_colour, child_colour);
|
||||
if (cblob_ptr != NULL)
|
||||
cblob_ptr->plot (window, blob_colour, child_colour);
|
||||
}
|
||||
#endif
|
||||
|
||||
NEWDELETE2 (BLOBNBOX) private:
|
||||
int area:30; //enclosed area
|
||||
int joined:1; //joined to prev
|
||||
int reduced:1; //reduced box set
|
||||
BOX box; //bounding box
|
||||
BOX red_box; //bounding box
|
||||
PBLOB *blob_ptr; //poly blob
|
||||
C_BLOB *cblob_ptr; //edgestep blob
|
||||
};
|
||||
|
||||
class TO_ROW:public ELIST2_LINK
|
||||
{
|
||||
public:
|
||||
TO_ROW() {
|
||||
} //empty
|
||||
TO_ROW( //constructor
|
||||
BLOBNBOX *blob, //from first blob
|
||||
float top, //of row //target height
|
||||
float bottom,
|
||||
float row_size);
|
||||
|
||||
float max_y() const { //access function
|
||||
return y_max;
|
||||
}
|
||||
float min_y() const {
|
||||
return y_min;
|
||||
}
|
||||
float mean_y() const {
|
||||
return (y_min + y_max) / 2.0f;
|
||||
}
|
||||
float initial_min_y() const {
|
||||
return initial_y_min;
|
||||
}
|
||||
float line_m() const { //access to line fit
|
||||
return m;
|
||||
}
|
||||
float line_c() const {
|
||||
return c;
|
||||
}
|
||||
float line_error() const {
|
||||
return error;
|
||||
}
|
||||
float parallel_c() const {
|
||||
return para_c;
|
||||
}
|
||||
float parallel_error() const {
|
||||
return para_error;
|
||||
}
|
||||
float believability() const { //baseline goodness
|
||||
return credibility;
|
||||
}
|
||||
float intercept() const { //real parallel_c
|
||||
return y_origin;
|
||||
}
|
||||
void add_blob( //put in row
|
||||
BLOBNBOX *blob, //blob to add
|
||||
float top, //of row //target height
|
||||
float bottom,
|
||||
float row_size);
|
||||
void insert_blob( //put in row in order
|
||||
BLOBNBOX *blob);
|
||||
|
||||
BLOBNBOX_LIST *blob_list() { //get list
|
||||
return &blobs;
|
||||
}
|
||||
|
||||
void set_line( //set line spec
|
||||
float new_m, //line to set
|
||||
float new_c,
|
||||
float new_error) {
|
||||
m = new_m;
|
||||
c = new_c;
|
||||
error = new_error;
|
||||
}
|
||||
void set_parallel_line( //set fixed gradient line
|
||||
float gradient, //page gradient
|
||||
float new_c,
|
||||
float new_error) {
|
||||
para_c = new_c;
|
||||
para_error = new_error;
|
||||
credibility =
|
||||
(float) (blobs.length () - textord_error_weight * new_error);
|
||||
y_origin = (float) (new_c / sqrt (1 + gradient * gradient));
|
||||
//real intercept
|
||||
}
|
||||
void set_limits( //set min,max
|
||||
float new_min, //bottom and
|
||||
float new_max) { //top of row
|
||||
y_min = new_min;
|
||||
y_max = new_max;
|
||||
}
|
||||
void compute_vertical_projection();
|
||||
//get projection
|
||||
|
||||
//true when dead
|
||||
NEWDELETE2 (TO_ROW) BOOL8 merged;
|
||||
BOOL8 all_caps; //had no ascenders
|
||||
BOOL8 used_dm_model; //in guessing pitch
|
||||
INT16 projection_left; //start of projection
|
||||
INT16 projection_right; //start of projection
|
||||
PITCH_TYPE pitch_decision; //how strong is decision
|
||||
float fixed_pitch; //pitch or 0
|
||||
float fp_space; //sp if fixed pitch
|
||||
float fp_nonsp; //nonsp if fixed pitch
|
||||
float pr_space; //sp if prop
|
||||
float pr_nonsp; //non sp if prop
|
||||
float spacing; //to "next" row
|
||||
float xheight; //of line
|
||||
float ascrise; //ascenders
|
||||
float descdrop; //descenders
|
||||
INT32 min_space; //min size for real space
|
||||
INT32 max_nonspace; //max size of non-space
|
||||
INT32 space_threshold; //space vs nonspace
|
||||
float kern_size; //average non-space
|
||||
float space_size; //average space
|
||||
WERD_LIST rep_words; //repeated chars
|
||||
ICOORDELT_LIST char_cells; //fixed pitch cells
|
||||
QSPLINE baseline; //curved baseline
|
||||
STATS projection; //vertical projection
|
||||
|
||||
private:
|
||||
BLOBNBOX_LIST blobs; //blobs in row
|
||||
float y_min; //coords
|
||||
float y_max;
|
||||
float initial_y_min;
|
||||
float m, c; //line spec
|
||||
float error; //line error
|
||||
float para_c; //constrained fit
|
||||
float para_error;
|
||||
float y_origin; //rotated para_c;
|
||||
float credibility; //baseline believability
|
||||
};
|
||||
|
||||
ELIST2IZEH (TO_ROW)
|
||||
class TO_BLOCK:public ELIST_LINK
|
||||
{
|
||||
public:
|
||||
TO_BLOCK() {
|
||||
} //empty
|
||||
TO_BLOCK( //constructor
|
||||
BLOCK *src_block); //real block
|
||||
~TO_BLOCK();
|
||||
|
||||
TO_ROW_LIST *get_rows() { //access function
|
||||
return &row_list;
|
||||
}
|
||||
|
||||
void print_rows() { //debug info
|
||||
TO_ROW_IT row_it = &row_list;
|
||||
TO_ROW *row;
|
||||
|
||||
for (row_it.mark_cycle_pt (); !row_it.cycled_list ();
|
||||
row_it.forward ()) {
|
||||
row = row_it.data ();
|
||||
printf ("Row range (%g,%g), para_c=%g, blobcount=" INT32FORMAT
|
||||
"\n", row->min_y (), row->max_y (), row->parallel_c (),
|
||||
row->blob_list ()->length ());
|
||||
}
|
||||
}
|
||||
|
||||
BLOBNBOX_LIST blobs; //medium size
|
||||
BLOBNBOX_LIST underlines; //underline blobs
|
||||
BLOBNBOX_LIST noise_blobs; //very small
|
||||
BLOBNBOX_LIST small_blobs; //fairly small
|
||||
BLOBNBOX_LIST large_blobs; //big blobs
|
||||
BLOCK *block; //real block
|
||||
PITCH_TYPE pitch_decision; //how strong is decision
|
||||
float line_spacing; //estimate
|
||||
float line_size; //estimate
|
||||
float max_blob_size; //line assignment limit
|
||||
float baseline_offset; //phase shift
|
||||
float xheight; //median blob size
|
||||
float fixed_pitch; //pitch or 0
|
||||
float kern_size; //average non-space
|
||||
float space_size; //average space
|
||||
INT32 min_space; //min definite space
|
||||
INT32 max_nonspace; //max definite
|
||||
float fp_space; //sp if fixed pitch
|
||||
float fp_nonsp; //nonsp if fixed pitch
|
||||
float pr_space; //sp if prop
|
||||
float pr_nonsp; //non sp if prop
|
||||
TO_ROW *key_row; //starting row
|
||||
|
||||
NEWDELETE2 (TO_BLOCK) private:
|
||||
TO_ROW_LIST row_list; //temporary rows
|
||||
};
|
||||
|
||||
ELISTIZEH (TO_BLOCK)
|
||||
extern double_VAR_H (textord_error_weight, 3,
|
||||
"Weighting for error in believability");
|
||||
void find_blob_limits( //get y limits
|
||||
PBLOB *blob, //blob to search
|
||||
float leftx, //x limits
|
||||
float rightx,
|
||||
FCOORD rotation, //for landscape
|
||||
float &ymin, //output y limits
|
||||
float &ymax);
|
||||
void find_cblob_limits( //get y limits
|
||||
C_BLOB *blob, //blob to search
|
||||
float leftx, //x limits
|
||||
float rightx,
|
||||
FCOORD rotation, //for landscape
|
||||
float &ymin, //output y limits
|
||||
float &ymax);
|
||||
void find_cblob_vlimits( //get y limits
|
||||
C_BLOB *blob, //blob to search
|
||||
float leftx, //x limits
|
||||
float rightx,
|
||||
float &ymin, //output y limits
|
||||
float &ymax);
|
||||
void find_cblob_hlimits( //get x limits
|
||||
C_BLOB *blob, //blob to search
|
||||
float bottomy, //y limits
|
||||
float topy,
|
||||
float &xmin, //output x limits
|
||||
float &xymax);
|
||||
PBLOB *rotate_blob( //get y limits
|
||||
PBLOB *blob, //blob to search
|
||||
FCOORD rotation //vector to rotate by
|
||||
);
|
||||
PBLOB *rotate_cblob( //rotate it
|
||||
C_BLOB *blob, //blob to search
|
||||
float xheight, //for poly approx
|
||||
FCOORD rotation //for landscape
|
||||
);
|
||||
C_BLOB *crotate_cblob( //rotate it
|
||||
C_BLOB *blob, //blob to search
|
||||
FCOORD rotation //for landscape
|
||||
);
|
||||
BOX box_next( //get bounding box
|
||||
BLOBNBOX_IT *it //iterator to blobds
|
||||
);
|
||||
BOX box_next_pre_chopped( //get bounding box
|
||||
BLOBNBOX_IT *it //iterator to blobds
|
||||
);
|
||||
void vertical_blob_projection( //project outlines
|
||||
PBLOB *blob, //blob to project
|
||||
STATS *stats //output
|
||||
);
|
||||
//project outlines
|
||||
void vertical_outline_projection(OUTLINE *outline, //outline to project
|
||||
STATS *stats //output
|
||||
);
|
||||
void vertical_cblob_projection( //project outlines
|
||||
C_BLOB *blob, //blob to project
|
||||
STATS *stats //output
|
||||
);
|
||||
void vertical_coutline_projection( //project outlines
|
||||
C_OUTLINE *outline, //outline to project
|
||||
STATS *stats //output
|
||||
);
|
||||
#endif
|
247
ccstruct/blobs.cpp
Normal file
247
ccstruct/blobs.cpp
Normal file
@ -0,0 +1,247 @@
|
||||
/* -*-C-*-
|
||||
********************************************************************************
|
||||
*
|
||||
* File: blobs.c (Formerly blobs.c)
|
||||
* Description: Blob definition
|
||||
* Author: Mark Seaman, OCR Technology
|
||||
* Created: Fri Oct 27 15:39:52 1989
|
||||
* Modified: Thu Mar 28 15:33:26 1991 (Mark Seaman) marks@hpgrlt
|
||||
* Language: C
|
||||
* Package: N/A
|
||||
* Status: Experimental (Do Not Distribute)
|
||||
*
|
||||
* (c) Copyright 1989, Hewlett-Packard Company.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
*********************************************************************************/
|
||||
|
||||
/*----------------------------------------------------------------------
|
||||
I n c l u d e s
|
||||
----------------------------------------------------------------------*/
|
||||
#include "mfcpch.h"
|
||||
#include "blobs.h"
|
||||
#include "cutil.h"
|
||||
#include "emalloc.h"
|
||||
#include "structures.h"
|
||||
|
||||
/*----------------------------------------------------------------------
|
||||
F u n c t i o n s
|
||||
----------------------------------------------------------------------*/
|
||||
/**********************************************************************
|
||||
* blob_origin
|
||||
*
|
||||
* Compute the origin of a compound blob, define to be the centre
|
||||
* of the bounding box.
|
||||
**********************************************************************/
|
||||
void blob_origin(TBLOB *blob, /*blob to compute on */
|
||||
TPOINT *origin) { /*return value */
|
||||
TPOINT topleft; /*bounding box */
|
||||
TPOINT botright;
|
||||
|
||||
/*find bounding box */
|
||||
blob_bounding_box(blob, &topleft, &botright);
|
||||
/*centre of box */
|
||||
origin->x = (topleft.x + botright.x) / 2;
|
||||
origin->y = (topleft.y + botright.y) / 2;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* blob_bounding_box
|
||||
*
|
||||
* Compute the bounding_box of a compound blob, define to be the
|
||||
* max coordinate value of the bounding boxes of all the top-level
|
||||
* outlines in the box.
|
||||
**********************************************************************/
|
||||
void blob_bounding_box(TBLOB *blob, /*blob to compute on */
|
||||
register TPOINT *topleft, /*bounding box */
|
||||
register TPOINT *botright) {
|
||||
register TESSLINE *outline; /*current outline */
|
||||
|
||||
if (blob == NULL || blob->outlines == NULL) {
|
||||
topleft->x = topleft->y = 0;
|
||||
*botright = *topleft; /*default value */
|
||||
}
|
||||
else {
|
||||
outline = blob->outlines;
|
||||
*topleft = outline->topleft;
|
||||
*botright = outline->botright;
|
||||
for (outline = outline->next; outline != NULL; outline = outline->next) {
|
||||
if (outline->topleft.x < topleft->x)
|
||||
/*find extremes */
|
||||
topleft->x = outline->topleft.x;
|
||||
if (outline->botright.x > botright->x)
|
||||
/*find extremes */
|
||||
botright->x = outline->botright.x;
|
||||
if (outline->topleft.y > topleft->y)
|
||||
/*find extremes */
|
||||
topleft->y = outline->topleft.y;
|
||||
if (outline->botright.y < botright->y)
|
||||
/*find extremes */
|
||||
botright->y = outline->botright.y;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* blobs_bounding_box
|
||||
*
|
||||
* Return the smallest extreme point that contain this word.
|
||||
**********************************************************************/
|
||||
void blobs_bounding_box(TBLOB *blobs, TPOINT *topleft, TPOINT *botright) {
|
||||
TPOINT tl;
|
||||
TPOINT br;
|
||||
TBLOB *blob;
|
||||
/* Start with first blob */
|
||||
blob_bounding_box(blobs, topleft, botright);
|
||||
|
||||
iterate_blobs(blob, blobs) {
|
||||
blob_bounding_box(blob, &tl, &br);
|
||||
|
||||
if (tl.x < topleft->x)
|
||||
topleft->x = tl.x;
|
||||
if (tl.y > topleft->y)
|
||||
topleft->y = tl.y;
|
||||
if (br.x > botright->x)
|
||||
botright->x = br.x;
|
||||
if (br.y < botright->y)
|
||||
botright->y = br.y;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* blobs_origin
|
||||
*
|
||||
* Compute the origin of a compound blob, define to be the centre
|
||||
* of the bounding box.
|
||||
**********************************************************************/
|
||||
void blobs_origin(TBLOB *blobs, /*blob to compute on */
|
||||
TPOINT *origin) { /*return value */
|
||||
TPOINT topleft; /*bounding box */
|
||||
TPOINT botright;
|
||||
|
||||
/*find bounding box */
|
||||
blobs_bounding_box(blobs, &topleft, &botright);
|
||||
/*center of box */
|
||||
origin->x = (topleft.x + botright.x) / 2;
|
||||
origin->y = (topleft.y + botright.y) / 2;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* blobs_widths
|
||||
*
|
||||
* Compute the widths of a list of blobs. Return an array of the widths
|
||||
* and gaps.
|
||||
**********************************************************************/
|
||||
WIDTH_RECORD *blobs_widths(TBLOB *blobs) { /*blob to compute on */
|
||||
WIDTH_RECORD *width_record;
|
||||
TPOINT topleft; /*bounding box */
|
||||
TPOINT botright;
|
||||
TBLOB *blob; /*blob to compute on */
|
||||
int i = 0;
|
||||
int blob_end;
|
||||
int num_blobs = count_blobs (blobs);
|
||||
|
||||
/* Get memory */
|
||||
width_record = (WIDTH_RECORD *) memalloc (sizeof (int) * num_blobs * 2);
|
||||
width_record->num_chars = num_blobs;
|
||||
|
||||
blob_bounding_box(blobs, &topleft, &botright);
|
||||
width_record->widths[i++] = botright.x - topleft.x;
|
||||
/* First width */
|
||||
blob_end = botright.x;
|
||||
|
||||
iterate_blobs (blob, blobs->next) {
|
||||
blob_bounding_box(blob, &topleft, &botright);
|
||||
width_record->widths[i++] = topleft.x - blob_end;
|
||||
width_record->widths[i++] = botright.x - topleft.x;
|
||||
blob_end = botright.x;
|
||||
}
|
||||
return (width_record);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* count_blobs
|
||||
*
|
||||
* Return a count of the number of blobs attached to this one.
|
||||
**********************************************************************/
|
||||
int count_blobs(TBLOB *blobs) {
|
||||
TBLOB *b;
|
||||
int x = 0;
|
||||
|
||||
iterate_blobs (b, blobs) x++;
|
||||
return (x);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* delete_word
|
||||
*
|
||||
* Reclaim the memory taken by this word structure and all of its
|
||||
* lower level structures.
|
||||
**********************************************************************/
|
||||
void delete_word(TWERD *word) {
|
||||
TBLOB *blob;
|
||||
TBLOB *nextblob;
|
||||
TESSLINE *outline;
|
||||
TESSLINE *nextoutline;
|
||||
TESSLINE *child;
|
||||
TESSLINE *nextchild;
|
||||
|
||||
for (blob = word->blobs; blob; blob = nextblob) {
|
||||
nextblob = blob->next;
|
||||
|
||||
for (outline = blob->outlines; outline; outline = nextoutline) {
|
||||
nextoutline = outline->next;
|
||||
|
||||
delete_edgepts (outline->loop);
|
||||
|
||||
for (child = outline->child; child; child = nextchild) {
|
||||
nextchild = child->next;
|
||||
|
||||
delete_edgepts (child->loop);
|
||||
|
||||
oldoutline(child);
|
||||
}
|
||||
oldoutline(outline);
|
||||
}
|
||||
oldblob(blob);
|
||||
}
|
||||
if (word->correct != NULL)
|
||||
strfree (word->correct); /* Reclaim memory */
|
||||
oldword(word);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* delete_edgepts
|
||||
*
|
||||
* Delete a list of EDGEPT structures.
|
||||
**********************************************************************/
|
||||
void delete_edgepts(register EDGEPT *edgepts) {
|
||||
register EDGEPT *this_edge;
|
||||
register EDGEPT *next_edge;
|
||||
|
||||
if (edgepts == NULL)
|
||||
return;
|
||||
|
||||
this_edge = edgepts;
|
||||
do {
|
||||
next_edge = this_edge->next;
|
||||
oldedgept(this_edge);
|
||||
this_edge = next_edge;
|
||||
}
|
||||
while (this_edge != edgepts);
|
||||
}
|
119
ccstruct/blobs.h
Normal file
119
ccstruct/blobs.h
Normal file
@ -0,0 +1,119 @@
|
||||
/* -*-C-*-
|
||||
********************************************************************************
|
||||
*
|
||||
* File: blobs.h (Formerly blobs.h)
|
||||
* Description: Blob definition
|
||||
* Author: Mark Seaman, OCR Technology
|
||||
* Created: Fri Oct 27 15:39:52 1989
|
||||
* Modified: Thu Mar 28 15:33:38 1991 (Mark Seaman) marks@hpgrlt
|
||||
* Language: C
|
||||
* Package: N/A
|
||||
* Status: Experimental (Do Not Distribute)
|
||||
*
|
||||
* (c) Copyright 1989, Hewlett-Packard Company.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
*********************************************************************************/
|
||||
|
||||
#ifndef BLOBS_H
|
||||
#define BLOBS_H
|
||||
|
||||
/*----------------------------------------------------------------------
|
||||
I n c l u d e s
|
||||
----------------------------------------------------------------------*/
|
||||
#include "vecfuncs.h"
|
||||
#include "tessclas.h"
|
||||
|
||||
/*----------------------------------------------------------------------
|
||||
T y p e s
|
||||
----------------------------------------------------------------------*/
|
||||
typedef struct
|
||||
{ /* Widths of pieces */
|
||||
int num_chars;
|
||||
int widths[1];
|
||||
} WIDTH_RECORD;
|
||||
|
||||
/*----------------------------------------------------------------------
|
||||
M a c r o s
|
||||
----------------------------------------------------------------------*/
|
||||
/**********************************************************************
|
||||
* free_widths
|
||||
*
|
||||
* Free the memory taken up by a width array.
|
||||
**********************************************************************/
|
||||
#define free_widths(w) \
|
||||
if (w) memfree (w)
|
||||
|
||||
/*----------------------------------------------------------------------
|
||||
F u n c t i o n s
|
||||
----------------------------------------------------------------------*/
|
||||
void blob_origin(TBLOB *blob, /*blob to compute on */
|
||||
TPOINT *origin); /*return value */
|
||||
|
||||
/*blob to compute on */
|
||||
void blob_bounding_box(TBLOB *blob,
|
||||
register TPOINT *topleft, /*bounding box */
|
||||
register TPOINT *botright);
|
||||
|
||||
void blobs_bounding_box(TBLOB *blobs, TPOINT *topleft, TPOINT *botright);
|
||||
|
||||
void blobs_origin(TBLOB *blobs, /*blob to compute on */
|
||||
TPOINT *origin); /*return value */
|
||||
|
||||
/*blob to compute on */
|
||||
WIDTH_RECORD *blobs_widths(TBLOB *blobs);
|
||||
|
||||
int count_blobs(TBLOB *blobs);
|
||||
|
||||
void delete_word(TWERD *word);
|
||||
|
||||
void delete_edgepts(register EDGEPT *edgepts);
|
||||
|
||||
/*
|
||||
#if defined(__STDC__) || defined(__cplusplus)
|
||||
# define _ARGS(s) s
|
||||
#else
|
||||
# define _ARGS(s) ()
|
||||
#endif*/
|
||||
|
||||
/* blobs.c
|
||||
void blob_origin
|
||||
_ARGS((BLOB *blob,
|
||||
TPOINT *origin));
|
||||
|
||||
void blob_bounding_box
|
||||
_ARGS((BLOB *blob,
|
||||
TPOINT *topleft,
|
||||
TPOINT *botright));
|
||||
|
||||
void blobs_bounding_box
|
||||
_ARGS((BLOB *blobs,
|
||||
TPOINT *topleft,
|
||||
TPOINT *botright));
|
||||
|
||||
void blobs_origin
|
||||
_ARGS((BLOB *blobs,
|
||||
TPOINT *origin));
|
||||
|
||||
WIDTH_RECORD *blobs_widths
|
||||
_ARGS((BLOB *blobs));
|
||||
|
||||
int count_blobs
|
||||
_ARGS((BLOB *blobs));
|
||||
|
||||
void delete_word
|
||||
_ARGS((TWERD *word));
|
||||
|
||||
void delete_edgepts
|
||||
_ARGS((EDGEPT *edgepts));
|
||||
#undef _ARGS
|
||||
*/
|
||||
#endif
|
537
ccstruct/blread.cpp
Normal file
537
ccstruct/blread.cpp
Normal file
@ -0,0 +1,537 @@
|
||||
/**********************************************************************
|
||||
* File: blread.cpp (Formerly pdread.c)
|
||||
* Description: Friend function of BLOCK to read the uscan pd file.
|
||||
* Author: Ray Smith
|
||||
* Created: Mon Mar 18 14:39:00 GMT 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <stdlib.h>
|
||||
#ifdef __UNIX__
|
||||
#include <assert.h>
|
||||
#endif
|
||||
#include "scanutils.h"
|
||||
#include "fileerr.h"
|
||||
#include "imgtiff.h"
|
||||
#include "pdclass.h"
|
||||
#include "rwpoly.h"
|
||||
#include "blread.h"
|
||||
|
||||
#define PD_EXT ".pd"
|
||||
#define VEC_EXT ".vec" //accupage file
|
||||
#define HPD_EXT ".bl" //hand pd file
|
||||
//unlv zone file
|
||||
#define UNLV_EXT ".uzn"
|
||||
#define BLOCK_EXPANSION 8 //boundary expansion
|
||||
#define EXTERN
|
||||
|
||||
EXTERN BOOL_EVAR (ignore_weird_blocks, TRUE, "Don't read weird blocks");
|
||||
|
||||
static BOX convert_vec_block( //make non-rect block
|
||||
VEC_ENTRY *entries, //vectors
|
||||
UINT16 entry_count, //no of entries
|
||||
INT32 ysize, //image size
|
||||
ICOORDELT_IT *left_it, //block sides
|
||||
ICOORDELT_IT *right_it);
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::read_pd_file
|
||||
*
|
||||
* Read a whole pd file to make a list of blocks, or use the whole page.
|
||||
**********************************************************************/
|
||||
|
||||
BOOL8 read_pd_file( //print list of sides
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BLOCK_LIST *blocks //output list
|
||||
) {
|
||||
FILE *pdfp; //file pointer
|
||||
BLOCK *block; //current block
|
||||
INT32 block_count; //no of blocks
|
||||
INT32 junk_count; //no of junks to read
|
||||
INT32 junks[4]; //junk elements
|
||||
INT32 vertex_count; //boundary vertices
|
||||
INT32 xcoord; //current coords
|
||||
INT32 ycoord;
|
||||
INT32 prevx; //previous coords
|
||||
INT32 prevy;
|
||||
BLOCK_IT block_it = blocks; //block iterator
|
||||
ICOORDELT_LIST dummy; //for constructor
|
||||
ICOORDELT_IT left_it = &dummy; //iterator
|
||||
ICOORDELT_IT right_it = &dummy;//iterator
|
||||
|
||||
if (read_hpd_file (name, xsize, ysize, blocks))
|
||||
return TRUE; //succeeded
|
||||
if (read_vec_file (name, xsize, ysize, blocks))
|
||||
return TRUE; //succeeded
|
||||
if (read_unlv_file (name, xsize, ysize, blocks))
|
||||
return TRUE; //succeeded
|
||||
name += PD_EXT; //add extension
|
||||
if ((pdfp = fopen (name.string (), "r")) == NULL) {
|
||||
//make rect block
|
||||
block = new BLOCK (name.string (), TRUE, 0, 0, 0, 0, xsize, ysize);
|
||||
block_it.add_to_end (block); //on end of list
|
||||
return FALSE; //didn't read one
|
||||
}
|
||||
else {
|
||||
if (fread (&block_count, sizeof (block_count), 1, pdfp) != 1)
|
||||
READFAILED.error ("read_pd_file", EXIT, "Block count");
|
||||
tprintf ("%d blocks in .pd file.\n", block_count);
|
||||
while (block_count > 0) {
|
||||
if (fread (&junk_count, sizeof (junk_count), 1, pdfp) != 1)
|
||||
READFAILED.error ("read_pd_file", EXIT, "Junk count");
|
||||
if (fread (&vertex_count, sizeof (vertex_count), 1, pdfp) != 1)
|
||||
READFAILED.error ("read_pd_file", EXIT, "Vertex count");
|
||||
block = new BLOCK; //make a block
|
||||
//on end of list
|
||||
block_it.add_to_end (block);
|
||||
left_it.set_to_list (&block->leftside);
|
||||
right_it.set_to_list (&block->rightside);
|
||||
|
||||
//read a pair
|
||||
get_pd_vertex (pdfp, xsize, ysize, &block->box, xcoord, ycoord);
|
||||
vertex_count -= 2; //count read ones
|
||||
prevx = xcoord;
|
||||
do {
|
||||
if (xcoord == prevx) {
|
||||
if (!right_it.empty ()) {
|
||||
if (right_it.data ()->x () <= xcoord + BLOCK_EXPANSION)
|
||||
right_it.data ()->set_y (right_it.data ()->y () +
|
||||
BLOCK_EXPANSION);
|
||||
else
|
||||
right_it.data ()->set_y (right_it.data ()->y () -
|
||||
BLOCK_EXPANSION);
|
||||
}
|
||||
right_it.
|
||||
add_before_then_move (new
|
||||
ICOORDELT (xcoord + BLOCK_EXPANSION,
|
||||
ycoord));
|
||||
}
|
||||
prevx = xcoord; //remember previous
|
||||
prevy = ycoord;
|
||||
get_pd_vertex (pdfp, xsize, ysize, &block->box, xcoord, ycoord);
|
||||
vertex_count -= 2; //count read ones
|
||||
}
|
||||
while (ycoord <= prevy);
|
||||
right_it.data ()->set_y (right_it.data ()->y () - BLOCK_EXPANSION);
|
||||
|
||||
//start of left
|
||||
left_it.add_to_end (new ICOORDELT (prevx - BLOCK_EXPANSION, prevy - BLOCK_EXPANSION));
|
||||
|
||||
do {
|
||||
prevx = xcoord; //remember previous
|
||||
get_pd_vertex (pdfp, xsize, ysize, &block->box, xcoord, ycoord);
|
||||
vertex_count -= 2;
|
||||
if (xcoord != prevx && vertex_count > 0) {
|
||||
if (xcoord > prevx)
|
||||
left_it.
|
||||
add_to_end (new
|
||||
ICOORDELT (xcoord - BLOCK_EXPANSION,
|
||||
ycoord + BLOCK_EXPANSION));
|
||||
else
|
||||
left_it.
|
||||
add_to_end (new
|
||||
ICOORDELT (xcoord - BLOCK_EXPANSION,
|
||||
ycoord - BLOCK_EXPANSION));
|
||||
}
|
||||
else if (vertex_count == 0)
|
||||
left_it.add_to_end (new ICOORDELT (prevx - BLOCK_EXPANSION,
|
||||
ycoord + BLOCK_EXPANSION));
|
||||
}
|
||||
while (vertex_count > 0); //until all read
|
||||
|
||||
while (junk_count > 0) {
|
||||
if (fread (junks, sizeof (INT32), 4, pdfp) != 4)
|
||||
READFAILED.error ("read_pd_file", EXIT, "Junk coords");
|
||||
junk_count--;
|
||||
}
|
||||
block_count--; //count read blocks
|
||||
}
|
||||
}
|
||||
fclose(pdfp);
|
||||
return TRUE; //read one
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* get_pd_vertex
|
||||
*
|
||||
* Read a pair of coords, invert the y and clip to image limits.
|
||||
* Also update the bounding box.
|
||||
*
|
||||
* Read a whole pd file to make a list of blocks, or use the whole page.
|
||||
**********************************************************************/
|
||||
|
||||
void get_pd_vertex( //get new vertex
|
||||
FILE *pdfp, //file to read
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BOX *box, //bounding box
|
||||
INT32 &xcoord, //output coords
|
||||
INT32 &ycoord) {
|
||||
BOX new_coord; //expansion box
|
||||
|
||||
//get new coords
|
||||
if (fread (&xcoord, sizeof (xcoord), 1, pdfp) != 1)
|
||||
READFAILED.error ("read_pd_file", EXIT, "Xcoord");
|
||||
if (fread (&ycoord, sizeof (ycoord), 1, pdfp) != 1)
|
||||
READFAILED.error ("read_pd_file", EXIT, "Xcoord");
|
||||
ycoord = ysize - ycoord; //invert y
|
||||
if (xcoord < BLOCK_EXPANSION)
|
||||
xcoord = BLOCK_EXPANSION; //clip to limits
|
||||
if (xcoord > xsize - BLOCK_EXPANSION)
|
||||
xcoord = xsize - BLOCK_EXPANSION;
|
||||
if (ycoord < BLOCK_EXPANSION)
|
||||
ycoord = BLOCK_EXPANSION;
|
||||
if (ycoord > ysize - BLOCK_EXPANSION)
|
||||
ycoord = ysize - BLOCK_EXPANSION;
|
||||
|
||||
new_coord =
|
||||
BOX (ICOORD (xcoord - BLOCK_EXPANSION, ycoord - BLOCK_EXPANSION),
|
||||
ICOORD (xcoord + BLOCK_EXPANSION, ycoord + BLOCK_EXPANSION));
|
||||
(*box) += new_coord;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::read_hpd_file
|
||||
*
|
||||
* Read a whole hpd file to make a list of blocks.
|
||||
* Return FALSE if the .vec fiel cannot be found
|
||||
**********************************************************************/
|
||||
|
||||
BOOL8 read_hpd_file( //print list of sides
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BLOCK_LIST *blocks //output list
|
||||
) {
|
||||
FILE *pdfp; //file pointer
|
||||
PAGE_BLOCK_LIST *page_blocks;
|
||||
INT32 block_no; //no of blocks
|
||||
BLOCK_IT block_it = blocks; //block iterator
|
||||
|
||||
name += HPD_EXT; //add extension
|
||||
if ((pdfp = fopen (name.string (), "r")) == NULL) {
|
||||
return FALSE; //can't find it
|
||||
}
|
||||
fclose(pdfp);
|
||||
page_blocks = read_poly_blocks (name.string ());
|
||||
block_no = 0;
|
||||
scan_hpd_blocks (name.string (), page_blocks, block_no, &block_it);
|
||||
tprintf ("Text region count=%d\n", block_no);
|
||||
return TRUE; //read one
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::scan_hpd_blocks
|
||||
*
|
||||
* Read a whole hpd file to make a list of blocks.
|
||||
* Return FALSE if the .vec fiel cannot be found
|
||||
**********************************************************************/
|
||||
|
||||
void scan_hpd_blocks( //print list of sides
|
||||
const char *name, //block label
|
||||
PAGE_BLOCK_LIST *page_blocks, //head of full pag
|
||||
INT32 &block_no, //no of blocks
|
||||
BLOCK_IT *block_it //block iterator
|
||||
) {
|
||||
BLOCK *block; //current block
|
||||
//page blocks
|
||||
PAGE_BLOCK_IT pb_it = page_blocks;
|
||||
PAGE_BLOCK *current_block;
|
||||
TEXT_REGION_IT tr_it;
|
||||
TEXT_BLOCK *tb;
|
||||
TEXT_REGION *tr;
|
||||
BOX *block_box; //from text region
|
||||
|
||||
for (pb_it.mark_cycle_pt (); !pb_it.cycled_list (); pb_it.forward ()) {
|
||||
current_block = pb_it.data ();
|
||||
if (current_block->type () == PB_TEXT) {
|
||||
tb = (TEXT_BLOCK *) current_block;
|
||||
if (!tb->regions ()->empty ()) {
|
||||
tr_it.set_to_list (tb->regions ());
|
||||
for (tr_it.mark_cycle_pt ();
|
||||
!tr_it.cycled_list (); tr_it.forward ()) {
|
||||
block_no++;
|
||||
tr = tr_it.data ();
|
||||
block_box = tr->bounding_box ();
|
||||
block = new BLOCK (name, TRUE, 0, 0,
|
||||
block_box->left (), block_box->bottom (),
|
||||
block_box->right (), block_box->top ());
|
||||
block->hand_block = tr;
|
||||
block->hand_poly = tr;
|
||||
block_it->add_after_then_move (block);
|
||||
}
|
||||
}
|
||||
}
|
||||
else if (current_block->type () == PB_WEIRD
|
||||
&& !ignore_weird_blocks
|
||||
&& ((WEIRD_BLOCK *) current_block)->id_no () > 0) {
|
||||
block_no++;
|
||||
block_box = current_block->bounding_box ();
|
||||
block = new BLOCK (name, TRUE, 0, 0,
|
||||
block_box->left (), block_box->bottom (),
|
||||
block_box->right (), block_box->top ());
|
||||
block->hand_block = NULL;
|
||||
block->hand_poly = current_block;
|
||||
block_it->add_after_then_move (block);
|
||||
}
|
||||
if (!current_block->child ()->empty ())
|
||||
scan_hpd_blocks (name, current_block->child (), block_no, block_it);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::read_vec_file
|
||||
*
|
||||
* Read a whole vec file to make a list of blocks.
|
||||
* Return FALSE if the .vec fiel cannot be found
|
||||
**********************************************************************/
|
||||
|
||||
BOOL8 read_vec_file( //print list of sides
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BLOCK_LIST *blocks //output list
|
||||
) {
|
||||
FILE *pdfp; //file pointer
|
||||
BLOCK *block; //current block
|
||||
INT32 block_no; //no of blocks
|
||||
INT32 block_index; //current blocks
|
||||
INT32 vector_count; //total vectors
|
||||
VEC_HEADER header; //file header
|
||||
BLOCK_HEADER *vec_blocks; //blocks from file
|
||||
VEC_ENTRY *vec_entries; //vectors from file
|
||||
BLOCK_IT block_it = blocks; //block iterator
|
||||
ICOORDELT_IT left_it; //iterators
|
||||
ICOORDELT_IT right_it;
|
||||
|
||||
name += VEC_EXT; //add extension
|
||||
if ((pdfp = fopen (name.string (), "r")) == NULL) {
|
||||
return FALSE; //can't find it
|
||||
}
|
||||
if (fread (&header, sizeof (header), 1, pdfp) != 1)
|
||||
READFAILED.error ("read_vec_file", EXIT, "Header");
|
||||
//from intel
|
||||
header.filesize = reverse32 (header.filesize);
|
||||
header.bytesize = reverse16 (header.bytesize);
|
||||
header.arraysize = reverse16 (header.arraysize);
|
||||
header.width = reverse16 (header.width);
|
||||
header.height = reverse16 (header.height);
|
||||
header.res = reverse16 (header.res);
|
||||
header.bpp = reverse16 (header.bpp);
|
||||
tprintf ("%d blocks in %s file:", header.arraysize, VEC_EXT);
|
||||
vector_count = header.filesize - header.arraysize * sizeof (BLOCK_HEADER);
|
||||
vector_count /= sizeof (VEC_ENTRY);
|
||||
vec_blocks =
|
||||
(BLOCK_HEADER *) alloc_mem (header.arraysize * sizeof (BLOCK_HEADER));
|
||||
vec_entries = (VEC_ENTRY *) alloc_mem (vector_count * sizeof (VEC_ENTRY));
|
||||
xsize = header.width; //real image size
|
||||
ysize = header.height;
|
||||
if (fread (vec_blocks, sizeof (BLOCK_HEADER), header.arraysize, pdfp)
|
||||
!= static_cast<size_t>(header.arraysize))
|
||||
READFAILED.error ("read_vec_file", EXIT, "Blocks");
|
||||
if (fread (vec_entries, sizeof (VEC_ENTRY), vector_count, pdfp)
|
||||
!= static_cast<size_t>(vector_count))
|
||||
READFAILED.error ("read_vec_file", EXIT, "Vectors");
|
||||
for (block_index = 0; block_index < header.arraysize; block_index++) {
|
||||
vec_blocks[block_index].offset =
|
||||
reverse16 (vec_blocks[block_index].offset);
|
||||
vec_blocks[block_index].order =
|
||||
reverse16 (vec_blocks[block_index].order);
|
||||
vec_blocks[block_index].entries =
|
||||
reverse16 (vec_blocks[block_index].entries);
|
||||
vec_blocks[block_index].charsize =
|
||||
reverse16 (vec_blocks[block_index].charsize);
|
||||
}
|
||||
for (block_index = 0; block_index < vector_count; block_index++) {
|
||||
vec_entries[block_index].start =
|
||||
ICOORD (reverse16 (vec_entries[block_index].start.x ()),
|
||||
reverse16 (vec_entries[block_index].start.y ()));
|
||||
vec_entries[block_index].end =
|
||||
ICOORD (reverse16 (vec_entries[block_index].end.x ()),
|
||||
reverse16 (vec_entries[block_index].end.y ()));
|
||||
}
|
||||
for (block_no = 1; block_no <= header.arraysize; block_no++) {
|
||||
for (block_index = 0; block_index < header.arraysize; block_index++) {
|
||||
if (vec_blocks[block_index].order == block_no
|
||||
&& vec_blocks[block_index].valid) {
|
||||
block = new BLOCK;
|
||||
left_it.set_to_list (&block->leftside);
|
||||
right_it.set_to_list (&block->rightside);
|
||||
block->box =
|
||||
convert_vec_block (&vec_entries
|
||||
[vec_blocks[block_index].offset],
|
||||
vec_blocks[block_index].entries, ysize,
|
||||
&left_it, &right_it);
|
||||
block->set_xheight (vec_blocks[block_index].charsize);
|
||||
//on end of list
|
||||
block_it.add_to_end (block);
|
||||
// tprintf("Block at (%d,%d)->(%d,%d) has index %d and order %d\n",
|
||||
// block->box.left(),
|
||||
// block->box.bottom(),
|
||||
// block->box.right(),
|
||||
// block->box.top(),
|
||||
// block_index,vec_blocks[block_index].order);
|
||||
}
|
||||
}
|
||||
}
|
||||
free_mem(vec_blocks);
|
||||
free_mem(vec_entries);
|
||||
tprintf ("%d valid\n", block_it.length ());
|
||||
fclose(pdfp);
|
||||
return TRUE; //read one
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::convert_vec_block
|
||||
*
|
||||
* Read a whole vec file to make a list of blocks.
|
||||
* Return FALSE if the .vec fiel cannot be found
|
||||
**********************************************************************/
|
||||
|
||||
static BOX convert_vec_block( //make non-rect block
|
||||
VEC_ENTRY *entries, //vectors
|
||||
UINT16 entry_count, //no of entries
|
||||
INT32 ysize, //image size
|
||||
ICOORDELT_IT *left_it, //block sides
|
||||
ICOORDELT_IT *right_it) {
|
||||
BOX block_box; //bounding box
|
||||
BOX vec_box; //box of vec
|
||||
ICOORD box_point; //expanded coord
|
||||
ICOORD shift_vec; //for box expansion
|
||||
ICOORD prev_pt; //previous coord
|
||||
ICOORD end_pt; //end of vector
|
||||
INT32 vertex_index; //boundary vertices
|
||||
|
||||
for (vertex_index = 0; vertex_index < entry_count; vertex_index++) {
|
||||
entries[vertex_index].start = ICOORD (entries[vertex_index].start.x (),
|
||||
ysize - 1 -
|
||||
entries[vertex_index].start.y ());
|
||||
entries[vertex_index].end =
|
||||
ICOORD (entries[vertex_index].end.x (),
|
||||
ysize - 1 - entries[vertex_index].end.y ());
|
||||
vec_box = BOX (entries[vertex_index].start, entries[vertex_index].end);
|
||||
block_box += vec_box; //find total bounds
|
||||
}
|
||||
|
||||
for (vertex_index = 0; vertex_index < entry_count
|
||||
&& (entries[vertex_index].start.y () != block_box.bottom ()
|
||||
|| entries[vertex_index].end.y () != block_box.bottom ());
|
||||
vertex_index++);
|
||||
ASSERT_HOST (vertex_index < entry_count);
|
||||
prev_pt = entries[vertex_index].start;
|
||||
end_pt = entries[vertex_index].end;
|
||||
do {
|
||||
for (vertex_index = 0; vertex_index < entry_count
|
||||
&& entries[vertex_index].start != end_pt; vertex_index++);
|
||||
//found start of vertical
|
||||
ASSERT_HOST (vertex_index < entry_count);
|
||||
box_point = entries[vertex_index].start;
|
||||
if (box_point.x () <= prev_pt.x ())
|
||||
shift_vec = ICOORD (-BLOCK_EXPANSION, -BLOCK_EXPANSION);
|
||||
else
|
||||
shift_vec = ICOORD (-BLOCK_EXPANSION, BLOCK_EXPANSION);
|
||||
left_it->add_to_end (new ICOORDELT (box_point + shift_vec));
|
||||
prev_pt = box_point;
|
||||
for (vertex_index = 0; vertex_index < entry_count
|
||||
&& entries[vertex_index].start != end_pt; vertex_index++);
|
||||
//found horizontal
|
||||
ASSERT_HOST (vertex_index < entry_count);
|
||||
end_pt = entries[vertex_index].end;
|
||||
}
|
||||
while (end_pt.y () < block_box.top ());
|
||||
shift_vec = ICOORD (-BLOCK_EXPANSION, BLOCK_EXPANSION);
|
||||
left_it->add_to_end (new ICOORDELT (end_pt + shift_vec));
|
||||
|
||||
for (vertex_index = 0; vertex_index < entry_count
|
||||
&& (entries[vertex_index].start.y () != block_box.top ()
|
||||
|| entries[vertex_index].end.y () != block_box.top ());
|
||||
vertex_index++);
|
||||
ASSERT_HOST (vertex_index < entry_count);
|
||||
prev_pt = entries[vertex_index].start;
|
||||
end_pt = entries[vertex_index].end;
|
||||
do {
|
||||
for (vertex_index = 0; vertex_index < entry_count
|
||||
&& entries[vertex_index].start != end_pt; vertex_index++);
|
||||
//found start of vertical
|
||||
ASSERT_HOST (vertex_index < entry_count);
|
||||
box_point = entries[vertex_index].start;
|
||||
if (box_point.x () < prev_pt.x ())
|
||||
shift_vec = ICOORD (BLOCK_EXPANSION, -BLOCK_EXPANSION);
|
||||
else
|
||||
shift_vec = ICOORD (BLOCK_EXPANSION, BLOCK_EXPANSION);
|
||||
right_it->add_before_then_move (new ICOORDELT (box_point + shift_vec));
|
||||
prev_pt = box_point;
|
||||
for (vertex_index = 0; vertex_index < entry_count
|
||||
&& entries[vertex_index].start != end_pt; vertex_index++);
|
||||
//found horizontal
|
||||
ASSERT_HOST (vertex_index < entry_count);
|
||||
end_pt = entries[vertex_index].end;
|
||||
}
|
||||
while (end_pt.y () > block_box.bottom ());
|
||||
shift_vec = ICOORD (BLOCK_EXPANSION, -BLOCK_EXPANSION);
|
||||
right_it->add_before_then_move (new ICOORDELT (end_pt + shift_vec));
|
||||
|
||||
shift_vec = ICOORD (BLOCK_EXPANSION, BLOCK_EXPANSION);
|
||||
box_point = block_box.botleft () - shift_vec;
|
||||
end_pt = block_box.topright () + shift_vec;
|
||||
return BOX (box_point, end_pt);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* read_unlv_file
|
||||
*
|
||||
* Read a whole unlv zone file to make a list of blocks.
|
||||
**********************************************************************/
|
||||
|
||||
BOOL8 read_unlv_file( //print list of sides
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BLOCK_LIST *blocks //output list
|
||||
) {
|
||||
FILE *pdfp; //file pointer
|
||||
BLOCK *block; //current block
|
||||
int x; //current top-down coords
|
||||
int y;
|
||||
int width; //of current block
|
||||
int height;
|
||||
BLOCK_IT block_it = blocks; //block iterator
|
||||
|
||||
name += UNLV_EXT; //add extension
|
||||
if ((pdfp = fopen (name.string (), "r")) == NULL) {
|
||||
return FALSE; //didn't read one
|
||||
}
|
||||
else {
|
||||
while (fscanf (pdfp, "%d %d %d %d %*s", &x, &y, &width, &height) >= 4) {
|
||||
//make rect block
|
||||
block = new BLOCK (name.string (), TRUE, 0, 0, (INT16) x, (INT16) (ysize - 1 - y - height), (INT16) (x + width), (INT16) (ysize - 1 - y));
|
||||
//on end of list
|
||||
block_it.add_to_end (block);
|
||||
}
|
||||
fclose(pdfp);
|
||||
}
|
||||
return true;
|
||||
}
|
63
ccstruct/blread.h
Normal file
63
ccstruct/blread.h
Normal file
@ -0,0 +1,63 @@
|
||||
/**********************************************************************
|
||||
* File: blread.h (Formerly pdread.h)
|
||||
* Description: Friend function of BLOCK to read the uscan pd file.
|
||||
* Author: Ray Smith
|
||||
* Created: Mon Mar 18 14:39:00 GMT 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef BLREAD_H
|
||||
#define BLREAD_H
|
||||
|
||||
#include "varable.h"
|
||||
#include "ocrblock.h"
|
||||
|
||||
BOOL8 read_pd_file( //print list of sides
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BLOCK_LIST *blocks //output list
|
||||
);
|
||||
void get_pd_vertex( //get new vertex
|
||||
FILE *pdfp, //file to read
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BOX *box, //bounding box
|
||||
INT32 &xcoord, //output coords
|
||||
INT32 &ycoord);
|
||||
BOOL8 read_hpd_file( //print list of sides
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BLOCK_LIST *blocks //output list
|
||||
);
|
||||
void scan_hpd_blocks( //print list of sides
|
||||
const char *name, //block label
|
||||
PAGE_BLOCK_LIST *page_blocks, //head of full pag
|
||||
INT32 &block_no, //no of blocks
|
||||
BLOCK_IT *block_it //block iterator
|
||||
);
|
||||
BOOL8 read_vec_file( //print list of sides
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BLOCK_LIST *blocks //output list
|
||||
);
|
||||
BOOL8 read_unlv_file( //print list of sides
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //image size
|
||||
INT32 ysize, //image size
|
||||
BLOCK_LIST *blocks //output list
|
||||
);
|
||||
#endif
|
270
ccstruct/callcpp.cpp
Normal file
270
ccstruct/callcpp.cpp
Normal file
@ -0,0 +1,270 @@
|
||||
/**********************************************************************
|
||||
* File: callcpp.cpp
|
||||
* Description: extern C interface calling C++ from C.
|
||||
* Author: Ray Smith
|
||||
* Created: Sun Feb 04 20:39:23 MST 1996
|
||||
*
|
||||
* (C) Copyright 1996, Hewlett-Packard Co.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "errcode.h"
|
||||
#ifdef __UNIX__
|
||||
#include <assert.h>
|
||||
#include <stdarg.h>
|
||||
#endif
|
||||
#include <time.h>
|
||||
#include "memry.h"
|
||||
#include "grphics.h"
|
||||
#include "evnts.h"
|
||||
#include "varable.h"
|
||||
#include "callcpp.h"
|
||||
#include "tprintf.h"
|
||||
//#include "strace.h"
|
||||
#include "host.h"
|
||||
|
||||
//extern "C" {
|
||||
|
||||
INT_VAR (tess_cp_mapping0, 0, "Mappings for class pruner distance");
|
||||
INT_VAR (tess_cp_mapping1, 1, "Mappings for class pruner distance");
|
||||
INT_VAR (tess_cp_mapping2, 2, "Mappings for class pruner distance");
|
||||
INT_VAR (tess_cp_mapping3, 3, "Mappings for class pruner distance");
|
||||
INT_VAR (stopper_numbers_on, 0, "Allow numbers to be acceptable choices");
|
||||
INT_VAR (config_pruner_enabled, 0, "Turn on config pruner");
|
||||
INT_VAR (feature_prune_percentile, 0, "Percent of features to use");
|
||||
INT_VAR (newcp_ratings_on, 0, "Use new class pruner normalisation");
|
||||
INT_VAR (record_matcher_output, 0, "Record detailed matcher info");
|
||||
INT_VAR (il1_adaption_test, 0, "Dont adapt to i/I at beginning of word");
|
||||
double_VAR (permuter_pending_threshold, 0.0,
|
||||
"Worst conf for using pending dictionary");
|
||||
double_VAR (newcp_duff_rating, 0.30, "Worst rating for calling real matcher");
|
||||
double_VAR (newcp_prune_threshold, 1.2, "Ratio of best to prune");
|
||||
double_VAR (tessedit_cp_ratio, 0.0, "Ratio from best to prune");
|
||||
//Global matcher info from the class pruner.
|
||||
INT32 cp_classes;
|
||||
INT32 cp_bestindex;
|
||||
INT32 cp_bestrating;
|
||||
INT32 cp_bestconf;
|
||||
char cp_chars[2];
|
||||
INT32 cp_ratings[2];
|
||||
INT32 cp_confs[2];
|
||||
INT32 cp_maps[4];
|
||||
//Global info to control writes of matcher info
|
||||
INT32 blob_type; //write control
|
||||
char blob_answer; //correct char
|
||||
char *word_answer; //correct word
|
||||
INT32 matcher_pass; //pass in chopper.c
|
||||
INT32 bits_in_states; //no of bits in states
|
||||
|
||||
#ifndef __UNIX__
|
||||
/**********************************************************************
|
||||
* assert
|
||||
*
|
||||
* A version of assert for C on NT.
|
||||
**********************************************************************/
|
||||
|
||||
void assert( //recog one owrd
|
||||
int testing //assert fail if false
|
||||
) {
|
||||
ASSERT_HOST(testing);
|
||||
}
|
||||
#endif
|
||||
|
||||
void setup_cp_maps() {
|
||||
cp_maps[0] = tess_cp_mapping0;
|
||||
cp_maps[1] = tess_cp_mapping1;
|
||||
cp_maps[2] = tess_cp_mapping2;
|
||||
cp_maps[3] = tess_cp_mapping3;
|
||||
}
|
||||
|
||||
|
||||
void trace_stack() { //Trace current stack
|
||||
}
|
||||
|
||||
|
||||
void
|
||||
cprintf ( //Trace printf
|
||||
const char *format, ... //special message
|
||||
) {
|
||||
va_list args; //variable args
|
||||
char msg[1000];
|
||||
|
||||
va_start(args, format); //variable list
|
||||
vsprintf(msg, format, args); //Format into msg
|
||||
va_end(args);
|
||||
|
||||
tprintf ("%s", msg);
|
||||
}
|
||||
|
||||
|
||||
char *c_alloc_string( //allocate string
|
||||
INT32 count //no of chars required
|
||||
) {
|
||||
return alloc_string (count);
|
||||
}
|
||||
|
||||
|
||||
void c_free_string( //free a string
|
||||
char *string //string to free
|
||||
) {
|
||||
free_string(string);
|
||||
}
|
||||
|
||||
|
||||
void *c_alloc_struct( //allocate memory
|
||||
INT32 count, //no of chars required
|
||||
const char *name //class name
|
||||
) {
|
||||
return alloc_struct (count, name);
|
||||
}
|
||||
|
||||
|
||||
void c_free_struct( //free a structure
|
||||
void *deadstruct, //structure to free
|
||||
INT32 count, //no of bytes
|
||||
const char *name //class name
|
||||
) {
|
||||
free_struct(deadstruct, count, name);
|
||||
}
|
||||
|
||||
|
||||
void *c_alloc_mem_p( //allocate permanent space
|
||||
INT32 count //block size to allocate
|
||||
) {
|
||||
return alloc_mem_p (count);
|
||||
}
|
||||
|
||||
|
||||
void *c_alloc_mem( //get some memory
|
||||
INT32 count //no of bytes to get
|
||||
) {
|
||||
return alloc_mem (count);
|
||||
}
|
||||
|
||||
|
||||
void c_free_mem( //free mem from alloc_mem
|
||||
void *oldchunk //chunk to free
|
||||
) {
|
||||
free_mem(oldchunk);
|
||||
}
|
||||
|
||||
|
||||
void c_check_mem( //check consistency
|
||||
const char *string, //context message
|
||||
INT8 level //level of check
|
||||
) {
|
||||
check_mem(string, level);
|
||||
}
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void *c_create_window( /*create a window */
|
||||
const char *name, /*name/title of window */
|
||||
INT16 xpos, /*coords of window */
|
||||
INT16 ypos, /*coords of window */
|
||||
INT16 xsize, /*size of window */
|
||||
INT16 ysize, /*size of window */
|
||||
double xmin, /*scrolling limits */
|
||||
double xmax, /*to stop users */
|
||||
double ymin, /*getting lost in */
|
||||
double ymax /*empty space */
|
||||
) {
|
||||
return create_window (name, SCROLLINGWIN, xpos, ypos, xsize, ysize,
|
||||
xmin, xmax, ymin, ymax, TRUE, FALSE, FALSE, TRUE);
|
||||
}
|
||||
|
||||
|
||||
void c_line_color_index( /*set color */
|
||||
void *win,
|
||||
C_COL index) {
|
||||
WINDOW window = (WINDOW) win;
|
||||
|
||||
// ASSERT_HOST(index>=0 && index<=48);
|
||||
if (index < 0 || index > 48)
|
||||
index = (C_COL) 1;
|
||||
window->Line_color_index ((COLOUR) index);
|
||||
}
|
||||
|
||||
|
||||
void c_move( /*move pen */
|
||||
void *win,
|
||||
double x,
|
||||
double y) {
|
||||
WINDOW window = (WINDOW) win;
|
||||
|
||||
window->Move2d (x, y);
|
||||
}
|
||||
|
||||
|
||||
void c_draw( /*move pen */
|
||||
void *win,
|
||||
double x,
|
||||
double y) {
|
||||
WINDOW window = (WINDOW) win;
|
||||
|
||||
window->Draw2d (x, y);
|
||||
}
|
||||
|
||||
|
||||
void c_make_current( /*move pen */
|
||||
void *win) {
|
||||
WINDOW window = (WINDOW) win;
|
||||
|
||||
window->Make_picture_current ();
|
||||
}
|
||||
|
||||
|
||||
void c_clear_window( /*move pen */
|
||||
void *win) {
|
||||
WINDOW window = (WINDOW) win;
|
||||
|
||||
window->Clear_view_surface ();
|
||||
}
|
||||
|
||||
|
||||
char window_wait( /*move pen */
|
||||
void *win) {
|
||||
WINDOW window = (WINDOW) win;
|
||||
GRAPHICS_EVENT event;
|
||||
|
||||
await_event(window, TRUE, ANY_EVENT, &event);
|
||||
if (event.type == KEYPRESS_EVENT)
|
||||
return event.key;
|
||||
else
|
||||
return '\0';
|
||||
}
|
||||
#endif
|
||||
|
||||
void reverse32(void *ptr) {
|
||||
char tmp;
|
||||
char *cptr = (char *) ptr;
|
||||
|
||||
tmp = *cptr;
|
||||
*cptr = *(cptr + 3);
|
||||
*(cptr + 3) = tmp;
|
||||
tmp = *(cptr + 1);
|
||||
*(cptr + 1) = *(cptr + 2);
|
||||
*(cptr + 2) = tmp;
|
||||
}
|
||||
|
||||
|
||||
void reverse16(void *ptr) {
|
||||
char tmp;
|
||||
char *cptr = (char *) ptr;
|
||||
|
||||
tmp = *cptr;
|
||||
*cptr = *(cptr + 1);
|
||||
*(cptr + 1) = tmp;
|
||||
}
|
||||
|
||||
|
||||
//};
|
604
ccstruct/coutln.cpp
Normal file
604
ccstruct/coutln.cpp
Normal file
@ -0,0 +1,604 @@
|
||||
/**********************************************************************
|
||||
* File: coutln.c (Formerly coutline.c)
|
||||
* Description: Code for the C_OUTLINE class.
|
||||
* Author: Ray Smith
|
||||
* Created: Mon Oct 07 16:01:57 BST 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <string.h>
|
||||
#ifdef __UNIX__
|
||||
#include <assert.h>
|
||||
#endif
|
||||
#include "coutln.h"
|
||||
|
||||
ELISTIZE_S (C_OUTLINE)
|
||||
ICOORD C_OUTLINE::step_coords[4] = {
|
||||
ICOORD (-1, 0), ICOORD (0, -1), ICOORD (1, 0), ICOORD (0, 1)
|
||||
};
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::C_OUTLINE
|
||||
*
|
||||
* Constructor to build a C_OUTLINE from a CRACKEDGE LOOP.
|
||||
**********************************************************************/
|
||||
|
||||
C_OUTLINE::C_OUTLINE (
|
||||
//constructor
|
||||
CRACKEDGE * startpt, //outline to convert
|
||||
ICOORD bot_left, //bounding box
|
||||
ICOORD top_right, INT16 length //length of loop
|
||||
):box (bot_left, top_right), start (startpt->pos) {
|
||||
INT16 stepindex; //index to step
|
||||
CRACKEDGE *edgept; //current point
|
||||
|
||||
stepcount = length; //no of steps
|
||||
//get memory
|
||||
steps = (UINT8 *) alloc_mem (step_mem());
|
||||
memset(steps, 0, step_mem());
|
||||
edgept = startpt;
|
||||
|
||||
for (stepindex = 0; stepindex < length; stepindex++) {
|
||||
//set compact step
|
||||
set_step (stepindex, edgept->stepdir);
|
||||
edgept = edgept->next;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::C_OUTLINE
|
||||
*
|
||||
* Constructor to build a C_OUTLINE from a C_OUTLINE_FRAG.
|
||||
**********************************************************************/
|
||||
C_OUTLINE::C_OUTLINE (
|
||||
//constructor
|
||||
//steps to copy
|
||||
ICOORD startpt, DIR128 * new_steps,
|
||||
INT16 length //length of loop
|
||||
):start (startpt) {
|
||||
INT8 dirdiff; //direction difference
|
||||
DIR128 prevdir; //previous direction
|
||||
DIR128 dir; //current direction
|
||||
DIR128 lastdir; //dir of last step
|
||||
BOX new_box; //easy bounding
|
||||
INT16 stepindex; //index to step
|
||||
INT16 srcindex; //source steps
|
||||
ICOORD pos; //current position
|
||||
|
||||
pos = startpt;
|
||||
stepcount = length; //no of steps
|
||||
//get memory
|
||||
steps = (UINT8 *) alloc_mem (step_mem());
|
||||
memset(steps, 0, step_mem());
|
||||
|
||||
lastdir = new_steps[length - 1];
|
||||
prevdir = lastdir;
|
||||
for (stepindex = 0, srcindex = 0; srcindex < length;
|
||||
stepindex++, srcindex++) {
|
||||
new_box = BOX (pos, pos);
|
||||
box += new_box;
|
||||
//copy steps
|
||||
dir = new_steps[srcindex];
|
||||
set_step(stepindex, dir);
|
||||
dirdiff = dir - prevdir;
|
||||
pos += step (stepindex);
|
||||
if ((dirdiff == 64 || dirdiff == -64) && stepindex > 0) {
|
||||
stepindex -= 2; //cancel there-and-back
|
||||
prevdir = stepindex >= 0 ? step_dir (stepindex) : lastdir;
|
||||
}
|
||||
else
|
||||
prevdir = dir;
|
||||
}
|
||||
ASSERT_HOST (pos.x () == startpt.x () && pos.y () == startpt.y ());
|
||||
do {
|
||||
dirdiff = step_dir (stepindex - 1) - step_dir (0);
|
||||
if (dirdiff == 64 || dirdiff == -64) {
|
||||
start += step (0);
|
||||
stepindex -= 2; //cancel there-and-back
|
||||
for (int i = 0; i < stepindex; ++i)
|
||||
set_step(i, step_dir(i + 1));
|
||||
}
|
||||
}
|
||||
while (stepindex > 1 && (dirdiff == 64 || dirdiff == -64));
|
||||
stepcount = stepindex;
|
||||
ASSERT_HOST (stepcount >= 4);
|
||||
}
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::C_OUTLINE
|
||||
*
|
||||
* Constructor to build a C_OUTLINE from a rotation of a C_OUTLINE.
|
||||
**********************************************************************/
|
||||
|
||||
C_OUTLINE::C_OUTLINE( //constructor
|
||||
C_OUTLINE *srcline, //outline to
|
||||
FCOORD rotation //rotate
|
||||
) {
|
||||
BOX new_box; //easy bounding
|
||||
INT16 stepindex; //index to step
|
||||
INT16 dirdiff; //direction change
|
||||
ICOORD pos; //current position
|
||||
ICOORD prevpos; //previous dest point
|
||||
|
||||
ICOORD destpos; //destination point
|
||||
INT16 destindex; //index to step
|
||||
DIR128 dir; //coded direction
|
||||
UINT8 new_step;
|
||||
|
||||
stepcount = srcline->stepcount * 2;
|
||||
//get memory
|
||||
steps = (UINT8 *) alloc_mem (step_mem());
|
||||
memset(steps, 0, step_mem());
|
||||
|
||||
for (int iteration = 0; iteration < 2; ++iteration) {
|
||||
DIR128 round1 = iteration == 0 ? 32 : 0;
|
||||
DIR128 round2 = iteration != 0 ? 32 : 0;
|
||||
pos = srcline->start;
|
||||
prevpos = pos;
|
||||
prevpos.rotate (rotation);
|
||||
start = prevpos;
|
||||
box = BOX (start, start);
|
||||
destindex = 0;
|
||||
for (stepindex = 0; stepindex < srcline->stepcount; stepindex++) {
|
||||
pos += srcline->step (stepindex);
|
||||
destpos = pos;
|
||||
destpos.rotate (rotation);
|
||||
if (destpos.x () != prevpos.x () || destpos.y () != prevpos.y ()) {
|
||||
dir = DIR128 (FCOORD (destpos - prevpos));
|
||||
dir += 64; //turn to step style
|
||||
new_step = dir.get_dir ();
|
||||
if (new_step & 31) {
|
||||
set_step(destindex++, dir + round1);
|
||||
if (destindex < 2
|
||||
|| (dirdiff =
|
||||
step_dir (destindex - 1) - step_dir (destindex - 2)) !=
|
||||
-64 && dirdiff != 64)
|
||||
set_step(destindex++, dir + round2);
|
||||
else {
|
||||
set_step(destindex - 1, dir + round2);
|
||||
set_step(destindex++, dir + round1);
|
||||
}
|
||||
}
|
||||
else {
|
||||
set_step(destindex++, dir);
|
||||
if (destindex >= 2
|
||||
&&
|
||||
((dirdiff =
|
||||
step_dir (destindex - 1) - step_dir (destindex - 2)) ==
|
||||
-64 || dirdiff == 64))
|
||||
destindex -= 2; // Forget u turn
|
||||
}
|
||||
prevpos = destpos;
|
||||
new_box = BOX (destpos, destpos);
|
||||
box += new_box;
|
||||
}
|
||||
}
|
||||
ASSERT_HOST (destpos.x () == start.x () && destpos.y () == start.y ());
|
||||
dirdiff = step_dir (destindex - 1) - step_dir (0);
|
||||
while ((dirdiff == 64 || dirdiff == -64) && destindex > 1) {
|
||||
start += step (0);
|
||||
destindex -= 2;
|
||||
for (int i = 0; i < destindex; ++i)
|
||||
set_step(i, step_dir(i + 1));
|
||||
dirdiff = step_dir (destindex - 1) - step_dir (0);
|
||||
}
|
||||
if (destindex >= 4)
|
||||
break;
|
||||
}
|
||||
stepcount = destindex;
|
||||
destpos = start;
|
||||
for (stepindex = 0; stepindex < stepcount; stepindex++) {
|
||||
destpos += step (stepindex);
|
||||
}
|
||||
ASSERT_HOST (destpos.x () == start.x () && destpos.y () == start.y ());
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::area
|
||||
*
|
||||
* Compute the area of the outline.
|
||||
**********************************************************************/
|
||||
|
||||
INT32 C_OUTLINE::area() { //winding number
|
||||
int stepindex; //current step
|
||||
INT32 total_steps; //steps to do
|
||||
INT32 total; //total area
|
||||
ICOORD pos; //position of point
|
||||
ICOORD next_step; //step to next pix
|
||||
C_OUTLINE_IT it = child ();
|
||||
|
||||
pos = start_pos ();
|
||||
total_steps = pathlength ();
|
||||
total = 0;
|
||||
for (stepindex = 0; stepindex < total_steps; stepindex++) {
|
||||
//all intersected
|
||||
next_step = step (stepindex);
|
||||
if (next_step.x () < 0)
|
||||
total += pos.y ();
|
||||
else if (next_step.x () > 0)
|
||||
total -= pos.y ();
|
||||
pos += next_step;
|
||||
}
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ())
|
||||
total += it.data ()->area ();//add areas of children
|
||||
|
||||
return total;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::outer_area
|
||||
*
|
||||
* Compute the area of the outline.
|
||||
**********************************************************************/
|
||||
|
||||
INT32 C_OUTLINE::outer_area() { //winding number
|
||||
int stepindex; //current step
|
||||
INT32 total_steps; //steps to do
|
||||
INT32 total; //total area
|
||||
ICOORD pos; //position of point
|
||||
ICOORD next_step; //step to next pix
|
||||
|
||||
pos = start_pos ();
|
||||
total_steps = pathlength ();
|
||||
total = 0;
|
||||
for (stepindex = 0; stepindex < total_steps; stepindex++) {
|
||||
//all intersected
|
||||
next_step = step (stepindex);
|
||||
if (next_step.x () < 0)
|
||||
total += pos.y ();
|
||||
else if (next_step.x () > 0)
|
||||
total -= pos.y ();
|
||||
pos += next_step;
|
||||
}
|
||||
|
||||
return total;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::count_transitions
|
||||
*
|
||||
* Compute the number of x and y maxes and mins in the outline.
|
||||
**********************************************************************/
|
||||
|
||||
INT32 C_OUTLINE::count_transitions( //winding number
|
||||
INT32 threshold //on size
|
||||
) {
|
||||
BOOL8 first_was_max_x; //what was first
|
||||
BOOL8 first_was_max_y;
|
||||
BOOL8 looking_for_max_x; //what is next
|
||||
BOOL8 looking_for_min_x;
|
||||
BOOL8 looking_for_max_y; //what is next
|
||||
BOOL8 looking_for_min_y;
|
||||
int stepindex; //current step
|
||||
INT32 total_steps; //steps to do
|
||||
//current limits
|
||||
INT32 max_x, min_x, max_y, min_y;
|
||||
INT32 initial_x, initial_y; //initial limits
|
||||
INT32 total; //total changes
|
||||
ICOORD pos; //position of point
|
||||
ICOORD next_step; //step to next pix
|
||||
|
||||
pos = start_pos ();
|
||||
total_steps = pathlength ();
|
||||
total = 0;
|
||||
max_x = min_x = pos.x ();
|
||||
max_y = min_y = pos.y ();
|
||||
looking_for_max_x = TRUE;
|
||||
looking_for_min_x = TRUE;
|
||||
looking_for_max_y = TRUE;
|
||||
looking_for_min_y = TRUE;
|
||||
first_was_max_x = FALSE;
|
||||
first_was_max_y = FALSE;
|
||||
initial_x = pos.x ();
|
||||
initial_y = pos.y (); //stop uninit warning
|
||||
for (stepindex = 0; stepindex < total_steps; stepindex++) {
|
||||
//all intersected
|
||||
next_step = step (stepindex);
|
||||
pos += next_step;
|
||||
if (next_step.x () < 0) {
|
||||
if (looking_for_max_x && pos.x () < min_x)
|
||||
min_x = pos.x ();
|
||||
if (looking_for_min_x && max_x - pos.x () > threshold) {
|
||||
if (looking_for_max_x) {
|
||||
initial_x = max_x;
|
||||
first_was_max_x = FALSE;
|
||||
}
|
||||
total++;
|
||||
looking_for_max_x = TRUE;
|
||||
looking_for_min_x = FALSE;
|
||||
min_x = pos.x (); //reset min
|
||||
}
|
||||
}
|
||||
else if (next_step.x () > 0) {
|
||||
if (looking_for_min_x && pos.x () > max_x)
|
||||
max_x = pos.x ();
|
||||
if (looking_for_max_x && pos.x () - min_x > threshold) {
|
||||
if (looking_for_min_x) {
|
||||
initial_x = min_x; //remember first min
|
||||
first_was_max_x = TRUE;
|
||||
}
|
||||
total++;
|
||||
looking_for_max_x = FALSE;
|
||||
looking_for_min_x = TRUE;
|
||||
max_x = pos.x ();
|
||||
}
|
||||
}
|
||||
else if (next_step.y () < 0) {
|
||||
if (looking_for_max_y && pos.y () < min_y)
|
||||
min_y = pos.y ();
|
||||
if (looking_for_min_y && max_y - pos.y () > threshold) {
|
||||
if (looking_for_max_y) {
|
||||
initial_y = max_y; //remember first max
|
||||
first_was_max_y = FALSE;
|
||||
}
|
||||
total++;
|
||||
looking_for_max_y = TRUE;
|
||||
looking_for_min_y = FALSE;
|
||||
min_y = pos.y (); //reset min
|
||||
}
|
||||
}
|
||||
else {
|
||||
if (looking_for_min_y && pos.y () > max_y)
|
||||
max_y = pos.y ();
|
||||
if (looking_for_max_y && pos.y () - min_y > threshold) {
|
||||
if (looking_for_min_y) {
|
||||
initial_y = min_y; //remember first min
|
||||
first_was_max_y = TRUE;
|
||||
}
|
||||
total++;
|
||||
looking_for_max_y = FALSE;
|
||||
looking_for_min_y = TRUE;
|
||||
max_y = pos.y ();
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
if (first_was_max_x && looking_for_min_x) {
|
||||
if (max_x - initial_x > threshold)
|
||||
total++;
|
||||
else
|
||||
total--;
|
||||
}
|
||||
else if (!first_was_max_x && looking_for_max_x) {
|
||||
if (initial_x - min_x > threshold)
|
||||
total++;
|
||||
else
|
||||
total--;
|
||||
}
|
||||
if (first_was_max_y && looking_for_min_y) {
|
||||
if (max_y - initial_y > threshold)
|
||||
total++;
|
||||
else
|
||||
total--;
|
||||
}
|
||||
else if (!first_was_max_y && looking_for_max_y) {
|
||||
if (initial_y - min_y > threshold)
|
||||
total++;
|
||||
else
|
||||
total--;
|
||||
}
|
||||
|
||||
return total;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::operator<
|
||||
*
|
||||
* Return TRUE if the left operand is inside the right one.
|
||||
**********************************************************************/
|
||||
|
||||
BOOL8
|
||||
C_OUTLINE::operator< ( //winding number
|
||||
const C_OUTLINE & other //other outline
|
||||
) const
|
||||
{
|
||||
INT16 count = 0; //winding count
|
||||
ICOORD pos; //position of point
|
||||
INT32 stepindex; //index to cstep
|
||||
|
||||
if (!box.overlap (other.box))
|
||||
return FALSE; //can't be contained
|
||||
|
||||
pos = start;
|
||||
for (stepindex = 0; stepindex < stepcount
|
||||
&& (count = other.winding_number (pos)) == INTERSECTING; stepindex++)
|
||||
pos += step (stepindex); //try all points
|
||||
if (count == INTERSECTING) {
|
||||
//all intersected
|
||||
pos = other.start;
|
||||
for (stepindex = 0; stepindex < other.stepcount
|
||||
&& (count = winding_number (pos)) == INTERSECTING; stepindex++)
|
||||
//try other way round
|
||||
pos += other.step (stepindex);
|
||||
return count == INTERSECTING || count == 0;
|
||||
}
|
||||
return count != 0;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::winding_number
|
||||
*
|
||||
* Return the winding number of the outline around the given point.
|
||||
**********************************************************************/
|
||||
|
||||
INT16 C_OUTLINE::winding_number( //winding number
|
||||
ICOORD point //point to wind around
|
||||
) const {
|
||||
INT16 stepindex; //index to cstep
|
||||
INT16 count; //winding count
|
||||
ICOORD vec; //to current point
|
||||
ICOORD stepvec; //step vector
|
||||
INT32 cross; //cross product
|
||||
|
||||
vec = start - point; //vector to it
|
||||
count = 0;
|
||||
for (stepindex = 0; stepindex < stepcount; stepindex++) {
|
||||
stepvec = step (stepindex); //get the step
|
||||
//crossing the line
|
||||
if (vec.y () <= 0 && vec.y () + stepvec.y () > 0) {
|
||||
cross = vec * stepvec; //cross product
|
||||
if (cross > 0)
|
||||
count++; //crossing right half
|
||||
else if (cross == 0)
|
||||
return INTERSECTING; //going through point
|
||||
}
|
||||
else if (vec.y () > 0 && vec.y () + stepvec.y () <= 0) {
|
||||
cross = vec * stepvec;
|
||||
if (cross < 0)
|
||||
count--; //crossing back
|
||||
else if (cross == 0)
|
||||
return INTERSECTING; //illegal
|
||||
}
|
||||
vec += stepvec; //sum vectors
|
||||
}
|
||||
return count; //winding number
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::turn_direction
|
||||
*
|
||||
* Return the sum direction delta of the outline.
|
||||
**********************************************************************/
|
||||
|
||||
INT16 C_OUTLINE::turn_direction() const { //winding number
|
||||
DIR128 prevdir; //previous direction
|
||||
DIR128 dir; //current direction
|
||||
INT16 stepindex; //index to cstep
|
||||
INT8 dirdiff; //direction difference
|
||||
INT16 count; //winding count
|
||||
|
||||
count = 0;
|
||||
prevdir = step_dir (stepcount - 1);
|
||||
for (stepindex = 0; stepindex < stepcount; stepindex++) {
|
||||
dir = step_dir (stepindex);
|
||||
dirdiff = dir - prevdir;
|
||||
ASSERT_HOST (dirdiff == 0 || dirdiff == 32 || dirdiff == -32);
|
||||
count += dirdiff;
|
||||
prevdir = dir;
|
||||
}
|
||||
ASSERT_HOST (count == 128 || count == -128);
|
||||
return count; //winding number
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::reverse
|
||||
*
|
||||
* Reverse the direction of an outline.
|
||||
**********************************************************************/
|
||||
|
||||
void C_OUTLINE::reverse() { //reverse drection
|
||||
DIR128 halfturn = MODULUS / 2; //amount to shift
|
||||
DIR128 stepdir; //direction of step
|
||||
INT16 stepindex; //index to cstep
|
||||
INT16 farindex; //index to other side
|
||||
INT16 halfsteps; //half of stepcount
|
||||
|
||||
halfsteps = (stepcount + 1) / 2;
|
||||
for (stepindex = 0; stepindex < halfsteps; stepindex++) {
|
||||
farindex = stepcount - stepindex - 1;
|
||||
stepdir = step_dir (stepindex);
|
||||
set_step (stepindex, step_dir (farindex) + halfturn);
|
||||
set_step (farindex, stepdir + halfturn);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::move
|
||||
*
|
||||
* Move C_OUTLINE by vector
|
||||
**********************************************************************/
|
||||
|
||||
void C_OUTLINE::move( // reposition OUTLINE
|
||||
const ICOORD vec // by vector
|
||||
) {
|
||||
C_OUTLINE_IT it(&children); // iterator
|
||||
|
||||
box.move (vec);
|
||||
start += vec;
|
||||
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ())
|
||||
it.data ()->move (vec); // move child outlines
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::plot
|
||||
*
|
||||
* Draw the outline in the given colour.
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void C_OUTLINE::plot( //draw it
|
||||
WINDOW window, //window to draw in
|
||||
COLOUR colour //colour to draw in
|
||||
) const {
|
||||
INT16 stepindex; //index to cstep
|
||||
ICOORD pos; //current position
|
||||
DIR128 stepdir; //direction of step
|
||||
DIR128 oldstepdir; //previous stepdir
|
||||
|
||||
pos = start; //current position
|
||||
line_color_index(window, colour);
|
||||
move2d (window, pos.x (), pos.y ());
|
||||
stepindex = 0;
|
||||
stepdir = step_dir (0); //get direction
|
||||
while (stepindex < stepcount) {
|
||||
do {
|
||||
pos += step (stepindex); //step to next
|
||||
stepindex++; //count steps
|
||||
oldstepdir = stepdir;
|
||||
//new direction
|
||||
stepdir = step_dir (stepindex);
|
||||
}
|
||||
while (stepindex < stepcount
|
||||
&& oldstepdir.get_dir () == stepdir.get_dir ());
|
||||
//merge straight lines
|
||||
draw2d (window, pos.x (), pos.y ());
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* C_OUTLINE::operator=
|
||||
*
|
||||
* Assignment - deep copy data
|
||||
**********************************************************************/
|
||||
|
||||
//assignment
|
||||
C_OUTLINE & C_OUTLINE::operator= (
|
||||
const C_OUTLINE & source //from this
|
||||
) {
|
||||
box = source.box;
|
||||
start = source.start;
|
||||
if (steps != NULL)
|
||||
free_mem(steps);
|
||||
stepcount = source.stepcount;
|
||||
steps = (UINT8 *) alloc_mem (step_mem());
|
||||
memmove (steps, source.steps, step_mem());
|
||||
if (!children.empty ())
|
||||
children.clear ();
|
||||
children.deep_copy (&source.children);
|
||||
return *this;
|
||||
}
|
176
ccstruct/coutln.h
Normal file
176
ccstruct/coutln.h
Normal file
@ -0,0 +1,176 @@
|
||||
/**********************************************************************
|
||||
* File: coutln.c (Formerly: coutline.c)
|
||||
* Description: Code for the C_OUTLINE class.
|
||||
* Author: Ray Smith
|
||||
* Created: Mon Oct 07 16:01:57 BST 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef COUTLN_H
|
||||
#define COUTLN_H
|
||||
|
||||
#include "grphics.h"
|
||||
#include "crakedge.h"
|
||||
#include "mod128.h"
|
||||
#include "bits16.h"
|
||||
#include "rect.h"
|
||||
#include "blckerr.h"
|
||||
|
||||
#define INTERSECTING MAX_INT16//no winding number
|
||||
|
||||
//mask to get step
|
||||
#define STEP_MASK 3
|
||||
|
||||
enum C_OUTLINE_FLAGS
|
||||
{
|
||||
COUT_INVERSE //White on black blob
|
||||
};
|
||||
|
||||
class DLLSYM C_OUTLINE; //forward declaration
|
||||
|
||||
ELISTIZEH_S (C_OUTLINE)
|
||||
class DLLSYM C_OUTLINE:public ELIST_LINK
|
||||
{
|
||||
public:
|
||||
C_OUTLINE() { //empty constructor
|
||||
steps = NULL;
|
||||
}
|
||||
C_OUTLINE( //constructor
|
||||
CRACKEDGE *startpt, //from edge detector
|
||||
ICOORD bot_left, //bounding box //length of loop
|
||||
ICOORD top_right,
|
||||
INT16 length);
|
||||
C_OUTLINE(ICOORD startpt, //start of loop
|
||||
DIR128 *new_steps, //steps in loop
|
||||
INT16 length); //length of loop
|
||||
//outline to copy
|
||||
C_OUTLINE(C_OUTLINE *srcline, FCOORD rotation); //and rotate
|
||||
~C_OUTLINE () { //destructor
|
||||
if (steps != NULL)
|
||||
free_mem(steps);
|
||||
steps = NULL;
|
||||
}
|
||||
|
||||
BOOL8 flag( //test flag
|
||||
C_OUTLINE_FLAGS mask) const { //flag to test
|
||||
return flags.bit (mask);
|
||||
}
|
||||
void set_flag( //set flag value
|
||||
C_OUTLINE_FLAGS mask, //flag to test
|
||||
BOOL8 value) { //value to set
|
||||
flags.set_bit (mask, value);
|
||||
}
|
||||
|
||||
C_OUTLINE_LIST *child() { //get child list
|
||||
return &children;
|
||||
}
|
||||
|
||||
//access function
|
||||
const BOX &bounding_box() const {
|
||||
return box;
|
||||
}
|
||||
void set_step( //set a step
|
||||
INT16 stepindex, //index of step
|
||||
INT8 stepdir) { //chain code
|
||||
int shift = stepindex%4 * 2;
|
||||
UINT8 mask = 3 << shift;
|
||||
steps[stepindex/4] = ((stepdir << shift) & mask) |
|
||||
(steps[stepindex/4] & ~mask);
|
||||
//squeeze 4 into byte
|
||||
}
|
||||
void set_step( //set a step
|
||||
INT16 stepindex, //index of step
|
||||
DIR128 stepdir) { //direction
|
||||
//clean it
|
||||
INT8 chaindir = stepdir.get_dir() >> (DIRBITS - 2);
|
||||
//difference
|
||||
set_step(stepindex, chaindir);
|
||||
//squeeze 4 into byte
|
||||
}
|
||||
|
||||
//get start position
|
||||
const ICOORD &start_pos() const {
|
||||
return start;
|
||||
}
|
||||
INT32 pathlength() const { //get path length
|
||||
return stepcount;
|
||||
}
|
||||
// Return step at a given index as a DIR128.
|
||||
DIR128 step_dir(INT16 index) const {
|
||||
return DIR128((INT16)(((steps[index/4] >> (index%4 * 2)) & STEP_MASK) <<
|
||||
(DIRBITS - 2)));
|
||||
}
|
||||
// Return the step vector for the given outline position.
|
||||
ICOORD step(INT16 index) const { //index of step
|
||||
return step_coords[(steps[index/4] >> (index%4 * 2)) & STEP_MASK];
|
||||
}
|
||||
|
||||
INT32 area(); //return area
|
||||
INT32 outer_area(); //return area
|
||||
INT32 count_transitions( //count maxima
|
||||
INT32 threshold); //size threshold
|
||||
|
||||
BOOL8 operator< ( //containment test
|
||||
const C_OUTLINE & other) const;
|
||||
BOOL8 operator> ( //containment test
|
||||
C_OUTLINE & other) const
|
||||
{
|
||||
return other < *this; //use the < to do it
|
||||
}
|
||||
INT16 winding_number( //get winding number
|
||||
ICOORD testpt) const; //around this point
|
||||
//get direction
|
||||
INT16 turn_direction() const;
|
||||
void reverse(); //reverse direction
|
||||
|
||||
void move( // reposition outline
|
||||
const ICOORD vec); // by vector
|
||||
|
||||
void plot( //draw one
|
||||
WINDOW window, //window to draw in
|
||||
COLOUR colour) const; //colour to draw it
|
||||
|
||||
void prep_serialise() { //set ptrs to counts
|
||||
children.prep_serialise ();
|
||||
}
|
||||
|
||||
void dump( //write external bits
|
||||
FILE *f) {
|
||||
//stepcount = # bytes
|
||||
serialise_bytes (f, (void *) steps, step_mem());
|
||||
children.dump (f);
|
||||
}
|
||||
|
||||
void de_dump( //read external bits
|
||||
FILE *f) {
|
||||
steps = (UINT8 *) de_serialise_bytes (f, step_mem());
|
||||
children.de_dump (f);
|
||||
}
|
||||
|
||||
//assignment
|
||||
make_serialise (C_OUTLINE) C_OUTLINE & operator= (
|
||||
const C_OUTLINE & source); //from this
|
||||
|
||||
private:
|
||||
int step_mem() const { return (stepcount+3) / 4; }
|
||||
|
||||
BOX box; //boudning box
|
||||
ICOORD start; //start coord
|
||||
UINT8 *steps; //step array
|
||||
INT16 stepcount; //no of steps
|
||||
BITS16 flags; //flags about outline
|
||||
C_OUTLINE_LIST children; //child elements
|
||||
static ICOORD step_coords[4];
|
||||
};
|
||||
#endif
|
39
ccstruct/crakedge.h
Normal file
39
ccstruct/crakedge.h
Normal file
@ -0,0 +1,39 @@
|
||||
/**********************************************************************
|
||||
* File: crakedge.h (Formerly: crkedge.h)
|
||||
* Description: Sturctures for the Crack following edge detector.
|
||||
* Author: Ray Smith
|
||||
* Created: Fri Mar 22 16:06:38 GMT 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef CRAKEDGE_H
|
||||
#define CRAKEDGE_H
|
||||
|
||||
#include "points.h"
|
||||
#include "mod128.h"
|
||||
|
||||
class CRACKEDGE
|
||||
{
|
||||
public:
|
||||
ICOORD pos; /*position of crack */
|
||||
INT8 stepx; //edge step
|
||||
INT8 stepy;
|
||||
INT8 stepdir; //chaincode
|
||||
CRACKEDGE *prev; /*previous point */
|
||||
CRACKEDGE *next; /*next point */
|
||||
|
||||
NEWDELETE2 (CRACKEDGE) CRACKEDGE () {
|
||||
} //empty constructor
|
||||
};
|
||||
#endif
|
133
ccstruct/genblob.cpp
Normal file
133
ccstruct/genblob.cpp
Normal file
@ -0,0 +1,133 @@
|
||||
/**********************************************************************
|
||||
* File: genblob.cpp (Formerly gblob.c)
|
||||
* Description: Generic Blob processing routines
|
||||
* Author: Phil Cheatle
|
||||
* Created: Mon Nov 25 10:53:26 GMT 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "stepblob.h"
|
||||
#include "polyblob.h"
|
||||
#include "genblob.h"
|
||||
|
||||
/**********************************************************************
|
||||
* blob_comparator()
|
||||
*
|
||||
* Blob comparator used to sort a blob list so that blobs are in increasing
|
||||
* order of left edge.
|
||||
**********************************************************************/
|
||||
|
||||
int blob_comparator( //sort blobs
|
||||
const void *blob1p, //ptr to ptr to blob1
|
||||
const void *blob2p //ptr to ptr to blob2
|
||||
) {
|
||||
PBLOB *blob1 = *(PBLOB **) blob1p;
|
||||
PBLOB *blob2 = *(PBLOB **) blob2p;
|
||||
|
||||
return blob1->bounding_box ().left () - blob2->bounding_box ().left ();
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* c_blob_comparator()
|
||||
*
|
||||
* Blob comparator used to sort a blob list so that blobs are in increasing
|
||||
* order of left edge.
|
||||
**********************************************************************/
|
||||
|
||||
int c_blob_comparator( //sort blobs
|
||||
const void *blob1p, //ptr to ptr to blob1
|
||||
const void *blob2p //ptr to ptr to blob2
|
||||
) {
|
||||
C_BLOB *blob1 = *(C_BLOB **) blob1p;
|
||||
C_BLOB *blob2 = *(C_BLOB **) blob2p;
|
||||
|
||||
return blob1->bounding_box ().left () - blob2->bounding_box ().left ();
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* gblob_bounding_box()
|
||||
*
|
||||
* Return the bounding box of a generic blob.
|
||||
**********************************************************************/
|
||||
|
||||
BOX gblob_bounding_box( //Get bounding box
|
||||
PBLOB *blob, //generic blob
|
||||
BOOL8 polygonal //is blob polygonal?
|
||||
) {
|
||||
if (polygonal)
|
||||
return blob->bounding_box ();
|
||||
else
|
||||
return ((C_BLOB *) blob)->bounding_box ();
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* gblob_sort_list()
|
||||
*
|
||||
* Sort a generic blob list into order of bounding box left edge
|
||||
**********************************************************************/
|
||||
|
||||
void gblob_sort_list( //Sort a gblob list
|
||||
PBLOB_LIST *blob_list, //generic blob list
|
||||
BOOL8 polygonal //is list polygonal?
|
||||
) {
|
||||
PBLOB_IT b_it;
|
||||
C_BLOB_IT c_it;
|
||||
|
||||
if (polygonal) {
|
||||
b_it.set_to_list (blob_list);
|
||||
b_it.sort (blob_comparator);
|
||||
}
|
||||
else {
|
||||
c_it.set_to_list ((C_BLOB_LIST *) blob_list);
|
||||
c_it.sort (c_blob_comparator);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* gblob_out_list()
|
||||
*
|
||||
* Return the generic outline list of a generic blob.
|
||||
**********************************************************************/
|
||||
|
||||
OUTLINE_LIST *gblob_out_list( //Get outline list
|
||||
PBLOB *blob, //generic blob
|
||||
BOOL8 polygonal //is blob polygonal?
|
||||
) {
|
||||
if (polygonal)
|
||||
return blob->out_list ();
|
||||
else
|
||||
return (OUTLINE_LIST *) ((C_BLOB *) blob)->out_list ();
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* goutline_bounding_box()
|
||||
*
|
||||
* Return the bounding box of a generic outline.
|
||||
**********************************************************************/
|
||||
|
||||
BOX goutline_bounding_box( //Get bounding box
|
||||
OUTLINE *outline, //generic outline
|
||||
BOOL8 polygonal //is outline polygonal?
|
||||
) {
|
||||
if (polygonal)
|
||||
return outline->bounding_box ();
|
||||
else
|
||||
return ((C_OUTLINE *) outline)->bounding_box ();
|
||||
}
|
52
ccstruct/genblob.h
Normal file
52
ccstruct/genblob.h
Normal file
@ -0,0 +1,52 @@
|
||||
/**********************************************************************
|
||||
* File: genblob.h (Formerly gblob.h)
|
||||
* Description: Generic Blob processing routines
|
||||
* Author: Phil Cheatle
|
||||
* Created: Mon Nov 25 10:53:26 GMT 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef GENBLOB_H
|
||||
#define GENBLOB_H
|
||||
|
||||
#include "polyblob.h"
|
||||
#include "hosthplb.h"
|
||||
#include "rect.h"
|
||||
#include "notdll.h"
|
||||
|
||||
int blob_comparator( //sort blobs
|
||||
const void *blob1p, //ptr to ptr to blob1
|
||||
const void *blob2p //ptr to ptr to blob2
|
||||
);
|
||||
int c_blob_comparator( //sort blobs
|
||||
const void *blob1p, //ptr to ptr to blob1
|
||||
const void *blob2p //ptr to ptr to blob2
|
||||
);
|
||||
BOX gblob_bounding_box( //Get bounding box
|
||||
PBLOB *blob, //generic blob
|
||||
BOOL8 polygonal //is blob polygonal?
|
||||
);
|
||||
void gblob_sort_list( //Sort a gblob list
|
||||
PBLOB_LIST *blob_list, //generic blob list
|
||||
BOOL8 polygonal //is list polygonal?
|
||||
);
|
||||
OUTLINE_LIST *gblob_out_list( //Get outline list
|
||||
PBLOB *blob, //generic blob
|
||||
BOOL8 polygonal //is blob polygonal?
|
||||
);
|
||||
BOX goutline_bounding_box( //Get bounding box
|
||||
OUTLINE *outline, //generic outline
|
||||
BOOL8 polygonal //is outline polygonal?
|
||||
);
|
||||
#endif
|
39
ccstruct/hpddef.h
Normal file
39
ccstruct/hpddef.h
Normal file
@ -0,0 +1,39 @@
|
||||
/**********************************************************************
|
||||
* File: hpddef.h
|
||||
* Description: Defines for dll symbols for handpd.dll.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Apr 30 17:15:01 MDT 1996
|
||||
*
|
||||
* (C) Copyright 1996, Hewlett-Packard Co.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
//This file does NOT use the usual single inclusion code as it
|
||||
//is necessary to allow it to be executed every time it is included.
|
||||
//#ifndef HPDDEF_H
|
||||
//#define HPDDEF_H
|
||||
|
||||
#undef DLLSYM
|
||||
#ifndef __IPEDLL
|
||||
# define DLLSYM
|
||||
#else
|
||||
# ifdef __BUILDING_HANDPD__
|
||||
# define DLLSYM DLLEXPORT
|
||||
# else
|
||||
# define DLLSYM DLLIMPORT
|
||||
# endif
|
||||
#endif
|
||||
#if defined(__CFM68K__) && !defined(__USING_STATIC_LIBS__)
|
||||
# pragma import on
|
||||
#endif
|
||||
|
||||
//#endif
|
8
ccstruct/hpdsizes.h
Normal file
8
ccstruct/hpdsizes.h
Normal file
@ -0,0 +1,8 @@
|
||||
#ifndef HPDSIZES_H
|
||||
#define HPDSIZES_H
|
||||
|
||||
#define NUM_TEXT_ATTR 10
|
||||
#define NUM_BLOCK_ATTR 7
|
||||
#define MAXLENGTH 128
|
||||
#define NUM_BACKGROUNDS 8
|
||||
#endif
|
479
ccstruct/ipoints.h
Normal file
479
ccstruct/ipoints.h
Normal file
@ -0,0 +1,479 @@
|
||||
/**********************************************************************
|
||||
* File: ipoints.h (Formerly icoords.h)
|
||||
* Description: Inline functions for coords.h.
|
||||
* Author: Ray Smith
|
||||
* Created: Fri Jun 21 15:14:21 BST 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef IPOINTS_H
|
||||
#define IPOINTS_H
|
||||
|
||||
#include <math.h>
|
||||
|
||||
/**********************************************************************
|
||||
* operator!
|
||||
*
|
||||
* Rotate an ICOORD 90 degrees anticlockwise.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD
|
||||
operator! ( //rotate 90 deg anti
|
||||
const ICOORD & src //thing to rotate
|
||||
) {
|
||||
ICOORD result; //output
|
||||
|
||||
result.xcoord = -src.ycoord;
|
||||
result.ycoord = src.xcoord;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator-
|
||||
*
|
||||
* Unary minus of an ICOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD
|
||||
operator- ( //unary minus
|
||||
const ICOORD & src //thing to minus
|
||||
) {
|
||||
ICOORD result; //output
|
||||
|
||||
result.xcoord = -src.xcoord;
|
||||
result.ycoord = -src.ycoord;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator+
|
||||
*
|
||||
* Add 2 ICOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD
|
||||
operator+ ( //sum vectors
|
||||
const ICOORD & op1, //operands
|
||||
const ICOORD & op2) {
|
||||
ICOORD sum; //result
|
||||
|
||||
sum.xcoord = op1.xcoord + op2.xcoord;
|
||||
sum.ycoord = op1.ycoord + op2.ycoord;
|
||||
return sum;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator+=
|
||||
*
|
||||
* Add 2 ICOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD &
|
||||
operator+= ( //sum vectors
|
||||
ICOORD & op1, //operands
|
||||
const ICOORD & op2) {
|
||||
op1.xcoord += op2.xcoord;
|
||||
op1.ycoord += op2.ycoord;
|
||||
return op1;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator-
|
||||
*
|
||||
* Subtract 2 ICOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD
|
||||
operator- ( //subtract vectors
|
||||
const ICOORD & op1, //operands
|
||||
const ICOORD & op2) {
|
||||
ICOORD sum; //result
|
||||
|
||||
sum.xcoord = op1.xcoord - op2.xcoord;
|
||||
sum.ycoord = op1.ycoord - op2.ycoord;
|
||||
return sum;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator-=
|
||||
*
|
||||
* Subtract 2 ICOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD &
|
||||
operator-= ( //sum vectors
|
||||
ICOORD & op1, //operands
|
||||
const ICOORD & op2) {
|
||||
op1.xcoord -= op2.xcoord;
|
||||
op1.ycoord -= op2.ycoord;
|
||||
return op1;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator%
|
||||
*
|
||||
* Scalar product of 2 ICOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline INT32
|
||||
operator% ( //scalar product
|
||||
const ICOORD & op1, //operands
|
||||
const ICOORD & op2) {
|
||||
return op1.xcoord * op2.xcoord + op1.ycoord * op2.ycoord;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator*
|
||||
*
|
||||
* Cross product of 2 ICOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline INT32 operator *( //cross product
|
||||
const ICOORD &op1, //operands
|
||||
const ICOORD &op2) {
|
||||
return op1.xcoord * op2.ycoord - op1.ycoord * op2.xcoord;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator*
|
||||
*
|
||||
* Scalar multiply of an ICOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD operator *( //scalar multiply
|
||||
const ICOORD &op1, //operands
|
||||
INT16 scale) {
|
||||
ICOORD result; //output
|
||||
|
||||
result.xcoord = op1.xcoord * scale;
|
||||
result.ycoord = op1.ycoord * scale;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
inline ICOORD operator *( //scalar multiply
|
||||
INT16 scale,
|
||||
const ICOORD &op1 //operands
|
||||
) {
|
||||
ICOORD result; //output
|
||||
|
||||
result.xcoord = op1.xcoord * scale;
|
||||
result.ycoord = op1.ycoord * scale;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator*=
|
||||
*
|
||||
* Scalar multiply of an ICOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD &
|
||||
operator*= ( //scalar multiply
|
||||
ICOORD & op1, //operands
|
||||
INT16 scale) {
|
||||
op1.xcoord *= scale;
|
||||
op1.ycoord *= scale;
|
||||
return op1;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator/
|
||||
*
|
||||
* Scalar divide of an ICOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD
|
||||
operator/ ( //scalar divide
|
||||
const ICOORD & op1, //operands
|
||||
INT16 scale) {
|
||||
ICOORD result; //output
|
||||
|
||||
result.xcoord = op1.xcoord / scale;
|
||||
result.ycoord = op1.ycoord / scale;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator/=
|
||||
*
|
||||
* Scalar divide of an ICOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline ICOORD &
|
||||
operator/= ( //scalar divide
|
||||
ICOORD & op1, //operands
|
||||
INT16 scale) {
|
||||
op1.xcoord /= scale;
|
||||
op1.ycoord /= scale;
|
||||
return op1;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* ICOORD::rotate
|
||||
*
|
||||
* Rotate an ICOORD by the given (normalized) (cos,sin) vector.
|
||||
**********************************************************************/
|
||||
|
||||
inline void ICOORD::rotate( //rotate by vector
|
||||
const FCOORD& vec) {
|
||||
INT16 tmp;
|
||||
|
||||
tmp = (INT16) floor (xcoord * vec.x () - ycoord * vec.y () + 0.5);
|
||||
ycoord = (INT16) floor (ycoord * vec.x () + xcoord * vec.y () + 0.5);
|
||||
xcoord = tmp;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator!
|
||||
*
|
||||
* Rotate an FCOORD 90 degrees anticlockwise.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD
|
||||
operator! ( //rotate 90 deg anti
|
||||
const FCOORD & src //thing to rotate
|
||||
) {
|
||||
FCOORD result; //output
|
||||
|
||||
result.xcoord = -src.ycoord;
|
||||
result.ycoord = src.xcoord;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator-
|
||||
*
|
||||
* Unary minus of an FCOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD
|
||||
operator- ( //unary minus
|
||||
const FCOORD & src //thing to minus
|
||||
) {
|
||||
FCOORD result; //output
|
||||
|
||||
result.xcoord = -src.xcoord;
|
||||
result.ycoord = -src.ycoord;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator+
|
||||
*
|
||||
* Add 2 FCOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD
|
||||
operator+ ( //sum vectors
|
||||
const FCOORD & op1, //operands
|
||||
const FCOORD & op2) {
|
||||
FCOORD sum; //result
|
||||
|
||||
sum.xcoord = op1.xcoord + op2.xcoord;
|
||||
sum.ycoord = op1.ycoord + op2.ycoord;
|
||||
return sum;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator+=
|
||||
*
|
||||
* Add 2 FCOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD &
|
||||
operator+= ( //sum vectors
|
||||
FCOORD & op1, //operands
|
||||
const FCOORD & op2) {
|
||||
op1.xcoord += op2.xcoord;
|
||||
op1.ycoord += op2.ycoord;
|
||||
return op1;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator-
|
||||
*
|
||||
* Subtract 2 FCOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD
|
||||
operator- ( //subtract vectors
|
||||
const FCOORD & op1, //operands
|
||||
const FCOORD & op2) {
|
||||
FCOORD sum; //result
|
||||
|
||||
sum.xcoord = op1.xcoord - op2.xcoord;
|
||||
sum.ycoord = op1.ycoord - op2.ycoord;
|
||||
return sum;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator-=
|
||||
*
|
||||
* Subtract 2 FCOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD &
|
||||
operator-= ( //sum vectors
|
||||
FCOORD & op1, //operands
|
||||
const FCOORD & op2) {
|
||||
op1.xcoord -= op2.xcoord;
|
||||
op1.ycoord -= op2.ycoord;
|
||||
return op1;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator%
|
||||
*
|
||||
* Scalar product of 2 FCOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline float
|
||||
operator% ( //scalar product
|
||||
const FCOORD & op1, //operands
|
||||
const FCOORD & op2) {
|
||||
return op1.xcoord * op2.xcoord + op1.ycoord * op2.ycoord;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator*
|
||||
*
|
||||
* Cross product of 2 FCOORDS.
|
||||
**********************************************************************/
|
||||
|
||||
inline float operator *( //cross product
|
||||
const FCOORD &op1, //operands
|
||||
const FCOORD &op2) {
|
||||
return op1.xcoord * op2.ycoord - op1.ycoord * op2.xcoord;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator*
|
||||
*
|
||||
* Scalar multiply of an FCOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD operator *( //scalar multiply
|
||||
const FCOORD &op1, //operands
|
||||
float scale) {
|
||||
FCOORD result; //output
|
||||
|
||||
result.xcoord = op1.xcoord * scale;
|
||||
result.ycoord = op1.ycoord * scale;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
inline FCOORD operator *( //scalar multiply
|
||||
float scale,
|
||||
const FCOORD &op1 //operands
|
||||
) {
|
||||
FCOORD result; //output
|
||||
|
||||
result.xcoord = op1.xcoord * scale;
|
||||
result.ycoord = op1.ycoord * scale;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator*=
|
||||
*
|
||||
* Scalar multiply of an FCOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD &
|
||||
operator*= ( //scalar multiply
|
||||
FCOORD & op1, //operands
|
||||
float scale) {
|
||||
op1.xcoord *= scale;
|
||||
op1.ycoord *= scale;
|
||||
return op1;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator/
|
||||
*
|
||||
* Scalar divide of an FCOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD
|
||||
operator/ ( //scalar divide
|
||||
const FCOORD & op1, //operands
|
||||
float scale) {
|
||||
FCOORD result; //output
|
||||
|
||||
if (scale != 0) {
|
||||
result.xcoord = op1.xcoord / scale;
|
||||
result.ycoord = op1.ycoord / scale;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* operator/=
|
||||
*
|
||||
* Scalar divide of an FCOORD.
|
||||
**********************************************************************/
|
||||
|
||||
inline FCOORD &
|
||||
operator/= ( //scalar divide
|
||||
FCOORD & op1, //operands
|
||||
float scale) {
|
||||
if (scale != 0) {
|
||||
op1.xcoord /= scale;
|
||||
op1.ycoord /= scale;
|
||||
}
|
||||
return op1;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* rotate
|
||||
*
|
||||
* Rotate an FCOORD by the given (normalized) (cos,sin) vector.
|
||||
**********************************************************************/
|
||||
|
||||
inline void FCOORD::rotate( //rotate by vector
|
||||
const FCOORD vec) {
|
||||
float tmp;
|
||||
|
||||
tmp = xcoord * vec.x () - ycoord * vec.y ();
|
||||
ycoord = ycoord * vec.x () + xcoord * vec.y ();
|
||||
xcoord = tmp;
|
||||
}
|
||||
#endif
|
188
ccstruct/labls.cpp
Normal file
188
ccstruct/labls.cpp
Normal file
@ -0,0 +1,188 @@
|
||||
/**********************************************************************
|
||||
* File: labls.c (Formerly labels.c)
|
||||
* Description: Attribute definition tables
|
||||
* Author: Sheelagh Lloyd?
|
||||
* Created:
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "hpdsizes.h"
|
||||
#include "labls.h"
|
||||
|
||||
/******************************************************************************
|
||||
* TEXT REGIONS
|
||||
*****************************************************************************/
|
||||
DLLSYM INT32 tn[NUM_TEXT_ATTR] = {
|
||||
3, //T_HORIZONTAL
|
||||
4, //T_TEXT
|
||||
2, //T_SERIF
|
||||
2, //T_PROPORTIONAL
|
||||
2, //T_NORMAL
|
||||
2, //T_UPRIGHT
|
||||
2, //T_SOLID
|
||||
3, //T_BLACK
|
||||
2, //T_NOTUNDER
|
||||
2, //T_NOTDROP
|
||||
};
|
||||
|
||||
DLLSYM char tlabel[NUM_TEXT_ATTR][4][MAXLENGTH] = { {
|
||||
//T_HORIZONTAL
|
||||
"Horizontal",
|
||||
"Vertical",
|
||||
"Skew",
|
||||
""
|
||||
},
|
||||
{ //T_TEXT
|
||||
"Text",
|
||||
"Table",
|
||||
"Form",
|
||||
"Mixed"
|
||||
},
|
||||
{ //T_SERIF
|
||||
"Serif",
|
||||
"Sans-serif",
|
||||
"",
|
||||
""
|
||||
},
|
||||
{ //T_PROPORTIONAL
|
||||
"Proportional",
|
||||
"Fixed pitch",
|
||||
"",
|
||||
""
|
||||
},
|
||||
{ //T_NORMAL
|
||||
"Normal",
|
||||
"Bold",
|
||||
"",
|
||||
""
|
||||
},
|
||||
{ //T_UPRIGHT
|
||||
"Upright",
|
||||
"Italic",
|
||||
"",
|
||||
""
|
||||
},
|
||||
{ //T_SOLID
|
||||
"Solid",
|
||||
"Outline",
|
||||
"",
|
||||
""
|
||||
},
|
||||
{ //T_BLACK
|
||||
"Black",
|
||||
"White",
|
||||
"Coloured",
|
||||
""
|
||||
},
|
||||
{ //T_NOTUNDER
|
||||
"Not underlined",
|
||||
"Underlined",
|
||||
"",
|
||||
""
|
||||
},
|
||||
{ //T_NOTDROP
|
||||
"Not drop caps",
|
||||
"Drop Caps",
|
||||
"",
|
||||
""
|
||||
}
|
||||
};
|
||||
|
||||
DLLSYM INT32 bn[NUM_BLOCK_ATTR] = {
|
||||
4, //G_MONOCHROME
|
||||
2, //I_MONOCHROME
|
||||
2, //I_SMOOTH
|
||||
3, //R_SINGLE
|
||||
3, //R_BLACK
|
||||
3, //S_BLACK
|
||||
2 //W_TEXT
|
||||
};
|
||||
|
||||
DLLSYM INT32 tvar[NUM_TEXT_ATTR];
|
||||
DLLSYM INT32 bvar[NUM_BLOCK_ATTR];
|
||||
DLLSYM char blabel[NUM_BLOCK_ATTR][4][MAXLENGTH] = { {
|
||||
//G_MONOCHROME
|
||||
|
||||
/****************************************************************************
|
||||
* GRAPHICS
|
||||
***************************************************************************/
|
||||
"Monochrome ",
|
||||
"Two colour ",
|
||||
"Spot colour",
|
||||
"Multicolour"
|
||||
},
|
||||
|
||||
/****************************************************************************
|
||||
* IMAGE
|
||||
***************************************************************************/
|
||||
{ //I_MONOCHROME
|
||||
"Monochrome ",
|
||||
"Colour ",
|
||||
"",
|
||||
""
|
||||
},
|
||||
{ //I_SMOOTH
|
||||
"Smooth ",
|
||||
"Grainy ",
|
||||
"",
|
||||
""
|
||||
},
|
||||
|
||||
/****************************************************************************
|
||||
* RULES
|
||||
***************************************************************************/
|
||||
{ //R_SINGLE
|
||||
"Single ",
|
||||
"Double ",
|
||||
"Multiple",
|
||||
""
|
||||
},
|
||||
{ //R_BLACK
|
||||
"Black ",
|
||||
"White ",
|
||||
"Coloured",
|
||||
""
|
||||
},
|
||||
|
||||
/****************************************************************************
|
||||
* SCRIBBLE
|
||||
***************************************************************************/
|
||||
{ //S_BLACK
|
||||
"Black ",
|
||||
"White ",
|
||||
"Coloured",
|
||||
""
|
||||
},
|
||||
/****************************************************************************
|
||||
* WEIRD
|
||||
***************************************************************************/
|
||||
{ //W_TEXT
|
||||
"No text ",
|
||||
"Contains text",
|
||||
"",
|
||||
""
|
||||
}
|
||||
};
|
||||
|
||||
DLLSYM char backlabel[NUM_BACKGROUNDS][MAXLENGTH] = {
|
||||
"White", //B_WHITE
|
||||
"Black", //B_BLACK
|
||||
"Coloured", //B_COLOURED
|
||||
"Textured", //B_TEXTURED
|
||||
"Patterned", //B_PATTERNED
|
||||
"Gradient fill", //B_GRADIENTFILL
|
||||
"Image", //B_IMAGE
|
||||
"Text" //B_TEXT
|
||||
};
|
38
ccstruct/labls.h
Normal file
38
ccstruct/labls.h
Normal file
@ -0,0 +1,38 @@
|
||||
/**********************************************************************
|
||||
* File: labls.h (Formerly labels.h)
|
||||
* Description: Attribute definition tables
|
||||
* Author: Sheelagh Lloyd?
|
||||
* Created:
|
||||
*
|
||||
* (C) Copyright 1993, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
#ifndef LABLS_H
|
||||
#define LABLS_H
|
||||
|
||||
#include "host.h"
|
||||
#include "hpdsizes.h"
|
||||
|
||||
#include "hpddef.h" //must be last (handpd.dll)
|
||||
|
||||
extern DLLSYM INT32 tn[NUM_TEXT_ATTR];
|
||||
|
||||
extern DLLSYM char tlabel[NUM_TEXT_ATTR][4][MAXLENGTH];
|
||||
|
||||
extern DLLSYM INT32 bn[NUM_BLOCK_ATTR];
|
||||
|
||||
extern DLLSYM INT32 tvar[NUM_TEXT_ATTR];
|
||||
extern DLLSYM INT32 bvar[NUM_BLOCK_ATTR];
|
||||
extern DLLSYM char blabel[NUM_BLOCK_ATTR][4][MAXLENGTH];
|
||||
|
||||
extern DLLSYM char backlabel[NUM_BACKGROUNDS][MAXLENGTH];
|
||||
#endif
|
249
ccstruct/linlsq.cpp
Normal file
249
ccstruct/linlsq.cpp
Normal file
@ -0,0 +1,249 @@
|
||||
/**********************************************************************
|
||||
* File: linlsq.cpp (Formerly llsq.c)
|
||||
* Description: Linear Least squares fitting code.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Sep 12 08:44:51 BST 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <stdio.h>
|
||||
#include <math.h>
|
||||
#include "errcode.h"
|
||||
#include "linlsq.h"
|
||||
|
||||
#ifndef __UNIX__
|
||||
#define M_PI 3.14159265359
|
||||
#endif
|
||||
|
||||
const ERRCODE EMPTY_LLSQ = "Can't delete from an empty LLSQ";
|
||||
|
||||
#define EXTERN
|
||||
|
||||
EXTERN double_VAR (pdlsq_posdir_ratio, 4e-6, "Mult of dir to cf pos");
|
||||
EXTERN double_VAR (pdlsq_threshold_angleavg, 0.1666666,
|
||||
"Frac of pi for simple fit");
|
||||
|
||||
/**********************************************************************
|
||||
* LLSQ::clear
|
||||
*
|
||||
* Function to initialize a LLSQ.
|
||||
**********************************************************************/
|
||||
|
||||
void LLSQ::clear() { //initialize
|
||||
n = 0; //no elements
|
||||
sigx = 0; //update accumulators
|
||||
sigy = 0;
|
||||
sigxx = 0;
|
||||
sigxy = 0;
|
||||
sigyy = 0;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LLSQ::add
|
||||
*
|
||||
* Add an element to the accumulator.
|
||||
**********************************************************************/
|
||||
|
||||
void LLSQ::add( //add an element
|
||||
double x, //xcoord
|
||||
double y //ycoord
|
||||
) {
|
||||
n++; //count elements
|
||||
sigx += x; //update accumulators
|
||||
sigy += y;
|
||||
sigxx += x * x;
|
||||
sigxy += x * y;
|
||||
sigyy += y * y;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LLSQ::remove
|
||||
*
|
||||
* Delete an element from the acculuator.
|
||||
**********************************************************************/
|
||||
|
||||
void LLSQ::remove( //delete an element
|
||||
double x, //xcoord
|
||||
double y //ycoord
|
||||
) {
|
||||
if (n <= 0)
|
||||
//illegal
|
||||
EMPTY_LLSQ.error ("LLSQ::remove", ABORT, NULL);
|
||||
n--; //count elements
|
||||
sigx -= x; //update accumulators
|
||||
sigy -= y;
|
||||
sigxx -= x * x;
|
||||
sigxy -= x * y;
|
||||
sigyy -= y * y;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LLSQ::m
|
||||
*
|
||||
* Return the gradient of the line fit.
|
||||
**********************************************************************/
|
||||
|
||||
double LLSQ::m() { //get gradient
|
||||
if (n > 1)
|
||||
return (sigxy - sigx * sigy / n) / (sigxx - sigx * sigx / n);
|
||||
else
|
||||
return 0; //too little
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LLSQ::c
|
||||
*
|
||||
* Return the constant of the line fit.
|
||||
**********************************************************************/
|
||||
|
||||
double LLSQ::c( //get constant
|
||||
double m //gradient to fit with
|
||||
) {
|
||||
if (n > 0)
|
||||
return (sigy - m * sigx) / n;
|
||||
else
|
||||
return 0; //too little
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LLSQ::rms
|
||||
*
|
||||
* Return the rms error of the fit.
|
||||
**********************************************************************/
|
||||
|
||||
double LLSQ::rms( //get error
|
||||
double m, //gradient to fit with
|
||||
double c //constant to fit with
|
||||
) {
|
||||
double error; //total error
|
||||
|
||||
if (n > 0) {
|
||||
error =
|
||||
sigyy + m * (m * sigxx + 2 * (c * sigx - sigxy)) + c * (n * c -
|
||||
2 * sigy);
|
||||
if (error >= 0)
|
||||
error = sqrt (error / n); //sqrt of mean
|
||||
else
|
||||
error = 0;
|
||||
}
|
||||
else
|
||||
error = 0; //too little
|
||||
return error;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LLSQ::spearman
|
||||
*
|
||||
* Return the spearman correlation coefficient.
|
||||
**********************************************************************/
|
||||
|
||||
double LLSQ::spearman() { //get error
|
||||
double error; //total error
|
||||
|
||||
if (n > 1) {
|
||||
error = (sigxx - sigx * sigx / n) * (sigyy - sigy * sigy / n);
|
||||
if (error > 0) {
|
||||
error = (sigxy - sigx * sigy / n) / sqrt (error);
|
||||
}
|
||||
else
|
||||
error = 1;
|
||||
}
|
||||
else
|
||||
error = 1; //too little
|
||||
return error;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* PDLSQ::fit
|
||||
*
|
||||
* Return all the parameters of the fit to pos/dir.
|
||||
* The return value is the rms error.
|
||||
**********************************************************************/
|
||||
|
||||
float PDLSQ::fit( //get fit
|
||||
DIR128 &ang, //output angle
|
||||
float &sin_ang, //r,theta parameterisation
|
||||
float &cos_ang,
|
||||
float &r) {
|
||||
double a, b; //itermediates
|
||||
double angle; //resulting angle
|
||||
double avg_angle; //simple average
|
||||
double error; //total error
|
||||
double sinx, cosx; //return values
|
||||
|
||||
if (pos.n > 0) {
|
||||
a = pos.sigxy - pos.sigx * pos.sigy / pos.n
|
||||
+ pdlsq_posdir_ratio * dir.sigxy;
|
||||
b =
|
||||
pos.sigxx - pos.sigyy + (pos.sigy * pos.sigy -
|
||||
pos.sigx * pos.sigx) / pos.n +
|
||||
pdlsq_posdir_ratio * (dir.sigxx - dir.sigyy);
|
||||
if (dir.sigy != 0 || dir.sigx != 0)
|
||||
avg_angle = atan2 (dir.sigy, dir.sigx);
|
||||
else
|
||||
avg_angle = 0;
|
||||
if ((a != 0 || b != 0) && pos.n > 1)
|
||||
angle = atan2 (2 * a, b) / 2;
|
||||
else
|
||||
angle = avg_angle;
|
||||
error = avg_angle - angle;
|
||||
if (error > M_PI / 2) {
|
||||
error -= M_PI;
|
||||
angle += M_PI;
|
||||
}
|
||||
if (error < -M_PI / 2) {
|
||||
error += M_PI;
|
||||
angle -= M_PI;
|
||||
}
|
||||
if (error > M_PI * pdlsq_threshold_angleavg
|
||||
|| error < -M_PI * pdlsq_threshold_angleavg)
|
||||
angle = avg_angle; //go simple
|
||||
//convert direction
|
||||
ang = (INT16) (angle * MODULUS / (2 * M_PI));
|
||||
sinx = sin (angle);
|
||||
cosx = cos (angle);
|
||||
r = (sinx * pos.sigx - cosx * pos.sigy) / pos.n;
|
||||
// tprintf("x=%g, y=%g, xx=%g, xy=%g, yy=%g, a=%g, b=%g, ang=%g, r=%g\n",
|
||||
// pos.sigx,pos.sigy,pos.sigxx,pos.sigxy,pos.sigyy,
|
||||
// a,b,angle,r);
|
||||
error = dir.sigxx * sinx * sinx + dir.sigyy * cosx * cosx
|
||||
- 2 * dir.sigxy * sinx * cosx;
|
||||
error *= pdlsq_posdir_ratio;
|
||||
error += sinx * sinx * pos.sigxx + cosx * cosx * pos.sigyy
|
||||
- 2 * sinx * cosx * pos.sigxy
|
||||
- 2 * r * (sinx * pos.sigx - cosx * pos.sigy) + r * r * pos.n;
|
||||
if (error >= 0)
|
||||
//rms value
|
||||
error = sqrt (error / pos.n);
|
||||
else
|
||||
error = 0; //-0
|
||||
sin_ang = sinx;
|
||||
cos_ang = cosx;
|
||||
}
|
||||
else {
|
||||
sin_ang = 0.0f;
|
||||
cos_ang = 0.0f;
|
||||
ang = 0;
|
||||
error = 0; //too little
|
||||
}
|
||||
return error;
|
||||
}
|
102
ccstruct/linlsq.h
Normal file
102
ccstruct/linlsq.h
Normal file
@ -0,0 +1,102 @@
|
||||
/**********************************************************************
|
||||
* File: linlsq.h (Formerly llsq.h)
|
||||
* Description: Linear Least squares fitting code.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Sep 12 08:44:51 BST 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef LINLSQ_H
|
||||
#define LINLSQ_H
|
||||
|
||||
#include "points.h"
|
||||
#include "mod128.h"
|
||||
#include "varable.h"
|
||||
|
||||
class LLSQ
|
||||
{
|
||||
friend class PDLSQ; //pos & direction
|
||||
|
||||
public:
|
||||
LLSQ() { //constructor
|
||||
clear(); //set to zeros
|
||||
}
|
||||
void clear(); //initialize
|
||||
|
||||
void add( //add element
|
||||
double x, //coords to add
|
||||
double y);
|
||||
void remove( //delete element
|
||||
double x, //coords to delete
|
||||
double y);
|
||||
INT32 count() { //no of elements
|
||||
return n;
|
||||
}
|
||||
|
||||
double m(); //get gradient
|
||||
double c( //get constant
|
||||
double m); //gradient
|
||||
double rms( //get error
|
||||
double m, //gradient
|
||||
double c); //constant
|
||||
double spearman(); //get error
|
||||
|
||||
private:
|
||||
INT32 n; //no of elements
|
||||
double sigx; //sum of x
|
||||
double sigy; //sum of y
|
||||
double sigxx; //sum x squared
|
||||
double sigxy; //sum of xy
|
||||
double sigyy; //sum y squared
|
||||
};
|
||||
|
||||
class PDLSQ
|
||||
{
|
||||
public:
|
||||
PDLSQ() { //constructor
|
||||
clear(); //set to zeros
|
||||
}
|
||||
void clear() { //initialize
|
||||
pos.clear (); //clear both
|
||||
dir.clear ();
|
||||
}
|
||||
|
||||
void add( //add element
|
||||
const ICOORD &addpos, //position of pt
|
||||
const ICOORD &adddir) { //dir of pt
|
||||
pos.add (addpos.x (), addpos.y ());
|
||||
dir.add (adddir.x (), adddir.y ());
|
||||
}
|
||||
void remove( //remove element
|
||||
const ICOORD &removepos, //position of pt
|
||||
const ICOORD &removedir) { //dir of pt
|
||||
pos.remove (removepos.x (), removepos.y ());
|
||||
dir.remove (removedir.x (), removedir.y ());
|
||||
}
|
||||
INT32 count() { //no of elements
|
||||
return pos.count ();
|
||||
}
|
||||
|
||||
float fit( //get fit parameters
|
||||
DIR128 &ang, //output angle
|
||||
float &sin_ang, //output components
|
||||
float &cos_ang,
|
||||
float &r);
|
||||
|
||||
private:
|
||||
LLSQ pos; //position
|
||||
LLSQ dir; //directions
|
||||
};
|
||||
extern double_VAR_H (pdlsq_posdir_ratio, 0.4e-6, "Mult of dir to cf pos");
|
||||
#endif
|
453
ccstruct/lmedsq.cpp
Normal file
453
ccstruct/lmedsq.cpp
Normal file
@ -0,0 +1,453 @@
|
||||
/**********************************************************************
|
||||
* File: lmedsq.cpp (Formerly lms.c)
|
||||
* Description: Code for the LMS class.
|
||||
* Author: Ray Smith
|
||||
* Created: Fri Aug 7 09:30:53 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <stdlib.h>
|
||||
#include "statistc.h"
|
||||
#include "memry.h"
|
||||
#include "statistc.h"
|
||||
#include "lmedsq.h"
|
||||
|
||||
#define EXTERN
|
||||
|
||||
EXTERN INT_VAR (lms_line_trials, 12, "Number of linew fits to do");
|
||||
#define SEED1 0x1234 //default seeds
|
||||
#define SEED2 0x5678
|
||||
#define SEED3 0x9abc
|
||||
#define LMS_MAX_FAILURES 3
|
||||
|
||||
#ifndef __UNIX__
|
||||
UINT32 nrand48( //get random number
|
||||
UINT16 *seeds //seeds to use
|
||||
) {
|
||||
static UINT32 seed = 0; //only seed
|
||||
|
||||
if (seed == 0) {
|
||||
seed = seeds[0] ^ (seeds[1] << 8) ^ (seeds[2] << 16);
|
||||
srand(seed);
|
||||
}
|
||||
//make 32 bit one
|
||||
return rand () | (rand () << 16);
|
||||
}
|
||||
#endif
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::LMS
|
||||
*
|
||||
* Construct a LMS class, given the max no of samples to be given
|
||||
**********************************************************************/
|
||||
|
||||
LMS::LMS ( //constructor
|
||||
INT32 size //samplesize
|
||||
):samplesize (size) {
|
||||
samplecount = 0;
|
||||
a = 0;
|
||||
m = 0.0f;
|
||||
c = 0.0f;
|
||||
samples = (FCOORD *) alloc_mem (size * sizeof (FCOORD));
|
||||
errors = (float *) alloc_mem (size * sizeof (float));
|
||||
line_error = 0.0f;
|
||||
fitted = FALSE;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::~LMS
|
||||
*
|
||||
* Destruct a LMS class.
|
||||
**********************************************************************/
|
||||
|
||||
LMS::~LMS ( //constructor
|
||||
) {
|
||||
free_mem(samples);
|
||||
free_mem(errors);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::clear
|
||||
*
|
||||
* Clear samples from array.
|
||||
**********************************************************************/
|
||||
|
||||
void LMS::clear() { //clear sample
|
||||
samplecount = 0;
|
||||
fitted = FALSE;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::add
|
||||
*
|
||||
* Add another sample. More than the constructed number will be ignored.
|
||||
**********************************************************************/
|
||||
|
||||
void LMS::add( //add sample
|
||||
FCOORD sample //sample coords
|
||||
) {
|
||||
if (samplecount < samplesize)
|
||||
//save it
|
||||
samples[samplecount++] = sample;
|
||||
fitted = FALSE;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::fit
|
||||
*
|
||||
* Fit a line to the given sample points.
|
||||
**********************************************************************/
|
||||
|
||||
void LMS::fit( //fit sample
|
||||
float &out_m, //output line
|
||||
float &out_c) {
|
||||
INT32 index; //of median
|
||||
INT32 trials; //no of medians
|
||||
float test_m, test_c; //candidate line
|
||||
float test_error; //error of test line
|
||||
|
||||
switch (samplecount) {
|
||||
case 0:
|
||||
m = 0.0f; //no info
|
||||
c = 0.0f;
|
||||
line_error = 0.0f;
|
||||
break;
|
||||
|
||||
case 1:
|
||||
m = 0.0f;
|
||||
c = samples[0].y (); //horiz thru pt
|
||||
line_error = 0.0f;
|
||||
break;
|
||||
|
||||
case 2:
|
||||
if (samples[0].x () != samples[1].x ()) {
|
||||
m = (samples[1].y () - samples[0].y ())
|
||||
/ (samples[1].x () - samples[0].x ());
|
||||
c = samples[0].y () - m * samples[0].x ();
|
||||
}
|
||||
else {
|
||||
m = 0.0f;
|
||||
c = (samples[0].y () + samples[1].y ()) / 2;
|
||||
}
|
||||
line_error = 0.0f;
|
||||
break;
|
||||
|
||||
default:
|
||||
pick_line(m, c); //use pts at random
|
||||
compute_errors(m, c); //from given line
|
||||
index = choose_nth_item (samplecount / 2, errors, samplecount);
|
||||
line_error = errors[index];
|
||||
for (trials = 1; trials < lms_line_trials; trials++) {
|
||||
//random again
|
||||
pick_line(test_m, test_c);
|
||||
compute_errors(test_m, test_c);
|
||||
index = choose_nth_item (samplecount / 2, errors, samplecount);
|
||||
test_error = errors[index];
|
||||
if (test_error < line_error) {
|
||||
//find least median
|
||||
line_error = test_error;
|
||||
m = test_m;
|
||||
c = test_c;
|
||||
}
|
||||
}
|
||||
}
|
||||
fitted = TRUE;
|
||||
out_m = m;
|
||||
out_c = c;
|
||||
a = 0;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::fit_quadratic
|
||||
*
|
||||
* Fit a quadratic to the given sample points.
|
||||
**********************************************************************/
|
||||
|
||||
void LMS::fit_quadratic( //fit sample
|
||||
float outlier_threshold, //min outlier size
|
||||
double &out_a, //x squared
|
||||
float &out_b, //output line
|
||||
float &out_c) {
|
||||
INT32 trials; //no of medians
|
||||
double test_a;
|
||||
float test_b, test_c; //candidate line
|
||||
float test_error; //error of test line
|
||||
|
||||
if (samplecount < 3) {
|
||||
out_a = 0;
|
||||
fit(out_b, out_c);
|
||||
return;
|
||||
}
|
||||
pick_quadratic(a, m, c);
|
||||
line_error = compute_quadratic_errors (outlier_threshold, a, m, c);
|
||||
for (trials = 1; trials < lms_line_trials * 2; trials++) {
|
||||
pick_quadratic(test_a, test_b, test_c);
|
||||
test_error = compute_quadratic_errors (outlier_threshold,
|
||||
test_a, test_b, test_c);
|
||||
if (test_error < line_error) {
|
||||
line_error = test_error; //find least median
|
||||
a = test_a;
|
||||
m = test_b;
|
||||
c = test_c;
|
||||
}
|
||||
}
|
||||
fitted = TRUE;
|
||||
out_a = a;
|
||||
out_b = m;
|
||||
out_c = c;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::constrained_fit
|
||||
*
|
||||
* Fit a line to the given sample points.
|
||||
* The line must have the given gradient.
|
||||
**********************************************************************/
|
||||
|
||||
void LMS::constrained_fit( //fit sample
|
||||
float fixed_m, //forced gradient
|
||||
float &out_c) {
|
||||
INT32 index; //of median
|
||||
INT32 trials; //no of medians
|
||||
float test_c; //candidate line
|
||||
static UINT16 seeds[3] = { SEED1, SEED2, SEED3 };
|
||||
//for nrand
|
||||
float test_error; //error of test line
|
||||
|
||||
m = fixed_m;
|
||||
switch (samplecount) {
|
||||
case 0:
|
||||
c = 0.0f;
|
||||
line_error = 0.0f;
|
||||
break;
|
||||
|
||||
case 1:
|
||||
//horiz thru pt
|
||||
c = samples[0].y () - m * samples[0].x ();
|
||||
line_error = 0.0f;
|
||||
break;
|
||||
|
||||
case 2:
|
||||
c = (samples[0].y () + samples[1].y ()
|
||||
- m * (samples[0].x () + samples[1].x ())) / 2;
|
||||
line_error = m * samples[0].x () + c - samples[0].y ();
|
||||
line_error *= line_error;
|
||||
break;
|
||||
|
||||
default:
|
||||
index = (INT32) nrand48 (seeds) % samplecount;
|
||||
//compute line
|
||||
c = samples[index].y () - m * samples[index].x ();
|
||||
compute_errors(m, c); //from given line
|
||||
index = choose_nth_item (samplecount / 2, errors, samplecount);
|
||||
line_error = errors[index];
|
||||
for (trials = 1; trials < lms_line_trials; trials++) {
|
||||
index = (INT32) nrand48 (seeds) % samplecount;
|
||||
test_c = samples[index].y () - m * samples[index].x ();
|
||||
//compute line
|
||||
compute_errors(m, test_c);
|
||||
index = choose_nth_item (samplecount / 2, errors, samplecount);
|
||||
test_error = errors[index];
|
||||
if (test_error < line_error) {
|
||||
//find least median
|
||||
line_error = test_error;
|
||||
c = test_c;
|
||||
}
|
||||
}
|
||||
}
|
||||
fitted = TRUE;
|
||||
out_c = c;
|
||||
a = 0;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::pick_line
|
||||
*
|
||||
* Fit a line to a random pair of sample points.
|
||||
**********************************************************************/
|
||||
|
||||
void LMS::pick_line( //fit sample
|
||||
float &line_m, //output gradient
|
||||
float &line_c) {
|
||||
INT16 trial_count; //no of attempts
|
||||
static UINT16 seeds[3] = { SEED1, SEED2, SEED3 };
|
||||
//for nrand
|
||||
INT32 index1; //picked point
|
||||
INT32 index2; //picked point
|
||||
|
||||
trial_count = 0;
|
||||
do {
|
||||
index1 = (INT32) nrand48 (seeds) % samplecount;
|
||||
index2 = (INT32) nrand48 (seeds) % samplecount;
|
||||
line_m = samples[index2].x () - samples[index1].x ();
|
||||
trial_count++;
|
||||
}
|
||||
while (line_m == 0 && trial_count < LMS_MAX_FAILURES);
|
||||
if (line_m == 0) {
|
||||
line_c = (samples[index2].y () + samples[index1].y ()) / 2;
|
||||
}
|
||||
else {
|
||||
line_m = (samples[index2].y () - samples[index1].y ()) / line_m;
|
||||
line_c = samples[index1].y () - samples[index1].x () * line_m;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::pick_quadratic
|
||||
*
|
||||
* Fit a quadratic to a random triplet of sample points.
|
||||
**********************************************************************/
|
||||
|
||||
void LMS::pick_quadratic( //fit sample
|
||||
double &line_a, //x suaread
|
||||
float &line_m, //output gradient
|
||||
float &line_c) {
|
||||
INT16 trial_count; //no of attempts
|
||||
static UINT16 seeds[3] = { SEED1, SEED2, SEED3 };
|
||||
//for nrand
|
||||
INT32 index1; //picked point
|
||||
INT32 index2; //picked point
|
||||
INT32 index3;
|
||||
FCOORD x1x2; //vector
|
||||
FCOORD x1x3;
|
||||
FCOORD x3x2;
|
||||
double bottom; //of a
|
||||
|
||||
trial_count = 0;
|
||||
do {
|
||||
if (trial_count >= LMS_MAX_FAILURES - 1) {
|
||||
index1 = 0;
|
||||
index2 = samplecount / 2;
|
||||
index3 = samplecount - 1;
|
||||
}
|
||||
else {
|
||||
index1 = (INT32) nrand48 (seeds) % samplecount;
|
||||
index2 = (INT32) nrand48 (seeds) % samplecount;
|
||||
index3 = (INT32) nrand48 (seeds) % samplecount;
|
||||
}
|
||||
x1x2 = samples[index2] - samples[index1];
|
||||
x1x3 = samples[index3] - samples[index1];
|
||||
x3x2 = samples[index2] - samples[index3];
|
||||
bottom = x1x2.x () * x1x3.x () * x3x2.x ();
|
||||
trial_count++;
|
||||
}
|
||||
while (bottom == 0 && trial_count < LMS_MAX_FAILURES);
|
||||
if (bottom == 0) {
|
||||
line_a = 0;
|
||||
pick_line(line_m, line_c);
|
||||
}
|
||||
else {
|
||||
line_a = x1x3 * x1x2 / bottom;
|
||||
line_m = x1x2.y () - line_a * x1x2.x ()
|
||||
* (samples[index2].x () + samples[index1].x ());
|
||||
line_m /= x1x2.x ();
|
||||
line_c = samples[index1].y () - samples[index1].x ()
|
||||
* (samples[index1].x () * line_a + line_m);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::compute_errors
|
||||
*
|
||||
* Compute the squared error from all the points.
|
||||
**********************************************************************/
|
||||
|
||||
void LMS::compute_errors( //fit sample
|
||||
float line_m, //input gradient
|
||||
float line_c) {
|
||||
INT32 index; //picked point
|
||||
|
||||
for (index = 0; index < samplecount; index++) {
|
||||
errors[index] =
|
||||
line_m * samples[index].x () + line_c - samples[index].y ();
|
||||
errors[index] *= errors[index];
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::compute_quadratic_errors
|
||||
*
|
||||
* Compute the squared error from all the points.
|
||||
**********************************************************************/
|
||||
|
||||
float LMS::compute_quadratic_errors( //fit sample
|
||||
float outlier_threshold, //min outlier
|
||||
double line_a,
|
||||
float line_m, //input gradient
|
||||
float line_c) {
|
||||
INT32 outlier_count; //total outliers
|
||||
INT32 index; //picked point
|
||||
INT32 error_count; //no in total
|
||||
double total_error; //summed squares
|
||||
|
||||
total_error = 0;
|
||||
outlier_count = 0;
|
||||
error_count = 0;
|
||||
for (index = 0; index < samplecount; index++) {
|
||||
errors[error_count] = line_c + samples[index].x ()
|
||||
* (line_m + samples[index].x () * line_a) - samples[index].y ();
|
||||
errors[error_count] *= errors[error_count];
|
||||
if (errors[error_count] > outlier_threshold) {
|
||||
outlier_count++;
|
||||
errors[samplecount - outlier_count] = errors[error_count];
|
||||
}
|
||||
else {
|
||||
total_error += errors[error_count++];
|
||||
}
|
||||
}
|
||||
if (outlier_count * 3 < error_count)
|
||||
return total_error / error_count;
|
||||
else {
|
||||
index = choose_nth_item (outlier_count / 2,
|
||||
errors + samplecount - outlier_count,
|
||||
outlier_count);
|
||||
//median outlier
|
||||
return errors[samplecount - outlier_count + index];
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* LMS::plot
|
||||
*
|
||||
* Plot the fitted line of a LMS.
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void LMS::plot( //plot fit
|
||||
WINDOW win, //window
|
||||
COLOUR colour //colour to draw in
|
||||
) {
|
||||
if (fitted) {
|
||||
line_color_index(win, colour);
|
||||
move2d (win, samples[0].x (),
|
||||
c + samples[0].x () * (m + samples[0].x () * a));
|
||||
draw2d (win, samples[samplecount - 1].x (),
|
||||
c + samples[samplecount - 1].x () * (m +
|
||||
samples[samplecount -
|
||||
1].x () * a));
|
||||
}
|
||||
}
|
||||
#endif
|
84
ccstruct/lmedsq.h
Normal file
84
ccstruct/lmedsq.h
Normal file
@ -0,0 +1,84 @@
|
||||
/**********************************************************************
|
||||
* File: lmedsq.h (Formerly lms.h)
|
||||
* Description: Code for the LMS class.
|
||||
* Author: Ray Smith
|
||||
* Created: Fri Aug 7 09:30:53 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef LMEDSQ_H
|
||||
#define LMEDSQ_H
|
||||
|
||||
#include "points.h"
|
||||
#include "varable.h"
|
||||
#include "grphics.h"
|
||||
#include "notdll.h"
|
||||
|
||||
class LMS
|
||||
{
|
||||
public:
|
||||
LMS( //constructor
|
||||
INT32 size); //no of samples
|
||||
~LMS (); //destructor
|
||||
void clear(); //clear samples
|
||||
void add( //add sample
|
||||
FCOORD sample); //sample coords
|
||||
void fit( //generate fit
|
||||
float &m, //output line
|
||||
float &c);
|
||||
void constrained_fit( //fixed gradient
|
||||
float fixed_m, //forced gradient
|
||||
float &out_c); //output line
|
||||
void fit_quadratic( //easy quadratic
|
||||
float outlier_threshold, //min outlier
|
||||
double &a, //x squared
|
||||
float &b, //x
|
||||
float &c); //constant
|
||||
void plot( //plot fit
|
||||
WINDOW win, //window
|
||||
COLOUR colour); //colour to draw in
|
||||
float error() { //get error
|
||||
return fitted ? line_error : -1;
|
||||
}
|
||||
|
||||
private:
|
||||
|
||||
void pick_line( //random choice
|
||||
float &m, //output line
|
||||
float &c);
|
||||
void pick_quadratic( //random choice
|
||||
double &a, //output curve
|
||||
float &b,
|
||||
float &c);
|
||||
void compute_errors( //find errors
|
||||
float m, //from line
|
||||
float c);
|
||||
//find errors
|
||||
float compute_quadratic_errors(float outlier_threshold, //min outlier
|
||||
double a, //from curve
|
||||
float m,
|
||||
float c);
|
||||
|
||||
BOOL8 fitted; //line parts valid
|
||||
INT32 samplesize; //max samples
|
||||
INT32 samplecount; //current sample size
|
||||
FCOORD *samples; //array of samples
|
||||
float *errors; //error distances
|
||||
double a; //x squared
|
||||
float m; //line gradient
|
||||
float c;
|
||||
float line_error; //error of fit
|
||||
};
|
||||
extern INT_VAR_H (lms_line_trials, 12, "Number of linew fits to do");
|
||||
#endif
|
100
ccstruct/mod128.cpp
Normal file
100
ccstruct/mod128.cpp
Normal file
@ -0,0 +1,100 @@
|
||||
/**********************************************************************
|
||||
* File: mod128.c (Formerly dir128.c)
|
||||
* Description: Code to convert a DIR128 to an ICOORD.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Oct 22 11:56:09 BST 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h" //precompiled headers
|
||||
#include "mod128.h"
|
||||
|
||||
static INT16 idirtab[] = {
|
||||
1000, 0, 998, 49, 995, 98, 989, 146,
|
||||
980, 195, 970, 242, 956, 290, 941, 336,
|
||||
923, 382, 903, 427, 881, 471, 857, 514,
|
||||
831, 555, 803, 595, 773, 634, 740, 671,
|
||||
707, 707, 671, 740, 634, 773, 595, 803,
|
||||
555, 831, 514, 857, 471, 881, 427, 903,
|
||||
382, 923, 336, 941, 290, 956, 242, 970,
|
||||
195, 980, 146, 989, 98, 995, 49, 998,
|
||||
0, 1000, -49, 998, -98, 995, -146, 989,
|
||||
-195, 980, -242, 970, -290, 956, -336, 941,
|
||||
-382, 923, -427, 903, -471, 881, -514, 857,
|
||||
-555, 831, -595, 803, -634, 773, -671, 740,
|
||||
-707, 707, -740, 671, -773, 634, -803, 595,
|
||||
-831, 555, -857, 514, -881, 471, -903, 427,
|
||||
-923, 382, -941, 336, -956, 290, -970, 242,
|
||||
-980, 195, -989, 146, -995, 98, -998, 49,
|
||||
-1000, 0, -998, -49, -995, -98, -989, -146,
|
||||
-980, -195, -970, -242, -956, -290, -941, -336,
|
||||
-923, -382, -903, -427, -881, -471, -857, -514,
|
||||
-831, -555, -803, -595, -773, -634, -740, -671,
|
||||
-707, -707, -671, -740, -634, -773, -595, -803,
|
||||
-555, -831, -514, -857, -471, -881, -427, -903,
|
||||
-382, -923, -336, -941, -290, -956, -242, -970,
|
||||
-195, -980, -146, -989, -98, -995, -49, -998,
|
||||
0, -1000, 49, -998, 98, -995, 146, -989,
|
||||
195, -980, 242, -970, 290, -956, 336, -941,
|
||||
382, -923, 427, -903, 471, -881, 514, -857,
|
||||
555, -831, 595, -803, 634, -773, 671, -740,
|
||||
707, -707, 740, -671, 773, -634, 803, -595,
|
||||
831, -555, 857, -514, 881, -471, 903, -427,
|
||||
923, -382, 941, -336, 956, -290, 970, -242,
|
||||
980, -195, 989, -146, 995, -98, 998, -49
|
||||
};
|
||||
|
||||
static ICOORD *dirtab = (ICOORD *) idirtab;
|
||||
|
||||
/**********************************************************************
|
||||
* DIR128::DIR128
|
||||
*
|
||||
* Quantize the direction of an FCOORD to make a DIR128.
|
||||
**********************************************************************/
|
||||
|
||||
DIR128::DIR128( //from fcoord
|
||||
const FCOORD fc //vector to quantize
|
||||
) {
|
||||
int high, low, current; //binary search
|
||||
|
||||
low = 0;
|
||||
if (fc.y () == 0) {
|
||||
if (fc.x () >= 0)
|
||||
dir = 0;
|
||||
else
|
||||
dir = MODULUS / 2;
|
||||
return;
|
||||
}
|
||||
high = MODULUS;
|
||||
do {
|
||||
current = (high + low) / 2;
|
||||
if (dirtab[current] * fc >= 0)
|
||||
low = current;
|
||||
else
|
||||
high = current;
|
||||
}
|
||||
while (high - low > 1);
|
||||
dir = low;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* dir_to_gradient
|
||||
*
|
||||
* Convert a direction to a vector.
|
||||
**********************************************************************/
|
||||
|
||||
ICOORD DIR128::vector() const { //convert to vector
|
||||
return dirtab[dir]; //easy really
|
||||
}
|
85
ccstruct/mod128.h
Normal file
85
ccstruct/mod128.h
Normal file
@ -0,0 +1,85 @@
|
||||
/**********************************************************************
|
||||
* File: mod128.h (Formerly dir128.h)
|
||||
* Description: Header for class which implements modulo arithmetic.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Mar 26 17:48:13 GMT 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef MOD128_H
|
||||
#define MOD128_H
|
||||
|
||||
#include "points.h"
|
||||
|
||||
#define MODULUS 128 /*range of directions */
|
||||
#define DIRBITS 7 //no of bits used
|
||||
#define DIRSCALE 1000 //length of vector
|
||||
|
||||
class DLLSYM DIR128
|
||||
{
|
||||
public:
|
||||
DIR128() {
|
||||
} //empty constructor
|
||||
|
||||
DIR128( //constructor
|
||||
INT16 value) { //value to assign
|
||||
value %= MODULUS; //modulo arithmetic
|
||||
if (value < 0)
|
||||
value += MODULUS; //done properly
|
||||
dir = (INT8) value;
|
||||
}
|
||||
DIR128(const FCOORD fc); //quantize vector
|
||||
|
||||
DIR128 & operator= ( //assign of INT16
|
||||
INT16 value) { //value to assign
|
||||
value %= MODULUS; //modulo arithmetic
|
||||
if (value < 0)
|
||||
value += MODULUS; //done properly
|
||||
dir = (INT8) value;
|
||||
return *this;
|
||||
}
|
||||
INT8 operator- ( //subtraction
|
||||
const DIR128 & minus) const//for signed result
|
||||
{
|
||||
//result
|
||||
INT16 result = dir - minus.dir;
|
||||
|
||||
if (result > MODULUS / 2)
|
||||
result -= MODULUS; //get in range
|
||||
else if (result < -MODULUS / 2)
|
||||
result += MODULUS;
|
||||
return (INT8) result;
|
||||
}
|
||||
DIR128 operator+ ( //addition
|
||||
const DIR128 & add) const //of itself
|
||||
{
|
||||
DIR128 result; //sum
|
||||
|
||||
result = dir + add.dir; //let = do the work
|
||||
return result;
|
||||
}
|
||||
DIR128 & operator+= ( //same as +
|
||||
const DIR128 & add) {
|
||||
*this = dir + add.dir; //let = do the work
|
||||
return *this;
|
||||
}
|
||||
INT8 get_dir() const { //access function
|
||||
return dir;
|
||||
}
|
||||
ICOORD vector() const; //turn to vector
|
||||
|
||||
private:
|
||||
INT8 dir; //a direction
|
||||
};
|
||||
#endif
|
176
ccstruct/normalis.cpp
Normal file
176
ccstruct/normalis.cpp
Normal file
@ -0,0 +1,176 @@
|
||||
/**********************************************************************
|
||||
* File: normalis.cpp (Formerly denorm.c)
|
||||
* Description: Code for the DENORM class.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Apr 23 09:22:43 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "werd.h"
|
||||
#include "normalis.h"
|
||||
|
||||
/**********************************************************************
|
||||
* DENORM::binary_search_segment
|
||||
*
|
||||
* Find the segment to use for the given x.
|
||||
**********************************************************************/
|
||||
|
||||
const DENORM_SEG *DENORM::binary_search_segment(float src_x) const {
|
||||
int bottom, top, middle; //binary search
|
||||
|
||||
bottom = 0;
|
||||
top = segments;
|
||||
do {
|
||||
middle = (bottom + top) / 2;
|
||||
if (segs[middle].xstart > src_x)
|
||||
top = middle;
|
||||
else
|
||||
bottom = middle;
|
||||
}
|
||||
while (top - bottom > 1);
|
||||
return &segs[bottom];
|
||||
}
|
||||
|
||||
/**********************************************************************
|
||||
* DENORM::scale_at_x
|
||||
*
|
||||
* Return scaling at a given (normalized) x coord.
|
||||
**********************************************************************/
|
||||
|
||||
float DENORM::scale_at_x(float src_x) const { // In normalized coords.
|
||||
if (segments != 0) {
|
||||
const DENORM_SEG* seg = binary_search_segment(src_x);
|
||||
if (seg->scale_factor > 0.0)
|
||||
return seg->scale_factor;
|
||||
}
|
||||
return scale_factor;
|
||||
}
|
||||
|
||||
/**********************************************************************
|
||||
* DENORM::yshift_at_x
|
||||
*
|
||||
* Return yshift at a given (normalized) x coord.
|
||||
**********************************************************************/
|
||||
|
||||
float DENORM::yshift_at_x(float src_x) const { // In normalized coords.
|
||||
if (segments != 0) {
|
||||
const DENORM_SEG* seg = binary_search_segment(src_x);
|
||||
if (seg->ycoord == -MAX_INT32) {
|
||||
if (base_is_row)
|
||||
return source_row->base_line(x(src_x)/scale_at_x(src_x) + x_centre);
|
||||
else
|
||||
return m * x(src_x) + c;
|
||||
} else {
|
||||
return seg->ycoord;
|
||||
}
|
||||
}
|
||||
return source_row->base_line (x(src_x)/scale_at_x(src_x) + x_centre);
|
||||
}
|
||||
|
||||
/**********************************************************************
|
||||
* DENORM::x
|
||||
*
|
||||
* Denormalise an x coordinate.
|
||||
**********************************************************************/
|
||||
|
||||
float DENORM::x( //convert x coord
|
||||
float src_x //coord to convert
|
||||
) const {
|
||||
return src_x / scale_at_x(src_x) + x_centre;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* DENORM::y
|
||||
*
|
||||
* Denormalise a y coordinate.
|
||||
**********************************************************************/
|
||||
|
||||
float DENORM::y( //convert y coord
|
||||
float src_y, //coord to convert
|
||||
float src_centre //x location for base
|
||||
) const {
|
||||
return (src_y - bln_baseline_offset) / scale_at_x(src_centre)
|
||||
+ yshift_at_x(src_centre);
|
||||
}
|
||||
|
||||
|
||||
DENORM::DENORM(float x, //from same pieces
|
||||
float scaling,
|
||||
double line_m, //default line
|
||||
double line_c,
|
||||
INT16 seg_count, //no of segments
|
||||
DENORM_SEG *seg_pts, //actual segments
|
||||
BOOL8 using_row, //as baseline
|
||||
ROW *src) {
|
||||
x_centre = x; //just copy
|
||||
scale_factor = scaling;
|
||||
source_row = src;
|
||||
if (seg_count > 0) {
|
||||
segs = new DENORM_SEG[seg_count];
|
||||
for (segments = 0; segments < seg_count; segments++) {
|
||||
// It is possible, if infrequent that the segments may be out of order.
|
||||
// since we are searching with a binary search, keep them in order.
|
||||
if (segments == 0 || segs[segments - 1].xstart <=
|
||||
seg_pts[segments].xstart) {
|
||||
segs[segments] = seg_pts[segments];
|
||||
} else {
|
||||
int i;
|
||||
for (i = 0; i < segments
|
||||
&& segs[segments - 1 - i].xstart > seg_pts[segments].xstart;
|
||||
++i) {
|
||||
segs[segments - i ] = segs[segments - 1 - i];
|
||||
}
|
||||
segs[segments - i] = seg_pts[segments];
|
||||
}
|
||||
}
|
||||
}
|
||||
else {
|
||||
segments = 0;
|
||||
segs = NULL;
|
||||
}
|
||||
base_is_row = using_row;
|
||||
m = line_m;
|
||||
c = line_c;
|
||||
}
|
||||
|
||||
|
||||
DENORM::DENORM(const DENORM &src) {
|
||||
segments = 0;
|
||||
segs = NULL;
|
||||
*this = src;
|
||||
}
|
||||
|
||||
|
||||
DENORM & DENORM::operator= (const DENORM & src) {
|
||||
x_centre = src.x_centre;
|
||||
scale_factor = src.scale_factor;
|
||||
source_row = src.source_row;
|
||||
if (segments > 0)
|
||||
delete[]segs;
|
||||
if (src.segments > 0) {
|
||||
segs = new DENORM_SEG[src.segments];
|
||||
for (segments = 0; segments < src.segments; segments++)
|
||||
segs[segments] = src.segs[segments];
|
||||
}
|
||||
else {
|
||||
segments = 0;
|
||||
segs = NULL;
|
||||
}
|
||||
base_is_row = src.base_is_row;
|
||||
m = src.m;
|
||||
c = src.c;
|
||||
return *this;
|
||||
}
|
108
ccstruct/normalis.h
Normal file
108
ccstruct/normalis.h
Normal file
@ -0,0 +1,108 @@
|
||||
/**********************************************************************
|
||||
* File: normalis.h (Formerly denorm.h)
|
||||
* Description: Code for the DENORM class.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Apr 23 09:22:43 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef NORMALIS_H
|
||||
#define NORMALIS_H
|
||||
|
||||
#include <stdio.h>
|
||||
|
||||
class ROW; //forward decl
|
||||
|
||||
class DENORM_SEG
|
||||
{
|
||||
public:
|
||||
DENORM_SEG() {
|
||||
} //empty
|
||||
|
||||
INT32 xstart; //start of segment
|
||||
INT32 ycoord; //y at segment
|
||||
float scale_factor; //for this segment
|
||||
};
|
||||
|
||||
class DENORM
|
||||
{
|
||||
public:
|
||||
DENORM() { //constructor
|
||||
source_row = NULL;
|
||||
x_centre = 0.0f;
|
||||
scale_factor = 1.0f;
|
||||
segments = 0;
|
||||
segs = NULL;
|
||||
base_is_row = TRUE;
|
||||
m = c = 0;
|
||||
}
|
||||
DENORM( //constructor
|
||||
float x, //from same pieces
|
||||
float scaling,
|
||||
ROW *src) {
|
||||
x_centre = x; //just copy
|
||||
scale_factor = scaling;
|
||||
source_row = src;
|
||||
segments = 0;
|
||||
segs = NULL;
|
||||
base_is_row = TRUE;
|
||||
m = c = 0;
|
||||
}
|
||||
DENORM( //constructor
|
||||
float x, //from same pieces
|
||||
float scaling,
|
||||
double line_m, //default line //no of segments
|
||||
double line_c,
|
||||
INT16 seg_count,
|
||||
DENORM_SEG *seg_pts, //actual segments
|
||||
BOOL8 using_row, //as baseline
|
||||
ROW *src);
|
||||
DENORM(const DENORM &);
|
||||
DENORM & operator= (const DENORM &);
|
||||
~DENORM () {
|
||||
if (segments > 0)
|
||||
delete[]segs;
|
||||
}
|
||||
|
||||
float origin() const { //get x centre
|
||||
return x_centre;
|
||||
}
|
||||
float scale() const { //get scale
|
||||
return scale_factor;
|
||||
}
|
||||
ROW *row() const { //get row
|
||||
return source_row;
|
||||
}
|
||||
float x( //convert an xcoord
|
||||
float src_x) const;
|
||||
float y( //convert a ycoord
|
||||
float src_y, //coord to convert
|
||||
float src_centre) const; //normed x centre
|
||||
float scale_at_x( // Return scaling at this coord.
|
||||
float src_x) const;
|
||||
float yshift_at_x( // Return yshift at this coord.
|
||||
float src_x) const;
|
||||
|
||||
private:
|
||||
const DENORM_SEG *binary_search_segment(float src_x) const;
|
||||
|
||||
BOOL8 base_is_row; //using row baseline?
|
||||
INT16 segments; //no of segments
|
||||
double c, m; //baseline
|
||||
float x_centre; //middle of word
|
||||
float scale_factor; //scaling
|
||||
ROW *source_row; //row it came from
|
||||
DENORM_SEG *segs; //array of segments
|
||||
};
|
||||
#endif
|
368
ccstruct/ocrblock.cpp
Normal file
368
ccstruct/ocrblock.cpp
Normal file
@ -0,0 +1,368 @@
|
||||
/**********************************************************************
|
||||
* File: ocrblock.cpp (Formerly block.c)
|
||||
* Description: BLOCK member functions and iterator functions.
|
||||
* Author: Ray Smith
|
||||
* Created: Fri Mar 15 09:41:28 GMT 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include <stdlib.h>
|
||||
#include "blckerr.h"
|
||||
#include "ocrblock.h"
|
||||
#include "tprintf.h"
|
||||
|
||||
#define BLOCK_LABEL_HEIGHT 150 //char height of block id
|
||||
|
||||
ELISTIZE_S (BLOCK)
|
||||
/**********************************************************************
|
||||
* BLOCK::BLOCK
|
||||
*
|
||||
* Constructor for a simple rectangular block.
|
||||
**********************************************************************/
|
||||
BLOCK::BLOCK ( //rectangular block
|
||||
const char *name, //filename
|
||||
BOOL8 prop, //proportional
|
||||
INT16 kern, //kerning
|
||||
INT16 space, //spacing
|
||||
INT16 xmin, //bottom left
|
||||
INT16 ymin, INT16 xmax, //top right
|
||||
INT16 ymax):
|
||||
PDBLK (xmin, ymin, xmax, ymax),
|
||||
filename(name) { //box(ICOORD(xmin,ymin),ICOORD(xmax,ymax))
|
||||
//boundaries
|
||||
ICOORDELT_IT left_it = &leftside;
|
||||
ICOORDELT_IT right_it = &rightside;
|
||||
|
||||
proportional = prop;
|
||||
kerning = kern;
|
||||
spacing = space;
|
||||
font_class = -1; //not assigned
|
||||
hand_block = NULL;
|
||||
hand_poly = NULL;
|
||||
left_it.set_to_list (&leftside);
|
||||
right_it.set_to_list (&rightside);
|
||||
//make default box
|
||||
left_it.add_to_end (new ICOORDELT (xmin, ymin));
|
||||
left_it.add_to_end (new ICOORDELT (xmin, ymax));
|
||||
right_it.add_to_end (new ICOORDELT (xmax, ymin));
|
||||
right_it.add_to_end (new ICOORDELT (xmax, ymax));
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::set_sides
|
||||
*
|
||||
* Sets left and right vertex lists
|
||||
**********************************************************************/
|
||||
|
||||
//void BLOCK::set_sides( //set vertex lists
|
||||
//ICOORDELT_LIST *left, //left vertices
|
||||
//ICOORDELT_LIST *right //right vertices
|
||||
//)
|
||||
//{
|
||||
// ICOORDELT_IT left_it= &leftside; //boundaries
|
||||
// ICOORDELT_IT right_it= &rightside;
|
||||
|
||||
// leftside.clear();
|
||||
// left_it.move_to_first();
|
||||
// left_it.add_list_before(left);
|
||||
// rightside.clear();
|
||||
// right_it.move_to_first();
|
||||
// right_it.add_list_before(right);
|
||||
//}
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::contains
|
||||
*
|
||||
* Return TRUE if the given point is within the block.
|
||||
**********************************************************************/
|
||||
|
||||
//BOOL8 BLOCK::contains( //test containment
|
||||
//ICOORD pt //point to test
|
||||
//)
|
||||
//{
|
||||
// BLOCK_RECT_IT it=this; //rectangle iterator
|
||||
// ICOORD bleft,tright; //corners of rectangle
|
||||
|
||||
// for (it.start_block();!it.cycled_rects();it.forward())
|
||||
// {
|
||||
// it.bounding_box(bleft,tright); //get rectangle
|
||||
// if (pt.x()>=bleft.x() && pt.x()<=tright.x() //inside rect
|
||||
// && pt.y()>=bleft.y() && pt.y()<=tright.y())
|
||||
// return TRUE; //is inside
|
||||
// }
|
||||
// return FALSE; //not inside
|
||||
//}
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::move
|
||||
*
|
||||
* Reposition block
|
||||
**********************************************************************/
|
||||
|
||||
//void BLOCK::move( // reposition block
|
||||
//const ICOORD vec // by vector
|
||||
//)
|
||||
//{
|
||||
// ROW_IT row_it( &rows );
|
||||
// ICOORDELT_IT it( &leftside );
|
||||
|
||||
// for( row_it.mark_cycle_pt(); !row_it.cycled_list(); row_it.forward() )
|
||||
// row_it.data()->move( vec );
|
||||
|
||||
// for( it.mark_cycle_pt(); !it.cycled_list(); it.forward() )
|
||||
// *(it.data()) += vec;
|
||||
|
||||
// it.set_to_list( &rightside );
|
||||
|
||||
// for( it.mark_cycle_pt(); !it.cycled_list(); it.forward() )
|
||||
// *(it.data()) += vec;
|
||||
|
||||
// box.move( vec );
|
||||
//}
|
||||
|
||||
/**********************************************************************
|
||||
* decreasing_top_order
|
||||
*
|
||||
* Sort Comparator: Return <0 if row1 top < row2 top
|
||||
**********************************************************************/
|
||||
|
||||
int decreasing_top_order( //
|
||||
const void *row1,
|
||||
const void *row2) {
|
||||
return (*(ROW **) row2)->bounding_box ().top () -
|
||||
(*(ROW **) row1)->bounding_box ().top ();
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::sort_rows
|
||||
*
|
||||
* Order rows so that they are in order of decreasing Y coordinate
|
||||
**********************************************************************/
|
||||
|
||||
void BLOCK::sort_rows() { // order on "top"
|
||||
ROW_IT row_it(&rows);
|
||||
|
||||
row_it.sort (decreasing_top_order);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::compress
|
||||
*
|
||||
* Delete space between the rows. (And maybe one day, compress the rows)
|
||||
* Fill space of block from top down, left aligning rows.
|
||||
**********************************************************************/
|
||||
|
||||
void BLOCK::compress() { // squash it up
|
||||
#define ROW_SPACING 5
|
||||
|
||||
ROW_IT row_it(&rows);
|
||||
ROW *row;
|
||||
ICOORD row_spacing (0, ROW_SPACING);
|
||||
|
||||
ICOORDELT_IT icoordelt_it;
|
||||
|
||||
sort_rows();
|
||||
|
||||
box = BOX (box.topleft (), box.topleft ());
|
||||
box.move_bottom_edge (ROW_SPACING);
|
||||
for (row_it.mark_cycle_pt (); !row_it.cycled_list (); row_it.forward ()) {
|
||||
row = row_it.data ();
|
||||
row->move (box.botleft () - row_spacing -
|
||||
row->bounding_box ().topleft ());
|
||||
box += row->bounding_box ();
|
||||
}
|
||||
|
||||
leftside.clear ();
|
||||
icoordelt_it.set_to_list (&leftside);
|
||||
icoordelt_it.add_to_end (new ICOORDELT (box.left (), box.bottom ()));
|
||||
icoordelt_it.add_to_end (new ICOORDELT (box.left (), box.top ()));
|
||||
rightside.clear ();
|
||||
icoordelt_it.set_to_list (&rightside);
|
||||
icoordelt_it.add_to_end (new ICOORDELT (box.right (), box.bottom ()));
|
||||
icoordelt_it.add_to_end (new ICOORDELT (box.right (), box.top ()));
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::check_pitch
|
||||
*
|
||||
* Check whether the block is fixed or prop, set the flag, and set
|
||||
* the pitch if it is fixed.
|
||||
**********************************************************************/
|
||||
|
||||
void BLOCK::check_pitch() { // check prop
|
||||
// tprintf("Missing FFT fixed pitch stuff!\n");
|
||||
pitch = -1;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::compress
|
||||
*
|
||||
* Compress and move in a single operation.
|
||||
**********************************************************************/
|
||||
|
||||
void BLOCK::compress( // squash it up
|
||||
const ICOORD vec // and move
|
||||
) {
|
||||
box.move (vec);
|
||||
compress();
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::print
|
||||
*
|
||||
* Print the info on a block
|
||||
**********************************************************************/
|
||||
|
||||
void BLOCK::print( //print list of sides
|
||||
FILE *, //file to print on
|
||||
BOOL8 dump //print full detail
|
||||
) {
|
||||
ICOORDELT_IT it = &leftside; //iterator
|
||||
|
||||
box.print ();
|
||||
tprintf ("Proportional= %s\n", proportional ? "TRUE" : "FALSE");
|
||||
tprintf ("Kerning= %d\n", kerning);
|
||||
tprintf ("Spacing= %d\n", spacing);
|
||||
tprintf ("Fixed_pitch=%d\n", pitch);
|
||||
tprintf ("Filename= %s\n", filename.string ());
|
||||
|
||||
if (dump) {
|
||||
tprintf ("Left side coords are:\n");
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ())
|
||||
tprintf ("(%d,%d) ", it.data ()->x (), it.data ()->y ());
|
||||
tprintf ("\n");
|
||||
tprintf ("Right side coords are:\n");
|
||||
it.set_to_list (&rightside);
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ())
|
||||
tprintf ("(%d,%d) ", it.data ()->x (), it.data ()->y ());
|
||||
tprintf ("\n");
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::plot
|
||||
*
|
||||
* Plot the outline of a block in the given colour.
|
||||
**********************************************************************/
|
||||
|
||||
//void BLOCK::plot( //draw outline
|
||||
//WINDOW window, //window to draw in
|
||||
//INT32 serial, //serial number
|
||||
//COLOUR colour //colour to draw in
|
||||
//)
|
||||
//{
|
||||
// ICOORD startpt; //start of outline
|
||||
// ICOORD endpt; //end of outline
|
||||
// ICOORD prevpt; //previous point
|
||||
// ICOORDELT_IT it= &leftside; //iterator
|
||||
// char number[32]; //block id
|
||||
|
||||
// line_color_index(window,colour); //set the colour
|
||||
// text_color_index(window,colour);
|
||||
// character_height(window,(float)BLOCK_LABEL_HEIGHT);
|
||||
// text_font_index(window,6);
|
||||
|
||||
// if (!leftside.empty())
|
||||
// {
|
||||
// startpt= *(it.data()); //bottom left corner
|
||||
//// fprintf(stderr,"Block %d bottom left is (%d,%d)\n",
|
||||
//// serial,startpt.x(),startpt.y());
|
||||
// sprintf(number,"%d",serial);
|
||||
// text2d(window,startpt.x(),startpt.y(),number,0,FALSE);
|
||||
|
||||
// move2d(window,startpt.x(),startpt.y());
|
||||
// do
|
||||
// {
|
||||
// prevpt= *(it.data()); //previous point
|
||||
// it.forward(); //move to next point
|
||||
// draw2d(window,prevpt.x(),it.data()->y()); //draw round corner
|
||||
// draw2d(window,it.data()->x(),it.data()->y());
|
||||
// }
|
||||
// while (!it.at_last()); //until end of list
|
||||
// endpt= *(it.data()); //end point
|
||||
|
||||
// move2d(window,startpt.x(),startpt.y()); //other side of boundary
|
||||
// it.set_to_list(&rightside);
|
||||
// prevpt=startpt;
|
||||
// for (it.mark_cycle_pt();!it.cycled_list();it.forward())
|
||||
// {
|
||||
// draw2d(window,prevpt.x(),it.data()->y()); //draw round corner
|
||||
// draw2d(window,it.data()->x(),it.data()->y());
|
||||
// prevpt= *(it.data()); //previous point
|
||||
// }
|
||||
// draw2d(window,endpt.x(),endpt.y()); //close boundary
|
||||
// if (hand_block!=NULL)
|
||||
// hand_block->plot(window,colour,serial);
|
||||
// }
|
||||
//}
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::show
|
||||
*
|
||||
* Show the image corresponding to a block as its set of rectangles.
|
||||
**********************************************************************/
|
||||
|
||||
//void BLOCK::show( //show image block
|
||||
//IMAGE *image, //image to show
|
||||
//WINDOW window //window to show in
|
||||
//)
|
||||
//{
|
||||
// BLOCK_RECT_IT it=this; //rectangle iterator
|
||||
// ICOORD bleft,tright; //corners of rectangle
|
||||
|
||||
// for (it.start_block();!it.cycled_rects();it.forward())
|
||||
// {
|
||||
// it.bounding_box(bleft,tright); //get rectangle
|
||||
//// fprintf(stderr,"Drawing a block with a bottom left of (%d,%d)\n",
|
||||
//// bleft.x(),bleft.y());
|
||||
// show_sub_image(image,bleft.x(),bleft.y(),
|
||||
// tright.x()-bleft.x(),tright.y()-bleft.y(),
|
||||
// window,bleft.x(),bleft.y()); //show it
|
||||
// }
|
||||
//}
|
||||
|
||||
/**********************************************************************
|
||||
* BLOCK::operator=
|
||||
*
|
||||
* Assignment - duplicate the block structure, but with an EMPTY row list.
|
||||
**********************************************************************/
|
||||
|
||||
BLOCK & BLOCK::operator= ( //assignment
|
||||
const BLOCK & source //from this
|
||||
) {
|
||||
this->ELIST_LINK::operator= (source);
|
||||
this->PDBLK::operator= (source);
|
||||
proportional = source.proportional;
|
||||
kerning = source.kerning;
|
||||
spacing = source.spacing;
|
||||
filename = source.filename; //STRINGs assign ok
|
||||
if (!rows.empty ())
|
||||
rows.clear ();
|
||||
// if ( !leftside.empty() )
|
||||
// leftside.clear();
|
||||
// if ( !rightside.empty() )
|
||||
// rightside.clear();
|
||||
// leftside.deep_copy( &source.leftside );
|
||||
// rightside.deep_copy( &source.rightside );
|
||||
// box=source.box;
|
||||
return *this;
|
||||
}
|
228
ccstruct/ocrblock.h
Normal file
228
ccstruct/ocrblock.h
Normal file
@ -0,0 +1,228 @@
|
||||
/**********************************************************************
|
||||
* File: ocrblock.h (Formerly block.h)
|
||||
* Description: Page block class definition.
|
||||
* Author: Ray Smith
|
||||
* Created: Thu Mar 14 17:32:01 GMT 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef OCRBLOCK_H
|
||||
#define OCRBLOCK_H
|
||||
|
||||
#include "img.h"
|
||||
#include "ocrrow.h"
|
||||
#include "pageblk.h"
|
||||
#include "pdblock.h"
|
||||
|
||||
class BLOCK; //forward decl
|
||||
|
||||
ELISTIZEH_S (BLOCK)
|
||||
class BLOCK:public ELIST_LINK, public PDBLK
|
||||
//page block
|
||||
{
|
||||
friend class BLOCK_RECT_IT; //block iterator
|
||||
|
||||
//block label
|
||||
friend void scan_hpd_blocks(const char *name,
|
||||
PAGE_BLOCK_LIST *page_blocks, //head of full pag
|
||||
INT32 &block_no, //no of blocks
|
||||
BLOCK_IT *block_it);
|
||||
friend BOOL8 read_vec_file( //read uscan output
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //page size //output list
|
||||
INT32 ysize,
|
||||
BLOCK_LIST *blocks);
|
||||
friend BOOL8 read_pd_file( //read uscan output
|
||||
STRING name, //basename of file
|
||||
INT32 xsize, //page size //output list
|
||||
INT32 ysize,
|
||||
BLOCK_LIST *blocks);
|
||||
|
||||
public:
|
||||
BLOCK() { //empty constructor
|
||||
hand_block = NULL;
|
||||
hand_poly = NULL;
|
||||
}
|
||||
BLOCK( //simple constructor
|
||||
const char *name, //filename
|
||||
BOOL8 prop, //proportional
|
||||
INT16 kern, //kerning
|
||||
INT16 space, //spacing
|
||||
INT16 xmin, //bottom left
|
||||
INT16 ymin,
|
||||
INT16 xmax, //top right
|
||||
INT16 ymax);
|
||||
|
||||
// void set_sides( //set vertex lists
|
||||
// ICOORDELT_LIST *left, //list of left vertices
|
||||
// ICOORDELT_LIST *right); //list of right vertices
|
||||
|
||||
~BLOCK () { //destructor
|
||||
}
|
||||
|
||||
void set_stats( //set space size etc.
|
||||
BOOL8 prop, //proportional
|
||||
INT16 kern, //inter char size
|
||||
INT16 space, //inter word size
|
||||
INT16 ch_pitch) { //pitch if fixed
|
||||
proportional = prop;
|
||||
kerning = (INT8) kern;
|
||||
spacing = space;
|
||||
pitch = ch_pitch;
|
||||
}
|
||||
void set_xheight( //set char size
|
||||
INT32 height) {
|
||||
xheight = height;
|
||||
}
|
||||
void set_font_class( //set font class
|
||||
INT16 font) {
|
||||
font_class = font;
|
||||
}
|
||||
// TEXT_REGION* text_region()
|
||||
// {
|
||||
// return hand_block;
|
||||
// }
|
||||
// POLY_BLOCK* poly_block()
|
||||
// {
|
||||
// return hand_poly;
|
||||
// }
|
||||
BOOL8 prop() const { //return proportional
|
||||
return proportional;
|
||||
}
|
||||
INT32 fixed_pitch() const { //return pitch
|
||||
return pitch;
|
||||
}
|
||||
INT16 kern() const { //return kerning
|
||||
return kerning;
|
||||
}
|
||||
INT16 font() const { //return font class
|
||||
return font_class;
|
||||
}
|
||||
INT16 space() const { //return spacing
|
||||
return spacing;
|
||||
}
|
||||
const char *name() const { //return filename
|
||||
return filename.string ();
|
||||
}
|
||||
INT32 x_height() const { //return xheight
|
||||
return xheight;
|
||||
}
|
||||
ROW_LIST *row_list() { //get rows
|
||||
return &rows;
|
||||
}
|
||||
C_BLOB_LIST *blob_list() { //get blobs
|
||||
return &c_blobs;
|
||||
}
|
||||
C_BLOB_LIST *reject_blobs() {
|
||||
return &rej_blobs;
|
||||
}
|
||||
// void bounding_box( //get box
|
||||
// ICOORD& bottom_left, //bottom left
|
||||
// ICOORD& top_right) const //topright
|
||||
// {
|
||||
// bottom_left=box.botleft();
|
||||
// top_right=box.topright();
|
||||
// }
|
||||
// const BOX& bounding_box() const //get real box
|
||||
// {
|
||||
// return box;
|
||||
// }
|
||||
|
||||
// BOOL8 contains( //is pt inside block
|
||||
// ICOORD pt);
|
||||
|
||||
// void move( // reposition block
|
||||
// const ICOORD vec); // by vector
|
||||
|
||||
void sort_rows(); //decreasing y order
|
||||
|
||||
void compress(); //shrink white space
|
||||
|
||||
void check_pitch(); //check proportional
|
||||
|
||||
void compress( //shrink white space
|
||||
const ICOORD vec); //and move by vector
|
||||
|
||||
void print( //print summary/table
|
||||
FILE *fp, //file to print on
|
||||
BOOL8 dump); //dump whole table
|
||||
|
||||
// void plot( //draw histogram
|
||||
// WINDOW window, //window to draw in
|
||||
// INT32 serial, //serial number
|
||||
// COLOUR colour); //colour to draw in
|
||||
|
||||
// void show( //show image
|
||||
// IMAGE *image, //image to show
|
||||
// WINDOW window); //window to show in
|
||||
|
||||
void prep_serialise() { //set ptrs to counts
|
||||
filename.prep_serialise ();
|
||||
rows.prep_serialise ();
|
||||
c_blobs.prep_serialise ();
|
||||
rej_blobs.prep_serialise ();
|
||||
leftside.prep_serialise ();
|
||||
rightside.prep_serialise ();
|
||||
}
|
||||
|
||||
void dump( //write external bits
|
||||
FILE *f) {
|
||||
filename.dump (f);
|
||||
rows.dump (f);
|
||||
c_blobs.dump (f);
|
||||
rej_blobs.dump (f);
|
||||
leftside.dump (f);
|
||||
rightside.dump (f);
|
||||
if (hand_block != NULL)
|
||||
hand_block->serialise (f);
|
||||
}
|
||||
|
||||
void de_dump( //read external bits
|
||||
FILE *f) {
|
||||
filename.de_dump (f);
|
||||
rows.de_dump (f);
|
||||
c_blobs.de_dump (f);
|
||||
rej_blobs.de_dump (f);
|
||||
leftside.de_dump (f);
|
||||
rightside.de_dump (f);
|
||||
if (hand_block != NULL)
|
||||
hand_block = TEXT_REGION::de_serialise (f);
|
||||
}
|
||||
|
||||
//assignment
|
||||
make_serialise (BLOCK) BLOCK & operator= (
|
||||
const BLOCK & source); //from this
|
||||
|
||||
private:
|
||||
BOOL8 proportional; //proportional
|
||||
INT8 kerning; //inter blob gap
|
||||
INT16 spacing; //inter word gap
|
||||
INT16 pitch; //pitch of non-props
|
||||
INT16 font_class; //correct font class
|
||||
INT32 xheight; //height of chars
|
||||
STRING filename; //name of block
|
||||
// TEXT_REGION* hand_block; //if it exists
|
||||
// POLY_BLOCK* hand_poly; //wierd as well
|
||||
ROW_LIST rows; //rows in block
|
||||
C_BLOB_LIST c_blobs; //before textord
|
||||
C_BLOB_LIST rej_blobs; //duff stuff
|
||||
// ICOORDELT_LIST leftside; //left side vertices
|
||||
// ICOORDELT_LIST rightside; //right side vertices
|
||||
// BOX box; //bounding box
|
||||
};
|
||||
|
||||
int decreasing_top_order( //
|
||||
const void *row1,
|
||||
const void *row2);
|
||||
#endif
|
216
ccstruct/ocrrow.cpp
Normal file
216
ccstruct/ocrrow.cpp
Normal file
@ -0,0 +1,216 @@
|
||||
/**********************************************************************
|
||||
* File: ocrrow.cpp (Formerly row.c)
|
||||
* Description: Code for the ROW class.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Oct 08 15:58:04 BST 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#include "mfcpch.h"
|
||||
#include "ocrrow.h"
|
||||
#include "blobbox.h"
|
||||
|
||||
ELISTIZE_S (ROW)
|
||||
/**********************************************************************
|
||||
* ROW::ROW
|
||||
*
|
||||
* Constructor to build a ROW. Only the stats stuff are given here.
|
||||
* The words are added directly.
|
||||
**********************************************************************/
|
||||
ROW::ROW ( //constructor
|
||||
INT32 spline_size, //no of segments
|
||||
INT32 * xstarts, //segment boundaries
|
||||
double *coeffs, //coefficients
|
||||
float x_height, //line height
|
||||
float ascenders, //ascender size
|
||||
float descenders, //descender drop
|
||||
INT16 kern, //char gap
|
||||
INT16 space //word gap
|
||||
):
|
||||
baseline(spline_size, xstarts, coeffs) {
|
||||
kerning = kern; //just store stuff
|
||||
spacing = space;
|
||||
xheight = x_height;
|
||||
ascrise = ascenders;
|
||||
descdrop = descenders;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* ROW::ROW
|
||||
*
|
||||
* Constructor to build a ROW. Only the stats stuff are given here.
|
||||
* The words are added directly.
|
||||
**********************************************************************/
|
||||
|
||||
ROW::ROW( //constructor
|
||||
TO_ROW *to_row, //source row
|
||||
INT16 kern, //char gap
|
||||
INT16 space //word gap
|
||||
) {
|
||||
kerning = kern; //just store stuff
|
||||
spacing = space;
|
||||
xheight = to_row->xheight;
|
||||
ascrise = to_row->ascrise;
|
||||
descdrop = to_row->descdrop;
|
||||
baseline = to_row->baseline;
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* ROW::recalc_bounding_box
|
||||
*
|
||||
* Set the bounding box correctly
|
||||
**********************************************************************/
|
||||
|
||||
void ROW::recalc_bounding_box() { //recalculate BB
|
||||
WERD *word; //current word
|
||||
WERD_IT it = &words; //words of ROW
|
||||
INT16 left; //of word
|
||||
INT16 prev_left; //old left
|
||||
|
||||
if (!it.empty ()) {
|
||||
word = it.data ();
|
||||
prev_left = word->bounding_box ().left ();
|
||||
it.forward ();
|
||||
while (!it.at_first ()) {
|
||||
word = it.data ();
|
||||
left = word->bounding_box ().left ();
|
||||
if (left < prev_left) {
|
||||
it.move_to_first ();
|
||||
//words in BB order
|
||||
it.sort (word_comparator);
|
||||
break;
|
||||
}
|
||||
prev_left = left;
|
||||
it.forward ();
|
||||
}
|
||||
}
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ()) {
|
||||
word = it.data ();
|
||||
if (it.at_first ())
|
||||
word->set_flag (W_BOL, TRUE);
|
||||
else
|
||||
//not start of line
|
||||
word->set_flag (W_BOL, FALSE);
|
||||
if (it.at_last ())
|
||||
word->set_flag (W_EOL, TRUE);
|
||||
else
|
||||
//not end of line
|
||||
word->set_flag (W_EOL, FALSE);
|
||||
//extend BB as reqd
|
||||
bound_box += word->bounding_box ();
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* ROW::move
|
||||
*
|
||||
* Reposition row by vector
|
||||
**********************************************************************/
|
||||
|
||||
void ROW::move( // reposition row
|
||||
const ICOORD vec // by vector
|
||||
) {
|
||||
WERD_IT it(&words); // word iterator
|
||||
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ())
|
||||
it.data ()->move (vec);
|
||||
|
||||
bound_box.move (vec);
|
||||
baseline.move (vec);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* ROW::print
|
||||
*
|
||||
* Display members
|
||||
**********************************************************************/
|
||||
|
||||
void ROW::print( //print
|
||||
FILE *fp //file to print on
|
||||
) {
|
||||
tprintf ("Kerning= %d\n", kerning);
|
||||
tprintf ("Spacing= %d\n", spacing);
|
||||
bound_box.print ();
|
||||
tprintf ("Xheight= %f\n", xheight);
|
||||
tprintf ("Ascrise= %f\n", ascrise);
|
||||
tprintf ("Descdrop= %f\n", descdrop);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* ROW::plot
|
||||
*
|
||||
* Draw the ROW in the given colour.
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void ROW::plot( //draw it
|
||||
WINDOW window, //window to draw in
|
||||
COLOUR colour //colour to draw in
|
||||
) {
|
||||
WERD *word; //current word
|
||||
WERD_IT it = &words; //words of ROW
|
||||
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ()) {
|
||||
word = it.data ();
|
||||
word->plot (window, colour); //all in one colour
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
/**********************************************************************
|
||||
* ROW::plot
|
||||
*
|
||||
* Draw the ROW in rainbow colours.
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void ROW::plot( //draw it
|
||||
WINDOW window //window to draw in
|
||||
) {
|
||||
WERD *word; //current word
|
||||
WERD_IT it = &words; //words of ROW
|
||||
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ()) {
|
||||
word = it.data ();
|
||||
word->plot (window); //in rainbow colours
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
/**********************************************************************
|
||||
* ROW::operator=
|
||||
*
|
||||
* Assign rows by duplicating the row structure but NOT the WERDLIST
|
||||
**********************************************************************/
|
||||
|
||||
ROW & ROW::operator= ( //assignment
|
||||
const ROW & source //from this
|
||||
) {
|
||||
this->ELIST_LINK::operator= (source);
|
||||
kerning = source.kerning;
|
||||
spacing = source.spacing;
|
||||
xheight = source.xheight;
|
||||
ascrise = source.ascrise;
|
||||
descdrop = source.descdrop;
|
||||
if (!words.empty ())
|
||||
words.clear ();
|
||||
baseline = source.baseline; //QSPLINES must do =
|
||||
bound_box = source.bound_box;
|
||||
return *this;
|
||||
}
|
133
ccstruct/ocrrow.h
Normal file
133
ccstruct/ocrrow.h
Normal file
@ -0,0 +1,133 @@
|
||||
/**********************************************************************
|
||||
* File: ocrrow.h (Formerly row.h)
|
||||
* Description: Code for the ROW class.
|
||||
* Author: Ray Smith
|
||||
* Created: Tue Oct 08 15:58:04 BST 1991
|
||||
*
|
||||
* (C) Copyright 1991, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
#ifndef OCRROW_H
|
||||
#define OCRROW_H
|
||||
|
||||
#include <stdio.h>
|
||||
#include "quspline.h"
|
||||
#include "werd.h"
|
||||
|
||||
class TO_ROW;
|
||||
|
||||
class ROW:public ELIST_LINK
|
||||
{
|
||||
friend void tweak_row_baseline(ROW *);
|
||||
public:
|
||||
ROW() {
|
||||
} //empty constructor
|
||||
ROW( //constructor
|
||||
INT32 spline_size, //no of segments
|
||||
INT32 *xstarts, //segment boundaries
|
||||
double *coeffs, //coefficients //ascender size
|
||||
float x_height,
|
||||
float ascenders,
|
||||
float descenders, //descender size
|
||||
INT16 kern, //char gap
|
||||
INT16 space); //word gap
|
||||
ROW( //constructor
|
||||
TO_ROW *row, //textord row
|
||||
INT16 kern, //char gap
|
||||
INT16 space); //word gap
|
||||
|
||||
WERD_LIST *word_list() { //get words
|
||||
return &words;
|
||||
}
|
||||
|
||||
float base_line( //compute baseline
|
||||
float xpos) const { //at the position
|
||||
//get spline value
|
||||
return (float) baseline.y (xpos);
|
||||
}
|
||||
float x_height() const { //return x height
|
||||
return xheight;
|
||||
}
|
||||
INT32 kern() const { //return kerning
|
||||
return kerning;
|
||||
}
|
||||
INT32 space() const { //return spacing
|
||||
return spacing;
|
||||
}
|
||||
float ascenders() const { //return size
|
||||
return ascrise;
|
||||
}
|
||||
float descenders() const { //return size
|
||||
return descdrop;
|
||||
}
|
||||
BOX bounding_box() const { //return bounding box
|
||||
return bound_box;
|
||||
}
|
||||
|
||||
void recalc_bounding_box(); //recalculate BB
|
||||
|
||||
void move( // reposition row
|
||||
const ICOORD vec); // by vector
|
||||
|
||||
void print( //print
|
||||
FILE *fp); //file to print on
|
||||
|
||||
void plot( //draw one
|
||||
WINDOW window, //window to draw in
|
||||
COLOUR colour); //uniform colour
|
||||
void plot( //draw one
|
||||
WINDOW window); //in rainbow colours
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void plot_baseline( //draw the baseline
|
||||
WINDOW window, //window to draw in
|
||||
COLOUR colour) { //colour to draw
|
||||
//draw it
|
||||
baseline.plot (window, colour);
|
||||
}
|
||||
#endif
|
||||
|
||||
void prep_serialise() { //set ptrs to counts
|
||||
words.prep_serialise ();
|
||||
baseline.prep_serialise ();
|
||||
}
|
||||
|
||||
void dump( //write external bits
|
||||
FILE *f) {
|
||||
words.dump (f);
|
||||
baseline.dump (f);
|
||||
}
|
||||
|
||||
void de_dump( //read external bits
|
||||
FILE *f) {
|
||||
words.de_dump (f);
|
||||
baseline.de_dump (f);
|
||||
}
|
||||
|
||||
//assignment
|
||||
make_serialise (ROW) ROW & operator= (
|
||||
const ROW & source); //from this
|
||||
|
||||
private:
|
||||
INT32 kerning; //inter char gap
|
||||
INT32 spacing; //inter word gap
|
||||
BOX bound_box; //bounding box
|
||||
float xheight; //height of line
|
||||
float ascrise; //size of ascenders
|
||||
float descdrop; //-size of descenders
|
||||
WERD_LIST words; //words
|
||||
QSPLINE baseline; //baseline spline
|
||||
};
|
||||
|
||||
ELISTIZEH_S (ROW)
|
||||
#endif
|
879
ccstruct/pageblk.cpp
Normal file
879
ccstruct/pageblk.cpp
Normal file
@ -0,0 +1,879 @@
|
||||
#include "mfcpch.h"
|
||||
#include "pageblk.h"
|
||||
#include <stdio.h>
|
||||
#include <ctype.h>
|
||||
#include <math.h>
|
||||
#ifdef __UNIX__
|
||||
#include <unistd.h>
|
||||
#else
|
||||
#include <io.h>
|
||||
#endif
|
||||
|
||||
#include "hpddef.h" //must be last (handpd.dll)
|
||||
|
||||
#define G_START 0
|
||||
#define I_START 1
|
||||
#define R_START 3
|
||||
#define S_START 5
|
||||
|
||||
extern char blabel[NUM_BLOCK_ATTR][4][MAXLENGTH];
|
||||
extern char backlabel[NUM_BACKGROUNDS][MAXLENGTH];
|
||||
|
||||
ELISTIZE_S (PAGE_BLOCK)
|
||||
void PAGE_BLOCK::pb_delete() {
|
||||
switch (pb_type) {
|
||||
case PB_TEXT:
|
||||
delete ((TEXT_BLOCK *) this);
|
||||
break;
|
||||
case PB_GRAPHICS:
|
||||
delete ((GRAPHICS_BLOCK *) this);
|
||||
break;
|
||||
case PB_IMAGE:
|
||||
delete ((IMAGE_BLOCK *) this);
|
||||
break;
|
||||
case PB_RULES:
|
||||
delete ((RULE_BLOCK *) this);
|
||||
break;
|
||||
case PB_SCRIBBLE:
|
||||
delete ((SCRIBBLE_BLOCK *) this);
|
||||
break;
|
||||
case PB_WEIRD:
|
||||
delete ((WEIRD_BLOCK *) this);
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
#define QUOTE_IT( parm ) #parm
|
||||
|
||||
void PAGE_BLOCK::serialise(FILE *f) {
|
||||
|
||||
if (fwrite (&pb_type, sizeof (PB_TYPE), 1, f) != 1)
|
||||
WRITEFAILED.error (QUOTE_IT (PAGE_BLOCK::serialise), ABORT, NULL);
|
||||
switch (pb_type) {
|
||||
case PB_TEXT:
|
||||
((TEXT_BLOCK *) this)->serialise (f);
|
||||
break;
|
||||
case PB_GRAPHICS:
|
||||
((GRAPHICS_BLOCK *) this)->serialise (f);
|
||||
break;
|
||||
case PB_RULES:
|
||||
((RULE_BLOCK *) this)->serialise (f);
|
||||
break;
|
||||
case PB_IMAGE:
|
||||
((IMAGE_BLOCK *) this)->serialise (f);
|
||||
break;
|
||||
case PB_SCRIBBLE:
|
||||
((SCRIBBLE_BLOCK *) this)->serialise (f);
|
||||
break;
|
||||
case PB_WEIRD:
|
||||
((WEIRD_BLOCK *) this)->serialise (f);
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
PAGE_BLOCK *PAGE_BLOCK::de_serialise(FILE *f) {
|
||||
PB_TYPE type;
|
||||
TEXT_BLOCK *tblock;
|
||||
GRAPHICS_BLOCK *gblock;
|
||||
RULE_BLOCK *rblock;
|
||||
IMAGE_BLOCK *iblock;
|
||||
SCRIBBLE_BLOCK *sblock;
|
||||
WEIRD_BLOCK *wblock;
|
||||
|
||||
if (fread ((void *) &type, sizeof (PB_TYPE), 1, f) != 1)
|
||||
WRITEFAILED.error (QUOTE_IT (PAGE_BLOCK::serialise), ABORT, NULL);
|
||||
switch (type) {
|
||||
case PB_TEXT:
|
||||
tblock = (TEXT_BLOCK *) alloc_struct (sizeof (TEXT_BLOCK));
|
||||
return tblock->de_serialise (f);
|
||||
case PB_GRAPHICS:
|
||||
gblock = (GRAPHICS_BLOCK *) alloc_struct (sizeof (GRAPHICS_BLOCK));
|
||||
return gblock->de_serialise (f);
|
||||
case PB_RULES:
|
||||
rblock = (RULE_BLOCK *) alloc_struct (sizeof (RULE_BLOCK));
|
||||
return rblock->de_serialise (f);
|
||||
case PB_IMAGE:
|
||||
iblock = (IMAGE_BLOCK *) alloc_struct (sizeof (IMAGE_BLOCK));
|
||||
return iblock->de_serialise (f);
|
||||
case PB_SCRIBBLE:
|
||||
sblock = (SCRIBBLE_BLOCK *) alloc_struct (sizeof (SCRIBBLE_BLOCK));
|
||||
return sblock->de_serialise (f);
|
||||
case PB_WEIRD:
|
||||
wblock = (WEIRD_BLOCK *) alloc_struct (sizeof (SCRIBBLE_BLOCK));
|
||||
return wblock->de_serialise (f);
|
||||
default:
|
||||
return NULL;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* PAGE_BLOCK::serialise_asc() Convert to ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void PAGE_BLOCK::serialise_asc( //convert to ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
serialise_INT32(f, pb_type);
|
||||
switch (pb_type) {
|
||||
case PB_TEXT:
|
||||
((TEXT_BLOCK *) this)->serialise_asc (f);
|
||||
break;
|
||||
case PB_GRAPHICS:
|
||||
((GRAPHICS_BLOCK *) this)->serialise_asc (f);
|
||||
break;
|
||||
case PB_RULES:
|
||||
((RULE_BLOCK *) this)->serialise_asc (f);
|
||||
break;
|
||||
case PB_IMAGE:
|
||||
((IMAGE_BLOCK *) this)->serialise_asc (f);
|
||||
break;
|
||||
case PB_SCRIBBLE:
|
||||
((SCRIBBLE_BLOCK *) this)->serialise_asc (f);
|
||||
break;
|
||||
case PB_WEIRD:
|
||||
((WEIRD_BLOCK *) this)->serialise_asc (f);
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* PAGE_BLOCK::internal_serialise_asc() Convert to ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void PAGE_BLOCK::internal_serialise_asc( //convert to ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((POLY_BLOCK *) this)->serialise_asc (f);
|
||||
serialise_INT32(f, pb_type);
|
||||
children.serialise_asc (f);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* PAGE_BLOCK::de_serialise_asc() Convert from ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void PAGE_BLOCK::de_serialise_asc( //convert from ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
PAGE_BLOCK *page_block; //new block for list
|
||||
INT32 len; /*length to retrive */
|
||||
PAGE_BLOCK_IT it;
|
||||
|
||||
((POLY_BLOCK *) this)->de_serialise_asc (f);
|
||||
pb_type = (PB_TYPE) de_serialise_INT32 (f);
|
||||
// children.de_serialise_asc(f);
|
||||
len = de_serialise_INT32 (f);
|
||||
it.set_to_list (&children);
|
||||
for (; len > 0; len--) {
|
||||
page_block = new_de_serialise_asc (f);
|
||||
it.add_to_end (page_block); /*put on the list */
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* PAGE_BLOCK::new_de_serialise_asc() Convert from ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
PAGE_BLOCK *PAGE_BLOCK::new_de_serialise_asc( //convert from ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
PB_TYPE type;
|
||||
TEXT_BLOCK *tblock;
|
||||
GRAPHICS_BLOCK *gblock;
|
||||
RULE_BLOCK *rblock;
|
||||
IMAGE_BLOCK *iblock;
|
||||
SCRIBBLE_BLOCK *sblock;
|
||||
WEIRD_BLOCK *wblock;
|
||||
|
||||
type = (PB_TYPE) de_serialise_INT32 (f);
|
||||
switch (type) {
|
||||
case PB_TEXT:
|
||||
tblock = new TEXT_BLOCK;
|
||||
tblock->de_serialise_asc (f);
|
||||
return tblock;
|
||||
case PB_GRAPHICS:
|
||||
gblock = new GRAPHICS_BLOCK;
|
||||
gblock->de_serialise_asc (f);
|
||||
return gblock;
|
||||
case PB_RULES:
|
||||
rblock = new RULE_BLOCK;
|
||||
rblock->de_serialise_asc (f);
|
||||
return rblock;
|
||||
case PB_IMAGE:
|
||||
iblock = new IMAGE_BLOCK;
|
||||
iblock->de_serialise_asc (f);
|
||||
return iblock;
|
||||
case PB_SCRIBBLE:
|
||||
sblock = new SCRIBBLE_BLOCK;
|
||||
sblock->de_serialise_asc (f);
|
||||
return sblock;
|
||||
case PB_WEIRD:
|
||||
wblock = new WEIRD_BLOCK;
|
||||
wblock->de_serialise_asc (f);
|
||||
return wblock;
|
||||
default:
|
||||
return NULL;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void PAGE_BLOCK::show_attrs(DEBUG_WIN *f) {
|
||||
PAGE_BLOCK_IT it;
|
||||
|
||||
switch (pb_type) {
|
||||
case PB_TEXT:
|
||||
((TEXT_BLOCK *) this)->show_attrs (f);
|
||||
break;
|
||||
case PB_GRAPHICS:
|
||||
((GRAPHICS_BLOCK *) this)->show_attrs (f);
|
||||
break;
|
||||
case PB_RULES:
|
||||
((RULE_BLOCK *) this)->show_attrs (f);
|
||||
break;
|
||||
case PB_IMAGE:
|
||||
((IMAGE_BLOCK *) this)->show_attrs (f);
|
||||
break;
|
||||
case PB_SCRIBBLE:
|
||||
((SCRIBBLE_BLOCK *) this)->show_attrs (f);
|
||||
break;
|
||||
case PB_WEIRD:
|
||||
((WEIRD_BLOCK *) this)->show_attrs (f);
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
if (!children.empty ()) {
|
||||
f->dprintf ("containing subblocks\n");
|
||||
it.set_to_list (&children);
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ())
|
||||
it.data ()->show_attrs (f);
|
||||
f->dprintf ("end of subblocks\n");
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
PAGE_BLOCK::PAGE_BLOCK (ICOORDELT_LIST * points, PB_TYPE type, PAGE_BLOCK_LIST * child):POLY_BLOCK (points,
|
||||
POLY_PAGE) {
|
||||
PAGE_BLOCK_IT
|
||||
c = &children;
|
||||
|
||||
pb_type = type;
|
||||
children.clear ();
|
||||
c.move_to_first ();
|
||||
c.add_list_before (child);
|
||||
}
|
||||
|
||||
|
||||
PAGE_BLOCK::PAGE_BLOCK (ICOORDELT_LIST * points, PB_TYPE type):POLY_BLOCK (points,
|
||||
POLY_PAGE) {
|
||||
pb_type = type;
|
||||
children.clear ();
|
||||
}
|
||||
|
||||
|
||||
void PAGE_BLOCK::add_a_child(PAGE_BLOCK *newchild) {
|
||||
PAGE_BLOCK_IT c = &children;
|
||||
|
||||
c.move_to_first ();
|
||||
c.add_to_end (newchild);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* PAGE_BLOCK::rotate
|
||||
*
|
||||
* Rotate the PAGE_BLOCK and its children
|
||||
**********************************************************************/
|
||||
|
||||
void PAGE_BLOCK::rotate( //cos,sin
|
||||
FCOORD rotation) {
|
||||
//sub block iterator
|
||||
PAGE_BLOCK_IT child_it = &children;
|
||||
PAGE_BLOCK *child; //child block
|
||||
|
||||
for (child_it.mark_cycle_pt (); !child_it.cycled_list ();
|
||||
child_it.forward ()) {
|
||||
child = child_it.data ();
|
||||
child->rotate (rotation);
|
||||
}
|
||||
if (pb_type == PB_TEXT)
|
||||
((TEXT_BLOCK *) this)->rotate (rotation);
|
||||
else
|
||||
POLY_BLOCK::rotate(rotation);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* PAGE_BLOCK::move
|
||||
*
|
||||
* Move the PAGE_BLOCK and its children
|
||||
**********************************************************************/
|
||||
|
||||
void PAGE_BLOCK::move(ICOORD shift //amount to move
|
||||
) {
|
||||
//sub block iterator
|
||||
PAGE_BLOCK_IT child_it = &children;
|
||||
PAGE_BLOCK *child; //child block
|
||||
|
||||
for (child_it.mark_cycle_pt (); !child_it.cycled_list ();
|
||||
child_it.forward ()) {
|
||||
child = child_it.data ();
|
||||
child->move (shift);
|
||||
}
|
||||
if (pb_type == PB_TEXT)
|
||||
((TEXT_BLOCK *) this)->move (shift);
|
||||
else
|
||||
POLY_BLOCK::move(shift);
|
||||
}
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void PAGE_BLOCK::basic_plot(WINDOW window, COLOUR colour) {
|
||||
PAGE_BLOCK_IT c = &children;
|
||||
|
||||
POLY_BLOCK::plot (window, colour, 0);
|
||||
|
||||
if (!c.empty ())
|
||||
for (c.mark_cycle_pt (); !c.cycled_list (); c.forward ())
|
||||
c.data ()->plot (window, colour);
|
||||
}
|
||||
|
||||
|
||||
void PAGE_BLOCK::plot(WINDOW window, COLOUR colour) {
|
||||
TEXT_BLOCK *tblock;
|
||||
WEIRD_BLOCK *wblock;
|
||||
|
||||
switch (pb_type) {
|
||||
case PB_TEXT:
|
||||
basic_plot(window, colour);
|
||||
tblock = (TEXT_BLOCK *) this;
|
||||
tblock->plot (window, colour, REGION_COLOUR, SUBREGION_COLOUR);
|
||||
break;
|
||||
case PB_WEIRD:
|
||||
wblock = (WEIRD_BLOCK *) this;
|
||||
wblock->plot (window, colour);
|
||||
break;
|
||||
default:
|
||||
basic_plot(window, colour);
|
||||
break;
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
void show_all_in(PAGE_BLOCK *pblock, POLY_BLOCK *show_area, DEBUG_WIN *f) {
|
||||
PAGE_BLOCK_IT c;
|
||||
INT16 i, pnum;
|
||||
|
||||
c.set_to_list (pblock->child ());
|
||||
pnum = pblock->child ()->length ();
|
||||
for (i = 0; i < pnum; i++, c.forward ()) {
|
||||
if (show_area->contains (c.data ()))
|
||||
c.data ()->show_attrs (f);
|
||||
else if (show_area->overlap (c.data ()))
|
||||
show_all_in (c.data (), show_area, f);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void delete_all_in(PAGE_BLOCK *pblock, POLY_BLOCK *delete_area) {
|
||||
PAGE_BLOCK_IT c;
|
||||
INT16 i, pnum;
|
||||
|
||||
c.set_to_list (pblock->child ());
|
||||
pnum = pblock->child ()->length ();
|
||||
for (i = 0; i < pnum; i++, c.forward ()) {
|
||||
if (delete_area->contains (c.data ()))
|
||||
c.extract ()->pb_delete ();
|
||||
else if (delete_area->overlap (c.data ()))
|
||||
delete_all_in (c.data (), delete_area);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
PAGE_BLOCK *smallest_containing(PAGE_BLOCK *pblock, POLY_BLOCK *other) {
|
||||
PAGE_BLOCK_IT c;
|
||||
|
||||
c.set_to_list (pblock->child ());
|
||||
if (c.empty ())
|
||||
return (pblock);
|
||||
|
||||
for (c.mark_cycle_pt (); !c.cycled_list (); c.forward ())
|
||||
if (c.data ()->contains (other))
|
||||
return (smallest_containing (c.data (), other));
|
||||
|
||||
return (pblock);
|
||||
}
|
||||
|
||||
|
||||
TEXT_BLOCK::TEXT_BLOCK (ICOORDELT_LIST * points, BOOL8 backg[NUM_BACKGROUNDS]):PAGE_BLOCK (points,
|
||||
PB_TEXT) {
|
||||
int
|
||||
i;
|
||||
|
||||
for (i = 0; i < NUM_BACKGROUNDS; i++)
|
||||
background.set_bit (i, backg[i]);
|
||||
|
||||
text_regions.clear ();
|
||||
}
|
||||
|
||||
|
||||
void
|
||||
TEXT_BLOCK::set_attrs (BOOL8 backg[NUM_BACKGROUNDS]) {
|
||||
int i;
|
||||
|
||||
for (i = 0; i < NUM_BACKGROUNDS; i++)
|
||||
background.set_bit (i, backg[i]);
|
||||
}
|
||||
|
||||
|
||||
void TEXT_BLOCK::add_a_region(TEXT_REGION *newchild) {
|
||||
TEXT_REGION_IT c;
|
||||
|
||||
c.set_to_list (&text_regions);
|
||||
|
||||
c.move_to_first ();
|
||||
c.add_to_end (newchild);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* TEXT_BLOCK::rotate
|
||||
*
|
||||
* Rotate the TEXT_BLOCK and its children
|
||||
**********************************************************************/
|
||||
|
||||
void TEXT_BLOCK::rotate( //cos,sin
|
||||
FCOORD rotation) {
|
||||
//sub block iterator
|
||||
TEXT_REGION_IT child_it = &text_regions;
|
||||
TEXT_REGION *child; //child block
|
||||
|
||||
for (child_it.mark_cycle_pt (); !child_it.cycled_list ();
|
||||
child_it.forward ()) {
|
||||
child = child_it.data ();
|
||||
child->rotate (rotation);
|
||||
}
|
||||
POLY_BLOCK::rotate(rotation);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* TEXT_BLOCK::move
|
||||
*
|
||||
* Move the TEXT_BLOCK and its children
|
||||
**********************************************************************/
|
||||
|
||||
void TEXT_BLOCK::move(ICOORD shift //amount to move
|
||||
) {
|
||||
//sub block iterator
|
||||
TEXT_REGION_IT child_it = &text_regions;
|
||||
TEXT_REGION *child; //child block
|
||||
|
||||
for (child_it.mark_cycle_pt (); !child_it.cycled_list ();
|
||||
child_it.forward ()) {
|
||||
child = child_it.data ();
|
||||
child->move (shift);
|
||||
}
|
||||
POLY_BLOCK::move(shift);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* TEXT_BLOCK::serialise_asc() Convert to ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void TEXT_BLOCK::serialise_asc( //convert to ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->internal_serialise_asc (f);
|
||||
serialise_INT32 (f, background.val);
|
||||
text_regions.serialise_asc (f);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* TEXT_BLOCK::de_serialise_asc() Convert from ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void TEXT_BLOCK::de_serialise_asc( //convert from ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->de_serialise_asc (f);
|
||||
background.val = de_serialise_INT32 (f);
|
||||
text_regions.de_serialise_asc (f);
|
||||
}
|
||||
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void TEXT_BLOCK::plot(WINDOW window,
|
||||
COLOUR colour,
|
||||
COLOUR region_colour,
|
||||
COLOUR subregion_colour) {
|
||||
TEXT_REGION_IT t = &text_regions, tc;
|
||||
|
||||
PAGE_BLOCK::basic_plot(window, colour);
|
||||
|
||||
if (!t.empty ())
|
||||
for (t.mark_cycle_pt (); !t.cycled_list (); t.forward ()) {
|
||||
t.data ()->plot (window, region_colour, t.data ()->id_no ());
|
||||
tc.set_to_list (t.data ()->regions ());
|
||||
if (!tc.empty ())
|
||||
for (tc.mark_cycle_pt (); !tc.cycled_list (); tc.forward ())
|
||||
tc.data ()->plot (window, subregion_colour, -1);
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
|
||||
void TEXT_BLOCK::show_attrs(DEBUG_WIN *f) {
|
||||
TEXT_REGION_IT it;
|
||||
|
||||
f->dprintf ("TEXT BLOCK\n");
|
||||
print_background(f, background);
|
||||
if (!text_regions.empty ()) {
|
||||
f->dprintf ("containing text regions:\n");
|
||||
it.set_to_list (&text_regions);
|
||||
for (it.mark_cycle_pt (); !it.cycled_list (); it.forward ())
|
||||
it.data ()->show_attrs (f);
|
||||
f->dprintf ("end of regions\n");
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
DLLSYM void show_all_tr_in(TEXT_BLOCK *tblock,
|
||||
POLY_BLOCK *show_area,
|
||||
DEBUG_WIN *f) {
|
||||
TEXT_REGION_IT t, tc;
|
||||
INT16 i, tnum, j, ttnum;
|
||||
|
||||
t.set_to_list (tblock->regions ());
|
||||
tnum = tblock->regions ()->length ();
|
||||
for (i = 0; i < tnum; i++, t.forward ()) {
|
||||
if (show_area->contains (t.data ()))
|
||||
t.data ()->show_attrs (f);
|
||||
else if (show_area->overlap (t.data ())) {
|
||||
tc.set_to_list (t.data ()->regions ());
|
||||
ttnum = t.data ()->regions ()->length ();
|
||||
for (j = 0; j < ttnum; j++, tc.forward ())
|
||||
if (show_area->contains (tc.data ()))
|
||||
tc.data ()->show_attrs (f);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void delete_all_tr_in(TEXT_BLOCK *tblock, POLY_BLOCK *delete_area) {
|
||||
TEXT_REGION_IT t, tc;
|
||||
INT16 i, tnum, j, ttnum;
|
||||
|
||||
t.set_to_list (tblock->regions ());
|
||||
tnum = tblock->regions ()->length ();
|
||||
for (i = 0; i < tnum; i++, t.forward ()) {
|
||||
if (delete_area->contains (t.data ()))
|
||||
delete (t.extract ());
|
||||
else if (delete_area->overlap (t.data ())) {
|
||||
tc.set_to_list (t.data ()->regions ());
|
||||
ttnum = t.data ()->regions ()->length ();
|
||||
for (j = 0; j < ttnum; j++, tc.forward ())
|
||||
if (delete_area->contains (tc.data ()))
|
||||
delete (tc.extract ());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
RULE_BLOCK::RULE_BLOCK (ICOORDELT_LIST * points, INT8 sing, INT8 colo):PAGE_BLOCK (points,
|
||||
PB_RULES) {
|
||||
multiplicity = sing;
|
||||
colour = colo;
|
||||
}
|
||||
|
||||
|
||||
void RULE_BLOCK::set_attrs(INT8 sing, INT8 colo) {
|
||||
multiplicity = sing;
|
||||
colour = colo;
|
||||
}
|
||||
|
||||
|
||||
void RULE_BLOCK::show_attrs(DEBUG_WIN *f) {
|
||||
f->dprintf ("RULE BLOCK with attributes %s, %s\n",
|
||||
blabel[R_START][multiplicity], blabel[R_START + 1][colour]);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* RULE_BLOCK::serialise_asc() Convert to ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void RULE_BLOCK::serialise_asc( //convert to ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->internal_serialise_asc (f);
|
||||
serialise_INT32(f, multiplicity);
|
||||
serialise_INT32(f, colour);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* RULE_BLOCK::de_serialise_asc() Convert from ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void RULE_BLOCK::de_serialise_asc( //convert from ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->de_serialise_asc (f);
|
||||
multiplicity = de_serialise_INT32 (f);
|
||||
colour = de_serialise_INT32 (f);
|
||||
}
|
||||
|
||||
|
||||
GRAPHICS_BLOCK::GRAPHICS_BLOCK (ICOORDELT_LIST * points, BOOL8 backg[NUM_BACKGROUNDS], INT8 foreg):PAGE_BLOCK (points,
|
||||
PB_GRAPHICS) {
|
||||
int
|
||||
i;
|
||||
|
||||
for (i = 0; i < NUM_BACKGROUNDS; i++)
|
||||
background.set_bit (i, backg[i]);
|
||||
|
||||
foreground = foreg;
|
||||
}
|
||||
|
||||
|
||||
void
|
||||
GRAPHICS_BLOCK::set_attrs (BOOL8 backg[NUM_BACKGROUNDS], INT8 foreg) {
|
||||
int i;
|
||||
|
||||
for (i = 0; i < NUM_BACKGROUNDS; i++)
|
||||
background.set_bit (i, backg[i]);
|
||||
|
||||
foreground = foreg;
|
||||
}
|
||||
|
||||
|
||||
void GRAPHICS_BLOCK::show_attrs(DEBUG_WIN *f) {
|
||||
f->dprintf ("GRAPHICS BLOCK with attribute %s\n",
|
||||
blabel[G_START][foreground]);
|
||||
print_background(f, background);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* GRAPHICS_BLOCK::serialise_asc() Convert to ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void GRAPHICS_BLOCK::serialise_asc( //convert to ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->internal_serialise_asc (f);
|
||||
serialise_INT32 (f, background.val);
|
||||
serialise_INT32(f, foreground);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* GRAPHICS_BLOCK::de_serialise_asc() Convert from ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void GRAPHICS_BLOCK::de_serialise_asc( //convert from ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->de_serialise_asc (f);
|
||||
background.val = de_serialise_INT32 (f);
|
||||
foreground = de_serialise_INT32 (f);
|
||||
}
|
||||
|
||||
|
||||
IMAGE_BLOCK::IMAGE_BLOCK (ICOORDELT_LIST * points, INT8 colo, INT8 qual):PAGE_BLOCK (points,
|
||||
PB_IMAGE) {
|
||||
colour = colo;
|
||||
quality = qual;
|
||||
}
|
||||
|
||||
|
||||
void IMAGE_BLOCK::set_attrs(INT8 colo, INT8 qual) {
|
||||
colour = colo;
|
||||
quality = qual;
|
||||
}
|
||||
|
||||
|
||||
void IMAGE_BLOCK::show_attrs(DEBUG_WIN *f) {
|
||||
f->dprintf ("IMAGE BLOCK with attributes %s, %s\n", blabel[I_START][colour],
|
||||
blabel[I_START + 1][quality]);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* IMAGE_BLOCK::serialise_asc() Convert to ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void IMAGE_BLOCK::serialise_asc( //convert to ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->internal_serialise_asc (f);
|
||||
serialise_INT32(f, colour);
|
||||
serialise_INT32(f, quality);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* IMAGE_BLOCK::de_serialise_asc() Convert from ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void IMAGE_BLOCK::de_serialise_asc( //convert from ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->de_serialise_asc (f);
|
||||
colour = de_serialise_INT32 (f);
|
||||
quality = de_serialise_INT32 (f);
|
||||
}
|
||||
|
||||
|
||||
SCRIBBLE_BLOCK::SCRIBBLE_BLOCK (ICOORDELT_LIST * points, BOOL8 backg[NUM_BACKGROUNDS], INT8 foreg):PAGE_BLOCK (points,
|
||||
PB_SCRIBBLE) {
|
||||
int
|
||||
i;
|
||||
|
||||
for (i = 0; i < NUM_BACKGROUNDS; i++)
|
||||
background.set_bit (i, backg[i]);
|
||||
|
||||
foreground = foreg;
|
||||
}
|
||||
|
||||
|
||||
void
|
||||
SCRIBBLE_BLOCK::set_attrs (BOOL8 backg[NUM_BACKGROUNDS], INT8 foreg) {
|
||||
int i;
|
||||
|
||||
for (i = 0; i < NUM_BACKGROUNDS; i++)
|
||||
background.set_bit (i, backg[i]);
|
||||
|
||||
foreground = foreg;
|
||||
}
|
||||
|
||||
|
||||
void SCRIBBLE_BLOCK::show_attrs(DEBUG_WIN *f) {
|
||||
f->dprintf ("SCRIBBLE BLOCK with attributes %s\n",
|
||||
blabel[S_START][foreground]);
|
||||
print_background(f, background);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* SCRIBBLE_BLOCK::serialise_asc() Convert to ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void SCRIBBLE_BLOCK::serialise_asc( //convert to ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->internal_serialise_asc (f);
|
||||
serialise_INT32 (f, background.val);
|
||||
serialise_INT32(f, foreground);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* SCRIBBLE_BLOCK::de_serialise_asc() Convert from ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void SCRIBBLE_BLOCK::de_serialise_asc( //convert from ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->de_serialise_asc (f);
|
||||
background.val = de_serialise_INT32 (f);
|
||||
foreground = de_serialise_INT32 (f);
|
||||
}
|
||||
|
||||
|
||||
WEIRD_BLOCK::WEIRD_BLOCK (ICOORDELT_LIST * points, INT32 id_no):PAGE_BLOCK (points,
|
||||
PB_WEIRD) {
|
||||
id_number = id_no;
|
||||
}
|
||||
|
||||
|
||||
#ifndef GRAPHICS_DISABLED
|
||||
void WEIRD_BLOCK::plot(WINDOW window, COLOUR colour) {
|
||||
PAGE_BLOCK_IT c = this->child ();
|
||||
|
||||
POLY_BLOCK::plot(window, colour, id_number);
|
||||
|
||||
if (!c.empty ())
|
||||
for (c.mark_cycle_pt (); !c.cycled_list (); c.forward ())
|
||||
c.data ()->plot (window, colour);
|
||||
}
|
||||
#endif
|
||||
|
||||
|
||||
void WEIRD_BLOCK::set_id(INT32 id_no) {
|
||||
id_number = id_no;
|
||||
}
|
||||
|
||||
|
||||
void WEIRD_BLOCK::show_attrs(DEBUG_WIN *f) {
|
||||
f->dprintf ("WEIRD BLOCK with id number %d\n", id_number);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* WEIRD_BLOCK::serialise_asc() Convert to ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void WEIRD_BLOCK::serialise_asc( //convert to ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->internal_serialise_asc (f);
|
||||
serialise_INT32(f, id_number);
|
||||
}
|
||||
|
||||
|
||||
/**********************************************************************
|
||||
* WEIRD_BLOCK::de_serialise_asc() Convert from ascii file.
|
||||
*
|
||||
**********************************************************************/
|
||||
|
||||
void WEIRD_BLOCK::de_serialise_asc( //convert from ascii
|
||||
FILE *f //file to use
|
||||
) {
|
||||
((PAGE_BLOCK *) this)->de_serialise_asc (f);
|
||||
id_number = de_serialise_INT32 (f);
|
||||
}
|
||||
|
||||
|
||||
void print_background(DEBUG_WIN *f, BITS16 background) {
|
||||
int i;
|
||||
|
||||
f->dprintf ("Background is \n");
|
||||
for (i = 0; i < NUM_BACKGROUNDS; i++) {
|
||||
if (background.bit (i))
|
||||
f->dprintf ("%s, ", backlabel[i]);
|
||||
}
|
||||
|
||||
f->dprintf ("\n");
|
||||
|
||||
}
|
318
ccstruct/pageblk.h
Normal file
318
ccstruct/pageblk.h
Normal file
@ -0,0 +1,318 @@
|
||||
#ifndef PAGEBLK_C
|
||||
#define PAGEBLK_C
|
||||
|
||||
#include "elst.h"
|
||||
#include "txtregn.h"
|
||||
#include "bits16.h"
|
||||
|
||||
#include "hpddef.h" //must be last (handpd.dll)
|
||||
|
||||
enum PB_TYPE
|
||||
{
|
||||
PB_TEXT,
|
||||
PB_RULES,
|
||||
PB_GRAPHICS,
|
||||
PB_IMAGE,
|
||||
PB_SCRIBBLE,
|
||||
PB_WEIRD
|
||||
};
|
||||
|
||||
class DLLSYM PAGE_BLOCK; //forward decl
|
||||
class DLLSYM TEXT_BLOCK; //forward decl
|
||||
class DLLSYM GRAPHICS_BLOCK; //forward decl
|
||||
class DLLSYM RULE_BLOCK; //forward decl
|
||||
class DLLSYM IMAGE_BLOCK; //forward decl
|
||||
class DLLSYM SCRIBBLE_BLOCK; //forward decl
|
||||
class DLLSYM WEIRD_BLOCK; //forward decl
|
||||
|
||||
ELISTIZEH_S (PAGE_BLOCK)
|
||||
class DLLSYM PAGE_BLOCK:public ELIST_LINK, public POLY_BLOCK
|
||||
//page block
|
||||
{
|
||||
public:
|
||||
PAGE_BLOCK() {
|
||||
} //empty constructor
|
||||
PAGE_BLOCK( //simple constructor
|
||||
ICOORDELT_LIST *points,
|
||||
PB_TYPE type,
|
||||
PAGE_BLOCK_LIST *child);
|
||||
|
||||
PAGE_BLOCK( //simple constructor
|
||||
ICOORDELT_LIST *points,
|
||||
PB_TYPE type);
|
||||
|
||||
~PAGE_BLOCK () { //destructor
|
||||
}
|
||||
|
||||
void add_a_child(PAGE_BLOCK *newchild);
|
||||
|
||||
PB_TYPE type() { //get type
|
||||
return pb_type;
|
||||
}
|
||||
|
||||
PAGE_BLOCK_LIST *child() { //get children
|
||||
return &children;
|
||||
}
|
||||
|
||||
void rotate( //rotate it
|
||||
FCOORD rotation);
|
||||
void move( //move it
|
||||
ICOORD shift); //vector
|
||||
|
||||
void basic_plot(WINDOW window, COLOUR colour);
|
||||
|
||||
void plot(WINDOW window, COLOUR colour);
|
||||
|
||||
void show_attrs(DEBUG_WIN *debug);
|
||||
|
||||
NEWDELETE2 (PAGE_BLOCK) void pb_delete ();
|
||||
|
||||
void serialise(FILE *f);
|
||||
|
||||
static PAGE_BLOCK *de_serialise(FILE *f);
|
||||
|
||||
void prep_serialise() { //set ptrs to counts
|
||||
POLY_BLOCK::prep_serialise();
|
||||
children.prep_serialise ();
|
||||
}
|
||||
|
||||
void dump( //write external bits
|
||||
FILE *f) {
|
||||
POLY_BLOCK::dump(f);
|
||||
children.dump (f);
|
||||
}
|
||||
|
||||
void de_dump( //read external bits
|
||||
FILE *f) {
|
||||
POLY_BLOCK::de_dump(f);
|
||||
children.de_dump (f);
|
||||
}
|
||||
|
||||
//note that due to the awful switched nature of the PAGE_BLOCK class,
|
||||
//a PAGE_BLOCK_LIST cannot be de-serialised by the normal mechanism, since
|
||||
//each element cannot be de-serialised in place.
|
||||
//To fix this it is important to use read_poly_blocks or the code therein.
|
||||
void serialise_asc( //serialise to ascii
|
||||
FILE *f);
|
||||
void internal_serialise_asc( //serialise to ascii
|
||||
FILE *f);
|
||||
void de_serialise_asc( //serialise from ascii
|
||||
FILE *f);
|
||||
//make one from ascii
|
||||
static PAGE_BLOCK *new_de_serialise_asc(FILE *f);
|
||||
|
||||
private:
|
||||
PB_TYPE pb_type;
|
||||
PAGE_BLOCK_LIST children;
|
||||
};
|
||||
|
||||
DLLSYM void show_all_in(PAGE_BLOCK *pblock,
|
||||
POLY_BLOCK *show_area,
|
||||
DEBUG_WIN *f);
|
||||
|
||||
DLLSYM void delete_all_in(PAGE_BLOCK *pblock, POLY_BLOCK *delete_area);
|
||||
|
||||
DLLSYM PAGE_BLOCK *smallest_containing(PAGE_BLOCK *pblock, POLY_BLOCK *other);
|
||||
|
||||
class DLLSYM TEXT_BLOCK:public PAGE_BLOCK
|
||||
//text block
|
||||
{
|
||||
public:
|
||||
TEXT_BLOCK() {
|
||||
} //empty constructor
|
||||
TEXT_BLOCK(ICOORDELT_LIST *points);
|
||||
|
||||
TEXT_BLOCK (ICOORDELT_LIST * points, BOOL8 backg[NUM_BACKGROUNDS]);
|
||||
|
||||
//get children
|
||||
TEXT_REGION_LIST *regions() {
|
||||
return &text_regions;
|
||||
}
|
||||
|
||||
INT32 nregions() {
|
||||
return text_regions.length ();
|
||||
}
|
||||
|
||||
void add_a_region(TEXT_REGION *newchild);
|
||||
|
||||
void rotate( //rotate it
|
||||
FCOORD rotation);
|
||||
void move( //move it
|
||||
ICOORD shift); //vector
|
||||
|
||||
void plot(WINDOW window,
|
||||
COLOUR colour,
|
||||
COLOUR region_colour,
|
||||
COLOUR subregion_colour);
|
||||
|
||||
void set_attrs (BOOL8 backg[NUM_BACKGROUNDS]);
|
||||
|
||||
void show_attrs(DEBUG_WIN *debug);
|
||||
|
||||
void prep_serialise() { //set ptrs to counts
|
||||
PAGE_BLOCK::prep_serialise();
|
||||
text_regions.prep_serialise ();
|
||||
}
|
||||
|
||||
void dump( //write external bits
|
||||
FILE *f) {
|
||||
PAGE_BLOCK::dump(f);
|
||||
text_regions.dump (f);
|
||||
}
|
||||
|
||||
void de_dump( //read external bits
|
||||
FILE *f) {
|
||||
PAGE_BLOCK::de_dump(f);
|
||||
text_regions.de_dump (f);
|
||||
}
|
||||
|
||||
//serialise to ascii
|
||||
make_serialise (TEXT_BLOCK) void serialise_asc (
|
||||
FILE * f);
|
||||
void de_serialise_asc( //serialise from ascii
|
||||
FILE *f);
|
||||
|
||||
private:
|
||||
BITS16 background;
|
||||
|
||||
TEXT_REGION_LIST text_regions;
|
||||
};
|
||||
|
||||
DLLSYM void delete_all_tr_in(TEXT_BLOCK *tblock, POLY_BLOCK *delete_area);
|
||||
|
||||
DLLSYM void show_all_tr_in(TEXT_BLOCK *tblock,
|
||||
POLY_BLOCK *show_area,
|
||||
DEBUG_WIN *f);
|
||||
|
||||
class DLLSYM RULE_BLOCK:public PAGE_BLOCK
|
||||
//rule block
|
||||
{
|
||||
public:
|
||||
RULE_BLOCK() {
|
||||
} //empty constructor
|
||||
RULE_BLOCK(ICOORDELT_LIST *points, INT8 sing, INT8 colo);
|
||||
|
||||
void set_attrs(INT8 sing, INT8 colo);
|
||||
|
||||
void show_attrs(DEBUG_WIN *debug);
|
||||
|
||||
//serialise to ascii
|
||||
make_serialise (RULE_BLOCK) void serialise_asc (
|
||||
FILE * f);
|
||||
void de_serialise_asc( //serialise from ascii
|
||||
FILE *f);
|
||||
|
||||
private:
|
||||
INT8 multiplicity;
|
||||
INT8 colour;
|
||||
|
||||
};
|
||||
|
||||
class DLLSYM GRAPHICS_BLOCK:public PAGE_BLOCK
|
||||
//graphics block
|
||||
{
|
||||
public:
|
||||
GRAPHICS_BLOCK() {
|
||||
} //empty constructor
|
||||
GRAPHICS_BLOCK (ICOORDELT_LIST * points,
|
||||
BOOL8 backg[NUM_BACKGROUNDS], INT8 foreg);
|
||||
|
||||
void set_attrs (BOOL8 backg[NUM_BACKGROUNDS], INT8 foreg);
|
||||
|
||||
void show_attrs(DEBUG_WIN *debug);
|
||||
|
||||
//serialise to ascii
|
||||
make_serialise (GRAPHICS_BLOCK) void serialise_asc (
|
||||
FILE * f);
|
||||
void de_serialise_asc( //serialise from ascii
|
||||
FILE *f);
|
||||
|
||||
private:
|
||||
BITS16 background;
|
||||
INT8 foreground;
|
||||
|
||||
};
|
||||
|
||||
class DLLSYM IMAGE_BLOCK:public PAGE_BLOCK
|
||||
//image block
|
||||
{
|
||||
public:
|
||||
IMAGE_BLOCK() {
|
||||
} //empty constructor
|
||||
IMAGE_BLOCK(ICOORDELT_LIST *points, INT8 colo, INT8 qual);
|
||||
|
||||
void set_attrs(INT8 colo, INT8 qual);
|
||||
|
||||
void show_attrs(DEBUG_WIN *debug);
|
||||
|
||||
//serialise to ascii
|
||||
make_serialise (IMAGE_BLOCK) void serialise_asc (
|
||||
FILE * f);
|
||||
void de_serialise_asc( //serialise from ascii
|
||||
FILE *f);
|
||||
|
||||
private:
|
||||
INT8 colour;
|
||||
INT8 quality;
|
||||
|
||||
};
|
||||
|
||||
class DLLSYM SCRIBBLE_BLOCK:public PAGE_BLOCK
|
||||
//scribble block
|
||||
{
|
||||
public:
|
||||
SCRIBBLE_BLOCK() {
|
||||
} //empty constructor
|
||||
SCRIBBLE_BLOCK (ICOORDELT_LIST * points,
|
||||
BOOL8 backg[NUM_BACKGROUNDS], INT8 foreg);
|
||||
|
||||
void set_attrs (BOOL8 backg[NUM_BACKGROUNDS], INT8 foreg);
|
||||
|
||||
void show_attrs(DEBUG_WIN *debug);
|
||||
|
||||
//serialise to ascii
|
||||
make_serialise (SCRIBBLE_BLOCK) void serialise_asc (
|
||||
FILE * f);
|
||||
void de_serialise_asc( //serialise from ascii
|
||||
FILE *f);
|
||||
|
||||
private:
|
||||
BITS16 background;
|
||||
INT8 foreground;
|
||||
};
|
||||
|
||||
class DLLSYM WEIRD_BLOCK:public PAGE_BLOCK
|
||||
//weird block
|
||||
{
|
||||
public:
|
||||
WEIRD_BLOCK() {
|
||||
} //empty constructor
|
||||
WEIRD_BLOCK(ICOORDELT_LIST *points, INT32 id_no);
|
||||
|
||||
void set_id(INT32 id_no);
|
||||
|
||||
void show_attrs(DEBUG_WIN *debug);
|
||||
|
||||
void set_id_no(INT32 new_id) {
|
||||
id_number = new_id;
|
||||
}
|
||||
|
||||
void plot(WINDOW window, COLOUR colour);
|
||||
|
||||
INT32 id_no() {
|
||||
return id_number;
|
||||
}
|
||||
|
||||
//serialise to ascii
|
||||
make_serialise (WEIRD_BLOCK) void serialise_asc (
|
||||
FILE * f);
|
||||
void de_serialise_asc( //serialise from ascii
|
||||
FILE *f);
|
||||
|
||||
private:
|
||||
INT32 id_number; //unique id
|
||||
|
||||
};
|
||||
|
||||
void print_background(DEBUG_WIN *f, BITS16 background);
|
||||
#endif
|
325
ccstruct/pageres.cpp
Normal file
325
ccstruct/pageres.cpp
Normal file
@ -0,0 +1,325 @@
|
||||
/**********************************************************************
|
||||
* File: pageres.cpp (Formerly page_res.c)
|
||||
* Description: Results classes used by control.c
|
||||
* Author: Phil Cheatle
|
||||
* Created: Tue Sep 22 08:42:49 BST 1992
|
||||
*
|
||||
* (C) Copyright 1992, Hewlett-Packard Ltd.
|
||||
** Licensed under the Apache License, Version 2.0 (the "License");
|
||||
** you may not use this file except in compliance with the License.
|
||||
** You may obtain a copy of the License at
|
||||
** http://www.apache.org/licenses/LICENSE-2.0
|
||||
** Unless required by applicable law or agreed to in writing, software
|
||||
** distributed under the License is distributed on an "AS IS" BASIS,
|
||||
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
** See the License for the specific language governing permissions and
|
||||
** limitations under the License.
|
||||
*
|
||||
**********************************************************************/
|
||||
#include "mfcpch.h"
|
||||
#include <stdlib.h>
|
||||
#ifdef __UNIX__
|
||||
#include <assert.h>
|
||||
#endif
|
||||
#include "pageres.h"
|
||||
#include "notdll.h"
|
||||
|
||||
ELISTIZE (BLOCK_RES)
|
||||
CLISTIZE (BLOCK_RES) ELISTIZE (ROW_RES) ELISTIZE (WERD_RES)
|
||||
/*************************************************************************
|
||||
* PAGE_RES::PAGE_RES
|
||||
*
|
||||
* Constructor for page results
|
||||
*************************************************************************/
|
||||
PAGE_RES::PAGE_RES( //recursive construct
|
||||
BLOCK_LIST *the_block_list //real page
|
||||
) {
|
||||
BLOCK_IT block_it(the_block_list);
|
||||
BLOCK_RES_IT block_res_it(&block_res_list);
|
||||
|
||||
char_count = 0;
|
||||
rej_count = 0;
|
||||
rejected = FALSE;
|
||||
|
||||
for (block_it.mark_cycle_pt ();
|
||||
!block_it.cycled_list (); block_it.forward ()) {
|
||||
block_res_it.add_to_end (new BLOCK_RES (block_it.data ()));
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* BLOCK_RES::BLOCK_RES
|
||||
*
|
||||
* Constructor for BLOCK results
|
||||
*************************************************************************/
|
||||
|
||||
BLOCK_RES::BLOCK_RES( //recursive construct
|
||||
BLOCK *the_block //real BLOCK
|
||||
) {
|
||||
ROW_IT row_it (the_block->row_list ());
|
||||
ROW_RES_IT row_res_it(&row_res_list);
|
||||
|
||||
char_count = 0;
|
||||
rej_count = 0;
|
||||
font_class = -1; //not assigned
|
||||
x_height = -1.0;
|
||||
font_assigned = FALSE;
|
||||
bold = FALSE;
|
||||
italic = FALSE;
|
||||
row_count = 0;
|
||||
|
||||
block = the_block;
|
||||
|
||||
for (row_it.mark_cycle_pt (); !row_it.cycled_list (); row_it.forward ()) {
|
||||
row_res_it.add_to_end (new ROW_RES (row_it.data ()));
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* ROW_RES::ROW_RES
|
||||
*
|
||||
* Constructor for ROW results
|
||||
*************************************************************************/
|
||||
|
||||
ROW_RES::ROW_RES( //recursive construct
|
||||
ROW *the_row //real ROW
|
||||
) {
|
||||
WERD_IT word_it (the_row->word_list ());
|
||||
WERD_RES_IT word_res_it(&word_res_list);
|
||||
WERD_RES *combo = NULL; //current combination of fuzzies
|
||||
WERD_RES *word_res; //current word
|
||||
WERD *copy_word;
|
||||
|
||||
char_count = 0;
|
||||
rej_count = 0;
|
||||
whole_word_rej_count = 0;
|
||||
font_class = -1;
|
||||
font_class_score = -1.0;
|
||||
bold = FALSE;
|
||||
italic = FALSE;
|
||||
|
||||
row = the_row;
|
||||
|
||||
for (word_it.mark_cycle_pt (); !word_it.cycled_list (); word_it.forward ()) {
|
||||
word_res = new WERD_RES (word_it.data ());
|
||||
|
||||
if (word_res->word->flag (W_FUZZY_NON)) {
|
||||
ASSERT_HOST (combo != NULL);
|
||||
word_res->part_of_combo = TRUE;
|
||||
combo->copy_on (word_res);
|
||||
}
|
||||
if (word_it.data_relative (1)->flag (W_FUZZY_NON)) {
|
||||
if (combo == NULL) {
|
||||
copy_word = new WERD;
|
||||
//deep copy
|
||||
*copy_word = *(word_it.data ());
|
||||
combo = new WERD_RES (copy_word);
|
||||
combo->combination = TRUE;
|
||||
word_res_it.add_to_end (combo);
|
||||
}
|
||||
word_res->part_of_combo = TRUE;
|
||||
}
|
||||
else
|
||||
combo = NULL;
|
||||
word_res_it.add_to_end (word_res);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
WERD_RES & WERD_RES::operator= ( //assign word_res
|
||||
const WERD_RES & source //from this
|
||||
) {
|
||||
this->ELIST_LINK::operator= (source);
|
||||
if (source.combination) {
|
||||
word = new WERD;
|
||||
*word = *(source.word); //deep copy
|
||||
}
|
||||
else
|
||||
word = source.word; //pt to same word
|
||||
|
||||
if (source.outword != NULL) {
|
||||
outword = new WERD;
|
||||
*outword = *(source.outword);//deep copy
|
||||
}
|
||||
else
|
||||
outword = NULL;
|
||||
|
||||
denorm = source.denorm;
|
||||
if (source.best_choice != NULL) {
|
||||
best_choice = new WERD_CHOICE;
|
||||
*best_choice = *(source.best_choice);
|
||||
raw_choice = new WERD_CHOICE;
|
||||
*raw_choice = *(source.raw_choice);
|
||||
}
|
||||
else {
|
||||
best_choice = NULL;
|
||||
raw_choice = NULL;
|
||||
}
|
||||
if (source.ep_choice != NULL) {
|
||||
ep_choice = new WERD_CHOICE;
|
||||
*ep_choice = *(source.ep_choice);
|
||||
}
|
||||
else
|
||||
ep_choice = NULL;
|
||||
reject_map = source.reject_map;
|
||||
tess_failed = source.tess_failed;
|
||||
tess_accepted = source.tess_accepted;
|
||||
tess_would_adapt = source.tess_would_adapt;
|
||||
done = source.done;
|
||||
unlv_crunch_mode = source.unlv_crunch_mode;
|
||||
italic = source.italic;
|
||||
bold = source.bold;
|
||||
font1 = source.font1;
|
||||
font1_count = source.font1_count;
|
||||
font2 = source.font2;
|
||||
font2_count = source.font2_count;
|
||||
x_height = source.x_height;
|
||||
caps_height = source.caps_height;
|
||||
guessed_x_ht = source.guessed_x_ht;
|
||||
guessed_caps_ht = source.guessed_caps_ht;
|
||||
combination = source.combination;
|
||||
part_of_combo = source.part_of_combo;
|
||||
reject_spaces = source.reject_spaces;
|
||||
return *this;
|
||||
}
|
||||
|
||||
|
||||
WERD_RES::~WERD_RES () {
|
||||
if (combination)
|
||||
delete word;
|
||||
if (outword != NULL)
|
||||
delete outword;
|
||||
if (best_choice != NULL) {
|
||||
delete best_choice;
|
||||
delete raw_choice;
|
||||
}
|
||||
if (ep_choice != NULL) {
|
||||
delete ep_choice;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* PAGE_RES_IT::restart_page
|
||||
*
|
||||
* Set things up at the start of the page
|
||||
*************************************************************************/
|
||||
|
||||
WERD_RES *PAGE_RES_IT::restart_page() {
|
||||
block_res_it.set_to_list (&page_res->block_res_list);
|
||||
block_res_it.mark_cycle_pt ();
|
||||
block_res = NULL;
|
||||
row_res = NULL;
|
||||
word_res = NULL;
|
||||
next_block_res = NULL;
|
||||
next_row_res = NULL;
|
||||
next_word_res = NULL;
|
||||
internal_forward(TRUE);
|
||||
return internal_forward (FALSE);
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* PAGE_RES_IT::internal_forward
|
||||
*
|
||||
* Find the next word on the page. Empty blocks and rows are skipped.
|
||||
* The iterator maintains pointers to block, row and word for the previous,
|
||||
* current and next words. These are correct, regardless of block/row
|
||||
* boundaries. NULL values denote start and end of the page.
|
||||
*************************************************************************/
|
||||
|
||||
WERD_RES *PAGE_RES_IT::internal_forward(BOOL8 new_block) {
|
||||
BOOL8 found_next_word = FALSE;
|
||||
BOOL8 new_row = FALSE;
|
||||
|
||||
prev_block_res = block_res;
|
||||
prev_row_res = row_res;
|
||||
prev_word_res = word_res;
|
||||
block_res = next_block_res;
|
||||
row_res = next_row_res;
|
||||
word_res = next_word_res;
|
||||
|
||||
while (!found_next_word && !block_res_it.cycled_list ()) {
|
||||
if (new_block) {
|
||||
new_block = FALSE;
|
||||
row_res_it.set_to_list (&block_res_it.data ()->row_res_list);
|
||||
row_res_it.mark_cycle_pt ();
|
||||
new_row = TRUE;
|
||||
}
|
||||
while (!found_next_word && !row_res_it.cycled_list ()) {
|
||||
if (new_row) {
|
||||
new_row = FALSE;
|
||||
word_res_it.set_to_list (&row_res_it.data ()->word_res_list);
|
||||
word_res_it.mark_cycle_pt ();
|
||||
}
|
||||
while (!found_next_word && !word_res_it.cycled_list ()) {
|
||||
next_block_res = block_res_it.data ();
|
||||
next_row_res = row_res_it.data ();
|
||||
next_word_res = word_res_it.data ();
|
||||
found_next_word = TRUE;
|
||||
do {
|
||||
word_res_it.forward ();
|
||||
}
|
||||
while (word_res_it.data ()->part_of_combo);
|
||||
}
|
||||
if (!found_next_word) { //end of row reached
|
||||
row_res_it.forward ();
|
||||
new_row = TRUE;
|
||||
}
|
||||
}
|
||||
if (!found_next_word) { //end of block reached
|
||||
block_res_it.forward ();
|
||||
new_block = TRUE;
|
||||
}
|
||||
}
|
||||
if (!found_next_word) { //end of page reached
|
||||
next_block_res = NULL;
|
||||
next_row_res = NULL;
|
||||
next_word_res = NULL;
|
||||
}
|
||||
return word_res;
|
||||
}
|
||||
|
||||
|
||||
/*************************************************************************
|
||||
* PAGE_RES_IT::forward_block
|
||||
*
|
||||
* Move to the first word of the next block
|
||||
* Can be followed by subsequent calls to forward() BUT at the first word in
|
||||
* the block, the prev block, row and word are all NULL.
|
||||
*************************************************************************/
|
||||
|
||||
WERD_RES *PAGE_RES_IT::forward_block() {
|
||||
if (block_res == next_block_res) {
|
||||
block_res_it.forward ();;
|
||||
block_res = NULL;
|
||||
row_res = NULL;
|
||||
word_res = NULL;
|
||||
next_block_res = NULL;
|
||||
next_row_res = NULL;
|
||||
next_word_res = NULL;
|
||||
internal_forward(TRUE);
|
||||
}
|
||||
return internal_forward (FALSE);
|
||||
}
|
||||
|
||||
|
||||
void PAGE_RES_IT::rej_stat_word() {
|
||||
INT16 chars_in_word;
|
||||
INT16 rejects_in_word = 0;
|
||||
|
||||
chars_in_word = word_res->reject_map.length ();
|
||||
page_res->char_count += chars_in_word;
|
||||
block_res->char_count += chars_in_word;
|
||||
row_res->char_count += chars_in_word;
|
||||
|
||||
rejects_in_word = word_res->reject_map.reject_count ();
|
||||
|
||||
page_res->rej_count += rejects_in_word;
|
||||
block_res->rej_count += rejects_in_word;
|
||||
row_res->rej_count += rejects_in_word;
|
||||
if (chars_in_word == rejects_in_word)
|
||||
row_res->whole_word_rej_count += rejects_in_word;
|
||||
}
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user