tesseract/doc/ambiguous_words.1.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<?asciidoc-toc?>
<?asciidoc-numbered?>
<refentry lang="en">
<refentryinfo>
    <title>AMBIGUOUS_WORDS(1)</title>
</refentryinfo>
<refmeta>
<refentrytitle>ambiguous_words</refentrytitle>
<manvolnum>1</manvolnum>
<refmiscinfo class="source">&#160;</refmiscinfo>
<refmiscinfo class="manual">&#160;</refmiscinfo>
</refmeta>
<refnamediv>
    <refname>ambiguous_words</refname>
    <refpurpose>generate sets of words Tesseract is likely to find ambiguous</refpurpose>
</refnamediv>
<refsynopsisdiv id="_synopsis">
<simpara><emphasis role="strong">ambiguous_words</emphasis> [-l lang] <emphasis>TESSDATADIR</emphasis> <emphasis>WORDLIST</emphasis> <emphasis>AMBIGUOUSFILE</emphasis></simpara>
</refsynopsisdiv>
<refsect1 id="_description">
<title>DESCRIPTION</title>
<simpara>ambiguous_words(1) runs Tesseract in a special mode, and for each word
in word list, produces a set of words which Tesseract thinks might be
ambiguous with it.   <emphasis>TESSDATADIR</emphasis> must be set to the absolute path of
a directory containing <emphasis>tessdata/lang.traineddata</emphasis>.</simpara>
</refsect1>
<refsect1 id="_see_also">
<title>SEE ALSO</title>
<simpara>tesseract(1)</simpara>
</refsect1>
<refsect1 id="_copying">
<title>COPYING</title>
<simpara>Copyright (C) 2012 Google, Inc.
Licensed under the Apache License, Version 2.0</simpara>
</refsect1>
<refsect1 id="_author">
<title>AUTHOR</title>
<simpara>The Tesseract OCR engine was written by Ray Smith and his research groups
at Hewlett Packard (1985-1995) and Google (2006-present).</simpara>
</refsect1>
</refentry>