shapeclustering(1) takes extracted feature \&.tr files (generated by tesseract(1) run in a special mode from box files) and produces a file \fBshapetable\fR and an enhanced unicharset\&. This program is still experimental, and is not required (yet) for training Tesseract\&.
.SH"OPTIONS"
.PP
\-U \fIFILE\fR
.RS4
The unicharset generated by unicharset_extractor(1)\&.
.RE
.PP
\-D \fIdir\fR
.RS4
Directory to write output files to\&.
.RE
.PP
\-F \fIfont_properties_file\fR
.RS4
(Input) font properties file, where each line is of the following form, where each field other than the font name is 0 or 1:
(Input) x heights file, each line is of the following form, where xheight is calculated as the pixel x height of a character drawn at 32pt on 300 dpi\&. [ That is, if base x height + ascenders + descenders = 133, how much is x height? ]
.sp
.ifn\{\
.RS4
.\}
.nf
\*(Aqfont_name\*(Aq \*(Aqxheight\*(Aq
.fi
.ifn\{\
.RE
.\}
.RE
.PP
\-O \fIFILE\fR
.RS4
The output unicharset that will be given to combine_tessdata(1)\&.