Intelligent Character Recognition

queXF ICR Process

queXF has implemented an Intelligent Character Recognition (ICR) process from version 1.12.0 to detect isolated handwritten characters. The system is available for testing, but currently is not optimised so may run slowly.

The process of ICR is broken in to the following 7 steps:

  1. Character isolation
  2. Noise reduction
  3. Boundary removal
  4. Normalising
  5. Thinning
  6. Feature extraction
  7. Training or Recognition

Character isolation

queXF already provides a character isolation feature as queXF considers each “box” on a form to be an independent entity. Where text is entered in to an individual character or number field, queXF knows the coordinates of the box on the page. Using the page edges, queXF determines the rotation, zoom (scale) and offset of the page and then applies these transformations to the box coordinates. It then extracts the character image from the page image.

Original box location overlay on scanned form:

Original box location overlay on scanned form

Box location after rotation, zoom and offset:

box location after rotation, zoom and offset

Examples of extracted original images of the handwritten letter A:

Examples of extracted original images of the handwritten letter A

Noise reduction

Given the isolated character image, queXF then removes “salt and pepper” noise (usually introduced via scanning) using the kFill algorithm.

See: kfill_modified function in functions/functions.ocr.php) with a k value of 5

The kFill algorithm was proposed in: K.Chinnasarn, Y.Rangsanseri, P.Thitimajshima: Removing Salt-and-Pepper Noise in Text/Graphics Images. Proceedings of The Asia-Pacific Conference on Circuits and Systems (APCCAS'98), pp. 459-462, 1998 (click for PDF)

Examples of the characters after noise reduction. Notice the effect on the dots on the left hand side of the third character:

Examples of the characters after noise reduction

Boundary removal

The BOX_EDGE value in config.default.php is used to strip the expected box outline from each character, but a poor scan may still leave some unwanted filled pixels at the box edge. If there is non-character data such as “salt and pepper” noise or edge lines, then the image normalisation process will fail, as the bounding box will be detected as inclusive of the noise. queXF therefore runs a boundary noise removal function.

See: remove_boundary_noise in functions/functions.ocr.php

This function is an implementation of part 4 "Noise Cleaning along Character Image Boundaries" from: Preprocessing and Image Enhancement Algorithms for a Form-based Intelligent Character Recognition System, Dipti Deodhare, NNR Ranga Suri and R Amit. International Journal of Computer Science & Appliacations Vol. II, No. II pp. 131-144

Examples of boundary removal. Notice the bottom line on the second character has been removed, along with the dots on the left hand edge of the third character:

Examples of the characters after boundary removal

Normalising (resizing)

The whitespace around the character is then disregarded (a bounding box is detected) then the character image within the bounding box is resized to a standard size (queXF uses 44x34 pixels).

See: resize_bounding in functions/functions.ocr.php

Examples of resizing. Notice that due to the noise reduction and boundary removal, that the resizing can include the entire character in the box:

Examples of the characters after resizing

Thinning

Thinning is the process of reducing an image to a skeleton that is 1 pixel wide. This is to remove the effect of the “thickness” of pen strokes on character recognition - so only the shape is left. The Zhang-Suen algorithm has been implemented in queXF to thin the images.

See: thinzs_np in functions/functions.ocr.php

The thinning function was ported from analysis.c in the T. Y. Zhang , C. Y. Suen, A fast parallel algorithm for thinning digital patterns, Communications of the ACM, v.27 n.3, p.236-239, March 1984

Examples of thinning applied to the resized characters:

Examples of the characters after thinning

Overview of image manipulation before feature extraction

Overview of image manipulation before feature extraction for ICR

Feature extraction

This is the process of identifying pertinent information in the image that can be used for training or recognition.

The process used by queXF is to calculate 16 features for an image:

  • Split the image in to 30 degree sections from the centroid (12 sections)
  • For each section, calculate the normalised vector distance. This is the sum of the distance between each filled pixel in the sector and the centroid, divided by the number of filled pixels in the sector. These become 12 “features” of the character image
  • Split the image in to 90 degree sections from the centroid (4 sections)
  • For each section, calculate the proportion of filled pixels in this section compared to the entire image. These become 4 more “features” of the image

See: sector_distance in functions/functions.ocr.php

The algorithm for feature extraction is described in Section 2 of "Hand printed Character Recognition using Neural Networks" by Vamsi K. Madasu, Brian C. Lovell and M. Hanmandlu

Training or Recognition

Once the features have been extracted from a character image, the data can either form part of the training set, or be compared against an existing training set for recognition.

Training

queXF implements training based on fuzzy logic. The 16 features extracted from each character image form 16 fuzzy sets. The mean and variance of each of the 16 fuzzy sets is calculated for each character. These form the knowledge base (KB).

See: generate_kb in functions/functions.ocr.php

Recognition

The “fuzzy distance” between the features in the character image to be recognised, and each character in the knowledge base is calculated. The character with the minimum “fuzzy distance” should identify the character.

See: ocr_guess in functions/functions.ocr.php

The algorithms for training and recognition is described in Section 4.2 of "Hand printed Character Recognition using Neural Networks" by Vamsi K. Madasu, Brian C. Lovell and M. Hanmandlu