Sunday, July 1, 2007

Constrained Run-Length Algorithm (CRLA)

The Constrained Run-Length Algorithm (CRLA) is a well-known technique for page segmentation which is also known as the Run-Length Smoothing/Smearing Algorithm (RLSA). This is a very well known technique for segmenting a document image into homogeneous regions. The algorithm is very efficient for partitioning documents with Manhattan layouts (i.e. the text/graphics/halftone-image regions were separable by horizontal and vertical line segments, for example two-column text together with rectangle-alignment graphics and halftone images) but not suited to deal with complex layout pages, e.g. irregular graphics embedded in a text paragraph. Its main drawback is to use only local information during the smearing stage, which may lead to erroneous linkage of text and graphics.

The previous two posts briefly describe the Constrained Run-Length Algorithm (CRLA) or Run Length Smoothing Algorithm (RLSA) and its related issue.

Source :

[1] Hung-Ming Sun, 2006, "Enhanced Constrained Run-Length Algorithm for Complex Layout Document Processing", International Journal of Applied Science and Engineering 2006.4, 3: 297-309

No comments: