Black border removal is a feature of document management software that automatically removes the black edges of a document that has been digitally scanned. Black border edges can cause problems with optical character recognition (OCR) software.
The document structure analysis and character recognition are usually done in several phases:
scanning
thresholding
skew detection and correction
despeckle or speckle removal
line removal
border removal
detection of preprinted elements (like boxes, rakes or combs)
page orientation detection and correction
layout analysis
classification
character recognition
Each step must be completed well enough for the performance of the sequence and result to be successful. Steps that follow the border removal are inefficient if the correction fails.
BordersHelper uses an algorithm based on Flood Fill, Component Labelling, and Region Adjacency Graphs. It removes noisy borders in monochromatic images of documents introduced by the digitalization process using automatically fed scanners.
BordersHelper expects as input a monochrome image.
|