Partitioning of feature space by iterative classification for degraded document image binarisation

Access Full Text

Partitioning of feature space by iterative classification for degraded document image binarisation

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Image Processing — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Proper partitioning of feature space into text and background regions is very important in document image binarisation. This study presents an iterative classification algorithm that efficiently partitions a two-dimensional feature space into text and background regions. It uses the result of Niblack's binarisation algorithm as training data and employs its characteristics to define classification rules. In each iteration, it labels only some points of the feature space, which can be classified reliably and leaves the classification of other points to the next iterations. The classification result of a point in current iteration affects the classification of its neighbours in the next iterations and makes them more probable to be classified correctly. After a few iterations, it partitions the feature space into two regions associated with the text and background pixels. After partitioning, two global thresholding methods were used as an extra text class refinement to make the proposed algorithm robust against bleeding-through and shadow-through degradations. Finally, each pixel is labelled as either text or background according to its corresponding region in the feature space. The authors’ binarisation algorithm demonstrated superior performance against six well-known algorithms on three datasets. It is appropriate for various types of degraded images.

Inspec keywords: feature extraction; pattern classification; iterative methods; document image processing

Other keywords: iterative classification; global thresholding methods; Niblack binarisation algorithm; shadow-through degradation; text class refinement; bleeding-through degradation; degraded document image binarisation; feature space partitioning

Subjects: Interpolation and function approximation (numerical analysis); Computer vision and image processing techniques; Document processing and analysis techniques; Optical, image and video signal processing; Interpolation and function approximation (numerical analysis); Pattern recognition

References

    1. 1)
      • http://www.mediateam.oulu.fi/MTDB/.
    2. 2)
    3. 3)
    4. 4)
    5. 5)
      • Badekas, E., Papamarkos, N.: `Automatic evaluation of document binarization results', Proc. Tenth Iberoamerican Congress on Pattern Recognition, 2005, p. 1005–1014.
    6. 6)
      • Su, B., Lu, S., Tan, C.L.: `A self-training learning document binarization framework', 20thInt. Conf. on Pattern Recognition, 2010, p. 3187–3190.
    7. 7)
      • http://users.iit.demokritos.gr/~bgat/DIBCO2009/benchmark\.
    8. 8)
    9. 9)
    10. 10)
    11. 11)
    12. 12)
      • W. Niblack . (1986) An introduction to digital image processing.
    13. 13)
      • Lu, S., Tan, C.L.: `Binarization of badly illuminated document images through shading estimation and compensation', Nineth Int. Conf. on Document Analysis and Recognition, 2007, Brazil, p. 312–316.
    14. 14)
    15. 15)
    16. 16)
    17. 17)
    18. 18)
    19. 19)
      • Gatos, B., Ntirogiannis, K., Pratikakis, I.: `ICDAR 2009 document image binarization contest (DIBCO 2009)', Tenth Int. Conf. on Document Analysis and Recognition, 2009, p. 1375–1382.
    20. 20)
      • Bernsen, J.: `Dynamic thresholding of grey-level images', Proc. Eighth Int. Conf. on Pattern Recognition, 1986, Paris, France, p. 1251–1255.
    21. 21)
    22. 22)
    23. 23)
    24. 24)
    25. 25)
    26. 26)
    27. 27)
    28. 28)
      • Wellner, P.D.: `Adaptive thresholding on the DigitalDesk', EPC-93–110, Technical, 1993.
    29. 29)
      • Shafait, F., Keysers, D., Breuel, T.M.: `Efficient implementation of local adaptive thresholding techniques using integral images', 15thDocument Recognition and Retrieval Conf., 2008, 6815.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2011.0399
Loading

Related content

content/journals/10.1049/iet-ipr.2011.0399
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading