Efficient Exploration of Text Regions in Natural Scene Images Using Adaptive Image Sampling
2016
Download
Abstract
An adaptive image sampling framework is proposed for identifying text regions in natural scene images. A small fraction of the pixels actually correspond to text regions. It is desirable to eliminate non-text regions at the early stages of text detection. First, the image is sampled row-by-row at a specific rate and each row is tested for containing text using an 1D adaptation of the Maximally Stable Extremal Regions (MSER) algorithm. The surrounding rows of the image are recursively sampled at finer rates to fully contain the text. The adaptive sampling process is performed on the vertical dimension as well for the identified regions. The final output is a binary mask which can be used for text detection and/or recognition purposes. The experiments on the ICDAR’03 dataset show that the proposed approach is up to 7x faster than the MSER baseline on a single CPU core with comparable text localization scores. The approach is inherently parallelizable for further speed improvements.
Research areas