SCATTER: Selective context attentional scene text recognizer

Ron Litman; Oron Anschel; Shahar Tsiper; Roee Litman; Shai Mazor; R. Manmatha

Publication

SCATTER: Selective context attentional scene text recognizer

By Ron Litman, Oron Anschel, Shahar Tsiper, Roee Litman, Shai Mazor, R. Manmatha

2020

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Scene Text Recognition (STR), the task of recognizing text against complex image backgrounds, is an active area of research. Despite the recent success of STR methods, current state-of-the-art (SOTA) STR methods still struggle to recognize text written in arbitrary shapes. In this paper, we introduce a novel architecture for STR, named Selective Context ATtentional Text Recognizer (SCATTER). SCATTER utilizes a stacked block architecture with inter-mediate supervision during training, that paves the way to successfully train a deep BiLSTM encoder, thus improving the encoding of contextual dependencies. Decoding is done using a two-step 1D attention mechanism. The first attention step re-weights between visual features from a CNN and contextual features from a BiLSTM. The second attention step, similarly to previous works, treats the features as a sequence and attends to the intra-sequence relationships. Experiments show that the proposed approach surpasses SOTA performance on irregular text recognition benchmarks by 3.7% on average.

SCATTER: Selective context attentional scene text recognizer

Latest news

Work with us