RNN-T lattice enhancement by grafting of pruned paths
Recurrent Neural Network Transducers (RNN-T) — a streaming variant of end-to-end models — became very popular in recent years. Since RNN-T networks condition the future output label sequence on all previous labels, the natural search space is represented by a tree. In contrast, hybrid systems employ limited-context language models, where the natural search space is a network, i.e. a lattice. While this lattice represents a rich set of hypotheses (n-best list), the search tree produced by RNN-T is more limited. In this work, we introduce a heuristic method which preserves some of the pruned RNN-T hypotheses by attaching them, or grafting them to the surviving hypotheses in the search tree. We achieved up to 21% oracle WERR without degrading the best-path WER. There is no impact on CPU cost, in fact we can speed up the decoder by using a smaller beam size, and still get improvements on oracle WER.