This setup allows to train end-to-end neural models for spoken language understanding (SLU). It uses either the Snips SLU or the Fluent Speech dataset (FSC). This framework is built using pytorch with torchaudio and the transformer package from HuggingFace. We tested using pytorch 1.5.0 and torchaudio 0.5.0.
Alexa end-to-end SLU
2020
Last updated January 3, 2024
Research areas