Code and datasets

Handwriting recognition using Amazon SageMaker

Jonathan Chung, Ehsan M. Kermani

2020

The SageMaker handwriting recognition solution applies deep learning techniques to transcribe text in images of passages into strings. If you have your own data, you can use this solution to label your own data and train a new network with it. Endpoints are then automatically deployed with the solution.

Computer vision

Point of view of message conversion dataset

Isabelle G. Lee, Vera Zu, Sai Srujana Buddi, Dennis Liang, Purva Kulkarni, Jack G. M. FitzGerald

2020

Virtual assistants (VAs) tend to be literal in their delivery of messages. Most likely, if you ask them to deliver a message, the VAs either send a recorded message or a literal transcription to the receiver. To make incremental improvement towards a virtual assistant that you may speak to conversationally and naturally, we have provided the data necessary to build AI systems that can convert the point

Conversational AI

Genomics secondary analysis using AWS step functions and AWS Batch

Lee Pang, Ryan Ulaszek

2020

This solution provides a framework for Next Generation Sequencing (NGS) genomics secondary-analysis pipelines using AWS Step Functions and AWS Batch. It deploys AWS services to develop and run custom workflow pipelines, monitor pipeline status and performance, fail-over to on-demand, handle errors, optimize for cost, and secure data with least-privileges. The solution is designed to be starting point for

Cloud and systems

STIL - Simultaneous Slot filling, translation, intent classification, and language identification: Initial results using mBART on MultiATIS++

Jack G. M. FitzGerald

2020

Slot-filling, Translation, Intent classification, and Language identification, or STIL, is a newly-proposed task for multilingual Natural Language Understanding (NLU). By performing simultaneous slot filling and translation into a single output language (English in this case), some portion of downstream system components can be monolingual, reducing development and maintenance cost. Results are given using

Conversational AI

Deep demand forecasting with Amazon SageMaker

Ehsan M. Kermani, Patrick Yang, Alex Voitau

2020

This project provides an end-to-end solution for Demand Forecasting task using a new state-of-the-art Deep Learning model LSTNet available in GluonTS and Amazon SageMaker.

Machine learning

Embedding-based zero-shot retrieval through query generation

Davis Liang, Peng Xu, Siamak Shakeri, Cicero Nogueira dos Santos, Ramesh Nallapati, Zhiheng Huang, Bing Xiang

2020

Embedding-based Zero-shot Retrieval through Query Generation leverages query synthesis over large corpuses of unlabeled text (such as Wikipedia) to pre-train siamese neural retrieval models. The resulting models significantly improve over previous BM25 baselines as well as state-of-the-art neural methods. This package provides support for leveraging BART-large for query synthesis as well as code for training

Information and knowledge management

Privacy-preserving XGBoost inference

Xianrui Meng, Joan Feigenbaum

2020

Although machine learning (ML) is widely used for predictive tasks, there are important scenarios in which ML cannot be used or at least cannot achieve its full potential. A major barrier to adoption is the sensitive nature of predictive queries. Individual users may lack sufficiently rich datasets to train accurate models locally but also be unwilling to send sensitive queries to commercial services that

Security, privacy, and abuse prevention

DSTQA - Dialogue state tracking via question answering

Li Zhou, Kevin Small

2020

Multi-domain dialogue state tracking (DST) is a critical component for conversational AI systems. The domain ontology (i.e., specification of domains, slots, and values) of a conversational AI system is generally incomplete, making the capability for DST models to generalize to new slots, values, and domains during inference imperative. In this paper, we propose to model multi-domain DST as a question answering

Conversational AI

MultiAtis++ Corpus

Weijia Xu, Batool Haider, Saab Mansour

2020

Natural language understanding (NLU) in the context of goal-oriented dialog systems typically includes intent classification and slot labeling tasks. Existing methods to expand an NLU system to new languages use machine translation with slot label projection from source to the translated utterances, and thus are sensitive to projection errors. In this work, we propose a novel end-to-end model that learns

Conversational AI

Amazon SageMaker solution for privacy in natural language processing

Theodore Vasiloudis, Ehsan M. Kermani

2020

More and more text data are becoming available these days to train Natural Language Processing models such as sentiment analysis, predictive keyboards and question-answering chatbots. If companies that deploy such models use data provided by users, they have a responsibility to take steps to ensure their users' privacy. In this solution we demonstrate how one can use Differential Privacy to build accurate

Conversational AI

Alexa Skills Kit SDK for Node.js

Tianren Zhang, Hidetaka Okamoto, Nikhil Yogendra Murali, Kakha Urigashvili, Jonathan Breedlove, Prashanth Bheemagani, Josh Bean, Kipp Ashford, Ian Gilham, Brian Broll, Shreyas Govinda Raju, Anthony Dall'Agnola-Bomier, Andrew King, Tomislav Skoković, German Viscuso, Justin Kovac, Saburo Higuchi, Olivia Sung, Thorsten Höger, Nat Burgwyn

2020

The ASK SDK v2 for Node.js makes it easier for you to build highly engaging skills by allowing you to spend more time on implementing features and less on writing boilerplate code. The ASK SDK Controls Framework (Beta) builds on the ASK SDK v2 for Node.js, providing a scalable solution for creating large, multi-turn skills in code with reusable components called controls. The ASK SMAPI SDK for Node.js provides

Conversational AI

Homomorphic implementor's toolkit

Bryce Ferenczi, Eric Crockett, Greg Linden

2020

Homomorphic encryption is a special type of encryption scheme which enables computation of arbitrary functions on encrypted data. To evaluate a function f, a developer must implement f as a circuit F using only the "native" operation supported by the underlying homomorphic encryption scheme. Libraries which implement homomorphic encryption provide an API for these native operations which can be used to

Security, privacy, and abuse prevention

Code and datasets

More resources

Related content

Work with us