Top Banner
Adding Conversation to GUIs Dekang Lin Naturali 1
31

Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Jan 23, 2018

Download

Data & Analytics

AI Frontiers
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Adding Conversation to GUIs

Dekang LinNaturali

1

Page 2: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

A Tale of Two Uber Ridesuber ride to

crowne plaza sfo

Page 3: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Naturali

A Beijing-based startup company

Upgrade apps with a speech interface

Naturali Sesami✦ Translate speech inputs to action sequences

in apps and execute them on users’ behalf.

✦ Chinese version launched on LeTV phones as a system app on April 12, 2017

✦ Available as a third party app all Android phones since Aug. 2017

Page 4: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Advantages of Speech

Speed✦ voice input is three times as fast as typing

Hand-free:✦ send messages, play music, order food

✦ turn on hotspot: 5 clicks

Mind-free:✦ where is my luggage?

Page 5: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Voice Assistants

Chat window

Fulfillment by backend API calls

Page 6: Dekang Lin at AI Frontiers: Adding Conversation to GUIs
Page 7: Dekang Lin at AI Frontiers: Adding Conversation to GUIs
Page 8: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Chat + API: the down sides

Chat assistants displace apps, but

Chat is not the best mode of interaction for everything.

editing

browsing

viewing

None the less, there are plenty of needs for voice interaction.

who has access to

this?

Page 9: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Who has access? Just ask

Page 10: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Chat + API: the down sides

Re-invention of user experience inside the chat window:✦ usually not as good as

specialized apps,

✦ requires a great deal of repeated development effort

Page 11: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Chat + API: the down sides

Re-invention of user experience inside the chat window:✦ usually not as good as

specialized apps,

✦ requires a great deal of development effort

Page 12: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Chat + API: the down sides

Economic interests of the assistant and the backend services may not be aligned.

Page 13: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Naturali Sesami

A thin, transparent translation layer over apps.✦ voice ➜ front end UI actions

Seamless integration of speech and graphics✦ Existing GUI interactions are still

available

✦ Making voice interaction available on any app page

Page 14: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Use Yelp to find greek food near Santa Clara Convention Center

Page 15: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Voice to Actions in Three Steps

Speech Recognition: sound → text✦ data

Semantic Interpretation: text → intent✦ knowledge

Plan Generation: intent → actions✦ grounding

Page 16: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Speech Recognition: sound → text

Third party services

Open source tools

Page 17: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Naturali Speech

End-to-end DNN: CNN+LSTM+Attention+CTC✦ built from scratch with TensorFlow

✦ trained with thousands of hours of transcribed speech

Personalized and contextualized language model:✦ contact names

✦ app specific vocabulary

Page 18: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Semantic Interpretation: text → intent

An intent identifies a task and the necessary information (parameters) for the task

Example: ✦ task: FlightSearch

✦ parameters: (to, from, date, airline, class)

Page 19: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Entities and Types

Persons: singers/directors/contacts

Locations: cities/POIs/addresses

Apps and Games

Media: songs/shows/movies/books

Time and Date

Food

Sports teams

……

Page 20: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Recognizing Thousands of Types

It is not an option to use manually labeled training examples.

An alternative is to use naturally annotated data:✦ Hearst patterns: NPtype such as NPinst

✦ Other examples: navigate to NPloc

Page 21: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Multi-round Conversation

Complex intents may not be articulated in one shot✦ FlightSearch(to, from, date, airline, class)

A multi-round conversation incrementally collects information from user and guides the user in the process.

Page 22: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Dialog Management

Page 23: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Composite Intents

Messenger chat with Alex and say let’s meet on saturday✦ OpenMessenger

✦ ChatWithPerson

✦ SendMessage

get a uber black ride to SFO✦ UberRide

✦ SetDest

✦ SelectUberBlack

Page 24: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Messenger Chat

Page 25: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Plan Generation: intent → actions

Grounding: establishes the connection between in the inside (the assistant) and the outside (apps and devices).

Example:✦ intent:

{“task”: “FlightStatus”, “number”:”UA888”, “date”:”2017-11-04”}

✦ action:

select * from flight_db where “airline”=“United Airlines”, flight_num = “888” and year=2017 and month=11 and day=4

Page 26: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Actions on Googlegrounding

Page 27: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

What is my data usage?

Page 28: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Teaching a New Skill

Page 29: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Grounding by Crowd Sourcing

context

expression

actions

Skills=

Page 30: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Crowd Sourced Skills

Skills are immediately usable by the creator. ✦ The user may share the skills with others, e.g., tech support

for parents

Vetted skills can be made available to the public

Page 31: Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Summary

Voice interaction is inevitable

Naturali Sesami translates user requests into sequences of actions in APPs.

Sesami grows by crowd sourcing skills.

Join US! ✦ [email protected]