Top Banner
Comparing voice related API’s Christian Rebernik @crebernik7791
12

The rise of voice platforms - Comparing voice related API's

Feb 14, 2017

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The rise of voice platforms - Comparing voice related API's

Comparing voice related API’s

Christian Rebernik @crebernik7791

Page 2: The rise of voice platforms - Comparing voice related API's

Voice First Footprint

In 2017 there will be 33 mio devices

● The Voice 2017 Report - VoiceLabs analysis combined with research from CIRP, KPCB and InfoScout

Page 3: The rise of voice platforms - Comparing voice related API's

Voice adoption

The ‘Voice First’ era has already started

● Alexa in 4% of US households (end 2016)

● Siri handles over 2bn commands a week

● 20% of Google searches on Android handsets input by voice

Alexa

Google home

Ding Dong

Page 4: The rise of voice platforms - Comparing voice related API's

Voice Devices

Creating an open ecosystem

Amazon EchoSkills and Alexa Voices Service

Google HomeGoogle Assistant Actions

Page 5: The rise of voice platforms - Comparing voice related API's

Speech Recognition API

Developing for the Amazon Alexa● Limit understanding

Amazon Echo is build for predefined options (e.g. no custom notes). Session is ended after 8 sec.

● Predefined wake word defines the customer experience.Only 4 wake words available and must be in any conversation.

● No notifications and no presenceYou can’t alert the user of an event. You cannot react on e.g. welcome home.

● No audio / No identificationAnybody can use Alexa (guests, etc.) and access all informations

Page 6: The rise of voice platforms - Comparing voice related API's

Technology Stack

Components enabling Voice User Interfaces

Implemented use cases leveraging the Hardware and AI Software

Software that interprets speech, enables conversations and provide natural voice.

Devices the consumer is interacting like Amazon Echo or Google Home

Applications

AI Software

Hardware

Page 7: The rise of voice platforms - Comparing voice related API's

AI overview

120 companies in Speech Recognition

Ventures Scanner, Contact [email protected]

Page 8: The rise of voice platforms - Comparing voice related API's

Speech Recognition API

Real time speech-to-text API’sGoogle4 IBM3 Microsoft2

Status Beta Beta/Production Preview

Language Support1 43 (89) 8 (14) 6 (7)

Cost/min 0,024 €0,006 / 15sec

0,02 € 0,06 €1000 calls a 15 sec for 4$

Speaker detection no English (8KHz) no

Audio Formats FLAC, Linear16, MULAW, ARM, AMR_WB

FLAC, PCM, WAV, OGG, NULAW

PCM single channel, Siren, SirenSR

Noise Friendly Yes Unkown Unkown

Word hints Yes No No1) Languages support (Languages supported including dialects)2) Microsoft: https://www.microsoft.com/cognitive-services/en-us/speech-api 3) IBM: http://www.ibm.com/watson/developercloud/speech-to-text.html4) Google: https://cloud.google.com/speech/

Page 9: The rise of voice platforms - Comparing voice related API's

● High audio capturing qualityUse lossless coding. Capture audio with 16,000 Hz or higher. Use native sample rate.

● No additional noiseAPI’s include noise reduction. Duplicate noise reduction can reduce the quality. Echo and noise has huge impact on speech recognition quality

● User educationEducate user to be close to the microphone

● One speaker per stream.For multi speaker setting try to separate the audio streams as the current API’s are built for dictation

● Provide contextContext matters a lot. Provide word hints to help the system to correct detection.

Speech Recognition API

Best practices

Page 10: The rise of voice platforms - Comparing voice related API's

Problem

Real life - Voice is in the early days

Speech-to-text-quality

Speaker recognition

Language mixing

Punctuation

Page 11: The rise of voice platforms - Comparing voice related API's

Demo

Voice interaction in IoT

Page 12: The rise of voice platforms - Comparing voice related API's

We are building a voice first company and are looking for support

- Technical Research- Deep Learning & NLP Scientist- Software Engineers

Christian Rebernik Contact: [email protected]