Real-Time Caption Streaming over WiFi Network Balaji Vasu ([email protected]) Joseph Joswig ([email protected]) CS 218.

Real-Time Caption Streaming over WiFi Network

Balaji Vasu ([email protected])

Joseph Joswig ([email protected])

CS 218

Outline

Project Description Assumptions Demo Results

Motivation for Caption Streaming Each student can visualize in real-time and

store the captions of the ongoing lesson Useful in many real situations:

to help the comprehension of foreign students, students with hearing

impairment, or just as a support to notes taking during classes (students

can focus on the lecture)

Project Overview

Speech recognition algorithm running in the central host

Multicast transmission via IEEE802.11b from the central host (sender) to the devices (receivers)

Receivers do not acknowledge receipt of packets

Streaming Protocol

Instructor is equipped with a headset and an IEEE802.11b enabled station, running a speech detection software.

The voice of the lecturer is captured by the headset, converted into a text file by the speech recognition software

Sent via a IEEE802.11b multicast transmission to all the students.

The lecturer’s host runs a client program which handles the multicast transmission

Wirelessly connected to the receiving stations that run a server program able to receive the packets and display the speech in a real-time fashion.

Redundant Transmission Protocol Each multicast packet contains N fragments

of spoken text. Fragments sent within a packet are

determined by a sliding window protocol. First packet delivers P to P + 3, second P + 1

to P + 4 etc. Even if one out of 4 packets reaches the receiver, all the fragments would reach the receiver.

Redundant Transmission Protocol (cont’d) Greater the N, the greater the channel is

robust to errors. Larger the N, the more bandwidth is wasted

on redundancy. At the end of the lecture, receivers can ask

for any missing packets from the host. These will be retransmitted over TCP.

Redundant Packet Protocol

S S+1 S+2 S+3

S+1 S+2 S+3 S+4

S+2 S+3 S+4 S+5

Platform Choices

Java Operating System Independence Easy to use networking tools.

Speech Recognition Sphinx Dragon Naturally Speaking IBM ViaVoice

Programming Notes

Speech to text program writes to an ascii file, and holds a write lock on the file.

N = 4, in this case, with results for other values of N shown later

Packet size is 512 bytes + 4 bytes for sequence number. Each segment is 128 bytes.

Simulation

Simulated program by pulling data from a text file.

Measure the percentage of packets that different window sizes and different error rates.

Calculate the number of packets that are lost during the broadcast.

Simulated Error is much larger than anticipated actual transmission error.

Simulation Results

Arrival Rate of different window sizes

0%

20%

40%

60%

80%

100%

120%

0% 20% 40% 60% 80% 100%

Loss Rate

Arr

ival

Rat

e

1 segment

2 segments

3 segments

4 segments

5 segments

6 segments

7 segments

8 segments

9 segments

10 segments

Remaining Issues

PDA’s Robust Speech Recognition Program

Real-Time Caption Streaming over WiFi Network Balaji Vasu ([email protected]) Joseph Joswig ([email protected]) CS 218.

Documents