Top Banner
Swapnil Belhe Team Lead C-DAC, GIST, Pune [email protected] WELCOME Enabling Mobile & Wireless Handheld Devices with Indian Languages
32

Swapnil Belhe Team Lead C-DAC, GIST, Pune swapnilb@cdac

Jan 04, 2016

Download

Documents

WELCOME. Enabling Mobile & Wireless Handheld Devices with Indian Languages. Swapnil Belhe Team Lead C-DAC, GIST, Pune [email protected]. Telecom Subscribers Base in India. Ref: TRAI Press Release Dec 2011. 70% of India’s Population lives in Rural Parts ~800 millions [Census 2001]. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Swapnil BelheTeam LeadC-DAC, GIST, Pune [email protected]

WELCOME

Enabling Mobile & Wireless Handheld Devices with Indian Languages

Page 2: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Telecom Subscribers Base in India

Ref: TRAI Press Release Dec 2011

• 70% of India’s Population lives in Rural Parts ~800 millions [Census 2001]

Page 3: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Future Growth

• Language enabled mobile devices

• Big and easy to read displays

• Multi-modal interactions (like keyboard, pen, speech etc)

• Indian language contents and applications“..What matters most about a new technology is not how it works, but how people use it and the changes it brings about in human lives…”

…. Frances Cairncross

Page 4: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

CDAC-GIST: IL Display Solution ScalabilityIndian Language Ecosystem on All Devices

• Public Display Systems• Printers with Indian language Support• Mobiles (Feature Phones & Smart Phones)• Tablet’s • Set Top Boxes• Lifts, Washing Machine,

Microwave Oven etc.

Page 5: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

What is required from Handset?• The mobile or PDA’s are important means of communications today. We

believe the end-user expects minimum following text based Indian language components from mobile’s/PDA’s

• SMS editing• Indian language Menus• Phonebook data • Notes/Notepad/Word• Indian language based Games• Browser supporting Indian Languages.• Multi-modal inputting e.g. Handwriting etc.• Text to Speech

VAS (Value Added Services)• alerts, reminders, mandi rates, farming tips etc.

Page 6: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Some Inputting Methods for Smart Phones

Page 7: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

W3C MW4D IG 2009

• Recommendations– Targeted at network operators

• Implementing Unicode support for SMS on all networks

– Targeted at handset manufacturers• Handsets should be extensible to support external/new

character sets and to be usable in all languages of the world

• Handsets should provide software modules such as Text-to-Speech engines to improve accessibility and offer opportunity for a greater support of voice

Page 8: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Challenges in Wide Spread usage of Mobiles other than for Voice Calls

Page 9: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Mobile Handset Scenario - Past

• Initial mobiles contained small display sizes like– 64 x 64,– 96 x 72– 128 x 128 etc.

• And contained very small memory in Kilo bytes(KBs)• This was sufficient for English like languages which

contained only 52 linear characters• There are many such legacy phones still in the market

Page 10: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

• Now a days with the advent of better LED and TFT displays the screen sizes have increased to– 256 x 256– 512 x 420– 640 x 480

• And available memory is in Mega bytes (MB)• But there is also increase in types of Operating Systems (OS) like

Windows Mobile, Symbian, Android, Embedded-Linux, etc.

Mobile Handset Scenario

Mobile Handset Features IL Communication through

Low End SMS Picture SMS

Mid Range SMS, MMS, WAP, J2ME SMS, MMS(image)

High End SMS, MMS, WAP, J2ME, Browser, E-mail

SMS, E-mail

Page 11: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

• Even though the memory & display size have increased still to get seamless support to Indian languages on handsets requires following,

– Indian language Keyboard for text inputting– Rasterizer for displaying text– Indian language Layout engine– Fonts– Common storage format

Ideally all these components should be backward compatible with legacy handsets

Mobile Handset Scenario

Page 12: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Indian Language SMS

Indian Language SMS Garbled Characters

• Current Scenario :In most of the handsets the Indian language text would appear garbled. Only few compatible handsets will display text properly

Page 13: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Proprietary picture SMS based solutions

Require picture enabled handsets to display; message size is limited to 72x28 to 72x56 hence only few words can be sent

Different SMS encodings

Keypads with Indian languages available but everybody's keypad differs in

Number of characters on each key Choice of characters placed on different keys Position of characters on a specific key Height and width of characters displayed on the keypad

Everyone is using different proprietary keyboard layouts

Most mobile manufacturers support Indian languages then where is the problem?

Page 14: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Fonts

• There are three types of fonts

– Bitmap fonts (used by low end handsets)

– Truetype fonts (used by high end handsets)

– Opentype fonts (currently not widely used)

Page 15: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Issues - Fonts• Every handset model is different from other in terms of,

– Screen resolution– Screen color depth– Screen technology (especially display pitch)– Available Memory

• Thus bitmap font designed for one handset model may not be readable on other handsets. Hence the fonts have to be custom designed as per specifications of every model

• For Truetype fonts the display is governed by handset’s operating system (Symbian, Windows Mobile, Android etc.)and availability of Indic layout mechanism

Page 16: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

What is Layout Mechanism?• Layout mechanism allows proper display of Indian language

text. • Without layout mechanism, re-ordering of text will not

happen e.g.,

Lay-outing provides basic facilities like •Text Re-ordering•Bold, Italics•Insert, delete characters•Cursor movement•Word wrapping•Text Scrolling

Lay-outing provides basic facilities like •Text Re-ordering•Bold, Italics•Insert, delete characters•Cursor movement•Word wrapping•Text Scrolling

Page 17: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Other Issues/Challenges for Mobiles

Indic Languages• One script many languages Covering only 10 languages may not support all 22 official

languages. Some of the languages are written in more than one script and thus there is dependency at the implementation level.

E.g.Sindhi can be written using Devanagari and Perso-Arabic

Santhali can be written using Devanagari and Ol chiki script.Manipuri can be written using Bangla and Meetei-Mayek

Page 18: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

• To view Indian language websites, • sending emails from PC, • displaying files created on PC has some issues

like,

Lack of ZWJ/ZWNJ support on Handsets

Other Issues/Challenges for Mobiles

Page 19: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

• Explicit Virama:Halant is a dead consonant in the 1st case

Page 20: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Half Explicit consonant:• Example of usage of ZWJ characterक + �� + ष = क्षक + �� + ZWJ + ष = क्‍ ष• Kannada example, ಕ + � � + ಷ = ಕ್ಷಕ + � � + ZWJ + ಷ = ಕ್ � ಷ• The ZWJ in the above example prevents the ligature

and displays Virama form of Ka.

Page 21: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Challenges in Overall FrameworkStake Holders

• Mobile Subscribers• Handset Manufactures• SMSC Vendors• Mobile Operators• Content Providers

Page 22: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Need Standardization• SMS: 3GPP TS 23.038 standard

for sending receiving SMS and its versions are primarily made for English and European scripts

Work has started by CeWIT

• USSD: GSM 02.90 (ETSI TS 100 549),

GSM 03.90 (ETSI TS 100 549)

No work started for Indian languages

• CBS: 3GPP TS 25.419 SABP Standard

No work started for Indian languages

• CDMA: TIA/EIA IS-824

• Above standards and many more other standards describes SMS protocols, trigger alerts, news broadcast etc. At present these standardizations do not cover Indian languages

• Supporting Indian Language USSD & CBC will allow emergency disaster alerts and other e-governance alerts to be sent/broadcasted to handsets

Page 23: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Cell Broadcast (CB) • It is the most important protocol which is overlooked and urgently

requires Indianization.

• Cell Broadcast is a genuine one-to-many geographically focused messaging service.

• Cell Broadcast is ideal for delivering local or regional information suited to all the people in that area, such as, hazard warnings, local weather, health concerns (such as Swine Flu outbreaks), flight or bus delays, tourist information, parking and traffic information.

• Regardless of network state (congested or not) CB is always available.

• The CB is a mature system that has been around for over a decade and robust to support national public warning systems.

• There is no cost to the subscriber to receive the message.

[ref: W3C MW4D 2009]

Page 24: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Encoding Scheme

• 3GPP TS 23.038 GSM standard supports 7-bit default alphabets (and their octets) and UCS2.

• Possible schemes include use of either of following encodings,

1. 7-bit Default GSM Alphabets

2. UCS2

3. 7-bit EA-ISCII

Page 25: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

• Complexity of Indian scripts requires more characters to be entered than English

7-Bit GSM : Supports Latin character setUCS 2 : Supports all languages of the world

• Cost of the SMS becomes high in case of UCS-2 …

But considering its advantage, it should be made mandatory to all Service Providers and SMSC’s to support UCS-2 without escalating the cost in order to promote use of Indian Languages

Encoding Scheme for SMS

Page 26: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Encoding scheme

• Efforts are underway to add Indic language enhancements to 3GPP for sending SMS

• But it does not cover support for ZWJ and ZWNJ characters. Hence it will not be pleasant to read Indian Language Websites.

• It also does not cover layouting of Indian text which is very crucial for common display and common storage of text in all handsets.

• Hence even if this standard is implemented it will be falling short of the goal of reaching out to more people.

Page 27: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Recommended Best Practices For Indian Languages on Mobile

Parameter: Usable Screen Width – 120 Pixel min.• With respect to Indian language text matter, to accommodate a complete

valid word with limited width of 120 pixels should not exceed the text height by 16 pixels.

• For higher pixel height font the effective width of the word or syllable may cross the 120 pixel width, and hence complete word or syllable may not be able to displayed without panning.

Breaking/Wrapping of the text• Additional guidelines to be provided for breaking the text at word level or at

the syllable level. This depends upon the font size and display size. • Guidelines for hyphenation mechanism to be provided for breaking the

words to enable the text wrapping.

Page 28: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Line Wrapping for Indian Scripts

Word Level Wrapping With Hyphenation

Syllable Level Wrapping With Hyphenation

Wrong Line Wrapping

Page 29: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

GuidelinesCursor Movement• Also guidelines should be provided for movement of cursor and deletion of the

character/syllable.• While editing text in Indian Languages cursor position should be changed as per

the syllable instead of individual character/vowel.

Deletion• While deleting the characters from the entered text, a syllable wise deletion

should happen so that it will reduce the burden of processing and redisplaying the half syllable. ‘Clear’ key to be used to delete the Syllable next to the cursor position and ‘Back’ Key to be used to delete the syllable which is just before the cursor position

URL• With Indian Language Domain Names (IDN) likely to come, like . भा�रत etc. It will

be required to provide this as a separate key (like .com) while typing in the browser URL bar.

• Also, for handsets sold in India, .in key should be made available on all handset.

Page 30: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Very Important points to achieve common Indian Language Support

• Indic eco-system– All the handset manufacturers must use same fonts and layout

engine and same inputting scheme for all models of the handsets.

– All content providers must use the same encoding scheme for sending/recieving SMS’s. May be UCS-2 which is a global standard.

– Ideally all of the above should implemented by single entity so that updating and maintenance will be easy especially since Indian language computing is constantly involving with use of new standards like Unicode 5.2, 6.0, 6.1 etc.

Page 31: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

• Regulation & Certification–3GPP Specification states that,

Current work undertaken for including Indian languages in 3GPP TS 23.038 is not intended to be implemented until a formal request is issued by the relevant national regulatory body.

–There should be independent verifying and benchmarking agency which can endorse compatibility of latest equipments SMSC/RNC/Handsets etc. to prescribed Indian standards.

–Verification and certification agency should have thorough knowledge of Indian Language issues (all 22 languages) and mobile computing background

Page 32: Swapnil Belhe Team Lead C-DAC, GIST, Pune  swapnilb@cdac

Thank You