Top Banner
Issues with SignWriting in Unicode 8 Prepared for UTC # 144 / L2 # 241 (July 27- 31, 2015) a Unicode Technical Committee meeting in Redmond, WA by Stephen E Slevinski Jr in association with the Center for Sutton Movement Writing
24
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issues withSignWriting

inUnicode 8

Prepared for UTC # 144 / L2 # 241 (July 27-31, 2015)a Unicode Technical Committee meeting in Redmond,

WAby Stephen E Slevinski Jr

in association with the Center for Sutton Movement Writing

Page 2: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

My BackgroundBachelor of Science

in MathematicsRaised two kids with

sign languageStarted collaboration with

Valerie Sutton from 2004 until today

Complete symbol encoding model on PUA Plane 16 (37,811

characters)

Complete script encoding model on PUA Plane 15 (1,179

characters)

Argued with Unicode in 2011 and then walked away

Released the ISWA 2010 symbol set in 2010

Finalized Formal SignWriting in ASCII on Jan 12, 2012

5 Years of stability with the symbol set and fonts

design

3 1/2 Years of stability with the character encoding

models

Involved with dozens of sign languages around the world

Foundation for all online use and modern publishing efforts

Page 3: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

SignWriting in SoftwareAll major SignWriting editors and viewers are compatible.

• SignPuddle OnlinePrimary source of written sign language

• Delegs EditorEducational software from Germany for bilingual education.

• SignWriter StudioGeneral purpose SignWriting editor, integrated dictionary, and printing.

• SWiftSignWriting improved fast transcriber that aims to simplify the editing

process.• JSPad

SignWriting editor for Japanese sign language based in the Gifu University.• Tunisigner

interact with SignWriting notations through a 3D virtual signer able to reproduce the exact gestures represented within the sign language transcription.• SignTyp

a linguistic coding system developed by Rachel Channon through an NSF grant that is being integrated with SignWriting.

Page 4: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

http://www.signbank.org/signmaker.html

Code Breakdown

Series10 KB7 KB

14 KB21 KB28 KB35 KB42 KB49 KB56 KB63 KB70 KB

ConfigurationSupport LibrariesCustom HTML, JS, and CSS

SignMaker 2015Cross-browser, drag-and-drop sign editor,

with dictionary and advanced sign searching

SignWriting in Software

Page 5: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Bookmarklet

Javascript-based SignWriting Keyboard

Keyboarding editing has returned to SignWriting

Wikimedia IncubatorThe keyboard editor is enabled on Wikimedia Incubator for the American Sign Language Wikipedia and every other sign language project.

Store JavaScript in a bookmark and you can use SignWriting on any web page in any text fields.

Any WebsiteAdd a few KB of JavaScript and the keyboard editor can be enabled on any website using standard edit boxes and visual presentation.

http://www.signwriting.org/symposium/presentation0041.html

SignWriting in Software

Page 6: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

What about Unicode?

PUA Plane 15 design (1,179 characters)The symbol only design removed 2-D layout by dropping 5 structural markers and 500 number

characters

N4015 Preliminary Unicode (674 characters)

N4090 Revised Unicode (672 characters)

N4342 Unicode Proposal (672 characters)

A new inherent design removes 2 characters (F1 and R1) and breaks collation as stated in

proposal

A new facial diacritic design is proposed that is unsupported and

untested

The original design is still compatible with the community efforts.

Page 7: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issues with SignWritingin Unicode 8

The Unicode 8 specification will not be used for any SignWriting project around

the world.The Unicode 8 specification for SignWriting is

politically valuable, but unhelpful for developers.

Page 8: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issues with SignWritingin Unicode 8

The issue of the moment is sorting, but there are three

main issues.If we address all of the issues for

SignWriting, the existing International community of SignWriters is ready, able,

and willing to embrace the standard.

Page 9: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issue 1: Unicode 8 is incomplete

http://signbank.org/SignWriting_Character_Viewer.html

Unicode 8 only encodesthe symbols and ignores

the issue of layout.

Unicode 8 is missing the structural markers

and number charactersrequired for 2-D Layout.

Unicode 8 requires SVGfor the visual presentation.

Unicode 8 requires additionalcharacters/markup to write a sign.

Page 10: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issue 2: Unicode 8 is flawed

The idea of Inherent charactersbreaks from the communityuse of today and historically.

Because of Inherent modifiers,sorting is broken, searching isambiguous, and replacements

can be destructive.

w s PSymbol BasesTokens

i oSymbol ModifiersTokens

identified with a string of 3

tokens.

w i o

Writing Symbol

P i oPunctuation Symbol

Fill Rotation

TriadicSymbol

Page 11: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issue 2: Unicode 8 is flawedSorting is broken

1D800 SIGNWRITING HAND-FIST INDEX (HFI)1DAA1 SIGNWRITING ROTATION MODIFIER-2 (R2)1DA9B SIGNWRITING FILL MODIFIER-2 (F2)

1. HFI F1 R15. HFI F1 R1 HFI F1 R12. HFI F1 R26. HFI F1 R2 HFI F1 R13. HFI F2 R17. HFI F2 R1 HFI F1 R14. HFI F2 R2

1. HFI5. HFI HFI3. HFI F27. HFI F2 HFI4. HFI F2 R22. HFI R26. HFI R2 HFI

Correct sorting with F1 & R1 Incorrect sorting without F1 & R1

http://www.unicode.org/L2/L2015/15184-signwriting-ducet.txt

http://signpuddle.net/15184-signwriting-ducet-response.txt

http://www.unicode.org/L2/L2015/15202-signwriting-ducet-aux.txt

Page 12: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issue 2: Unicode 8 is flawedSorting is broken

1D800 SIGNWRITING HAND-FIST INDEX (HFI)1DAA1 SIGNWRITING ROTATION MODIFIER-2 (R2)1DA9B SIGNWRITING FILL MODIFIER-2 (F2)

HFI weight of 100F2 weight of 420R2 weight of 410

1. HFI 1005. HFI HFI 100 1002. HFI R2 100 4106. HFI R2 HFI 100 410 100 3. HFI F2 100 4207. HFI F2 HFI 100 420 1004. HFI F2 R2 100 420 410

DUCET FixCorrect sorting with DUCET

1, 5, 2, 6, 3, 7, 4

Correct Sort Order

1, 2, 3, 4, 5, 6, 7

Incorrect Sort Order

Page 13: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issue 2: Unicode 8 is flawedSearching is ambiguous

1D800 SIGNWRITING HAND-FIST INDEX (HFI)1DAA1 SIGNWRITING ROTATION MODIFIER-2 (R2)1DA9B SIGNWRITING FILL MODIFIER-2 (F2)

1. HFI F1 R15. HFI F1 R1 HFI F1 R12. HFI F1 R26. HFI F1 R2 HFI F1 R13. HFI F2 R17. HFI F2 R1 HFI F1 R14. HFI F2 R2

1. HFI5. HFI HFI3. HFI F27. HFI F2 HFI4. HFI F2 R22. HFI R26. HFI R2 HFI

Searching with F1 & R1 Searching without F1 & R1

Searching for the symbolHFI F1 R1 correctly

finds 4 matches

Searching for the symbol HFI incorrectly finds 10

matches without negative lookaheads

Page 14: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issue 2: Unicode 8 is flawedSearching is ambiguous

Query String:QS10000S20500

Searching for signs that include 2 exact symbols will return these results from the ASL Dictionary.

Page 15: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issue 2: Unicode 8 is flawedSearching is ambiguous

Plus 6 more pages of signs.

Query String:QS100uuS205uu

In Unicode 8, searching for a symbol base without fill or

rotation modifiers will return 6 times as much noise as signal.

Page 16: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issue 2: Unicode 8 is flawedReplacements can be destructive

sub uFD830 uFD810 uFD820 by S10000;sub uFD830 uFD810 uFD821 by S10001;sub uFD830 uFD810 uFD822 by S10002;sub uFD830 uFD810 uFD823 by S10003;sub uFD830 uFD810 uFD824 by S10004;sub uFD830 uFD810 uFD825 by S10005;sub uFD830 uFD810 uFD826 by S10006;sub uFD830 uFD810 uFD827 by S10007;

sub u1DA8B u1DAA7 by S38b07;sub u1DA8B u1DAA6 by S38b06;sub u1DA8B u1DAA5 by S38b05;sub u1DA8B u1DAA4 by S38b04;sub u1DA8B u1DAA3 by S38b03;sub u1DA8B u1DAA2 by S38b02;sub u1DA8B u1DAA1 by S38b01;sub u1DA8B by S38b00;

https://github.com/Slevinski/signwriting_2010_tools

The TrueType Fonts use Ligatures to support multiple character sets.

Plane 15 Characters Unicode 8 Characters

Increasing symbols keys or decreasing works without

issue.

Decreasing symbol keys to avoid destruction.

Page 17: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issue 3: Unicode 8 is fictionalFacial diacritics do not exist. There is no font support, no software support, and

no data.

Facial diacritics are described in one document, using 177 words.

Facial diacritics have never been tested on any individual, let alone an

international group.

Facial expressions are created usingoverlap and overlay of many symbolsusing Cartesian coordinates for each.

Facial diacritics should be handled in software rather than the character

encoding.

Facial diacritics development was quietly abandoned the end of 2012.

Page 18: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Formal SignWriting

Regular Expressions

Query Strings

Community Use

SVG

PUA Plane 15

Graphite Font

Unicode 8 PUA Plane 16TTF

10% to 50% reduction

15 to 50 times expansion

process million of characters per second

search results

15 times expansion

single character per symbolligatures of 1 to 3 characters

twice the size

cartesian coordinates with GPOS

CSSstyle text

Isomorphic

JS

ASCII Lite Markup

preferredunused

prototype

6 KB zipped

Page 19: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

AS18711S20500 M514x517S18711490x483S20500486x506

AS18711S20500M514x517S18711490x483S20500486x506

A S18711 S20500 M514x517 S18711490x483 S20500486x506

M 514x517 S18711 490x483 S20500 486x506

(514,517) (490,483) (486,506)

Time Space

SequenceMarker

Symbol

Middle LaneSignBox

MaxCoord

SpatialSymbol

Community UseFormal SignWriting

Standard ASCII format is Isomorphic to PUA Plane 15

Page 20: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Unicode 9

Regular Expressions

Query Strings

Ideal SolutionGraphite Font TTF

10% to 50% reduction

15 to 50 times expansion

process million of characters per second

search results

cartesian coordinates with GPOS

CSSstyle text

http://signpuddle.net/iswa/#smartfont

Prototype Font uses Cartesian coordinates for 2-D layout with Graphite

JS6 KB zipped

Page 21: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Too Late?

SignWriting is spreading around the world and exploding online. All of the SignWriting projects are using an ASCII solution and have no plans to switch to the Unicode 8 design for the symbols.Without a full script solution for SignWriting, Unicode will not be used for SignWriting, especially the Unicode 8 design which complicates otherwise simple routines.Using Unicode for SignWriting is a great idea in theory, but there are few advantages and too many disadvantages to seriously consider applying the Unicode 8 design, even if sorting is fixed.

I left the Unicode effort the end of 2011. In 2012, I was shown the latest proposal (N4342). I objected privately and asked that they produce a working font before they contact me again.In 2014, I was contacted that SignWriting will be in Unicode 8. I reiterated my objections, pointing out the issues, and was told it was too late to change the design in any way.

Page 22: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Discussion Ideas2-Color FontsSignWriting relies on a 2-color font. Currently, SignWriting mimics a 2-color font by using 2 TrueType Fonts: one for the line and another for the filling. If you have any experience with 2-color fonts, let’s discuss the possibilities.

2-Dimensional Layout with Graphite and Cartesian coordinatesSignWriting has a prototype font that uses Cartesian coordinates to control the 2-dimensional layout with Graphite and PUA Plane 15 characters. If you have any experience with 2-dimensional layout using Cartesian coordinates, let’s discuss the possibilities.

Alternate designs for a 2-dimensional scriptThis type of discussion is interesting, but it will not effect the SignWriting community. The standards are stable and widely used. This would make for an interesting project, but it is not work that I will be doing myself.

Page 23: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Discussion Ideas

Unicode 9 or 10Can we deprecate Unicode 8? The community design has been stable for 3 1/2 years. There is an interested community and there are many possibilities for 2-Color fonts and 2-Dimensional layout.

Unicode 8I will not be using Unicode 8. I partially support Unicode 8 with the SignWriting 2010 Fonts, but not the facial diacritics. I suggested that people avoid use SignWriting in Unicode 8. I’m willing to discuss any of the 3 issues that I have outlined, but I’m not invested in any tweaks to the Unicode 8 design.

Symbol Encoding ModelPUA Plane 16 (37,811 characters)

Script Encoding ModelPUA Plane 15 (1,179

characters)

both designs are productive and used today

Page 24: SIGNWRITING IN UNICODE 8 ISSUES 2015 by Stephen E Slevinski Jr

Issues with SignWriting in Unicode

8by Stephen E Slevinski Jr

http://slevinski.github.io

[email protected]

http://www.slideshare.net/StephenSlevinski/presentations