Top Banner
Unicode Support in SAP Web Application Server Dr. Christian Hansen Matthias Mittelstein Server Technology Internationalization SAP AG
45

Unicode Support in SAP WAS.E - Archive

Apr 24, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Unicode Support in SAP WAS.E - Archive

Unicode Support in SAP Web Application Server

Dr. Christian Hansen Matthias MittelsteinServer Technology Internationalization SAP AG

Page 2: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 2

Agenda

Scripts, Characters and Code Pages� Conventional code pages� Unicode

Unicode in the Web Application Server� Front End� Communication� Application server� Database� Printing

Conversion to Unicode� C and C++ programs� ABAP programs� Database� System landscape

Availability and Release Planning

Summary

Page 3: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 3

Scripts, Characters and Code Pages: Languages

English

German

Turkish

DanishDutch,

FinnishFrench, Italian

NorwegianPortugueseSpanish

Swedish

CroatianCzechHungarianPolish

RumanianSlovakian

Slovene

RussianUkrainian

Greek

Hebrew

Thai

Korean

Japanese Chinese

Taiwanese

Icel

andi

c

Page 4: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 4

Conventional Code Pages: ISO8859-1

English

German

Turkish

DanishDutch,

FinnishFrench, Italian

NorwegianPortugueseSpanish

Swedish

CroatianCzechHungarianPolish

RumanianSlovakian

Slovene

RussianUkrainian

Greek

Hebrew

Thai

Korean

Japanese Chinese

Taiwanese

Icel

andi

c

Page 5: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 5

Conventional Code Pages ISO8859-5

German

Turkish

DanishDutch,

FinnishFrench, Italian

NorwegianPortugueseSpanish

Swedish

CroatianCzechHungarianPolish

RumanianSlovakian

Slovene

RussianUkrainian

Greek

Hebrew

Thai

Korean

Japanese Chinese

Taiwanese

Icel

andi

c

English

Page 6: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 6

Conventional Code Pages: ISO8859-7

German

Turkish

DanishDutch,

FinnishFrench, Italian

NorwegianPortugueseSpanish

Swedish

CroatianCzechHungarianPolish

RumanianSlovakian

Slovene

RussianUkrainian

Greek

Hebrew

Thai

Korean

Japanese Chinese

Taiwanese

Icel

andi

c

English

Page 7: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 7

Conventional Code Pages: ISO8859-8

German

Turkish

DanishDutch,

FinnishFrench, Italian

NorwegianPortugueseSpanish

Swedish

CroatianCzechHungarianPolish

RumanianSlovakian

Slovene

RussianUkrainian

Greek

Hebrew

Thai

Korean

Japanese Chinese

Taiwanese

Icel

andi

c

English

Page 8: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 8

Conventional Code Pages: Shift-JIS

German

Turkish

DanishDutch,

FinnishFrench, Italian

NorwegianPortugueseSpanish

Swedish

CroatianCzechHungarianPolish

RumanianSlovakian

Slovene

RussianUkrainian

Greek

Hebrew

Thai

Korean

Chinese

Taiwanese

Icel

andi

c

Japanese

English

Page 9: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 9

Conventional Code Pages: GB2312

German

Turkish

DanishDutch,

FinnishFrench, Italian

NorwegianPortugueseSpanish

Swedish

CroatianCzechHungarianPolish

RumanianSlovakian

Slovene

RussianUkrainian

Greek

Hebrew

Thai

Korean

Japanese

Taiwanese

Icel

andi

c

Chinese

English

Page 10: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 10

Conventional Code Pages: KSC5601-1992

German

Turkish

DanishDutch,

FinnishFrench, Italian

NorwegianPortugueseSpanish

Swedish

CroatianCzechHungarianPolish

RumanianSlovakian

Slovene

RussianUkrainian

Greek

Hebrew

Thai

Japanese Chinese

Taiwanese

Icel

andi

cKorean

English

Page 11: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 11

Conventional Code Pages: Earlier SAP approaches

Earlier SAP approaches used a fixed mapping

from database to code page� "single code page system“

from application server to code page� "MNLS“ (obsolete)

� R/3 releases 2.2F to 3.0C

from language to code page� "MDMP“ (current SAP technique for using multiple code

pages)

� Using tables TCP0C, TCP0D, TCPDB

Page 12: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 12

Conventional Code Pages: Earlier SAP approaches (MDMP)

West European View Japanese View Korean View

Page 13: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 13

Conventional Code Pages: Earlier SAP approaches

� ambiguous data, when accessing across a code page boundary

� data encoding hard to understand for non-SAP programs

� sophisticated programming techniques needed to handle data in the appropriate code page

� each user limited to her own language

� …

� …

� …

Earlier SAP approaches had disadvantages when the n umber of concurrent system code pages increases:

Page 14: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 14

Solution: Unicode

English

German

Turkish

DanishDutch,

FinnishFrench, Italian

NorwegianPortugueseSpanish

Swedish

CroatianCzechHungarianPolish

RumanianSlovakian

Slovene

RussianUkrainian

Greek

Hebrew

Thai

Korean

Japanese Chinese

Taiwanese

Icel

andi

c

And morelanguagescan besupportedeasilywithout theneed fornew code

pages orother new

methods

Page 15: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 15

Solution: Unicode characters

ASCIIGeneral Scripts

Symbols

CJK Ideographs

Hangul

Compatibility

Surrogate Area

65,000 characters

Additional 1,000,000 characters

Page 16: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 16

E3 91 B979 3434 79U+3479

CE B1B1 0303 B1U+03B1αααα

C3 A4E4 0000 E4U+00E4ä

6161 0000 61U+0061a

UTF-8UTF-16little endian

UTF-16big endian

Unicodescalar value

Character

Representation of Unicode Characters

UTF-16 – Unicode Transformation Format, 16 bit encoding

� Fixed length, 1 character = 2 bytes (surrogate pai rs = 2 + 2 bytes)

� Platform-dependent byte order (big/little endian)

� 2 byte alignment restriction

UTF-8 – Unicode Transformation Format, 8 bit encoding

� Variable length, 1 character = 1...4 bytes

� Platform independent

� no alignment restriction

� 7 bit US ASCII compatible

Page 17: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 17

Agenda

Scripts, Characters and Code Pages� Conventional code pages� Unicode

Unicode in the Web Application Server� Front End� Communication� Application server� Database� Printing

Conversion to Unicode� C and C++ programs� ABAP programs� Database� System landscape

Availability and Release Planing

Summary

Page 18: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 18

Unicode in the SAP Web Application Server

non-Unicode Unicode

� Release 6.20 GUIs use UTF-8 for communication and UTF-8 and UTF-16 internally

� WinGUI 6.30 will use UTF-16 internally

� SAP GUI

� UTF-16

� Unicode:

� UTF-8 printer to cover all characters

� Normal printers restricted to local texts with reduced character set

� RFC, XML and other: Code page conversions on character data are explicit and mandatory

Fro

nt-e

ndC

omm

u-ni

catio

nA

ppl.

Ser

ver

Dat

abas

eP

rintin

g

� UTF-16 chosen by � CESU-8 DBMS vendor

� ASCII, DBCS, EBCDIC, MDMP, ...

� Or only bytes?

� ASCII, DBCS, EBCDIC, MDMP, blended code pages, …

� RFC and other: Implicit reinterpretations andmemcopy based loopholes create ambiguous data

� One standard printer type for each code page

� UTF-8 printer for all code pages

Page 19: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 19

Unicode in the Front-end: SAPGUI

Use a 6.20 SAPGUI when working on a Unicode system.a) Frontend code page is set to 4110 (UTF-8)

b) Multibyte functionality is activated (see note 508 854)

The feature to use Unicode is built into the SAPGUIs . No separate executable or separate installation is nec essary.

6.20-SAPGUIs use UTF-8 for communication when in Unicode mode.

a) b)

Page 20: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 20

SAPGUI for the Windows Environment

WinGUI 6.20�delivered

�Screenshot:Version 6.20 Revision 2 Patch level 18

�Standard installation

Page 21: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 21

SAPGUI for the Java Environment

JavaGUI 6.20�delivered

�Screenshot:Version 6.20 Revision 4

�Standard installation

�Plus modified C:\program files\JavaSoft\

JRE\1.3.1_02\lib\font.properties

Page 22: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 22

SAPGUI for the HTML Environment

WebGUI�prototype

�Version 6.20

�Remember:WAS 6.20 comes with ITS 6.10

Page 23: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 23

Unicode and RFC (1)

RFC non-Unicode – non-Unicode� Long running standard� Receiver converts code pages, if it can

RFC Unicode – Unicode� No problem

RFC Unicode -- non-Unicode� The Unicode-side converts from/to the old code page� In Unicode <->MDMP communication data is interprete d using

language key information � System settings allow to catch or ignore possible c onversion

problems� Applications have to readjust fields, if structured data has been

transported stored in character containers

When the data has a complicated structure RFC already uses XML und UTF-8 since release 4.6C

Page 24: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 24

Unicode and RFC (2)

� A Unicode receiver can receive all characters.

( solid lines )

� A non-Unicode receiver cannot receive characters that are not in its own code page. But as long as you restrict the character set, data can be sent from everywhere to everywhere.

( dotted lines )

Page 25: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 25

Unicode in the application servers

Application servers use UTF-16

� not UCS-4 or UTF-32, because that would be too expe nsive (memory, bandwidth)

� not UTF-8 because the dynamic length and offset wou ld be too complicated (CPU consumption, robustness of applica tions)

� ABAP like JAVA

Kernel has only one C / C++ source

Applications have only one ABAP source

Additional hardware requirement: � CPU 30-35%

� Memory 50%

Page 26: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 26

Unicode in the databases

Databases use UTF-16 or CESU-8 * internally

� Hidden by database client software library� The library interface to the kernel uses UTF-16.

Additional hardware requirement: � UTF-8/CESU-8 36%� UTF-16 60-70%

UTF-16SAPDB 7.0

- not supported -Informix

later: UTF-8DB/2 390

UTF-16DB/2 400

UTF-8DB/2 6000

UTF-16MS SQL Server

CESU-8Oracle

* CESU-8 is similar to UTF-8, but when binary sorting is used, it gives the same result as binary sorting on UTF-16BE. The difference between UTF-8 and CESU-8 is visible only for surrogate pairs.

See note 379940 for current status.

Page 27: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 27

Unicode and printers: LEXMARK UTF-8 printer

Printer of choice when connected to a Unicode system

Can handle any single language in an MDMP system

Can print mixed languages in an MDMP system, whenSAPscript is used carefully. ( See scan of printout.)

Page 28: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 28

Agenda

Scripts, Characters and Code Pages� Conventional code pages� Unicode

Unicode in the Web Application Server� Front End� Communication� Application server� Database� Printing

Conversion to Unicode� C and C++ programs� ABAP programs� Database� System landscape

Availability and Release Planing

Summary

Page 29: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 29

Conversion of C and C++ programs

SAP : C and C++ programs, the Kernel (done)

� Modify sources (with tools ccU and ccQ++ and some manual changes) Replace char, strcmp, fopen, freadwith SAP_UC, strcmpU, fopenU, freadUor SAP_RAW, strcmpR, , freadR

� Generate backward compatible non-Unicode kernels an d UTF-16 kernels from the same sources.

� Suggest char16_t and u"String" to standard organizations.

� Use ICU in Unicode systems for locale-dependent fun ctionality.The International Components for Unicode (ICU) is a C and C++ library that provides Unicode support functionality. ICU is a collaborative, open-source development project. It is licensed under the X License. For more details see the http://oss.software.ibm.com/icu/index.hltml. There is also a parallel ICU for Java project http://oss.software.ibm.com/icu4j/.

Customer: external RFC-applications in C or C++

� Modify sources (with tools ccU and ccQ++ and some manual changes, see SAP RFC Software Development Kit docum entation for further information)

Page 30: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 30

Conversion of ABAP programs

ABAP character data types (C,N,D,T,STRING) are automatically Unicode in a Unicode system.

� Major part of ABAP coding is ready for Unicode without any changes

� Minor part of ABAP coding has to be adapted to comply with Unicode restrictions ( ���� Workshop ABAP217)

Use UCCHECK to activate Unicode syntax check and view problematic places

Do runtime tests to detect semantic changes in the application. Screen runtime tests with the ABAP Cov erage Analyzer SCOV .

Page 31: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 31

Conversion of ABAP programs: UCCHECK

Page 32: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 32

Conversion of ABAP programs: SCOV

Page 33: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 33

Conversion of the database: Single Code Page System

Building up a Unicode system requires converting al l character data in the system to Unicode

In a single code page system the code page of the c haracter data is unambiguous.

The conversion is done with R3load:

� R3load – export (conversion to Unicode is done)

� R3load – import (Unicode data is imported)

Downtime may be reduced with IMIG incremental conversion with minimal downtime (in development)

Page 34: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 34

Unicode database conversion and downtime

Downtime

Non-Unicode system Unicode system

Export File

Online conversion time

Conversionof rest

Page 35: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 35

Conversion of the database: MDMP System

In MDMP systems the code page of the character data is ambiguous and has to be derived from secondary information:

� Scan database with R3 transaction SPUMG

�Find explicit language fields and hidden language f ields

�Recognize typical characters, recognize language de pendent words

�Classify tables and report problematic data

� Enhance database or give hints

The conversion is done with R3load:

� R3load multi-code page export

� R3load import

� Post conversion repair with SUMG (to correct wrong table classifications)

(plus IMIG )

Page 36: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 36

Unicode Conversion: System landscapes

Convert systems one by one:

� RFC and other communication between non-Unicode and Unicode systems can do code page conversions where necessary

� Convert data destinations first (only Unicode systems can receive all data)

� Think of external files as own systems and determine their code page. Convert files once or setup conversion durin g reading.

� If you had systems separated by code pages, they ca n be migrated into a single system now

Page 37: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 37

Agenda

Scripts, Characters and Code Pages� Conventional code pages� Unicode

Unicode in the Web Application Server� Front End� Communication� Application server� Database� Printing

Conversion to Unicode� C and C++ programs� ABAP programs� Database� System landscape

Availability and Release Planing

Summary

Page 38: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 38

Unicode Conversion and Release planning

R/3 4.6D

NU

SAP Web AS 6.20

NU

SAP Web AS 6.20Unicode

SAP Web AS 6.30Unicode

New

Installation

Unicodeconversion

Up-gradewithUnicode conversion ?

Upgrade

Upgrade

Page 39: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 39

Unicode enabled mySAP.com Status and Planning

mySAP CRM SAP CRM 3.0

selected stand-alone

SAP CRM 4.0complete

mySAP SCMSAP SC Event Manager 1.1

SAP SCM 4.01

mySAP Enterprise Portals

SAP EP 5.0: ISO-LATIN1

SAP EP 6.0: Unicode

mySAP BISAP BW 3.1SAP R/3 Enterprise

Unicode Roll-Out is started

All the availability dates and release schedules gi ven (see note 79991) are based on SAP internal planning only and, thus, may be subject to change.

mySAP ExchangesSAP XI 2.0

Page 40: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 40

Agenda

Scripts, Characters and Code Pages� Conventional code pages� Unicode

Unicode in the Web Application Server� Front End� Communication� Application server� Database� Printing

Conversion to Unicode� C and C++ programs� ABAP programs� Database� System landscape

Availability and Release Planning

Summary

Page 41: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 41

Summary

After this lecture you know

the benefits of Unicode

SAPs implementation of Unicode in the Web Application Server

the steps necessary to convert a existing SAP system into a Unicode system

the availability dates of already Unicode enabled mySAP.com products

Page 42: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 42

���� Service Marketplace:Service Marketplace:Service Marketplace:Service Marketplace:

Technical information: http://service.sap.com/Unicode@SAP

Customer contact: http://service.sap.com/Unicode

���� Public Web:Public Web:Public Web:Public Web:

www.sap.com

���� Related Related Related Related WorkshopWorkshopWorkshopWorkshop at SAP at SAP at SAP at SAP TechEdTechEdTechEdTechEd 2002200220022002

Unicode Enabling of ABAP Programs:

Tue., 4:15:00 PM - 6:15:00 PM, 391Wed., 4:15:00 PM - 6:15:00 PM, 298 / 299

���� Related LecturesRelated LecturesRelated LecturesRelated Lectures at SAP at SAP at SAP at SAP TechEdTechEdTechEdTechEd 2002200220022002

Global Solutions: Legal Requirements, Languages, Unicode:

Tue., 2:45:00 PM - 3:45:00 PM, 398 / 399Wed., 5:45:00 PM - 6:45:00 PM, 350 / 351

Further Information

Page 43: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 43

Q&AQ&AQ&AQ&A

Questions?

Page 44: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 44

Feedback

Please complete your session evaluation and drop it in the box on

your way out.

Be courteous — deposit your trash, and do not take the handouts for the

following session.

The SAP TechEd ’02 New Orleans Team

Page 45: Unicode Support in SAP WAS.E - Archive

2002 SAP Labs, LLC, WAS203, Christian Hansen 45

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice.

Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.

Microsoft®, WINDOWS®, NT®, EXCEL®, Word®, PowerPoint® and SQL Server® are registered trademarks of Microsoft Corporation.

IBM®, DB2®, DB2 Universal Database, OS/2®, Parallel Sysplex®, MVS/ESA, AIX®, S/390®, AS/400®, OS/390®, OS/400®, iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner, WebSphere®, Netfinity®, Tivoli®, Informix and Informix® Dynamic ServerTM are trademarks of IBM Corporation in USA and/or other countries.

ORACLE® is a registered trademark of ORACLE Corporation.

UNIX®, X/Open®, OSF/1®, and Motif® are registered trademarks of the Open Group.

Citrix®, the Citrix logo, ICA®, Program Neighborhood®, MetaFrame®, WinFrame®, VideoFrame®, MultiWin® and other Citrix product names referenced herein are trademarks of Citrix Systems, Inc.

HTML, DHTML, XML, XHTML are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts Institute of Technology.

JAVA® is a registered trademark of Sun Microsystems, Inc.

JAVASCRIPT® is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape.

MarketSet and Enterprise Buyer are jointly owned trademarks of SAP Markets and Commerce One.

SAP, SAP Logo, R/2, R/3, mySAP, mySAP.com and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are trademarks of their respective companies.

Copyright 2002 SAP AG. All Rights Reserved