Unicode in Distributed Systems

Download Report

Transcript Unicode in Distributed Systems

Unicode in Distributed
Systems
Michael G. McKenna
[email protected]
Globalisation Strategist
Haddon Hill International
(c) 2002 M. McKenna
1
Unicode in Dist Sys
IUC#22
Distributed Systems
Terminal remote access
a
Host-centric co-operative
a
b
Co-operative processing
a
b
c
Client / Server
a
b
c
Distributed Data
a
b
c
Fully distributed processing
a
b
b
c
d
e
c
d
e
c
d
e
d
e
d
e
d
e
d
c
c
d
a
End user device
d
Data Access
b
End User interface standards
c
Application
e
Data
Unicode can be implemented in any of the functional areas
(c) 2002 M. McKenna
2
Unicode in Dist Sys
IUC#22
Distributed Systems:
Status Quo
• Heterogeneous
• Large Investments
• Mixed Proprietary and International
Standards
• Often Under Parochial Control
• Work Group to Global Organizational
Size
(c) 2002 M. McKenna
3
Unicode in Dist Sys
IUC#22
The Enterprise in the Real World
Java
SYBASE
Internet
Clients
Appl
Server
DB2
Application
Development
Data
Collection
Oracle
Distributed
Enterprise
Information
Embedded
Training
Web
Servers
Multiple SQL
Database
Access
Flat Files
Legacy &
NonRelational
Data
IBM...5/8...
7/8 ..
...QFR....
SYB...6 1/8...
System
Management
RealTime
Mainframe
Data
Auditing
System Configuration
Performance Monitoring
Plug & Play
Users
Distributed
Systems
TCP/IP
BATT
C AR D
EDIT
CORBA
Security
(c) 2002 M. McKenna
J2EE
4
Unicode in Dist Sys
IUC#22
The Enterprise in the Real World
Java
SYBASE
Internet
Clients
Appl
Server
DB2
Application
Development
Data
Collection
Oracle
Distributed
Enterprise
Information
Embedded
Training
Web
Servers
Multiple SQL
Database
Access
Flat Files
Legacy &
NonRelational
Data
IBM...5/8...
7/8 ..
...QFR....
SYB...6 1/8...
System
Management
RealTime
Mainframe
Data
Auditing
System Configuration
Performance Monitoring
Plug & Play
Users
LAN-Based
Systems
TCP/IP
BATT
C AR D
EDIT
CORBA
Security
(c) 2002 M. McKenna
J2EE
5
Unicode in Dist Sys
IUC#22
Enterprise Client/Server Requirements
Java
Appl
Server
Data
Entry
SYBASE
WWW
Clients
DB2
Multilingual
Application
Developer
&
Development
End-User
Productivity
Embedded
Training
Oracle
Distributed
Enterprise
Transparency
Information
Remote
Backup
Localisable
Plug & Play
Users
(c) 2002 M. McKenna
IBM... /8...
...QFR....
6/8...
SYB...
RealTime
Mainframe
Data
Seamless
Interoperability
LAN
Based
Systems
Performance Monitoring
Security
Flat Files
Legacy &
NonRelational
Data
System
Management
Auditing
System Configuration
Multiple SQL
Database
Access
TCP/IP
CORBA
J2EE
6
Unicode in Dist Sys
IUC#22
Any Component Can
Affect Globalization
Client
Application
GUI
API
Network
Client
API
(c) 2002 M. McKenna
Server
API
Database
Server
OS
API
Database Design
Non-RDBMS Data
7
Unicode in Dist Sys
IUC#22
Globalisation Spans all Areas
4 GL
Application
GUI
Application
Server Comm API
Network API
O/S
O/S
Network API
Client Comm API
Data
(c) 2002 M. McKenna
8
Unicode in Dist Sys
IUC#22
Rating Distributed Systems
A system for rating levels of
Internationalisation
+3
+2
+1
0
-1
-2
-3
Global Ready / Local Cultural Authenticity
Global Ready
Single-Locale Ready (Europe or Asia)
Locale-Specific Early Adopter
8-bit Clean
7-bit “Dirty”
Don’t Care
(c) 2002 M. McKenna
9
Unicode in Dist Sys
IUC#22
Level (-3) “Don’t Care”
•
•
•
•
•
Ethnocentric attitude of organization
Lack of understanding
No desire
I18N thought of as “another feature”
Fear, uncertainty and doubt
(c) 2002 M. McKenna
10
Unicode in Dist Sys
IUC#22
Level (-2) “7-bit dirty”
•
•
•
•
•
•
7-bit ASCII support
U.S. only
ASCII sort only
U.S. platforms/environments only
U.S.-specific UI
U.S. keyboards, terminals, printers
(c) 2002 M. McKenna
11
Unicode in Dist Sys
IUC#22
Level (-1) “8-bit clean”
• 8-bit data integrity (the 8th bit is not stripped)
• Support for 8-bit object names
• 16-bit data integrity for pass through
(c) 2002 M. McKenna
12
Unicode in Dist Sys
IUC#22
Level (0) “Minimum I18N”
•
•
•
•
•
•
•
•
•
•
8-bit and multibyte codeset support
Unicode really
8-bit and multibyte lexical support
needed here!
8-bit and multibyte object names
European sort orders
Localizable
European and Asian platform/HW support
Documentation on I18N
Multibyte input and display
Application development in target language
European and Asian keyboards, terminals, printers
(c) 2002 M. McKenna
13
Unicode in Dist Sys
IUC#22
Level (+1) “Minimum
Heterogeneous I18N”
•
•
•
•
•
•
•
•
•
•
•
Unicode support
Can add European sort orders
Distributed locale management
All messages localizable
Language-sensitive string operations
Locale-sensitive cultural string formatting
Transparent Connectivity
Imperial calendars
Codeset conversion
Localizable user interface
European multilingual application development
(c) 2002 M. McKenna
14
Unicode in Dist Sys
IUC#22
Level (+2) “Global
Ready”
•
•
•
•
Can add new character set conversions in the field
Bi-directional support
Robust codeset conversion
Support world-wide multilingual application
development
• Multiscript heterogeneous distributed processing
(c) 2002 M. McKenna
15
Unicode in Dist Sys
IUC#22
Level (+3) “Cultural
Authenticity”
•
•
•
•
•
Full Unicode support
Keisen tables, radar charts in Japan
Non-Gregorian calendars
Composite characters
Vertical input and display
(c) 2002 M. McKenna
16
Unicode in Dist Sys
IUC#22
Evolution of Client/Server/Intranet
Departmental
10-100 Users
Systems Centralized Server(s)
Mainframe Extracts
Applications
Single Function
Stand Alone
Simple Administration
Single Vendor
Databases 10s of Gigabytes
(c) 2002 M. McKenna
Enterprise
100s to 1000s of Users
Distributed Servers
Mainframe Integration
Corporate-Wide
Integrated
Complex Management
Internet
Any User
Any Machine
Anywhere
World-wide
HTTP/HTML
Remote Mgmt
Many Vendors
Gigabytes to Terabytes
17
Unicode in Dist Sys
IUC#22
Evolution of Client/Server/Intranet
Departmental
10-100 Users
Systems Centralized Server(s)
Mainframe Extracts
Applications
Single Function
Level
(0) I18N
Stand
Alone
Simple Administration
Single Vendor
Databases 10s of Gigabytes
(c) 2002 M. McKenna
Enterprise
100s to 1000s of Users
Distributed Servers
Mainframe Integration
Corporate-Wide
Integrated
Complex Management
Intranet
Any User
Any Machine
Anywhere
World-wide
HTTP/HTML
Remote Mgmt
Many Vendors
Gigabytes to Terabytes
18
Unicode in Dist Sys
IUC#22
Evolution of Client/Server/Intranet
Level (+2) I18N
Departmental
10-100 Users
Systems Centralized Server(s)
Mainframe Extracts
Applications
Single Function
Level
(0) I18N
Stand
Alone
Simple Administration
Single Vendor
Databases 10s of Gigabytes
(c) 2002 M. McKenna
Enterprise
100s to 1000s of Users
Distributed Servers
Mainframe Integration
Corporate-Wide
Integrated
Complex Management
Intranet
Any User
Any Machine
Anywhere
World-wide
HTTP/HTML
Remote Mgmt
Many Vendors
Gigabytes to Terabytes
19
Unicode in Dist Sys
IUC#22
Legacy Systems
•
•
•
•
Communication through Gateways
Proprietary Character Sets
Many Asian Implementations
Lots of Data = Lot$ of Mone¥
Gateway
Gateway
Bridge
(c) 2002 M. McKenna
20
Unicode in Dist Sys
IUC#22
Three-Tier I18N System
Normalisation
LANGUAGE
VIEW
DATA
(c) 2002 M. McKenna
21
Unicode in Dist Sys
IUC#22
Phased Approach for
Distributed Unicode
• Phase I - encapsulated Unicode
– used internally, conversion filters to operating system
environment and external distributed APIs
• Phase II - Unicode on the wire
– Unicode for transmission to distributed applications
– Requires application control on both sides of the wire
• Phase III - Unicode end-to-end
– Unicode enabled user-I/O with appropriate software
– Competitive advantage for multiplatform portability
• Finally - Unicode everywhere
– Operating environments and standards catch up. Change the
conversion filters and the distributed applications continue to
work
(c) 2002 M. McKenna
22
Unicode in Dist Sys
IUC#22
Phase I - Encapsulated
Unicode
• Unicode Enabled application
inside a conversion envelope.
(c) 2002 M. McKenna
23
Unicode in Dist Sys
IUC#22
Phase II - Unicode on
the Wire
non-Unicode
App
• Conversion filters to operating
environment and distributed nonUnicode APIs
(c) 2002 M. McKenna
24
Unicode in Dist Sys
IUC#22
Phase III - Unicode
End-to-End
non-Unicode
App
• If needed, use proprietary software to enable
Unicode technology for user interfaces.
(c) 2002 M. McKenna
25
Unicode in Dist Sys
IUC#22
Final Phase - Unicode
Everywhere
• Distributed Environment vendors and
standards bodies support Unicode
• Unicode used everywhere for communication,
data representation, and user interfacing
(c) 2002 M. McKenna
26
Unicode in Dist Sys
IUC#22
Legacy System Integration
PC’s
3270’s
Local
Data
Servers
AS/400
AS/400
Rightsizing a Large Legacy System
• 3270 terminals in 16 countries connected to AS/400
MIS system
• Integration with new Client/Server
• Microsoft Windows clients (1st tier)
• Sun Sparc Solaris Unix servers (2nd tier)
• IBM AS/400 backend (3rd tier)
(c) 2002 M. McKenna
27
Unicode in Dist Sys
IUC#22
Example: B2B
Technology
to enable Global eCommerce
• All data in Unicode
• UTF-8 in XML
• All resource and message files stored
in Unicode for portability
• Support for Unicode internally
• Java and XML
(c) 2002 M. McKenna
28
Unicode in Dist Sys
IUC#22
Convertibility
CS1
CS2
CS0
• Mapping Tables
– National standards
– International standards
– Vendor standards
CS0
Unicode
• Always a mapping
Unicode
CS3
base standards
CS5
CS4
• Replacement characters
?
(c) 2002 M. McKenna
CS1
CS3
CS2
CS4
CS5
29
Unicode in Dist Sys
IUC#22
CORBA and Code Set
Conversion
•
•
•
•
•
Use Unicode for Inter-ORB global communications
OMG Common Object Services (COS)
Inter-ORB Bridge Support
General Inter-ORB Protocol (GIOP)
Internet Inter-ORB Protocol (IIOP)
• Code Set Negotiation: use CONV_Frame IDL
Diagram from “The Common Object Request Broker: Architecture and Specification”,
Rev. 2.2, Chapter 11: ORB Interoperability Architecture
(c) 2002 M. McKenna
30
Unicode in Dist Sys
IUC#22
CORBA IOP/IOR
•IOP: Inter-ORB Protocol
•IOR: Interoperable Object Reference (like URL, with attributes):
Diagram from “The Common Object Request Broker: Architecture and Specification”,
Rev. 2.2, Chapter 11: ORB Interoperability Architecture
(c) 2002 M. McKenna
31
Unicode in Dist Sys
IUC#22
Transmission Code Set
•Character Set: The characters, independent of encoding
•Code Set: The encoded values of a Character Set
•OSF Character and Code Set Registry:
–ftp.opengroup.org:/pub/code_set_registry
Diagram from “The Common Object Request Broker: Architecture and Specification”,
Rev. 2.2, Chapter 11: ORB Interoperability Architecture
(c) 2002 M. McKenna
32
Unicode in Dist Sys
IUC#22
Character Set Conversion
• User definable
– Table-driven
– API for user-defined routines
• Robust
Negotiated conversion policy with Server:
– CMR - Client Makes Right
– SMR - Server Makes Right
– UNR - Universal Network Representation
• Unicode based conversions
(c) 2002 M. McKenna
33
Unicode in Dist Sys
IUC#22
Character Set Conversion
• Configurable error results depending on
data-integrity needs
– Exact Match
– Best Guess
– Error plus replacement character
• Multiple character sets supportable with
ICU as a “Conversion Envelope”
(c) 2002 M. McKenna
34
Unicode in Dist Sys
IUC#22
Character Set and Sort Order
Definitions
• International and commercial standards
supplied by ICU
• User definable for others
– 8-bit
– Multi-byte
– Unicode reference set
• Utilities for creating character sets
• Sort order issues:
• Multilingual sorting
• Multiple sort orders and indexing
• Default vs expected sorting
(c) 2002 M. McKenna
35
Unicode in Dist Sys
IUC#22
Unicode SQL Database
• Virtually every
written
business
language
supported
• Allows worldwide solutions
(c) 2002 M. McKenna
36
Unicode in Dist Sys
IUC#22
(c) 2002 M. McKenna
ü
ü
ü
ü
ü
ü
ü3
DB2
Microsoft
Oracle
ü
ü
ü
ü
ü2
ü2
ü
ü
ü
ü
Interbase
ü
ü
ü
ü
ü1
ü1
ü1
Teradata
Length Functions
ü
Folding (upper, lower) ü
Comparison (< = >)
ü
Searching (Like)
ü
Collation
Order By
Indexes (<collate clause>)
SQL Anywhere
Sybase
Unicode in Databases
ü
ü
ü
ü
ü4
ü4
ü
ü
ü
ü
37
Unicode in Dist Sys
IUC#22
IETF
• RFC 2277 - All Protocols
IETF Policy on Character Sets and Languages
“The Internet is international”
–
–
–
–
Must identify charset
Must support UTF-8
Must identify language
Multilanguage support required
• RFC 2130 - new Protocols and Formats
– Unicode default (UTF-8)
(c) 2002 M. McKenna
38
Unicode in Dist Sys
IUC#22
MIME
• MIME ‘charset’ Parameters
– Used for Character encoding identification
» HTTP
» HTML
» XML
» CSS
(c) 2002 M. McKenna
39
Unicode in Dist Sys
IUC#22
HTTP encoding negotiation
• Client sends Accept-Charset HTTP header
Accept-Charset: UTF-8, ISO-8859-1;q=0.9,*;q=0.1
• Server know encoding and sends ‘charset’ parameter in HTTP
header
Content-Type: text/html; charset=“UTF-8”
• HTML clues
document header
<meta http-equiv=“Content-Type”
content=“text/html; charset=UTF-8”>
links <a href=… charset=“UTF-8”> … </a>
(c)
(c) 2002
2002 M.
M. McKenna
McKenna
40
42
Unicode
Unicode in
in Dist
Dist Sys
Sys
IUC#22
Determining Internet
Encodings
In priority order:
1. User override
2. HTTP header or protocol information
3. Self-identification
» <meta> for HTML
» encoding for XML
» @charset for CSS
4. ‘charset’ parameters on links
5. User preferences/heuristics
(c) 2002 M. McKenna
41
Unicode in Dist Sys
IUC#22
LDAP version 3.0
• LDAP strings are UTF-8
• Directory entries can be in any language
• RFC 2251 to RFC 2256
(c) 2002 M. McKenna
42
Unicode in Dist Sys
IUC#22
XML and Java
• XML - Portable Data
• Java - Portable Code
• XML tag structures map to Java Classes
• Default encoding for XML is Unicode
• Encoding for Java Strings is Unicode
(c) 2002 M. McKenna
43
Unicode in Dist Sys
IUC#22
XML
• Conforming parsers must support
– UTF-16
– UTF-8
• UTF-8 is the default encoding
<?xml version=”1.0" encoding=”UTF-8" ?>
• Character entities are Unicode values
– &#dddd
– &#xUUUU
• CSS
@charset “UTF-8”;
(c) 2002 M. McKenna
44
Unicode in Dist Sys
IUC#22
Java
• java.lang.String - Unicode
• inputStreamReader
– converts sourceCharset to Unicode
• outputStreamWriter
– converts Unicode to targetCharset
• Different list of charsets supported per Vendor
• Java 1.1 vs Java 2 and Unicode
– Java 2 “Swing” set has better display support
http://www.javasoft.com search on “internationalization”
(c) 2002 M. McKenna
45
Unicode in Dist Sys
IUC#22
GUI in Java
• Portable consistent interface
• Use Java 1.2
• Use Java Foundation Classes (e.g.
JTextArea)
• Use Java Locale class
• Link to O/S through JNI
• Java runs in native Unicode
(c) 2002 M. McKenna
46
Unicode in Dist Sys
IUC#22
Java and I18n
• java.io
–Character set conversion
–InputStreamReader, OutputStreamWriter
• java.util
–Locale
–Date, Calendar
–ResourceBundle
• java.text
–String handling, formatting
–Collation
(c) 2002 M. McKenna
47
Unicode in Dist Sys
IUC#22
Java Methods for JDBC
•
•
•
•
•
•
Connection
Data Binding
Formatting Output
Date and Time
Collation
Translated Messages
(c) 2002 M. McKenna
48
Unicode in Dist Sys
IUC#22
Connection
– What language?
• System default?
Locale defLocale = Locale.getDefault();
set properties.put("LANGUAGE",
(defLocale.getDisplayLanguage(Locale.US)).
toLowerCase());
• User choice?
• Server choices list + us_english
select name from master..syslanguages
• Server Default?
• Java Application/Applet can be different
– What character set?
• sp_server_info server_csname
set properties.put(“CHARSET”, server_csname);
(c) 2002 M. McKenna
49
Unicode in Dist Sys
IUC#22
Data Binding
–Static locale
• User-driven
• System default
• Menu pull-down
–Dynamic locale
• Data-driven
• per-column
• per-row
• generated by business rules
–Format using java.text
(c) 2002 M. McKenna
50
Unicode in Dist Sys
IUC#22
Formatting Output
–Numeric
• java.text
• DecimalFormat
• NumberFormat
• ChoiceFormat
SQL Types
INT, SMALLINT,
TINYINT, NUMERIC,
FLOAT, REAL
DOUBLE, MONEY
–Date and Time
• java.text
• DateFormat, SimpleDateFormat
• java.util
• Calendar, GregorianCalendar
• Date
• TimeZone. SimpleTimeZone
(c) 2002 M. McKenna
SQL Types
DATETIME,
TIMESTAMP
51
Unicode in Dist Sys
IUC#22
Example: www.3m.com
Flag
Banner
Meta Data
Content
(c) 2002 M. McKenna
52
Unicode in Dist Sys
IUC#22
Generic Datetime for
input to remote systems
• Use “YYYYMMDD” format (ISO 8601 format)
insert “20021104” into table1(date_col)
/* 4 November 2002 */
– avoids language and format confusions
(c) 2002 M. McKenna
53
Unicode in Dist Sys
IUC#22
E-Marketplace Technology
XML Facilitates eCommerce.
(c) 2002 M. McKenna
54
Unicode in Dist Sys
IUC#22
Example Message (DTD)
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE Book [
<!ELEMENT Book (BookDesc)+ >
<!ELEMENT
BookDesc
(Title, Author, Publisher, ISBN,
Price, CoverImage, Desc) >
<!ATTLIST
Book
xml:lang CDATA #REQUIRED >
<!ELEMENT
Title
(#PCDATA) >
<!ELEMENT
Author
(#PCDATA) >
<!ELEMENT
Publisher (#PCDATA) >
<!ELEMENT
ISBN
(#PCDATA) >
<!ELEMENT
Price
(Currency, Amount) >
<!ELEMENT Currency (#PCDATA) >
<!ELEMENT Amount
(#PCDATA) >
<!ENTITY
CoverImage
EMPTY >
<!ELEMENT
Desc
(#PCDATA) >
<!ATTLIST
CoverImagemage
type (bmp|gif|jpg|other) "gif">
<!NOTATION gif SYSTEM ("gwswin/gws.exe">
<!NOTATION bmp SYSTEM ("gwswin/gws.exe">
<!NOTATION jpg SYSTEM ("gwswin/gws.exe">
<!NOTATION other SYSTEM ("gwswin/gws.exe">
]>
(c) 2002 M. McKenna
55
Unicode in Dist Sys
IUC#22
Example Message (XML)
<Book>
<BookDesc
xml:lang=“EN”>
<Title>Java in a Nutshell</Title>
<Author>David Flanagan</Author>
<Publisher>O'Reilly & Associates
</Publisher>
<ISBN>156592262X</ISBN>
<Price>
<Currency>USD</Currency>
<Amount>24.95</Amount>
</Price>
<CoverImage>jnut_us.gif</CoverImage>
<Desc>The bestselling Java in a Nutshell
has been updated to cover Java 1.1.
If you're a Java programmer who is
migrating to 1.1, this second
</Desc>
</BookDesc>
...
(c) 2002 M. McKenna
56
Unicode in Dist Sys
IUC#22
Example Message (XML)
<BookDesc
xml:lang=“DE”>
<Title>Java in a Nutshell</Title>
<Author>David Flanagan</Author>
<Publisher>OReilly/VVA
</Publisher>
<ISBN>3897211009</ISBN>
<Price>
<Currency>EUR</Currency>
<Amount>25.95</Amount>
</Price>
<CoverImage>jnut_de.gif</CoverImage>
<Desc>Dieses Handbuch ist eine
unentbehrliche Kurzreferenz, die dazu
gedacht ist, aufgeschlagen neben der Tastatur
jedes Java-Programmierers zu liegen. Es enthält eine
</Desc>
</BookDesc>
...
(c) 2002 M. McKenna
57
Unicode in Dist Sys
IUC#22
Example Message (XML)
<BookDesc
xml:lang=“JP”>
<Title>
</Title>
<Author>David Flanagan</Author>
<Publisher>
</Publisher>
<ISBN>4-900900-08-7</ISBN>
<Price>
<Currency>JPY</Currency>
<Amount>3900.00</Amount>
</Price>
<CoverImage>jnut_jp.gif</CoverImage>
<Desc>
</Desc>
</BookDesc>
</Book>
...
(c) 2002 M. McKenna
58
Unicode in Dist Sys
IUC#22
Example Order (XML)
<Order>
<OrderNum>20193786</OrderNum>
<UserIdNum>A47US37892</UserIdNum>
<ItemsOrdered>
<Item>
<ProductId> 156592262X
</ProductId>
<Qty>1</Qty>
</Item>
<Item>
<ProductId> 3897211009
</ProductId>
<Qty>12</Qty>
</Item>
<Item>
<ProductId> 4900900087
</ProductId>
<Qty>2</Qty>
</Item>
</ItemsOrdered>
</Order>
(c) 2002 M. McKenna
59
Unicode in Dist Sys
IUC#22
Business Rules
• Reports
• Taxes
• Currency
– Dual Currency Display
– Currency Conversion
– Payment and Settlement
• Import/Export
• Business Process
• Workflow
(c) 2002 M. McKenna
60
Unicode in Dist Sys
IUC#22
Communication
Protocols
• CORBA 3.0
– string, char
– wstring, wchar
supports UTF-8
supports UTF-16
• COM, DCOM
– Allows Unicode
• ActiveX
– Unicode interface
(c) 2002 M. McKenna
61
Unicode in Dist Sys
IUC#22
Fonts
• True Type
– Bitstream Cyberbit
– Monotype
• BDF, Java
– cobble together from many sources
• Dynamic
– Composed of multiple fonts
– Bitstream Truedoc
(c) 2002 M. McKenna
www.truedoc.com
62
Unicode in Dist Sys
IUC#22
Web Services
UDDI
SOAP (XML
WSDL (XML)
HTTP
(c) 2002 M. McKenna
63
Unicode in Dist Sys
IUC#22
E-Marketplace Technology
Service Provider layers in their Services to seamlessly
add value to all trading partners.
(c) 2002 M. McKenna
64
Unicode in Dist Sys
IUC#22
UDDI
Describes:
• What is it?
• Where is it?
• How do I get it?
(c) 2002 M. McKenna
65
Unicode in Dist Sys
IUC#22
UDDI - I18n
• Need to track time zone usage
• Useful to have alternate names
• Specify normalized formats to use
(c) 2002 M. McKenna
66
Unicode in Dist Sys
IUC#22
WSDL
Web Services Description Language
Services are defined using six major elements:
• types : describe the messages exchanged.
• message : abstract definition of the data being transmitted
• portType : set of abstract operations refering to an input message or
output messages.
• binding : protocol and data format specifications
• port : address for a binding - single communication endpoint.
• service : aggregate a set of related ports
(c) 2002 M. McKenna
67
Unicode in Dist Sys
IUC#22
WSDL - I18n
Web Services Description Language
• Pure XML
• Use xml:lang and locale attributes
• Export to UDDI
• Service provider localizes to supported
locales
(c) 2002 M. McKenna
68
Unicode in Dist Sys
IUC#22
SOAP
Simplified Object Access Protocol
SOAP
SOAP
(c) 2002 M. McKenna
69
Unicode in Dist Sys
IUC#22
SOAP - I18n
Simplified Object Access Protocol
Locale B
Locale A
SOAP
I18N Info
SOAP
I18N Info
(c) 2002 M. McKenna
71
Unicode in Dist Sys
IUC#22
E-Marketplace Technology
(c) 2002 M. McKenna
72
Unicode in Dist Sys
IUC#22
System Architecture
Application Software
String Formatting
Conversions
Portability Layer
Operating System
(c) 2002 M. McKenna
Message System
Collations
Character Handling
Cultural Profiler
Text Handling
en
Resource
fr
Resource
Resource
Store
Resource de
Files
Files
jp
Files
73
Unicode in Dist Sys
IUC#22
Summary
• Unicode is a powerful portability and
interoperability solution for distributed
environments
• Global, distributed computing (+2 Level of I18N)
requires Unicode to be effective
• Unicode can be acquired in a phased approach
• Unicode is now required to use new technologies
(RFC 2277)
– XML, Java
(c) 2002 M. McKenna
75
Unicode in Dist Sys
IUC#22
Global Vision
• “Think Globally, Act Locally”
• The trade relationships of the World
makes for a very small planet
economically, but complex culturally
• The World Needs Unicode Today!
(c) 2002 M. McKenna
76
Unicode in Dist Sys
IUC#22