Internationalization of Java Platform

Download Report

Transcript Internationalization of Java Platform

Internationalization of Java
Platform
Presenter: Ataru Nakazawa
Advisor: Xiaoping Jia
Date: January 23, 2004
Overview
1.
2.
3.
4.
5.
6.
7.
Introduction
Issues
Unicode 4.0
Research Design
Problem and challenge
Work Plan
References
Introduction

How many countries are in the world?

How many languages are in the world?

How many people are in the world?
Answers

How many countries are in the world?
192 countries (U.S. Department of State)

How many languages are in the world?
It is difficult to give an exact figure of the number
of languages that exist in the world. According to
“Ethnologue”, it is usually estimated between 3,000
and 8,000.

How many people are in the world?
6,328,406,644 (U.S census Bureau)
What is Internationalization?


Internationalization is the process of
designing an application so that it can
be adapted to various languages and
regions without engineering changes.
i18n
What is Localization?
Localization is the process of adapting
software for a specific region or
language by adding locale-specific
components and translating text.
 l10n
 Example

Hello
(English)
l10n
Buongiorno
(Italian)
Issues
(Culturally Dependent Data)









Messages
(e.g. English, Italian, Spanish and Japanese)
Dates
(e.g. November/14/2003, 14 / 11 /2003)
Times
Numbers
Currencies
(e.g. $, £, ¥, )
Measurements
(e.g. length, weight, capacity,
temperature)
Phone numbers
Postal addresses
Labels on GUI components
Issues that has been solved

Locale
Formatting Messages

Character Sets and Unicode

Collation
Fonts
Graphical User Interfaces
Input Methods




Issues that has not been solved

Unicode 4.0 Support

Complex Text Enhancement
Character Converter Framework
Input Method Framework


My research is going to be the Unicode 4.0 support in Java
Unicode 4.0 support in Java
Currently, Unicode standard version 4.0 and Java 2
Standard Edition 1.4.2 are released (As of January 5 ,
2004). But the J2SE1.4.2 supports only Unicode 3.0. The
Unicode 4.0 is not being supported by the J2SE 1.4.2.
Java 1.5 is going to be released in this coming summer
of 2004. It is going to support Unicode 3.1 but Unicode
4.0.
Unsupported Unicode Blocks

47,188 new character assignments were
made to the Unicode standard version 4.0
from version 3.0.

38 unsupported Unicode blocks.
http://students.depaul.edu/~anakazaw/se690/UnsupportedUnicodeBlocks.txt
Research Design




The java.lang.Character and java.lang.Character.
UnicodeBlock classes need updating to provide access to
the new characters and blocks.
The normalization algorithm, used by the Collator object,
needs modification so that collation and string
comparison are carried out correctly.
The case mapping rules used to perform case mappings
between upper and lower case letters need upgrading to
handle new characters.
The character encoding converters need modification to
support these new characters.
Problem and Challenge

32 bits character assignments were made since Unicode
4.0. That is why supporting the new Unicode characters
that are bigger than 16 bits would be the hardest part.
Work Plan
•
Design
January, February, March
•
Testing
April
References

http://java.sun.com, Java Tutorial, Sun Microsystems, 2004.

http://www.oreilly.com, Java Internationalization, O'Reilly, 2001.

http://www.unicode.org, Unicode Standard Version 4.0, Unicode Inc,
2003.

http://java.sun.com, Java Internationalization Forum.
End