Transcript Document

Globalization
Copyright © 2008, Oracle. All rights reserved.
Objectives
After completing this lesson, you should be able to:
• Determine a correct database character set that meets your
business requirements
• Obtain globalization support configuration information
• Customize language-dependent behavior for the database
and individual sessions
• Specify different linguistic sorts for queries
• Retrieve data that matches a search string ignoring case or
accent differences
20 - 2
Copyright © 2008, Oracle. All rights reserved.
Globalization Support Features
•
•
•
•
•
•
•
•
20 - 3
Language support
Territory support
Character set support
Linguistic sorting
Message support
Date and time formats
Numeric formats
Monetary formats
French
data
Copyright © 2008, Oracle. All rights reserved.
Japanese
data
What Every DBA Needs to Know
•
•
•
•
•
•
•
•
20 - 4
What is a character set?
How are character sets used?
Problems to avoid
Choosing your character set
Obtaining character set information
Specifying language-dependent behavior
Using linguistic searching and sorting
Using data conversion
Copyright © 2008, Oracle. All rights reserved.
What Is a Character Set?
The Oracle database supports different classes of characterencoding schemes:
• Single-byte character sets
– 7-bit
– 8-bit
• Multibyte character sets, including Unicode
20 - 5
Copyright © 2008, Oracle. All rights reserved.
Understanding Unicode
AL32UTF8
c
á
t
d
ö
63
C3 91
74
AL16UTF16
Supplementary
characters
0063
00E1
0074
EE AA 9E
F0 9D 84 9E
64
C3 B6
A89E
D834 DD1E
0064
D0 A4
0424
00F6
Encoding: Representing characters with byte sequences
20 - 7
Copyright © 2008, Oracle. All rights reserved.
How Are Character Sets Used?
• Oracle Net compares the client NLS_LANG setting to the
character set on the server.
• If needed, conversion occurs automatically and
transparently.
NLS_LANG
Oracle Net
Client
20 - 9
Server
Copyright © 2008, Oracle. All rights reserved.
Problems to Avoid
Example:
NLS_LANG:
AL32UTF8
Client
Windows English
Code page: WE8MSWIN1252
Oracle Net
Server
Database character set:
AL32UTF8
No conversion occurs, because it does not seem to be
required.
Issue: Invalid data are entered into the database.
20 - 10
Copyright © 2008, Oracle. All rights reserved.
Another Sample Problem
CREATE DATABASE ...
CHARACTER SET US7ASCII
NATIONAL CHARACTER SET
UTF8 ...
% export NLS_LANG=‘SIMPLIFIED
CHINESE_HONG KONG.ZHS16GBK’
20 - 11
Copyright © 2008, Oracle. All rights reserved.
Choosing Your Character Set
• Trade-offs to consider
• Choosing the correct character set that meets your business
requirements now and in the future
• Specifying the character set
• Changing the character set after database creation
20 - 12
Copyright © 2008, Oracle. All rights reserved.
Database Character Sets and
National Character Sets
Database Character Sets
National Character Sets
Defined at creation time
Defined at creation time
Cannot be changed without
re-creation (exceptions in certain
configurations)
Can be exchanged
Store data columns of type CHAR, Store data columns of type
VARCHAR2, CLOB, LONG
NCHAR, NVARCHAR2, NCLOB
Can store varying-width character Can store Unicode using either
AL16UTF16 or UTF8
sets
20 - 13
Copyright © 2008, Oracle. All rights reserved.
Obtaining Character Set Information
SQL> SELECT parameter, value
2
FROM nls_database_parameters
3
WHERE parameter LIKE '%CHARACTERSET%';
PARAMETER
VALUE
----------------------- ------------NLS_CHARACTERSET
WE8ISO8859P1
NLS_NCHAR_CHARACTERSET AL16UTF16
2 rows selected.
20 - 14
Copyright © 2008, Oracle. All rights reserved.
Specifying Language-Dependent Behavior
Initialization parameters for the database server
Environment variables for the clients
ALTER SESSION command
SQL function
SELECT sysdate FROM dual;
20 - 15
Copyright © 2008, Oracle. All rights reserved.
Specifying Language-Dependent
Behavior for the Session
• Specify the locale behavior with the NLS_LANG environment
variable:
– Language
– Territory
– Character set
NLS_LANG=FRENCH_CANADA.WE8ISO8859P1
• Set other NLS environment variables to:
– Override database initialization parameter settings for all
sessions
– Customize the locale behavior
– Change the default location of the NLS library files
20 - 16
Copyright © 2008, Oracle. All rights reserved.
Language-Dependent and Territory-Dependent
Parameters
Parameter
Default Values
NLS_LANGUAGE
NLS_DATE_LANGUAGE
NLS_SORT
AMERICAN
AMERICAN
BINARY
NLS_TERRITORY
NLS_CURRENCY
NLS_DUAL_CURRENCY
NLS_ISO_CURRENCY
NLS_DATE_FORMAT
NLS_NUMERIC_CHARACTERS
NLS_TIMESTAMP_FORMAT
NLS_TIMESTAMP_TZ_FORMAT
AMERICA
$
$
AMERICA
DD-MON-RR
.,
DD-MON-RRHH.MI.SSXFF AM
DD-MON-RRHH.MI.SSXFF AM TZR
20 - 17
Copyright © 2008, Oracle. All rights reserved.
Specifying Language-Dependent Behavior
Using NLS parameters in SQL functions:
ALTER SESSION SET NLS_DATE_FORMAT='DD.MM.YYYY';
DBMS_SESSION.SET_NLS('NLS_DATE_FORMAT',
'''DD.MM.YYYY''') ;
SELECT TO_CHAR(hire_date,'DD.Mon.YYYY',
'NLS_DATE_LANGUAGE=FRENCH')
FROM employees
WHERE hire_date > '01-JAN-2000';
20 - 19
Copyright © 2008, Oracle. All rights reserved.
Linguistic Searching and Sorting
Sort order can be affected by:
• Case-sensitivity
• Diacritics or accent characters
• Combination of characters that is treated as a single
character
• Phonetics or character appearance
• Cultural preferences
20 - 20
Copyright © 2008, Oracle. All rights reserved.
Linguistic Searching and Sorting
Three types of sorting:
• Binary sorting
– Sorted according to the binary values of the encoded
characters
• Monolingual linguistic sorting
– A two-pass sort based on a character’s assigned major and
minor values
• Multilingual linguistic sorting
– Based on the ISO standard (ISO 14651), and the Unicode 3.2
Standard for multilingual collation
– Ordered by the number of strokes, PinYin, or radicals for
Chinese characters
20 - 21
Copyright © 2008, Oracle. All rights reserved.
Using Linguistic Searching and Sorting
You can specify the type of sort used for character data with
the:
• NLS_SORT parameter
– Default value derived from the NLS_LANG environment
variable, if set
– Can be specified for the session, client, or server
• NLSSORT function
– Defines the sorting method at the query level
20 - 22
Copyright © 2008, Oracle. All rights reserved.
Case-Insensitive and Accent-Insensitive
Search and Sort
• Specify the linguistic name:
NLS_SORT = <NLS_sort_name>[_AI | _CI]
• Examples:
NLS_SORT = FRENCH_M_AI
NLS_SORT = XGERMAN_CI
• Specify the sort action for WHERE clauses and PL/SQL
blocks:
NLS_COMP = BINARY | ANSI
• Useful for migrated databases
20 - 24
Copyright © 2008, Oracle. All rights reserved.
Support in SQL and Functions
• The following SQL clauses support NLS_SORT and
NLS_COMP settings:
–
–
–
–
–
–
–
WHERE
ORDER BY
START WITH
HAVING
IN/NOT IN
BETWEEN
CASE-WHEN
• The NLSSORT() function supports the case-insensitive and
accent-insensitive functionality.
20 - 25
Copyright © 2008, Oracle. All rights reserved.
Linguistic Index Support
• Create an index on linguistically sorted values.
• Rapidly query data without having to specify ORDER BY
clause and NLSSORT:
CREATE INDEX list_word ON
list (NLSSORT(word, 'NLS_SORT=French_M'));
SELECT word FROM list;
• Set the NLS_SORT parameter to match the linguistic
definition that you want to use for the linguistic sort when
creating the index.
20 - 26
Copyright © 2008, Oracle. All rights reserved.
Customizing Linguistic
Searching and Sorting
You can customize linguistic sorting for:
• Ignorable characters
• Contracting or expanding characters
• Special combination letters or special letters
• Expanding characters or special letters
• Special uppercase and lowercase letters
• Context-sensitive characters
• Reverse secondary sorting
• Canonical equivalence
20 - 27
Copyright © 2008, Oracle. All rights reserved.
Implicit Conversion
Between CLOB and NCLOB
Transparent implicit conversion is supported in:
• SQL IN and OUT bind variables for query and DML
• PL/SQL functions and procedure parameter passing
• PL/SQL variable assignment
20 - 29
Copyright © 2008, Oracle. All rights reserved.
NLS Data Conversion with Oracle Utilities
• Multiple data conversions can take place when data is
exported from one database and imported into another if the
same character sets are not used.
• External tables use the NLS settings on the server for
determining the data character set.
• SQL*Loader:
– Conventional path: Data is converted into the session
character set specified by NLS_LANG.
– Direct path: Data is converted using client-side directives.
20 - 30
Copyright © 2008, Oracle. All rights reserved.
NLS Data Conversion with Data Pump
• Data Pump Export always saves data in the same character
set as the database from which the data originates.
• Data Pump Import converts the data to the character set of
the target database, if needed.
• The Data Pump log file is written in the language specified
by NLS_LANG for the session that started Data Pump.
20 - 32
Copyright © 2008, Oracle. All rights reserved.
Language and Character Set File Scanner
(LCSSCAN)
Character set
20 - 33
Copyright © 2008, Oracle. All rights reserved.
Setting the Database Time Zone
The current time zone in the database is determined by the
following:
• The SET TIME_ZONE clause of the CREATE DATABASE
statement
• The time zone of the operating system on the database
server host
• The time zone specified by the ALTER SESSION SET
TIME_ZONE command
CREATE DATABASE ... SET TIME_ZONE='-04:00';
20 - 34
Copyright © 2008, Oracle. All rights reserved.
Summary
In this lesson, you should have learned how to:
• Determine a correct database character set that meets your
business requirements
• Obtain globalization support configuration information
• Customize language-dependent behavior for the database
and individual sessions
• Specify different linguistic sorts for queries
• Retrieve data that matches a search string ignoring case or
accent differences
20 - 35
Copyright © 2008, Oracle. All rights reserved.
Practice 20 Overview:
Using Globalization Support
This practice covers the following topics:
• Determining the database character set
• Setting the NLS_SORT variable
20 - 36
Copyright © 2008, Oracle. All rights reserved.