1 Overview of Globalization Support
This chapter provides an overview of globalization support for Oracle Database. This chapter discusses the following topics:
1.1 Globalization Support Architecture
The globalization support in Oracle Database enables you to store, process, and retrieve data in native languages. It ensures that database utilities, error messages, sort order, and date, time, monetary, numeric, and calendar conventions automatically adapt to any native language and locale.
In the past, Oracle referred to globalization support capabilities as National Language Support (NLS) features. NLS is actually a subset of globalization support. NLS is the ability to choose a national language and store data in a specific character set. Globalization support enables you to develop multilingual applications and software products that can be accessed and run from anywhere in the world simultaneously. An application can render content of the user interface and process data in the native users' languages and locale preferences.
1.1.1 Locale Data on Demand
Oracle Database globalization support is implemented with the Oracle NLS Runtime Library (NLSRTL). NLSRTL provides a comprehensive suite of language-independent functions that perform proper text and character processing and language-convention manipulations. Behavior of these functions for a specific language and territory is governed by a set of locale-specific data that is identified and loaded at run time.
The locale-specific data is structured as independent sets of data for each locale that Oracle Database supports. The data for a particular locale can be loaded independently of other locale data.
The advantages of this design are as follows:
-
You can manage memory consumption by choosing the set of locales that you need.
-
You can add and customize locale data for a specific locale without affecting other locales.
The following figure shows how locale-specific data is loaded at run time. In this example, French data and Japanese data are loaded into the multilingual database, but German data is not.
Figure 1-1 Loading Locale-Specific Data to the Database
The locale-specific data is stored in the $ORACLE_HOME/nls/data
directory. The ORA_NLS10
environment variable should be defined only when you need to change the default directory location for the locale-specific data files, for example, when the system has multiple Oracle Database homes that share a single copy of the locale-specific data files.
A boot file is used to determine the availability of the NLS objects that can be loaded. Oracle Database supports both system and user boot files. The user boot file gives you the flexibility to tailor what NLS locale objects are available for the database. Also, new locale data can be added and some locale data components can be customized.
See Also:
1.1.2 Architecture to Support Multilingual Applications
Oracle Database enables multitier applications and client/server applications to support languages for which the database is configured.
The locale-dependent operations are controlled by several parameters and environment variables on both the client and the database server. On the database server, each session that is started on behalf of a client may run in the same or a different locale as other sessions, and can have the same or different language requirements specified.
Oracle Database has a set of session-independent NLS parameters that are specified when you create a database. Two of the parameters specify the database character set and the national character set, which is an alternative Unicode character set that can be specified for NCHAR
, NVARCHAR2
, and NCLOB
data. The parameters specify the character set that is used to store text data in the database. Other parameters, such as language and territory, are used to evaluate and check constraints.
If the client session and the database server specify different character sets, then the database converts character set strings automatically.
From a globalization support perspective, all applications are considered to be clients, even if they run on the same physical machine as the Oracle Database instance. For example, when SQL*Plus is started by the UNIX user who owns the Oracle Database software from the Oracle home in which the RDBMS software is installed, and SQL*Plus connects to the database through an adapter by specifying the ORACLE_SID
parameter, SQL*Plus is considered a client. Its behavior is ruled by client-side NLS parameters.
Another example of an application being considered a client occurs when the middle tier is an application server. The different sessions spawned by the application server are considered to be separate client sessions.
When a client application is started, it initializes the client NLS environment from environment settings. All NLS operations performed locally are executed using these settings. Examples of local NLS operations are:
-
Display formatting in Oracle Developer applications
-
User OCI code that executes NLS OCI functions with OCI environment handles
When the application connects to a database, a session is created on the server. The new session initializes its NLS environment from NLS instance parameters specified in the initialization parameter file. These settings can be subsequently changed by an ALTER
SESSION
statement. The statement changes only the session NLS environment. It does not change the local client NLS environment. The session NLS settings are used to process SQL and PL/SQL statements that are executed on the server. For example, use an ALTER SESSION
statement to set the NLS_LANGUAGE
initialization parameter to Italian:
ALTER SESSION SET NLS_LANGUAGE=Italian;
Enter a SELECT
statement:
SQL> SELECT last_name, hire_date, ROUND(salary/8,2) salary FROM employees;
You should see results similar to the following:
LAST_NAME HIRE_DATE SALARY ------------------------- --------- ---------- ... Sciarra 30-SET-05 962.5 Urman 07-MAR-06 975 Popp 07-DIC-07 862.5 ...
Note that the month name abbreviations are in Italian.
Immediately after the connection has been established, if the NLS_LANG
environment setting is defined on the client side, then an implicit ALTER
SESSION
statement synchronizes the client and session NLS environments.
1.1.3 Using Unicode in a Multilingual Database
Unicode, the universal encoded character set, enables you to store information in any language by using a single character set. Unicode provides a unique code value for every character, regardless of the platform, program, or language. Oracle recommends using AL32UTF8
as the database character set. AL32UTF8
is the proper implementation of the UTF-8 encoding form of the Unicode standard.
Note:
Starting with Oracle Database 12c Release 2, if you use Oracle Universal Installer (OUI) or Oracle Database Configuration Assistant (DBCA) to create a database, the default database character set used is the Unicode character set AL32UTF8
.
Unicode has the following advantages:
-
Simplifies character set conversion and linguistic sort functions.
-
Improves performance compared with native multibyte character sets.
-
Supports the Unicode data type based on the Unicode standard.
To help you migrate to a Unicode environment, Oracle provides the Database Migration Assistant for Unicode (DMU). The DMU is an intuitive and user-friendly GUI that helps streamline the migration process through an interface that minimizes the workload and ensures that all migration issues are addressed, along with guaranteeing that the data conversion is carried out correctly and efficiently. The DMU offers many advantages over past methods of migrating data, some of which are:
-
It guides you through the workflow.
-
It offers suggestions for handling certain problems, such as failures during the cleansing of the data.
-
It supports selective conversion of data.
-
It offers progress monitoring.
1.2 Globalization Support Features
1.2.1 Language Support
Oracle Database enables you to store, process, and retrieve data in native languages. The languages that can be stored in a database are all languages written in scripts that are encoded by Oracle-supported character sets. Through the use of Unicode databases and data types, Oracle Database supports most contemporary languages.
Additional support is available for a subset of the languages. The database can, for example, display dates using translated month names, and can sort text data according to cultural conventions.
When this document uses the term language support, it refers to the additional language-dependent functionality, and not to the ability to store text of a specific language. For example, language support includes displaying dates or sorting text according to specific locales and cultural conventions. Additionally, for some supported languages, Oracle Database provides translated error messages and a translated user interface for the database utilities.
See Also:
-
"Languages" for the list of Oracle Database language names and abbreviations
-
"Translated Messages" for the list of languages into which Oracle Database messages are translated
1.2.2 Territory Support
Oracle Database supports cultural conventions that are specific to geographical locations. The default local time format, date format, and numeric and monetary conventions depend on the local territory setting. Setting different NLS parameters enables the database session to use different cultural settings. For example, you can set the euro (EUR
) as the primary currency and the Japanese yen (JPY
) as the secondary currency for a given database session, even when the territory is defined as AMERICA
.
See Also:
-
"Territories" for a list of territories that are supported by Oracle Database
1.2.3 Date and Time Formats
Different conventions for displaying the hour, day, month, and year can be handled in local formats. For example, in the United Kingdom, the date is displayed using the DD-MON-YYYY
format, while Japan commonly uses the YYYY-MM-DD
format.
Time zones and daylight saving support are also available.
1.2.4 Monetary and Numeric Formats
Currency, credit, and debit symbols can be represented in local formats. Radix symbols and thousands separators can be defined by locales. For example, in the US, the decimal point is a dot (.), while it is a comma (,) in France. Therefore, the amount $1,234 has different meanings in different countries.
1.2.5 Calendar Systems
Many different calendar systems are in use around the world. Oracle Database supports eight different calendar systems:
-
Gregorian
-
Japanese Imperial
-
ROC Official (Republic of China)
-
Thai Buddha
-
Persian
-
English Hijrah
-
Arabic Hijrah
-
Ethiopian
See Also:
-
"Calendar Systems" for more information about supported calendars
1.2.6 Linguistic Sorting
Oracle Database provides linguistic definitions for culturally accurate sorting and case conversion. The basic definition treats strings as sequences of independent characters. The extended definition recognizes pairs of characters that should be treated as special cases.
Strings that are converted to upper case or lower case using the basic definition always retain their lengths. Strings converted using the extended definition may become longer or shorter.
See Also:
1.2.7 Character Set Support
Oracle Database supports a large number of single-byte, multibyte, and fixed-width encoding schemes that are based on national, international, and vendor-specific standards.
See Also:
-
"Character Sets" for a list of supported character sets
1.2.8 Character Semantics
Oracle Database provides character semantics. It is useful for defining the storage requirements for multibyte strings of varying widths in terms of characters instead of bytes.
See Also:
1.2.9 Customization of Locale and Calendar Data
You can customize locale data such as language, character set, territory, or linguistic sort using the Oracle Locale Builder.
You can customize calendars with the NLS Calendar Utility.
1.2.10 Unicode Support
Unicode is an industry standard that enables text and symbols from all languages to be consistently represented and manipulated by computers.
Oracle Database has complied with the Unicode standard since Oracle 7. Subsequently, Oracle Database 10g release 2 supports Unicode 4.0. Oracle Database 11g release supports Unicode 5.0. Oracle Database 12c Release 1 supports Unicode 6.2. Oracle Database 12c Release 2 (12.2) supports Unicode 7.0. Oracle Database Release 18c and Oracle Database Release 19c support Unicode 9.0. Oracle Database Release 21c and later support Unicode 12.1.
You can store Unicode characters in an Oracle database in two ways:
-
You can create a Unicode database that enables you to store UTF-8 encoded characters as SQL
CHAR
data typesVARCHAR2
,CHAR
,LONG
(deprecated), andCLOB
. -
You can support multilingual data in specific columns by using SQL
NCHAR
data typesNVARCHAR2
,NCHAR
, andNCLOB
. You can store Unicode characters into columns of theNCHAR
data types regardless of how the database character set has been defined. TheNCHAR
data types are exclusively Unicode data types.Note:
Starting with Oracle Database 12c Release 2 (12.2), if you use Oracle Universal Installer (OUI) or Oracle Database Configuration Assistant (DBCA) to create a database, then the default database character set used is the Unicode character set AL32UTF8.
1.3 Changes in Oracle Database Globalization Support Guide for Oracle Database Release 21c, Version 21.1
The following are changes in Oracle Database Globalization Support Guide for Oracle Database release 21c, version 21.1.
1.3.1 New Features
-
Support for Unicode 12.1, a major version of the Unicode Standard that supersedes all its previous versions.
See "Unicode Support".
-
Support for the Unicode Collation Algorithm (UCA) 12.1 collations (UCA1210_*).
See "UCA Collation".
-
Two new linguistic sorts (
XGERMAN_S
andXGERMAN_DIN_S
), which support Latin Capital Letter Sharp S as the uppercase form of Latin Smaller Letter Sharp S.See "Linguistic Collations".
-
The new Japanese era Reiwa, which went into effect on May 1, 2019, is now supported in Oracle Database for the Japanese Imperial Calendar.
See "Japanese Imperial Calendar".
-
You can now upgrade the time zone data in your database without incurring any database downtime.
See "Upgrading the Time Zone Data Using the DBMS_DST Package".
-
The
BURMESE
,GEORGIAN
, andKYRGYZ
languages are now supported.See "Languages".
-
The
MYANMAR
,GEORGIA
, andKYRGYZSTAN
territories are now supported.See "Territories".
1.3.2 Desupported Features
-
The Unicode Collation Algorithm (UCA) 6.1 collations (UCA0610_*) are desupported in this release. Oracle recommends that you use the latest supported version of UCA collations, which in Oracle Database 21c is UCA 12.1. UCA 12.1 incorporates all of the UCA enhancements since version 6.1, as well as proper collation weight assignments for all new characters introduced since Unicode 6.1.
See Table A-17 for the list of UCA collations supported in this release.