Skip to content
Home » Oracle » BYTE vs CHAR in NLS_LENGTH_SEMANTICS

BYTE vs CHAR in NLS_LENGTH_SEMANTICS

BYTE vs CHAR in NLS_LENGTH_SEMANTICS

Most of databases set NLS_LENGTH_SEMANTICS as BYTE by default, but there are some databases in Non-English country will take CHAR as the default unit of data type for CHAR and VARCHAR2.

In other words, if there is no explicit unit in the columns definition during table creation, the unit of type CHAR and VARCHAR2 column will follow the default NLS_LENGTH_SEMANTICS.

An Example to Distinguish the Differences

Let's see an example to distinguish the differences:

1. Check Current NLS_LENGTH_SEMANTICS

SQL> show parameter nls_length_semantics;
 NAME                                 TYPE        VALUE
 ------------------------------------ ----------- --------------------------
 nls_length_semantics                 string      BYTE

As we can see, current setting is BYTE.

2. Create a Table Without Semantics

SQL> CREATE TABLE HR.NAME1 (FIRST_NAME VARCHAR2(20), LAST_NAME VARCHAR2(25) NOT NULL);

Table created.

We created a table hr.name1 without explicit unit in VARCHAR2 columns.

3. Describe the Table

SQL> DESC hr.name1
 Name                                      Null?    Type
 ----------------------------------------- -------- ------------------------
 FIRST_NAME                                         VARCHAR2(20)
 LAST_NAME                                 NOT NULL VARCHAR2(25)

We checked the definition of the new table hr.name1, the implied unit of VARCHAR2 is BYTE, even though it's not shown in the output.

That is to say, if the unit of the character length is not displayed, it uses the default value BYTE, otherwise it's CHAR.

4. Change NLS_LENGTH_SEMANTICS to CHAR

SQL> ALTER system SET nls_length_semantics='CHAR' scope=both;

System altered.

Please note that, NLS_LENGTH_SEMANTICS can be changed at session-level.

5. Bounce the database

$ srvctl stop database -d compdb
$ srvctl start database -d compdb

6. Check Current NLS_LENGTH_SEMANTICS

SQL> show parameter nls_length_semantics;
 NAME                                 TYPE        VALUE
 ------------------------------------ ----------- --------------------------
 nls_length_semantics                 string      CHAR

We checked NLS_LENGTH_SEMANTICS, current setting is CHAR.

7. Create a Table Without Semantics

SQL>  CREATE TABLE HR.NAME2 (FIRST_NAME VARCHAR2(20), LAST_NAME VARCHAR2(25) NOT NULL);

Table created.

We created a table hr.name2 without explicit unit in VARCHAR2 column.

8. Describe the Table

SQL> DESC hr.name2
 Name                                      Null?    Type
 ----------------------------------------- -------- ------------------------
 FIRST_NAME                                         VARCHAR2(20 CHAR)
 LAST_NAME                                 NOT NULL VARCHAR2(25 CHAR)

We checked the definition of the new table hr.name2, the explicit unit of VARCHAR2 is CHAR as you can see.

BYTE is the Implicit Value

Which means, if the unit notation of a column is CHAR, it will show the unit explicitly. But if the unit notation of a column is BYTE, it won't show the unit.

You can see the unit of VARCHAR2 columns follow the implicit unit semantics (i.e. BYTE) if there are no semantics assigned. In other words, in the first table hr.name1, it creates VARCHAR2 columns in BYTE, and in the second table hr.name2, it creates VARCHAR2 columns in CHAR.

NLS_LENGTH_SEMANTICS in Enterprise Manager

Next, let's see a different point of view in the followings which are DDL of hr.name1 and hr.name2 generated by Oracle Enterprise Manager. Don't forget, the current semantics is still CHAR:

--hr.name1 DDL generated by OEM.
CREATE TABLE "HR"."NAME1" ( "FIRST_NAME" VARCHAR2(20 BYTE), "LAST_NAME" VARCHAR2(25 BYTE) NOT NULL ) TABLESPACE "USERS" PCTFREE 10 INITRANS 1 MAXTRANS 255 STORAGE ( BUFFER_POOL DEFAULT) LOGGING NOCOMPRESS

/* This DDL with explicit semantics is portable across databases. */


--hr.name2 DDL generated by OEM.
CREATE TABLE "HR"."NAME2" ( "FIRST_NAME" VARCHAR2(20), "LAST_NAME" VARCHAR2(25) NOT NULL ) TABLESPACE "USERS" PCTFREE 10 INITRANS 1 MAXTRANS 255 STORAGE ( BUFFER_POOL DEFAULT) LOGGING NOCOMPRESS

This DDL is reasonable, but not good enough. Since we are in CHAR semantics, which has no need to append unit notations in the table definition. But it will be a troublesome table if you migrate the table with such DDL to a target database with BYTE semantics manually. The target database will take BYTE to create the table.

NLS_LENGTH_SEMANTICS in Generated DDL

But the table DDL generated by the package DBMS_METADATA tells a different story:

SQL> show parameter nls_length_semantics

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
nls_length_semantics                 string      CHAR

SQL> SET LONG 2000000
SQL> select dbms_metadata.get_ddl('TABLE','NAME1','HR') ddl from dual;

DDL
--------------------------------------------------------------------------------

  CREATE TABLE "HR"."NAME1"
   (    "FIRST_NAME" VARCHAR2(20),
        "LAST_NAME" VARCHAR2(25) NOT NULL ENABLE
   ) SEGMENT CREATION DEFERRED
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  TABLESPACE "USERS"

SQL> select dbms_metadata.get_ddl('TABLE','NAME2','HR') ddl from dual;

DDL
--------------------------------------------------------------------------------

  CREATE TABLE "HR"."NAME2"
   (    "FIRST_NAME" VARCHAR2(20 CHAR),
        "LAST_NAME" VARCHAR2(25 CHAR) NOT NULL ENABLE
   ) SEGMENT CREATION DEFERRED
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  TABLESPACE "USERS"

Semantics in Table Design

Columns with the length in CHAR can be converted into BYTE in some manners. One character could be 1, 2, 3 or 4 bytes under UTF-8, which depends on what language you are using. For Chinese, Japanese, Korean and Indian languages, one character takes 3 bytes. Oracle concludes some simple rules as followings:

  1. Characters from the European and Middle Eastern scripts are represented in either 1 or 2 bytes.
  2. Characters from most Asian scripts are represented in 3 bytes.
  3. Supplementary characters are represented in 4 bytes.
According to Oracle Database Globalization Support Guide : 6 Supporting Multilingual Databases with Unicode.

For example, if you defined a column like this:

...FIRST_NAME VARCHAR2(20 CHAR)

It equals to the following definition in Taiwan, which is using TRADITIONAL CHINESE.

...FIRST_NAME VARCHAR2(60 BYTE)

Please notice that the length of NCHAR or NVARCHAR2 is always in character (i.e. CHAR), no matter what the current semantics is. For example, if you assign the unit explicitly in a NVARCHAR2 column to BYTE or CHAR during table creation, it will fail.

SQL> CREATE TABLE HR.NAME3 (FIRST_NAME NVARCHAR2(20 BYTE), LAST_NAME NVARCHAR2(25 BYTE) NOT NULL);
CREATE TABLE HR.NAME3 (FIRST_NAME NVARCHAR2(20 BYTE), LAST_NAME NVARCHAR2(25 BYTE) NOT NULL)
                                               *
ERROR at line 1:
ORA-00907: missing right parenthesis


SQL> CREATE TABLE HR.NAME3 (FIRST_NAME NVARCHAR2(20 CHAR), LAST_NAME NVARCHAR2(25 CHAR) NOT NULL);
CREATE TABLE HR.NAME3 (FIRST_NAME NVARCHAR2(20 CHAR), LAST_NAME NVARCHAR2(25 CHAR) NOT NULL)
                                               *
ERROR at line 1:
ORA-00907: missing right parenthesis


SQL> CREATE TABLE HR.NAME3 (FIRST_NAME NVARCHAR2(20), LAST_NAME NVARCHAR2(25) NOT NULL);

Table created.

SQL> desc hr.name3
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 FIRST_NAME                                         NVARCHAR2(20)
 LAST_NAME                                 NOT NULL NVARCHAR2(25)


SQL> select dbms_metadata.get_ddl('TABLE','NAME3','HR') ddl from dual;

DDL
--------------------------------------------------------------------------------

  CREATE TABLE "HR"."NAME3"
   (    "FIRST_NAME" NVARCHAR2(20),
        "LAST_NAME" NVARCHAR2(25) NOT NULL ENABLE
   ) SEGMENT CREATION DEFERRED
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  TABLESPACE "USERS"

You had better not to assign semantics when defining such columns.

Leave a Reply

Your email address will not be published. Required fields are marked *