Oracle varchar2或char類型的byte和char的差別

Oracle定義字元串類型VARCHAR2和CHAR指定長度的用法如下：

varchar2(<SIZE> <BYTE|CHAR>) <SIZE>是介于1~4000之間的一個數，表示最多占用4000位元組的存儲空間。

char(<SIZE> <BYTE|CHAR>) <SIZE>是介于1~2000之間的一個數，表示最多占用2000位元組的存儲空間。

那其中的BYTE和CHAR有什麼差別呢

BYTE,用位元組指定：VARCHAR2(10 BYTE)。這能支援最多10位元組的資料，在一個多位元組字元集中，這可能隻是兩個字元。采用多位元組字元集時，位元組與字元并不相同。

CHAR,用字元指定：VARCHAR2(10 CHAR)。這将支援最多10字元資料，可能是多達40位元組的資訊。另外，VARCHAR2(4000 CHAR)理論上支援最多4000個字元的資料，不過由于Oracle中字元串資料類型限制為4000位元組，是以可能無法得到全部4000個字元。

使用UTF8之類的多位元組字元集時，建議你在VARCHAR2/CHAR定義中使用CHAR修飾會，也就是說，使用VARCHAR2(30 CHAR)，而不是VARCHAR2(30)，因為你的本意很可能是定義一個實際上能存儲30字元資料的列。還可以使用會話參數或系統參數NLS_LENGTH_SEMANTICS來修改預設行為，即把預設設定BYTE改為CHAR。不建議在系統級修改這個設定，而應該使用ALTER SESSION修改會話級。還有重要的一點，VARCHAR2中存儲的位元組數上界是4000。不過，即使你指定了VARCHAR(4000 CHAR)，可能并不能在這個字段中放下4000個字元實際上，采用你選擇的字元集時，如果所有字元都要用4個位元組來表示，那麼這個字段中就隻能放下1000個字元！

下面使用一個小例子展示BYTE和CHAR之間的差別，并顯示出上界的作用。

測試環境11.2.0.4，是在多位元組字元集資料庫上完成的，在此使用了字元集AL32UTF8，這個字元集支援最新版本的Unicode标準，采用一種變長方式對每個字元使用1~4個位元組進行編碼

<code>zx@ORCL>col value </code><code>for</code> <code>a30</code>

<code>zx@ORCL>col parameter </code><code>for</code> <code>a30</code>

<code>zx@ORCL></code><code>select</code> <code>* </code><code>from</code> <code>nls_database_parameters </code><code>where</code> <code>parameter=</code><code>'NLS_CHARACTERSET'</code><code>;</code>

<code>PARAMETER VALUE</code>

<code>NLS_CHARACTERSET AL32UTF8</code>

<code>zx@ORCL>show parameter nls_leng</code>

<code>NAME</code> <code>TYPE VALUE</code>

<code>nls_length_semantics string BYTE</code>

建立測試表

<code>zx@ORCL></code><code>create</code> <code>table</code> <code>t (a varchar2(1),b varchar2(1 </code><code>char</code><code>),c varchar2(4000 </code><code>char</code><code>));</code>

<code>Table</code> <code>created.</code>

現在，這個表中插入一個UTF字元unistr('\00d6')，這個字元長度為2個位元組，可以觀察到以下結果：

<code>zx@ORCL></code><code>select</code> <code>length(unistr(</code><code>'\00d6'</code><code>)),lengthb(unistr(</code><code>'\00d6'</code><code>)) </code><code>from</code> <code>dual;</code>

<code>LENGTH(UNISTR(</code><code>'\00D6'</code><code>)) LENGTHB(UNISTR(</code><code>'\00D6'</code><code>))</code>

<code>zx@ORCL></code><code>insert</code> <code>into</code> <code>t (a) </code><code>values</code> <code>(unistr(</code><code>'\00d6'</code><code>));</code>

<code>insert</code> <code>into</code> <code>t (a) </code><code>values</code> <code>(unistr(</code><code>'\00d6'</code><code>))</code>

<code>ERROR </code><code>at</code> <code>line 1:</code>

<code>ORA-12899: value too large </code><code>for</code> <code>column</code> <code>"ZX"</code><code>.</code><code>"T"</code><code>.</code><code>"A"</code> <code>(actual: 2, maximum: 1)</code>

這說明：VARCHAR(1)的機關是位元組而不是字元。這裡确實隻有一個Unicode字元，但是它在一個位元組中放不下；将應用從單位元組定寬字元集移植到一個多位元組字元集時，可能會發現原來在字段中能放下的文本現在卻無法放下。第二點的原因是：在一個單位元組字元集中，包含20個字元的字元串長度就是20位元組，完全可以在VARCHAR2(20)中放下。不過在一個多位元組字元集中，20個字元的長度可以達到80位元組(如果每個字元用4個位元組表示)，這樣一傑，20個Unicode字元很可能無法在20個位元組中放下。你可能會考慮将DDL修改為VARCHAR2(20 CHAR)，或在運作DDL建立表時使用前面提到的NLS_LENGTH_SEMENTICS會話參數。

插入包含一個字元的字段時觀察到以下結果：

<code>zx@ORCL></code><code>insert</code> <code>into</code> <code>t (b) </code><code>values</code> <code>(unistr(</code><code>'\00d6'</code><code>));</code>

<code>1 row created.</code>

<code>zx@ORCL></code><code>select</code> <code>length(b),lengthb(b),dump(b) dump </code><code>from</code> <code>t;</code>

<code> </code><code>LENGTH(B) LENGTHB(B) DUMP</code>

這個INSERT成功了，而且可以看到，所有插入資料的長度(LENGTH)就是一個字元，所有字元串函數都以字元為機關工作。LENGTHB函數(位元組長度)顯示出這個字段占用了2位元組的存儲空間，另外DUMP函數顯示了這些位元組到底是什麼。這個例子展示了VARCHAR2(N)并不一定存儲N個字元，而隻是存儲N個位元組。

下面測試VARCHAR2(4000)可能存儲不了4000個字元

<code>zx@ORCL></code><code>declare</code>

<code> </code><code>2 l_date varchar2(4000 </code><code>char</code><code>);</code>

<code> </code><code>3 l_ch varchar2(1 </code><code>char</code><code>) := unistr(</code><code>'\00d6'</code><code>);</code>

<code> </code><code>4 </code><code>begin</code>

<code> </code><code>6 </code><code>insert</code> <code>into</code> <code>t(c) </code><code>values</code><code>(l_date);</code>

<code>declare</code>

<code>ORA-01461: can bind a LONG value </code><code>only</code> <code>for</code> <code>insert</code> <code>into</code> <code>a LONG </code><code>column</code>

在此顯示出，一個4000字元的實際上長度為8000位元組，這樣一個字元串無法永久地存儲在一個VARCHAR(4000 char)字段中，這個字元串能放在PL/SQL變量中，因為在PL/SQL中VARCHAR2最大可以達到32K。不過，存儲在表中，VARCHAR2則被硬性限制為最多隻能存放4000位元組。我們可以成功地存儲其中2000個字元：

<code>PL/SQL </code><code>procedure</code> <code>successfully completed.</code>

<code>zx@ORCL></code><code>select</code> <code>length(c),lengthb(c) </code><code>from</code> <code>t </code><code>where</code> <code>c </code><code>is</code> <code>not</code> <code>null</code><code>;</code>

<code> </code><code>LENGTH(C) LENGTHB(C)</code>

輸出可見，c占用了4000個位元組的存儲空間。

參考：《9I10G11G程式設計藝術深入資料庫體系結構》《Oracle Database Globalization Support Guide》

本文轉自hbxztc 51CTO部落格，原文連結：http://blog.51cto.com/hbxztc/1893768，如需轉載請自行聯系原作者

Oracle varchar2或char類型的byte和char的差別

繼續閱讀

報錯：'mysql' 不是内部或外部指令，也不是可運作的程式或批處理檔案。

Linxu常用指令技巧彙總

ERROR 1 (HY000): Can't create/write to file '/tmp/#sql_4188_1.MYI' (Errcode: 28)

艱難安裝LDAP,SSL認證

《Linux指令行與Shell腳本程式設計大全第2版.布盧姆》pdf

MySQL的4種隔離級别？出現問題

XX系統實施過程問題總結

無元件上傳圖檔到資料庫中，最完整解決方案

【MySQL資料庫】資料庫索引事務1.索引2.事務

neo4j之cypher使用文檔

NOSQL安全攻擊

mybatis_入門程式Mybatis入門

登入plsql 報錯 the account is locked --使用者被鎖

sqlServer根據經緯查距離

SequoiaDB巨杉資料庫C++驅動概述

Oracle 批量查詢傳入List 傳回List