近來在做資料庫設計,有時候真弄不清SQL2000裡的資料類型,是以摘了這篇文章。
(1)char、varchar、text和nchar、nvarchar、ntext
char和varchar的長度都在1到8000之間,它們的差別在于char是定長字元資料,而varchar是變長字元資料。所謂定長就是長度固定的,當輸入的資料長度沒有達到指定的長度時将自動以英文空格在其後面填充,使長度達到相應的長度;而變長字元資料則不會以空格填充。text存儲可變長度的非Unicode資料,最大長度為2^31-1(2,147,483,647)個字元。
後面三種資料類型和前面的相比,從名稱上看隻是多了個字母"n",它表示存儲的是Unicode資料類型的字元。寫過程式的朋友對Unicode應該很了解。字元中,英文字元隻需要一個位元組存儲就足夠了,但漢字衆多,需要兩個位元組存儲,英文與漢字同時存在時容易造成混亂,Unicode字元集就是為了解決字元集這種不相容的問題而産生的,它所有的字元都用兩個位元組表示,即英文字元也是用兩個位元組表示。nchar、nvarchar的長度是在1到4000之間。和char、varchar比較:nchar、nvarchar則最多存儲4000個字元,不論是英文還是漢字;而char、varchar最多能存儲8000個英文,4000個漢字。可以看出使用nchar、nvarchar資料類型時不用擔心輸入的字元是英文還是漢字,較為友善,但在存儲英文時數量上有些損失。
(2)datetime和smalldatetime
datetime:從1753年1月1日到9999年12月31日的日期和時間資料,精确到百分之三秒。
smalldatetime:從1900年1月1日到2079年6月6日的日期和時間資料,精确到分鐘。
(3)bitint、int、smallint、tinyint和bit
bigint:從-2^63(-9223372036854775808)到2^63-1(9223372036854775807)的整型資料。
int:從-2^31(-2,147,483,648)到2^31-1(2,147,483,647)的整型資料。
smallint:從-2^15(-32,768)到2^15-1(32,767)的整數資料。
tinyint:從0到255的整數資料。
bit:1或0的整數資料。
(4)decimal和numeric
這兩種資料類型是等效的。都有兩個參數:p(精度)和s(小數位數)。p指定小數點左邊和右邊可以存儲的十進制數字的最大個數,p必須是從 1到38之間的值。s指定小數點右邊可以存儲的十進制數字的最大個數,s必須是從0到p之間的值,預設小數位數是0。
(5)float和real
float:從-1.79^308到1.79^308之間的浮點數字資料。
real:從-3.40^38到3.40^38之間的浮點數字資料。在SQL Server中,real的同義詞為float(24)。
資料庫定義到char類型的字段時,不知道大家是否會猶豫一下,到底選char、nchar、varchar、nvarchar、text、ntext中哪一種呢?結果很可能是兩種,一種是節儉人士的選擇:最好是用定長的,感覺比變長能省些空間,而且處理起來會快些,無法定長隻好選用定長,并且将長度設定盡可能地小;另一種是則是覺得無所謂,盡量用可變類型的,長度盡量放大些。
鑒于現在硬體像蘿蔔一樣便宜的大好形勢,糾纏這樣的小問題實在是沒多大意義,不過如果不弄清它,總覺得對不起勞累過度的CPU和硬碟。
下面開始了(以下說明隻針對SqlServer有效):
1、當使用非unicode時慎用以下這種查詢:
select f from t where f = N'xx'
原因:無法利用到索引,因為資料庫會将f先轉換到unicode再和N'xx'比較
2、char 和相同長度的varchar處理速度差不多(後面還有說明)
3、varchar的長度不會影響處理速度!!!(看後面解釋)
4、索引中列總長度最多支援總為900位元組,是以長度大于900的varchar、char和大于450的nvarchar,nchar将無法建立索引
5、text、ntext上是無法建立索引的
6、O/R Mapping中對應實體的屬性類型一般是以string居多,用char[]的非常少,是以如果按mapping的合理性來說,可變長度的類型更加吻合
7、一般基礎資料表中的name在實際查詢中基本上全部是使用like '%xx%'這種方式,而這種方式是無法利用索引的,是以如果對于此種字段,索引建了也白建
8、其它一些像remark的字段則是根本不需要查詢的,是以不需要索引
9、varchar的存放和string是一樣原理的,即length {block}這種方式,是以varchar的長度和它實際占用空間是無關的
10、對于固定長度的字段,是需要額外空間來存放NULL辨別的,是以如果一個char字段中出現非常多的NULL,那麼很不幸,你的占用空間比沒有NULL的大(但這個大并不是大太多,因為NULL辨別是用bit存放的,可是如果你一行中隻有你一個NULL需要辨別,那麼你就白白浪費1byte空間了,罪過罪過!),這時候,你可以使用特殊辨別來存放,如:'NV'
11、同上,是以對于這種NULL查詢,索引是無法生效的,假如你使用了NULL辨別替代的話,那麼恭喜你,你可以利用到索引了
12、char和varchar的比較成本是一樣的,現在關鍵就看它們的索引查找的成本了,因為查找政策都一樣,是以應該比較誰占用空間小。在存放相同數量的字元情況下,如果數量小,那麼char占用長度是小于varchar的,但如果數量稍大,則varchar完全可能小于char,而且要看實際填充數值的充實度,比如說varchar(3)和char(3),那麼理論上應該是char快了,但如果是char(10)和varchar(10),充實度隻有30%的情況下,理論上就應該是varchar快了。因為varchar需要額外空間存放塊長度,是以隻要length(1-fillfactor)大于這個存放空間(好像是2位元組),那麼它就會比相同長度的char快了。
13、nvarchar比varchar要慢上一些,而且對于非unicode字元它會占用雙倍的空間,那麼這麼一種類型推出來是為什麼呢?對,就是為了國際化,對于unicode類型的資料,排序規則對它們是不起作用的,而非unicode字元在處理不同語言的資料時,必須指定排序規則才能正常工作,是以n類型就這麼一點好處。
總結陳詞:
1、如果資料量非常大,又能100%确定長度且儲存隻是ansi字元,那麼char
2、能确定長度又不一定是ansi字元或者,那麼用nchar;
3、不确定長度,要查詢且希望利用索引的話,用nvarchar類型吧,将它們設到400;
4、不查詢的話沒什麼好說的,用nvarchar(4000)
5、性格豪爽的可以隻用3和4,偶爾用用1,畢竟這是一種額外說明,等于告訴别人說,我一定需要長度為X位的資料
SQL 文法參考手冊
DB2 提供了關連式資料庫的查詢語言 SQL (Structured Query Language),是一種非常口語化、既易學又易懂的文法。 此一語言幾乎是每個資料庫系統都必須提供的,用以表示關連式的操作,包含了資料的定義(DDL)以及資料的處理(DML)。SQL原來拼成SEQUEL,這語言的原型以“系統 R“的名字在 IBM 聖荷西實驗室完成,經過IBM内部及其他的許多使用性及效率測試,其結果相當令人滿意,并決定在系統R 的技術基礎發展出來 IBM 的産品。而且美國國家标準學會(ANSI)及國際标準化組織(ISO)在1987遵循一個幾乎是以 IBM SQL 為基礎的标準關連式資料語言定義。
一、資料定義 DDL(Data Definition Language)
資料定語言是指對資料的格式和形态下定義的語言,他是每個資料庫要建立時候時首先要面對的,舉凡資料分哪些表格關系、表格内的有什麽欄位主鍵、表格和表格之間互相參考的關系等等,都是在開始的時候所必須規劃好的。
1、建表格:
CREATE TABLE table_name(
column1 DATATYPE [NOT NULL] [NOT NULL PRIMARY KEY],
column2 DATATYPE [NOT NULL],
...)
說明:
DATATYPE --是資料的格式,詳見表。
NUT NULL --可不可以允許資料有空的(尚未有資料填入)。
PRIMARY KEY --是本表的主鍵。
2、更改表格
ALTER TABLE table_name
ADD COLUMN column_name DATATYPE
說明:增加一個欄位(沒有删除某個欄位的文法。
ALTER TABLE table_name
ADD PRIMARY KEY (column_name)
說明:更改表得的定義把某個欄位設為主鍵。
ALTER TABLE table_name
DROP PRIMARY KEY (column_name)
說明:把主鍵的定義删除。
3、建立索引
CREATE INDEX index_name ON table_name (column_name)
說明:對某個表格的欄位建立索引以增加查詢時的速度。
4、删除
DROP table_name
DROP index_name
二、的資料形态 DATATYPEs
smallint
16 位元的整數。
interger
32 位元的整數。
decimal(p,s)
p 精确值和 s 大小的十進位整數,精确值p是指全部有幾個數(digits)大小值,s是指小數
點後有幾位數。如果沒有特别指定,則系統會設為 p=5; s=0 。
float
32位元的實數。
double
64位元的實數。
char(n)
n 長度的字串,n不能超過 254。
varchar(n)
長度不固定且其最大長度為 n 的字串,n不能超過 4000。
graphic(n)
和 char(n) 一樣,不過其機關是兩個字元 double-bytes, n不能超過127。這個形态是為
了支援兩個字元長度的字型,例如中文字。
vargraphic(n)
可變長度且其最大長度為 n 的雙字元字串,n不能超過 2000。
date
包含了 年份、月份、日期。
time
包含了 小時、分鐘、秒。
timestamp
包含了 年、月、日、時、分、秒、千分之一秒。
三、資料操作 DML (Data Manipulation Language)
資料定義好之後接下來的就是資料的操作。資料的操作不外乎增加資料(insert)、查詢資料(query)、更改資料(update) 、删除資料(delete)四種模式,以下分 别介紹他們的文法:
1、增加資料:
INSERT INTO table_name (column1,column2,...)
VALUES ( value1,value2, ...)
說明:
1.若沒有指定column 系統則會按表格内的欄位順序填入資料。
2.欄位的資料形态和所填入的資料必須吻合。
3.table_name 也可以是景觀 view_name。
INSERT INTO table_name (column1,column2,...)
SELECT columnx,columny,... FROM another_table
說明:也可以經過一個子查詢(subquery)把别的表格的資料填入。
2、查詢資料:
基本查詢
SELECT column1,columns2,...
FROM table_name
說明:把table_name 的特定欄位資料全部列出來
SELECT *
FROM table_name
WHERE column1 = xxx
[AND column2 > yyy] [OR column3 <> zzz]
說明:
1.'*'表示全部的欄位都列出來。
2.WHERE 之後是接條件式,把符合條件的資料列出來。
SELECT column1,column2
FROM table_name
ORDER BY column2 [DESC]
說明:ORDER BY 是指定以某個欄位做排序,[DESC]是指從大到小排列,若沒有指明,則是從小到大
排列
組合查詢
組合查詢是指所查詢得資料來源并不隻有單一的表格,而是聯合一個以上的
表格才能夠得到結果的。
SELECT *
FROM table1,table2
WHERE table1.colum1=table2.column1
說明:
1.查詢兩個表格中其中 column1 值相同的資料。
2.當然兩個表格互相比較的欄位,其資料形态必須相同。
3.一個複雜的查詢其動用到的表格可能會很多個。
整合性的查詢:
SELECT COUNT (*)
FROM table_name
WHERE column_name = xxx
說明:
查詢符合條件的資料共有幾筆。
SELECT SUM(column1)
FROM table_name
說明:
1.計算出總和,所選的欄位必須是可數的數字形态。
2.除此以外還有 AVG() 是計算平均、MAX()、MIN()計算最大最小值的整合性查詢。
SELECT column1,AVG(column2)
FROM table_name
GROUP BY column1
HAVING AVG(column2) > xxx
說明:
1.GROUP BY: 以column1 為一組計算 column2 的平均值必須和 AVG、SUM等整合性查詢的關鍵字
一起使用。
2.HAVING : 必須和 GROUP BY 一起使用作為整合性的限制。
複合性的查詢
SELECT *
FROM table_name1
WHERE EXISTS (
SELECT *
FROM table_name2
WHERE conditions )
說明:
1.WHERE 的 conditions 可以是另外一個的 query。
2.EXISTS 在此是指存在與否。
SELECT *
FROM table_name1
WHERE column1 IN (
SELECT column1
FROM table_name2
WHERE conditions )
說明:
1. IN 後面接的是一個集合,表示column1 存在集合裡面。
2. SELECT 出來的資料形态必須符合 column1。
其他查詢
SELECT *
FROM table_name1
WHERE column1 LIKE 'x%'
說明:LIKE 必須和後面的'x%' 相呼應表示以 x為開頭的字串。
SELECT *
FROM table_name1
WHERE column1 IN ('xxx','yyy',..)
說明:IN 後面接的是一個集合,表示column1 存在集合裡面。
SELECT *
FROM table_name1
WHERE column1 BETWEEN xx AND yy
說明:BETWEEN 表示 column1 的值介於 xx 和 yy 之間。
3、更改資料:
UPDATE table_name
SET column1='xxx'
WHERE conditoins
說明:
1.更改某個欄位設定其值為'xxx'。
2.conditions 是所要符合的條件、若沒有 WHERE 則整個 table 的那個欄位都會全部被更改。
4、删除資料:
DELETE FROM table_name
WHERE conditions
說明:删除符合條件的資料。
說明:關于WHERE條件後面如果包含有日期的比較,不同資料庫有不同的表達式。具體如下:
(1)如果是ACCESS資料庫,則為:WHERE mydate>#2000-01-01#
(2)如果是ORACLE資料庫,則為:WHERE mydate>cast('2000-01-01' as date)
或:WHERE mydate>to_date('2000-01-01','yyyy-mm-dd')
在Delphi中寫成:
thedate= '2000-01-01';
query1.SQL.add('select * from abc where mydate>cast('+''''+thedate+''''+' as date)');
SQL Server 2000 Datatypes
By Sergey Vartanyan
- Introduction Binary datatypes Character datatypes Date and Time datatypes Numeric datatypes Integer datatypes Monetary datatypes Special datatypes Text and image datatypes Unicode Character datatypes User-Defined datatypes
Introduction
There are three new datatypes in SQL Server 2000 in comparison with SQL Server 7.0.
These datatypes were added:
- bigint
- sql_variant
- table
In this article, I want to tell you about built-in SQL Server 2000 datatypes, about user-defined datatypes and what datatype is generally used in a particular situation.
Binary datatypes
Binary data is similar to hexadecimal data and consists of the characters 0 through 9 and A through F, in groups of two characters each. You should specify 0x before binary value when input it.
There are two binary datatypes:
- binary[(n)]
- varbinary[(n)]
Binary[(n)] datatype can store up to 8000 bytes of fixed-length binary data. You can specify the maximum byte length with n.
Varbinary[(n)] datatype can store up to 8000 bytes of variable-length binary data. You can specify the maximum byte length with n. Variable-length means that binary data can contain less than n bytes, and the storage size will be the actual length of the data entered.
You should use varbinary datatype instead of binary datatype, when you expect null values or a variation in data size.
Character datatypes
Character datatypes are used to store any combination of letters, symbols, and numbers. You should enclose character data with quotation marks, when enter it.
There are two character datatypes:
- char[(n)]
- varchar[(n)]
Char[(n)] datatype can store up to 8000 bytes of fixed-length character data. You can specify the maximum byte length with n.
Varchar[(n)] datatype can store up to 8000 bytes of variable-length character data. You can specify the maximum byte length with n. Variable-length means that character data can contain less than n bytes, and the storage size will be the actual length of the data entered.
You should use varchar datatype instead of char datatype, when you expect null values or a variation in data size.
Date and Time datatypes
There are two datetime datatypes:
- datetime
- smalldatetime
Datetime is stored in 8 bytes of two 4-byte integers: 4 bytes for the number of days before or after the base date of January 1, 1900, and 4 bytes for the number of milliseconds after midnight.
Datetime datatype can store dates from January 1, 1753, to December 31, 9999, with accuracy of 3.33 milliseconds.
If you will not specify date portion of the datetime value, then January 1, 1900 is supplied; if you will not specify time portion of datetime value, then 12:00:00:000AM is supplied.
Smalldatetime is stored in 4 bytes of two 2-byte integers: 2 bytes for the number of days after the base date of January 1, 1900, and 2 bytes for the number of minutes after midnight.
Smalldatetime datatype can store dates from January 1, 1900, to June 6, 2079, with accuracy to the minute.
If you will not specify date portion of the datetime value, then January 1, 1900 is supplied; if you will not specify time portion of datetime value, then 12:00AM is supplied.
Numeric datatypes
There are two kinds of the numeric datatypes:
- Exact Numeric Data
- Approximate Numeric Data
The difference between Exact Numeric Data and Approximate Numeric Data in that Exact Numeric Data can store all decimal numbers with complete accuracy, and Approximate Numeric Data cannot.
Exact Numeric Data are:
- decimal[(p[, s])]
- numeric[(p[, s])]
The decimal and numeric datatypes are synonyms in the SQL Server 2000. Exact Numeric Data holds values from 10^38 - 1 through - 10^38 - 1. The storage size varies based on the specified precision, and it ranges from a minimum of 2 bytes to a maximum of 17 bytes.
p - is a precision, that specify the maximum total number of decimal digits that can be stored, both to the left and to the right of the decimal point. The maximum precision is 28 digits.
s - is a scale, that specify the maximum number of decimal digits that can be stored to the right of the decimal point, and it must be less than or equal to the precision.
Approximate Numeric Data are:
- float[(n)]
- real
Float[(n)] datatype is stored in 8 bytes and is used to hold positive or negative floating-point numbers. By default, this column has a 15-digit precision. Float[(n)] datatype can store positive values from 2.23E-308 to 1.79E308 and negative values from -2.23E-308 to -1.79E308.
Real datatype is stored in 4 bytes and is used as float datatype to hold positive or negative floating-point numbers. This column has a 7-digit precision. Real datatype can store positive values from 1.18E-38 to 3.40E38 and negative values from -1.18E-38 to -3.40E38.
Integer datatypes
There are four integer datatypes:
- tinyint
- smallint
- int
- bigint
Tinyint is stored in 1 byte and is used to hold integer values from 0 through 255.
Smallint is stored in 2 bytes and is used to hold integer values from -32768 through 32,767.
Int is stored in 4 bytes and is used to hold integer values from -2147483648 through 2147483647.
Bigint is stored in 8 bytes and is used to hold integer values from -9223372036854775808 through 9223372036854775807.
Monetary datatypes
Monetary datatypes are usually used to store monetary values. There are two monetary datatypes:
- money
- smallmoney
Money datatype is stored in 8 bytes and is used to hold monetary values from -922337203685477.5808 through 922337203685477.5807.
Smallmoney datatype is stored in 4 bytes and is used to hold monetary values from - 214748.3648 through 214748.3647.
Special datatypes
These are the special datatypes:
- bit
- cursor
- sql_variant
- table
- timestamp
- uniqueidentifier
Bit datatype is usually used for true/false or yes/no types of data, because it holds either 1 or 0. All integer values other than 1 or 0 are always interpreted as 1. One bit column stores in 1 byte, but multiple bit types in a table can be collected into bytes. Bit columns cannot be NULL and cannot have indexes on them.
Cursor datatype is used for variables or stored procedure OUTPUT parameters that contain a reference to a cursor. The variables created with the cursor data type are nullable. You cannot use this datatype for a column in a CREATE TABLE statement.
sql_variant datatype is used to store values of various SQL Server supported data types, except text, ntext, timestamp, and sql_variant. The maximum length of sql_variant datatype is 8016 bytes. You can store in one column of type sql_variant the rows of different data types, for example int, char, and varchar values.
This is the example of using sql_variant datatype:
SET NOCOUNT ON
GO
if object_id('tbTest') is not null drop table tbTest
GO
CREATE TABLE tbTest (
id int primary key,
sql_v sql_variant
)
GO
INSERT INTO tbTest VALUES (1, 1)
INSERT INTO tbTest VALUES (2, 1.0)
INSERT INTO tbTest VALUES (3, '1')
INSERT INTO tbTest VALUES (4, 0x01)
GO
SELECT sql_v FROM tbTest
GO
Here is the result set:
sql_v
------------------------
1
1.0
1
0x01
Table datatype is used to store a result set for later processing. You cannot use this datatype for a column in a CREATE TABLE statement. You should use DECLARE @local_variable to declare variables of type table. Table variables should be used instead of temporary tables, whenever possible, because table variables are cleaned up automatically at the end of the function, stored procedure, or batch in which they are defined, and table variables require less locking and logging resources.
This is the example of using table datatype:
DECLARE @tbl table (id int)
INSERT INTO @tbl VALUES (1)
INSERT INTO @tbl VALUES (2)
SELECT * FROM @tbl
Here is the result set:
id
-----------
1
2
Timestamp datatype is stored in 8 bytes as binary(8) datatype. The timestamp value is automatically updated every time a row containing a timestamp column is inserted or updated.
Timestamp value is a monotonically increasing counter whose values will always be unique within a database and can be selected by queried global variable @@DBTS.
Uniqueidentifier is a GUID (globally unique identifier). A GUID is a 16-byte binary number that is guaranteed to be unique in the world. This datatype is usually used in replication or as primary key to unique identify rows in a table.
You can get the new uniqueidentifier value by calling the NEWID function.
Note You should use IDENTITY property instead of uniqueidentifier, if global uniqueness is not necessary, because the uniqueidentifier values are long and more slowly generated.
Text and image datatypes
Text and image data are stored on the Text/Image pages, not on the Data pages as other SQL Server 2000 data.
There are three datatypes in this category:
- text
- ntext
- image
Text datatype is a variable-length datatype that can hold up to 2147483647 characters. This datatype is used when you want to store the character values with the total length more than 8000 bytes.
ntext datatype is a variable-length unicode datatype that can hold up to 1073741823 characters. This datatype is used when you want to store the variable-length unicode data with the total length more than 4000 bytes.
Image datatype is a variable-length datatype that can hold up to 2147483647 bytes of binary data. This datatype is used when you want to store the binary values with the total length more than 8000 bytes. This datatype is also used to store pictures.
Unicode Character datatypes
A column with unicode character datatype can store all of the characters that are defined in the various character sets, not only the characters from the particular character set, which was chosen during SQL Server Setup. Unicode datatypes take twice as much storage space as non-Unicode datatypes.
The unicode character data, as well as character data, can be used to store any combination of letters, symbols, and numbers. You should enclose unicode character data with quotation marks, when enter it.
There are two unicode character datatypes:
- nchar[(n)]
- nvarchar[(n)]
nchar[(n)] datatype can store up to 4000 bytes of fixed-length unicode character data. You can specify the maximum byte length with n.
nvarchar[(n)] datatype can store up to 4000 bytes of variable-length unicode character data. You can specify the maximum byte length with n. Variable-length means that character data can contain less than n bytes, and the storage size will be the actual length of the data entered.
You should use nvarchar datatype instead of nchar datatype, when you expect null values or a variation in data size.
User-Defined datatypes
You can create your own User-Defined datatypes by executing sp_addtype system stored procedure. Once a User-Defined datatype is created, you can use it in the CREATE TABLE or ALTER TABLE statements, as built-in SQL Server 2000 datatypes.
This is the syntax of sp_addtype stored procedure:
sp_addtype [ @typename = ] type,
[ @phystype = ] system_data_type
[ , [ @nulltype = ] 'null_type' ]
[ , [ @owner = ] 'owner_name' ]
where
typename
- is the User-Defined datatype name.
phystype
- is the SQL Server 2000 datatype, on which the User-Defined datatype is based.
nulltype
- is the NULL or NOT NULL.
owner
- is the owner of the User-Defined datatype.