這是本文件的舊版!


Data Alignment in PostgreSQL

一年前開始處理千萬級資料時,考慮到migration速度與資料儲存空間的使用,才知道資料庫也有alignment的機制。64位元作業系統是8 bytes的對齊,取自link的範例,對齊前:

CREATE TABLE t1 (
 , a char
 , b int2    -- 1 byte of padding after a
 , c char
 , d int4    -- 3 bytes of padding after c
 , e char
 , f int8    -- 7 bytes of padding after e
 );
對齊後,可節省11位元組空間,不管是儲存空間還是CPU使用效率都會有所提升。
CREATE TABLE t1 (
 , f int8
 , d int4
 , b int2
 , a char
 , c char
 , e char
 );
我的問題是,各種型態所需的數據對齊方式該怎查呢?

link提供了一個SQL,讓你可以列出所有type的typalign:

SELECT typname,typbyval,typlen,typalign FROM pg_type;
假如我要找某一種特定type,加入where限制條件即可:
SELECT typname,typbyval,typlen,typalign FROM pg_type WHERE typname='int2';
結果如下,可以發現typalign的字元有多種:


接下來就是去link中搜尋typalign,
typalign char

typalign is the alignment required when storing a value of this type. It applies to storage on disk as well as most representations of the value inside PostgreSQL. When multiple values are stored consecutively, such as in the representation of a complete row on disk, padding is inserted before a datum of this type so that it begins on the specified boundary. The alignment reference is the beginning of the first datum in the sequence. Possible values are:
c = char alignment, i.e., no alignment needed.
s = short alignment (2 bytes on most machines).
i = int alignment (4 bytes on most machines).
d = double alignment (8 bytes on many machines, but by no means all).