当前位置:   article > 正文

postgres Page详解、存储结构、空间使用、空间回收_pg的pageinit是在内存还是磁盘

pg的pageinit是在内存还是磁盘

存储结构

       在PG中,磁盘存储和内存中的最小管理单位都是page,也是通常所说的block。一般PG页的大小为8K,在源码编译时可以设置。此后都不可更改,因为许多PG内存结构设计都是以此为基础的

PG块的定义文件为src/include/storage/bufpage.h

PAGE的结构如下:

参考网上整理之后的结构图如下:

Page Header的定义如下:

  1. lsn:记录最后一次对页修改的xlog记录
  2. checksum:页面的校验和,用于判断当前页是否完整
  3. flags:(指示当前页的状态)
  4. lower:本页空闲位置的起始指针
  5. uper:本业空闲位置的结束指针
  6. special:页预留的位置
  7. pagesize:页面大小
  8. version:当前版本
  9. prune_xid:最后一次删除或更新的xid
  10. 页后面存储的是元组(tuple)信息的数据(表的数据行,一个元组信息就是一行),也就是下一节的page item

ItemIdData的定义如下:

itemid.h头文件

参考网友整理图

ItemIdData之后之后是tuple,tuple的组成为tuple header+data

首先tuple header的定义文件为access/htup_details.h

其中HeapTupleFields的定义如下:

参考网上整理图如下:

tuple header之后就是data数据了。

1)、空间使用测试

       首先确认您是否拥有了 pageinspect 这个 extension

如果没有,使用create extension pageinspect命令安装插件。

  1. postgres=# drop table if EXISTS test1;
  2. DROP TABLE
  3. postgres=# create table test1(id int);
  4. CREATE TABLE
  5. postgres=# insert into test1 values(1);
  6. INSERT 0 1
  7. postgres=# select * from page_header(get_raw_page('test1', 0));
  8. lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
  9. ------------+----------+-------+-------+-------+---------+----------+---------+-----------
  10. 2/DB128048 | 0 | 0 | 28 | 8160 | 8192 | 8192 | 4 | 0
  11. (1 row)
  12. postgres=# insert into test1 values(2);
  13. INSERT 0 1
  14. postgres=# select * from page_header(get_raw_page('test1', 0));
  15. lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
  16. ------------+----------+-------+-------+-------+---------+----------+---------+-----------
  17. 2/DB1741C0 | 0 | 0 | 32 | 8128 | 8192 | 8192 | 4 | 0
  18. (1 row)

通过上面可以看到,每插入一行数据,lower的值就会增加4(itemIdData),其中page header的大小为24,插入两行数据之后增加了两个ItemIdData。相应的upper地址会向上移动。

查询tuple header的

 select * from heap_page_items(get_raw_page('test1',0));

  1. lp:这是插件自己定义的列,在源码中其实没有,这个是项指针的顺序
  2. lp_off: tuple在页面中的位置
  3. lp_len:tuple的实际长度(tuple header + data)
  4. t_xmin,t_xmax: 插入,删除和更新时的事物id,插入时xmin写入当前事物id,删除时xmax写入事物id。更新也是先删除再插入
  5. t_field3:
  6. t_ctid:物理id,或者关联更新之后的物理ID

2)、更新第二行数据

  1. postgres=# update test1 set id = 3 where id = 2;
  2. UPDATE 1
  3. postgres=# select * from page_header(get_raw_page('test1', 0));
  4. lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
  5. ------------+----------+-------+-------+-------+---------+----------+---------+-----------
  6. 2/DB174418 | 0 | 0 | 36 | 8096 | 8192 | 8192 | 4 | 2097527
  7. (1 row)
  8. postgres=# select * from heap_page_items(get_raw_page('test1',0));
  9. lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
  10. ----+--------+----------+--------+---------+---------+----------+--------+-------------+------------+--------+--------+-------+------------
  11. 1 | 8160 | 1 | 28 | 2097522 | 0 | 0 | (0,1) | 1 | 2304 | 24 | | | \x01000000
  12. 2 | 8128 | 1 | 28 | 2097526 | 2097527 | 0 | (0,3) | 16385 | 256 | 24 | | | \x02000000
  13. 3 | 8096 | 1 | 28 | 2097527 | 0 | 0 | (0,3) | 32769 | 10240 | 24 | | | \x03000000
  14. (3 rows)
  15. postgres=#

更新之后发现,lower增加了4,说明新增了一个ItemIddata,同时upper的值减少也说明了free space空间在减少。查询heap_page_items发现lp的值从原来的2增加到了3,多了一行数据。原来ID=2的lp=2的行还在,t_xmax=2097527和lp=3的t_xmin的值一样,并且lp=2和lp=3的t_ctid一样,说明是update test1 set id = 3 where id = 2;更新之后的新数据为lp=3的行,被更新之前的行还在,只是数据指向了lp=3的数据(t_ctid=(0,3))。

3)、再插入一行新数据

  1. postgres=# insert into test1 values (4);
  2. INSERT 0 1
  3. postgres=# select * from page_header(get_raw_page('test1', 0));
  4. lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
  5. ------------+----------+-------+-------+-------+---------+----------+---------+-----------
  6. 2/DB174688 | 0 | 0 | 40 | 8064 | 8192 | 8192 | 4 | 2097527
  7. (1 row)
  8. postgres=# select * from heap_page_items(get_raw_page('test1',0));
  9. lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
  10. ----+--------+----------+--------+---------+---------+----------+--------+-------------+------------+--------+--------+-------+------------
  11. 1 | 8160 | 1 | 28 | 2097522 | 0 | 0 | (0,1) | 1 | 2304 | 24 | | | \x01000000
  12. 2 | 8128 | 1 | 28 | 2097526 | 2097527 | 0 | (0,3) | 16385 | 1280 | 24 | | | \x02000000
  13. 3 | 8096 | 1 | 28 | 2097527 | 0 | 0 | (0,3) | 32769 | 10496 | 24 | | | \x03000000
  14. 4 | 8064 | 1 | 28 | 2097528 | 0 | 0 | (0,4) | 1 | 2048 | 24 | | | \x04000000
  15. (4 rows)
  16. postgres=#

通过heap_page_items新增了一行数据,被更新lp=2的行对应的空间没有被重复使用。其实这里使用到的就是postgres的MVCC机制,如果表频繁的发生更新、删掉等。vacuum没有即使的清理旧版本的信息,就会导致表膨胀。关于vacuum什么时候会触发清理旧数据,可以参考链接:postgres统计信息规则和触发条件分析_qxy0503的博客-CSDN博客_postgres 统计信息(postgres统计信息规则和触发条件分析)。

4)、手动对表test1做vacuum操作,然后再插入数据观察

  1. postgres=# vacuum test1;
  2. VACUUM
  3. postgres=# select * from page_header(get_raw_page('test1', 0));
  4. lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
  5. ------------+----------+-------+-------+-------+---------+----------+---------+-----------
  6. 2/DB176940 | 0 | 4 | 40 | 8096 | 8192 | 8192 | 4 | 0
  7. (1 row)
  8. <=====vacuum 之后lower没有发生变化,但是upper从原来的8064 变成了8096
  9. postgres=# select * from heap_page_items(get_raw_page('test1',0));
  10. lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
  11. ----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  12. 1 | 8160 | 1 | 28 | 2097522 | 0 | 0 | (0,1) | 1 | 2304 | 24 | | | \x01000000
  13. 2 | 3 | 2 | 0 | | | | | | | | | |
  14. 3 | 8128 | 1 | 28 | 2097527 | 0 | 0 | (0,3) | 32769 | 10496 | 24 | | | \x03000000
  15. 4 | 8096 | 1 | 28 | 2097528 | 0 | 0 | (0,4) | 1 | 2304 | 24 | | | \x04000000
  16. (4 rows)
  17. <=====新插入一行数据
  18. postgres=# insert into test1 values (5);
  19. INSERT 0 1
  20. postgres=# select * from page_header(get_raw_page('test1', 0));
  21. lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
  22. ------------+----------+-------+-------+-------+---------+----------+---------+-----------
  23. 2/DB17E7C8 | 0 | 0 | 44 | 8064 | 8192 | 8192 | 4 | 0
  24. (1 row)
  25. postgres=# select * from heap_page_items(get_raw_page('test1',0));
  26. lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
  27. ----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  28. 1 | 8160 | 1 | 28 | 2097522 | 0 | 0 | (0,1) | 1 | 2304 | 24 | | | \x01000000
  29. 2 | 3 | 2 | 0 | | | | | | | | | |
  30. 3 | 8128 | 1 | 28 | 2097527 | 0 | 0 | (0,3) | 32769 | 10496 | 24 | | | \x03000000
  31. 4 | 8096 | 1 | 28 | 2097528 | 0 | 0 | (0,4) | 1 | 2304 | 24 | | | \x04000000
  32. 5 | 8064 | 1 | 28 | 2097529 | 0 | 0 | (0,5) | 1 | 2048 | 24 | | | \x05000000
  33. (5 rows)
  34. postgres=#

vacuum test1之后,虽然空间已经回收,upper从原来的8096变成了8064,但是新插入的一行记录并没有使用lp=2的空间(理论上这个空间(t_ctid(0,2))的空间是可以重新使用的)。

5)、再插入一行数据测试

  1. postgres=# insert into test1 values (2);
  2. INSERT 0 1
  3. postgres=# \x
  4. Expanded display is off.
  5. postgres=# select * from page_header(get_raw_page('test1', 0));
  6. lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
  7. ------------+----------+-------+-------+-------+---------+----------+---------+-----------
  8. 2/DB17EA10 | 0 | 0 | 48 | 8032 | 8192 | 8192 | 4 | 0
  9. (1 row)
  10. postgres=# select * from heap_page_items(get_raw_page('test1',0));
  11. lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
  12. ----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  13. 1 | 8160 | 1 | 28 | 2097522 | 0 | 0 | (0,1) | 1 | 2304 | 24 | | | \x01000000
  14. 2 | 3 | 2 | 0 | | | | | | | | | |
  15. 3 | 8128 | 1 | 28 | 2097527 | 0 | 0 | (0,3) | 32769 | 10496 | 24 | | | \x03000000
  16. 4 | 8096 | 1 | 28 | 2097528 | 0 | 0 | (0,4) | 1 | 2304 | 24 | | | \x04000000
  17. 5 | 8064 | 1 | 28 | 2097529 | 0 | 0 | (0,5) | 1 | 2304 | 24 | | | \x05000000
  18. 6 | 8032 | 1 | 28 | 2097530 | 0 | 0 | (0,6) | 1 | 2304 | 24 | | | \x02000000
  19. (6 rows)

新插入的数据还是没有使用lp=3的空间。

6)、先把id<6的数据都删除了,之后再插入新数据

  1. postgres=# delete from test1 where id <6;
  2. DELETE 4
  3. postgres=# select * from heap_page_items(get_raw_page('test1',0));
  4. lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
  5. ----+--------+----------+--------+---------+---------+----------+--------+-------------+------------+--------+--------+-------+------------
  6. 1 | 8160 | 1 | 28 | 2097522 | 2097540 | 0 | (0,1) | 8193 | 1280 | 24 | | | \x01000000
  7. 2 | 3 | 2 | 0 | | | | | | | | | |
  8. 3 | 8128 | 1 | 28 | 2097527 | 2097540 | 0 | (0,3) | 40961 | 9472 | 24 | | | \x03000000
  9. 4 | 8096 | 1 | 28 | 2097528 | 2097540 | 0 | (0,4) | 8193 | 1280 | 24 | | | \x04000000
  10. 5 | 8064 | 1 | 28 | 2097529 | 2097540 | 0 | (0,5) | 8193 | 1280 | 24 | | | \x05000000
  11. 6 | 8032 | 1 | 28 | 2097530 | 0 | 0 | (0,6) | 1 | 2304 | 24 | | | \x02000000
  12. (6 rows)
  13. postgres=# vacuum VERBOSE test1;
  14. INFO: vacuuming "public.test1"
  15. INFO: "test1": removed 4 row versions in 1 pages
  16. INFO: "test1": found 4 removable, 1 nonremovable row versions in 1 out of 1 pages
  17. DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 2097542
  18. There were 1 unused item pointers.
  19. Skipped 0 pages due to buffer pins, 0 frozen pages.
  20. 0 pages are entirely empty.
  21. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
  22. VACUUM
  23. postgres=# select * from heap_page_items(get_raw_page('test1',0));
  24. lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
  25. ----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  26. 1 | 0 | 0 | 0 | | | | | | | | | |
  27. 2 | 0 | 0 | 0 | | | | | | | | | |
  28. 3 | 0 | 0 | 0 | | | | | | | | | |
  29. 4 | 0 | 0 | 0 | | | | | | | | | |
  30. 5 | 0 | 0 | 0 | | | | | | | | | |
  31. 6 | 8160 | 1 | 28 | 2097530 | 0 | 0 | (0,6) | 1 | 2304 | 24 | | | \x06000000
  32. (6 rows)
  33. postgres=# select * from page_header(get_raw_page('test1', 0));
  34. lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
  35. ------------+----------+-------+-------+-------+---------+----------+---------+-----------
  36. 2/DB18FA68 | 0 | 5 | 48 | 8160 | 8192 | 8192 | 4 | 0
  37. (1 row)
  38. <======vacuum之后lower的值没有发生变化,但是upper从原来的8032 变成了8160,说明空间释放了
  39. postgres=# insert into test1 values (1);
  40. INSERT 0 1
  41. postgres=#
  42. postgres=# select * from page_header(get_raw_page('test1', 0));
  43. lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
  44. ------------+----------+-------+-------+-------+---------+----------+---------+-----------
  45. 2/DB18FC00 | 0 | 1 | 48 | 8128 | 8192 | 8192 | 4 | 0
  46. (1 row)
  47. postgres=# select * from heap_page_items(get_raw_page('test1',0));
  48. lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
  49. ----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  50. 1 | 8128 | 1 | 28 | 2097542 | 0 | 0 | (0,1) | 1 | 2048 | 24 | | | \x01000000
  51. 2 | 0 | 0 | 0 | | | | | | | | | |
  52. 3 | 0 | 0 | 0 | | | | | | | | | |
  53. 4 | 0 | 0 | 0 | | | | | | | | | |
  54. 5 | 0 | 0 | 0 | | | | | | | | | |
  55. 6 | 8160 | 1 | 28 | 2097530 | 0 | 0 | (0,6) | 1 | 2304 | 24 | | | \x06000000
  56. (6 rows)
  57. postgres=#
  58. <=====新插入的数据也使用了之前已经分配过的空间(t_ctid(0,1))。

通过上面的测试发现,其实vacuum table是不会降低page的高水位的,只是把旧数据占的空间释放,所以如果vaccum不及时导致了表的膨胀,那么再使用vacuum table是不能把表的大小降下去的。

测试如下:

7)、创建测试表

  1. postgres=#
  2. postgres=# create table tbl (id int primary key, info text, crt_time timestamp);
  3. CREATE TABLE
  4. postgres=# insert into tbl select generate_series(1,2000000),md5(random()::text),clock_timestamp();
  5. INSERT 0 2000000
  6. postgres=# \dt+ tbl
  7. List of relations
  8. Schema | Name | Type | Owner | Size | Description
  9. --------+------+-------+----------+--------+-------------
  10. public | tbl | table | postgres | 146 MB |
  11. (1 row)
  12. postgres=# \dt+ tbl_pkey
  13. Did not find any relation named "tbl_pkey".
  14. postgres=# \di+ tbl_pkey
  15. List of relations
  16. Schema | Name | Type | Owner | Table | Size | Description
  17. --------+----------+-------+----------+-------+-------+-------------
  18. public | tbl_pkey | index | postgres | tbl | 43 MB |
  19. (1 row)
  20. postgres=#

8)、编写测试脚本

[postgres@postgres ~]$ cat test.sql 
\set id random(1,2000000)
update tbl set info=md5(random()::text) where id between :id-250000 and :id+250000;  
[postgres@postgres ~]$ 

9)、测试数据库中存在持有事务号的长事务,这个事务时间段内,数据库产生的垃圾无法被回收。

  1. [postgres@postgres ~]$ pgbench -M prepared -h 127.0.0.1 -p 5442 -n -r -f ./test.sql -c 1 -j 1 -T 30
  2. transaction type: ./test.sql
  3. scaling factor: 1
  4. query mode: prepared
  5. number of clients: 1
  6. number of threads: 1
  7. duration: 30 s
  8. number of transactions actually processed: 4
  9. latency average = 10183.262 ms
  10. tps = 0.098200 (including connections establishing)
  11. tps = 0.098204 (excluding connections establishing)
  12. statement latencies in milliseconds:
  13. 0.005 \set id random(1,2000000)
  14. 10182.856 update tbl set info=md5(random()::text) where id between :id-250000 and :id+250000;
  15. [postgres@postgres ~]$

查看更新表之后的表大小和索引大小

  1. postgres=# \dt+ tbl
  2. List of relations
  3. Schema | Name | Type | Owner | Size | Description
  4. --------+------+-------+----------+--------+-------------
  5. public | tbl | table | postgres | 203 MB |
  6. (1 row)
  7. postgres=# \di+ tbl_pkey
  8. List of relations
  9. Schema | Name | Type | Owner | Table | Size | Description
  10. --------+----------+-------+----------+-------+--------+-------------
  11. public | tbl_pkey | index | postgres | tbl | 103 MB |
  12. (1 row)

10)、执行vacuum tbl

  1. postgres=# vacuum VERBOSE tbl;
  2. INFO: vacuuming "public.tbl"
  3. INFO: "tbl": found 0 removable, 107 nonremovable row versions in 1 out of 26029 pages
  4. DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 2097561
  5. There were 0 unused item pointers.
  6. Skipped 0 pages due to buffer pins, 7335 frozen pages.
  7. 0 pages are entirely empty.
  8. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
  9. INFO: vacuuming "pg_toast.pg_toast_157481"
  10. INFO: index "pg_toast_157481_index" now contains 0 row versions in 1 pages
  11. DETAIL: 0 index row versions were removed.
  12. 0 index pages have been deleted, 0 are currently reusable.
  13. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
  14. INFO: "pg_toast_157481": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
  15. DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 2097561
  16. There were 0 unused item pointers.
  17. Skipped 0 pages due to buffer pins, 0 frozen pages.
  18. 0 pages are entirely empty.
  19. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
  20. VACUUM
  21. postgres=# \dt+ tbl
  22. List of relations
  23. Schema | Name | Type | Owner | Size | Description
  24. --------+------+-------+----------+--------+-------------
  25. public | tbl | table | postgres | 203 MB |
  26. (1 row)
  27. postgres=# \di+ tbl_pkey
  28. List of relations
  29. Schema | Name | Type | Owner | Table | Size | Description
  30. --------+----------+-------+----------+-------+--------+-------------
  31. public | tbl_pkey | index | postgres | tbl | 103 MB |
  32. (1 row)
  33. postgres=#

vacuum发现表的空间和tbl的空间并没有释放,所以,如果已经膨胀之后的表,再使用vacuum table是无法降低表的大小的,但是使用vacuum full table是可以降低表的大小的。

如下:

  1. postgres=# vacuum FULL VERBOSE tbl;
  2. INFO: vacuuming "public.tbl"
  3. INFO: "tbl": found 0 removable, 2000000 nonremovable row versions in 26029 pages
  4. DETAIL: 0 dead row versions cannot be removed yet.
  5. CPU: user: 0.25 s, system: 2.93 s, elapsed: 3.54 s.
  6. VACUUM
  7. postgres=# \dt+ tbl
  8. List of relations
  9. Schema | Name | Type | Owner | Size | Description
  10. --------+------+-------+----------+--------+-------------
  11. public | tbl | table | postgres | 146 MB |
  12. (1 row)
  13. postgres=# \di+ tbl_pkey
  14. List of relations
  15. Schema | Name | Type | Owner | Table | Size | Description
  16. --------+----------+-------+----------+-------+-------+-------------
  17. public | tbl_pkey | index | postgres | tbl | 43 MB |
  18. (1 row)

       PostgreSQL的表或索引发生膨胀后,用户可以使用vacuum full或rewrite table(如cluster)的方式重建表。
但是vacuum full或者rewrite都需要持有排它锁,会堵塞读操作。为了减少锁冲突,社区有一个名为pg_reorg或pg_repack的插件,使用了增量的方式重组数据,最后通过切换FILENODE完成数据重组。仅仅在切换FILENODE时需要持有排他锁,非常短暂,影响比VACUUM FULL和rewrite的方式小多了。

参考资料:

PostgreSQL中page页结构_ITPUB博客

postgresql磁盘存储结构 - exact - 博客园

https://github.com/digoal/blog/blob/master/201504/20150429_02.md

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/我家自动化/article/detail/256384
推荐阅读
相关标签
  

闽ICP备14008679号