Top Banner
Copyright © 2016 NTT DATA Corporation 03/17/2016 NTT DATA Corporation Masahiko Sawada Introduction VACUUM, FREEZING, XID wraparound
42

Introduction VAUUM, Freezing, XID wraparound

Jan 23, 2018

Download

Technology

Masahiko Sawada
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation

03/17/2016 NTT DATA Corporation Masahiko Sawada

Introduction VACUUM, FREEZING, XID wraparound

Page 2: Introduction VAUUM, Freezing, XID wraparound

2 Copyright © 2016NTT DATA Corporation

A little about me

Ø  Masahiko Sawada Ø  twitter : @sawada_masahiko

Ø  NTT DATA Corporation Ø  Database engineer

Ø  PostgreSQL Hacker Ø  Core feature Ø  pg_bigm (Multi-byte full text search module for PostgreSQL)

Page 3: Introduction VAUUM, Freezing, XID wraparound

3 Copyright © 2016NTT DATA Corporation

Contents

•  VACUUM

•  Visibility Map

•  Freezing Tuple

•  XID wraparound

•  New VACUUM feature for 9.6

Page 4: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 4

What is the VACUUM?

Page 5: Introduction VAUUM, Freezing, XID wraparound

5 Copyright © 2016 NTT DATA Corporation

VACUUM

1 AAA

2 BBB

3 CCC

2 bbb

4 DDD Concurrently INSERT/DELETE/UPDATE

1 AAA

2 BBB

3 CCC

2 bbb

1 AAA

3 CCC

2 bbb

4 DDD

VACUUM Starts

VACUUM Done FSM

UPDATE : BBB->bbb

•  Postgres garbage collection feature

•  Acquire ShareUpdateExclusive Lock

Page 6: Introduction VAUUM, Freezing, XID wraparound

6 Copyright © 2016 NTT DATA Corporation

Why do we need to VACUUM?

•  Recover or reuse disk space occupied

•  Update data statistics

•  Update visibility map to speed up Index-Only Scan.

•  Protect against loss of very old data due to XID wraparound

Page 7: Introduction VAUUM, Freezing, XID wraparound

7 Copyright © 2016 NTT DATA Corporation

Evolution history of VACUUM

v8.1 (2005) v8.4 (2009)

autovacuum !?

Visibility Map Free Space Map

v9.5 (2016)

vacuumdb parallel option

v9.6

Page 8: Introduction VAUUM, Freezing, XID wraparound

8 Copyright © 2016 NTT DATA Corporation

VACUUM Syntax

-- VACUUM whole database =# VACUUM;

-- Multiple option, analyzing only col1 column

=# VACUUM FREEZE VERBOSE ANALYZE hoge (col1);

-- Multiple option with parentheses

=# VACUUM (FULL, ANALYZE, VERBOSE) hoge;

Page 9: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 9

Visibility Map

Page 10: Introduction VAUUM, Freezing, XID wraparound

10 Copyright © 2016 NTT DATA Corporation

Visibility Map

•  Introduced at 8.4 •  A bit map for each table (1 bit per 1 page) •  A table relation can have a visibility map. •  keep track of which pages are all-visible page

•  keep track of which pages are having garbage. •  If 500GB table, Visibility Map is less than 10MB.

Table (base/XXX/1234)

Visibility Map (base/XXX/1234_vm) Block 0

Block 1 Block 2 Block 3 Block 4

11001…

Page 11: Introduction VAUUM, Freezing, XID wraparound

11 Copyright © 2016 NTT DATA Corporation

State transition of Visibility Map bit

VACUUM

0 1

INSERT, UPDATE, DELETE

(NOT all-visible) (all-visible)

Page 12: Introduction VAUUM, Freezing, XID wraparound

12 Copyright © 2016 NTT DATA Corporation

How does the VACUUM works actually?

•  VACUUM works with two phases;

1.  Scan table to collect TID

2.  Reclaim garbage (Table, Index)

maintenance_work_mem

Index

Table

Scan Table

Collect garbage TID

Reclaim garbages

1st Phase

2nd Phase

Page 13: Introduction VAUUM, Freezing, XID wraparound

13 Copyright © 2016 NTT DATA Corporation

Performance improvement point of VACUUM

•  Scan table page one by one.

•  vacuum can skip, iff there are more than 32 consecutive all-visible pages

•  Store and remember garbage tuple ID to maintenance_work_mem.

VACUUM can skip to scan efficiency.

SLOW!! FAST!

VACUUM needs to scan all page.

: all-visible block

: Not all-visible block

Page 14: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 14

XID wraparound and freezing tuple

Page 15: Introduction VAUUM, Freezing, XID wraparound

15 Copyright © 2016 NTT DATA Corporation

What is the transaction ID (XID)?

•  Every tuple has two transaction IDs. •  xmin : Inserted XID •  xmax : Deleted/Updated XID

xmin | xmax | col -------+------+------ 1810 | 1820 | AAA 1812 | 0 | BBB 1814 | 1830 | CCC 1820 | 0 | XXX

In REPEATABLE READ transaction isolation level, •  Transaction 1815 can see ‘AAA’, ‘BBB’ and ‘CCC’. •  Transaction 1821 can see ‘BBB’, ‘CCC’ and ‘XXX’ •  Transaction 1831 can see ‘BBB’ and ‘XXX’.

Page 16: Introduction VAUUM, Freezing, XID wraparound

16 Copyright © 2016 NTT DATA Corporation

What is the transaction ID (XID)?

•  Can represent up to 4 billion transactions (uint32).

•  XID space is circular with no endpoint.

•  There are 2 billion XIDs that are “older”, 2 billion XIDs that are “newer”.

0 232-1

Older (Not visible)

Newer (Visible)

Page 17: Introduction VAUUM, Freezing, XID wraparound

17 Copyright © 2016 NTT DATA Corporation

What is the XID wraparound?

XID=100 XID=100

XID 100 become not visible

XID=100

Older (Visible)

Newer (Not visible)

XID 100 is visible

Older (Not visible) Older

(Not visible)

Newer (Visible)

Newer (Visible)

Still visible

•  Postgres could loss the very old data due to XID wraparound.

•  When tuple is more than 2 billion transaction old, it could be happen.

•  If 200 TPS system, it’s happen every 120 days.

•  Note that it could be happen on INSERT-only table.

Page 18: Introduction VAUUM, Freezing, XID wraparound

18 Copyright © 2016 NTT DATA Corporation

Freezing tuple

•  Mark tuple as “Frozen”

•  Marking “frozen” means that it will appear to be “in the past” to all transaction.

•  Must freeze old tuple *before* XID proceeds 2 billion.

XID=100 (FREEZE)

XID=100 (FREEZE)

Tuple is visible.

XID=100

Older (Visible)

Newer (Not visible)

XID 100 is visible

Older (Not visible) Older

(Not visible)

Newer (Visible)

Newer (Visible)

Still visible. Tuple is marked as ‘FREEZE’

Page 19: Introduction VAUUM, Freezing, XID wraparound

19 Copyright © 2016 NTT DATA Corporation

To prevent old data loss due to XID wraparound

•  Emit WARNING log at 10 million transactions remaining.

•  Prohibit to generate new XID at 1 million transactions remaining.

•  Run anti-wraparound VACUUM automatically.

Page 20: Introduction VAUUM, Freezing, XID wraparound

20 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

•  All table has pg_class.relfrozenxid value. •  All tuples which had been inserted by XID older than relfrozenxid have been

marked as “Frozen”. •  Same as forcibly executed VACUUM *FREEZE*.

Current XID pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

VACUUM could do a whole table scan

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

Page 21: Introduction VAUUM, Freezing, XID wraparound

21 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

At this XID, lazy VACUUM is executed.

Current XID pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

VACUUM could do a whole table scan

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

VACUUM

Page 22: Introduction VAUUM, Freezing, XID wraparound

22 Copyright © 2016 NTT DATA Corporation

VACUUM could do a whole table scan

Anti-wraparound VACUUM

If you execute VACUUM at this XID, anti-wraparound VACUUM will be

executed.

If you do VACUUM at this XID, anti-wraparound VACUUM is executed.

pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

anti-wraparound VACUUM

Current XID

Page 23: Introduction VAUUM, Freezing, XID wraparound

23 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

After current XID is exceeded, anti-wraparound VACUUM is launched forcibly by autovacuum.

pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

anti-wraparound auto VACUUM

Current XID

VACUUM could do a whole table scan

Page 24: Introduction VAUUM, Freezing, XID wraparound

24 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

After anti-wraparound VACUUM, relrozenxid value is updated.

Current XID pg_class. relfrozenxid

vacuum_freeze_min_age (default 50 million)

Page 25: Introduction VAUUM, Freezing, XID wraparound

25 Copyright © 2016 NTT DATA Corporation

anti-wraparound VACUUM is too slow

•  Scanning whole table is always required to proceed relfrozenxid.

•  Because lazy vacuum could skip page having the visible but not frozen tuple.

Visibility Map

Block # xmin

0 0 FREEZE FREEZE

1 1 FREEZE FREEZE

1 2 101

102

103

0 3 Garbage

104

Normal VACUUM

Anti-wraparound VACUUM

Page 26: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 26

How can we improve anti-wraparound VACUUM?

Page 27: Introduction VAUUM, Freezing, XID wraparound

27 Copyright © 2016 NTT DATA Corporation

Approaches

•  Freeze Map

•  Track pages which are necessary to be frozen.

•  64bit XID

•  Change size of XID from 32bit to 64bit.

•  LSN to XID map

•  Mapping XID to LSN.

Page 28: Introduction VAUUM, Freezing, XID wraparound

28 Copyright © 2016 NTT DATA Corporation

Freeze Map

•  New feature for 9.6.

•  Improve VACUUM FREEZE, anti-wraparound VACUUM performance.

•  Bring us to functionality for VLDB.

Page 29: Introduction VAUUM, Freezing, XID wraparound

29 Copyright © 2016 NTT DATA Corporation

Idea - Add an additional bit

•  Not adding new map.

•  Add a additional bit to Visibility Map.

•  The additional bits tracks which pages are all-frozen.

•  All-frozen page should be all-visible as well.

10110010 all-visible all-frozen

Page 30: Introduction VAUUM, Freezing, XID wraparound

30 Copyright © 2016 NTT DATA Corporation

State transition of two bits

00

10 11

all-visible all-frozen

VACUUM UPDATE/ DELETE/ INSERT

UPDATE/ DELETE/ INSERT

VACUUM FREEZE

VACUUM FREEZE

Page 31: Introduction VAUUM, Freezing, XID wraparound

31 Copyright © 2016 NTT DATA Corporation

Idea - Improve anti-wraparound performance

•  VACUUM can skip all-frozen page even if anti-wraparound VACUUM is

required.

Normal VACUUM

Anti-wraparound VACUUM

Visiblity Map Block # xmin

visible frozen

1 0 0 FREEZE FREEZE

1 1 1 FREEZE FREEZE

1 0 2 101

102

103

0 0 3 Garbage

104

Page 32: Introduction VAUUM, Freezing, XID wraparound

32 Copyright © 2016 NTT DATA Corporation

Pros/Cons

•  Pros

•  Dramatically performance improvement for VACUUM FREEZE.

•  Read only table. (future)

•  Cons

•  Bloat Visibility Map size as twice.

Page 33: Introduction VAUUM, Freezing, XID wraparound

33 Copyright © 2016 NTT DATA Corporation

No More Full-Table Vacuums

http://rhaas.blogspot.jp/2016/03/no-more-full-table-vacuums.html#comment-form

Page 34: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 34

Another work

Page 35: Introduction VAUUM, Freezing, XID wraparound

35 Copyright © 2016 NTT DATA Corporation

Vacuum Progress Checker

•  New feature for 9.6. (under reviewing)

•  Report progress information of VACUUM via system view.

Page 36: Introduction VAUUM, Freezing, XID wraparound

36 Copyright © 2016 NTT DATA Corporation

Idea

•  Add new system view.

•  Report meaningful progress information for detail per process doing VACUUM.

postgres(1)=# SELECT * FROM pg_stat_vacuum_progress ; -[ RECORD 1 ]-------+--------------

pid | 55513

relid | 16384

phase | Scanning Heap

total_heap_blks | 451372

current_heap_blkno | 77729

total_index_pages | 559364

scanned_index_pages | 559364 index_scan_count | 1

percent_complete | 17

Page 37: Introduction VAUUM, Freezing, XID wraparound

37 Copyright © 2016 NTT DATA Corporation

Future works

•  Read Only Table

•  Report progress information of other maintenance command.

Page 38: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2011 NTT DATA Corporation

Copyright © 2016 NTT DATA Corporation

PostgreSQL git repository

git://git.postgresql.org/git/postgresql.git

Page 39: Introduction VAUUM, Freezing, XID wraparound

39 Copyright © 2016 NTT DATA Corporation

VERBOSE option

=# VACUUM VERBOSE hoge; INFO: vacuuming "public.hoge"

INFO: scanned index "hoge_idx1" to remove 1000 row versions

DETAIL: CPU 0.00s/0.01u sec elapsed 0.01 sec.

INFO: "hoge": removed 1000 row versions in 443 pages

DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: index "hoge_idx1" now contains 100000 row versions in 276 pages DETAIL: 1000 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: "hoge": found 1000 removable, 100000 nonremovable row versions in 447 out of 447 pages DETAIL: 0 dead row versions cannot be removed yet.

There were 0 unused item pointers.

Skipped 0 pages due to buffer pins.

0 pages are entirely empty.

CPU 0.00s/0.05u sec elapsed 0.05 sec.

VACUUM

Page 40: Introduction VAUUM, Freezing, XID wraparound

40 Copyright © 2016 NTT DATA Corporation

FREEZE option

•  Aggressive freezing of tuples

•  Same as running normal VACUUM with vacuum_freeze_min_age = 0 and

vacuum_freeze_table_age = 0

•  Always scan whole table

Page 41: Introduction VAUUM, Freezing, XID wraparound

41 Copyright © 2016 NTT DATA Corporation

ANALYZE option

•  Do ANALYZE after VACUUM •  Update data statistics used by planner

-- VACUUM and analyze with VERBOSE option =# VACUUM ANALYZE VERBOSE hoge;

INFO: vacuuming "public.hoge"

:

INFO: analyzing "public.hoge"

INFO: "hoge": scanned 452 of 452 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows

VACUUM

Page 42: Introduction VAUUM, Freezing, XID wraparound

42 Copyright © 2016 NTT DATA Corporation

FULL option

•  Completely different from lazy VACUUM

•  Similar to CLUSTER

•  Acquire AccessExclusiveLock

•  Take much longer than lazy VACUUM

•  Need more space at most twice as table size.

•  Rebuild table and indexes

•  Freeze tuple while VACUUM FULL (9.3~)