Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Shutting down cluster when master recovering in pass3 cause inconsistency between pg_class and gp_persistent table.
And it cause data loss.
2016-03-01 01:56:33.032318 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","checkpoint record is at 0/302AD30",,,,,,,0,,"xlog.c",6304, 2016-03-01 01:56:33.032337 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","redo record is at 0/302AD30; undo record is at 0/0; shutdown FALSE",,,,,,,0,,"xlog.c",6338, 2016-03-01 01:56:33.032353 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","next transaction ID: 0/1045; next OID: 24726",,,,,,,0,,"xlog.c",6342, 2016-03-01 01:56:33.032367 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","next MultiXactId: 1; next MultiXactOffset: 0",,,,,,,0,,"xlog.c",6345, 2016-03-01 01:56:33.032382 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","database system was not properly shut down; automatic recovery in progress",,,,,,,0,,"xlog.c",6434, 2016-03-01 01:56:33.033329 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","redo starts at 0/302AD80",,,,,,,0,,"xlog.c",6523, 2016-03-01 01:56:33.089749 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","record with zero length at 0/77A7708",,,,,,,0,,"xlog.c",4110, 2016-03-01 01:56:33.089792 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","redo done at 0/77A76D8",,,,,,,0,,"xlog.c",6560, 2016-03-01 01:56:33.089893 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","end of transaction log location is 0/77A7708",,,,,,,0,,"xlog.c",6582, 2016-03-01 01:56:33.738889 PST,,,p119941,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","Finished startup pass 1. Proceeding to startup crash recovery passes 2 and 3.",,,,,,,0,,"xlog.c",6816, 2016-03-01 01:56:34.525387 PST,,,p118947,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","received smart shutdown request",,,,,,,0,,"postmaster.c",3447, 2016-03-01 01:56:35.042857 PST,,,p119958,th731297984,,,,0,,,seg-10000,,,,,"WARNING","XX000","could not remove relation directory 16385/16536/20219: Success (smgr.c:1049)",,,,,"Dropping file-system object -- Relation Directory: '16385/16536/20219'",,0,,"smgr.c",1049, 2016-03-01 01:56:35.131058 PST,,,p119958,th731297984,,,,0,,,seg-10000,,,,,"WARNING","XX000","could not remove relation directory 16385/16536/16894: Success (smgr.c:1049)",,,,,"Dropping file-system object -- Relation Directory: '16385/16536/16894'",,0,,"smgr.c",1049, 2016-03-01 01:56:35.584893 PST,,,p119958,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","Finished startup crash recovery pass 2",,,,,,,0,,"xlog.c",6987, 2016-03-01 01:56:35.590423 PST,,,p120017,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","shutting down",,,,,,,0,,"xlog.c",7853, 2016-03-01 01:56:35.592973 PST,,,p120017,th731297984,,,,0,,,seg-10000,,,,,"LOG","00000","database system is shut down",,,,,,,0,,"xlog.c",7874,
cr_workload=# select * from pg_class where relname like 'create_insert%' and relname not like '%prt%'; relname | relnamespace | reltype | relowner | relam | relfilenode | reltablespace | relpages | reltuples | reltoastrelid | reltoastidxid | relaosegrelid | relaosegidxid | relhasindex | relisshared | relkind | relstorage | relnatts | relchecks | reltriggers | relukeys | relfkeys | relrefs | relhasoids | relh aspkey | relhasrules | relhassubclass | relfrozenxid | relacl | reloptions ----------------+--------------+---------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+---------------+---------------+-------------+-------------+---------+------------+----------+-----------+-------------+----------+----------+---------+------------+----- -------+-------------+----------------+--------------+--------+------------------- create_insert1 | 2200 | 696503 | 10 | 0 | 702761 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | f | f | r | a | 3 | 0 | 0 | 0 | 0 | 0 | f | f | f | t | 11609 | | {appendonly=true} (1 row) cr_workload=# \d No relations found. cr_workload=# select * from create_insert1; ERROR: relation "create_insert1" does not exist LINE 1: select * from create_insert1; ^ cr_workload=# select * from gp_persistent_relation_node where relfilenode_oid = 702761; tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved | parent_xid | persistent_serial_num | previous_free_tid ----------------+--------------+-----------------+------------------+----------+------------+-----------------------+------------------- 16385 | 696501 | 702761 | 2 | 0 | 0 | 31380 | (0,0) (1 row)
Attachments
Issue Links
- relates to
-
HAWQ-471 Reindex bug
- Resolved