When we create a new table, and insert data into it. There will be records in gp_relfile_node, gp_persistent_relfile_node and gp_persistent_relation_node. But if we run the HAWQ build-in function gp_persistent_build_all, we will find that the record in gp_relfile_node and gp_persistent_relfile_node for this table is lost. And if there are more than 1 file in this talbe, we will get error when we drop this table. Here are the steps to recur this bug:
1. Create table a, and insert data into a with two concurrent process:
postgres=# create table a(id int);
CREATE TABLE
postgres=# insert into a select generate_series(1, 10000000);
INSERT 0 10000000
postgres=# insert into a select generate_series(10000000, 20000000); INSERT 0 10000001
2. Check the persistent table and find two files in this table's directory:
postgres=# select oid from pg_class where relname='a';
oid
---------
3017232
(1 row)
postgres=# select * from gp_relfile_node where relfilenode_oid=3017232;
relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num
-----------------+------------------+----------------+-----------------------
3017232 | 1 | (4,128) | 855050
3017232 | 2 | (4,129) | 855051
(2 rows)
postgres=# select * from gp_persistent_relation_node where relfilenode_oid=3017232;
tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved | parent_xid | persistent_serial_num | previous_free_tid
----------------+--------------+-----------------+------------------+----------+------------+-----------------------+-------------------
16385 | 16387 | 3017232 | 2 | 0 | 0 | 158943 | (0,0)
(1 row)
postgres=# select * from gp_persistent_relfile_node where relfilenode_oid=3017232;
tablespace_oid | database_oid | relfilenode_oid | segment_file_num | relation_storage_manager | persistent_state | relation_bufpool_kind | parent_xid | persistent_serial_num | previous_free_tid
----------------+--------------+-----------------+------------------+--------------------------+------------------+-----------------------+------------+-----------------------+-------------------
16385 | 16387 | 3017232 | 1 | 2 | 2 | 0 | 0 | 855050 | (0,0)
16385 | 16387 | 3017232 | 2 | 2 | 2 | 0 | 0 | 855051 | (0,0)
(2 rows)
hadoop fs -ls /hawq_default/16385/16387/3017232
-rw------- 3 wangchunling supergroup 100103584 2016-11-08 17:02 /hawq_default/16385/16387/3017232/1
-rw------- 3 wangchunling supergroup 100103600 2016-11-08 17:02 /hawq_default/16385/16387/3017232/2
3. Rebuilt persistent tables.
postgres=# insert into a select generate_series(10000000, 20000000);
INSERT 0 10000001
postgres=# select gp_persistent_reset_all();
gp_persistent_reset_all
-------------------------
1
(1 row)
postgres=# select gp_persistent_build_all(false);
gp_persistent_build_all
-------------------------
1
(1 row)
4. Check persistent table and find data lost in gp_relfile_node and gp_persistent_relfile_node.
postgres=# select * from gp_relfile_node where relfilenode_oid=3017232; relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num -----------------+------------------+----------------+----------------------- (0 rows) postgres=# select * from gp_persistent_relation_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved | parent_xid | persistent_serial_num | previous_free_tid ----------------+--------------+-----------------+------------------+----------+------------+-----------------------+------------------- 16385 | 16387 | 3017232 | 2 | 0 | 0 | 159020 | (0,0) (1 row) postgres=# select * from gp_persistent_relfile_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | segment_file_num | relation_storage_manager | persistent_state | relation_bufpool_kind | parent_xid | persistent_serial_num | previous_free_tid ----------------+--------------+-----------------+------------------+--------------------------+------------------+-----------------------+------------+-----------------------+------------------- (0 rows)
5. Drop talbe a and get error.
postgres=# DROP TABLE a; ERROR: TID for persistent 'Relation Directory: '16385/16387/3017232'' tuple is invalid (0,0) (index 2, transaction kind 'Commit') (persistentendxactrec.c:264)
- relates to
-
HAWQ-1145 After registering a partition table, if we want to insert some data into the table, it fails.
-
- Closed
-
- links to