Uploaded image for project: 'Apache HAWQ'
  1. Apache HAWQ
  2. HAWQ-1149

Built-in function gp_persistent_build_all loses data in gp_relfile_node and gp_persistent_relfile_node

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: backlog
    • Component/s: Core
    • Labels:
      None

      Description

      When we create a new table, and insert data into it. There will be records in gp_relfile_node, gp_persistent_relfile_node and gp_persistent_relation_node. But if we run the HAWQ build-in function gp_persistent_build_all, we will find that the record in gp_relfile_node and gp_persistent_relfile_node for this table is lost. And if there are more than 1 file in this talbe, we will get error when we drop this table. Here are the steps to recur this bug:
      1. Create table a, and insert data into a with two concurrent process:

      postgres=# create table a(id int);
      CREATE TABLE
      postgres=# insert into a select generate_series(1, 10000000);
      INSERT 0 10000000
      
      postgres=# insert into a select generate_series(10000000, 20000000);
      INSERT 0 10000001
      

      2. Check the persistent table and find two files in this table's directory:

      postgres=# select oid from pg_class where relname='a';
         oid
      ---------
       3017232
      (1 row)
      
      postgres=# select * from gp_relfile_node where relfilenode_oid=3017232;
       relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num
      -----------------+------------------+----------------+-----------------------
               3017232 |                1 | (4,128)        |                855050
               3017232 |                2 | (4,129)        |                855051
      (2 rows)
      
      postgres=# select * from gp_persistent_relation_node where relfilenode_oid=3017232;
       tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved | parent_xid | persistent_serial_num | previous_free_tid
      ----------------+--------------+-----------------+------------------+----------+------------+-----------------------+-------------------
                16385 |        16387 |         3017232 |                2 |        0 |          0 |                158943 | (0,0)
      (1 row)
      
      postgres=# select * from gp_persistent_relfile_node where relfilenode_oid=3017232;
       tablespace_oid | database_oid | relfilenode_oid | segment_file_num | relation_storage_manager | persistent_state | relation_bufpool_kind | parent_xid | persistent_serial_num | previous_free_tid
      ----------------+--------------+-----------------+------------------+--------------------------+------------------+-----------------------+------------+-----------------------+-------------------
                16385 |        16387 |         3017232 |                1 |                        2 |                2 |                     0 |          0 |                855050 | (0,0)
                16385 |        16387 |         3017232 |                2 |                        2 |                2 |                     0 |          0 |                855051 | (0,0)
      (2 rows)
      
      hadoop fs -ls /hawq_default/16385/16387/3017232
      -rw-------   3 wangchunling supergroup  100103584 2016-11-08 17:02 /hawq_default/16385/16387/3017232/1
      -rw-------   3 wangchunling supergroup  100103600 2016-11-08 17:02 /hawq_default/16385/16387/3017232/2
      

      3. Rebuilt persistent tables.

      postgres=# insert into a select generate_series(10000000, 20000000);
      INSERT 0 10000001
      postgres=# select gp_persistent_reset_all();
       gp_persistent_reset_all
      -------------------------
                             1
      (1 row)
      
      postgres=# select gp_persistent_build_all(false);
       gp_persistent_build_all
      -------------------------
                             1
      (1 row)
      

      4. Check persistent table and find data lost in gp_relfile_node and gp_persistent_relfile_node.

      postgres=# select * from gp_relfile_node where relfilenode_oid=3017232;
       relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num
      -----------------+------------------+----------------+-----------------------
      (0 rows)
      
      postgres=# select * from gp_persistent_relation_node where relfilenode_oid=3017232;
       tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved | parent_xid | persistent_serial_num | previous_free_tid
      ----------------+--------------+-----------------+------------------+----------+------------+-----------------------+-------------------
                16385 |        16387 |         3017232 |                2 |        0 |          0 |                159020 | (0,0)
      (1 row)
      
      postgres=# select * from gp_persistent_relfile_node where relfilenode_oid=3017232;
       tablespace_oid | database_oid | relfilenode_oid | segment_file_num | relation_storage_manager | persistent_state | relation_bufpool_kind | parent_xid | persistent_serial_num | previous_free_tid
      ----------------+--------------+-----------------+------------------+--------------------------+------------------+-----------------------+------------+-----------------------+-------------------
      (0 rows)
      

      5. Drop talbe a and get error.

      postgres=# DROP TABLE a;
      ERROR:  TID for persistent 'Relation Directory: '16385/16387/3017232'' tuple is invalid (0,0) (index 2, transaction kind 'Commit') (persistentendxactrec.c:264)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lei_chang Lei Chang
                Reporter:
                wcl14 Chunling Wang
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: