Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-1776

Data loss in many multi-partitions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • None
    • None
    • MAC/Linux

    Description

      Total description: If we configure more than 24 partitions in each NC, we always loss almost half of the partitions, without any error information or logs.
      Schema:

      drop dataverse tpch if exists;
      create dataverse tpch;
      use dataverse tpch;
      
      create type LineItemType as closed {
        l_orderkey: int32,
        l_partkey: int32,
        l_suppkey: int32,
        l_linenumber: int32,
        l_quantity: int32,
        l_extendedprice: double,
        l_discount: double,
        l_tax: double,
        l_returnflag: string,
        l_linestatus: string,
        l_shipdate: string,
        l_commitdate: string,
        l_receiptdate: string,
        l_shipinstruct: string,
        l_shipmode: string,
        l_comment: string
      }
      
      create dataset LineItem(LineItemType)
        primary key l_orderkey, l_linenumber;
      load dataset LineItem 
      using localfs
      (("path"="127.0.0.1:///path-to-tpch-data/tpch0.001/lineitem.tbl"),("format"="delimited-text"),("delimiter"="|"));
      

      Query:

      use dataverse tpch;
      let $s := count(
      for $d in dataset LineItem
      return $d
      )
      return $s
      

      Return:

      6005
      

      Command:

      managix stop -n tpch
      managix start -n tpch
      

      Query:

      use dataverse tpch;
      let $s := count(
      for $d in dataset LineItem
      return $d
      )
      return $s
      

      Return:

      4521
      

      We lose 1/3 records in this tiny test. When we increase the tpch scale onto 200gb across 196 partitions by the distribution of 8 X 24, we should get 1.2 billion records, but it only returned 0.45 billion!

      Attachments

        1. demo.xml
          3 kB
          Wenhai Li
        2. tpch_node2.log
          115 kB
          Wenhai Li
        3. tpch_node1.log
          106 kB
          Wenhai Li
        4. cc.log
          151 kB
          Wenhai Li
        5. execute.log
          51 kB
          Wenhai Li

        Activity

          People

            imaxon Ian Maxon
            lwhay Wenhai Li
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: