Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20868

SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      In MapRecordProcessor::getFinalOp() due to external cause(not known), the TezDummyStoreOperator may have MergeJoin Op as child intermittently. Due to this, the fetchDone remains set to true for the DummyOp which was set by previous task. Ideally, fetchDone should be reset for each task. This eventually leads to the join op skip rows from that dummy op resulting in wrong results.

      Good init order

      2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp child Ops = TS[3] (core)
      2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp child Ops = FIL[24]
      2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp child Ops = SEL[5]
      2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp child Ops = DUMMY_STORE[45]
      2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: Iterating children of dummy op DUMMY_STORE[45]
      2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp returns DUMMY_STORE[45]
      2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: InitProcessor : setting fetchDone to false
      

      Bad init order

      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp child Ops = TS[3] (core)
      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp child Ops = FIL[24]
      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp child Ops = SEL[5]
      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp child Ops = DUMMY_STORE[45]
      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  Iterating children of dummy op DUMMY_STORE[45]
      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  Child of Dummy Op MERGEJOIN[44]
      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp child Ops = MERGEJOIN[44]
      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp child Ops = SEL[13]
      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp child Ops = RS[14]
      2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp returns RS[14]
      

      Attachments

        1. HIVE-20868.2.patch
          0.9 kB
          Deepak Jaiswal
        2. HIVE-20868.1.patch
          0.9 kB
          Deepak Jaiswal

        Activity

          People

            djaiswal Deepak Jaiswal
            djaiswal Deepak Jaiswal
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: