Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7816

Race condition in HdfsScanNodeBase::StopAndFinalizeCounters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 3.1.0
    • None
    • Backend
    • ghx-label-9

    Description

      While working on IMPALA-6964, I noticed that sometimes the runtime profile for a HDFS_SCAN_NODE will include File Formats: PARQUET/NONE:2 and sometimes it won't (depending on the query). However, looking at the code, any scan of Parquet files should include this line.

      I debugged the code and there seems to a be a race condition where HdfsScanNodeBase::StopAndFinalizeCounters can be called before HdfsParquetScanner::Close is called for all the scan ranges. This causes the File Formats issue above because HdfsParquetScanner::Close calls HdfsScanNodeBase::RangeComplete which updates the shared object file_type_counts_, which is read in StopAndFinalizeCounters (so StopAndFinalizeCounters will write out the contents of file_type_counts_ before all scanners can update it).

      StopAndFinalizeCounters can be called in two places: HdfsScanNodeBase::Close and in HdfsScanNode::GetNext. It can be called in GetNext when GetNextInternal reads enough rows to cross the query defined limit. So GetNext will call StopAndFinalizeCounters once the limit is reached, but not necessarily before the scanners are closed.

      I'm able to re-produce this locally by using the queries:

       select * from functional_parquet.lineitem_sixblocks limit 10 

      The runtime profile does not include File Formats

       select * from functional_parquet.lineitem_sixblocks order by l_orderkey limit 10 

      The runtime profile does include File Formats

      I tried to simply remove the call to StopAndFinalizeCounters from GetNext but that doesn't seem to work. It actually caused several other RP messages to get deleted (not entirely sure why).

      Attachments

        Activity

          People

            Unassigned Unassigned
            stakiar Sahil Takiar
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: