Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11704

Remote Ozone scans are slow even after data cache warmup

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.1.1
    • Impala 4.2.0
    • Backend
    • ghx-label-4

    Description

      From drorke:

      Running some basic performance sanity tests ... with Impala TPC-DS queries against Ozone vs HDFS. Impala appears to be using it's data cache for both Ozone and HDFS remote reads, but in the case of Ozone reads I'm still seeing long scan times and high I/O wait times even after cache warmup. Excerpts below from profiles of q90.  Note in both cases the Impala profiles show 100% cache hit rates but for some reason the scan IO wait times are still much longer for the Ozone scans.

      HDFS:
      - TotalTime: 1s924ms
      - ScannerIoWaitTime: 52.037ms
      
      Ozone:
      - TotalTime: 8s917ms
      - ScannerIoWaitTime: 7s454ms

      If I disable the local cache explicitly via query option I get the following times for the same scan:

      HDFS:
      - TotalTime: 7s792ms
      - ScannerIoWaitTime: 6s244ms
      
      Ozone:
      - TotalTime: 8s963ms
      - ScannerIoWaitTime: 7s464ms

      Investigating a bit, joemcdonnell noticed in the Ozone profile

       - ScannerIoWaitTime: 7s454ms
       - TotalRawHdfsOpenFileTime: 5s782ms
      

      Based on profile differences around TotalRawHdfsOpenFileTime=5s782ms (vs 0ms for HDFS), I believe this is a difference in performance when using the data cache but the file handle cache is disabled. That traces back to an incomplete implementation of IMPALA-10147.

      A data read:
      1. Checks that it can open a file handle. When file handle cache is enabled, this is a noop.
      2. It will then try to read data. If data cache is enabled, it will try to read from the data cache.
      3. If data cache hits, that data is returned and any open file handles are unused.

      When the file handle cache is disabled, opening the file handle calls hdfsOpenFile and hdfsSeek. hdfsOpenFile in particular is monitored and added to the profile as TotalRawHdfsOpenFileTime. That time in the Ozone profile accounts for most of the difference in performance between HDFS and Ozone in this case.

      Attachments

        Issue Links

          Activity

            People

              MikaelSmith Michael Smith
              MikaelSmith Michael Smith
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: