Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11886

Data cache should support asynchronous writes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 4.3.0
    • None
    • None
    • ghx-label-7

    Description

      Currently, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to query performance deterioration in some cases.
      Therefore, the data cache should be able to defer the writes to another thread(or thread pool) which writes asynchronously, allowing the IO thread to copy the data into the temporary buffer and immediately return it to the Scanner. Also need to bound the extra memory consumption for holding the temporary buffer though.

      Attachments

        Activity

          People

            eyizoha Zihao Ye
            eyizoha Zihao Ye
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: