Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7252

Backport rate limiting of fadvise calls into toolchain glog

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 3.0
    • Impala 3.1.0
    • Backend
    • None
    • ghx-label-9

    Description

      Currently, glog's default behavior is to call fadvise(FADV_DONTNEED) on the log file after each entry that is written. In many versions of the Linux kernel, each invocation of this call causes work to be scheduled on all other CPUs, causing up to one context switch per CPU for every log line. We saw this cause an extremely long GC pause in the catalogd in the case where the native side of the catalog was logging a lot of messages about publishing metadata updates at the same time that the Java side was running a GC. The GC spent almost all of its time in the kernel due to the high context switch rate causing a lot of TLB clears and misses, and instead of pausing the JVM for a couple of seconds took several minutes.

      This was identified and fixed upstream in glog here: https://github.com/google/glog/commit/dacd29679633c9b845708e7015bd2c79367a6ea2

      We should backport this fix into the version in the toolchain.

      Attachments

        Issue Links

          Activity

            People

              tianyiwang Tianyi Wang
              tlipcon Todd Lipcon
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: