Uploaded image for project: 'IMPALA'
  2. IMPALA-9714

SimpleLogger does not respect limits when there are high frequency appends

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.0.0
    • Impala 4.0.0
    • Backend
    • None
    • ghx-label-5


      SimpleLogger provides a basic guarantee to limit disk space usage for logs. It limits the number of items in each log file, and it limits the total number of log files. When adding tests for this, both limits can be exceeded when a SimpleLogger has a high rate of appends.

      The first issue is that SimpleLogger names its files with a prefix plus the current time in milliseconds. When SimpleLogger reaches its limit of entries for the current file, it flushes that file and calculates a new filename to write new output. However, if appends are happening at a high rate, one millisecond may not have elapsed, in which case the new filename is the same as the old filename. It will just keep appending to the current file.

      The second issue has to do with how we enforce the limit on the number of files. SimpleLogger relies on LoggingSupport::DeleteOldLogs() to enforce the limit on the number of files. DeleteOldLogs() lists the files in the directory matching the prefix pattern and inserts them into a map sorted by their mtime. The mtime has a time_t type, which has a granularity of seconds. When there are high frequency appends to a SimpleLogger, multiple files can be created per second, causing collisions in this map. DeleteOldLogs() can only see one file per distinct mtime, so it can't enforce the limit. This also means that it can only delete at most one file per distinct mtime in each run.

      The first issue is offset by the second issue. The second issue makes DeleteOldLogs() slower, which limits the number of records written per millisecond.

      It doesn't seem like the existing users of SimpleLogger have these types of high frequency updates. It argues for caution when setting the number of log entries per file. A small value for log entries per file can exacerbate these cases. This mainly impacts writing unit tests for SimpleLogger.



          This comment will be Viewable by All Users Viewable by All Users


            Unassigned Unassigned
            joemcdonnell Joe McDonnell
            0 Vote for this issue
            2 Start watching this issue




                Issue deployment