Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1524

Task Logs userlogs don't show up for a while

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.13.0
    • Fix Version/s: 0.14.0
    • Component/s: None
    • Labels:
      None

      Description

      When I start a task and go to the task logs, nothing shows up for a while. An examination of TaskLog.Writer and TaskLog.Reader reveals:

      1. The TaskLog.Reader relies on the presence of a split.idx to identify the parts of the logs to display.
      2. The TaskLog.Writer only updates the split.idx file when it moves on to the next log.

      As a result, updates to the log only get pushed when an entire file is done.

      Why is there a split.idx file? It seems that since files are called part-00000, part-00001, etc., the TaskLog.Reader can just look at all files and arrange them by alphabetical order. The split.idx file also contains file length, but this data is already stored by the filesystem.

      If nobody has objections, I'd like to write a patch to eliminate the split.idx file.

        Attachments

        1. accelerate-task-log.patch
          5 kB
          Michael Bieniosek
        2. eliminate-split-idx.patch
          5 kB
          Michael Bieniosek

          Activity

            People

            • Assignee:
              bien Michael Bieniosek
              Reporter:
              bien Michael Bieniosek
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: