Chukwa
  1. Chukwa
  2. CHUKWA-132

Handle multiline output in Job History file

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Data Processors
    • Labels:
      None
    • Environment:

      Redhat EL 5.1, Java 6

      Description

      When there are multi line output in the Job History file, the parser fails with exception like this:

      MapAttempt TASK_TYPE="MAP" TASKID="task_200904060626_2141_m_000108" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000108_1" START_TIME="1239190934835" TRACKER_NAME="tracker_kry50024\.inktomisearch\.com:localhost/127\.0\.0\.1:39507" HTTP_PORT="50060" .
      MapAttempt TASK_TYPE="MAP" TASKID="task_200904060626_2141_m_000108" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000108_1" TASK_STATUS="FAILED" FINISH_TIME="1239190949062" HOSTNAME="kry50024\.inktomisearch\.com" ERROR="java\.io\.IOException: MROutput/MRErrThread failed:java\.lang\.ArrayIndexOutOfBoundsException: -1
      at org\.apache\.hadoop\.mapred\.lib\.KeyFieldBasedPartitioner\.hashCode(KeyFieldBasedPartitioner\.java:95)
      at org\.apache\.hadoop\.mapred\.lib\.KeyFieldBasedPartitioner\.getPartition(KeyFieldBasedPartitioner\.java:87)
      at org\.apache\.hadoop\.mapred\.MapTask$MapOutputBuffer\.collect(MapTask\.java:801)
      at org\.apache\.hadoop\.streaming\.PipeMapRed$MROutputThread\.run(PipeMapRed\.java:378)

      at org\.apache\.hadoop\.streaming\.PipeMapper\.map(PipeMapper\.java:87)
      at org\.apache\.hadoop\.mapred\.MapRunner\.run(MapRunner\.java:50)
      at org\.apache\.hadoop\.streaming\.PipeMapRunner\.run(PipeMapRunner\.java:36)
      at org\.apache\.hadoop\.mapred\.MapTask\.runOldMapper(MapTask\.java:356)
      at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:305)
      at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:156)
      " .
      MapAttempt TASK_TYPE="CLEANUP" TASKID="task_200904060626_2141_m_000197" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000197_0" START_TIME="1239190961843" TRACKER_NAME="tracker_kry3083\.inktomisearch\.com:localhost/127\.0\.0\.1:60970" HTTP_PORT="50060" .
      MapAttempt TASK_TYPE="CLEANUP" TASKID="task_200904060626_2141_m_000197" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000197_0" TASK_STATUS="SUCCESS" FINISH_TIME="1239190963602" HOSTNAME="/74\.6\.135\.128/kry3083\.inktomisearch\.com" STATE_STRING="cleanup" COUNTERS="

      {(org\.apache\.hadoop\.mapred\.Task$Counter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)]}" .
      Task TASKID="task_200904060626_2141_m_000197" TASK_TYPE="CLEANUP" TASK_STATUS="SUCCESS" FINISH_TIME="1239190963509" COUNTERS="{(org.apache.hadoop.mapred.Task$Counter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)]}

      " .
      Job JOBID="job_200904060626_2141" FINISH_TIME="1239190963510" JOB_STATUS="FAILED" FINISHED_MAPS="0" FINISHED_REDUCES="0" .

      [cchunkException] :java.lang.StringIndexOutOfBoundsException: String index out of range: -1
      at java.lang.String.substring(String.java:1938)
      at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.JobLog$JobLogLine.<init>(JobLog.java:114)
      at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.JobLog.parse(JobLog.java:39)
      at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.AbstractProcessor.process(AbstractProcessor.java:90)
      at org.apache.hadoop.chukwa.extraction.demux.Demux$MapClass.map(Demux.java:94)
      at org.apache.hadoop.chukwa.extraction.demux.Demux$MapClass.map(Demux.java:60)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
      at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2210)

      [csource] :host.example.com
      [ctags] :cluster="demo"

        Activity

        Eric Yang created issue -
        Eric Yang made changes -
        Field Original Value New Value
        Assignee Cheng [ zhangyongjiang ]
        Mac Yang made changes -
        Priority Critical [ 2 ] Blocker [ 1 ]
        Cheng made changes -
        Attachment chukwa-132.patch [ 12405462 ]
        Cheng made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Eric Yang made changes -
        Description When there are multi line output in the Job History file, the parser fails with exception like this:

        MapAttempt TASK_TYPE="MAP" TASKID="task_200904060626_2141_m_000108" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000108_1" START_TIME="1239190934835" TRACKER_NAME="tracker_kry50024\.inktomisearch\.com:localhost/127\.0\.0\.1:39507" HTTP_PORT="50060" .
        MapAttempt TASK_TYPE="MAP" TASKID="task_200904060626_2141_m_000108" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000108_1" TASK_STATUS="FAILED" FINISH_TIME="1239190949062" HOSTNAME="kry50024\.inktomisearch\.com" ERROR="java\.io\.IOException: MROutput/MRErrThread failed:java\.lang\.ArrayIndexOutOfBoundsException: -1
        at org\.apache\.hadoop\.mapred\.lib\.KeyFieldBasedPartitioner\.hashCode(KeyFieldBasedPartitioner\.java:95)
        at org\.apache\.hadoop\.mapred\.lib\.KeyFieldBasedPartitioner\.getPartition(KeyFieldBasedPartitioner\.java:87)
        at org\.apache\.hadoop\.mapred\.MapTask$MapOutputBuffer\.collect(MapTask\.java:801)
        at org\.apache\.hadoop\.streaming\.PipeMapRed$MROutputThread\.run(PipeMapRed\.java:378)

        at org\.apache\.hadoop\.streaming\.PipeMapper\.map(PipeMapper\.java:87)
        at org\.apache\.hadoop\.mapred\.MapRunner\.run(MapRunner\.java:50)
        at org\.apache\.hadoop\.streaming\.PipeMapRunner\.run(PipeMapRunner\.java:36)
        at org\.apache\.hadoop\.mapred\.MapTask\.runOldMapper(MapTask\.java:356)
        at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:305)
        at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:156)
        " .
        MapAttempt TASK_TYPE="CLEANUP" TASKID="task_200904060626_2141_m_000197" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000197_0" START_TIME="1239190961843" TRACKER_NAME="tracker_kry3083\.inktomisearch\.com:localhost/127\.0\.0\.1:60970" HTTP_PORT="50060" .
        MapAttempt TASK_TYPE="CLEANUP" TASKID="task_200904060626_2141_m_000197" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000197_0" TASK_STATUS="SUCCESS" FINISH_TIME="1239190963602" HOSTNAME="/74\.6\.135\.128/kry3083\.inktomisearch\.com" STATE_STRING="cleanup" COUNTERS="{(org\.apache\.hadoop\.mapred\.Task$Counter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)]}" .
        Task TASKID="task_200904060626_2141_m_000197" TASK_TYPE="CLEANUP" TASK_STATUS="SUCCESS" FINISH_TIME="1239190963509" COUNTERS="{(org\.apache\.hadoop\.mapred\.Task$Counter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)]}" .
        Job JOBID="job_200904060626_2141" FINISH_TIME="1239190963510" JOB_STATUS="FAILED" FINISHED_MAPS="0" FINISHED_REDUCES="0" .

        [cchunkException] :java.lang.StringIndexOutOfBoundsException: String index out of range: -1
        at java.lang.String.substring(String.java:1938)
        at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.JobLog$JobLogLine.<init>(JobLog.java:114)
        at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.JobLog.parse(JobLog.java:39)
        at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.AbstractProcessor.process(AbstractProcessor.java:90)
        at org.apache.hadoop.chukwa.extraction.demux.Demux$MapClass.map(Demux.java:94)
        at org.apache.hadoop.chukwa.extraction.demux.Demux$MapClass.map(Demux.java:60)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2210)

        [csource] :kry-jt1.red.ygrid.yahoo.com
        [ctags] :cluster="kryptonitered"

        When there are multi line output in the Job History file, the parser fails with exception like this:

        MapAttempt TASK_TYPE="MAP" TASKID="task_200904060626_2141_m_000108" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000108_1" START_TIME="1239190934835" TRACKER_NAME="tracker_kry50024\.inktomisearch\.com:localhost/127\.0\.0\.1:39507" HTTP_PORT="50060" .
        MapAttempt TASK_TYPE="MAP" TASKID="task_200904060626_2141_m_000108" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000108_1" TASK_STATUS="FAILED" FINISH_TIME="1239190949062" HOSTNAME="kry50024\.inktomisearch\.com" ERROR="java\.io\.IOException: MROutput/MRErrThread failed:java\.lang\.ArrayIndexOutOfBoundsException: -1
        at org\.apache\.hadoop\.mapred\.lib\.KeyFieldBasedPartitioner\.hashCode(KeyFieldBasedPartitioner\.java:95)
        at org\.apache\.hadoop\.mapred\.lib\.KeyFieldBasedPartitioner\.getPartition(KeyFieldBasedPartitioner\.java:87)
        at org\.apache\.hadoop\.mapred\.MapTask$MapOutputBuffer\.collect(MapTask\.java:801)
        at org\.apache\.hadoop\.streaming\.PipeMapRed$MROutputThread\.run(PipeMapRed\.java:378)

        at org\.apache\.hadoop\.streaming\.PipeMapper\.map(PipeMapper\.java:87)
        at org\.apache\.hadoop\.mapred\.MapRunner\.run(MapRunner\.java:50)
        at org\.apache\.hadoop\.streaming\.PipeMapRunner\.run(PipeMapRunner\.java:36)
        at org\.apache\.hadoop\.mapred\.MapTask\.runOldMapper(MapTask\.java:356)
        at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:305)
        at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:156)
        " .
        MapAttempt TASK_TYPE="CLEANUP" TASKID="task_200904060626_2141_m_000197" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000197_0" START_TIME="1239190961843" TRACKER_NAME="tracker_kry3083\.inktomisearch\.com:localhost/127\.0\.0\.1:60970" HTTP_PORT="50060" .
        MapAttempt TASK_TYPE="CLEANUP" TASKID="task_200904060626_2141_m_000197" TASK_ATTEMPT_ID="attempt_200904060626_2141_m_000197_0" TASK_STATUS="SUCCESS" FINISH_TIME="1239190963602" HOSTNAME="/74\.6\.135\.128/kry3083\.inktomisearch\.com" STATE_STRING="cleanup" COUNTERS="{(org\.apache\.hadoop\.mapred\.Task$Counter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)]}" .
        Task TASKID="task_200904060626_2141_m_000197" TASK_TYPE="CLEANUP" TASK_STATUS="SUCCESS" FINISH_TIME="1239190963509" COUNTERS="{(org\.apache\.hadoop\.mapred\.Task$Counter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)]}" .
        Job JOBID="job_200904060626_2141" FINISH_TIME="1239190963510" JOB_STATUS="FAILED" FINISHED_MAPS="0" FINISHED_REDUCES="0" .

        [cchunkException] :java.lang.StringIndexOutOfBoundsException: String index out of range: -1
        at java.lang.String.substring(String.java:1938)
        at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.JobLog$JobLogLine.<init>(JobLog.java:114)
        at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.JobLog.parse(JobLog.java:39)
        at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.AbstractProcessor.process(AbstractProcessor.java:90)
        at org.apache.hadoop.chukwa.extraction.demux.Demux$MapClass.map(Demux.java:94)
        at org.apache.hadoop.chukwa.extraction.demux.Demux$MapClass.map(Demux.java:60)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2210)

        [csource] :host.example.com
        [ctags] :cluster="demo"

        Eric Yang made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Cheng
            Reporter:
            Eric Yang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development