Uploaded image for project: 'Chukwa'
  1. Chukwa
  2. CHUKWA-131

Collect finished_maps, finished_reduces, failed_maps, failed_reduces in mr_job table

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.1.2
    • Fix Version/s: 0.1.2
    • Component/s: Data Processors
    • Labels:
      None
    • Environment:

      Redhat 5.1, Java 6

      Description

      The current metrics from map reduce job is missing the following important columns:

      finished_maps
      finished_reduces
      failed_maps
      failed_reduces

      1. CHUKWA-131-3.patch
        0.4 kB
        Eric Yang
      2. CHUKWA-131-2.patch
        4 kB
        Eric Yang
      3. CHUKWA-131-1.patch
        16 kB
        Eric Yang
      4. CHUKWA-131.patch
        2 kB
        Eric Yang

        Issue Links

          Activity

          Hide
          hudson Hudson added a comment -
          Show
          hudson Hudson added a comment - Integrated in Chukwa-trunk #45 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/45/ )
          Hide
          hudson Hudson added a comment -

          Integrated in Chukwa-trunk #8 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/8/)
          . Changed from using task id to task attempt id. (Eric Yang)

          Show
          hudson Hudson added a comment - Integrated in Chukwa-trunk #8 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/8/ ) . Changed from using task id to task attempt id. (Eric Yang)
          Hide
          eyang Eric Yang added a comment -

          I just committed this, thanks Cheng.

          Show
          eyang Eric Yang added a comment - I just committed this, thanks Cheng.
          Hide
          zhangyongjiang Cheng added a comment -

          +1 good

          Show
          zhangyongjiang Cheng added a comment - +1 good
          Hide
          eyang Eric Yang added a comment -

          Add the missing task_type field.

          Show
          eyang Eric Yang added a comment - Add the missing task_type field.
          Hide
          eyang Eric Yang added a comment -

          Missing task_type definition in MDL.

          Show
          eyang Eric Yang added a comment - Missing task_type definition in MDL.
          Hide
          eyang Eric Yang added a comment -

          I just committed this, thanks Cheng.

          Show
          eyang Eric Yang added a comment - I just committed this, thanks Cheng.
          Hide
          zhangyongjiang Cheng added a comment -

          +1 good.

          Show
          zhangyongjiang Cheng added a comment - +1 good.
          Hide
          eyang Eric Yang added a comment -

          Update MDL dictionary and database schema to match the new columns.

          Show
          eyang Eric Yang added a comment - Update MDL dictionary and database schema to match the new columns.
          Hide
          eyang Eric Yang added a comment -

          MR Job Data:

          finished_maps
          finished_reduces
          failed_maps
          failed_reduces
          total_maps
          total_reduces
          reduce_shuffle_bytes

          MR Task Data:

          type
          reduce_shuffle_bytes
          hostname
          shuffle_finished
          sort_finished
          spilts

          Collect those new metrics.

          Show
          eyang Eric Yang added a comment - MR Job Data: finished_maps finished_reduces failed_maps failed_reduces total_maps total_reduces reduce_shuffle_bytes MR Task Data: type reduce_shuffle_bytes hostname shuffle_finished sort_finished spilts Collect those new metrics.
          Hide
          macyang Mac Yang added a comment -

          All fields should be loaded. Two that I'm not sure about are http_port and state_string.

          Possible fields Map Reduce Fields:
          FAILED_MAPS
          FAILED_REDUCES
          FINISHED_MAPS
          FINISHED_REDUCES
          FINISH_TIME
          JOBCONF
          JOBID
          JOBNAME
          JOB_PRIORITY
          JOB_STATUS
          LAUNCH_TIME
          SUBMIT_TIME
          TOTAL_MAPS
          TOTAL_REDUCES
          USER
          Counter:FileSystemCounters:FILE_BYTES_READ
          Counter:FileSystemCounters:FILE_BYTES_WRITTEN
          Counter:FileSystemCounters:HDFS_BYTES_READ
          Counter:FileSystemCounters:HDFS_BYTES_WRITTEN
          Counter:org.apache.hadoop.mapred.JobInProgress$Counter:TOTAL_LAUNCHED_MAPS
          Counter:org.apache.hadoop.mapred.JobInProgress$Counter:TOTAL_LAUNCHED_REDUCES
          Counter:org.apache.hadoop.mapred.Task$Counter:COMBINE_INPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:COMBINE_OUTPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:MAP_INPUT_BYTES
          Counter:org.apache.hadoop.mapred.Task$Counter:MAP_INPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:MAP_OUTPUT_BYTES
          Counter:org.apache.hadoop.mapred.Task$Counter:MAP_OUTPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_INPUT_GROUPS
          Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_INPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_OUTPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_SHUFFLE_BYTES
          Counter:org.apache.hadoop.mapred.Task$Counter:SPILLED_RECORDS

          Possible Task Fields:

          Counter:FileSystemCounters:FILE_BYTES_WRITTEN
          Counter:FileSystemCounters:HDFS_BYTES_READ
          Counter:org.apache.hadoop.mapred.Task$Counter:COMBINE_INPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:COMBINE_OUTPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:MAP_INPUT_BYTES
          Counter:org.apache.hadoop.mapred.Task$Counter:MAP_INPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:MAP_OUTPUT_BYTES
          Counter:org.apache.hadoop.mapred.Task$Counter:MAP_OUTPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_INPUT_GROUPS
          Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_INPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_OUTPUT_RECORDS
          Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_SHUFFLE_BYTES
          Counter:org.apache.hadoop.mapred.Task$Counter:SPILLED_RECORDS
          ERROR
          FINISH_TIME
          HOSTNAME
          HTTP_PORT
          JOBID
          SHUFFLE_FINISHED
          SORT_FINISHED
          SPLITS
          START_TIME
          STATE_STRING
          TASKID
          TASK_ATTEMPT_ID
          TASK_ATTEMPT_TIMES
          TASK_STATUS
          TASK_TYPE
          TRACKER_NAME

          Show
          macyang Mac Yang added a comment - All fields should be loaded. Two that I'm not sure about are http_port and state_string. Possible fields Map Reduce Fields: FAILED_MAPS FAILED_REDUCES FINISHED_MAPS FINISHED_REDUCES FINISH_TIME JOBCONF JOBID JOBNAME JOB_PRIORITY JOB_STATUS LAUNCH_TIME SUBMIT_TIME TOTAL_MAPS TOTAL_REDUCES USER Counter:FileSystemCounters:FILE_BYTES_READ Counter:FileSystemCounters:FILE_BYTES_WRITTEN Counter:FileSystemCounters:HDFS_BYTES_READ Counter:FileSystemCounters:HDFS_BYTES_WRITTEN Counter:org.apache.hadoop.mapred.JobInProgress$Counter:TOTAL_LAUNCHED_MAPS Counter:org.apache.hadoop.mapred.JobInProgress$Counter:TOTAL_LAUNCHED_REDUCES Counter:org.apache.hadoop.mapred.Task$Counter:COMBINE_INPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:COMBINE_OUTPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:MAP_INPUT_BYTES Counter:org.apache.hadoop.mapred.Task$Counter:MAP_INPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:MAP_OUTPUT_BYTES Counter:org.apache.hadoop.mapred.Task$Counter:MAP_OUTPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_INPUT_GROUPS Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_INPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_OUTPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_SHUFFLE_BYTES Counter:org.apache.hadoop.mapred.Task$Counter:SPILLED_RECORDS Possible Task Fields: Counter:FileSystemCounters:FILE_BYTES_WRITTEN Counter:FileSystemCounters:HDFS_BYTES_READ Counter:org.apache.hadoop.mapred.Task$Counter:COMBINE_INPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:COMBINE_OUTPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:MAP_INPUT_BYTES Counter:org.apache.hadoop.mapred.Task$Counter:MAP_INPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:MAP_OUTPUT_BYTES Counter:org.apache.hadoop.mapred.Task$Counter:MAP_OUTPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_INPUT_GROUPS Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_INPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_OUTPUT_RECORDS Counter:org.apache.hadoop.mapred.Task$Counter:REDUCE_SHUFFLE_BYTES Counter:org.apache.hadoop.mapred.Task$Counter:SPILLED_RECORDS ERROR FINISH_TIME HOSTNAME HTTP_PORT JOBID SHUFFLE_FINISHED SORT_FINISHED SPLITS START_TIME STATE_STRING TASKID TASK_ATTEMPT_ID TASK_ATTEMPT_TIMES TASK_STATUS TASK_TYPE TRACKER_NAME
          Hide
          terencekwan Terence Kwan added a comment -

          finished_maps default 0,
          finished_reduces default 0,
          failed_maps default 0,
          failed_reduces default 0,

          need to be:

          finished_maps bigint default 0,
          finished_reduces bigint default 0,
          failed_maps bigint default 0,
          failed_reduces bigint default 0,

          Show
          terencekwan Terence Kwan added a comment - finished_maps default 0, finished_reduces default 0, failed_maps default 0, failed_reduces default 0, need to be: finished_maps bigint default 0, finished_reduces bigint default 0, failed_maps bigint default 0, failed_reduces bigint default 0,
          Hide
          eyang Eric Yang added a comment -

          I just committed this, thanks Cheng.

          Show
          eyang Eric Yang added a comment - I just committed this, thanks Cheng.
          Hide
          zhangyongjiang Cheng added a comment -

          +1 looks good.

          Show
          zhangyongjiang Cheng added a comment - +1 looks good.
          Hide
          eyang Eric Yang added a comment -

          This issue contains the patch for CHUKWA-127.

          Show
          eyang Eric Yang added a comment - This issue contains the patch for CHUKWA-127 .
          Hide
          eyang Eric Yang added a comment -

          Added MDL dictionary to collect finished_maps, finished_reduces, failed_maps, failed_reduces.

          Show
          eyang Eric Yang added a comment - Added MDL dictionary to collect finished_maps, finished_reduces, failed_maps, failed_reduces.

            People

            • Assignee:
              eyang Eric Yang
              Reporter:
              eyang Eric Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development