Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.2.0
    • Component/s: None
    • Labels:
      None

      Description

      Typically a user submits jobs with similar characteristics. Aggregating the following metrics based on users can help
      quickly identify VIP users and how their jobs look like:

      • slot-hours used for map tasks, for reduce tasks
      • total jobs, jobs failed
      • data-local-maps, rack-local-maps, remote-maps
      • total map-input-bytes, reduce-output-records
      • total map tasks, total reduce tasks

      The granularity of the aggregation can be as coarse as daily. Data may be used to report top-K users in certain
      categories. Data shall be available as chukwa records (namely, one record per day per user).

      1. chukwa-253.2.patch
        9 kB
        Cheng
      2. chukwa-253.3.patch
        13 kB
        Cheng
      3. chukwa-253.5.patch
        4 kB
        Cheng
      4. chukwa-253.patch
        6 kB
        Cheng
      5. chukwa-253-1.patch
        8 kB
        Cheng
      6. CHUKWA-253-4.patch
        4 kB
        Eric Yang

        Activity

        Hide
        zhangyongjiang Cheng added a comment -

        Patch submitted.

        • the pig script is at chukwa-home/script/pig
        • the shell script for cronjob is at chukwa-home/bin. To manually run the shell script, use command
          /path/to/chukwa-home/bin/UserDailySummary.sh <YYYYMMDD> <CLUSTER>
        Show
        zhangyongjiang Cheng added a comment - Patch submitted. the pig script is at chukwa-home/script/pig the shell script for cronjob is at chukwa-home/bin. To manually run the shell script, use command /path/to/chukwa-home/bin/UserDailySummary.sh <YYYYMMDD> <CLUSTER>
        Hide
        zhangyongjiang Cheng added a comment -

        This patch depends on the patch for chukwa-20.

        Show
        zhangyongjiang Cheng added a comment - This patch depends on the patch for chukwa-20.
        Hide
        zhangyongjiang Cheng added a comment -

        Comparing with previous patch:

        • Changed file name format
        • Use CHUKWA_IDENT_STRING for the cluster name
        Show
        zhangyongjiang Cheng added a comment - Comparing with previous patch: Changed file name format Use CHUKWA_IDENT_STRING for the cluster name
        Hide
        zhangyongjiang Cheng added a comment -

        Just talked to Eric. CHUKWA_IDENT_STRING might not be the right source for cluster name. Also each chukwa instance could monitor more than one cluster. We may need a conf file to specify all clusters.

        Show
        zhangyongjiang Cheng added a comment - Just talked to Eric. CHUKWA_IDENT_STRING might not be the right source for cluster name. Also each chukwa instance could monitor more than one cluster. We may need a conf file to specify all clusters.
        Hide
        jboulon Jerome Boulon added a comment -

        Cluster should be manually specified. Also keep in mind that you can use a set of parameter file for the same query.
        This way in order to add another cluster you only have to drop one additional parameter file.

        Show
        jboulon Jerome Boulon added a comment - Cluster should be manually specified. Also keep in mind that you can use a set of parameter file for the same query. This way in order to add another cluster you only have to drop one additional parameter file.
        Hide
        zhangyongjiang Cheng added a comment -

        New patch submitted. User can specified the cluster name, date, jobfile, taskfile from command line. All the parameters have default values.

        Show
        zhangyongjiang Cheng added a comment - New patch submitted. User can specified the cluster name, date, jobfile, taskfile from command line. All the parameters have default values.
        Hide
        zhangyongjiang Cheng added a comment -

        Record type is UserDailySummary

        ChukwaRecord keys:
        ts (timestamp),
        user,
        totalJobs,
        dataLocalMaps,
        rackLocalMaps,
        remoteMaps,
        mapInputBytes,
        reduceOutputRecords,
        mapSlotHours,
        reduceSlotHours,
        totalMaps,
        totalReduces;

        Show
        zhangyongjiang Cheng added a comment - Record type is UserDailySummary ChukwaRecord keys: ts (timestamp), user, totalJobs, dataLocalMaps, rackLocalMaps, remoteMaps, mapInputBytes, reduceOutputRecords, mapSlotHours, reduceSlotHours, totalMaps, totalReduces;
        Hide
        zhangyongjiang Cheng added a comment -

        Added mdl support to load data into database.

        Show
        zhangyongjiang Cheng added a comment - Added mdl support to load data into database.
        Hide
        eyang Eric Yang added a comment -

        +1 looks good.

        Show
        eyang Eric Yang added a comment - +1 looks good.
        Hide
        eyang Eric Yang added a comment -

        I just committed this, thanks Cheng.

        Show
        eyang Eric Yang added a comment - I just committed this, thanks Cheng.
        Hide
        hudson Hudson added a comment -

        Integrated in Chukwa-trunk #49 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/49/)
        . Added aggregations by user. (Cheng Zhang via Eric Yang)

        Show
        hudson Hudson added a comment - Integrated in Chukwa-trunk #49 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/49/ ) . Added aggregations by user. (Cheng Zhang via Eric Yang)
        Hide
        eyang Eric Yang added a comment -

        The aggregator.sql contains incorrect syntax.

        Show
        eyang Eric Yang added a comment - The aggregator.sql contains incorrect syntax.
        Hide
        eyang Eric Yang added a comment -

        Fix group by bracket.

        Show
        eyang Eric Yang added a comment - Fix group by bracket.
        Hide
        zhangyongjiang Cheng added a comment -

        +1 Looks good and thank you.

        Show
        zhangyongjiang Cheng added a comment - +1 Looks good and thank you.
        Hide
        eyang Eric Yang added a comment -

        Thanks Cheng, I just committed this.

        Show
        eyang Eric Yang added a comment - Thanks Cheng, I just committed this.
        Hide
        hudson Hudson added a comment -
        Show
        hudson Hudson added a comment - Integrated in Chukwa-trunk #53 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/53/ )
        Hide
        eyang Eric Yang added a comment -

        SQL statement is still not right.

        Show
        eyang Eric Yang added a comment - SQL statement is still not right.
        Hide
        zhangyongjiang Cheng added a comment -

        fixed syntax error.

        Show
        zhangyongjiang Cheng added a comment - fixed syntax error.
        Hide
        eyang Eric Yang added a comment -

        +1 looks good. The new syntax works on my machine.

        Show
        eyang Eric Yang added a comment - +1 looks good. The new syntax works on my machine.
        Hide
        eyang Eric Yang added a comment -

        I just committed this to both branch and trunk. Thanks Cheng.

        Show
        eyang Eric Yang added a comment - I just committed this to both branch and trunk. Thanks Cheng.
        Hide
        hudson Hudson added a comment -

        Integrated in Chukwa-trunk #55 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/55/)
        . Updated SQL statement with proper column names and brackets. (Cheng Zhang via Eric Yang)

        Show
        hudson Hudson added a comment - Integrated in Chukwa-trunk #55 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/55/ ) . Updated SQL statement with proper column names and brackets. (Cheng Zhang via Eric Yang)

          People

          • Assignee:
            zhangyongjiang Cheng
            Reporter:
            zhangyongjiang Cheng
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development