Issue Details (XML | Word | Printable)

Key: HADOOP-5733
Type: Improvement Improvement
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Sreekanth Ramakrishnan
Reporter: Hong Tang
Votes: 0
Watchers: 4
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Add map/reduce slot capacity and lost map/reduce slot capacity to JobTracker metrics

Created: 23/Apr/09 06:58 PM   Updated: 08/Jul/09 04:53 PM
Component/s: metrics
Affects Version/s: None
Fix Version/s: 0.21.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works hadoop-5733-1.patch 2009-04-29 09:06 AM Sreekanth Ramakrishnan 7 kB
Text File Licensed for inclusion in ASF works hadoop-5733-2.patch 2009-05-04 06:51 AM Sreekanth Ramakrishnan 6 kB
Text File Licensed for inclusion in ASF works hadoop-5733-3.patch 2009-05-04 09:57 AM Sreekanth Ramakrishnan 6 kB
Text File Licensed for inclusion in ASF works hadoop-5733-4.patch 2009-05-04 10:11 PM Chris Douglas 6 kB
Text File Licensed for inclusion in ASF works hadoop-5733-v20.patch 2009-05-28 11:32 PM Robert Chansler 6 kB

Hadoop Flags: Reviewed
Resolution Date: 04/May/09 10:13 PM


 Description  « Hide
It would be nice to have the actual map/reduce slot capacity and the lost map/reduce slot capacity (# of blacklisted nodes * map-slot-per-node or reduce-slot-per-node). This information can be used to calculate a JT view of slot utilization.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Sreekanth Ramakrishnan added a comment - 29/Apr/09 09:06 AM
Attaching patch addressing this issue:

Added following new fields:

  • map_slots : Number of Map slots in Cluster
  • reduce_slots : Number of reduce slots in cluster.
  • blacklisted_maps : Number of maps slots black listed.
  • blacklisted_reduces : Number of reduce slots black listed.

Made changes in JobTracker to publish these metrics.


Chris Douglas added a comment - 04/May/09 05:58 AM
Looks good

For the map/reduce slots:

  • Instead of {add,dec}*Slots, consider adding set*Slots to the instrumentation and update with total*TaskCapacity (use MetricsRecord::setMetric)
  • Updates can occur outside the synchronized block in addHostCapacity and removeHostCapacity. With get/set, the field in the metrics can be volatile and updated without synchronizing on the instrumentation

For the blacklisted slots:

  • The add/dec methods should be synchronized; there's a race condition with doUpdate

Sreekanth Ramakrishnan added a comment - 04/May/09 06:51 AM
Attaching patch incorporating the comment:
  • Changed map slot and reduce slot metric from incrMetric to setMetric
  • Changed the field holding, map slots and reduce slots to volatile, so the setters need not be synchronized.
  • The maps and reduce slot is set in updateTaskTrackerStatus in JobTracker
  • The setters for black listed slots have been made synchronized.

Chris Douglas added a comment - 04/May/09 08:26 AM
  • When using setMetric, doUpdates shouldn't reset the metric to 0
  • set*Slots doesn't need to be adjusted in add/removeHostCapacity, as in the original patch?

Sreekanth Ramakrishnan added a comment - 04/May/09 09:57 AM
*Not resetting the metrics field during doUpdates
  • Setting of the slots from add/removeHostCapacity have been removed because, in previous patches case, the map and reduce slots were incremental fields so, when ever the capacities were added/removed it was adjusted. Now, since it is static it is set whenever the TT statuses have updated the JT's internal capacity fields. But it is retained when tracker is marked blacklisted we increment/decrement in add/removeHostCapacity.

Chris Douglas added a comment - 04/May/09 10:11 PM
Merged with trunk, as conflicts with HADOOP-5738. Also re-added the blacklist metric resets to doUpdates, since incrMetric semantics still require it there.
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

Chris Douglas added a comment - 04/May/09 10:13 PM
I committed this. Thanks, Sreekanth

Chris Douglas added a comment - 04/May/09 10:51 PM

Setting of the slots from add/removeHostCapacity have been removed because, in previous patches case, the map and reduce slots were incremental fields so, when ever the capacities were added/removed it was adjusted. Now, since it is static it is set whenever the TT statuses have updated the JT's internal capacity fields. But it is retained when tracker is marked blacklisted we increment/decrement in add/removeHostCapacity.

Sorry, I forgot to acknowledge this. Thanks for the explanation


Hudson added a comment - 05/May/09 07:06 PM
Integrated in Hadoop-trunk #827 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/827/)
. Add map/reduce slot capacity and blacklisted capacity to JobTracker metrics. Contributed by Sreekanth Ramakrishnan

Robert Chansler added a comment - 28/May/09 11:32 PM
Attached example for earlier version not to be committed.