Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-679

XML-based metrics as JSP servlet for JobTracker

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: jobtracker
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Added XML-based JobTracker status JSP page for metrics reporting

      Description

      In HADOOP-4559, a general REST API for reporting metrics was proposed but work seems to have stalled. In the interim, we have a simple XML translation of the existing JobTracker status page which provides the same metrics (including the tables of running/completed/failed jobs) as the human-readable page. This is a relatively lightweight addition to provide some machine-understandable metrics reporting.

      1. MAPREDUCE-679.patch
        5 kB
        Aaron Kimball
      2. MAPREDUCE-679.7.patch
        16 kB
        Aaron Kimball
      3. MAPREDUCE-679.6.patch
        16 kB
        Aaron Kimball
      4. MAPREDUCE-679.5.patch
        16 kB
        Aaron Kimball
      5. MAPREDUCE-679.4.patch
        11 kB
        Aaron Kimball
      6. MAPREDUCE-679.3.patch
        11 kB
        Aaron Kimball
      7. MAPREDUCE-679.2.patch
        10 kB
        Aaron Kimball
      8. example-jobtracker-running-job.xml
        1 kB
        Aaron Kimball
      9. example-jobtracker-completed-job.xml
        1 kB
        Aaron Kimball

        Issue Links

          Activity

          Hide
          Aaron Kimball added a comment -

          Initial version of this patch

          Show
          Aaron Kimball added a comment - Initial version of this patch
          Hide
          Steve Loughran added a comment -

          What is your test plan for this? I'd suggest something involving html unit while jobs are pushed up

          Show
          Steve Loughran added a comment - What is your test plan for this? I'd suggest something involving html unit while jobs are pushed up
          Hide
          Aaron Kimball added a comment -

          htmlunit isn't the right choice here since I'm not generating HTML. I'll add some tests that start a MiniMRCluster, perform an HTTP GET to retrieve the page, and check for well-formedness of XML results.

          Show
          Aaron Kimball added a comment - htmlunit isn't the right choice here since I'm not generating HTML. I'll add some tests that start a MiniMRCluster, perform an HTTP GET to retrieve the page, and check for well-formedness of XML results.
          Hide
          Aaron Kimball added a comment -

          New patch available. Per a code-review comment on HDFS-453, I've split the helper methods into a separate static class. Also added a test case which starts a MiniMRCluster and retrieves the JSP page from the appropriate URL. It uses Java's built-in SAX parser to parse the page, assure it is well-formed, and that it contains exactly one <cluster>..</cluster> block which defines the page of data.

          Show
          Aaron Kimball added a comment - New patch available. Per a code-review comment on HDFS-453 , I've split the helper methods into a separate static class. Also added a test case which starts a MiniMRCluster and retrieves the JSP page from the appropriate URL. It uses Java's built-in SAX parser to parse the page, assure it is well-formed, and that it contains exactly one <cluster>..</cluster> block which defines the page of data.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12412525/MAPREDUCE-679.2.patch
          against trunk revision 790971.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/351/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/351/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/351/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/351/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12412525/MAPREDUCE-679.2.patch against trunk revision 790971. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/351/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/351/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/351/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/351/console This message is automatically generated.
          Hide
          Aaron Kimball added a comment -

          Test failures are all unrelated.

          Show
          Aaron Kimball added a comment - Test failures are all unrelated.
          Hide
          Steve Loughran added a comment -

          Same comments as for HDFS-453

          1. a JSPX page may make it easier for generating XML; especially for ensuring there isn't any whitespace ahead of the xml header declaration, or bad xml inside it.
          2. caching needs to be turned off here and in all JSP pages as per HDFS-91
          3. your generateJobTable code can/should iterate as {{for(JobInProgress job:jobs) instead of Object

          Can you attach some example XML, I'd like to look at it a bit more too, see how well it would for for XSL/Xpath navigation

          Show
          Steve Loughran added a comment - Same comments as for HDFS-453 a JSPX page may make it easier for generating XML; especially for ensuring there isn't any whitespace ahead of the xml header declaration, or bad xml inside it. caching needs to be turned off here and in all JSP pages as per HDFS-91 your generateJobTable code can/should iterate as {{for(JobInProgress job:jobs) instead of Object Can you attach some example XML, I'd like to look at it a bit more too, see how well it would for for XSL/Xpath navigation
          Hide
          Aaron Kimball added a comment -

          Attaching example xml outputs

          Show
          Aaron Kimball added a comment - Attaching example xml outputs
          Hide
          Aaron Kimball added a comment -

          Attaching new patch

          • moves /jobtracker.xml.jsp to /jobtracker.jspx
          • updates test to go along with it
          • updates build.xml to fix bug in webapp compilation.
          Show
          Aaron Kimball added a comment - Attaching new patch moves /jobtracker.xml.jsp to /jobtracker.jspx updates test to go along with it updates build.xml to fix bug in webapp compilation.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12417204/MAPREDUCE-679.3.patch
          against trunk revision 806408.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/500/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/500/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/500/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/500/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417204/MAPREDUCE-679.3.patch against trunk revision 806408. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/500/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/500/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/500/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/500/console This message is automatically generated.
          Hide
          Aaron Kimball added a comment -

          2 findbugs warnings (dead stores) are in code auto-generated by JspC. I don't think there's anything I can do about that.

          The failing unit tests don't seem related.

          Adding this patch, now that it uses jspx, adds the following warning:

          [jsp-compile] 09/08/21 14:30:00 WARN compiler.TldLocationsCache: Internal Error: File /WEB-INF/web.xml not found
          

          This is confusing, since it seems like JspC auto-generates the correct web.xml anyway. Problematic?

          Show
          Aaron Kimball added a comment - 2 findbugs warnings (dead stores) are in code auto-generated by JspC. I don't think there's anything I can do about that. The failing unit tests don't seem related. Adding this patch, now that it uses jspx, adds the following warning: [jsp-compile] 09/08/21 14:30:00 WARN compiler.TldLocationsCache: Internal Error: File /WEB-INF/web.xml not found This is confusing, since it seems like JspC auto-generates the correct web.xml anyway. Problematic?
          Hide
          Steve Loughran added a comment -

          TldLocations cache is some cache for globally defined taglibs

          http://tomcat.apache.org/tomcat-5.5-doc/jasper/docs/api/org/apache/jasper/compiler/TldLocationsCache.html

          source is here:
          http://svn.apache.org/repos/asf/tomcat/tc6.0.x/trunk/java/org/apache/jasper/compiler/TldLocationsCache.java

          Looking at the source, the message comes from {{

          {processWebDotXml()}

          }}; it doesnt do any harm, except that it doesnt bother parsing any web.xml -defined content if web.xml is nowhere to be found. Its a warning, not an error.

          There is a servlet context property, org.apache.catalina.deploy.alt_dd, which can be used to identify an alternate deployment descriptor, but I have no idea how to set that from command line jspc.

          Recommendation: ignore the warning.

          Show
          Steve Loughran added a comment - TldLocations cache is some cache for globally defined taglibs http://tomcat.apache.org/tomcat-5.5-doc/jasper/docs/api/org/apache/jasper/compiler/TldLocationsCache.html source is here: http://svn.apache.org/repos/asf/tomcat/tc6.0.x/trunk/java/org/apache/jasper/compiler/TldLocationsCache.java Looking at the source, the message comes from {{ {processWebDotXml()} }}; it doesnt do any harm, except that it doesnt bother parsing any web.xml -defined content if web.xml is nowhere to be found. Its a warning, not an error. There is a servlet context property, org.apache.catalina.deploy.alt_dd, which can be used to identify an alternate deployment descriptor, but I have no idea how to set that from command line jspc. Recommendation: ignore the warning.
          Hide
          Aaron Kimball added a comment -

          Good enough. Is this ready to be committed?

          Show
          Aaron Kimball added a comment - Good enough. Is this ready to be committed?
          Hide
          Chris Douglas added a comment -

          The code looks reasonable, save a really minor nit on semicolons:

          +            JobTrackerJspHelper.generateJobTable(out, "running", tracker.runningJobs());;;;
          

          and DecimalFormat is not threadsafe.

          Adding a new metrics interface requires that we maintain it. Particularly if this is an "interim solution," I hesitate to commit it when these metrics are already exported through existing interfaces. I'm not against adding this, but neither am I particularly enthusiastic about adding yet another metrics API. What is this in support of?

          Show
          Chris Douglas added a comment - The code looks reasonable, save a really minor nit on semicolons: + JobTrackerJspHelper.generateJobTable(out, "running", tracker.runningJobs());;;; and DecimalFormat is not threadsafe. Adding a new metrics interface requires that we maintain it. Particularly if this is an "interim solution," I hesitate to commit it when these metrics are already exported through existing interfaces. I'm not against adding this, but neither am I particularly enthusiastic about adding yet another metrics API. What is this in support of?
          Hide
          Steve Loughran added a comment -

          I actually quite like XML interfaces you can GET as a way of seeing what's going on

          • people can read them
          • you can directly import them into any spreadsheet that can GET+Xpath a document, including google spreadsheets (if you were to expose your cluster to the world)
          • code can get then XSLT them into useful forms
          • if the error code is meaningful, you can have routers keep an eye on them. For that to work, you need the ability to set limits in the report (eg. have a query param that specifies the minimum #of nodes you want for the cluster to be considered functional)

          And, unlike every other JSP page in the codebase, it has a test. This will reduce maintenance costs, as future patches are less likely to break things.

          Show
          Steve Loughran added a comment - I actually quite like XML interfaces you can GET as a way of seeing what's going on people can read them you can directly import them into any spreadsheet that can GET+Xpath a document, including google spreadsheets (if you were to expose your cluster to the world) code can get then XSLT them into useful forms if the error code is meaningful, you can have routers keep an eye on them. For that to work, you need the ability to set limits in the report (eg. have a query param that specifies the minimum #of nodes you want for the cluster to be considered functional) And, unlike every other JSP page in the codebase, it has a test. This will reduce maintenance costs, as future patches are less likely to break things.
          Hide
          Chris Douglas added a comment -

          I'm not arguing against exporting XML APIs- we already do this for listing paths in the NameNode- but I'd like to have a use case before we commit to supporting an interface. The aforementioned is an example of an interim, XML interface: listing for HftpFileSystem, which was added for cross-version copying while we work on better protocols. Which is why I ask: what is this for? "XML is widely supported and we have a unit test" is answering a different question.

          Show
          Chris Douglas added a comment - I'm not arguing against exporting XML APIs- we already do this for listing paths in the NameNode- but I'd like to have a use case before we commit to supporting an interface. The aforementioned is an example of an interim, XML interface: listing for HftpFileSystem, which was added for cross-version copying while we work on better protocols. Which is why I ask: what is this for? "XML is widely supported and we have a unit test" is answering a different question.
          Hide
          Aaron Kimball added a comment -

          We have clients who want to integrate some basic jobtracker/namenode monitoring and jobtracker history into external metrics-tracking applications. Scraping the HTML off the existing JSP's doesn't cut it – XML representations of them are considerably more sane. Earlier versions of this patch are already in one-off deployments.

          As for the "interim" nature of this – Really, I think the interim is here to stay. HADOOP-4559 has not had any attention since early February, and that was just a one-off comment. No substantive dialogue there since Dec '08.

          Show
          Aaron Kimball added a comment - We have clients who want to integrate some basic jobtracker/namenode monitoring and jobtracker history into external metrics-tracking applications. Scraping the HTML off the existing JSP's doesn't cut it – XML representations of them are considerably more sane. Earlier versions of this patch are already in one-off deployments. As for the "interim" nature of this – Really, I think the interim is here to stay. HADOOP-4559 has not had any attention since early February, and that was just a one-off comment. No substantive dialogue there since Dec '08.
          Hide
          Chris Douglas added a comment -

          Scraping HTML is obviously a poor choice, but there's already a metrics API and several implementations. This would exist outside of that framework and report (IIRC) a subset of the available metrics. Is there some aspect of that API that makes it unsuitable?

          Show
          Chris Douglas added a comment - Scraping HTML is obviously a poor choice, but there's already a metrics API and several implementations. This would exist outside of that framework and report (IIRC) a subset of the available metrics. Is there some aspect of that API that makes it unsuitable?
          Hide
          Aaron Kimball added a comment -

          Chris, to what API are you referring? No API reports job history in a universally-readable manner (e.g., REST + XML).

          Show
          Aaron Kimball added a comment - Chris, to what API are you referring? No API reports job history in a universally-readable manner (e.g., REST + XML).
          Hide
          Aaron Kimball added a comment -

          Attaching new patch. Fixes formatting issue. Also makes JobTrackerJspHelper non-static to avoid thread-safety issue with DecimalFormat.

          Show
          Aaron Kimball added a comment - Attaching new patch. Fixes formatting issue. Also makes JobTrackerJspHelper non-static to avoid thread-safety issue with DecimalFormat.
          Hide
          Chris Douglas added a comment -

          Chris, to what API are you referring?

          o.a.h.metrics and o.a.h.mapred.JobTrackerInstrumentation (JobTrackerMetricsInst). Ganglia, Chukwa, etc. push metrics through this interface. Adding a new interface isn't a small change. For example, the current patch calls JobTracker::runningJobs rather than JobTracker::getRunningJobs; the former is unsynchronized and accesses shared data structures. There have been similar issues caused by metrics frameworks attempting to pull information out of the JobTracker without an audit of the (regrettable) lock hierarchy. If this receives its data through the metrics API, then neither you nor maintainers need to consider how this servlet affects the shared JT data.

          No API reports job history in a universally-readable manner (e.g., REST + XML).

          That's fair. Again, I like machine-readable formats and wouldn't object to this even if it were an interim solution, but I want to be clear about how it gets its data and what it supports, since we'll be committing ourselves to maintaining both.

          Would JobHistory be a better home for this? Much of the data are not currently available in the web UI, let alone in a reasonable format.

          Show
          Chris Douglas added a comment - Chris, to what API are you referring? o.a.h.metrics and o.a.h.mapred.JobTrackerInstrumentation (JobTrackerMetricsInst). Ganglia, Chukwa, etc. push metrics through this interface. Adding a new interface isn't a small change. For example, the current patch calls JobTracker::runningJobs rather than JobTracker::getRunningJobs ; the former is unsynchronized and accesses shared data structures. There have been similar issues caused by metrics frameworks attempting to pull information out of the JobTracker without an audit of the (regrettable) lock hierarchy. If this receives its data through the metrics API, then neither you nor maintainers need to consider how this servlet affects the shared JT data. No API reports job history in a universally-readable manner (e.g., REST + XML). That's fair. Again, I like machine-readable formats and wouldn't object to this even if it were an interim solution, but I want to be clear about how it gets its data and what it supports, since we'll be committing ourselves to maintaining both. Would JobHistory be a better home for this? Much of the data are not currently available in the web UI, let alone in a reasonable format.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12418040/MAPREDUCE-679.4.patch
          against trunk revision 808730.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/538/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/538/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/538/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/538/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418040/MAPREDUCE-679.4.patch against trunk revision 808730. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/538/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/538/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/538/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/538/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          I'd assumed that this was just pushing metrics, but looking more closely at the patch, it's writing per-job statistics. My comments about using the metrics APIs are invalid.

          The unsynchronized calls into the JT should be changed, though it looks like the jobtracker jsp does the same thing. sigh This may also be affected by MAPREDUCE-870.

          Show
          Chris Douglas added a comment - I'd assumed that this was just pushing metrics, but looking more closely at the patch, it's writing per-job statistics. My comments about using the metrics APIs are invalid. The unsynchronized calls into the JT should be changed, though it looks like the jobtracker jsp does the same thing. sigh This may also be affected by MAPREDUCE-870 .
          Hide
          Aaron Kimball added a comment -

          As you note, this API exposes nothing that's not already part of jobtracker.jsp. I don't see this as a fundamentally new API, just a new presentation layer for existing data. The same exact hooks are used in both.

          I'm attaching a new patch with a thread-safe call to getRunningJobs() instead of runningJobs(). I changed jobtracker.jsp to do the same; also added getCompletedJobs() and getFailedJobs() that use the same synchronzation discipline as getRunningJobs(). Finally, I added a "retired jobs" section that conforms to the API of MAPREDUCE-870.

          Tested all of this locally by running jobs and checking the output XML for validation.

          Show
          Aaron Kimball added a comment - As you note, this API exposes nothing that's not already part of jobtracker.jsp. I don't see this as a fundamentally new API, just a new presentation layer for existing data. The same exact hooks are used in both. I'm attaching a new patch with a thread-safe call to getRunningJobs() instead of runningJobs(). I changed jobtracker.jsp to do the same; also added getCompletedJobs() and getFailedJobs() that use the same synchronzation discipline as getRunningJobs(). Finally, I added a "retired jobs" section that conforms to the API of MAPREDUCE-870 . Tested all of this locally by running jobs and checking the output XML for validation.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419112/MAPREDUCE-679.5.patch
          against trunk revision 813140.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/21/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/21/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/21/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/21/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419112/MAPREDUCE-679.5.patch against trunk revision 813140. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/21/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/21/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/21/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/21/console This message is automatically generated.
          Hide
          Aaron Kimball added a comment -

          The new test failure is unrelated.

          Show
          Aaron Kimball added a comment - The new test failure is unrelated.
          Hide
          Chris Douglas added a comment -
          • This adds generateRetiredJobXml to JSPUtil to use the package-private "mechanism" for obtaining retired jobs from generateRetiredJobTable, right? I suppose it's unfair to hold this to a higher standard than the tracker page, but building a buffer of up to 100 jobs in XML before writing it out is not ideal. Could this be a little more sparing of memory by passing the Writer to the utility function?
          • It may not matter given the preceding, but StringBuilder is preferred over StringBuffer in a single-threaded context
          • This should use a real XML writer instead of writing the tags as string literals. xmlenc is used in some of the existing code. It might be better- since the data are all key=value pairs not nested structures- if these were expressed less verbosely. On reflection, wouldn't it make sense to just let each item in a <%s_jobs> be a job instead of a %s_job? e.g.
            <running_jobs>
              <job jobid="job_xxx_0003" user="fuser" name="word count" map_complete="8" ...>
              <job jobid="job_xxx_0005" user="faser" name="word count" map_complete="6" ...>
            </running_jobs>
            
          Show
          Chris Douglas added a comment - This adds generateRetiredJobXml to JSPUtil to use the package-private "mechanism" for obtaining retired jobs from generateRetiredJobTable , right? I suppose it's unfair to hold this to a higher standard than the tracker page, but building a buffer of up to 100 jobs in XML before writing it out is not ideal. Could this be a little more sparing of memory by passing the Writer to the utility function? It may not matter given the preceding, but StringBuilder is preferred over StringBuffer in a single-threaded context This should use a real XML writer instead of writing the tags as string literals. xmlenc is used in some of the existing code. It might be better- since the data are all key=value pairs not nested structures- if these were expressed less verbosely. On reflection, wouldn't it make sense to just let each item in a <%s_jobs> be a job instead of a %s_job ? e.g. <running_jobs> <job jobid="job_xxx_0003" user="fuser" name="word count" map_complete="8" ...> <job jobid="job_xxx_0005" user="faser" name="word count" map_complete="6" ...> </running_jobs>
          Hide
          Steve Loughran added a comment -

          I'm not sure if Aaron could use Xmlenc here, as it is biased towards writing whole XML docs (with the XML header) rather than fragments.

          Given that XMLenc is no longer an active project, there is value in having our own classes to chuck out well-formed XML and XHTML though I'm not in a rush to implement this. I'd like to, though -something that escapes XML and makes it hard to embed bits of Javascript into your cluster status reports.

          Show
          Steve Loughran added a comment - I'm not sure if Aaron could use Xmlenc here, as it is biased towards writing whole XML docs (with the XML header) rather than fragments. Given that XMLenc is no longer an active project, there is value in having our own classes to chuck out well-formed XML and XHTML though I'm not in a rush to implement this. I'd like to, though -something that escapes XML and makes it hard to embed bits of Javascript into your cluster status reports.
          Hide
          Chris Douglas added a comment -

          I'm not sure if Aaron could use Xmlenc here, as it is biased towards writing whole XML docs (with the XML header) rather than fragments.

          I don't see what you mean. XMLOutputter doesn't appear to require the declaration.

          Given that XMLenc is no longer an active project, there is value in having our own classes to chuck out well-formed XML and XHTML

          It's a fast, simple XML writer; I'm pleased that it's not getting new features. That said, escaping user strings displayed in the web UI is an excellent idea.

          Show
          Chris Douglas added a comment - I'm not sure if Aaron could use Xmlenc here, as it is biased towards writing whole XML docs (with the XML header) rather than fragments. I don't see what you mean. XMLOutputter doesn't appear to require the declaration. Given that XMLenc is no longer an active project, there is value in having our own classes to chuck out well-formed XML and XHTML It's a fast, simple XML writer; I'm pleased that it's not getting new features. That said, escaping user strings displayed in the web UI is an excellent idea.
          Hide
          Aaron Kimball added a comment -

          I don't see why using some additional XML writing library is necessary. I have already provided a unit test which ensures that the output of this page is well-formed XML. (For what it's worth, none of the other JSPs have any unit tests at all.) Perhaps there is a broader problem of refactoring the existing JSPs to ensure that all generated HTML/XML is well-formed, but that is outside the scope of this issue.

          I'll get rid of the buffered job history and pass in a writer; expect a new patch for that soon.

          Show
          Aaron Kimball added a comment - I don't see why using some additional XML writing library is necessary. I have already provided a unit test which ensures that the output of this page is well-formed XML. (For what it's worth, none of the other JSPs have any unit tests at all.) Perhaps there is a broader problem of refactoring the existing JSPs to ensure that all generated HTML/XML is well-formed, but that is outside the scope of this issue. I'll get rid of the buffered job history and pass in a writer; expect a new patch for that soon.
          Hide
          Aaron Kimball added a comment -

          Fix buffer issue in JSPUtil.java

          Show
          Aaron Kimball added a comment - Fix buffer issue in JSPUtil.java
          Hide
          Doug Cutting added a comment -

          Maybe we should file a separate issue to convert XML-generating JSPs to use an XML generator?

          As a new feature, this patch has no potential for breaking anything existing and is hence seems low-risk and yet provides some valuable functionality that would be good to have in 0.21.

          Show
          Doug Cutting added a comment - Maybe we should file a separate issue to convert XML-generating JSPs to use an XML generator? As a new feature, this patch has no potential for breaking anything existing and is hence seems low-risk and yet provides some valuable functionality that would be good to have in 0.21.
          Hide
          Chris Douglas added a comment -

          +0

          The format is still overly verbose and should add key/value attributes to job tags rather than building nested structures, but if Aaron is unwilling to invest the time then the current patch is sufficient.

          Show
          Chris Douglas added a comment - +0 The format is still overly verbose and should add key/value attributes to job tags rather than building nested structures, but if Aaron is unwilling to invest the time then the current patch is sufficient.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419911/MAPREDUCE-679.6.patch
          against trunk revision 816240.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/100/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/100/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/100/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/100/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419911/MAPREDUCE-679.6.patch against trunk revision 816240. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/100/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/100/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/100/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/100/console This message is automatically generated.
          Hide
          Aaron Kimball added a comment -

          Unrelated test failure

          Show
          Aaron Kimball added a comment - Unrelated test failure
          Hide
          Aaron Kimball added a comment -

          New patch sync'd with trunk

          Show
          Aaron Kimball added a comment - New patch sync'd with trunk
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12420056/MAPREDUCE-679.7.patch
          against trunk revision 816735.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420056/MAPREDUCE-679.7.patch against trunk revision 816735. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/console This message is automatically generated.
          Hide
          Aaron Kimball added a comment -

          unrelated test failure.

          Show
          Aaron Kimball added a comment - unrelated test failure.
          Hide
          Tom White added a comment -

          +1

          I've just committed this. Thanks Aaron!

          Show
          Tom White added a comment - +1 I've just committed this. Thanks Aaron!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/)
          . XML-based metrics as JSP servlet for JobTracker. Contributed by Aaron Kimball.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/ ) . XML-based metrics as JSP servlet for JobTracker. Contributed by Aaron Kimball.
          Hide
          Steve Loughran added a comment -

          It's worth noting this is the first JSP that has a test alongside. Aaron has now set the baseline for all JSP work going forward: tests are expected.

          Show
          Steve Loughran added a comment - It's worth noting this is the first JSP that has a test alongside. Aaron has now set the baseline for all JSP work going forward: tests are expected.
          Hide
          Konstantin Boudnik added a comment -

          Steve, while it's great to set such a baseline, we need to make sure that we aren't creating new testing infrastructure, incompatible with what we have so far. While JUnit framework isn't an ideal environment in any sense it gives a certain level of automation which is still useful for most of our activities. In my understanding this particular test is manual, right?

          Show
          Konstantin Boudnik added a comment - Steve, while it's great to set such a baseline, we need to make sure that we aren't creating new testing infrastructure, incompatible with what we have so far. While JUnit framework isn't an ideal environment in any sense it gives a certain level of automation which is still useful for most of our activities. In my understanding this particular test is manual, right?
          Hide
          Steve Loughran added a comment -

          Konstantin -no, it's not manual.

          A JUnit test case brings up a MiniMR cluster, constructs the URL "http://localhost:" + infoPort + "/jobtracker.jspx" and then does something very devious - hands off the URL to the XML parser and says "parse this". If the JSPX isnt there or wont run, the HTTP errors get picked up and reported. If the page is there but isn't valid XML, the parser will catch and report that. Very nice indeed. No need for even an extra JAR like HttpUnit or HtmlUnit.

          Show
          Steve Loughran added a comment - Konstantin -no, it's not manual. A JUnit test case brings up a MiniMR cluster, constructs the URL "http://localhost:" + infoPort + "/jobtracker.jspx" and then does something very devious - hands off the URL to the XML parser and says "parse this". If the JSPX isnt there or wont run, the HTTP errors get picked up and reported. If the page is there but isn't valid XML, the parser will catch and report that. Very nice indeed. No need for even an extra JAR like HttpUnit or HtmlUnit.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I really like the fact that something real is finally done with respect to testing information presented via JSPs. Thanks Aaron!

          But I still have one doubt - how do we keep the jsp and jspx pages in sync? I can think of ways in which we take the XML output as the standard and generate html from it (ala forrest). I can already see this kind of duplication across html/xml in this patch, generateRetiredJobTable() and generateRetiredJobXml() in JSPUtil.java for example. Thoughts about this?

          Show
          Vinod Kumar Vavilapalli added a comment - I really like the fact that something real is finally done with respect to testing information presented via JSPs. Thanks Aaron! But I still have one doubt - how do we keep the jsp and jspx pages in sync? I can think of ways in which we take the XML output as the standard and generate html from it (ala forrest). I can already see this kind of duplication across html/xml in this patch, generateRetiredJobTable() and generateRetiredJobXml() in JSPUtil.java for example. Thoughts about this?
          Hide
          Aaron Kimball added a comment -

          The "most correct" way to do this would be to integrate with some Java web framework's templating engine, which would, given a set of key/value data, either provide HTML or XML or JSON or a dozen other formats from the same source. But that is a big change and not something that's just going to come together overnight. Generating HTML from XML is a reasonable-sounding alternative.

          For now, it's on us to add fields to the JSPX when doing so to the JSP, or vice versa. Suboptimal, for sure.

          Show
          Aaron Kimball added a comment - The "most correct" way to do this would be to integrate with some Java web framework's templating engine, which would, given a set of key/value data, either provide HTML or XML or JSON or a dozen other formats from the same source. But that is a big change and not something that's just going to come together overnight. Generating HTML from XML is a reasonable-sounding alternative. For now, it's on us to add fields to the JSPX when doing so to the JSP, or vice versa. Suboptimal, for sure.
          Hide
          Todd Lipcon added a comment -

          I can think of ways in which we take the XML output as the standard and generate html from it (ala forrest)

          The standard way to do this is to use XSLT. Unfortunately, most everyone I've talked to who has used XSLT for generating web pages has decided it's a giant pain and wished they hadn't

          Show
          Todd Lipcon added a comment - I can think of ways in which we take the XML output as the standard and generate html from it (ala forrest) The standard way to do this is to use XSLT. Unfortunately, most everyone I've talked to who has used XSLT for generating web pages has decided it's a giant pain and wished they hadn't
          Hide
          Steve Loughran added a comment -

          -1 to anything too fancy in the way of content generation
          +1 to adding a couple of XSLs at the root of the webapps, so that XML status pages can be presented in a human readable form.

          Show
          Steve Loughran added a comment - -1 to anything too fancy in the way of content generation +1 to adding a couple of XSLs at the root of the webapps, so that XML status pages can be presented in a human readable form.
          Hide
          Nigel Daley added a comment -

          Aaron Kimball added a comment - 18/Sep/09 02:54 PM
          unrelated test failure.

          Aaron, the test failure seen by Hadoop QA was related to this patch. Looks like you remove an unzip function in build.xml which has caused TestCopyFile to fail now for the past 2+ weeks. MAPREDUCE-1029 has been opened to fix this. Please comment there on why it was removed and please be more careful analyzing failed test cases.

          Show
          Nigel Daley added a comment - Aaron Kimball added a comment - 18/Sep/09 02:54 PM unrelated test failure. Aaron, the test failure seen by Hadoop QA was related to this patch. Looks like you remove an unzip function in build.xml which has caused TestCopyFile to fail now for the past 2+ weeks. MAPREDUCE-1029 has been opened to fix this. Please comment there on why it was removed and please be more careful analyzing failed test cases.
          Hide
          Aaron Kimball added a comment -

          Oops; I did not understand the linkage between these two components at all. Sorry about that. Commented on MAPREDUCE-1029.

          Show
          Aaron Kimball added a comment - Oops; I did not understand the linkage between these two components at all. Sorry about that. Commented on MAPREDUCE-1029 .
          Hide
          Amir Youssefi added a comment -

          What's level of effort to back-port this to 0.20? Are there other JIRAs which this patch is dependent on?

          Show
          Amir Youssefi added a comment - What's level of effort to back-port this to 0.20? Are there other JIRAs which this patch is dependent on?

            People

            • Assignee:
              Aaron Kimball
              Reporter:
              Aaron Kimball
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development