Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2399

The embedded web framework for MAPREDUCE-279

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: None
    • Labels:
      None

      Description

      Discuss the web framework which is a part of MAPREDUCE-279.

      1. multi-column-stable-sort-default-theme.png
        299 kB
        Luke Lu
      2. context-sensitive.png
        109 kB
        Luke Lu
      3. capacity-scheduler-dark-theme.png
        192 kB
        Luke Lu
      4. autocompletion-tag-stack.png
        196 kB
        Luke Lu

        Activity

        Hide
        Luke Lu added a comment -

        I got quite a few requests from otherwise competent engineers to "enhance" Hamlet, which all turned out to be invalid. One guy sent me a long email complete with an html example (even with doctype!) for a tool tip widget that have a table contained in a span element. I wondered why he didn't bother to validate the example.

        IMO, html templating is a vestige of scripting language hacks that should not be used when a superior alternative exists

        Show
        Luke Lu added a comment - I got quite a few requests from otherwise competent engineers to "enhance" Hamlet, which all turned out to be invalid. One guy sent me a long email complete with an html example (even with doctype!) for a tool tip widget that have a table contained in a span element. I wondered why he didn't bother to validate the example. IMO, html templating is a vestige of scripting language hacks that should not be used when a superior alternative exists
        Hide
        Aaron T. Myers added a comment -

        Ideally I'd also like the same to be reused across HDFS too. Could you file a JIRA when you resolve this, for that effect, and outline some base starter articles/points for anyone who'd like to pick that up?

        If we were to redo the HDFS web UIs, I'd much rather use a pre-existing templating system that more people know, like Jamon. HDFS doesn't have any of the security issues that MR does with user applications having web interfaces, which was the main motivating factor for developing this framework for MR2.

        Show
        Aaron T. Myers added a comment - Ideally I'd also like the same to be reused across HDFS too. Could you file a JIRA when you resolve this, for that effect, and outline some base starter articles/points for anyone who'd like to pick that up? If we were to redo the HDFS web UIs, I'd much rather use a pre-existing templating system that more people know, like Jamon. HDFS doesn't have any of the security issues that MR does with user applications having web interfaces, which was the main motivating factor for developing this framework for MR2.
        Hide
        Harsh J added a comment -

        I guess there isn't anything to discuss as this has already been merged. Should we just close this out?

        Ideally I'd also like the same to be reused across HDFS too. Could you file a JIRA when you resolve this, for that effect, and outline some base starter articles/points for anyone who'd like to pick that up?

        Show
        Harsh J added a comment - I guess there isn't anything to discuss as this has already been merged. Should we just close this out? Ideally I'd also like the same to be reused across HDFS too. Could you file a JIRA when you resolve this, for that effect, and outline some base starter articles/points for anyone who'd like to pick that up?
        Hide
        Luke Lu added a comment -

        Sorry for missing the question.

        Upon merge to trunk, would all these need to be closed as invalid where applied?

        Since JT/TT remains as a legacy MR framework in 0.23. These issues are still relevant for people staying on the legacy framework (maybe in a transition period). MAPREDUCE-1720 is not applicable to the v2 MR framework. Others, like UI for killing/failing tasks is still applicable as they won't be implemented by the time of the merge.

        Show
        Luke Lu added a comment - Sorry for missing the question. Upon merge to trunk, would all these need to be closed as invalid where applied? Since JT/TT remains as a legacy MR framework in 0.23. These issues are still relevant for people staying on the legacy framework (maybe in a transition period). MAPREDUCE-1720 is not applicable to the v2 MR framework. Others, like UI for killing/failing tasks is still applicable as they won't be implemented by the time of the merge.
        Hide
        Harsh J added a comment -

        Luke - Would like to know how this umbrella issue affects all outstanding UI issues with the former UI (such as MAPREDUCE-1720, for example). Upon merge to trunk, would all these need to be closed as invalid where applied?

        Show
        Harsh J added a comment - Luke - Would like to know how this umbrella issue affects all outstanding UI issues with the former UI (such as MAPREDUCE-1720 , for example). Upon merge to trunk, would all these need to be closed as invalid where applied?
        Hide
        Luke Lu added a comment -

        The attachements demonstrates context-sensitive autocompletion and tag-stacks in NetBeans (similar in Eclipse.) Also included some screenshots (started inside an IDE) of new mapreduce webapps, one demonstrates multi-column stable sort by user name (ascending) and progress (descending). The other demonstrates the capacity scheduler page in a dark-theme. I chose to use jquery-ui (the framework can use any javascript library/frameworks) as the css framework to leverage all the available and up coming themes.

        Show
        Luke Lu added a comment - The attachements demonstrates context-sensitive autocompletion and tag-stacks in NetBeans (similar in Eclipse.) Also included some screenshots (started inside an IDE) of new mapreduce webapps, one demonstrates multi-column stable sort by user name (ascending) and progress (descending). The other demonstrates the capacity scheduler page in a dark-theme. I chose to use jquery-ui (the framework can use any javascript library/frameworks) as the css framework to leverage all the available and up coming themes.
        Hide
        Luke Lu added a comment -

        We needed a secure, lightweight embedded (in existing daemons) web framework that can leverage all the security configurations in hadoop-commons HttpServer. Most web frameworks assumes standalone webapps and want to dictate things like directory layout and configurations etc. Major portion existing MVC frameworks is devoted to some ORM layers that we don't need.

        The main goals for the framework is security and ease of development (IDE refactor friendly, esp. for typical Hadoop developers who're not professional web developers with tons of plugins installed) and testing (webapps code needs to be fully unit testable at controller and view level without booting up web servers). This preclude frameworks that require embed class names in some config files (properties or xml), which is arguably a pervasive java anti-pattern.

        It's very easy to accidentally write insecure and thread-unsafe code in existing web frameworks. Anything that uses JSP as the view layer is insecure and thread-unsafe by default (cf. HDFS-1758). In this framework, all views are statically checked by the java compiler and every piece of data is escaped by default. Thread-safety is simplified as controllers and views are request scoped by default.

        Hamlet is a novel and superior view technology that has no counter parts in any existing framework/language in that it's IDE friendly (without having to install tons of plugins): it gives instant autocompletion, javadoc and compiler feedback, showing the current tag stack (see attached IDE screen-shots), making it practically impossible to write invalid html (4.01 strict. In contrast, most frameworks don't even generate valid html in their tutorials, neither do all existing hadoop webapps.) The framework is precisely targeting devs who're not html experts who don't know about quirks mode and remember html spec (e.g., block elements is forbidden inside inline elements etc.) by heart. In the end, Hamlet is just a plain old java object builder (with creative use of generics) that every Java developer can pickup without having to learn any new (expression) language syntax that's needed for JSP etc. (Spring MVC typically use JSP as its view layer.)

        Note, Hamlet.java only needs to be regenerated for people who want to hack on the framework (fixing bugs or support new html5 tags etc.) Webapps just use hamlet as a library. No xml, templates and config files to futz with. It just works by default.

        After wrote all the new webapps with it, I'm happy to report that the yarn web framework achieved all the goals and then some. I have quite a few years startup production experience with Rails and I honestly prefer this framework despite it's in Java , mostly due to Hamlet, Guice and the precise IDE assisted refactor.

        This framework might become a apache project by itself (like many other apache projects come to be), as it's emerging as a perfect lightweight embedded web framework for distributed systems. Note, this framework doesn't preclude a standalone web app using a different framework more suitable for standalone webapps, that pull data from different servers via (REST) APIs, in fact the framework can serve web APIs in a very simple way (comparable or simpler then typical JAX-RS implementations.)

        See attachments for examples.

        Show
        Luke Lu added a comment - We needed a secure, lightweight embedded (in existing daemons) web framework that can leverage all the security configurations in hadoop-commons HttpServer. Most web frameworks assumes standalone webapps and want to dictate things like directory layout and configurations etc. Major portion existing MVC frameworks is devoted to some ORM layers that we don't need. The main goals for the framework is security and ease of development (IDE refactor friendly, esp. for typical Hadoop developers who're not professional web developers with tons of plugins installed) and testing (webapps code needs to be fully unit testable at controller and view level without booting up web servers). This preclude frameworks that require embed class names in some config files (properties or xml), which is arguably a pervasive java anti-pattern. It's very easy to accidentally write insecure and thread-unsafe code in existing web frameworks. Anything that uses JSP as the view layer is insecure and thread-unsafe by default (cf. HDFS-1758 ). In this framework, all views are statically checked by the java compiler and every piece of data is escaped by default. Thread-safety is simplified as controllers and views are request scoped by default. Hamlet is a novel and superior view technology that has no counter parts in any existing framework/language in that it's IDE friendly ( without having to install tons of plugins): it gives instant autocompletion, javadoc and compiler feedback, showing the current tag stack (see attached IDE screen-shots), making it practically impossible to write invalid html (4.01 strict. In contrast, most frameworks don't even generate valid html in their tutorials, neither do all existing hadoop webapps.) The framework is precisely targeting devs who're not html experts who don't know about quirks mode and remember html spec (e.g., block elements is forbidden inside inline elements etc.) by heart. In the end, Hamlet is just a plain old java object builder (with creative use of generics) that every Java developer can pickup without having to learn any new (expression) language syntax that's needed for JSP etc. (Spring MVC typically use JSP as its view layer.) Note, Hamlet.java only needs to be regenerated for people who want to hack on the framework (fixing bugs or support new html5 tags etc.) Webapps just use hamlet as a library. No xml, templates and config files to futz with. It just works by default. After wrote all the new webapps with it, I'm happy to report that the yarn web framework achieved all the goals and then some. I have quite a few years startup production experience with Rails and I honestly prefer this framework despite it's in Java , mostly due to Hamlet, Guice and the precise IDE assisted refactor. This framework might become a apache project by itself (like many other apache projects come to be), as it's emerging as a perfect lightweight embedded web framework for distributed systems. Note, this framework doesn't preclude a standalone web app using a different framework more suitable for standalone webapps, that pull data from different servers via (REST) APIs, in fact the framework can serve web APIs in a very simple way (comparable or simpler then typical JAX-RS implementations.) See attachments for examples.

          People

          • Assignee:
            Luke Lu
            Reporter:
            Arun C Murthy
          • Votes:
            1 Vote for this issue
            Watchers:
            23 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development