Hadoop Common
  1. Hadoop Common
  2. HADOOP-3854

org.apache.hadoop.http.HttpServer should support user configurable filter

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.19.0
    • Component/s: util
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Added a configuration property hadoop.http.filter.initializers and a class org.apache.hadoop.http.FilterInitializer for supporting servlet filter. Cluster administrator could possibly configure customized filters for their web site.

      Description

      Filters provide universal functions such as authentication, logging and auditing, etc. HttpServer should support configurable filters, so that individual site administrators could possibly configure filters for their web site.

      1. 3854_20080919_0.18.patch
        42 kB
        Tsz Wo Nicholas Sze
      2. LoggingFilterInitializer.java
        1 kB
        Tsz Wo Nicholas Sze
      3. 3854_20080821c.patch
        27 kB
        Tsz Wo Nicholas Sze
      4. 3854_20080821b.patch
        26 kB
        Tsz Wo Nicholas Sze
      5. 3941_20080820.patch
        18 kB
        Tsz Wo Nicholas Sze
      6. 3854_20080809.patch
        26 kB
        Tsz Wo Nicholas Sze
      7. 3854_20080808b.patch
        26 kB
        Tsz Wo Nicholas Sze
      8. 3854_20080808.patch
        22 kB
        Tsz Wo Nicholas Sze
      9. 3854_20080807.patch
        20 kB
        Tsz Wo Nicholas Sze
      10. LoggingFilterConfiguration.java
        1 kB
        Tsz Wo Nicholas Sze
      11. 3854_20080806.patch
        14 kB
        Tsz Wo Nicholas Sze
      12. 3854_20080805.patch
        14 kB
        Tsz Wo Nicholas Sze
      13. 3854_20080730.patch
        11 kB
        Tsz Wo Nicholas Sze
      14. LoggingFilter.java
        2 kB
        Tsz Wo Nicholas Sze
      15. SampleInitializer.java
        2 kB
        Tsz Wo Nicholas Sze
      16. 3854_20080729.patch
        10 kB
        Tsz Wo Nicholas Sze

        Issue Links

          Activity

          Hide
          Tsz Wo Nicholas Sze added a comment -

          How about define a new interface HttpServerInitializer as following?

          public interface HttpServerInitializer {
            void initialize(WebApplicationHandler webAppHandler) throws IOException;
          }
          

          We also define a new conf property "hadoop.http.initializer", which specifies a class implementing HttpServerInitializer. Then, HttpServer calls initialize(...) if the conf property is set.

          Show
          Tsz Wo Nicholas Sze added a comment - How about define a new interface HttpServerInitializer as following? public interface HttpServerInitializer { void initialize(WebApplicationHandler webAppHandler) throws IOException; } We also define a new conf property "hadoop.http.initializer", which specifies a class implementing HttpServerInitializer. Then, HttpServer calls initialize(...) if the conf property is set.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3854_20080729.patch: added HttpServerInitializer interface.

          Show
          Tsz Wo Nicholas Sze added a comment - 3854_20080729.patch: added HttpServerInitializer interface.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Here is an example showing how to configure filters:

          • SampleInitializer.java: It implements HttpServerInitializer.
          • LoggingFilter.java: A very simple filter which does logging.
          • Specify SampleInitializer in conf as below
            <property>
              <name>hadoop.http.initializer</name>
              <value>org.apache.hadoop.http.SampleInitializer</value>
              <description>A class implementing org.apache.hadoop.http.HttpServerInitializer for customized initialization of HttpServer objects.</description>
            </property>
            

          Then, a message will be added to log for all http requests/responses.

          Show
          Tsz Wo Nicholas Sze added a comment - Here is an example showing how to configure filters: SampleInitializer.java: It implements HttpServerInitializer. LoggingFilter.java: A very simple filter which does logging. Specify SampleInitializer in conf as below <property> <name>hadoop.http.initializer</name> <value>org.apache.hadoop.http.SampleInitializer</value> <description>A class implementing org.apache.hadoop.http.HttpServerInitializer for customized initialization of HttpServer objects.</description> </property> Then, a message will be added to log for all http requests/responses.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3854_20080730.patch: updated the patch. Note that the sample programs are not included in the patch.

          Show
          Tsz Wo Nicholas Sze added a comment - 3854_20080730.patch: updated the patch. Note that the sample programs are not included in the patch.
          Hide
          Doug Cutting added a comment -

          I'm trying to think of use cases for this. Logging and authentication are core functionalities of Hadoop, that folks shouldn't be replacing in this way, no? We should consistently log using a logging API, and provide ways for folks to replace that. Similarly for authentication: we should provide an extensible authentication API that folks can replace. Both logging and authentication are done in other places besides from HTTP servers. I can see this as a nice internal API, as a way to call the standard logging and authentication APIs, but, once we've done that, do we want to support users overriding that and specifying different logging and authentication mechanisms, just for the HTTP services, and not for the RPC and raw socket interfaces? That sounds error-prone. Or am I missing something?

          Show
          Doug Cutting added a comment - I'm trying to think of use cases for this. Logging and authentication are core functionalities of Hadoop, that folks shouldn't be replacing in this way, no? We should consistently log using a logging API, and provide ways for folks to replace that. Similarly for authentication: we should provide an extensible authentication API that folks can replace. Both logging and authentication are done in other places besides from HTTP servers. I can see this as a nice internal API, as a way to call the standard logging and authentication APIs, but, once we've done that, do we want to support users overriding that and specifying different logging and authentication mechanisms, just for the HTTP services, and not for the RPC and raw socket interfaces? That sounds error-prone. Or am I missing something?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Filters support universal functions. Logging and authentication are just examples of them. See http://java.sun.com/products/servlet/Filters.html for more filter use cases.

          Even for logging and authentication, site administrators may want to monitor the entire web site (not only hadoop web pages) uniformly. Some organizations might have their own logging database and authentication service. They might want to log information according to their site policy. They might already have a single sign-on system on their web site and they don't want Hadoop web pages to have special login page.

          Show
          Tsz Wo Nicholas Sze added a comment - Filters support universal functions. Logging and authentication are just examples of them. See http://java.sun.com/products/servlet/Filters.html for more filter use cases. Even for logging and authentication, site administrators may want to monitor the entire web site (not only hadoop web pages) uniformly. Some organizations might have their own logging database and authentication service. They might want to log information according to their site policy. They might already have a single sign-on system on their web site and they don't want Hadoop web pages to have special login page.
          Hide
          Doug Cutting added a comment -

          I agree that authentication and logging on our web servers needs to be extensible. I am also aware that filters support universal functions. What I question is whether we want to support authentication and logging extensibility with such a universal function, or whether it might be better to provide more specific extensibility mechanisms for these cases. Should we perhaps instead have one filter we call to authenticate requests, and another to log them, that folks can separately replace? In the authentication case, should we support a standard mechanism of annotating the http request with the user's credentials? Or do you think for some reason those would be overkill, that the more-general mechanism is more appropriate here?

          I don't necessarily think this is a bad approach, I'm just trying to better understand its motivation.

          Show
          Doug Cutting added a comment - I agree that authentication and logging on our web servers needs to be extensible. I am also aware that filters support universal functions. What I question is whether we want to support authentication and logging extensibility with such a universal function, or whether it might be better to provide more specific extensibility mechanisms for these cases. Should we perhaps instead have one filter we call to authenticate requests, and another to log them, that folks can separately replace? In the authentication case, should we support a standard mechanism of annotating the http request with the user's credentials? Or do you think for some reason those would be overkill, that the more-general mechanism is more appropriate here? I don't necessarily think this is a bad approach, I'm just trying to better understand its motivation.
          Hide
          Allen Wittenauer added a comment -

          Basically, this spawned out of a request of mine to support Y!'s custom authentication and access control system for the UIs until Hadoop gets a security system in place. If this works the way I hope it does , it would allow us to enable HDFS browsing and potentially even enable the JT UI in "write" mode.

          Show
          Allen Wittenauer added a comment - Basically, this spawned out of a request of mine to support Y!'s custom authentication and access control system for the UIs until Hadoop gets a security system in place. If this works the way I hope it does , it would allow us to enable HDFS browsing and potentially even enable the JT UI in "write" mode.
          Hide
          Doug Cutting added a comment -

          Ah, so the motivating use-case is authentication. Should we address this more directly then? With the proposed general mechanism, one might have to be careful to disable it for shuffle transfers, and once configured, adding a second filter for logging might be tricky. We currently really only want this for authentication on user-facing web interfaces, right?

          Show
          Doug Cutting added a comment - Ah, so the motivating use-case is authentication. Should we address this more directly then? With the proposed general mechanism, one might have to be careful to disable it for shuffle transfers, and once configured, adding a second filter for logging might be tricky. We currently really only want this for authentication on user-facing web interfaces, right?
          Hide
          Allen Wittenauer added a comment -

          Correct. Although I'm not how sure distcp is impacted by such a change given that it likely talks to the same UIs as the users.

          Show
          Allen Wittenauer added a comment - Correct. Although I'm not how sure distcp is impacted by such a change given that it likely talks to the same UIs as the users.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Although I'm not how sure distcp is impacted by such a change given that it likely talks to the same UIs as the users.

          We could set the filter mapping to exclude the web pages for distcp. Alternatively, we could disable hftp and only allow hsftp.

          Show
          Tsz Wo Nicholas Sze added a comment - > Although I'm not how sure distcp is impacted by such a change given that it likely talks to the same UIs as the users. We could set the filter mapping to exclude the web pages for distcp. Alternatively, we could disable hftp and only allow hsftp.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Ah, so the motivating use-case is authentication. Should we address this more directly then? With the proposed general mechanism, one might have to be careful to disable it for shuffle transfers, and once configured, adding a second filter for logging might be tricky. We currently really only want this for authentication on user-facing web interfaces, right?

          The filter mapping will solve the problem. We could exclude the web pages for shuffle transfers. For adding a second filter, administrators could define a filter chain via filter mapping.

          Show
          Tsz Wo Nicholas Sze added a comment - Ah, so the motivating use-case is authentication. Should we address this more directly then? With the proposed general mechanism, one might have to be careful to disable it for shuffle transfers, and once configured, adding a second filter for logging might be tricky. We currently really only want this for authentication on user-facing web interfaces, right? The filter mapping will solve the problem. We could exclude the web pages for shuffle transfers. For adding a second filter, administrators could define a filter chain via filter mapping.
          Hide
          Doug Cutting added a comment -

          > The filter mapping will solve the problem.

          But shouldn't the filter mapping be done in Hadoop, not by the application, so that when, e.g., we restructure Hadoop's urls, applications that use this feature don't have to be updated? Folks could specify an authentication filter that's called for user-facing pages. That seems like a much easier API for Hadoop to maintain compatibly.

          Show
          Doug Cutting added a comment - > The filter mapping will solve the problem. But shouldn't the filter mapping be done in Hadoop, not by the application, so that when, e.g., we restructure Hadoop's urls, applications that use this feature don't have to be updated? Folks could specify an authentication filter that's called for user-facing pages. That seems like a much easier API for Hadoop to maintain compatibly.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > But shouldn't the filter mapping be done in Hadoop, not by the application, ...

          +1 It is true that administrators are hard to tell which servlets should be excluded in the filter mapping.

          Show
          Tsz Wo Nicholas Sze added a comment - > But shouldn't the filter mapping be done in Hadoop, not by the application, ... +1 It is true that administrators are hard to tell which servlets should be excluded in the filter mapping.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3854_20080805.patch: changed the API such that administrators define a list of FilterConfiguration in conf. If the list is specified, the corresponding Filter will be initialized for each FilterConfiguration.

          Show
          Tsz Wo Nicholas Sze added a comment - 3854_20080805.patch: changed the API such that administrators define a list of FilterConfiguration in conf. If the list is specified, the corresponding Filter will be initialized for each FilterConfiguration.
          Hide
          Doug Cutting added a comment -

          I don't see where this is limited to user-facing pages.

          + handler.addFilterPathMapping("*", filtername, Dispatcher.__ALL);

          That looks like it adds it for every request. Don't we need to add a mapping per daemon that restricts this to user-facing pages? For example, the TaskTracker uses the same HTTP daemon for user-facing pages and for the shuffle. Do we want to run these filters for shuffle requests? I don't think so.

          Show
          Doug Cutting added a comment - I don't see where this is limited to user-facing pages. + handler.addFilterPathMapping("*", filtername, Dispatcher.__ALL); That looks like it adds it for every request. Don't we need to add a mapping per daemon that restricts this to user-facing pages? For example, the TaskTracker uses the same HTTP daemon for user-facing pages and for the shuffle. Do we want to run these filters for shuffle requests? I don't think so.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > I don't see where this is limited to user-facing pages.

          I need to list out all the jsp/servlet paths and check whether they are user facing or not. I did not have time to do it yesterday. I would like to show new API design first. That's why I have posted my last patch. Sorry for being confusing.

          Show
          Tsz Wo Nicholas Sze added a comment - > I don't see where this is limited to user-facing pages. I need to list out all the jsp/servlet paths and check whether they are user facing or not. I did not have time to do it yesterday. I would like to show new API design first. That's why I have posted my last patch. Sorry for being confusing.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Some of the jsp/servlet pages are both user facing and using internally. For example, consider the LogLevel servlet. Users can access it through the URL /logLevel with a browser or through the command "hadoop -daemonlog". If an authentication filter is installed, it will block "hadoop -daemonlog". Fsck is another example.

          Show
          Tsz Wo Nicholas Sze added a comment - Some of the jsp/servlet pages are both user facing and using internally. For example, consider the LogLevel servlet. Users can access it through the URL /logLevel with a browser or through the command "hadoop -daemonlog". If an authentication filter is installed, it will block "hadoop -daemonlog". Fsck is another example.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          common
          User facing only

          • /stacks
          • /logs/*
          • /static/*

          Both

          • /logLevel

          NameNode
          User facing only

          • /index.html
          • /nn_browsedfscontent.jsp
          • /dfshealth.jsp

          Internal use only

          • /getimage
          • /listPaths/*
          • /data/*

          Both

          • /fsck

          DataNode
          User facing only

          • /browseDirectory.jsp
          • /tail.jsp
          • /browseBlock.jsp
          • /blockScannerReport

          Internal use only

          • /streamFile/*

          SecondaryNameNode
          Internal use only

          • /getimage

          JobTracker
          User facing only

          • /index.html
          • /taskgraph
          • /machines.jsp
          • /jobconf_history.jsp
          • /loadhistory.jsp
          • /jobhistory.jsp
          • /jobtaskshistory.jsp
          • /jobdetailshistory.jsp
          • /analysejobhistory.jsp
          • /jobfailures.jsp
          • /jobblacklistedtrackers.jsp
          • /jobconf.jsp
          • /taskstats.jsp
          • /taskdetailshistory.jsp
          • /jobtracker.jsp
          • /taskdetails.jsp
          • /jobdetails.jsp
          • /jobtasks.jsp

          TaskTracker
          User facing only

          • /index.html
          • /tasktracker.jsp
          • /taskgraph

          Internal use only

          • /mapOutput
          • /tasklog
          Show
          Tsz Wo Nicholas Sze added a comment - common User facing only /stacks /logs/* /static/* Both /logLevel NameNode User facing only /index.html /nn_browsedfscontent.jsp /dfshealth.jsp Internal use only /getimage /listPaths/* /data/* Both /fsck DataNode User facing only /browseDirectory.jsp /tail.jsp /browseBlock.jsp /blockScannerReport Internal use only /streamFile/* SecondaryNameNode Internal use only /getimage JobTracker User facing only /index.html /taskgraph /machines.jsp /jobconf_history.jsp /loadhistory.jsp /jobhistory.jsp /jobtaskshistory.jsp /jobdetailshistory.jsp /analysejobhistory.jsp /jobfailures.jsp /jobblacklistedtrackers.jsp /jobconf.jsp /taskstats.jsp /taskdetailshistory.jsp /jobtracker.jsp /taskdetails.jsp /jobdetails.jsp /jobtasks.jsp TaskTracker User facing only /index.html /tasktracker.jsp /taskgraph Internal use only /mapOutput /tasklog
          Hide
          Tsz Wo Nicholas Sze added a comment -

          From the list above, the filters should apply to

          • all static contents
          • all log files
          • all jsp pages
          • all other servlets except the ones for internal use only
          • /logLevel and /fsck (it is better to have a strict policy)

          I suggest to define the following pattern for the user-facing pages:

          /stacks
          /logs/*
          /static/*
          /logLevel
          /fsck
          *.html
          *.jsp
          /blockScannerReport
          /taskgraph
          
          Show
          Tsz Wo Nicholas Sze added a comment - From the list above, the filters should apply to all static contents all log files all jsp pages all other servlets except the ones for internal use only /logLevel and /fsck (it is better to have a strict policy) I suggest to define the following pattern for the user-facing pages: /stacks /logs/* /static/* /logLevel /fsck *.html *.jsp /blockScannerReport /taskgraph
          Hide
          Lohit Vijayarenu added a comment -

          From the list above, the filters should apply to (/fsck)

          We should not block fsck since it is accessed via command right now.

          Show
          Lohit Vijayarenu added a comment - From the list above, the filters should apply to (/fsck) We should not block fsck since it is accessed via command right now.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3854_20080806.patch: used the above pattern filter mapping except fsck

          Show
          Tsz Wo Nicholas Sze added a comment - 3854_20080806.patch: used the above pattern filter mapping except fsck
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Tested manually, the mapping seems working fine. Will write a test for it.

          Show
          Tsz Wo Nicholas Sze added a comment - Tested manually, the mapping seems working fine. Will write a test for it.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          LoggingFilterConfiguration.java: this is an filter example which requires LoggingFilter.java posted before.

          Show
          Tsz Wo Nicholas Sze added a comment - LoggingFilterConfiguration.java: this is an filter example which requires LoggingFilter.java posted before.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3854_20080807.patch: add a test and re-write some javadocs. Another filter usage example can be found in the new unit test.

          Show
          Tsz Wo Nicholas Sze added a comment - 3854_20080807.patch: add a test and re-write some javadocs. Another filter usage example can be found in the new unit test.
          Hide
          Doug Cutting added a comment -

          Since authentication is the motivating case, should we add an example for that? It should, e.g., check to see if the request already specifies credentials, and only redirect the user when it does not. That way non-user-facing pages can be easily accessed programmatically by putting the credentials in the headers. Or is this thought to be obvious enough that we don't need an example?

          Show
          Doug Cutting added a comment - Since authentication is the motivating case, should we add an example for that? It should, e.g., check to see if the request already specifies credentials, and only redirect the user when it does not. That way non-user-facing pages can be easily accessed programmatically by putting the credentials in the headers. Or is this thought to be obvious enough that we don't need an example?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Since we define the filter mapping for the user-facing pages only, the non-user-facing pages won't trigger filters. So if their is a user-defined authentication filter, it won't protect the non-user-facing pages.

          The non-user-facing pages should use hadoop internal authentication mechanism (currently, we use UGI) but not a user-defined authentication filter. Otherwise, we have to provide a plug-in framework for users to integrate their authentication technology to hadoop.

          Show
          Tsz Wo Nicholas Sze added a comment - Since we define the filter mapping for the user-facing pages only, the non-user-facing pages won't trigger filters. So if their is a user-defined authentication filter, it won't protect the non-user-facing pages. The non-user-facing pages should use hadoop internal authentication mechanism (currently, we use UGI) but not a user-defined authentication filter. Otherwise, we have to provide a plug-in framework for users to integrate their authentication technology to hadoop.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Indeed authentication needs more thought: for example, we should provide some ways for a user-defined authentication filter to propagate user information to hadoop. Then, the user information can be used for permission checking. We are currently using a pre-defined web account (see dfs.web.ugi in conf) for all web accesses to the file system.

          I think we should commit this issue first and then work on authentication in a separated jira.

          Show
          Tsz Wo Nicholas Sze added a comment - Indeed authentication needs more thought: for example, we should provide some ways for a user-defined authentication filter to propagate user information to hadoop. Then, the user information can be used for permission checking. We are currently using a pre-defined web account (see dfs.web.ugi in conf) for all web accesses to the file system. I think we should commit this issue first and then work on authentication in a separated jira.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Passed all tests locally. Try Hudson.

          Show
          Tsz Wo Nicholas Sze added a comment - Passed all tests locally. Try Hudson.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12387767/3854_20080807.patch
          against trunk revision 683671.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3034/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3034/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3034/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3034/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12387767/3854_20080807.patch against trunk revision 683671. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3034/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3034/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3034/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3034/console This message is automatically generated.
          Hide
          Doug Cutting added a comment -

          > I think we should commit this issue first and then work on authentication in a separated jira.

          I see. This will not in fact be used to perform full user authentication, but rather just to check if the user is permitted to access things as dfs.web.ugi access.

          The only issue I have is that FilterConfiguration should probably be an abstract class rather than an interface, so that we can update it without breaking user code. Other than that, +1.

          Show
          Doug Cutting added a comment - > I think we should commit this issue first and then work on authentication in a separated jira. I see. This will not in fact be used to perform full user authentication, but rather just to check if the user is permitted to access things as dfs.web.ugi access. The only issue I have is that FilterConfiguration should probably be an abstract class rather than an interface, so that we can update it without breaking user code. Other than that, +1.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3854_20080808.patch:

          • changed FilterConfiguration to an abstract class
          • moved the static method getFilterConfigurations(conf) to FilterConfiguration
          Show
          Tsz Wo Nicholas Sze added a comment - 3854_20080808.patch: changed FilterConfiguration to an abstract class moved the static method getFilterConfigurations(conf) to FilterConfiguration
          Hide
          Doug Cutting added a comment -

          Sorry, one more idea. The constant PUBLIC_FACING_URLS is fragile. If someone adds another servlet that's intended to be public-facing, subject to filters, then it will not automatically be added to this. I propose limiting this constant to non-servlet stuff like ".html" and ".jsp", then modify HttpServer#addServlet() to add a boolean isFiltered parameter, deprecating the existing signature. Then each added servlet must declare whether it's filtered or not, which should reduce surprises.

          Also, I don't see the point of passing the filter configs to the ctor. Shouldn't we just always create the filters automatically from the config? We need to add a Configuration parameter to HttpServer for this, but that seems better than a FilterConfiguration[] parameter, since 'FilterConfiguration.getFilterConfigurations(conf)' won't be duplicated so much, and it would permit further configuration of HttpServer in the future.

          Show
          Doug Cutting added a comment - Sorry, one more idea. The constant PUBLIC_FACING_URLS is fragile. If someone adds another servlet that's intended to be public-facing, subject to filters, then it will not automatically be added to this. I propose limiting this constant to non-servlet stuff like " .html" and " .jsp", then modify HttpServer#addServlet() to add a boolean isFiltered parameter, deprecating the existing signature. Then each added servlet must declare whether it's filtered or not, which should reduce surprises. Also, I don't see the point of passing the filter configs to the ctor. Shouldn't we just always create the filters automatically from the config? We need to add a Configuration parameter to HttpServer for this, but that seems better than a FilterConfiguration[] parameter, since 'FilterConfiguration.getFilterConfigurations(conf)' won't be duplicated so much, and it would permit further configuration of HttpServer in the future.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Sorry, one more idea. The constant PUBLIC_FACING_URLS is fragile. If someone adds another servlet that's intended to be public-facing, subject to filters, then it will not automatically be added to this. I propose limiting this constant to non-servlet stuff like ".html" and ".jsp", then modify HttpServer#addServlet() to add a boolean isFiltered parameter, deprecating the existing signature. Then each added servlet must declare whether it's filtered or not, which should reduce surprises.

          No problem. I think this is a very good idea.

          Also, I don't see the point of passing the filter configs to the ctor. Shouldn't we just always create the filters automatically from the config? We need to add a Configuration parameter to HttpServer for this, but that seems better than a FilterConfiguration[] parameter, since 'FilterConfiguration.getFilterConfigurations(conf)' won't be duplicated so much, and it would permit further configuration of HttpServer in the future.

          Done.

          Show
          Tsz Wo Nicholas Sze added a comment - Sorry, one more idea. The constant PUBLIC_FACING_URLS is fragile. If someone adds another servlet that's intended to be public-facing, subject to filters, then it will not automatically be added to this. I propose limiting this constant to non-servlet stuff like " .html" and " .jsp", then modify HttpServer#addServlet() to add a boolean isFiltered parameter, deprecating the existing signature. Then each added servlet must declare whether it's filtered or not, which should reduce surprises. No problem. I think this is a very good idea. Also, I don't see the point of passing the filter configs to the ctor. Shouldn't we just always create the filters automatically from the config? We need to add a Configuration parameter to HttpServer for this, but that seems better than a FilterConfiguration[] parameter, since 'FilterConfiguration.getFilterConfigurations(conf)' won't be duplicated so much, and it would permit further configuration of HttpServer in the future. Done.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3854_20080808b.patch:

          • added a new parameter isFiltered to addServlet(...). Deprecated the old method.
          • added new methods addContext(...) and addFilterPathMapping(...)
          • changed the HttpServer constructor parameter to conf
          Show
          Tsz Wo Nicholas Sze added a comment - 3854_20080808b.patch: added a new parameter isFiltered to addServlet(...). Deprecated the old method. added new methods addContext(...) and addFilterPathMapping(...) changed the HttpServer constructor parameter to conf
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3854_20080809.patch: fixed a bug. This patch passed all tests locally.

          Show
          Tsz Wo Nicholas Sze added a comment - 3854_20080809.patch: fixed a bug. This patch passed all tests locally.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12387887/3854_20080809.patch
          against trunk revision 685406.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3047/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3047/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3047/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3047/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12387887/3854_20080809.patch against trunk revision 685406. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3047/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3047/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3047/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3047/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The javadoc warnings are nothing to do with the patch. See HADOOP-3949

          The tests failed are TestMapRed and TestMiniMRDFSSort but they failed on trunk. See HADOOP-3950

          Show
          Tsz Wo Nicholas Sze added a comment - The javadoc warnings are nothing to do with the patch. See HADOOP-3949 The tests failed are TestMapRed and TestMiniMRDFSSort but they failed on trunk. See HADOOP-3950
          Hide
          Owen O'Malley added a comment -

          I don't think this change is the right one for the API. In the long term, we should have two servers, one for the internal work and one for external work. Otherwise, you can't get security with these plugins. Therefore, I'd propose that we leave the addServlet alone and the requests will be filtered. We also add:

          @Deprecated
          public void addInternalServlet(...);
          

          that adds a servlet that doesn't use the filters.

          Show
          Owen O'Malley added a comment - I don't think this change is the right one for the API. In the long term, we should have two servers, one for the internal work and one for external work. Otherwise, you can't get security with these plugins. Therefore, I'd propose that we leave the addServlet alone and the requests will be filtered. We also add: @Deprecated public void addInternalServlet(...); that adds a servlet that doesn't use the filters.
          Hide
          Doug Cutting added a comment -

          Owen's addInternalServlet() proposal makes good sense. It would be easier to replace calls to this method than to replace all calls that pass 'true' as a boolean parameter.

          Show
          Doug Cutting added a comment - Owen's addInternalServlet() proposal makes good sense. It would be easier to replace calls to this method than to replace all calls that pass 'true' as a boolean parameter.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3941_20080820.patch: changed the method to addInternalServlet().

          Show
          Tsz Wo Nicholas Sze added a comment - 3941_20080820.patch: changed the method to addInternalServlet().
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3854_20080821b.patch: oops, the previous is not for this issue.

          Show
          Tsz Wo Nicholas Sze added a comment - 3854_20080821b.patch: oops, the previous is not for this issue.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Passed all tests locally. The following is the "ant test-patch" results. The javadoc warnings are due to HADOOP-3964. The new javac warnings are due to the deprecated method. Submitting...

               [exec] -1 overall.  
          
               [exec]     +1 @author.  The patch does not contain any @author tags.
          
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
          
               [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
          
               [exec]     -1 javac.  The applied patch generated 480 javac compiler warnings (more than the trunk's current 472 warnings).
          
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
          
          Show
          Tsz Wo Nicholas Sze added a comment - Passed all tests locally. The following is the "ant test-patch" results. The javadoc warnings are due to HADOOP-3964 . The new javac warnings are due to the deprecated method. Submitting... [exec] -1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages. [exec] -1 javac. The applied patch generated 480 javac compiler warnings (more than the trunk's current 472 warnings). [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          Hide
          Owen O'Malley added a comment -

          The FilterConfiguration seems like a confusing interface. How about:

          public abstract class FilterInitializer {
            public abstract void initFilters(HttpServer server, Configuration conf) throws IOException;
          }
          
          public class HttpServer {
            ...
            public void addFilter(String name, String className, Map<String, String> params) { ... }
          }
          
          Show
          Owen O'Malley added a comment - The FilterConfiguration seems like a confusing interface. How about: public abstract class FilterInitializer { public abstract void initFilters(HttpServer server, Configuration conf) throws IOException; } public class HttpServer { ... public void addFilter( String name, String className, Map< String , String > params) { ... } }
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Talked to Owen. It is better to define FilterInitializer than FilterConfiguration.

          3854_20080821c.patch: made the change.

          Show
          Tsz Wo Nicholas Sze added a comment - Talked to Owen. It is better to define FilterInitializer than FilterConfiguration. 3854_20080821c.patch: made the change.
          Hide
          Tsz Wo Nicholas Sze added a comment -
               [exec] -1 overall.  
          
               [exec]     +1 @author.  The patch does not contain any @author tags.
          
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
          
               [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
          
               [exec]     -1 javac.  The applied patch generated 480 javac compiler warnings (more than the trunk's current 472 warnings).
          
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
          

          Same as before.

          Show
          Tsz Wo Nicholas Sze added a comment - [exec] -1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages. [exec] -1 javac. The applied patch generated 480 javac compiler warnings (more than the trunk's current 472 warnings). [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. Same as before.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Passed all tests locally.

          Show
          Tsz Wo Nicholas Sze added a comment - Passed all tests locally.
          Hide
          Owen O'Malley added a comment -

          +1

          Show
          Owen O'Malley added a comment - +1
          Hide
          Owen O'Malley added a comment -

          I just committed this. Thanks, Nicholas!

          Show
          Owen O'Malley added a comment - I just committed this. Thanks, Nicholas!
          Hide
          Tsz Wo Nicholas Sze added a comment -

          LoggingFilterInitializer.java: This is the updated example which works with LoggingFilter.java

          Show
          Tsz Wo Nicholas Sze added a comment - LoggingFilterInitializer.java: This is the updated example which works with LoggingFilter.java
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #586 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/586/ )
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Got some requests for a 0.18 patch. 3854_20080919_0.18.patch is the one. Feel free to use it. It won't be committed.

          Show
          Tsz Wo Nicholas Sze added a comment - Got some requests for a 0.18 patch. 3854_20080919_0.18.patch is the one. Feel free to use it. It won't be committed.

            People

            • Assignee:
              Tsz Wo Nicholas Sze
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development