Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: security
    • Labels:
      None

      Description

      This is a top-level tracking JIRA for security work we are doing in Hadoop. Please add reference to this when opening new security related JIRAs.

      1. security-design.pdf
        322 kB
        Owen O'Malley

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          751d 21m 1 Owen O'Malley 12/Nov/10 18:57
          Resolved Resolved Closed Closed
          394d 11h 22m 1 Konstantin Shvachko 12/Dec/11 06:19
          Liang Tang made changes -
          Assignee Kan Zhang [ kzhang ]
          Liang Tang made changes -
          Assignee Liang Tang [ latang ]
          Liang Tang made changes -
          Assignee Kan Zhang [ kzhang ] Liang Tang [ latang ]
          Chris Nauroth made changes -
          Link This issue is related to HADOOP-9621 [ HADOOP-9621 ]
          Eugene Koontz made changes -
          Link This issue is related to GIRAPH-211 [ GIRAPH-211 ]
          Konstantin Shvachko made changes -
          Link This issue relates to HADOOP-8357 [ HADOOP-8357 ]
          Eugene Koontz made changes -
          Link This issue relates to ZOOKEEPER-938 [ ZOOKEEPER-938 ]
          Konstantin Shvachko made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 0.22.0 [ 12314296 ]
          Resolution Fixed [ 1 ]
          Hide
          Owen O'Malley added a comment -

          The various parts of this have been committed.

          Show
          Owen O'Malley added a comment - The various parts of this have been committed.
          Jeff Hammerbacher made changes -
          Link This issue is duplicated by HADOOP-6960 [ HADOOP-6960 ]
          Kan Zhang made changes -
          Link This issue relates to HDFS-1305 [ HDFS-1305 ]
          Jakob Homan made changes -
          Link This issue relates to HDFS-1150 [ HDFS-1150 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-6589 [ HADOOP-6589 ]
          Jakob Homan made changes -
          Link This issue relates to HADOOP-6584 [ HADOOP-6584 ]
          Kan Zhang made changes -
          Link This issue relates to HDFS-992 [ HDFS-992 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-6581 [ HADOOP-6581 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-6572 [ HADOOP-6572 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-6510 [ HADOOP-6510 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-6543 [ HADOOP-6543 ]
          Doug Cutting made changes -
          Link This issue is related to AVRO-341 [ AVRO-341 ]
          Hide
          Philip Zeyliger added a comment -

          I'm surprised I'm the first to comment: is the discussion going on elsewhere?

          I read the design document over Christmas. Great to see a document with so much detail, thanks! I had some questions, and thought a couple of places could be clearer; my comments are below.

          ******

          One thing that hasn't been covered (outside of assumptions) is more detail about how to operationally secure a Hadoop cluster in Unix-land. The assumptions section lays out some of these ("root" needs to be secure). Some things that I thought about: (1) data nodes node to write their data with a unix user that users don't have access to, and with appropriate permissions (or umask). (Looking at my local system, the DataNode has left blocks world-readable.) (2) We assume that the JT and NN are also run under unix accounts which users do not have access to.

          Since Data Nodes and the NameNode share a key, it's important to limit cluster membership. (This is critical for task trackers, too, since an evil task tracker could do nasty things.) What's the mechanism to limit cluster participation?

          Is there a central registry of what users can access HDFS and queues?

          Is there an "HDFS" superuser? In existing Hadoop, it's the username corresponding to the uid of the running the Namenode process.

          If the token doesn't exist in memory, which indicates NameNode has restarted

          It could also mean that the token is expired, no? I think this is made clearer in the following sentences.

          READ, WRITE, COPY, REPLACE

          What is the COPY access mode used for?

          "only the user will be able to kill their own jobs and tasks"

          Somewhere else in the document, there's discussion of jobs having owners/groups, not just owners. Surely a superuser or cluster manager can kill jobs with appropriate permissions?

          API and environment changes

          Will users still be able to use Hadoop in a "non-secure" manner? How much work would be involved in using a different security model? This is probably answered by the patch itself

          Show
          Philip Zeyliger added a comment - I'm surprised I'm the first to comment: is the discussion going on elsewhere? I read the design document over Christmas. Great to see a document with so much detail, thanks! I had some questions, and thought a couple of places could be clearer; my comments are below. ****** One thing that hasn't been covered (outside of assumptions) is more detail about how to operationally secure a Hadoop cluster in Unix-land. The assumptions section lays out some of these ("root" needs to be secure). Some things that I thought about: (1) data nodes node to write their data with a unix user that users don't have access to, and with appropriate permissions (or umask). (Looking at my local system, the DataNode has left blocks world-readable.) (2) We assume that the JT and NN are also run under unix accounts which users do not have access to. Since Data Nodes and the NameNode share a key, it's important to limit cluster membership. (This is critical for task trackers, too, since an evil task tracker could do nasty things.) What's the mechanism to limit cluster participation? Is there a central registry of what users can access HDFS and queues? Is there an "HDFS" superuser? In existing Hadoop, it's the username corresponding to the uid of the running the Namenode process. If the token doesn't exist in memory, which indicates NameNode has restarted It could also mean that the token is expired, no? I think this is made clearer in the following sentences. READ, WRITE, COPY, REPLACE What is the COPY access mode used for? "only the user will be able to kill their own jobs and tasks" Somewhere else in the document, there's discussion of jobs having owners/groups, not just owners. Surely a superuser or cluster manager can kill jobs with appropriate permissions? API and environment changes Will users still be able to use Hadoop in a "non-secure" manner? How much work would be involved in using a different security model? This is probably answered by the patch itself
          Kan Zhang made changes -
          Link This issue relates to MAPREDUCE-1335 [ MAPREDUCE-1335 ]
          Owen O'Malley made changes -
          Attachment security-design.pdf [ 12428537 ]
          Hide
          Owen O'Malley added a comment -

          Fixed title page to format better.

          Show
          Owen O'Malley added a comment - Fixed title page to format better.
          Owen O'Malley made changes -
          Attachment security-design.pdf [ 12428493 ]
          Andrew Purtell made changes -
          Link This issue is related to HBASE-1697 [ HBASE-1697 ]
          Owen O'Malley made changes -
          Attachment security-design.pdf [ 12428493 ]
          Hide
          Owen O'Malley added a comment -

          A security design overview for Hadoop.

          Show
          Owen O'Malley added a comment - A security design overview for Hadoop.
          Kan Zhang made changes -
          Link This issue relates to HADOOP-6415 [ HADOOP-6415 ]
          Kan Zhang made changes -
          Link This issue is related to HADOOP-6415 [ HADOOP-6415 ]
          Kan Zhang made changes -
          Link This issue is related to HADOOP-6415 [ HADOOP-6415 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-6419 [ HADOOP-6419 ]
          Kan Zhang made changes -
          Link This issue relates to MAPREDUCE-1250 [ MAPREDUCE-1250 ]
          Andrew Purtell made changes -
          Link This issue is depended upon by HBASE-2016 [ HBASE-2016 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-6367 [ HADOOP-6367 ]
          Boris Shkolnik made changes -
          Link This issue relates to HADOOP-6325 [ HADOOP-6325 ]
          Arun C Murthy made changes -
          Link This issue is related to HADOOP-6299 [ HADOOP-6299 ]
          Kan Zhang made changes -
          Link This issue is related to MAPREDUCE-563 [ MAPREDUCE-563 ]
          Jeff Hammerbacher made changes -
          Link This issue is related to MAPREDUCE-1026 [ MAPREDUCE-1026 ]
          Kan Zhang made changes -
          Link This issue is related to HADOOP-6151 [ HADOOP-6151 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue incorporates MAPREDUCE-720 [ MAPREDUCE-720 ]
          Kan Zhang made changes -
          Link This issue is related to HADOOP-3578 [ HADOOP-3578 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-5405 [ HADOOP-5405 ]
          Arun C Murthy made changes -
          Link This issue is related to HADOOP-4853 [ HADOOP-4853 ]
          Arun C Murthy made changes -
          Link This issue is related to HADOOP-3953 [ HADOOP-3953 ]
          Arun C Murthy made changes -
          Link This issue is related to HADOOP-4656 [ HADOOP-4656 ]
          Arun C Murthy made changes -
          Link This issue is related to HADOOP-4852 [ HADOOP-4852 ]
          Arun C Murthy made changes -
          Link This issue is related to HADOOP-4851 [ HADOOP-4851 ]
          Arun C Murthy made changes -
          Link This issue is related to HADOOP-4850 [ HADOOP-4850 ]
          Arun C Murthy made changes -
          Issue Type Wish [ 5 ] New Feature [ 2 ]
          Component/s security [ 12312526 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-4343 [ HADOOP-4343 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-4348 [ HADOOP-4348 ]
          Kan Zhang made changes -
          Link This issue relates to HADOOP-4359 [ HADOOP-4359 ]
          Kan Zhang made changes -
          Field Original Value New Value
          Link This issue relates to HADOOP-4453 [ HADOOP-4453 ]
          Kan Zhang created issue -

            People

            • Assignee:
              Kan Zhang
              Reporter:
              Kan Zhang
            • Votes:
              0 Vote for this issue
              Watchers:
              39 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development