Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: security
    • Labels:
      None

      Description

      Audit: Important actions taken by subjects should be logged for accountability, a chronological record which enables the full reconstruction and examination of a sequence of events, e.g. schema changes or data mutations. Logging activity should be protected from all subjects except for a restricted set with administrative privilege, perhaps to only a single super-user.

      Support dynamic scaling transparently and support multi-tenant. Acquire enough detail and support streamline auditing in time. Should be configurable on a per-table basis to avoid this overhead where it is not wanted.

      Consider logging audit trails to an HBase table (bigtable type schemas are natural for this) and also external options with Java library support - syslog, etc., or maybe commons-logging is sufficient and punt to administrator to set up appropriate commons-logging/log4j configurations for their needs.

      Consider integration with Scribe (http://developers.facebook.com/scribe/) or Chukwa (http://wiki.apache.org/hadoop/Chukwa).

      • Session information (Required)
        • Client, server, When, How, Where.
      • Command information (Required)
        • Command detail and intent
        • Command result and why
        • Data event (input and output interested data, depends on predefined policy)
          • Metadata, data detail, session identity and command identity, data direction, etc.
        • Command Counts (optional)
          • Execution duration
          • Response/request data amount
          • Resource usage
      • Node status
        • Node resource counts
        • Session status
        • Abnormal events (Required)

        Activity

        Hide
        stack added a comment -

        Moving out of 0.92.0. Pull it back in if you think different.

        Show
        stack added a comment - Moving out of 0.92.0. Pull it back in if you think different.
        Hide
        linden lin added a comment -

        I give my thoughts for reference.
        Routing string is hierarchical and easily matched by order.

        The routing string is as follows:

        {Event Type}.{Candidate Router Key}.{Sub Event Item}.{Other items}
        {Event Type}

        indicates the main event type;

        {Candidate Router Key} is the consideration for scalability and performance; and {Sub Event Item} will be the more accurate type for filtering and routing; {Other items} is reserved for future.
        {Event Type}:
        1. Session
        2. Command
        3. Data in Command
        4. Counts in Command
        5. Node Status: necessary status and abnormal events.

        Relation (A->B: A depends on B):
        Counts in Command->Data In Command->Command->Session;
        Counts in Command->Command->Session;
        Node Status;
        {Candidate Router Key}

        (Only choose one):
        1. Object Name (Recommended, it is table name for Hbase, if there isn't table in the event, Object name is null. If client query metadata from Zookeeper, use hardcode table name to replace. Such as "Hbase Metadata".)
        2. HRegion Identity
        3. RegionServer IP
        4. Others.....

        {Sub Event Item}

        : It depends on the

        {Event Type}

        .
        1. Session (in current Hbase version, it is the connection, establish connection and close connection)
        Session Login In :
        Session Login Off:
        2. Command
        Command Request
        Command Response
        3. Data in Command (interested data from input and output command)
        Data metadata and content (only)
        4. Counts In Command
        Counts Set (only)
        5. Node Status
        Performance counts (Resource usage, Session amount, and other performance related counts)
        Abnormal events (defined by user, normally, it includes error event, huge request in a short time and
        so on).

        BTW, I suggest try to shorten the routing string and keep the capability of dynamic routing. Fox example:

        2.ObjectName.2.Others => it means Command.ObjectName.CommandResponse.Others (string to number is only for predefined type).

        Show
        linden lin added a comment - I give my thoughts for reference. Routing string is hierarchical and easily matched by order. The routing string is as follows: {Event Type}.{Candidate Router Key}.{Sub Event Item}.{Other items} {Event Type} indicates the main event type; {Candidate Router Key} is the consideration for scalability and performance; and {Sub Event Item} will be the more accurate type for filtering and routing; {Other items} is reserved for future. {Event Type}: 1. Session 2. Command 3. Data in Command 4. Counts in Command 5. Node Status: necessary status and abnormal events. Relation (A->B: A depends on B): Counts in Command->Data In Command->Command->Session; Counts in Command->Command->Session; Node Status; {Candidate Router Key} (Only choose one): 1. Object Name (Recommended, it is table name for Hbase, if there isn't table in the event, Object name is null. If client query metadata from Zookeeper, use hardcode table name to replace. Such as "Hbase Metadata".) 2. HRegion Identity 3. RegionServer IP 4. Others..... {Sub Event Item} : It depends on the {Event Type} . 1. Session (in current Hbase version, it is the connection, establish connection and close connection) Session Login In : Session Login Off: 2. Command Command Request Command Response 3. Data in Command (interested data from input and output command) Data metadata and content (only) 4. Counts In Command Counts Set (only) 5. Node Status Performance counts (Resource usage, Session amount, and other performance related counts) Abnormal events (defined by user, normally, it includes error event, huge request in a short time and so on). BTW, I suggest try to shorten the routing string and keep the capability of dynamic routing. Fox example: 2.ObjectName.2.Others => it means Command.ObjectName.CommandResponse.Others (string to number is only for predefined type).
        Hide
        linden lin added a comment -

        I give my thoughts for reference.
        Routing string is hierarchical and easily matched by order.

        The routing string is as follows:

        {Event Type}.{Candidate Router Key}.{Sub Event Item}.{Other items}
        {Event Type}

        indicates the main event type;

        {Candidate Router Key} is the consideration for scalability and performance; and {Sub Event Item} will be the more accurate type for filtering and routing; {Other items} is reserved for future.
        {Event Type}:
        1. Session
        2. Command
        3. Data in Command
        4. Counts in Command
        5. Node Status: necessary status and abnormal events.

        Relation (A->B: A depends on B):
        Counts in Command->Data In Command->Command->Session;
        Counts in Command->Command->Session;
        Node Status;
        {Candidate Router Key}

        (Only choose one):
        1. Object Name (Recommended, it is table name for Hbase, if there isn't table in the event, Object name is null. If client query metadata from Zookeeper, use hardcode table name to replace. Such as "Hbase Metadata".)
        2. HRegion Identity
        3. RegionServer IP
        4. Others.....

        {Sub Event Item}

        : It depends on the

        {Event Type}

        .
        1. Session (in current Hbase version, it is the connection, establish connection and close connection)
        Session Login In :
        Session Login Off:
        2. Command
        Command Request
        Command Response
        3. Data in Command (interested data from input and output command)
        Data metadata and content (only)
        4. Counts In Command
        Counts Set (only)
        5. Node Status
        Performance counts (Resource usage, Session amount, and other performance related counts)
        Abnormal events (defined by user, normally, it includes error event, huge request in a short time and
        so on).

        BTW, I suggest try to shorten the routing string and keep the capability of dynamic routing. Fox example:

        2.ObjectName.2.Others => it means Command.ObjectName.CommandResponse.Others (string to number is only for predefined type).

        Show
        linden lin added a comment - I give my thoughts for reference. Routing string is hierarchical and easily matched by order. The routing string is as follows: {Event Type}.{Candidate Router Key}.{Sub Event Item}.{Other items} {Event Type} indicates the main event type; {Candidate Router Key} is the consideration for scalability and performance; and {Sub Event Item} will be the more accurate type for filtering and routing; {Other items} is reserved for future. {Event Type}: 1. Session 2. Command 3. Data in Command 4. Counts in Command 5. Node Status: necessary status and abnormal events. Relation (A->B: A depends on B): Counts in Command->Data In Command->Command->Session; Counts in Command->Command->Session; Node Status; {Candidate Router Key} (Only choose one): 1. Object Name (Recommended, it is table name for Hbase, if there isn't table in the event, Object name is null. If client query metadata from Zookeeper, use hardcode table name to replace. Such as "Hbase Metadata".) 2. HRegion Identity 3. RegionServer IP 4. Others..... {Sub Event Item} : It depends on the {Event Type} . 1. Session (in current Hbase version, it is the connection, establish connection and close connection) Session Login In : Session Login Off: 2. Command Command Request Command Response 3. Data in Command (interested data from input and output command) Data metadata and content (only) 4. Counts In Command Counts Set (only) 5. Node Status Performance counts (Resource usage, Session amount, and other performance related counts) Abnormal events (defined by user, normally, it includes error event, huge request in a short time and so on). BTW, I suggest try to shorten the routing string and keep the capability of dynamic routing. Fox example: 2.ObjectName.2.Others => it means Command.ObjectName.CommandResponse.Others (string to number is only for predefined type).
        Hide
        Andrew Purtell added a comment -

        Instead of building a complex audit data management system, I suggest making a log tap that sends audit trace to syslog, either local or remote.

        I agree. Via log4j preferably, as we already bundle it. I had a log4j setup once which aggregated into a mysql db via a rsyslog hierarchy. Not that such a thing is necessarily ideal, point is log4j affords a lot of flexibility to the user and is clean and simple to use in the HBase code.

        I suggest defining a format for audit logging to conveniently support message routing by regexp.

        Show
        Andrew Purtell added a comment - Instead of building a complex audit data management system, I suggest making a log tap that sends audit trace to syslog, either local or remote. I agree. Via log4j preferably, as we already bundle it. I had a log4j setup once which aggregated into a mysql db via a rsyslog hierarchy. Not that such a thing is necessarily ideal, point is log4j affords a lot of flexibility to the user and is clean and simple to use in the HBase code. I suggest defining a format for audit logging to conveniently support message routing by regexp.
        Hide
        ryan rawson added a comment -

        Instead of building a complex audit data management system, I suggest making a log tap that sends audit trace to syslog, either local or remote. Using syslog to audit machines is fairly common and there are a lot of good syslog systems for a variety of levels of paranoia.

        Show
        ryan rawson added a comment - Instead of building a complex audit data management system, I suggest making a log tap that sends audit trace to syslog, either local or remote. Using syslog to audit machines is fairly common and there are a lot of good syslog systems for a variety of levels of paranoia.
        Hide
        linden lin added a comment -

        @stack, User should have a another hbase instance for audit isn't a reasonable solution from my view. Acquiring the enough detail, log4j or other logging solution is ok (leverage the efforts in implementation). But my consideration is how to transfer the log to different kind of sinks with efficient method.
        My draft idea, I recommend using the distributional subscriber & receiver model for Hbase audit. One Hbase server (or HRegion) is a subscriber (many subscribers) for the distributional framework and receiver is the any sink which receives the interested content from distributional framework. The key point is receiver can divide the subscriber's log for load balance (for example, by topic name, topic name is IP address, table name, key range and so on).
        Thus, Hbase only needs to add a client plug-in for the distributional framework (message bus, etc) and define the log title for router (it is static from design).

        Normally, the auditing feature is disabled. When user want to enable this feature, he should install the specific third-party router cluster (distributional, scalable framework), then add the cluster address to Hbase configuration. Thus, Hbase cluster can be the subscribers for the router cluster. The next things I think they are all customers' task. (Add receiver, operate the log and so on)

        Meanwhile, should we need to support dynamic subscriber and subscriber content in this version?

        Show
        linden lin added a comment - @stack, User should have a another hbase instance for audit isn't a reasonable solution from my view. Acquiring the enough detail, log4j or other logging solution is ok (leverage the efforts in implementation). But my consideration is how to transfer the log to different kind of sinks with efficient method. My draft idea, I recommend using the distributional subscriber & receiver model for Hbase audit. One Hbase server (or HRegion) is a subscriber (many subscribers) for the distributional framework and receiver is the any sink which receives the interested content from distributional framework. The key point is receiver can divide the subscriber's log for load balance (for example, by topic name, topic name is IP address, table name, key range and so on). Thus, Hbase only needs to add a client plug-in for the distributional framework (message bus, etc) and define the log title for router (it is static from design). Normally, the auditing feature is disabled. When user want to enable this feature, he should install the specific third-party router cluster (distributional, scalable framework), then add the cluster address to Hbase configuration. Thus, Hbase cluster can be the subscribers for the router cluster. The next things I think they are all customers' task. (Add receiver, operate the log and so on) Meanwhile, should we need to support dynamic subscriber and subscriber content in this version?
        Hide
        stack added a comment -

        @Linden That makes sense. So, if writing to hbase, write to a different hbase instance? Emitting audit logs using apache commons or so or sfl4j make sense to you and then hooking up the logging system to different kind of sinks writing any necessary plugins if needed make sense to you?

        Show
        stack added a comment - @Linden That makes sense. So, if writing to hbase, write to a different hbase instance? Emitting audit logs using apache commons or so or sfl4j make sense to you and then hooking up the logging system to different kind of sinks writing any necessary plugins if needed make sense to you?
        Hide
        linden lin added a comment -

        There is afraid of security issue about storing auditing log on the same Hbase. Audit's motivation includes the observation of the administrator's behavior.

        Show
        linden lin added a comment - There is afraid of security issue about storing auditing log on the same Hbase. Audit's motivation includes the observation of the administrator's behavior.
        Hide
        Andrew Purtell added a comment -

        So I think participants on this issue are in basic agreement we can start with commons logging, presumed into a log aggregation framework. Should put support in package o.a.h.h.log.audit or similar to facilitate routing and filtering in log4j properties.

        Show
        Andrew Purtell added a comment - So I think participants on this issue are in basic agreement we can start with commons logging, presumed into a log aggregation framework. Should put support in package o.a.h.h.log.audit or similar to facilitate routing and filtering in log4j properties.
        Hide
        stack added a comment -

        I like the idea of audit logs going out via commons logging so you could hook up a sink of your choosing (and yes, sink could be an hbase table.. we could write a logger plugin for log4j or some such to do this).

        Show
        stack added a comment - I like the idea of audit logs going out via commons logging so you could hook up a sink of your choosing (and yes, sink could be an hbase table.. we could write a logger plugin for log4j or some such to do this).
        Hide
        Andrew Purtell added a comment -

        @ryan: Thanks for the reminder about the historian. Agree.

        Show
        Andrew Purtell added a comment - @ryan: Thanks for the reminder about the historian. Agree.
        Hide
        linden lin added a comment -

        Audit is always for regulatory needs. How to secure auditing data as evidence and if there is enough detail to trace the source and problem is the key point I think. If the auditing data can deliver to target in time, it will better.

        From regulatory compliant needs, it not only needs to acquire all events on the table, but also needs to collect the necessary events from the cluster, such as server offline information, and some necessary information (metadata and status at that time) to analyze the event. Thus, third-part software can get the detailed event in time for monitoring, content inspection or policy enforcement in the company.

        Show
        linden lin added a comment - Audit is always for regulatory needs. How to secure auditing data as evidence and if there is enough detail to trace the source and problem is the key point I think. If the auditing data can deliver to target in time, it will better. From regulatory compliant needs, it not only needs to acquire all events on the table, but also needs to collect the necessary events from the cluster, such as server offline information, and some necessary information (metadata and status at that time) to analyze the event. Thus, third-part software can get the detailed event in time for monitoring, content inspection or policy enforcement in the company.
        Hide
        ryan rawson added a comment -

        Beware the lessons of the historian, storing data like this in an actual table may cause problems when the systems are offline. I would vote for straight up normal logging and let people put together a log aggregation infrastructure as needed.

        Show
        ryan rawson added a comment - Beware the lessons of the historian, storing data like this in an actual table may cause problems when the systems are offline. I would vote for straight up normal logging and let people put together a log aggregation infrastructure as needed.

          People

          • Assignee:
            Unassigned
            Reporter:
            Andrew Purtell
          • Votes:
            2 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:

              Development