Uploaded image for project: 'Ranger'
  1. Ranger
  2. RANGER-4910

Develop Apache Ranger Plugin for Polaris to Enhance Access Control for Apache Iceberg

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • plugins
    • None

    Description

      Polaris, recently open-sourced by Snowflake, provides comprehensive technical metadata management for Apache Iceberg. Key features of Polaris include:

      • RBAC (Role-Based Access Control): Polaris supports RBAC for table and view-level operations. [See Documentation](https://polaris.io/#tag/Access-Control)
      • Role Management: Polaris allows the creation of Principals with roles like Data Engineer, Data Scientist, etc.
      • Catalog Roles: Specialized roles like Catalog Administrators, Catalog Readers, and Catalog Contributors can be defined to manage access to different parts of the data catalog.
      • Granular Privileges: Polaris provides fine-grained privileges for operations on Tables, Views, Namespaces, and Catalogs. Examples include `TABLE_CREATE`, `TABLE_READ_DATA`, `TABLE_WRITE_DATA`, `VIEW_CREATE`, `NAMESPACE_CREATE`, `CATALOG_MANAGE_CONTENT`, and more.
      • Credential Vending: Polaris vends credentials based on the specific table the user is trying to access.
      • API for Role Management: Polaris offers an API to manage grants for roles, allowing fine-tuned control over data access.

      Objective:

      To enhance the usability and security of Polaris for Apache Iceberg users, the request is to develop an Apache Ranger plugin that integrates Polaris' access control features with Apache Ranger. This integration will allow for centralized and consistent management of access policies, audit logging, and fine-grained access control across different tools used with Apache Iceberg.

      Use Cases:

      1. Centralized Access Policy Management:

      • Implement centralized and consistent management of access policies for data stored using Apache Iceberg across multiple tools and environments.

      2. Access Control for Data Engineering Workloads:

      • Manage and control access to datasets used by Data Engineering workloads (e.g., Apache Spark) with a coarser-grained approach at the table level.

      3. Fine-Grained Access Control for Data Analysts:

      • Provide fine-grained access control for Data Analysts using compute engines like Trino. This control can be enforced by leveraging the native Ranger Plugin in Trino, allowing for more granular control over data access at the table, view, or even column level.

      4. Centralized Access Auditing:

      • Enable centralized collection and analysis of access audit logs across all tools used to access datasets in Iceberg, ensuring comprehensive auditing and compliance.

      References:

      Expected Deliverables:

      • A fully functional Apache Ranger plugin for Polaris that supports the outlined use cases.
      • Documentation on how to configure and deploy the plugin.
      • Integration tests to ensure the plugin works as expected with Apache Iceberg and other tools like Apache Spark and Trino.
      • A detailed user guide explaining how to use the plugin for managing access control in various scenarios.

      Attachments

        Activity

          People

            Unassigned Unassigned
            bosco Bosco
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: