Details

    • New Feature
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Ranger
    • None

    Description

      Given the primary business value of Apache Ranger is to enable sharing of resources, it will help if Apache Ranger provides an abstraction that enables a set of resources/data across services, a dataset, to be the unit of sharing instead of one or more resources in each service. This has several benefits, like:

      1. A single policy to manage access to data in multiple services - like HBase, Hive, Snowflake, Kafka, Google BigQuery, AWS S3, AWS Redshift, ADLS-Gen2. This enables authorization to be centered around a purpose, like:
        • Marketing Campaign 2022 dataset
        • Sales 2021 dataset
        • CA Claims 2021 dataset
      2. Enables different set of users to manage sharing data into a dataset and manage access to the data in a dataset:
        • Data owners share data into a dataset, with necessary masking,  row-filters and schedules; they can update the share details, including stop sharing into a dataset.
        • Dataset admins manage who has access to the data in the dataset. This relieves data owners from having to micromanage access to the shared data, for example when a user needs access to the data in multiple services to participate in a project.

      Attached document has more details on this new abstraction, including a number of questions & answers that to help understand various aspects of this feature. Please read and add your comments/suggestions.

      Attachments

        1. Apache Ranger - Dataset-1.pdf
          329 kB
          Madhan Neethiraj

        Activity

          There are no comments yet on this issue.

          People

            madhan Madhan Neethiraj
            madhan Madhan Neethiraj
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: