Uploaded image for project: 'Sentry'
  1. Sentry
  2. SENTRY-1779

HDFS full snapshot should limit to a set of path prefixes

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.5.1, 1.8.0, 2.0.0
    • Fix Version/s: None
    • Component/s: Hdfs Plugin
    • Labels:
      None

      Description

      Currently when the cluster starts up, HDFS requests aa full snapshot from Sentry and Sentry returns a complete list of all privileges and permissions to HDFS plugin and upon receiving the data, the plugin filters the content to a subset that matches the prefixes. And this happens every time during the service restart (HDFS) or upon the expiry (every 24hrs). So during this time, Sentry is doing the heavy lifting work of loading all the metadata on to the memory to send the full snapshot to HDFS even though HDFS might not care about most of the data. During this time, the memory requirement for Sentry spikes and could hit OOM given if the metadata can get huge over time.

      A better option would be that the plugin asks for full snapshot for a list of prefixes. And Sentry would query the database for permissions by filtering with the paths supplied. Thereby, reducing the memory usage of Sentry and also reducing the amount of data being transferred over to the HDFS.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              vamsee Vamsee K. Yarlagadda
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: