Uploaded image for project: 'Sentry'
  1. Sentry
  2. SENTRY-1915

Sentry is doing a lot of work to convert list of paths to HMSPaths structure

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0
    • Fix Version/s: None
    • Component/s: Sentry
    • Labels:
      None

      Description

      It turns out that in 2.0 we changed the way full snapshots are sent from Sentry to HDFS. Before they were using HMSPaths which used tree structure and eliminated some duplication. Also SENTRY-1827 helped to compressed this on the serialization side.

      Now we are using TPathChanges structure that is not tree-based and contains very non-efficient way of representing paths: required list<list<string>> addPaths; so we split each paths on slashes and store list of elements instead of storing a tree. As a result we may use much more memory.

      1. SENTRY-1915.01.patch
        12 kB
        Alexander Kolbasov
      2. SENTRY-1915.02.patch
        18 kB
        Alexander Kolbasov

        Issue Links

          Activity

          Show
          akolb Alexander Kolbasov added a comment - Vamsee Yarlagadda Misha Dmitriev Sergio Peña Mathew Crocker FYI
          Hide
          akolb Alexander Kolbasov added a comment -

          Hao Hao FYI

          Show
          akolb Alexander Kolbasov added a comment - Hao Hao FYI
          Hide
          akolb Alexander Kolbasov added a comment -

          It turns out that in the end we do use PathDump structures to send to HDFS, it is just the fact that we are not very efficient in handling these - we create a lot of intermediate structures before we get to it.

          Show
          akolb Alexander Kolbasov added a comment - It turns out that in the end we do use PathDump structures to send to HDFS, it is just the fact that we are not very efficient in handling these - we create a lot of intermediate structures before we get to it.
          Hide
          akolb Alexander Kolbasov added a comment -

          The idea of the fix is to compress all the layers and move the code directly in the SentryStore where we read objects. As we read them, we convert each path to the list of components and add them directly to the HMSPath() object.

          Show
          akolb Alexander Kolbasov added a comment - The idea of the fix is to compress all the layers and move the code directly in the SentryStore where we read objects. As we read them, we convert each path to the list of components and add them directly to the HMSPath() object.
          Hide
          hadoopqa Hadoop QA added a comment -

          Here are the results of testing the latest attachment
          https://issues.apache.org/jira/secure/attachment/12885521/SENTRY-1915.01.patch against master.

          Overall: -1 due to 5 errors

          ERROR: mvn test exited 1
          ERROR: Failed: org.apache.sentry.hdfs.TestImageRetriever
          ERROR: Failed: org.apache.sentry.hdfs.TestImageRetriever
          ERROR: Failed: org.apache.sentry.hdfs.TestSentryHDFSServiceProcessor
          ERROR: Failed: org.apache.sentry.hdfs.TestSentryHDFSServiceProcessor

          Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3243/console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12885521/SENTRY-1915.01.patch against master. Overall: -1 due to 5 errors ERROR: mvn test exited 1 ERROR: Failed: org.apache.sentry.hdfs.TestImageRetriever ERROR: Failed: org.apache.sentry.hdfs.TestImageRetriever ERROR: Failed: org.apache.sentry.hdfs.TestSentryHDFSServiceProcessor ERROR: Failed: org.apache.sentry.hdfs.TestSentryHDFSServiceProcessor Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3243/console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          Here are the results of testing the latest attachment
          https://issues.apache.org/jira/secure/attachment/12885530/SENTRY-1915.01.patch against master.

          Overall: +1 all checks pass

          SUCCESS: all tests passed

          Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3244/console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12885530/SENTRY-1915.01.patch against master. Overall: +1 all checks pass SUCCESS: all tests passed Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3244/console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          Here are the results of testing the latest attachment
          https://issues.apache.org/jira/secure/attachment/12885649/SENTRY-1915.02.patch against master.

          Overall: -1 due to an error

          ERROR: failed to build with patch (exit code 1)

          Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3251/console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12885649/SENTRY-1915.02.patch against master. Overall: -1 due to an error ERROR: failed to build with patch (exit code 1) Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3251/console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          Here are the results of testing the latest attachment
          https://issues.apache.org/jira/secure/attachment/12885678/SENTRY-1915.02.patch against master.

          Overall: -1 due to an error

          ERROR: failed to apply patch (exit code 1):
          The patch does not appear to apply with p0, p1, or p2

          Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3252/console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12885678/SENTRY-1915.02.patch against master. Overall: -1 due to an error ERROR: failed to apply patch (exit code 1): The patch does not appear to apply with p0, p1, or p2 Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3252/console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          Here are the results of testing the latest attachment
          https://issues.apache.org/jira/secure/attachment/12885689/SENTRY-1915.02.patch against master.

          Overall: +1 all checks pass

          SUCCESS: all tests passed

          Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3253/console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12885689/SENTRY-1915.02.patch against master. Overall: +1 all checks pass SUCCESS: all tests passed Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/3253/console This message is automatically generated.

            People

            • Assignee:
              akolb Alexander Kolbasov
              Reporter:
              akolb Alexander Kolbasov
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development