Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-17079

Optimize UGI#getGroups by adding UGI#getGroupsSet

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • build
    • None
    • Reviewed
    • Hide
      Added a UserGroupMapping#getGroupsSet() API and deprecate UserGroupMapping#getGroups.

      The UserGroupMapping#getGroups() can be expensive as it involves Set->List conversion. For user with large group membership (i.e., > 1000 groups), we recommend using getGroupSet to avoid the conversion and fast membership look up.
      Show
      Added a UserGroupMapping#getGroupsSet() API and deprecate UserGroupMapping#getGroups. The UserGroupMapping#getGroups() can be expensive as it involves Set->List conversion. For user with large group membership (i.e., > 1000 groups), we recommend using getGroupSet to avoid the conversion and fast membership look up.

    Description

      UGI#getGroups has been optimized with HADOOP-13442 by avoiding the List->Set->List conversion. However the returned list is not optimized to contains lookup, especially the user's group membership list is huge (thousands+) . This ticket is opened to add a UGI#getGroupsSet and use Set#contains() instead of List#contains() to speed up large group look up while minimize List->Set conversions in Groups#getGroups() call.

      Attachments

        1. HADOOP-17079.002.patch
          65 kB
          Xiaoyu Yao
        2. HADOOP-17079.003.patch
          78 kB
          Xiaoyu Yao
        3. HADOOP-17079.004.patch
          80 kB
          Xiaoyu Yao
        4. HADOOP-17079.005.patch
          81 kB
          Xiaoyu Yao
        5. HADOOP-17079.006.patch
          81 kB
          Xiaoyu Yao
        6. HADOOP-17079.007.patch
          80 kB
          Xiaoyu Yao

        Issue Links

          Activity

            People

              xyao Xiaoyu Yao
              xyao Xiaoyu Yao
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h