Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-7160

Configurable initial buffersize for getGroupDetails()

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.22.0
    • None
    • native, security
    • None

    Description

      trunk/src/native/src/org/apache/hadoop/security/getGroup.c
      int getGroupDetails(gid_t group, char **grpBuf) {
        struct group * grp = NULL;
        size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
        if (currBufferSize < 1024) {
          currBufferSize = 1024;
        }
        *grpBuf = NULL; 
        char *buf = (char*)malloc(sizeof(char) * currBufferSize);
      
        if (!buf) {
          return ENOMEM;
        }
        int error;
        for (;;) {
          error = getgrgid_r(group, (struct group*)buf,
                             buf + sizeof(struct group),
                             currBufferSize - sizeof(struct group), &grp);
          if(error != ERANGE) {
             break;
          }
          free(buf);
          currBufferSize *= 2;
          buf = malloc(sizeof(char) * currBufferSize);
          if(!buf) {
            return ENOMEM;
          }
      ...
      

      For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))

      In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter

      Attachments

        Activity

          People

            Unassigned Unassigned
            mary T Meyarivan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: