Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: HADOOP-10388
    • Fix Version/s: HADOOP-10388
    • Component/s: native
    • Labels:
      None

      Description

      Add the ability for the native RPCv9 client to set a username when initiating a connection.

      1. HADOOP-10564.001.patch
        23 kB
        Colin Patrick McCabe
      2. HADOOP-10564-pnative.002.patch
        30 kB
        Colin Patrick McCabe
      3. HADOOP-10564-pnative.003.patch
        30 kB
        Colin Patrick McCabe
      4. HADOOP-10564-pnative.004.patch
        30 kB
        Colin Patrick McCabe
      5. HADOOP-10564-pnative.005.patch
        33 kB
        Colin Patrick McCabe
      6. HADOOP-10564-pnative.006.patch
        33 kB
        Binglin Chang

        Activity

        Hide
        Colin Patrick McCabe added a comment -

        committed. thanks, guys

        Show
        Colin Patrick McCabe added a comment - committed. thanks, guys
        Hide
        Binglin Chang added a comment -

        patch lgtm +1
        update the patch one line to match latest branch HEAD.

        Show
        Binglin Chang added a comment - patch lgtm +1 update the patch one line to match latest branch HEAD.
        Hide
        Colin Patrick McCabe added a comment -

        Hi Binglin,

        Earlier, on HADOOP-10389, you made a comment that we should set call id rather than just using 0 everywhere. I implemented this in version #5 of this patch.

        I think the comment was this:

        1. to my understanding, rpc client should have a map<callid, call> to record all unfinished calls, but I could not find any code assigning callids(only make them 0) and manage unfinished calls, could you help me located those logic?

        We don't need a map here, since we can only have one call ID in flight at once (the server doesn't yet support this). In the future, this might change, but for now, it's OK just to check that the call ID we got in the response was the same as the call ID we made in the request.

        Show
        Colin Patrick McCabe added a comment - Hi Binglin, Earlier, on HADOOP-10389 , you made a comment that we should set call id rather than just using 0 everywhere. I implemented this in version #5 of this patch. I think the comment was this: 1. to my understanding, rpc client should have a map<callid, call> to record all unfinished calls, but I could not find any code assigning callids(only make them 0) and manage unfinished calls, could you help me located those logic? We don't need a map here, since we can only have one call ID in flight at once (the server doesn't yet support this). In the future, this might change, but for now, it's OK just to check that the call ID we got in the response was the same as the call ID we made in the request.
        Hide
        Colin Patrick McCabe added a comment -

        Abe said:

        In hadoop-native-core/common/user.c seems like it should be ERANGE? It seems like it's checking an upper boundary.

        OK.

        Binglin said:

        1. it's hard to remember which field needs free(some are stack alloc, some are heap) which didn't, could you add comments of each field's memory ownership?

        I added comments to hrpc_proxy_init.

        2. in patch line: 690

        Fixed

        3. reactor.c 71 RB_NFIND may always find nothing, based on what the RB tree compare method's content(only pointer equal means equal). I am not familiar with RB_tree's semantic and the header file doesn't provide any document. And hrpc_conn_usable may be redundant cause RB_NFIND already checks those fields.

        There is some documentation in tree.h:

        /* Finds the first node greater than or equal to the search key */  \
        attr struct type *              \
        name##_RB_NFIND(struct name *head, struct type *elm)      \
        

        As the comment says, RB_NFIND finds the first node greater than or equal to the search key. This is similar to java.util.TreeMap.floorKey.

        Show
        Colin Patrick McCabe added a comment - Abe said: In hadoop-native-core/common/user.c seems like it should be ERANGE? It seems like it's checking an upper boundary. OK. Binglin said: 1. it's hard to remember which field needs free(some are stack alloc, some are heap) which didn't, could you add comments of each field's memory ownership? I added comments to hrpc_proxy_init . 2. in patch line: 690 Fixed 3. reactor.c 71 RB_NFIND may always find nothing, based on what the RB tree compare method's content(only pointer equal means equal). I am not familiar with RB_tree's semantic and the header file doesn't provide any document. And hrpc_conn_usable may be redundant cause RB_NFIND already checks those fields. There is some documentation in tree.h : /* Finds the first node greater than or equal to the search key */ \ attr struct type * \ name##_RB_NFIND(struct name *head, struct type *elm) \ As the comment says, RB_NFIND finds the first node greater than or equal to the search key. This is similar to java.util.TreeMap.floorKey .
        Hide
        Binglin Chang added a comment -

        Hi, Colin, thanks for the patch, some comments:
        1. it's hard to remember which field needs free(some are stack alloc, some are heap) which didn't, could you add comments of each field's memory ownership?
        2. in patch line: 690

        690 +    proxy->call.remote = *remote;
        691 +    proxy->call.remote = *remote;
        

        3. reactor.c 71 RB_NFIND may always find nothing, based on what the RB tree compare method's content(only pointer equal means equal). I am not familiar with RB_tree's semantic and the header file doesn't provide any document. And hrpc_conn_usable may be redundant cause RB_NFIND already checks those fields.

        Show
        Binglin Chang added a comment - Hi, Colin, thanks for the patch, some comments: 1. it's hard to remember which field needs free(some are stack alloc, some are heap) which didn't, could you add comments of each field's memory ownership? 2. in patch line: 690 690 + proxy->call.remote = *remote; 691 + proxy->call.remote = *remote; 3. reactor.c 71 RB_NFIND may always find nothing, based on what the RB tree compare method's content(only pointer equal means equal). I am not familiar with RB_tree's semantic and the header file doesn't provide any document. And hrpc_conn_usable may be redundant cause RB_NFIND already checks those fields.
        Hide
        Abraham Elmahrek added a comment -

        Looking better.

        ret = ENOMEM;
        

        In hadoop-native-core/common/user.c seems like it should be ERANGE? It seems like it's checking an upper boundary.

        Show
        Abraham Elmahrek added a comment - Looking better. ret = ENOMEM; In hadoop-native-core/common/user.c seems like it should be ERANGE? It seems like it's checking an upper boundary.
        Hide
        Colin Patrick McCabe added a comment -

        """void hrpc_proxy_init(struct hrpc_proxy *proxy, ...""" seems to be missing a @param comment about the username

        fixed

        If you jump to done, I don't think you can set 'ret'?

        fixed

        Users of the client would need to be cognizant of the fact that "uid_to_string" allocates from the heap. This delegates memory management to the user. Usernames have a max size generally. Why not just allocate enough memory that exceeds most systems max size and strcpy?

        This function will only be accessible from libhdfs and libyarn... so I think it's less of a concern, right?

        Why get rid of the builder pattern?

        When I was working on the HDFS client, I found that putting it on the stack was WAY easier and faster. I liked the builder stuff but it was getting too verbose... and I (almost) don't have to make any allocations this way

        Show
        Colin Patrick McCabe added a comment - """void hrpc_proxy_init(struct hrpc_proxy *proxy, ...""" seems to be missing a @param comment about the username fixed If you jump to done, I don't think you can set 'ret'? fixed Users of the client would need to be cognizant of the fact that "uid_to_string" allocates from the heap. This delegates memory management to the user. Usernames have a max size generally. Why not just allocate enough memory that exceeds most systems max size and strcpy? This function will only be accessible from libhdfs and libyarn... so I think it's less of a concern, right? Why get rid of the builder pattern? When I was working on the HDFS client, I found that putting it on the stack was WAY easier and faster. I liked the builder stuff but it was getting too verbose... and I (almost) don't have to make any allocations this way
        Hide
        Abraham Elmahrek added a comment -

        A few nits:

        1. """void hrpc_proxy_init(struct hrpc_proxy *proxy, ...""" seems to be missing a @param comment about the username
        2. You seem to have GOTO logic that takes on the following form:
          goto done;
          ret = ENOMEM;
          goto done;
          

          If you jump to done, I don't think you can set 'ret'?

        3. The following seems like a tough thing to manage:
          *out = strdup(result->pw_name);
          

          Users of the client would need to be cognizant of the fact that "uid_to_string" allocates from the heap. This delegates memory management to the user. Usernames have a max size generally. Why not just allocate enough memory that exceeds most systems max size and strcpy?

        4. Why get rid of the builder pattern?
        Show
        Abraham Elmahrek added a comment - A few nits: """void hrpc_proxy_init(struct hrpc_proxy *proxy, ...""" seems to be missing a @param comment about the username You seem to have GOTO logic that takes on the following form: goto done; ret = ENOMEM; goto done; If you jump to done, I don't think you can set 'ret'? The following seems like a tough thing to manage: *out = strdup(result->pw_name); Users of the client would need to be cognizant of the fact that "uid_to_string" allocates from the heap. This delegates memory management to the user. Usernames have a max size generally. Why not just allocate enough memory that exceeds most systems max size and strcpy? Why get rid of the builder pattern?
        Hide
        Colin Patrick McCabe added a comment -

        Hi Colin, about user.h, we may need a struct to represent user(like ugi in hadoop), so in the future more things can be added to it, like auth method, tokens...

        Yeah. We probably want a ugi struct at some point. I look at the stuff I'm adding here (a way to get a user name from the current user ID) as a building block for that. One step at a time...

        Show
        Colin Patrick McCabe added a comment - Hi Colin, about user.h, we may need a struct to represent user(like ugi in hadoop), so in the future more things can be added to it, like auth method, tokens... Yeah. We probably want a ugi struct at some point. I look at the stuff I'm adding here (a way to get a user name from the current user ID) as a building block for that. One step at a time...
        Hide
        Binglin Chang added a comment -

        Hi Colin, about user.h, we may need a struct to represent user(like ugi in hadoop), so in the future more things can be added to it, like auth method, tokens, something like:
        struct hadoop_user;
        hadoop_user_(alloc|get_login|free)
        It is better to add it when the change is small, thoughts?

        Show
        Binglin Chang added a comment - Hi Colin, about user.h, we may need a struct to represent user(like ugi in hadoop), so in the future more things can be added to it, like auth method, tokens, something like: struct hadoop_user; hadoop_user_(alloc|get_login|free) It is better to add it when the change is small, thoughts?
        Hide
        Colin Patrick McCabe added a comment -

        to be more specific, I put spaces around the PRId64, etc. macros, and added #include <inttypes.h> to a few files which needed it on certain platforms

        Show
        Colin Patrick McCabe added a comment - to be more specific, I put spaces around the PRId64, etc. macros, and added #include <inttypes.h> to a few files which needed it on certain platforms
        Hide
        Colin Patrick McCabe added a comment -

        add some compilation fixes suggested by Binglin and Yongjun

        Show
        Colin Patrick McCabe added a comment - add some compilation fixes suggested by Binglin and Yongjun
        Hide
        Colin Patrick McCabe added a comment -

        This patch adds the ability to set the username in the IpcConnectionContext. It does not yet implement authentication (SASL, etc.), but just setting that field.

        I also changed hrpc_proxy so that instances can be created on the stack. This is helpful in the forthcoming RPC client, since it means that blocking RPCs can just create one of these on the stack and make a call without having to call malloc().

        Show
        Colin Patrick McCabe added a comment - This patch adds the ability to set the username in the IpcConnectionContext. It does not yet implement authentication (SASL, etc.), but just setting that field. I also changed hrpc_proxy so that instances can be created on the stack. This is helpful in the forthcoming RPC client, since it means that blocking RPCs can just create one of these on the stack and make a call without having to call malloc() .

          People

          • Assignee:
            Colin Patrick McCabe
            Reporter:
            Colin Patrick McCabe
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development