Thrift
  1. Thrift
  2. THRIFT-242

Python struct constructors are clunky and error-prone

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Python - Compiler
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      The python constructors are clunky and unnecessarily difficult to use. Instead of

      Cls(value1, value2, ...)

      you must write

      Cls(

      {'arg1': value1, 'arg2': value2}

      )

      (or use the dict constructor instead of literal notation).

      Additionally, having the constructor silently do nothing if a non-dict is passed in (an object of the type being created would be another reasonable guess) or if an argument is mis-spelled causes users unnecessary trouble debugging.

      Since removing the d argument entirely would be backwards-incompatible, the attached patch keeps it, but puts real keyword parameters in for forwards-compatibility. Thus, old code will work with this patch, but new code can be written using keyword args that will still work when the d argument is removed in a release that allows backwards-incompatibility. (Removing the d argument is desireable because otherwise positional arguments can't be used.)

      1. init.patch
        2 kB
        Jonathan Ellis
      2. thrift-242_no_d_argument.patch
        4 kB
        Esteve Fernandez
      3. thrift-242_v2_no_d_argument.patch
        5 kB
        Esteve Fernandez

        Activity

        Jake Farrell made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        David Reiss added a comment -

        It is marked resolved.

        Show
        David Reiss added a comment - It is marked resolved.
        Hide
        Mark Slee added a comment -

        Should we close this issue? Looks done to me, we just updated the website to reflect the change.

        Show
        Mark Slee added a comment - Should we close this issue? Looks done to me, we just updated the website to reflect the change.
        David Reiss made changes -
        Resolution Fixed [ 1 ]
        Assignee Esteve Fernandez [ esteve ]
        Status Open [ 1 ] Resolved [ 5 ]
        Hide
        David Reiss added a comment -

        Committed the fix as r734536. I had to make one additional small change to render_constant_value for constant structures.

        Show
        David Reiss added a comment - Committed the fix as r734536. I had to make one additional small change to render_constant_value for constant structures.
        Hide
        Esteve Fernandez added a comment -

        Could someone apply the patch? Ben gave his approval on removing the d argument and
        David's concern on mutable arguments is already solved in the last version of the patch.

        Show
        Esteve Fernandez added a comment - Could someone apply the patch? Ben gave his approval on removing the d argument and David's concern on mutable arguments is already solved in the last version of the patch.
        Esteve Fernandez made changes -
        Attachment thrift-242_v2_no_d_argument.patch [ 12397651 ]
        Hide
        Esteve Fernandez added a comment -

        Updated patch, constructors now take default values from thrift_spec and use the "field is value" trick.

        Show
        Esteve Fernandez added a comment - Updated patch, constructors now take default values from thrift_spec and use the "field is value" trick.
        Hide
        Esteve Fernandez added a comment -

        Oops, I forgot that. In any case, I would prefer to define defaults in thrift_spec, rather than using a sentinel. I'll upload a new patch by the end of the day.

        Show
        Esteve Fernandez added a comment - Oops, I forgot that. In any case, I would prefer to define defaults in thrift_spec, rather than using a sentinel. I'll upload a new patch by the end of the day.
        Hide
        David Reiss added a comment -

        This suggestion was sent to the dev list. I like it.

        A simple way around this is to write

            _default = object()
        
            def func(x=_default):
                if x is _default:
                    x = DEFAULT
        

        You can use the same _default object for all args of this type in the same
        file. This is a fairly well-known approach to this problem. The only
        possible downside is someone deliberately reaching into the file to pass
        _default as an arg, but in that case they get what they asked for...

        Show
        David Reiss added a comment - This suggestion was sent to the dev list. I like it. A simple way around this is to write _default = object() def func(x=_default): if x is _default: x = DEFAULT You can use the same _default object for all args of this type in the same file. This is a fairly well-known approach to this problem. The only possible downside is someone deliberately reaching into the file to pass _default as an arg, but in that case they get what they asked for...
        Hide
        David Reiss added a comment -

        Hey, there is one thing that concerns me about this patch, regarding the handling of container and structure fields with default values. For an example, look at test/DebugProtoTest.thrift, the OneOfEach structure, the i16_list field. The generated Python code looks like...

          def __init__(self, im_true=None, im_false=None, a_bite=200, integer16=33000, integer32=None, integer64=10000000000, double_precision=None, some_characters=None, zomg_unicode=None, what_who=None, base64=None, byte_list=[
            1,
            2,
            3,
          ], i16_list=[
            1,
            2,
            3,
          ], i64_list=[
            1,
            2,
            3,
          ],):
        

        What this means is that all OneOfEachs constructed without specifying i16_list will have a reference to the same list (Python default args are only evaluated at definition time, not call time). So I wrote this test program...

        #!/usr/bin/env python
        import sys
        sys.path.append('./gen-py')
        from DebugProtoTest.ttypes import OneOfEach
        
        ooe1 = OneOfEach()
        ooe2 = OneOfEach()
        print ooe2.byte_list[0]
        ooe1.byte_list[0] = 0
        print ooe2.byte_list[0]
        

        And it prints 1, 0. I think this needs to be avoided. The two best solutions I can come up with are...

        1/ Default everything (at least non-scalars) to None, and make the body of the _init_ say "if foo is None: foo = DEFAULT". The downside is that it becomes impossible to override a default value with None (in the constructor).
        2/ Take **kwargs. The downside is that it is less self-documenting. Also, I think we would want to write code to throw an exception if someone specified an invalid field.

        Thoughts?

        Show
        David Reiss added a comment - Hey, there is one thing that concerns me about this patch, regarding the handling of container and structure fields with default values. For an example, look at test/DebugProtoTest.thrift, the OneOfEach structure, the i16_list field. The generated Python code looks like... def __init__(self, im_true=None, im_false=None, a_bite=200, integer16=33000, integer32=None, integer64=10000000000, double_precision=None, some_characters=None, zomg_unicode=None, what_who=None, base64=None, byte_list=[ 1, 2, 3, ], i16_list=[ 1, 2, 3, ], i64_list=[ 1, 2, 3, ],): What this means is that all OneOfEachs constructed without specifying i16_list will have a reference to the same list (Python default args are only evaluated at definition time, not call time). So I wrote this test program... #!/usr/bin/env python import sys sys.path.append('./gen-py') from DebugProtoTest.ttypes import OneOfEach ooe1 = OneOfEach() ooe2 = OneOfEach() print ooe2.byte_list[0] ooe1.byte_list[0] = 0 print ooe2.byte_list[0] And it prints 1, 0. I think this needs to be avoided. The two best solutions I can come up with are... 1/ Default everything (at least non-scalars) to None, and make the body of the _ init _ say "if foo is None: foo = DEFAULT". The downside is that it becomes impossible to override a default value with None (in the constructor). 2/ Take **kwargs. The downside is that it is less self-documenting. Also, I think we would want to write code to throw an exception if someone specified an invalid field. Thoughts?
        Hide
        Ben Maurer added a comment -

        I'm fine with removing the dictionary argument completely. Using a dictionary doesn't really help you do a copy constructor, and if you're going to specify fields, I think it's a better practice to not do that in a constructor.

        Show
        Ben Maurer added a comment - I'm fine with removing the dictionary argument completely. Using a dictionary doesn't really help you do a copy constructor, and if you're going to specify fields, I think it's a better practice to not do that in a constructor.
        Hide
        Jonathan Ellis added a comment -

        Note that Esteve's patch does not create an _init_ method at all when no arguments are possible. This is good python behavior. +1 for Esteve's patch.

        Show
        Jonathan Ellis added a comment - Note that Esteve's patch does not create an _ init _ method at all when no arguments are possible. This is good python behavior. +1 for Esteve's patch.
        Hide
        David Reiss added a comment -

        I'd be interested to hear what Ben (Maurer) thinks of this, since ReCaptcha is one of the biggest Python/Thrift users.

        Show
        David Reiss added a comment - I'd be interested to hear what Ben (Maurer) thinks of this, since ReCaptcha is one of the biggest Python/Thrift users.
        Esteve Fernandez made changes -
        Attachment thrift-242_no_d_argument.patch [ 12396915 ]
        Hide
        Esteve Fernandez added a comment -

        Patch that builds constructors without the d argument. Unit tests updated as well.

        Show
        Esteve Fernandez added a comment - Patch that builds constructors without the d argument. Unit tests updated as well.
        Hide
        Will Pierce added a comment -

        You're right, and explicit argument naming will be far more usable for introspecting code.

        Show
        Will Pierce added a comment - You're right, and explicit argument naming will be far more usable for introspecting code.
        Hide
        Jonathan Ellis added a comment -

        That's true, but since the code is autogenerated anyway, there's no reason not to put the exact parameter names in for human readability.

        Show
        Jonathan Ellis added a comment - That's true, but since the code is autogenerated anyway, there's no reason not to put the exact parameter names in for human readability.
        Hide
        Will Pierce added a comment -

        Removing the extra dict to thrift constructors will improve both performance and readability in my python apps.

        Another way to do this might be to use the built in kwargs feature:

        def _init_(self, **d):

        1. start of constructor
        2. initialize self by using d.get(NAME, default) for each NAMEd element? (maybe default is None?)

        justa thought!

        Show
        Will Pierce added a comment - Removing the extra dict to thrift constructors will improve both performance and readability in my python apps. Another way to do this might be to use the built in kwargs feature: def _ init _(self, **d): start of constructor initialize self by using d.get(NAME, default) for each NAMEd element? (maybe default is None?) justa thought!
        Hide
        Jonathan Ellis added a comment -

        I am happy to submit a d-less patch if The Powers That Be want one sooner rather than later. (Who are The Powers here?

        Show
        Jonathan Ellis added a comment - I am happy to submit a d-less patch if The Powers That Be want one sooner rather than later. (Who are The Powers here?
        Hide
        Jonathan Ellis added a comment -

        1) True, but since this addresses a special case of the "having the constructor silently do nothing if nonsense is passed" problem, I think it's worth causing (very) minor trouble if someone is actually explicitly passing {} – or worse, a non-empty dict or any other value, all of which have the same effect! Which illustrates the problem. Basically, if they are doing any of those things, it is an error, and we are not doing them a favor by preserving the existing behavior.

        (Obviously we also break back-compat for any parameters named "d.")

        2) This is a stylistic preference; IMO, it's more clear (for users of help() as well as code divers) to have d be an empty instance of the type it expects, but certainly there is a lot of Python code out there that does things the way you suggest. (You didn't bring up performance as an argument, but for the record, the extra if may be more or less performant depending on what the common case is, so I think that is a wash.) I don't much care here since ultimately I want the d argument to go away.

        3) Good catch.

        Show
        Jonathan Ellis added a comment - 1) True, but since this addresses a special case of the "having the constructor silently do nothing if nonsense is passed" problem, I think it's worth causing (very) minor trouble if someone is actually explicitly passing {} – or worse, a non-empty dict or any other value, all of which have the same effect! Which illustrates the problem. Basically, if they are doing any of those things, it is an error, and we are not doing them a favor by preserving the existing behavior. (Obviously we also break back-compat for any parameters named "d.") 2) This is a stylistic preference; IMO, it's more clear (for users of help() as well as code divers) to have d be an empty instance of the type it expects, but certainly there is a lot of Python code out there that does things the way you suggest. (You didn't bring up performance as an argument, but for the record, the extra if may be more or less performant depending on what the common case is, so I think that is a wash.) I don't much care here since ultimately I want the d argument to go away. 3) Good catch.
        Hide
        Esteve Fernandez added a comment -

        +1 However, I think backwards compatibility cannot be achieved completely. What happens if a structure has a member called "d"?

        I would propose an even more dramatic change and drop the current behavior. The latest release is from March (20080411) and other language libraries have already changed their API, so I think backwards compatibility can be sacrificed in favor of correctness.

        Show
        Esteve Fernandez added a comment - +1 However, I think backwards compatibility cannot be achieved completely. What happens if a structure has a member called "d"? I would propose an even more dramatic change and drop the current behavior. The latest release is from March (20080411) and other language libraries have already changed their API, so I think backwards compatibility can be sacrificed in favor of correctness.
        Hide
        Todd Lipcon added a comment -

        +1 on the general intent, but a couple questions:

        1) Why is there no d argument in the case that there are no members? This seems like it would break back compat if someone was passing in {} before

        2) With a default value of d={}, the loop checking for members in d will always execute. Why not set d=None and then check "if d != None"? This seems to me to be slightly less error prone since (a) passing a non-dictionary type will end up with an error instead of being ignored, and (b) you could then pass in any type that has a dictionary-like behavior (eg defaultdict)

        3) There's a small typo: "d-handing" instead of "d-handling"

        Show
        Todd Lipcon added a comment - +1 on the general intent, but a couple questions: 1) Why is there no d argument in the case that there are no members? This seems like it would break back compat if someone was passing in {} before 2) With a default value of d={}, the loop checking for members in d will always execute. Why not set d=None and then check "if d != None"? This seems to me to be slightly less error prone since (a) passing a non-dictionary type will end up with an error instead of being ignored, and (b) you could then pass in any type that has a dictionary-like behavior (eg defaultdict) 3) There's a small typo: "d-handing" instead of "d-handling"
        Jonathan Ellis made changes -
        Field Original Value New Value
        Attachment init.patch [ 12396742 ]
        Jonathan Ellis created issue -

          People

          • Assignee:
            Esteve Fernandez
            Reporter:
            Jonathan Ellis
          • Votes:
            2 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development