Thrift
  1. Thrift
  2. THRIFT-1115

python TBase class for dynamic (de)serialization, and __slots__ option for memory savings

    Details

    • Patch Info:
      Patch Available

      Description

      This patch adds several new features to the compiler for python and the python libraries, and exercises the new features with expanded unit testing.

      This adds support for generating python classes that have no read() or write() methods. Instead, the generated classes inherit from a new base class, TProtocolDynamic. This new class implements the de/serialization with read() and write() methods that iterate over the derived class's "thrift_spec" class member that describes the structure and types of the thrift. This dynamic evaluation works with both binary and compact protocols, and has the same hook as before for delegating de/serialization to fastbinary.so for the "accelerated binary" protocol. This new baseclass read() method may even be more efficient than the generated explicit read() code for objects with lots of attributes because it doesn't have a case/switch style series of "if field_id == X..." nested inside a loop. Instead, it indexes the decoded field ID into the thrift_spec tuple directly. That efficiency gain is probably just noise though, since the dynamic process probably uses more CPU later on, though I haven't benchmarked it. (Up[date: see the benchmarking results posted below for construction/serialization/deserialization comparisons.)

      If the 'dynamic' flag is given as a -gen py: flag to the compiler, then the generated classes no longer get individual __repr__ and __eq__ and __ne__ methods, instead they inherit from the TProtocolDynamic base class implementation, which uses __slots__ instead of __dict__ for repr and equality testing.

      When "dynamic" python classes are generated, they have very little code, just a constructor and class data. All the work of serialization and deserialization is done by the base class. This produces about 980 lines for DebugProtoTest vs. 3540 lines in default "-gen py" mode, or about 1/3 the original code size.

      The __slots__ support is available without requiring the dynamic base class, so users can save memory using the slots flag to generate non-dict based instances. The memory difference between dict and slots based objects is hard to measure, but seems to be around 10x smaller using slots, as long as the base class also uses __slots__. If the generated classes are old-style, and use slots, there's no memory savings at all, because the base class still creates a __dict__ object for every instance. Python is just tricky when it comes to using __slots__ best.

      The memory savings is pretty astounding using new-style classes and __slots__. Building DebugProtoTest.thrift with: -gen py:dynamic,slots versus -gen py results in some pretty amazing memory savings. I tested by instantiating 1 million of the heavy DebugProtoTest.thrift's CompactProtoTestStruct(), which has 49 attributes in it, using regular "-gen py" code versus "-gen py:dynamic,slots" and compared the VmRSS resident memory usage of both processes. I didn't set any values to any attributes, so every attribute was left with the null value, None. The slots technique used 441 MB with slots vs. 3485 MB using non-slots, non-dynamic generated code. That's about 13% of the original size, or 87% memory savings.

      I tried the same test using a very tiny object instead, the DebugThrift.thift Bonk() class, which only has two fields. For this, I made 10 million instances of Bonk() and the results were very similar: 730 MB with slots vs. 2622 MB using non-slots.

      The speed difference between non-slots and slots is about 25% better with slots, or 14.6 seconds for non-slots 10 million Bonk() test, versus 11.1 seconds for the same slots test, or a savings of 24% time. This only measured object creation time, not serialization/deserialization or instance attribute access. (See the benchmarking results below for serialization/deserialization comparisons).

      If the user wants to subclass the TProtocolDynamic class and override the read/write methods, or add more functionality, this patch also adds a dynbase=CLS thrift compiler option and a "dynimport='from foo import *" option, to specify a different base class (CLS) for the generated "dynamic" objects, and an import line that brings that base class into scope.

      The python unit tests now get yet another test loop layer, with the Makefile.am specifying multiple thrift generation lines, exercising the possible combinations of thrift -gen py:blah, sending the output to multiple gen-py-blah output directories (like gen-py-default, gen-py-dynamicslots, gen-py-newstyle, etc), so that RunClientServer.py can then loop over each of these and do all of its testing (which is combinatorially exhaustive) on each of those subdirectories. It's a pretty insane number of tests now, over 390 individual tests on python >= 2.5 (ssl + multiprocessing). Since the test/py "make check" tests aren't done for the 'casual' installer, I think it's acceptable to have the "make check" tests take this long to complete- but it is annoying to have to wait a few minutes for all the tests to finish/succeed. I think it could be improved a fair bit.

      This patch also makes a few other small changes:

      • adds a comment to the top of each generated .py file with the options to -gen, i.e. "py:dynamic,slots", to make it easier to figure out how the thrift compiler generated code was created, after the fact. This should probably be backported to other languages.
      • added another test for inequality to SerializationTest.py to compare objects of different types, to exercise the __eq__ code more
      • added --genpydir cmdline options to TestServer.py and TestClient.py to let RunClientServer.py loop over the gen-py-* directories as a kind of test controller.
      • reduced verbosity on the TestServer/Client.py code from 2 to 1, since the number of invocations is now huge.
      • changed TestServer's intentional Exception to have a message "Exception test PASSES." instead of "foo", which previously made the unit test output look kind of scary.
      • Adds container read/write methods to TProtocol, only used by TProtocolDynamic for "dynamic" generated classes
      • Added a _VALUES_TO_NAMES list to TType in Thrift.py to make it easier to log about TType related (de)serialization issues.
      • Changes the import line in generated code from "from thrift.Thrift import *" to "from thrift.Thrift import TType, TMessageType" to be a little more controlled about what is imported into the generated-code namespace. May be slightly more efficient too.
      • reduces the inter-test sleep() delays somewhat, to try to speed up the "make check" unit testing a bit. hopefully the delays are large enough that no spurious errors pop up.

      Things to ponder:

      Using -gen py:dynamic,slots actually results in python rejecting any attempt to assign to a non-existent member variable of a thrift object. It's not the same as statically-typed languages blowing up at compile time for the same reason, but it could expose more subtle user-code errors that otherwise are undetected during serialization.

      There's an interesting edge case in python 2.4 that arises due to old-style classes and exception processing that only shows up with new-style classes (those derived ultimately from object). I'm not sure how to handle this, since it's a problem that you don't know exists at code-generation time, since we can't really know what version of python is going to execute the generated code. Usually it just doesn't matter, but the case of exceptions with multiple inheritance is an edge case, it seems. Since this is only a tricky problem for the unit tests, I put some failure detection code into TestClient.py to skip the exception testing if the generated class causes a TypeError exception when raised, instead of itself. The test code prints a warning: "Skipping testing of Xception because python (2, 4) does not support this type of exception (gen py is: gen-py-dynamic)".

      I'm not sure the TProtocolDynamic class name is the best possible name for the base class for "dynamic" objects to derive from. The TProtocolDynamic class does embody the underlying protocol, at an abstract level, but the only other class in thrift/protocol/ that it is similar to is the TProtocolBase class.

      This patch adds more dynamic behavior to the thrift library, and it makes sense that we'd want to take it another step and implement a python class that can parse a .thrift file, and generate the corresponding message/struct/etc classes at runtime. This patch doesn't do that, but I think it makes it easier. Obviously the use-cases are compelling for generating the actual per-class read() and write() methods at thrift-compile time. But there's some interesting possibilities ahead.

      The unit testing code needs a little more TLC... I'm not too familiar with best practices here, but I'd love to see some code-coverage metrics produced by the unit testing / test cases. It would be nice to know if some code paths aren't being exercised. I won't be surprised if there are edge-cases that break the dynamic/slots generated code. It would be nice to know of any, so we can update the unit tests to explore it (and fix whatever it is). The unit tests don't exercise the py:twisted compatible code at all, so there's likely some lurking bugs in there.

      With this patch, the new output from thrift -h relating to python is:

        --gen STR   Generate code with a dynamically-registered generator.
                      STR has the form language[:key1=val1[,key2,[key3=val3]]].
                      Keys and values are options passed to the generator.
                      Many options will not require values.
      
      Available generators (and options):
      
      [ ... snip ... ]
      
        py (Python):
          new_style:       Generate new-style classes.
          twisted:         Generate Twisted-friendly RPC services.
          slots:           Generate code using slots for instance members.
          dynamic:         Generate dynamic code, less code generated but slower.
          dynbase=CLS      Derive generated classes from class CLS instead of TProtocolDynamic.
          dynimport='from foo.bar import CLS'
                           Add an import line to generated code to find the dynbase class.
      

      Sorry for such a long ticket body. I'll attach the patch in a minute or so,

      1. THRIFT-1115.followup_add_import.patch
        0.5 kB
        Will Pierce
      2. THRIFT-1115.python_dynamic_code_and_slots_v6.patch
        46 kB
        Will Pierce
      3. THRIFT-1115.python_dynamic_code_and_slots_v5.patch
        35 kB
        Will Pierce
      4. THRIFT-1115.python_dynamic_code_and_slots_v4.patch
        35 kB
        Will Pierce
      5. THRIFT-1115.python_dynamic_code_and_slots_v3.patch
        33 kB
        Will Pierce
      6. test_dynser.py
        3 kB
        Will Pierce
      7. THRIFT-1115.python_dynamic_code_and_slots_v2.patch
        32 kB
        Will Pierce
      8. test_size.py
        1 kB
        Will Pierce
      9. THRIFT-1115.python_dynamic_code_and_slots_v1.patch
        32 kB
        Will Pierce

        Activity

        Hide
        Will Pierce added a comment -

        I forgot to mention, don't forget to re-run: ./bootstrap.sh; ./configure; make after applying this patch, or else the Makefile.am changes for test/py won't be rebuilt.

        Show
        Will Pierce added a comment - I forgot to mention, don't forget to re-run: ./bootstrap.sh; ./configure; make after applying this patch, or else the Makefile.am changes for test/py won't be rebuilt.
        Hide
        Will Pierce added a comment - - edited

        Attaching test benchmarking script "test_size.py" that measures the RAM used and the Time elapsed to instantiate 1 million thrift objects.

        To use this code:

        1. cd into the current thrift-svn trunk/
        2. apply the patch to trunk
        3. run ./bootstrap.sh && ./configure --with-ruby=no && make
        4. cd test/py (put the test_size.py script here into this test/py/ directory)
        5. make check
        6. hit control-C after the tests start up, since all want is the test-suite to build out the various gen-py-* directories with different variations of generated python code.

        The test code takes 3 arguments:

        • arg1: directory to add to sys.path to import the generated code. One of gen-py (same as gen-py-default), gen-py-dynamicslots, gen-py-newstyle, gen-py-newstyleslots, gen-py-dynamic
        • arg2: name of class from DebugProtoTest and ThriftTest to instantiate for size/speed tests: i.e. Bonk, Empty, OneOfEach, CompactProtoTestStruct, Bools, etc...
        • args: optional, the number of objects to instantiate. Defaults to 1 million. If you have the RAM, try 10 million, i.e. 10000000

        This requires linux, as it gets the RAM usage at end-of-execution from the VmRSS line of /proc/PID/status

        The two particular test-cases that are most interesting (to me) are: gen-py-default vs. gen-py-dynamicslots The gen-py-default is what "-gen py" produces, using old-style non-dynamic generated code. The gen-py-dynamicslots directory contains the results of running "-gen py:dynamic,slots" which produces new-style classes that inherit from the TProtocolDynamic class for serialization/deserialization and uses _slots_ for more efficient storage and faster execution.

        I will post some test results from this script shortly, with the time and space savings that the dynamic/slots based code produces. The savings surprised me.

        Show
        Will Pierce added a comment - - edited Attaching test benchmarking script "test_size.py" that measures the RAM used and the Time elapsed to instantiate 1 million thrift objects. To use this code: cd into the current thrift-svn trunk/ apply the patch to trunk run ./bootstrap.sh && ./configure --with-ruby=no && make cd test/py (put the test_size.py script here into this test/py/ directory) make check hit control-C after the tests start up, since all want is the test-suite to build out the various gen-py-* directories with different variations of generated python code. The test code takes 3 arguments: arg1: directory to add to sys.path to import the generated code. One of gen-py (same as gen-py-default), gen-py-dynamicslots, gen-py-newstyle, gen-py-newstyleslots, gen-py-dynamic arg2: name of class from DebugProtoTest and ThriftTest to instantiate for size/speed tests: i.e. Bonk, Empty, OneOfEach, CompactProtoTestStruct, Bools, etc... args: optional, the number of objects to instantiate. Defaults to 1 million. If you have the RAM, try 10 million, i.e. 10000000 This requires linux, as it gets the RAM usage at end-of-execution from the VmRSS line of /proc/PID/status The two particular test-cases that are most interesting (to me) are: gen-py-default vs. gen-py-dynamicslots The gen-py-default is what "-gen py" produces, using old-style non-dynamic generated code. The gen-py-dynamicslots directory contains the results of running "-gen py:dynamic,slots" which produces new-style classes that inherit from the TProtocolDynamic class for serialization/deserialization and uses _ slots _ for more efficient storage and faster execution. I will post some test results from this script shortly, with the time and space savings that the dynamic/slots based code produces. The savings surprised me.
        Hide
        Will Pierce added a comment -

        Results of testing default generated code versus this patch's code using "-gen py:dynamic,slots":

        Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation
        default code OneOfEach 10000000 14056.4 MB 1473.93 KB 83.20 sec 8.3204 usec/iteration
        dynamic,slots OneOfOEach 10000000 4712.1 MB 494.10 KB 46.13 sec 4.6133 usec/iteration

        The (dynamic,slots) version uses 33.5% the RAM of the default code, and runs 1.8x faster.

        Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation
        default code Empty 10000000 3537.6 MB 370.94 KB 10.93 sec 1.0933 usec/iteration
        dynamic,slots Empty 10000000 239.8 MB 25.14 KB 3.20 sec 0.3196 usec/iteration

        The (dynamic,slots) version uses 6.8% the RAM of the default code, and runs 3.4x faster.

        Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation
        default code CompactProtoTestStruct 5000000 16992.7 MB 3563.64 KB 31.47 sec 6.2935 usec/iteration
        dynamic,slots CompactProtoTestStruct 5000000 2180.5 MB 457.29 KB 31.37 sec 6.2739 usec/iteration

        The (dynamic,slots) version uses 12.8% the RAM of the default code, and runs 1.003x faster (insignificant difference).

        Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation
        default code Bonk 10000000 3537.6 MB 370.94 KB 14.76 sec 1.4763 usec/iteration
        dynamic,slots Bonk 10000000 713.5 MB 74.81 KB 11.06 sec 1.1057 usec/iteration

        The (dynamic,slots) version uses 20.2% the RAM of the default code, and runs 1.3x faster.

        Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation
        default code Xtruct 10000000 3537.6 MB 370.94 KB 15.78 sec 1.5779 usec/iteration
        dynamic,slots Xtruct 10000000 877.5 MB 92.01 KB 13.29 sec 1.3290 usec/iteration

        The (dynamic,slots) version uses 24.8% the RAM of the default code, and runs 1.2x faster.

        Show
        Will Pierce added a comment - Results of testing default generated code versus this patch's code using "-gen py:dynamic,slots": Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation default code OneOfEach 10000000 14056.4 MB 1473.93 KB 83.20 sec 8.3204 usec/iteration dynamic,slots OneOfOEach 10000000 4712.1 MB 494.10 KB 46.13 sec 4.6133 usec/iteration The (dynamic,slots) version uses 33.5% the RAM of the default code, and runs 1.8x faster. Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation default code Empty 10000000 3537.6 MB 370.94 KB 10.93 sec 1.0933 usec/iteration dynamic,slots Empty 10000000 239.8 MB 25.14 KB 3.20 sec 0.3196 usec/iteration The (dynamic,slots) version uses 6.8% the RAM of the default code, and runs 3.4x faster. Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation default code CompactProtoTestStruct 5000000 16992.7 MB 3563.64 KB 31.47 sec 6.2935 usec/iteration dynamic,slots CompactProtoTestStruct 5000000 2180.5 MB 457.29 KB 31.37 sec 6.2739 usec/iteration The (dynamic,slots) version uses 12.8% the RAM of the default code, and runs 1.003x faster (insignificant difference). Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation default code Bonk 10000000 3537.6 MB 370.94 KB 14.76 sec 1.4763 usec/iteration dynamic,slots Bonk 10000000 713.5 MB 74.81 KB 11.06 sec 1.1057 usec/iteration The (dynamic,slots) version uses 20.2% the RAM of the default code, and runs 1.3x faster. Test Case Class Type Number Objects Total RAM MB KB / Object Total Elapsed Sec Microseconds / Object Instantiation default code Xtruct 10000000 3537.6 MB 370.94 KB 15.78 sec 1.5779 usec/iteration dynamic,slots Xtruct 10000000 877.5 MB 92.01 KB 13.29 sec 1.3290 usec/iteration The (dynamic,slots) version uses 24.8% the RAM of the default code, and runs 1.2x faster.
        Hide
        Will Pierce added a comment -

        So far, the general savings seems to be about 10% to 33% memory usage (1/10th to 1/3rd original sizes), and speedups varying from no-change to ~3x faster. The memory savings seems to always be significant, but the time-savings, or object-instantiation seems to level out when a struct has a ton of fields, like the CompactProtoTestStruct which has 49 fields in it. The "dynamic,slots" versions of objects are actually new-style objects, so this code ends up producing code that is faster using new-style python classes than old-style classes, since the TProtocolDynamic base class itself inherits from object.

        I haven't yet benchmarked the serialization/deserialization time differences between the "dynamic,slots" version of generated code versus default non-dynamic code. I'm not surprised by the memory savings in the object-instantiation tests so far, but I am surprised that the "dynamic,slots" generated code is so much faster in some cases. I don't expect the dynamic serialization/deserialization to be faster than the static(default) generated code, but only real benchmarking will answer that question.

        Show
        Will Pierce added a comment - So far, the general savings seems to be about 10% to 33% memory usage (1/10th to 1/3rd original sizes), and speedups varying from no-change to ~3x faster. The memory savings seems to always be significant, but the time-savings, or object-instantiation seems to level out when a struct has a ton of fields, like the CompactProtoTestStruct which has 49 fields in it. The "dynamic,slots" versions of objects are actually new-style objects, so this code ends up producing code that is faster using new-style python classes than old-style classes, since the TProtocolDynamic base class itself inherits from object. I haven't yet benchmarked the serialization/deserialization time differences between the "dynamic,slots" version of generated code versus default non-dynamic code. I'm not surprised by the memory savings in the object-instantiation tests so far, but I am surprised that the "dynamic,slots" generated code is so much faster in some cases. I don't expect the dynamic serialization/deserialization to be faster than the static(default) generated code, but only real benchmarking will answer that question.
        Hide
        Will Pierce added a comment -

        FYI, the object instantiation time is not as significant compared to the serialization/deserialization time. I'm working on some additional benchmarks to test the performance of de/serialization. There may be a _v2 of the original patch if I spot any ways to speed up the _v1 patch in this ticket without making the dynamic code evil. The dynamic read() and write() methods from TProtocolDynamic are slower than the statically generated read()/write() methods. It won't be surprising if the dynamic read/write baseclass methods are slower than the static methods, I suppose.

        Show
        Will Pierce added a comment - FYI, the object instantiation time is not as significant compared to the serialization/deserialization time. I'm working on some additional benchmarks to test the performance of de/serialization. There may be a _v2 of the original patch if I spot any ways to speed up the _v1 patch in this ticket without making the dynamic code evil. The dynamic read() and write() methods from TProtocolDynamic are slower than the statically generated read()/write() methods. It won't be surprising if the dynamic read/write baseclass methods are slower than the static methods, I suppose.
        Hide
        Will Pierce added a comment -

        version 2 of this patch makes some slight changes for speed in the container type read/write methods in TProtocol and a slight speedup in the TProtocolDynamic write method. This just makes the loops a little more efficient by adjusting the loops to do invariant work outside the loop where it's simple.

        Show
        Will Pierce added a comment - version 2 of this patch makes some slight changes for speed in the container type read/write methods in TProtocol and a slight speedup in the TProtocolDynamic write method. This just makes the loops a little more efficient by adjusting the loops to do invariant work outside the loop where it's simple.
        Hide
        Will Pierce added a comment -

        benchmarking script, copy it into trunk/test/py/ and use the same instructions as above for the test_size.py script (build the code, and run "make check")

        This benchmarking script measures the execution time of object instantiation, serialization, and deserialization, taking the specific gen-py-* diretory name on the cmdline. This is useful for comparing the speeds of default generated code versus the new dynamic,slots version. I'll post the results I got in a table comment in a moment.

        Show
        Will Pierce added a comment - benchmarking script, copy it into trunk/test/py/ and use the same instructions as above for the test_size.py script (build the code, and run "make check") This benchmarking script measures the execution time of object instantiation, serialization, and deserialization, taking the specific gen-py-* diretory name on the cmdline. This is useful for comparing the speeds of default generated code versus the new dynamic,slots version. I'll post the results I got in a table comment in a moment.
        Hide
        Will Pierce added a comment -

        Benchmark results from testing 3 million of construction, serialization, and deserialization (executed in test/py/): for j in Bonk CompactProtoTestStruct; do echo Testing $j; for i in gen-py-default gen-py-dynamic; do ./test_dynser.py $i Bonk 3000000; echo; done ; echo;echo ; done Pasted into table format for easier comparison.

        Test Case Class Type Number Objects Object Instantiation Serialization Deserialization
        default Bonk 3000000 1.0380 usec/iteration 15.7197 usec/iteration 23.0692 usec/iteration
        dynamic,slots Bonk 3000000 1.0117 usec/iteration 17.7777 usec/iteration 24.5451 usec/iteration

        Instantiation time iss about the same (see Note below), and Serialization is 13% slower with dynamic,slots, and Deserialization is 6% slower with dynamic,slots. The Bonk object only has two fields, a string and an integer.

        Test Case Class Type Number Objects Object Instantiation Serialization Deserialization
        default CompactProtoTestStruct 3000000 6.5773 usec/iteration 83.5817 usec/iteration 145.0053 usec/iteration
        dynamic,slots CompactProtoTestStruct 3000000 7.3533 usec/iteration 112.5280 usec/iteration 161.8131 usec/iteration

        Instantiation time is 11% slower with dynamic,slots, and Serialization is 34% slower with dynamic,slots, and Deserialization is 11% slower with dynamic,slots. The CompactProtoTestStruct has 49 fields, and the test code initializes 12 of them, to simulate a realistic object.

        • Note: I think the object construction times are different in this test from the test_size.py results because in this test, the constructed objects are thrown away, so a lot of the time measured in test_size.py is due to memory management in cpython needing to build out tons of storage for the heavier objects from gen-default.

        So, it seems that thrift objects using the dynamic,slots feature from this patch are about 10-30% slower at serialization and deserialization, but they use about 1/8 to 1/5th the RAM and about 1/3 the generated code size.

        Show
        Will Pierce added a comment - Benchmark results from testing 3 million of construction, serialization, and deserialization (executed in test/py/): for j in Bonk CompactProtoTestStruct; do echo Testing $j; for i in gen-py-default gen-py-dynamic; do ./test_dynser.py $i Bonk 3000000; echo; done ; echo;echo ; done Pasted into table format for easier comparison. Test Case Class Type Number Objects Object Instantiation Serialization Deserialization default Bonk 3000000 1.0380 usec/iteration 15.7197 usec/iteration 23.0692 usec/iteration dynamic,slots Bonk 3000000 1.0117 usec/iteration 17.7777 usec/iteration 24.5451 usec/iteration Instantiation time iss about the same (see Note below), and Serialization is 13% slower with dynamic,slots, and Deserialization is 6% slower with dynamic,slots. The Bonk object only has two fields, a string and an integer. Test Case Class Type Number Objects Object Instantiation Serialization Deserialization default CompactProtoTestStruct 3000000 6.5773 usec/iteration 83.5817 usec/iteration 145.0053 usec/iteration dynamic,slots CompactProtoTestStruct 3000000 7.3533 usec/iteration 112.5280 usec/iteration 161.8131 usec/iteration Instantiation time is 11% slower with dynamic,slots, and Serialization is 34% slower with dynamic,slots, and Deserialization is 11% slower with dynamic,slots. The CompactProtoTestStruct has 49 fields, and the test code initializes 12 of them, to simulate a realistic object. Note : I think the object construction times are different in this test from the test_size.py results because in this test, the constructed objects are thrown away, so a lot of the time measured in test_size.py is due to memory management in cpython needing to build out tons of storage for the heavier objects from gen-default. So, it seems that thrift objects using the dynamic,slots feature from this patch are about 10-30% slower at serialization and deserialization, but they use about 1/8 to 1/5th the RAM and about 1/3 the generated code size.
        Hide
        Will Pierce added a comment -

        I would love to hear some feedback on this patch. The memory savings it provides are very large compared to the extra CPU this option costs. As an optional new feature that doesn't break existing code/apis, I think it's valuable. Any thoughts?

        Show
        Will Pierce added a comment - I would love to hear some feedback on this patch. The memory savings it provides are very large compared to the extra CPU this option costs. As an optional new feature that doesn't break existing code/apis, I think it's valuable. Any thoughts?
        Hide
        Bryan Duxbury added a comment -

        As a non-python user, I'm pro the changes. If no one objects, I'll commit this today.

        Show
        Bryan Duxbury added a comment - As a non-python user, I'm pro the changes. If no one objects, I'll commit this today.
        Hide
        Will Pierce added a comment -

        Thanks Bryan! I'm thinking of adding TJSONProtocol support next. (Not that I would ever use it in production, but just for completeness on the python side.)

        Show
        Will Pierce added a comment - Thanks Bryan! I'm thinking of adding TJSONProtocol support next. (Not that I would ever use it in production, but just for completeness on the python side.)
        Hide
        David Reiss added a comment -

        I think this is a solid design and definitely the way we should be moving with the Python bindings. There are a bunch of small things that need to be addressed.

        You mentioned an "interesting edge case", but didn't describe what it was. Can you explain the cause?

        You're right that TProtocolDynamic is not the right name. I think that Java calls this TBase.

        I think that the TProtocol.readStruct should be the real implementation and TBase.read should be the wrapper, not the other way around.

        In the makefile, you define thrift_gen1, thrift_gen2, etc. Why not just make these all part of the same variable?

        In the makefile, you use "mkdir gen-py-dynamicslots || true". I think "test -d gen-py-dynamicslots || mkdir gen-py-dynamicslots" is safer.

        +      my_val, other_val = [getattr(obj, attr) for obj in [self, other]]
        

        I think this is more understandable as just

        my_val = getattr(self, attr)
        other_val = getattr(other, attr)
        

        +      if my_val is other_val:
        +        continue
        +      if my_val != other_val:
        +          return False
        

        Indentation is off here, plus I'm a bit skeptical of the optimization of checking "is" first.

        +      finally:
        +        iprot.readFieldEnd()
        

        This should not be in a finally. We don't want to try to keep reading if something goes wrong.

        +          field = self.thrift_spec[fid]
        ...
        +        except IndexError:
        +          iprot.skip(ftype)
        

        Having the try block for this line so far away makes it difficult to read and also increases the risk of an exception coming from another line and incorrectly triggering the catch block. I'd rather see this as

        try:
          field = self.thrift_spec[fid]
        except IndexError:
          iprot.skip(ftype)
        else:
          # read the field
        self.readFieldEnd()
        

        +    if r_handler is None:
        +      return
        

        This should be an exception.

        +      except TypeError:
        +        pass # TODO: add logging about unhashable types. i.e. d=dict(); d[[0,1]] = 2 fails
        

        This should be an exception.

        Show
        David Reiss added a comment - I think this is a solid design and definitely the way we should be moving with the Python bindings. There are a bunch of small things that need to be addressed. You mentioned an "interesting edge case", but didn't describe what it was. Can you explain the cause? You're right that TProtocolDynamic is not the right name. I think that Java calls this TBase. I think that the TProtocol.readStruct should be the real implementation and TBase.read should be the wrapper, not the other way around. In the makefile, you define thrift_gen1, thrift_gen2, etc. Why not just make these all part of the same variable? In the makefile, you use "mkdir gen-py-dynamicslots || true". I think "test -d gen-py-dynamicslots || mkdir gen-py-dynamicslots" is safer. + my_val, other_val = [getattr(obj, attr) for obj in [self, other]] I think this is more understandable as just my_val = getattr(self, attr) other_val = getattr(other, attr) – + if my_val is other_val: + continue + if my_val != other_val: + return False Indentation is off here, plus I'm a bit skeptical of the optimization of checking "is" first. + finally: + iprot.readFieldEnd() This should not be in a finally. We don't want to try to keep reading if something goes wrong. + field = self.thrift_spec[fid] ... + except IndexError: + iprot.skip(ftype) Having the try block for this line so far away makes it difficult to read and also increases the risk of an exception coming from another line and incorrectly triggering the catch block. I'd rather see this as try: field = self.thrift_spec[fid] except IndexError: iprot.skip(ftype) else: # read the field self.readFieldEnd() – + if r_handler is None: + return This should be an exception. + except TypeError: + pass # TODO: add logging about unhashable types. i.e. d=dict(); d[[0,1]] = 2 fails This should be an exception.
        Hide
        Will Pierce added a comment -

        Hi David, thanks for reviewing this! I'm glad to get your feedback (which I agree with on all counts actually- good eye). I'll address all the points and submit an updated patch either later tonight or tomorrow morning. I really appreciate the time you put in to reviewing this patch. I wasn't sure where the right place for the dynamic read/write should go. TBase is definitely the name I was looking for. Thanks! -W

        Show
        Will Pierce added a comment - Hi David, thanks for reviewing this! I'm glad to get your feedback (which I agree with on all counts actually- good eye). I'll address all the points and submit an updated patch either later tonight or tomorrow morning. I really appreciate the time you put in to reviewing this patch. I wasn't sure where the right place for the dynamic read/write should go. TBase is definitely the name I was looking for. Thanks! -W
        Hide
        Will Pierce added a comment -

        Took me some time to get back to this, but I am almost done with an updated patch that makes all but one of the changes you recommended. Here's an update:

        On the "interesting edge case" with exceptions. Python 2.4 apparently doesn't let you raise an exception which is a new style class... Here's a snippet of demo code (to run from trunk/test/py/ after "make check" has built the gen-py-* subdirectories):

        % cat te.py
        #!/usr/bin/python2.4
        import sys, glob
        sys.path.insert(0, 'gen-py-dynamicslots')
        sys.path.insert(0, glob.glob('../../lib/py/build/lib.*')[0])
        import thrift
        import ThriftTest
        import ThriftTest.ttypes
        try:
            raise ThriftTest.ttypes.Xception()
        except ThriftTest.ttypes.Xception:
            print 'OK!'
        %
        % python2.7 te.py 
        OK!
        % python2.4 te.py 
        Traceback (most recent call last):
          File "te.py", line 9, in ?
            raise ThriftTest.ttypes.Xception()
        TypeError: exceptions must be classes, instances, or strings (deprecated), not Xception
        % 
        

        If you change the sys.path to include ./gen-py-default/ instead of ./gen-py-dynamicslots/ then the code works for both python2.4 and python2.7.

        The edge case is that the -gen py:dynamic option generates code that doesn't work with python2.4 for exception structs. The TestClient.py code has code in testException (in v1 and v2 of the patch) for skipping the exception test, if raising Xception itself raises the TypeError. What's going on, is the generated Xception code for dynamic type code (-gen py:dynamic) has to use multiple inheritance to be both an Exception and a TBase (correct name). The Exception makes it raisable, and the TBase gives it .read and .write (for the -gen py:dynamic case). TBase inherits from object, so the weird python2.4 limitation on exceptions kicks in. (Found this on python.org to explain it.)

        I think this can be fixed by making TBase an old style class (remove 'object' from its class definition.) I'm trying that now, and will post an update if it works and what impacts it has on the performance test results (if any).

        To my mind, the edge case is that we don't really know what version of python will be used to execute this code, when we're generating code at compile time. An ugly way to work around this would be to let TBase remain a new style class, and offer an additional compiler flag to indicate that users' exception structs need to be old style classes, and generate out their explicit read and write methods so a class like Xception only inherits from Exception, which will always work in python 2.x and up. That is pretty ugly though, to have two different serialization/deserialization methods in use at the same time, in the same code.

        I'm hoping that just making TBase into an old-style class will just work without affecting any of the performance tests. I'll post results soon. The 396 unit test cases all pass with TBase as an old style class, so now it's just performance tests that I have to check.

        Show
        Will Pierce added a comment - Took me some time to get back to this, but I am almost done with an updated patch that makes all but one of the changes you recommended. Here's an update: On the "interesting edge case" with exceptions. Python 2.4 apparently doesn't let you raise an exception which is a new style class... Here's a snippet of demo code (to run from trunk/test/py/ after "make check" has built the gen-py-* subdirectories): % cat te.py #!/usr/bin/python2.4 import sys, glob sys.path.insert(0, 'gen-py-dynamicslots') sys.path.insert(0, glob.glob('../../lib/py/build/lib.*')[0]) import thrift import ThriftTest import ThriftTest.ttypes try: raise ThriftTest.ttypes.Xception() except ThriftTest.ttypes.Xception: print 'OK!' % % python2.7 te.py OK! % python2.4 te.py Traceback (most recent call last): File "te.py", line 9, in ? raise ThriftTest.ttypes.Xception() TypeError: exceptions must be classes, instances, or strings (deprecated), not Xception % If you change the sys.path to include ./gen-py-default/ instead of ./gen-py-dynamicslots/ then the code works for both python2.4 and python2.7. The edge case is that the -gen py:dynamic option generates code that doesn't work with python2.4 for exception structs. The TestClient.py code has code in testException (in v1 and v2 of the patch) for skipping the exception test, if raising Xception itself raises the TypeError. What's going on, is the generated Xception code for dynamic type code (-gen py:dynamic) has to use multiple inheritance to be both an Exception and a TBase (correct name). The Exception makes it raisable, and the TBase gives it .read and .write (for the -gen py:dynamic case). TBase inherits from object, so the weird python2.4 limitation on exceptions kicks in. (Found this on python.org to explain it.) I think this can be fixed by making TBase an old style class (remove 'object' from its class definition.) I'm trying that now, and will post an update if it works and what impacts it has on the performance test results (if any). To my mind, the edge case is that we don't really know what version of python will be used to execute this code, when we're generating code at compile time. An ugly way to work around this would be to let TBase remain a new style class, and offer an additional compiler flag to indicate that users' exception structs need to be old style classes, and generate out their explicit read and write methods so a class like Xception only inherits from Exception, which will always work in python 2.x and up. That is pretty ugly though, to have two different serialization/deserialization methods in use at the same time, in the same code. I'm hoping that just making TBase into an old-style class will just work without affecting any of the performance tests. I'll post results soon. The 396 unit test cases all pass with TBase as an old style class, so now it's just performance tests that I have to check.
        Hide
        Will Pierce added a comment -

        Hmm, bad news. The TBase class needs to be a new-style class, or else the _slots_ feature has no massive memory savings benefit.

        Old style classes seem to always have a _dict_ object. A quote from the python.org link from my previous comment is succinct: "New Style classes can use descriptors (including _slots_), and Old Style classes cannot."

        % cat ob.py 
        #!/usr/bin/python
        class cls_old:
          __slots__ = []
        class cls_new(object):
          __slots__ = []
        
        c_new = cls_new()
        c_old = cls_old()
        
        print 'Old style object __dict__:', c_old.__dict__  # has one.
        print 'New style object __dict__:', c_new.__dict__  # does not, good!
        
        % python ob.py
        Old style object __dict__: {}
        New style object __dict__:
        Traceback (most recent call last):
          File "./ob.py", line 11, in <module>
            print 'New style object __dict__:', c_new.__dict__
        AttributeError: 'cls_new' object has no attribute '__dict__'
        

        So, to get huge memory savings from _slots_, we need TBase to be a new-style class. But, python2.4 won't let Exceptions derive from object, so there's a couple options ahead:

        Either:

        • Let it slide, and if someone chooses to use the -gen py:dynamic option, warn that generated Exception structs will break on python <= 2.4.x.

        Or:

        • Generate exception classes that don't inherit from TBase, so they don't inherit from object, and can be raised in python2.4.
          • They will either need to inherit from an old-style version of TBase that implements the same API. Might be doable without duplicating any code anywhere, with a new TBaseSlots kind of class maybe...
          • Or, the exception structs need to have their own explicitly generated read() and write() methods, so they won't have anything to do with TBase. This isn't too awful, since exception structs are almost always going to be very simple objects.

        Maybe "interesting edge case" was an understatement.

        What do you think is the right approach? I really want to see the memory savings from using _slots_ and have super-small generated code files for my structs. But supporting python2.4 is also important.

        Maybe this is moot? If someone tells the compiler to generate code with the optional 'dynamic' flag, then maybe it's just up to them to make sure they either avoid python2.4 and user-defined exceptions (if it's a non-RPC application), or use python2.5 or newer. The default output from the compiler still works in all 100% cases, so this is really only a problem if someone uses the -gen py:dynamic feature. Cavear emptor?

        I'm interested in hearing feedback...

        Show
        Will Pierce added a comment - Hmm, bad news. The TBase class needs to be a new-style class, or else the _ slots _ feature has no massive memory savings benefit. Old style classes seem to always have a _ dict _ object. A quote from the python.org link from my previous comment is succinct: "New Style classes can use descriptors (including _ slots _), and Old Style classes cannot." % cat ob.py #!/usr/bin/python class cls_old: __slots__ = [] class cls_new(object): __slots__ = [] c_new = cls_new() c_old = cls_old() print 'Old style object __dict__:', c_old.__dict__ # has one. print 'New style object __dict__:', c_new.__dict__ # does not, good! % python ob.py Old style object __dict__: {} New style object __dict__: Traceback (most recent call last): File "./ob.py", line 11, in <module> print 'New style object __dict__:', c_new.__dict__ AttributeError: 'cls_new' object has no attribute '__dict__' So, to get huge memory savings from _ slots _ , we need TBase to be a new-style class. But, python2.4 won't let Exceptions derive from object, so there's a couple options ahead: Either: Let it slide, and if someone chooses to use the -gen py:dynamic option, warn that generated Exception structs will break on python <= 2.4.x. Or: Generate exception classes that don't inherit from TBase, so they don't inherit from object, and can be raised in python2.4. They will either need to inherit from an old-style version of TBase that implements the same API. Might be doable without duplicating any code anywhere, with a new TBaseSlots kind of class maybe... Or, the exception structs need to have their own explicitly generated read() and write() methods, so they won't have anything to do with TBase. This isn't too awful, since exception structs are almost always going to be very simple objects. Maybe "interesting edge case" was an understatement. What do you think is the right approach? I really want to see the memory savings from using _ slots _ and have super-small generated code files for my structs. But supporting python2.4 is also important. Maybe this is moot? If someone tells the compiler to generate code with the optional 'dynamic' flag, then maybe it's just up to them to make sure they either avoid python2.4 and user-defined exceptions (if it's a non-RPC application), or use python2.5 or newer. The default output from the compiler still works in all 100% cases, so this is really only a problem if someone uses the -gen py:dynamic feature. Cavear emptor? I'm interested in hearing feedback...
        Hide
        Will Pierce added a comment -

        David, at first I didn't understand what you meant when you said "I think that the TProtocol.readStruct should be the real implementation and TBase.read should be the wrapper, not the other way around." It finally clicked. I agree. The code looks a lot cleaner without lots of repeated calls out to iprot.blah() and oprot.blah() all over the place.

        I confirmed you are correct, that the _eq_ method didn't need the extra check to compare "is" equality before testing != inequality. Looking at the disassembled bytecode, they both generate the same COMPARE_OP followed by a JUMP instruction, so it was redundant.

        I have an idea on how to work around the python2.4 exception issue. The "interesting edge case" could be solved if the generated code for thrift exceptions inherited from something like TBase which implements the same API but is an old-style class. After I moved the readStruct/writeStruct code out of TBase and into TProtocol, there's very little left (that's got to be good for a class named TBase). With very little repeated code, I think we can have a TExceptionBase class that is old-style, so is raisable by python2.4, and still do dynamic read() and write() from TProtocol's new readStruct() and writeStruct().

        I will have version 3 of this patch uploaded in the next hour.

        Show
        Will Pierce added a comment - David, at first I didn't understand what you meant when you said "I think that the TProtocol.readStruct should be the real implementation and TBase.read should be the wrapper, not the other way around." It finally clicked. I agree. The code looks a lot cleaner without lots of repeated calls out to iprot.blah() and oprot.blah() all over the place. I confirmed you are correct, that the _ eq _ method didn't need the extra check to compare "is" equality before testing != inequality. Looking at the disassembled bytecode, they both generate the same COMPARE_OP followed by a JUMP instruction, so it was redundant. I have an idea on how to work around the python2.4 exception issue. The "interesting edge case" could be solved if the generated code for thrift exceptions inherited from something like TBase which implements the same API but is an old-style class. After I moved the readStruct/writeStruct code out of TBase and into TProtocol, there's very little left (that's got to be good for a class named TBase). With very little repeated code, I think we can have a TExceptionBase class that is old-style, so is raisable by python2.4, and still do dynamic read() and write() from TProtocol's new readStruct() and writeStruct(). I will have version 3 of this patch uploaded in the next hour.
        Hide
        Will Pierce added a comment -

        Patch attached, version 3.

        Changes from v2:

        • test/py/TestClient.py: removed skipping of testException for Xception case
        • test/py/Makefile.am: single variable for all gen-py-* dirs now, and uses test -d instead of mkdir || new
        • lib/py/src/protocol/TBase.py: new name for TProtocolDynamic
        • lib/py/src/protocol/TBase.py: removed unnecessary "is" testing from _eq_ and made getattr() calls clearer
        • lib/py/src/protocol/TBase.py: factored out read() and write() into TProtocol readStruct() and writeStruct()
        • lib/py/src/protocol/TBase.py: added TExceptionBase class, old-style to be compatible with python2.4, reusing some TBase code without inheritance
        • lib/py/src/protocol/_init_.py: renamed TProtocolDynamic to TBase
        • lib/py/src/protocol/TProtocol.py: brought readStruct() and writeStruct() in from TBase
        • compiler/cpp/src/generate/t_py_generator.cc: renames for TBase, and added TExceptionBase to handle python2.4 "interesting edge case", and added new "dynexc" optional flag to complete the feature set for pluggable base classing at compile time.

        The tests pass on my local machine for python2.4 and 2.7.

        Show
        Will Pierce added a comment - Patch attached, version 3. Changes from v2: test/py/TestClient.py: removed skipping of testException for Xception case test/py/Makefile.am: single variable for all gen-py-* dirs now, and uses test -d instead of mkdir || new lib/py/src/protocol/TBase.py: new name for TProtocolDynamic lib/py/src/protocol/TBase.py: removed unnecessary "is" testing from _ eq _ and made getattr() calls clearer lib/py/src/protocol/TBase.py: factored out read() and write() into TProtocol readStruct() and writeStruct() lib/py/src/protocol/TBase.py: added TExceptionBase class, old-style to be compatible with python2.4, reusing some TBase code without inheritance lib/py/src/protocol/_ init _.py: renamed TProtocolDynamic to TBase lib/py/src/protocol/TProtocol.py: brought readStruct() and writeStruct() in from TBase compiler/cpp/src/generate/t_py_generator.cc: renames for TBase, and added TExceptionBase to handle python2.4 "interesting edge case", and added new "dynexc" optional flag to complete the feature set for pluggable base classing at compile time. The tests pass on my local machine for python2.4 and 2.7.
        Hide
        Will Pierce added a comment -

        adjusted ticket title to match the patch (version 3)

        Show
        Will Pierce added a comment - adjusted ticket title to match the patch (version 3)
        Hide
        Will Pierce added a comment - - edited

        Does the updated patch address all your concerns/suggestions? (I think it's a lot better than the first version, especially the exception class generation.)

        I'm interested in knowing if this is likely to be committed or not. I don't mind going through more revisions, if there's more improvements you'd like to see.

        Thanks!

        Show
        Will Pierce added a comment - - edited Does the updated patch address all your concerns/suggestions? (I think it's a lot better than the first version, especially the exception class generation.) I'm interested in knowing if this is likely to be committed or not. I don't mind going through more revisions, if there's more improvements you'd like to see. Thanks!
        Hide
        Will Pierce added a comment -

        Does anyone have any time this week to review this patch?

        Show
        Will Pierce added a comment - Does anyone have any time this week to review this patch?
        Hide
        David Reiss added a comment -

        Looks solid. Mostly style changes.

        Does dynamic require the slots option and conflict with the newstyle option? If so, this should be enforced. If not, there are a few places where they conflict.

        In TExceptionBase, can you copy the read and write methods the way you do with _eq_?

        +      except TypeError:
        +        pass # TODO: add logging about unhashable types. i.e. d=dict(); d[[0,1]] = 2 fails
        

        This should be an exception.

        @@ -176,7 +180,6 @@
             except Xception, x:
               self.assertEqual(x.errorCode, 1001)
               self.assertEqual(x.message, 'Xception')
        -
             try:
               self.client.testException("throw_undeclared")
               self.fail("should have thrown exception")
        @@ -225,4 +228,4 @@
                 self.createTests()
        

        Looks like this is an unnecessary change.

        +    // hdr += std::string("from thrift.protocol.TBase import TBase\n");
        

        Remove commented-out code.

        +    hdr +=
        +      "from thrift.transport import TTransport\n"
        +      "from thrift.protocol import TBinaryProtocol, TProtocol\n";
        +    hdr += "try:\n"
             "  from thrift.protocol import fastbinary\n"
             "except:\n"
             "  fastbinary = None\n";
        

        The two "hdr += " statements can be combined into one.
        In fact, it looks like the hdr variable can be eliminated entirely if you want.

        @@ -384,6 +439,7 @@
           f_types_ <<
             "class " << tenum->get_name() <<
             (gen_newstyle_ ? "(object)" : "") <<
        +    (gen_dynamic_ ? "(" + gen_dynbaseclass_ + ")" : "") <<  
             ":" << endl;
        

        This is one conflict between newstyle and dynamic. I think the generation of _eq_ and friends is another.

        +    if (gen_newstyle_) {
        +     out << "(object)";
        +    }
        +    else if (gen_dynamic_) {
        +      out << "(" << gen_dynbaseclass_ << ")";
        +    }
        

        Combine "else if" to previous line.

        +    indent_down();
        +  } else {
        +    if (! gen_dynamic_) {
        +      // no base class available to implement __eq__ and __repr__ and __ne__ for us
        

        Combine "else" with "if". Also, indentation is off inside the "if".

        +    // perhaps we should try __slots__?
        

        What does this mean?

        +    indent_up();
        +    for (m_iter = sorted_members.begin(); m_iter != sorted_members.end(); ++m_iter) {
        +      indent(out) <<  "'" << (*m_iter)->get_name()  << "', "
        +            << endl;
        +    }
        

        This puts trailing whitespace in generated code.

        +  if (! gen_dynamic_) {
        

        No space after "!".

        +      indent() << "  for attr in self.__slots__:" << endl <<
        +      indent() << "    my_val, other_val = [getattr(obj, attr) for obj in [self, other]]" << endl <<
        +      indent() << "    if my_val is other_val:" << endl <<
        +      indent() << "      continue" << endl <<
        +      indent() << "    if my_val != other_val:" << endl <<
        +      indent() << "        return False" << endl <<
        +      indent() << "  return True" << endl <<
        

        Please make the same updates as you did to the TBase implementation (two separte statements to extract vals, and skip the is check).

             if (gen_twisted_) {
               extends_if = "(Interface)";
        +    } else if (gen_dynamic_) {
        +      extends_if = "(" + gen_dynbaseclass_  + ")"; // TBase
        

        The service interface should not be a TBase.

        +    if (gen_twisted_ && gen_newstyle_) { // TODO: handle dynbase here?
        

        Same for the client.

        Show
        David Reiss added a comment - Looks solid. Mostly style changes. Does dynamic require the slots option and conflict with the newstyle option? If so, this should be enforced. If not, there are a few places where they conflict. In TExceptionBase, can you copy the read and write methods the way you do with _ eq _? + except TypeError: + pass # TODO: add logging about unhashable types. i.e. d=dict(); d[[0,1]] = 2 fails This should be an exception. @@ -176,7 +180,6 @@ except Xception, x: self.assertEqual(x.errorCode, 1001) self.assertEqual(x.message, 'Xception') - try: self.client.testException("throw_undeclared") self.fail("should have thrown exception") @@ -225,4 +228,4 @@ self.createTests() Looks like this is an unnecessary change. + // hdr += std::string("from thrift.protocol.TBase import TBase\n"); Remove commented-out code. + hdr += + "from thrift.transport import TTransport\n" + "from thrift.protocol import TBinaryProtocol, TProtocol\n"; + hdr += "try:\n" " from thrift.protocol import fastbinary\n" "except:\n" " fastbinary = None\n"; The two "hdr += " statements can be combined into one. In fact, it looks like the hdr variable can be eliminated entirely if you want. @@ -384,6 +439,7 @@ f_types_ << "class " << tenum->get_name() << (gen_newstyle_ ? "(object)" : "") << + (gen_dynamic_ ? "(" + gen_dynbaseclass_ + ")" : "") << ":" << endl; This is one conflict between newstyle and dynamic. I think the generation of _ eq _ and friends is another. + if (gen_newstyle_) { + out << "(object)"; + } + else if (gen_dynamic_) { + out << "(" << gen_dynbaseclass_ << ")"; + } Combine "else if" to previous line. + indent_down(); + } else { + if (! gen_dynamic_) { + // no base class available to implement __eq__ and __repr__ and __ne__ for us Combine "else" with "if". Also, indentation is off inside the "if". + // perhaps we should try __slots__? What does this mean? + indent_up(); + for (m_iter = sorted_members.begin(); m_iter != sorted_members.end(); ++m_iter) { + indent(out) << "'" << (*m_iter)->get_name() << "', " + << endl; + } This puts trailing whitespace in generated code. + if (! gen_dynamic_) { No space after "!". + indent() << " for attr in self.__slots__:" << endl << + indent() << " my_val, other_val = [getattr(obj, attr) for obj in [self, other]]" << endl << + indent() << " if my_val is other_val:" << endl << + indent() << " continue" << endl << + indent() << " if my_val != other_val:" << endl << + indent() << " return False" << endl << + indent() << " return True" << endl << Please make the same updates as you did to the TBase implementation (two separte statements to extract vals, and skip the is check). if (gen_twisted_) { extends_if = "(Interface)"; + } else if (gen_dynamic_) { + extends_if = "(" + gen_dynbaseclass_ + ")"; // TBase The service interface should not be a TBase. + if (gen_twisted_ && gen_newstyle_) { // TODO: handle dynbase here? Same for the client.
        Hide
        Will Pierce added a comment -

        Hi David, thanks for the feedback. I have an updated patch now. I tried to make as few additional changes as possible, so there's forward progress, but unfortunately found a couple bugs lurking in there that needed a structural change (item 3 below). I really appreciate the time you've put into helping get this patch refined. (This code is important to get right.)

        Answering your questions in-order:

        1. Options: (Dynamic or newstyle) and slots
        The dynamic option doesn't require the slots option, but there's no memory savings in that case. The slots option can be used without the dynamic (TBase) feature, which gives memory savings but the classes use statically generated read/write code in the usual way. The "make check" test-cases exercise all four combinations of (dynamic, slots, dynamicslots, <neither>). It used to be a single option 'dynslots', but I split them apart because the slots feature can be used independently with the 'newstyle' option to get memory savings without TBase dynamic decoding. The test cases exercise each of (slots, newstyle, newstyleslots, <neither>).

        2. Options: Dynamic and newstyle
        The dynamic option produces new-style classes, so I think it would make sense for the compiler to ignore the 'newstyle' option if it is included with the 'dynamic' option. In this version of the patch, I've adjusted the compiler code so that the 'dynamic' option overrides the 'newstyle' option (sets it to false), so the code paths won't intersect. The python test cases don't exercise this combination of options, as it treats 'dynamic,newstyle' the same as just 'dynamic'. If you think it makes sense, though, I can add that combination to the "make check" testing in trunk/tests/py/Makefile.am file.

        3. TExceptionBase read/write method copying:
        I've tried it and uncovered some problems. The method copying that I used for _eq and repr_ in TExceptionBase was just wrong, and wasn't being exercised by any test code. After I added the tests, I found a latent bug I introduced into the compiler generated non-dynamic, slots-enabled generated repr() method, which I've fixed in this updated patch. (I also fixed some inconsistent indentation in TestClient.py) Also, I reversed the inheritance order of generated exception structs, so they inherit from (TExceptionBase,Exception) instead of Exception,TExceptionBase, which fixes a bug where python's Exception repr() overrode TExceptionBase's. That multiple-inheritance happens within TBase.py now, instead of in the generated code, keeping it in one place. This is fixed in this latest version of the patch and the unit tests in TestClient now check the output of Xception's repr() using an explicit test assertion.

        There was still a problem though, because python doesn't actually allow method copying across classes like I thought. It throws "TypeError: unbound method write() must be called with TBase instance as first argument (got TBinaryProtocol instance instead)" when the copied method is invoked. (More detail: http://mousebender.wordpress.com/2007/02/17/copying-methods-in-python/) So that's a broken path, and I've removed the method copying... One alternative is delegating to module-level functions like a shared_read(), but that would add significant runtime cost. It doesn't seem worthwhile to add an extra function call in every repr/eq/ne/read/write just to avoid duplicating code.

        The only way I can think of to avoid cut-n-paste code in TBase without incurring runtime cost is to build a new old-style subclass, TRoot which implements what TBase did, then derive TBase from TRoot and 'object' to make it new-style (slots capable), and then derive TExceptionBase from TRoot and Exception, keeping it old-style for python2.4 compatibility. The only code left in TBase and TExceptionBase are the mandatory _slots_ = [] lines to stop python from generating a _dict_ in between TRoot and a struct. The result looks even cleaner than any previous version of this patch.

        A picture will demonstrate it better than words. For a thrift struct named Foo and an exception named Xcept, the class hierarchy would have three possible shapes, for dynamic vs. newstyle-nondynamic vs. default compiler options:

        Dynamic                      Newstyle-nondynamic     Default
        ========================     ===================     ==============
        object  TRoot  Exception     object  Exception       Foo  Exception
            |   /  \      |             |       |                    |
            |  /    \     |             |       |                  Xcept
           TBase   TExceptionBase      Foo    Xcept
            |             |
           Foo          Xcept
        

        The only extra runtime cost here is the internal python method resolution lookup for handling on extra level of inheritance. Given that TBase implements _slots_ and has no local methods of its own, I think it's an acceptable cost. The dynamic read() and write() methods are already somewhat slower than the auto-generated static code, so this extra method resolution cost is not significant.

        Note: The impact of this inheritance change is that code generated using the "dynamic,slots" option isn't quite as memory-efficient as code generated with "newstyle,slots". This is because the TRoot class is old-style, and even though is implements _slots, the python interpreter sticks a dict in there. It's an empty dict though, so its size is constant even for thrift structs with very many fields. The resulting objects still far less RAM than the dict_ based ones, and this constant difference is less significant with larger objects.

        I did some proof of concept timing tests, and the extra cost for invoking a method that is two inheritance levels up versus one level is pretty small. For thrift object construction (Bonk class), the difference is 0.9837 microseconds with TRoot versus 0.9488 microseconds without. Serialization is 18.1025 microseconds with TRoot versus 18.3218 microseconds without (just noise). And deserialization is 24.3407 microseconds with TRoot versus 24.5639 microseconds without (noise). So the extra level of inheritance doesn't make a practical speed difference.

        When python2.4 support is no longer needed (2014? 2015?), we can collapse TRoot into TBase and inherit TExceptionBase from TBase and Exception. The linecount will hardly change, but the class hierarchy will be a bit simpler.

        And of course, an alternative to using this TRoot parent class would be to simply cut and paste the same code between TBase and TExceptionBase, and let the only difference be the 'class' line (object vs. Exception).

        4. TypeError in TProtocol.py's readContainerMap() at line 303:
        I've removed the TypeError exception handling, and left a 1-line comment in place to help anyone who traces their issues to that point.

        Aside: The python deserializer will now fail to decode a message that has a map/dict using mutable keys, instead of silently eating the error but decoding the rest of the message. If there was a mechanism for indicating "partial success" at decoding, we could allow for different degrees of decoder "strictness". A user-settable strictness property might fit in with some of the other discussions about limiting the maximum sizes of some message types, like list and string lengths. I'm definitely interested in adding that kind of robustness to the code in a followup patch, in a way that would work well across platforms of course.

        5. unnecessary whitespace change
        I added the whitespace back in, but also have inserted two lines above it to implement the repr() result testing, after fixing up my bug in the generated TException _repr_ code.

        6. unnecessary C++ comment left in
        Zapped, thanks!

        7. extra hdr string assignment
        I merged the two hdr += assignments and made the indentation consistent, so it looks a lot cleaner. I left hdr in for now. I don't think it does any harm.

        8. conflict between newstyle and dynamic
        This is resolved now with the options handling code that makes 'dynamic' take precedence over newstyle (since dynamic TBase is itself new-style).

        9. "else if" to previous line and indentation fix
        Done. I switched back to emacs for the C++ edits, since it auto indents, and the GUI IDE I was using before didn't do the job. Also fixed a vertical-alignment on line 805 for def _eq_ output. Also touched a couple other places where my indentation was off a space or two.

        10. "perhaps we should try slots" comment
        That was a note to myself when I first started this patch. It's zapped now.

        11. trailing whitespace
        Zapped teh trailing whitespace, and moved the << endl up to the previous line so it reads cleaner.

        12. no space after !
        Fixed in both places.

        13. _eq_ should match TBase's
        Fixed up, good catch, thank you!

        14. Service class wrong base class
        I removed the TBase reference, and made the else check compare (gen_newstyle_ || gen_dynamic_) so services generated with 'dyanmic' end up as new-style, to repeat the structs behavior.

        15. service class for client
        Updated the check to make sure '(object)' is the base class when either newstyle or dynamic is selected, not just newstyle. I'm not entirely sure this is what you meant, though. I'm not as familiar with the service code as other parts of thrift.

        I'm attaching version of the patch now which passes all unit tests under python2.7 and python2.4 on my host at home.

        Show
        Will Pierce added a comment - Hi David, thanks for the feedback. I have an updated patch now. I tried to make as few additional changes as possible, so there's forward progress, but unfortunately found a couple bugs lurking in there that needed a structural change (item 3 below). I really appreciate the time you've put into helping get this patch refined. (This code is important to get right.) Answering your questions in-order: 1. Options: (Dynamic or newstyle) and slots The dynamic option doesn't require the slots option, but there's no memory savings in that case. The slots option can be used without the dynamic (TBase) feature, which gives memory savings but the classes use statically generated read/write code in the usual way. The "make check" test-cases exercise all four combinations of (dynamic, slots, dynamicslots, <neither>). It used to be a single option 'dynslots', but I split them apart because the slots feature can be used independently with the 'newstyle' option to get memory savings without TBase dynamic decoding. The test cases exercise each of (slots, newstyle, newstyleslots, <neither>). 2. Options: Dynamic and newstyle The dynamic option produces new-style classes, so I think it would make sense for the compiler to ignore the 'newstyle' option if it is included with the 'dynamic' option. In this version of the patch, I've adjusted the compiler code so that the 'dynamic' option overrides the 'newstyle' option (sets it to false), so the code paths won't intersect. The python test cases don't exercise this combination of options, as it treats 'dynamic,newstyle' the same as just 'dynamic'. If you think it makes sense, though, I can add that combination to the "make check" testing in trunk/tests/py/Makefile.am file. 3. TExceptionBase read/write method copying: I've tried it and uncovered some problems. The method copying that I used for _ eq and repr _ in TExceptionBase was just wrong, and wasn't being exercised by any test code. After I added the tests, I found a latent bug I introduced into the compiler generated non-dynamic, slots-enabled generated repr() method, which I've fixed in this updated patch. (I also fixed some inconsistent indentation in TestClient.py) Also, I reversed the inheritance order of generated exception structs, so they inherit from (TExceptionBase,Exception) instead of Exception,TExceptionBase, which fixes a bug where python's Exception repr() overrode TExceptionBase's. That multiple-inheritance happens within TBase.py now, instead of in the generated code, keeping it in one place. This is fixed in this latest version of the patch and the unit tests in TestClient now check the output of Xception's repr() using an explicit test assertion. There was still a problem though, because python doesn't actually allow method copying across classes like I thought. It throws "TypeError: unbound method write() must be called with TBase instance as first argument (got TBinaryProtocol instance instead)" when the copied method is invoked. (More detail: http://mousebender.wordpress.com/2007/02/17/copying-methods-in-python/ ) So that's a broken path, and I've removed the method copying... One alternative is delegating to module-level functions like a shared_read(), but that would add significant runtime cost. It doesn't seem worthwhile to add an extra function call in every repr/eq/ne/read/write just to avoid duplicating code. The only way I can think of to avoid cut-n-paste code in TBase without incurring runtime cost is to build a new old-style subclass, TRoot which implements what TBase did, then derive TBase from TRoot and 'object' to make it new-style (slots capable), and then derive TExceptionBase from TRoot and Exception, keeping it old-style for python2.4 compatibility. The only code left in TBase and TExceptionBase are the mandatory _ slots _ = [] lines to stop python from generating a _ dict _ in between TRoot and a struct. The result looks even cleaner than any previous version of this patch. A picture will demonstrate it better than words. For a thrift struct named Foo and an exception named Xcept, the class hierarchy would have three possible shapes, for dynamic vs. newstyle-nondynamic vs. default compiler options: Dynamic Newstyle-nondynamic Default ======================== =================== ============== object TRoot Exception object Exception Foo Exception | / \ | | | | | / \ | | | Xcept TBase TExceptionBase Foo Xcept | | Foo Xcept The only extra runtime cost here is the internal python method resolution lookup for handling on extra level of inheritance. Given that TBase implements _ slots _ and has no local methods of its own, I think it's an acceptable cost. The dynamic read() and write() methods are already somewhat slower than the auto-generated static code, so this extra method resolution cost is not significant. Note: The impact of this inheritance change is that code generated using the "dynamic,slots" option isn't quite as memory-efficient as code generated with "newstyle,slots". This is because the TRoot class is old-style, and even though is implements _ slots , the python interpreter sticks a dict in there. It's an empty dict though, so its size is constant even for thrift structs with very many fields. The resulting objects still far less RAM than the dict _ based ones, and this constant difference is less significant with larger objects. I did some proof of concept timing tests, and the extra cost for invoking a method that is two inheritance levels up versus one level is pretty small. For thrift object construction (Bonk class), the difference is 0.9837 microseconds with TRoot versus 0.9488 microseconds without. Serialization is 18.1025 microseconds with TRoot versus 18.3218 microseconds without (just noise). And deserialization is 24.3407 microseconds with TRoot versus 24.5639 microseconds without (noise). So the extra level of inheritance doesn't make a practical speed difference. When python2.4 support is no longer needed (2014? 2015?), we can collapse TRoot into TBase and inherit TExceptionBase from TBase and Exception. The linecount will hardly change, but the class hierarchy will be a bit simpler. And of course, an alternative to using this TRoot parent class would be to simply cut and paste the same code between TBase and TExceptionBase, and let the only difference be the 'class' line ( object vs. Exception ). 4. TypeError in TProtocol.py's readContainerMap() at line 303: I've removed the TypeError exception handling, and left a 1-line comment in place to help anyone who traces their issues to that point. Aside: The python deserializer will now fail to decode a message that has a map/dict using mutable keys, instead of silently eating the error but decoding the rest of the message. If there was a mechanism for indicating "partial success" at decoding, we could allow for different degrees of decoder "strictness". A user-settable strictness property might fit in with some of the other discussions about limiting the maximum sizes of some message types, like list and string lengths. I'm definitely interested in adding that kind of robustness to the code in a followup patch, in a way that would work well across platforms of course. 5. unnecessary whitespace change I added the whitespace back in, but also have inserted two lines above it to implement the repr() result testing, after fixing up my bug in the generated TException _ repr _ code. 6. unnecessary C++ comment left in Zapped, thanks! 7. extra hdr string assignment I merged the two hdr += assignments and made the indentation consistent, so it looks a lot cleaner. I left hdr in for now. I don't think it does any harm. 8. conflict between newstyle and dynamic This is resolved now with the options handling code that makes 'dynamic' take precedence over newstyle (since dynamic TBase is itself new-style). 9. "else if" to previous line and indentation fix Done. I switched back to emacs for the C++ edits, since it auto indents, and the GUI IDE I was using before didn't do the job. Also fixed a vertical-alignment on line 805 for def _ eq _ output. Also touched a couple other places where my indentation was off a space or two. 10. "perhaps we should try slots" comment That was a note to myself when I first started this patch. It's zapped now. 11. trailing whitespace Zapped teh trailing whitespace, and moved the << endl up to the previous line so it reads cleaner. 12. no space after ! Fixed in both places. 13. _ eq _ should match TBase's Fixed up, good catch, thank you! 14. Service class wrong base class I removed the TBase reference, and made the else check compare (gen_newstyle_ || gen_dynamic_) so services generated with 'dyanmic' end up as new-style, to repeat the structs behavior. 15. service class for client Updated the check to make sure '(object)' is the base class when either newstyle or dynamic is selected, not just newstyle. I'm not entirely sure this is what you meant, though. I'm not as familiar with the service code as other parts of thrift. I'm attaching version of the patch now which passes all unit tests under python2.7 and python2.4 on my host at home.
        Hide
        Will Pierce added a comment -

        After thinking about this latest version of the patch, issue 3 w/ the duplicated code between TBase and TExceptionBase gnaws at me. We can eliminate TRoot if we just cut and paste its contents into both TBase and TExceptionBase. It isn't super elegant, but the inheritance diagram is a lot simpler. The only reason TRoot exists at all is to work around the python2.4 requirement that Exceptions cannot be new-style classes, otherwise the inheritance diagram would be simple: TBase(object) and TExceptionBase(TBase,Exception)

        I'm not sure it's worth complicating the object hierarchy just to avoid having two copies of the same code, especially when they live in the same file. I'm curious what others think of the tradeoff. It's ~25 lines of code for 5 methods, and 4 lines of whitespace.

        Show
        Will Pierce added a comment - After thinking about this latest version of the patch, issue 3 w/ the duplicated code between TBase and TExceptionBase gnaws at me. We can eliminate TRoot if we just cut and paste its contents into both TBase and TExceptionBase. It isn't super elegant, but the inheritance diagram is a lot simpler. The only reason TRoot exists at all is to work around the python2.4 requirement that Exceptions cannot be new-style classes, otherwise the inheritance diagram would be simple: TBase(object) and TExceptionBase(TBase,Exception) I'm not sure it's worth complicating the object hierarchy just to avoid having two copies of the same code, especially when they live in the same file. I'm curious what others think of the tradeoff. It's ~25 lines of code for 5 methods, and 4 lines of whitespace.
        Hide
        David Reiss added a comment -
        class First(object):
          def go(self, val):
            return val + 1
        
        class Second(object):
          go = First.go.im_func
        
        def go_impl(self, val):
          return val + 1
        
        class Third(object):
          go = go_impl
        

        I think First, Second, and Third are all equivalent in terms of functionality and performance, though I haven't verified this. I'm not sure how they compare in terms of debug-ability. I'm sure it's subjective.

        Do either of these options strike you as preferable to copying the implementations?

        If not, I'd be fine with the copying. I think we might do the same thing in PHP.

        Show
        David Reiss added a comment - class First(object): def go(self, val): return val + 1 class Second(object): go = First.go.im_func def go_impl(self, val): return val + 1 class Third(object): go = go_impl I think First, Second, and Third are all equivalent in terms of functionality and performance, though I haven't verified this. I'm not sure how they compare in terms of debug-ability. I'm sure it's subjective. Do either of these options strike you as preferable to copying the implementations? If not, I'd be fine with the copying. I think we might do the same thing in PHP.
        Hide
        Will Pierce added a comment -

        Aha, the .im_func is the secret sauce that will make the method copying work. I wondered if it was possible to get around python's instance type checking, im_func is it. Thanks for pointing it out!

        The method copying in Second seems the cleanest to me. I'm a little queasy about the Third option, mostly because seeing a 'self' argument in a module-level function will confusingly look like a bug in a year or so when we've forgotten all about this detour.

        I think you're right about the performance being equal. I disassembled with dis.dis three functions that exercised each version (instance construction and a .go() call), and all three produced the same bytecode:

          1           0 LOAD_GLOBAL              0 (Third)  # or Second, or First
                      3 CALL_FUNCTION            0
                      6 STORE_FAST               0 (x)
                      9 LOAD_FAST                0 (x)
                     12 LOAD_ATTR                2 (go)
                     15 POP_TOP             
                     16 LOAD_CONST               0 (None)
                     19 RETURN_VALUE        
        

        I'll post an updated patch in a moment, after running the tests.

        Show
        Will Pierce added a comment - Aha, the .im_func is the secret sauce that will make the method copying work. I wondered if it was possible to get around python's instance type checking, im_func is it. Thanks for pointing it out! The method copying in Second seems the cleanest to me. I'm a little queasy about the Third option, mostly because seeing a 'self' argument in a module-level function will confusingly look like a bug in a year or so when we've forgotten all about this detour. I think you're right about the performance being equal. I disassembled with dis.dis three functions that exercised each version (instance construction and a .go() call), and all three produced the same bytecode: 1 0 LOAD_GLOBAL 0 (Third) # or Second, or First 3 CALL_FUNCTION 0 6 STORE_FAST 0 (x) 9 LOAD_FAST 0 (x) 12 LOAD_ATTR 2 (go) 15 POP_TOP 16 LOAD_CONST 0 (None) 19 RETURN_VALUE I'll post an updated patch in a moment, after running the tests.
        Hide
        Will Pierce added a comment -

        version 5 of patch attached, this changes method copying for TExceptionBase from TBase to use the .im_func special variable, and removes the TRoot class entirely.

        the 'make check' tests for python2.7 and python2.4 work on my box.

        Show
        Will Pierce added a comment - version 5 of patch attached, this changes method copying for TExceptionBase from TBase to use the .im_func special variable, and removes the TRoot class entirely. the 'make check' tests for python2.7 and python2.4 work on my box.
        Hide
        Christian Rakow added a comment -

        Hi,
        when I looked at the generated python code I had exactly the same idea that the code can be reduced significantly. So thanks for your patch Will its exactly what I was looking for and I would love to see it in thrift soon!

        I noticed a little bug concerning nested list, maybe other container types are affected too.
        Such data field will result in a traceback:

        0: list<list<list<Data>>> data
        
          File "thrift/protocol/TProtocol.py", line 388, in writeStruct
            self.writeFieldByTType(ftype, val, fspec)
          File "thrift/protocol/TProtocol.py", line 397, in writeFieldByTType
            writer(val, spec)
          File "thrift/protocol/TProtocol.py", line 339, in writeContainerList
            e_writer(elem, spec)
          File "thrift/protocol/TProtocol.py", line 339, in writeContainerList
            e_writer(elem, spec)
          File "thrift/protocol/TProtocol.py", line 339, in writeContainerList
            e_writer(elem, spec)
          File "thrift/protocol/TProtocol.py", line 331, in writeContainerList
            self.writeListBegin(spec[0], len(val))
        TypeError: object of type 'Data' has no len()
        
        

        Thanks for your work and I hope someone will commit it to trunk when its fixed.

        Show
        Christian Rakow added a comment - Hi, when I looked at the generated python code I had exactly the same idea that the code can be reduced significantly. So thanks for your patch Will its exactly what I was looking for and I would love to see it in thrift soon! I noticed a little bug concerning nested list, maybe other container types are affected too. Such data field will result in a traceback: 0: list<list<list<Data>>> data File "thrift/protocol/TProtocol.py" , line 388, in writeStruct self.writeFieldByTType(ftype, val, fspec) File "thrift/protocol/TProtocol.py" , line 397, in writeFieldByTType writer(val, spec) File "thrift/protocol/TProtocol.py" , line 339, in writeContainerList e_writer(elem, spec) File "thrift/protocol/TProtocol.py" , line 339, in writeContainerList e_writer(elem, spec) File "thrift/protocol/TProtocol.py" , line 339, in writeContainerList e_writer(elem, spec) File "thrift/protocol/TProtocol.py" , line 331, in writeContainerList self.writeListBegin(spec[0], len(val)) TypeError: object of type 'Data' has no len() Thanks for your work and I hope someone will commit it to trunk when its fixed.
        Hide
        Will Pierce added a comment -

        Thanks for finding the bug, Christian! I'll add this specific test-case to the unit tests and fix what I did wrong and provide an updated patch. It sounds like I may have used the wrong fspec value in the recursion (the outer, not the inner type). I'll post with an update soon!

        Show
        Will Pierce added a comment - Thanks for finding the bug, Christian! I'll add this specific test-case to the unit tests and fix what I did wrong and provide an updated patch. It sounds like I may have used the wrong fspec value in the recursion (the outer, not the inner type). I'll post with an update soon!
        Hide
        Will Pierce added a comment -

        FYI, I just found that a struct X { } with only one field of ID zero will fail to get a thrift_spec member in the generated code, but that's probably by design (I'm not sure if field IDs are supposed to support ID 0 or not offhand).

        I changed the test to use (in ThriftTest.thrift):

        struct NestedListsBonk {
          1: list<list<list<Bonk>>> bonk
        }
        

        which reproduces the error now. I'm working on finding the bug now.

        Show
        Will Pierce added a comment - FYI, I just found that a struct X { } with only one field of ID zero will fail to get a thrift_spec member in the generated code, but that's probably by design (I'm not sure if field IDs are supposed to support ID 0 or not offhand). I changed the test to use (in ThriftTest.thrift): struct NestedListsBonk { 1: list<list<list<Bonk>>> bonk } which reproduces the error now. I'm working on finding the bug now.
        Hide
        Will Pierce added a comment -

        The bug is in the code for handling containers of containers. The writeContainer*() method(s) wasn't removing the outermost container before recursing in to write the inner containers, and it looks like there's a symmetric bug in the readContainer*() method(s). There's some very similar-looking code here that I think I combine together into a more generalized implementation (in TProtocol.py). I'll post the fix as soon as it's ready.

        Show
        Will Pierce added a comment - The bug is in the code for handling containers of containers. The writeContainer*() method(s) wasn't removing the outermost container before recursing in to write the inner containers, and it looks like there's a symmetric bug in the readContainer*() method(s). There's some very similar-looking code here that I think I combine together into a more generalized implementation (in TProtocol.py ). I'll post the fix as soon as it's ready.
        Hide
        Will Pierce added a comment -

        This is version 6 of the patch, which has been updated to include:

        • a fix in TProtocol.py for encoding/decoding container types with non-primitive contents, i.e. lists of lists, etc.
        • changed python unit-tests to support multiple generated python directories (i.e. the gen-py-default, gen-py-newstyle, gen-py-dynamicslots, etc made by different arguments to the thrift compiler's -gen option). SerializationTest.py and others now take a --genpydir argument to adjust sys.path, but defaults to "gen-py" if not specified
        • The RunClientServer.py test now wraps the non-client/server tests as well, so the SerializationTest.py and other single-script tests are executed against all generated python directories, to ensure better coverage of the tests.
        • expanded SerializationTest.py to include more testing of non-trivial container types
        • added the missing Apache license comment header to TCompactProtocol.py
        • updated patch so it works cleanly against trunk
        Show
        Will Pierce added a comment - This is version 6 of the patch, which has been updated to include: a fix in TProtocol.py for encoding/decoding container types with non-primitive contents, i.e. lists of lists, etc. changed python unit-tests to support multiple generated python directories (i.e. the gen-py-default, gen-py-newstyle, gen-py-dynamicslots, etc made by different arguments to the thrift compiler's -gen option). SerializationTest.py and others now take a --genpydir argument to adjust sys.path, but defaults to "gen-py" if not specified The RunClientServer.py test now wraps the non-client/server tests as well, so the SerializationTest.py and other single-script tests are executed against all generated python directories, to ensure better coverage of the tests. expanded SerializationTest.py to include more testing of non-trivial container types added the missing Apache license comment header to TCompactProtocol.py updated patch so it works cleanly against trunk
        Hide
        Will Pierce added a comment -

        Here is sample comparison of the code generated for the trunk/test/ThriftTest.thrift's new "NestedMixedx2" struct, which is defined as:

        struct NestedMixedx2 {
          1: list<set<i32>> int_set_list
          2: map<i32,set<string>> map_int_strset
          3: list<map<i32,set<string>>> map_int_strset_list
        }
        

        Here is the generated code from ttypes.py for this class using the new dynamic,slots option:

        class NestedMixedx2(TBase):
          """
          Attributes:
           - int_set_list
           - map_int_strset
           - map_int_strset_list
          """
        
          __slots__ = [ 
            'int_set_list',
            'map_int_strset',
            'map_int_strset_list',
           ]
        
          thrift_spec = (
            None, # 0
            (1, TType.LIST, 'int_set_list', (TType.SET,(TType.I32,None)), None, ), # 1
            (2, TType.MAP, 'map_int_strset', (TType.I32,None,TType.SET,(TType.STRING,None)), None, ), # 2
            (3, TType.LIST, 'map_int_strset_list', (TType.MAP,(TType.I32,None,TType.SET,(TType.STRING,None))), None, ), # 3
          )
        
          def __init__(self, int_set_list=None, map_int_strset=None, map_int_strset_list=None,):
            self.int_set_list = int_set_list
            self.map_int_strset = map_int_strset
            self.map_int_strset_list = map_int_strset_list
        

        Here is the same class using old style default options:

        class NestedMixedx2:
          """
          Attributes:
           - int_set_list
           - map_int_strset
           - map_int_strset_list
          """
        
          thrift_spec = (
            None, # 0
            (1, TType.LIST, 'int_set_list', (TType.SET,(TType.I32,None)), None, ), # 1
            (2, TType.MAP, 'map_int_strset', (TType.I32,None,TType.SET,(TType.STRING,None)), None, ), # 2
            (3, TType.LIST, 'map_int_strset_list', (TType.MAP,(TType.I32,None,TType.SET,(TType.STRING,None))), None, ), # 3
          )
        
          def __init__(self, int_set_list=None, map_int_strset=None, map_int_strset_list=None,):
            self.int_set_list = int_set_list
            self.map_int_strset = map_int_strset
            self.map_int_strset_list = map_int_strset_list
        
          def read(self, iprot):
            if iprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None and f
        astbinary is not None:
              fastbinary.decode_binary(self, iprot.trans, (self.__class__, self.thrift_spec))
              return
            iprot.readStructBegin()
            while True:
              (fname, ftype, fid) = iprot.readFieldBegin()
              if ftype == TType.STOP:
                break
              if fid == 1:
                if ftype == TType.LIST:
                  self.int_set_list = []
                  (_etype176, _size173) = iprot.readListBegin()
                  for _i177 in xrange(_size173):
                    _elem178 = set()
                    (_etype182, _size179) = iprot.readSetBegin()
                    for _i183 in xrange(_size179):
                      _elem184 = iprot.readI32();
                      _elem178.add(_elem184)
                    iprot.readSetEnd()
                    self.int_set_list.append(_elem178)
                  iprot.readListEnd()
                else:
                  iprot.skip(ftype)
              elif fid == 2:
                if ftype == TType.MAP:
                  self.map_int_strset = {}
                  (_ktype186, _vtype187, _size185 ) = iprot.readMapBegin() 
                  for _i189 in xrange(_size185):
                    _key190 = iprot.readI32();
                    _val191 = set()
                    (_etype195, _size192) = iprot.readSetBegin()
                    for _i196 in xrange(_size192):
                      _elem197 = iprot.readString();
                      _val191.add(_elem197)
                    iprot.readSetEnd()
                    self.map_int_strset[_key190] = _val191
                  iprot.readMapEnd()
                else:
                  iprot.skip(ftype)
              elif fid == 3:
                if ftype == TType.LIST:
                  self.map_int_strset_list = []
                  (_etype201, _size198) = iprot.readListBegin()
                  for _i202 in xrange(_size198):
                    _elem203 = {}
                    (_ktype205, _vtype206, _size204 ) = iprot.readMapBegin() 
                    for _i208 in xrange(_size204):
                      _key209 = iprot.readI32();
                      _val210 = set()
                      (_etype214, _size211) = iprot.readSetBegin()
                      for _i215 in xrange(_size211):
                        _elem216 = iprot.readString();
                        _val210.add(_elem216)
                      iprot.readSetEnd()
                      _elem203[_key209] = _val210
                    iprot.readMapEnd()
                    self.map_int_strset_list.append(_elem203)
                  iprot.readListEnd()
                else:
                  iprot.skip(ftype)
              else:
                iprot.skip(ftype)
              iprot.readFieldEnd()
            iprot.readStructEnd()
        
          def write(self, oprot):
            if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and self.thrift_spec is not None and fastbinary is not None:
              oprot.trans.write(fastbinary.encode_binary(self, (self.__class__, self.thrift_spec)))
              return
            oprot.writeStructBegin('NestedMixedx2')
            if self.int_set_list is not None:
              oprot.writeFieldBegin('int_set_list', TType.LIST, 1)
              oprot.writeListBegin(TType.SET, len(self.int_set_list))
              for iter217 in self.int_set_list:
                oprot.writeSetBegin(TType.I32, len(iter217))
                for iter218 in iter217:
                  oprot.writeI32(iter218)
                oprot.writeSetEnd()
              oprot.writeListEnd()
              oprot.writeFieldEnd()
            if self.map_int_strset is not None:
              oprot.writeFieldBegin('map_int_strset', TType.MAP, 2)
              oprot.writeMapBegin(TType.I32, TType.SET, len(self.map_int_strset))
              for kiter219,viter220 in self.map_int_strset.items():
                oprot.writeI32(kiter219)
                oprot.writeSetBegin(TType.STRING, len(viter220))
                for iter221 in viter220:
                  oprot.writeString(iter221)
                oprot.writeSetEnd()
              oprot.writeMapEnd()
              oprot.writeFieldEnd()
            if self.map_int_strset_list is not None:
              oprot.writeFieldBegin('map_int_strset_list', TType.LIST, 3)
              oprot.writeListBegin(TType.MAP, len(self.map_int_strset_list))
              for iter222 in self.map_int_strset_list:
                oprot.writeMapBegin(TType.I32, TType.SET, len(iter222))
                for kiter223,viter224 in iter222.items():
                  oprot.writeI32(kiter223)
                  oprot.writeSetBegin(TType.STRING, len(viter224))
                  for iter225 in viter224:
                    oprot.writeString(iter225)
                  oprot.writeSetEnd()
                oprot.writeMapEnd()
              oprot.writeListEnd()
              oprot.writeFieldEnd()
            oprot.writeFieldStop()
            oprot.writeStructEnd()
        
          def validate(self):
            return
        
        
          def __repr__(self):
            L = ['%s=%r' % (key, value)
              for key, value in self.__dict__.iteritems()]
            return '%s(%s)' % (self.__class__.__name__, ', '.join(L))
        
          def __eq__(self, other):
            return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
        
          def __ne__(self, other):
            return not (self == other)
        

        If this (THRIFT-1115) passes muster, then the next direction that might be interesting to go would be to look at reviving the python .thrift compiler from branches/py-compiler/compiler/py/src/, removing all of its code text generation, and instead build up the thrift classes at runtime using metaclasses. If the parser/builder class lived in the python thrift libraries, then it would allow some kinds of applications (proxies maybe) to work more dynamically.

        Show
        Will Pierce added a comment - Here is sample comparison of the code generated for the trunk/test/ThriftTest.thrift's new "NestedMixedx2" struct, which is defined as: struct NestedMixedx2 { 1: list<set<i32>> int_set_list 2: map<i32,set<string>> map_int_strset 3: list<map<i32,set<string>>> map_int_strset_list } Here is the generated code from ttypes.py for this class using the new dynamic,slots option: class NestedMixedx2(TBase): """ Attributes: - int_set_list - map_int_strset - map_int_strset_list """ __slots__ = [ 'int_set_list', 'map_int_strset', 'map_int_strset_list', ] thrift_spec = ( None, # 0 (1, TType.LIST, 'int_set_list', (TType.SET,(TType.I32,None)), None, ), # 1 (2, TType.MAP, 'map_int_strset', (TType.I32,None,TType.SET,(TType.STRING,None)), None, ), # 2 (3, TType.LIST, 'map_int_strset_list', (TType.MAP,(TType.I32,None,TType.SET,(TType.STRING,None))), None, ), # 3 ) def __init__(self, int_set_list=None, map_int_strset=None, map_int_strset_list=None,): self.int_set_list = int_set_list self.map_int_strset = map_int_strset self.map_int_strset_list = map_int_strset_list Here is the same class using old style default options: class NestedMixedx2: """ Attributes: - int_set_list - map_int_strset - map_int_strset_list """ thrift_spec = ( None, # 0 (1, TType.LIST, 'int_set_list', (TType.SET,(TType.I32,None)), None, ), # 1 (2, TType.MAP, 'map_int_strset', (TType.I32,None,TType.SET,(TType.STRING,None)), None, ), # 2 (3, TType.LIST, 'map_int_strset_list', (TType.MAP,(TType.I32,None,TType.SET,(TType.STRING,None))), None, ), # 3 ) def __init__(self, int_set_list=None, map_int_strset=None, map_int_strset_list=None,): self.int_set_list = int_set_list self.map_int_strset = map_int_strset self.map_int_strset_list = map_int_strset_list def read(self, iprot): if iprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None and f astbinary is not None: fastbinary.decode_binary(self, iprot.trans, (self.__class__, self.thrift_spec)) return iprot.readStructBegin() while True: (fname, ftype, fid) = iprot.readFieldBegin() if ftype == TType.STOP: break if fid == 1: if ftype == TType.LIST: self.int_set_list = [] (_etype176, _size173) = iprot.readListBegin() for _i177 in xrange(_size173): _elem178 = set() (_etype182, _size179) = iprot.readSetBegin() for _i183 in xrange(_size179): _elem184 = iprot.readI32(); _elem178.add(_elem184) iprot.readSetEnd() self.int_set_list.append(_elem178) iprot.readListEnd() else : iprot.skip(ftype) elif fid == 2: if ftype == TType.MAP: self.map_int_strset = {} (_ktype186, _vtype187, _size185 ) = iprot.readMapBegin() for _i189 in xrange(_size185): _key190 = iprot.readI32(); _val191 = set() (_etype195, _size192) = iprot.readSetBegin() for _i196 in xrange(_size192): _elem197 = iprot.readString(); _val191.add(_elem197) iprot.readSetEnd() self.map_int_strset[_key190] = _val191 iprot.readMapEnd() else : iprot.skip(ftype) elif fid == 3: if ftype == TType.LIST: self.map_int_strset_list = [] (_etype201, _size198) = iprot.readListBegin() for _i202 in xrange(_size198): _elem203 = {} (_ktype205, _vtype206, _size204 ) = iprot.readMapBegin() for _i208 in xrange(_size204): _key209 = iprot.readI32(); _val210 = set() (_etype214, _size211) = iprot.readSetBegin() for _i215 in xrange(_size211): _elem216 = iprot.readString(); _val210.add(_elem216) iprot.readSetEnd() _elem203[_key209] = _val210 iprot.readMapEnd() self.map_int_strset_list.append(_elem203) iprot.readListEnd() else : iprot.skip(ftype) else : iprot.skip(ftype) iprot.readFieldEnd() iprot.readStructEnd() def write(self, oprot): if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and self.thrift_spec is not None and fastbinary is not None: oprot.trans.write(fastbinary.encode_binary(self, (self.__class__, self.thrift_spec))) return oprot.writeStructBegin('NestedMixedx2') if self.int_set_list is not None: oprot.writeFieldBegin('int_set_list', TType.LIST, 1) oprot.writeListBegin(TType.SET, len(self.int_set_list)) for iter217 in self.int_set_list: oprot.writeSetBegin(TType.I32, len(iter217)) for iter218 in iter217: oprot.writeI32(iter218) oprot.writeSetEnd() oprot.writeListEnd() oprot.writeFieldEnd() if self.map_int_strset is not None: oprot.writeFieldBegin('map_int_strset', TType.MAP, 2) oprot.writeMapBegin(TType.I32, TType.SET, len(self.map_int_strset)) for kiter219,viter220 in self.map_int_strset.items(): oprot.writeI32(kiter219) oprot.writeSetBegin(TType.STRING, len(viter220)) for iter221 in viter220: oprot.writeString(iter221) oprot.writeSetEnd() oprot.writeMapEnd() oprot.writeFieldEnd() if self.map_int_strset_list is not None: oprot.writeFieldBegin('map_int_strset_list', TType.LIST, 3) oprot.writeListBegin(TType.MAP, len(self.map_int_strset_list)) for iter222 in self.map_int_strset_list: oprot.writeMapBegin(TType.I32, TType.SET, len(iter222)) for kiter223,viter224 in iter222.items(): oprot.writeI32(kiter223) oprot.writeSetBegin(TType.STRING, len(viter224)) for iter225 in viter224: oprot.writeString(iter225) oprot.writeSetEnd() oprot.writeMapEnd() oprot.writeListEnd() oprot.writeFieldEnd() oprot.writeFieldStop() oprot.writeStructEnd() def validate(self): return def __repr__(self): L = ['%s=%r' % (key, value) for key, value in self.__dict__.iteritems()] return '%s(%s)' % (self.__class__.__name__, ', '.join(L)) def __eq__(self, other): return isinstance(other, self.__class__) and self.__dict__ == other.__dict__ def __ne__(self, other): return not (self == other) If this ( THRIFT-1115 ) passes muster, then the next direction that might be interesting to go would be to look at reviving the python .thrift compiler from branches/py-compiler/compiler/py/src/, removing all of its code text generation, and instead build up the thrift classes at runtime using metaclasses. If the parser/builder class lived in the python thrift libraries, then it would allow some kinds of applications (proxies maybe) to work more dynamically.
        Hide
        Will Pierce added a comment -

        FYI, this latest patch significantly changes the python unit tests. The RunClientServer.py program spawned by make check now wraps the command line tests which were previously run directly.

        The tests are now run with varying command lines to make them use different generated code, i.e. different gen-py-* directories on sys.path. The reason is that this ensures each of the python thrift compiler options get identical test coverage. (The script 'RunClientServer.py' should probably be renamed to 'RunTests.py' or something, but that's a small detail.) Unfortunately, there's very little, if any, testing of the python Twisted code, so that's still an area for improvement.

        As this patch stands, there's over 400 distinct unit tests now executed on the python codebase, exercising many permutations (server type, transport, zlib compression, SSL encryption, etc). Going forward, I'll be sure to add a unit test case any time I uncover a bug in the python thrift code. Incrementally this will be pretty awesome.

        Show
        Will Pierce added a comment - FYI, this latest patch significantly changes the python unit tests. The RunClientServer.py program spawned by make check now wraps the command line tests which were previously run directly. The tests are now run with varying command lines to make them use different generated code, i.e. different gen-py-* directories on sys.path . The reason is that this ensures each of the python thrift compiler options get identical test coverage. (The script 'RunClientServer.py' should probably be renamed to 'RunTests.py' or something, but that's a small detail.) Unfortunately, there's very little, if any, testing of the python Twisted code, so that's still an area for improvement. As this patch stands, there's over 400 distinct unit tests now executed on the python codebase, exercising many permutations (server type, transport, zlib compression, SSL encryption, etc). Going forward, I'll be sure to add a unit test case any time I uncover a bug in the python thrift code. Incrementally this will be pretty awesome.
        Hide
        Will Pierce added a comment -

        How does this look? The unit tests are thorough and default compiler behavior doesn't change, so it won't break existing users. Is this good enough for committers?

        PS thanks for the patience throughout this patch. A lot has happened since the initial March 25th submission and subsequent updates. I even moved 4000 km.

        Show
        Will Pierce added a comment - How does this look? The unit tests are thorough and default compiler behavior doesn't change, so it won't break existing users. Is this good enough for committers? PS thanks for the patience throughout this patch. A lot has happened since the initial March 25th submission and subsequent updates. I even moved 4000 km.
        Hide
        Roger Meier added a comment -

        Thanks Will for this huge work!

        just committed
        -roger

        Show
        Roger Meier added a comment - Thanks Will for this huge work! just committed -roger
        Hide
        Hudson added a comment -

        Integrated in Thrift #259 (See https://builds.apache.org/job/Thrift/259/)
        THRIFT-1115 python TBase class for dynamic (de)serialization, and _slots_ option for memory savings
        Patch: Will Pierce

        roger : http://svn.apache.org/viewvc/?view=rev&rev=1169492
        Files :

        • /thrift/trunk/.gitignore
        • /thrift/trunk/compiler/cpp/src/generate/t_py_generator.cc
        • /thrift/trunk/lib/py/src/Thrift.py
        • /thrift/trunk/lib/py/src/protocol/TBase.py
        • /thrift/trunk/lib/py/src/protocol/TCompactProtocol.py
        • /thrift/trunk/lib/py/src/protocol/TProtocol.py
        • /thrift/trunk/lib/py/src/protocol/_init_.py
        • /thrift/trunk/test/ThriftTest.thrift
        • /thrift/trunk/test/py/Makefile.am
        • /thrift/trunk/test/py/RunClientServer.py
        • /thrift/trunk/test/py/SerializationTest.py
        • /thrift/trunk/test/py/TestClient.py
        • /thrift/trunk/test/py/TestEof.py
        • /thrift/trunk/test/py/TestServer.py
        • /thrift/trunk/test/py/TestSocket.py
        • /thrift/trunk/test/py/TestSyntax.py
        Show
        Hudson added a comment - Integrated in Thrift #259 (See https://builds.apache.org/job/Thrift/259/ ) THRIFT-1115 python TBase class for dynamic (de)serialization, and _ slots _ option for memory savings Patch: Will Pierce roger : http://svn.apache.org/viewvc/?view=rev&rev=1169492 Files : /thrift/trunk/.gitignore /thrift/trunk/compiler/cpp/src/generate/t_py_generator.cc /thrift/trunk/lib/py/src/Thrift.py /thrift/trunk/lib/py/src/protocol/TBase.py /thrift/trunk/lib/py/src/protocol/TCompactProtocol.py /thrift/trunk/lib/py/src/protocol/TProtocol.py /thrift/trunk/lib/py/src/protocol/_ init _.py /thrift/trunk/test/ThriftTest.thrift /thrift/trunk/test/py/Makefile.am /thrift/trunk/test/py/RunClientServer.py /thrift/trunk/test/py/SerializationTest.py /thrift/trunk/test/py/TestClient.py /thrift/trunk/test/py/TestEof.py /thrift/trunk/test/py/TestServer.py /thrift/trunk/test/py/TestSocket.py /thrift/trunk/test/py/TestSyntax.py
        Hide
        Christian Rakow added a comment -

        I noticed a little bug when using slots and dynamic generation:
        Currently TApplicationException is not imported anymore in the service file, but still in use. The new TExceptionBase class is never used.

        Best regards

        Show
        Christian Rakow added a comment - I noticed a little bug when using slots and dynamic generation: Currently TApplicationException is not imported anymore in the service file, but still in use. The new TExceptionBase class is never used. Best regards
        Hide
        Will Pierce added a comment -

        Thanks Christian, I'll update the test cases and submit a patch in another ticket to fix it. Thx!

        Show
        Will Pierce added a comment - Thanks Christian, I'll update the test cases and submit a patch in another ticket to fix it. Thx!
        Hide
        Nathaniel Cook added a comment -

        Did the TApplicationException import bug ever get fixed?

        Show
        Nathaniel Cook added a comment - Did the TApplicationException import bug ever get fixed?
        Hide
        Will Pierce added a comment -

        Not yet. I'm in transit at the moment, but can send a patch tonight. Should be a 1 or 2 liner if you want to snag it yourself.

        Show
        Will Pierce added a comment - Not yet. I'm in transit at the moment, but can send a patch tonight. Should be a 1 or 2 liner if you want to snag it yourself.
        Hide
        Will Pierce added a comment -

        missing import for TApplicationException, have 2 line patch

        Show
        Will Pierce added a comment - missing import for TApplicationException, have 2 line patch
        Hide
        Will Pierce added a comment -

        2 line patch adds TApplicationException to the import stanza (along with TException).

        Show
        Will Pierce added a comment - 2 line patch adds TApplicationException to the import stanza (along with TException).
        Hide
        Jake Farrell added a comment -

        Import line currently contains all items from that patch. Anything left on this Will or can this be closed?

        Show
        Jake Farrell added a comment - Import line currently contains all items from that patch. Anything left on this Will or can this be closed?
        Hide
        Will Pierce added a comment -

        terrific!

        Show
        Will Pierce added a comment - terrific!

          People

          • Assignee:
            Will Pierce
            Reporter:
            Will Pierce
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development