Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2632

Create a SchemaTuple which generates efficient Tuples via code gen


    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Implemented
    • Affects Version/s: None
    • Fix Version/s: 0.11
    • Component/s: None
    • Labels:


      This work builds on Dmitriy's PrimitiveTuple work. The idea is that, knowing the Schema on the frontend, we can code generate Tuples which can be used for fun and profit. In rudimentary tests, the memory efficiency is 2-4x better, and it's ~15% smaller serialized (heavily heavily depends on the data, though). Need to do get/set tests, but assuming that it's on par (or even faster) than Tuple, the memory gain is huge.

      Need to clean up the code and add tests.

      Right now, it generates a SchemaTuple for every inputSchema and outputSchema given to UDF's. The next step is to make a SchemaBag, where I think the serialization savings will be really huge.


        1. schematuple benchmarking.pptx
          104 kB
          Jonathan Coveney
        2. schematuple benchmarking.pdf
          70 kB
          Jonathan Coveney
        3. PIG-2632-9.patch
          67 kB
          Jonathan Coveney
        4. PIG-2632-9.patch
          284 kB
          Jonathan Coveney
        5. PIG-2632-8.patch
          242 kB
          Jonathan Coveney
        6. PIG-2632-7.patch
          236 kB
          Jonathan Coveney
        7. PIG-2632-6.patch
          232 kB
          Jonathan Coveney
        8. PIG-2632-5.patch
          222 kB
          Jonathan Coveney
        9. PIG-2632-4.patch
          219 kB
          Jonathan Coveney
        10. PIG-2632-3.patch
          100 kB
          Jonathan Coveney
        11. PIG-2632-10.patch
          318 kB
          Jonathan Coveney
        12. PIG-2632-10.patch
          319 kB
          Jonathan Coveney
        13. PIG-2632-1.patch
          69 kB
          Jonathan Coveney
        14. PIG-2632-0.patch
          55 kB
          Jonathan Coveney

          Issue Links



              • Assignee:
                jcoveney Jonathan Coveney
                jcoveney Jonathan Coveney
              • Votes:
                1 Vote for this issue
                10 Start watching this issue


                • Created: