Hive
  1. Hive
  2. HIVE-6262

Remove unnecessary copies of schema + table desc from serialized plan

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.13.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently for a partitioned table the following are true:

      • for each partitiondesc we send a copy of the corresponding tabledesc
      • for each partitiondesc we send two copies of the schema (in different formats).

      Obviously we need to send different schemas if they are required by schema evolution, but in our case we'll always end up with multiple copies.

      The effect can be dramatic. The reductions by removing those on partitioned tables easily be can be 8-10x in size. Plans themselves can be 10s to 100s of mb (even with kryo). The size difference also plays out in every task on the cluster we run.

      1. HIVE-6262.1.patch
        294 kB
        Gunther Hagleitner

        Activity

        Hide
        Gunther Hagleitner added a comment -

        Committed to trunk. Thanks for the review Vikram!

        Show
        Gunther Hagleitner added a comment - Committed to trunk. Thanks for the review Vikram!
        Hide
        Vikram Dixit K added a comment -

        This is really good in terms of memory efficiency. LGTM +1.

        Show
        Vikram Dixit K added a comment - This is really good in terms of memory efficiency. LGTM +1.
        Hide
        Gunther Hagleitner added a comment -

        Tests have successfully run: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/994/testReport/ (the 7 failures are unrelated). Unfortunately jira was down when the tests completed (so no auto update)

        Show
        Gunther Hagleitner added a comment - Tests have successfully run: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/994/testReport/ (the 7 failures are unrelated). Unfortunately jira was down when the tests completed (so no auto update)

          People

          • Assignee:
            Gunther Hagleitner
            Reporter:
            Gunther Hagleitner
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development