Hive
  1. Hive
  2. HIVE-6265

dedup Metastore data structures or at least protocol

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Metastore
    • Labels:
      None

      Description

      Metastore currently stores SD per partition, and column schema/serde/... per SD.
      Most of the time all the partitions have the same setup in a table, the only different things in SD/CD/... being the location. In such cases, we don't need to store these separately and send them to client when many partitions are retrieved for a large table. While storage changes may be too complex wrt backward compat, as well as with DataNucleus being in the picture and controlling the db schema/persistence, at least we can avoid sending lots of duplicate data to the client on the network; thrift protocol can be modified to omit duplicate data in a backward compatible manner.

        Activity

        There are no comments yet on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Sergey Shelukhin
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development