Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6265

dedup Metastore data structures or at least protocol

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Metastore
    • None

    Description

      Metastore currently stores SD per partition, and column schema/serde/... per SD.
      Most of the time all the partitions have the same setup in a table, the only different things in SD/CD/... being the location. In such cases, we don't need to store these separately and send them to client when many partitions are retrieved for a large table. While storage changes may be too complex wrt backward compat, as well as with DataNucleus being in the picture and controlling the db schema/persistence, at least we can avoid sending lots of duplicate data to the client on the network; thrift protocol can be modified to omit duplicate data in a backward compatible manner.

      Attachments

        Activity

          People

            Unassigned Unassigned
            sershe Sergey Shelukhin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: