Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7195

Improve Metastore performance

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Metastore
    • Labels:
      None

      Description

      Even with direct SQL, which significantly improves MS performance, some operations take a considerable amount of time, when there are many partitions on table. Specifically I believe the issue:

      • When a client gets all partitions we do not send them an iterator, we create a collection of all data and then pass the object over the network in total
      • Operations which require looking up data on the NN can still be slow since there is no cache of information and it's done in a serial fashion
      • Perhaps a tangent, but our client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. We should use deadlines, i.e. pass the timeout to the server so it can calculate that the client has expired.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                brocknoland Brock Noland
              • Votes:
                3 Vote for this issue
                Watchers:
                25 Start watching this issue

                Dates

                • Created:
                  Updated: