Hive
  1. Hive
  2. HIVE-5775 Introduce Cost Based Optimizer to Hive
  3. HIVE-7324

CBO: provide a mechanism to test CBO features based on table stats only (w/o table data)

    Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: CBO
    • Labels:
      None

      Description

      Since lot of the CBO work is focused on planning, it will be nice to be able to run explain query to test CBO features. TPCDS has a rich enough schema and query set. So the patch loads a dump TPCDS(Scale 10000) stats.

      1. TestCBO shows a way to load stats from a dump and run explain on a tpcds query. The output is currently dumped to Sys.out. This can be improved by hooking to QTestUtil, but hopefully this is a good start.

      2. Uncovered couple of issues in the process of testing this:
      a) PartitionPruner fails on 'true' constants. For e.g. you will get an error for

      SELECT * 
      FROM t WHERE
      partCol < 100 AND true
      

      This gets exposed because the predicates coming out of Optiq can contain 'true' predicates.
      b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = numBuckets. This fails because there are no dataFiles. So I have altered it to catch exceptions and assume bucketMapJoinConvertible = false if an exception is encountered here.
      Uploading with these changes in this patch for now. Will carve them out as separate patches.

      Ashutosh Chauhan, Gunther Hagleitner can you please take a look.

      1. HIVE-7324.1.patch
        9.45 MB
        Harish Butani
      2. HIVE-7324.2.patch
        9.45 MB
        Gunther Hagleitner

        Activity

        Hide
        Harish Butani added a comment -

        The issue is MetadataImporter only works on an empty DB, otherwise you get into id issues: ids imported from export files intersect with ids in the metastore. To get this to work through QTestUtil I can look into having it react to a flag such that it either Imports the schema or invokes createSources.
        But this still would not get to what you were trying to do: you will only be run these tests along with other .q tests in one go.

        Show
        Harish Butani added a comment - The issue is MetadataImporter only works on an empty DB, otherwise you get into id issues: ids imported from export files intersect with ids in the metastore. To get this to work through QTestUtil I can look into having it react to a flag such that it either Imports the schema or invokes createSources. But this still would not get to what you were trying to do: you will only be run these tests along with other .q tests in one go.
        Hide
        Gunther Hagleitner added a comment -

        However, when I ran the tests they failed because the loading of the metastore failed some foreign key constraints.

        Show
        Gunther Hagleitner added a comment - However, when I ran the tests they failed because the loading of the metastore failed some foreign key constraints.
        Hide
        Gunther Hagleitner added a comment -

        I've added the import tool to QTestUtil as an alternative way of writing tests with this. You can now bake it into q file tests.

        Show
        Gunther Hagleitner added a comment - I've added the import tool to QTestUtil as an alternative way of writing tests with this. You can now bake it into q file tests.

          People

          • Assignee:
            Harish Butani
            Reporter:
            Harish Butani
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development