Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17035

Optimizer: Lineage transform() should be invoked after rest of the optimizers are invoked

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • Logical Optimizer
    • None

    Description

      In a fairly large query which had tens of left join, time taken to create linageInfo itself took 1500+ seconds. This is due to the fact that the table had lots of columns and in some processing, it ended up processing 7000+ value columns in ReduceSinkLineage, though only 50 columns were projected in the query.

      It would be good to invoke lineage transform when rest of the optimizers in Optimizer are invoked. This would avoid unwanted processing and help in improving the runtime.

      Attachments

        1. HIVE-17035.1.patch
          2 kB
          Rajesh Balamohan
        2. HIVE-17035.2.patch
          817 kB
          Rajesh Balamohan
        3. HIVE-17035.3.patch
          799 kB
          Rajesh Balamohan
        4. HIVE-17035.4.patch
          799 kB
          Rajesh Balamohan

        Issue Links

          Activity

            People

              rajesh.balamohan Rajesh Balamohan
              rajesh.balamohan Rajesh Balamohan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: