Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24336

Turn off the direct insert for EXPLAIN ANALYZE queries

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0.0
    • Component/s: None

      Description

      If we do an EXPLAIN ANALYZE for an INSERT query with direct insert on, the new files will be created in the table directory, and they won't be cleaned-up when the EXPLAIN query is finished.

      Example: 

      create table analyze_table (id int) stored as orc tblproperties('transactional'='true');
      explain analyze insert into analyze_table values (1),(2),(3),(4);
      
      select * from analyze_table;
      1
      2
      3
      4
      Time taken: 0.1 seconds, Fetched: 4 row(s)
      
      The result should be empty after the explain command.
      

      An EXPLAIN ANALYZE query will execute the actual query and the files will be created within the staging directory, but the MoveTask won't move them to the final location. So when the EXPLAIN ANALYZE query is finished, the staging directory will be deleted, and the table directory will be the same as before the EXPLAIN query. But with direct insert on the files will be written into the table directory, so an additional cleanup would be necessary in order to restore the files within the table directory to the state before the EXPLAIN ANALYZE query. This could be avoided by turning off the direct insert for an EXPLAIN ANALYZE query. Since the direct insert improves the performance by eliminating the file movements within the MoveTask, but it has no affect on the query execution plan it can be safely turned off for explain queries.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                kuczoram Marta Kuczora
                Reporter:
                kuczoram Marta Kuczora
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m