Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Transactions
    • None

    Description

      1. DDLSemanticAnalyzer.alterTableOutput is unused
      2. DDLTask.generateAddMmTasks(Table) - stmtId should probably come from TransactionManager
      3. DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has Long mmWriteId = crtTbl.getInitialMmWriteId(); logic is unclear.. this ID is only set in one place..
      4. FileSinkOperator has multiple places that look like conf.getWriteType() == AcidUtils.Operation.NOT_ACID || conf.isMmTable() - what is the writeType for MM tables? Seems that Wei opted for "work.getLoadTableWork().getWriteType() != AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. MoveTask.handleStaticParts() call to Hive.loadPartition()
      5. HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is obsolete
      6. Compactor Initiator likely doesn't work for MM tables. It's triggered by into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS. MM tables don't write to either because DbTxnManager.acquireLocks() does compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t)); i.e. it treats MM as non-acid tables
      7. In general integration with full Acid seems confused wrt to MM and seems to treat MM as special table type rather than subtype of Acid table. (mostly, but not always).
        1. e.g. SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input, QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)
        2. SemanticAnalyzer.validate() has if (tbl != null && (AcidUtils.isFullAcidTable(tbl) || MetaStoreUtils.isInsertOnlyTable(tbl.getParameters()))) {
      8. LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather than from TM
      9. ImportCommitTask - doesn't currently do anything. It used to commit mmID. Need to verify we properly commit the txn in the Driver
      10. As far as I can tell all the mm_*.q tests run on TestCliDriver which means MR. This doesn't exercise some code specifically for dealing with writes from Union All queries (CTAS, Insert into). On MR this requires "hive.optimize.union.remove=true" (false by default)
      11. Remove MoveWork().setNoop(boolean) and usages per todo in GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path finalName, DependencyCollectionTask dependencyTask, List<Task<MoveWork>> mvTasks, HiveConf conf, Task<? extends Serializable> currTask)
      12. PartialScanWork.tblDesc - unused
      13. Partition.getBucketPath(int bucketNum) has "// Note: this makes assumptions that won't work with MM tables, unions, etc.". File Jira?
      14. PartitionDesc.LOG is unused
      15. Insert Overwrite for MM is incomplete - see comments in HIVE-15212 regarding IOW and multi IOW
      16. mm_bucket_convert.q - doesn't install DbTxnManager, doesn't write any data - not sure what it tests in practice
      17. There no concurrency tests that check locking
      18. no tests with aborted txns
      19. tests don't run on Tez/LLap - affects some optimization like Union All writes

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: