Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-4396

Add a boolean parameter to decide whether the partition is cascade or not when hive table columns changes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • hive, meta-sync
    • 2

    Description

      Currently, when using the HudiHiveSync tool to do hive table columns changes, this happens at HMSDDLExecutor.updateTableDefinition(), this cascade is only decided by META_SYNC_PARTITION_FIELDS:

      boolean cascade = syncConfig.getSplitStrings(META_SYNC_PARTITION_FIELDS).size() > 0;

      but some scenarios do not need to update partition columns and this will cost a lot of time even the hive might hang when the partition number is large. 

      Therefore, I want to add a supplement boolean config parameter: 

      HIVE_SYNC_PARTITION_CASCADE_WITH_COLUMN_CHANGE

      the default is true, if the users don't want to update the partition then set it to false

      Attachments

        Issue Links

          Activity

            People

              honeyaya XixiHua
              honeyaya XixiHua
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: