Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-5892

Code refactoring for config defaults and wiring

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 1.1.0
    • configs
    • None

    Description

      See https://github.com/apache/hudi/pull/7912/files for more details

      Note that this ticket may be split into separate improvement areas for clarify.

      Below are tracked in separate tickets:

      CLEANER_COMMITS_RETAINED: when either "hoodie.cleaner.commits.retained", "hoodie.cleaner.hours.retained", or "hoodie.cleaner.fileversions.retained" is set, should we automatically use the corresponding clean policy?

      Clustering around group size: PLAN_STRATEGY_MAX_BYTES_PER_OUTPUT_FILEGROUP

      LAYOUT_TYPE

      ORDERING_FIELD (PAYLOAD_ORDERING_FIELD_PROP_KEY)

      PAYLOAD_CLASS_NAME ("hoodie.compaction.payload.class")

      KEYGENERATOR_TYPE (auto inference)

      Low ROI which can be punt:

      PLAN_STRATEGY_CLASS_NAME to enum

      MERGE_ALLOW_DUPLICATE_ON_INSERTS_ENABLE

      EQUALITY_SQL_QUERIES

      These should be untouched:

      EMBEDDED_TIMELINE_SERVER_REUSE_ENABLED (Flink may still need this)

      DYNAMODB_ENDPOINT_URL (this is inDynamoDbBasedLockConfig and we only use the config key.  There is no ROI for adding infer logic.)

      AWS_ACCESS_KEY, AWS_SECRET_KEY (These configs should only be necessary if no environmental variables are not set (to confirm with code); no action should be needed on changing the code)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              guoyihua Ethan Guo
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: