Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6041

add `properties` to Hudi Spark Procedures

    XMLWordPrintableJSON

Details

    Description

      We need to write extra properties to a HDFS file for Bootstrap Procedure and set `props_file_path`, which make it troublesome to call this procedure, like:

      call run_bootstrap(table => 'test_hudi_table', table_type => 'COPY_ON_WRITE', 
      bootstrap_path => 'hdfs://ns1/hive/warehouse/hudi.db/test_hudi_table', 
      base_path => 'hdfs://ns1//tmp/hoodie/test_hudi_table', 
      rowKey_field => 'id', partition_path_field => 'dt',
      props_file_path => 'hdfs://ns1//tmp/tableProp.txt'); 

      Or we can set those properties by session config, which means that we need to execute some `set` SQLs.

      We can add a new parameter for procedure input named `properties`, add  collect key-value pairs for this input, like:

      call run_bootstrap(table => 'test_hudi_table', table_type => 'COPY_ON_WRITE', 
      bootstrap_path => 'hdfs://ns1/hive/warehouse/hudi.db/test_hudi_table', 
      base_path => 'hdfs://ns1//tmp/hoodie/test_hudi_table', 
      rowKey_field => 'id', partition_path_field => 'dt', 
      properties => 'hoodie.datasource.write.hive_style_partitioning=true');  

      So that we don't need to put another file to HDFS

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              1365976815@qq.com lvyanquan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: