Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-3177

CREATE INDEX command

    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Blocker
    • Resolution: Pending Closed
    • None
    • 0.11.0
    • index, metadata
    • None

    Description

      Users should be able to trigger index creation using CREATE INDEX statement or a CLI tool by capturing below options for one or more partitions.
       

      CREATE [BLOOM | COL_STATS | SOME_INDEX_TYPE] INDEX ON TABLE  [table_name] FOR COLUMNS (col1, col2, col3) WITH OPTION  (<file_group_count>, <some_other_option>);

       
      Maps to following hudi configs:

      METADATA_PREFIX + ".index.bloom.filter.file.group.count” 
      METADATA_PREFIX + ".index.column.stats.file.group.count" 
      METADATA_PREFIX + ".index.bloom.filter.for.columns” -> comma-separated column names 
      METADATA_PREFIX + ".index.column.stats.for.columns" -> comma-separated column names

      Even the CLI indexer tool will map user inputs to the above configs.
      By default, bloom filter will only be for record key and column stats will be for all columns.

      For v0.11.0, our assumption is:

      1. Static file group count for all columns.
      2. Infer the set of columns that have already been indexed from the MT partition layout (see HUDI-3258).

      Attachments

        Issue Links

          Activity

            People

              codope Sagar Sumit
              codope Sagar Sumit
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: