Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-3760

Rebase ColStats onto fetching Records by Column prefix

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.11.0
    • None

    Description

      Right now all the records from ColStats for all columns, for all files are being read to compose the index used in Data Skipping.

      In reality, individual queries touch up only a handful of columns at any given moment, so we can very effectively prune the # of records we fetch simply fetching records for the columns referenced in the query (by the key prefix, since CS record key is concatenation of column, partition-path, filename)

      Attachments

        Issue Links

          Activity

            People

              alexey.kudinkin Alexey Kudinkin
              alexey.kudinkin Alexey Kudinkin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: